Compare commits

...

9 Commits

Author SHA1 Message Date
gdkchan
9ecbee8032 Batch inline index buffer update (#4587) 2023-03-24 14:19:54 +01:00
gdkchan
80519af67d Update short cache textures if modified (#4586) 2023-03-24 12:54:58 +01:00
gdkchan
26e30faff3 Fix handle leak on IShopServiceAccessServerInterface.CreateServerInterface (#4591) 2023-03-24 11:56:54 +01:00
Wunk
0992310b76 ARMeilleure: Check for XSAVE cpuid flag for AVX{2,512} (#4584)
Protection for the `xgetbv` instruction for systems that do not support
`xcr0` such as nehalem processors.

The `XSAVE` cpuid indicates support for `XSAVE`, `XRESTOR`, `XSETBV`,
`XGETBV` while `OSXSAVE` indicates if the operating system itself has
`XSAVE` turned on. Both must be checked at the same time.
2023-03-22 14:51:21 -03:00
Andrew Glaze
009c1101d2 CI: add a version tag to correlate release versions with commits (#4572)
* add step to tag commit with release version

* add step to tag commit with release version

* Rename step to “Create Tag”

* Fix name
2023-03-22 13:17:28 +01:00
gdkchan
ba95ee54ab Revert "Use source generated json serializers in order to improve code trimming (#4094)" (#4576)
This reverts commit 4ce4299ca2.
2023-03-21 20:14:46 -03:00
Andrey Sukharev
4ce4299ca2 Use source generated json serializers in order to improve code trimming (#4094)
* Use source generated json serializers in order to improve code trimming

* Use strongly typed github releases model to fetch updates instead of raw Newtonsoft.Json parsing

* Use separate model for LogEventArgs serialization

* Make dynamic object formatter static. Fix string builder pooling.

* Do not inherit json version of LogEventArgs from EventArgs

* Fix extra space in object formatting

* Write log json directly to stream instead of using buffer writer

* Rebase fixes

* Rebase fixes

* Rebase fixes

* Enforce block-scoped namespaces in the solution. Convert style for existing code

* Apply suggestions from code review

Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>

* Rebase indent fix

* Fix indent

* Delete unnecessary json properties

* Rebase fix

* Remove overridden json property names as they are handled in the options

* Apply suggestions from code review

Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>

* Use default json options in github api calls

* Indentation and spacing fixes

---------

Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
2023-03-21 19:41:19 -03:00
Wunk
17620d18db ARMeilleure: Add initial support for AVX512 (EVEX encoding) (cont) (#4147)
* ARMeilleure: Add AVX512{F,VL,DQ,BW} detection

Add `UseAvx512Ortho` and `UseAvx512OrthoFloat` optimization flags as
short-hands for `F+VL` and `F+VL+DQ`.

* ARMeilleure: Add initial support for EVEX instruction encoding

Does not implement rounding, or exception controls.

* ARMeilleure: Add `X86Vpternlogd`

Accelerates the vector-`Not` instruction.

* ARMeilleure: Add check for `OSXSAVE` for AVX{2,512}

* ARMeilleure: Add check for `XCR0` flags

Add XCR0 register checks for AVX and AVX512F, following the guidelines
from section 14.3 and 15.2 from the Intel Architecture Software
Developer's Manual.

* ARMeilleure: Remove redundant `ReProtect` and `Dispose`, formatting

* ARMeilleure: Move XCR0 procedure to GetXcr0Eax

* ARMeilleure: Add `XCR0` to `FeatureInfo` structure

* ARMeilleure: Utilize `ReadOnlySpan` for Xcr0 assembly

Avoids an additional allocation

* ARMeilleure: Formatting fixes

* ARMeilleure: Fix EVEX encoding src2 register index

> Just like in VEX prefix, vvvv is provided in inverted form.

* ARMeilleure: Add `X86Vpternlogd` acceleration to `Vmvn_I`

Passes unit tests, verified instruction utilization

* ARMeilleure: Fix EVEX register operand designations

Operand 2 was being sourced improperly.

EVEX encoded instructions source their operands like so:
Operand 1: ModRM:reg
Operand 2: EVEX.vvvvv
Operand 3: ModRM:r/m
Operand 4: Imm

This fixes the improper register designations when emitting vpternlog.
Now "dest", "src1", "src2" arguments emit in the proper order in EVEX instructions.

* ARMeilleure: Add `X86Vpternlogd` acceleration to `Orn_V`

* ARMeilleure: PTC version bump

* ARMeilleure: Update EVEX encoding Debug.Assert to Debug.Fail

* ARMeilleure: Update EVEX encoding comment capitalization
2023-03-20 16:09:24 -03:00
riperiperi
9f1cf6458c Vulkan: Migrate buffers between memory types to improve GPU performance (#4540)
* Initial implementation of migration between memory heaps

- Missing OOM handling
- Missing `_map` data safety when remapping
  - Copy may not have completed yet (needs some kind of fence)
  - Map may be unmapped before it is done being used. (needs scoped access)
- SSBO accesses are all "writes" - maybe pass info in another way.
- Missing keeping map type when resizing buffers (should this be done?)

* Ensure migrated data is in place before flushing.

* Fix issue where old waitable would be signalled.

- There is a real issue where existing Auto<> references need to be replaced.

* Swap bound Auto<> instances when swapping buffer backing

* Fix conversion buffers

* Don't try move buffers if the host has shared memory.

* Make GPU methods return PinnedSpan with scope

* Storage Hint

* Fix stupidity

* Fix rebase

* Tweak rules

Attempt to sidestep BOTW slowdown

* Remove line

* Migrate only when command buffers flush

* Change backing swap log to debug

* Address some feedback

* Disallow backing swap when the flush lock is held by the current thread

* Make PinnedSpan from ReadOnlySpan explicitly unsafe

* Fix some small issues

- Index buffer swap fixed
- Allocate DeviceLocal buffers using a separate block list to images.

* Remove alternative flags

* Address feedback
2023-03-19 17:56:48 -03:00
52 changed files with 984 additions and 204 deletions

View File

@@ -112,6 +112,17 @@ jobs:
repo: ${{ env.RYUJINX_TARGET_RELEASE_CHANNEL_REPO }}
token: ${{ secrets.RELEASE_TOKEN }}
- name: Create tag
uses: actions/github-script@v5
with:
script: |
github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: 'refs/tags/${{ steps.version_info.outputs.build_version }}',
sha: context.sha
})
flatpak_release:
uses: ./.github/workflows/flatpak.yml
needs: release

View File

@@ -7,6 +7,7 @@
<ItemGroup>
<ProjectReference Include="..\Ryujinx.Common\Ryujinx.Common.csproj" />
<ProjectReference Include="..\Ryujinx.Memory\Ryujinx.Memory.csproj" />
</ItemGroup>
<ItemGroup>

View File

@@ -1034,7 +1034,13 @@ namespace ARMeilleure.CodeGen.X86
Debug.Assert(opCode != BadOp, "Invalid opcode value.");
if ((flags & InstructionFlags.Vex) != 0 && HardwareCapabilities.SupportsVexEncoding)
if ((flags & InstructionFlags.Evex) != 0 && HardwareCapabilities.SupportsEvexEncoding)
{
WriteEvexInst(dest, src1, src2, type, flags, opCode);
opCode &= 0xff;
}
else if ((flags & InstructionFlags.Vex) != 0 && HardwareCapabilities.SupportsVexEncoding)
{
// In a vex encoding, only one prefix can be active at a time. The active prefix is encoded in the second byte using two bits.
@@ -1153,6 +1159,103 @@ namespace ARMeilleure.CodeGen.X86
}
}
private void WriteEvexInst(
Operand dest,
Operand src1,
Operand src2,
OperandType type,
InstructionFlags flags,
int opCode,
bool broadcast = false,
int registerWidth = 128,
int maskRegisterIdx = 0,
bool zeroElements = false)
{
int op1Idx = dest.GetRegister().Index;
int op2Idx = src1.GetRegister().Index;
int op3Idx = src2.GetRegister().Index;
WriteByte(0x62);
// P0
// Extend operand 1 register
bool r = (op1Idx & 8) == 0;
// Extend operand 3 register
bool x = (op3Idx & 16) == 0;
// Extend operand 3 register
bool b = (op3Idx & 8) == 0;
// Extend operand 1 register
bool rp = (op1Idx & 16) == 0;
// Escape code index
byte mm = 0b00;
switch ((ushort)(opCode >> 8))
{
case 0xf00: mm = 0b01; break;
case 0xf38: mm = 0b10; break;
case 0xf3a: mm = 0b11; break;
default: Debug.Fail($"Failed to EVEX encode opcode 0x{opCode:X}."); break;
}
WriteByte(
(byte)(
(r ? 0x80 : 0) |
(x ? 0x40 : 0) |
(b ? 0x20 : 0) |
(rp ? 0x10 : 0) |
mm));
// P1
// Specify 64-bit lane mode
bool w = Is64Bits(type);
// Operand 2 register index
byte vvvv = (byte)(~op2Idx & 0b1111);
// Opcode prefix
byte pp = (flags & InstructionFlags.PrefixMask) switch
{
InstructionFlags.Prefix66 => 0b01,
InstructionFlags.PrefixF3 => 0b10,
InstructionFlags.PrefixF2 => 0b11,
_ => 0
};
WriteByte(
(byte)(
(w ? 0x80 : 0) |
(vvvv << 3) |
0b100 |
pp));
// P2
// Mask register determines what elements to zero, rather than what elements to merge
bool z = zeroElements;
// Specifies register-width
byte ll = 0b00;
switch (registerWidth)
{
case 128: ll = 0b00; break;
case 256: ll = 0b01; break;
case 512: ll = 0b10; break;
default: Debug.Fail($"Invalid EVEX vector register width {registerWidth}."); break;
}
// Embedded broadcast in the case of a memory operand
bool bcast = broadcast;
// Extend operand 2 register
bool vp = (op2Idx & 16) == 0;
// Mask register index
Debug.Assert(maskRegisterIdx < 8, $"Invalid mask register index {maskRegisterIdx}.");
byte aaa = (byte)(maskRegisterIdx & 0b111);
WriteByte(
(byte)(
(z ? 0x80 : 0) |
(ll << 5) |
(bcast ? 0x10 : 0) |
(vp ? 8 : 0) |
aaa));
}
private void WriteCompactInst(Operand operand, int opCode)
{
int regIndex = operand.GetRegister().Index;

View File

@@ -20,6 +20,7 @@ namespace ARMeilleure.CodeGen.X86
Reg8Dest = 1 << 2,
RexW = 1 << 3,
Vex = 1 << 4,
Evex = 1 << 5,
PrefixBit = 16,
PrefixMask = 7 << PrefixBit,
@@ -278,6 +279,7 @@ namespace ARMeilleure.CodeGen.X86
Add(X86Instruction.Vfnmsub231sd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38bf, InstructionFlags.Vex | InstructionFlags.Prefix66 | InstructionFlags.RexW));
Add(X86Instruction.Vfnmsub231ss, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38bf, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Vpblendvb, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f3a4c, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Vpternlogd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f3a25, InstructionFlags.Evex | InstructionFlags.Prefix66));
Add(X86Instruction.Xor, new InstructionInfo(0x00000031, 0x06000083, 0x06000081, BadOp, 0x00000033, InstructionFlags.None));
Add(X86Instruction.Xorpd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000f57, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Xorps, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000f57, InstructionFlags.Vex));

View File

@@ -1,10 +1,14 @@
using Ryujinx.Memory;
using System;
using System.Runtime.InteropServices;
using System.Runtime.Intrinsics.X86;
namespace ARMeilleure.CodeGen.X86
{
static class HardwareCapabilities
{
private delegate uint GetXcr0();
static HardwareCapabilities()
{
if (!X86Base.IsSupported)
@@ -24,6 +28,34 @@ namespace ARMeilleure.CodeGen.X86
FeatureInfo7Ebx = (FeatureFlags7Ebx)ebx7;
FeatureInfo7Ecx = (FeatureFlags7Ecx)ecx7;
}
Xcr0InfoEax = (Xcr0FlagsEax)GetXcr0Eax();
}
private static uint GetXcr0Eax()
{
if (!FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Xsave))
{
// XSAVE feature required for xgetbv
return 0;
}
ReadOnlySpan<byte> asmGetXcr0 = new byte[]
{
0x31, 0xc9, // xor ecx, ecx
0xf, 0x01, 0xd0, // xgetbv
0xc3, // ret
};
using MemoryBlock memGetXcr0 = new MemoryBlock((ulong)asmGetXcr0.Length);
memGetXcr0.Write(0, asmGetXcr0);
memGetXcr0.Reprotect(0, (ulong)asmGetXcr0.Length, MemoryPermission.ReadAndExecute);
var fGetXcr0 = Marshal.GetDelegateForFunctionPointer<GetXcr0>(memGetXcr0.Pointer);
return fGetXcr0();
}
[Flags]
@@ -44,6 +76,8 @@ namespace ARMeilleure.CodeGen.X86
Sse42 = 1 << 20,
Popcnt = 1 << 23,
Aes = 1 << 25,
Xsave = 1 << 26,
Osxsave = 1 << 27,
Avx = 1 << 28,
F16c = 1 << 29
}
@@ -52,7 +86,11 @@ namespace ARMeilleure.CodeGen.X86
public enum FeatureFlags7Ebx
{
Avx2 = 1 << 5,
Sha = 1 << 29
Avx512f = 1 << 16,
Avx512dq = 1 << 17,
Sha = 1 << 29,
Avx512bw = 1 << 30,
Avx512vl = 1 << 31
}
[Flags]
@@ -61,10 +99,21 @@ namespace ARMeilleure.CodeGen.X86
Gfni = 1 << 8,
}
[Flags]
public enum Xcr0FlagsEax
{
Sse = 1 << 1,
YmmHi128 = 1 << 2,
Opmask = 1 << 5,
ZmmHi256 = 1 << 6,
Hi16Zmm = 1 << 7
}
public static FeatureFlags1Edx FeatureInfo1Edx { get; }
public static FeatureFlags1Ecx FeatureInfo1Ecx { get; }
public static FeatureFlags7Ebx FeatureInfo7Ebx { get; } = 0;
public static FeatureFlags7Ecx FeatureInfo7Ecx { get; } = 0;
public static Xcr0FlagsEax Xcr0InfoEax { get; } = 0;
public static bool SupportsSse => FeatureInfo1Edx.HasFlag(FeatureFlags1Edx.Sse);
public static bool SupportsSse2 => FeatureInfo1Edx.HasFlag(FeatureFlags1Edx.Sse2);
@@ -76,8 +125,13 @@ namespace ARMeilleure.CodeGen.X86
public static bool SupportsSse42 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Sse42);
public static bool SupportsPopcnt => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Popcnt);
public static bool SupportsAesni => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Aes);
public static bool SupportsAvx => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Avx);
public static bool SupportsAvx => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Avx | FeatureFlags1Ecx.Xsave | FeatureFlags1Ecx.Osxsave) && Xcr0InfoEax.HasFlag(Xcr0FlagsEax.Sse | Xcr0FlagsEax.YmmHi128);
public static bool SupportsAvx2 => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx2) && SupportsAvx;
public static bool SupportsAvx512F => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx512f) && FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Xsave | FeatureFlags1Ecx.Osxsave)
&& Xcr0InfoEax.HasFlag(Xcr0FlagsEax.Sse | Xcr0FlagsEax.YmmHi128 | Xcr0FlagsEax.Opmask | Xcr0FlagsEax.ZmmHi256 | Xcr0FlagsEax.Hi16Zmm);
public static bool SupportsAvx512Vl => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx512vl) && SupportsAvx512F;
public static bool SupportsAvx512Bw => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx512bw) && SupportsAvx512F;
public static bool SupportsAvx512Dq => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx512dq) && SupportsAvx512F;
public static bool SupportsF16c => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.F16c);
public static bool SupportsSha => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Sha);
public static bool SupportsGfni => FeatureInfo7Ecx.HasFlag(FeatureFlags7Ecx.Gfni);
@@ -85,5 +139,6 @@ namespace ARMeilleure.CodeGen.X86
public static bool ForceLegacySse { get; set; }
public static bool SupportsVexEncoding => SupportsAvx && !ForceLegacySse;
public static bool SupportsEvexEncoding => SupportsAvx512F && !ForceLegacySse;
}
}

View File

@@ -180,6 +180,7 @@ namespace ARMeilleure.CodeGen.X86
Add(Intrinsic.X86Vfnmadd231ss, new IntrinsicInfo(X86Instruction.Vfnmadd231ss, IntrinsicType.Fma));
Add(Intrinsic.X86Vfnmsub231sd, new IntrinsicInfo(X86Instruction.Vfnmsub231sd, IntrinsicType.Fma));
Add(Intrinsic.X86Vfnmsub231ss, new IntrinsicInfo(X86Instruction.Vfnmsub231ss, IntrinsicType.Fma));
Add(Intrinsic.X86Vpternlogd, new IntrinsicInfo(X86Instruction.Vpternlogd, IntrinsicType.TernaryImm));
Add(Intrinsic.X86Xorpd, new IntrinsicInfo(X86Instruction.Xorpd, IntrinsicType.Binary));
Add(Intrinsic.X86Xorps, new IntrinsicInfo(X86Instruction.Xorps, IntrinsicType.Binary));
}

View File

@@ -219,6 +219,7 @@ namespace ARMeilleure.CodeGen.X86
Vfnmsub231sd,
Vfnmsub231ss,
Vpblendvb,
Vpternlogd,
Xor,
Xorpd,
Xorps,

View File

@@ -254,7 +254,22 @@ namespace ARMeilleure.Instructions
public static void Not_V(ArmEmitterContext context)
{
if (Optimizations.UseSse2)
if (Optimizations.UseAvx512Ortho)
{
OpCodeSimd op = (OpCodeSimd)context.CurrOp;
Operand n = GetVec(op.Rn);
Operand res = context.AddIntrinsic(Intrinsic.X86Vpternlogd, n, n, Const(~0b10101010));
if (op.RegisterSize == RegisterSize.Simd64)
{
res = context.VectorZeroUpper64(res);
}
context.Copy(GetVec(op.Rd), res);
}
else if (Optimizations.UseSse2)
{
OpCodeSimd op = (OpCodeSimd)context.CurrOp;
@@ -283,6 +298,22 @@ namespace ARMeilleure.Instructions
{
InstEmitSimdHelperArm64.EmitVectorBinaryOp(context, Intrinsic.Arm64OrnV);
}
else if (Optimizations.UseAvx512Ortho)
{
OpCodeSimdReg op = (OpCodeSimdReg)context.CurrOp;
Operand n = GetVec(op.Rn);
Operand m = GetVec(op.Rm);
Operand res = context.AddIntrinsic(Intrinsic.X86Vpternlogd, n, m, Const(0b11001100 | ~0b10101010));
if (op.RegisterSize == RegisterSize.Simd64)
{
res = context.VectorZeroUpper64(res);
}
context.Copy(GetVec(op.Rd), res);
}
else if (Optimizations.UseSse2)
{
OpCodeSimdReg op = (OpCodeSimdReg)context.CurrOp;

View File

@@ -151,6 +151,13 @@ namespace ARMeilleure.Instructions
{
InstEmitSimdHelper32Arm64.EmitVectorBinaryOpSimd32(context, (n, m) => context.AddIntrinsic(Intrinsic.Arm64OrnV | Intrinsic.Arm64V128, n, m));
}
else if (Optimizations.UseAvx512Ortho)
{
EmitVectorBinaryOpSimd32(context, (n, m) =>
{
return context.AddIntrinsic(Intrinsic.X86Vpternlogd, n, m, Const(0b11001100 | ~0b10101010));
});
}
else if (Optimizations.UseSse2)
{
Operand mask = context.VectorOne();

View File

@@ -34,7 +34,14 @@ namespace ARMeilleure.Instructions
public static void Vmvn_I(ArmEmitterContext context)
{
if (Optimizations.UseSse2)
if (Optimizations.UseAvx512Ortho)
{
EmitVectorUnaryOpSimd32(context, (op1) =>
{
return context.AddIntrinsic(Intrinsic.X86Vpternlogd, op1, op1, Const(0b01010101));
});
}
else if (Optimizations.UseSse2)
{
EmitVectorUnaryOpSimd32(context, (op1) =>
{

View File

@@ -173,6 +173,7 @@ namespace ARMeilleure.IntermediateRepresentation
X86Vfnmadd231ss,
X86Vfnmsub231sd,
X86Vfnmsub231ss,
X86Vpternlogd,
X86Xorpd,
X86Xorps,

View File

@@ -23,6 +23,10 @@ namespace ARMeilleure
public static bool UseSse42IfAvailable { get; set; } = true;
public static bool UsePopCntIfAvailable { get; set; } = true;
public static bool UseAvxIfAvailable { get; set; } = true;
public static bool UseAvx512FIfAvailable { get; set; } = true;
public static bool UseAvx512VlIfAvailable { get; set; } = true;
public static bool UseAvx512BwIfAvailable { get; set; } = true;
public static bool UseAvx512DqIfAvailable { get; set; } = true;
public static bool UseF16cIfAvailable { get; set; } = true;
public static bool UseFmaIfAvailable { get; set; } = true;
public static bool UseAesniIfAvailable { get; set; } = true;
@@ -47,11 +51,18 @@ namespace ARMeilleure
internal static bool UseSse42 => UseSse42IfAvailable && X86HardwareCapabilities.SupportsSse42;
internal static bool UsePopCnt => UsePopCntIfAvailable && X86HardwareCapabilities.SupportsPopcnt;
internal static bool UseAvx => UseAvxIfAvailable && X86HardwareCapabilities.SupportsAvx && !ForceLegacySse;
internal static bool UseAvx512F => UseAvx512FIfAvailable && X86HardwareCapabilities.SupportsAvx512F && !ForceLegacySse;
internal static bool UseAvx512Vl => UseAvx512VlIfAvailable && X86HardwareCapabilities.SupportsAvx512Vl && !ForceLegacySse;
internal static bool UseAvx512Bw => UseAvx512BwIfAvailable && X86HardwareCapabilities.SupportsAvx512Bw && !ForceLegacySse;
internal static bool UseAvx512Dq => UseAvx512DqIfAvailable && X86HardwareCapabilities.SupportsAvx512Dq && !ForceLegacySse;
internal static bool UseF16c => UseF16cIfAvailable && X86HardwareCapabilities.SupportsF16c;
internal static bool UseFma => UseFmaIfAvailable && X86HardwareCapabilities.SupportsFma;
internal static bool UseAesni => UseAesniIfAvailable && X86HardwareCapabilities.SupportsAesni;
internal static bool UsePclmulqdq => UsePclmulqdqIfAvailable && X86HardwareCapabilities.SupportsPclmulqdq;
internal static bool UseSha => UseShaIfAvailable && X86HardwareCapabilities.SupportsSha;
internal static bool UseGfni => UseGfniIfAvailable && X86HardwareCapabilities.SupportsGfni;
internal static bool UseAvx512Ortho => UseAvx512F && UseAvx512Vl;
internal static bool UseAvx512OrthoFloat => UseAvx512Ortho && UseAvx512Dq;
}
}

View File

@@ -30,7 +30,7 @@ namespace ARMeilleure.Translation.PTC
private const string OuterHeaderMagicString = "PTCohd\0\0";
private const string InnerHeaderMagicString = "PTCihd\0\0";
private const uint InternalVersion = 4484; //! To be incremented manually for each change to the ARMeilleure project.
private const uint InternalVersion = 4485; //! To be incremented manually for each change to the ARMeilleure project.
private const string ActualDir = "0";
private const string BackupDir = "1";
@@ -969,6 +969,7 @@ namespace ARMeilleure.Translation.PTC
(ulong)Arm64HardwareCapabilities.LinuxFeatureInfoHwCap,
(ulong)Arm64HardwareCapabilities.LinuxFeatureInfoHwCap2,
(ulong)Arm64HardwareCapabilities.MacOsFeatureInfo,
0,
0);
}
else if (RuntimeInformation.ProcessArchitecture == Architecture.X64)
@@ -977,11 +978,12 @@ namespace ARMeilleure.Translation.PTC
(ulong)X86HardwareCapabilities.FeatureInfo1Ecx,
(ulong)X86HardwareCapabilities.FeatureInfo1Edx,
(ulong)X86HardwareCapabilities.FeatureInfo7Ebx,
(ulong)X86HardwareCapabilities.FeatureInfo7Ecx);
(ulong)X86HardwareCapabilities.FeatureInfo7Ecx,
(ulong)X86HardwareCapabilities.Xcr0InfoEax);
}
else
{
return new FeatureInfo(0, 0, 0, 0);
return new FeatureInfo(0, 0, 0, 0, 0);
}
}
@@ -1002,7 +1004,7 @@ namespace ARMeilleure.Translation.PTC
return osPlatform;
}
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 78*/)]
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 86*/)]
private struct OuterHeader
{
public ulong Magic;
@@ -1034,8 +1036,8 @@ namespace ARMeilleure.Translation.PTC
}
}
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 32*/)]
private record struct FeatureInfo(ulong FeatureInfo0, ulong FeatureInfo1, ulong FeatureInfo2, ulong FeatureInfo3);
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 40*/)]
private record struct FeatureInfo(ulong FeatureInfo0, ulong FeatureInfo1, ulong FeatureInfo2, ulong FeatureInfo3, ulong FeatureInfo4);
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 128*/)]
private struct InnerHeader

View File

@@ -15,7 +15,12 @@ namespace Ryujinx.Graphics.GAL
void BackgroundContextAction(Action action, bool alwaysBackground = false);
BufferHandle CreateBuffer(int size);
BufferHandle CreateBuffer(int size, BufferHandle storageHint);
BufferHandle CreateBuffer(int size)
{
return CreateBuffer(size, BufferHandle.Null);
}
IProgram CreateProgram(ShaderSource[] shaders, ShaderInfo info);
@@ -26,7 +31,7 @@ namespace Ryujinx.Graphics.GAL
void DeleteBuffer(BufferHandle buffer);
ReadOnlySpan<byte> GetBufferData(BufferHandle buffer, int offset, int size);
PinnedSpan<byte> GetBufferData(BufferHandle buffer, int offset, int size);
Capabilities GetCapabilities();
ulong GetCurrentSync();

View File

@@ -15,8 +15,8 @@ namespace Ryujinx.Graphics.GAL
ITexture CreateView(TextureCreateInfo info, int firstLayer, int firstLevel);
ReadOnlySpan<byte> GetData();
ReadOnlySpan<byte> GetData(int layer, int level);
PinnedSpan<byte> GetData();
PinnedSpan<byte> GetData(int layer, int level);
void SetData(SpanOrArray<byte> data);
void SetData(SpanOrArray<byte> data, int layer, int level);

View File

@@ -21,9 +21,9 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Buffer
public static void Run(ref BufferGetDataCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
ReadOnlySpan<byte> result = renderer.GetBufferData(threaded.Buffers.MapBuffer(command._buffer), command._offset, command._size);
PinnedSpan<byte> result = renderer.GetBufferData(threaded.Buffers.MapBuffer(command._buffer), command._offset, command._size);
command._result.Get(threaded).Result = new PinnedSpan<byte>(result);
command._result.Get(threaded).Result = result;
}
}
}

View File

@@ -5,16 +5,25 @@
public CommandType CommandType => CommandType.CreateBuffer;
private BufferHandle _threadedHandle;
private int _size;
private BufferHandle _storageHint;
public void Set(BufferHandle threadedHandle, int size)
public void Set(BufferHandle threadedHandle, int size, BufferHandle storageHint)
{
_threadedHandle = threadedHandle;
_size = size;
_storageHint = storageHint;
}
public static void Run(ref CreateBufferCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
threaded.Buffers.AssignBuffer(command._threadedHandle, renderer.CreateBuffer(command._size));
BufferHandle hint = BufferHandle.Null;
if (command._storageHint != BufferHandle.Null)
{
hint = threaded.Buffers.MapBuffer(command._storageHint);
}
threaded.Buffers.AssignBuffer(command._threadedHandle, renderer.CreateBuffer(command._size, hint));
}
}
}

View File

@@ -18,9 +18,9 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Texture
public static void Run(ref TextureGetDataCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
ReadOnlySpan<byte> result = command._texture.Get(threaded).Base.GetData();
PinnedSpan<byte> result = command._texture.Get(threaded).Base.GetData();
command._result.Get(threaded).Result = new PinnedSpan<byte>(result);
command._result.Get(threaded).Result = result;
}
}
}

View File

@@ -22,9 +22,9 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Texture
public static void Run(ref TextureGetDataSliceCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
ReadOnlySpan<byte> result = command._texture.Get(threaded).Base.GetData(command._layer, command._level);
PinnedSpan<byte> result = command._texture.Get(threaded).Base.GetData(command._layer, command._level);
command._result.Get(threaded).Result = new PinnedSpan<byte>(result);
command._result.Get(threaded).Result = result;
}
}
}

View File

@@ -1,23 +0,0 @@
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
namespace Ryujinx.Graphics.GAL.Multithreading.Model
{
unsafe struct PinnedSpan<T> where T : unmanaged
{
private void* _ptr;
private int _size;
public PinnedSpan(ReadOnlySpan<T> span)
{
_ptr = Unsafe.AsPointer(ref MemoryMarshal.GetReference(span));
_size = span.Length;
}
public ReadOnlySpan<T> Get()
{
return new ReadOnlySpan<T>(_ptr, _size * Unsafe.SizeOf<T>());
}
}
}

View File

@@ -72,7 +72,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources
return newTex;
}
public ReadOnlySpan<byte> GetData()
public PinnedSpan<byte> GetData()
{
if (_renderer.IsGpuThread())
{
@@ -80,7 +80,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources
_renderer.New<TextureGetDataCommand>().Set(Ref(this), Ref(box));
_renderer.InvokeCommand();
return box.Result.Get();
return box.Result;
}
else
{
@@ -90,7 +90,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources
}
}
public ReadOnlySpan<byte> GetData(int layer, int level)
public PinnedSpan<byte> GetData(int layer, int level)
{
if (_renderer.IsGpuThread())
{
@@ -98,7 +98,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources
_renderer.New<TextureGetDataSliceCommand>().Set(Ref(this), Ref(box), layer, level);
_renderer.InvokeCommand();
return box.Result.Get();
return box.Result;
}
else
{

View File

@@ -265,10 +265,10 @@ namespace Ryujinx.Graphics.GAL.Multithreading
}
}
public BufferHandle CreateBuffer(int size)
public BufferHandle CreateBuffer(int size, BufferHandle storageHint)
{
BufferHandle handle = Buffers.CreateBufferHandle();
New<CreateBufferCommand>().Set(handle, size);
New<CreateBufferCommand>().Set(handle, size, storageHint);
QueueCommand();
return handle;
@@ -329,7 +329,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading
QueueCommand();
}
public ReadOnlySpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
public PinnedSpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
{
if (IsGpuThread())
{
@@ -337,7 +337,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading
New<BufferGetDataCommand>().Set(buffer, offset, size, Ref(box));
InvokeCommand();
return box.Result.Get();
return box.Result;
}
else
{

View File

@@ -0,0 +1,53 @@
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
namespace Ryujinx.Graphics.GAL
{
public unsafe struct PinnedSpan<T> : IDisposable where T : unmanaged
{
private void* _ptr;
private int _size;
private Action _disposeAction;
/// <summary>
/// Creates a new PinnedSpan from an existing ReadOnlySpan. The span *must* be pinned in memory.
/// The data must be guaranteed to live until disposeAction is called.
/// </summary>
/// <param name="span">Existing span</param>
/// <param name="disposeAction">Action to call on dispose</param>
/// <remarks>
/// If a dispose action is not provided, it is safe to assume the resource will be available until the next call.
/// </remarks>
public static PinnedSpan<T> UnsafeFromSpan(ReadOnlySpan<T> span, Action disposeAction = null)
{
return new PinnedSpan<T>(Unsafe.AsPointer(ref MemoryMarshal.GetReference(span)), span.Length, disposeAction);
}
/// <summary>
/// Creates a new PinnedSpan from an existing unsafe region. The data must be guaranteed to live until disposeAction is called.
/// </summary>
/// <param name="ptr">Pointer to the region</param>
/// <param name="size">The total items of T the region contains</param>
/// <param name="disposeAction">Action to call on dispose</param>
/// <remarks>
/// If a dispose action is not provided, it is safe to assume the resource will be available until the next call.
/// </remarks>
public PinnedSpan(void* ptr, int size, Action disposeAction = null)
{
_ptr = ptr;
_size = size;
_disposeAction = disposeAction;
}
public ReadOnlySpan<T> Get()
{
return new ReadOnlySpan<T>(_ptr, _size * Unsafe.SizeOf<T>());
}
public void Dispose()
{
_disposeAction?.Invoke();
}
}
}

View File

@@ -180,7 +180,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
int firstInstance = (int)_state.State.FirstInstance;
int inlineIndexCount = _drawState.IbStreamer.GetAndResetInlineIndexCount();
int inlineIndexCount = _drawState.IbStreamer.GetAndResetInlineIndexCount(_context.Renderer);
if (inlineIndexCount != 0)
{
@@ -670,7 +670,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
{
if (indexedInline)
{
int inlineIndexCount = _drawState.IbStreamer.GetAndResetInlineIndexCount();
int inlineIndexCount = _drawState.IbStreamer.GetAndResetInlineIndexCount(_context.Renderer);
BufferRange br = new BufferRange(_drawState.IbStreamer.GetInlineIndexBuffer(), 0, inlineIndexCount * 4);
_channel.BufferManager.SetIndexBuffer(br, IndexType.UInt);

View File

@@ -11,9 +11,13 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// </summary>
struct IbStreamer
{
private const int BufferCapacity = 256; // Must be a power of 2.
private BufferHandle _inlineIndexBuffer;
private int _inlineIndexBufferSize;
private int _inlineIndexCount;
private uint[] _buffer;
private int _bufferOffset;
/// <summary>
/// Indicates if any index buffer data has been pushed.
@@ -38,9 +42,11 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// Gets the number of elements on the current inline index buffer,
/// while also reseting it to zero for the next draw.
/// </summary>
/// <param name="renderer">Host renderer</param>
/// <returns>Inline index bufffer count</returns>
public int GetAndResetInlineIndexCount()
public int GetAndResetInlineIndexCount(IRenderer renderer)
{
UpdateRemaining(renderer);
int temp = _inlineIndexCount;
_inlineIndexCount = 0;
return temp;
@@ -58,16 +64,12 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
byte i2 = (byte)(argument >> 16);
byte i3 = (byte)(argument >> 24);
Span<uint> data = stackalloc uint[4];
int offset = _inlineIndexCount;
data[0] = i0;
data[1] = i1;
data[2] = i2;
data[3] = i3;
int offset = _inlineIndexCount * 4;
renderer.SetBufferData(GetInlineIndexBuffer(renderer, offset), offset, MemoryMarshal.Cast<uint, byte>(data));
PushData(renderer, offset, i0);
PushData(renderer, offset + 1, i1);
PushData(renderer, offset + 2, i2);
PushData(renderer, offset + 3, i3);
_inlineIndexCount += 4;
}
@@ -82,14 +84,10 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
ushort i0 = (ushort)argument;
ushort i1 = (ushort)(argument >> 16);
Span<uint> data = stackalloc uint[2];
int offset = _inlineIndexCount;
data[0] = i0;
data[1] = i1;
int offset = _inlineIndexCount * 4;
renderer.SetBufferData(GetInlineIndexBuffer(renderer, offset), offset, MemoryMarshal.Cast<uint, byte>(data));
PushData(renderer, offset, i0);
PushData(renderer, offset + 1, i1);
_inlineIndexCount += 2;
}
@@ -103,13 +101,61 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
{
uint i0 = (uint)argument;
Span<uint> data = stackalloc uint[1];
int offset = _inlineIndexCount++;
data[0] = i0;
PushData(renderer, offset, i0);
}
int offset = _inlineIndexCount++ * 4;
/// <summary>
/// Pushes a 32-bit value to the index buffer.
/// </summary>
/// <param name="renderer">Host renderer</param>
/// <param name="offset">Offset where the data should be written, in 32-bit words</param>
/// <param name="value">Index value to be written</param>
private void PushData(IRenderer renderer, int offset, uint value)
{
if (_buffer == null)
{
_buffer = new uint[BufferCapacity];
}
renderer.SetBufferData(GetInlineIndexBuffer(renderer, offset), offset, MemoryMarshal.Cast<uint, byte>(data));
// We upload data in chunks.
// If we are at the start of a chunk, then the buffer might be full,
// in that case we need to submit any existing data before overwriting the buffer.
int subOffset = offset & (BufferCapacity - 1);
if (subOffset == 0 && offset != 0)
{
int baseOffset = (offset - BufferCapacity) * sizeof(uint);
BufferHandle buffer = GetInlineIndexBuffer(renderer, baseOffset, BufferCapacity * sizeof(uint));
renderer.SetBufferData(buffer, baseOffset, MemoryMarshal.Cast<uint, byte>(_buffer));
}
_buffer[subOffset] = value;
}
/// <summary>
/// Makes sure that any pending data is submitted to the GPU before the index buffer is used.
/// </summary>
/// <param name="renderer">Host renderer</param>
private void UpdateRemaining(IRenderer renderer)
{
int offset = _inlineIndexCount;
if (offset == 0)
{
return;
}
int count = offset & (BufferCapacity - 1);
if (count == 0)
{
count = BufferCapacity;
}
int baseOffset = (offset - count) * sizeof(uint);
int length = count * sizeof(uint);
BufferHandle buffer = GetInlineIndexBuffer(renderer, baseOffset, length);
renderer.SetBufferData(buffer, baseOffset, MemoryMarshal.Cast<uint, byte>(_buffer).Slice(0, length));
}
/// <summary>
@@ -117,12 +163,13 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// </summary>
/// <param name="renderer">Host renderer</param>
/// <param name="offset">Offset where the data will be written</param>
/// <param name="length">Number of bytes that will be written</param>
/// <returns>Buffer handle</returns>
private BufferHandle GetInlineIndexBuffer(IRenderer renderer, int offset)
private BufferHandle GetInlineIndexBuffer(IRenderer renderer, int offset, int length)
{
// Calculate a reasonable size for the buffer that can fit all the data,
// and that also won't require frequent resizes if we need to push more data.
int size = BitUtils.AlignUp(offset + 0x10, 0x200);
int size = BitUtils.AlignUp(offset + length + 0x10, 0x200);
if (_inlineIndexBuffer == BufferHandle.Null)
{

View File

@@ -1022,13 +1022,12 @@ namespace Ryujinx.Graphics.Gpu.Image
/// This method should be used to retrieve data that was modified by the host GPU.
/// This is not cheap, avoid doing that unless strictly needed.
/// </remarks>
/// <param name="output">An output span to place the texture data into. If empty, one is generated</param>
/// <param name="output">An output span to place the texture data into</param>
/// <param name="blacklist">True if the texture should be blacklisted, false otherwise</param>
/// <param name="texture">The specific host texture to flush. Defaults to this texture</param>
/// <returns>The span containing the texture data</returns>
private ReadOnlySpan<byte> GetTextureDataFromGpu(Span<byte> output, bool blacklist, ITexture texture = null)
private void GetTextureDataFromGpu(Span<byte> output, bool blacklist, ITexture texture = null)
{
ReadOnlySpan<byte> data;
PinnedSpan<byte> data;
if (texture != null)
{
@@ -1054,9 +1053,9 @@ namespace Ryujinx.Graphics.Gpu.Image
}
}
data = ConvertFromHostCompatibleFormat(output, data);
ConvertFromHostCompatibleFormat(output, data.Get());
return data;
data.Dispose();
}
/// <summary>
@@ -1071,10 +1070,9 @@ namespace Ryujinx.Graphics.Gpu.Image
/// <param name="level">The level of the texture to flush</param>
/// <param name="blacklist">True if the texture should be blacklisted, false otherwise</param>
/// <param name="texture">The specific host texture to flush. Defaults to this texture</param>
/// <returns>The span containing the texture data</returns>
public ReadOnlySpan<byte> GetTextureDataSliceFromGpu(Span<byte> output, int layer, int level, bool blacklist, ITexture texture = null)
public void GetTextureDataSliceFromGpu(Span<byte> output, int layer, int level, bool blacklist, ITexture texture = null)
{
ReadOnlySpan<byte> data;
PinnedSpan<byte> data;
if (texture != null)
{
@@ -1100,9 +1098,9 @@ namespace Ryujinx.Graphics.Gpu.Image
}
}
data = ConvertFromHostCompatibleFormat(output, data, level, true);
ConvertFromHostCompatibleFormat(output, data.Get(), level, true);
return data;
data.Dispose();
}
/// <summary>

View File

@@ -130,6 +130,10 @@ namespace Ryujinx.Graphics.Gpu.Image
return ref descriptor;
}
}
else
{
texture.SynchronizeMemory();
}
Items[id] = texture;
@@ -233,7 +237,7 @@ namespace Ryujinx.Graphics.Gpu.Image
}
/// <summary>
/// Queues a request to update a texture's mapping.
/// Queues a request to update a texture's mapping.
/// Mapping is updated later to avoid deleting the texture if it is still sparsely mapped.
/// </summary>
/// <param name="texture">Texture with potential mapping change</param>

View File

@@ -82,7 +82,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
Address = address;
Size = size;
Handle = context.Renderer.CreateBuffer((int)size);
Handle = context.Renderer.CreateBuffer((int)size, baseBuffers?.MaxBy(x => x.Size).Handle ?? BufferHandle.Null);
_useGranular = size > GranularBufferThreshold;
@@ -415,10 +415,10 @@ namespace Ryujinx.Graphics.Gpu.Memory
{
int offset = (int)(address - Address);
ReadOnlySpan<byte> data = _context.Renderer.GetBufferData(Handle, offset, (int)size);
using PinnedSpan<byte> data = _context.Renderer.GetBufferData(Handle, offset, (int)size);
// TODO: When write tracking shaders, they will need to be aware of changes in overlapping buffers.
_physicalMemory.WriteUntracked(address, data);
_physicalMemory.WriteUntracked(address, data.Get());
}
/// <summary>

View File

@@ -55,11 +55,14 @@ namespace Ryujinx.Graphics.OpenGL
(IntPtr)size);
}
public static unsafe ReadOnlySpan<byte> GetData(OpenGLRenderer renderer, BufferHandle buffer, int offset, int size)
public static unsafe PinnedSpan<byte> GetData(OpenGLRenderer renderer, BufferHandle buffer, int offset, int size)
{
// Data in the persistent buffer and host array is guaranteed to be available
// until the next time the host thread requests data.
if (HwCapabilities.UsePersistentBufferForFlush)
{
return renderer.PersistentBuffers.Default.GetBufferData(buffer, offset, size);
return PinnedSpan<byte>.UnsafeFromSpan(renderer.PersistentBuffers.Default.GetBufferData(buffer, offset, size));
}
else
{
@@ -69,7 +72,7 @@ namespace Ryujinx.Graphics.OpenGL
GL.GetBufferSubData(BufferTarget.CopyReadBuffer, (IntPtr)offset, size, target);
return new ReadOnlySpan<byte>(target.ToPointer(), size);
return new PinnedSpan<byte>(target.ToPointer(), size);
}
}

View File

@@ -39,12 +39,12 @@ namespace Ryujinx.Graphics.OpenGL.Image
throw new NotSupportedException();
}
public ReadOnlySpan<byte> GetData()
public PinnedSpan<byte> GetData()
{
return Buffer.GetData(_renderer, _buffer, _bufferOffset, _bufferSize);
}
public ReadOnlySpan<byte> GetData(int layer, int level)
public PinnedSpan<byte> GetData(int layer, int level)
{
return GetData();
}

View File

@@ -166,7 +166,7 @@ namespace Ryujinx.Graphics.OpenGL.Image
_renderer.TextureCopy.Copy(this, (TextureView)destination, srcRegion, dstRegion, linearFilter);
}
public unsafe ReadOnlySpan<byte> GetData()
public unsafe PinnedSpan<byte> GetData()
{
int size = 0;
int levels = Info.GetLevelsClamped();
@@ -196,16 +196,16 @@ namespace Ryujinx.Graphics.OpenGL.Image
data = FormatConverter.ConvertD24S8ToS8D24(data);
}
return data;
return PinnedSpan<byte>.UnsafeFromSpan(data);
}
public unsafe ReadOnlySpan<byte> GetData(int layer, int level)
public unsafe PinnedSpan<byte> GetData(int layer, int level)
{
int size = Info.GetMipSize(level);
if (HwCapabilities.UsePersistentBufferForFlush)
{
return _renderer.PersistentBuffers.Default.GetTextureData(this, size, layer, level);
return PinnedSpan<byte>.UnsafeFromSpan(_renderer.PersistentBuffers.Default.GetTextureData(this, size, layer, level));
}
else
{
@@ -213,7 +213,7 @@ namespace Ryujinx.Graphics.OpenGL.Image
int offset = WriteTo2D(target, layer, level);
return new ReadOnlySpan<byte>(target.ToPointer(), size).Slice(offset);
return new PinnedSpan<byte>((byte*)target.ToPointer() + offset, size);
}
}

View File

@@ -57,7 +57,7 @@ namespace Ryujinx.Graphics.OpenGL
ResourcePool = new ResourcePool();
}
public BufferHandle CreateBuffer(int size)
public BufferHandle CreateBuffer(int size, BufferHandle storageHint)
{
BufferCount++;
@@ -96,7 +96,7 @@ namespace Ryujinx.Graphics.OpenGL
return new HardwareInfo(GpuVendor, GpuRenderer);
}
public ReadOnlySpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
public PinnedSpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
{
return Buffer.GetData(this, buffer, offset, size);
}

View File

@@ -0,0 +1,12 @@
namespace Ryujinx.Graphics.Vulkan
{
internal enum BufferAllocationType
{
Auto = 0,
HostMappedNoCache,
HostMapped,
DeviceLocal,
DeviceLocalMapped
}
}

View File

@@ -1,7 +1,10 @@
using Ryujinx.Graphics.GAL;
using Ryujinx.Common.Logging;
using Ryujinx.Graphics.GAL;
using Silk.NET.Vulkan;
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Threading;
using VkBuffer = Silk.NET.Vulkan.Buffer;
using VkFormat = Silk.NET.Vulkan.Format;
@@ -11,6 +14,12 @@ namespace Ryujinx.Graphics.Vulkan
{
private const int MaxUpdateBufferSize = 0x10000;
private const int SetCountThreshold = 100;
private const int WriteCountThreshold = 50;
private const int FlushCountThreshold = 5;
public const int DeviceLocalSizeThreshold = 256 * 1024; // 256kb
public const AccessFlags DefaultAccessFlags =
AccessFlags.IndirectCommandReadBit |
AccessFlags.ShaderReadBit |
@@ -21,10 +30,10 @@ namespace Ryujinx.Graphics.Vulkan
private readonly VulkanRenderer _gd;
private readonly Device _device;
private readonly MemoryAllocation _allocation;
private readonly Auto<DisposableBuffer> _buffer;
private readonly Auto<MemoryAllocation> _allocationAuto;
private readonly ulong _bufferHandle;
private MemoryAllocation _allocation;
private Auto<DisposableBuffer> _buffer;
private Auto<MemoryAllocation> _allocationAuto;
private ulong _bufferHandle;
private CacheByRange<BufferHolder> _cachedConvertedBuffers;
@@ -32,11 +41,28 @@ namespace Ryujinx.Graphics.Vulkan
private IntPtr _map;
private readonly MultiFenceHolder _waitable;
private MultiFenceHolder _waitable;
private bool _lastAccessIsWrite;
public BufferHolder(VulkanRenderer gd, Device device, VkBuffer buffer, MemoryAllocation allocation, int size)
private BufferAllocationType _baseType;
private BufferAllocationType _currentType;
private bool _swapQueued;
public BufferAllocationType DesiredType { get; private set; }
private int _setCount;
private int _writeCount;
private int _flushCount;
private int _flushTemp;
private ReaderWriterLock _flushLock;
private FenceHolder _flushFence;
private int _flushWaiting;
private List<Action> _swapActions;
public BufferHolder(VulkanRenderer gd, Device device, VkBuffer buffer, MemoryAllocation allocation, int size, BufferAllocationType type, BufferAllocationType currentType)
{
_gd = gd;
_device = device;
@@ -47,9 +73,153 @@ namespace Ryujinx.Graphics.Vulkan
_bufferHandle = buffer.Handle;
Size = size;
_map = allocation.HostPointer;
_baseType = type;
_currentType = currentType;
DesiredType = currentType;
_flushLock = new ReaderWriterLock();
}
public unsafe Auto<DisposableBufferView> CreateView(VkFormat format, int offset, int size)
public bool TryBackingSwap(ref CommandBufferScoped? cbs)
{
if (_swapQueued && DesiredType != _currentType)
{
// Only swap if the buffer is not used in any queued command buffer.
bool isRented = _buffer.HasRentedCommandBufferDependency(_gd.CommandBufferPool);
if (!isRented && _gd.CommandBufferPool.OwnedByCurrentThread && !_flushLock.IsReaderLockHeld)
{
var currentAllocation = _allocationAuto;
var currentBuffer = _buffer;
IntPtr currentMap = _map;
(VkBuffer buffer, MemoryAllocation allocation, BufferAllocationType resultType) = _gd.BufferManager.CreateBacking(_gd, Size, DesiredType, false, _currentType);
if (buffer.Handle != 0)
{
_flushLock.AcquireWriterLock(Timeout.Infinite);
ClearFlushFence();
_waitable = new MultiFenceHolder(Size);
_allocation = allocation;
_allocationAuto = new Auto<MemoryAllocation>(allocation);
_buffer = new Auto<DisposableBuffer>(new DisposableBuffer(_gd.Api, _device, buffer), _waitable, _allocationAuto);
_bufferHandle = buffer.Handle;
_map = allocation.HostPointer;
if (_map != IntPtr.Zero && currentMap != IntPtr.Zero)
{
// Copy data directly. Readbacks don't have to wait if this is done.
unsafe
{
new Span<byte>((void*)currentMap, Size).CopyTo(new Span<byte>((void*)_map, Size));
}
}
else
{
if (cbs == null)
{
cbs = _gd.CommandBufferPool.Rent();
}
CommandBufferScoped cbsV = cbs.Value;
Copy(_gd, cbsV, currentBuffer, _buffer, 0, 0, Size);
// Need to wait for the data to reach the new buffer before data can be flushed.
_flushFence = _gd.CommandBufferPool.GetFence(cbsV.CommandBufferIndex);
_flushFence.Get();
}
Logger.Debug?.PrintMsg(LogClass.Gpu, $"Converted {Size} buffer {_currentType} to {resultType}");
_currentType = resultType;
if (_swapActions != null)
{
foreach (var action in _swapActions)
{
action();
}
_swapActions.Clear();
}
currentBuffer.Dispose();
currentAllocation.Dispose();
_gd.PipelineInternal.SwapBuffer(currentBuffer, _buffer);
_flushLock.ReleaseWriterLock();
}
_swapQueued = false;
return true;
}
else
{
return false;
}
}
else
{
_swapQueued = false;
return true;
}
}
private void ConsiderBackingSwap()
{
if (_baseType == BufferAllocationType.Auto)
{
if (_writeCount >= WriteCountThreshold || _setCount >= SetCountThreshold || _flushCount >= FlushCountThreshold)
{
if (_flushCount > 0 || _flushTemp-- > 0)
{
// Buffers that flush should ideally be mapped in host address space for easy copies.
// If the buffer is large it will do better on GPU memory, as there will be more writes than data flushes (typically individual pages).
// If it is small, then it's likely most of the buffer will be flushed so we want it on host memory, as access is cached.
DesiredType = Size > DeviceLocalSizeThreshold ? BufferAllocationType.DeviceLocalMapped : BufferAllocationType.HostMapped;
// It's harder for a buffer that is flushed to revert to another type of mapping.
if (_flushCount > 0)
{
_flushTemp = 1000;
}
}
else if (_writeCount >= WriteCountThreshold)
{
// Buffers that are written often should ideally be in the device local heap. (Storage buffers)
DesiredType = BufferAllocationType.DeviceLocal;
}
else if (_setCount > SetCountThreshold)
{
// Buffers that have their data set often should ideally be host mapped. (Constant buffers)
DesiredType = BufferAllocationType.HostMapped;
}
_flushCount = 0;
_writeCount = 0;
_setCount = 0;
}
if (!_swapQueued && DesiredType != _currentType)
{
_swapQueued = true;
_gd.PipelineInternal.AddBackingSwap(this);
}
}
}
public unsafe Auto<DisposableBufferView> CreateView(VkFormat format, int offset, int size, Action invalidateView)
{
var bufferViewCreateInfo = new BufferViewCreateInfo()
{
@@ -62,9 +232,19 @@ namespace Ryujinx.Graphics.Vulkan
_gd.Api.CreateBufferView(_device, bufferViewCreateInfo, null, out var bufferView).ThrowOnError();
(_swapActions ??= new List<Action>()).Add(invalidateView);
return new Auto<DisposableBufferView>(new DisposableBufferView(_gd.Api, _device, bufferView), _waitable, _buffer);
}
public void InheritMetrics(BufferHolder other)
{
_setCount = other._setCount;
_writeCount = other._writeCount;
_flushCount = other._flushCount;
_flushTemp = other._flushTemp;
}
public unsafe void InsertBarrier(CommandBuffer commandBuffer, bool isWrite)
{
// If the last access is write, we always need a barrier to be sure we will read or modify
@@ -104,12 +284,22 @@ namespace Ryujinx.Graphics.Vulkan
return _buffer;
}
public Auto<DisposableBuffer> GetBuffer(CommandBuffer commandBuffer, bool isWrite = false)
public Auto<DisposableBuffer> GetBuffer(CommandBuffer commandBuffer, bool isWrite = false, bool isSSBO = false)
{
if (isWrite)
{
_writeCount++;
SignalWrite(0, Size);
}
else if (isSSBO)
{
// Always consider SSBO access for swapping to device local memory.
_writeCount++;
ConsiderBackingSwap();
}
return _buffer;
}
@@ -118,6 +308,8 @@ namespace Ryujinx.Graphics.Vulkan
{
if (isWrite)
{
_writeCount++;
SignalWrite(offset, size);
}
@@ -126,6 +318,8 @@ namespace Ryujinx.Graphics.Vulkan
public void SignalWrite(int offset, int size)
{
ConsiderBackingSwap();
if (offset == 0 && size == Size)
{
_cachedConvertedBuffers.Clear();
@@ -147,11 +341,76 @@ namespace Ryujinx.Graphics.Vulkan
return _map;
}
public unsafe ReadOnlySpan<byte> GetData(int offset, int size)
private void ClearFlushFence()
{
// Asusmes _flushLock is held as writer.
if (_flushFence != null)
{
if (_flushWaiting == 0)
{
_flushFence.Put();
}
_flushFence = null;
}
}
private void WaitForFlushFence()
{
// Assumes the _flushLock is held as reader, returns in same state.
if (_flushFence != null)
{
// If storage has changed, make sure the fence has been reached so that the data is in place.
var cookie = _flushLock.UpgradeToWriterLock(Timeout.Infinite);
if (_flushFence != null)
{
var fence = _flushFence;
Interlocked.Increment(ref _flushWaiting);
// Don't wait in the lock.
var restoreCookie = _flushLock.ReleaseLock();
fence.Wait();
_flushLock.RestoreLock(ref restoreCookie);
if (Interlocked.Decrement(ref _flushWaiting) == 0)
{
fence.Put();
}
_flushFence = null;
}
_flushLock.DowngradeFromWriterLock(ref cookie);
}
}
public unsafe PinnedSpan<byte> GetData(int offset, int size)
{
_flushLock.AcquireReaderLock(Timeout.Infinite);
WaitForFlushFence();
_flushCount++;
Span<byte> result;
if (_map != IntPtr.Zero)
{
return GetDataStorage(offset, size);
result = GetDataStorage(offset, size);
// Need to be careful here, the buffer can't be unmapped while the data is being used.
_buffer.IncrementReferenceCount();
_flushLock.ReleaseReaderLock();
return PinnedSpan<byte>.UnsafeFromSpan(result, _buffer.DecrementReferenceCount);
}
else
{
@@ -161,12 +420,17 @@ namespace Ryujinx.Graphics.Vulkan
{
_gd.FlushAllCommands();
return resource.GetFlushBuffer().GetBufferData(_gd.CommandBufferPool, this, offset, size);
result = resource.GetFlushBuffer().GetBufferData(_gd.CommandBufferPool, this, offset, size);
}
else
{
return resource.GetFlushBuffer().GetBufferData(resource.GetPool(), this, offset, size);
result = resource.GetFlushBuffer().GetBufferData(resource.GetPool(), this, offset, size);
}
_flushLock.ReleaseReaderLock();
// Flush buffer is pinned until the next GetBufferData on the thread, which is fine for current uses.
return PinnedSpan<byte>.UnsafeFromSpan(result);
}
}
@@ -190,6 +454,8 @@ namespace Ryujinx.Graphics.Vulkan
return;
}
_setCount++;
if (_map != IntPtr.Zero)
{
// If persistently mapped, set the data directly if the buffer is not currently in use.
@@ -268,6 +534,8 @@ namespace Ryujinx.Graphics.Vulkan
var dstBuffer = GetBuffer(cbs.CommandBuffer, dstOffset, data.Length, true).Get(cbs, dstOffset, data.Length).Value;
_writeCount--;
InsertBufferBarrier(
_gd,
cbs.CommandBuffer,
@@ -502,11 +770,19 @@ namespace Ryujinx.Graphics.Vulkan
public void Dispose()
{
_swapQueued = false;
_gd.PipelineInternal?.FlushCommandsIfWeightExceeding(_buffer, (ulong)Size);
_buffer.Dispose();
_allocationAuto.Dispose();
_cachedConvertedBuffers.Dispose();
_flushLock.AcquireWriterLock(Timeout.Infinite);
ClearFlushFence();
_flushLock.ReleaseWriterLock();
}
}
}

View File

@@ -4,6 +4,7 @@ using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using VkFormat = Silk.NET.Vulkan.Format;
using VkBuffer = Silk.NET.Vulkan.Buffer;
namespace Ryujinx.Graphics.Vulkan
{
@@ -16,17 +17,17 @@ namespace Ryujinx.Graphics.Vulkan
// Some drivers don't expose a "HostCached" memory type,
// so we need those alternative flags for the allocation to succeed there.
private const MemoryPropertyFlags DefaultBufferMemoryAltFlags =
private const MemoryPropertyFlags DefaultBufferMemoryNoCacheFlags =
MemoryPropertyFlags.HostVisibleBit |
MemoryPropertyFlags.HostCoherentBit;
private const MemoryPropertyFlags DeviceLocalBufferMemoryFlags =
MemoryPropertyFlags.DeviceLocalBit;
private const MemoryPropertyFlags FlushableDeviceLocalBufferMemoryFlags =
private const MemoryPropertyFlags DeviceLocalMappedBufferMemoryFlags =
MemoryPropertyFlags.DeviceLocalBit |
MemoryPropertyFlags.HostVisibleBit |
MemoryPropertyFlags.HostCoherentBit |
MemoryPropertyFlags.DeviceLocalBit;
MemoryPropertyFlags.HostCoherentBit;
private const BufferUsageFlags DefaultBufferUsageFlags =
BufferUsageFlags.TransferSrcBit |
@@ -54,14 +55,14 @@ namespace Ryujinx.Graphics.Vulkan
StagingBuffer = new StagingBuffer(gd, this);
}
public BufferHandle CreateWithHandle(VulkanRenderer gd, int size, bool deviceLocal)
public BufferHandle CreateWithHandle(VulkanRenderer gd, int size, BufferAllocationType baseType = BufferAllocationType.HostMapped, BufferHandle storageHint = default)
{
return CreateWithHandle(gd, size, deviceLocal, out _);
return CreateWithHandle(gd, size, out _, baseType, storageHint);
}
public BufferHandle CreateWithHandle(VulkanRenderer gd, int size, bool deviceLocal, out BufferHolder holder)
public BufferHandle CreateWithHandle(VulkanRenderer gd, int size, out BufferHolder holder, BufferAllocationType baseType = BufferAllocationType.HostMapped, BufferHandle storageHint = default)
{
holder = Create(gd, size, deviceLocal: deviceLocal);
holder = Create(gd, size, baseType: baseType, storageHint: storageHint);
if (holder == null)
{
return BufferHandle.Null;
@@ -74,7 +75,12 @@ namespace Ryujinx.Graphics.Vulkan
return Unsafe.As<ulong, BufferHandle>(ref handle64);
}
public unsafe BufferHolder Create(VulkanRenderer gd, int size, bool forConditionalRendering = false, bool deviceLocal = false)
public unsafe (VkBuffer buffer, MemoryAllocation allocation, BufferAllocationType resultType) CreateBacking(
VulkanRenderer gd,
int size,
BufferAllocationType type,
bool forConditionalRendering = false,
BufferAllocationType fallbackType = BufferAllocationType.Auto)
{
var usage = DefaultBufferUsageFlags;
@@ -98,48 +104,106 @@ namespace Ryujinx.Graphics.Vulkan
gd.Api.CreateBuffer(_device, in bufferCreateInfo, null, out var buffer).ThrowOnError();
gd.Api.GetBufferMemoryRequirements(_device, buffer, out var requirements);
MemoryPropertyFlags allocateFlags;
MemoryPropertyFlags allocateFlagsAlt;
MemoryAllocation allocation;
if (deviceLocal)
do
{
allocateFlags = DeviceLocalBufferMemoryFlags;
allocateFlagsAlt = DeviceLocalBufferMemoryFlags;
}
else
{
allocateFlags = DefaultBufferMemoryFlags;
allocateFlagsAlt = DefaultBufferMemoryAltFlags;
}
var allocateFlags = type switch
{
BufferAllocationType.HostMappedNoCache => DefaultBufferMemoryNoCacheFlags,
BufferAllocationType.HostMapped => DefaultBufferMemoryFlags,
BufferAllocationType.DeviceLocal => DeviceLocalBufferMemoryFlags,
BufferAllocationType.DeviceLocalMapped => DeviceLocalMappedBufferMemoryFlags,
_ => DefaultBufferMemoryFlags
};
var allocation = gd.MemoryAllocator.AllocateDeviceMemory(requirements, allocateFlags, allocateFlagsAlt);
// If an allocation with this memory type fails, fall back to the previous one.
try
{
allocation = gd.MemoryAllocator.AllocateDeviceMemory(requirements, allocateFlags, true);
}
catch (VulkanException)
{
allocation = default;
}
}
while (allocation.Memory.Handle == 0 && (--type != fallbackType));
if (allocation.Memory.Handle == 0UL)
{
gd.Api.DestroyBuffer(_device, buffer, null);
return null;
return default;
}
gd.Api.BindBufferMemory(_device, buffer, allocation.Memory, allocation.Offset);
return new BufferHolder(gd, _device, buffer, allocation, size);
return (buffer, allocation, type);
}
public Auto<DisposableBufferView> CreateView(BufferHandle handle, VkFormat format, int offset, int size)
public unsafe BufferHolder Create(
VulkanRenderer gd,
int size,
bool forConditionalRendering = false,
BufferAllocationType baseType = BufferAllocationType.HostMapped,
BufferHandle storageHint = default)
{
if (TryGetBuffer(handle, out var holder))
BufferAllocationType type = baseType;
BufferHolder storageHintHolder = null;
if (baseType == BufferAllocationType.Auto)
{
return holder.CreateView(format, offset, size);
if (gd.IsSharedMemory)
{
baseType = BufferAllocationType.HostMapped;
type = baseType;
}
else
{
type = size >= BufferHolder.DeviceLocalSizeThreshold ? BufferAllocationType.DeviceLocal : BufferAllocationType.HostMapped;
}
if (storageHint != BufferHandle.Null)
{
if (TryGetBuffer(storageHint, out storageHintHolder))
{
type = storageHintHolder.DesiredType;
}
}
}
(VkBuffer buffer, MemoryAllocation allocation, BufferAllocationType resultType) =
CreateBacking(gd, size, type, forConditionalRendering);
if (buffer.Handle != 0)
{
var holder = new BufferHolder(gd, _device, buffer, allocation, size, baseType, resultType);
if (storageHintHolder != null)
{
holder.InheritMetrics(storageHintHolder);
}
return holder;
}
return null;
}
public Auto<DisposableBuffer> GetBuffer(CommandBuffer commandBuffer, BufferHandle handle, bool isWrite)
public Auto<DisposableBufferView> CreateView(BufferHandle handle, VkFormat format, int offset, int size, Action invalidateView)
{
if (TryGetBuffer(handle, out var holder))
{
return holder.GetBuffer(commandBuffer, isWrite);
return holder.CreateView(format, offset, size, invalidateView);
}
return null;
}
public Auto<DisposableBuffer> GetBuffer(CommandBuffer commandBuffer, BufferHandle handle, bool isWrite, bool isSSBO = false)
{
if (TryGetBuffer(handle, out var holder))
{
return holder.GetBuffer(commandBuffer, isWrite, isSSBO);
}
return null;
@@ -332,14 +396,14 @@ namespace Ryujinx.Graphics.Vulkan
return null;
}
public ReadOnlySpan<byte> GetData(BufferHandle handle, int offset, int size)
public PinnedSpan<byte> GetData(BufferHandle handle, int offset, int size)
{
if (TryGetBuffer(handle, out var holder))
{
return holder.GetData(offset, size);
}
return ReadOnlySpan<byte>.Empty;
return new PinnedSpan<byte>();
}
public void SetData<T>(BufferHandle handle, int offset, ReadOnlySpan<T> data) where T : unmanaged

View File

@@ -2,14 +2,14 @@
namespace Ryujinx.Graphics.Vulkan
{
readonly struct BufferState : IDisposable
struct BufferState : IDisposable
{
public static BufferState Null => new BufferState(null, 0, 0);
private readonly int _offset;
private readonly int _size;
private readonly Auto<DisposableBuffer> _buffer;
private Auto<DisposableBuffer> _buffer;
public BufferState(Auto<DisposableBuffer> buffer, int offset, int size)
{
@@ -29,6 +29,17 @@ namespace Ryujinx.Graphics.Vulkan
}
}
public void Swap(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
if (_buffer == from)
{
_buffer.DecrementReferenceCount();
to.IncrementReferenceCount();
_buffer = to;
}
}
public void Dispose()
{
_buffer?.DecrementReferenceCount();

View File

@@ -94,7 +94,7 @@ namespace Ryujinx.Graphics.Vulkan
else
{
// If null descriptors are not supported, we need to pass the handle of a dummy buffer on unused bindings.
_dummyBuffer = gd.BufferManager.Create(gd, 0x10000, forConditionalRendering: false, deviceLocal: true);
_dummyBuffer = gd.BufferManager.Create(gd, 0x10000, forConditionalRendering: false, baseType: BufferAllocationType.DeviceLocal);
}
_dummyTexture = gd.CreateTextureView(new TextureCreateInfo(
@@ -178,7 +178,7 @@ namespace Ryujinx.Graphics.Vulkan
var buffer = assignment.Range;
int index = assignment.Binding;
Auto<DisposableBuffer> vkBuffer = _gd.BufferManager.GetBuffer(commandBuffer, buffer.Handle, false);
Auto<DisposableBuffer> vkBuffer = _gd.BufferManager.GetBuffer(commandBuffer, buffer.Handle, false, isSSBO: true);
ref Auto<DisposableBuffer> currentVkBuffer = ref _storageBufferRefs[index];
DescriptorBufferInfo info = new DescriptorBufferInfo()
@@ -640,6 +640,23 @@ namespace Ryujinx.Graphics.Vulkan
Array.Clear(_storageSet);
}
private void SwapBuffer(Auto<DisposableBuffer>[] list, Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
for (int i = 0; i < list.Length; i++)
{
if (list[i] == from)
{
list[i] = to;
}
}
}
public void SwapBuffer(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
SwapBuffer(_uniformBufferRefs, from, to);
SwapBuffer(_storageBufferRefs, from, to);
}
protected virtual void Dispose(bool disposing)
{
if (disposing)

View File

@@ -156,11 +156,11 @@ namespace Ryujinx.Graphics.Vulkan.Effects
};
int rangeSize = dimensionsBuffer.Length * sizeof(float);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize, false);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize);
_renderer.BufferManager.SetData(bufferHandle, 0, dimensionsBuffer);
ReadOnlySpan<float> sharpeningBuffer = stackalloc float[] { 1.5f - (Level * 0.01f * 1.5f)};
var sharpeningBufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, sizeof(float), false);
var sharpeningBufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, sizeof(float));
_renderer.BufferManager.SetData(sharpeningBufferHandle, 0, sharpeningBuffer);
int threadGroupWorkRegionDim = 16;

View File

@@ -87,7 +87,7 @@ namespace Ryujinx.Graphics.Vulkan.Effects
ReadOnlySpan<float> resolutionBuffer = stackalloc float[] { view.Width, view.Height };
int rangeSize = resolutionBuffer.Length * sizeof(float);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize, false);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize);
_renderer.BufferManager.SetData(bufferHandle, 0, resolutionBuffer);

View File

@@ -266,7 +266,7 @@ namespace Ryujinx.Graphics.Vulkan.Effects
ReadOnlySpan<float> resolutionBuffer = stackalloc float[] { view.Width, view.Height };
int rangeSize = resolutionBuffer.Length * sizeof(float);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize, false);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize);
_renderer.BufferManager.SetData(bufferHandle, 0, resolutionBuffer);
var bufferRanges = new BufferRange(bufferHandle, 0, rangeSize);

View File

@@ -394,7 +394,7 @@ namespace Ryujinx.Graphics.Vulkan
(region[2], region[3]) = (region[3], region[2]);
}
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize);
gd.BufferManager.SetData<float>(bufferHandle, 0, region);
@@ -495,7 +495,7 @@ namespace Ryujinx.Graphics.Vulkan
(region[2], region[3]) = (region[3], region[2]);
}
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize);
gd.BufferManager.SetData<float>(bufferHandle, 0, region);
@@ -649,7 +649,7 @@ namespace Ryujinx.Graphics.Vulkan
_pipeline.SetCommandBuffer(cbs);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ClearColorBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ClearColorBufferSize);
gd.BufferManager.SetData<float>(bufferHandle, 0, clearColor);
@@ -726,7 +726,7 @@ namespace Ryujinx.Graphics.Vulkan
(region[2], region[3]) = (region[3], region[2]);
}
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize);
gd.BufferManager.SetData<float>(bufferHandle, 0, region);
@@ -802,7 +802,7 @@ namespace Ryujinx.Graphics.Vulkan
shaderParams[2] = size;
shaderParams[3] = srcOffset;
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize);
gd.BufferManager.SetData<int>(bufferHandle, 0, shaderParams);
@@ -958,7 +958,7 @@ namespace Ryujinx.Graphics.Vulkan
shaderParams[0] = BitOperations.Log2((uint)ratio);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize);
gd.BufferManager.SetData<int>(bufferHandle, 0, shaderParams);
@@ -1050,7 +1050,7 @@ namespace Ryujinx.Graphics.Vulkan
(shaderParams[0], shaderParams[1]) = GetSampleCountXYLog2(samples);
(shaderParams[2], shaderParams[3]) = GetSampleCountXYLog2((int)TextureStorage.ConvertToSampleCountFlags(gd.Capabilities.SupportedSampleCounts, (uint)samples));
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize);
gd.BufferManager.SetData<int>(bufferHandle, 0, shaderParams);
@@ -1133,7 +1133,7 @@ namespace Ryujinx.Graphics.Vulkan
(shaderParams[0], shaderParams[1]) = GetSampleCountXYLog2(samples);
(shaderParams[2], shaderParams[3]) = GetSampleCountXYLog2((int)TextureStorage.ConvertToSampleCountFlags(gd.Capabilities.SupportedSampleCounts, (uint)samples));
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize);
gd.BufferManager.SetData<int>(bufferHandle, 0, shaderParams);
@@ -1407,7 +1407,7 @@ namespace Ryujinx.Graphics.Vulkan
pattern.OffsetIndex.CopyTo(shaderParams.Slice(0, pattern.OffsetIndex.Length));
var patternBufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false, out var patternBuffer);
var patternBufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, out var patternBuffer);
var patternBufferAuto = patternBuffer.GetBuffer();
gd.BufferManager.SetData<int>(patternBufferHandle, 0, shaderParams);

View File

@@ -82,7 +82,7 @@ namespace Ryujinx.Graphics.Vulkan
}
// Expand the repeating pattern to the number of requested primitives.
BufferHandle newBuffer = _gd.CreateBuffer(expectedSize * sizeof(int));
BufferHandle newBuffer = _gd.BufferManager.CreateWithHandle(_gd, expectedSize * sizeof(int));
// Copy the old data to the new one.
if (_repeatingBuffer != BufferHandle.Null)

View File

@@ -146,5 +146,16 @@ namespace Ryujinx.Graphics.Vulkan
{
return _buffer == buffer;
}
public void Swap(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
if (_buffer == from)
{
_buffer.DecrementReferenceCount();
to.IncrementReferenceCount();
_buffer = to;
}
}
}
}

View File

@@ -28,32 +28,25 @@ namespace Ryujinx.Graphics.Vulkan
public MemoryAllocation AllocateDeviceMemory(
MemoryRequirements requirements,
MemoryPropertyFlags flags = 0)
MemoryPropertyFlags flags = 0,
bool isBuffer = false)
{
return AllocateDeviceMemory(requirements, flags, flags);
}
public MemoryAllocation AllocateDeviceMemory(
MemoryRequirements requirements,
MemoryPropertyFlags flags,
MemoryPropertyFlags alternativeFlags)
{
int memoryTypeIndex = FindSuitableMemoryTypeIndex(requirements.MemoryTypeBits, flags, alternativeFlags);
int memoryTypeIndex = FindSuitableMemoryTypeIndex(requirements.MemoryTypeBits, flags);
if (memoryTypeIndex < 0)
{
return default;
}
bool map = flags.HasFlag(MemoryPropertyFlags.HostVisibleBit);
return Allocate(memoryTypeIndex, requirements.Size, requirements.Alignment, map);
return Allocate(memoryTypeIndex, requirements.Size, requirements.Alignment, map, isBuffer);
}
private MemoryAllocation Allocate(int memoryTypeIndex, ulong size, ulong alignment, bool map)
private MemoryAllocation Allocate(int memoryTypeIndex, ulong size, ulong alignment, bool map, bool isBuffer)
{
for (int i = 0; i < _blockLists.Count; i++)
{
var bl = _blockLists[i];
if (bl.MemoryTypeIndex == memoryTypeIndex)
if (bl.MemoryTypeIndex == memoryTypeIndex && bl.ForBuffer == isBuffer)
{
lock (bl)
{
@@ -62,18 +55,15 @@ namespace Ryujinx.Graphics.Vulkan
}
}
var newBl = new MemoryAllocatorBlockList(_api, _device, memoryTypeIndex, _blockAlignment);
var newBl = new MemoryAllocatorBlockList(_api, _device, memoryTypeIndex, _blockAlignment, isBuffer);
_blockLists.Add(newBl);
return newBl.Allocate(size, alignment, map);
}
private int FindSuitableMemoryTypeIndex(
uint memoryTypeBits,
MemoryPropertyFlags flags,
MemoryPropertyFlags alternativeFlags)
MemoryPropertyFlags flags)
{
int bestCandidateIndex = -1;
for (int i = 0; i < _physicalDeviceMemoryProperties.MemoryTypeCount; i++)
{
var type = _physicalDeviceMemoryProperties.MemoryTypes[i];
@@ -84,14 +74,27 @@ namespace Ryujinx.Graphics.Vulkan
{
return i;
}
else if (type.PropertyFlags.HasFlag(alternativeFlags))
{
bestCandidateIndex = i;
}
}
}
return bestCandidateIndex;
return -1;
}
public static bool IsDeviceMemoryShared(Vk api, PhysicalDevice physicalDevice)
{
// The device is regarded as having shared memory if all heaps have the device local bit.
api.GetPhysicalDeviceMemoryProperties(physicalDevice, out var properties);
for (int i = 0; i < properties.MemoryHeapCount; i++)
{
if (!properties.MemoryHeaps[i].Flags.HasFlag(MemoryHeapFlags.DeviceLocalBit))
{
return false;
}
}
return true;
}
public void Dispose()

View File

@@ -162,15 +162,17 @@ namespace Ryujinx.Graphics.Vulkan
private readonly Device _device;
public int MemoryTypeIndex { get; }
public bool ForBuffer { get; }
private readonly int _blockAlignment;
public MemoryAllocatorBlockList(Vk api, Device device, int memoryTypeIndex, int blockAlignment)
public MemoryAllocatorBlockList(Vk api, Device device, int memoryTypeIndex, int blockAlignment, bool forBuffer)
{
_blocks = new List<Block>();
_api = api;
_device = device;
MemoryTypeIndex = memoryTypeIndex;
ForBuffer = forBuffer;
_blockAlignment = blockAlignment;
}

View File

@@ -1297,6 +1297,25 @@ namespace Ryujinx.Graphics.Vulkan
SignalStateChange();
}
public void SwapBuffer(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
_indexBuffer.Swap(from, to);
for (int i = 0; i < _vertexBuffers.Length; i++)
{
_vertexBuffers[i].Swap(from, to);
}
for (int i = 0; i < _transformFeedbackBuffers.Length; i++)
{
_transformFeedbackBuffers[i].Swap(from, to);
}
_descriptorSetUpdater.SwapBuffer(from, to);
SignalCommandBufferChange();
}
public unsafe void TextureBarrier()
{
MemoryBarrier memoryBarrier = new MemoryBarrier()

View File

@@ -17,10 +17,13 @@ namespace Ryujinx.Graphics.Vulkan
private ulong _byteWeight;
private List<BufferHolder> _backingSwaps;
public PipelineFull(VulkanRenderer gd, Device device) : base(gd, device)
{
_activeQueries = new List<(QueryPool, bool)>();
_pendingQueryCopies = new();
_backingSwaps = new();
CommandBuffer = (Cbs = gd.CommandBufferPool.Rent()).CommandBuffer;
}
@@ -185,6 +188,20 @@ namespace Ryujinx.Graphics.Vulkan
}
}
private void TryBackingSwaps()
{
CommandBufferScoped? cbs = null;
_backingSwaps.RemoveAll((holder) => holder.TryBackingSwap(ref cbs));
cbs?.Dispose();
}
public void AddBackingSwap(BufferHolder holder)
{
_backingSwaps.Add(holder);
}
public void Restore()
{
if (Pipeline != null)
@@ -230,6 +247,8 @@ namespace Ryujinx.Graphics.Vulkan
Gd.ResetCounterPool();
TryBackingSwaps();
Restore();
}

View File

@@ -57,12 +57,12 @@ namespace Ryujinx.Graphics.Vulkan
throw new NotSupportedException();
}
public ReadOnlySpan<byte> GetData()
public PinnedSpan<byte> GetData()
{
return _gd.GetBufferData(_bufferHandle, _offset, _size);
}
public ReadOnlySpan<byte> GetData(int layer, int level)
public PinnedSpan<byte> GetData(int layer, int level)
{
return GetData();
}
@@ -128,7 +128,7 @@ namespace Ryujinx.Graphics.Vulkan
{
if (_bufferView == null)
{
_bufferView = _gd.BufferManager.CreateView(_bufferHandle, VkFormat, _offset, _size);
_bufferView = _gd.BufferManager.CreateView(_bufferHandle, VkFormat, _offset, _size, ReleaseImpl);
}
return _bufferView?.Get(cbs, _offset, _size).Value ?? default;
@@ -147,7 +147,7 @@ namespace Ryujinx.Graphics.Vulkan
return bufferView.Get(cbs, _offset, _size).Value;
}
bufferView = _gd.BufferManager.CreateView(_bufferHandle, vkFormat, _offset, _size);
bufferView = _gd.BufferManager.CreateView(_bufferHandle, vkFormat, _offset, _size, ReleaseImpl);
if (bufferView != null)
{

View File

@@ -531,7 +531,7 @@ namespace Ryujinx.Graphics.Vulkan
return bitmap;
}
public ReadOnlySpan<byte> GetData()
public PinnedSpan<byte> GetData()
{
BackgroundResource resources = _gd.BackgroundResources.Get();
@@ -539,15 +539,15 @@ namespace Ryujinx.Graphics.Vulkan
{
_gd.FlushAllCommands();
return GetData(_gd.CommandBufferPool, resources.GetFlushBuffer());
return PinnedSpan<byte>.UnsafeFromSpan(GetData(_gd.CommandBufferPool, resources.GetFlushBuffer()));
}
else
{
return GetData(resources.GetPool(), resources.GetFlushBuffer());
return PinnedSpan<byte>.UnsafeFromSpan(GetData(resources.GetPool(), resources.GetFlushBuffer()));
}
}
public ReadOnlySpan<byte> GetData(int layer, int level)
public PinnedSpan<byte> GetData(int layer, int level)
{
BackgroundResource resources = _gd.BackgroundResources.Get();
@@ -555,11 +555,11 @@ namespace Ryujinx.Graphics.Vulkan
{
_gd.FlushAllCommands();
return GetData(_gd.CommandBufferPool, resources.GetFlushBuffer(), layer, level);
return PinnedSpan<byte>.UnsafeFromSpan(GetData(_gd.CommandBufferPool, resources.GetFlushBuffer(), layer, level));
}
else
{
return GetData(resources.GetPool(), resources.GetFlushBuffer(), layer, level);
return PinnedSpan<byte>.UnsafeFromSpan(GetData(resources.GetPool(), resources.GetFlushBuffer(), layer, level));
}
}

View File

@@ -129,6 +129,17 @@ namespace Ryujinx.Graphics.Vulkan
return _buffer == buffer;
}
public void Swap(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
if (_buffer == from)
{
_buffer.DecrementReferenceCount();
to.IncrementReferenceCount();
_buffer = to;
}
}
public void Dispose()
{
// Only dispose if this buffer is not refetched on each bind.

View File

@@ -80,6 +80,7 @@ namespace Ryujinx.Graphics.Vulkan
internal bool IsAmdGcn { get; private set; }
internal bool IsMoltenVk { get; private set; }
internal bool IsTBDR { get; private set; }
internal bool IsSharedMemory { get; private set; }
public string GpuVendor { get; private set; }
public string GpuRenderer { get; private set; }
public string GpuVersion { get; private set; }
@@ -313,6 +314,8 @@ namespace Ryujinx.Graphics.Vulkan
portabilityFlags,
vertexBufferAlignment);
IsSharedMemory = MemoryAllocator.IsDeviceMemoryShared(Api, _physicalDevice);
MemoryAllocator = new MemoryAllocator(Api, _physicalDevice, _device, properties.Limits.MaxMemoryAllocationCount);
CommandBufferPool = VulkanInitialization.CreateCommandBufferPool(Api, _device, Queue, QueueLock, queueFamilyIndex);
@@ -373,9 +376,9 @@ namespace Ryujinx.Graphics.Vulkan
_initialized = true;
}
public BufferHandle CreateBuffer(int size)
public BufferHandle CreateBuffer(int size, BufferHandle storageHint)
{
return BufferManager.CreateWithHandle(this, size, false);
return BufferManager.CreateWithHandle(this, size, BufferAllocationType.Auto, storageHint);
}
public IProgram CreateProgram(ShaderSource[] sources, ShaderInfo info)
@@ -439,7 +442,7 @@ namespace Ryujinx.Graphics.Vulkan
_syncManager.RegisterFlush();
}
public ReadOnlySpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
public PinnedSpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
{
return BufferManager.GetData(buffer, offset, size);
}

View File

@@ -14,6 +14,9 @@ namespace Ryujinx.HLE.HOS.Services.Nim
// CreateServerInterface(pid, handle<unknown>, u64) -> object<nn::ec::IShopServiceAccessServer>
public ResultCode CreateServerInterface(ServiceCtx context)
{
// Close transfer memory immediately as we don't use it.
context.Device.System.KernelContext.Syscall.CloseHandle(context.Request.HandleDesc.ToCopy[0]);
MakeObject(context, new IShopServiceAccessServer());
Logger.Stub?.PrintStub(LogClass.ServiceNim);