Compare commits

...

17 Commits

Author SHA1 Message Date
Mary
b5032b3c91 vulkan: Fix access level of extensions fields and make them readonly (#4608) 2023-03-27 08:40:27 +02:00
Mary
f0a3dff136 vulkan: Remove CreateCommandBufferPool from VulkanInitialization (#4606)
It was only called in one place, that can be simplified.
2023-03-27 02:16:31 +00:00
Mary
f659dcb9d8 vulkan: fix broken "VK_EXT_subgroup_size_control" support check (#4607)
Not sure since when it was broken...
2023-03-26 19:01:30 +02:00
riperiperi
a34fb0e939 Vulkan: Insert barriers before clears (#4596)
* Vulkan: Insert barriers before clears

Newer NVIDIA GPUs seem to be able to start clearing render targets before the last rasterization task is completed, which can cause it to clear a texture while it is being sampled.

This change adds a barrier from read to write when doing a clear, assuming it has been sampled in the past. It could be possible for this to be needed for sample into draw by some GPU, but it's not right now afaik.

This should fix visual artifacts on newer NVIDIA GPUs and driver combos. Contrary to popular belief, Tetris® Effect: Connected is not affected. Testing welcome, hopefully should fix most cases of this and not cost too much performance.

* Visual Studio Moment

* Address feedback

* Address Feedback 2
2023-03-26 12:51:02 +02:00
Mary
21ce8a9b80 chore: Update Ryujinx.SDL2-CS to 2.26.3 (#4479) 2023-03-24 22:42:24 +01:00
gdkchan
9ecbee8032 Batch inline index buffer update (#4587) 2023-03-24 14:19:54 +01:00
gdkchan
80519af67d Update short cache textures if modified (#4586) 2023-03-24 12:54:58 +01:00
gdkchan
26e30faff3 Fix handle leak on IShopServiceAccessServerInterface.CreateServerInterface (#4591) 2023-03-24 11:56:54 +01:00
Wunk
0992310b76 ARMeilleure: Check for XSAVE cpuid flag for AVX{2,512} (#4584)
Protection for the `xgetbv` instruction for systems that do not support
`xcr0` such as nehalem processors.

The `XSAVE` cpuid indicates support for `XSAVE`, `XRESTOR`, `XSETBV`,
`XGETBV` while `OSXSAVE` indicates if the operating system itself has
`XSAVE` turned on. Both must be checked at the same time.
2023-03-22 14:51:21 -03:00
Andrew Glaze
009c1101d2 CI: add a version tag to correlate release versions with commits (#4572)
* add step to tag commit with release version

* add step to tag commit with release version

* Rename step to “Create Tag”

* Fix name
2023-03-22 13:17:28 +01:00
gdkchan
ba95ee54ab Revert "Use source generated json serializers in order to improve code trimming (#4094)" (#4576)
This reverts commit 4ce4299ca2.
2023-03-21 20:14:46 -03:00
Andrey Sukharev
4ce4299ca2 Use source generated json serializers in order to improve code trimming (#4094)
* Use source generated json serializers in order to improve code trimming

* Use strongly typed github releases model to fetch updates instead of raw Newtonsoft.Json parsing

* Use separate model for LogEventArgs serialization

* Make dynamic object formatter static. Fix string builder pooling.

* Do not inherit json version of LogEventArgs from EventArgs

* Fix extra space in object formatting

* Write log json directly to stream instead of using buffer writer

* Rebase fixes

* Rebase fixes

* Rebase fixes

* Enforce block-scoped namespaces in the solution. Convert style for existing code

* Apply suggestions from code review

Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>

* Rebase indent fix

* Fix indent

* Delete unnecessary json properties

* Rebase fix

* Remove overridden json property names as they are handled in the options

* Apply suggestions from code review

Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>

* Use default json options in github api calls

* Indentation and spacing fixes

---------

Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
2023-03-21 19:41:19 -03:00
Wunk
17620d18db ARMeilleure: Add initial support for AVX512 (EVEX encoding) (cont) (#4147)
* ARMeilleure: Add AVX512{F,VL,DQ,BW} detection

Add `UseAvx512Ortho` and `UseAvx512OrthoFloat` optimization flags as
short-hands for `F+VL` and `F+VL+DQ`.

* ARMeilleure: Add initial support for EVEX instruction encoding

Does not implement rounding, or exception controls.

* ARMeilleure: Add `X86Vpternlogd`

Accelerates the vector-`Not` instruction.

* ARMeilleure: Add check for `OSXSAVE` for AVX{2,512}

* ARMeilleure: Add check for `XCR0` flags

Add XCR0 register checks for AVX and AVX512F, following the guidelines
from section 14.3 and 15.2 from the Intel Architecture Software
Developer's Manual.

* ARMeilleure: Remove redundant `ReProtect` and `Dispose`, formatting

* ARMeilleure: Move XCR0 procedure to GetXcr0Eax

* ARMeilleure: Add `XCR0` to `FeatureInfo` structure

* ARMeilleure: Utilize `ReadOnlySpan` for Xcr0 assembly

Avoids an additional allocation

* ARMeilleure: Formatting fixes

* ARMeilleure: Fix EVEX encoding src2 register index

> Just like in VEX prefix, vvvv is provided in inverted form.

* ARMeilleure: Add `X86Vpternlogd` acceleration to `Vmvn_I`

Passes unit tests, verified instruction utilization

* ARMeilleure: Fix EVEX register operand designations

Operand 2 was being sourced improperly.

EVEX encoded instructions source their operands like so:
Operand 1: ModRM:reg
Operand 2: EVEX.vvvvv
Operand 3: ModRM:r/m
Operand 4: Imm

This fixes the improper register designations when emitting vpternlog.
Now "dest", "src1", "src2" arguments emit in the proper order in EVEX instructions.

* ARMeilleure: Add `X86Vpternlogd` acceleration to `Orn_V`

* ARMeilleure: PTC version bump

* ARMeilleure: Update EVEX encoding Debug.Assert to Debug.Fail

* ARMeilleure: Update EVEX encoding comment capitalization
2023-03-20 16:09:24 -03:00
riperiperi
9f1cf6458c Vulkan: Migrate buffers between memory types to improve GPU performance (#4540)
* Initial implementation of migration between memory heaps

- Missing OOM handling
- Missing `_map` data safety when remapping
  - Copy may not have completed yet (needs some kind of fence)
  - Map may be unmapped before it is done being used. (needs scoped access)
- SSBO accesses are all "writes" - maybe pass info in another way.
- Missing keeping map type when resizing buffers (should this be done?)

* Ensure migrated data is in place before flushing.

* Fix issue where old waitable would be signalled.

- There is a real issue where existing Auto<> references need to be replaced.

* Swap bound Auto<> instances when swapping buffer backing

* Fix conversion buffers

* Don't try move buffers if the host has shared memory.

* Make GPU methods return PinnedSpan with scope

* Storage Hint

* Fix stupidity

* Fix rebase

* Tweak rules

Attempt to sidestep BOTW slowdown

* Remove line

* Migrate only when command buffers flush

* Change backing swap log to debug

* Address some feedback

* Disallow backing swap when the flush lock is held by the current thread

* Make PinnedSpan from ReadOnlySpan explicitly unsafe

* Fix some small issues

- Index buffer swap fixed
- Allocate DeviceLocal buffers using a separate block list to images.

* Remove alternative flags

* Address feedback
2023-03-19 17:56:48 -03:00
gdkchan
67b4e63cff Remove MultiRange Min/MaxAddress and rename GetSlice to Slice (#4566)
* Delete MinAddress and MaxAddress from MultiRange

* Rename MultiRange.GetSlice to MultiRange.Slice
2023-03-19 17:31:35 +01:00
TSRBerry
c05c688ee8 Avoid copying more handles than we have space for (#4564)
* Avoid copying more handles than we have space for

* Use locks instead

* Reduce nesting by combining the lock statements

* Add locks for other uses of _sessionHandles and _portHandles

* Use one object to lock instead of locking twice

* Release the lock as soon as possible
2023-03-19 11:30:04 +01:00
riperiperi
b2623dc27d OpenGL: Fix inverted conditional for counter flush from #4471 (#4560)
Fixes OpenGL.
2023-03-18 20:39:05 -03:00
61 changed files with 1088 additions and 296 deletions

View File

@@ -112,6 +112,17 @@ jobs:
repo: ${{ env.RYUJINX_TARGET_RELEASE_CHANNEL_REPO }}
token: ${{ secrets.RELEASE_TOKEN }}
- name: Create tag
uses: actions/github-script@v5
with:
script: |
github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: 'refs/tags/${{ steps.version_info.outputs.build_version }}',
sha: context.sha
})
flatpak_release:
uses: ./.github/workflows/flatpak.yml
needs: release

View File

@@ -7,6 +7,7 @@
<ItemGroup>
<ProjectReference Include="..\Ryujinx.Common\Ryujinx.Common.csproj" />
<ProjectReference Include="..\Ryujinx.Memory\Ryujinx.Memory.csproj" />
</ItemGroup>
<ItemGroup>

View File

@@ -1034,7 +1034,13 @@ namespace ARMeilleure.CodeGen.X86
Debug.Assert(opCode != BadOp, "Invalid opcode value.");
if ((flags & InstructionFlags.Vex) != 0 && HardwareCapabilities.SupportsVexEncoding)
if ((flags & InstructionFlags.Evex) != 0 && HardwareCapabilities.SupportsEvexEncoding)
{
WriteEvexInst(dest, src1, src2, type, flags, opCode);
opCode &= 0xff;
}
else if ((flags & InstructionFlags.Vex) != 0 && HardwareCapabilities.SupportsVexEncoding)
{
// In a vex encoding, only one prefix can be active at a time. The active prefix is encoded in the second byte using two bits.
@@ -1153,6 +1159,103 @@ namespace ARMeilleure.CodeGen.X86
}
}
private void WriteEvexInst(
Operand dest,
Operand src1,
Operand src2,
OperandType type,
InstructionFlags flags,
int opCode,
bool broadcast = false,
int registerWidth = 128,
int maskRegisterIdx = 0,
bool zeroElements = false)
{
int op1Idx = dest.GetRegister().Index;
int op2Idx = src1.GetRegister().Index;
int op3Idx = src2.GetRegister().Index;
WriteByte(0x62);
// P0
// Extend operand 1 register
bool r = (op1Idx & 8) == 0;
// Extend operand 3 register
bool x = (op3Idx & 16) == 0;
// Extend operand 3 register
bool b = (op3Idx & 8) == 0;
// Extend operand 1 register
bool rp = (op1Idx & 16) == 0;
// Escape code index
byte mm = 0b00;
switch ((ushort)(opCode >> 8))
{
case 0xf00: mm = 0b01; break;
case 0xf38: mm = 0b10; break;
case 0xf3a: mm = 0b11; break;
default: Debug.Fail($"Failed to EVEX encode opcode 0x{opCode:X}."); break;
}
WriteByte(
(byte)(
(r ? 0x80 : 0) |
(x ? 0x40 : 0) |
(b ? 0x20 : 0) |
(rp ? 0x10 : 0) |
mm));
// P1
// Specify 64-bit lane mode
bool w = Is64Bits(type);
// Operand 2 register index
byte vvvv = (byte)(~op2Idx & 0b1111);
// Opcode prefix
byte pp = (flags & InstructionFlags.PrefixMask) switch
{
InstructionFlags.Prefix66 => 0b01,
InstructionFlags.PrefixF3 => 0b10,
InstructionFlags.PrefixF2 => 0b11,
_ => 0
};
WriteByte(
(byte)(
(w ? 0x80 : 0) |
(vvvv << 3) |
0b100 |
pp));
// P2
// Mask register determines what elements to zero, rather than what elements to merge
bool z = zeroElements;
// Specifies register-width
byte ll = 0b00;
switch (registerWidth)
{
case 128: ll = 0b00; break;
case 256: ll = 0b01; break;
case 512: ll = 0b10; break;
default: Debug.Fail($"Invalid EVEX vector register width {registerWidth}."); break;
}
// Embedded broadcast in the case of a memory operand
bool bcast = broadcast;
// Extend operand 2 register
bool vp = (op2Idx & 16) == 0;
// Mask register index
Debug.Assert(maskRegisterIdx < 8, $"Invalid mask register index {maskRegisterIdx}.");
byte aaa = (byte)(maskRegisterIdx & 0b111);
WriteByte(
(byte)(
(z ? 0x80 : 0) |
(ll << 5) |
(bcast ? 0x10 : 0) |
(vp ? 8 : 0) |
aaa));
}
private void WriteCompactInst(Operand operand, int opCode)
{
int regIndex = operand.GetRegister().Index;

View File

@@ -20,6 +20,7 @@ namespace ARMeilleure.CodeGen.X86
Reg8Dest = 1 << 2,
RexW = 1 << 3,
Vex = 1 << 4,
Evex = 1 << 5,
PrefixBit = 16,
PrefixMask = 7 << PrefixBit,
@@ -278,6 +279,7 @@ namespace ARMeilleure.CodeGen.X86
Add(X86Instruction.Vfnmsub231sd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38bf, InstructionFlags.Vex | InstructionFlags.Prefix66 | InstructionFlags.RexW));
Add(X86Instruction.Vfnmsub231ss, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38bf, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Vpblendvb, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f3a4c, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Vpternlogd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f3a25, InstructionFlags.Evex | InstructionFlags.Prefix66));
Add(X86Instruction.Xor, new InstructionInfo(0x00000031, 0x06000083, 0x06000081, BadOp, 0x00000033, InstructionFlags.None));
Add(X86Instruction.Xorpd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000f57, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Xorps, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000f57, InstructionFlags.Vex));

View File

@@ -1,10 +1,14 @@
using Ryujinx.Memory;
using System;
using System.Runtime.InteropServices;
using System.Runtime.Intrinsics.X86;
namespace ARMeilleure.CodeGen.X86
{
static class HardwareCapabilities
{
private delegate uint GetXcr0();
static HardwareCapabilities()
{
if (!X86Base.IsSupported)
@@ -24,6 +28,34 @@ namespace ARMeilleure.CodeGen.X86
FeatureInfo7Ebx = (FeatureFlags7Ebx)ebx7;
FeatureInfo7Ecx = (FeatureFlags7Ecx)ecx7;
}
Xcr0InfoEax = (Xcr0FlagsEax)GetXcr0Eax();
}
private static uint GetXcr0Eax()
{
if (!FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Xsave))
{
// XSAVE feature required for xgetbv
return 0;
}
ReadOnlySpan<byte> asmGetXcr0 = new byte[]
{
0x31, 0xc9, // xor ecx, ecx
0xf, 0x01, 0xd0, // xgetbv
0xc3, // ret
};
using MemoryBlock memGetXcr0 = new MemoryBlock((ulong)asmGetXcr0.Length);
memGetXcr0.Write(0, asmGetXcr0);
memGetXcr0.Reprotect(0, (ulong)asmGetXcr0.Length, MemoryPermission.ReadAndExecute);
var fGetXcr0 = Marshal.GetDelegateForFunctionPointer<GetXcr0>(memGetXcr0.Pointer);
return fGetXcr0();
}
[Flags]
@@ -44,6 +76,8 @@ namespace ARMeilleure.CodeGen.X86
Sse42 = 1 << 20,
Popcnt = 1 << 23,
Aes = 1 << 25,
Xsave = 1 << 26,
Osxsave = 1 << 27,
Avx = 1 << 28,
F16c = 1 << 29
}
@@ -52,7 +86,11 @@ namespace ARMeilleure.CodeGen.X86
public enum FeatureFlags7Ebx
{
Avx2 = 1 << 5,
Sha = 1 << 29
Avx512f = 1 << 16,
Avx512dq = 1 << 17,
Sha = 1 << 29,
Avx512bw = 1 << 30,
Avx512vl = 1 << 31
}
[Flags]
@@ -61,10 +99,21 @@ namespace ARMeilleure.CodeGen.X86
Gfni = 1 << 8,
}
[Flags]
public enum Xcr0FlagsEax
{
Sse = 1 << 1,
YmmHi128 = 1 << 2,
Opmask = 1 << 5,
ZmmHi256 = 1 << 6,
Hi16Zmm = 1 << 7
}
public static FeatureFlags1Edx FeatureInfo1Edx { get; }
public static FeatureFlags1Ecx FeatureInfo1Ecx { get; }
public static FeatureFlags7Ebx FeatureInfo7Ebx { get; } = 0;
public static FeatureFlags7Ecx FeatureInfo7Ecx { get; } = 0;
public static Xcr0FlagsEax Xcr0InfoEax { get; } = 0;
public static bool SupportsSse => FeatureInfo1Edx.HasFlag(FeatureFlags1Edx.Sse);
public static bool SupportsSse2 => FeatureInfo1Edx.HasFlag(FeatureFlags1Edx.Sse2);
@@ -76,8 +125,13 @@ namespace ARMeilleure.CodeGen.X86
public static bool SupportsSse42 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Sse42);
public static bool SupportsPopcnt => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Popcnt);
public static bool SupportsAesni => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Aes);
public static bool SupportsAvx => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Avx);
public static bool SupportsAvx => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Avx | FeatureFlags1Ecx.Xsave | FeatureFlags1Ecx.Osxsave) && Xcr0InfoEax.HasFlag(Xcr0FlagsEax.Sse | Xcr0FlagsEax.YmmHi128);
public static bool SupportsAvx2 => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx2) && SupportsAvx;
public static bool SupportsAvx512F => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx512f) && FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Xsave | FeatureFlags1Ecx.Osxsave)
&& Xcr0InfoEax.HasFlag(Xcr0FlagsEax.Sse | Xcr0FlagsEax.YmmHi128 | Xcr0FlagsEax.Opmask | Xcr0FlagsEax.ZmmHi256 | Xcr0FlagsEax.Hi16Zmm);
public static bool SupportsAvx512Vl => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx512vl) && SupportsAvx512F;
public static bool SupportsAvx512Bw => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx512bw) && SupportsAvx512F;
public static bool SupportsAvx512Dq => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx512dq) && SupportsAvx512F;
public static bool SupportsF16c => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.F16c);
public static bool SupportsSha => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Sha);
public static bool SupportsGfni => FeatureInfo7Ecx.HasFlag(FeatureFlags7Ecx.Gfni);
@@ -85,5 +139,6 @@ namespace ARMeilleure.CodeGen.X86
public static bool ForceLegacySse { get; set; }
public static bool SupportsVexEncoding => SupportsAvx && !ForceLegacySse;
public static bool SupportsEvexEncoding => SupportsAvx512F && !ForceLegacySse;
}
}

View File

@@ -180,6 +180,7 @@ namespace ARMeilleure.CodeGen.X86
Add(Intrinsic.X86Vfnmadd231ss, new IntrinsicInfo(X86Instruction.Vfnmadd231ss, IntrinsicType.Fma));
Add(Intrinsic.X86Vfnmsub231sd, new IntrinsicInfo(X86Instruction.Vfnmsub231sd, IntrinsicType.Fma));
Add(Intrinsic.X86Vfnmsub231ss, new IntrinsicInfo(X86Instruction.Vfnmsub231ss, IntrinsicType.Fma));
Add(Intrinsic.X86Vpternlogd, new IntrinsicInfo(X86Instruction.Vpternlogd, IntrinsicType.TernaryImm));
Add(Intrinsic.X86Xorpd, new IntrinsicInfo(X86Instruction.Xorpd, IntrinsicType.Binary));
Add(Intrinsic.X86Xorps, new IntrinsicInfo(X86Instruction.Xorps, IntrinsicType.Binary));
}

View File

@@ -219,6 +219,7 @@ namespace ARMeilleure.CodeGen.X86
Vfnmsub231sd,
Vfnmsub231ss,
Vpblendvb,
Vpternlogd,
Xor,
Xorpd,
Xorps,

View File

@@ -254,7 +254,22 @@ namespace ARMeilleure.Instructions
public static void Not_V(ArmEmitterContext context)
{
if (Optimizations.UseSse2)
if (Optimizations.UseAvx512Ortho)
{
OpCodeSimd op = (OpCodeSimd)context.CurrOp;
Operand n = GetVec(op.Rn);
Operand res = context.AddIntrinsic(Intrinsic.X86Vpternlogd, n, n, Const(~0b10101010));
if (op.RegisterSize == RegisterSize.Simd64)
{
res = context.VectorZeroUpper64(res);
}
context.Copy(GetVec(op.Rd), res);
}
else if (Optimizations.UseSse2)
{
OpCodeSimd op = (OpCodeSimd)context.CurrOp;
@@ -283,6 +298,22 @@ namespace ARMeilleure.Instructions
{
InstEmitSimdHelperArm64.EmitVectorBinaryOp(context, Intrinsic.Arm64OrnV);
}
else if (Optimizations.UseAvx512Ortho)
{
OpCodeSimdReg op = (OpCodeSimdReg)context.CurrOp;
Operand n = GetVec(op.Rn);
Operand m = GetVec(op.Rm);
Operand res = context.AddIntrinsic(Intrinsic.X86Vpternlogd, n, m, Const(0b11001100 | ~0b10101010));
if (op.RegisterSize == RegisterSize.Simd64)
{
res = context.VectorZeroUpper64(res);
}
context.Copy(GetVec(op.Rd), res);
}
else if (Optimizations.UseSse2)
{
OpCodeSimdReg op = (OpCodeSimdReg)context.CurrOp;

View File

@@ -151,6 +151,13 @@ namespace ARMeilleure.Instructions
{
InstEmitSimdHelper32Arm64.EmitVectorBinaryOpSimd32(context, (n, m) => context.AddIntrinsic(Intrinsic.Arm64OrnV | Intrinsic.Arm64V128, n, m));
}
else if (Optimizations.UseAvx512Ortho)
{
EmitVectorBinaryOpSimd32(context, (n, m) =>
{
return context.AddIntrinsic(Intrinsic.X86Vpternlogd, n, m, Const(0b11001100 | ~0b10101010));
});
}
else if (Optimizations.UseSse2)
{
Operand mask = context.VectorOne();

View File

@@ -34,7 +34,14 @@ namespace ARMeilleure.Instructions
public static void Vmvn_I(ArmEmitterContext context)
{
if (Optimizations.UseSse2)
if (Optimizations.UseAvx512Ortho)
{
EmitVectorUnaryOpSimd32(context, (op1) =>
{
return context.AddIntrinsic(Intrinsic.X86Vpternlogd, op1, op1, Const(0b01010101));
});
}
else if (Optimizations.UseSse2)
{
EmitVectorUnaryOpSimd32(context, (op1) =>
{

View File

@@ -173,6 +173,7 @@ namespace ARMeilleure.IntermediateRepresentation
X86Vfnmadd231ss,
X86Vfnmsub231sd,
X86Vfnmsub231ss,
X86Vpternlogd,
X86Xorpd,
X86Xorps,

View File

@@ -23,6 +23,10 @@ namespace ARMeilleure
public static bool UseSse42IfAvailable { get; set; } = true;
public static bool UsePopCntIfAvailable { get; set; } = true;
public static bool UseAvxIfAvailable { get; set; } = true;
public static bool UseAvx512FIfAvailable { get; set; } = true;
public static bool UseAvx512VlIfAvailable { get; set; } = true;
public static bool UseAvx512BwIfAvailable { get; set; } = true;
public static bool UseAvx512DqIfAvailable { get; set; } = true;
public static bool UseF16cIfAvailable { get; set; } = true;
public static bool UseFmaIfAvailable { get; set; } = true;
public static bool UseAesniIfAvailable { get; set; } = true;
@@ -47,11 +51,18 @@ namespace ARMeilleure
internal static bool UseSse42 => UseSse42IfAvailable && X86HardwareCapabilities.SupportsSse42;
internal static bool UsePopCnt => UsePopCntIfAvailable && X86HardwareCapabilities.SupportsPopcnt;
internal static bool UseAvx => UseAvxIfAvailable && X86HardwareCapabilities.SupportsAvx && !ForceLegacySse;
internal static bool UseAvx512F => UseAvx512FIfAvailable && X86HardwareCapabilities.SupportsAvx512F && !ForceLegacySse;
internal static bool UseAvx512Vl => UseAvx512VlIfAvailable && X86HardwareCapabilities.SupportsAvx512Vl && !ForceLegacySse;
internal static bool UseAvx512Bw => UseAvx512BwIfAvailable && X86HardwareCapabilities.SupportsAvx512Bw && !ForceLegacySse;
internal static bool UseAvx512Dq => UseAvx512DqIfAvailable && X86HardwareCapabilities.SupportsAvx512Dq && !ForceLegacySse;
internal static bool UseF16c => UseF16cIfAvailable && X86HardwareCapabilities.SupportsF16c;
internal static bool UseFma => UseFmaIfAvailable && X86HardwareCapabilities.SupportsFma;
internal static bool UseAesni => UseAesniIfAvailable && X86HardwareCapabilities.SupportsAesni;
internal static bool UsePclmulqdq => UsePclmulqdqIfAvailable && X86HardwareCapabilities.SupportsPclmulqdq;
internal static bool UseSha => UseShaIfAvailable && X86HardwareCapabilities.SupportsSha;
internal static bool UseGfni => UseGfniIfAvailable && X86HardwareCapabilities.SupportsGfni;
internal static bool UseAvx512Ortho => UseAvx512F && UseAvx512Vl;
internal static bool UseAvx512OrthoFloat => UseAvx512Ortho && UseAvx512Dq;
}
}

View File

@@ -30,7 +30,7 @@ namespace ARMeilleure.Translation.PTC
private const string OuterHeaderMagicString = "PTCohd\0\0";
private const string InnerHeaderMagicString = "PTCihd\0\0";
private const uint InternalVersion = 4484; //! To be incremented manually for each change to the ARMeilleure project.
private const uint InternalVersion = 4485; //! To be incremented manually for each change to the ARMeilleure project.
private const string ActualDir = "0";
private const string BackupDir = "1";
@@ -969,6 +969,7 @@ namespace ARMeilleure.Translation.PTC
(ulong)Arm64HardwareCapabilities.LinuxFeatureInfoHwCap,
(ulong)Arm64HardwareCapabilities.LinuxFeatureInfoHwCap2,
(ulong)Arm64HardwareCapabilities.MacOsFeatureInfo,
0,
0);
}
else if (RuntimeInformation.ProcessArchitecture == Architecture.X64)
@@ -977,11 +978,12 @@ namespace ARMeilleure.Translation.PTC
(ulong)X86HardwareCapabilities.FeatureInfo1Ecx,
(ulong)X86HardwareCapabilities.FeatureInfo1Edx,
(ulong)X86HardwareCapabilities.FeatureInfo7Ebx,
(ulong)X86HardwareCapabilities.FeatureInfo7Ecx);
(ulong)X86HardwareCapabilities.FeatureInfo7Ecx,
(ulong)X86HardwareCapabilities.Xcr0InfoEax);
}
else
{
return new FeatureInfo(0, 0, 0, 0);
return new FeatureInfo(0, 0, 0, 0, 0);
}
}
@@ -1002,7 +1004,7 @@ namespace ARMeilleure.Translation.PTC
return osPlatform;
}
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 78*/)]
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 86*/)]
private struct OuterHeader
{
public ulong Magic;
@@ -1034,8 +1036,8 @@ namespace ARMeilleure.Translation.PTC
}
}
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 32*/)]
private record struct FeatureInfo(ulong FeatureInfo0, ulong FeatureInfo1, ulong FeatureInfo2, ulong FeatureInfo3);
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 40*/)]
private record struct FeatureInfo(ulong FeatureInfo0, ulong FeatureInfo1, ulong FeatureInfo2, ulong FeatureInfo3, ulong FeatureInfo4);
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 128*/)]
private struct InnerHeader

View File

@@ -34,7 +34,7 @@
<PackageVersion Include="Ryujinx.Graphics.Nvdec.Dependencies" Version="5.0.1-build13" />
<PackageVersion Include="Ryujinx.Graphics.Vulkan.Dependencies.MoltenVK" Version="1.2.0" />
<PackageVersion Include="Ryujinx.GtkSharp" Version="3.24.24.59-ryujinx" />
<PackageVersion Include="Ryujinx.SDL2-CS" Version="2.26.1-build23" />
<PackageVersion Include="Ryujinx.SDL2-CS" Version="2.26.3-build25" />
<PackageVersion Include="shaderc.net" Version="0.1.0" />
<PackageVersion Include="SharpZipLib" Version="1.4.2" />
<PackageVersion Include="Silk.NET.Vulkan" Version="2.16.0" />

View File

@@ -15,7 +15,12 @@ namespace Ryujinx.Graphics.GAL
void BackgroundContextAction(Action action, bool alwaysBackground = false);
BufferHandle CreateBuffer(int size);
BufferHandle CreateBuffer(int size, BufferHandle storageHint);
BufferHandle CreateBuffer(int size)
{
return CreateBuffer(size, BufferHandle.Null);
}
IProgram CreateProgram(ShaderSource[] shaders, ShaderInfo info);
@@ -26,7 +31,7 @@ namespace Ryujinx.Graphics.GAL
void DeleteBuffer(BufferHandle buffer);
ReadOnlySpan<byte> GetBufferData(BufferHandle buffer, int offset, int size);
PinnedSpan<byte> GetBufferData(BufferHandle buffer, int offset, int size);
Capabilities GetCapabilities();
ulong GetCurrentSync();

View File

@@ -15,8 +15,8 @@ namespace Ryujinx.Graphics.GAL
ITexture CreateView(TextureCreateInfo info, int firstLayer, int firstLevel);
ReadOnlySpan<byte> GetData();
ReadOnlySpan<byte> GetData(int layer, int level);
PinnedSpan<byte> GetData();
PinnedSpan<byte> GetData(int layer, int level);
void SetData(SpanOrArray<byte> data);
void SetData(SpanOrArray<byte> data, int layer, int level);

View File

@@ -21,9 +21,9 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Buffer
public static void Run(ref BufferGetDataCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
ReadOnlySpan<byte> result = renderer.GetBufferData(threaded.Buffers.MapBuffer(command._buffer), command._offset, command._size);
PinnedSpan<byte> result = renderer.GetBufferData(threaded.Buffers.MapBuffer(command._buffer), command._offset, command._size);
command._result.Get(threaded).Result = new PinnedSpan<byte>(result);
command._result.Get(threaded).Result = result;
}
}
}

View File

@@ -5,16 +5,25 @@
public CommandType CommandType => CommandType.CreateBuffer;
private BufferHandle _threadedHandle;
private int _size;
private BufferHandle _storageHint;
public void Set(BufferHandle threadedHandle, int size)
public void Set(BufferHandle threadedHandle, int size, BufferHandle storageHint)
{
_threadedHandle = threadedHandle;
_size = size;
_storageHint = storageHint;
}
public static void Run(ref CreateBufferCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
threaded.Buffers.AssignBuffer(command._threadedHandle, renderer.CreateBuffer(command._size));
BufferHandle hint = BufferHandle.Null;
if (command._storageHint != BufferHandle.Null)
{
hint = threaded.Buffers.MapBuffer(command._storageHint);
}
threaded.Buffers.AssignBuffer(command._threadedHandle, renderer.CreateBuffer(command._size, hint));
}
}
}

View File

@@ -18,9 +18,9 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Texture
public static void Run(ref TextureGetDataCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
ReadOnlySpan<byte> result = command._texture.Get(threaded).Base.GetData();
PinnedSpan<byte> result = command._texture.Get(threaded).Base.GetData();
command._result.Get(threaded).Result = new PinnedSpan<byte>(result);
command._result.Get(threaded).Result = result;
}
}
}

View File

@@ -22,9 +22,9 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Texture
public static void Run(ref TextureGetDataSliceCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
ReadOnlySpan<byte> result = command._texture.Get(threaded).Base.GetData(command._layer, command._level);
PinnedSpan<byte> result = command._texture.Get(threaded).Base.GetData(command._layer, command._level);
command._result.Get(threaded).Result = new PinnedSpan<byte>(result);
command._result.Get(threaded).Result = result;
}
}
}

View File

@@ -1,23 +0,0 @@
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
namespace Ryujinx.Graphics.GAL.Multithreading.Model
{
unsafe struct PinnedSpan<T> where T : unmanaged
{
private void* _ptr;
private int _size;
public PinnedSpan(ReadOnlySpan<T> span)
{
_ptr = Unsafe.AsPointer(ref MemoryMarshal.GetReference(span));
_size = span.Length;
}
public ReadOnlySpan<T> Get()
{
return new ReadOnlySpan<T>(_ptr, _size * Unsafe.SizeOf<T>());
}
}
}

View File

@@ -72,7 +72,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources
return newTex;
}
public ReadOnlySpan<byte> GetData()
public PinnedSpan<byte> GetData()
{
if (_renderer.IsGpuThread())
{
@@ -80,7 +80,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources
_renderer.New<TextureGetDataCommand>().Set(Ref(this), Ref(box));
_renderer.InvokeCommand();
return box.Result.Get();
return box.Result;
}
else
{
@@ -90,7 +90,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources
}
}
public ReadOnlySpan<byte> GetData(int layer, int level)
public PinnedSpan<byte> GetData(int layer, int level)
{
if (_renderer.IsGpuThread())
{
@@ -98,7 +98,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources
_renderer.New<TextureGetDataSliceCommand>().Set(Ref(this), Ref(box), layer, level);
_renderer.InvokeCommand();
return box.Result.Get();
return box.Result;
}
else
{

View File

@@ -265,10 +265,10 @@ namespace Ryujinx.Graphics.GAL.Multithreading
}
}
public BufferHandle CreateBuffer(int size)
public BufferHandle CreateBuffer(int size, BufferHandle storageHint)
{
BufferHandle handle = Buffers.CreateBufferHandle();
New<CreateBufferCommand>().Set(handle, size);
New<CreateBufferCommand>().Set(handle, size, storageHint);
QueueCommand();
return handle;
@@ -329,7 +329,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading
QueueCommand();
}
public ReadOnlySpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
public PinnedSpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
{
if (IsGpuThread())
{
@@ -337,7 +337,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading
New<BufferGetDataCommand>().Set(buffer, offset, size, Ref(box));
InvokeCommand();
return box.Result.Get();
return box.Result;
}
else
{

View File

@@ -0,0 +1,53 @@
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
namespace Ryujinx.Graphics.GAL
{
public unsafe struct PinnedSpan<T> : IDisposable where T : unmanaged
{
private void* _ptr;
private int _size;
private Action _disposeAction;
/// <summary>
/// Creates a new PinnedSpan from an existing ReadOnlySpan. The span *must* be pinned in memory.
/// The data must be guaranteed to live until disposeAction is called.
/// </summary>
/// <param name="span">Existing span</param>
/// <param name="disposeAction">Action to call on dispose</param>
/// <remarks>
/// If a dispose action is not provided, it is safe to assume the resource will be available until the next call.
/// </remarks>
public static PinnedSpan<T> UnsafeFromSpan(ReadOnlySpan<T> span, Action disposeAction = null)
{
return new PinnedSpan<T>(Unsafe.AsPointer(ref MemoryMarshal.GetReference(span)), span.Length, disposeAction);
}
/// <summary>
/// Creates a new PinnedSpan from an existing unsafe region. The data must be guaranteed to live until disposeAction is called.
/// </summary>
/// <param name="ptr">Pointer to the region</param>
/// <param name="size">The total items of T the region contains</param>
/// <param name="disposeAction">Action to call on dispose</param>
/// <remarks>
/// If a dispose action is not provided, it is safe to assume the resource will be available until the next call.
/// </remarks>
public PinnedSpan(void* ptr, int size, Action disposeAction = null)
{
_ptr = ptr;
_size = size;
_disposeAction = disposeAction;
}
public ReadOnlySpan<T> Get()
{
return new ReadOnlySpan<T>(_ptr, _size * Unsafe.SizeOf<T>());
}
public void Dispose()
{
_disposeAction?.Invoke();
}
}
}

View File

@@ -180,7 +180,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
int firstInstance = (int)_state.State.FirstInstance;
int inlineIndexCount = _drawState.IbStreamer.GetAndResetInlineIndexCount();
int inlineIndexCount = _drawState.IbStreamer.GetAndResetInlineIndexCount(_context.Renderer);
if (inlineIndexCount != 0)
{
@@ -670,7 +670,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
{
if (indexedInline)
{
int inlineIndexCount = _drawState.IbStreamer.GetAndResetInlineIndexCount();
int inlineIndexCount = _drawState.IbStreamer.GetAndResetInlineIndexCount(_context.Renderer);
BufferRange br = new BufferRange(_drawState.IbStreamer.GetInlineIndexBuffer(), 0, inlineIndexCount * 4);
_channel.BufferManager.SetIndexBuffer(br, IndexType.UInt);

View File

@@ -11,9 +11,13 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// </summary>
struct IbStreamer
{
private const int BufferCapacity = 256; // Must be a power of 2.
private BufferHandle _inlineIndexBuffer;
private int _inlineIndexBufferSize;
private int _inlineIndexCount;
private uint[] _buffer;
private int _bufferOffset;
/// <summary>
/// Indicates if any index buffer data has been pushed.
@@ -38,9 +42,11 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// Gets the number of elements on the current inline index buffer,
/// while also reseting it to zero for the next draw.
/// </summary>
/// <param name="renderer">Host renderer</param>
/// <returns>Inline index bufffer count</returns>
public int GetAndResetInlineIndexCount()
public int GetAndResetInlineIndexCount(IRenderer renderer)
{
UpdateRemaining(renderer);
int temp = _inlineIndexCount;
_inlineIndexCount = 0;
return temp;
@@ -58,16 +64,12 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
byte i2 = (byte)(argument >> 16);
byte i3 = (byte)(argument >> 24);
Span<uint> data = stackalloc uint[4];
int offset = _inlineIndexCount;
data[0] = i0;
data[1] = i1;
data[2] = i2;
data[3] = i3;
int offset = _inlineIndexCount * 4;
renderer.SetBufferData(GetInlineIndexBuffer(renderer, offset), offset, MemoryMarshal.Cast<uint, byte>(data));
PushData(renderer, offset, i0);
PushData(renderer, offset + 1, i1);
PushData(renderer, offset + 2, i2);
PushData(renderer, offset + 3, i3);
_inlineIndexCount += 4;
}
@@ -82,14 +84,10 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
ushort i0 = (ushort)argument;
ushort i1 = (ushort)(argument >> 16);
Span<uint> data = stackalloc uint[2];
int offset = _inlineIndexCount;
data[0] = i0;
data[1] = i1;
int offset = _inlineIndexCount * 4;
renderer.SetBufferData(GetInlineIndexBuffer(renderer, offset), offset, MemoryMarshal.Cast<uint, byte>(data));
PushData(renderer, offset, i0);
PushData(renderer, offset + 1, i1);
_inlineIndexCount += 2;
}
@@ -103,13 +101,61 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
{
uint i0 = (uint)argument;
Span<uint> data = stackalloc uint[1];
int offset = _inlineIndexCount++;
data[0] = i0;
PushData(renderer, offset, i0);
}
int offset = _inlineIndexCount++ * 4;
/// <summary>
/// Pushes a 32-bit value to the index buffer.
/// </summary>
/// <param name="renderer">Host renderer</param>
/// <param name="offset">Offset where the data should be written, in 32-bit words</param>
/// <param name="value">Index value to be written</param>
private void PushData(IRenderer renderer, int offset, uint value)
{
if (_buffer == null)
{
_buffer = new uint[BufferCapacity];
}
renderer.SetBufferData(GetInlineIndexBuffer(renderer, offset), offset, MemoryMarshal.Cast<uint, byte>(data));
// We upload data in chunks.
// If we are at the start of a chunk, then the buffer might be full,
// in that case we need to submit any existing data before overwriting the buffer.
int subOffset = offset & (BufferCapacity - 1);
if (subOffset == 0 && offset != 0)
{
int baseOffset = (offset - BufferCapacity) * sizeof(uint);
BufferHandle buffer = GetInlineIndexBuffer(renderer, baseOffset, BufferCapacity * sizeof(uint));
renderer.SetBufferData(buffer, baseOffset, MemoryMarshal.Cast<uint, byte>(_buffer));
}
_buffer[subOffset] = value;
}
/// <summary>
/// Makes sure that any pending data is submitted to the GPU before the index buffer is used.
/// </summary>
/// <param name="renderer">Host renderer</param>
private void UpdateRemaining(IRenderer renderer)
{
int offset = _inlineIndexCount;
if (offset == 0)
{
return;
}
int count = offset & (BufferCapacity - 1);
if (count == 0)
{
count = BufferCapacity;
}
int baseOffset = (offset - count) * sizeof(uint);
int length = count * sizeof(uint);
BufferHandle buffer = GetInlineIndexBuffer(renderer, baseOffset, length);
renderer.SetBufferData(buffer, baseOffset, MemoryMarshal.Cast<uint, byte>(_buffer).Slice(0, length));
}
/// <summary>
@@ -117,12 +163,13 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// </summary>
/// <param name="renderer">Host renderer</param>
/// <param name="offset">Offset where the data will be written</param>
/// <param name="length">Number of bytes that will be written</param>
/// <returns>Buffer handle</returns>
private BufferHandle GetInlineIndexBuffer(IRenderer renderer, int offset)
private BufferHandle GetInlineIndexBuffer(IRenderer renderer, int offset, int length)
{
// Calculate a reasonable size for the buffer that can fit all the data,
// and that also won't require frequent resizes if we need to push more data.
int size = BitUtils.AlignUp(offset + 0x10, 0x200);
int size = BitUtils.AlignUp(offset + length + 0x10, 0x200);
if (_inlineIndexBuffer == BufferHandle.Null)
{

View File

@@ -1022,13 +1022,12 @@ namespace Ryujinx.Graphics.Gpu.Image
/// This method should be used to retrieve data that was modified by the host GPU.
/// This is not cheap, avoid doing that unless strictly needed.
/// </remarks>
/// <param name="output">An output span to place the texture data into. If empty, one is generated</param>
/// <param name="output">An output span to place the texture data into</param>
/// <param name="blacklist">True if the texture should be blacklisted, false otherwise</param>
/// <param name="texture">The specific host texture to flush. Defaults to this texture</param>
/// <returns>The span containing the texture data</returns>
private ReadOnlySpan<byte> GetTextureDataFromGpu(Span<byte> output, bool blacklist, ITexture texture = null)
private void GetTextureDataFromGpu(Span<byte> output, bool blacklist, ITexture texture = null)
{
ReadOnlySpan<byte> data;
PinnedSpan<byte> data;
if (texture != null)
{
@@ -1054,9 +1053,9 @@ namespace Ryujinx.Graphics.Gpu.Image
}
}
data = ConvertFromHostCompatibleFormat(output, data);
ConvertFromHostCompatibleFormat(output, data.Get());
return data;
data.Dispose();
}
/// <summary>
@@ -1071,10 +1070,9 @@ namespace Ryujinx.Graphics.Gpu.Image
/// <param name="level">The level of the texture to flush</param>
/// <param name="blacklist">True if the texture should be blacklisted, false otherwise</param>
/// <param name="texture">The specific host texture to flush. Defaults to this texture</param>
/// <returns>The span containing the texture data</returns>
public ReadOnlySpan<byte> GetTextureDataSliceFromGpu(Span<byte> output, int layer, int level, bool blacklist, ITexture texture = null)
public void GetTextureDataSliceFromGpu(Span<byte> output, int layer, int level, bool blacklist, ITexture texture = null)
{
ReadOnlySpan<byte> data;
PinnedSpan<byte> data;
if (texture != null)
{
@@ -1100,9 +1098,9 @@ namespace Ryujinx.Graphics.Gpu.Image
}
}
data = ConvertFromHostCompatibleFormat(output, data, level, true);
ConvertFromHostCompatibleFormat(output, data.Get(), level, true);
return data;
data.Dispose();
}
/// <summary>
@@ -1475,8 +1473,8 @@ namespace Ryujinx.Graphics.Gpu.Image
MultiRange otherRange = texture.Range;
IEnumerable<MultiRange> regions = _sizeInfo.AllRegions().Select((region) => Range.GetSlice((ulong)region.Offset, (ulong)region.Size));
IEnumerable<MultiRange> otherRegions = texture._sizeInfo.AllRegions().Select((region) => otherRange.GetSlice((ulong)region.Offset, (ulong)region.Size));
IEnumerable<MultiRange> regions = _sizeInfo.AllRegions().Select((region) => Range.Slice((ulong)region.Offset, (ulong)region.Size));
IEnumerable<MultiRange> otherRegions = texture._sizeInfo.AllRegions().Select((region) => otherRange.Slice((ulong)region.Offset, (ulong)region.Size));
foreach (MultiRange region in regions)
{

View File

@@ -397,7 +397,7 @@ namespace Ryujinx.Graphics.Gpu.Image
int endOffset = Math.Min(spanLast + _sliceSizes[endInfo.BaseLevel + endInfo.Levels - 1], (int)Storage.Size);
int size = endOffset - spanBase;
dataSpan = _physicalMemory.GetSpan(Storage.Range.GetSlice((ulong)spanBase, (ulong)size));
dataSpan = _physicalMemory.GetSpan(Storage.Range.Slice((ulong)spanBase, (ulong)size));
}
// Only one of these will be greater than 1, as partial sync is only called when there are sub-image views.
@@ -473,7 +473,7 @@ namespace Ryujinx.Graphics.Gpu.Image
int endOffset = Math.Min(offset + _sliceSizes[level], (int)Storage.Size);
int size = endOffset - offset;
using WritableRegion region = _physicalMemory.GetWritableRegion(Storage.Range.GetSlice((ulong)offset, (ulong)size), tracked);
using WritableRegion region = _physicalMemory.GetWritableRegion(Storage.Range.Slice((ulong)offset, (ulong)size), tracked);
Storage.GetTextureDataSliceFromGpu(region.Memory.Span, layer, level, tracked, texture);
}
@@ -1419,13 +1419,13 @@ namespace Ryujinx.Graphics.Gpu.Image
for (int i = 0; i < _allOffsets.Length; i++)
{
(int layer, int level) = GetLayerLevelForView(i);
MultiRange handleRange = Storage.Range.GetSlice((ulong)_allOffsets[i], 1);
MultiRange handleRange = Storage.Range.Slice((ulong)_allOffsets[i], 1);
ulong handleBase = handleRange.GetSubRange(0).Address;
for (int j = 0; j < other._handles.Length; j++)
{
(int otherLayer, int otherLevel) = other.GetLayerLevelForView(j);
MultiRange otherHandleRange = other.Storage.Range.GetSlice((ulong)other._allOffsets[j], 1);
MultiRange otherHandleRange = other.Storage.Range.Slice((ulong)other._allOffsets[j], 1);
ulong otherHandleBase = otherHandleRange.GetSubRange(0).Address;
if (handleBase == otherHandleBase)
@@ -1502,7 +1502,7 @@ namespace Ryujinx.Graphics.Gpu.Image
// Handles list is not modified by another thread, only replaced, so this is thread safe.
// Remove modified flags from all overlapping handles, so that the textures don't flush to unmapped/remapped GPU memory.
MultiRange subRange = Storage.Range.GetSlice((ulong)handle.Offset, (ulong)handle.Size);
MultiRange subRange = Storage.Range.Slice((ulong)handle.Offset, (ulong)handle.Size);
if (range.OverlapsWith(subRange))
{

View File

@@ -130,6 +130,10 @@ namespace Ryujinx.Graphics.Gpu.Image
return ref descriptor;
}
}
else
{
texture.SynchronizeMemory();
}
Items[id] = texture;
@@ -233,7 +237,7 @@ namespace Ryujinx.Graphics.Gpu.Image
}
/// <summary>
/// Queues a request to update a texture's mapping.
/// Queues a request to update a texture's mapping.
/// Mapping is updated later to avoid deleting the texture if it is still sparsely mapped.
/// </summary>
/// <param name="texture">Texture with potential mapping change</param>

View File

@@ -82,7 +82,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
Address = address;
Size = size;
Handle = context.Renderer.CreateBuffer((int)size);
Handle = context.Renderer.CreateBuffer((int)size, baseBuffers?.MaxBy(x => x.Size).Handle ?? BufferHandle.Null);
_useGranular = size > GranularBufferThreshold;
@@ -415,10 +415,10 @@ namespace Ryujinx.Graphics.Gpu.Memory
{
int offset = (int)(address - Address);
ReadOnlySpan<byte> data = _context.Renderer.GetBufferData(Handle, offset, (int)size);
using PinnedSpan<byte> data = _context.Renderer.GetBufferData(Handle, offset, (int)size);
// TODO: When write tracking shaders, they will need to be aware of changes in overlapping buffers.
_physicalMemory.WriteUntracked(address, data);
_physicalMemory.WriteUntracked(address, data.Get());
}
/// <summary>

View File

@@ -55,11 +55,14 @@ namespace Ryujinx.Graphics.OpenGL
(IntPtr)size);
}
public static unsafe ReadOnlySpan<byte> GetData(OpenGLRenderer renderer, BufferHandle buffer, int offset, int size)
public static unsafe PinnedSpan<byte> GetData(OpenGLRenderer renderer, BufferHandle buffer, int offset, int size)
{
// Data in the persistent buffer and host array is guaranteed to be available
// until the next time the host thread requests data.
if (HwCapabilities.UsePersistentBufferForFlush)
{
return renderer.PersistentBuffers.Default.GetBufferData(buffer, offset, size);
return PinnedSpan<byte>.UnsafeFromSpan(renderer.PersistentBuffers.Default.GetBufferData(buffer, offset, size));
}
else
{
@@ -69,7 +72,7 @@ namespace Ryujinx.Graphics.OpenGL
GL.GetBufferSubData(BufferTarget.CopyReadBuffer, (IntPtr)offset, size, target);
return new ReadOnlySpan<byte>(target.ToPointer(), size);
return new PinnedSpan<byte>(target.ToPointer(), size);
}
}

View File

@@ -39,12 +39,12 @@ namespace Ryujinx.Graphics.OpenGL.Image
throw new NotSupportedException();
}
public ReadOnlySpan<byte> GetData()
public PinnedSpan<byte> GetData()
{
return Buffer.GetData(_renderer, _buffer, _bufferOffset, _bufferSize);
}
public ReadOnlySpan<byte> GetData(int layer, int level)
public PinnedSpan<byte> GetData(int layer, int level)
{
return GetData();
}

View File

@@ -166,7 +166,7 @@ namespace Ryujinx.Graphics.OpenGL.Image
_renderer.TextureCopy.Copy(this, (TextureView)destination, srcRegion, dstRegion, linearFilter);
}
public unsafe ReadOnlySpan<byte> GetData()
public unsafe PinnedSpan<byte> GetData()
{
int size = 0;
int levels = Info.GetLevelsClamped();
@@ -196,16 +196,16 @@ namespace Ryujinx.Graphics.OpenGL.Image
data = FormatConverter.ConvertD24S8ToS8D24(data);
}
return data;
return PinnedSpan<byte>.UnsafeFromSpan(data);
}
public unsafe ReadOnlySpan<byte> GetData(int layer, int level)
public unsafe PinnedSpan<byte> GetData(int layer, int level)
{
int size = Info.GetMipSize(level);
if (HwCapabilities.UsePersistentBufferForFlush)
{
return _renderer.PersistentBuffers.Default.GetTextureData(this, size, layer, level);
return PinnedSpan<byte>.UnsafeFromSpan(_renderer.PersistentBuffers.Default.GetTextureData(this, size, layer, level));
}
else
{
@@ -213,7 +213,7 @@ namespace Ryujinx.Graphics.OpenGL.Image
int offset = WriteTo2D(target, layer, level);
return new ReadOnlySpan<byte>(target.ToPointer(), size).Slice(offset);
return new PinnedSpan<byte>((byte*)target.ToPointer() + offset, size);
}
}

View File

@@ -57,7 +57,7 @@ namespace Ryujinx.Graphics.OpenGL
ResourcePool = new ResourcePool();
}
public BufferHandle CreateBuffer(int size)
public BufferHandle CreateBuffer(int size, BufferHandle storageHint)
{
BufferCount++;
@@ -96,7 +96,7 @@ namespace Ryujinx.Graphics.OpenGL
return new HardwareInfo(GpuVendor, GpuRenderer);
}
public ReadOnlySpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
public PinnedSpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
{
return Buffer.GetData(this, buffer, offset, size);
}

View File

@@ -74,7 +74,7 @@ namespace Ryujinx.Graphics.OpenGL.Queries
{
result = Marshal.ReadInt64(_bufferMap);
return WaitingForValue(result);
return !WaitingForValue(result);
}
public long AwaitResult(AutoResetEvent wakeSignal = null)

View File

@@ -0,0 +1,12 @@
namespace Ryujinx.Graphics.Vulkan
{
internal enum BufferAllocationType
{
Auto = 0,
HostMappedNoCache,
HostMapped,
DeviceLocal,
DeviceLocalMapped
}
}

View File

@@ -1,7 +1,10 @@
using Ryujinx.Graphics.GAL;
using Ryujinx.Common.Logging;
using Ryujinx.Graphics.GAL;
using Silk.NET.Vulkan;
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Threading;
using VkBuffer = Silk.NET.Vulkan.Buffer;
using VkFormat = Silk.NET.Vulkan.Format;
@@ -11,6 +14,12 @@ namespace Ryujinx.Graphics.Vulkan
{
private const int MaxUpdateBufferSize = 0x10000;
private const int SetCountThreshold = 100;
private const int WriteCountThreshold = 50;
private const int FlushCountThreshold = 5;
public const int DeviceLocalSizeThreshold = 256 * 1024; // 256kb
public const AccessFlags DefaultAccessFlags =
AccessFlags.IndirectCommandReadBit |
AccessFlags.ShaderReadBit |
@@ -21,10 +30,10 @@ namespace Ryujinx.Graphics.Vulkan
private readonly VulkanRenderer _gd;
private readonly Device _device;
private readonly MemoryAllocation _allocation;
private readonly Auto<DisposableBuffer> _buffer;
private readonly Auto<MemoryAllocation> _allocationAuto;
private readonly ulong _bufferHandle;
private MemoryAllocation _allocation;
private Auto<DisposableBuffer> _buffer;
private Auto<MemoryAllocation> _allocationAuto;
private ulong _bufferHandle;
private CacheByRange<BufferHolder> _cachedConvertedBuffers;
@@ -32,11 +41,28 @@ namespace Ryujinx.Graphics.Vulkan
private IntPtr _map;
private readonly MultiFenceHolder _waitable;
private MultiFenceHolder _waitable;
private bool _lastAccessIsWrite;
public BufferHolder(VulkanRenderer gd, Device device, VkBuffer buffer, MemoryAllocation allocation, int size)
private BufferAllocationType _baseType;
private BufferAllocationType _currentType;
private bool _swapQueued;
public BufferAllocationType DesiredType { get; private set; }
private int _setCount;
private int _writeCount;
private int _flushCount;
private int _flushTemp;
private ReaderWriterLock _flushLock;
private FenceHolder _flushFence;
private int _flushWaiting;
private List<Action> _swapActions;
public BufferHolder(VulkanRenderer gd, Device device, VkBuffer buffer, MemoryAllocation allocation, int size, BufferAllocationType type, BufferAllocationType currentType)
{
_gd = gd;
_device = device;
@@ -47,9 +73,153 @@ namespace Ryujinx.Graphics.Vulkan
_bufferHandle = buffer.Handle;
Size = size;
_map = allocation.HostPointer;
_baseType = type;
_currentType = currentType;
DesiredType = currentType;
_flushLock = new ReaderWriterLock();
}
public unsafe Auto<DisposableBufferView> CreateView(VkFormat format, int offset, int size)
public bool TryBackingSwap(ref CommandBufferScoped? cbs)
{
if (_swapQueued && DesiredType != _currentType)
{
// Only swap if the buffer is not used in any queued command buffer.
bool isRented = _buffer.HasRentedCommandBufferDependency(_gd.CommandBufferPool);
if (!isRented && _gd.CommandBufferPool.OwnedByCurrentThread && !_flushLock.IsReaderLockHeld)
{
var currentAllocation = _allocationAuto;
var currentBuffer = _buffer;
IntPtr currentMap = _map;
(VkBuffer buffer, MemoryAllocation allocation, BufferAllocationType resultType) = _gd.BufferManager.CreateBacking(_gd, Size, DesiredType, false, _currentType);
if (buffer.Handle != 0)
{
_flushLock.AcquireWriterLock(Timeout.Infinite);
ClearFlushFence();
_waitable = new MultiFenceHolder(Size);
_allocation = allocation;
_allocationAuto = new Auto<MemoryAllocation>(allocation);
_buffer = new Auto<DisposableBuffer>(new DisposableBuffer(_gd.Api, _device, buffer), _waitable, _allocationAuto);
_bufferHandle = buffer.Handle;
_map = allocation.HostPointer;
if (_map != IntPtr.Zero && currentMap != IntPtr.Zero)
{
// Copy data directly. Readbacks don't have to wait if this is done.
unsafe
{
new Span<byte>((void*)currentMap, Size).CopyTo(new Span<byte>((void*)_map, Size));
}
}
else
{
if (cbs == null)
{
cbs = _gd.CommandBufferPool.Rent();
}
CommandBufferScoped cbsV = cbs.Value;
Copy(_gd, cbsV, currentBuffer, _buffer, 0, 0, Size);
// Need to wait for the data to reach the new buffer before data can be flushed.
_flushFence = _gd.CommandBufferPool.GetFence(cbsV.CommandBufferIndex);
_flushFence.Get();
}
Logger.Debug?.PrintMsg(LogClass.Gpu, $"Converted {Size} buffer {_currentType} to {resultType}");
_currentType = resultType;
if (_swapActions != null)
{
foreach (var action in _swapActions)
{
action();
}
_swapActions.Clear();
}
currentBuffer.Dispose();
currentAllocation.Dispose();
_gd.PipelineInternal.SwapBuffer(currentBuffer, _buffer);
_flushLock.ReleaseWriterLock();
}
_swapQueued = false;
return true;
}
else
{
return false;
}
}
else
{
_swapQueued = false;
return true;
}
}
private void ConsiderBackingSwap()
{
if (_baseType == BufferAllocationType.Auto)
{
if (_writeCount >= WriteCountThreshold || _setCount >= SetCountThreshold || _flushCount >= FlushCountThreshold)
{
if (_flushCount > 0 || _flushTemp-- > 0)
{
// Buffers that flush should ideally be mapped in host address space for easy copies.
// If the buffer is large it will do better on GPU memory, as there will be more writes than data flushes (typically individual pages).
// If it is small, then it's likely most of the buffer will be flushed so we want it on host memory, as access is cached.
DesiredType = Size > DeviceLocalSizeThreshold ? BufferAllocationType.DeviceLocalMapped : BufferAllocationType.HostMapped;
// It's harder for a buffer that is flushed to revert to another type of mapping.
if (_flushCount > 0)
{
_flushTemp = 1000;
}
}
else if (_writeCount >= WriteCountThreshold)
{
// Buffers that are written often should ideally be in the device local heap. (Storage buffers)
DesiredType = BufferAllocationType.DeviceLocal;
}
else if (_setCount > SetCountThreshold)
{
// Buffers that have their data set often should ideally be host mapped. (Constant buffers)
DesiredType = BufferAllocationType.HostMapped;
}
_flushCount = 0;
_writeCount = 0;
_setCount = 0;
}
if (!_swapQueued && DesiredType != _currentType)
{
_swapQueued = true;
_gd.PipelineInternal.AddBackingSwap(this);
}
}
}
public unsafe Auto<DisposableBufferView> CreateView(VkFormat format, int offset, int size, Action invalidateView)
{
var bufferViewCreateInfo = new BufferViewCreateInfo()
{
@@ -62,9 +232,19 @@ namespace Ryujinx.Graphics.Vulkan
_gd.Api.CreateBufferView(_device, bufferViewCreateInfo, null, out var bufferView).ThrowOnError();
(_swapActions ??= new List<Action>()).Add(invalidateView);
return new Auto<DisposableBufferView>(new DisposableBufferView(_gd.Api, _device, bufferView), _waitable, _buffer);
}
public void InheritMetrics(BufferHolder other)
{
_setCount = other._setCount;
_writeCount = other._writeCount;
_flushCount = other._flushCount;
_flushTemp = other._flushTemp;
}
public unsafe void InsertBarrier(CommandBuffer commandBuffer, bool isWrite)
{
// If the last access is write, we always need a barrier to be sure we will read or modify
@@ -104,12 +284,22 @@ namespace Ryujinx.Graphics.Vulkan
return _buffer;
}
public Auto<DisposableBuffer> GetBuffer(CommandBuffer commandBuffer, bool isWrite = false)
public Auto<DisposableBuffer> GetBuffer(CommandBuffer commandBuffer, bool isWrite = false, bool isSSBO = false)
{
if (isWrite)
{
_writeCount++;
SignalWrite(0, Size);
}
else if (isSSBO)
{
// Always consider SSBO access for swapping to device local memory.
_writeCount++;
ConsiderBackingSwap();
}
return _buffer;
}
@@ -118,6 +308,8 @@ namespace Ryujinx.Graphics.Vulkan
{
if (isWrite)
{
_writeCount++;
SignalWrite(offset, size);
}
@@ -126,6 +318,8 @@ namespace Ryujinx.Graphics.Vulkan
public void SignalWrite(int offset, int size)
{
ConsiderBackingSwap();
if (offset == 0 && size == Size)
{
_cachedConvertedBuffers.Clear();
@@ -147,11 +341,76 @@ namespace Ryujinx.Graphics.Vulkan
return _map;
}
public unsafe ReadOnlySpan<byte> GetData(int offset, int size)
private void ClearFlushFence()
{
// Asusmes _flushLock is held as writer.
if (_flushFence != null)
{
if (_flushWaiting == 0)
{
_flushFence.Put();
}
_flushFence = null;
}
}
private void WaitForFlushFence()
{
// Assumes the _flushLock is held as reader, returns in same state.
if (_flushFence != null)
{
// If storage has changed, make sure the fence has been reached so that the data is in place.
var cookie = _flushLock.UpgradeToWriterLock(Timeout.Infinite);
if (_flushFence != null)
{
var fence = _flushFence;
Interlocked.Increment(ref _flushWaiting);
// Don't wait in the lock.
var restoreCookie = _flushLock.ReleaseLock();
fence.Wait();
_flushLock.RestoreLock(ref restoreCookie);
if (Interlocked.Decrement(ref _flushWaiting) == 0)
{
fence.Put();
}
_flushFence = null;
}
_flushLock.DowngradeFromWriterLock(ref cookie);
}
}
public unsafe PinnedSpan<byte> GetData(int offset, int size)
{
_flushLock.AcquireReaderLock(Timeout.Infinite);
WaitForFlushFence();
_flushCount++;
Span<byte> result;
if (_map != IntPtr.Zero)
{
return GetDataStorage(offset, size);
result = GetDataStorage(offset, size);
// Need to be careful here, the buffer can't be unmapped while the data is being used.
_buffer.IncrementReferenceCount();
_flushLock.ReleaseReaderLock();
return PinnedSpan<byte>.UnsafeFromSpan(result, _buffer.DecrementReferenceCount);
}
else
{
@@ -161,12 +420,17 @@ namespace Ryujinx.Graphics.Vulkan
{
_gd.FlushAllCommands();
return resource.GetFlushBuffer().GetBufferData(_gd.CommandBufferPool, this, offset, size);
result = resource.GetFlushBuffer().GetBufferData(_gd.CommandBufferPool, this, offset, size);
}
else
{
return resource.GetFlushBuffer().GetBufferData(resource.GetPool(), this, offset, size);
result = resource.GetFlushBuffer().GetBufferData(resource.GetPool(), this, offset, size);
}
_flushLock.ReleaseReaderLock();
// Flush buffer is pinned until the next GetBufferData on the thread, which is fine for current uses.
return PinnedSpan<byte>.UnsafeFromSpan(result);
}
}
@@ -190,6 +454,8 @@ namespace Ryujinx.Graphics.Vulkan
return;
}
_setCount++;
if (_map != IntPtr.Zero)
{
// If persistently mapped, set the data directly if the buffer is not currently in use.
@@ -268,6 +534,8 @@ namespace Ryujinx.Graphics.Vulkan
var dstBuffer = GetBuffer(cbs.CommandBuffer, dstOffset, data.Length, true).Get(cbs, dstOffset, data.Length).Value;
_writeCount--;
InsertBufferBarrier(
_gd,
cbs.CommandBuffer,
@@ -502,11 +770,19 @@ namespace Ryujinx.Graphics.Vulkan
public void Dispose()
{
_swapQueued = false;
_gd.PipelineInternal?.FlushCommandsIfWeightExceeding(_buffer, (ulong)Size);
_buffer.Dispose();
_allocationAuto.Dispose();
_cachedConvertedBuffers.Dispose();
_flushLock.AcquireWriterLock(Timeout.Infinite);
ClearFlushFence();
_flushLock.ReleaseWriterLock();
}
}
}

View File

@@ -4,6 +4,7 @@ using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using VkFormat = Silk.NET.Vulkan.Format;
using VkBuffer = Silk.NET.Vulkan.Buffer;
namespace Ryujinx.Graphics.Vulkan
{
@@ -16,17 +17,17 @@ namespace Ryujinx.Graphics.Vulkan
// Some drivers don't expose a "HostCached" memory type,
// so we need those alternative flags for the allocation to succeed there.
private const MemoryPropertyFlags DefaultBufferMemoryAltFlags =
private const MemoryPropertyFlags DefaultBufferMemoryNoCacheFlags =
MemoryPropertyFlags.HostVisibleBit |
MemoryPropertyFlags.HostCoherentBit;
private const MemoryPropertyFlags DeviceLocalBufferMemoryFlags =
MemoryPropertyFlags.DeviceLocalBit;
private const MemoryPropertyFlags FlushableDeviceLocalBufferMemoryFlags =
private const MemoryPropertyFlags DeviceLocalMappedBufferMemoryFlags =
MemoryPropertyFlags.DeviceLocalBit |
MemoryPropertyFlags.HostVisibleBit |
MemoryPropertyFlags.HostCoherentBit |
MemoryPropertyFlags.DeviceLocalBit;
MemoryPropertyFlags.HostCoherentBit;
private const BufferUsageFlags DefaultBufferUsageFlags =
BufferUsageFlags.TransferSrcBit |
@@ -54,14 +55,14 @@ namespace Ryujinx.Graphics.Vulkan
StagingBuffer = new StagingBuffer(gd, this);
}
public BufferHandle CreateWithHandle(VulkanRenderer gd, int size, bool deviceLocal)
public BufferHandle CreateWithHandle(VulkanRenderer gd, int size, BufferAllocationType baseType = BufferAllocationType.HostMapped, BufferHandle storageHint = default)
{
return CreateWithHandle(gd, size, deviceLocal, out _);
return CreateWithHandle(gd, size, out _, baseType, storageHint);
}
public BufferHandle CreateWithHandle(VulkanRenderer gd, int size, bool deviceLocal, out BufferHolder holder)
public BufferHandle CreateWithHandle(VulkanRenderer gd, int size, out BufferHolder holder, BufferAllocationType baseType = BufferAllocationType.HostMapped, BufferHandle storageHint = default)
{
holder = Create(gd, size, deviceLocal: deviceLocal);
holder = Create(gd, size, baseType: baseType, storageHint: storageHint);
if (holder == null)
{
return BufferHandle.Null;
@@ -74,7 +75,12 @@ namespace Ryujinx.Graphics.Vulkan
return Unsafe.As<ulong, BufferHandle>(ref handle64);
}
public unsafe BufferHolder Create(VulkanRenderer gd, int size, bool forConditionalRendering = false, bool deviceLocal = false)
public unsafe (VkBuffer buffer, MemoryAllocation allocation, BufferAllocationType resultType) CreateBacking(
VulkanRenderer gd,
int size,
BufferAllocationType type,
bool forConditionalRendering = false,
BufferAllocationType fallbackType = BufferAllocationType.Auto)
{
var usage = DefaultBufferUsageFlags;
@@ -98,48 +104,106 @@ namespace Ryujinx.Graphics.Vulkan
gd.Api.CreateBuffer(_device, in bufferCreateInfo, null, out var buffer).ThrowOnError();
gd.Api.GetBufferMemoryRequirements(_device, buffer, out var requirements);
MemoryPropertyFlags allocateFlags;
MemoryPropertyFlags allocateFlagsAlt;
MemoryAllocation allocation;
if (deviceLocal)
do
{
allocateFlags = DeviceLocalBufferMemoryFlags;
allocateFlagsAlt = DeviceLocalBufferMemoryFlags;
}
else
{
allocateFlags = DefaultBufferMemoryFlags;
allocateFlagsAlt = DefaultBufferMemoryAltFlags;
}
var allocateFlags = type switch
{
BufferAllocationType.HostMappedNoCache => DefaultBufferMemoryNoCacheFlags,
BufferAllocationType.HostMapped => DefaultBufferMemoryFlags,
BufferAllocationType.DeviceLocal => DeviceLocalBufferMemoryFlags,
BufferAllocationType.DeviceLocalMapped => DeviceLocalMappedBufferMemoryFlags,
_ => DefaultBufferMemoryFlags
};
var allocation = gd.MemoryAllocator.AllocateDeviceMemory(requirements, allocateFlags, allocateFlagsAlt);
// If an allocation with this memory type fails, fall back to the previous one.
try
{
allocation = gd.MemoryAllocator.AllocateDeviceMemory(requirements, allocateFlags, true);
}
catch (VulkanException)
{
allocation = default;
}
}
while (allocation.Memory.Handle == 0 && (--type != fallbackType));
if (allocation.Memory.Handle == 0UL)
{
gd.Api.DestroyBuffer(_device, buffer, null);
return null;
return default;
}
gd.Api.BindBufferMemory(_device, buffer, allocation.Memory, allocation.Offset);
return new BufferHolder(gd, _device, buffer, allocation, size);
return (buffer, allocation, type);
}
public Auto<DisposableBufferView> CreateView(BufferHandle handle, VkFormat format, int offset, int size)
public unsafe BufferHolder Create(
VulkanRenderer gd,
int size,
bool forConditionalRendering = false,
BufferAllocationType baseType = BufferAllocationType.HostMapped,
BufferHandle storageHint = default)
{
if (TryGetBuffer(handle, out var holder))
BufferAllocationType type = baseType;
BufferHolder storageHintHolder = null;
if (baseType == BufferAllocationType.Auto)
{
return holder.CreateView(format, offset, size);
if (gd.IsSharedMemory)
{
baseType = BufferAllocationType.HostMapped;
type = baseType;
}
else
{
type = size >= BufferHolder.DeviceLocalSizeThreshold ? BufferAllocationType.DeviceLocal : BufferAllocationType.HostMapped;
}
if (storageHint != BufferHandle.Null)
{
if (TryGetBuffer(storageHint, out storageHintHolder))
{
type = storageHintHolder.DesiredType;
}
}
}
(VkBuffer buffer, MemoryAllocation allocation, BufferAllocationType resultType) =
CreateBacking(gd, size, type, forConditionalRendering);
if (buffer.Handle != 0)
{
var holder = new BufferHolder(gd, _device, buffer, allocation, size, baseType, resultType);
if (storageHintHolder != null)
{
holder.InheritMetrics(storageHintHolder);
}
return holder;
}
return null;
}
public Auto<DisposableBuffer> GetBuffer(CommandBuffer commandBuffer, BufferHandle handle, bool isWrite)
public Auto<DisposableBufferView> CreateView(BufferHandle handle, VkFormat format, int offset, int size, Action invalidateView)
{
if (TryGetBuffer(handle, out var holder))
{
return holder.GetBuffer(commandBuffer, isWrite);
return holder.CreateView(format, offset, size, invalidateView);
}
return null;
}
public Auto<DisposableBuffer> GetBuffer(CommandBuffer commandBuffer, BufferHandle handle, bool isWrite, bool isSSBO = false)
{
if (TryGetBuffer(handle, out var holder))
{
return holder.GetBuffer(commandBuffer, isWrite, isSSBO);
}
return null;
@@ -332,14 +396,14 @@ namespace Ryujinx.Graphics.Vulkan
return null;
}
public ReadOnlySpan<byte> GetData(BufferHandle handle, int offset, int size)
public PinnedSpan<byte> GetData(BufferHandle handle, int offset, int size)
{
if (TryGetBuffer(handle, out var holder))
{
return holder.GetData(offset, size);
}
return ReadOnlySpan<byte>.Empty;
return new PinnedSpan<byte>();
}
public void SetData<T>(BufferHandle handle, int offset, ReadOnlySpan<T> data) where T : unmanaged

View File

@@ -2,14 +2,14 @@
namespace Ryujinx.Graphics.Vulkan
{
readonly struct BufferState : IDisposable
struct BufferState : IDisposable
{
public static BufferState Null => new BufferState(null, 0, 0);
private readonly int _offset;
private readonly int _size;
private readonly Auto<DisposableBuffer> _buffer;
private Auto<DisposableBuffer> _buffer;
public BufferState(Auto<DisposableBuffer> buffer, int offset, int size)
{
@@ -29,6 +29,17 @@ namespace Ryujinx.Graphics.Vulkan
}
}
public void Swap(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
if (_buffer == from)
{
_buffer.DecrementReferenceCount();
to.IncrementReferenceCount();
_buffer = to;
}
}
public void Dispose()
{
_buffer?.DecrementReferenceCount();

View File

@@ -94,7 +94,7 @@ namespace Ryujinx.Graphics.Vulkan
else
{
// If null descriptors are not supported, we need to pass the handle of a dummy buffer on unused bindings.
_dummyBuffer = gd.BufferManager.Create(gd, 0x10000, forConditionalRendering: false, deviceLocal: true);
_dummyBuffer = gd.BufferManager.Create(gd, 0x10000, forConditionalRendering: false, baseType: BufferAllocationType.DeviceLocal);
}
_dummyTexture = gd.CreateTextureView(new TextureCreateInfo(
@@ -178,7 +178,7 @@ namespace Ryujinx.Graphics.Vulkan
var buffer = assignment.Range;
int index = assignment.Binding;
Auto<DisposableBuffer> vkBuffer = _gd.BufferManager.GetBuffer(commandBuffer, buffer.Handle, false);
Auto<DisposableBuffer> vkBuffer = _gd.BufferManager.GetBuffer(commandBuffer, buffer.Handle, false, isSSBO: true);
ref Auto<DisposableBuffer> currentVkBuffer = ref _storageBufferRefs[index];
DescriptorBufferInfo info = new DescriptorBufferInfo()
@@ -236,7 +236,7 @@ namespace Ryujinx.Graphics.Vulkan
}
else if (texture is TextureView view)
{
view.Storage.InsertBarrier(cbs, AccessFlags.ShaderReadBit, stage.ConvertToPipelineStageFlags());
view.Storage.InsertWriteToReadBarrier(cbs, AccessFlags.ShaderReadBit, stage.ConvertToPipelineStageFlags());
_textureRefs[binding] = view.GetImageView();
_samplerRefs[binding] = ((SamplerHolder)sampler)?.GetSampler();
@@ -640,6 +640,23 @@ namespace Ryujinx.Graphics.Vulkan
Array.Clear(_storageSet);
}
private void SwapBuffer(Auto<DisposableBuffer>[] list, Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
for (int i = 0; i < list.Length; i++)
{
if (list[i] == from)
{
list[i] = to;
}
}
}
public void SwapBuffer(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
SwapBuffer(_uniformBufferRefs, from, to);
SwapBuffer(_storageBufferRefs, from, to);
}
protected virtual void Dispose(bool disposing)
{
if (disposing)

View File

@@ -156,11 +156,11 @@ namespace Ryujinx.Graphics.Vulkan.Effects
};
int rangeSize = dimensionsBuffer.Length * sizeof(float);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize, false);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize);
_renderer.BufferManager.SetData(bufferHandle, 0, dimensionsBuffer);
ReadOnlySpan<float> sharpeningBuffer = stackalloc float[] { 1.5f - (Level * 0.01f * 1.5f)};
var sharpeningBufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, sizeof(float), false);
var sharpeningBufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, sizeof(float));
_renderer.BufferManager.SetData(sharpeningBufferHandle, 0, sharpeningBuffer);
int threadGroupWorkRegionDim = 16;

View File

@@ -87,7 +87,7 @@ namespace Ryujinx.Graphics.Vulkan.Effects
ReadOnlySpan<float> resolutionBuffer = stackalloc float[] { view.Width, view.Height };
int rangeSize = resolutionBuffer.Length * sizeof(float);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize, false);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize);
_renderer.BufferManager.SetData(bufferHandle, 0, resolutionBuffer);

View File

@@ -266,7 +266,7 @@ namespace Ryujinx.Graphics.Vulkan.Effects
ReadOnlySpan<float> resolutionBuffer = stackalloc float[] { view.Width, view.Height };
int rangeSize = resolutionBuffer.Length * sizeof(float);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize, false);
var bufferHandle = _renderer.BufferManager.CreateWithHandle(_renderer, rangeSize);
_renderer.BufferManager.SetData(bufferHandle, 0, resolutionBuffer);
var bufferRanges = new BufferRange(bufferHandle, 0, rangeSize);

View File

@@ -322,7 +322,7 @@ namespace Ryujinx.Graphics.Vulkan
GAL.Format.S8Uint => ImageAspectFlags.StencilBit,
GAL.Format.D24UnormS8Uint or
GAL.Format.D32FloatS8Uint or
GAL.Format.S8UintD24Unorm => ImageAspectFlags.DepthBit | ImageAspectFlags.StencilBit,
GAL.Format.S8UintD24Unorm => ImageAspectFlags.DepthBit | ImageAspectFlags.StencilBit,
_ => ImageAspectFlags.ColorBit
};
}

View File

@@ -218,5 +218,23 @@ namespace Ryujinx.Graphics.Vulkan
AccessFlags.DepthStencilAttachmentWriteBit,
PipelineStageFlags.ColorAttachmentOutputBit);
}
public void InsertClearBarrier(CommandBufferScoped cbs, int index)
{
if (_colors != null)
{
int realIndex = Array.IndexOf(AttachmentIndices, index);
if (realIndex != -1)
{
_colors[realIndex].Storage?.InsertReadToWriteBarrier(cbs, AccessFlags.ColorAttachmentWriteBit, PipelineStageFlags.ColorAttachmentOutputBit);
}
}
}
public void InsertClearBarrierDS(CommandBufferScoped cbs)
{
_depthStencil?.Storage?.InsertReadToWriteBarrier(cbs, AccessFlags.DepthStencilAttachmentWriteBit, PipelineStageFlags.EarlyFragmentTestsBit);
}
}
}

View File

@@ -394,7 +394,7 @@ namespace Ryujinx.Graphics.Vulkan
(region[2], region[3]) = (region[3], region[2]);
}
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize);
gd.BufferManager.SetData<float>(bufferHandle, 0, region);
@@ -495,7 +495,7 @@ namespace Ryujinx.Graphics.Vulkan
(region[2], region[3]) = (region[3], region[2]);
}
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize);
gd.BufferManager.SetData<float>(bufferHandle, 0, region);
@@ -649,7 +649,7 @@ namespace Ryujinx.Graphics.Vulkan
_pipeline.SetCommandBuffer(cbs);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ClearColorBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ClearColorBufferSize);
gd.BufferManager.SetData<float>(bufferHandle, 0, clearColor);
@@ -726,7 +726,7 @@ namespace Ryujinx.Graphics.Vulkan
(region[2], region[3]) = (region[3], region[2]);
}
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, RegionBufferSize);
gd.BufferManager.SetData<float>(bufferHandle, 0, region);
@@ -802,7 +802,7 @@ namespace Ryujinx.Graphics.Vulkan
shaderParams[2] = size;
shaderParams[3] = srcOffset;
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize);
gd.BufferManager.SetData<int>(bufferHandle, 0, shaderParams);
@@ -958,7 +958,7 @@ namespace Ryujinx.Graphics.Vulkan
shaderParams[0] = BitOperations.Log2((uint)ratio);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize);
gd.BufferManager.SetData<int>(bufferHandle, 0, shaderParams);
@@ -1050,7 +1050,7 @@ namespace Ryujinx.Graphics.Vulkan
(shaderParams[0], shaderParams[1]) = GetSampleCountXYLog2(samples);
(shaderParams[2], shaderParams[3]) = GetSampleCountXYLog2((int)TextureStorage.ConvertToSampleCountFlags(gd.Capabilities.SupportedSampleCounts, (uint)samples));
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize);
gd.BufferManager.SetData<int>(bufferHandle, 0, shaderParams);
@@ -1133,7 +1133,7 @@ namespace Ryujinx.Graphics.Vulkan
(shaderParams[0], shaderParams[1]) = GetSampleCountXYLog2(samples);
(shaderParams[2], shaderParams[3]) = GetSampleCountXYLog2((int)TextureStorage.ConvertToSampleCountFlags(gd.Capabilities.SupportedSampleCounts, (uint)samples));
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false);
var bufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize);
gd.BufferManager.SetData<int>(bufferHandle, 0, shaderParams);
@@ -1407,7 +1407,7 @@ namespace Ryujinx.Graphics.Vulkan
pattern.OffsetIndex.CopyTo(shaderParams.Slice(0, pattern.OffsetIndex.Length));
var patternBufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, false, out var patternBuffer);
var patternBufferHandle = gd.BufferManager.CreateWithHandle(gd, ParamsBufferSize, out var patternBuffer);
var patternBufferAuto = patternBuffer.GetBuffer();
gd.BufferManager.SetData<int>(patternBufferHandle, 0, shaderParams);

View File

@@ -82,7 +82,7 @@ namespace Ryujinx.Graphics.Vulkan
}
// Expand the repeating pattern to the number of requested primitives.
BufferHandle newBuffer = _gd.CreateBuffer(expectedSize * sizeof(int));
BufferHandle newBuffer = _gd.BufferManager.CreateWithHandle(_gd, expectedSize * sizeof(int));
// Copy the old data to the new one.
if (_repeatingBuffer != BufferHandle.Null)

View File

@@ -146,5 +146,16 @@ namespace Ryujinx.Graphics.Vulkan
{
return _buffer == buffer;
}
public void Swap(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
if (_buffer == from)
{
_buffer.DecrementReferenceCount();
to.IncrementReferenceCount();
_buffer = to;
}
}
}
}

View File

@@ -28,32 +28,25 @@ namespace Ryujinx.Graphics.Vulkan
public MemoryAllocation AllocateDeviceMemory(
MemoryRequirements requirements,
MemoryPropertyFlags flags = 0)
MemoryPropertyFlags flags = 0,
bool isBuffer = false)
{
return AllocateDeviceMemory(requirements, flags, flags);
}
public MemoryAllocation AllocateDeviceMemory(
MemoryRequirements requirements,
MemoryPropertyFlags flags,
MemoryPropertyFlags alternativeFlags)
{
int memoryTypeIndex = FindSuitableMemoryTypeIndex(requirements.MemoryTypeBits, flags, alternativeFlags);
int memoryTypeIndex = FindSuitableMemoryTypeIndex(requirements.MemoryTypeBits, flags);
if (memoryTypeIndex < 0)
{
return default;
}
bool map = flags.HasFlag(MemoryPropertyFlags.HostVisibleBit);
return Allocate(memoryTypeIndex, requirements.Size, requirements.Alignment, map);
return Allocate(memoryTypeIndex, requirements.Size, requirements.Alignment, map, isBuffer);
}
private MemoryAllocation Allocate(int memoryTypeIndex, ulong size, ulong alignment, bool map)
private MemoryAllocation Allocate(int memoryTypeIndex, ulong size, ulong alignment, bool map, bool isBuffer)
{
for (int i = 0; i < _blockLists.Count; i++)
{
var bl = _blockLists[i];
if (bl.MemoryTypeIndex == memoryTypeIndex)
if (bl.MemoryTypeIndex == memoryTypeIndex && bl.ForBuffer == isBuffer)
{
lock (bl)
{
@@ -62,18 +55,15 @@ namespace Ryujinx.Graphics.Vulkan
}
}
var newBl = new MemoryAllocatorBlockList(_api, _device, memoryTypeIndex, _blockAlignment);
var newBl = new MemoryAllocatorBlockList(_api, _device, memoryTypeIndex, _blockAlignment, isBuffer);
_blockLists.Add(newBl);
return newBl.Allocate(size, alignment, map);
}
private int FindSuitableMemoryTypeIndex(
uint memoryTypeBits,
MemoryPropertyFlags flags,
MemoryPropertyFlags alternativeFlags)
MemoryPropertyFlags flags)
{
int bestCandidateIndex = -1;
for (int i = 0; i < _physicalDeviceMemoryProperties.MemoryTypeCount; i++)
{
var type = _physicalDeviceMemoryProperties.MemoryTypes[i];
@@ -84,14 +74,27 @@ namespace Ryujinx.Graphics.Vulkan
{
return i;
}
else if (type.PropertyFlags.HasFlag(alternativeFlags))
{
bestCandidateIndex = i;
}
}
}
return bestCandidateIndex;
return -1;
}
public static bool IsDeviceMemoryShared(Vk api, PhysicalDevice physicalDevice)
{
// The device is regarded as having shared memory if all heaps have the device local bit.
api.GetPhysicalDeviceMemoryProperties(physicalDevice, out var properties);
for (int i = 0; i < properties.MemoryHeapCount; i++)
{
if (!properties.MemoryHeaps[i].Flags.HasFlag(MemoryHeapFlags.DeviceLocalBit))
{
return false;
}
}
return true;
}
public void Dispose()

View File

@@ -162,15 +162,17 @@ namespace Ryujinx.Graphics.Vulkan
private readonly Device _device;
public int MemoryTypeIndex { get; }
public bool ForBuffer { get; }
private readonly int _blockAlignment;
public MemoryAllocatorBlockList(Vk api, Device device, int memoryTypeIndex, int blockAlignment)
public MemoryAllocatorBlockList(Vk api, Device device, int memoryTypeIndex, int blockAlignment, bool forBuffer)
{
_blocks = new List<Block>();
_api = api;
_device = device;
MemoryTypeIndex = memoryTypeIndex;
ForBuffer = forBuffer;
_blockAlignment = blockAlignment;
}

View File

@@ -226,6 +226,8 @@ namespace Ryujinx.Graphics.Vulkan
var attachment = new ClearAttachment(ImageAspectFlags.ColorBit, (uint)index, clearValue);
var clearRect = FramebufferParams.GetClearRect(ClearScissor, layer, layerCount);
FramebufferParams.InsertClearBarrier(Cbs, index);
Gd.Api.CmdClearAttachments(CommandBuffer, 1, &attachment, 1, &clearRect);
}
@@ -256,6 +258,8 @@ namespace Ryujinx.Graphics.Vulkan
var attachment = new ClearAttachment(flags, 0, clearValue);
var clearRect = FramebufferParams.GetClearRect(ClearScissor, layer, layerCount);
FramebufferParams.InsertClearBarrierDS(Cbs);
Gd.Api.CmdClearAttachments(CommandBuffer, 1, &attachment, 1, &clearRect);
}
@@ -1297,6 +1301,25 @@ namespace Ryujinx.Graphics.Vulkan
SignalStateChange();
}
public void SwapBuffer(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
_indexBuffer.Swap(from, to);
for (int i = 0; i < _vertexBuffers.Length; i++)
{
_vertexBuffers[i].Swap(from, to);
}
for (int i = 0; i < _transformFeedbackBuffers.Length; i++)
{
_transformFeedbackBuffers[i].Swap(from, to);
}
_descriptorSetUpdater.SwapBuffer(from, to);
SignalCommandBufferChange();
}
public unsafe void TextureBarrier()
{
MemoryBarrier memoryBarrier = new MemoryBarrier()

View File

@@ -17,10 +17,13 @@ namespace Ryujinx.Graphics.Vulkan
private ulong _byteWeight;
private List<BufferHolder> _backingSwaps;
public PipelineFull(VulkanRenderer gd, Device device) : base(gd, device)
{
_activeQueries = new List<(QueryPool, bool)>();
_pendingQueryCopies = new();
_backingSwaps = new();
CommandBuffer = (Cbs = gd.CommandBufferPool.Rent()).CommandBuffer;
}
@@ -185,6 +188,20 @@ namespace Ryujinx.Graphics.Vulkan
}
}
private void TryBackingSwaps()
{
CommandBufferScoped? cbs = null;
_backingSwaps.RemoveAll((holder) => holder.TryBackingSwap(ref cbs));
cbs?.Dispose();
}
public void AddBackingSwap(BufferHolder holder)
{
_backingSwaps.Add(holder);
}
public void Restore()
{
if (Pipeline != null)
@@ -230,6 +247,8 @@ namespace Ryujinx.Graphics.Vulkan
Gd.ResetCounterPool();
TryBackingSwaps();
Restore();
}

View File

@@ -57,12 +57,12 @@ namespace Ryujinx.Graphics.Vulkan
throw new NotSupportedException();
}
public ReadOnlySpan<byte> GetData()
public PinnedSpan<byte> GetData()
{
return _gd.GetBufferData(_bufferHandle, _offset, _size);
}
public ReadOnlySpan<byte> GetData(int layer, int level)
public PinnedSpan<byte> GetData(int layer, int level)
{
return GetData();
}
@@ -128,7 +128,7 @@ namespace Ryujinx.Graphics.Vulkan
{
if (_bufferView == null)
{
_bufferView = _gd.BufferManager.CreateView(_bufferHandle, VkFormat, _offset, _size);
_bufferView = _gd.BufferManager.CreateView(_bufferHandle, VkFormat, _offset, _size, ReleaseImpl);
}
return _bufferView?.Get(cbs, _offset, _size).Value ?? default;
@@ -147,7 +147,7 @@ namespace Ryujinx.Graphics.Vulkan
return bufferView.Get(cbs, _offset, _size).Value;
}
bufferView = _gd.BufferManager.CreateView(_bufferHandle, vkFormat, _offset, _size);
bufferView = _gd.BufferManager.CreateView(_bufferHandle, vkFormat, _offset, _size, ReleaseImpl);
if (bufferView != null)
{

View File

@@ -46,6 +46,8 @@ namespace Ryujinx.Graphics.Vulkan
private AccessFlags _lastModificationAccess;
private PipelineStageFlags _lastModificationStage;
private AccessFlags _lastReadAccess;
private PipelineStageFlags _lastReadStage;
private int _viewsCount;
private ulong _size;
@@ -440,31 +442,39 @@ namespace Ryujinx.Graphics.Vulkan
_lastModificationStage = stage;
}
public void InsertBarrier(CommandBufferScoped cbs, AccessFlags dstAccessFlags, PipelineStageFlags dstStageFlags)
public void InsertReadToWriteBarrier(CommandBufferScoped cbs, AccessFlags dstAccessFlags, PipelineStageFlags dstStageFlags)
{
if (_lastReadAccess != AccessFlags.NoneKhr)
{
ImageAspectFlags aspectFlags = Info.Format.ConvertAspectFlags();
TextureView.InsertImageBarrier(
_gd.Api,
cbs.CommandBuffer,
_imageAuto.Get(cbs).Value,
_lastReadAccess,
dstAccessFlags,
_lastReadStage,
dstStageFlags,
aspectFlags,
0,
0,
_info.GetLayers(),
_info.Levels);
_lastReadAccess = AccessFlags.NoneKhr;
_lastReadStage = PipelineStageFlags.None;
}
}
public void InsertWriteToReadBarrier(CommandBufferScoped cbs, AccessFlags dstAccessFlags, PipelineStageFlags dstStageFlags)
{
_lastReadAccess |= dstAccessFlags;
_lastReadStage |= dstStageFlags;
if (_lastModificationAccess != AccessFlags.NoneKhr)
{
ImageAspectFlags aspectFlags;
if (_info.Format.IsDepthOrStencil())
{
if (_info.Format == GAL.Format.S8Uint)
{
aspectFlags = ImageAspectFlags.StencilBit;
}
else if (_info.Format == GAL.Format.D16Unorm || _info.Format == GAL.Format.D32Float)
{
aspectFlags = ImageAspectFlags.DepthBit;
}
else
{
aspectFlags = ImageAspectFlags.DepthBit | ImageAspectFlags.StencilBit;
}
}
else
{
aspectFlags = ImageAspectFlags.ColorBit;
}
ImageAspectFlags aspectFlags = Info.Format.ConvertAspectFlags();
TextureView.InsertImageBarrier(
_gd.Api,

View File

@@ -531,7 +531,7 @@ namespace Ryujinx.Graphics.Vulkan
return bitmap;
}
public ReadOnlySpan<byte> GetData()
public PinnedSpan<byte> GetData()
{
BackgroundResource resources = _gd.BackgroundResources.Get();
@@ -539,15 +539,15 @@ namespace Ryujinx.Graphics.Vulkan
{
_gd.FlushAllCommands();
return GetData(_gd.CommandBufferPool, resources.GetFlushBuffer());
return PinnedSpan<byte>.UnsafeFromSpan(GetData(_gd.CommandBufferPool, resources.GetFlushBuffer()));
}
else
{
return GetData(resources.GetPool(), resources.GetFlushBuffer());
return PinnedSpan<byte>.UnsafeFromSpan(GetData(resources.GetPool(), resources.GetFlushBuffer()));
}
}
public ReadOnlySpan<byte> GetData(int layer, int level)
public PinnedSpan<byte> GetData(int layer, int level)
{
BackgroundResource resources = _gd.BackgroundResources.Get();
@@ -555,11 +555,11 @@ namespace Ryujinx.Graphics.Vulkan
{
_gd.FlushAllCommands();
return GetData(_gd.CommandBufferPool, resources.GetFlushBuffer(), layer, level);
return PinnedSpan<byte>.UnsafeFromSpan(GetData(_gd.CommandBufferPool, resources.GetFlushBuffer(), layer, level));
}
else
{
return GetData(resources.GetPool(), resources.GetFlushBuffer(), layer, level);
return PinnedSpan<byte>.UnsafeFromSpan(GetData(resources.GetPool(), resources.GetFlushBuffer(), layer, level));
}
}

View File

@@ -129,6 +129,17 @@ namespace Ryujinx.Graphics.Vulkan
return _buffer == buffer;
}
public void Swap(Auto<DisposableBuffer> from, Auto<DisposableBuffer> to)
{
if (_buffer == from)
{
_buffer.DecrementReferenceCount();
to.IncrementReferenceCount();
_buffer = to;
}
}
public void Dispose()
{
// Only dispose if this buffer is not refetched on each bind.

View File

@@ -20,7 +20,7 @@ namespace Ryujinx.Graphics.Vulkan
private const string AppName = "Ryujinx.Graphics.Vulkan";
private const int QueuesCount = 2;
public static string[] DesirableExtensions { get; } = new string[]
private static readonly string[] _desirableExtensions = new string[]
{
ExtConditionalRendering.ExtensionName,
ExtExtendedDynamicState.ExtensionName,
@@ -42,7 +42,7 @@ namespace Ryujinx.Graphics.Vulkan
"VK_KHR_portability_subset", // By spec, we should enable this if present.
};
public static string[] RequiredExtensions { get; } = new string[]
private static readonly string[] _requiredExtensions = new string[]
{
KhrSwapchain.ExtensionName
};
@@ -337,14 +337,14 @@ namespace Ryujinx.Graphics.Vulkan
{
string extensionName = Marshal.PtrToStringAnsi((IntPtr)pExtensionProperties[i].ExtensionName);
if (RequiredExtensions.Contains(extensionName))
if (_requiredExtensions.Contains(extensionName))
{
extensionMatches++;
}
}
}
return extensionMatches == RequiredExtensions.Length && FindSuitableQueueFamily(api, physicalDevice, surface, out _) != InvalidIndex;
return extensionMatches == _requiredExtensions.Length && FindSuitableQueueFamily(api, physicalDevice, surface, out _) != InvalidIndex;
}
internal static uint FindSuitableQueueFamily(Vk api, PhysicalDevice physicalDevice, SurfaceKHR surface, out uint queueCount)
@@ -626,7 +626,7 @@ namespace Ryujinx.Graphics.Vulkan
pExtendedFeatures = &featuresCustomBorderColor;
}
var enabledExtensions = RequiredExtensions.Union(DesirableExtensions.Intersect(supportedExtensions)).ToArray();
var enabledExtensions = _requiredExtensions.Union(_desirableExtensions.Intersect(supportedExtensions)).ToArray();
IntPtr* ppEnabledExtensions = stackalloc IntPtr[enabledExtensions.Length];
@@ -672,11 +672,6 @@ namespace Ryujinx.Graphics.Vulkan
return extensionProperties.Select(x => Marshal.PtrToStringAnsi((IntPtr)x.ExtensionName)).ToArray();
}
internal static CommandBufferPool CreateCommandBufferPool(Vk api, Device device, Queue queue, object queueLock, uint queueFamilyIndex)
{
return new CommandBufferPool(api, device, queue, queueLock, queueFamilyIndex);
}
internal unsafe static void CreateDebugMessenger(
Vk api,
GraphicsDebugLevel logLevel,

View File

@@ -80,6 +80,7 @@ namespace Ryujinx.Graphics.Vulkan
internal bool IsAmdGcn { get; private set; }
internal bool IsMoltenVk { get; private set; }
internal bool IsTBDR { get; private set; }
internal bool IsSharedMemory { get; private set; }
public string GpuVendor { get; private set; }
public string GpuRenderer { get; private set; }
public string GpuVersion { get; private set; }
@@ -167,7 +168,9 @@ namespace Ryujinx.Graphics.Vulkan
SType = StructureType.PhysicalDeviceSubgroupSizeControlPropertiesExt
};
if (Capabilities.SupportsSubgroupSizeControl)
bool supportsSubgroupSizeControl = supportedExtensions.Contains("VK_EXT_subgroup_size_control");
if (supportsSubgroupSizeControl)
{
properties2.PNext = &propertiesSubgroupSizeControl;
}
@@ -291,7 +294,7 @@ namespace Ryujinx.Graphics.Vulkan
supportedExtensions.Contains(KhrDrawIndirectCount.ExtensionName),
supportedExtensions.Contains("VK_EXT_fragment_shader_interlock"),
supportedExtensions.Contains("VK_NV_geometry_shader_passthrough"),
supportedExtensions.Contains("VK_EXT_subgroup_size_control"),
supportsSubgroupSizeControl,
featuresShaderInt8.ShaderInt8,
supportedExtensions.Contains("VK_EXT_shader_stencil_export"),
supportedExtensions.Contains(ExtConditionalRendering.ExtensionName),
@@ -313,9 +316,11 @@ namespace Ryujinx.Graphics.Vulkan
portabilityFlags,
vertexBufferAlignment);
IsSharedMemory = MemoryAllocator.IsDeviceMemoryShared(Api, _physicalDevice);
MemoryAllocator = new MemoryAllocator(Api, _physicalDevice, _device, properties.Limits.MaxMemoryAllocationCount);
CommandBufferPool = VulkanInitialization.CreateCommandBufferPool(Api, _device, Queue, QueueLock, queueFamilyIndex);
CommandBufferPool = new CommandBufferPool(Api, _device, Queue, QueueLock, queueFamilyIndex);
DescriptorSetManager = new DescriptorSetManager(_device);
@@ -373,9 +378,9 @@ namespace Ryujinx.Graphics.Vulkan
_initialized = true;
}
public BufferHandle CreateBuffer(int size)
public BufferHandle CreateBuffer(int size, BufferHandle storageHint)
{
return BufferManager.CreateWithHandle(this, size, false);
return BufferManager.CreateWithHandle(this, size, BufferAllocationType.Auto, storageHint);
}
public IProgram CreateProgram(ShaderSource[] sources, ShaderInfo info)
@@ -439,7 +444,7 @@ namespace Ryujinx.Graphics.Vulkan
_syncManager.RegisterFlush();
}
public ReadOnlySpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
public PinnedSpan<byte> GetBufferData(BufferHandle buffer, int offset, int size)
{
return BufferManager.GetData(buffer, offset, size);
}

View File

@@ -14,6 +14,9 @@ namespace Ryujinx.HLE.HOS.Services.Nim
// CreateServerInterface(pid, handle<unknown>, u64) -> object<nn::ec::IShopServiceAccessServer>
public ResultCode CreateServerInterface(ServiceCtx context)
{
// Close transfer memory immediately as we don't use it.
context.Device.System.KernelContext.Syscall.CloseHandle(context.Request.HandleDesc.ToCopy[0]);
MakeObject(context, new IShopServiceAccessServer());
Logger.Stub?.PrintStub(LogClass.ServiceNim);

View File

@@ -32,6 +32,8 @@ namespace Ryujinx.HLE.HOS.Services
0x01007FFF
};
private readonly object _handleLock = new();
private readonly KernelContext _context;
private KProcess _selfProcess;
@@ -77,7 +79,10 @@ namespace Ryujinx.HLE.HOS.Services
private void AddPort(int serverPortHandle, Func<IpcService> objectFactory)
{
_portHandles.Add(serverPortHandle);
lock (_handleLock)
{
_portHandles.Add(serverPortHandle);
}
_ports.Add(serverPortHandle, objectFactory);
}
@@ -92,7 +97,10 @@ namespace Ryujinx.HLE.HOS.Services
public void AddSessionObj(int serverSessionHandle, IpcService obj)
{
_sessionHandles.Add(serverSessionHandle);
lock (_handleLock)
{
_sessionHandles.Add(serverSessionHandle);
}
_sessions.Add(serverSessionHandle, obj);
}
@@ -126,12 +134,20 @@ namespace Ryujinx.HLE.HOS.Services
while (true)
{
int handleCount = _portHandles.Count + _sessionHandles.Count;
int handleCount;
int portHandleCount;
int[] handles;
int[] handles = ArrayPool<int>.Shared.Rent(handleCount);
_portHandles.CopyTo(handles, 0);
_sessionHandles.CopyTo(handles, _portHandles.Count);
lock (_handleLock)
{
portHandleCount = _portHandles.Count;
handleCount = portHandleCount + _sessionHandles.Count;
handles = ArrayPool<int>.Shared.Rent(handleCount);
_portHandles.CopyTo(handles, 0);
_sessionHandles.CopyTo(handles, portHandleCount);
}
// We still need a timeout here to allow the service to pick up and listen new sessions...
var rc = _context.Syscall.ReplyAndReceive(out int signaledIndex, handles.AsSpan(0, handleCount), replyTargetHandle, 1000000L);
@@ -145,7 +161,7 @@ namespace Ryujinx.HLE.HOS.Services
replyTargetHandle = 0;
if (rc == Result.Success && signaledIndex >= _portHandles.Count)
if (rc == Result.Success && signaledIndex >= portHandleCount)
{
// We got a IPC request, process it, pass to the appropriate service if needed.
int signaledHandle = handles[signaledIndex];
@@ -278,7 +294,10 @@ namespace Ryujinx.HLE.HOS.Services
else if (request.Type == IpcMessageType.HipcCloseSession || request.Type == IpcMessageType.TipcCloseSession)
{
_context.Syscall.CloseHandle(serverSessionHandle);
_sessionHandles.Remove(serverSessionHandle);
lock (_handleLock)
{
_sessionHandles.Remove(serverSessionHandle);
}
IpcService service = _sessions[serverSessionHandle];
(service as IDisposable)?.Dispose();
_sessions.Remove(serverSessionHandle);

View File

@@ -20,16 +20,6 @@ namespace Ryujinx.Memory.Range
/// </summary>
public int Count => HasSingleRange ? 1 : _ranges.Length;
/// <summary>
/// Minimum start address of all sub-ranges.
/// </summary>
public ulong MinAddress { get; }
/// <summary>
/// Maximum end address of all sub-ranges.
/// </summary>
public ulong MaxAddress { get; }
/// <summary>
/// Creates a new multi-range with a single physical region.
/// </summary>
@@ -39,8 +29,6 @@ namespace Ryujinx.Memory.Range
{
_singleRange = new MemoryRange(address, size);
_ranges = null;
MinAddress = address;
MaxAddress = address + size;
}
/// <summary>
@@ -52,30 +40,6 @@ namespace Ryujinx.Memory.Range
{
_singleRange = MemoryRange.Empty;
_ranges = ranges ?? throw new ArgumentNullException(nameof(ranges));
if (ranges.Length != 0)
{
MinAddress = ulong.MaxValue;
MaxAddress = 0UL;
foreach (MemoryRange range in ranges)
{
if (MinAddress > range.Address)
{
MinAddress = range.Address;
}
if (MaxAddress < range.EndAddress)
{
MaxAddress = range.EndAddress;
}
}
}
else
{
MinAddress = 0UL;
MaxAddress = 0UL;
}
}
/// <summary>
@@ -84,7 +48,7 @@ namespace Ryujinx.Memory.Range
/// <param name="offset">Offset of the slice into the multi-range in bytes</param>
/// <param name="size">Size of the slice in bytes</param>
/// <returns>A new multi-range representing the given slice of this one</returns>
public MultiRange GetSlice(ulong offset, ulong size)
public MultiRange Slice(ulong offset, ulong size)
{
if (HasSingleRange)
{