Compare commits

..

12 Commits

Author SHA1 Message Date
gdkchan
2e43d01d36 Move gl_Layer from vertex to geometry if GPU does not support it on vertex (#3866)
* Move gl_Layer from vertex to geometry if GPU does not support it on vertex

* Shader cache version bump

* PR feedback
2022-11-18 23:27:54 -03:00
riperiperi
7373ec5792 Vulkan: Clear dummy texture to (0,0,0,0) on creation (#3867)
This might fix an issue with AMD gpus on linux where the data could contain random garbage data. On the switch, it always samples as 0.
2022-11-18 23:11:34 -03:00
riperiperi
de162a648b Gpu: Fix thread safety of ReregisterRanges (#3865)
A quick fix to prevent reading the wrong value of Count when reregistering ranges for a new target buffer. Buffer flushes from another thread can modify the range list when the lock isn't active, which can change the count.

This prevents some crashes in Pokemon Scarlet/Violet. It's probably likely that buffer migration during flush is causing some other issues in this game, but this at least prevents the crashing.
2022-11-18 21:47:29 +01:00
riperiperi
131baebe2a Vulkan: Don't create preload command buffer outside a render pass (#3864)
* Vulkan: Don't create preload buffer outside a render pass

The preload command buffer is used to avoid render pass splits and barriers when updating buffer data. However, when a render pass is not active (for example, at the start of a pass, or during compute invocations) buffer uploads can be performed at any time, so the optimization isn't as useful.

This PR makes it so that the preload command buffer is only used for buffer updates outside of a render pass. It's still used for textures as I don't want to shake things up right now regarding how the preload buffer is obtained before some other changes, and texture updates are a lot rarer anyways.

Improves performance slightly in Pokemon Scarlet/Violet (43 -> 48), as it was switching to compute, writing a bunch of buffers inline, then dispatching, then flushing commands... It uses 1 command buffer instead of 2 every time it does this now. Maybe it would be nice to find a faster way to sync without creating so many command buffers in a short period of time.

* Address feedback
2022-11-18 14:58:56 +00:00
riperiperi
187372cbde Prune ForceDirty and CheckModified caches on unmap (#3862)
* Prune ForceDirty and CheckModified caches on unmap

Since we're now using this for modified checks on the HLE indirect draw method, I'm worried that leaving these to forever gather cache entries isn't the best idea for performance in the long term, and it could keep old buffer objects alive for longer than they should be.

This PR adds the ability to prune invalid entries before checking these caches, and queues it whenever gpu memory is unmapped. It also aligns modified checks to the page size, as I figured it would be possible for a huge number of overlapping over a game's runtime.

This prevents Super Mario Odyssey from having 10s of thousands of entries in the modified cache in Metro Kingdom, and them duplicating when entering and leaving a building (should be cleared, as they were unmapped).

* Address Feedback
2022-11-18 14:58:24 +00:00
TSRBerry
022d495335 am: Stub GetSaveDataSizeMax (#3857)
* am: Stub GetSaveDataSizeMax()

* am: Remove todo comment for GetSaveDataSizeMax()

* am: saveDataSize & journalDataSize should be of type long

* am: Add explanation for returning default values in GetSaveDataSizeMax()
2022-11-18 03:29:01 +00:00
Berkan Diler
c1372ed775 Use ReadOnlySpan<byte> compiler optimization in more places (#3853)
* Use ReadOnlySpan<byte> compiler optimization in more places

* Revert changes in ShaderBinaries.cs

* Remove unused using;

* Use ReadOnlySpan<byte> compiler optimization in more places
2022-11-18 03:10:44 +00:00
riperiperi
a16682cfd3 Allow _volatile to be set from MultiRegionHandle checks again (#3830)
* Allow _volatile to be set from MultiRegionHandle checks again

Tracking handles have a `_volatile` flag which indicates that the resource being tracked is modified every time it is used under a new sequence number. This is used to reduce the time spent reprotecting memory for tracking writes to commonly modified buffers, like constant buffers.

This optimisation works by detecting if a buffer is modified every time a check happens. If a buffer is checked but it is not dirty, then that data is likely not modified every sequence number, and should use memory protection for write tracking. If the opposite is the case all the time, it is faster to just assume it's dirty as we'd just be wasting time protecting the memory.

The new MultiRegionBitmap could not notify handles that they had been checked as part of the fast bitmap lookup, so bindings larger than 4096 bytes wouldn't trigger it at all. This meant that they would be subject to a ton of reprotection if they were modified often.

This does mean there are two separate sources for a _volatile set: VolatileOrDirty + _checkCount, and the bitmap check. These shouldn't interfere with each other, though.

This fixes performance regressions from #3775 in Pokemon Sword, and hopefully Yu-Gi-Oh! RUSH DUEL: Dawn of the Battle Royale. May affect other games.

* Fix stupid mistake
2022-11-18 02:54:20 +00:00
riperiperi
7c53b69c30 SPIR-V: Fix unscaling helper not being able to find Array textures (#3863)
The type in the `texOp` in the textureSize instruction doesn't have the exact type on SPIR-V (for example, it is missing the Array flag). This PR gives it the proper type before giving it to the unscaling helper.

This fixes the ground textures being broken on Pokemon Scarlet/Violet when scaling. It wasn't finding the texture, so the descriptor index it provided was -1...
2022-11-18 02:37:37 +00:00
riperiperi
33a4d7d1ba GPU: Eliminate CB0 accesses when storage buffer accesses are resolved (#3847)
* Eliminate CB0 accesses

Still some work to do, decouple from hle?

* Forgot the important part somehow

* Fix and improve alignment test

* Address Feedback

* Remove some complexity when checking storage buffer alignment

* Update Ryujinx.Graphics.Shader/Translation/Optimizations/GlobalToStorage.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-11-17 18:47:41 +01:00
Mary-nyan
391e08dd27 ci: Clean up Actions leftovers (#3859)
As title say.

Fix Avalonia build versions for PRs.

Also ensure that the --self-contained doesn't warn at build.
2022-11-17 18:30:54 +01:00
Matthew Wells
b5cf8b8af9 Capitalization to be consistent (#3860)
Thread ID Register, Floating-point Control Register, and Floating-point Status Register all had Register capitalized, so the Register in Processor State register should be capitalized.
2022-11-17 18:13:37 +01:00
44 changed files with 696 additions and 163 deletions

View File

@@ -52,26 +52,22 @@ jobs:
- uses: actions/setup-dotnet@v3
with:
dotnet-version: 7.0.x
- name: Ensure NuGet Source
uses: fabriciomurta/ensure-nuget-source@v1
- name: Get git short hash
id: git_short_hash
run: echo "result=$(git rev-parse --short "${{ github.sha }}")" >> $GITHUB_OUTPUT
shell: bash
- name: Clear
run: dotnet clean && dotnet nuget locals all --clear
- name: Build
run: dotnet build -c "${{ matrix.configuration }}" /p:Version="${{ env.RYUJINX_BASE_VERSION }}" /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER
run: dotnet build -c "${{ matrix.configuration }}" -p:Version="${{ env.RYUJINX_BASE_VERSION }}" -p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" -p:ExtraDefineConstants=DISABLE_UPDATER
- name: Test
run: dotnet test --no-build -c "${{ matrix.configuration }}"
- name: Publish Ryujinx
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish /p:Version="${{ env.RYUJINX_BASE_VERSION }}" /p:DebugType=embedded /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx --self-contained
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish -p:Version="${{ env.RYUJINX_BASE_VERSION }}" -p:DebugType=embedded -p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" -p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx --self-contained true
if: github.event_name == 'pull_request'
- name: Publish Ryujinx.Headless.SDL2
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish_sdl2_headless /p:Version="${{ env.RYUJINX_BASE_VERSION }}" /p:DebugType=embedded /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx.Headless.SDL2 --self-contained
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish_sdl2_headless -p:Version="${{ env.RYUJINX_BASE_VERSION }}" -p:DebugType=embedded -p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" -p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx.Headless.SDL2 --self-contained true
if: github.event_name == 'pull_request'
- name: Publish Ryujinx.Ava
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish_ava /p:Version="1.0.0" /p:DebugType=embedded /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx.Ava
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish_ava -p:Version="${{ env.RYUJINX_BASE_VERSION }}" -p:DebugType=embedded -p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" -p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx.Ava --self-contained true
if: github.event_name == 'pull_request'
- name: Upload Ryujinx artifact
uses: actions/upload-artifact@v3

View File

@@ -29,10 +29,6 @@ jobs:
- uses: actions/setup-dotnet@v3
with:
dotnet-version: 7.0.x
- name: Ensure NuGet Source
uses: fabriciomurta/ensure-nuget-source@v1
- name: Clear
run: dotnet clean && dotnet nuget locals all --clear
- name: Get version info
id: version_info
run: |
@@ -51,9 +47,9 @@ jobs:
run: "mkdir release_output"
- name: Publish Windows
run: |
dotnet publish -c Release -r win10-x64 -o ./publish_windows/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx --self-contained
dotnet publish -c Release -r win10-x64 -o ./publish_windows_sdl2_headless/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Headless.SDL2 --self-contained
dotnet publish -c Release -r win10-x64 -o ./publish_windows_ava/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Ava --self-contained
dotnet publish -c Release -r win10-x64 -o ./publish_windows/publish -p:Version="${{ steps.version_info.outputs.build_version }}" -p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" -p:DebugType=embedded Ryujinx --self-contained true
dotnet publish -c Release -r win10-x64 -o ./publish_windows_sdl2_headless/publish -p:Version="${{ steps.version_info.outputs.build_version }}" -p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" -p:DebugType=embedded Ryujinx.Headless.SDL2 --self-contained true
dotnet publish -c Release -r win10-x64 -o ./publish_windows_ava/publish -p:Version="${{ steps.version_info.outputs.build_version }}" -p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" -p:DebugType=embedded Ryujinx.Ava --self-contained true
- name: Packing Windows builds
run: |
pushd publish_windows
@@ -71,9 +67,9 @@ jobs:
- name: Publish Linux
run: |
dotnet publish -c Release -r linux-x64 -o ./publish_linux/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx --self-contained
dotnet publish -c Release -r linux-x64 -o ./publish_linux_sdl2_headless/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Headless.SDL2 --self-contained
dotnet publish -c Release -r linux-x64 -o ./publish_linux_ava/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Ava --self-contained
dotnet publish -c Release -r linux-x64 -o ./publish_linux/publish -p:Version="${{ steps.version_info.outputs.build_version }}" -p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" -p:DebugType=embedded Ryujinx --self-contained true
dotnet publish -c Release -r linux-x64 -o ./publish_linux_sdl2_headless/publish -p:Version="${{ steps.version_info.outputs.build_version }}" -p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" -p:DebugType=embedded Ryujinx.Headless.SDL2 --self-contained true
dotnet publish -c Release -r linux-x64 -o ./publish_linux_ava/publish -p:Version="${{ steps.version_info.outputs.build_version }}" -p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" -p:DebugType=embedded Ryujinx.Ava --self-contained true
- name: Packing Linux builds
run: |

View File

@@ -1,15 +1,11 @@
using System;
using System.Numerics;
namespace ARMeilleure.Common
{
static class BitUtils
{
private static readonly sbyte[] HbsNibbleLut;
static BitUtils()
{
HbsNibbleLut = new sbyte[] { -1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3 };
}
private static ReadOnlySpan<sbyte> HbsNibbleLut => new sbyte[] { -1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3 };
public static long FillWithOnes(int bits)
{

View File

@@ -27,7 +27,7 @@ namespace Ryujinx.Cpu
long TpidrroEl0 { get; set; }
/// <summary>
/// Processor State register.
/// Processor State Register.
/// </summary>
uint Pstate { get; set; }
@@ -109,4 +109,4 @@ namespace Ryujinx.Cpu
/// </remarks>
void StopRunning();
}
}
}

View File

@@ -21,6 +21,7 @@ namespace Ryujinx.Graphics.GAL
public readonly bool SupportsFragmentShaderOrderingIntel;
public readonly bool SupportsGeometryShaderPassthrough;
public readonly bool SupportsImageLoadFormatted;
public readonly bool SupportsLayerVertexTessellation;
public readonly bool SupportsMismatchingViewFormat;
public readonly bool SupportsCubemapView;
public readonly bool SupportsNonConstantTextureOffset;
@@ -55,6 +56,7 @@ namespace Ryujinx.Graphics.GAL
bool supportsFragmentShaderOrderingIntel,
bool supportsGeometryShaderPassthrough,
bool supportsImageLoadFormatted,
bool supportsLayerVertexTessellation,
bool supportsMismatchingViewFormat,
bool supportsCubemapView,
bool supportsNonConstantTextureOffset,
@@ -86,6 +88,7 @@ namespace Ryujinx.Graphics.GAL
SupportsFragmentShaderOrderingIntel = supportsFragmentShaderOrderingIntel;
SupportsGeometryShaderPassthrough = supportsGeometryShaderPassthrough;
SupportsImageLoadFormatted = supportsImageLoadFormatted;
SupportsLayerVertexTessellation = supportsLayerVertexTessellation;
SupportsMismatchingViewFormat = supportsMismatchingViewFormat;
SupportsCubemapView = supportsCubemapView;
SupportsNonConstantTextureOffset = supportsNonConstantTextureOffset;

View File

@@ -95,5 +95,10 @@ namespace Ryujinx.Graphics.Gpu
/// Byte alignment for block linear textures
/// </summary>
public const int GobAlignment = 64;
/// <summary>
/// Expected byte alignment for storage buffers
/// </summary>
public const int StorageAlignment = 16;
}
}

View File

@@ -138,7 +138,8 @@ namespace Ryujinx.Graphics.Gpu.Engine.Compute
qmd.CtaThreadDimension1,
qmd.CtaThreadDimension2,
localMemorySize,
sharedMemorySize);
sharedMemorySize,
_channel.BufferManager.HasUnalignedStorageBuffers);
CachedShaderProgram cs = memoryManager.Physical.ShaderCache.GetComputeShader(_channel, poolState, computeState, shaderGpuVa);
@@ -150,6 +151,33 @@ namespace Ryujinx.Graphics.Gpu.Engine.Compute
ShaderProgramInfo info = cs.Shaders[0].Info;
bool hasUnaligned = _channel.BufferManager.HasUnalignedStorageBuffers;
for (int index = 0; index < info.SBuffers.Count; index++)
{
BufferDescriptor sb = info.SBuffers[index];
ulong sbDescAddress = _channel.BufferManager.GetComputeUniformBufferAddress(0);
int sbDescOffset = 0x310 + sb.Slot * 0x10;
sbDescAddress += (ulong)sbDescOffset;
SbDescriptor sbDescriptor = _channel.MemoryManager.Physical.Read<SbDescriptor>(sbDescAddress);
_channel.BufferManager.SetComputeStorageBuffer(sb.Slot, sbDescriptor.PackAddress(), (uint)sbDescriptor.Size, sb.Flags);
}
if ((_channel.BufferManager.HasUnalignedStorageBuffers) != hasUnaligned)
{
// Refetch the shader, as assumptions about storage buffer alignment have changed.
cs = memoryManager.Physical.ShaderCache.GetComputeShader(_channel, poolState, computeState, shaderGpuVa);
_context.Renderer.Pipeline.SetProgram(cs.HostProgram);
info = cs.Shaders[0].Info;
}
for (int index = 0; index < info.CBuffers.Count; index++)
{
BufferDescriptor cb = info.CBuffers[index];
@@ -174,21 +202,6 @@ namespace Ryujinx.Graphics.Gpu.Engine.Compute
_channel.BufferManager.SetComputeUniformBuffer(cb.Slot, cbDescriptor.PackAddress(), (uint)cbDescriptor.Size);
}
for (int index = 0; index < info.SBuffers.Count; index++)
{
BufferDescriptor sb = info.SBuffers[index];
ulong sbDescAddress = _channel.BufferManager.GetComputeUniformBufferAddress(0);
int sbDescOffset = 0x310 + sb.Slot * 0x10;
sbDescAddress += (ulong)sbDescOffset;
SbDescriptor sbDescriptor = _channel.MemoryManager.Physical.Read<SbDescriptor>(sbDescAddress);
_channel.BufferManager.SetComputeStorageBuffer(sb.Slot, sbDescriptor.PackAddress(), (uint)sbDescriptor.Size, sb.Flags);
}
_channel.BufferManager.SetComputeStorageBufferBindings(info.SBuffers);
_channel.BufferManager.SetComputeUniformBufferBindings(info.CBuffers);

View File

@@ -293,9 +293,12 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// </summary>
private void CommitBindings()
{
var buffers = _channel.BufferManager;
var hasUnaligned = buffers.HasUnalignedStorageBuffers;
UpdateStorageBuffers();
if (!_channel.TextureManager.CommitGraphicsBindings(_shaderSpecState))
if (!_channel.TextureManager.CommitGraphicsBindings(_shaderSpecState) || (buffers.HasUnalignedStorageBuffers != hasUnaligned))
{
// Shader must be reloaded.
UpdateShaderState();
@@ -1361,7 +1364,8 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
_state.State.AlphaTestFunc,
_state.State.AlphaTestRef,
ref attributeTypes,
_drawState.HasConstantBufferDrawParameters);
_drawState.HasConstantBufferDrawParameters,
_channel.BufferManager.HasUnalignedStorageBuffers);
}
/// <summary>

View File

@@ -67,6 +67,7 @@ namespace Ryujinx.Graphics.Gpu
// Since the memory manager changed, make sure we will get pools from addresses of the new memory manager.
TextureManager.ReloadPools();
MemoryManager.Physical.BufferCache.QueuePrune();
}
/// <summary>
@@ -77,6 +78,7 @@ namespace Ryujinx.Graphics.Gpu
private void MemoryUnmappedHandler(object sender, UnmapEventArgs e)
{
TextureManager.ReloadPools();
MemoryManager.Physical.BufferCache.QueuePrune();
}
/// <summary>

View File

@@ -28,6 +28,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
private readonly Dictionary<ulong, BufferCacheEntry> _dirtyCache;
private readonly Dictionary<ulong, BufferCacheEntry> _modifiedCache;
private bool _pruneCaches;
public event Action NotifyBuffersModified;
@@ -136,6 +137,11 @@ namespace Ryujinx.Graphics.Gpu.Memory
/// <param name="size">Size in bytes of the buffer</param>
public void ForceDirty(MemoryManager memoryManager, ulong gpuVa, ulong size)
{
if (_pruneCaches)
{
Prune();
}
if (!_dirtyCache.TryGetValue(gpuVa, out BufferCacheEntry result) ||
result.EndGpuAddress < gpuVa + size ||
result.UnmappedSequence != result.Buffer.UnmappedSequence)
@@ -158,17 +164,29 @@ namespace Ryujinx.Graphics.Gpu.Memory
/// <returns>True if modified, false otherwise</returns>
public bool CheckModified(MemoryManager memoryManager, ulong gpuVa, ulong size, out ulong outAddr)
{
if (!_modifiedCache.TryGetValue(gpuVa, out BufferCacheEntry result) ||
result.EndGpuAddress < gpuVa + size ||
result.UnmappedSequence != result.Buffer.UnmappedSequence)
if (_pruneCaches)
{
ulong address = TranslateAndCreateBuffer(memoryManager, gpuVa, size);
result = new BufferCacheEntry(address, gpuVa, GetBuffer(address, size));
_modifiedCache[gpuVa] = result;
Prune();
}
outAddr = result.Address;
// Align the address to avoid creating too many entries on the quick lookup dictionary.
ulong mask = BufferAlignmentMask;
ulong alignedGpuVa = gpuVa & (~mask);
ulong alignedEndGpuVa = (gpuVa + size + mask) & (~mask);
size = alignedEndGpuVa - alignedGpuVa;
if (!_modifiedCache.TryGetValue(alignedGpuVa, out BufferCacheEntry result) ||
result.EndGpuAddress < alignedEndGpuVa ||
result.UnmappedSequence != result.Buffer.UnmappedSequence)
{
ulong address = TranslateAndCreateBuffer(memoryManager, alignedGpuVa, size);
result = new BufferCacheEntry(address, alignedGpuVa, GetBuffer(address, size));
_modifiedCache[alignedGpuVa] = result;
}
outAddr = result.Address | (gpuVa & mask);
return result.Buffer.IsModified(result.Address, size);
}
@@ -435,6 +453,54 @@ namespace Ryujinx.Graphics.Gpu.Memory
}
}
/// <summary>
/// Prune any invalid entries from a quick access dictionary.
/// </summary>
/// <param name="dictionary">Dictionary to prune</param>
/// <param name="toDelete">List used to track entries to delete</param>
private void Prune(Dictionary<ulong, BufferCacheEntry> dictionary, ref List<ulong> toDelete)
{
foreach (var entry in dictionary)
{
if (entry.Value.UnmappedSequence != entry.Value.Buffer.UnmappedSequence)
{
(toDelete ??= new()).Add(entry.Key);
}
}
if (toDelete != null)
{
foreach (ulong entry in toDelete)
{
dictionary.Remove(entry);
}
}
}
/// <summary>
/// Prune any invalid entries from the quick access dictionaries.
/// </summary>
private void Prune()
{
List<ulong> toDelete = null;
Prune(_dirtyCache, ref toDelete);
toDelete?.Clear();
Prune(_modifiedCache, ref toDelete);
_pruneCaches = false;
}
/// <summary>
/// Queues a prune of invalid entries the next time a dictionary cache is accessed.
/// </summary>
public void QueuePrune()
{
_pruneCaches = true;
}
/// <summary>
/// Disposes all buffers in the cache.
/// It's an error to use the buffer manager after disposal.

View File

@@ -17,6 +17,9 @@ namespace Ryujinx.Graphics.Gpu.Memory
private readonly GpuContext _context;
private readonly GpuChannel _channel;
private int _unalignedStorageBuffers;
public bool HasUnalignedStorageBuffers => _unalignedStorageBuffers > 0;
private IndexBuffer _indexBuffer;
private readonly VertexBuffer[] _vertexBuffers;
private readonly BufferBounds[] _transformFeedbackBuffers;
@@ -38,6 +41,11 @@ namespace Ryujinx.Graphics.Gpu.Memory
/// </summary>
public BufferBounds[] Buffers { get; }
/// <summary>
/// Flag indicating if this binding is unaligned.
/// </summary>
public bool[] Unaligned { get; }
/// <summary>
/// Total amount of buffers used on the shader.
/// </summary>
@@ -51,6 +59,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
{
Bindings = new BufferDescriptor[count];
Buffers = new BufferBounds[count];
Unaligned = new bool[count];
}
/// <summary>
@@ -202,6 +211,31 @@ namespace Ryujinx.Graphics.Gpu.Memory
_transformFeedbackBuffersDirty = true;
}
/// <summary>
/// Records the alignment of a storage buffer.
/// Unaligned storage buffers disable some optimizations on the shader.
/// </summary>
/// <param name="buffers">The binding list to modify</param>
/// <param name="index">Index of the storage buffer</param>
/// <param name="gpuVa">Start GPU virtual address of the buffer</param>
private void RecordStorageAlignment(BuffersPerStage buffers, int index, ulong gpuVa)
{
bool unaligned = (gpuVa & (Constants.StorageAlignment - 1)) != 0;
if (unaligned || HasUnalignedStorageBuffers)
{
// Check if the alignment changed for this binding.
ref bool currentUnaligned = ref buffers.Unaligned[index];
if (currentUnaligned != unaligned)
{
currentUnaligned = unaligned;
_unalignedStorageBuffers += unaligned ? 1 : -1;
}
}
}
/// <summary>
/// Sets a storage buffer on the compute pipeline.
/// Storage buffers can be read and written to on shaders.
@@ -214,6 +248,8 @@ namespace Ryujinx.Graphics.Gpu.Memory
{
size += gpuVa & ((ulong)_context.Capabilities.StorageBufferOffsetAlignment - 1);
RecordStorageAlignment(_cpStorageBuffers, index, gpuVa);
gpuVa = BitUtils.AlignDown(gpuVa, _context.Capabilities.StorageBufferOffsetAlignment);
ulong address = _channel.MemoryManager.Physical.BufferCache.TranslateAndCreateBuffer(_channel.MemoryManager, gpuVa, size);
@@ -234,17 +270,21 @@ namespace Ryujinx.Graphics.Gpu.Memory
{
size += gpuVa & ((ulong)_context.Capabilities.StorageBufferOffsetAlignment - 1);
BuffersPerStage buffers = _gpStorageBuffers[stage];
RecordStorageAlignment(buffers, index, gpuVa);
gpuVa = BitUtils.AlignDown(gpuVa, _context.Capabilities.StorageBufferOffsetAlignment);
ulong address = _channel.MemoryManager.Physical.BufferCache.TranslateAndCreateBuffer(_channel.MemoryManager, gpuVa, size);
if (_gpStorageBuffers[stage].Buffers[index].Address != address ||
_gpStorageBuffers[stage].Buffers[index].Size != size)
if (buffers.Buffers[index].Address != address ||
buffers.Buffers[index].Size != size)
{
_gpStorageBuffersDirty = true;
}
_gpStorageBuffers[stage].SetBounds(index, address, size, flags);
buffers.SetBounds(index, address, size, flags);
}
/// <summary>

View File

@@ -325,13 +325,15 @@ namespace Ryujinx.Graphics.Gpu.Memory
public void ReregisterRanges(Action<ulong, ulong> rangeAction)
{
ref var ranges = ref ThreadStaticArray<BufferModifiedRange>.Get();
int count;
// Range list must be consistent for this operation.
lock (_lock)
{
if (ranges.Length < Count)
count = Count;
if (ranges.Length < count)
{
Array.Resize(ref ranges, Count);
Array.Resize(ref ranges, count);
}
int i = 0;
@@ -342,7 +344,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
}
ulong currentSync = _context.SyncNumber;
for (int i = 0; i < Count; i++)
for (int i = 0; i < count; i++)
{
BufferModifiedRange range = ranges[i];
if (range.SyncNumber != currentSync)

View File

@@ -36,6 +36,7 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// </summary>
/// <param name="channel">GPU channel</param>
/// <param name="poolState">Texture pool state</param>
/// <param name="computeState">Compute state</param>
/// <param name="gpuVa">GPU virtual address of the compute shader</param>
/// <param name="program">Cached host program for the given state, if found</param>
/// <param name="cachedGuestCode">Cached guest code, if any found</param>
@@ -43,6 +44,7 @@ namespace Ryujinx.Graphics.Gpu.Shader
public bool TryFind(
GpuChannel channel,
GpuChannelPoolState poolState,
GpuChannelComputeState computeState,
ulong gpuVa,
out CachedShaderProgram program,
out byte[] cachedGuestCode)
@@ -50,7 +52,7 @@ namespace Ryujinx.Graphics.Gpu.Shader
program = null;
ShaderCodeAccessor codeAccessor = new ShaderCodeAccessor(channel.MemoryManager, gpuVa);
bool hasSpecList = _cache.TryFindItem(codeAccessor, out var specList, out cachedGuestCode);
return hasSpecList && specList.TryFindForCompute(channel, poolState, out program);
return hasSpecList && specList.TryFindForCompute(channel, poolState, computeState, out program);
}
/// <summary>

View File

@@ -225,6 +225,12 @@ namespace Ryujinx.Graphics.Gpu.Shader.DiskCache
return _oldSpecState.GraphicsState.EarlyZForce;
}
/// <inheritdoc/>
public bool QueryHasUnalignedStorageBuffer()
{
return _oldSpecState.GraphicsState.HasUnalignedStorageBuffer || _oldSpecState.ComputeState.HasUnalignedStorageBuffer;
}
/// <inheritdoc/>
public bool QueryViewportTransformDisable()
{

View File

@@ -22,7 +22,7 @@ namespace Ryujinx.Graphics.Gpu.Shader.DiskCache
private const ushort FileFormatVersionMajor = 1;
private const ushort FileFormatVersionMinor = 2;
private const uint FileFormatVersionPacked = ((uint)FileFormatVersionMajor << 16) | FileFormatVersionMinor;
private const uint CodeGenVersion = 3747;
private const uint CodeGenVersion = 3866;
private const string SharedTocFileName = "shared.toc";
private const string SharedDataFileName = "shared.data";

View File

@@ -636,6 +636,8 @@ namespace Ryujinx.Graphics.Gpu.Shader.DiskCache
CachedShaderStage[] shaders = new CachedShaderStage[guestShaders.Length];
List<ShaderProgram> translatedStages = new List<ShaderProgram>();
TranslatorContext previousStage = null;
for (int stageIndex = 0; stageIndex < Constants.ShaderStages; stageIndex++)
{
TranslatorContext currentStage = translatorContexts[stageIndex + 1];
@@ -668,6 +670,16 @@ namespace Ryujinx.Graphics.Gpu.Shader.DiskCache
{
translatedStages.Add(program);
}
previousStage = currentStage;
}
else if (
previousStage != null &&
previousStage.LayerOutputWritten &&
stageIndex == 3 &&
!_context.Capabilities.SupportsLayerVertexTessellation)
{
translatedStages.Add(previousStage.GenerateGeometryPassthrough());
}
}

View File

@@ -145,6 +145,12 @@ namespace Ryujinx.Graphics.Gpu.Shader
return _state.GraphicsState.HasConstantBufferDrawParameters;
}
/// <inheritdoc/>
public bool QueryHasUnalignedStorageBuffer()
{
return _state.GraphicsState.HasUnalignedStorageBuffer || _state.ComputeState.HasUnalignedStorageBuffer;
}
/// <inheritdoc/>
public InputTopology QueryPrimitiveTopology()
{

View File

@@ -128,6 +128,8 @@ namespace Ryujinx.Graphics.Gpu.Shader
public bool QueryHostSupportsImageLoadFormatted() => _context.Capabilities.SupportsImageLoadFormatted;
public bool QueryHostSupportsLayerVertexTessellation() => _context.Capabilities.SupportsLayerVertexTessellation;
public bool QueryHostSupportsNonConstantTextureOffset() => _context.Capabilities.SupportsNonConstantTextureOffset;
public bool QueryHostSupportsShaderBallot() => _context.Capabilities.SupportsShaderBallot;

View File

@@ -32,6 +32,11 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// </summary>
public readonly int SharedMemorySize;
/// <summary>
/// Indicates that any storage buffer use is unaligned.
/// </summary>
public readonly bool HasUnalignedStorageBuffer;
/// <summary>
/// Creates a new GPU compute state.
/// </summary>
@@ -40,18 +45,21 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// <param name="localSizeZ">Local group size Z of the compute shader</param>
/// <param name="localMemorySize">Local memory size of the compute shader</param>
/// <param name="sharedMemorySize">Shared memory size of the compute shader</param>
/// <param name="hasUnalignedStorageBuffer">Indicates that any storage buffer use is unaligned</param>
public GpuChannelComputeState(
int localSizeX,
int localSizeY,
int localSizeZ,
int localMemorySize,
int sharedMemorySize)
int sharedMemorySize,
bool hasUnalignedStorageBuffer)
{
LocalSizeX = localSizeX;
LocalSizeY = localSizeY;
LocalSizeZ = localSizeZ;
LocalMemorySize = localMemorySize;
SharedMemorySize = sharedMemorySize;
HasUnalignedStorageBuffer = hasUnalignedStorageBuffer;
}
}
}

View File

@@ -82,6 +82,11 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// </summary>
public readonly bool HasConstantBufferDrawParameters;
/// <summary>
/// Indicates that any storage buffer use is unaligned.
/// </summary>
public readonly bool HasUnalignedStorageBuffer;
/// <summary>
/// Creates a new GPU graphics state.
/// </summary>
@@ -99,6 +104,7 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// <param name="alphaTestReference">When alpha test is enabled, indicates the value to compare with the fragment output alpha</param>
/// <param name="attributeTypes">Type of the vertex attributes consumed by the shader</param>
/// <param name="hasConstantBufferDrawParameters">Indicates that the draw is writing the base vertex, base instance and draw index to Constant Buffer 0</param>
/// <param name="hasUnalignedStorageBuffer">Indicates that any storage buffer use is unaligned</param>
public GpuChannelGraphicsState(
bool earlyZForce,
PrimitiveTopology topology,
@@ -113,7 +119,8 @@ namespace Ryujinx.Graphics.Gpu.Shader
CompareOp alphaTestCompare,
float alphaTestReference,
ref Array32<AttributeType> attributeTypes,
bool hasConstantBufferDrawParameters)
bool hasConstantBufferDrawParameters,
bool hasUnalignedStorageBuffer)
{
EarlyZForce = earlyZForce;
Topology = topology;
@@ -129,6 +136,7 @@ namespace Ryujinx.Graphics.Gpu.Shader
AlphaTestReference = alphaTestReference;
AttributeTypes = attributeTypes;
HasConstantBufferDrawParameters = hasConstantBufferDrawParameters;
HasUnalignedStorageBuffer = hasUnalignedStorageBuffer;
}
}
}

View File

@@ -203,12 +203,12 @@ namespace Ryujinx.Graphics.Gpu.Shader
GpuChannelComputeState computeState,
ulong gpuVa)
{
if (_cpPrograms.TryGetValue(gpuVa, out var cpShader) && IsShaderEqual(channel, poolState, cpShader, gpuVa))
if (_cpPrograms.TryGetValue(gpuVa, out var cpShader) && IsShaderEqual(channel, poolState, computeState, cpShader, gpuVa))
{
return cpShader;
}
if (_computeShaderCache.TryFind(channel, poolState, gpuVa, out cpShader, out byte[] cachedGuestCode))
if (_computeShaderCache.TryFind(channel, poolState, computeState, gpuVa, out cpShader, out byte[] cachedGuestCode))
{
_cpPrograms[gpuVa] = cpShader;
return cpShader;
@@ -356,6 +356,8 @@ namespace Ryujinx.Graphics.Gpu.Shader
CachedShaderStage[] shaders = new CachedShaderStage[Constants.ShaderStages + 1];
List<ShaderSource> shaderSources = new List<ShaderSource>();
TranslatorContext previousStage = null;
for (int stageIndex = 0; stageIndex < Constants.ShaderStages; stageIndex++)
{
TranslatorContext currentStage = translatorContexts[stageIndex + 1];
@@ -392,6 +394,16 @@ namespace Ryujinx.Graphics.Gpu.Shader
{
shaderSources.Add(CreateShaderSource(program));
}
previousStage = currentStage;
}
else if (
previousStage != null &&
previousStage.LayerOutputWritten &&
stageIndex == 3 &&
!_context.Capabilities.SupportsLayerVertexTessellation)
{
shaderSources.Add(CreateShaderSource(previousStage.GenerateGeometryPassthrough()));
}
}
@@ -473,18 +485,20 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// </summary>
/// <param name="channel">GPU channel using the shader</param>
/// <param name="poolState">GPU channel state to verify shader compatibility</param>
/// <param name="computeState">GPU channel compute state to verify shader compatibility</param>
/// <param name="cpShader">Cached compute shader</param>
/// <param name="gpuVa">GPU virtual address of the shader code in memory</param>
/// <returns>True if the code is different, false otherwise</returns>
private static bool IsShaderEqual(
GpuChannel channel,
GpuChannelPoolState poolState,
GpuChannelComputeState computeState,
CachedShaderProgram cpShader,
ulong gpuVa)
{
if (IsShaderEqual(channel.MemoryManager, cpShader.Shaders[0], gpuVa))
{
return cpShader.SpecializationState.MatchesCompute(channel, poolState, true);
return cpShader.SpecializationState.MatchesCompute(channel, poolState, computeState, true);
}
return false;

View File

@@ -53,13 +53,14 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// </summary>
/// <param name="channel">GPU channel</param>
/// <param name="poolState">Texture pool state</param>
/// <param name="computeState">Compute state</param>
/// <param name="program">Cached program, if found</param>
/// <returns>True if a compatible program is found, false otherwise</returns>
public bool TryFindForCompute(GpuChannel channel, GpuChannelPoolState poolState, out CachedShaderProgram program)
public bool TryFindForCompute(GpuChannel channel, GpuChannelPoolState poolState, GpuChannelComputeState computeState, out CachedShaderProgram program)
{
foreach (var entry in _entries)
{
if (entry.SpecializationState.MatchesCompute(channel, poolState, true))
if (entry.SpecializationState.MatchesCompute(channel, poolState, computeState, true))
{
program = entry;
return true;

View File

@@ -531,6 +531,11 @@ namespace Ryujinx.Graphics.Gpu.Shader
return false;
}
if (graphicsState.HasUnalignedStorageBuffer != GraphicsState.HasUnalignedStorageBuffer)
{
return false;
}
return Matches(channel, poolState, checkTextures, isCompute: false);
}
@@ -539,10 +544,16 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// </summary>
/// <param name="channel">GPU channel</param>
/// <param name="poolState">Texture pool state</param>
/// <param name="computeState">Compute state</param>
/// <param name="checkTextures">Indicates whether texture descriptors should be checked</param>
/// <returns>True if the state matches, false otherwise</returns>
public bool MatchesCompute(GpuChannel channel, GpuChannelPoolState poolState, bool checkTextures)
public bool MatchesCompute(GpuChannel channel, GpuChannelPoolState poolState, GpuChannelComputeState computeState, bool checkTextures)
{
if (computeState.HasUnalignedStorageBuffer != ComputeState.HasUnalignedStorageBuffer)
{
return false;
}
return Matches(channel, poolState, checkTextures, isCompute: true);
}

View File

@@ -63,7 +63,7 @@ namespace Ryujinx.Graphics.Nvdec.Vp9
short dqv = dq[0];
ReadOnlySpan<byte> cat6Prob = (xd.Bd == 12)
? Luts.Vp9Cat6ProbHigh12
: (xd.Bd == 10) ? new ReadOnlySpan<byte>(Luts.Vp9Cat6ProbHigh12).Slice(2) : Luts.Vp9Cat6Prob;
: (xd.Bd == 10) ? Luts.Vp9Cat6ProbHigh12.Slice(2) : Luts.Vp9Cat6Prob;
int cat6Bits = (xd.Bd == 12) ? 18 : (xd.Bd == 10) ? 16 : 14;
// Keep value, range, and count as locals. The compiler produces better
// results with the locals than using r directly.

View File

@@ -1,14 +1,12 @@
using Ryujinx.Common.Memory;
using Ryujinx.Graphics.Nvdec.Vp9.Types;
using System;
namespace Ryujinx.Graphics.Nvdec.Vp9
{
internal static class Luts
{
public static readonly byte[] SizeGroupLookup = new byte[]
{
0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3
};
public static ReadOnlySpan<byte> SizeGroupLookup => new byte[] { 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3 };
public static readonly BlockSize[][] SubsizeLookup = new BlockSize[][]
{
@@ -1070,18 +1068,18 @@ namespace Ryujinx.Graphics.Nvdec.Vp9
-(sbyte)MvClassType.MvClass10,
};
public static readonly sbyte[] Vp9MvFPTree = new sbyte[] { -0, 2, -1, 4, -2, -3 };
public static ReadOnlySpan<sbyte> Vp9MvFPTree => new sbyte[] { -0, 2, -1, 4, -2, -3 };
// Entropy
public static readonly byte[] Vp9Cat1Prob = new byte[] { 159 };
public static readonly byte[] Vp9Cat2Prob = new byte[] { 165, 145 };
public static readonly byte[] Vp9Cat3Prob = new byte[] { 173, 148, 140 };
public static readonly byte[] Vp9Cat4Prob = new byte[] { 176, 155, 140, 135 };
public static readonly byte[] Vp9Cat5Prob = new byte[] { 180, 157, 141, 134, 130 };
public static readonly byte[] Vp9Cat6Prob = new byte[] { 254, 254, 254, 252, 249, 243, 230, 196, 177, 153, 140, 133, 130, 129 };
public static ReadOnlySpan<byte> Vp9Cat1Prob => new byte[] { 159 };
public static ReadOnlySpan<byte> Vp9Cat2Prob => new byte[] { 165, 145 };
public static ReadOnlySpan<byte> Vp9Cat3Prob => new byte[] { 173, 148, 140 };
public static ReadOnlySpan<byte> Vp9Cat4Prob => new byte[] { 176, 155, 140, 135 };
public static ReadOnlySpan<byte> Vp9Cat5Prob => new byte[] { 180, 157, 141, 134, 130 };
public static ReadOnlySpan<byte> Vp9Cat6Prob => new byte[] { 254, 254, 254, 252, 249, 243, 230, 196, 177, 153, 140, 133, 130, 129 };
public static readonly byte[] Vp9Cat6ProbHigh12 = new byte[]
public static ReadOnlySpan<byte> Vp9Cat6ProbHigh12 => new byte[]
{
255, 255, 255, 255, 254, 254, 54, 252, 249, 243, 230, 196, 177, 153, 140, 133, 130, 129
};
@@ -1131,12 +1129,12 @@ namespace Ryujinx.Graphics.Nvdec.Vp9
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
};
private static readonly byte[] Vp9CoefbandTrans4X4 = new byte[]
private static ReadOnlySpan<byte> Vp9CoefbandTrans4X4 => new byte[]
{
0, 1, 1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5,
};
public static byte[] get_band_translate(TxSize txSize)
public static ReadOnlySpan<byte> get_band_translate(TxSize txSize)
{
return txSize == TxSize.Tx4x4 ? Vp9CoefbandTrans4X4 : Vp9CoefbandTrans8X8Plus;
}

View File

@@ -18,6 +18,7 @@ namespace Ryujinx.Graphics.OpenGL
private static readonly Lazy<bool> _supportsQuads = new Lazy<bool>(SupportsQuadsCheck);
private static readonly Lazy<bool> _supportsSeamlessCubemapPerTexture = new Lazy<bool>(() => HasExtension("GL_ARB_seamless_cubemap_per_texture"));
private static readonly Lazy<bool> _supportsShaderBallot = new Lazy<bool>(() => HasExtension("GL_ARB_shader_ballot"));
private static readonly Lazy<bool> _supportsShaderViewportLayerArray = new Lazy<bool>(() => HasExtension("GL_ARB_shader_viewport_layer_array"));
private static readonly Lazy<bool> _supportsTextureCompressionBptc = new Lazy<bool>(() => HasExtension("GL_EXT_texture_compression_bptc"));
private static readonly Lazy<bool> _supportsTextureCompressionRgtc = new Lazy<bool>(() => HasExtension("GL_EXT_texture_compression_rgtc"));
private static readonly Lazy<bool> _supportsTextureCompressionS3tc = new Lazy<bool>(() => HasExtension("GL_EXT_texture_compression_s3tc"));
@@ -61,6 +62,7 @@ namespace Ryujinx.Graphics.OpenGL
public static bool SupportsQuads => _supportsQuads.Value;
public static bool SupportsSeamlessCubemapPerTexture => _supportsSeamlessCubemapPerTexture.Value;
public static bool SupportsShaderBallot => _supportsShaderBallot.Value;
public static bool SupportsShaderViewportLayerArray => _supportsShaderViewportLayerArray.Value;
public static bool SupportsTextureCompressionBptc => _supportsTextureCompressionBptc.Value;
public static bool SupportsTextureCompressionRgtc => _supportsTextureCompressionRgtc.Value;
public static bool SupportsTextureCompressionS3tc => _supportsTextureCompressionS3tc.Value;

View File

@@ -117,12 +117,13 @@ namespace Ryujinx.Graphics.OpenGL
supportsFragmentShaderOrderingIntel: HwCapabilities.SupportsFragmentShaderOrdering,
supportsGeometryShaderPassthrough: HwCapabilities.SupportsGeometryShaderPassthrough,
supportsImageLoadFormatted: HwCapabilities.SupportsImageLoadFormatted,
supportsLayerVertexTessellation: HwCapabilities.SupportsShaderViewportLayerArray,
supportsMismatchingViewFormat: HwCapabilities.SupportsMismatchingViewFormat,
supportsCubemapView: true,
supportsNonConstantTextureOffset: HwCapabilities.SupportsNonConstantTextureOffset,
supportsShaderBallot: HwCapabilities.SupportsShaderBallot,
supportsTextureShadowLod: HwCapabilities.SupportsTextureShadowLod,
supportsViewportIndex: true,
supportsViewportIndex: HwCapabilities.SupportsShaderViewportLayerArray,
supportsViewportSwizzle: HwCapabilities.SupportsViewportSwizzle,
supportsIndirectParameters: HwCapabilities.SupportsIndirectParameters,
maximumUniformBuffersPerStage: 13, // TODO: Avoid hardcoding those limits here and get from driver?

View File

@@ -1829,7 +1829,7 @@ namespace Ryujinx.Graphics.Shader.CodeGen.Spirv
if (texOp.Index < 2 || (type & SamplerType.Mask) == SamplerType.Texture3D)
{
result = ScalingHelpers.ApplyUnscaling(context, texOp, result, isBindless, isIndexed);
result = ScalingHelpers.ApplyUnscaling(context, texOp.WithType(type), result, isBindless, isIndexed);
}
return new OperationResult(AggregateType.S32, result);

View File

@@ -10,5 +10,7 @@ namespace Ryujinx.Graphics.Shader
public const int NvnBaseVertexByteOffset = 0x640;
public const int NvnBaseInstanceByteOffset = 0x644;
public const int NvnDrawIndexByteOffset = 0x648;
public const int StorageAlignment = 16;
}
}

View File

@@ -177,6 +177,15 @@ namespace Ryujinx.Graphics.Shader
return false;
}
/// <summary>
/// Queries whenever the current draw uses unaligned storage buffer addresses.
/// </summary>
/// <returns>True if any storage buffer address is not aligned to 16 bytes, false otherwise</returns>
bool QueryHasUnalignedStorageBuffer()
{
return false;
}
/// <summary>
/// Queries host about the presence of the FrontFacing built-in variable bug.
/// </summary>
@@ -249,6 +258,15 @@ namespace Ryujinx.Graphics.Shader
return true;
}
/// <summary>
/// Queries host support for writes to Layer from vertex or tessellation shader stages.
/// </summary>
/// <returns>True if writes to layer from vertex or tessellation are supported, false otherwise</returns>
bool QueryHostSupportsLayerVertexTessellation()
{
return true;
}
/// <summary>
/// Queries host GPU non-constant texture offset support.
/// </summary>

View File

@@ -278,13 +278,21 @@ namespace Ryujinx.Graphics.Shader.Instructions
private static int FixedFuncToUserAttribute(ShaderConfig config, int attr, bool isOutput)
{
if (attr >= AttributeConsts.FrontColorDiffuseR && attr < AttributeConsts.ClipDistance0)
bool supportsLayerFromVertexOrTess = config.GpuAccessor.QueryHostSupportsLayerVertexTessellation();
int fixedStartAttr = supportsLayerFromVertexOrTess ? 0 : 1;
if (attr == AttributeConsts.Layer && config.Stage != ShaderStage.Geometry && !supportsLayerFromVertexOrTess)
{
attr = FixedFuncToUserAttribute(config, attr, AttributeConsts.FrontColorDiffuseR, 0, isOutput);
attr = FixedFuncToUserAttribute(config, attr, AttributeConsts.Layer, 0, isOutput);
config.SetLayerOutputAttribute(attr);
}
else if (attr >= AttributeConsts.FrontColorDiffuseR && attr < AttributeConsts.ClipDistance0)
{
attr = FixedFuncToUserAttribute(config, attr, AttributeConsts.FrontColorDiffuseR, fixedStartAttr, isOutput);
}
else if (attr >= AttributeConsts.TexCoordBase && attr < AttributeConsts.TexCoordEnd)
{
attr = FixedFuncToUserAttribute(config, attr, AttributeConsts.TexCoordBase, 4, isOutput);
attr = FixedFuncToUserAttribute(config, attr, AttributeConsts.TexCoordBase, fixedStartAttr + 4, isOutput);
}
return attr;

View File

@@ -27,5 +27,10 @@ namespace Ryujinx.Graphics.Shader.StructuredIr
CbufSlot = cbufSlot;
Handle = handle;
}
public AstTextureOperation WithType(SamplerType type)
{
return new AstTextureOperation(Inst, type, Format, Flags, CbufSlot, Handle, Index);
}
}
}

View File

@@ -34,7 +34,7 @@ namespace Ryujinx.Graphics.Shader.Translation.Optimizations
// we can guess which storage buffer it is accessing.
// We can then replace the global memory access with a storage
// buffer access.
node = ReplaceGlobalWithStorage(node, config, storageIndex);
node = ReplaceGlobalWithStorage(block, node, config, storageIndex);
}
else if (config.Stage == ShaderStage.Compute && operation.Inst == Instruction.LoadGlobal)
{
@@ -54,7 +54,7 @@ namespace Ryujinx.Graphics.Shader.Translation.Optimizations
}
}
private static LinkedListNode<INode> ReplaceGlobalWithStorage(LinkedListNode<INode> node, ShaderConfig config, int storageIndex)
private static LinkedListNode<INode> ReplaceGlobalWithStorage(BasicBlock block, LinkedListNode<INode> node, ShaderConfig config, int storageIndex)
{
Operation operation = (Operation)node.Value;
@@ -64,42 +64,10 @@ namespace Ryujinx.Graphics.Shader.Translation.Optimizations
config.SetUsedStorageBuffer(storageIndex, isWrite);
Operand GetStorageOffset()
{
Operand addrLow = operation.GetSource(0);
Operand baseAddrLow = Cbuf(0, GetStorageCbOffset(config.Stage, storageIndex));
Operand baseAddrTrunc = Local();
Operand alignMask = Const(-config.GpuAccessor.QueryHostStorageBufferOffsetAlignment());
Operation andOp = new Operation(Instruction.BitwiseAnd, baseAddrTrunc, baseAddrLow, alignMask);
node.List.AddBefore(node, andOp);
Operand byteOffset = Local();
Operation subOp = new Operation(Instruction.Subtract, byteOffset, addrLow, baseAddrTrunc);
node.List.AddBefore(node, subOp);
if (isStg16Or8)
{
return byteOffset;
}
Operand wordOffset = Local();
Operation shrOp = new Operation(Instruction.ShiftRightU32, wordOffset, byteOffset, Const(2));
node.List.AddBefore(node, shrOp);
return wordOffset;
}
Operand[] sources = new Operand[operation.SourcesCount];
sources[0] = Const(storageIndex);
sources[1] = GetStorageOffset();
sources[1] = GetStorageOffset(block, node, config, storageIndex, operation.GetSource(0), isStg16Or8);
for (int index = 2; index < operation.SourcesCount; index++)
{
@@ -144,6 +112,170 @@ namespace Ryujinx.Graphics.Shader.Translation.Optimizations
return node;
}
private static Operand GetStorageOffset(
BasicBlock block,
LinkedListNode<INode> node,
ShaderConfig config,
int storageIndex,
Operand addrLow,
bool isStg16Or8)
{
int baseAddressCbOffset = GetStorageCbOffset(config.Stage, storageIndex);
bool storageAligned = !(config.GpuAccessor.QueryHasUnalignedStorageBuffer() || config.GpuAccessor.QueryHostStorageBufferOffsetAlignment() > Constants.StorageAlignment);
(Operand byteOffset, int constantOffset) = storageAligned ?
GetStorageOffset(block, Utils.FindLastOperation(addrLow, block), baseAddressCbOffset) :
(null, 0);
if (byteOffset == null)
{
Operand baseAddrLow = Cbuf(0, baseAddressCbOffset);
Operand baseAddrTrunc = Local();
Operand alignMask = Const(-config.GpuAccessor.QueryHostStorageBufferOffsetAlignment());
Operation andOp = new Operation(Instruction.BitwiseAnd, baseAddrTrunc, baseAddrLow, alignMask);
node.List.AddBefore(node, andOp);
Operand offset = Local();
Operation subOp = new Operation(Instruction.Subtract, offset, addrLow, baseAddrTrunc);
node.List.AddBefore(node, subOp);
byteOffset = offset;
}
else if (constantOffset != 0)
{
Operand offset = Local();
Operation addOp = new Operation(Instruction.Add, offset, byteOffset, Const(constantOffset));
node.List.AddBefore(node, addOp);
byteOffset = offset;
}
if (byteOffset != null)
{
ReplaceAddressAlignment(node.List, addrLow, byteOffset, constantOffset);
}
if (isStg16Or8)
{
return byteOffset;
}
Operand wordOffset = Local();
Operation shrOp = new Operation(Instruction.ShiftRightU32, wordOffset, byteOffset, Const(2));
node.List.AddBefore(node, shrOp);
return wordOffset;
}
private static bool IsCb0Offset(Operand operand, int offset)
{
return operand.Type == OperandType.ConstantBuffer && operand.GetCbufSlot() == 0 && operand.GetCbufOffset() == offset;
}
private static void ReplaceAddressAlignment(LinkedList<INode> list, Operand address, Operand byteOffset, int constantOffset)
{
// When we emit 16/8-bit LDG, we add extra code to determine the address alignment.
// Eliminate the storage buffer base address from this too, leaving only the byte offset.
foreach (INode useNode in address.UseOps)
{
if (useNode is Operation op && op.Inst == Instruction.BitwiseAnd)
{
Operand src1 = op.GetSource(0);
Operand src2 = op.GetSource(1);
int addressIndex = -1;
if (src1 == address && src2.Type == OperandType.Constant && src2.Value == 3)
{
addressIndex = 0;
}
else if (src2 == address && src1.Type == OperandType.Constant && src1.Value == 3)
{
addressIndex = 1;
}
if (addressIndex != -1)
{
LinkedListNode<INode> node = list.Find(op);
// Add offset calculation before the use. Needs to be on the same block.
if (node != null)
{
Operand offset = Local();
Operation addOp = new Operation(Instruction.Add, offset, byteOffset, Const(constantOffset));
list.AddBefore(node, addOp);
op.SetSource(addressIndex, offset);
}
}
}
}
}
private static (Operand, int) GetStorageOffset(BasicBlock block, Operand address, int baseAddressCbOffset)
{
if (IsCb0Offset(address, baseAddressCbOffset))
{
// Direct offset: zero.
return (Const(0), 0);
}
(address, int constantOffset) = GetStorageConstantOffset(block, address);
address = Utils.FindLastOperation(address, block);
if (IsCb0Offset(address, baseAddressCbOffset))
{
// Only constant offset
return (Const(0), constantOffset);
}
if (!(address.AsgOp is Operation offsetAdd) || offsetAdd.Inst != Instruction.Add)
{
return (null, 0);
}
Operand src1 = offsetAdd.GetSource(0);
Operand src2 = Utils.FindLastOperation(offsetAdd.GetSource(1), block);
if (IsCb0Offset(src2, baseAddressCbOffset))
{
return (src1, constantOffset);
}
else if (IsCb0Offset(src1, baseAddressCbOffset))
{
return (src2, constantOffset);
}
return (null, 0);
}
private static (Operand, int) GetStorageConstantOffset(BasicBlock block, Operand address)
{
if (!(address.AsgOp is Operation offsetAdd) || offsetAdd.Inst != Instruction.Add)
{
return (address, 0);
}
Operand src1 = offsetAdd.GetSource(0);
Operand src2 = offsetAdd.GetSource(1);
if (src2.Type != OperandType.Constant)
{
return (address, 0);
}
return (src1, src2.Value);
}
private static LinkedListNode<INode> ReplaceLdgWithLdc(LinkedListNode<INode> node, ShaderConfig config, int storageIndex)
{
Operation operation = (Operation)node.Value;
@@ -165,7 +297,7 @@ namespace Ryujinx.Graphics.Shader.Translation.Optimizations
Operand byteOffset = Local();
Operand wordOffset = Local();
Operation subOp = new Operation(Instruction.Subtract, byteOffset, addrLow, baseAddrTrunc);
Operation subOp = new Operation(Instruction.Subtract, byteOffset, addrLow, baseAddrTrunc);
Operation shrOp = new Operation(Instruction.ShiftRightU32, wordOffset, byteOffset, Const(2));
node.List.AddBefore(node, subOp);
@@ -260,7 +392,7 @@ namespace Ryujinx.Graphics.Shader.Translation.Optimizations
{
if (operand.Type == OperandType.ConstantBuffer)
{
int slot = operand.GetCbufSlot();
int slot = operand.GetCbufSlot();
int offset = operand.GetCbufOffset();
if (slot == 0 && offset >= sbStart && offset < sbEnd)

View File

@@ -48,6 +48,9 @@ namespace Ryujinx.Graphics.Shader.Translation
public int Cb1DataSize { get; private set; }
public bool LayerOutputWritten { get; private set; }
public int LayerOutputAttribute { get; private set; }
public bool NextUsesFixedFuncAttributes { get; private set; }
public int UsedInputAttributes { get; private set; }
public int UsedOutputAttributes { get; private set; }
@@ -131,6 +134,20 @@ namespace Ryujinx.Graphics.Shader.Translation
_usedImages = new Dictionary<TextureInfo, TextureMeta>();
}
public ShaderConfig(
ShaderStage stage,
OutputTopology outputTopology,
int maxOutputVertices,
IGpuAccessor gpuAccessor,
TranslationOptions options) : this(gpuAccessor, options)
{
Stage = stage;
ThreadsPerInputPrimitive = 1;
OutputTopology = outputTopology;
MaxOutputVertices = maxOutputVertices;
TransformFeedbackEnabled = gpuAccessor.QueryTransformFeedbackEnabled();
}
public ShaderConfig(ShaderHeader header, IGpuAccessor gpuAccessor, TranslationOptions options) : this(gpuAccessor, options)
{
Stage = header.Stage;
@@ -240,6 +257,12 @@ namespace Ryujinx.Graphics.Shader.Translation
}
}
public void SetLayerOutputAttribute(int attr)
{
LayerOutputWritten = true;
LayerOutputAttribute = attr;
}
public void SetInputUserAttributeFixedFunc(int index)
{
UsedInputAttributes |= 1 << index;
@@ -694,5 +717,20 @@ namespace Ryujinx.Graphics.Shader.Translation
{
return FindDescriptorIndex(GetImageDescriptors(), texOp);
}
public ShaderProgramInfo CreateProgramInfo()
{
return new ShaderProgramInfo(
GetConstantBufferDescriptors(),
GetStorageBufferDescriptors(),
GetTextureDescriptors(),
GetImageDescriptors(),
Stage,
UsedFeatures.HasFlag(FeatureFlags.InstanceId),
UsedFeatures.HasFlag(FeatureFlags.DrawParameters),
UsedFeatures.HasFlag(FeatureFlags.RtLayer),
ClipDistancesWritten,
OmapTargets);
}
}
}

View File

@@ -79,17 +79,7 @@ namespace Ryujinx.Graphics.Shader.Translation
var sInfo = StructuredProgram.MakeStructuredProgram(funcs, config);
var info = new ShaderProgramInfo(
config.GetConstantBufferDescriptors(),
config.GetStorageBufferDescriptors(),
config.GetTextureDescriptors(),
config.GetImageDescriptors(),
config.Stage,
config.UsedFeatures.HasFlag(FeatureFlags.InstanceId),
config.UsedFeatures.HasFlag(FeatureFlags.DrawParameters),
config.UsedFeatures.HasFlag(FeatureFlags.RtLayer),
config.ClipDistancesWritten,
config.OmapTargets);
var info = config.CreateProgramInfo();
return config.Options.TargetLanguage switch
{

View File

@@ -1,7 +1,12 @@
using Ryujinx.Graphics.Shader.Decoders;
using Ryujinx.Graphics.Shader.CodeGen.Glsl;
using Ryujinx.Graphics.Shader.CodeGen.Spirv;
using Ryujinx.Graphics.Shader.Decoders;
using Ryujinx.Graphics.Shader.IntermediateRepresentation;
using Ryujinx.Graphics.Shader.StructuredIr;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
using static Ryujinx.Graphics.Shader.IntermediateRepresentation.OperandHelper;
using static Ryujinx.Graphics.Shader.Translation.Translator;
@@ -18,6 +23,7 @@ namespace Ryujinx.Graphics.Shader.Translation
public ShaderStage Stage => _config.Stage;
public int Size => _config.Size;
public int Cb1DataSize => _config.Cb1DataSize;
public bool LayerOutputWritten => _config.LayerOutputWritten;
public IGpuAccessor GpuAccessor => _config.GpuAccessor;
@@ -149,5 +155,94 @@ namespace Ryujinx.Graphics.Shader.Translation
return Translator.Translate(code, _config);
}
public ShaderProgram GenerateGeometryPassthrough()
{
int outputAttributesMask = _config.UsedOutputAttributes;
int layerOutputAttr = _config.LayerOutputAttribute;
OutputTopology outputTopology;
int maxOutputVertices;
switch (GpuAccessor.QueryPrimitiveTopology())
{
case InputTopology.Points:
outputTopology = OutputTopology.PointList;
maxOutputVertices = 1;
break;
case InputTopology.Lines:
case InputTopology.LinesAdjacency:
outputTopology = OutputTopology.LineStrip;
maxOutputVertices = 2;
break;
default:
outputTopology = OutputTopology.TriangleStrip;
maxOutputVertices = 3;
break;
}
ShaderConfig config = new ShaderConfig(ShaderStage.Geometry, outputTopology, maxOutputVertices, GpuAccessor, _config.Options);
EmitterContext context = new EmitterContext(default, config, false);
for (int v = 0; v < maxOutputVertices; v++)
{
int outAttrsMask = outputAttributesMask;
while (outAttrsMask != 0)
{
int attrIndex = BitOperations.TrailingZeroCount(outAttrsMask);
outAttrsMask &= ~(1 << attrIndex);
for (int c = 0; c < 4; c++)
{
int attr = AttributeConsts.UserAttributeBase + attrIndex * 16 + c * 4;
Operand value = context.LoadAttribute(Const(attr), Const(0), Const(v));
if (attr == layerOutputAttr)
{
context.Copy(Attribute(AttributeConsts.Layer), value);
}
else
{
context.Copy(Attribute(attr), value);
config.SetOutputUserAttribute(attrIndex);
}
config.SetInputUserAttribute(attrIndex, c);
}
}
for (int c = 0; c < 4; c++)
{
int attr = AttributeConsts.PositionX + c * 4;
Operand value = context.LoadAttribute(Const(attr), Const(0), Const(v));
context.Copy(Attribute(attr), value);
}
context.EmitVertex();
}
context.EndPrimitive();
var operations = context.GetOperations();
var cfg = ControlFlowGraph.Create(operations);
var function = new Function(cfg.Blocks, "main", false, 0, 0);
var sInfo = StructuredProgram.MakeStructuredProgram(new[] { function }, config);
var info = config.CreateProgramInfo();
return config.Options.TargetLanguage switch
{
TargetLanguage.Glsl => new ShaderProgram(info, TargetLanguage.Glsl, GlslGenerator.Generate(sInfo, config)),
TargetLanguage.Spirv => new ShaderProgram(info, TargetLanguage.Spirv, SpirvGenerator.Generate(sInfo, config)),
_ => throw new NotImplementedException(config.Options.TargetLanguage.ToString())
};
}
}
}

View File

@@ -210,7 +210,10 @@ namespace Ryujinx.Graphics.Vulkan
}
}
if (cbs != null && !(_buffer.HasCommandBufferDependency(cbs.Value) && _waitable.IsBufferRangeInUse(cbs.Value.CommandBufferIndex, offset, dataSize)))
if (cbs != null &&
_gd.PipelineInternal.RenderPassActive &&
!(_buffer.HasCommandBufferDependency(cbs.Value) &&
_waitable.IsBufferRangeInUse(cbs.Value.CommandBufferIndex, offset, dataSize)))
{
// If the buffer hasn't been used on the command buffer yet, try to preload the data.
// This avoids ending and beginning render passes on each buffer data upload.

View File

@@ -130,6 +130,12 @@ namespace Ryujinx.Graphics.Vulkan
1f));
}
public void Initialize()
{
Span<byte> dummyTextureData = stackalloc byte[4];
_dummyTexture.SetData(dummyTextureData);
}
public void SetProgram(ShaderCollection program)
{
_program = program;

View File

@@ -49,7 +49,6 @@ namespace Ryujinx.Graphics.Vulkan
private Auto<DisposableFramebuffer> _framebuffer;
private Auto<DisposableRenderPass> _renderPass;
private int _writtenAttachmentCount;
private bool _renderPassActive;
private readonly DescriptorSetUpdater _descriptorSetUpdater;
@@ -73,6 +72,7 @@ namespace Ryujinx.Graphics.Vulkan
private PipelineColorBlendAttachmentState[] _storedBlend;
public ulong DrawCount { get; private set; }
public bool RenderPassActive { get; private set; }
public unsafe PipelineBase(VulkanRenderer gd, Device device)
{
@@ -114,6 +114,8 @@ namespace Ryujinx.Graphics.Vulkan
public void Initialize()
{
_descriptorSetUpdater.Initialize();
SupportBufferUpdater = new SupportBufferUpdater(Gd);
SupportBufferUpdater.UpdateRenderScale(_renderScale, 0, SupportBuffer.RenderScaleMaxCount);
@@ -838,6 +840,11 @@ namespace Ryujinx.Graphics.Vulkan
stages.CopyTo(_newState.Stages.AsSpan().Slice(0, stages.Length));
SignalStateChange();
if (_program.IsCompute)
{
EndRenderPass();
}
}
public void Specialize<T>(in T data) where T : unmanaged
@@ -1451,7 +1458,7 @@ namespace Ryujinx.Graphics.Vulkan
private unsafe void BeginRenderPass()
{
if (!_renderPassActive)
if (!RenderPassActive)
{
var renderArea = new Rect2D(null, new Extent2D(FramebufferParams.Width, FramebufferParams.Height));
var clearValue = new ClearValue();
@@ -1467,18 +1474,18 @@ namespace Ryujinx.Graphics.Vulkan
};
Gd.Api.CmdBeginRenderPass(CommandBuffer, renderPassBeginInfo, SubpassContents.Inline);
_renderPassActive = true;
RenderPassActive = true;
}
}
public void EndRenderPass()
{
if (_renderPassActive)
if (RenderPassActive)
{
PauseTransformFeedbackInternal();
Gd.Api.CmdEndRenderPass(CommandBuffer);
SignalRenderPassEnd();
_renderPassActive = false;
RenderPassActive = false;
}
}

View File

@@ -19,6 +19,7 @@ namespace Ryujinx.Graphics.Vulkan
public bool HasMinimalLayout { get; }
public bool UsePushDescriptors { get; }
public bool IsCompute { get; }
public uint Stages { get; }
@@ -47,7 +48,6 @@ namespace Ryujinx.Graphics.Vulkan
private VulkanRenderer _gd;
private Device _device;
private bool _initialized;
private bool _isCompute;
private ProgramPipelineState _state;
private DisposableRenderPass _dummyRenderPass;
@@ -91,7 +91,7 @@ namespace Ryujinx.Graphics.Vulkan
if (shader.StageFlags == ShaderStageFlags.ShaderStageComputeBit)
{
_isCompute = true;
IsCompute = true;
}
internalShaders[i] = shader;
@@ -163,7 +163,7 @@ namespace Ryujinx.Graphics.Vulkan
try
{
if (_isCompute)
if (IsCompute)
{
CreateBackgroundComputePipeline();
}

View File

@@ -396,6 +396,7 @@ namespace Ryujinx.Graphics.Vulkan
supportsFragmentShaderOrderingIntel: false,
supportsGeometryShaderPassthrough: Capabilities.SupportsGeometryShaderPassthrough,
supportsImageLoadFormatted: features2.Features.ShaderStorageImageReadWithoutFormat,
supportsLayerVertexTessellation: featuresVk12.ShaderOutputLayer,
supportsMismatchingViewFormat: true,
supportsCubemapView: !IsAmdGcn,
supportsNonConstantTextureOffset: false,

View File

@@ -26,8 +26,8 @@ namespace Ryujinx.HLE.HOS.Services.Am.AppletOE.ApplicationProxyService.Applicati
{
class IApplicationFunctions : IpcService
{
private ulong _defaultSaveDataSize = 200000000;
private ulong _defaultJournalSaveDataSize = 200000000;
private long _defaultSaveDataSize = 200000000;
private long _defaultJournalSaveDataSize = 200000000;
private KEvent _gpuErrorDetectedSystemEvent;
private KEvent _friendInvitationStorageChannelEvent;
@@ -203,13 +203,13 @@ namespace Ryujinx.HLE.HOS.Services.Am.AppletOE.ApplicationProxyService.Applicati
}
[CommandHipc(25)] // 3.0.0+
// ExtendSaveData(u8 save_data_type, nn::account::Uid, u64 save_size, u64 journal_size) -> u64 result_code
// ExtendSaveData(u8 save_data_type, nn::account::Uid, s64 save_size, s64 journal_size) -> u64 result_code
public ResultCode ExtendSaveData(ServiceCtx context)
{
SaveDataType saveDataType = (SaveDataType)context.RequestData.ReadUInt64();
Uid userId = context.RequestData.ReadStruct<Uid>();
ulong saveDataSize = context.RequestData.ReadUInt64();
ulong journalSize = context.RequestData.ReadUInt64();
long saveDataSize = context.RequestData.ReadInt64();
long journalSize = context.RequestData.ReadInt64();
// NOTE: Service calls nn::fs::ExtendApplicationSaveData.
// Since LibHac currently doesn't support this method, we can stub it for now.
@@ -225,7 +225,7 @@ namespace Ryujinx.HLE.HOS.Services.Am.AppletOE.ApplicationProxyService.Applicati
}
[CommandHipc(26)] // 3.0.0+
// GetSaveDataSize(u8 save_data_type, nn::account::Uid) -> (u64 save_size, u64 journal_size)
// GetSaveDataSize(u8 save_data_type, nn::account::Uid) -> (s64 save_size, s64 journal_size)
public ResultCode GetSaveDataSize(ServiceCtx context)
{
SaveDataType saveDataType = (SaveDataType)context.RequestData.ReadUInt64();
@@ -268,6 +268,23 @@ namespace Ryujinx.HLE.HOS.Services.Am.AppletOE.ApplicationProxyService.Applicati
return ResultCode.Success;
}
[CommandHipc(28)] // 11.0.0+
// GetSaveDataSizeMax() -> (s64 save_size_max, s64 journal_size_max)
public ResultCode GetSaveDataSizeMax(ServiceCtx context)
{
// NOTE: We are currently using a stub for GetSaveDataSize() which returns the default values.
// For this method we shouldn't return anything lower than that, but since we aren't interacting
// with fs to get the actual sizes, we return the default values here as well.
// This also helps in case ExtendSaveData() has been executed and the default values were modified.
context.ResponseData.Write(_defaultSaveDataSize);
context.ResponseData.Write(_defaultJournalSaveDataSize);
Logger.Stub?.PrintStub(LogClass.ServiceAm);
return ResultCode.Success;
}
[CommandHipc(30)]
// BeginBlockingHomeButtonShortAndLongPressed()
public ResultCode BeginBlockingHomeButtonShortAndLongPressed(ServiceCtx context)

View File

@@ -25,6 +25,7 @@ namespace Ryujinx.Memory.Tracking
private int _sequenceNumber;
private BitMap _sequenceNumberBitmap;
private BitMap _dirtyCheckedBitmap;
private int _uncheckedHandles;
public bool Dirty { get; private set; } = true;
@@ -36,6 +37,7 @@ namespace Ryujinx.Memory.Tracking
_dirtyBitmap = new ConcurrentBitmap(_handles.Length, true);
_sequenceNumberBitmap = new BitMap(_handles.Length);
_dirtyCheckedBitmap = new BitMap(_handles.Length);
int i = 0;
@@ -246,16 +248,18 @@ namespace Ryujinx.Memory.Tracking
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void ParseDirtyBits(long dirtyBits, long mask, int index, long[] seqMasks, ref int baseBit, ref int prevHandle, ref ulong rgStart, ref ulong rgSize, Action<ulong, ulong> modifiedAction)
private void ParseDirtyBits(long dirtyBits, long mask, int index, long[] seqMasks, long[] checkMasks, ref int baseBit, ref int prevHandle, ref ulong rgStart, ref ulong rgSize, Action<ulong, ulong> modifiedAction)
{
long seqMask = mask & ~seqMasks[index];
long checkMask = (~dirtyBits) & seqMask;
dirtyBits &= seqMask;
while (dirtyBits != 0)
{
int bit = BitOperations.TrailingZeroCount(dirtyBits);
long bitValue = 1L << bit;
dirtyBits &= ~(1L << bit);
dirtyBits &= ~bitValue;
int handleIndex = baseBit + bit;
@@ -273,11 +277,14 @@ namespace Ryujinx.Memory.Tracking
}
rgSize += handle.Size;
handle.Reprotect();
handle.Reprotect(false, (checkMasks[index] & bitValue) == 0);
checkMasks[index] &= ~bitValue;
prevHandle = handleIndex;
}
checkMasks[index] |= checkMask;
seqMasks[index] |= mask;
_uncheckedHandles -= BitOperations.PopCount((ulong)seqMask);
@@ -328,6 +335,7 @@ namespace Ryujinx.Memory.Tracking
ulong rgSize = 0;
long[] seqMasks = _sequenceNumberBitmap.Masks;
long[] checkedMasks = _dirtyCheckedBitmap.Masks;
long[] masks = _dirtyBitmap.Masks;
int startIndex = startHandle >> ConcurrentBitmap.IntShift;
@@ -345,20 +353,20 @@ namespace Ryujinx.Memory.Tracking
if (startIndex == endIndex)
{
ParseDirtyBits(startValue, startMask & endMask, startIndex, seqMasks, ref baseBit, ref prevHandle, ref rgStart, ref rgSize, modifiedAction);
ParseDirtyBits(startValue, startMask & endMask, startIndex, seqMasks, checkedMasks, ref baseBit, ref prevHandle, ref rgStart, ref rgSize, modifiedAction);
}
else
{
ParseDirtyBits(startValue, startMask, startIndex, seqMasks, ref baseBit, ref prevHandle, ref rgStart, ref rgSize, modifiedAction);
ParseDirtyBits(startValue, startMask, startIndex, seqMasks, checkedMasks, ref baseBit, ref prevHandle, ref rgStart, ref rgSize, modifiedAction);
for (int i = startIndex + 1; i < endIndex; i++)
{
ParseDirtyBits(Volatile.Read(ref masks[i]), -1L, i, seqMasks, ref baseBit, ref prevHandle, ref rgStart, ref rgSize, modifiedAction);
ParseDirtyBits(Volatile.Read(ref masks[i]), -1L, i, seqMasks, checkedMasks, ref baseBit, ref prevHandle, ref rgStart, ref rgSize, modifiedAction);
}
long endValue = Volatile.Read(ref masks[endIndex]);
ParseDirtyBits(endValue, endMask, endIndex, seqMasks, ref baseBit, ref prevHandle, ref rgStart, ref rgSize, modifiedAction);
ParseDirtyBits(endValue, endMask, endIndex, seqMasks, checkedMasks, ref baseBit, ref prevHandle, ref rgStart, ref rgSize, modifiedAction);
}
if (rgSize != 0)

View File

@@ -263,15 +263,15 @@ namespace Ryujinx.Memory.Tracking
/// </summary>
public void ForceDirty()
{
_checkCount++;
Dirty = true;
}
/// <summary>
/// Consume the dirty flag for this handle, and reprotect so it can be set on the next write.
/// </summary>
public void Reprotect(bool asDirty = false)
/// <param name="asDirty">True if the handle should be reprotected as dirty, rather than have it cleared</param>
/// <param name="consecutiveCheck">True if this reprotect is the result of consecutive dirty checks</param>
public void Reprotect(bool asDirty, bool consecutiveCheck = false)
{
if (_volatile) return;
@@ -296,7 +296,7 @@ namespace Ryujinx.Memory.Tracking
}
else if (!asDirty)
{
if (_checkCount > 0 && _checkCount < CheckCountForInfrequent)
if (consecutiveCheck || (_checkCount > 0 && _checkCount < CheckCountForInfrequent))
{
if (++_volatileCount >= VolatileThreshold && _preAction == null)
{
@@ -313,6 +313,15 @@ namespace Ryujinx.Memory.Tracking
}
}
/// <summary>
/// Consume the dirty flag for this handle, and reprotect so it can be set on the next write.
/// </summary>
/// <param name="asDirty">True if the handle should be reprotected as dirty, rather than have it cleared</param>
public void Reprotect(bool asDirty = false)
{
Reprotect(asDirty, false);
}
/// <summary>
/// Register an action to perform when the tracked region is read or written.
/// The action is automatically removed after it runs.