Compare commits

..

57 Commits

Author SHA1 Message Date
43ebd7a9bb New shader cache implementation (#3194)
* New shader cache implementation

* Remove some debug code

* Take transform feedback varying count into account

* Create shader cache directory if it does not exist + fragment output map related fixes

* Remove debug code

* Only check texture descriptors if the constant buffer is bound

* Also check CPU VA on GetSpanMapped

* Remove more unused code and move cache related code

* XML docs + remove more unused methods

* Better codegen for TransformFeedbackDescriptor.AsSpan

* Support migration from old cache format, remove more unused code

Shader cache rebuild now also rewrites the shared toc and data files

* Fix migration error with BRX shaders

* Add a limit to the async translation queue

 Avoid async translation threads not being able to keep up and the queue growing very large

* Re-create specialization state on recompile

This might be required if a new version of the shader translator requires more or less state, or if there is a bug related to the GPU state access

* Make shader cache more error resilient

* Add some missing XML docs and move GpuAccessor docs to the interface/use inheritdoc

* Address early PR feedback

* Fix rebase

* Remove IRenderer.CompileShader and IShader interface, replace with new ShaderSource struct passed to CreateProgram directly

* Handle some missing exceptions

* Make shader cache purge delete both old and new shader caches

* Register textures on new specialization state

* Translate and compile shaders in forward order (eliminates diffs due to different binding numbers)

* Limit in-flight shader compilation to the maximum number of compilation threads

* Replace ParallelDiskCacheLoader state changed event with a callback function

* Better handling for invalid constant buffer 1 data length

* Do not create the old cache directory structure if the old cache does not exist

* Constant buffer use should be per-stage. This change will invalidate existing new caches (file format version was incremented)

* Replace rectangle texture with just coordinate normalization

* Skip incompatible shaders that are missing texture information, instead of crashing

This is required if we, for example, support new texture instruction to the shader translator, and then they allow access to textures that were not accessed before. In this scenario, the old cache entry is no longer usable

* Fix coordinates normalization on cubemap textures

* Check if title ID is null before combining shader cache path

* More robust constant buffer address validation on spec state

* More robust constant buffer address validation on spec state (2)

* Regenerate shader cache with one stream, rather than one per shader.

* Only create shader cache directory during initialization

* Logging improvements

* Proper shader program disposal

* PR feedback, and add a comment on serialized structs

* XML docs for RegisterTexture

Co-authored-by: riperiperi <rhy3756547@hotmail.com>
2022-04-10 10:49:44 -03:00
26a881176e Fix tail merge from block with conditional jump to multiple returns (#3267)
* Fix tail merge from block with conditional jump to multiple returns

* PPTC version bump
2022-04-09 16:56:50 +02:00
e44a43c7e1 Implement VMAD shader instruction and improve InvocationInfo and ISBERD handling (#3251)
* Implement VMAD shader instruction and improve InvocationInfo and ISBERD handling

* Shader cache version bump

* Fix typo
2022-04-08 12:42:39 +02:00
3139a85a2b Allow copy texture views to have mismatching multisample state (#3152) 2022-04-08 11:26:48 +02:00
a4e8bea866 Lop3Expression: Optimize expressions (#3184)
* lut3

* bugfixes

* TruthTable

* false/true -> 0/-1

* add or to expressions

* fix inversions

* increment cache version
2022-04-08 11:17:38 +02:00
6a9e9b5360 Remove save data creation prompt (#3252)
* begone

* review

* mods directory update
2022-04-08 11:09:35 +02:00
952f6f8a65 Calculate vertex buffer size from index buffer type (#3253)
* Calculate vertex buffer size from index buffer type

* We also need to update the size if first vertex changes
2022-04-08 11:02:06 +02:00
d04ba51bb0 amadeus: Improve and fix delay effect processing (#3205)
* amadeus: Improve and fix delay effect processing

This rework the delay effect processing by representing calculation with the appropriate matrix and by unrolling some loop in the code.
This allows better optimization by the JIT while making it more readeable.

Also fix a bug in the Surround code path found while looking back at my notes.

* Remove useless GetHashCode

* Address gdkchan's comments
2022-04-08 10:52:18 +02:00
55ee261363 service: hid: Signal event on AcquireNpadStyleSetUpdateEventHandle (#3247) 2022-04-07 15:43:14 -03:00
4e3a34412e Update to LibHac 0.16.1 (#3263) 2022-04-07 15:18:14 -03:00
3f4fb8f73a amadeus: Update to REV11 (#3230)
This should implement all ABI changes from REV11 on 14.0.0

As Nintendo changed the channel disposition for "legacy" effects (Delay, Reverb and Reverb 3D) to match the standard channel mapping, I took the liberty to just remap to the old disposition for now.
The proper changes will be handled at a later date with a complete rewriting of those 3 effects to be more readable (see https://github.com/Ryujinx/Ryujinx/pull/3205 for the first iteration of it).
2022-04-06 09:12:38 +02:00
56c56aa34d Do not clamp SNorm outputs to the [0, 1] range on OpenGL (#3260) 2022-04-05 18:09:06 -03:00
d4b960d348 Implement primitive restart draw arrays properly on OpenGL (#3256) 2022-04-04 18:43:24 -03:00
b2a225558d Do not force scissor on clear if scissor is disabled (#3258) 2022-04-04 18:30:43 -03:00
0ef0fc044a Small graphics abstraction layer cleanup (#3257) 2022-04-04 18:21:06 -03:00
04bd87ed5a Fix shader textureSize with multisample and buffer textures (#3240)
* Fix shader textureSize with multisample and buffer textures

* Replace out param with tuple return value
2022-04-04 14:43:58 -03:00
5158cdb308 infra: Put SDL2 headless release inside a GUI-less block in PR (#3218)
As title say.
2022-03-26 11:38:35 +01:00
1402d8391d Support NVDEC H264 interlaced video decoding and VIC deinterlacing (#3225)
* Support NVDEC H264 interlaced video decoding and VIC deinterlacing

* Remove unused code
2022-03-23 17:09:32 -03:00
e3b36db71c hle: Some cleanup (#3210)
* hle: Some cleanup

This PR cleaned up a bit the HLE folder and the VirtualFileSystem one, since we use LibHac, we can use some class of it directly instead of duplicate things. The "Content" of VFS folder is removed since it should be handled in the NCM service directly.
A larger cleanup should be done later since there is still be duplicated code here and there.

* Fix Headless.SDL2

* Addresses gdkchan feedback
2022-03-22 20:46:16 +01:00
ba0171d054 Memory.Tests: Make Multithreading test explicit (#3220) 2022-03-21 09:21:05 +01:00
d1146a5af2 Don't restore Viewport 0 if it hasn't been set yet. (#3219)
Fixes a driver crash when starting some games caused by #3217
2022-03-20 14:48:43 -03:00
79408b68c3 De-tile GOB when DMA copying from block linear to pitch kind memory regions (#3207)
* De-tile GOB when DMA copying from block linear to pitch kind memory regions

* XML docs + nits

* Remove using

* No flush for regular buffer copies

* Add back ulong casts, fix regression due to oversight
2022-03-20 13:55:07 -03:00
d461d4f68b Fix OpenGL issues with RTSS overlays and OBS Game Capture (#3217)
OpenGL game overlays and hooks tend to make a lot of assumptions about how games present frames to the screen, since presentation in OpenGL kind of sucks and they would like to have info such as the size of the screen, or if the contents are SRGB rather than linear.

There are two ways of getting this. OBS hooks swap buffers to get a frame for video capture, but it actually checks the bound framebuffer at the time. I made sure that this matches the output framebuffer (the window) so that the output matches the size. RTSS checks the viewport size by default, but this was actually set to the last used viewport by the game, causing the OSD to fly all across the screen depending on how it was used (or res scale). The viewport is now manually set to match the output framebuffer size.

In the case of RTSS, it also loads its resources by destructively setting a pixel pack parameter without regard to what it was set to by the guest application. OpenGL state can be set for a long period of time and is not expected to be set before each call to a method, so randomly changing it isn't great practice. To fix this, I've added a line to set the pixel unpack alignment back to 4 after presentation, which should cover RTSS loading its incredibly ugly font.

- RTSS and overlays that use it should no longer cause certain textures to load incorrectly. (mario kart 8, pokemon legends arceus)
- OBS Game Capture should no longer crop the game output incorrectly, flicker randomly, or capture with incorrect gamma.

This doesn't fix issues with how RTSS reports our frame timings.
2022-03-20 13:37:45 -03:00
b45d30acf8 oslc: Fix condition in GetSaveDataBackupSetting (#3208)
* oslc: Fix condition in GetSaveDataBackupSetting

This PR fixes a condition previously implemented in #3190 where ACNH can't be booted without an existing savedata.
Closes #3206

* Addresses gdkchan feedback
2022-03-20 13:25:29 -03:00
df70442c46 InstEmitMemoryEx: Barrier after write on ordered store (#3193)
* InstEmitMemoryEx: Barrier after write on ordered store

* increment ptc version

* 32
2022-03-19 10:32:35 -03:00
e2ffa5a125 ntc: Implement IEnsureNetworkClockAvailabilityService (#3192)
* ntc: Implement IEnsureNetworkClockAvailabilityService

This PR implement a basic `IEnsureNetworkClockAvailabilityService` checked by RE. It's needed by Splatoon 2 with Guest Internet Access enabled. Game is now playable with this setting.

* Update Ryujinx.HLE/HOS/Services/Nim/Ntc/StaticService/IEnsureNetworkClockAvailabilityService.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-03-15 04:07:07 +01:00
73feac5819 Caching local network info and using an event handler to invalidate as needed (improves menu slow down issue in FE3H) (#2761)
* Update IGeneralService.cs

Fix IPV4 local ip related frame drop in fire emblem by rewriting [CommandHipc(12)]

* Fix IPV4 Local IP Slowdown & Style Fixes

fix a missing space

* Remove unnecessary line

* Fix for hardcoding which index to use

* Replace argument with empty string.

By sending an empty string to Dns.GetHostAddresses("") you get back localhost info only.

* Add caching, undo change in GetCurrentIpAddress

Implement caching and revert the GetCurrentIP() function, speed improvements still present.

* Remove unnecessary using

* Syntax fixes and removing extra lines

Requested changes by AcK77

* Properly unsubscribe from event handler

Adds an unsubscribe in the dispose section of IGeneralService
2022-03-15 03:49:35 +01:00
e5ad1dfa48 Implement S8D24 texture format and tweak depth range detection (#2458) 2022-03-15 03:42:08 +01:00
79becc4b78 Dynamically increase buffer size when resizing (#2861)
* Grow buffers by 1.5x of its size when resizing

* Further restrict the cases where the dynamic expansion is done
2022-03-15 03:33:53 +01:00
223172ac0b Ui: Add option to show/hide console window (Windows-only) (#3170)
* Ui: Add option to show/hide console window (Windows-only)

* Ui: Only display Show Console menu item on Windows

* ConsoleHelper: Handle NULL case

This will never happen

* Address nits

* Address comments

* Address comments 2
2022-03-15 02:35:41 +01:00
8c9633d72f Initialize indexed inputs used on next shader stage (#3198) 2022-03-14 20:02:50 -03:00
1f93fd52d9 Do not initialize geometry shader passthrough attributes (#3196) 2022-03-14 19:35:41 -03:00
aac7bbd378 olsc: Implement GetSaveDataBackupSetting (#3190)
* olsc: Implement GetSaveDataBackupSetting

This PR implement GetSaveDataBackupSetting of OLSC service which is now needed by ACNH 2.0.5. The game is playable as usual if you use the same user profile as the original save file (I don't know if it was the case before), everything is checked by RE.

* addresses gdkchan feedback
2022-03-12 18:38:49 +01:00
bed516bfda Implement rotate stick 90 degrees clockwise (#3084)
* Implement swapping sticks

* Rotate 90 degrees clockwise

Co-authored-by: matesic.darko@gmail.com <Darkman1979>
2022-03-12 18:23:48 +01:00
69b05f9918 Fix GetUserDisableCount NRE (#3187) 2022-03-12 18:12:12 +01:00
fb7c80e928 Limit number of events that can be retrieved from GetDisplayVSyncEvent (#3188)
* Limit number of events that can be retrieved from GetDisplayVSyncEvent

* Cleaning

* Rename openDisplayInfos -> openDisplays
2022-03-12 17:56:19 +01:00
bb2f9df0a1 KThread: Fix GetPsr mask (#3180)
* ExecutionContext: GetPstate / SetPstate

* Put it in NativeContext

* KThread: Fix GetPsr mask

* ExecutionContext: Turn methods into Pstate property

* Address nit
2022-03-11 03:16:32 +01:00
54bfaa125d amadeus: Fix wrong Span usage in CopyHistories (#3181)
Fix a copypasta from the original Amadeus PR causing invalid
CopyHistories output.

Also added a missing size check.

This fix a crash in Mononoke Slashdown
2022-03-07 09:49:29 +01:00
7af9fcbc06 T32: Implement Data Processing (Modified Immediate) instructions (#3178)
* T32: Implement Data Processing (Modified Immediate) instructions

* Update tests

* switch -> lookup table
2022-03-06 22:25:01 +01:00
ee174be57c Mod loading from atmosphere SD directories (#3176)
* initial sd support

* GUI option

* alignment

* review changes
2022-03-06 22:12:01 +01:00
0bcbe32367 Only initialize shader outputs that are actually used on the next stage (#3054)
* Only initialize shader outputs that are actually used on the next stage

* Shader cache version bump
2022-03-06 20:42:13 +01:00
b97ff4da5e A32: Fix ALU immediate instructions (#3179)
* Tests: Add A32 tests for immediate ADC/ADCS/RSC/RSCS/SBC/SBCS

* A32: Fix bug in ADC/ADCS/RSC/RSCS/SBC/SBCS

* CpuTestAluImm32: Add more opcodes

* Increment PTC version
2022-03-05 15:23:10 -03:00
747081d2c7 Decoders: Fix instruction lengths for 16-bit B instructions (#3177) 2022-03-05 16:20:24 +01:00
497199bb50 Decoder: Exit on trapping instructions, and resume execution at trapping instruction (#3153)
* Decoder: Exit on trapping instructions, and resume execution at trapping instruction

* Resume at trapping address

* remove mustExit
2022-03-04 23:16:58 +01:00
bd9ac0fdaa T32: Implement B, B.cond, BL, BLX (#3155)
* Decoders: Make IsThumb a function of OpCode32

* OpCode32: Fix GetPc

* T32: Implement B, B.cond, BL, BLX

* rm usings
2022-03-04 23:05:08 +01:00
ac21abbb9d Preparation for initial Flatpack and FlatHub integration (#3173)
* Preparation for initial Flatpack and FlatHub integration

This integrate some initial changes required for Flatpack and distribution from FlatHub.

Also added some resources that will be used for packaging on Linux.

* Address gdkchan comment
2022-03-04 18:03:16 +01:00
a3dd04deef Implement -p or --profile command line argument (#2947)
* Implement -p or --profile command line argument

* Put command line logic all in Program and reference it elsewhere

* Address feedback
2022-03-02 09:51:37 +01:00
3705c20668 Update LibHac to v0.16.0 (#3159) 2022-02-27 00:52:25 +01:00
7b35ebc64a T32: Implement ALU (shifted register) instructions (#3135)
* T32: Implement ADC, ADD, AND, BIC, CMN, CMP, EOR, MOV, MVN, ORN, ORR, RSB, SBC, SUB, TEQ, TST (shifted register)

* OpCodeTable: Sort T32 list

* Tests: Rename RandomTestCase to PrecomputedThumbTestCase

* T32: Tests for AluRsImm instructions

* fix nit

* fix nit 2
2022-02-22 19:11:28 -03:00
0a24aa6af2 Allow textures to have their data partially mapped (#2629)
* Allow textures to have their data partially mapped

* Explicitly check for invalid memory ranges on the MultiRangeList

* Update GetWritableRegion to also support unmapped ranges
2022-02-22 13:34:16 -03:00
c9c65af59e Perform unscaled 2d engine copy on CPU if source texture isn't in cache. (#3112)
* Initial implementation of fast 2d copy

TODO: Partial copy for mismatching region/size.

* WIP

* Cleanup

* Update Ryujinx.Graphics.Gpu/Engine/Twod/TwodClass.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-02-22 11:21:29 -03:00
dc063eac83 ARMeilleure: Implement single stepping (#3133)
* Decoder: Implement SingleInstruction decoder mode

* Translator: Implement Step

* DecoderMode: Rename Normal to MultipleBlocks
2022-02-22 11:11:42 -03:00
ccf23fc629 gui: Fixes the games icon when there is an update (#3148)
* gui: Fixes the games icon when there is a game update

Currently we just load the version of the update, instead of the whole NACP file. This PR fixes that. A little cleanup is made into the code to avoid duplicate things.
(Closes #3039)

* Fix condition
2022-02-22 14:53:39 +01:00
f1460d5494 A32: Fix BLX and BXWritePC (#3151) 2022-02-22 10:41:56 -03:00
644b497df1 Collapse AsSpan().Slice(..) calls into AsSpan(..) (#3145)
* Collapse AsSpan().Slice(..) calls into AsSpan(..)

Less code and a bit faster

* Collapse an Array.Clear(array, 0, array.Length) call to Array.Clear(array)
2022-02-22 10:32:10 -03:00
fb935fd201 Add dedicated ServerBase for FileSystem services (#3142)
This should prevent filesystem services from blocking other services that don't have their own ServerBase. May improve filesystem related stutters in certain titles.

Improves button advanced cutscenes such as Miqol's Request in Xenoblade: DE when the game is on a network share (used to stutter when voice lines played).

Should probably be tested to make sure no mysterious bugs have been unearthed, and to see if any other filesystem related perf issues are improved.
2022-02-19 15:29:11 +01:00
f2087ca29e PPTC version increment (#3139) 2022-02-17 23:52:42 -03:00
284 changed files with 11975 additions and 3775 deletions

View File

@ -36,15 +36,20 @@ jobs:
return core.error(`No artifacts found`);
}
let body = `Download the artifacts for this pull request:\n`;
let hidden_headless_artifacts = `\n\n <details><summary>GUI-less (SDL2)</summary>\n`;
let hidden_debug_artifacts = `\n\n <details><summary>Only for Developers</summary>\n`;
for (const art of artifacts) {
if(art.name.includes('Debug')) {
hidden_debug_artifacts += `\n* [${art.name}](https://nightly.link/${owner}/${repo}/actions/artifacts/${art.id}.zip)`;
} else if(art.name.includes('headless-sdl2')) {
hidden_headless_artifacts += `\n* [${art.name}](https://nightly.link/${owner}/${repo}/actions/artifacts/${art.id}.zip)`;
} else {
body += `\n* [${art.name}](https://nightly.link/${owner}/${repo}/actions/artifacts/${art.id}.zip)`;
}
}
hidden_headless_artifacts += `\n</details>`;
hidden_debug_artifacts += `\n</details>`;
body += hidden_headless_artifacts;
body += hidden_debug_artifacts;
const {data: comments} = await github.issues.listComments({repo, owner, issue_number});

View File

@ -59,7 +59,7 @@ namespace ARMeilleure.CodeGen.Optimizations
BasicBlock fromPred = from.Predecessors.Count == 1 ? from.Predecessors[0] : null;
// If the block is empty, we can try to append to the predecessor and avoid unnecessary jumps.
if (from.Operations.Count == 0 && fromPred != null)
if (from.Operations.Count == 0 && fromPred != null && fromPred.SuccessorsCount == 1)
{
for (int i = 0; i < fromPred.SuccessorsCount; i++)
{

View File

@ -18,7 +18,7 @@ namespace ARMeilleure.Decoders
// For lower code quality translation, we set a lower limit since we're blocking execution.
private const int MaxInstsPerFunctionLowCq = 500;
public static Block[] Decode(IMemoryManager memory, ulong address, ExecutionMode mode, bool highCq, bool singleBlock)
public static Block[] Decode(IMemoryManager memory, ulong address, ExecutionMode mode, bool highCq, DecoderMode dMode)
{
List<Block> blocks = new List<Block>();
@ -38,7 +38,7 @@ namespace ARMeilleure.Decoders
{
block = new Block(blkAddress);
if ((singleBlock && visited.Count >= 1) || opsCount > instructionLimit || !memory.IsMapped(blkAddress))
if ((dMode != DecoderMode.MultipleBlocks && visited.Count >= 1) || opsCount > instructionLimit || !memory.IsMapped(blkAddress))
{
block.Exit = true;
block.EndAddress = blkAddress;
@ -96,6 +96,12 @@ namespace ARMeilleure.Decoders
}
}
if (dMode == DecoderMode.SingleInstruction)
{
// Only read at most one instruction
limitAddress = currBlock.Address + 1;
}
FillBlock(memory, mode, currBlock, limitAddress);
opsCount += currBlock.OpCodes.Count;
@ -115,7 +121,7 @@ namespace ARMeilleure.Decoders
currBlock.Branch = GetBlock((ulong)op.Immediate);
}
if (!IsUnconditionalBranch(lastOp) || isCall)
if (isCall || !(IsUnconditionalBranch(lastOp) || IsTrap(lastOp)))
{
currBlock.Next = GetBlock(currBlock.EndAddress);
}
@ -143,7 +149,7 @@ namespace ARMeilleure.Decoders
throw new InvalidOperationException($"Decoded a single empty exit block. Entry point = 0x{address:X}.");
}
if (!singleBlock)
if (dMode == DecoderMode.MultipleBlocks)
{
return TailCallRemover.RunPass(address, blocks);
}
@ -257,6 +263,11 @@ namespace ARMeilleure.Decoders
// so we must consider such operations as a branch in potential aswell.
if (opCode is IOpCode32Alu opAlu && opAlu.Rd == RegisterAlias.Aarch32Pc)
{
if (opCode is OpCodeT32)
{
return opCode.Instruction.Name != InstName.Tst && opCode.Instruction.Name != InstName.Teq &&
opCode.Instruction.Name != InstName.Cmp && opCode.Instruction.Name != InstName.Cmn;
}
return true;
}
@ -318,9 +329,13 @@ namespace ARMeilleure.Decoders
}
private static bool IsException(OpCode opCode)
{
return IsTrap(opCode) || opCode.Instruction.Name == InstName.Svc;
}
private static bool IsTrap(OpCode opCode)
{
return opCode.Instruction.Name == InstName.Brk ||
opCode.Instruction.Name == InstName.Svc ||
opCode.Instruction.Name == InstName.Trap ||
opCode.Instruction.Name == InstName.Und;
}

View File

@ -0,0 +1,9 @@
namespace ARMeilleure.Decoders
{
enum DecoderMode
{
MultipleBlocks,
SingleBlock,
SingleInstruction,
}
}

View File

@ -13,11 +13,25 @@ namespace ARMeilleure.Decoders
Cond = (Condition)((uint)opCode >> 28);
}
public bool IsThumb()
{
return this is OpCodeT16 || this is OpCodeT32;
}
public uint GetPc()
{
// Due to backwards compatibility and legacy behavior of ARMv4 CPUs pipeline,
// the PC actually points 2 instructions ahead.
return (uint)Address + (uint)OpCodeSizeInBytes * 2;
if (IsThumb())
{
// PC is ahead by 4 in thumb mode whether or not the current instruction
// is 16 or 32 bit.
return (uint)Address + 4u;
}
else
{
return (uint)Address + 8u;
}
}
}
}

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16BImm11 : OpCode32, IOpCode32BImm
class OpCodeT16BImm11 : OpCodeT16, IOpCode32BImm
{
public long Immediate { get; }

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16BImm8 : OpCode32, IOpCode32BImm
class OpCodeT16BImm8 : OpCodeT16, IOpCode32BImm
{
public long Immediate { get; }

View File

@ -0,0 +1,14 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32 : OpCode32
{
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32(inst, address, opCode);
public OpCodeT32(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Cond = Condition.Al;
OpCodeSizeInBytes = 4;
}
}
}

View File

@ -0,0 +1,20 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32Alu : OpCodeT32, IOpCode32Alu
{
public int Rd { get; }
public int Rn { get; }
public bool? SetFlags { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32Alu(inst, address, opCode);
public OpCodeT32Alu(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 8) & 0xf;
Rn = (opCode >> 16) & 0xf;
SetFlags = ((opCode >> 20) & 1) != 0;
}
}
}

View File

@ -0,0 +1,38 @@
using ARMeilleure.Common;
using System.Runtime.Intrinsics;
namespace ARMeilleure.Decoders
{
class OpCodeT32AluImm : OpCodeT32Alu, IOpCode32AluImm
{
public int Immediate { get; }
public bool IsRotated { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluImm(inst, address, opCode);
private static readonly Vector128<int> _factor = Vector128.Create(1, 0x00010001, 0x01000100, 0x01010101);
public OpCodeT32AluImm(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
int imm8 = (opCode >> 0) & 0xff;
int imm3 = (opCode >> 12) & 7;
int imm1 = (opCode >> 26) & 1;
int imm12 = imm8 | (imm3 << 8) | (imm1 << 11);
if ((imm12 >> 10) == 0)
{
Immediate = imm8 * _factor.GetElement((imm12 >> 8) & 3);
IsRotated = false;
}
else
{
int shift = imm12 >> 7;
Immediate = BitUtils.RotateRight(0x80 | (imm12 & 0x7f), shift, 32);
IsRotated = shift != 0;
}
}
}
}

View File

@ -0,0 +1,20 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32AluRsImm : OpCodeT32Alu, IOpCode32AluRsImm
{
public int Rm { get; }
public int Immediate { get; }
public ShiftType ShiftType { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluRsImm(inst, address, opCode);
public OpCodeT32AluRsImm(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rm = (opCode >> 0) & 0xf;
Immediate = ((opCode >> 6) & 3) | ((opCode >> 10) & 0x1c);
ShiftType = (ShiftType)((opCode >> 4) & 3);
}
}
}

View File

@ -0,0 +1,29 @@
using ARMeilleure.Instructions;
namespace ARMeilleure.Decoders
{
class OpCodeT32BImm20 : OpCodeT32, IOpCode32BImm
{
public long Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32BImm20(inst, address, opCode);
public OpCodeT32BImm20(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
uint pc = GetPc();
int imm11 = (opCode >> 0) & 0x7ff;
int j2 = (opCode >> 11) & 1;
int j1 = (opCode >> 13) & 1;
int imm6 = (opCode >> 16) & 0x3f;
int s = (opCode >> 26) & 1;
int imm32 = imm11 | (imm6 << 11) | (j1 << 17) | (j2 << 18) | (s << 19);
imm32 = (imm32 << 13) >> 12;
Immediate = pc + imm32;
Cond = (Condition)((opCode >> 22) & 0xf);
}
}
}

View File

@ -0,0 +1,35 @@
using ARMeilleure.Instructions;
namespace ARMeilleure.Decoders
{
class OpCodeT32BImm24 : OpCodeT32, IOpCode32BImm
{
public long Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32BImm24(inst, address, opCode);
public OpCodeT32BImm24(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
uint pc = GetPc();
if (inst.Name == InstName.Blx)
{
pc &= ~3u;
}
int imm11 = (opCode >> 0) & 0x7ff;
int j2 = (opCode >> 11) & 1;
int j1 = (opCode >> 13) & 1;
int imm10 = (opCode >> 16) & 0x3ff;
int s = (opCode >> 26) & 1;
int i1 = j1 ^ s ^ 1;
int i2 = j2 ^ s ^ 1;
int imm32 = imm11 | (imm10 << 11) | (i2 << 21) | (i1 << 22) | (s << 23);
imm32 = (imm32 << 9) >> 8;
Immediate = pc + imm32;
}
}
}

View File

@ -1,6 +1,7 @@
using ARMeilleure.Instructions;
using System;
using System.Collections.Generic;
using System.Numerics;
namespace ARMeilleure.Decoders
{
@ -972,8 +973,7 @@ namespace ARMeilleure.Decoders
SetA32("111100111x11<<10xxxx00011xx0xxxx", InstName.Vzip, InstEmit32.Vzip, OpCode32SimdCmpZ.Create);
#endregion
#region "OpCode Table (AArch32, T16/T32)"
// T16
#region "OpCode Table (AArch32, T16)"
SetT16("000<<xxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16ShiftImm.Create);
SetT16("0001100xxxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT16AddSubReg.Create);
SetT16("0001101xxxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT16AddSubReg.Create);
@ -1045,6 +1045,46 @@ namespace ARMeilleure.Decoders
SetT16("11100xxxxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT16BImm11.Create);
#endregion
#region "OpCode Table (AArch32, T32)"
// Base
SetT32("11101011010xxxxx0xxxxxxxxxxxxxxx", InstName.Adc, InstEmit32.Adc, OpCodeT32AluRsImm.Create);
SetT32("11110x01010xxxxx0xxxxxxxxxxxxxxx", InstName.Adc, InstEmit32.Adc, OpCodeT32AluImm.Create);
SetT32("11101011000<xxxx0xxx<<<<xxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT32AluRsImm.Create);
SetT32("11110x01000<xxxx0xxx<<<<xxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT32AluImm.Create);
SetT32("11101010000<xxxx0xxx<<<<xxxxxxxx", InstName.And, InstEmit32.And, OpCodeT32AluRsImm.Create);
SetT32("11110x00000<xxxx0xxx<<<<xxxxxxxx", InstName.And, InstEmit32.And, OpCodeT32AluImm.Create);
SetT32("11110x<<<xxxxxxx10x0xxxxxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT32BImm20.Create);
SetT32("11110xxxxxxxxxxx10x1xxxxxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT32BImm24.Create);
SetT32("11101010001xxxxx0xxxxxxxxxxxxxxx", InstName.Bic, InstEmit32.Bic, OpCodeT32AluRsImm.Create);
SetT32("11110x00001xxxxx0xxxxxxxxxxxxxxx", InstName.Bic, InstEmit32.Bic, OpCodeT32AluImm.Create);
SetT32("11110xxxxxxxxxxx11x1xxxxxxxxxxxx", InstName.Bl, InstEmit32.Bl, OpCodeT32BImm24.Create);
SetT32("11110xxxxxxxxxxx11x0xxxxxxxxxxx0", InstName.Blx, InstEmit32.Blx, OpCodeT32BImm24.Create);
SetT32("111010110001xxxx0xxx1111xxxxxxxx", InstName.Cmn, InstEmit32.Cmn, OpCodeT32AluRsImm.Create);
SetT32("11110x010001xxxx0xxx1111xxxxxxxx", InstName.Cmn, InstEmit32.Cmn, OpCodeT32AluImm.Create);
SetT32("111010111011xxxx0xxx1111xxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT32AluRsImm.Create);
SetT32("11110x011011xxxx0xxx1111xxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT32AluImm.Create);
SetT32("11101010100<xxxx0xxx<<<<xxxxxxxx", InstName.Eor, InstEmit32.Eor, OpCodeT32AluRsImm.Create);
SetT32("11110x00100<xxxx0xxx<<<<xxxxxxxx", InstName.Eor, InstEmit32.Eor, OpCodeT32AluImm.Create);
SetT32("11101010010x11110xxxxxxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT32AluRsImm.Create);
SetT32("11110x00010x11110xxxxxxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT32AluImm.Create);
SetT32("11101010011x11110xxxxxxxxxxxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCodeT32AluRsImm.Create);
SetT32("11110x00011x11110xxxxxxxxxxxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCodeT32AluImm.Create);
SetT32("11101010011x<<<<0xxxxxxxxxxxxxxx", InstName.Orn, InstEmit32.Orn, OpCodeT32AluRsImm.Create);
SetT32("11110x00011x<<<<0xxxxxxxxxxxxxxx", InstName.Orn, InstEmit32.Orn, OpCodeT32AluImm.Create);
SetT32("11101010010x<<<<0xxxxxxxxxxxxxxx", InstName.Orr, InstEmit32.Orr, OpCodeT32AluRsImm.Create);
SetT32("11110x00010x<<<<0xxxxxxxxxxxxxxx", InstName.Orr, InstEmit32.Orr, OpCodeT32AluImm.Create);
SetT32("11101011110xxxxx0xxxxxxxxxxxxxxx", InstName.Rsb, InstEmit32.Rsb, OpCodeT32AluRsImm.Create);
SetT32("11110x01110xxxxx0xxxxxxxxxxxxxxx", InstName.Rsb, InstEmit32.Rsb, OpCodeT32AluImm.Create);
SetT32("11101011011xxxxx0xxxxxxxxxxxxxxx", InstName.Sbc, InstEmit32.Sbc, OpCodeT32AluRsImm.Create);
SetT32("11110x01011xxxxx0xxxxxxxxxxxxxxx", InstName.Sbc, InstEmit32.Sbc, OpCodeT32AluImm.Create);
SetT32("11101011101<xxxx0xxx<<<<xxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT32AluRsImm.Create);
SetT32("11110x01101<xxxx0xxx<<<<xxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT32AluImm.Create);
SetT32("111010101001xxxx0xxx1111xxxxxxxx", InstName.Teq, InstEmit32.Teq, OpCodeT32AluRsImm.Create);
SetT32("11110x001001xxxx0xxx1111xxxxxxxx", InstName.Teq, InstEmit32.Teq, OpCodeT32AluImm.Create);
SetT32("111010100001xxxx0xxx1111xxxxxxxx", InstName.Tst, InstEmit32.Tst, OpCodeT32AluRsImm.Create);
SetT32("11110x000001xxxx0xxx1111xxxxxxxx", InstName.Tst, InstEmit32.Tst, OpCodeT32AluImm.Create);
#endregion
FillFastLookupTable(InstA32FastLookup, AllInstA32, ToFastLookupIndexA);
FillFastLookupTable(InstT32FastLookup, AllInstT32, ToFastLookupIndexT);
FillFastLookupTable(InstA64FastLookup, AllInstA64, ToFastLookupIndexA);
@ -1092,8 +1132,11 @@ namespace ARMeilleure.Decoders
private static void SetT32(string encoding, InstName name, InstEmitter emitter, MakeOp makeOp)
{
encoding = encoding.Substring(16) + encoding.Substring(0, 16);
Set(encoding, AllInstT32, new InstDescriptor(name, emitter), makeOp);
string reversedEncoding = encoding.Substring(16) + encoding.Substring(0, 16);
MakeOp reversedMakeOp =
(InstDescriptor inst, ulong address, int opCode)
=> makeOp(inst, address, (int)BitOperations.RotateRight((uint)opCode, 16));
Set(reversedEncoding, AllInstT32, new InstDescriptor(name, emitter), reversedMakeOp);
}
private static void SetA64(string encoding, InstName name, InstEmitter emitter, MakeOp makeOp)

View File

@ -244,6 +244,23 @@ namespace ARMeilleure.Instructions
EmitAluStore(context, res);
}
public static void Orn(ArmEmitterContext context)
{
IOpCode32Alu op = (IOpCode32Alu)context.CurrOp;
Operand n = GetAluN(context);
Operand m = GetAluM(context);
Operand res = context.BitwiseOr(n, context.BitwiseNot(m));
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
EmitAluStore(context, res);
}
public static void Pkh(ArmEmitterContext context)
{
OpCode32AluRsImm op = (OpCode32AluRsImm)context.CurrOp;

View File

@ -128,7 +128,7 @@ namespace ARMeilleure.Instructions
{
Debug.Assert(value.Type == OperandType.I32);
if (IsThumb(context.CurrOp))
if (((OpCode32)context.CurrOp).IsThumb())
{
bool isReturn = IsA32Return(context);
if (!isReturn)
@ -197,7 +197,7 @@ namespace ARMeilleure.Instructions
// ARM32.
case IOpCode32AluImm op:
{
if (ShouldSetFlags(context) && op.IsRotated)
if (ShouldSetFlags(context) && op.IsRotated && setCarry)
{
SetFlag(context, PState.CFlag, Const((uint)op.Immediate >> 31));
}

View File

@ -9,18 +9,25 @@ namespace ARMeilleure.Instructions
{
public static void Brk(ArmEmitterContext context)
{
EmitExceptionCall(context, nameof(NativeInterface.Break));
OpCodeException op = (OpCodeException)context.CurrOp;
string name = nameof(NativeInterface.Break);
context.StoreToContext();
context.Call(typeof(NativeInterface).GetMethod(name), Const(op.Address), Const(op.Id));
context.LoadFromContext();
context.Return(Const(op.Address));
}
public static void Svc(ArmEmitterContext context)
{
EmitExceptionCall(context, nameof(NativeInterface.SupervisorCall));
}
private static void EmitExceptionCall(ArmEmitterContext context, string name)
{
OpCodeException op = (OpCodeException)context.CurrOp;
string name = nameof(NativeInterface.SupervisorCall);
context.StoreToContext();
context.Call(typeof(NativeInterface).GetMethod(name), Const(op.Address), Const(op.Id));
@ -41,6 +48,8 @@ namespace ARMeilleure.Instructions
context.Call(typeof(NativeInterface).GetMethod(name), Const(op.Address), Const(op.RawOpCode));
context.LoadFromContext();
context.Return(Const(op.Address));
}
}
}

View File

@ -9,19 +9,11 @@ namespace ARMeilleure.Instructions
static partial class InstEmit32
{
public static void Svc(ArmEmitterContext context)
{
EmitExceptionCall(context, nameof(NativeInterface.SupervisorCall));
}
public static void Trap(ArmEmitterContext context)
{
EmitExceptionCall(context, nameof(NativeInterface.Break));
}
private static void EmitExceptionCall(ArmEmitterContext context, string name)
{
IOpCode32Exception op = (IOpCode32Exception)context.CurrOp;
string name = nameof(NativeInterface.SupervisorCall);
context.StoreToContext();
context.Call(typeof(NativeInterface).GetMethod(name), Const(((IOpCode)op).Address), Const(op.Id));
@ -30,5 +22,20 @@ namespace ARMeilleure.Instructions
Translator.EmitSynchronization(context);
}
public static void Trap(ArmEmitterContext context)
{
IOpCode32Exception op = (IOpCode32Exception)context.CurrOp;
string name = nameof(NativeInterface.Break);
context.StoreToContext();
context.Call(typeof(NativeInterface).GetMethod(name), Const(((IOpCode)op).Address), Const(op.Id));
context.LoadFromContext();
context.Return(Const(context.CurrOp.Address));
}
}
}

View File

@ -34,7 +34,7 @@ namespace ARMeilleure.Instructions
uint pc = op.GetPc();
bool isThumb = IsThumb(context.CurrOp);
bool isThumb = ((OpCode32)context.CurrOp).IsThumb();
uint currentPc = isThumb
? pc | 1
@ -61,7 +61,7 @@ namespace ARMeilleure.Instructions
Operand addr = context.Copy(GetIntA32(context, op.Rm));
Operand bitOne = context.BitwiseAnd(addr, Const(1));
bool isThumb = IsThumb(context.CurrOp);
bool isThumb = ((OpCode32)context.CurrOp).IsThumb();
uint currentPc = isThumb
? (pc - 2) | 1
@ -71,7 +71,7 @@ namespace ARMeilleure.Instructions
SetFlag(context, PState.TFlag, bitOne);
EmitVirtualCall(context, addr);
EmitBxWritePc(context, addr);
}
public static void Bx(ArmEmitterContext context)

View File

@ -10,11 +10,6 @@ namespace ARMeilleure.Instructions
{
static class InstEmitHelper
{
public static bool IsThumb(OpCode op)
{
return op is OpCodeT16;
}
public static Operand GetExtendedM(ArmEmitterContext context, int rm, IntType type)
{
Operand value = GetIntOrZR(context, rm);
@ -186,7 +181,7 @@ namespace ARMeilleure.Instructions
SetFlag(context, PState.TFlag, mode);
Operand addr = context.ConditionalSelect(mode, pc, context.BitwiseAnd(pc, Const(~3)));
Operand addr = context.ConditionalSelect(mode, context.BitwiseAnd(pc, Const(~1)), context.BitwiseAnd(pc, Const(~3)));
InstEmitFlowHelper.EmitVirtualJump(context, addr, isReturn);
}

View File

@ -130,11 +130,6 @@ namespace ARMeilleure.Instructions
bool ordered = (accType & AccessType.Ordered) != 0;
bool exclusive = (accType & AccessType.Exclusive) != 0;
if (ordered)
{
EmitBarrier(context);
}
Operand address = context.Copy(GetIntOrSP(context, op.Rn));
Operand t = GetIntOrZR(context, op.Rt);
@ -163,6 +158,11 @@ namespace ARMeilleure.Instructions
{
EmitStoreExclusive(context, address, t, exclusive, op.Size, op.Rs, a32: false);
}
if (ordered)
{
EmitBarrier(context);
}
}
private static void EmitBarrier(ArmEmitterContext context)

View File

@ -146,13 +146,13 @@ namespace ARMeilleure.Instructions
var exclusive = (accType & AccessType.Exclusive) != 0;
var ordered = (accType & AccessType.Ordered) != 0;
if ((accType & AccessType.Load) != 0)
{
if (ordered)
{
EmitBarrier(context);
}
if ((accType & AccessType.Load) != 0)
{
if (size == DWordSizeLog2)
{
// Keep loads atomic - make the call to get the whole region and then decompose it into parts
@ -219,6 +219,11 @@ namespace ARMeilleure.Instructions
Operand value = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rt));
EmitStoreExclusive(context, address, value, exclusive, size, op.Rd, a32: true);
}
if (ordered)
{
EmitBarrier(context);
}
}
}

View File

@ -191,7 +191,7 @@ namespace ARMeilleure.Signal
// Is the fault address within this tracked region?
Operand inRange = context.BitwiseAnd(
context.ICompare(faultAddress, rangeAddress, Comparison.GreaterOrEqualUI),
context.ICompare(faultAddress, rangeEndAddress, Comparison.Less)
context.ICompare(faultAddress, rangeEndAddress, Comparison.LessUI)
);
// Only call tracking if in range.

View File

@ -43,6 +43,12 @@ namespace ARMeilleure.State
public long TpidrEl0 { get; set; }
public long Tpidr { get; set; }
public uint Pstate
{
get => _nativeContext.GetPstate();
set => _nativeContext.SetPstate(value);
}
public FPCR Fpcr { get; set; }
public FPSR Fpsr { get; set; }
public FPCR StandardFpcrValue => (Fpcr & (FPCR.Ahp)) | FPCR.Dn | FPCR.Fz;

View File

@ -95,6 +95,25 @@ namespace ARMeilleure.State
GetStorage().Flags[(int)flag] = value ? 1u : 0u;
}
public unsafe uint GetPstate()
{
uint value = 0;
for (int flag = 0; flag < RegisterConsts.FlagsCount; flag++)
{
value |= GetStorage().Flags[flag] != 0 ? 1u << flag : 0u;
}
return value;
}
public unsafe void SetPstate(uint value)
{
for (int flag = 0; flag < RegisterConsts.FlagsCount; flag++)
{
uint bit = 1u << flag;
GetStorage().Flags[flag] = (value & bit) == bit ? 1u : 0u;
}
}
public unsafe bool GetFPStateFlag(FPState flag)
{
if ((uint)flag >= RegisterConsts.FpFlagsCount)

View File

@ -27,7 +27,7 @@ namespace ARMeilleure.Translation.PTC
private const string OuterHeaderMagicString = "PTCohd\0\0";
private const string InnerHeaderMagicString = "PTCihd\0\0";
private const uint InternalVersion = 3061; //! To be incremented manually for each change to the ARMeilleure project.
private const uint InternalVersion = 3267; //! To be incremented manually for each change to the ARMeilleure project.
private const string ActualDir = "0";
private const string BackupDir = "1";

View File

@ -113,7 +113,7 @@ namespace ARMeilleure.Translation
}
}
Array.Clear(localDefs, 0, localDefs.Length);
Array.Clear(localDefs);
}
// Second pass, rename variables with definitions on different blocks.

View File

@ -209,6 +209,17 @@ namespace ARMeilleure.Translation
return nextAddr;
}
public ulong Step(State.ExecutionContext context, ulong address)
{
TranslatedFunction func = Translate(address, context.ExecutionMode, highCq: false, singleStep: true);
address = func.Execute(context);
EnqueueForDeletion(address, func);
return address;
}
internal TranslatedFunction GetOrTranslate(ulong address, ExecutionMode mode)
{
if (!Functions.TryGetValue(address, out TranslatedFunction func))
@ -242,7 +253,7 @@ namespace ARMeilleure.Translation
}
}
internal TranslatedFunction Translate(ulong address, ExecutionMode mode, bool highCq)
internal TranslatedFunction Translate(ulong address, ExecutionMode mode, bool highCq, bool singleStep = false)
{
var context = new ArmEmitterContext(
Memory,
@ -255,7 +266,7 @@ namespace ARMeilleure.Translation
Logger.StartPass(PassName.Decoding);
Block[] blocks = Decoder.Decode(Memory, address, mode, highCq, singleBlock: false);
Block[] blocks = Decoder.Decode(Memory, address, mode, highCq, singleStep ? DecoderMode.SingleInstruction : DecoderMode.MultipleBlocks);
Logger.EndPass(PassName.Decoding);
@ -285,14 +296,14 @@ namespace ARMeilleure.Translation
var options = highCq ? CompilerOptions.HighCq : CompilerOptions.None;
if (context.HasPtc)
if (context.HasPtc && !singleStep)
{
options |= CompilerOptions.Relocatable;
}
CompiledFunction compiledFunc = Compiler.Compile(cfg, argTypes, retType, options);
if (context.HasPtc)
if (context.HasPtc && !singleStep)
{
Hash128 hash = Ptc.ComputeHash(Memory, address, funcSize);

View File

@ -18,8 +18,10 @@
using Ryujinx.Audio.Renderer.Dsp.State;
using Ryujinx.Audio.Renderer.Parameter.Effect;
using Ryujinx.Audio.Renderer.Server.Effect;
using Ryujinx.Audio.Renderer.Utils.Math;
using System;
using System.Diagnostics;
using System.Numerics;
using System.Runtime.CompilerServices;
namespace Ryujinx.Audio.Renderer.Dsp.Command
@ -45,7 +47,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
private const int FixedPointPrecision = 14;
public DelayCommand(uint bufferOffset, DelayParameter parameter, Memory<DelayState> state, bool isEnabled, ulong workBuffer, int nodeId)
public DelayCommand(uint bufferOffset, DelayParameter parameter, Memory<DelayState> state, bool isEnabled, ulong workBuffer, int nodeId, bool newEffectChannelMappingSupported)
{
Enabled = true;
NodeId = nodeId;
@ -63,9 +65,14 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
InputBufferIndices[i] = (ushort)(bufferOffset + Parameter.Input[i]);
OutputBufferIndices[i] = (ushort)(bufferOffset + Parameter.Output[i]);
}
// NOTE: We do the opposite as Nintendo here for now to restore previous behaviour
// TODO: Update delay processing and remove this to use RemapLegacyChannelEffectMappingToChannelResourceMapping.
DataSourceHelper.RemapChannelResourceMappingToLegacy(newEffectChannelMappingSupported, InputBufferIndices);
DataSourceHelper.RemapChannelResourceMappingToLegacy(newEffectChannelMappingSupported, OutputBufferIndices);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
[MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
private unsafe void ProcessDelayMono(ref DelayState state, float* outputBuffer, float* inputBuffer, uint sampleCount)
{
float feedbackGain = FixedPointHelper.ToFloat(Parameter.FeedbackGain, FixedPointPrecision);
@ -78,133 +85,148 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
float input = inputBuffer[i] * 64;
float delayLineValue = state.DelayLines[0].Read();
float lowPassResult = (input * inGain + delayLineValue * feedbackGain) * state.LowPassBaseGain + state.LowPassZ[0] * state.LowPassFeedbackGain;
float temp = input * inGain + delayLineValue * feedbackGain;
state.LowPassZ[0] = lowPassResult;
state.DelayLines[0].Update(lowPassResult);
state.UpdateLowPassFilter(ref temp, 1);
outputBuffer[i] = (input * dryGain + delayLineValue * outGain) / 64;
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
[MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
private unsafe void ProcessDelayStereo(ref DelayState state, Span<IntPtr> outputBuffers, ReadOnlySpan<IntPtr> inputBuffers, uint sampleCount)
{
const ushort channelCount = 2;
Span<float> channelInput = stackalloc float[channelCount];
Span<float> delayLineValues = stackalloc float[channelCount];
Span<float> temp = stackalloc float[channelCount];
float delayFeedbackBaseGain = state.DelayFeedbackBaseGain;
float delayFeedbackCrossGain = state.DelayFeedbackCrossGain;
float inGain = FixedPointHelper.ToFloat(Parameter.InGain, FixedPointPrecision);
float dryGain = FixedPointHelper.ToFloat(Parameter.DryGain, FixedPointPrecision);
float outGain = FixedPointHelper.ToFloat(Parameter.OutGain, FixedPointPrecision);
Matrix2x2 delayFeedback = new Matrix2x2(delayFeedbackBaseGain , delayFeedbackCrossGain,
delayFeedbackCrossGain, delayFeedbackBaseGain);
for (int i = 0; i < sampleCount; i++)
{
for (int j = 0; j < channelCount; j++)
Vector2 channelInput = new Vector2
{
channelInput[j] = *((float*)inputBuffers[j] + i) * 64;
delayLineValues[j] = state.DelayLines[j].Read();
}
X = *((float*)inputBuffers[0] + i) * 64,
Y = *((float*)inputBuffers[1] + i) * 64,
};
temp[0] = channelInput[0] * inGain + delayLineValues[1] * delayFeedbackCrossGain + delayLineValues[0] * delayFeedbackBaseGain;
temp[1] = channelInput[1] * inGain + delayLineValues[0] * delayFeedbackCrossGain + delayLineValues[1] * delayFeedbackBaseGain;
for (int j = 0; j < channelCount; j++)
Vector2 delayLineValues = new Vector2()
{
float lowPassResult = state.LowPassFeedbackGain * state.LowPassZ[j] + temp[j] * state.LowPassBaseGain;
X = state.DelayLines[0].Read(),
Y = state.DelayLines[1].Read(),
};
state.LowPassZ[j] = lowPassResult;
state.DelayLines[j].Update(lowPassResult);
Vector2 temp = MatrixHelper.Transform(ref channelInput, ref delayFeedback) + channelInput * inGain;
*((float*)outputBuffers[j] + i) = (channelInput[j] * dryGain + delayLineValues[j] * outGain) / 64;
}
state.UpdateLowPassFilter(ref Unsafe.As<Vector2, float>(ref temp), channelCount);
*((float*)outputBuffers[0] + i) = (channelInput.X * dryGain + delayLineValues.X * outGain) / 64;
*((float*)outputBuffers[1] + i) = (channelInput.Y * dryGain + delayLineValues.Y * outGain) / 64;
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
[MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
private unsafe void ProcessDelayQuadraphonic(ref DelayState state, Span<IntPtr> outputBuffers, ReadOnlySpan<IntPtr> inputBuffers, uint sampleCount)
{
const ushort channelCount = 4;
Span<float> channelInput = stackalloc float[channelCount];
Span<float> delayLineValues = stackalloc float[channelCount];
Span<float> temp = stackalloc float[channelCount];
float delayFeedbackBaseGain = state.DelayFeedbackBaseGain;
float delayFeedbackCrossGain = state.DelayFeedbackCrossGain;
float inGain = FixedPointHelper.ToFloat(Parameter.InGain, FixedPointPrecision);
float dryGain = FixedPointHelper.ToFloat(Parameter.DryGain, FixedPointPrecision);
float outGain = FixedPointHelper.ToFloat(Parameter.OutGain, FixedPointPrecision);
Matrix4x4 delayFeedback = new Matrix4x4(delayFeedbackBaseGain , delayFeedbackCrossGain, delayFeedbackCrossGain, 0.0f,
delayFeedbackCrossGain, delayFeedbackBaseGain , 0.0f , delayFeedbackCrossGain,
delayFeedbackCrossGain, 0.0f , delayFeedbackBaseGain , delayFeedbackCrossGain,
0.0f , delayFeedbackCrossGain, delayFeedbackCrossGain, delayFeedbackBaseGain);
for (int i = 0; i < sampleCount; i++)
{
for (int j = 0; j < channelCount; j++)
Vector4 channelInput = new Vector4
{
channelInput[j] = *((float*)inputBuffers[j] + i) * 64;
delayLineValues[j] = state.DelayLines[j].Read();
}
X = *((float*)inputBuffers[0] + i) * 64,
Y = *((float*)inputBuffers[1] + i) * 64,
Z = *((float*)inputBuffers[2] + i) * 64,
W = *((float*)inputBuffers[3] + i) * 64
};
temp[0] = channelInput[0] * inGain + (delayLineValues[2] + delayLineValues[1]) * delayFeedbackCrossGain + delayLineValues[0] * delayFeedbackBaseGain;
temp[1] = channelInput[1] * inGain + (delayLineValues[0] + delayLineValues[3]) * delayFeedbackCrossGain + delayLineValues[1] * delayFeedbackBaseGain;
temp[2] = channelInput[2] * inGain + (delayLineValues[3] + delayLineValues[0]) * delayFeedbackCrossGain + delayLineValues[2] * delayFeedbackBaseGain;
temp[3] = channelInput[3] * inGain + (delayLineValues[1] + delayLineValues[2]) * delayFeedbackCrossGain + delayLineValues[3] * delayFeedbackBaseGain;
for (int j = 0; j < channelCount; j++)
Vector4 delayLineValues = new Vector4()
{
float lowPassResult = state.LowPassFeedbackGain * state.LowPassZ[j] + temp[j] * state.LowPassBaseGain;
X = state.DelayLines[0].Read(),
Y = state.DelayLines[1].Read(),
Z = state.DelayLines[2].Read(),
W = state.DelayLines[3].Read()
};
state.LowPassZ[j] = lowPassResult;
state.DelayLines[j].Update(lowPassResult);
Vector4 temp = MatrixHelper.Transform(ref channelInput, ref delayFeedback) + channelInput * inGain;
*((float*)outputBuffers[j] + i) = (channelInput[j] * dryGain + delayLineValues[j] * outGain) / 64;
}
state.UpdateLowPassFilter(ref Unsafe.As<Vector4, float>(ref temp), channelCount);
*((float*)outputBuffers[0] + i) = (channelInput.X * dryGain + delayLineValues.X * outGain) / 64;
*((float*)outputBuffers[1] + i) = (channelInput.Y * dryGain + delayLineValues.Y * outGain) / 64;
*((float*)outputBuffers[2] + i) = (channelInput.Z * dryGain + delayLineValues.Z * outGain) / 64;
*((float*)outputBuffers[3] + i) = (channelInput.W * dryGain + delayLineValues.W * outGain) / 64;
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
[MethodImpl(MethodImplOptions.AggressiveInlining | MethodImplOptions.AggressiveOptimization)]
private unsafe void ProcessDelaySurround(ref DelayState state, Span<IntPtr> outputBuffers, ReadOnlySpan<IntPtr> inputBuffers, uint sampleCount)
{
const ushort channelCount = 6;
Span<float> channelInput = stackalloc float[channelCount];
Span<float> delayLineValues = stackalloc float[channelCount];
Span<float> temp = stackalloc float[channelCount];
float feedbackGain = FixedPointHelper.ToFloat(Parameter.FeedbackGain, FixedPointPrecision);
float delayFeedbackBaseGain = state.DelayFeedbackBaseGain;
float delayFeedbackCrossGain = state.DelayFeedbackCrossGain;
float inGain = FixedPointHelper.ToFloat(Parameter.InGain, FixedPointPrecision);
float dryGain = FixedPointHelper.ToFloat(Parameter.DryGain, FixedPointPrecision);
float outGain = FixedPointHelper.ToFloat(Parameter.OutGain, FixedPointPrecision);
Matrix6x6 delayFeedback = new Matrix6x6(delayFeedbackBaseGain , 0.0f , 0.0f , 0.0f , delayFeedbackCrossGain, delayFeedbackCrossGain,
0.0f , delayFeedbackBaseGain , 0.0f , delayFeedbackCrossGain, delayFeedbackCrossGain, 0.0f ,
delayFeedbackCrossGain, 0.0f , delayFeedbackBaseGain , delayFeedbackCrossGain, 0.0f , 0.0f ,
0.0f , delayFeedbackCrossGain, delayFeedbackCrossGain, delayFeedbackBaseGain , 0.0f , 0.0f ,
delayFeedbackCrossGain, delayFeedbackCrossGain, 0.0f , 0.0f , delayFeedbackBaseGain , 0.0f ,
0.0f , 0.0f , 0.0f , 0.0f , 0.0f , feedbackGain);
for (int i = 0; i < sampleCount; i++)
{
for (int j = 0; j < channelCount; j++)
Vector6 channelInput = new Vector6
{
channelInput[j] = *((float*)inputBuffers[j] + i) * 64;
delayLineValues[j] = state.DelayLines[j].Read();
}
X = *((float*)inputBuffers[0] + i) * 64,
Y = *((float*)inputBuffers[1] + i) * 64,
Z = *((float*)inputBuffers[2] + i) * 64,
W = *((float*)inputBuffers[3] + i) * 64,
V = *((float*)inputBuffers[4] + i) * 64,
U = *((float*)inputBuffers[5] + i) * 64
};
temp[0] = channelInput[0] * inGain + (delayLineValues[2] + delayLineValues[4]) * delayFeedbackCrossGain + delayLineValues[0] * delayFeedbackBaseGain;
temp[1] = channelInput[1] * inGain + (delayLineValues[4] + delayLineValues[3]) * delayFeedbackCrossGain + delayLineValues[1] * delayFeedbackBaseGain;
temp[2] = channelInput[2] * inGain + (delayLineValues[3] + delayLineValues[0]) * delayFeedbackCrossGain + delayLineValues[2] * delayFeedbackBaseGain;
temp[3] = channelInput[3] * inGain + (delayLineValues[1] + delayLineValues[2]) * delayFeedbackCrossGain + delayLineValues[3] * delayFeedbackBaseGain;
temp[4] = channelInput[4] * inGain + (delayLineValues[0] + delayLineValues[1]) * delayFeedbackCrossGain + delayLineValues[4] * delayFeedbackBaseGain;
temp[5] = channelInput[5] * inGain + delayLineValues[5] * delayFeedbackBaseGain;
for (int j = 0; j < channelCount; j++)
Vector6 delayLineValues = new Vector6
{
float lowPassResult = state.LowPassFeedbackGain * state.LowPassZ[j] + temp[j] * state.LowPassBaseGain;
X = state.DelayLines[0].Read(),
Y = state.DelayLines[1].Read(),
Z = state.DelayLines[2].Read(),
W = state.DelayLines[3].Read(),
V = state.DelayLines[4].Read(),
U = state.DelayLines[5].Read()
};
state.LowPassZ[j] = lowPassResult;
state.DelayLines[j].Update(lowPassResult);
Vector6 temp = MatrixHelper.Transform(ref channelInput, ref delayFeedback) + channelInput * inGain;
*((float*)outputBuffers[j] + i) = (channelInput[j] * dryGain + delayLineValues[j] * outGain) / 64;
}
state.UpdateLowPassFilter(ref Unsafe.As<Vector6, float>(ref temp), channelCount);
*((float*)outputBuffers[0] + i) = (channelInput.X * dryGain + delayLineValues.X * outGain) / 64;
*((float*)outputBuffers[1] + i) = (channelInput.Y * dryGain + delayLineValues.Y * outGain) / 64;
*((float*)outputBuffers[2] + i) = (channelInput.Z * dryGain + delayLineValues.Z * outGain) / 64;
*((float*)outputBuffers[3] + i) = (channelInput.W * dryGain + delayLineValues.W * outGain) / 64;
*((float*)outputBuffers[4] + i) = (channelInput.V * dryGain + delayLineValues.V * outGain) / 64;
*((float*)outputBuffers[5] + i) = (channelInput.U * dryGain + delayLineValues.U * outGain) / 64;
}
}

View File

@ -63,7 +63,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
private Reverb3dParameter _parameter;
public Reverb3dCommand(uint bufferOffset, Reverb3dParameter parameter, Memory<Reverb3dState> state, bool isEnabled, ulong workBuffer, int nodeId)
public Reverb3dCommand(uint bufferOffset, Reverb3dParameter parameter, Memory<Reverb3dState> state, bool isEnabled, ulong workBuffer, int nodeId, bool newEffectChannelMappingSupported)
{
Enabled = true;
IsEffectEnabled = isEnabled;
@ -80,6 +80,11 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
InputBufferIndices[i] = (ushort)(bufferOffset + Parameter.Input[i]);
OutputBufferIndices[i] = (ushort)(bufferOffset + Parameter.Output[i]);
}
// NOTE: We do the opposite as Nintendo here for now to restore previous behaviour
// TODO: Update reverb 3d processing and remove this to use RemapLegacyChannelEffectMappingToChannelResourceMapping.
DataSourceHelper.RemapChannelResourceMappingToLegacy(newEffectChannelMappingSupported, InputBufferIndices);
DataSourceHelper.RemapChannelResourceMappingToLegacy(newEffectChannelMappingSupported, OutputBufferIndices);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
@ -194,7 +199,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
if (isSurround)
{
*((float*)outputBuffers[4] + sampleIndex) += (outputValues[4] + state.BackLeftDelayLine.Update((values[2] - values[3]) * 0.5f) + channelInput[4] * state.DryGain);
*((float*)outputBuffers[4] + sampleIndex) += (outputValues[4] + state.FrontCenterDelayLine.Update((values[2] - values[3]) * 0.5f) + channelInput[4] * state.DryGain);
}
}
}

View File

@ -66,7 +66,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
private const int FixedPointPrecision = 14;
public ReverbCommand(uint bufferOffset, ReverbParameter parameter, Memory<ReverbState> state, bool isEnabled, ulong workBuffer, int nodeId, bool isLongSizePreDelaySupported)
public ReverbCommand(uint bufferOffset, ReverbParameter parameter, Memory<ReverbState> state, bool isEnabled, ulong workBuffer, int nodeId, bool isLongSizePreDelaySupported, bool newEffectChannelMappingSupported)
{
Enabled = true;
IsEffectEnabled = isEnabled;
@ -85,6 +85,11 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
}
IsLongSizePreDelaySupported = isLongSizePreDelaySupported;
// NOTE: We do the opposite as Nintendo here for now to restore previous behaviour
// TODO: Update reverb processing and remove this to use RemapLegacyChannelEffectMappingToChannelResourceMapping.
DataSourceHelper.RemapChannelResourceMappingToLegacy(newEffectChannelMappingSupported, InputBufferIndices);
DataSourceHelper.RemapChannelResourceMappingToLegacy(newEffectChannelMappingSupported, OutputBufferIndices);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
@ -214,7 +219,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.Command
if (isSurround)
{
outputValues[4] += state.BackLeftDelayLine.Update((feedbackOutputValues[2] - feedbackOutputValues[3]) * 0.5f);
outputValues[4] += state.FrontCenterDelayLine.Update((feedbackOutputValues[2] - feedbackOutputValues[3]) * 0.5f);
}
for (int channelIndex = 0; channelIndex < Parameter.ChannelCount; channelIndex++)

View File

@ -445,5 +445,39 @@ namespace Ryujinx.Audio.Renderer.Dsp
ToIntSlow(output, input, sampleCount);
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static void RemapLegacyChannelEffectMappingToChannelResourceMapping(bool isSupported, Span<ushort> bufferIndices)
{
if (!isSupported && bufferIndices.Length == 6)
{
ushort backLeft = bufferIndices[2];
ushort backRight = bufferIndices[3];
ushort frontCenter = bufferIndices[4];
ushort lowFrequency = bufferIndices[5];
bufferIndices[2] = frontCenter;
bufferIndices[3] = lowFrequency;
bufferIndices[4] = backLeft;
bufferIndices[5] = backRight;
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static void RemapChannelResourceMappingToLegacy(bool isSupported, Span<ushort> bufferIndices)
{
if (isSupported && bufferIndices.Length == 6)
{
ushort frontCenter = bufferIndices[2];
ushort lowFrequency = bufferIndices[3];
ushort backLeft = bufferIndices[4];
ushort backRight = bufferIndices[5];
bufferIndices[2] = backLeft;
bufferIndices[3] = backRight;
bufferIndices[4] = frontCenter;
bufferIndices[5] = lowFrequency;
}
}
}
}

View File

@ -17,6 +17,7 @@
using Ryujinx.Audio.Renderer.Dsp.Effect;
using Ryujinx.Audio.Renderer.Parameter.Effect;
using System.Runtime.CompilerServices;
namespace Ryujinx.Audio.Renderer.Dsp.State
{
@ -43,7 +44,6 @@ namespace Ryujinx.Audio.Renderer.Dsp.State
{
DelayLines[i] = new DelayLine(sampleRate, parameter.DelayTimeMax);
DelayLines[i].SetDelay(parameter.DelayTime);
LowPassZ[0] = 0;
}
UpdateParameter(ref parameter);
@ -69,5 +69,16 @@ namespace Ryujinx.Audio.Renderer.Dsp.State
LowPassFeedbackGain = 0.95f * FixedPointHelper.ToFloat(parameter.LowPassAmount, FixedPointPrecision);
LowPassBaseGain = 1.0f - LowPassFeedbackGain;
}
public void UpdateLowPassFilter(ref float tempRawRef, uint channelCount)
{
for (int i = 0; i < channelCount; i++)
{
float lowPassResult = LowPassFeedbackGain * LowPassZ[i] + Unsafe.Add(ref tempRawRef, i) * LowPassBaseGain;
LowPassZ[i] = lowPassResult;
DelayLines[i].Update(lowPassResult);
}
}
}
}

View File

@ -34,7 +34,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.State
public DecayDelay[] DecayDelays1 { get; }
public DecayDelay[] DecayDelays2 { get; }
public IDelayLine PreDelayLine { get; }
public IDelayLine BackLeftDelayLine { get; }
public IDelayLine FrontCenterDelayLine { get; }
public float DryGain { get; private set; }
public uint[] EarlyDelayTime { get; private set; }
public float PreviousPreDelayValue { get; set; }
@ -69,7 +69,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.State
}
PreDelayLine = new DelayLine3d(sampleRate, 400);
BackLeftDelayLine = new DelayLine3d(sampleRate, 5);
FrontCenterDelayLine = new DelayLine3d(sampleRate, 5);
UpdateParameter(ref parameter);
}

View File

@ -97,7 +97,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.State
public DelayLine[] FdnDelayLines { get; }
public DecayDelay[] DecayDelays { get; }
public DelayLine PreDelayLine { get; }
public DelayLine BackLeftDelayLine { get; }
public DelayLine FrontCenterDelayLine { get; }
public uint[] EarlyDelayTime { get; }
public float[] EarlyGain { get; }
public uint PreDelayLineDelayTime { get; private set; }
@ -112,12 +112,12 @@ namespace Ryujinx.Audio.Renderer.Dsp.State
private ReadOnlySpan<float> GetFdnDelayTimesByLateMode(ReverbLateMode lateMode)
{
return FdnDelayTimes.AsSpan().Slice((int)lateMode * 4, 4);
return FdnDelayTimes.AsSpan((int)lateMode * 4, 4);
}
private ReadOnlySpan<float> GetDecayDelayTimesByLateMode(ReverbLateMode lateMode)
{
return DecayDelayTimes.AsSpan().Slice((int)lateMode * 4, 4);
return DecayDelayTimes.AsSpan((int)lateMode * 4, 4);
}
public ReverbState(ref ReverbParameter parameter, ulong workBuffer, bool isLongSizePreDelaySupported)
@ -149,7 +149,7 @@ namespace Ryujinx.Audio.Renderer.Dsp.State
}
PreDelayLine = new DelayLine(sampleRate, preDelayTimeMax);
BackLeftDelayLine = new DelayLine(sampleRate, 5.0f);
FrontCenterDelayLine = new DelayLine(sampleRate, 5.0f);
UpdateParameter(ref parameter);
}

View File

@ -363,6 +363,9 @@ namespace Ryujinx.Audio.Renderer.Server
case 4:
_commandProcessingTimeEstimator = new CommandProcessingTimeEstimatorVersion4(_sampleCount, _mixBufferCount);
break;
case 5:
_commandProcessingTimeEstimator = new CommandProcessingTimeEstimatorVersion5(_sampleCount, _mixBufferCount);
break;
default:
throw new NotImplementedException($"Unsupported processing time estimator version {_behaviourContext.GetCommandProcessingTimeEstimatorVersion()}.");
}

View File

@ -107,10 +107,18 @@ namespace Ryujinx.Audio.Renderer.Server
/// <remarks>This was added in system update 13.0.0</remarks>
public const int Revision10 = 10 << 24;
/// <summary>
/// REV11:
/// The "legacy" effects (Delay, Reverb and Reverb 3D) were updated to match the standard channel mapping used by the audio renderer.
/// A new version of the command estimator was added to address timing changes caused by the legacy effects changes.
/// </summary>
/// <remarks>This was added in system update 14.0.0</remarks>
public const int Revision11 = 11 << 24;
/// <summary>
/// Last revision supported by the implementation.
/// </summary>
public const int LastRevision = Revision10;
public const int LastRevision = Revision11;
/// <summary>
/// Target revision magic supported by the implementation.
@ -366,12 +374,26 @@ namespace Ryujinx.Audio.Renderer.Server
return CheckFeatureSupported(UserRevision, BaseRevisionMagic + Revision10);
}
/// <summary>
/// Check if the audio renderer should support new channel resource mapping for 5.1 on Delay, Reverb and Reverb 3D effects.
/// </summary>
/// <returns>True if the audio renderer support new channel resource mapping for 5.1.</returns>
public bool IsNewEffectChannelMappingSupported()
{
return CheckFeatureSupported(UserRevision, BaseRevisionMagic + Revision11);
}
/// <summary>
/// Get the version of the <see cref="ICommandProcessingTimeEstimator"/>.
/// </summary>
/// <returns>The version of the <see cref="ICommandProcessingTimeEstimator"/>.</returns>
public int GetCommandProcessingTimeEstimatorVersion()
{
if (CheckFeatureSupported(UserRevision, BaseRevisionMagic + Revision11))
{
return 5;
}
if (CheckFeatureSupported(UserRevision, BaseRevisionMagic + Revision10))
{
return 4;

View File

@ -336,11 +336,12 @@ namespace Ryujinx.Audio.Renderer.Server
/// <param name="workBuffer">The work buffer to use for processing.</param>
/// <param name="nodeId">The node id associated to this command.</param>
/// <param name="isLongSizePreDelaySupported">If set to true, the long size pre-delay is supported.</param>
public void GenerateReverbEffect(uint bufferOffset, ReverbParameter parameter, Memory<ReverbState> state, bool isEnabled, CpuAddress workBuffer, int nodeId, bool isLongSizePreDelaySupported)
/// <param name="newEffectChannelMappingSupported">If set to true, the new effect channel mapping for 5.1 is supported.</param>
public void GenerateReverbEffect(uint bufferOffset, ReverbParameter parameter, Memory<ReverbState> state, bool isEnabled, CpuAddress workBuffer, int nodeId, bool isLongSizePreDelaySupported, bool newEffectChannelMappingSupported)
{
if (parameter.IsChannelCountValid())
{
ReverbCommand command = new ReverbCommand(bufferOffset, parameter, state, isEnabled, workBuffer, nodeId, isLongSizePreDelaySupported);
ReverbCommand command = new ReverbCommand(bufferOffset, parameter, state, isEnabled, workBuffer, nodeId, isLongSizePreDelaySupported, newEffectChannelMappingSupported);
command.EstimatedProcessingTime = _commandProcessingTimeEstimator.Estimate(command);
@ -357,11 +358,12 @@ namespace Ryujinx.Audio.Renderer.Server
/// <param name="isEnabled">Set to true if the effect should be active.</param>
/// <param name="workBuffer">The work buffer to use for processing.</param>
/// <param name="nodeId">The node id associated to this command.</param>
public void GenerateReverb3dEffect(uint bufferOffset, Reverb3dParameter parameter, Memory<Reverb3dState> state, bool isEnabled, CpuAddress workBuffer, int nodeId)
/// <param name="newEffectChannelMappingSupported">If set to true, the new effect channel mapping for 5.1 is supported.</param>
public void GenerateReverb3dEffect(uint bufferOffset, Reverb3dParameter parameter, Memory<Reverb3dState> state, bool isEnabled, CpuAddress workBuffer, int nodeId, bool newEffectChannelMappingSupported)
{
if (parameter.IsChannelCountValid())
{
Reverb3dCommand command = new Reverb3dCommand(bufferOffset, parameter, state, isEnabled, workBuffer, nodeId);
Reverb3dCommand command = new Reverb3dCommand(bufferOffset, parameter, state, isEnabled, workBuffer, nodeId, newEffectChannelMappingSupported);
command.EstimatedProcessingTime = _commandProcessingTimeEstimator.Estimate(command);
@ -379,11 +381,12 @@ namespace Ryujinx.Audio.Renderer.Server
/// <param name="isEnabled">Set to true if the effect should be active.</param>
/// <param name="workBuffer">The work buffer to use for processing.</param>
/// <param name="nodeId">The node id associated to this command.</param>
public void GenerateDelayEffect(uint bufferOffset, DelayParameter parameter, Memory<DelayState> state, bool isEnabled, CpuAddress workBuffer, int nodeId)
/// <param name="newEffectChannelMappingSupported">If set to true, the new effect channel mapping for 5.1 is supported.</param>
public void GenerateDelayEffect(uint bufferOffset, DelayParameter parameter, Memory<DelayState> state, bool isEnabled, CpuAddress workBuffer, int nodeId, bool newEffectChannelMappingSupported)
{
if (parameter.IsChannelCountValid())
{
DelayCommand command = new DelayCommand(bufferOffset, parameter, state, isEnabled, workBuffer, nodeId);
DelayCommand command = new DelayCommand(bufferOffset, parameter, state, isEnabled, workBuffer, nodeId, newEffectChannelMappingSupported);
command.EstimatedProcessingTime = _commandProcessingTimeEstimator.Estimate(command);

View File

@ -483,31 +483,31 @@ namespace Ryujinx.Audio.Renderer.Server
}
}
private void GenerateDelayEffect(uint bufferOffset, DelayEffect effect, int nodeId)
private void GenerateDelayEffect(uint bufferOffset, DelayEffect effect, int nodeId, bool newEffectChannelMappingSupported)
{
Debug.Assert(effect.Type == EffectType.Delay);
ulong workBuffer = effect.GetWorkBuffer(-1);
_commandBuffer.GenerateDelayEffect(bufferOffset, effect.Parameter, effect.State, effect.IsEnabled, workBuffer, nodeId);
_commandBuffer.GenerateDelayEffect(bufferOffset, effect.Parameter, effect.State, effect.IsEnabled, workBuffer, nodeId, newEffectChannelMappingSupported);
}
private void GenerateReverbEffect(uint bufferOffset, ReverbEffect effect, int nodeId, bool isLongSizePreDelaySupported)
private void GenerateReverbEffect(uint bufferOffset, ReverbEffect effect, int nodeId, bool isLongSizePreDelaySupported, bool newEffectChannelMappingSupported)
{
Debug.Assert(effect.Type == EffectType.Reverb);
ulong workBuffer = effect.GetWorkBuffer(-1);
_commandBuffer.GenerateReverbEffect(bufferOffset, effect.Parameter, effect.State, effect.IsEnabled, workBuffer, nodeId, isLongSizePreDelaySupported);
_commandBuffer.GenerateReverbEffect(bufferOffset, effect.Parameter, effect.State, effect.IsEnabled, workBuffer, nodeId, isLongSizePreDelaySupported, newEffectChannelMappingSupported);
}
private void GenerateReverb3dEffect(uint bufferOffset, Reverb3dEffect effect, int nodeId)
private void GenerateReverb3dEffect(uint bufferOffset, Reverb3dEffect effect, int nodeId, bool newEffectChannelMappingSupported)
{
Debug.Assert(effect.Type == EffectType.Reverb3d);
ulong workBuffer = effect.GetWorkBuffer(-1);
_commandBuffer.GenerateReverb3dEffect(bufferOffset, effect.Parameter, effect.State, effect.IsEnabled, workBuffer, nodeId);
_commandBuffer.GenerateReverb3dEffect(bufferOffset, effect.Parameter, effect.State, effect.IsEnabled, workBuffer, nodeId, newEffectChannelMappingSupported);
}
private void GenerateBiquadFilterEffect(uint bufferOffset, BiquadFilterEffect effect, int nodeId)
@ -650,13 +650,13 @@ namespace Ryujinx.Audio.Renderer.Server
GenerateAuxEffect(mix.BufferOffset, (AuxiliaryBufferEffect)effect, nodeId);
break;
case EffectType.Delay:
GenerateDelayEffect(mix.BufferOffset, (DelayEffect)effect, nodeId);
GenerateDelayEffect(mix.BufferOffset, (DelayEffect)effect, nodeId, _rendererContext.BehaviourContext.IsNewEffectChannelMappingSupported());
break;
case EffectType.Reverb:
GenerateReverbEffect(mix.BufferOffset, (ReverbEffect)effect, nodeId, mix.IsLongSizePreDelaySupported);
GenerateReverbEffect(mix.BufferOffset, (ReverbEffect)effect, nodeId, mix.IsLongSizePreDelaySupported, _rendererContext.BehaviourContext.IsNewEffectChannelMappingSupported());
break;
case EffectType.Reverb3d:
GenerateReverb3dEffect(mix.BufferOffset, (Reverb3dEffect)effect, nodeId);
GenerateReverb3dEffect(mix.BufferOffset, (Reverb3dEffect)effect, nodeId, _rendererContext.BehaviourContext.IsNewEffectChannelMappingSupported());
break;
case EffectType.BiquadFilter:
GenerateBiquadFilterEffect(mix.BufferOffset, (BiquadFilterEffect)effect, nodeId);

View File

@ -198,7 +198,7 @@ namespace Ryujinx.Audio.Renderer.Server
return (uint)1853.2f;
}
public uint Estimate(DelayCommand command)
public virtual uint Estimate(DelayCommand command)
{
Debug.Assert(_sampleCount == 160 || _sampleCount == 240);
@ -272,7 +272,7 @@ namespace Ryujinx.Audio.Renderer.Server
}
}
public uint Estimate(ReverbCommand command)
public virtual uint Estimate(ReverbCommand command)
{
Debug.Assert(_sampleCount == 160 || _sampleCount == 240);
@ -346,7 +346,7 @@ namespace Ryujinx.Audio.Renderer.Server
}
}
public uint Estimate(Reverb3dCommand command)
public virtual uint Estimate(Reverb3dCommand command)
{
Debug.Assert(_sampleCount == 160 || _sampleCount == 240);

View File

@ -0,0 +1,253 @@
//
// Copyright (c) 2019-2022 Ryujinx
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU Lesser General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU Lesser General Public License for more details.
//
// You should have received a copy of the GNU Lesser General Public License
// along with this program. If not, see <https://www.gnu.org/licenses/>.
//
using Ryujinx.Audio.Renderer.Dsp.Command;
using System;
using System.Diagnostics;
namespace Ryujinx.Audio.Renderer.Server
{
/// <summary>
/// <see cref="ICommandProcessingTimeEstimator"/> version 5. (added with REV11)
/// </summary>
public class CommandProcessingTimeEstimatorVersion5 : CommandProcessingTimeEstimatorVersion4
{
public CommandProcessingTimeEstimatorVersion5(uint sampleCount, uint bufferCount) : base(sampleCount, bufferCount) { }
public override uint Estimate(DelayCommand command)
{
Debug.Assert(_sampleCount == 160 || _sampleCount == 240);
if (_sampleCount == 160)
{
if (command.Enabled)
{
switch (command.Parameter.ChannelCount)
{
case 1:
return 8929;
case 2:
return 25501;
case 4:
return 47760;
case 6:
return 82203;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
else
{
switch (command.Parameter.ChannelCount)
{
case 1:
return (uint)1295.20f;
case 2:
return (uint)1213.60f;
case 4:
return (uint)942.03f;
case 6:
return (uint)1001.6f;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
}
if (command.Enabled)
{
switch (command.Parameter.ChannelCount)
{
case 1:
return 11941;
case 2:
return 37197;
case 4:
return 69750;
case 6:
return 12004;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
else
{
switch (command.Parameter.ChannelCount)
{
case 1:
return (uint)997.67f;
case 2:
return (uint)977.63f;
case 4:
return (uint)792.31f;
case 6:
return (uint)875.43f;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
}
public override uint Estimate(ReverbCommand command)
{
Debug.Assert(_sampleCount == 160 || _sampleCount == 240);
if (_sampleCount == 160)
{
if (command.Enabled)
{
switch (command.Parameter.ChannelCount)
{
case 1:
return 81475;
case 2:
return 84975;
case 4:
return 91625;
case 6:
return 95332;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
else
{
switch (command.Parameter.ChannelCount)
{
case 1:
return (uint)536.30f;
case 2:
return (uint)588.80f;
case 4:
return (uint)643.70f;
case 6:
return (uint)706.0f;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
}
if (command.Enabled)
{
switch (command.Parameter.ChannelCount)
{
case 1:
return 120170;
case 2:
return 125260;
case 4:
return 135750;
case 6:
return 141130;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
else
{
switch (command.Parameter.ChannelCount)
{
case 1:
return (uint)617.64f;
case 2:
return (uint)659.54f;
case 4:
return (uint)711.44f;
case 6:
return (uint)778.07f;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
}
public override uint Estimate(Reverb3dCommand command)
{
Debug.Assert(_sampleCount == 160 || _sampleCount == 240);
if (_sampleCount == 160)
{
if (command.Enabled)
{
switch (command.Parameter.ChannelCount)
{
case 1:
return 116750;
case 2:
return 125910;
case 4:
return 146340;
case 6:
return 165810;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
else
{
switch (command.Parameter.ChannelCount)
{
case 1:
return 735;
case 2:
return (uint)766.62f;
case 4:
return (uint)834.07f;
case 6:
return (uint)875.44f;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
}
if (command.Enabled)
{
switch (command.Parameter.ChannelCount)
{
case 1:
return 170290;
case 2:
return 183880;
case 4:
return 214700;
case 6:
return 243850;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
else
{
switch (command.Parameter.ChannelCount)
{
case 1:
return (uint)508.47f;
case 2:
return (uint)582.45f;
case 4:
return (uint)626.42f;
case 6:
return (uint)682.47f;
default:
throw new NotImplementedException($"{command.Parameter.ChannelCount}");
}
}
}
}
}

View File

@ -149,11 +149,21 @@ namespace Ryujinx.Audio.Renderer.Server.Performance
Span<byte> targetSpan = performanceOutput.Slice(nextOffset);
// NOTE: We check for the space for two headers for the final blank header.
int requiredSpace = Unsafe.SizeOf<THeader>() + Unsafe.SizeOf<TEntry>() * inputHeader.GetEntryCount()
+ Unsafe.SizeOf<TEntryDetail>() * inputHeader.GetEntryDetailCount()
+ Unsafe.SizeOf<THeader>();
if (targetSpan.Length < requiredSpace)
{
break;
}
ref THeader outputHeader = ref MemoryMarshal.Cast<byte, THeader>(targetSpan)[0];
nextOffset += Unsafe.SizeOf<THeader>();
Span<TEntry> outputEntries = MemoryMarshal.Cast<byte, TEntry>(targetSpan.Slice(nextOffset));
Span<TEntry> outputEntries = MemoryMarshal.Cast<byte, TEntry>(performanceOutput.Slice(nextOffset));
int totalProcessingTime = 0;
@ -175,7 +185,7 @@ namespace Ryujinx.Audio.Renderer.Server.Performance
}
}
Span<TEntryDetail> outputEntriesDetail = MemoryMarshal.Cast<byte, TEntryDetail>(targetSpan.Slice(nextOffset));
Span<TEntryDetail> outputEntriesDetail = MemoryMarshal.Cast<byte, TEntryDetail>(performanceOutput.Slice(nextOffset));
int effectiveEntryDetailCount = 0;

View File

@ -459,7 +459,7 @@ namespace Ryujinx.Audio.Renderer.Server.Voice
for (int i = 0; i < Constants.VoiceWaveBufferCount; i++)
{
UpdateWaveBuffer(errorInfos.AsSpan().Slice(i * 2, 2), ref WaveBuffers[i], ref parameter.WaveBuffers[i], parameter.SampleFormat, voiceUpdateState.IsWaveBufferValid[i], ref mapper, ref behaviourContext);
UpdateWaveBuffer(errorInfos.AsSpan(i * 2, 2), ref WaveBuffers[i], ref parameter.WaveBuffers[i], parameter.SampleFormat, voiceUpdateState.IsWaveBufferValid[i], ref mapper, ref behaviourContext);
}
}

View File

@ -0,0 +1,71 @@
namespace Ryujinx.Audio.Renderer.Utils.Math
{
record struct Matrix2x2
{
public float M11;
public float M12;
public float M21;
public float M22;
public Matrix2x2(float m11, float m12,
float m21, float m22)
{
M11 = m11;
M12 = m12;
M21 = m21;
M22 = m22;
}
public static Matrix2x2 operator +(Matrix2x2 value1, Matrix2x2 value2)
{
Matrix2x2 m;
m.M11 = value1.M11 + value2.M11;
m.M12 = value1.M12 + value2.M12;
m.M21 = value1.M21 + value2.M21;
m.M22 = value1.M22 + value2.M22;
return m;
}
public static Matrix2x2 operator -(Matrix2x2 value1, float value2)
{
Matrix2x2 m;
m.M11 = value1.M11 - value2;
m.M12 = value1.M12 - value2;
m.M21 = value1.M21 - value2;
m.M22 = value1.M22 - value2;
return m;
}
public static Matrix2x2 operator *(Matrix2x2 value1, float value2)
{
Matrix2x2 m;
m.M11 = value1.M11 * value2;
m.M12 = value1.M12 * value2;
m.M21 = value1.M21 * value2;
m.M22 = value1.M22 * value2;
return m;
}
public static Matrix2x2 operator *(Matrix2x2 value1, Matrix2x2 value2)
{
Matrix2x2 m;
// First row
m.M11 = value1.M11 * value2.M11 + value1.M12 * value2.M21;
m.M12 = value1.M11 * value2.M12 + value1.M12 * value2.M22;
// Second row
m.M21 = value1.M21 * value2.M11 + value1.M22 * value2.M21;
m.M22 = value1.M21 * value2.M12 + value1.M22 * value2.M22;
return m;
}
}
}

View File

@ -0,0 +1,97 @@
namespace Ryujinx.Audio.Renderer.Utils.Math
{
record struct Matrix6x6
{
public float M11;
public float M12;
public float M13;
public float M14;
public float M15;
public float M16;
public float M21;
public float M22;
public float M23;
public float M24;
public float M25;
public float M26;
public float M31;
public float M32;
public float M33;
public float M34;
public float M35;
public float M36;
public float M41;
public float M42;
public float M43;
public float M44;
public float M45;
public float M46;
public float M51;
public float M52;
public float M53;
public float M54;
public float M55;
public float M56;
public float M61;
public float M62;
public float M63;
public float M64;
public float M65;
public float M66;
public Matrix6x6(float m11, float m12, float m13, float m14, float m15, float m16,
float m21, float m22, float m23, float m24, float m25, float m26,
float m31, float m32, float m33, float m34, float m35, float m36,
float m41, float m42, float m43, float m44, float m45, float m46,
float m51, float m52, float m53, float m54, float m55, float m56,
float m61, float m62, float m63, float m64, float m65, float m66)
{
M11 = m11;
M12 = m12;
M13 = m13;
M14 = m14;
M15 = m15;
M16 = m16;
M21 = m21;
M22 = m22;
M23 = m23;
M24 = m24;
M25 = m25;
M26 = m26;
M31 = m31;
M32 = m32;
M33 = m33;
M34 = m34;
M35 = m35;
M36 = m36;
M41 = m41;
M42 = m42;
M43 = m43;
M44 = m44;
M45 = m45;
M46 = m46;
M51 = m51;
M52 = m52;
M53 = m53;
M54 = m54;
M55 = m55;
M56 = m56;
M61 = m61;
M62 = m62;
M63 = m63;
M64 = m64;
M65 = m65;
M66 = m66;
}
}
}

View File

@ -0,0 +1,45 @@
using Ryujinx.Audio.Renderer.Utils.Math;
using System.Numerics;
using System.Runtime.CompilerServices;
namespace Ryujinx.Audio.Renderer.Dsp
{
static class MatrixHelper
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector6 Transform(ref Vector6 value1, ref Matrix6x6 value2)
{
return new Vector6
{
X = value2.M11 * value1.X + value2.M12 * value1.Y + value2.M13 * value1.Z + value2.M14 * value1.W + value2.M15 * value1.V + value2.M16 * value1.U,
Y = value2.M21 * value1.X + value2.M22 * value1.Y + value2.M23 * value1.Z + value2.M24 * value1.W + value2.M25 * value1.V + value2.M26 * value1.U,
Z = value2.M31 * value1.X + value2.M32 * value1.Y + value2.M33 * value1.Z + value2.M34 * value1.W + value2.M35 * value1.V + value2.M36 * value1.U,
W = value2.M41 * value1.X + value2.M42 * value1.Y + value2.M43 * value1.Z + value2.M44 * value1.W + value2.M45 * value1.V + value2.M46 * value1.U,
V = value2.M51 * value1.X + value2.M52 * value1.Y + value2.M53 * value1.Z + value2.M54 * value1.W + value2.M55 * value1.V + value2.M56 * value1.U,
U = value2.M61 * value1.X + value2.M62 * value1.Y + value2.M63 * value1.Z + value2.M64 * value1.W + value2.M65 * value1.V + value2.M66 * value1.U,
};
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector4 Transform(ref Vector4 value1, ref Matrix4x4 value2)
{
return new Vector4
{
X = value2.M11 * value1.X + value2.M12 * value1.Y + value2.M13 * value1.Z + value2.M14 * value1.W,
Y = value2.M21 * value1.X + value2.M22 * value1.Y + value2.M23 * value1.Z + value2.M24 * value1.W,
Z = value2.M31 * value1.X + value2.M32 * value1.Y + value2.M33 * value1.Z + value2.M34 * value1.W,
W = value2.M41 * value1.X + value2.M42 * value1.Y + value2.M43 * value1.Z + value2.M44 * value1.W
};
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector2 Transform(ref Vector2 value1, ref Matrix2x2 value2)
{
return new Vector2
{
X = value2.M11 * value1.X + value2.M12 * value1.Y,
Y = value2.M21 * value1.X + value2.M22 * value1.Y,
};
}
}
}

View File

@ -0,0 +1,56 @@
using System.Runtime.CompilerServices;
namespace Ryujinx.Audio.Renderer.Utils.Math
{
record struct Vector6
{
public float X;
public float Y;
public float Z;
public float W;
public float V;
public float U;
public Vector6(float value) : this(value, value, value, value, value, value)
{
}
public Vector6(float x, float y, float z, float w, float v, float u)
{
X = x;
Y = y;
Z = z;
W = w;
V = v;
U = u;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector6 operator +(Vector6 left, Vector6 right)
{
return new Vector6(left.X + right.X,
left.Y + right.Y,
left.Z + right.Z,
left.W + right.W,
left.V + right.V,
left.U + right.U);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector6 operator *(Vector6 left, Vector6 right)
{
return new Vector6(left.X * right.X,
left.Y * right.Y,
left.Z * right.Z,
left.W * right.W,
left.V * right.V,
left.U * right.U);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static Vector6 operator *(Vector6 left, float right)
{
return left * new Vector6(right);
}
}
}

View File

@ -34,6 +34,7 @@ namespace Ryujinx.Common.Configuration
private const string DefaultModsDir = "mods";
public static string CustomModsPath { get; set; }
public static string CustomSdModsPath {get; set; }
public static string CustomNandPath { get; set; } // TODO: Actually implement this into VFS
public static string CustomSdCardPath { get; set; } // TODO: Actually implement this into VFS
@ -85,5 +86,6 @@ namespace Ryujinx.Common.Configuration
}
public static string GetModsPath() => CustomModsPath ?? Directory.CreateDirectory(Path.Combine(BaseDirPath, DefaultModsDir)).FullName;
public static string GetSdModsPath() => CustomSdModsPath ?? Directory.CreateDirectory(Path.Combine(BaseDirPath, DefaultSdcardDir, "atmosphere")).FullName;
}
}

View File

@ -5,6 +5,7 @@
public Stick Joystick { get; set; }
public bool InvertStickX { get; set; }
public bool InvertStickY { get; set; }
public bool Rotate90CW { get; set; }
public Button StickButton { get; set; }
}
}

View File

@ -47,6 +47,7 @@ namespace Ryujinx.Common.Logging
ServiceNim,
ServiceNs,
ServiceNsd,
ServiceNtc,
ServiceNv,
ServiceOlsc,
ServicePctl,

View File

@ -1,10 +1,14 @@
using System.Reflection;
using Ryujinx.Common.Configuration;
using System;
using System.Reflection;
namespace Ryujinx.Common
{
// DO NOT EDIT, filled by CI
public static class ReleaseInformations
{
private const string FlatHubChannelOwner = "flathub";
public static string BuildVersion = "%%RYUJINX_BUILD_VERSION%%";
public static string BuildGitHash = "%%RYUJINX_BUILD_GIT_HASH%%";
public static string ReleaseChannelName = "%%RYUJINX_TARGET_RELEASE_CHANNEL_NAME%%";
@ -19,6 +23,11 @@ namespace Ryujinx.Common
!ReleaseChannelRepo.StartsWith("%%");
}
public static bool IsFlatHubBuild()
{
return IsValid() && ReleaseChannelOwner.Equals(FlatHubChannelOwner);
}
public static string GetVersion()
{
if (IsValid())
@ -30,5 +39,15 @@ namespace Ryujinx.Common
return Assembly.GetEntryAssembly().GetCustomAttribute<AssemblyInformationalVersionAttribute>().InformationalVersion;
}
}
public static string GetBaseApplicationDirectory()
{
if (IsFlatHubBuild())
{
return AppDataManager.BaseDirPath;
}
return AppDomain.CurrentDomain.BaseDirectory;
}
}
}

View File

@ -1,7 +1,12 @@
using Ryujinx.Graphics.Shader.Translation;
namespace Ryujinx.Graphics.GAL
{
public struct Capabilities
{
public readonly TargetApi Api;
public readonly string VendorName;
public readonly bool HasFrontFacingBug;
public readonly bool HasVectorIndexingBug;
@ -24,6 +29,8 @@ namespace Ryujinx.Graphics.GAL
public readonly int StorageBufferOffsetAlignment;
public Capabilities(
TargetApi api,
string vendorName,
bool hasFrontFacingBug,
bool hasVectorIndexingBug,
bool supportsAstcCompression,
@ -43,6 +50,8 @@ namespace Ryujinx.Graphics.GAL
float maximumSupportedAnisotropy,
int storageBufferOffsetAlignment)
{
Api = api;
VendorName = vendorName;
HasFrontFacingBug = hasFrontFacingBug;
HasVectorIndexingBug = hasVectorIndexingBug;
SupportsAstcCompression = supportsAstcCompression;

View File

@ -1,47 +0,0 @@
namespace Ryujinx.Graphics.GAL
{
public struct DepthStencilState
{
public bool DepthTestEnable { get; }
public bool DepthWriteEnable { get; }
public bool StencilTestEnable { get; }
public CompareOp DepthFunc { get; }
public CompareOp StencilFrontFunc { get; }
public StencilOp StencilFrontSFail { get; }
public StencilOp StencilFrontDpPass { get; }
public StencilOp StencilFrontDpFail { get; }
public CompareOp StencilBackFunc { get; }
public StencilOp StencilBackSFail { get; }
public StencilOp StencilBackDpPass { get; }
public StencilOp StencilBackDpFail { get; }
public DepthStencilState(
bool depthTestEnable,
bool depthWriteEnable,
bool stencilTestEnable,
CompareOp depthFunc,
CompareOp stencilFrontFunc,
StencilOp stencilFrontSFail,
StencilOp stencilFrontDpPass,
StencilOp stencilFrontDpFail,
CompareOp stencilBackFunc,
StencilOp stencilBackSFail,
StencilOp stencilBackDpPass,
StencilOp stencilBackDpFail)
{
DepthTestEnable = depthTestEnable;
DepthWriteEnable = depthWriteEnable;
StencilTestEnable = stencilTestEnable;
DepthFunc = depthFunc;
StencilFrontFunc = stencilFrontFunc;
StencilFrontSFail = stencilFrontSFail;
StencilFrontDpPass = stencilFrontDpPass;
StencilFrontDpFail = stencilFrontDpFail;
StencilBackFunc = stencilBackFunc;
StencilBackSFail = stencilBackSFail;
StencilBackDpPass = stencilBackDpPass;
StencilBackDpFail = stencilBackDpFail;
}
}
}

View File

@ -52,7 +52,7 @@ namespace Ryujinx.Graphics.GAL
R32G32B32A32Sint,
S8Uint,
D16Unorm,
D24X8Unorm,
S8UintD24Unorm,
D32Float,
D24UnormS8Uint,
D32FloatS8Uint,
@ -266,7 +266,7 @@ namespace Ryujinx.Graphics.GAL
{
case Format.D16Unorm:
case Format.D24UnormS8Uint:
case Format.D24X8Unorm:
case Format.S8UintD24Unorm:
case Format.D32Float:
case Format.D32FloatS8Uint:
case Format.S8Uint:

View File

@ -16,11 +16,9 @@ namespace Ryujinx.Graphics.GAL
void BackgroundContextAction(Action action, bool alwaysBackground = false);
IShader CompileShader(ShaderStage stage, string code);
BufferHandle CreateBuffer(int size);
IProgram CreateProgram(IShader[] shaders, ShaderInfo info);
IProgram CreateProgram(ShaderSource[] shaders, ShaderInfo info);
ISampler CreateSampler(SamplerCreateInfo info);
ITexture CreateTexture(TextureCreateInfo info, float scale);

View File

@ -4,7 +4,6 @@ using Ryujinx.Graphics.GAL.Multithreading.Commands.CounterEvent;
using Ryujinx.Graphics.GAL.Multithreading.Commands.Program;
using Ryujinx.Graphics.GAL.Multithreading.Commands.Renderer;
using Ryujinx.Graphics.GAL.Multithreading.Commands.Sampler;
using Ryujinx.Graphics.GAL.Multithreading.Commands.Shader;
using Ryujinx.Graphics.GAL.Multithreading.Commands.Texture;
using Ryujinx.Graphics.GAL.Multithreading.Commands.Window;
using System;
@ -53,8 +52,6 @@ namespace Ryujinx.Graphics.GAL.Multithreading
{
_lookup[(int)CommandType.Action] = (Span<byte> memory, ThreadedRenderer threaded, IRenderer renderer) =>
ActionCommand.Run(ref GetCommand<ActionCommand>(memory), threaded, renderer);
_lookup[(int)CommandType.CompileShader] = (Span<byte> memory, ThreadedRenderer threaded, IRenderer renderer) =>
CompileShaderCommand.Run(ref GetCommand<CompileShaderCommand>(memory), threaded, renderer);
_lookup[(int)CommandType.CreateBuffer] = (Span<byte> memory, ThreadedRenderer threaded, IRenderer renderer) =>
CreateBufferCommand.Run(ref GetCommand<CreateBufferCommand>(memory), threaded, renderer);
_lookup[(int)CommandType.CreateProgram] = (Span<byte> memory, ThreadedRenderer threaded, IRenderer renderer) =>
@ -98,9 +95,6 @@ namespace Ryujinx.Graphics.GAL.Multithreading
_lookup[(int)CommandType.SamplerDispose] = (Span<byte> memory, ThreadedRenderer threaded, IRenderer renderer) =>
SamplerDisposeCommand.Run(ref GetCommand<SamplerDisposeCommand>(memory), threaded, renderer);
_lookup[(int)CommandType.ShaderDispose] = (Span<byte> memory, ThreadedRenderer threaded, IRenderer renderer) =>
ShaderDisposeCommand.Run(ref GetCommand<ShaderDisposeCommand>(memory), threaded, renderer);
_lookup[(int)CommandType.TextureCopyTo] = (Span<byte> memory, ThreadedRenderer threaded, IRenderer renderer) =>
TextureCopyToCommand.Run(ref GetCommand<TextureCopyToCommand>(memory), threaded, renderer);
_lookup[(int)CommandType.TextureCopyToScaled] = (Span<byte> memory, ThreadedRenderer threaded, IRenderer renderer) =>

View File

@ -3,7 +3,6 @@
enum CommandType : byte
{
Action,
CompileShader,
CreateBuffer,
CreateProgram,
CreateSampler,
@ -29,8 +28,6 @@
SamplerDispose,
ShaderDispose,
TextureCopyTo,
TextureCopyToScaled,
TextureCopyToSlice,

View File

@ -1,22 +0,0 @@
using Ryujinx.Graphics.GAL.Multithreading.Model;
using Ryujinx.Graphics.GAL.Multithreading.Resources;
namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Renderer
{
struct CompileShaderCommand : IGALCommand
{
public CommandType CommandType => CommandType.CompileShader;
private TableRef<ThreadedShader> _shader;
public void Set(TableRef<ThreadedShader> shader)
{
_shader = shader;
}
public static void Run(ref CompileShaderCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
ThreadedShader shader = command._shader.Get(threaded);
shader.EnsureCreated();
}
}
}

View File

@ -1,7 +1,4 @@
using Ryujinx.Graphics.GAL.Multithreading.Resources;
using Ryujinx.Graphics.Shader;
namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Renderer
namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Renderer
{
struct CreateBufferCommand : IGALCommand
{

View File

@ -1,6 +1,4 @@
using System;
namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Renderer
namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Renderer
{
struct PreFrameCommand : IGALCommand
{

View File

@ -1,21 +0,0 @@
using Ryujinx.Graphics.GAL.Multithreading.Model;
using Ryujinx.Graphics.GAL.Multithreading.Resources;
namespace Ryujinx.Graphics.GAL.Multithreading.Commands.Shader
{
struct ShaderDisposeCommand : IGALCommand
{
public CommandType CommandType => CommandType.ShaderDispose;
private TableRef<ThreadedShader> _shader;
public void Set(TableRef<ThreadedShader> shader)
{
_shader = shader;
}
public static void Run(ref ShaderDisposeCommand command, ThreadedRenderer threaded, IRenderer renderer)
{
command._shader.Get(threaded).Base.Dispose();
}
}
}

View File

@ -6,10 +6,10 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources.Programs
{
public ThreadedProgram Threaded { get; set; }
private IShader[] _shaders;
private ShaderSource[] _shaders;
private ShaderInfo _info;
public SourceProgramRequest(ThreadedProgram program, IShader[] shaders, ShaderInfo info)
public SourceProgramRequest(ThreadedProgram program, ShaderSource[] shaders, ShaderInfo info)
{
Threaded = program;
@ -19,14 +19,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading.Resources.Programs
public IProgram Create(IRenderer renderer)
{
IShader[] shaders = _shaders.Select(shader =>
{
var threaded = (ThreadedShader)shader;
threaded?.EnsureCreated();
return threaded?.Base;
}).ToArray();
return renderer.CreateProgram(shaders, _info);
return renderer.CreateProgram(_shaders, _info);
}
}
}

View File

@ -1,38 +0,0 @@
using Ryujinx.Graphics.GAL.Multithreading.Commands.Shader;
using Ryujinx.Graphics.GAL.Multithreading.Model;
using Ryujinx.Graphics.Shader;
namespace Ryujinx.Graphics.GAL.Multithreading.Resources
{
class ThreadedShader : IShader
{
private ThreadedRenderer _renderer;
private ShaderStage _stage;
private string _code;
public IShader Base;
public ThreadedShader(ThreadedRenderer renderer, ShaderStage stage, string code)
{
_renderer = renderer;
_stage = stage;
_code = code;
}
internal void EnsureCreated()
{
if (_code != null && Base == null)
{
Base = _renderer.BaseRenderer.CompileShader(_stage, _code);
_code = null;
}
}
public void Dispose()
{
_renderer.New<ShaderDisposeCommand>().Set(new TableRef<ThreadedShader>(_renderer, this));
_renderer.QueueCommand();
}
}
}

View File

@ -1,7 +1,6 @@
using Ryujinx.Graphics.GAL.Multithreading.Commands;
using Ryujinx.Graphics.GAL.Multithreading.Model;
using Ryujinx.Graphics.GAL.Multithreading.Resources;
using Ryujinx.Graphics.Shader;
using System;
using System.Linq;

View File

@ -250,15 +250,6 @@ namespace Ryujinx.Graphics.GAL.Multithreading
}
}
public IShader CompileShader(ShaderStage stage, string code)
{
var shader = new ThreadedShader(this, stage, code);
New<CompileShaderCommand>().Set(Ref(shader));
QueueCommand();
return shader;
}
public BufferHandle CreateBuffer(int size)
{
BufferHandle handle = Buffers.CreateBufferHandle();
@ -268,7 +259,7 @@ namespace Ryujinx.Graphics.GAL.Multithreading
return handle;
}
public IProgram CreateProgram(IShader[] shaders, ShaderInfo info)
public IProgram CreateProgram(ShaderSource[] shaders, ShaderInfo info)
{
var program = new ThreadedProgram(this);
SourceProgramRequest request = new SourceProgramRequest(program, shaders, info);

View File

@ -0,0 +1,29 @@
using Ryujinx.Graphics.Shader;
using Ryujinx.Graphics.Shader.Translation;
namespace Ryujinx.Graphics.GAL
{
public struct ShaderSource
{
public string Code { get; }
public byte[] BinaryCode { get; }
public ShaderStage Stage { get; }
public TargetLanguage Language { get; }
public ShaderSource(string code, byte[] binaryCode, ShaderStage stage, TargetLanguage language)
{
Code = code;
BinaryCode = binaryCode;
Stage = stage;
Language = language;
}
public ShaderSource(string code, ShaderStage stage, TargetLanguage language) : this(code, null, stage, language)
{
}
public ShaderSource(byte[] binaryCode, ShaderStage stage, TargetLanguage language) : this(null, binaryCode, stage, language)
{
}
}
}

View File

@ -9,7 +9,6 @@ namespace Ryujinx.Graphics.GAL
Texture2DArray,
Texture2DMultisample,
Texture2DMultisampleArray,
Rectangle,
Cubemap,
CubemapArray,
TextureBuffer

View File

@ -124,24 +124,20 @@ namespace Ryujinx.Graphics.Gpu.Engine.Compute
ulong samplerPoolGpuVa = ((ulong)_state.State.SetTexSamplerPoolAOffsetUpper << 32) | _state.State.SetTexSamplerPoolB;
ulong texturePoolGpuVa = ((ulong)_state.State.SetTexHeaderPoolAOffsetUpper << 32) | _state.State.SetTexHeaderPoolB;
GpuAccessorState gas = new GpuAccessorState(
GpuChannelPoolState poolState = new GpuChannelPoolState(
texturePoolGpuVa,
_state.State.SetTexHeaderPoolCMaximumIndex,
_state.State.SetBindlessTextureConstantBufferSlotSelect,
false,
PrimitiveTopology.Points,
default);
_state.State.SetBindlessTextureConstantBufferSlotSelect);
ShaderBundle cs = memoryManager.Physical.ShaderCache.GetComputeShader(
_channel,
gas,
shaderGpuVa,
GpuChannelComputeState computeState = new GpuChannelComputeState(
qmd.CtaThreadDimension0,
qmd.CtaThreadDimension1,
qmd.CtaThreadDimension2,
localMemorySize,
sharedMemorySize);
CachedShaderProgram cs = memoryManager.Physical.ShaderCache.GetComputeShader(_channel, poolState, computeState, shaderGpuVa);
_context.Renderer.Pipeline.SetProgram(cs.HostProgram);
_channel.TextureManager.SetComputeSamplerPool(samplerPoolGpuVa, _state.State.SetTexSamplerPoolCMaximumIndex, qmd.SamplerIndex);

View File

@ -1,7 +1,7 @@
using Ryujinx.Common;
using Ryujinx.Common.Logging;
using Ryujinx.Graphics.Device;
using Ryujinx.Graphics.Gpu.Engine.Threed;
using Ryujinx.Graphics.Gpu.Memory;
using Ryujinx.Graphics.Texture;
using System;
using System.Collections.Generic;
@ -330,10 +330,94 @@ namespace Ryujinx.Graphics.Gpu.Engine.Dma
{
// TODO: Implement remap functionality.
// Buffer to buffer copy.
bool srcIsPitchKind = memoryManager.GetKind(srcGpuVa).IsPitch();
bool dstIsPitchKind = memoryManager.GetKind(dstGpuVa).IsPitch();
if (!srcIsPitchKind && dstIsPitchKind)
{
CopyGobBlockLinearToLinear(memoryManager, srcGpuVa, dstGpuVa, size);
}
else if (srcIsPitchKind && !dstIsPitchKind)
{
CopyGobLinearToBlockLinear(memoryManager, srcGpuVa, dstGpuVa, size);
}
else
{
memoryManager.Physical.BufferCache.CopyBuffer(memoryManager, srcGpuVa, dstGpuVa, size);
}
}
}
}
/// <summary>
/// Copies block linear data with block linear GOBs to a block linear destination with linear GOBs.
/// </summary>
/// <param name="memoryManager">GPU memory manager</param>
/// <param name="srcGpuVa">Source GPU virtual address</param>
/// <param name="dstGpuVa">Destination GPU virtual address</param>
/// <param name="size">Size in bytes of the copy</param>
private static void CopyGobBlockLinearToLinear(MemoryManager memoryManager, ulong srcGpuVa, ulong dstGpuVa, ulong size)
{
if (((srcGpuVa | dstGpuVa | size) & 0xf) == 0)
{
for (ulong offset = 0; offset < size; offset += 16)
{
Vector128<byte> data = memoryManager.Read<Vector128<byte>>(ConvertGobLinearToBlockLinearAddress(srcGpuVa + offset), true);
memoryManager.Write(dstGpuVa + offset, data);
}
}
else
{
for (ulong offset = 0; offset < size; offset++)
{
byte data = memoryManager.Read<byte>(ConvertGobLinearToBlockLinearAddress(srcGpuVa + offset), true);
memoryManager.Write(dstGpuVa + offset, data);
}
}
}
/// <summary>
/// Copies block linear data with linear GOBs to a block linear destination with block linear GOBs.
/// </summary>
/// <param name="memoryManager">GPU memory manager</param>
/// <param name="srcGpuVa">Source GPU virtual address</param>
/// <param name="dstGpuVa">Destination GPU virtual address</param>
/// <param name="size">Size in bytes of the copy</param>
private static void CopyGobLinearToBlockLinear(MemoryManager memoryManager, ulong srcGpuVa, ulong dstGpuVa, ulong size)
{
if (((srcGpuVa | dstGpuVa | size) & 0xf) == 0)
{
for (ulong offset = 0; offset < size; offset += 16)
{
Vector128<byte> data = memoryManager.Read<Vector128<byte>>(srcGpuVa + offset, true);
memoryManager.Write(ConvertGobLinearToBlockLinearAddress(dstGpuVa + offset), data);
}
}
else
{
for (ulong offset = 0; offset < size; offset++)
{
byte data = memoryManager.Read<byte>(srcGpuVa + offset, true);
memoryManager.Write(ConvertGobLinearToBlockLinearAddress(dstGpuVa + offset), data);
}
}
}
/// <summary>
/// Calculates the GOB block linear address from a linear address.
/// </summary>
/// <param name="address">Linear address</param>
/// <returns>Block linear address</returns>
private static ulong ConvertGobLinearToBlockLinearAddress(ulong address)
{
// y2 y1 y0 x5 x4 x3 x2 x1 x0 -> x5 y2 y1 x4 y0 x3 x2 x1 x0
return (address & ~0x1f0UL) |
((address & 0x40) >> 2) |
((address & 0x10) << 1) |
((address & 0x180) >> 1) |
((address & 0x20) << 3);
}
/// <summary>
/// Performs a buffer to buffer, or buffer to texture copy, then optionally releases a semaphore.

View File

@ -525,7 +525,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
int scissorW = screenScissorState.Width;
int scissorH = screenScissorState.Height;
if (clearAffectedByScissor)
if (clearAffectedByScissor && _state.State.ScissorState[0].Enable)
{
ref var scissorState = ref _state.State.ScissorState[0];

View File

@ -7,7 +7,6 @@ using Ryujinx.Graphics.Shader;
using Ryujinx.Graphics.Texture;
using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
namespace Ryujinx.Graphics.Gpu.Engine.Threed
{
@ -20,6 +19,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
public const int RasterizerStateIndex = 1;
public const int ScissorStateIndex = 2;
public const int VertexBufferStateIndex = 3;
public const int PrimitiveRestartStateIndex = 4;
private readonly GpuContext _context;
private readonly GpuChannel _channel;
@ -29,11 +29,14 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
private readonly StateUpdateTracker<ThreedClassState> _updateTracker;
private readonly ShaderProgramInfo[] _currentProgramInfo;
private ShaderSpecializationState _shaderSpecState;
private bool _vtgWritesRtLayer;
private byte _vsClipDistancesWritten;
private bool _prevDrawIndexed;
private IndexType _prevIndexType;
private uint _prevFirstVertex;
private bool _prevTfEnable;
/// <summary>
@ -75,6 +78,10 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
nameof(ThreedClassState.VertexBufferState),
nameof(ThreedClassState.VertexBufferEndAddress)),
new StateUpdateCallbackEntry(UpdatePrimitiveRestartState,
nameof(ThreedClassState.PrimitiveRestartDrawArrays),
nameof(ThreedClassState.PrimitiveRestartState)),
new StateUpdateCallbackEntry(UpdateTessellationState,
nameof(ThreedClassState.TessOuterLevel),
nameof(ThreedClassState.TessInnerLevel),
@ -140,8 +147,6 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
nameof(ThreedClassState.PointSpriteEnable),
nameof(ThreedClassState.PointCoordReplace)),
new StateUpdateCallbackEntry(UpdatePrimitiveRestartState, nameof(ThreedClassState.PrimitiveRestartState)),
new StateUpdateCallbackEntry(UpdateIndexBufferState,
nameof(ThreedClassState.IndexBufferState),
nameof(ThreedClassState.IndexBufferCount)),
@ -190,6 +195,17 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public void Update()
{
// If any state that the shader depends on changed,
// then we may need to compile/bind a different version
// of the shader for the new state.
if (_shaderSpecState != null)
{
if (!_shaderSpecState.MatchesGraphics(_channel, GetPoolState()))
{
ForceShaderUpdate();
}
}
// The vertex buffer size is calculated using a different
// method when doing indexed draws, so we need to make sure
// to update the vertex buffers if we are doing a regular
@ -197,9 +213,31 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
if (_drawState.DrawIndexed != _prevDrawIndexed)
{
_updateTracker.ForceDirty(VertexBufferStateIndex);
// If PrimitiveRestartDrawArrays is false and this is a non-indexed draw, we need to ensure primitive restart is disabled.
// If PrimitiveRestartDrawArrays is false and this is a indexed draw, we need to ensure primitive restart enable matches GPU state.
// If PrimitiveRestartDrawArrays is true, then primitive restart enable should always match GPU state.
// That is because "PrimitiveRestartDrawArrays" is not configurable on the backend, it is always
// true on OpenGL and always false on Vulkan.
if (!_state.State.PrimitiveRestartDrawArrays && _state.State.PrimitiveRestartState.Enable)
{
_updateTracker.ForceDirty(PrimitiveRestartStateIndex);
}
_prevDrawIndexed = _drawState.DrawIndexed;
}
// In some cases, the index type is also used to guess the
// vertex buffer size, so we must update it if the type changed too.
if (_drawState.DrawIndexed &&
(_prevIndexType != _state.State.IndexBufferState.Type ||
_prevFirstVertex != _state.State.FirstVertex))
{
_updateTracker.ForceDirty(VertexBufferStateIndex);
_prevIndexType = _state.State.IndexBufferState.Type;
_prevFirstVertex = _state.State.FirstVertex;
}
bool tfEnable = _state.State.TfEnable;
if (!tfEnable && _prevTfEnable)
@ -816,8 +854,9 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
private void UpdatePrimitiveRestartState()
{
PrimitiveRestartState primitiveRestart = _state.State.PrimitiveRestartState;
bool enable = primitiveRestart.Enable && (_drawState.DrawIndexed || _state.State.PrimitiveRestartDrawArrays);
_context.Renderer.Pipeline.SetPrimitiveRestart(primitiveRestart.Enable, primitiveRestart.Index);
_context.Renderer.Pipeline.SetPrimitiveRestart(enable, primitiveRestart.Index);
}
/// <summary>
@ -852,6 +891,9 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// </summary>
private void UpdateVertexBufferState()
{
IndexType indexType = _state.State.IndexBufferState.Type;
bool indexTypeSmall = indexType == IndexType.UByte || indexType == IndexType.UShort;
_drawState.IsAnyVbInstanced = false;
for (int index = 0; index < Constants.TotalVertexBuffers; index++)
@ -883,12 +925,27 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
{
// This size may be (much) larger than the real vertex buffer size.
// Avoid calculating it this way, unless we don't have any other option.
size = endAddress.Pack() - address + 1;
if (stride > 0 && indexTypeSmall)
{
// If the index type is a small integer type, then we might be still able
// to reduce the vertex buffer size based on the maximum possible index value.
ulong maxVertexBufferSize = indexType == IndexType.UByte ? 0x100UL : 0x10000UL;
maxVertexBufferSize += _state.State.FirstVertex;
maxVertexBufferSize *= (uint)stride;
size = Math.Min(size, maxVertexBufferSize);
}
}
else
{
// For non-indexed draws, we can guess the size from the vertex count
// and stride.
int firstInstance = (int)_state.State.FirstInstance;
var drawState = _state.State.VertexBufferDrawState;
@ -1019,51 +1076,53 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
/// </summary>
private void UpdateShaderState()
{
var shaderCache = _channel.MemoryManager.Physical.ShaderCache;
_vtgWritesRtLayer = false;
ShaderAddresses addresses = new ShaderAddresses();
Span<ShaderAddresses> addressesSpan = MemoryMarshal.CreateSpan(ref addresses, 1);
Span<ulong> addressesArray = MemoryMarshal.Cast<ShaderAddresses, ulong>(addressesSpan);
Span<ulong> addressesSpan = addresses.AsSpan();
ulong baseAddress = _state.State.ShaderBaseAddress.Pack();
for (int index = 0; index < 6; index++)
{
var shader = _state.State.ShaderState[index];
if (!shader.UnpackEnable() && index != 1)
{
continue;
}
addressesArray[index] = baseAddress + shader.Offset;
addressesSpan[index] = baseAddress + shader.Offset;
}
GpuAccessorState gas = new GpuAccessorState(
_state.State.TexturePoolState.Address.Pack(),
_state.State.TexturePoolState.MaximumId,
(int)_state.State.TextureBufferIndex,
_state.State.EarlyZForce,
_drawState.Topology,
_state.State.TessMode);
GpuChannelPoolState poolState = GetPoolState();
GpuChannelGraphicsState graphicsState = GetGraphicsState();
ShaderBundle gs = _channel.MemoryManager.Physical.ShaderCache.GetGraphicsShader(ref _state.State, _channel, gas, addresses);
CachedShaderProgram gs = shaderCache.GetGraphicsShader(ref _state.State, _channel, poolState, graphicsState, addresses);
_shaderSpecState = gs.SpecializationState;
byte oldVsClipDistancesWritten = _vsClipDistancesWritten;
_drawState.VsUsesInstanceId = gs.Shaders[0]?.Info.UsesInstanceId ?? false;
_vsClipDistancesWritten = gs.Shaders[0]?.Info.ClipDistancesWritten ?? 0;
_vtgWritesRtLayer = false;
_drawState.VsUsesInstanceId = gs.Shaders[1]?.Info.UsesInstanceId ?? false;
_vsClipDistancesWritten = gs.Shaders[1]?.Info.ClipDistancesWritten ?? 0;
if (oldVsClipDistancesWritten != _vsClipDistancesWritten)
{
UpdateUserClipState();
}
for (int stage = 0; stage < Constants.ShaderStages; stage++)
for (int stageIndex = 0; stageIndex < Constants.ShaderStages; stageIndex++)
{
ShaderProgramInfo info = gs.Shaders[stage]?.Info;
UpdateStageBindings(stageIndex, gs.Shaders[stageIndex + 1]?.Info);
}
_context.Renderer.Pipeline.SetProgram(gs.HostProgram);
}
private void UpdateStageBindings(int stage, ShaderProgramInfo info)
{
_currentProgramInfo[stage] = info;
if (info == null)
@ -1072,7 +1131,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
_channel.TextureManager.RentGraphicsImageBindings(stage, 0);
_channel.BufferManager.SetGraphicsStorageBufferBindings(stage, null);
_channel.BufferManager.SetGraphicsUniformBufferBindings(stage, null);
continue;
return;
}
Span<TextureBindingInfo> textureBindings = _channel.TextureManager.RentGraphicsTextureBindings(stage, info.Textures.Count);
@ -1118,7 +1177,24 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
_channel.BufferManager.SetGraphicsUniformBufferBindings(stage, info.CBuffers);
}
_context.Renderer.Pipeline.SetProgram(gs.HostProgram);
private GpuChannelPoolState GetPoolState()
{
return new GpuChannelPoolState(
_state.State.TexturePoolState.Address.Pack(),
_state.State.TexturePoolState.MaximumId,
(int)_state.State.TextureBufferIndex);
}
/// <summary>
/// Gets the current GPU channel state for shader creation or compatibility verification.
/// </summary>
/// <returns>Current GPU channel state</returns>
private GpuChannelGraphicsState GetGraphicsState()
{
return new GpuChannelGraphicsState(
_state.State.EarlyZForce,
_drawState.Topology,
_state.State.TessMode);
}
/// <summary>

View File

@ -730,7 +730,9 @@ namespace Ryujinx.Graphics.Gpu.Engine.Threed
public int PatchVertices;
public fixed uint ReservedDD0[4];
public uint TextureBarrier;
public fixed uint ReservedDE4[7];
public uint WatchdogTimer;
public Boolean32 PrimitiveRestartDrawArrays;
public fixed uint ReservedDEC[5];
public Array16<ScissorState> ScissorState;
public fixed uint ReservedF00[21];
public StencilBackMasks StencilBackMasks;

View File

@ -1,11 +1,15 @@
using Ryujinx.Graphics.Device;
using Ryujinx.Common;
using Ryujinx.Graphics.Device;
using Ryujinx.Graphics.GAL;
using Ryujinx.Graphics.Gpu.Engine.Types;
using Ryujinx.Graphics.Gpu.Image;
using Ryujinx.Graphics.Texture;
using Ryujinx.Memory;
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Runtime.Intrinsics;
namespace Ryujinx.Graphics.Gpu.Engine.Twod
{
@ -44,6 +48,180 @@ namespace Ryujinx.Graphics.Gpu.Engine.Twod
/// <param name="data">Data to be written</param>
public void Write(int offset, int data) => _state.Write(offset, data);
/// <summary>
/// Determines if data is compatible between the source and destination texture.
/// The two textures must have the same size, layout, and bytes per pixel.
/// </summary>
/// <param name="lhs">Info for the first texture</param>
/// <param name="rhs">Info for the second texture</param>
/// <param name="lhsFormat">Format of the first texture</param>
/// <param name="rhsFormat">Format of the second texture</param>
/// <returns>True if the data is compatible, false otherwise</returns>
private bool IsDataCompatible(TwodTexture lhs, TwodTexture rhs, FormatInfo lhsFormat, FormatInfo rhsFormat)
{
if (lhsFormat.BytesPerPixel != rhsFormat.BytesPerPixel ||
lhs.Height != rhs.Height ||
lhs.Depth != rhs.Depth ||
lhs.LinearLayout != rhs.LinearLayout ||
lhs.MemoryLayout.Packed != rhs.MemoryLayout.Packed)
{
return false;
}
if (lhs.LinearLayout)
{
return lhs.Stride == rhs.Stride;
}
else
{
return lhs.Width == rhs.Width;
}
}
/// <summary>
/// Determine if the given region covers the full texture, also considering width alignment.
/// </summary>
/// <param name="texture">The texture to check</param>
/// <param name="formatInfo"></param>
/// <param name="x1">Region start x</param>
/// <param name="y1">Region start y</param>
/// <param name="x2">Region end x</param>
/// <param name="y2">Region end y</param>
/// <returns>True if the region covers the full texture, false otherwise</returns>
private bool IsCopyRegionComplete(TwodTexture texture, FormatInfo formatInfo, int x1, int y1, int x2, int y2)
{
if (x1 != 0 || y1 != 0 || y2 != texture.Height)
{
return false;
}
int width;
int widthAlignment;
if (texture.LinearLayout)
{
widthAlignment = 1;
width = texture.Stride / formatInfo.BytesPerPixel;
}
else
{
widthAlignment = Constants.GobAlignment / formatInfo.BytesPerPixel;
width = texture.Width;
}
return width == BitUtils.AlignUp(x2, widthAlignment);
}
/// <summary>
/// Performs a full data copy between two textures, reading and writing guest memory directly.
/// The textures must have a matching layout, size, and bytes per pixel.
/// </summary>
/// <param name="src">The source texture</param>
/// <param name="dst">The destination texture</param>
/// <param name="w">Copy width</param>
/// <param name="h">Copy height</param>
/// <param name="bpp">Bytes per pixel</param>
private void UnscaledFullCopy(TwodTexture src, TwodTexture dst, int w, int h, int bpp)
{
var srcCalculator = new OffsetCalculator(
w,
h,
src.Stride,
src.LinearLayout,
src.MemoryLayout.UnpackGobBlocksInY(),
src.MemoryLayout.UnpackGobBlocksInZ(),
bpp);
(int _, int srcSize) = srcCalculator.GetRectangleRange(0, 0, w, h);
var memoryManager = _channel.MemoryManager;
ulong srcGpuVa = src.Address.Pack();
ulong dstGpuVa = dst.Address.Pack();
ReadOnlySpan<byte> srcSpan = memoryManager.GetSpan(srcGpuVa, srcSize, true);
int width;
int height = src.Height;
if (src.LinearLayout)
{
width = src.Stride / bpp;
}
else
{
width = src.Width;
}
// If the copy is not equal to the width and height of the texture, we will need to copy partially.
// It's worth noting that it has already been established that the src and dst are the same size.
if (w == width && h == height)
{
memoryManager.Write(dstGpuVa, srcSpan);
}
else
{
using WritableRegion dstRegion = memoryManager.GetWritableRegion(dstGpuVa, srcSize, true);
Span<byte> dstSpan = dstRegion.Memory.Span;
if (src.LinearLayout)
{
int stride = src.Stride;
int offset = 0;
int lineSize = width * bpp;
for (int y = 0; y < height; y++)
{
srcSpan.Slice(offset, lineSize).CopyTo(dstSpan.Slice(offset));
offset += stride;
}
}
else
{
// Copy with the block linear layout in mind.
// Recreate the offset calculate with bpp 1 for copy.
int stride = w * bpp;
srcCalculator = new OffsetCalculator(
stride,
h,
0,
false,
src.MemoryLayout.UnpackGobBlocksInY(),
src.MemoryLayout.UnpackGobBlocksInZ(),
1);
int strideTrunc = BitUtils.AlignDown(stride, 16);
ReadOnlySpan<Vector128<byte>> srcVec = MemoryMarshal.Cast<byte, Vector128<byte>>(srcSpan);
Span<Vector128<byte>> dstVec = MemoryMarshal.Cast<byte, Vector128<byte>>(dstSpan);
for (int y = 0; y < h; y++)
{
int x = 0;
srcCalculator.SetY(y);
for (; x < strideTrunc; x += 16)
{
int offset = srcCalculator.GetOffset(x) >> 4;
dstVec[offset] = srcVec[offset];
}
for (; x < stride; x++)
{
int offset = srcCalculator.GetOffset(x);
dstSpan[offset] = srcSpan[offset];
}
}
}
}
}
/// <summary>
/// Performs the blit operation, triggered by the register write.
/// </summary>
@ -114,16 +292,31 @@ namespace Ryujinx.Graphics.Gpu.Engine.Twod
srcX1 = 0;
}
FormatInfo dstCopyTextureFormat = dstCopyTexture.Format.Convert();
bool canDirectCopy = GraphicsConfig.Fast2DCopy &&
srcX2 == dstX2 && srcY2 == dstY2 &&
IsDataCompatible(srcCopyTexture, dstCopyTexture, srcCopyTextureFormat, dstCopyTextureFormat) &&
IsCopyRegionComplete(srcCopyTexture, srcCopyTextureFormat, srcX1, srcY1, srcX2, srcY2) &&
IsCopyRegionComplete(dstCopyTexture, dstCopyTextureFormat, dstX1, dstY1, dstX2, dstY2);
var srcTexture = memoryManager.Physical.TextureCache.FindOrCreateTexture(
memoryManager,
srcCopyTexture,
offset,
srcCopyTextureFormat,
!canDirectCopy,
false,
srcHint);
if (srcTexture == null)
{
if (canDirectCopy)
{
// Directly copy the data on CPU.
UnscaledFullCopy(srcCopyTexture, dstCopyTexture, srcX2, srcY2, srcCopyTextureFormat.BytesPerPixel);
}
return;
}
@ -132,7 +325,6 @@ namespace Ryujinx.Graphics.Gpu.Engine.Twod
// When the source texture that was found has a depth format,
// we must enforce the target texture also has a depth format,
// as copies between depth and color formats are not allowed.
FormatInfo dstCopyTextureFormat;
if (srcTexture.Format.IsDepthOrStencil())
{
@ -148,6 +340,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Twod
dstCopyTexture,
0,
dstCopyTextureFormat,
true,
srcTexture.ScaleMode == TextureScaleMode.Scaled,
dstHint);

View File

@ -32,7 +32,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.Types
ZetaFormat.D16Unorm => new FormatInfo(Format.D16Unorm, 1, 1, 2, 1),
ZetaFormat.D24UnormS8Uint => new FormatInfo(Format.D24UnormS8Uint, 1, 1, 4, 2),
ZetaFormat.D24Unorm => new FormatInfo(Format.D24UnormS8Uint, 1, 1, 4, 1),
ZetaFormat.S8UintD24Unorm => new FormatInfo(Format.D24UnormS8Uint, 1, 1, 4, 2),
ZetaFormat.S8UintD24Unorm => new FormatInfo(Format.S8UintD24Unorm, 1, 1, 4, 2),
ZetaFormat.S8Uint => new FormatInfo(Format.S8Uint, 1, 1, 1, 1),
ZetaFormat.D32FloatS8Uint => new FormatInfo(Format.D32FloatS8Uint, 1, 1, 8, 2),
_ => FormatInfo.Default

View File

@ -238,13 +238,13 @@ namespace Ryujinx.Graphics.Gpu
/// <summary>
/// Initialize the GPU shader cache.
/// </summary>
public void InitializeShaderCache()
public void InitializeShaderCache(CancellationToken cancellationToken)
{
HostInitalized.WaitOne();
foreach (var physicalMemory in PhysicalMemoryRegistry.Values)
{
physicalMemory.ShaderCache.Initialize();
physicalMemory.ShaderCache.Initialize(cancellationToken);
}
}

View File

@ -28,6 +28,14 @@ namespace Ryujinx.Graphics.Gpu
/// </summary>
public static bool FastGpuTime = true;
/// <summary>
/// Enables or disables fast 2d engine texture copies entirely on CPU when possible.
/// Reduces stuttering and # of textures in games that copy textures around for streaming,
/// as textures will not need to be created for the copy, and the data does not need to be
/// flushed from GPU.
/// </summary>
public static bool Fast2DCopy = true;
/// <summary>
/// Enables or disables the Just-in-Time compiler for GPU Macro code.
/// </summary>

View File

@ -55,6 +55,7 @@ namespace Ryujinx.Graphics.Gpu.Image
{ 0x24a0e, new FormatInfo(Format.D24UnormS8Uint, 1, 1, 4, 2) },
{ 0x24a29, new FormatInfo(Format.D24UnormS8Uint, 1, 1, 4, 2) },
{ 0x48a29, new FormatInfo(Format.D24UnormS8Uint, 1, 1, 4, 2) },
{ 0x4912b, new FormatInfo(Format.S8UintD24Unorm, 1, 1, 4, 2) },
{ 0x25385, new FormatInfo(Format.D32FloatS8Uint, 1, 1, 8, 2) },
{ 0x253b0, new FormatInfo(Format.D32FloatS8Uint, 1, 1, 8, 2) },
{ 0xa4908, new FormatInfo(Format.R8G8B8A8Srgb, 1, 1, 4, 4) },

View File

@ -1136,18 +1136,34 @@ namespace Ryujinx.Graphics.Gpu.Image
/// <param name="range">Texture view physical memory ranges</param>
/// <param name="layerSize">Layer size on the given texture</param>
/// <param name="caps">Host GPU capabilities</param>
/// <param name="allowMs">Indicates that multisample textures are allowed to match non-multisample requested textures</param>
/// <param name="firstLayer">Texture view initial layer on this texture</param>
/// <param name="firstLevel">Texture view first mipmap level on this texture</param>
/// <returns>The level of compatiblilty a view with the given parameters created from this texture has</returns>
public TextureViewCompatibility IsViewCompatible(TextureInfo info, MultiRange range, int layerSize, Capabilities caps, out int firstLayer, out int firstLevel)
public TextureViewCompatibility IsViewCompatible(TextureInfo info, MultiRange range, int layerSize, Capabilities caps, bool allowMs, out int firstLayer, out int firstLevel)
{
TextureViewCompatibility result = TextureViewCompatibility.Full;
result = TextureCompatibility.PropagateViewCompatibility(result, TextureCompatibility.ViewFormatCompatible(Info, info, caps));
if (result != TextureViewCompatibility.Incompatible)
{
bool msTargetCompatible = false;
if (allowMs)
{
msTargetCompatible = Info.Target == Target.Texture2DMultisample && info.Target == Target.Texture2D;
}
if (!msTargetCompatible)
{
result = TextureCompatibility.PropagateViewCompatibility(result, TextureCompatibility.ViewTargetCompatible(Info, info));
if (Info.SamplesInX != info.SamplesInX || Info.SamplesInY != info.SamplesInY)
{
result = TextureViewCompatibility.Incompatible;
}
}
if (result == TextureViewCompatibility.Full && Info.FormatInfo.Format != info.FormatInfo.Format && !_context.Capabilities.SupportsMismatchingViewFormat)
{
// AMD and Intel have a bug where the view format is always ignored;
@ -1156,11 +1172,6 @@ namespace Ryujinx.Graphics.Gpu.Image
result = TextureViewCompatibility.CopyOnly;
}
if (Info.SamplesInX != info.SamplesInX || Info.SamplesInY != info.SamplesInY)
{
result = TextureViewCompatibility.Incompatible;
}
}
firstLayer = 0;

View File

@ -7,7 +7,6 @@ using Ryujinx.Graphics.Gpu.Engine.Types;
using Ryujinx.Graphics.Gpu.Image;
using Ryujinx.Graphics.Gpu.Memory;
using Ryujinx.Graphics.Texture;
using Ryujinx.Memory;
using Ryujinx.Memory.Range;
using System;
using System.Collections.Generic;
@ -40,6 +39,7 @@ namespace Ryujinx.Graphics.Gpu.Image
private readonly PhysicalMemory _physicalMemory;
private readonly MultiRangeList<Texture> _textures;
private readonly HashSet<Texture> _partiallyMappedTextures;
private Texture[] _textureOverlaps;
private OverlapInfo[] _overlapInfo;
@ -57,6 +57,7 @@ namespace Ryujinx.Graphics.Gpu.Image
_physicalMemory = physicalMemory;
_textures = new MultiRangeList<Texture>();
_partiallyMappedTextures = new HashSet<Texture>();
_textureOverlaps = new Texture[OverlapsBufferInitialCapacity];
_overlapInfo = new OverlapInfo[OverlapsBufferInitialCapacity];
@ -74,17 +75,7 @@ namespace Ryujinx.Graphics.Gpu.Image
Texture[] overlaps = new Texture[10];
int overlapCount;
MultiRange unmapped;
try
{
unmapped = ((MemoryManager)sender).GetPhysicalRegions(e.Address, e.Size);
}
catch (InvalidMemoryRegionException)
{
// This event fires on Map in case any mappings are overwritten. In that case, there may not be an existing mapping.
return;
}
MultiRange unmapped = ((MemoryManager)sender).GetPhysicalRegions(e.Address, e.Size);
lock (_textures)
{
@ -95,6 +86,24 @@ namespace Ryujinx.Graphics.Gpu.Image
{
overlaps[i].Unmapped(unmapped);
}
// If any range was previously unmapped, we also need to purge
// all partially mapped texture, as they might be fully mapped now.
for (int i = 0; i < unmapped.Count; i++)
{
if (unmapped.GetSubRange(i).Address == MemoryManager.PteUnmapped)
{
lock (_partiallyMappedTextures)
{
foreach (var texture in _partiallyMappedTextures)
{
texture.Unmapped(unmapped);
}
}
break;
}
}
}
/// <summary>
@ -194,6 +203,7 @@ namespace Ryujinx.Graphics.Gpu.Image
TwodTexture copyTexture,
ulong offset,
FormatInfo formatInfo,
bool shouldCreate,
bool preferScaling = true,
Size? sizeHint = null)
{
@ -234,6 +244,11 @@ namespace Ryujinx.Graphics.Gpu.Image
flags |= TextureSearchFlags.WithUpscale;
}
if (!shouldCreate)
{
flags |= TextureSearchFlags.NoCreate;
}
Texture texture = FindOrCreateTexture(memoryManager, flags, info, 0, sizeHint);
texture?.SynchronizeMemory();
@ -480,15 +495,29 @@ namespace Ryujinx.Graphics.Gpu.Image
return texture;
}
else if (flags.HasFlag(TextureSearchFlags.NoCreate))
{
return null;
}
// Calculate texture sizes, used to find all overlapping textures.
SizeInfo sizeInfo = info.CalculateSizeInfo(layerSize);
ulong size = (ulong)sizeInfo.TotalSize;
bool partiallyMapped = false;
if (range == null)
{
range = memoryManager.GetPhysicalRegions(info.GpuAddress, size);
for (int i = 0; i < range.Value.Count; i++)
{
if (range.Value.GetSubRange(i).Address == MemoryManager.PteUnmapped)
{
partiallyMapped = true;
break;
}
}
}
// Find view compatible matches.
@ -513,7 +542,14 @@ namespace Ryujinx.Graphics.Gpu.Image
for (int index = 0; index < overlapsCount; index++)
{
Texture overlap = _textureOverlaps[index];
TextureViewCompatibility overlapCompatibility = overlap.IsViewCompatible(info, range.Value, sizeInfo.LayerSize, _context.Capabilities, out int firstLayer, out int firstLevel);
TextureViewCompatibility overlapCompatibility = overlap.IsViewCompatible(
info,
range.Value,
sizeInfo.LayerSize,
_context.Capabilities,
flags.HasFlag(TextureSearchFlags.ForCopy),
out int firstLayer,
out int firstLevel);
if (overlapCompatibility == TextureViewCompatibility.Full)
{
@ -621,7 +657,14 @@ namespace Ryujinx.Graphics.Gpu.Image
Texture overlap = _textureOverlaps[index];
bool overlapInCache = overlap.CacheNode != null;
TextureViewCompatibility compatibility = texture.IsViewCompatible(overlap.Info, overlap.Range, overlap.LayerSize, _context.Capabilities, out int firstLayer, out int firstLevel);
TextureViewCompatibility compatibility = texture.IsViewCompatible(
overlap.Info,
overlap.Range,
overlap.LayerSize,
_context.Capabilities,
false,
out int firstLayer,
out int firstLevel);
if (overlap.IsView && compatibility == TextureViewCompatibility.Full)
{
@ -774,6 +817,14 @@ namespace Ryujinx.Graphics.Gpu.Image
_textures.Add(texture);
}
if (partiallyMapped)
{
lock (_partiallyMappedTextures)
{
_partiallyMappedTextures.Add(texture);
}
}
ShrinkOverlapsBufferIfNeeded();
for (int i = 0; i < overlapsCount; i++)
@ -963,20 +1014,34 @@ namespace Ryujinx.Graphics.Gpu.Image
depthOrLayers = info.DepthOrLayers;
}
// 2D and 2D multisample textures are not considered compatible.
// This specific case is required for copies, where the source texture might be multisample.
// In this case, we inherit the parent texture multisample state.
Target target = info.Target;
int samplesInX = info.SamplesInX;
int samplesInY = info.SamplesInY;
if (target == Target.Texture2D && parent.Target == Target.Texture2DMultisample)
{
target = Target.Texture2DMultisample;
samplesInX = parent.Info.SamplesInX;
samplesInY = parent.Info.SamplesInY;
}
return new TextureInfo(
info.GpuAddress,
width,
height,
depthOrLayers,
info.Levels,
info.SamplesInX,
info.SamplesInY,
samplesInX,
samplesInY,
info.Stride,
info.IsLinear,
info.GobBlocksInY,
info.GobBlocksInZ,
info.GobBlocksInTileX,
info.Target,
target,
info.FormatInfo,
info.DepthStencilMode,
info.SwizzleR,
@ -1069,6 +1134,11 @@ namespace Ryujinx.Graphics.Gpu.Image
{
_textures.Remove(texture);
}
lock (_partiallyMappedTextures)
{
_partiallyMappedTextures.Remove(texture);
}
}
/// <summary>

View File

@ -203,7 +203,7 @@ namespace Ryujinx.Graphics.Gpu.Image
}
if ((lhs.FormatInfo.Format == Format.D24UnormS8Uint ||
lhs.FormatInfo.Format == Format.D24X8Unorm) && rhs.FormatInfo.Format == Format.B8G8R8A8Unorm)
lhs.FormatInfo.Format == Format.S8UintD24Unorm) && rhs.FormatInfo.Format == Format.B8G8R8A8Unorm)
{
return TextureMatchQuality.FormatAlias;
}

View File

@ -879,7 +879,7 @@ namespace Ryujinx.Graphics.Gpu.Image
int sliceStart = Math.Clamp(offset, 0, subRangeSize);
int sliceEnd = Math.Clamp(endOffset, 0, subRangeSize);
if (sliceStart != sliceEnd)
if (sliceStart != sliceEnd && item.Address != MemoryManager.PteUnmapped)
{
result.Add(GenerateHandle(item.Address + (ulong)sliceStart, (ulong)(sliceEnd - sliceStart)));
}
@ -1097,11 +1097,20 @@ namespace Ryujinx.Graphics.Gpu.Image
{
// Single dirty region.
var cpuRegionHandles = new CpuRegionHandle[TextureRange.Count];
int count = 0;
for (int i = 0; i < TextureRange.Count; i++)
{
var currentRange = TextureRange.GetSubRange(i);
cpuRegionHandles[i] = GenerateHandle(currentRange.Address, currentRange.Size);
if (currentRange.Address != MemoryManager.PteUnmapped)
{
cpuRegionHandles[count++] = GenerateHandle(currentRange.Address, currentRange.Size);
}
}
if (count != TextureRange.Count)
{
Array.Resize(ref cpuRegionHandles, count);
}
var groupHandle = new TextureGroupHandle(this, 0, Storage.Size, _views, 0, 0, 0, _allOffsets.Length, cpuRegionHandles);

View File

@ -362,7 +362,7 @@ namespace Ryujinx.Graphics.Gpu.Image
return DepthStencilMode.Depth;
}
if (format == Format.D24X8Unorm || format == Format.D24UnormS8Uint)
if (format == Format.D24UnormS8Uint)
{
return component == SwizzleComponent.Red
? DepthStencilMode.Stencil

View File

@ -12,6 +12,7 @@ namespace Ryujinx.Graphics.Gpu.Image
Strict = 1 << 0,
ForSampler = 1 << 1,
ForCopy = 1 << 2,
WithUpscale = 1 << 3
WithUpscale = 1 << 3,
NoCreate = 1 << 4
}
}

View File

@ -17,6 +17,8 @@ namespace Ryujinx.Graphics.Gpu.Memory
private const ulong BufferAlignmentSize = 0x1000;
private const ulong BufferAlignmentMask = BufferAlignmentSize - 1;
private const ulong MaxDynamicGrowthSize = 0x100000;
private readonly GpuContext _context;
private readonly PhysicalMemory _physicalMemory;
@ -166,10 +168,35 @@ namespace Ryujinx.Graphics.Gpu.Memory
// Otherwise, we must delete the overlapping buffers and create a bigger buffer
// that fits all the data we need. We also need to copy the contents from the
// old buffer(s) to the new buffer.
ulong endAddress = address + size;
if (_bufferOverlaps[0].Address > address || _bufferOverlaps[0].EndAddress < endAddress)
{
// Check if the following conditions are met:
// - We have a single overlap.
// - The overlap starts at or before the requested range. That is, the overlap happens at the end.
// - The size delta between the new, merged buffer and the old one is of at most 2 pages.
// In this case, we attempt to extend the buffer further than the requested range,
// this can potentially avoid future resizes if the application keeps using overlapping
// sequential memory.
// Allowing for 2 pages (rather than just one) is necessary to catch cases where the
// range crosses a page, and after alignment, ends having a size of 2 pages.
if (overlapsCount == 1 &&
address >= _bufferOverlaps[0].Address &&
endAddress - _bufferOverlaps[0].EndAddress <= BufferAlignmentSize * 2)
{
// Try to grow the buffer by 1.5x of its current size.
// This improves performance in the cases where the buffer is resized often by small amounts.
ulong existingSize = _bufferOverlaps[0].Size;
ulong growthSize = (existingSize + Math.Min(existingSize >> 1, MaxDynamicGrowthSize)) & ~BufferAlignmentMask;
size = Math.Max(size, growthSize);
endAddress = address + size;
overlapsCount = _buffers.FindOverlapsNonOverlapping(address, size, ref _bufferOverlaps);
}
for (int index = 0; index < overlapsCount; index++)
{
Buffer buffer = _bufferOverlaps[index];
@ -183,7 +210,9 @@ namespace Ryujinx.Graphics.Gpu.Memory
}
}
Buffer newBuffer = new Buffer(_context, _physicalMemory, address, endAddress - address, _bufferOverlaps.Take(overlapsCount));
ulong newSize = endAddress - address;
Buffer newBuffer = new Buffer(_context, _physicalMemory, address, newSize, _bufferOverlaps.Take(overlapsCount));
lock (_buffers)
{
@ -202,7 +231,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
buffer.DisposeData();
}
newBuffer.SynchronizeMemory(address, endAddress - address);
newBuffer.SynchronizeMemory(address, newSize);
// Existing buffers were modified, we need to rebind everything.
NotifyBuffersModified?.Invoke();

View File

@ -28,7 +28,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
private const int PtLvl1Bit = PtPageBits;
private const int AddressSpaceBits = PtPageBits + PtLvl1Bits + PtLvl0Bits;
public const ulong PteUnmapped = 0xffffffff_ffffffff;
public const ulong PteUnmapped = ulong.MaxValue;
private readonly ulong[][] _pageTable;
@ -115,6 +115,73 @@ namespace Ryujinx.Graphics.Gpu.Memory
}
}
/// <summary>
/// Gets a read-only span of data from GPU mapped memory, up to the entire range specified,
/// or the last mapped page if the range is not fully mapped.
/// </summary>
/// <param name="va">GPU virtual address where the data is located</param>
/// <param name="size">Size of the data</param>
/// <param name="tracked">True if read tracking is triggered on the span</param>
/// <returns>The span of the data at the specified memory location</returns>
public ReadOnlySpan<byte> GetSpanMapped(ulong va, int size, bool tracked = false)
{
bool isContiguous = true;
int mappedSize;
if (ValidateAddress(va) && GetPte(va) != PteUnmapped && Physical.IsMapped(Translate(va)))
{
ulong endVa = va + (ulong)size;
ulong endVaAligned = (endVa + PageMask) & ~PageMask;
ulong currentVa = va & ~PageMask;
int pages = (int)((endVaAligned - currentVa) / PageSize);
for (int page = 0; page < pages - 1; page++)
{
ulong nextVa = currentVa + PageSize;
ulong nextPa = Translate(nextVa);
if (!ValidateAddress(nextVa) || GetPte(nextVa) == PteUnmapped || !Physical.IsMapped(nextPa))
{
break;
}
if (Translate(currentVa) + PageSize != nextPa)
{
isContiguous = false;
}
currentVa += PageSize;
}
currentVa += PageSize;
if (currentVa > endVa)
{
currentVa = endVa;
}
mappedSize = (int)(currentVa - va);
}
else
{
return ReadOnlySpan<byte>.Empty;
}
if (isContiguous)
{
return Physical.GetSpan(Translate(va), mappedSize, tracked);
}
else
{
Span<byte> data = new byte[mappedSize];
ReadImpl(va, data, tracked);
return data;
}
}
/// <summary>
/// Reads data from a possibly non-contiguous region of GPU mapped memory.
/// </summary>
@ -154,14 +221,15 @@ namespace Ryujinx.Graphics.Gpu.Memory
/// <summary>
/// Gets a writable region from GPU mapped memory.
/// </summary>
/// <param name="address">Start address of the range</param>
/// <param name="va">Start address of the range</param>
/// <param name="size">Size in bytes to be range</param>
/// <param name="tracked">True if write tracking is triggered on the span</param>
/// <returns>A writable region with the data at the specified memory location</returns>
public WritableRegion GetWritableRegion(ulong va, int size)
public WritableRegion GetWritableRegion(ulong va, int size, bool tracked = false)
{
if (IsContiguous(va, size))
{
return Physical.GetWritableRegion(Translate(va), size);
return Physical.GetWritableRegion(Translate(va), size, tracked);
}
else
{
@ -169,7 +237,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
GetSpan(va, size).CopyTo(memory.Span);
return new WritableRegion(this, va, memory);
return new WritableRegion(this, va, memory, tracked);
}
}
@ -254,6 +322,49 @@ namespace Ryujinx.Graphics.Gpu.Memory
}
}
/// <summary>
/// Writes data to GPU mapped memory, stopping at the first unmapped page at the memory region, if any.
/// </summary>
/// <param name="va">GPU virtual address to write the data into</param>
/// <param name="data">The data to be written</param>
public void WriteMapped(ulong va, ReadOnlySpan<byte> data)
{
if (IsContiguous(va, data.Length))
{
Physical.Write(Translate(va), data);
}
else
{
int offset = 0, size;
if ((va & PageMask) != 0)
{
ulong pa = Translate(va);
size = Math.Min(data.Length, (int)PageSize - (int)(va & PageMask));
if (pa != PteUnmapped && Physical.IsMapped(pa))
{
Physical.Write(pa, data.Slice(0, size));
}
offset += size;
}
for (; offset < data.Length; offset += size)
{
ulong pa = Translate(va + (ulong)offset);
size = Math.Min(data.Length - offset, (int)PageSize);
if (pa != PteUnmapped && Physical.IsMapped(pa))
{
Physical.Write(pa, data.Slice(offset, size));
}
}
}
}
/// <summary>
/// Maps a given range of pages to the specified CPU virtual address.
/// </summary>
@ -263,7 +374,8 @@ namespace Ryujinx.Graphics.Gpu.Memory
/// <param name="pa">CPU virtual address to map into</param>
/// <param name="va">GPU virtual address to be mapped</param>
/// <param name="size">Size in bytes of the mapping</param>
public void Map(ulong pa, ulong va, ulong size)
/// <param name="kind">Kind of the resource located at the mapping</param>
public void Map(ulong pa, ulong va, ulong size, PteKind kind)
{
lock (_pageTable)
{
@ -271,7 +383,7 @@ namespace Ryujinx.Graphics.Gpu.Memory
for (ulong offset = 0; offset < size; offset += PageSize)
{
SetPte(va + offset, pa + offset);
SetPte(va + offset, PackPte(pa + offset, kind));
}
}
}
@ -339,7 +451,6 @@ namespace Ryujinx.Graphics.Gpu.Memory
/// <param name="va">Virtual address of the range</param>
/// <param name="size">Size of the range</param>
/// <returns>Multi-range with the physical regions</returns>
/// <exception cref="InvalidMemoryRegionException">The memory region specified by <paramref name="va"/> and <paramref name="size"/> is not fully mapped</exception>
public MultiRange GetPhysicalRegions(ulong va, ulong size)
{
if (IsContiguous(va, (int)size))
@ -347,11 +458,6 @@ namespace Ryujinx.Graphics.Gpu.Memory
return new MultiRange(Translate(va), size);
}
if (!IsMapped(va))
{
throw new InvalidMemoryRegionException($"The specified GPU virtual address 0x{va:X} is not mapped.");
}
ulong regionStart = Translate(va);
ulong regionSize = Math.Min(size, PageSize - (va & PageMask));
@ -366,14 +472,10 @@ namespace Ryujinx.Graphics.Gpu.Memory
for (int page = 0; page < pages - 1; page++)
{
if (!IsMapped(va + PageSize))
{
throw new InvalidMemoryRegionException($"The specified GPU virtual memory range 0x{va:X}..0x{(va + size):X} is not fully mapped.");
}
ulong currPa = Translate(va);
ulong newPa = Translate(va + PageSize);
if (Translate(va) + PageSize != newPa)
if ((currPa != PteUnmapped || newPa != PteUnmapped) && currPa + PageSize != newPa)
{
regions.Add(new MemoryRange(regionStart, regionSize));
regionStart = newPa;
@ -404,6 +506,8 @@ namespace Ryujinx.Graphics.Gpu.Memory
{
MemoryRange currentRange = range.GetSubRange(i);
if (currentRange.Address != PteUnmapped)
{
ulong address = currentRange.Address & ~PageMask;
ulong endAddress = (currentRange.EndAddress + PageMask) & ~PageMask;
@ -418,6 +522,21 @@ namespace Ryujinx.Graphics.Gpu.Memory
address += PageSize;
}
}
else
{
ulong endVa = va + (((currentRange.Size) + PageMask) & ~PageMask);
while (va < endVa)
{
if (Translate(va) != PteUnmapped)
{
return false;
}
va += PageSize;
}
}
}
return true;
}
@ -454,14 +573,37 @@ namespace Ryujinx.Graphics.Gpu.Memory
return PteUnmapped;
}
ulong baseAddress = GetPte(va);
ulong pte = GetPte(va);
if (baseAddress == PteUnmapped)
if (pte == PteUnmapped)
{
return PteUnmapped;
}
return baseAddress + (va & PageMask);
return UnpackPaFromPte(pte) + (va & PageMask);
}
/// <summary>
/// Gets the kind of a given memory page.
/// This might indicate the type of resource that can be allocated on the page, and also texture tiling.
/// </summary>
/// <param name="va">GPU virtual address</param>
/// <returns>Kind of the memory page</returns>
public PteKind GetKind(ulong va)
{
if (!ValidateAddress(va))
{
return PteKind.Invalid;
}
ulong pte = GetPte(va);
if (pte == PteUnmapped)
{
return PteKind.Invalid;
}
return UnpackKindFromPte(pte);
}
/// <summary>
@ -504,5 +646,36 @@ namespace Ryujinx.Graphics.Gpu.Memory
_pageTable[l0][l1] = pte;
}
/// <summary>
/// Creates a page table entry from a physical address and kind.
/// </summary>
/// <param name="pa">Physical address</param>
/// <param name="kind">Kind</param>
/// <returns>Page table entry</returns>
private static ulong PackPte(ulong pa, PteKind kind)
{
return pa | ((ulong)kind << 56);
}
/// <summary>
/// Unpacks kind from a page table entry.
/// </summary>
/// <param name="pte">Page table entry</param>
/// <returns>Kind</returns>
private static PteKind UnpackKindFromPte(ulong pte)
{
return (PteKind)(pte >> 56);
}
/// <summary>
/// Unpacks physical address from a page table entry.
/// </summary>
/// <param name="pte">Page table entry</param>
/// <returns>Physical address</returns>
private static ulong UnpackPaFromPte(ulong pte)
{
return pte & 0xffffffffffffffUL;
}
}
}

View File

@ -7,8 +7,6 @@ using Ryujinx.Memory.Range;
using Ryujinx.Memory.Tracking;
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Threading;
namespace Ryujinx.Graphics.Gpu.Memory
@ -19,8 +17,6 @@ namespace Ryujinx.Graphics.Gpu.Memory
/// </summary>
class PhysicalMemory : IDisposable
{
public const int PageSize = 0x1000;
private readonly GpuContext _context;
private IVirtualMemoryManagerTracked _cpuMemory;
private int _referenceCount;
@ -103,10 +99,12 @@ namespace Ryujinx.Graphics.Gpu.Memory
if (range.Count == 1)
{
var singleRange = range.GetSubRange(0);
if (singleRange.Address != MemoryManager.PteUnmapped)
{
return _cpuMemory.GetSpan(singleRange.Address, (int)singleRange.Size, tracked);
}
else
{
}
Span<byte> data = new byte[range.GetSize()];
int offset = 0;
@ -115,13 +113,15 @@ namespace Ryujinx.Graphics.Gpu.Memory
{
var currentRange = range.GetSubRange(i);
int size = (int)currentRange.Size;
if (currentRange.Address != MemoryManager.PteUnmapped)
{
_cpuMemory.GetSpan(currentRange.Address, size, tracked).CopyTo(data.Slice(offset, size));
}
offset += size;
}
return data;
}
}
/// <summary>
/// Gets a writable region from the application process.
@ -156,11 +156,13 @@ namespace Ryujinx.Graphics.Gpu.Memory
int offset = 0;
for (int i = 0; i < range.Count; i++)
{
MemoryRange subrange = range.GetSubRange(i);
GetSpan(subrange.Address, (int)subrange.Size).CopyTo(memory.Span.Slice(offset, (int)subrange.Size));
offset += (int)subrange.Size;
var currentRange = range.GetSubRange(i);
int size = (int)currentRange.Size;
if (currentRange.Address != MemoryManager.PteUnmapped)
{
GetSpan(currentRange.Address, size).CopyTo(memory.Span.Slice(offset, size));
}
offset += size;
}
return new WritableRegion(new MultiRangeWritableBlock(range, this), 0, memory, tracked);
@ -253,8 +255,11 @@ namespace Ryujinx.Graphics.Gpu.Memory
if (range.Count == 1)
{
var singleRange = range.GetSubRange(0);
if (singleRange.Address != MemoryManager.PteUnmapped)
{
writeCallback(singleRange.Address, data);
}
}
else
{
int offset = 0;
@ -263,7 +268,10 @@ namespace Ryujinx.Graphics.Gpu.Memory
{
var currentRange = range.GetSubRange(i);
int size = (int)currentRange.Size;
if (currentRange.Address != MemoryManager.PteUnmapped)
{
writeCallback(currentRange.Address, data.Slice(offset, size));
}
offset += size;
}
}
@ -288,11 +296,20 @@ namespace Ryujinx.Graphics.Gpu.Memory
public GpuRegionHandle BeginTracking(MultiRange range)
{
var cpuRegionHandles = new CpuRegionHandle[range.Count];
int count = 0;
for (int i = 0; i < range.Count; i++)
{
var currentRange = range.GetSubRange(i);
cpuRegionHandles[i] = _cpuMemory.BeginTracking(currentRange.Address, currentRange.Size);
if (currentRange.Address != MemoryManager.PteUnmapped)
{
cpuRegionHandles[count++] = _cpuMemory.BeginTracking(currentRange.Address, currentRange.Size);
}
}
if (count != range.Count)
{
Array.Resize(ref cpuRegionHandles, count);
}
return new GpuRegionHandle(cpuRegionHandles);
@ -323,6 +340,16 @@ namespace Ryujinx.Graphics.Gpu.Memory
return _cpuMemory.BeginSmartGranularTracking(address, size, granularity);
}
/// <summary>
/// Checks if a given memory page is mapped.
/// </summary>
/// <param name="address">CPU virtual address of the page</param>
/// <returns>True if mapped, false otherwise</returns>
public bool IsMapped(ulong address)
{
return _cpuMemory.IsMapped(address);
}
/// <summary>
/// Release our reference to the CPU memory manager.
/// </summary>

View File

@ -0,0 +1,268 @@
namespace Ryujinx.Graphics.Gpu.Memory
{
/// <summary>
/// Kind of the resource at the given memory mapping.
/// </summary>
public enum PteKind : byte
{
Invalid = 0xff,
Pitch = 0x00,
Z16 = 0x01,
Z162C = 0x02,
Z16MS22C = 0x03,
Z16MS42C = 0x04,
Z16MS82C = 0x05,
Z16MS162C = 0x06,
Z162Z = 0x07,
Z16MS22Z = 0x08,
Z16MS42Z = 0x09,
Z16MS82Z = 0x0a,
Z16MS162Z = 0x0b,
Z162CZ = 0x36,
Z16MS22CZ = 0x37,
Z16MS42CZ = 0x38,
Z16MS82CZ = 0x39,
Z16MS162CZ = 0x5f,
Z164CZ = 0x0c,
Z16MS24CZ = 0x0d,
Z16MS44CZ = 0x0e,
Z16MS84CZ = 0x0f,
Z16MS164CZ = 0x10,
S8Z24 = 0x11,
S8Z241Z = 0x12,
S8Z24MS21Z = 0x13,
S8Z24MS41Z = 0x14,
S8Z24MS81Z = 0x15,
S8Z24MS161Z = 0x16,
S8Z242CZ = 0x17,
S8Z24MS22CZ = 0x18,
S8Z24MS42CZ = 0x19,
S8Z24MS82CZ = 0x1a,
S8Z24MS162CZ = 0x1b,
S8Z242CS = 0x1c,
S8Z24MS22CS = 0x1d,
S8Z24MS42CS = 0x1e,
S8Z24MS82CS = 0x1f,
S8Z24MS162CS = 0x20,
S8Z244CSZV = 0x21,
S8Z24MS24CSZV = 0x22,
S8Z24MS44CSZV = 0x23,
S8Z24MS84CSZV = 0x24,
S8Z24MS164CSZV = 0x25,
V8Z24MS4VC12 = 0x26,
V8Z24MS4VC4 = 0x27,
V8Z24MS8VC8 = 0x28,
V8Z24MS8VC24 = 0x29,
V8Z24MS4VC121ZV = 0x2e,
V8Z24MS4VC41ZV = 0x2f,
V8Z24MS8VC81ZV = 0x30,
V8Z24MS8VC241ZV = 0x31,
V8Z24MS4VC122CS = 0x32,
V8Z24MS4VC42CS = 0x33,
V8Z24MS8VC82CS = 0x34,
V8Z24MS8VC242CS = 0x35,
V8Z24MS4VC122CZV = 0x3a,
V8Z24MS4VC42CZV = 0x3b,
V8Z24MS8VC82CZV = 0x3c,
V8Z24MS8VC242CZV = 0x3d,
V8Z24MS4VC122ZV = 0x3e,
V8Z24MS4VC42ZV = 0x3f,
V8Z24MS8VC82ZV = 0x40,
V8Z24MS8VC242ZV = 0x41,
V8Z24MS4VC124CSZV = 0x42,
V8Z24MS4VC44CSZV = 0x43,
V8Z24MS8VC84CSZV = 0x44,
V8Z24MS8VC244CSZV = 0x45,
Z24S8 = 0x46,
Z24S81Z = 0x47,
Z24S8MS21Z = 0x48,
Z24S8MS41Z = 0x49,
Z24S8MS81Z = 0x4a,
Z24S8MS161Z = 0x4b,
Z24S82CS = 0x4c,
Z24S8MS22CS = 0x4d,
Z24S8MS42CS = 0x4e,
Z24S8MS82CS = 0x4f,
Z24S8MS162CS = 0x50,
Z24S82CZ = 0x51,
Z24S8MS22CZ = 0x52,
Z24S8MS42CZ = 0x53,
Z24S8MS82CZ = 0x54,
Z24S8MS162CZ = 0x55,
Z24S84CSZV = 0x56,
Z24S8MS24CSZV = 0x57,
Z24S8MS44CSZV = 0x58,
Z24S8MS84CSZV = 0x59,
Z24S8MS164CSZV = 0x5a,
Z24V8MS4VC12 = 0x5b,
Z24V8MS4VC4 = 0x5c,
Z24V8MS8VC8 = 0x5d,
Z24V8MS8VC24 = 0x5e,
YUVB8C12Y = 0x60,
YUVB8C22Y = 0x61,
YUVB10C12Y = 0x62,
YUVB10C22Y = 0x6b,
YUVB12C12Y = 0x6c,
YUVB12C22Y = 0x6d,
Z24V8MS4VC121ZV = 0x63,
Z24V8MS4VC41ZV = 0x64,
Z24V8MS8VC81ZV = 0x65,
Z24V8MS8VC241ZV = 0x66,
Z24V8MS4VC122CS = 0x67,
Z24V8MS4VC42CS = 0x68,
Z24V8MS8VC82CS = 0x69,
Z24V8MS8VC242CS = 0x6a,
Z24V8MS4VC122CZV = 0x6f,
Z24V8MS4VC42CZV = 0x70,
Z24V8MS8VC82CZV = 0x71,
Z24V8MS8VC242CZV = 0x72,
Z24V8MS4VC122ZV = 0x73,
Z24V8MS4VC42ZV = 0x74,
Z24V8MS8VC82ZV = 0x75,
Z24V8MS8VC242ZV = 0x76,
Z24V8MS4VC124CSZV = 0x77,
Z24V8MS4VC44CSZV = 0x78,
Z24V8MS8VC84CSZV = 0x79,
Z24V8MS8VC244CSZV = 0x7a,
ZF32 = 0x7b,
ZF321Z = 0x7c,
ZF32MS21Z = 0x7d,
ZF32MS41Z = 0x7e,
ZF32MS81Z = 0x7f,
ZF32MS161Z = 0x80,
ZF322CS = 0x81,
ZF32MS22CS = 0x82,
ZF32MS42CS = 0x83,
ZF32MS82CS = 0x84,
ZF32MS162CS = 0x85,
ZF322CZ = 0x86,
ZF32MS22CZ = 0x87,
ZF32MS42CZ = 0x88,
ZF32MS82CZ = 0x89,
ZF32MS162CZ = 0x8a,
X8Z24X16V8S8MS4VC12 = 0x8b,
X8Z24X16V8S8MS4VC4 = 0x8c,
X8Z24X16V8S8MS8VC8 = 0x8d,
X8Z24X16V8S8MS8VC24 = 0x8e,
X8Z24X16V8S8MS4VC121CS = 0x8f,
X8Z24X16V8S8MS4VC41CS = 0x90,
X8Z24X16V8S8MS8VC81CS = 0x91,
X8Z24X16V8S8MS8VC241CS = 0x92,
X8Z24X16V8S8MS4VC121ZV = 0x97,
X8Z24X16V8S8MS4VC41ZV = 0x98,
X8Z24X16V8S8MS8VC81ZV = 0x99,
X8Z24X16V8S8MS8VC241ZV = 0x9a,
X8Z24X16V8S8MS4VC121CZV = 0x9b,
X8Z24X16V8S8MS4VC41CZV = 0x9c,
X8Z24X16V8S8MS8VC81CZV = 0x9d,
X8Z24X16V8S8MS8VC241CZV = 0x9e,
X8Z24X16V8S8MS4VC122CS = 0x9f,
X8Z24X16V8S8MS4VC42CS = 0xa0,
X8Z24X16V8S8MS8VC82CS = 0xa1,
X8Z24X16V8S8MS8VC242CS = 0xa2,
X8Z24X16V8S8MS4VC122CSZV = 0xa3,
X8Z24X16V8S8MS4VC42CSZV = 0xa4,
X8Z24X16V8S8MS8VC82CSZV = 0xa5,
X8Z24X16V8S8MS8VC242CSZV = 0xa6,
ZF32X16V8S8MS4VC12 = 0xa7,
ZF32X16V8S8MS4VC4 = 0xa8,
ZF32X16V8S8MS8VC8 = 0xa9,
ZF32X16V8S8MS8VC24 = 0xaa,
ZF32X16V8S8MS4VC121CS = 0xab,
ZF32X16V8S8MS4VC41CS = 0xac,
ZF32X16V8S8MS8VC81CS = 0xad,
ZF32X16V8S8MS8VC241CS = 0xae,
ZF32X16V8S8MS4VC121ZV = 0xb3,
ZF32X16V8S8MS4VC41ZV = 0xb4,
ZF32X16V8S8MS8VC81ZV = 0xb5,
ZF32X16V8S8MS8VC241ZV = 0xb6,
ZF32X16V8S8MS4VC121CZV = 0xb7,
ZF32X16V8S8MS4VC41CZV = 0xb8,
ZF32X16V8S8MS8VC81CZV = 0xb9,
ZF32X16V8S8MS8VC241CZV = 0xba,
ZF32X16V8S8MS4VC122CS = 0xbb,
ZF32X16V8S8MS4VC42CS = 0xbc,
ZF32X16V8S8MS8VC82CS = 0xbd,
ZF32X16V8S8MS8VC242CS = 0xbe,
ZF32X16V8S8MS4VC122CSZV = 0xbf,
ZF32X16V8S8MS4VC42CSZV = 0xc0,
ZF32X16V8S8MS8VC82CSZV = 0xc1,
ZF32X16V8S8MS8VC242CSZV = 0xc2,
ZF32X24S8 = 0xc3,
ZF32X24S81CS = 0xc4,
ZF32X24S8MS21CS = 0xc5,
ZF32X24S8MS41CS = 0xc6,
ZF32X24S8MS81CS = 0xc7,
ZF32X24S8MS161CS = 0xc8,
ZF32X24S82CSZV = 0xce,
ZF32X24S8MS22CSZV = 0xcf,
ZF32X24S8MS42CSZV = 0xd0,
ZF32X24S8MS82CSZV = 0xd1,
ZF32X24S8MS162CSZV = 0xd2,
ZF32X24S82CS = 0xd3,
ZF32X24S8MS22CS = 0xd4,
ZF32X24S8MS42CS = 0xd5,
ZF32X24S8MS82CS = 0xd6,
ZF32X24S8MS162CS = 0xd7,
S8 = 0x2a,
S82S = 0x2b,
Generic16Bx2 = 0xfe,
C322C = 0xd8,
C322CBR = 0xd9,
C322CBA = 0xda,
C322CRA = 0xdb,
C322BRA = 0xdc,
C32MS22C = 0xdd,
C32MS22CBR = 0xde,
C32MS24CBRA = 0xcc,
C32MS42C = 0xdf,
C32MS42CBR = 0xe0,
C32MS42CBA = 0xe1,
C32MS42CRA = 0xe2,
C32MS42BRA = 0xe3,
C32MS44CBRA = 0x2c,
C32MS8MS162C = 0xe4,
C32MS8MS162CRA = 0xe5,
C642C = 0xe6,
C642CBR = 0xe7,
C642CBA = 0xe8,
C642CRA = 0xe9,
C642BRA = 0xea,
C64MS22C = 0xeb,
C64MS22CBR = 0xec,
C64MS24CBRA = 0xcd,
C64MS42C = 0xed,
C64MS42CBR = 0xee,
C64MS42CBA = 0xef,
C64MS42CRA = 0xf0,
C64MS42BRA = 0xf1,
C64MS44CBRA = 0x2d,
C64MS8MS162C = 0xf2,
C64MS8MS162CRA = 0xf3,
C1282C = 0xf4,
C1282CR = 0xf5,
C128MS22C = 0xf6,
C128MS22CR = 0xf7,
C128MS42C = 0xf8,
C128MS42CR = 0xf9,
C128MS8MS162C = 0xfa,
C128MS8MS162CR = 0xfb,
X8C24 = 0xfc,
PitchNoSwizzle = 0xfd,
SmSkedMessage = 0xca,
SmHostMessage = 0xcb
}
static class PteKindExtensions
{
/// <summary>
/// Checks if the kind is pitch.
/// </summary>
/// <param name="kind">Kind to check</param>
/// <returns>True if pitch, false otherwise</returns>
public static bool IsPitch(this PteKind kind)
{
return kind == PteKind.Pitch || kind == PteKind.PitchNoSwizzle;
}
}
}

View File

@ -1,13 +0,0 @@
namespace Ryujinx.Graphics.Gpu.Memory
{
/// <summary>
/// Name of a GPU resource.
/// </summary>
public enum ResourceName
{
Buffer,
Texture,
TexturePool,
SamplerPool
}
}

View File

@ -2,11 +2,8 @@
using Ryujinx.Common;
using Ryujinx.Common.Configuration;
using Ryujinx.Common.Logging;
using Ryujinx.Graphics.GAL;
using Ryujinx.Graphics.Gpu.Memory;
using Ryujinx.Graphics.Gpu.Shader.Cache.Definition;
using Ryujinx.Graphics.Shader;
using Ryujinx.Graphics.Shader.Translation;
using System;
using System.Collections.Generic;
using System.IO;
@ -20,70 +17,6 @@ namespace Ryujinx.Graphics.Gpu.Shader.Cache
/// </summary>
static class CacheHelper
{
/// <summary>
/// Try to read the manifest header from a given file path.
/// </summary>
/// <param name="manifestPath">The path to the manifest file</param>
/// <param name="header">The manifest header read</param>
/// <returns>Return true if the manifest header was read</returns>
public static bool TryReadManifestHeader(string manifestPath, out CacheManifestHeader header)
{
header = default;
if (File.Exists(manifestPath))
{
Memory<byte> rawManifest = File.ReadAllBytes(manifestPath);
if (MemoryMarshal.TryRead(rawManifest.Span, out header))
{
return true;
}
}
return false;
}
/// <summary>
/// Try to read the manifest from a given file path.
/// </summary>
/// <param name="manifestPath">The path to the manifest file</param>
/// <param name="graphicsApi">The graphics api used by the cache</param>
/// <param name="hashType">The hash type of the cache</param>
/// <param name="header">The manifest header read</param>
/// <param name="entries">The entries read from the cache manifest</param>
/// <returns>Return true if the manifest was read</returns>
public static bool TryReadManifestFile(string manifestPath, CacheGraphicsApi graphicsApi, CacheHashType hashType, out CacheManifestHeader header, out HashSet<Hash128> entries)
{
header = default;
entries = new HashSet<Hash128>();
if (File.Exists(manifestPath))
{
Memory<byte> rawManifest = File.ReadAllBytes(manifestPath);
if (MemoryMarshal.TryRead(rawManifest.Span, out header))
{
Memory<byte> hashTableRaw = rawManifest.Slice(Unsafe.SizeOf<CacheManifestHeader>());
bool isValid = header.IsValid(graphicsApi, hashType, hashTableRaw.Span);
if (isValid)
{
ReadOnlySpan<Hash128> hashTable = MemoryMarshal.Cast<byte, Hash128>(hashTableRaw.Span);
foreach (Hash128 hash in hashTable)
{
entries.Add(hash);
}
}
return isValid;
}
}
return false;
}
/// <summary>
/// Compute a cache manifest from runtime data.
/// </summary>
@ -113,7 +46,7 @@ namespace Ryujinx.Graphics.Gpu.Shader.Cache
dataSpan[i++] = hash;
}
manifestHeader.UpdateChecksum(data.AsSpan().Slice(Unsafe.SizeOf<CacheManifestHeader>()));
manifestHeader.UpdateChecksum(data.AsSpan(Unsafe.SizeOf<CacheManifestHeader>()));
MemoryMarshal.Write(data, ref manifestHeader);
@ -246,82 +179,23 @@ namespace Ryujinx.Graphics.Gpu.Shader.Cache
return null;
}
/// <summary>
/// Compute the guest program code for usage while dumping to disk or hash.
/// </summary>
/// <param name="cachedShaderEntries">The guest shader entries to use</param>
/// <param name="tfd">The transform feedback descriptors</param>
/// <param name="forHashCompute">Used to determine if the guest program code is generated for hashing</param>
/// <returns>The guest program code for usage while dumping to disk or hash</returns>
private static byte[] ComputeGuestProgramCode(ReadOnlySpan<GuestShaderCacheEntry> cachedShaderEntries, TransformFeedbackDescriptor[] tfd, bool forHashCompute = false)
{
using (MemoryStream stream = new MemoryStream())
{
BinaryWriter writer = new BinaryWriter(stream);
foreach (GuestShaderCacheEntry cachedShaderEntry in cachedShaderEntries)
{
if (cachedShaderEntry != null)
{
// Code (and Code A if present)
stream.Write(cachedShaderEntry.Code);
if (forHashCompute)
{
// Guest GPU accessor header (only write this for hashes, already present in the header for dumps)
writer.WriteStruct(cachedShaderEntry.Header.GpuAccessorHeader);
}
// Texture descriptors
foreach (GuestTextureDescriptor textureDescriptor in cachedShaderEntry.TextureDescriptors.Values)
{
writer.WriteStruct(textureDescriptor);
}
}
}
// Transform feedback
if (tfd != null)
{
foreach (TransformFeedbackDescriptor transform in tfd)
{
writer.WriteStruct(new GuestShaderCacheTransformFeedbackHeader(transform.BufferIndex, transform.Stride, transform.VaryingLocations.Length));
writer.Write(transform.VaryingLocations);
}
}
return stream.ToArray();
}
}
/// <summary>
/// Compute a guest hash from shader entries.
/// </summary>
/// <param name="cachedShaderEntries">The guest shader entries to use</param>
/// <param name="tfd">The optional transform feedback descriptors</param>
/// <returns>A guest hash from shader entries</returns>
public static Hash128 ComputeGuestHashFromCache(ReadOnlySpan<GuestShaderCacheEntry> cachedShaderEntries, TransformFeedbackDescriptor[] tfd = null)
{
return XXHash128.ComputeHash(ComputeGuestProgramCode(cachedShaderEntries, tfd, true));
}
/// <summary>
/// Read transform feedback descriptors from guest.
/// </summary>
/// <param name="data">The raw guest transform feedback descriptors</param>
/// <param name="header">The guest shader program header</param>
/// <returns>The transform feedback descriptors read from guest</returns>
public static TransformFeedbackDescriptor[] ReadTransformFeedbackInformation(ref ReadOnlySpan<byte> data, GuestShaderCacheHeader header)
public static TransformFeedbackDescriptorOld[] ReadTransformFeedbackInformation(ref ReadOnlySpan<byte> data, GuestShaderCacheHeader header)
{
if (header.TransformFeedbackCount != 0)
{
TransformFeedbackDescriptor[] result = new TransformFeedbackDescriptor[header.TransformFeedbackCount];
TransformFeedbackDescriptorOld[] result = new TransformFeedbackDescriptorOld[header.TransformFeedbackCount];
for (int i = 0; i < result.Length; i++)
{
GuestShaderCacheTransformFeedbackHeader feedbackHeader = MemoryMarshal.Read<GuestShaderCacheTransformFeedbackHeader>(data);
result[i] = new TransformFeedbackDescriptor(feedbackHeader.BufferIndex, feedbackHeader.Stride, data.Slice(Unsafe.SizeOf<GuestShaderCacheTransformFeedbackHeader>(), feedbackHeader.VaryingLocationsLength).ToArray());
result[i] = new TransformFeedbackDescriptorOld(feedbackHeader.BufferIndex, feedbackHeader.Stride, data.Slice(Unsafe.SizeOf<GuestShaderCacheTransformFeedbackHeader>(), feedbackHeader.VaryingLocationsLength).ToArray());
data = data.Slice(Unsafe.SizeOf<GuestShaderCacheTransformFeedbackHeader>() + feedbackHeader.VaryingLocationsLength);
}
@ -332,205 +206,6 @@ namespace Ryujinx.Graphics.Gpu.Shader.Cache
return null;
}
/// <summary>
/// Builds gpu state flags using information from the given gpu accessor.
/// </summary>
/// <param name="gpuAccessor">The gpu accessor</param>
/// <returns>The gpu state flags</returns>
private static GuestGpuStateFlags GetGpuStateFlags(IGpuAccessor gpuAccessor)
{
GuestGpuStateFlags flags = 0;
if (gpuAccessor.QueryEarlyZForce())
{
flags |= GuestGpuStateFlags.EarlyZForce;
}
return flags;
}
/// <summary>
/// Packs the tessellation parameters from the gpu accessor.
/// </summary>
/// <param name="gpuAccessor">The gpu accessor</param>
/// <returns>The packed tessellation parameters</returns>
private static byte GetTessellationModePacked(IGpuAccessor gpuAccessor)
{
byte value;
value = (byte)((int)gpuAccessor.QueryTessPatchType() & 3);
value |= (byte)(((int)gpuAccessor.QueryTessSpacing() & 3) << 2);
if (gpuAccessor.QueryTessCw())
{
value |= 0x10;
}
return value;
}
/// <summary>
/// Create a new instance of <see cref="GuestGpuAccessorHeader"/> from an gpu accessor.
/// </summary>
/// <param name="gpuAccessor">The gpu accessor</param>
/// <returns>A new instance of <see cref="GuestGpuAccessorHeader"/></returns>
public static GuestGpuAccessorHeader CreateGuestGpuAccessorCache(IGpuAccessor gpuAccessor)
{
return new GuestGpuAccessorHeader
{
ComputeLocalSizeX = gpuAccessor.QueryComputeLocalSizeX(),
ComputeLocalSizeY = gpuAccessor.QueryComputeLocalSizeY(),
ComputeLocalSizeZ = gpuAccessor.QueryComputeLocalSizeZ(),
ComputeLocalMemorySize = gpuAccessor.QueryComputeLocalMemorySize(),
ComputeSharedMemorySize = gpuAccessor.QueryComputeSharedMemorySize(),
PrimitiveTopology = gpuAccessor.QueryPrimitiveTopology(),
TessellationModePacked = GetTessellationModePacked(gpuAccessor),
StateFlags = GetGpuStateFlags(gpuAccessor)
};
}
/// <summary>
/// Create guest shader cache entries from the runtime contexts.
/// </summary>
/// <param name="channel">The GPU channel in use</param>
/// <param name="shaderContexts">The runtime contexts</param>
/// <returns>Guest shader cahe entries from the runtime contexts</returns>
public static GuestShaderCacheEntry[] CreateShaderCacheEntries(GpuChannel channel, ReadOnlySpan<TranslatorContext> shaderContexts)
{
MemoryManager memoryManager = channel.MemoryManager;
int startIndex = shaderContexts.Length > 1 ? 1 : 0;
GuestShaderCacheEntry[] entries = new GuestShaderCacheEntry[shaderContexts.Length - startIndex];
for (int i = startIndex; i < shaderContexts.Length; i++)
{
TranslatorContext context = shaderContexts[i];
if (context == null)
{
continue;
}
GpuAccessor gpuAccessor = context.GpuAccessor as GpuAccessor;
ulong cb1DataAddress;
int cb1DataSize = gpuAccessor?.Cb1DataSize ?? 0;
if (context.Stage == ShaderStage.Compute)
{
cb1DataAddress = channel.BufferManager.GetComputeUniformBufferAddress(1);
}
else
{
int stageIndex = context.Stage switch
{
ShaderStage.TessellationControl => 1,
ShaderStage.TessellationEvaluation => 2,
ShaderStage.Geometry => 3,
ShaderStage.Fragment => 4,
_ => 0
};
cb1DataAddress = channel.BufferManager.GetGraphicsUniformBufferAddress(stageIndex, 1);
}
int size = context.Size;
TranslatorContext translatorContext2 = i == 1 ? shaderContexts[0] : null;
int sizeA = translatorContext2 != null ? translatorContext2.Size : 0;
byte[] code = new byte[size + cb1DataSize + sizeA];
memoryManager.GetSpan(context.Address, size).CopyTo(code);
if (cb1DataAddress != 0 && cb1DataSize != 0)
{
memoryManager.Physical.GetSpan(cb1DataAddress, cb1DataSize).CopyTo(code.AsSpan().Slice(size, cb1DataSize));
}
if (translatorContext2 != null)
{
memoryManager.GetSpan(translatorContext2.Address, sizeA).CopyTo(code.AsSpan().Slice(size + cb1DataSize, sizeA));
}
GuestGpuAccessorHeader gpuAccessorHeader = CreateGuestGpuAccessorCache(context.GpuAccessor);
if (gpuAccessor != null)
{
gpuAccessorHeader.TextureDescriptorCount = context.TextureHandlesForCache.Count;
}
GuestShaderCacheEntryHeader header = new GuestShaderCacheEntryHeader(
context.Stage,
size + cb1DataSize,
sizeA,
cb1DataSize,
gpuAccessorHeader);
GuestShaderCacheEntry entry = new GuestShaderCacheEntry(header, code);
if (gpuAccessor != null)
{
foreach (int textureHandle in context.TextureHandlesForCache)
{
GuestTextureDescriptor textureDescriptor = ((Image.TextureDescriptor)gpuAccessor.GetTextureDescriptor(textureHandle, -1)).ToCache();
textureDescriptor.Handle = (uint)textureHandle;
entry.TextureDescriptors.Add(textureHandle, textureDescriptor);
}
}
entries[i - startIndex] = entry;
}
return entries;
}
/// <summary>
/// Create a guest shader program.
/// </summary>
/// <param name="shaderCacheEntries">The entries composing the guest program dump</param>
/// <param name="tfd">The transform feedback descriptors in use</param>
/// <returns>The resulting guest shader program</returns>
public static byte[] CreateGuestProgramDump(GuestShaderCacheEntry[] shaderCacheEntries, TransformFeedbackDescriptor[] tfd = null)
{
using (MemoryStream resultStream = new MemoryStream())
{
BinaryWriter resultStreamWriter = new BinaryWriter(resultStream);
byte transformFeedbackCount = 0;
if (tfd != null)
{
transformFeedbackCount = (byte)tfd.Length;
}
// Header
resultStreamWriter.WriteStruct(new GuestShaderCacheHeader((byte)shaderCacheEntries.Length, transformFeedbackCount));
// Write all entries header
foreach (GuestShaderCacheEntry entry in shaderCacheEntries)
{
if (entry == null)
{
resultStreamWriter.WriteStruct(new GuestShaderCacheEntryHeader());
}
else
{
resultStreamWriter.WriteStruct(entry.Header);
}
}
// Finally, write all program code and all transform feedback information.
resultStreamWriter.Write(ComputeGuestProgramCode(shaderCacheEntries, tfd));
return resultStream.ToArray();
}
}
/// <summary>
/// Save temporary files not in archive.
/// </summary>

View File

@ -47,8 +47,6 @@ namespace Ryujinx.Graphics.Gpu.Shader.Cache
string baseCacheDirectory = CacheHelper.GetBaseCacheDirectory(titleId);
CacheMigration.Run(baseCacheDirectory, graphicsApi, hashType, shaderProvider);
_guestProgramCache = new CacheCollection(baseCacheDirectory, _hashType, CacheGraphicsApi.Guest, "", "program", GuestCacheVersion);
_hostProgramCache = new CacheCollection(baseCacheDirectory, _hashType, _graphicsApi, _shaderProvider, "host", shaderCodeGenVersion);
}

View File

@ -1,175 +0,0 @@
using ICSharpCode.SharpZipLib.Zip;
using Ryujinx.Common;
using Ryujinx.Common.Logging;
using Ryujinx.Graphics.GAL;
using Ryujinx.Graphics.Gpu.Shader.Cache.Definition;
using System;
using System.Collections.Generic;
using System.IO;
namespace Ryujinx.Graphics.Gpu.Shader.Cache
{
/// <summary>
/// Class handling shader cache migrations.
/// </summary>
static class CacheMigration
{
/// <summary>
/// Check if the given cache version need to recompute its hash.
/// </summary>
/// <param name="version">The version in use</param>
/// <param name="newVersion">The new version after migration</param>
/// <returns>True if a hash recompute is needed</returns>
public static bool NeedHashRecompute(ulong version, out ulong newVersion)
{
const ulong TargetBrokenVersion = 1717;
const ulong TargetFixedVersion = 1759;
newVersion = TargetFixedVersion;
if (version == TargetBrokenVersion)
{
return true;
}
return false;
}
private class StreamZipEntryDataSource : IStaticDataSource
{
private readonly ZipFile Archive;
private readonly ZipEntry Entry;
public StreamZipEntryDataSource(ZipFile archive, ZipEntry entry)
{
Archive = archive;
Entry = entry;
}
public Stream GetSource()
{
return Archive.GetInputStream(Entry);
}
}
/// <summary>
/// Move a file with the name of a given hash to another in the cache archive.
/// </summary>
/// <param name="archive">The archive in use</param>
/// <param name="oldKey">The old key</param>
/// <param name="newKey">The new key</param>
private static void MoveEntry(ZipFile archive, Hash128 oldKey, Hash128 newKey)
{
ZipEntry oldGuestEntry = archive.GetEntry($"{oldKey}");
if (oldGuestEntry != null)
{
archive.Add(new StreamZipEntryDataSource(archive, oldGuestEntry), $"{newKey}", CompressionMethod.Deflated);
archive.Delete(oldGuestEntry);
}
}
/// <summary>
/// Recompute all the hashes of a given cache.
/// </summary>
/// <param name="guestBaseCacheDirectory">The guest cache directory path</param>
/// <param name="hostBaseCacheDirectory">The host cache directory path</param>
/// <param name="graphicsApi">The graphics api in use</param>
/// <param name="hashType">The hash type in use</param>
/// <param name="newVersion">The version to write in the host and guest manifest after migration</param>
private static void RecomputeHashes(string guestBaseCacheDirectory, string hostBaseCacheDirectory, CacheGraphicsApi graphicsApi, CacheHashType hashType, ulong newVersion)
{
string guestManifestPath = CacheHelper.GetManifestPath(guestBaseCacheDirectory);
string hostManifestPath = CacheHelper.GetManifestPath(hostBaseCacheDirectory);
if (CacheHelper.TryReadManifestFile(guestManifestPath, CacheGraphicsApi.Guest, hashType, out _, out HashSet<Hash128> guestEntries))
{
CacheHelper.TryReadManifestFile(hostManifestPath, graphicsApi, hashType, out _, out HashSet<Hash128> hostEntries);
Logger.Info?.Print(LogClass.Gpu, "Shader cache hashes need to be recomputed, performing migration...");
string guestArchivePath = CacheHelper.GetArchivePath(guestBaseCacheDirectory);
string hostArchivePath = CacheHelper.GetArchivePath(hostBaseCacheDirectory);
ZipFile guestArchive = new ZipFile(File.Open(guestArchivePath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None));
ZipFile hostArchive = new ZipFile(File.Open(hostArchivePath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None));
CacheHelper.EnsureArchiveUpToDate(guestBaseCacheDirectory, guestArchive, guestEntries);
CacheHelper.EnsureArchiveUpToDate(hostBaseCacheDirectory, hostArchive, hostEntries);
int programIndex = 0;
HashSet<Hash128> newEntries = new HashSet<Hash128>();
foreach (Hash128 oldHash in guestEntries)
{
byte[] guestProgram = CacheHelper.ReadFromArchive(guestArchive, oldHash);
Logger.Info?.Print(LogClass.Gpu, $"Migrating shader {oldHash} ({programIndex + 1} / {guestEntries.Count})");
if (guestProgram != null)
{
ReadOnlySpan<byte> guestProgramReadOnlySpan = guestProgram;
ReadOnlySpan<GuestShaderCacheEntry> cachedShaderEntries = GuestShaderCacheEntry.Parse(ref guestProgramReadOnlySpan, out GuestShaderCacheHeader fileHeader);
TransformFeedbackDescriptor[] tfd = CacheHelper.ReadTransformFeedbackInformation(ref guestProgramReadOnlySpan, fileHeader);
Hash128 newHash = CacheHelper.ComputeGuestHashFromCache(cachedShaderEntries, tfd);
if (newHash != oldHash)
{
MoveEntry(guestArchive, oldHash, newHash);
MoveEntry(hostArchive, oldHash, newHash);
}
else
{
Logger.Warning?.Print(LogClass.Gpu, $"Same hashes for shader {oldHash}");
}
newEntries.Add(newHash);
}
programIndex++;
}
byte[] newGuestManifestContent = CacheHelper.ComputeManifest(newVersion, CacheGraphicsApi.Guest, hashType, newEntries);
byte[] newHostManifestContent = CacheHelper.ComputeManifest(newVersion, graphicsApi, hashType, newEntries);
File.WriteAllBytes(guestManifestPath, newGuestManifestContent);
File.WriteAllBytes(hostManifestPath, newHostManifestContent);
guestArchive.CommitUpdate();
hostArchive.CommitUpdate();
guestArchive.Close();
hostArchive.Close();
}
}
/// <summary>
/// Check and run cache migration if needed.
/// </summary>
/// <param name="baseCacheDirectory">The base path of the cache</param>
/// <param name="graphicsApi">The graphics api in use</param>
/// <param name="hashType">The hash type in use</param>
/// <param name="shaderProvider">The shader provider name of the cache</param>
public static void Run(string baseCacheDirectory, CacheGraphicsApi graphicsApi, CacheHashType hashType, string shaderProvider)
{
string guestBaseCacheDirectory = CacheHelper.GenerateCachePath(baseCacheDirectory, CacheGraphicsApi.Guest, "", "program");
string hostBaseCacheDirectory = CacheHelper.GenerateCachePath(baseCacheDirectory, graphicsApi, shaderProvider, "host");
string guestArchivePath = CacheHelper.GetArchivePath(guestBaseCacheDirectory);
string hostArchivePath = CacheHelper.GetArchivePath(hostBaseCacheDirectory);
bool isReadOnly = CacheHelper.IsArchiveReadOnly(guestArchivePath) || CacheHelper.IsArchiveReadOnly(hostArchivePath);
if (!isReadOnly && CacheHelper.TryReadManifestHeader(CacheHelper.GetManifestPath(guestBaseCacheDirectory), out CacheManifestHeader header))
{
if (NeedHashRecompute(header.Version, out ulong newVersion))
{
RecomputeHashes(guestBaseCacheDirectory, hostBaseCacheDirectory, graphicsApi, hashType, newVersion);
}
}
}
}
}

View File

@ -96,6 +96,7 @@ namespace Ryujinx.Graphics.Gpu.Shader.Cache.Definition
SBuffers,
Textures,
Images,
default,
Header.UseFlags.HasFlag(UseFlags.InstanceId),
Header.UseFlags.HasFlag(UseFlags.RtLayer),
Header.ClipDistancesWritten,
@ -160,7 +161,7 @@ namespace Ryujinx.Graphics.Gpu.Shader.Cache.Definition
/// <param name="programCode">The host shader program</param>
/// <param name="codeHolders">The shaders code holder</param>
/// <returns>Raw data of a new host shader cache file</returns>
internal static byte[] Create(ReadOnlySpan<byte> programCode, ShaderCodeHolder[] codeHolders)
internal static byte[] Create(ReadOnlySpan<byte> programCode, CachedShaderStage[] codeHolders)
{
HostShaderCacheHeader header = new HostShaderCacheHeader((byte)codeHolders.Length, programCode.Length);

View File

@ -0,0 +1,255 @@
using Ryujinx.Common;
using Ryujinx.Common.Logging;
using Ryujinx.Common.Memory;
using Ryujinx.Graphics.GAL;
using Ryujinx.Graphics.Gpu.Engine.Threed;
using Ryujinx.Graphics.Gpu.Shader.Cache.Definition;
using Ryujinx.Graphics.Gpu.Shader.DiskCache;
using Ryujinx.Graphics.Shader;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Runtime.InteropServices;
namespace Ryujinx.Graphics.Gpu.Shader.Cache
{
/// <summary>
/// Class handling shader cache migrations.
/// </summary>
static class Migration
{
// Last codegen version before the migration to the new cache.
private const ulong ShaderCodeGenVersion = 3054;
/// <summary>
/// Migrates from the old cache format to the new one.
/// </summary>
/// <param name="context">GPU context</param>
/// <param name="hostStorage">Disk cache host storage (used to create the new shader files)</param>
/// <returns>Number of migrated shaders</returns>
public static int MigrateFromLegacyCache(GpuContext context, DiskCacheHostStorage hostStorage)
{
string baseCacheDirectory = CacheHelper.GetBaseCacheDirectory(GraphicsConfig.TitleId);
string cacheDirectory = CacheHelper.GenerateCachePath(baseCacheDirectory, CacheGraphicsApi.Guest, "", "program");
// If the directory does not exist, we have no old cache.
// Exist early as the CacheManager constructor will create the directories.
if (!Directory.Exists(cacheDirectory))
{
return 0;
}
if (GraphicsConfig.EnableShaderCache && GraphicsConfig.TitleId != null)
{
CacheManager cacheManager = new CacheManager(CacheGraphicsApi.OpenGL, CacheHashType.XxHash128, "glsl", GraphicsConfig.TitleId, ShaderCodeGenVersion);
bool isReadOnly = cacheManager.IsReadOnly;
HashSet<Hash128> invalidEntries = null;
if (isReadOnly)
{
Logger.Warning?.Print(LogClass.Gpu, "Loading shader cache in read-only mode (cache in use by another program!)");
}
else
{
invalidEntries = new HashSet<Hash128>();
}
ReadOnlySpan<Hash128> guestProgramList = cacheManager.GetGuestProgramList();
for (int programIndex = 0; programIndex < guestProgramList.Length; programIndex++)
{
Hash128 key = guestProgramList[programIndex];
byte[] guestProgram = cacheManager.GetGuestProgramByHash(ref key);
if (guestProgram == null)
{
Logger.Error?.Print(LogClass.Gpu, $"Ignoring orphan shader hash {key} in cache (is the cache incomplete?)");
continue;
}
ReadOnlySpan<byte> guestProgramReadOnlySpan = guestProgram;
ReadOnlySpan<GuestShaderCacheEntry> cachedShaderEntries = GuestShaderCacheEntry.Parse(ref guestProgramReadOnlySpan, out GuestShaderCacheHeader fileHeader);
if (cachedShaderEntries[0].Header.Stage == ShaderStage.Compute)
{
Debug.Assert(cachedShaderEntries.Length == 1);
GuestShaderCacheEntry entry = cachedShaderEntries[0];
byte[] code = entry.Code.AsSpan(0, entry.Header.Size - entry.Header.Cb1DataSize).ToArray();
Span<byte> codeSpan = entry.Code;
byte[] cb1Data = codeSpan.Slice(codeSpan.Length - entry.Header.Cb1DataSize).ToArray();
ShaderProgramInfo info = new ShaderProgramInfo(
Array.Empty<BufferDescriptor>(),
Array.Empty<BufferDescriptor>(),
Array.Empty<TextureDescriptor>(),
Array.Empty<TextureDescriptor>(),
ShaderStage.Compute,
false,
false,
0,
0);
GpuChannelComputeState computeState = new GpuChannelComputeState(
entry.Header.GpuAccessorHeader.ComputeLocalSizeX,
entry.Header.GpuAccessorHeader.ComputeLocalSizeY,
entry.Header.GpuAccessorHeader.ComputeLocalSizeZ,
entry.Header.GpuAccessorHeader.ComputeLocalMemorySize,
entry.Header.GpuAccessorHeader.ComputeSharedMemorySize);
ShaderSpecializationState specState = new ShaderSpecializationState(computeState);
foreach (var td in entry.TextureDescriptors)
{
var handle = td.Key;
var data = td.Value;
specState.RegisterTexture(
0,
handle,
-1,
data.UnpackFormat(),
data.UnpackSrgb(),
data.UnpackTextureTarget(),
data.UnpackTextureCoordNormalized());
}
CachedShaderStage shader = new CachedShaderStage(info, code, cb1Data);
CachedShaderProgram program = new CachedShaderProgram(null, specState, shader);
hostStorage.AddShader(context, program, ReadOnlySpan<byte>.Empty);
}
else
{
Debug.Assert(cachedShaderEntries.Length == Constants.ShaderStages);
CachedShaderStage[] shaders = new CachedShaderStage[Constants.ShaderStages + 1];
List<ShaderProgram> shaderPrograms = new List<ShaderProgram>();
TransformFeedbackDescriptorOld[] tfd = CacheHelper.ReadTransformFeedbackInformation(ref guestProgramReadOnlySpan, fileHeader);
GuestShaderCacheEntry[] entries = cachedShaderEntries.ToArray();
GuestGpuAccessorHeader accessorHeader = entries[0].Header.GpuAccessorHeader;
TessMode tessMode = new TessMode();
int tessPatchType = accessorHeader.TessellationModePacked & 3;
int tessSpacing = (accessorHeader.TessellationModePacked >> 2) & 3;
bool tessCw = (accessorHeader.TessellationModePacked & 0x10) != 0;
tessMode.Packed = (uint)tessPatchType;
tessMode.Packed |= (uint)(tessSpacing << 4);
if (tessCw)
{
tessMode.Packed |= 0x100;
}
PrimitiveTopology topology = accessorHeader.PrimitiveTopology switch
{
InputTopology.Lines => PrimitiveTopology.Lines,
InputTopology.LinesAdjacency => PrimitiveTopology.LinesAdjacency,
InputTopology.Triangles => PrimitiveTopology.Triangles,
InputTopology.TrianglesAdjacency => PrimitiveTopology.TrianglesAdjacency,
_ => PrimitiveTopology.Points
};
GpuChannelGraphicsState graphicsState = new GpuChannelGraphicsState(
accessorHeader.StateFlags.HasFlag(GuestGpuStateFlags.EarlyZForce),
topology,
tessMode);
TransformFeedbackDescriptor[] tfdNew = null;
if (tfd != null)
{
tfdNew = new TransformFeedbackDescriptor[tfd.Length];
for (int tfIndex = 0; tfIndex < tfd.Length; tfIndex++)
{
Array32<uint> varyingLocations = new Array32<uint>();
Span<byte> varyingLocationsSpan = MemoryMarshal.Cast<uint, byte>(varyingLocations.ToSpan());
tfd[tfIndex].VaryingLocations.CopyTo(varyingLocationsSpan.Slice(0, tfd[tfIndex].VaryingLocations.Length));
tfdNew[tfIndex] = new TransformFeedbackDescriptor(
tfd[tfIndex].BufferIndex,
tfd[tfIndex].Stride,
tfd[tfIndex].VaryingLocations.Length,
ref varyingLocations);
}
}
ShaderSpecializationState specState = new ShaderSpecializationState(graphicsState, tfdNew);
for (int i = 0; i < entries.Length; i++)
{
GuestShaderCacheEntry entry = entries[i];
if (entry == null)
{
continue;
}
ShaderProgramInfo info = new ShaderProgramInfo(
Array.Empty<BufferDescriptor>(),
Array.Empty<BufferDescriptor>(),
Array.Empty<TextureDescriptor>(),
Array.Empty<TextureDescriptor>(),
(ShaderStage)(i + 1),
false,
false,
0,
0);
// NOTE: Vertex B comes first in the shader cache.
byte[] code = entry.Code.AsSpan(0, entry.Header.Size - entry.Header.Cb1DataSize).ToArray();
byte[] code2 = entry.Header.SizeA != 0 ? entry.Code.AsSpan(entry.Header.Size, entry.Header.SizeA).ToArray() : null;
Span<byte> codeSpan = entry.Code;
byte[] cb1Data = codeSpan.Slice(codeSpan.Length - entry.Header.Cb1DataSize).ToArray();
shaders[i + 1] = new CachedShaderStage(info, code, cb1Data);
if (code2 != null)
{
shaders[0] = new CachedShaderStage(null, code2, cb1Data);
}
foreach (var td in entry.TextureDescriptors)
{
var handle = td.Key;
var data = td.Value;
specState.RegisterTexture(
i,
handle,
-1,
data.UnpackFormat(),
data.UnpackSrgb(),
data.UnpackTextureTarget(),
data.UnpackTextureCoordNormalized());
}
}
CachedShaderProgram program = new CachedShaderProgram(null, specState, shaders);
hostStorage.AddShader(context, program, ReadOnlySpan<byte>.Empty);
}
}
return guestProgramList.Length;
}
return 0;
}
}
}

View File

@ -0,0 +1,19 @@
using System;
namespace Ryujinx.Graphics.Gpu.Shader.Cache
{
struct TransformFeedbackDescriptorOld
{
public int BufferIndex { get; }
public int Stride { get; }
public byte[] VaryingLocations { get; }
public TransformFeedbackDescriptorOld(int bufferIndex, int stride, byte[] varyingLocations)
{
BufferIndex = bufferIndex;
Stride = stride;
VaryingLocations = varyingLocations ?? throw new ArgumentNullException(nameof(varyingLocations));
}
}
}

View File

@ -1,222 +0,0 @@
using Ryujinx.Common.Logging;
using Ryujinx.Graphics.Gpu.Shader.Cache.Definition;
using Ryujinx.Graphics.Shader;
using System;
using System.Collections.Generic;
using System.Runtime.InteropServices;
namespace Ryujinx.Graphics.Gpu.Shader
{
class CachedGpuAccessor : TextureDescriptorCapableGpuAccessor, IGpuAccessor
{
private readonly ReadOnlyMemory<byte> _data;
private readonly ReadOnlyMemory<byte> _cb1Data;
private readonly GuestGpuAccessorHeader _header;
private readonly Dictionary<int, GuestTextureDescriptor> _textureDescriptors;
private readonly TransformFeedbackDescriptor[] _tfd;
/// <summary>
/// Creates a new instance of the cached GPU state accessor for shader translation.
/// </summary>
/// <param name="context">GPU context</param>
/// <param name="data">The data of the shader</param>
/// <param name="cb1Data">The constant buffer 1 data of the shader</param>
/// <param name="header">The cache of the GPU accessor</param>
/// <param name="guestTextureDescriptors">The cache of the texture descriptors</param>
public CachedGpuAccessor(
GpuContext context,
ReadOnlyMemory<byte> data,
ReadOnlyMemory<byte> cb1Data,
GuestGpuAccessorHeader header,
IReadOnlyDictionary<int, GuestTextureDescriptor> guestTextureDescriptors,
TransformFeedbackDescriptor[] tfd) : base(context)
{
_data = data;
_cb1Data = cb1Data;
_header = header;
_textureDescriptors = new Dictionary<int, GuestTextureDescriptor>();
foreach (KeyValuePair<int, GuestTextureDescriptor> guestTextureDescriptor in guestTextureDescriptors)
{
_textureDescriptors.Add(guestTextureDescriptor.Key, guestTextureDescriptor.Value);
}
_tfd = tfd;
}
/// <summary>
/// Reads data from the constant buffer 1.
/// </summary>
/// <param name="offset">Offset in bytes to read from</param>
/// <returns>Value at the given offset</returns>
public uint ConstantBuffer1Read(int offset)
{
return MemoryMarshal.Cast<byte, uint>(_cb1Data.Span.Slice(offset))[0];
}
/// <summary>
/// Prints a log message.
/// </summary>
/// <param name="message">Message to print</param>
public void Log(string message)
{
Logger.Warning?.Print(LogClass.Gpu, $"Shader translator: {message}");
}
/// <summary>
/// Gets a span of the specified memory location, containing shader code.
/// </summary>
/// <param name="address">GPU virtual address of the data</param>
/// <param name="minimumSize">Minimum size that the returned span may have</param>
/// <returns>Span of the memory location</returns>
public override ReadOnlySpan<ulong> GetCode(ulong address, int minimumSize)
{
return MemoryMarshal.Cast<byte, ulong>(_data.Span.Slice((int)address));
}
/// <summary>
/// Checks if a given memory address is mapped.
/// </summary>
/// <param name="address">GPU virtual address to be checked</param>
/// <returns>True if the address is mapped, false otherwise</returns>
public bool MemoryMapped(ulong address)
{
return address < (ulong)_data.Length;
}
/// <summary>
/// Queries Local Size X for compute shaders.
/// </summary>
/// <returns>Local Size X</returns>
public int QueryComputeLocalSizeX()
{
return _header.ComputeLocalSizeX;
}
/// <summary>
/// Queries Local Size Y for compute shaders.
/// </summary>
/// <returns>Local Size Y</returns>
public int QueryComputeLocalSizeY()
{
return _header.ComputeLocalSizeY;
}
/// <summary>
/// Queries Local Size Z for compute shaders.
/// </summary>
/// <returns>Local Size Z</returns>
public int QueryComputeLocalSizeZ()
{
return _header.ComputeLocalSizeZ;
}
/// <summary>
/// Queries Local Memory size in bytes for compute shaders.
/// </summary>
/// <returns>Local Memory size in bytes</returns>
public int QueryComputeLocalMemorySize()
{
return _header.ComputeLocalMemorySize;
}
/// <summary>
/// Queries Shared Memory size in bytes for compute shaders.
/// </summary>
/// <returns>Shared Memory size in bytes</returns>
public int QueryComputeSharedMemorySize()
{
return _header.ComputeSharedMemorySize;
}
/// <summary>
/// Queries current primitive topology for geometry shaders.
/// </summary>
/// <returns>Current primitive topology</returns>
public InputTopology QueryPrimitiveTopology()
{
return _header.PrimitiveTopology;
}
/// <summary>
/// Queries the tessellation evaluation shader primitive winding order.
/// </summary>
/// <returns>True if the primitive winding order is clockwise, false if counter-clockwise</returns>
public bool QueryTessCw()
{
return (_header.TessellationModePacked & 0x10) != 0;
}
/// <summary>
/// Queries the tessellation evaluation shader abstract patch type.
/// </summary>
/// <returns>Abstract patch type</returns>
public TessPatchType QueryTessPatchType()
{
return (TessPatchType)(_header.TessellationModePacked & 3);
}
/// <summary>
/// Queries the tessellation evaluation shader spacing between tessellated vertices of the patch.
/// </summary>
/// <returns>Spacing between tessellated vertices of the patch</returns>
public TessSpacing QueryTessSpacing()
{
return (TessSpacing)((_header.TessellationModePacked >> 2) & 3);
}
/// <summary>
/// Gets the texture descriptor for a given texture on the pool.
/// </summary>
/// <param name="handle">Index of the texture (this is the word offset of the handle in the constant buffer)</param>
/// <param name="cbufSlot">Constant buffer slot for the texture handle</param>
/// <returns>Texture descriptor</returns>
public override Image.ITextureDescriptor GetTextureDescriptor(int handle, int cbufSlot)
{
if (!_textureDescriptors.TryGetValue(handle, out GuestTextureDescriptor textureDescriptor))
{
throw new ArgumentException();
}
return textureDescriptor;
}
/// <summary>
/// Queries transform feedback enable state.
/// </summary>
/// <returns>True if the shader uses transform feedback, false otherwise</returns>
public bool QueryTransformFeedbackEnabled()
{
return _tfd != null;
}
/// <summary>
/// Queries the varying locations that should be written to the transform feedback buffer.
/// </summary>
/// <param name="bufferIndex">Index of the transform feedback buffer</param>
/// <returns>Varying locations for the specified buffer</returns>
public ReadOnlySpan<byte> QueryTransformFeedbackVaryingLocations(int bufferIndex)
{
return _tfd[bufferIndex].VaryingLocations;
}
/// <summary>
/// Queries the stride (in bytes) of the per vertex data written into the transform feedback buffer.
/// </summary>
/// <param name="bufferIndex">Index of the transform feedback buffer</param>
/// <returns>Stride for the specified buffer</returns>
public int QueryTransformFeedbackStride(int bufferIndex)
{
return _tfd[bufferIndex].Stride;
}
/// <summary>
/// Queries if host state forces early depth testing.
/// </summary>
/// <returns>True if early depth testing is forced</returns>
public bool QueryEarlyZForce()
{
return (_header.StateFlags & GuestGpuStateFlags.EarlyZForce) != 0;
}
}
}

View File

@ -7,26 +7,33 @@ namespace Ryujinx.Graphics.Gpu.Shader
/// Represents a program composed of one or more shader stages (for graphics shaders),
/// or a single shader (for compute shaders).
/// </summary>
class ShaderBundle : IDisposable
class CachedShaderProgram : IDisposable
{
/// <summary>
/// Host shader program object.
/// </summary>
public IProgram HostProgram { get; }
/// <summary>
/// GPU state used to create this version of the shader.
/// </summary>
public ShaderSpecializationState SpecializationState { get; }
/// <summary>
/// Compiled shader for each shader stage.
/// </summary>
public ShaderCodeHolder[] Shaders { get; }
public CachedShaderStage[] Shaders { get; }
/// <summary>
/// Creates a new instance of the shader bundle.
/// </summary>
/// <param name="hostProgram">Host program with all the shader stages</param>
/// <param name="specializationState">GPU state used to create this version of the shader</param>
/// <param name="shaders">Shaders</param>
public ShaderBundle(IProgram hostProgram, params ShaderCodeHolder[] shaders)
public CachedShaderProgram(IProgram hostProgram, ShaderSpecializationState specializationState, params CachedShaderStage[] shaders)
{
HostProgram = hostProgram;
SpecializationState = specializationState;
Shaders = shaders;
}
@ -36,11 +43,6 @@ namespace Ryujinx.Graphics.Gpu.Shader
public void Dispose()
{
HostProgram.Dispose();
foreach (ShaderCodeHolder holder in Shaders)
{
holder?.HostShader?.Dispose();
}
}
}
}

View File

@ -0,0 +1,38 @@
using Ryujinx.Graphics.Shader;
namespace Ryujinx.Graphics.Gpu.Shader
{
/// <summary>
/// Cached shader code for a single shader stage.
/// </summary>
class CachedShaderStage
{
/// <summary>
/// Shader program information.
/// </summary>
public ShaderProgramInfo Info { get; }
/// <summary>
/// Maxwell binary shader code.
/// </summary>
public byte[] Code { get; }
/// <summary>
/// Constant buffer 1 data accessed by the shader.
/// </summary>
public byte[] Cb1Data { get; }
/// <summary>
/// Creates a new instance of the shader code holder.
/// </summary>
/// <param name="info">Shader program information</param>
/// <param name="code">Maxwell binary shader code</param>
/// <param name="cb1Data">Constant buffer 1 data accessed by the shader</param>
public CachedShaderStage(ShaderProgramInfo info, byte[] code, byte[] cb1Data)
{
Info = info;
Code = code;
Cb1Data = cb1Data;
}
}
}

View File

@ -0,0 +1,68 @@
using Ryujinx.Graphics.Gpu.Shader.HashTable;
using System.Collections.Generic;
namespace Ryujinx.Graphics.Gpu.Shader
{
/// <summary>
/// Compute shader cache hash table.
/// </summary>
class ComputeShaderCacheHashTable
{
private readonly PartitionedHashTable<ShaderSpecializationList> _cache;
private readonly List<CachedShaderProgram> _shaderPrograms;
/// <summary>
/// Creates a new compute shader cache hash table.
/// </summary>
public ComputeShaderCacheHashTable()
{
_cache = new PartitionedHashTable<ShaderSpecializationList>();
_shaderPrograms = new List<CachedShaderProgram>();
}
/// <summary>
/// Adds a program to the cache.
/// </summary>
/// <param name="program">Program to be added</param>
public void Add(CachedShaderProgram program)
{
var specList = _cache.GetOrAdd(program.Shaders[0].Code, new ShaderSpecializationList());
specList.Add(program);
_shaderPrograms.Add(program);
}
/// <summary>
/// Tries to find a cached program.
/// </summary>
/// <param name="channel">GPU channel</param>
/// <param name="poolState">Texture pool state</param>
/// <param name="gpuVa">GPU virtual address of the compute shader</param>
/// <param name="program">Cached host program for the given state, if found</param>
/// <param name="cachedGuestCode">Cached guest code, if any found</param>
/// <returns>True if a cached host program was found, false otherwise</returns>
public bool TryFind(
GpuChannel channel,
GpuChannelPoolState poolState,
ulong gpuVa,
out CachedShaderProgram program,
out byte[] cachedGuestCode)
{
program = null;
ShaderCodeAccessor codeAccessor = new ShaderCodeAccessor(channel.MemoryManager, gpuVa);
bool hasSpecList = _cache.TryFindItem(codeAccessor, out var specList, out cachedGuestCode);
return hasSpecList && specList.TryFindForCompute(channel, poolState, out program);
}
/// <summary>
/// Gets all programs that have been added to the table.
/// </summary>
/// <returns>Programs added to the table</returns>
public IEnumerable<CachedShaderProgram> GetPrograms()
{
foreach (var program in _shaderPrograms)
{
yield return program;
}
}
}
}

View File

@ -0,0 +1,138 @@
using Ryujinx.Common;
using Ryujinx.Common.Logging;
using System;
using System.IO;
namespace Ryujinx.Graphics.Gpu.Shader.DiskCache
{
/// <summary>
/// Represents a background disk cache writer.
/// </summary>
class BackgroundDiskCacheWriter : IDisposable
{
/// <summary>
/// Possible operation to do on the <see cref="_fileWriterWorkerQueue"/>.
/// </summary>
private enum CacheFileOperation
{
/// <summary>
/// Operation to add a shader to the cache.
/// </summary>
AddShader
}
/// <summary>
/// Represents an operation to perform on the <see cref="_fileWriterWorkerQueue"/>.
/// </summary>
private struct CacheFileOperationTask
{
/// <summary>
/// The type of operation to perform.
/// </summary>
public readonly CacheFileOperation Type;
/// <summary>
/// The data associated to this operation or null.
/// </summary>
public readonly object Data;
public CacheFileOperationTask(CacheFileOperation type, object data)
{
Type = type;
Data = data;
}
}
/// <summary>
/// Background shader cache write information.
/// </summary>
private struct AddShaderData
{
/// <summary>
/// Cached shader program.
/// </summary>
public readonly CachedShaderProgram Program;
/// <summary>
/// Binary host code.
/// </summary>
public readonly byte[] HostCode;
/// <summary>
/// Creates a new background shader cache write information.
/// </summary>
/// <param name="program">Cached shader program</param>
/// <param name="hostCode">Binary host code</param>
public AddShaderData(CachedShaderProgram program, byte[] hostCode)
{
Program = program;
HostCode = hostCode;
}
}
private readonly GpuContext _context;
private readonly DiskCacheHostStorage _hostStorage;
private readonly AsyncWorkQueue<CacheFileOperationTask> _fileWriterWorkerQueue;
/// <summary>
/// Creates a new background disk cache writer.
/// </summary>
/// <param name="context">GPU context</param>
/// <param name="hostStorage">Disk cache host storage</param>
public BackgroundDiskCacheWriter(GpuContext context, DiskCacheHostStorage hostStorage)
{
_context = context;
_hostStorage = hostStorage;
_fileWriterWorkerQueue = new AsyncWorkQueue<CacheFileOperationTask>(ProcessTask, "Gpu.BackgroundDiskCacheWriter");
}
/// <summary>
/// Processes a shader cache background operation.
/// </summary>
/// <param name="task">Task to process</param>
private void ProcessTask(CacheFileOperationTask task)
{
switch (task.Type)
{
case CacheFileOperation.AddShader:
AddShaderData data = (AddShaderData)task.Data;
try
{
_hostStorage.AddShader(_context, data.Program, data.HostCode);
}
catch (DiskCacheLoadException diskCacheLoadException)
{
Logger.Error?.Print(LogClass.Gpu, $"Error writing shader to disk cache. {diskCacheLoadException.Message}");
}
catch (IOException ioException)
{
Logger.Error?.Print(LogClass.Gpu, $"Error writing shader to disk cache. {ioException.Message}");
}
break;
}
}
/// <summary>
/// Adds a shader program to be cached in the background.
/// </summary>
/// <param name="program">Shader program to cache</param>
/// <param name="hostCode">Host binary code of the program</param>
public void AddShader(CachedShaderProgram program, byte[] hostCode)
{
_fileWriterWorkerQueue.Add(new CacheFileOperationTask(CacheFileOperation.AddShader, new AddShaderData(program, hostCode)));
}
public void Dispose()
{
Dispose(true);
}
protected virtual void Dispose(bool disposing)
{
if (disposing)
{
_fileWriterWorkerQueue.Dispose();
}
}
}
}

View File

@ -0,0 +1,216 @@
using System;
using System.IO;
using System.IO.Compression;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
namespace Ryujinx.Graphics.Gpu.Shader.DiskCache
{
/// <summary>
/// Binary data serializer.
/// </summary>
struct BinarySerializer
{
private readonly Stream _stream;
private Stream _activeStream;
/// <summary>
/// Creates a new binary serializer.
/// </summary>
/// <param name="stream">Stream to read from or write into</param>
public BinarySerializer(Stream stream)
{
_stream = stream;
_activeStream = stream;
}
/// <summary>
/// Reads data from the stream.
/// </summary>
/// <typeparam name="T">Type of the data</typeparam>
/// <param name="data">Data read</param>
public void Read<T>(ref T data) where T : unmanaged
{
Span<byte> buffer = MemoryMarshal.Cast<T, byte>(MemoryMarshal.CreateSpan(ref data, 1));
for (int offset = 0; offset < buffer.Length;)
{
offset += _activeStream.Read(buffer.Slice(offset));
}
}
/// <summary>
/// Tries to read data from the stream.
/// </summary>
/// <typeparam name="T">Type of the data</typeparam>
/// <param name="data">Data read</param>
/// <returns>True if the read was successful, false otherwise</returns>
public bool TryRead<T>(ref T data) where T : unmanaged
{
// Length is unknown on compressed streams.
if (_activeStream == _stream)
{
int size = Unsafe.SizeOf<T>();
if (_activeStream.Length - _activeStream.Position < size)
{
return false;
}
}
Read(ref data);
return true;
}
/// <summary>
/// Reads data prefixed with a magic and size from the stream.
/// </summary>
/// <typeparam name="T">Type of the data</typeparam>
/// <param name="data">Data read</param>
/// <param name="magic">Expected magic value, for validation</param>
public void ReadWithMagicAndSize<T>(ref T data, uint magic) where T : unmanaged
{
uint actualMagic = 0;
int size = 0;
Read(ref actualMagic);
Read(ref size);
if (actualMagic != magic)
{
throw new DiskCacheLoadException(DiskCacheLoadResult.FileCorruptedInvalidMagic);
}
// Structs are expected to expand but not shrink between versions.
if (size > Unsafe.SizeOf<T>())
{
throw new DiskCacheLoadException(DiskCacheLoadResult.FileCorruptedInvalidLength);
}
Span<byte> buffer = MemoryMarshal.Cast<T, byte>(MemoryMarshal.CreateSpan(ref data, 1)).Slice(0, size);
for (int offset = 0; offset < buffer.Length;)
{
offset += _activeStream.Read(buffer.Slice(offset));
}
}
/// <summary>
/// Writes data into the stream.
/// </summary>
/// <typeparam name="T">Type of the data</typeparam>
/// <param name="data">Data to be written</param>
public void Write<T>(ref T data) where T : unmanaged
{
Span<byte> buffer = MemoryMarshal.Cast<T, byte>(MemoryMarshal.CreateSpan(ref data, 1));
_activeStream.Write(buffer);
}
/// <summary>
/// Writes data prefixed with a magic and size into the stream.
/// </summary>
/// <typeparam name="T">Type of the data</typeparam>
/// <param name="data">Data to write</param>
/// <param name="magic">Magic value to write</param>
public void WriteWithMagicAndSize<T>(ref T data, uint magic) where T : unmanaged
{
int size = Unsafe.SizeOf<T>();
Write(ref magic);
Write(ref size);
Span<byte> buffer = MemoryMarshal.Cast<T, byte>(MemoryMarshal.CreateSpan(ref data, 1));
_activeStream.Write(buffer);
}
/// <summary>
/// Indicates that all data that will be read from the stream has been compressed.
/// </summary>
public void BeginCompression()
{
CompressionAlgorithm algorithm = CompressionAlgorithm.None;
Read(ref algorithm);
if (algorithm == CompressionAlgorithm.Deflate)
{
_activeStream = new DeflateStream(_stream, CompressionMode.Decompress, true);
}
}
/// <summary>
/// Indicates that all data that will be written into the stream should be compressed.
/// </summary>
/// <param name="algorithm">Compression algorithm that should be used</param>
public void BeginCompression(CompressionAlgorithm algorithm)
{
Write(ref algorithm);
if (algorithm == CompressionAlgorithm.Deflate)
{
_activeStream = new DeflateStream(_stream, CompressionLevel.SmallestSize, true);
}
}
/// <summary>
/// Indicates the end of a compressed chunck.
/// </summary>
/// <remarks>
/// Any data written after this will not be compressed unless <see cref="BeginCompression(CompressionAlgorithm)"/> is called again.
/// Any data read after this will be assumed to be uncompressed unless <see cref="BeginCompression"/> is called again.
/// </remarks>
public void EndCompression()
{
if (_activeStream != _stream)
{
_activeStream.Dispose();
_activeStream = _stream;
}
}
/// <summary>
/// Reads compressed data from the stream.
/// </summary>
/// <remarks>
/// <paramref name="data"/> must have the exact length of the uncompressed data,
/// otherwise decompression will fail.
/// </remarks>
/// <param name="stream">Stream to read from</param>
/// <param name="data">Buffer to write the uncompressed data into</param>
public static void ReadCompressed(Stream stream, Span<byte> data)
{
CompressionAlgorithm algorithm = (CompressionAlgorithm)stream.ReadByte();
switch (algorithm)
{
case CompressionAlgorithm.None:
stream.Read(data);
break;
case CompressionAlgorithm.Deflate:
stream = new DeflateStream(stream, CompressionMode.Decompress, true);
for (int offset = 0; offset < data.Length;)
{
offset += stream.Read(data.Slice(offset));
}
stream.Dispose();
break;
}
}
/// <summary>
/// Compresses and writes the compressed data into the stream.
/// </summary>
/// <param name="stream">Stream to write into</param>
/// <param name="data">Data to compress</param>
/// <param name="algorithm">Compression algorithm to be used</param>
public static void WriteCompressed(Stream stream, ReadOnlySpan<byte> data, CompressionAlgorithm algorithm)
{
stream.WriteByte((byte)algorithm);
switch (algorithm)
{
case CompressionAlgorithm.None:
stream.Write(data);
break;
case CompressionAlgorithm.Deflate:
stream = new DeflateStream(stream, CompressionLevel.SmallestSize, true);
stream.Write(data);
stream.Dispose();
break;
}
}
}
}

Some files were not shown because too many files have changed in this diff Show More