* dotnet format style --severity info
Some changes were manually reverted.
* dotnet format analyzers --serverity info
Some changes have been minimally adapted.
* Restore a few unused methods and variables
* Silence dotnet format IDE0060 warnings
* Silence dotnet format IDE0052 warnings
* Address or silence dotnet format IDE1006 warnings
* Address or silence dotnet format CA2208 warnings
* Address dotnet format CA1822 warnings
* Address or silence dotnet format CA1069 warnings
* Silence CA1806 and CA1834 issues
* Address dotnet format CA1401 warnings
* Fix new dotnet-format issues after rebase
* Address review comments
* Address dotnet format CA2208 warnings properly
* Fix formatting for switch expressions
* Address most dotnet format whitespace warnings
* Apply dotnet format whitespace formatting
A few of them have been manually reverted and the corresponding warning was silenced
* Add previously silenced warnings back
I have no clue how these disappeared
* Revert formatting changes for OpCodeTable.cs
* Enable formatting for a few cases again
* Format if-blocks correctly
* Enable formatting for a few more cases again
* Fix inline comment alignment
* Run dotnet format after rebase and remove unused usings
- analyzers
- style
- whitespace
* Disable 'prefer switch expression' rule
* Add comments to disabled warnings
* Remove a few unused parameters
* Adjust namespaces
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Start working on disabled warnings
* Fix and silence a few dotnet-format warnings again
* Address IDE0251 warnings
* Address a few disabled IDE0060 warnings
* Silence IDE0060 in .editorconfig
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* First dotnet format pass
* Remove unnecessary formatting exclusion
* Add unsafe dotnet format changes
* Change visibility of JitSupportDarwin to internal
* dotnet format style --severity info
Some changes were manually reverted.
* dotnet format analyzers --serverity info
Some changes have been minimally adapted.
* Restore a few unused methods and variables
* Silence dotnet format IDE0052 warnings
* Address dotnet format CA1816 warnings
* Address or silence dotnet format CA1806 and a few CA1854 warnings
* Address most dotnet format whitespace warnings
* Add comments to disabled warnings
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* Add trailing commas, log errors instead of throwing and remove redundant code
* dotnet format style --severity info
Some changes were manually reverted.
* Address dotnet format CA1816 warnings
* Address or silence dotnet format CA1806 and a few CA1854 warnings
* Address most dotnet format whitespace warnings
* dotnet format whitespace after rebase
* dotnet format analyzers --serverity info
Some changes have been minimally adapted.
* Address most dotnet format whitespace warnings
* dotnet format whitespace after rebase
* dotnet format style --severity info
Some changes were manually reverted.
* dotnet format analyzers --serverity info
Some changes have been minimally adapted.
* Address dotnet format CA1816 warnings
* Address most dotnet format whitespace warnings
* Run dotnet format style after rebase
* Run dotnet format analyzers after rebase
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* Update src/Ryujinx.Audio.Backends.SDL2/SDL2HardwareDeviceDriver.cs
Co-authored-by: Ac_K <Acoustik666@gmail.com>
---------
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* dotnet format style --severity info
Some changes were manually reverted.
* Restore a few unused methods and variables
* Address most dotnet format whitespace warnings
* Apply dotnet format whitespace formatting
A few of them have been manually reverted and the corresponding warning was silenced
* Add previously silenced warnings back
I have no clue how these disappeared
* Add comments to disabled warnings
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Address IDE0251 warnings
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* First dotnet format pass
* dotnet format style --severity info
Some changes were manually reverted.
* Address or silence dotnet format CA1806 and a few CA1854 warnings
* Address most dotnet format whitespace warnings
* Apply dotnet format whitespace formatting
A few of them have been manually reverted and the corresponding warning was silenced
* Add comments to disabled warnings
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Address IDE0251 warnings
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* First dotnet format pass
* dotnet format style --severity info
Some changes were manually reverted.
* dotnet format analyzers --serverity info
Some changes have been minimally adapted.
* Restore a few unused methods and variables
* Silence dotnet format IDE0060 warnings
* Address dotnet format CA1822 warnings
* Address most dotnet format whitespace warnings
* Apply dotnet format whitespace formatting
A few of them have been manually reverted and the corresponding warning was silenced
* Add comments to disabled warnings
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Silence IDE0060 in .editorconfig
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* Final dotnet format pass and fix naming rule violations
* Apply suggestions from code review
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Remove unused constant
---------
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* dotnet format style --severity info
Some changes were manually reverted.
* Restore a few unused methods and variables
* Address review comments
* Address most dotnet format whitespace warnings
* Add comments to disabled warnings
* Address IDE0251 warnings
* dotnet format whitespace after rebase
* Remove SuppressMessage attribute for removed rule
* dotnet format style --severity info
Some changes were manually reverted.
* Restore a few unused methods and variables
* Address most dotnet format whitespace warnings
* Apply dotnet format whitespace formatting
A few of them have been manually reverted and the corresponding warning was silenced
* Add comments to disabled warnings
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* Final dotnet format pass and fix naming rule violations
* dotnet format style --severity info
Some changes were manually reverted.
* Address most dotnet format whitespace warnings
* Address IDE0251 warnings
* dotnet format whitespace after rebase
* dotnet format style --severity info
Some changes were manually reverted.
* Address dotnet format CA1816 warnings
* Address dotnet format CA1401 warnings
* Address most dotnet format whitespace warnings
* Run dotnet format after rebase and remove unused usings
- analyzers
- style
- whitespace
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* Address review feedback
* dotnet format style --severity info
Some changes were manually reverted.
* Restore a few unused methods and variables
* Address dotnet format CA1816 warnings
* Address most dotnet format whitespace warnings
* Simplify properties and array initialization, Use const when possible, Remove trailing commas
* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"
This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.
* dotnet format whitespace after rebase
* Fix regression introduced by 1.1733 on Intel iGPUs
* Should have actually figured the variable, oops.
* maybe something goes wrong here? honestly lost
* Shader cache bump
* misc: Implement address space size workarounds
This adds code to support userland with less than 39 bits of address
space available by testing reserving multiple sizes and reducing
guess address space when needed.
This is required for ARM64 support when the kernel is
configured to use 63..39 bits for kernel space.(meaning only 38 bits is available to userland)
* Address comments
* Fix 32 bits address space support and address more comments
* Implement Load/Store Local/Shared and Atomic shared using new instructions
* Remove now unused code
* Fix base offset register overwrite
* Fix missing storage buffer set index when generating GLSL for Vulkan
* Shader cache version bump
* Remove more unused code
* Some PR feedback
* ARMeilleure: Do not hardcode 4KiB page size in JitCache
* test: Do not hardcode page size to 4KiB for Ryujinx.Tests.Memory.Tests
Fix running tests on Asahi Linux with 16KiB pages.
* test: Do not hardcode page size to 4KiB for Ryujinx.Tests.Cpu
Fix running tests on Asahi Linux.
Test runner still crash when trying to run all test suite.
* test: Do not hardcode page size to 4KiB for Ryujinx.Tests.Cpu
Fix somecrashes on Asahi Linux.
* test: Ignore Vshl test on ARM64 due to unicorn crashes
* test: Workaround hardcoded size on some tests
Change mapping of code and data in case of non 4KiB configuration.
* test: Make CpuTestT32Flow depends on code address
Fix failure with different page size.
* test: Disable CpuTestThumb.TestRandomTestCases when page size isn't 4KiB
The test data needs to be reevaluated to take different page size into account.
* Address gdkchan's comments
* Correctly set 'shell/open/command; registry key for file associations
* File association fixes
* 'using' statements instead of blocks
* Idempotent unregistration
* Single "hey shell, we changed file associations" notification at the
end instead of 1 for every operation, speeds things up greatly.
* Adapt and fix Linux specific function as well
---------
Co-authored-by: TSR Berry <20988865+TSRBerry@users.noreply.github.com>
* Use glob patterns to match file paths
* Update ignored paths for releases
* Adjust build.yml as well
* Add names to auto-assign steps
* Fix developer team name
* Allow build workflows to run if workflows changed
This is a bare minimal triage action that handle big categories.
In the future we could also label all services correctly but
I didn't felt this was required for a first iteration.
* ava: Fix OpenGL on Linux again
This shouldn't be working like that, but for some reason it does.
* Apply the correct fix
* gtk: Add warning messages for caught exceptions
* ava: Handle disposing the same way as GTK does
* Address review feedback
* Implement transform feedback emulation for hardware without native support
* Stop doing some useless buffer updates and account for non-zero base instance
* Reduce redundant updates even more
* Update descriptor init logic to account for ResourceLayout
* Fix transform feedback and storage buffers not being updated in some cases
* Shader cache version bump
* PR feedback
* SetInstancedDrawVertexCount must be always called after UpdateState
* Minor typo
* Updater: Ignore files introduced by the user in base directory
* Replicate logic in Avalonia version.
* Address requested changes
* Updater: Ignore files introduced by the user in base directory
* Replicate logic in Avalonia version.
* Address requested changes
* Address requested changes
* Address requested changes
* Comment cleanup
* Address feedback
* Forgot comment, tehe
* Texture: Fix 3D texture size when totalBlocksOfGobsInZ > 0
When there is a remainder when dividing depth by gobs in z, it is used to remove the unused part of the 3D texture's size. This was done to calculate correct sizes for single slice views of 3D textures.
However, this case can also apply to 3D textures with many slices, and more than one total block of gobs in z. In this case it's meant to trim off the end of the level size. Most textures won't encounter this as their size will be aligned, but UE4 games tend to use 3D textures with funny unaligned sizes.
The size offset should have been applied to the level size instead of the slice size, and it should only affect the slice size if it ends up larger.
Hopefully should fix issues with UE4 games without breaking other stuff, I don't have much time to test.
* Whoops
* Texture: Fix layout conversion when gobs in z is used with depth = 1
The size calculator methods deliberately reduce the gob size of textures if they are deemed too small for it. This is required to get correct sizes when iterating mip levels of a texture.
Rendering to a slice of a 3D texture can produce a 3D texture with depth 1, but a gob size matching a much larger texture. We _can't_ "correct" this gob size, as it is intended as a slice of a larger 3D texture. Ignoring it causes layout conversion to break on read and flush.
This caused an issue in Tears of the Kingdom where the compressed 3D texture used for the gloom would always break on OpenGL, and seemingly randomly break on Vulkan. In the first case, the data is forcibly flushed to decompress the BC4 texture on the CPU to upload it as 3D, which was broken due to the incorrect layout. In the second, the data may be randomly flushed if it falls out of the cache, but it will appear correct if it's able to form copy dependencies.
This change only allows gob sizes to be reduced once per mip level. For the purpose of aligned size, it can still be reduced infinitely as our texture cache isn't properly able to handle a view being _misaligned_.
The SizeCalculator has also been changed to reduce the size of rendered depth slices to only include the exact range a single depth slice will cover. (before, the size was way too small with gobs in z reduced to 1, and too large when using the correct value)
Gobs in Y logic remains untouched, we don't support Y slices of textures so it's fine as is.
This is probably worth testing in a few games as it also affects texture size and view logic.
* Improve wording
* Maybe a bit better
* Update SoftwareKeyboard to send KeyboardMode to UI
* Update GTK UI to check text against KeyboardMode
* Update Ava UI to check text against KeyboardMode
* Restructure input validation
* true when text is not empty
* Add English validation text for SoftwareKeyboardMode
* Add Chinese validation text for SoftwareKeyboardMode
* Update base on feedback
---------
Co-authored-by: TSR Berry <20988865+TSRBerry@users.noreply.github.com>
* Implement storage buffer operations using new Load/Store instruction
* Extend GenerateMultiTargetStorageOp to also match access with constant offset, and log and comments
* Remove now unused code
* Catch more complex cases of global memory usage
* Shader cache version bump
* Extend global access elimination to work with more shared memory cases
* Change alignment requirement from 16 bytes to 8 bytes, handle cases where we need more than 16 storage buffers
* Tweak preferencing to catch more cases
* Enable CB0 elimination even when host storage buffer alignment is > 16 (for Intel)
* Fix storage buffer bindings
* Simplify some code
* Shader cache version bump
* Fix typo
* Extend global memory elimination to handle shared memory with multiple possible offsets and local memory
Currently, the `Open Applet` menu is still enabled when a guest is running, which is wrong. This is not fixed by refreshing the property binding on `IsEnabled`.
* ava: Fix exit dialog while guest is running.
There is currently an issue while a game runs, the content dialog creation method check if `IsGameRunning` is true to show the popup.
But the condition here is wrong (`window` is null) so it throw a NullException silently in `Dispatcher.UIThread`.
This is now fixed by using the right casting.
* improve condition
* Fix spacing
* UI: Fix empty homebrew icon
We currently don't check the icon size when we read it from the homebrew data. That could cause issues at UI side since the buffer isn't null but empty. Extra check have been added UI side too.
(I cleaned up some files during my research too)
Fixes#5188
* Remove additional check
* Remove unused using
* GAL: Dispose Renderer after running deferred actions
Deferred actions from disposing physical memory instances always dispose the resources in their caches. The renderer can't be disposed before these resources get disposed, otherwise the dispose actions will not actually run, and the ThreadedRenderer may get stuck trying to enqueue too many commands when there is nothing consuming them.
This should fix most instances of the emulator freezing on close.
* Wait for main render commands to finish, but keep RenderThread alive til dispose
* Address some feedback.
* No parameterize needed
* Set thread name as part of constructor
* Port to Ava and SDL2
* memory: Check results of pinvoke calls
* Increase vm.max_map_count when running Ryujinx
* Add SupportedOSPlatform attribute for WindowsApiException
* Revert increasing vm.max_map_count via script
* Add LinuxHelper to detect and increase vm.max_map_count
With GUI dialogs, this should be a bit more user-friendly.
* Supply arguments as a list to RunPkExec
* Add error logging in case RunPkExec() fails
* Prevent Gtk from crashing
* Linux: Detect if gamemode is installed and start it when launching Ryujinx.
When using the Ryujinx.sh script to start the emulator check if gamemoderun exists and use it if it does.
Gamemode mode on Linux changes some system settings to make performance during gaming more consistent mainly by changing the CPU governor to performance.
https://github.com/FeralInteractive/gamemode
* Removed if statement.
* Fix due to wrong assumption about the output of which.
Checks if the which output contains a no match response, otherwise use gamemoderun.
Using a case statement because it makes substring matching possible in sh and also it turns out that adding an empty string after env throws an error because env attempts to parse it as a paramater.
* Missed a couple semicolons.
* Different approach for checking if gamemode is available.
Should hopefully work across all implementations of which.
* Remove unneeded which command.
* Change code to keep launch command to a single line.
* Add support for VK_EXT_depth_clip_control.
* Code review feedback
Minor formatting
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Check .DepthClipControl to make sure the host actually supports the feature.
* Review feedback: remove Vulkan platform switch, relying on QueryHostSupportsDepthClipControl to drive the behaviour - OpenGL returns true, and any future platforms that don't support the [-1, 1] depth mode can return false for the transformation.
---------
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Attempt at fixing hang on exit by ending the WindowNotificationManager notification loop, so that the Thread running it can exit.
* explicitly apply the NotificationManager template to allow the notification loop to begin
* NotificationHelper - remove explicity call to ApplyTemplate(). Change to ManualResetEventSlim so we can cancel the Wait on it.
* add a timeout to AudioRenderSystem.Stop()'s waiting for the termination signal, log a warning if this timeout occurs, and continue execution
* NotifiationHelper - cancel first, the CompleteAdding()
* Remove AudioRenderSystem._terminationEvent, redundant
* NotificationHelper - use host.Closing event to trigger cancellation instead of _notifationManager.DetachedFromLogicalTree
* Change NotificationHelper to use an explicit Thread for background work. Wait on the cancellationToken's WaitHandle so the Thread doesn't have to deal with async. Wrap foreach in try/catch (OperationCanceledException) to swallow the escaping exception from the GetConsumingEnumerable().
* adjust formatting of AsyncWorkQueue constructor to use object initializers consistently
* use AsyncWorkQueue to do everything I added in SetNotificationManager()
* Revert "use AsyncWorkQueue to do everything I added in SetNotificationManager()"
This reverts commit f0e78366b8776ec8e2fef8ab023c0db1833155d3.
* use AsyncWorkQueue to handle the Thread-related changes previously made to NotificationHelper.SetNotificationHelper(). Wrap it in Lazy<T> and force instantiation in the TemplateApplied event handler to accomodate for the fact that AsyncWorkQueue starts immediately, and the notification dispatch loop was being delayed by _templateAppliedEvent.
* impl changes suggested by AcK77
* impl changes suggested by AcK77 (more)
* Generate scaling helper functions on IR
* Delete unused code
* Split RewriteTextureSample and move gather bias add to an earlier pass
* Remove using
* Shader cache version bump
* Truncate vertex attribute format if it exceeds stride on MoltenVK
* Fix BGR format
* Move vertex attribute check to pipeline creation to avoid costs
* No need for this to be public
* fix crash when Vulkan isn't available
* add VulkanRenderer.GetPhysicalDevices() overload that provides its own Vk API object and logs on failure
* adjustments per AcK77
* Add guard against ServerBase.Dispose() being called multiple times. Add reset event to avoid Dispose() being called while the ServerLoop is still running.
* remove unused usings
* rework ServerBase to use one collection each for sessions and ports, and make all accesses thread-safe.
* fix Logger call
* use GetSessionObj(int) instead of using _sessions directly
* move _threadStopped check inside "dispose once" test
* - Replace _threadStopped event with attempt to Join() the ending thread (if that isn't the current thread) instead.
- Use the instance-local _selfProcess and (new) _selfThread variables to avoid suggesting that the current KProcess and KThread could change. Per gdkchan, they can't currently, and this old IPC system will be removed before that changes.
- Re-order Dispose() so that the Interlocked _isDisposed check is the last check before disposing, to increase the likelihood that multiple callers will result in one of them succeeding.
* code style suggestions per AcK77
* add infinite wait for thread termination
* Introduce ResourceLayout
* Part 1: Use new ResourceSegments array on UpdateAndBind
* Part 2: Use ResourceLayout to build PipelineLayout
* Delete old code
* XML docs
* Fix shader cache load NRE
* Fix typo
* GPU: Avoid using garbage size for non-cb0 storage buffers
In the depths area, Tears of the Kingdom uses a global memory access with address on constant buffer slot 6. This isn't standard and thus doesn't actually have a size 8 bytes after it, so we were reading back a garbage size that ended up very large (at least in version 1.1.0), and would synchronize a lot of data per frame.
This PR makes storage buffers created from addresses outside constant buffer slot 0 get their size as the number of bytes remaining in the GPU mapping starting at the given virtual address. This should bound the buffer to a reasonable size, and ideally stop it crossing into other memory.
* Limit max size
* Add TODO
* Feedback
* gtk: Add missing isMouseInClient check for hide-cursor
* ava: Add missing events and default isCursorInRenderer to true
This is necessary because we don't receive a initial PointerEnter event for some reason.
* amadeus: adjust VirtualDevice channel configuration reporting with HardwareDevice
* audio: sdl2: Do not report 5.1 if device doesn't support it
SDL2 5.1 to Stereo conversion is terrible and make everything sound
quiet.
Let's not expose 5.1 if not truly supported by the device.
* Fix macOS build name in CI
Fixes updater
* Update build.yml
Don't publish x86 Mac builds
* Naming nitpick
* Berry changes
* Use the same prefix for PR and release build archives
---------
Co-authored-by: TSR Berry <20988865+TSRBerry@users.noreply.github.com>
* GPU: Remove swizzle undefined matching and rework depth aliasing
@gdkchan pointed out that UI textures in TOTK seemed to be setting their texture swizzle incorrectly (texture was RGB but was sampling A, swizzle for A was wrong), so I determined that SwizzleComponentMatches was the problem and set on eliminating it. This PR combines existing work to select the most recently modified texture (now used when selecting which aliased texture to use) with some additional changes to remove the swizzle check and support aliased view creation.
The original observation (#1538) was that we wanted to match depth textures for the purposes of aliasing with color textures, but they often had different swizzle from what was sampled (as it's generally the identity swizzle once rendered). At the time, I decided to allow swizzles to match if only the defined components matched, which fixed the issue in all known cases but could easily be broken by a game _expecting_ a given swizzle, such as a 1/0 value on a component.
This error case could also occur in textures that don't even depth alias, such as R11G11B10, as the rule was created to generally apply to all cases.
The solution is now to fail this exact match test, and allow the search for an R32 texture to create a swizzled view of a D32 texture (and other such cases). This allows the creation of a view that mismatches the requested format, which wasn't present before and was the reason for the swizzle matching approach.
The exact match and view creation rules now follow the same rules over what textures to select when there are multiple options (such as a "perfect" match and an "aliased" match at the same time). It now selects the most recently modified texture, which is done with a new sequence number in the GpuContext (because we don't have enough of these).
Reportedly fixes UI having weird coloured backgrounds in TOTK. This also fixes an issue in MK8D where returning from a race resulted in the character selection cubemaps being broken. May work around issues introduced by the "short texture cache" PR due to modification ordering, though they won't be truly fixed.
Should allow (#4365) to avoid copies in more cases. Need to test that.
I tested a bunch of games #1538 originally affected and they seem to be fine. This change affects all games so it would be good to get some wide testing on it.
* Address feedback 1, fix an issue
* Workaround: Do not allow copies for format alias.
These should be removed when D32<->R32 copy dependencies become legal
* Fix the restart after an update.
* Fix the updater for the Ava UI too.
* Fixing up the code after some change requests.
Removed a line of code that was accidentally left in.
* Fix restarting on Linux Avalonia.
* Fix issues with escaped arguments.
* Changed LastPlayed field from string to nullable DateTime
Added ApplicationData.LastPlayedString property
Added NullableDateTimeConverter for the DateTime->string conversion in Avalonia
* Added migration from string-based last_played to DateTime-based last_played_utc
* Updated comment style
* Added MarkupExtension to NullableDateTimeConverter and changed its usage
Cleaned up leftover usings
* Missed one comment
* amadeus: Allow 5.1 sink output
Also add a simple Stereo to 5.1 change for device sink.
Tested against NES - Nintendo Switch Online that output stereo on the
audio renderer.
* Remove outdated comment
* refactor: clean up controller settings ui
- Remove inconsistencies between left and right side
- Use style to set ToggleButton properties (since they are all the same)
- Move topmost controller settings from one line to 2x2 grid for improved clarity
- Properly adjust borders, text widths, etc. to neighboring elements to eliminate misaligned visual lines
* fix: merge issues
* fix: prevent sliders from jumping by giving text block fixed width
* refactor: add more separators and increase margin
* refactor: center deadzone and range descriptions
* refactor: move rumble border top margin to -1 and prevent double border
* refactor: remove margins & double borders + switch profile & input selection
* style: apply suggestions from code review
Co-authored-by: Ac_K <Acoustik666@gmail.com>
---------
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* amadeus: Fix wrong channel mapping check
This was always going to happens, as a result quadratic would break and
move index after the channel count point, effectively breaking
input/output indices.
* amadeus: Fix reverb 3d early delay wrong output index
* Fix the issue of unequal check for amiibo file date due to the lack of sub-second units in the header, causing slow opening of the amiibo interface.
* Supplement the unrepaired.
This fixes a potential issue where a shader lookup could match the address of a previous _different_ shader, but that shader is now partially unmapped. This would just crash with an invalid region exception.
To compare a shader in the address cache with one in memory, we get the memory at the location with the previous shader's size. However, it's possible it has been unmapped and then remapped with a smaller size. In this case, we should just get back the mapped portion of the shader, which will then fail the comparison immediately and get to compile/lookup for the new one.
This might fix a random crash in TOTK that was reported by Piplup. I don't know if it does, because I don't have the game yet.
* Add build config and extra args to create_macos_build.sh
* Use matrix strategy for releases
* Add macOS jobs
Co-authored-by: Mary <thog@protonmail.com>
* Fix wrong version argument
* Fix check for the correct amount of args
* Install latest rcodesign release
Co-authored-by: Mary <thog@protonmail.com>
* Set executable bits for PR builds on linux
---------
Co-authored-by: Mary <thog@protonmail.com>
Command buffer errors currently trigger an exception "DeviceLost" crashing the process.
Looking at [MKV's code](53a4eb26f2/MoltenVK/MoltenVK/GPUObjects/MVKQueue.mm (L392-L408)) we observe that:
- It hard fails if error is:
```
MTLCommandBufferErrorBlacklisted || MTLCommandBufferErrorNotPermitted || MTLCommandBufferErrorDeviceRemoved
```
- Otherwise fails conditionally if `config.resumeLostDevice == false` (current default)
For Ryujinx's use-case it's more graceful to resume on those errors rather than crashing the app, the error isn't totally silenced since `mvk` still logs it
Fixes#4704, #4575
* Vulkan: Batch vertex buffer updates
Some games can bind a large number of vertex buffers for draws. This PR allows for vertex buffers to be updated with one call rather than one per buffer.
This mostly affects the AMD Mesa driver, the testing platform was Steam Deck with Super Mario Odyssey. It was taking about 12% before, should be greatly reduced now.
A small optimization has been added to avoid looking up the same buffer multiple times, as a common pattern is for the same buffer to be bound many times in a row with different ranges.
* Only rebind vertex buffers if they have changed
* Address feedback
* Ava: Fix SystemTimeOffset calculation
During testing of #4822, Mary pointed out the way we calculate time offset is wrong in our Avalonia UI. This PR fixed that.
The axaml file is autoformatted too.
* DateTime.Now in local var
* time: Update for 15.0.0 changes
Last time we did an upgrade on the time service was during 9.x era, it was about time to take back that reverse again!
15.0.0 added a new structure on the shared memory to get steady clock raw timepoints with a granularity in nanoseconds.
This commit implements this new part.
I plan to write a follow up with a bit of refactoring of this ancient part of the emulator.
As always, reverse and work done by your truly.
PS: As a reminder, if this change is reused anywhere else, work should be credited as Ryujinx and not my person.
* time: Do not set setup value to posix time
This should fix local and network clock returning 0 under usage with
shared memory.
This probably fix#2430.
* Address gdkchan's comment
* Fix internal offset not working since changes and ensure that user clock have a valid clock id
* time: Report auto correcting clock and hardcode steady clock unique id
Fix Pokemon Sword Pokejobs for real.
* Address gdkchan's comment
* Ava UI: Expose games build ID for cheat management
* Fix bad merge
* Change integrity check level to error on invalid
* Add support for GDK
* Remove whitespace
* Add BID identifier
* PR Comments fix
* Restore title id in cheats GTK window
* use halign center instead of margin_left
* Merge
* fix after merge
* PR comments fix - design AVA
* PR fix - Move GetApplicationBuildId to ApplicationData class
* PR comment fix - Add empty line before method
* Align with PR #4755
* PR comments fix
* Change BuildId label to support translation
* Comments fix
* Remove unused BuildIdLabel property
* feat: introduce new shader loading state for progress tracking when writing shaders to disk
* fix: move translation to bottom of locale file
* fix: change back to foreach and add requested spacing between lines
* style: fix formatting
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
---------
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* AM: Stub some service call
Some IPC I have stubbed during private testing and I don't want to deal with them anymore. Nothing more.
* ICommonStateGetter disposable
* GPU: Remove CPU region handle containers.
Another one for the "I don't know why I didn't do this earlier" pile.
This removes the "Cpu" prefixed region handle classes, which each mirror a region handle type from Ryujinx.Memory.
Originally, not all projects had a reference to Ryujinx.Memory, so these classes were introduced to bridge the gap. Someone else crossed that bridge since, so these classes don't have much of a purpose anymore.
This PR replaces all uses of CpuRegionHandle etc to their direct Ryujinx.Memory versions.
RegionHandle methods (specifically QueryModified) are about the hottest path there is in the entire emulator, so there is a nice boost from doing this.
* Add docs
* UI: Fix sections extraction
There is currently an issue when the update NCA doesn't contains the section we want to extract, this is fixed by adding a check.
I have fixed the inverted handler of ExeFs/Logo introduced in #4755.
Fixes#4521
* Addresses feedback
* Allow any shader SSBO constant buffer slot and offset
* Fix slot value passed to SetUsedStorageBuffer on fallback case
* Shader cache version
* Ensure that the storage buffer source constant buffer offset is word aligned
* Fix FirstBinding on GetUniformBufferDescriptors
* GPU: Allow granular buffer updates from the constant buffer updater
Sometimes, constant buffer updates can't be avoided, either due to a cb0 access that cannot be eliminated, or the game updating a buffer between draws to the detriment of everyone.
To avoid uploading the full 4096 bytes each time, this PR remembers the offset and size containing all constant buffer updates since the last sync. It will then upload that range after sync.
* Allow clearing the dirty range
* Always use precise
Might want to not do this if distance between the existing range and new one is too high.
* Use old force dirty mechanism when distance between regions is too great
* Update src/Ryujinx.Graphics.Gpu/Memory/Buffer.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Fix inheritance of _dirtyStart and _dirtyEnd
---------
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Fix case sensitivity for mod subdirectories
* Small refactoring of ModLoader
* Don't share instruction list between all cheats
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
---------
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* fix: linux launcher breaks when there are spaces in the directory path
* Add quotes around $0 as well
---------
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* UI: Move ApplicationContextMenu in a separated class
This PR remove duplicated code related to the context menu on the Application list/grid by create a control for the menu which include related handler.
I've renamed "GameList/GameGrid" by "Application" for consistencies. And I've removed all uneeded field from the project file too.
While I cleaned up things, I've found an issue about purging Ptc/Shader cache, both methods list files even if the user say "No", shader cache is purged even if the user say "No". It's fixed.
* Adresses feedbacks
Our Vulkan backend inserts image barriers when a texture is sampled after it is rendered. This is done via a "modification flag" which is set when a render target is unbound (presuming that a texture has finished drawing to it).
Imagine the following scenario:
- Game sets render target to texture A
- Game renders to texture A
- (render pass ends)
- Game binds texture A to a sampler
- Game sets render target to texture B
- Renders to texture B using texture A (barrier required)
Because of the previous behaviour, the check to add a barrier for sampling a texture actually happens before it is registered as modified, meaning no barrier was added at all. This isn't always the case, but it was definitely causing issues in Xenoblade 2.
This doesn't fix any more complicated issues where a texture is repeatedly sampled while it is currently being rendered.
Fixes visual glitches at lower resolutions in Xenoblade 2. May fix other cases.
* Add hide-cursor command line argument
* gtk: Adjust SettingsWindow for hide cursor options
* ava: Adjust SettingsWindow for hide cursor options
* ava: Add override check for HideCursor arg
* Remove copy&paste sins
* ava: Leave a little more room between the options
* gtk: Fix hide cursor issues
* ava: Only hide cursor if it's within the embedded window
* GPU: Keep sampled textures without any pool references alive
Occasionally games are very wasteful and clear/write to a texture without ever sampling it. As rendered textures in NVN games seem to all have overlapping memory ranges, the texture will eventually get overwritten.
Normally, this would trigger a removal from the auto delete cache, but a pool entry would keep the texture alive. However, with these textures that are never used, they will get deleted immediately and recreated on the next frame.
This change makes it so the ShortTextureCache can keep textures that have naver had a pool reference alive for a few frames, so they're not constantly being created and deleted.
This improves performance in Zelda BOTW a little.
* Cleanup
* WIP texture pre-flush
Improve performance of TextureView GetData to buffer
Fix copy/sync ordering
Fix minor bug
Make this actually work
WIP host mapping stuff
* Fix usage flags
* message
* Cleanup 1
* Fix rebase
* Fix
* Improve pre-flush rules
* Fix pre-flush
* A lot of cleanup
* Use the host memory bits
* Select the correct memory type
* Cleanup TextureGroupHandle
* Missing comment
* Remove debugging logs
* Revert BufferHandle _value access modifier
* One interrupt action at a time.
* Support D32S8 to D24S8 conversion, safeguards
* Interrupt cannot happen in sync handle's lock
Waitable needs to be checked twice now, but this should stop it from deadlocking.
* Remove unused using
* Address some feedback
* Address feedback
* Address more feedback
* Address more feedback
* Improve sync rules
Should allow for faster sync in some cases.
* GPU: Fix errors handling texture remapping
- Fixes an error where a pool entry and memory mapping changing at the same time could cause a texture to rebind its data from the wrong GPU VA (data swaps)
- Fixes an error where the texture pool could act on a mapping change before the mapping has actually been changed ("Unmapped" event happens before change, we need to signal it changed _after_ it completes)
TODO: remove textures from partially mapped list... if they aren't.
* Add Remap actions for handling post-mapping behaviours
* Remove unused code.
* Address feedback
* Nit
* Refactor attribute handling on the shader generator
* Implement gl_ViewportMask[]
* Add back the Intel FrontFacing bug workaround
* Fix GLSL transform feedback outputs mistmatch with fragment stage
* Shader cache version bump
* Fix geometry shader recognition
* PR feedback
* Delete GetOperandDef and GetOperandUse
* Remove replacements that are no longer needed on GLSL compilation on Vulkan
* Fix incorrect load for per-patch outputs
* Fix build
* Use vector transform feedback outputs with fragment shaders
* Shader cache version bump
* Fix missing outputs when vector transform feedback outputs are used
* use ArrayPool, avoid 6000-7000 allocs/sec of runtime
* use ArrayPool, avoid ~7k allocs/second during game execution
* use ArrayPool, avoid ~3000 allocs/sec during game execution
* use MemoryPool, reduce 0.5 MB/sec of new allocations during game execution
* avoid over-allocation by setting List<> Capacity when known
* remove LINQ in KTimeManager.UnscheduleFutureInvocation
* KTimeManager - avoid spinning one more time when the time has arrived
* KTimeManager - let SpinWait decide when to Thread.Yield(), and don't SpinOnce() immediately after Thread.Yield()
* use MemoryPool, reduce ~175k bytes/sec allocation during game execution
* IpcService - call commands via dynamic methods instead of reflection .Invoke(). Faster to call and with fewer allocations because parameters can be passed directly instead of as an array
* Make ButtonMappingEntry a record struct to avoid allocations. Set the List<ButtonMappingEntry> capacity according to use.
* add MemoryBuffer type for working with MemoryPool<byte>
* update changes to use MemoryBuffer
* make parameter ReadOnlySpan instead of Span
* whitespace fix
* Revert "IpcService - call commands via dynamic methods instead of reflection .Invoke(). Faster to call and with fewer allocations because parameters can be passed directly instead of as an array"
This reverts commit f2c698bdf65f049e8481c9f2ec7138d9b9a8261d.
* tweak KTimeManager spin behavior
* replace MemoryBuffer with ByteMemoryPool modeled after System.Buffers.ArrayMemoryPool<T>
* make ByteMemoryPoolBuffer responsible for renting memory
* ava: Remove unused doWhileDeferred parameters
* ava: Minimally improve swkbd dialog
It's currently impossible to get the dialog to redirect focus to the InputBox.
* ava: Fix nca extraction dialog never closing
Also contains some minor cleanup
* Added HiddenFileTypes to config state, and check to file enumeration
* Added hiddenfiletypes checkboxes to the UI
* Added Ava version of HiddenFileTypes
* Inverted Hide to Show with file types, minor formatting
* all variables with a reference to 'hidden' is now 'shown'
* one more variable name changed
* review feedback
* added FileTypes extension methof to get the correlating config value
* moved extension method to new folder and file in Ryujinx.Ui.Common
* added default case for ToggleFileType
* changed exception type to OutOfRangeException
* Added check for eventual symlink when displaying game files.
* Moved symlink check logic
* Moved symlink check logic
* Fixed prev commit
---------
Co-authored-by: Daniel Shala <danielshala00@gmail.com>
* hle: Deal with empty titleNames in some languages
* gui: Fix displaying the wrong title name
* Remove unnecessary bounds check
* Fix a NRE when getting the version string
* Restore empty string logic
* Flush in the middle of long command buffers.
* Vulkan: add situational "Fast Flush" mode
The AutoFlushCounter class was added to periodically flush Vulkan command buffers throughout a frame, which reduces latency to the GPU as commands are submitted and processed much sooner. This was done by allowing command buffers to flush when framebuffer attachments changed.
However, some games have incredibly long render passes with a large number of draws, and really aggressive data access that forces GPU sync.
The Vulkan backend could potentially end up building a single command buffer for 4-5ms if a pass has enough draws, such as in BOTW. In the scenario where sync is waited on immediately after submission, this would have to wait for the completion of a much longer command buffer than usual.
The solution is to force command buffer submission periodically in a "fast flush" mode. This will end up splitting render passes, but it will only enable if sync is aggressive enough.
This should improve performance in GPU limited scenarios, or in games that aggressively wait on synchronization. In some games, it may only kick in when res scaling. It won't trigger in games like SMO where sync is not an issue.
Improves performance in Pokemon Scarlet/Violet (res scaled) and BOTW (in general).
* Add conversions in milliseconds next to flush timers.
* ARMeilleure: Move TPIDR_EL0 and TPIDRRO_EL0 to NativeContext
Some games access these system registers several tens of thousands of times in a second from many different threads. While this isn't really crippling, it is a lot of wasted time spent in a reverse pinvoke transition.
Example games are Pokemon Scarlet/Violet and BOTW. These games have a lot of different potential bottlenecks so it's unlikely you will see a consistent improvement, but it definitely disappears from the cpu profile.
* Remove unreachable code.
* Add ulong conversion for offsets
* Nit
This seems to have been removed by the Post-Processing PR, but it is required for the display in OBS to be the right way up and properly scaled.
I've tested this with AA and FSR on MK8D and it seems to behave properly. Testing is welcome.
* ARMeilleure: Respect Fz flag for all floating point operations.
This is a change in strategy for emulating the Fz FPCR flag. Before, it was set before instructions that "needed it" and reset after. However, this missed a few hot instructions like the multiplication instruction, and the entirety of A32.
The new strategy is to set the Fz flag only in the following circumstances:
- Set to match FPCR before translated functions/loop are executed.
- Reset when calling SoftFloat methods, set when returning.
- Reset when exiting execution.
This allows us to remove the code around the existing Fz aware instructions, and get the accuracy benefits on all floating point instructions executed while in translated code.
Single step executions now need to be called with a context wrapper - right now it just contains the Fz flag initialization, and won't actually do anything on ARM.
This fixes a bug in Breath of the Wild where some physics interactions could randomly crash the game due to subnormal values not flushing to zero.
This is draft right now because I need to answer the questions:
- Does dotnet avoid changing the value of Mxcsr?
- Is it a good idea to assume that? Or should the flag set/restore be done on every managed method call, not just softfloat?
- If we assume that, do we want a unit test to verify the behaviour?
I recommend testing a bunch of games, especially games affected when this was originally added, such as #1611.
* Remove unused method
* Use FMA for Fmadd, Fmsub, Fnmadd, Fnmsub, Fmla, Fmls
...when available.
Similar implementation to A32
* Use FMA for Frecps, Frsqrts
* Don't set DAZ.
* Add round mode to ARM FP mode
* Fix mistakes
* Add test for FP state when calling managed methods
* Add explanatory comment to test.
* Cleanup
* Add A64 FPCR flags
* Vrintx_S A32 fast path on A64 backend
* Address feedback 1, re-enable DAZ
* Fix FMA instructions By Elem
* Address feedback
* Redesign use of ISampledData for accessing the SamplingNumber value on input data structs.
* Always read SamplingNumber as little-endian
* Restored field order for SixAxisSensorState. Rework to allow possibility of non-zero offsets for the SamplingNumber field. Set StructLayout Pack=8 - the KeyboardState struct is 4 bytes shorter with any other value.
* fix spelling
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* set Pack = 1 for ISampledDataStruct types, added Unknown field to KeyboardState
* extend size of KeyboardModifier
---------
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* vulkan: Move most of the properties enumeration to VulkanPhysicalDevice
That clean up a bit of duplicate logic.
Also move to use an hashset for device extensions.
* vulkan: Move instance querying to VulkanInstance
Also cleanup code to use span when possible instead of unsafe pointers.
* Address gdkchan's comments
* Use index fragment shader output when dual source blend is enabled
* Shader cache version bump
* Actually set DualSourceBlendEnabled to true
* Fix XML doc
---------
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Fix missing string enum converters for the config
* Revert changing KeyboardHotkeys to struct
This needs to be done because
Avalonia's TwoWay Binding breaks otherwise.
* Use source generated json serializers in order to improve code trimming
* Use strongly typed github releases model to fetch updates instead of raw Newtonsoft.Json parsing
* Use separate model for LogEventArgs serialization
* Make dynamic object formatter static. Fix string builder pooling.
* Do not inherit json version of LogEventArgs from EventArgs
* Fix extra space in object formatting
* Write log json directly to stream instead of using buffer writer
* Rebase fixes
* Rebase fixes
* Rebase fixes
* Enforce block-scoped namespaces in the solution. Convert style for existing code
* Apply suggestions from code review
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* Rebase indent fix
* Fix indent
* Delete unnecessary json properties
* Rebase fix
* Remove overridden json property names as they are handled in the options
* Apply suggestions from code review
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* Use default json options in github api calls
* Indentation and spacing fixes
* Fix json serialization
* Fix missing JsonConverter for config enums
* Add double \n\n after the whole string, not inside join
---------
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* vulkan: Separate debug utils logic from VulkanInitialization
Also checks for VK_EXT_debug_utils existence instead of force enabling it and allow possible error during messenger init
* Address gdkchan's comment
* Use CreateDebugUtilsMessenger Span variant
* HLE: Refactoring of ApplicationLoader
* Fix SDL2 Headless
* Addresses gdkchan feedback
* Fixes LoadUnpackedNca RomFS loading
* remove useless casting
* Cleanup and fixe empty application name
* Remove ProcessInfo
* Fixes typo
* ActiveProcess to ActiveApplication
* Update check
* Clean using.
* Use the correct filepath when loading Homebrew.npdm
* Fix NRE in ProcessResult if MetaLoader is null
* Add more checks for valid processId & return success
* Add missing logging statement for npdm error
* Return result for LoadKip()
* Move error logging out of PFS load extension method
This avoids logging "Could not find Main NCA"
followed by "Loading main..." when trying to start hbl.
* Fix GUIs not checking load results
* Fix style and formatting issues
* Fix formatting and wording
* gtk: Refactor LoadApplication()
---------
Co-authored-by: TSR Berry <20988865+TSRBerry@users.noreply.github.com>
* Rework StdErr-to-log redirection to use built-in FileStream, and do reads asynchronously to avoid hanging the process shutdown.
* set _disposable to false ASAP
* Simplify return statements by using ternary expressions
* Remove a redundant type conversion
* Reduce nesting by inverting "if" statements
* Try to improve code readability by using LINQ and inverting "if" statements
* Try to improve code readability by using LINQ, using ternary expressions, and inverting "if" statements
* Add line breaks to long LINQ
* Add line breaks to long LINQ
* Vulkan: Insert barriers before clears
Newer NVIDIA GPUs seem to be able to start clearing render targets before the last rasterization task is completed, which can cause it to clear a texture while it is being sampled.
This change adds a barrier from read to write when doing a clear, assuming it has been sampled in the past. It could be possible for this to be needed for sample into draw by some GPU, but it's not right now afaik.
This should fix visual artifacts on newer NVIDIA GPUs and driver combos. Contrary to popular belief, Tetris® Effect: Connected is not affected. Testing welcome, hopefully should fix most cases of this and not cost too much performance.
* Visual Studio Moment
* Address feedback
* Address Feedback 2
Protection for the `xgetbv` instruction for systems that do not support
`xcr0` such as nehalem processors.
The `XSAVE` cpuid indicates support for `XSAVE`, `XRESTOR`, `XSETBV`,
`XGETBV` while `OSXSAVE` indicates if the operating system itself has
`XSAVE` turned on. Both must be checked at the same time.
* Use source generated json serializers in order to improve code trimming
* Use strongly typed github releases model to fetch updates instead of raw Newtonsoft.Json parsing
* Use separate model for LogEventArgs serialization
* Make dynamic object formatter static. Fix string builder pooling.
* Do not inherit json version of LogEventArgs from EventArgs
* Fix extra space in object formatting
* Write log json directly to stream instead of using buffer writer
* Rebase fixes
* Rebase fixes
* Rebase fixes
* Enforce block-scoped namespaces in the solution. Convert style for existing code
* Apply suggestions from code review
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* Rebase indent fix
* Fix indent
* Delete unnecessary json properties
* Rebase fix
* Remove overridden json property names as they are handled in the options
* Apply suggestions from code review
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* Use default json options in github api calls
* Indentation and spacing fixes
---------
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* ARMeilleure: Add AVX512{F,VL,DQ,BW} detection
Add `UseAvx512Ortho` and `UseAvx512OrthoFloat` optimization flags as
short-hands for `F+VL` and `F+VL+DQ`.
* ARMeilleure: Add initial support for EVEX instruction encoding
Does not implement rounding, or exception controls.
* ARMeilleure: Add `X86Vpternlogd`
Accelerates the vector-`Not` instruction.
* ARMeilleure: Add check for `OSXSAVE` for AVX{2,512}
* ARMeilleure: Add check for `XCR0` flags
Add XCR0 register checks for AVX and AVX512F, following the guidelines
from section 14.3 and 15.2 from the Intel Architecture Software
Developer's Manual.
* ARMeilleure: Remove redundant `ReProtect` and `Dispose`, formatting
* ARMeilleure: Move XCR0 procedure to GetXcr0Eax
* ARMeilleure: Add `XCR0` to `FeatureInfo` structure
* ARMeilleure: Utilize `ReadOnlySpan` for Xcr0 assembly
Avoids an additional allocation
* ARMeilleure: Formatting fixes
* ARMeilleure: Fix EVEX encoding src2 register index
> Just like in VEX prefix, vvvv is provided in inverted form.
* ARMeilleure: Add `X86Vpternlogd` acceleration to `Vmvn_I`
Passes unit tests, verified instruction utilization
* ARMeilleure: Fix EVEX register operand designations
Operand 2 was being sourced improperly.
EVEX encoded instructions source their operands like so:
Operand 1: ModRM:reg
Operand 2: EVEX.vvvvv
Operand 3: ModRM:r/m
Operand 4: Imm
This fixes the improper register designations when emitting vpternlog.
Now "dest", "src1", "src2" arguments emit in the proper order in EVEX instructions.
* ARMeilleure: Add `X86Vpternlogd` acceleration to `Orn_V`
* ARMeilleure: PTC version bump
* ARMeilleure: Update EVEX encoding Debug.Assert to Debug.Fail
* ARMeilleure: Update EVEX encoding comment capitalization
* Initial implementation of migration between memory heaps
- Missing OOM handling
- Missing `_map` data safety when remapping
- Copy may not have completed yet (needs some kind of fence)
- Map may be unmapped before it is done being used. (needs scoped access)
- SSBO accesses are all "writes" - maybe pass info in another way.
- Missing keeping map type when resizing buffers (should this be done?)
* Ensure migrated data is in place before flushing.
* Fix issue where old waitable would be signalled.
- There is a real issue where existing Auto<> references need to be replaced.
* Swap bound Auto<> instances when swapping buffer backing
* Fix conversion buffers
* Don't try move buffers if the host has shared memory.
* Make GPU methods return PinnedSpan with scope
* Storage Hint
* Fix stupidity
* Fix rebase
* Tweak rules
Attempt to sidestep BOTW slowdown
* Remove line
* Migrate only when command buffers flush
* Change backing swap log to debug
* Address some feedback
* Disallow backing swap when the flush lock is held by the current thread
* Make PinnedSpan from ReadOnlySpan explicitly unsafe
* Fix some small issues
- Index buffer swap fixed
- Allocate DeviceLocal buffers using a separate block list to images.
* Remove alternative flags
* Address feedback
* Avoid copying more handles than we have space for
* Use locks instead
* Reduce nesting by combining the lock statements
* Add locks for other uses of _sessionHandles and _portHandles
* Use one object to lock instead of locking twice
* Release the lock as soon as possible
* add RecyclableMemoryStream dependency and MemoryStreamManager
* organize BinaryReader/BinaryWriter extensions
* add StreamExtensions to reduce need for BinaryWriter
* simple replacments of MemoryStream with RecyclableMemoryStream
* add write ReadOnlySequence<byte> support to IVirtualMemoryManager
* avoid 0-length array creation
* rework IpcMessage and related types to greatly reduce memory allocation by using RecylableMemoryStream, keeping streams around longer, avoiding their creation when possible, and avoiding creation of BinaryReader and BinaryWriter when possible
* reduce LINQ-induced memory allocations with custom methods to query KPriorityQueue
* use RecyclableMemoryStream in StreamUtils, and use StreamUtils in EmbeddedResources
* add constants for nanosecond/millisecond conversions
* code formatting
* XML doc adjustments
* fix: StreamExtension.WriteByte not writing non-zero values for lengths <= 16
* XML Doc improvements. Implement StreamExtensions.WriteByte() block writes for large-enough count values.
* add copyless path for StreamExtension.Write(ReadOnlySpan<int>)
* add default implementation of IVirtualMemoryManager.Write(ulong, ReadOnlySequence<byte>); remove previous explicit implementations
* code style fixes
* remove LINQ completely from KScheduler/KPriorityQueue by implementing a custom struct-based enumerator
* GPU: Fast path for adding one texture view to a group
Texture group handles must store a list of their overlapping views, so they can be properly notified when a write is detected, and a few other things relating to texture readback. This is generally created when the group is established, with each handle looping over all views to find its overlaps. This whole process was also done when only a single view was added (and no handles were changed), however...
Sonic Frontiers had a huge cubemap array with 7350 faces (175 cubemaps * 6 faces * 7 levels), so iterating over both handles and existing views added up very fast. Since we are only adding a single view, we only need to _add_ that view to the existing overlaps, rather than recalculate them all.
This greatly improves performance during loading screens and a few seconds into gameplay on the "open zone" sections of Sonic Frontiers. May improve loading times or stutters on some other games.
Note that the current texture cache rules will cause these views to fall out of the cache, as there are more than the hard cap, so the cost will be repaid when reloading the open zone.
I also added some code to properly remove overlaps when texture views are removed, since it seems that was missing.
This can be improved further by only iterating handles that overlap the view (filter by range), but so can a few places in TextureGroup, so better to do all at once. The full generation of overlaps could probably be improved in a similar way.
I recommend testing a few games to make sure nothing breaks.
* Address feedback
* Update sparsely mapped texture ranges without recreating
Important TODO in TexturePool. Smaller TODO: should I look into making textures with views also do this? It needs to be able to detect if the views can be instantly deleted without issue if they're now remapped.
* Actually do partial updates
* Signal group dirty after mappings changed
* Fix various issues (should work now)
* Further optimisation
Should load a lot less data (16x) when partial updating 3d textures.
* Improve stability
* Allow granular uploads on large textures, improve rules
* Actually avoid updating slices that aren't modified.
* Address some feedback, minor optimisation
* Small tweak
* Refactor DereferenceRequest
More specific initialization methods.
* Improve code for resetting handles
* Explain data loading a bit more
* Add some safety for setting null from different threads.
All texture sets come from the one thread, but null sets can come from multiple. Only decrement ref count if we succeeded the null set first.
* Address feedback 1
* Make a bit safer
* GPU: Scale counter results before addition
Counter results were being scaled on ReportCounter, which meant that the _total_ value of the counter was being scaled. Not only could this result in very large numbers and weird overflows if the game doesn't clear the counter, but it also caused the result to change drastically.
This PR changes scaling to be done when the value is added to the counter on the backend. This should evaluate the scale at the same time as before, on report counter, but avoiding the issue with scaling the total.
Fixes scaling in Warioware, at least in the demo, where it seems to compare old/new counters and broke down when scaling was enabled.
* Fix issues when result is partially uploaded.
Drivers tend to write the low half first, then the high half. Retry if the high half is FFFFFFFF.
* use Array.Empty() where instead of allocating new zero-length arrays
* structure for loops in a way that the JIT will elide array/Span bounds checking
* avoiding function calls in for loop condition tests
* avoid LINQ in a hot path
* conform with code style
* fix mistake in GetNextWaitingObject()
* fix GetNextWaitingObject() possibility of returning null if all list items have TimePoint == long.MaxValue
* make GetNextWaitingObject() behave FIFO behavior for multiple items with the same TimePoint
* Add flatpak release workflow
Co-authored-by: Mary <mary@mary.zone>
* infra: Update required SDK version to 7.0.200
---------
Co-authored-by: Mary <mary@mary.zone>
* Sockets: Properly convert error codes on MacOS
The error codes for MacOS are very different to how they are on windows or linux. An alternate mapping is used when the host operating system is MacOS.
This PR also defaults IsDhcpEnabled to true when interfaceProperties.DhcpServerAddresses is not available.
This change was already in `macos1`.
* Address feedback
* Add Post Processing Effects
* fix events and shader issues
* fix gtk upscale slider value
* fix bgra games
* don't swap swizzle if already swapped
* restore opengl texture state after effects run
* addressed review
* use single pipeline for smaa and fsr
* call finish on all pipelines
* addressed review
* attempt fix file case
* attempt fixing file case
* fix filter level tick frequency
* adjust filter slider margins
* replace fxaa shaders with original shader
* addressed review
This allows changing base application directory behavior at build time via FORCE_EXTERNAL_BASE_DIR.
This is intended to be used by nixpkgs and flathub builds.
I also added the missing patch for macOS that we have on macos1 to avoid invalidating code signature.
* Move Ryujinx folder to Application Support on macOS
* Create a symlink to preserve back compat
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Remove extra whitespace
* Don’t create a symlink
* Update Ryujinx.Common/Configuration/AppDataManager.cs
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Revert "Don’t create a symlink"
This reverts commit 31752fe8ab.
---------
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Use SIMD acceleration for audio upsampler filter kernel for a moderate speedup
* Address formatting. Implement AVX2 fast path for high quality resampling in ResamplerHelper
* now really, are we really getting the benefit of inlining 50+ line methods?
* adding unit tests for resampler + upsampler. The upsampler ones fail for some reason
* Fixing upsampler test. Apparently this algo only works at specific ratios
---------
Co-authored-by: Logan Stromberg <lostromb@microsoft.com>
I noticed that in Xenoblade 2, the game can end up spending a lot of time adding and removing tracking handles. One of the main causes of this is actually splitting existing handles, which does the following:
- Remove existing handle from list
- Update existing handle to end at split address, create new handle starting at split address
- Add updated handle (left) to list
- Add new handle (right) to list
This costs 1 deletion and 2 insertions. When there are more handles, this gets a lot more expensive, as insertions are done by copying all values to the right, and deletions by copying values to the left.
This PR simply allows it to look up the handle being split, and replace its entry with the new end address without insertion or deletion. This makes a split only cost one insertion and a binary search lookup (very cheap). This isn't all of the cost on Xenoblade 2, but it does significantly reduce it.
There might be something else to this - we could find a way to reduce the handle count for the game (merging on deletion? buffer deletion?), we could use a different structure for virtual regions, as the current one is optimal for buffer lookups which nearly always read, memory tracking has more of a balance between read/write. That's for a later date though, this was an easy improvment.
* Add blend microcode registers
* Add advanced blend support using host extension
* Remove debug message
* Use pre-generated table for blend functions
* XML docs
* Rename AdvancedBlendMode to AdvancedBlendOp for consistency
* Remove redundant code
* Fix some advanced blend related issues on Vulkan
* Formatting
* Clear CPU side data on GPU buffer clears
* Implement tracked fill operation that can signal other resource types except buffer
* Fix tests, add missing XML doc
* PR feedback
* ava: Refactor Updater.cs
Fix typos
Remove unused usings
Rename variables to follow naming scheme
* ava: Set file permissions when extracting update files
* gtk: Apply the same refactor to Updater.cs
* updater: Replace assert with if statement
* updater: Remove await usings again
* vulkan: Respect VK_KHR_portability_subset vertex stride alignment
We were hardcoding alignment to 4, but by specs it can be any values that
is a power of 2.
This also enable VK_KHR_portability_subset if present as per specs
requirements.
* address gdkchan's comment
* Make NeedsVertexBufferAlignment internal
This started as an attempt to remove vkGetPhysicalDeviceMemoryProperties
in FindSuitableMemoryTypeIndex (As this could have some overhead and
shouldn't change at runtime) and turned in a little bigger cleanup.
* vulkan: Enforce Vulkan 1.2+ at instance API level and 1.1+ at device level
This ensure we don't end up trying to initialize with anything currently incompatible.
* Address riperiperi's comment
I was forcing some types of texture to partially update when investigating performance with games that stream in data, and noticed that partially loading texture data was really broken on both backends.
Fixes Vulkan texture set by getting the correct expected size for the texture. Fixes partial upload on both backends for both Texture 2D Array and Cubemap using the wrong offset and uploading to the first layer/level for a handle. 3D might also be affected.
This might fix textures randomly having incorrect data in games that render to it - jumbled in the case of OpenGL, and outdated/black in the case of Vulkan. This case typically happens in UE4 games.
* Log shader compile errors with Warning level
These are infrequent enough that I think it's worth dumping any errors into the log. They also keep causing graphical glitches, and the only indication that anything went wrong is a debug log that is never enabled.
* Add maximum length for shader log
* Replace unicorn bindings with Nuget package
* Use nameof for ValueSource args
* Remove redundant code from test projects
* Fix wrong values for EmuStart()
Add notes to address this later again
* Improve formatting
* Fix formatting/alignment issues
The AutoFlushCounter would flush command buffers on any attachment change (write mask or bindings change) if there was a pending query. This is to get query results as soon as possible for draw skips, but it's assuming that a full occlusion query _pass_ happened, that we want to flush it's data before getting onto draws, rather than the queries being randomly interspersed throughout a pass that also draws.
Xenoblade 2 repeatedly switches between performing a samples passed query and outputting to a render target on each draw, and flips the write mask to do so. Flushing the command buffer every 2 draws isn't ideal, so it's best that we only do this if the pattern matches the large block style of occlusion query.
This change makes this flush only happen after a few consecutive query reports. "Consecutive" is interrupted by attachment changes or command buffer flush.
This doesn't really solve the issue where it resets more queries than it uses, it just stops the game doing it as often. I'm not sure of the best way to do that. The cost of resetting could probably be reduced by using query pools with more than one element and resetting in bulk.
* Handle mismatching texture size with copy dependencies
* Create copy and render textures with the minimum possible size
* Only align width for comparisons, assume that height is always exact
* Fix IsExactMatch size check
* Allow sampler and copy textures to match textures with larger width
* Delete texture ChangeSize related code
* Move AdjustSize to TextureInfo and give it a better name, adjust usages
* Fix GetMinimumWidthInGob when minimumWidth > width
* Only update render targets that are actually cleared for clear
Avoids creating textures with incorrect sizes
* Delete UpdateRenderTargetState method that is not needed anymore
Clears now only ever sets the render targets that will be cleared rather than all of them
* Support safe blit on non-2D textures (except multisample)
* Change safe blit with different levels and layers to match CmdBlitImage path
* Remove now unused variables
* Multisample safe blit support
* Initial Apple Hypervisor based CPU emulation implementation
* Add UseHypervisor Setting
* Add basic MacOS support to Avalonia
* Fix initialization
* Fix GTK build
* Fix/silence warnings
* Change exceptions to asserts on HvAddressSpaceRange
* Replace DllImport with LibraryImport
* Fix LibraryImport
* Remove unneeded usings
* Revert outdated change
* Set DiskCacheLoadState when using hypervisor too
* Fix HvExecutionContext PC value
* Address PR feedback
* Use existing entitlements.xml file on distribution folder
---------
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* Create bug_report.yml
* Update bug_report.yml
* Update bug_report.yml
* Create feature_request.yml
* Update feature_request.yml
* Update feature_request.yml
* Update feature_request.yml
* Update feature_request.yml
* a
* Update missing_cpu_instruction.yml
* Update missing_cpu_instruction.yml
* Update missing_cpu_instruction.yml
* Update missing_cpu_instruction.yml
* b
* addressed some of the feedback
* forget the label
* added missing text inputs
* formatting changes
* dropdown menu
added dropdown menu for os, idk if we will keep this
* addressed feedback
addressed the long overdue feedback, sorry about that
* added markdowns
everything should be addressed now i hope
* game version optional
made game version optional after further feedback
* feature request checkbox
* Relax Vulkan requirements
* Fix MaxColorAttachmentIndex
* Fix ColorBlendAttachmentStateCount value mismatch for background pipelines
* Change query capability check to check for pipeline statistics query rather than geometry shader support
* Reset queries on same command buffer
Vulkan seems to complain when the queries are reset on another command buffer. No idea why, the spec really could be written better in this regard. This fixes complaints, and hopefully any implementations that care extensively about them.
This change _guesses_ how many queries need to be reset and resets as many as possible at the same time to avoid splitting render passes. If it resets too many queries, we didn't waste too much time - if it runs out of resets it will batch reset 10 more.
The number of queries reset is the maximum number of queries in the last 3 frames. This has been worked into the AutoFlushCounter so that it only resets up to 32 if it is yet to force a command buffer submission in this attachment.
This is only done for samples passed queries right now, as they have by far the most resets.
* Address Feedback
* Allow setting texture data from 1x to fix some textures resetting randomly
Expected targets:
- Deltarune 1+2
- Crash Team Racing
- Those new pokemon games idk
* Allow scaling of MSAA textures, propagate scale on copy.
* Fix Rebase
Oops
* Automatic disable
* A bit more aggressive
* Without the debug log
* Actually decrement the score when writing.
The only guarantee of the occlusion query type in Vulkan is that it will be zero when no samples pass, and non-zero when any samples pass. Of course, most GPUs implement this by just placing the # of samples in the result and calling it a day. However, this lax restriction means that GPUs could just report a boolean (1/0) or report a value after one is recorded, but before all samples have been counted.
MoltenVK falls in the first category - by default it only reports 1/0 for occlusion queries. Thankfully, there is a feature and flag that you can use to force compatible drivers to provide a "precise" query result, that being the real # of samples passed.
Should fix ink collision in Splatoon 2/3 on MoltenVK.
We currently loading only one RomFs at a time, which could be wrong if one day we want to load more than one guest at time.
This PR fixes that by loading romfs by pid.
* Implement support for page sizes > 4KB
* Check and work around more alignment issues
* Was not meant to change this
* Use MemoryBlock.GetPageSize() value for signal handler code
* Do not take the path for private allocations if host supports 4KB pages
* Add Flags attribute on MemoryMapFlags
* Fix dirty region size with 16kb pages
Would accidentally report a size that was too high (generally 16k instead of 4k, uploading 4x as much data)
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* Add short duration texture cache
This texture cache takes textures that lose their last pool reference and keeps them alive until the next frame, or until an incompatible overlap removes it. This is done since under certain circumstances, a texture's reference can be wiped from a pool despite it still being in use - though typically the reference will return when rendering the next frame.
While this may slightly increase texture memory usage when quickly going through a bunch of temporary textures, it's still bounded due to the overlap removal rule.
This greatly increases performance in Hyrule Warriors: Age of Calamity. It may positively affect some UE4 games which dip framerate severely under certain circumstances.
* Small optimization
* Don't forget this.
* Add short cache dictionary
* Address feedback
* Address some feedback
* Ava: Fixes "Hide Cursor on Idle" for Windows
* Add check in MouseDriver and reduce the time of idling
* Fix linux error
* Change idle time everywhere for consistencies
* Add MVK basics.
* Use appropriate output attribute types
* 4kb vertex alignment, bunch of fixes
* Add reduced shader precision mode for mvk.
* Disable ASTC on MVK for now
* Only request robustnes2 when it is available.
* It's just the one feature actually
* Add triangle fan conversion
* Allow NullDescriptor on MVK for some reason.
* Force safe blit on MoltenVK
* Use ASTC only when formats are all available.
* Disable multilevel 3d texture views
* Filter duplicate render targets (on backend)
* Add Automatic MoltenVK Configuration
* Do not create color attachment views with formats that are not RT compatible
* Make sure that the host format matches the vertex shader input types for invalid/unknown guest formats
* FIx rebase for Vertex Attrib State
* Fix 4b alignment for vertex
* Use asynchronous queue submits for MVK
* Ensure color clear shader has correct output type
* Update MoltenVK config
* Always use MoltenVK workarounds on MacOS
* Make MVK supersede all vendors
* Fix rebase
* Various fixes on rebase
* Get portability flags from extension
* Fix some minor rebasing issues
* Style change
* Use LibraryImport for MVKConfiguration
* Rename MoltenVK vendor to Apple
Intel and AMD GPUs on moltenvk report with the those vendors - only apple silicon reports with vendor 0x106B.
* Fix features2 rebase conflict
* Rename fragment output type
* Add missing check for fragment output types
Might have caused the crash in MK8
* Only do fragment output specialization on MoltenVK
* Avoid copy when passing capabilities
* Self feedback
* Address feedback
Co-authored-by: gdk <gab.dark.100@gmail.com>
Co-authored-by: nastys <nastys@users.noreply.github.com>
* Ava: Move Ava logging to Logger.Debug
Since #4231 we currently redirect Avalonia logs to our Logger, which is pretty nice. But since it uses our Logging level too, it now leads to a massive flood in our Log files.
To avoid that, I've included all `AvaLogLevel` to the log message, and make all Ava Logs using `Logger.Debug`.
* Logs errors to Error and other to Debug
* missing level
* keep var
Currently in `MenuMainBarView.axaml` we list all available languages and hardcode the language name with the language key.
It's a bit bad beause if we want to add a new language, we have to edit the `csproj` and the `axaml` with the translated language name and the language code.
I've put all translations in their respective locale files, add code into `MainMenuBarView` constructor to generate the menu automatically. Now we just have to edit the `csproj` if we want to add a new language.
* Implement JIT Arm64 backend
* PPTC version bump
* Address some feedback from Arm64 JIT PR
* Address even more PR feedback
* Remove unused IsPageAligned function
* Sync Qc flag before calls
* Fix comment and remove unused enum
* Address riperiperi PR feedback
* Delete Breakpoint IR instruction that was only implemented for Arm64
Because we are building everything on Windows for release at the moment,
git default line ending to CRLF causing issues when packing the
Ryujinx.sh script.
This addresses this by enforcing all files to use LF via .gitattributes.
* ava: Cleanup AppHost
This PR cleaned up the AppHost file a bit (adding the infamous extra spaces to improve readability), resorting private vars, remove useless vars, and improve the code here and there, like the AudioBackend check.
Co-Authored-By: gdkchan <5624669+gdkchan@users.noreply.github.com>
* Remove 'renderer"
* Revert currentTime
* revert if condition
Co-authored-by: gdkchan <5624669+gdkchan@users.noreply.github.com>
* headless: Fix typos in command line options
* Remove nullable from command line options
Add EnableMacroHLE option
Add HideCursorOnIdle option
* headless: Adjust enable-ptc help text
* headless: Use switch statement instead of if-else chain
* headless: Improve formatting for long constructors
* headless: Remove discards from SDL_ShowCursor()
* headless: Add window icon
* Fix hiding cursor on idle
At least on Wayland, SDL2 doesn't produce any mouse motion events.
* Add new command line args: BaseDataDir and UserProfile
* headless: Read icon from embedded resource
* headless: Skip SetWindowIcon() on Windows if dll isn't present
* headless: Fix division by zero
* headless: Fix command line options not working correctly
* headless: Fix crash when viewing command line options
* headless: Load window icon bmp from memory
* Add comment to the workaround for SDL_LoadBMP_RW
* headless: Enable logging to file by default
* headless: Add 3 options for --hide-cursor
Replaces --disable-hide-cursor-on-idle
* Horizon: Impl Prepo, Fixes bugs, Clean things
* remove ToArray()
* resultCode > status
* Remove old services
* Addresses gdkchan's comments and more cleanup
* Addresses Gdkchan's feedback 2
* Reorganize services, make sure service are loaded before guest
Co-Authored-By: gdkchan <5624669+gdkchan@users.noreply.github.com>
* Create interfaces for lm and sm
Co-authored-by: gdkchan <5624669+gdkchan@users.noreply.github.com>
Avalonia seems to not like when the artifact doesns't match the root namespace...
Address that by moving the binary to "Ryujinx" like we do on macOS build.
* IPC refactor part 3 + 4: New server HIPC message processor with source generator based serialization
* Make types match on calls to AlignUp/AlignDown
* Formatting
* Address some PR feedback
* Move BitfieldExtensions to Ryujinx.Common.Utilities and consolidate implementations
* Rename Reader/Writer to SpanReader/SpanWriter and move to Ryujinx.Common.Memory
* Implement EventType
* Address more PR feedback
* Log request processing errors since they are not normal
* Rename waitable to multiwait and add missing lock
* PR feedback
* Ac_K PR feedback
* chore: Update tests dependencies
* Apply TSR Berry suggestion to add a GC.SuppressFinalize in MemoryBlock.cs
* Ensure we wait for the test thread to be dead on PartialUnmap
* Use platform attribute for os specific tests
* Make P/Invoke methods private
* Downgrade NUnit3TestAdapter to 4.1.0
* test: Disable warning about platform compat for ThreadLocalMap()
Co-authored-by: TSR Berry <20988865+TSRBerry@users.noreply.github.com>
* Filter “._” files from the game list
* Filter all hidden files from the game list
* Fix style
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* merge OR expression into a pattern
* migrate from GetFiles/Directories to Enumerate
* Remove GetFilesInDirectory()
* Update Ryujinx.Ui.Common/App/ApplicationLibrary.cs
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* add error handeling
* code cleanup
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Change AggregateType to include vector type counts
* Replace VariableType uses with AggregateType and delete VariableType
* Support new local vector types on SPIR-V and GLSL
* Start using vector outputs for texture operations
* Use vectors on more texture operations
* Use vector output for ImageLoad operations
* Replace all uses of single destination texture constructors with multi destination ones
* Update textureGatherOffsets replacement to split vector operations
* Shader cache version bump
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Vulkan: Don't flush commands when creating most sync
When the WaitForIdle method is called, we create sync as some internal GPU method may read back written buffer data. Some games randomly intersperse compute dispatch into their render passes, which result in this happening an unbounded number of times depending on how many times they run compute.
Creating sync in Vulkan is expensive, as we need to flush the current command buffer so that it can be waited on. We have a limited number of active command buffers due to how we track resource usage, so submitting too many command buffers will force us to wait for them to return to the pool.
This PR allows less "important" sync (things which are less likely to be waited on) to wait on a command buffer's result without submitting it, instead relying on AutoFlush or another, more important sync to flush it later on.
Because of the possibility of us waiting for a command buffer that hasn't submitted yet, any thread needs to be able to force the active command buffer to submit. The ability to do this has been added to the backend multithreading via an "Interrupt", though it is not supported without multithreading.
OpenGL drivers should already be doing something similar so they don't blow up when creating lots of sync, which is why this hasn't been a problem for these games over there.
Improves Vulkan performance on Xenoblade DE, Pokemon Scarlet/Violet, and Zelda BOTW (still another large issue here)
* Add strict argument
This is technically a separate concern from whether the sync is a host syncpoint.
* Remove _interrupted variable
* Actually wait for the invoke
This is required by AMD GPUs, and also may have caused some issues on other GPUs.
* Remove unused using.
* I don't know why it added these ones.
* Address Feedback
* Fix typo
* haydn: Add support for PCMFloat, PCM32 and PCM8 conversions
This adds support in the compatibility layer for other sample format
than PCM16.
This should help extends compatibility with soundio on devices that
doesn't expose PCM16.
I ommited PCM24 conversion for now as it's not simplest of all.
* Address TSRBerry's comment
* Address comments
* Fix conversion issue and clean up saturation usage
* Revert saturation changes
* Address gdkchan's comment
* Add conversion for 16 bit RGBA formats (not supported in Rosetta)
* Rebase fix
Rebase fix
* Forgot to remove this
* Fix RGBA16 format conversion
* Add RGBA4 -> RGBA8 conversion
* Handle host stride alignment
* Address Feedback Part 1
* Can't count
* Don't zero out rgb when alpha is 0
* Separate RGBA4 and 5-bit component formats
Not sure of a better way to name them...
* Add A1B5G5R5 conversion
* Put this in the right place.
* Make format naming consistent for capabilities
* Change method names
* Generic Math Update
Updated Several functions in Ryujinx.Common/Utilities/BitUtils to use generic math
* Updated BitUtil calls
* Removed Whitespace
* Switched decrement
* Fixed changed method calls.
The method calls were originally changed on accident due to me relying too much on intellisense doing stuff for me
* Update Ryujinx.Common/Utilities/BitUtils.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* bsd::RecvFrom: Ryujinx does not verify output buffer size before writing socket address
* Calculate the size of BsdSockAddr
* use bsdSockAddr variable
* Update Ryujinx.HLE/HOS/Services/Sockets/Bsd/IClient.cs
Co-authored-by: Mary-nyan <thog@protonmail.com>
* Update Ryujinx.HLE/HOS/Services/Sockets/Bsd/Types/BsdSockAddr.cs
Co-authored-by: Mary-nyan <thog@protonmail.com>
* set errno to ENOMEM in case we can't write the address to memory
* Update Ryujinx.HLE/HOS/Services/Sockets/Bsd/IClient.cs
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Update Ryujinx.HLE/HOS/Services/Sockets/Bsd/IClient.cs
Co-authored-by: Mary-nyan <thog@protonmail.com>
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Replace Array.Clear(x, 0, x.Length) with Array.Clear(x)
* Use DateTime.UnixEpoch field
* Replace SHA256.ComputeHash calls with static SHA256.HashData call
More performant and avoids the need to initialize a SHA256 instance.
This fix a warning on "慟哭そして…" by handling correctly the debug mode
flag.
When debug mode isn't enabled, opening /dev/nvhost-dbg-gpu or /dev/nvhost-prof-gpu should fail with a not implemented error code.
This implement this behaviour and also define stubbed interfaces for
completness.
I noticed a weirdly high cost for dictionary accesses from MarkLabel etc. Turns out that the hash code was always the same for labels, so the whole point of having a dictionary was missed and it was putting everything in the same bucket. I made it always hash the _data pointer as that's a good source of identifiable and "random" data.
* ARMeilleure: Add AVX512{F,VL,DQ,BW} detection
Add `UseAvx512Ortho` and `UseAvx512OrthoFloat` optimization flags as
short-hands for `F+VL` and `F+VL+DQ`.
* ARMeilleure: Add initial support for EVEX instruction encoding
Does not implement rounding, or exception controls.
* ARMeilleure: Add `X86Vpternlogd`
Accelerates the vector-`Not` instruction.
* ARMeilleure: Add check for `OSXSAVE` for AVX{2,512}
* ARMeilleure: Add check for `XCR0` flags
Add XCR0 register checks for AVX and AVX512F, following the guidelines
from section 14.3 and 15.2 from the Intel Architecture Software
Developer's Manual.
* ARMeilleure: Increment InternalVersion
* ARMeilleure: Remove redundant `ReProtect` and `Dispose`, formatting
* ARMeilleure: Move XCR0 procedure to GetXcr0Eax
* ARMeilleure: Add `XCR0` to `FeatureInfo` structure
* ARMeilleure: Utilize `ReadOnlySpan` for Xcr0 assembly
Avoids an additional allocation
* ARMeilleure: Formatting fixes
This fixes an error from #3805 that caused a wrong conversion of ``AppKeyValueStorage`` to string.
As that information isn't really relevant without appropriate parsing, it was removed from ``ToString``.
This should get ride of "bell warning" in Mario Kart 8 when entering Time Trials.
* Vulkan: enable VK_EXT_custom_border_color features
radv only create the border color bo if this feature is enabled, so it crashed when creating samplers with custom border colors
Fixes#4072Fixes#3993
* Address gdkchan's comment
Co-authored-by: Mary <mary@mary.zone>
This was meant to be only an upgrade of how we set unix permission in
the updater to use .NET 7 new APIs, but I end up finding bugs along the
way.
Changelog:
- Remove direct usage of chmod to use File.SetUnixFileMode.
- Fix command line being broken when updating (#3744) but on
Ryujinx.Ava.
- Makes Ryujinx.Ava updater fallback to Ryujinx executable if current
name isn't found.
- Make permission setter function more generic.
* bsd: Add gdkchan's Select implementation
Co-authored-by: TSRBerry <20988865+tsrberry@users.noreply.github.com>
* bsd: Fix Select() causing a crash with an ArgumentException
.NET Sockets have to be used for the Select() call
* bsd: Make Select more generic
* bsd: Adjust namespaces and remove unused imports
* bsd: Fix NullReferenceException in Select
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* audio: Rewrite SoundIo bindings
This rewrite SoundIo bindings to be safer and not a pedantic autogenerated mess.
* Address comments
* Switch DllImport to LibraryImport
* Address gdkchan's comment
We only used it in one spot for DPI scaling factor.
This implements the same behaviour using gdiplus.
This remove 700KB of dependency to download and around 170KB unpacked.
* hle: Do not add disabled AoC item to the list
We currently add all AoC items to a list in `ContentManager` and the enable check is only done when FS service ask for the data. Which is wrong. It causes an issue in MK8D which doesn't boot even if you have disabled a not updated DLC.
I've fixed it by not adding the disabled AoC item to the list, I've removed some duplicate code too.
There is still an edge case because we currently don't check the AoC Item version, but that should be fixed later since now MK8D throw an error if the DLC isn't updated.
* remove useless "enabled"
We have a conversion from LDG on the compute shader to a special constant buffer binding that's used to exceed hardware limits on compute, but it was only running if the byte offset could be identified. The fallback that checks all of the bindings at runtime only checks the storage buffers.
This PR adds checking ube ranges to the LoadGlobal fallback. This extends the changes in #4011 to only check ube entries which are accessed by the shader.
Fixes particles affected by the wind in The Legend of Zelda: Breath of the Wild. May fix other weird issues with compute shaders in some games.
Try a bunch of games and drivers to make sure they don't blow up loading constants willynilly from searchable buffers.
* Initial implementation of metal surface across UIs
* Fix SDL2 on windows
* Update Ryujinx/Ryujinx.csproj
Co-authored-by: Mary-nyan <thog@protonmail.com>
* Address Feedback
Co-authored-by: Mary-nyan <thog@protonmail.com>
* amadeus: Add missing compressor effect from REV11
This was in my reversing notes but seems I completely forgot to
implement it
Also took the opportunity to simplify the Limiter effect a bit.
* Remove some outdated comment
* Address gdkchan's comments
* Fix accessability violations in ListView
* Use accent colour for favourite star
* Hide progress bar when its done
* App Data Formating
- Added space before storage unit
- Changed so minutes have 0 decimals, and hours and days have 1
* Fix theming
* Fix mismatched corner radius
* Fix acceability violations in GridView
* More consistency between Grid and List View
* Fix margin
* Let whitespace defocus controls
* Make all structs readonly when applicable. It should reduce amount of needless defensive copies
* Make structs with trivial boilerplate equality code record structs
* Remove unnecessary readonly modifiers from TextureCreateInfo
* Make BitMap structs readonly too
* GPU: Use lazy checks for specialization state
This PR adds a new class, the SpecializationStateUpdater, that allows elements of specialization state to be updated individually, and signal the state is checked when it changes between draws, instead of building and checking it on every draw. This also avoids building spec state when
Most state updates have been moved behind the shader state update, so that their specialization state updates make it in before shaders are fetched.
Downside: Fields in GpuChannelGraphicsState are no longer readonly. To counteract copies that might be caused this I pass it as `ref` when possible, though maybe `in` would be better? Not really sure about the quirks of `in` and the difference probably won't show on a benchmark.
The result is around 2 extra FPS on SMO in the usual spot. Not much right now, but it will remove costs when we're doing more expensive specialization checks, such as fragment output type specialization for macos. It may also help more on other games with more draws.
* Address Feedback
* Oops
* GPU: Swap bindings array instead of copying
Reduces work on UpdateShaderState. Now the cost is a few reference moves for arrays, rather than copying data.
Downside: bindings arrays are no longer readonly.
* Micro optimisation
* Add missing docs
* Address Feedback
This reverts commit 9677ddaa5d.
SixLabors.ImageShar switched to a shady and vague license starting with 2.x
without mentioning it on their changelog.
As a result we are staying on 1.x (licensed under Apache-2) and will
seak an alternative package.
* Track buffer migrations and flush source on incomplete copy
Makes sure that the modified range list is always from the latest iteration of the buffer, and flushes earlier iterations of a buffer if the data has not been migrated yet.
* Cleanup 1
* Reduce cost for redundant signal checks on Vulkan
* Only inherit the range list if there are pending ranges.
* Fix OpenGL
* Address Feedback
* Whoops
* ava: Cleanup RenderTimer
* ava: Remove ContentControl from RendererHost
* ava: Remove unused actual scale factor
* ava: Enable UseGpu for Linux
* ava: Set better initial size & Scale the window properly
* ava: Realign properties
* ava: Use explicit type & specify where the note applies
* Ensure that vertex attribute buffer index is valid on GPU
* Remove vertex buffer validation code from OpenGL
* Remove some fields that are no longer necessary
Polygon topology wasn't really supported and would only work on OpenGL on drivers that haven't removed it. As an alternative, this PR makes all cases of polygon topology use triangle fan. The topology type and transform feedback type have not been changed, as I don't think geo shader/tfb should be used with polygons.
The OpenGL spec states:
Only convex polygons are guaranteed to be drawn correctly by the GL.
For convex polygons, triangle fan is equivalent to polygon. I imagine this is probably how it works on device, as this get-out-of-jail-free card is too enticing to pass up.
This fixes the stat display in Pokemon S/V.
* amadeus: Allow OOB read of GC-ADPCM coefficients
Fixes "Ninja Gaiden Sigma 2" and possibly "NINJA GAIDEN 3: Razor's Edge"
* amadeus: Fix wrong variable usage in delay effect
We should transform the delay line values, not the input.
* amadeus: Update GroupedBiquadFilterCommand documentation
* amadeus: Simplify PoolMapper alignment checks
* amadeus: Update Surround delay effect matrix to REV11
* amadeus: Add drop parameter support and use 32 bits integers for estimate time
Also implement accurate ExecuteAudioRendererRendering stub.
* Address gdkchan's comments
* Address gdkchan's other comments
* Address gdkchan's comment
* bsd: Fix eventfd broken logic
This commit fix eventfd logic being broken.
The following changes were made:
- EventFd IPC definition had argument inverted
- EventFd events weren't fired correctly
- Poll logic was wrong and unfinished for eventfd
- Reintroduce workaround from #3385 but in a safer way, and spawn 4
threads.
* ipc: Rework a bit for multithreads
* Clean up debug logs
* Make server thread yield when managed lock isn't availaible
* Fix replyTargetHandle not being added in the proper locking scope
* Simplify some scopes
* Address gdkchan's comments
* Revert IPC workaround for now
* Reintroduce the EventFileDescriptor workaround
This replacement is meant to be done with the original identified byteOffset, not the one assigned later on by the below conditionals (that already has the constant offset added, for instance).
This fixes videos being pixelated in Xenoblade 3, and other regressions that might have happened since #3847.
* common: Make BinaryReaderExtensions Read & Write take unamanged types
This allows us to not rely on Marshal.PtrToStructure and Marshal.StructureToPtr for those.
* common: Make MemoryHelper Read & Write takes unamanged types
* Update Marshal.SizeOf => Unsafe.SizeOf when appropriate and start moving software applet to unmanaged types
* ui: Only wait on _exitEvent when MainLoop is active under GTK
This fixes a dispose issue under Horizon/GTK, we don't check if the ApplicationClient is null so it throw NCE. We don't check if the main loop is active and waiting an event which is set in the main loop... So that could lead to a freeze.
Everything works fine in GTK now.
Related issue: https://github.com/Ryujinx/Ryujinx/issues/3873
As a side note, same kind of issue appear in Avalonia UI too. Firmware's popup doesn't show anything and the emulator just freeze.
* TSRBerry's change
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* Fix Avalonia crashing/freezing
* Add Avalonia OpenGL fixes
* Fix firmware popup on windows
* Fixes everything
* Add _initialized bool to VulkanRenderer and OpenGL Window
Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
* GAL: Send all buffer assignments at once rather than individually
The `(int first, BufferRange[] ranges)` method call has very significant performance implications when the bindings are spread out, which they generally always are in Vulkan. This change makes it so that these methods are only called a maximum of one time per draw.
Significantly improves GPU thread performance in Pokemon Scarlet/Violet.
* Address Feedback
Removed SetUniformBuffers(int first, ReadOnlySpan<BufferRange> buffers)
* GPU: Access non-prefetch command buffers directly
Saves allocating new arrays for them constantly - they can be quite small so it can be very wasteful. About 0.4% of GPU thread in SMO, but was a bit higher in S/V when I checked.
Assumes that non-prefetch command buffers won't be randomly clobbered before they finish executing, though that's probably a safe bet.
* Small change while I'm here
* Address feedback
I did this on ncbuffer2 when we were using it for LDN 3, but I noticed that it can apply to the current buffer manager too, and it's an easy performance win.
The only buffer access that can come from another thread is the overlap search for buffers that have been unmapped. Everything else, including modifications, come from the main GPU thread. That means we only need to lock the range list when it's being modified, as that's the only time where we'll cause a race with the unmapped handler.
This has a significant performance improvements in situations where FIFO is high, like the other two PRs. Joined together they give a nice boost (73.6 master -> 79 -> 83 fps in SMO).
Since we move to .NET7, JsonSerializer now needs to have explicit options as arguments, which leads to some warnings in Avalonia project. This is fixed by using our `JsonHelper` class.
* Update to LibHac 0.17.0
* Don't clear SD card saves when starting the emulator
This was an old workaround for errors that happened when a user's SD card encryption seed changed. SD card saves have been unencrypted for over a year, so we should be fine to remove the workaround.
* unicorn: Add modified ver of unicorns const gen
* unicorn: Use upstream consts
These consts were generated from the dev branch of unicorn
* unicorn: Split common consts into multiple enums
* unicorn: Remove arch prefix from consts
* unicorn: Add new windows dll
Windows 10 - MSVC x64 shared build
* unicorn: Use absolute path for const generation
* unicorn: Remove fspcr patch
* unicorn: Fix using the wrong file extension
For some reason _NativeLibraryExtension evaluates to ".so" even on Windows.
* unicorn: Add linux shared object again
* unicron: Add DllImportResolver
* unicorn: Try to import unicorn using an absolute path
* unicorn: Add clean target
* unicorn: Replace IsUnicornAvailable() methods
* unicorn: Skip tests instead of silently passing them if unicorn is missing
* unicorn: Write error message to stderr
* unicorn: Make Interface static
* unicron: Include prefixed unicorn libs (libunicorn.so)
Co-authored-by: merry <git@mary.rs>
* unicorn: Add lib prefix to shared object for linux
Co-authored-by: merry <git@mary.rs>
A quick fix to prevent reading the wrong value of Count when reregistering ranges for a new target buffer. Buffer flushes from another thread can modify the range list when the lock isn't active, which can change the count.
This prevents some crashes in Pokemon Scarlet/Violet. It's probably likely that buffer migration during flush is causing some other issues in this game, but this at least prevents the crashing.
* Vulkan: Don't create preload buffer outside a render pass
The preload command buffer is used to avoid render pass splits and barriers when updating buffer data. However, when a render pass is not active (for example, at the start of a pass, or during compute invocations) buffer uploads can be performed at any time, so the optimization isn't as useful.
This PR makes it so that the preload command buffer is only used for buffer updates outside of a render pass. It's still used for textures as I don't want to shake things up right now regarding how the preload buffer is obtained before some other changes, and texture updates are a lot rarer anyways.
Improves performance slightly in Pokemon Scarlet/Violet (43 -> 48), as it was switching to compute, writing a bunch of buffers inline, then dispatching, then flushing commands... It uses 1 command buffer instead of 2 every time it does this now. Maybe it would be nice to find a faster way to sync without creating so many command buffers in a short period of time.
* Address feedback
* Prune ForceDirty and CheckModified caches on unmap
Since we're now using this for modified checks on the HLE indirect draw method, I'm worried that leaving these to forever gather cache entries isn't the best idea for performance in the long term, and it could keep old buffer objects alive for longer than they should be.
This PR adds the ability to prune invalid entries before checking these caches, and queues it whenever gpu memory is unmapped. It also aligns modified checks to the page size, as I figured it would be possible for a huge number of overlapping over a game's runtime.
This prevents Super Mario Odyssey from having 10s of thousands of entries in the modified cache in Metro Kingdom, and them duplicating when entering and leaving a building (should be cleared, as they were unmapped).
* Address Feedback
* am: Stub GetSaveDataSizeMax()
* am: Remove todo comment for GetSaveDataSizeMax()
* am: saveDataSize & journalDataSize should be of type long
* am: Add explanation for returning default values in GetSaveDataSizeMax()
* Use ReadOnlySpan<byte> compiler optimization in more places
* Revert changes in ShaderBinaries.cs
* Remove unused using;
* Use ReadOnlySpan<byte> compiler optimization in more places
* Allow _volatile to be set from MultiRegionHandle checks again
Tracking handles have a `_volatile` flag which indicates that the resource being tracked is modified every time it is used under a new sequence number. This is used to reduce the time spent reprotecting memory for tracking writes to commonly modified buffers, like constant buffers.
This optimisation works by detecting if a buffer is modified every time a check happens. If a buffer is checked but it is not dirty, then that data is likely not modified every sequence number, and should use memory protection for write tracking. If the opposite is the case all the time, it is faster to just assume it's dirty as we'd just be wasting time protecting the memory.
The new MultiRegionBitmap could not notify handles that they had been checked as part of the fast bitmap lookup, so bindings larger than 4096 bytes wouldn't trigger it at all. This meant that they would be subject to a ton of reprotection if they were modified often.
This does mean there are two separate sources for a _volatile set: VolatileOrDirty + _checkCount, and the bitmap check. These shouldn't interfere with each other, though.
This fixes performance regressions from #3775 in Pokemon Sword, and hopefully Yu-Gi-Oh! RUSH DUEL: Dawn of the Battle Royale. May affect other games.
* Fix stupid mistake
The type in the `texOp` in the textureSize instruction doesn't have the exact type on SPIR-V (for example, it is missing the Array flag). This PR gives it the proper type before giving it to the unscaling helper.
This fixes the ground textures being broken on Pokemon Scarlet/Violet when scaling. It wasn't finding the texture, so the descriptor index it provided was -1...
* Eliminate CB0 accesses
Still some work to do, decouple from hle?
* Forgot the important part somehow
* Fix and improve alignment test
* Address Feedback
* Remove some complexity when checking storage buffer alignment
* Update Ryujinx.Graphics.Shader/Translation/Optimizations/GlobalToStorage.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Thread ID Register, Floating-point Control Register, and Floating-point Status Register all had Register capitalized, so the Register in Processor State register should be capitalized.
For some reasons, my fresh installation of Fedora 36 (KDE) doesn't have a
symlink for libX11.so.
This commit fixes this by trying to import the library with its major
version or fallback to the normal way.
* Revert "Add support for releasing a semaphore to DmaClass (#2926)"
This reverts commit 521a07e612.
* Revert "Revert "Add support for releasing a semaphore to DmaClass (#2926)""
This reverts commit ec8a5fd053.
* Strip non-visible control codes from strings before they are sent to the software keyboard to prevent ugly unicode blocks from being shown on the UI.
* remove debugging junk
* Initialize stringbuilder capacity at the start to prevent resizing (a tiny tiny microoptimization)
* Update remarks documentation. Remove unneeded imports.
* Removing a test that's actually just redundant
Co-authored-by: Logan Stromberg <lostromb@microsoft.com>
`MB` and `GB` can either be interpreted as having base-10 units, or
base-2. `MiB` and `GiB` removes this discrepancy so that units of memory
are always interpreted using base-2 units.
* Implement HLE macro for DrawElementsIndirect
* Shader cache version bump
* Use GL_ARB_shader_draw_parameters extension on OpenGL
* Fix DrawIndexedIndirectCount on Vulkan when extension is not supported
* Implement DrawIndex
* Alignment
* Fix some validation errors
* Rename BaseIds to DrawParameters
* Fix incorrect index buffer and vertex buffer size in some cases
* Add HLE macros for DrawArraysInstanced and DrawElementsInstanced
* Perform a regular draw when indirect data is not modified
* Use non-indirect draw methods if indirect buffer was not GPU modified
* Only check if draw parameters match if the shader actually uses them
* Expose Macro HLE setting on GUI
* Reset FirstVertex and FirstInstance after draw
* Update shader cache version again since some people already tested this
* PR feedback
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* Ava: Keep command line args when restarting
* UI: Move common UI functions to ProgramHelper
Add command line option to override the configured graphics backend
* Ava: Add CleanupUpdate task back
* Remove unused usings
* Revert combining common UI functions
Rename ProgramHelper to CommandLineState
Move command line parsing to CommandLineState
* Rename CommandLineProfile to Profile
* Fix assigning the wrong array to Arguments
* Update readme to mention .NET 7
* infra: Migrate to .NET 7
.NET 7 is still in preview but this prepare for the release coming up
next month.
* Use Random.Shared in CreateRandom
* Move UInt128Utils.cs to Ryujinx.Common project
* Fix inverted parameters in System.UInt128 constructor
* Fix Visual Studio complains on Ryujinx.Graphics.Vic
* time: Fix missing alignment enforcement in SystemClockContext
Fixes at least Smash
* time: Fix missing alignment enforcement in SteadyClockContext
Fix games (like recent version of Smash) using time shared memory
* Switch to .NET 7.0.100 release
* Enable Tiered PGO
* Ensure CreateId validity requirements are meet when doing random generation
Also enforce correct packing layout for other Mii structures.
This fix a Mario Kart 8 crashes related to the default Miis.
* Vulkan: Implement multisample <-> non-multisample copies and depth-stencil resolve
* FramebufferParams is no longer required there
* Implement Specialization Constants and merge CopyMS Shaders (#15)
* Vulkan: Initial Specialization Constants
* Replace with specialized helper shader
* Reimplement everything
Fix nonexistant interaction with Ryu pipeline caching
Decouple specialization info from data and relocate them
Generalize mapping and add type enum to better match spv types
Use local fixed scopes instead of global unmanaged allocs
* Fix misses in initial implementation
Use correct info variable in Create2DLayerView
Add ShaderStorageImageMultisample to required feature set
* Use texture for source image
* No point in using ReadOnlyMemory
* Apply formatting feedback
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Apply formatting suggestions on shader source
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Support conversion with samples count that does not match the requested count, other minor changes
Co-authored-by: mageven <62494521+mageven@users.noreply.github.com>
* Do not clear the rejit queue when overlaps count is equal to 0.
* Ptc and PtcProfiler must be invalidated.
* Revert "Ptc and PtcProfiler must be invalidated."
This reverts commit f5b0ad9d7d.
* Fix#3710 slow path due to #3701.
* Manage state of NfcManager
Very basic state management but works with Hyrule Warriors Definitive Edition. Partially fixes#2122
* Fixes changes from review
* A64: Add fast path for Fcvtas_Gp/S/V, Fcvtau_Gp/S/V and Frinta_S/V instructions;
they use "Round to Nearest with Ties to Away" rounding mode not supported in x86.
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
The titles Mario Strikers and Super Smash Bros. U. use these instructions intensively.
* Update Ptc.cs
* A32: Add fast path for Vcvta_RM, Vrinta_RM and Vrinta_V instructions aswell.
* Add new string
You need the period there otherwise it could be read as "Głoś" -> "Preach"
* Update MainWindow.axaml
Updating to bring it in line with the other languages naming themselves in their respective languages
* Update pl_PL.json
realizing that period isn't necessary considering the string's usage (which to be fair, I should have checked when I added it)
* Update pl_PL.json
* Add Updater Message
Due to the `using` statement being scoped to the `CreateTextureView` method, `TextureStorage` would be disposed as soon as the view was returned.
This was largely fine as the TextureStorage resources were being kept alive by the views holding their own references to them, but it also meant that dispose is only called as soon as the texture is created.
Aliased Storages are TextureStorages created with the same allocation as another TextureStorage, if they have to be aliased as another format. We keep track of a TextureStorage's `_aliasedStorages` as they are created, and dispose them when the TextureStorage is disposed...
...except it is disposed immediately, before any aliased storages are even created. The aliased storages added after this will never be disposed.
This PR attempts to fix this by disposing TextureStorage when its view count reaches 0. The other use of texture storage - the D32S8 blit - still manually disposes the storage, but regular uses created via the GAL are now disposed by the view count.
I think this makes the most sense, as otherwise in the future this behaviour might be forgotton and more things could be added to the Dispose() method that don't work due to it not actually calling at the right time.
This should improve memory leaks in Super Mario Odyssey, most noticeable when resolution scaling. The memory usage of the game is still wildly unpredictable due to how it interacts with the texture cache, but now it shouldn't get considerably longer as you play... I hope. I've seen it typically recover back to the same level occasionally, though it can spike significantly.
Please test a bunch of games on multiple GPUs to make sure this doesn't break anything.
* Wrap Args in quotes
-Wrap args in quotes to allow for spaces in dir paths when restarting Ryujinxs from Update.
* Wrap second instance of GetCommandLineArgs()
* Changed ryuArgs from string to string[]
* Update Ryujinx.Ava/Modules/Updater/Updater.cs
Co-authored-by: mageven <62494521+mageven@users.noreply.github.com>
* Update UpdateDialog.cs
Co-authored-by: mageven <62494521+mageven@users.noreply.github.com>
* Avoid allocations in .Parse methods
Use the Span overloads of the Parse methods when possible to avoid string allocations and remove one unnecessarry array allocation
* Avoid another string allocation
* Fix various issues caused by #3679
- The arguments for the 0th dummy vertex buffer were incorrect - it was given an offset of 16 rather than a size of 16.
- The wrong size was used when doing `autoBuffer.Get` on a converted vertex buffer.
- The possibility of a vertex buffer being disposed and then rebound can rebindings to find a different buffer where the current range is out of bounds. Avoid binding when out of range to prevent validation errors.
- The above also affects generation of converted buffers, which was a bit more fatal. Conversion functions now attempt to bound input offset/size.
* Fix offset for converted buffer
Luigi's Mansion 3 performs a non-index quads draw with 6 vertices. It's meant to ignore the last two, but the index pattern's primitive count calculation was rounding up.
No idea why the game does this but this should fix random triangles in the map.
* GPU: Pass SpanOrArray for Texture SetData to avoid copy
Texture data is often converted before upload, meaning that an array was allocated to perform the conversion into. However, the backend SetData methods were being passed a Span of that data, and the Multithreaded layer does `ToArray()` on it so that it can be stored for later! This method can't extract the original array, so it creates a copy.
This PR changes the type passed for textures to a new ref struct called SpanOrArray, which is backed by either a ReadOnlySpan or an array. The benefit here is that we can have a ToArray method that doesn't copy if it is originally backed by an array.
This will also avoid a copy when running the ASTC decoder.
On NieR this was taking 38% of texture upload time, which it does a _lot_ of when you move between areas, so there should be a 1.6x performance boost when strictly uploading textures. No doubt this will also improve texture streaming performance in UE4 games, and maybe a small reduction with video playback.
From the numbers, it's probably possible to improve the upload rate by a further 1.6x by performing layout conversion on GPU. I'm not sure if we could improve it further than that - multithreading conversion on CPU would probably result in memory bottleneck.
This doesn't extend to buffers, since we don't convert their data on the GPU emulator side.
* Remove implicit cast to array.
* Fix some issues with CacheByRange
- Cache now clears under more circumstances, the most important being the fast path write.
- Cache supports partial clear which should help when more buffers join.
- Fixed an issue with I8->I16 conversion where it wouldn't register the buffer for use on dispose.
Should hopefully fix issues with https://github.com/Ryujinx/Ryujinx-Games-List/issues/4010 and maybe others.
* Fix collection modified exception
* Fix accidental use of parameterless constructor
* Replay DynamicState when restoring from helper shader
* Initial GTK implementation
* Less messy and Avalonia imp
* Move clamping to HLE and streamline imps
* Make viewmodel update consistent
* Fix rebase and add an english locale.
Co-authored-by: Mary-nyan <mary@mary.zone>
* ARMeilleure: Add `GFNI` detection
This is intended for utilizing the `gf2p8affineqb` instruction
* ARMeilleure: Add `gf2p8affineqb`
Not using the VEX or EVEX-form of this instruction is intentional. There
are `GFNI`-chips that do not support AVX(so no VEX encoding) such as
Tremont(Lakefield) chips as well as Jasper Lake.
13df339fe7/GenuineIntel/GenuineIntel00806A1_Lakefield_LC_InstLatX64.txt (L1297-L1299)13df339fe7/GenuineIntel/GenuineIntel00906C0_JasperLake_InstLatX64.txt (L1252-L1254)
* ARMeilleure: Add `gfni` acceleration of `Rbit_V`
Passes all `Rbit_V*` unit tests on my `i9-11900k`
* ARMeilleure: Add `gfni` acceleration of `S{l,r}i_V`
Also added a fast-path for when the shift amount is greater than the
size of the element.
* ARMeilleure: Add `gfni` acceleration of `Shl_V` and `Sshr_V`
* ARMeilleure: Increment InternalVersion
* ARMeilleure: Fix Intrinsic and Assembler Table alignment
`gf2p8affineqb` is the longest instruction name I know of. It shouldn't
get any wider than this.
* ARMeilleure: Remove SSE2+SHA requirement for GFNI
* ARMeilleure Add `X86GetGf2p8LogicalShiftLeft`
Used to generate GF(2^8) 8x8 bit-matrices for bit-shifting for the `gf2p8affineqb` instruction.
* ARMeilleure: Append `FeatureInfo7Ecx` to `FeatureInfo`
* fatal: Implement Service
This PR adds a basic implementation of fatal service, guest processes call it when there is something wrong. But since we can already have all informations by debugging it's not really useful.
In any case, that's avoid an unimplemented service exception. Structs/Enum are based on Atmosphère source code.
After logs the error report, I call SvcBreak. Feedbacks are welcome on this, since some guests calls it right after fatal service so I can remove it if needed.
* Addresses gdkchan feedback
* Zero blend state when disabled or write mask is 0
Any difference in the blend state when blend is disabled is meaningless, but Ryujinx would compare different disabled blends and compile them as separate pipelines. This change ensures that all pipelines where blend state is meaningless record it as such, which avoids compiling a bunch of pipelines that are essentially identical.
The NVIDIA driver is pretty forgiving when it comes to silly pipeline misses like this, but other drivers don't offer the same level of kindness.
This should reduce stuttering on those drivers, and might improve overall performance very slightly due to less pipeline variants being in the hash table.
* Fix blend possibly being wrong when an attachment is unmasked
* Implemented in IR the managed methods of the Saturating region ...
... of the SoftFallback class (the SatQ ones).
The need to natively manage the Fpcr and Fpsr system registers is still a fact.
Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Ptc.InternalVersion = 3665
* Addressed PR feedback.
* Implemented in IR the managed methods of the ShlReg region of the SoftFallback class.
It also includes the last two SatQ ones (following up on https://github.com/Ryujinx/Ryujinx/pull/3665).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Fpsr and Fpcr freed.
Handling/isolation of Fpsr and Fpcr via register for IR and via memory for Tests and Threads, with synchronization to context exchanges (explicit for SoftFloat); without having to call managed methods. Thanks to the inlining work of the previous two PRs and others in this.
Tests performed locally in both release and debug modes, in both lowcq and highcq, with FastFP to true and false (explicit FP tests included). Tested with the title Tony Hawk's PS.
Depends on shlreg.
* Update InstEmitSimdHelper.cs
* De-magic Masks.
Remove the Stride and Len flags; Fpsr.NZCV are A32 only, then moved to Fpscr: this leads to emitting less IR in reference to Get/Set Fpsr/Fpcr/Fpscr methods in reference to Mrs/Msr (A64) and Vmrs/Vmsr (A32) instructions.
* Addressed PR feedback.
* Add Index Buffer conversion for quads to Vulkan
Also adds a reusable repeating pattern index buffer to use for non-indexed
draws, and generalizes the conversion cache for buffers.
* Fix some issues
* End render pass before conversion
* Resume transform feedback after we ensure we're in a pass.
* Always generate UInt32 type indices for topology conversion
* No it's not.
* Remove unused code
* Rely on TopologyRemap to convert quads to tris.
* Remove double newline
* Ensure render pass ends before stride or I8 conversion
* Change navbar from compact to default and force text overflow globally
* Fix settings window
* Fix right stick control alignment
* Initialize value and add logging for SDL IDs
* Fix alignment of setting text and improve borders
* Clean up padding and size of buttons on controller settings
* Fix right side trigger alignment and correct styling
* Revert axaml alignment
* Fix alignment of volume widget
* Fix timezone autocompletebox dropdown height
* MainWindow: Line up volume status bar item
* Remove margins and add padding to volume widget
* Make volume text localizable.
Co-authored-by: merry <git@mary.rs>
* Implemented in IR the managed methods of the Saturating region ...
... of the SoftFallback class (the SatQ ones).
The need to natively manage the Fpcr and Fpsr system registers is still a fact.
Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Ptc.InternalVersion = 3665
* Addressed PR feedback.
* Implemented in IR the managed methods of the ShlReg region of the SoftFallback class.
It also includes the last two SatQ ones (following up on https://github.com/Ryujinx/Ryujinx/pull/3665).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Update InstEmitSimdHelper.cs
* OpCodeTable: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD)
* A64: Remove catch-all Hint instruction
* T16: Handle unallocated hint instructions
Some thumb tests execute these assuming that they're nops.
* T32: Fill out other Hint instructions
* A32: Fill out other hint instructions
* Periodically Flush Commands for Vulkan
NVIDIA's OpenGL driver has a built-in mechanism to automatically flush commands to GPU when a lot have been queued. It's also pretty inconsistent, but we'll ignore that for now.
Our Vulkan implementation only submits a command buffer (flush equivalent) when it needs to. This is typically when another command buffer needs to be sequenced after it, presenting a frame, or an edge case where we flush around GPU queries to get results sooner.
This difference in flush behaviour causes a notable difference between Vulkan and OpenGL when we have to wait for commands. In the worst case, we will wait for a sync point that has just been created. In Vulkan, this sync point is created by flushing the command buffer, and storing a waitable fence that signals its completion. Our command buffer contains _every command that we queued since the last submit_, which could be an entire frame's worth of draws.
This has a huge effect on CPU <-> GPU latency. The more commands in a command buffer, the longer we have to wait for it to complete, which results in wasted time. Because we don't know when the guest will force us to wait, we always want the smallest possible latency.
By periodically flushing, we ensure that each command buffer takes a more consistent, smaller amount of time to execute, and that the back of the GPU queue isn't as far away when we need to wait for something to happen. This also might reduce time that the GPU is left inactive while commands are being built.
The main affected game is Pokemon Sword, which got significantly faster in overworld areas due to reduced waiting time when it flushes a shadow map from the main GPU thread.
Another affected game is BOTW, which gets faster depending on the area. This game flushes textures/buffers from its game thread, which is the bottleneck.
Flush latency and throughput may be improved on other games that are inexplicably slower than OpenGL. It's possible that certain games could have their performance _decreased_ slightly due to flushes not being free, but it is unlikely.
Also, flushing to get query results sooner has been tweaked to improve the number of full draw skips that can be done. (tested in SMO)
* Remove unused variable
* Fix possible issue with early query flush
* Scale SamplesPassed counter by RT scale on report
Adds a scale factor for samples passed counter report based on the render target scale at the time. This ensures that when a game reads this counter, it appears similar to the result at 1x.
This doesn't cover cases where the the render target scale changes during the queried draws, though that might be better to handle along with other scope related issues in a future rework of counters. Games generally don't count for occlusion queries over render target changes anyways.
Fixes an issue in the Splatoon games where the special charge would scale too quickly at high res, points at the end of the game would be broken (but still provide a correct winner), and playing at a low res would make it impossible to swim in ink.
May also affect LOD scaling in The Witcher 3.
* Update Ryujinx.Graphics.Gpu/Engine/Threed/SemaphoreUpdater.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Implement Thumb (32-bit) memory (ordered), multiply and bitfield instructions
* Remove public from interface
* Fix T32 BL immediate and implement signed and unsigned extend instructions
* Implement VRSRA, VRSHRN, VQSHRUN, VQMOVN, VQMOVUN, VQADD, VQSUB, VRHADD, VPADDL, VSUBL, VQDMULH and VMLAL Arm32 NEON instructions
* PPTC version
* Fix VQADD/VQSUB
* Improve MRC/MCR handling and exception messages
In case data is being recompiled as code, we don't want to throw at emit stage, instead we should only throw if it actually tries to execute
* Vertex Buffer Alignment part 1
* Update CacheByRange
* Add Stride Change compute shader, fix storage buffers in helpers
* An AMD exclusive
* Reword
* Change rules - stride conversion when attrs misalign
* Fix stupid mistake
* Fix background pipeline compile
* Improve a few things.
* Fix some feedback
* Address Feedback
(the shader binary didn't change when i changed the source to use the subgroup size)
* Fix bug where rewritten buffer would be disposed instantly.
* Implemented in IR the managed methods of the Saturating region ...
... of the SoftFallback class (the SatQ ones).
The need to natively manage the Fpcr and Fpsr system registers is still a fact.
Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Ptc.InternalVersion = 3665
* Addressed PR feedback.
We should report errors even when not requested.
This also ensure we only clear the bits that were requested on the output.
Finally, this fix when input events is 0.
* Bsd: Fix NullReferenceException in BsdSockAddr.FromIPEndPoint()
Allows "Victor Vran Overkill Edition" to boot with guest internet access enabled.
Thanks to EmulationFanatic for testing this for me!
* Bsd: Return proper error code if RemoteEndPoint is null
* Remove whitespace from empty line
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Implement intrusive red-black tree, use it for HLE kernel block manager
* Implement TreeDictionary using IntrusiveRedBlackTree
* Implement IntervalTree using IntrusiveRedBlackTree
* Implement IntervalTree (on Ryujinx.Memory) using IntrusiveRedBlackTree
* Make PredecessorOf and SuccessorOf internal, expose Predecessor and Successor properties on the node itself
* Allocation free tree node lookup
This is a very old oversight on our Poll implementation.
This worked so far reliably because games and homebrews pass the same
buffer as input and output.
* Check if game directories have been updated before refreshing list on save.
* Cleanup spacing
* Add Avalonia and reset value after saving
* Fix Avalonia
* Fix multiple directories not being added in GTK
* Added .ToString overrides, to help diagnose and debug SpirV generated code.
* Added Spirv to team shared dictionary, so the word will not show up as a warning.
* Fixed bug where we were creating invalid constants (bool 0i and float 0i)
* Update Ryujinx.Graphics.Shader/CodeGen/Spirv/CodeGenContext.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update Spv.Generator/Instruction.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Adjusted spacing to match style of the rest of the code.
* Added handler for FP64(double) as well, for undefined aggregate types.
* Made the operand labels a static dictionary, to avoid re-allocation on each call.
Replaced Contains/Get with a TryGetValue, to reduce the number of dictionary lookups.
* Added newline between AllOperands and ToString().
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* drop split devices, rebase
* add fallback to opengl if vulkan is not available
* addressed review
* ensure present image references are incremented and decremented when necessary
* allow changing vsync for vulkan
* fix screenshot on avalonia vulkan
* save favorite when toggled
* improve sync between popups
* use separate devices for each new window
* fix crash when closing window
* addressed review
* don't create the main window with immediate mode
* change skia vk delegate to method
* update vulkan throwonerror
* addressed review
This PR some calls in `am` service:
- ISelfController: SetWirelessPriorityMode, SaveCurrentScreenshot (Partially checked by RE).
- ICommonStateGetter: GetHdcpAuthenticationState
Close#1831 and close#3527
This was broken by the Vulkan changes - OpenGL was building host caches at boot on one thread, which is very notably slower than when it is multithreaded.
This was caused by trying to get the program binary immediately after compilation started, which blocks. Now it does it after compilation has completed.
This PR stub ResolverSetOptionRequest (checked by RE), but the options parsing is still missing since we don't support it in our current code.
(Close#3479)
* SPIR-V: Initialize undefined variables with a value
Changes undefined values on spir-v shaders (caused by phi nodes) to be initialized instead of truly undefined.
Fixes an issue with NVIDIA gpus seemingly not liking when a variable is _potentially_ undefined. Not sure about the details at the moment.
Fixes:
- Tilt shift blur effect in Link's Awakening (bottom of the screen)
- Potentially block flickering on newer NVIDIA gpus in Splatoon 2? Needs testing.
Testing is welcome.
* Update Ryujinx.Graphics.Shader/CodeGen/Spirv/CodeGenContext.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* WIP Vulkan implementation
* No need to initialize attributes on the SPIR-V backend anymore
* Allow multithreading shaderc and vkCreateShaderModule
You'll only really see the benefit here with threaded-gal or parallel shader cache compile.
Fix shaderc multithreaded changes
Thread safety for shaderc Options constructor
Dunno how they managed to make a constructor not thread safe, but you do you. May avoid some freezes.
* Support multiple levels/layers for blit.
Fixes MK8D when scaled, maybe a few other games. AMD software "safe" blit not supported right now.
* TextureStorage should hold a ref of the foreign storage, otherwise it might be freed while in use
* New depth-stencil blit method for AMD
* Workaround for AMD driver bug
* Fix some tessellation related issues (still doesn't work?)
* Submit command buffer before Texture GetData. (UE4 fix)
* DrawTexture support
* Fix BGRA on OpenGL backend
* Fix rebase build break
* Support format aliasing on SetImage
* Fix uniform buffers being lost when bindings are out of order
* Fix storage buffers being lost when bindings are out of order
(also avoid allocations when changing bindings)
* Use current command buffer for unscaled copy (perf)
Avoids flushing commands and renting a command buffer when fulfilling copy dependencies and when games do unscaled copies.
* Update to .net6
* Update Silk.NET to version 2.10.1
Somehow, massive performance boost. Seems like their vtable for looking up vulkan methods was really slow before.
* Fix PrimitivesGenerated query, disable Transform Feedback queries for now
Lets Splatoon 2 work on nvidia. (mostly)
* Update counter queue to be similar to the OGL one
Fixes softlocks when games had to flush counters.
* Don't throw when ending conditional rendering for now
This should be re-enabled when conditional rendering is enabled on nvidia etc.
* Update findMSB/findLSB to match master's instruction enum
* Fix triangle overlay on SMO, Captain Toad, maybe others?
* Don't make Intel Mesa pay for Intel Windows bugs
* Fix samplers with MinFilter Linear or Nearest (fixes New Super Mario Bros U Deluxe black borders)
* Update Spv.Generator
* Add alpha test emulation on shader (but no shader specialisation yet...)
* Fix R4G4B4A4Unorm texture format permutation
* Validation layers should be enabled for any log level other than None
* Add barriers around vkCmdCopyImage
Write->Read barrier for src image (we want to wait for a write to read it)
Write->Read barrier for dst image (we want to wait for the copy to complete before use)
* Be a bit more careful with texture access flags, since it can be used for anything
* Device local mapping for all buffers
May avoid issues with drivers with NVIDIA on linux/older gpus on windows when using large buffers (?)
Also some performance things and fixes issues with opengl games loading textures weird.
* Cleanup, disable device local buffers for now.
* Add single queue support
Multiqueue seems to be a bit more responsive on NVIDIA. Should fix texture flush on intel. AMD has been forced to single queue for an experiment.
* Fix some validation errors around extended dynamic state
* Remove Intel bug workaround, it was fixed on the latest driver
* Use circular queue for checking consumption on command buffers
Speeds up games that spam command buffers a little. Avoids checking multiple command buffers if multiple are active at once.
* Use SupportBufferUpdater, add single layer flush
* Fix counter queue leak when game decides to use host conditional rendering
* Force device local storage for textures (fixes linux performance)
* Port #3019
* Insert barriers around vkCmdBlitImage (may fix some amd flicker)
* Fix transform feedback on Intel, gl_Position feedback and clears to inexistent depth buffers
* Don't pause transform feedback for multi draw
* Fix draw outside of render pass and missing capability
* Workaround for wrong last attribute on AMD (affects FFVII, STRIKERS1945, probably more)
* Better workaround for AMD vertex buffer size alignment issue
* More instructions + fixes on SPIR-V backend
* Allow custom aspect ratio on Vulkan
* Correct GTK UI status bar positions
* SPIR-V: Functions must always end with a return
* SPIR-V: Fix ImageQuerySizeLod
* SPIR-V: Set DepthReplacing execution mode when FragDepth is modified
* SPIR-V: Implement LoopContinue IR instruction
* SPIR-V: Geometry shader support
* SPIR-V: Use correct binding number on storage buffers array
* Reduce allocations for Spir-v serialization
Passes BinaryWriter instead of the stream to Write and WriteOperand
- Removes creation of BinaryWriter for each instruction
- Removes allocations for literal string
* Some optimizations to Spv.Generator
- Dictionary for lookups of type declarations, constants, extinst
- LiteralInteger internal data format -> ushort
- Deterministic HashCode implementation to avoid spirv result not being the same between runs
- Inline operand list instead of List<T>, falls back to array if many operands. (large performance boost)
TODO: improve instruction allocation, structured program creator, ssa?
* Pool Spv.Generator resources, cache delegates, spv opts
- Pools for Instructions and LiteralIntegers. Can be passed in when creating the generator module.
- NewInstruction is called instead of new Instruction()
- Ryujinx SpirvGenerator passes in some pools that are static. The idea is for these to be shared between threads eventually.
- Estimate code size when creating the output MemoryStream
- LiteralInteger pools using ThreadStatic pools that are initialized before and after creation... not sure of a better way since the way these are created is via implicit cast.
Also, cache delegates for Spv.Generator for functions that are passed around to GenerateBinary etc, since passing the function raw creates a delegate on each call.
TODO: update python spv cs generator to make the coregrammar with NewInstruction and the `params` overloads.
* LocalDefMap for Ssa Rewriter
Rather than allocating a large array of all registers for each block in the shader, allocate one array of all registers and clear it between blocks. Reduces allocations in the shader translator.
* SPIR-V: Transform feedback support
* SPIR-V: Fragment shader interlock support (and image coherency)
* SPIR-V: Add early fragment tests support
* SPIR-V: Implement SwizzleAdd, add missing Triangles ExecutionMode for geometry shaders, remove SamplerType field from TextureMeta
* Don't pass depth clip state right now (fix decals)
Explicitly disabling it is incorrect. OpenGL currently automatically disables based on depth clamp, which is the behaviour if this state is omitted.
* Multisampling support
* Multisampling: Use resolve if src samples count > dst samples count
* Multisampling: We can only resolve for unscaled copies
* SPIR-V: Only add FSI exec mode if used.
* SPIR-V: Use ConstantComposite for Texture Offset Vector
Fixes a bunch of freezes with SPIR-V on AMD hardware, and validation errors. Note: Obviously assumes input offsets are constant, which they currently are.
* SPIR-V: Don't OpReturn if we already OpExit'ed
Fixes spir-v parse failure and stack smashing in RADV (obviously you still need bolist)
* SPIR-V: Only use input attribute type for input attributes
Output vertex attributes should always be of type float.
* Multithreaded Pipeline Compilation
* Address some feedback
* Make this 32
* Update topology with GpuAccessorState
* Cleanup for merge (note: disables spir-v)
* Make more robust to shader compilation failure
- Don't freeze when GLSL compilation fails
- Background SPIR-V pipeline compile failure results in skipped draws, similar to GLSL compilation failure.
* Fix Multisampling
* Only update fragment scale count if a vertex texture needs a scale.
Fixes a performance regression introduced by texture scaling in the vertex stage where support buffer updates would be very frequent, even at 1x, if any textures were used on the vertex stage.
This check doesn't exactly look cheap (a flag in the shader stage would probably be preferred), but it is much cheaper than uploading scales in both vulkan and opengl, so it will do for now.
* Use a bitmap to do granular tracking for buffer uploads.
This path is only taken if the much faster check of "is the buffer rented at all" is triggered, so it doesn't actually end up costing too much, and the time saved by not ending render passes (and on gpu for not waiting on barriers) is probably helpful.
Avoids ending render passes to update buffer data (not all the time)
- 140-180 to 35-45 in SMO metro kingdom (these updates are in the UI)
- Very variable 60-150(!) to 16-25 in mario kart 8 (these updates are in the UI)
As well as allowing more data to be preloaded persistently, this will also allow more data to be loaded in the preload buffer, which should be faster as it doesn't need to insert barriers between draws. (and on tbdr, does not need to flush and reload tile memory)
Improves performance in GPU limited scenarios. Should notably improve performance on TBDR gpus. Still a lot more to do here.
* Copy query results after RP ends, rather than ending to copy
We need to end the render pass to get the data (submit command buffer) anyways...
Reduces render passes created in games that use queries.
* Rework Query stuff a bit to avoid render pass end
Tries to reset returned queries in background when possible, rather than ending the render pass.
Still ends render pass when resetting a counter after draws, but maybe that can be solved too. (by just pulling an empty object off the pool?)
* Remove unnecessary lines
Was for testing
* Fix validation error for query reset
Need to think of a better way to do this.
* SPIR-V: Fix SwizzleAdd and some validation errors
* SPIR-V: Implement attribute indexing and StoreAttribute
* SPIR-V: Fix TextureSize for MS and Buffer sampler types
* Fix relaunch issues
* SPIR-V: Implement LogicalExclusiveOr
* SPIR-V: Constant buffer indexing support
* Ignore unsupported attributes rather than throwing (matches current GLSL behaviour)
* SPIR-V: Implement tessellation support
* SPIR-V: Geometry shader passthrough support
* SPIR-V: Implement StoreShader8/16 and StoreStorage8/16
* SPIR-V: Resolution scale support and fix TextureSample multisample with LOD bug
* SPIR-V: Fix field index for scale count
* SPIR-V: Fix another case of wrong field index
* SPIRV/GLSL: More scaling related fixes
* SPIR-V: Fix ImageLoad CompositeExtract component type
* SPIR-V: Workaround for Intel FrontFacing bug
* Enable SPIR-V backend by default
* Allow null samplers (samplers are not required when only using texelFetch to access the texture)
* Fix some validation errors related to texel block view usage flag and invalid image barrier base level
* Use explicit subgroup size if we can (might fix some block flickering on AMD)
* Take componentMask and scissor into account when clearing framebuffer attachments
* Add missing barriers around CmdFillBuffer (fixes Monster Hunter Rise flickering on NVIDIA)
* Use ClampToEdge for Clamp sampler address mode on Vulkan (fixes Hollow Knight)
Clamp is unsupported on Vulkan, but ClampToEdge behaves almost the same. ClampToBorder on the other hand (which was being used before) is pretty different
* Shader specialization for new Vulkan required state (fixes remaining alpha test issues, vertex stretching on AMD on Crash Bandicoot, etc)
* Check if the subgroup size is supported before passing a explicit size
* Only enable ShaderFloat64 if the GPU supports it
* We don't need to recompile shaders if alpha test state changed but alpha test is disabled
* Enable shader cache on Vulkan and implement MultiplyHighS32/U32 on SPIR-V (missed those before)
* Fix pipeline state saving before it is updated.
This should fix a few warnings and potential stutters due to bad pipeline states being saved in the cache. You may need to clear your guest cache.
* Allow null samplers on OpenGL backend
* _unit0Sampler should be set only for binding 0
* Remove unused PipelineConverter format variable (was causing IOR)
* Raise textures limit to 64 on Vulkan
* No need to pack the shader binaries if shader cache is disabled
* Fix backbuffer not being cleared and scissor not being re-enabled on OpenGL
* Do not clear unbound framebuffer color attachments
* Geometry shader passthrough emulation
* Consolidate UpdateDepthMode and GetDepthMode implementation
* Fix A1B5G5R5 texture format and support R4G4 on Vulkan
* Add barrier before use of some modified images
* Report 32 bit query result on AMD windows (smo issue)
* Add texture recompression support (disabled for now)
It recompresses ASTC textures into BC7, which might reduce VRAM usage significantly on games that uses ASTC textures
* Do not report R4G4 format as supported on Vulkan
It was causing mario head to become white on Super Mario 64 (???)
* Improvements to -1 to 1 depth mode.
- Transformation is only applied on the last stage in the vertex pipeline.
- Should fix some issues with geometry and tessellation (hopefully)
- Reading back FragCoord Z on fragment will transform back to -1 to 1.
* Geometry Shader index count from ThreadsPerInputPrimitive
Generally fixes SPIR-V emitting too many triangles, may change games in OpenGL
* Remove gl_FragDepth scaling
This is always 0-1; the other two issues were causing the problems. Fixes regression with Xenoblade.
* Add Gl StencilOp enum values to Vulkan
* Update guest cache to v1.1 (due to specialization state changes)
This will explode your shader cache from earlier vulkan build, but it must be done. 😔
* Vulkan/SPIR-V support for viewport inverse
* Fix typo
* Don't create query pools for unsupported query types
* Return of the Vector Indexing Bug
One day, everyone will get this right.
* Check for transform feedback query support
Sometimes transform feedback is supported without the query type.
* Fix gl_FragCoord.z transformation
FragCoord.z is always in 0-1, even when the real depth range is -1 to 1. Turns out the only bug was geo and tess stage outputs.
Fixes Pokemon Sword/Shield, possibly others.
* Fix Avalonia Rebase
Vulkan is currently not available on Avalonia, but the build does work and you can use opengl.
* Fix headless build
* Add support for BC6 and BC7 decompression, decompress all BC formats if they are not supported by the host
* Fix BCn 4/5 conversion, GetTextureTarget
BCn 4/5 could generate invalid data when a line's size in bytes was not divisible by 4, which both backends expect.
GetTextureTarget was not creating a view with the replacement format.
* Fix dependency
* Fix inverse viewport transform vector type on SPIR-V
* Do not require null descriptors support
* If MultiViewport is not supported, do not try to set more than one viewport/scissor
* Bounds check on bitmap add.
* Flush queries on attachment change rather than program change
Occlusion queries are usually used in a depth only pass so the attachments changing is a better indication of the query block ending.
Write mask changes are also considered since some games do depth only pass by setting 0 write mask on all the colour targets.
* Add support for avalonia (#6)
* add avalonia support
* only lock around skia flush
* addressed review
* cleanup
* add fallback size if avalonia attempts to render but the window size is 0. read desktop scale after enabling dpi check
* fix getting window handle on linux. skip render is size is 0
* Combine non-buffer with buffer image descriptor sets
* Support multisample texture copy with automatic resolve on Vulkan
* Remove old CompileShader methods from the Vulkan backend
* Add minimal pipeline layouts that only contains used bindings
They are used by helper shaders, the intention is avoiding needing to recompile the shaders (from GLSL to SPIR-V) if the bindings changes on the translated guest shaders
* Pre-compile helper shader as SPIR-V, and some fixes
* Remove pre-compiled shaderc binary for Windows as its no longer needed by default
* Workaround RADV crash
Enabling the descriptor indexing extension, even if it is not used, forces the radv driver to use "bolist".
* Use RobustBufferAccess on NVIDIA gpus
Avoids the SMO waterfall triangle on older NVIDIA gpus.
* Implement GPU selector and expose texture recompression on the UI and config
* Fix and enable background compute shader compilation
Also disables warnings from shader cache pipeline misses.
* Fix error due to missing subpass dependency when Attachment Write -> Shader Read barriers are added
* If S8D24 is not supported, use D32FS8
* Ensure all fences are destroyed on dispose
* Pre-allocate arrays up front on DescriptorSetUpdater, allows the removal of some checks
* Add missing clear layer parameter after rebase
* Use selected gpu from config for avalonia (#7)
* use configured device
* address review
* Fix D32S8 copy workaround (AMD)
Fixes water in Pokemon Legends Arceus on AMD GPUs. Possibly fixes other things.
* Use push descriptors for uniform buffer updates (disabled for now)
* Push descriptor support check, buffer redundancy checks
Should make push descriptors faster, needs more testing though.
* Increase light command buffer pool to 2 command buffers, throw rather than returning invalid cbs
* Adjust bindings array sizes
* Force submit command buffers if memory in use by its resources is high
* Add workaround for AMD GCN cubemap view sins
`ImageCreateCubeCompatibleBit` seems to generally break 2D array textures with mipmaps... even if they are eventually aliased as a cubemap with mipmaps. Forcing a copy here works around the issue.
This could be used in future if enabling this bit reduces performance on certain GPUs. (mobile class is generally a worry)
Currently also enabled on Linux as I don't know if they managed to dodge this bug (someone please tell me). Not enabled on Vega at the moment, but easy to add if the issue is there.
* Add mobile, non-RX variants to the GCN regex.
Also make sure that the 3 digit ones only include numbers starting with 7 or 8.
* Increase image limit per stage from 8 to 16
Xenoblade Chronicles 2 was hiting the limit of 8
* Minor code cleanup
* Fix NRE caused by SupportBufferUpdater calling pipeline ClearBuffer
* Add gpu selector to Avalonia (#8)
* Add gpu selector to avalonia settings
* show backend label on window
* some fixes
* address review
* Minor changes to the Avalonia UI
* Update graphics window UI and locales. (#9)
* Update xaml and update locales
* locale updates
Did my best here but likely needs to be checked by native speakers, especially the use of ampersands in greek, russian and turkish?
* Fix locales with more (?) correct translations.
* add separator to render widget
* fix spanish and portuguese
* Add new IdList, replaces buffer list that could not remove elements and had unbounded growth
* Don't crash the settings window if Vulkan is not supported
* Fix Actions menu not being clickable on GTK UI after relaunch
* Rename VulkanGraphicsDevice to VulkanRenderer and Renderer to OpenGLRenderer
* Fix IdList and make it not thread safe
* Revert useless OpenGL format table changes
* Fix headless project build
* List throws ArgumentOutOfRangeException
* SPIR-V: Fix tessellation
* Increase shader cache version due to tessellation fix
* Reduce number of Sync objects created (improves perf in some specific titles)
* Fix vulkan validation errors for NPOT compressed upload and GCN workaround.
* Add timestamp to the shader cache and force rebuild if host cache is outdated
* Prefer Mail box present mode for popups (#11)
* Prefer Mail box present mode
* fix debug
* switch present mode when vsync is toggled
* only disable vsync on the main window
* SPIR-V: Fix geometry shader input load with transform feedback
* BC7 Encoder: Prefer more precision on alpha rather than RGB when alpha is 0
* Fix Avalonia build
* Address initial PR feedback
* Only set transform feedback outputs on last vertex stage
* Address riperiperi PR feedback
* Remove outdated comment
* Remove unused constructor
* Only throw for negative results
* Throw for QueueSubmit and other errors
No point in delaying the inevitable
* Transform feedback decorations inside gl_PerVertex struct breaks the NVIDIA compiler
* Fix some resolution scale issues
* No need for two UpdateScale calls
* Fix comments on SPIR-V generator project
* Try to fix shader local memory size
On DOOM, a shader is using local memory, but both Low and High size are 0, CRS size is 1536, it seems to store on that region?
* Remove RectangleF that is now unused
* Fix ImageGather with multiple offsets
Needs ImageGatherExtended capability, and must use `ConstantComposite` instead of `CompositeConstruct`
* Address PR feedback from jD in all projects except Avalonia
* Address most of jD PR feedback on Avalonia
* Remove unsafe
* Fix VulkanSkiaGpu
* move present mode request out of Create Swapchain method
* split more parts of create swapchain
* addressed reviews
* addressed review
* Address second batch of jD PR feedback
* Fix buffer <-> image copy row length and height alignment
AlignUp helper does not support NPOT alignment, and ASTC textures can have NPOT block sizes
* Better fix for NPOT alignment issue
* Use switch expressions on Vulkan EnumConversion
Thanks jD
* Fix Avalonia build
* Add Vulkan selection prompt on startup
* Grammar fixes on Vulkan prompt message
* Add missing Vulkan migration flag
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
Co-authored-by: Emmanuel Hansen <emmausssss@gmail.com>
Co-authored-by: MutantAura <44103205+MutantAura@users.noreply.github.com>
* Initial commit with a lot of testing stuff.
* Partial Unmap Cleanup Part 1
* Fix some minor issues, hopefully windows tests.
* Disable partial unmap tests on macos for now
Weird issue.
* Goodbye magic number
* Add COMPlus_EnableAlternateStackCheck for tests
`COMPlus_EnableAlternateStackCheck` is needed for NullReferenceException handling to work on linux after registering the signal handler, due to how dotnet registers its own signal handler.
* Address some feedback
* Force retry when memory is mapped in memory tracking
This case existed before, but returning `false` no longer retries, so it would crash immediately after unprotecting the memory... Now, we return `true` to deliberately retry.
This case existed before (was just broken by this change) and I don't really want to look into fixing the issue right now. Technically, this means that on guest code partial unmaps will retry _due to this_ rather than hitting the handler. I don't expect this to cause any issues.
This should fix random crashes in Xenoblade Chronicles 2.
* Use IsRangeMapped
* Suppress MockMemoryManager.UnmapEvent warning
This event is not signalled by the mock memory manager.
* Remove 4kb mapping
* Avalonia: Another Cleanup
This PR is a cleanup to the avalonia code recently added:
- Some XAML file are autoformatted like a previous PR.
- Dlc is renamed to DownloadableContent (Locale exclude).
- DownloadableContentManagerWindow is a bit improved (Fixes#3491).
- Some nits here and there.
* Fix GTK
* Remove AttachDebugDevTools
* Fix last warning
* Fix JSON fields
This PR cleanup the UserEditor code a bit, 2 texts are added for "Name" and "User Id", because when you create a new profile, the textbox is empty without any hints. `axaml` files are autoformated too.
* Add a sampler pool cache and improve texture pool cache
* Increase disposal timestamp delta more to be on the safe side
* Nits
* Use abstract class for PoolCache, remove factory callback
This is the first commit of a series of reformat around the codebase as
discussed internally some weeks ago.
This project being one that isn't touched that much, it shouldn't cause
conflict with any opened PRs.
* remove content dialog placeholder from all windows
* remove redundant window argument
* redesign user profile window
* wip
* use avalonia auto name generator
* add edit and new user options
* move profile image selection to content dialog
* remove usings
* fix updater
* address review
* adjust avatar dialog size
* add validation for user editor
* fix typo
* Shorten some labels
* experimental changes to try and reduce allocations in kernel threading and DMA handler
* Simplify the changes in this branch to just 1. Don't make unnecessary copies of data just for texture-texture transfers and 2. Add a fast path for 1bpp linear byte copies
* forgot to check src + dst linearity in 1bpp DMA fast path. Fixes the UE4 regression.
* removing dev log I left in
* Generalizing the DMA linear fast path to cases other than 1bpp copies
* revert kernel changes
* revert whitespace
* remove unneeded references
* PR feedback
Co-authored-by: Logan Stromberg <lostromb@microsoft.com>
Co-authored-by: gdk <gab.dark.100@gmail.com>
Some games and the Mario Odyssey Multiplayer mod do this.
The SMO multiplayer mod also needs you to revert #3394 as it uses a blocking socket to receive (otherwise it hangs), and it doesn't seem to like being forced as non-blocking.
* expand English tooltips and clean up
* small oversight
* update Spanish locale
* wording
* Internet
* address feedback
* update localization accordingly
* Add all other windows
* addreesed review
* Prevent "No Update" option from being deleted
* Select no update is the current update is removed from the title update window
* fix amiibo crash
* Ryujinx.Audio: Remove BOM from files
* misc: Relicense Ryujinx.Audio under the terms of the MIT license
With the approvals of all the Ryujinx.Audio contributors, this commit
changes Ryujinx.Audio license from LGPLv3 to MIT.
* Add support for alpha to coverage dithering
* Shader cache version bump
* Fix wrong alpha register
* Ensure support buffer is cleared
* New shader specialization based approach
* add settings windows and children views
* Expose hotkeys configuration on the UI
* Remove double spacing from locale JSON
* simplify button assigner
* add cemuhook buttons and title to locale
* move common button assigner to own class
* cancel button assigner when window is closed
* remove unused setting
* address review. fix controller profile not loading default when switching devices
* fix updater file name
* Input cleanup (#37)
* addressed review
* add device type to controller device checks
* change accessibility modifier of public classes to internal
* Update Ryujinx.Ava/Ui/ViewModels/ControllerSettingsViewModel.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update de_DE.json
* Update de_DE.json
* Update tr_TR.json
Translated newly added lines
* Update it_IT.json
* fix rebase
* update avalonia
* fix wrong key used for button text
* Align settings window elements
* Tabs to spaces
* Update brazilian portuguese translation
* Minor improvement on brazilian portuguese translation
* fix turkish translation
* remove unused text
* change view related classes to public
* unsubscribe from deferred event if dialog is closed
* Load the default language before loading any other when switching languages
* Make controller settings more compact
* increase default width of settings window, reduce profile buttons width
Co-authored-by: gdk <gab.dark.100@gmail.com>
Co-authored-by: MutantAura <44103205+MutantAura@users.noreply.github.com>
Co-authored-by: Niwu34 <67392333+Niwu34@users.noreply.github.com>
Co-authored-by: aegiff <99728970+aegiff@users.noreply.github.com>
Co-authored-by: Antonio Brugnolo <36473846+AntoSkate@users.noreply.github.com>
Because of that PR, TimeZoneRule was bigger than 0x4000 thanks to a
misuse of a constant.
This commit address this issue and add a new unit test to ensure the size of
TimeZoneRule is 0x4000 bytes.
Also address suggestions that were lost on the original PR.
* time: Make TimeZoneRule blittable and avoid copies
This drastically reduce overhead of using TimeZoneRule around the
codebase.
Effect on games is unknown
* Add missing Box type
* Ensure we clean the structure still
This doesn't perform any copies
* Address gdkchan's comments
* Simplify Box
* Rewrite kernel memory allocator
* Remove unused using
* Adjust private static field naming
* Change UlongBitSize to UInt64BitSize
* Fix unused argument, change argument order to be inline with official code and disable random allocation
* Fix doubling of detected gamepads (sometimes the connected event is fired when the app starts even though the pad was connected for some time now).
The fix rejects the gamepad if one with the same ID is already present.
* Fixed review findings
* Changes 1
* Changes 2
* Better ModifiedSequence handling
This should handle PreciseEvents properly, and simplifies a few things.
* Minor changes, remove debug log
* Handle stage.Info being null
Hopefully fixes Catherine crash
* Fix shader specialization fast texture lookup
* Fix some things.
* Address Feedback Part 1
* Make method static.
* Use copy dependency for textures that differs in multisample but are otherwise compatible
* Remove allowMs flag as it's no longer required for correctness, it's just an optimization now
* Dispose intermmediate pool
The syncpoint maximum value represents the maximum possible syncpt value at a given time, however due to PBs being submitted before max was incremented, for a brief moment of time this is not the case which could lead to invalid behaviour if a game waits on the fence at that specific time.
* Implement syscall handlers using a source generator
* Copy FlushProcessDataCache implementation to Syscall since it was only implemented on Syscall32
* Fix wrong argument order in some syscalls
* Delete old Reflection.Emit based syscall handling code
* Improvements to the code generation
* ControlCodeMemory address and size is always 64-bit
* Refactor CPU interface
* Use IExecutionContext interface on SVC handler, change how CPU interrupts invokes the handlers
* Make CpuEngine take a ITickSource rather than returning one
The previous implementation had the scenario where the CPU engine had to implement the tick source in mind, like for example, when we have a hypervisor and the game can read CNTPCT on the host directly. However given that we need to do conversion due to different frequencies anyway, it's not worth it. It's better to just let the user pass the tick source and redirect any reads to CNTPCT to the user tick source
* XML docs for the public interfaces
* PPTC invalidation due to NativeInterface function name changes
* Fix build of the CPU tests
* PR feedback
As GitHub sort our builds in an alphanumeric way, we abuse that to fix
both new and old updater behaviour.
This should fix all our issues.
Avalonia updater will be broken between version 1.1.122 to 1.1.126, and
will need manual intervention.
* Prefetch capabilities before spawning translation threads.
The Backend Multithreading only expects one thread to submit commands at a time. When compiling shaders, the translator may request the host GPU capabilities from the backend. It's possible for a bunch of translators to do this at the same time.
There's a caching mechanism in place so that the capabilities are only fetched once. By triggering this before spawning the thread, the async translation threads no longer try to queue onto the backend queue all at the same time.
The Capabilities do need to be checked from the GPU thread, due to OpenGL needing a context to check them, so it's not possible to call the underlying backend directly.
* Initialize the capabilities when setting the GPU thread + missing call in headless
* Remove private variables
This PR adds the alternative enum values for StencilOp. Similar to the other enums, I added these with the same names but with Gl added to the end. These are used by homebrew using Nouveau, though they might be used by games with the official Vulkan driver.
39d90be897/rnndb/graph/nv_3ddefs.xml (L77)
Fixes some broken graphics in Citra, such as missing shadows in Mario Kart 7. Likely fixes other homebrew.
* Enable JIT service LLE
* Force disable PPTC when using the JIT service
PPTC does not support multiple guest processes
* Fix build
* Make SM service registration per emulation context rather than global
* Address PR feedback
* Fix shared memory leak on Windows
* Fix memory leak caused by RO session disposal not decrementing the memory manager ref count
* Fix UnmapViewInternal deadlock
* Was not supposed to add those back
* Back to the origins: Make memory manager take guest PA rather than host address once again
* Direct mapping with alias support on Windows
* Fixes and remove more of the emulated shared memory
* Linux support
* Make shared and transfer memory not depend on SharedMemoryStorage
* More efficient view mapping on Windows (no more restricted to 4KB pages at a time)
* Handle potential access violations caused by partial unmap
* Implement host mapping using shared memory on Linux
* Add new GetPhysicalAddressChecked method, used to ensure the virtual address is mapped before address translation
Also align GetRef behaviour with software memory manager
* We don't need a mirrorable memory block for software memory manager mode
* Disable memory aliasing tests while we don't have shared memory support on Mac
* Shared memory & SIGBUS handler for macOS
* Fix typo + nits + re-enable memory tests
* Set MAP_JIT_DARWIN on x86 Mac too
* Add back the address space mirror
* Only set MAP_JIT_DARWIN if we are mapping as executable
* Disable aliasing tests again (still fails on Mac)
* Fix UnmapView4KB (by not casting size to int)
* Use ref counting on memory blocks to delay closing the shared memory handle until all blocks using it are disposed
* Address PR feedback
* Make RO hold a reference to the guest process memory manager to avoid early disposal
Co-authored-by: nastys <nastys@users.noreply.github.com>
If two or more threads encounter a region of memory where a read action has been registered, then they must _both_ wait on the data.
Clearing the action before it completed was causing the null check above to fail, so the action would only be run on the first thread, and the second would end up continuing without waiting. Depending on what the game does, this could be disasterous.
This fixes a regression introduced by #3302 with Pokemon Legends Arceus, and possibly Catherine. There are likely other affected games. What is fixed in that PR should still be fixed.
* Fix various issues with texture sync
A variable called _actionRegistered is used to keep track of whether a tracking action has been registered for a given texture group handle. This variable is set when the action is registered, and should be unset when it is consumed. This is used to skip registering the tracking action if it's already registered, saving some time for render targets that are modified very often.
There were two issues with this. The worst issue was that the tracking action handler exits early if the handle's modified flag is false... which means that it never reset _actionRegistered, as that was done within the Sync() method called later. The second issue was that this variable was set true after the sync action was registered, so it was technically possible for the action to run immediately, set the flag to false, then set it to true.
Both situations would lead to the action never being registered again, as the texture group handle would be sure the action is already registered. This breaks the texture for the remaining runtime, or until it is disposed.
It was also possible for a texture to register sync once, then on future frames the last modified sync number did not update. This may have caused some more minor issues.
Seems to fix the Xenoblade flashing bug. Obviously this needs a lot of testing, since it was random chance. I typically had the most luck getting it to happen by switching time of day on the event theatre screen for a while, then entering the equipment screen by pressing X on an event.
May also fix weird things like random chance air swimming in BOTW, maybe a few texture streaming bugs.
* Exchange rather than CompareExchange
* New shader cache implementation
* Remove some debug code
* Take transform feedback varying count into account
* Create shader cache directory if it does not exist + fragment output map related fixes
* Remove debug code
* Only check texture descriptors if the constant buffer is bound
* Also check CPU VA on GetSpanMapped
* Remove more unused code and move cache related code
* XML docs + remove more unused methods
* Better codegen for TransformFeedbackDescriptor.AsSpan
* Support migration from old cache format, remove more unused code
Shader cache rebuild now also rewrites the shared toc and data files
* Fix migration error with BRX shaders
* Add a limit to the async translation queue
Avoid async translation threads not being able to keep up and the queue growing very large
* Re-create specialization state on recompile
This might be required if a new version of the shader translator requires more or less state, or if there is a bug related to the GPU state access
* Make shader cache more error resilient
* Add some missing XML docs and move GpuAccessor docs to the interface/use inheritdoc
* Address early PR feedback
* Fix rebase
* Remove IRenderer.CompileShader and IShader interface, replace with new ShaderSource struct passed to CreateProgram directly
* Handle some missing exceptions
* Make shader cache purge delete both old and new shader caches
* Register textures on new specialization state
* Translate and compile shaders in forward order (eliminates diffs due to different binding numbers)
* Limit in-flight shader compilation to the maximum number of compilation threads
* Replace ParallelDiskCacheLoader state changed event with a callback function
* Better handling for invalid constant buffer 1 data length
* Do not create the old cache directory structure if the old cache does not exist
* Constant buffer use should be per-stage. This change will invalidate existing new caches (file format version was incremented)
* Replace rectangle texture with just coordinate normalization
* Skip incompatible shaders that are missing texture information, instead of crashing
This is required if we, for example, support new texture instruction to the shader translator, and then they allow access to textures that were not accessed before. In this scenario, the old cache entry is no longer usable
* Fix coordinates normalization on cubemap textures
* Check if title ID is null before combining shader cache path
* More robust constant buffer address validation on spec state
* More robust constant buffer address validation on spec state (2)
* Regenerate shader cache with one stream, rather than one per shader.
* Only create shader cache directory during initialization
* Logging improvements
* Proper shader program disposal
* PR feedback, and add a comment on serialized structs
* XML docs for RegisterTexture
Co-authored-by: riperiperi <rhy3756547@hotmail.com>
* amadeus: Improve and fix delay effect processing
This rework the delay effect processing by representing calculation with the appropriate matrix and by unrolling some loop in the code.
This allows better optimization by the JIT while making it more readeable.
Also fix a bug in the Surround code path found while looking back at my notes.
* Remove useless GetHashCode
* Address gdkchan's comments
This should implement all ABI changes from REV11 on 14.0.0
As Nintendo changed the channel disposition for "legacy" effects (Delay, Reverb and Reverb 3D) to match the standard channel mapping, I took the liberty to just remap to the old disposition for now.
The proper changes will be handled at a later date with a complete rewriting of those 3 effects to be more readable (see https://github.com/Ryujinx/Ryujinx/pull/3205 for the first iteration of it).
* hle: Some cleanup
This PR cleaned up a bit the HLE folder and the VirtualFileSystem one, since we use LibHac, we can use some class of it directly instead of duplicate things. The "Content" of VFS folder is removed since it should be handled in the NCM service directly.
A larger cleanup should be done later since there is still be duplicated code here and there.
* Fix Headless.SDL2
* Addresses gdkchan feedback
* De-tile GOB when DMA copying from block linear to pitch kind memory regions
* XML docs + nits
* Remove using
* No flush for regular buffer copies
* Add back ulong casts, fix regression due to oversight
OpenGL game overlays and hooks tend to make a lot of assumptions about how games present frames to the screen, since presentation in OpenGL kind of sucks and they would like to have info such as the size of the screen, or if the contents are SRGB rather than linear.
There are two ways of getting this. OBS hooks swap buffers to get a frame for video capture, but it actually checks the bound framebuffer at the time. I made sure that this matches the output framebuffer (the window) so that the output matches the size. RTSS checks the viewport size by default, but this was actually set to the last used viewport by the game, causing the OSD to fly all across the screen depending on how it was used (or res scale). The viewport is now manually set to match the output framebuffer size.
In the case of RTSS, it also loads its resources by destructively setting a pixel pack parameter without regard to what it was set to by the guest application. OpenGL state can be set for a long period of time and is not expected to be set before each call to a method, so randomly changing it isn't great practice. To fix this, I've added a line to set the pixel unpack alignment back to 4 after presentation, which should cover RTSS loading its incredibly ugly font.
- RTSS and overlays that use it should no longer cause certain textures to load incorrectly. (mario kart 8, pokemon legends arceus)
- OBS Game Capture should no longer crop the game output incorrectly, flicker randomly, or capture with incorrect gamma.
This doesn't fix issues with how RTSS reports our frame timings.
* oslc: Fix condition in GetSaveDataBackupSetting
This PR fixes a condition previously implemented in #3190 where ACNH can't be booted without an existing savedata.
Closes#3206
* Addresses gdkchan feedback
* ntc: Implement IEnsureNetworkClockAvailabilityService
This PR implement a basic `IEnsureNetworkClockAvailabilityService` checked by RE. It's needed by Splatoon 2 with Guest Internet Access enabled. Game is now playable with this setting.
* Update Ryujinx.HLE/HOS/Services/Nim/Ntc/StaticService/IEnsureNetworkClockAvailabilityService.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Update IGeneralService.cs
Fix IPV4 local ip related frame drop in fire emblem by rewriting [CommandHipc(12)]
* Fix IPV4 Local IP Slowdown & Style Fixes
fix a missing space
* Remove unnecessary line
* Fix for hardcoding which index to use
* Replace argument with empty string.
By sending an empty string to Dns.GetHostAddresses("") you get back localhost info only.
* Add caching, undo change in GetCurrentIpAddress
Implement caching and revert the GetCurrentIP() function, speed improvements still present.
* Remove unnecessary using
* Syntax fixes and removing extra lines
Requested changes by AcK77
* Properly unsubscribe from event handler
Adds an unsubscribe in the dispose section of IGeneralService
* Ui: Add option to show/hide console window (Windows-only)
* Ui: Only display Show Console menu item on Windows
* ConsoleHelper: Handle NULL case
This will never happen
* Address nits
* Address comments
* Address comments 2
* olsc: Implement GetSaveDataBackupSetting
This PR implement GetSaveDataBackupSetting of OLSC service which is now needed by ACNH 2.0.5. The game is playable as usual if you use the same user profile as the original save file (I don't know if it was the case before), everything is checked by RE.
* addresses gdkchan feedback
Fix a copypasta from the original Amadeus PR causing invalid
CopyHistories output.
Also added a missing size check.
This fix a crash in Mononoke Slashdown
* Preparation for initial Flatpack and FlatHub integration
This integrate some initial changes required for Flatpack and distribution from FlatHub.
Also added some resources that will be used for packaging on Linux.
* Address gdkchan comment
* Allow textures to have their data partially mapped
* Explicitly check for invalid memory ranges on the MultiRangeList
* Update GetWritableRegion to also support unmapped ranges
* gui: Fixes the games icon when there is a game update
Currently we just load the version of the update, instead of the whole NACP file. This PR fixes that. A little cleanup is made into the code to avoid duplicate things.
(Closes#3039)
* Fix condition
* Collapse AsSpan().Slice(..) calls into AsSpan(..)
Less code and a bit faster
* Collapse an Array.Clear(array, 0, array.Length) call to Array.Clear(array)
This should prevent filesystem services from blocking other services that don't have their own ServerBase. May improve filesystem related stutters in certain titles.
Improves button advanced cutscenes such as Miqol's Request in Xenoblade: DE when the game is on a network share (used to stutter when voice lines played).
Should probably be tested to make sure no mysterious bugs have been unearthed, and to see if any other filesystem related perf issues are improved.
* Implement/Stub mnpp:app service and some hid calls
This PR Implement/Stub the `mnpp:app` service (closes#3107) accordingly to RE, which seems to do some telemetry for China region only, so everything is stubbed.
This PR fixes some inconsistencies in the hid service too and stub EnableSixAxisSensorUnalteredPassthrough, IsSixAxisSensorUnalteredPassthroughEnabled, LoadSixAxisSensorCalibrationParameter, GetSixAxisSensorIcInformation calls (closes#3123 and closes#3124).
* Addresses Thog review
* added trace log level
* use trace log level instead of debug ( #1547)
* alignment #1547
* moved trace logs toggle at the bottom #1547
* bumped config file version #3096
* added migration step #3096
* setting moved to the dev section #1547
* performance warning displayed when trace is enabled #1547
Before, it was selecting nearest neighbour, which sounded terrible. This is likely temporary til the upsampling algorithm used by the switch is reversed.
Fixes bad audio in Skyward Sword HD.
* Do not allow render targets not explicitly written by the fragment shader to be modified
* Shader cache version bump
* Remove blank lines
* Avoid redundant color mask updates
* HostShaderCacheEntry can be null
* Avoid more redundant glColorMask calls
* nit: Mask -> Masks
* Fix currentComponentMask
* More efficient way to update _currentComponentMasks
* Making deadzones feel nice and smooth + adding rider files to .gitignore
* removing unnecessary parentheses and fixing possibility of divide by 0
* formatting :)
* fixing up ClampAxis
* fixing up ClampAxis
* Add RuntimeIdentifers properties
For Linux, Windows and OS X x86-64
This ensures that the SoundIO project gets this property when built as a subproject
* Address gdkchan's nit
Merge tags into one
- Run the extra data fix in FixExtraData on non-system saves that have no owner ID.
- Set the owner ID in the dummy application control property if an application doesn't have a proper one available.
[ If any section does not apply, replace its contents with "N/A". ]</br>
[ If you do not have the information needed for a section, replace its contents with "Unknown". ]</br>
[ Lines between [ ] (square brackets) are to be removed before posting. ]</br>
[ Please search for existing [feature requests](https://github.com/Ryujinx/Ryujinx/issues) before you make your own request. ]</br>
[ Duplicate requests will be marked as such and you will be referred to the original request. ]
### What feature are you suggesting?
#### Overview:
- [ Include the basic, high-level concepts for this feature here. ]
#### Smaller Details:
- [ These may include specific methods of implementation etc. ]
#### Nature of Request:
[ Remove all that do not apply to your request. ]
- Addition
- [ Ex: Addition of certain original features or features from other community projects. ]
- [ If you are suggesting porting features or including features from other projects, include what license they are distributed under and what, if any libraries those project use. ]
- Change
- Removal
- [Ex: Removal of certain features or implementation due to a specific issue/bug or because of low quality code, etc.]
### Why would this feature be useful?
[ If this is a feature for an end-user, how does it benefit the end-user? ]</br>
[ If this feature is for developers, what does it add to Ryujinx that did not already exist? ]
[ If any section does not apply, replace its contents with "N/A". ]</br>
[ If you do not have the information needed for a section, replace its contents with "Unknown". ]</br>
[ Lines between [ ] (square brackets) are to be removed before posting. ]
[ Please search for existing [missing CPU instruction](https://github.com/Ryujinx/Ryujinx/issues) before you make your own issue. ]</br>
[ See the following [issue](https://github.com/Ryujinx/Ryujinx/issues/1405) as an example ]</br>
[ Duplicate issue will be marked as such and you will be referred to the original request. ]
### What CPU instruction is missing?
Requires the *INSTRUCTION* instruction.</br>
[ Replace *INSTRUCTION* by the instruction name, e.g. VADDL.U16 ]
```
*
```
[ Add the undefined instruction error message in the above code block ]
### Instruction name
```
*
```
[ Include the name from [armconverter.com](https://armconverter.com/?disasm) or [shell-storm.org](http://shell-storm.org/online/Online-Assembler-and-Disassembler/?arch=arm64&endianness=big&dis_with_raw=True&dis_with_ins=True) in the above code block ]
### Required by:
[ Add our (games list database)[https://github.com/Ryujinx/Ryujinx-Games-List/issues] links of games who require this instruction ]
description:CPU Instruction is missing in Ryujinx.
title:"[CPU]"
labels:[cpu, not-implemented]
body:
- type:textarea
id:instruction
attributes:
label:CPU instruction
description:What CPU instruction is missing?
validations:
required:true
- type:textarea
id:name
attributes:
label:Instruction name
description:Include the name from [armconverter.com](https://armconverter.com/?disasm) or [shell-storm.org](http://shell-storm.org/online/Online-Assembler-and-Disassembler/?arch=arm64&endianness=big&dis_with_raw=True&dis_with_ins=True) in the above code block
validations:
required:true
- type:textarea
id:required
attributes:
label:Required by
description:Add links to the [compatibility list page(s)](https://github.com/Ryujinx/Ryujinx-Games-List/issues) of the game(s) that require this instruction.
[ If any section does not apply, replace its contents with "N/A". ]</br>
[ If you do not have the information needed for a section, replace its contents with "Unknown". ]</br>
[ Lines between [ ] (square brackets) are to be removed before posting. ]
[ Please search for existing [missing service call](https://github.com/Ryujinx/Ryujinx/issues) before you make your own issue. ]</br>
[ See the following [issue](https://github.com/Ryujinx/Ryujinx/issues/1431) as an example ]</br>
[ Duplicate issue will be marked as such and you will be referred to the original request. ]
### What service call is missing?
*SERVICE**INTERFACE*: *NUMBER* (*NAME*) is not implemented.</br>
[ Replace *SERVICE* by the service name, e.g. appletAE ]</br>
[ Replace *INTERFACE* by the interface name, e.g. IAllSystemAppletProxiesService ]</br>
[ Replace *NUMBER* by the call number, e.g. 100 ]</br>
[ Replace *NAME* by the call name, e.g. OpenSystemAppletProxy ]</br>
[ e.g. appletAE IAllSystemAppletProxiesService: 100 (OpenSystemAppletProxy) ]
[ Add related links to the specific call from [Switchbrew](https://switchbrew.org/w/index.php?title=Services_API) and/or [SwIPC](https://reswitched.github.io/SwIPC/) ]
### Service description
```
*
```
[ Include the description/explanation from [Switchbrew](https://switchbrew.org/w/index.php?title=Services_API) and/or [SwIPC](https://reswitched.github.io/SwIPC/) in the above code block ]
### Required by:
[ Add our (games list database)[https://github.com/Ryujinx/Ryujinx-Games-List/issues] links of games who require this call ]
description:Include the description/explanation from [Switchbrew](https://switchbrew.org/w/index.php?title=Services_API) and/or [SwIPC](https://reswitched.github.io/SwIPC/) in the above code block
validations:
required:true
- type:textarea
id:required
attributes:
label:Required by
description:Add links to the [compatibility list page(s)](https://github.com/Ryujinx/Ryujinx-Games-List/issues) of the game(s) that require this service.
description:Shader Instruction is missing in Ryujinx.
title:"[GPU]"
labels:[gpu, not-implemented]
body:
- type:textarea
id:instruction
attributes:
label:Shader instruction
description:What shader instruction is missing?
validations:
required:true
- type:textarea
id:required
attributes:
label:Required by
description:Add links to the [compatibility list page(s)](https://github.com/Ryujinx/Ryujinx-Games-List/issues) of the game(s) that require this instruction.
// Helpers to index doublewords within quad words. Essentially, looping over the vector starts at quadword Q and index Fx or Ix within it,
// depending on instruction type.
//
// Qx: The quadword register that the target vector is contained in.
// Ix: The starting index of the target vector within the quadword, with size treated as integer.
// Fx: The starting index of the target vector within the quadword, with size treated as floating point. (16 or 32)
publicintQd=>GetQuadwordIndex(Vd);
publicintId=>GetQuadwordSubindex(Vd)<<(3-Size);
publicintFd=>GetQuadwordSubindex(Vd)<<(1-(Size&1));// When the top bit is truncated, 1 is fp16 which is an optional extension in ARMv8.2. We always assume 64.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.