Commit Graph

61 Commits

Author SHA1 Message Date
ed2f5ede0f Fix texture sampling with depth compare and LOD level or bias (#2404)
* Fix texture sampling with depth compare and LOD level or bias

* Shader cache version bump

* nit: Sorting
2021-06-25 00:54:50 +02:00
49edf14a3e Pass all inputs when geometry shader passthrough is enabled (#2362)
* Pass all inputs when geometry shader passthrough is enabled

* Shader cache version bump
2021-06-23 23:04:59 +02:00
b34c0a47b4 Fix constant buffer array size when indexing is used and other buffer descriptor and resolution scale regressions (#2298)
* Fix constant buffer array size when indexing is used

* Change default QueryConstantBufferUse value

* Fix more regressions

* Ensure proper order
2021-05-20 15:12:15 -03:00
49745cfa37 Move shader resource descriptor creation out of the backend (#2290)
* Move shader resource descriptor creation out of the backend

* Remove now unused code, and other nits

* Shader cache version bump

* Nits

* Set format for bindless image load/store

* Fix buffer write flag
2021-05-19 23:15:26 +02:00
0e9823d7e6 Fix shader buffer write flag on atomic instructions (#2261)
* Fix shader buffer write flag on atomic instructions

* Shader cache version bump
2021-05-01 20:46:21 +02:00
524fe3bea4 Implement shader HelperThreadNV (#2163)
* Implement shader HelperThreadNV

* Bump shader cache version

* Use gl_HelperInvocation since its supported across all vendors

* Nit
2021-04-02 21:50:35 +11:00
1623ab524f Improve Buffer Textures and flush Image Stores (#2088)
* Improve Buffer Textures and flush Image Stores

Fixes a number of issues with buffer textures:

- Reworked Buffer Textures to create their buffers in the TextureManager, then bind them with the BufferManager later.
  - Fixes an issue where a buffer texture's buffer could be invalidated after it is bound, but before use.
- Fixed width unpacking for large buffer textures. The width is now 32-bit rather than 16.
- Force buffer textures to be rebound whenever any buffer is created, as using the handle id wasn't reliable, and the cost of binding isn't too high.

Fixes vertex explosions and flickering animations in UE4 games.

* Set ImageStore flag... for ImageStore.

* Check the offset and size.
2021-03-08 18:43:39 -03:00
f93089a64f Implement geometry shader passthrough (#1961)
* Implement geometry shader passthrough

* Cache version change
2021-01-29 14:38:51 +11:00
4b7c7dab9e Support multiple destination operands on shader IR and shuffle predicates (#1964)
* Support multiple destination operands on shader IR and shuffle predicates

* Cache version change
2021-01-28 10:59:47 +11:00
98d0240ce6 Support shader F32 to Bool reinterpretation (#1969) 2021-01-27 09:19:30 +01:00
a8e9dd2f83 Fix regression on shader atomic SSBO operations (#1967)
* Fix regression on shader atomic SSBO operations

* Update comment
2021-01-27 11:26:23 +11:00
e453ba69f4 Add support for shader atomic min/max (S32) (#1948) 2021-01-26 17:38:33 +11:00
a1f77a5b6a Implement lazy flush-on-read for Buffers (SSBO/Copy) (#1790)
* Initial implementation of buffer flush (VERY WIP)

* Host shaders need to be rebuilt for the SSBO write flag.

* New approach with reserved regions and gl sync

* Fix a ton of buffer issues.

* Remove unused buffer unmapped behaviour

* Revert "Remove unused buffer unmapped behaviour"

This reverts commit f1700e52fb8760180ac5e0987a07d409d1e70ece.

* Delete modified ranges on unmap

Fixes potential crashes in Super Smash Bros, where a previously modified range could lie on either side of an unmap.

* Cache some more delegates.

* Dispose Sync on Close

* Also create host sync for GPFifo syncpoint increment.

* Copy buffer optimization, add docs

* Fix race condition with OpenGL Sync

* Enable read tracking on CommandBuffer, insert syncpoint on WaitForIdle

* Performance: Only flush individual pages of SSBO at a time

This avoids flushing large amounts of data when only a small amount is actually used.

* Signal Modified rather than flushing after clear

* Fix some docs and code style.

* Introduce a new test for tracking memory protection.

Sucessfully demonstrates that the bug causing write protection to be cleared by a read action has been fixed. (these tests fail on master)

* Address Comments

* Add host sync for SetReference

This ensures that any indirect draws will correctly flush any related buffer data written before them. Fixes some flashing and misplaced world geometry in MH rise.

* Make PageAlign static

* Re-enable read tracking, for reads.
2021-01-17 17:08:06 -03:00
461c24092a Implement Force Early Z Register (#1755) 2020-12-02 00:13:27 +01:00
934a78005e Simplify logic for bindless texture handling (#1667)
* Simplify logic for bindless texture handling

* Nits
2020-11-09 19:35:04 -03:00
8d168574eb Use explicit buffer and texture bindings on shaders (#1666)
* Use explicit buffer and texture bindings on shaders

* More XML docs and other nits
2020-11-08 12:10:00 +01:00
ce9105a130 Support single precision contants for double precision operations (#1673) 2020-11-06 18:54:13 +01:00
e1da7df207 Support res scale on images, correctly blacklist for SUST, move logic out of backend. (#1657)
* Support res scale on images, correctly blacklist for SUST, move logic
out of backend.

* Fix Typo
2020-11-02 16:53:23 -03:00
780c7530d6 Add scaling for Texture2DArray when using TexelFetch. (#1645)
* Add scaling for Texture2DArray when using TexelFetch.

* Should only really trigger for Texture2D. (do not emit texelfetch for
TextureBufferArray(?) and Texture1DArray)

* Address nit.
2020-10-28 21:27:46 +01:00
0031edae27 Avoid sampler conflicts on bindless samplers with the same name (#1642) 2020-10-28 21:20:43 +01:00
49f970d5bd Implement CAL and RET shader instructions (#1618)
* Add support for CAL and RET shader instructions

* Remove unused stuff

* Fix a bug that could cause the wrong values to be passed to a function

* Avoid repopulating function id dictionary every time

* PR feedback

* Fix vertex shader A/B merge
2020-10-25 17:00:44 -03:00
2dcc6333f8 Fix image binding format (#1625)
* Fix image binding format

* XML doc
2020-10-20 19:03:20 -03:00
5264d55b39 Fix gl_in being used with built-in variables that are not per-vertex (#1624) 2020-10-17 10:16:40 +02:00
b066cfc1a3 Add support for shader constant buffer slot indexing (#1608)
* Add support for shader constant buffer slot indexing

* Fix typo
2020-10-12 21:40:50 -03:00
991784868f Fix shader regression on Intel iGPUs by reverting layout changes (#1425) 2020-07-29 08:01:11 +10:00
8dbcae1ff8 Implement BGRA texture support (#1418)
* Implement BGRA texture support

* Missing AppendLine

* Remove empty lines

* Address PR feedback
2020-07-26 00:03:40 -03:00
788ca6a411 Initial transform feedback support (#1370)
* Initial transform feedback support

* Some nits and fixes

* Update ReportCounterType and Write method

* Can't change shader or TFB bindings while TFB is active

* Fix geometry shader input names with new naming
2020-07-15 13:01:10 +10:00
484eb645ae Implement Zero-Configuration Resolution Scaling (#1365)
* Initial implementation of Render Target Scaling

Works with most games I have. No GUI option right now, it is hardcoded.

Missing handling for texelFetch operation.

* Realtime Configuration, refactoring.

* texelFetch scaling on fragment shader (WIP)

* Improve Shader-Side changes.

* Fix potential crash when no color/depth bound

* Workaround random uses of textures in compute.

This was blacklisting textures in a few games despite causing no bugs. Will eventually add full support so this doesn't break anything.

* Fix scales oscillating when changing between non-native scales.

* Scaled textures on compute, cleanup, lazier uniform update.

* Cleanup.

* Fix stupidity

* Address Thog Feedback.

* Cover most of GDK's feedback (two comments remain)

* Fix bad rename

* Move IsDepthStencil to FormatExtensions, add docs.

* Fix default config, square texture detection.

* Three final fixes:

- Nearest copy when texture is integer format.
- Texture2D -> Texture3D copy correctly blacklists the texture before trying an unscaled copy (caused driver error)
- Discount small textures.

* Remove scale threshold.

Not needed right now - we'll see if we run into problems.

* All CPU modification blacklists scale.

* Fix comment.
2020-07-07 04:41:07 +02:00
5795bb1528 Support separate textures and samplers (#1216)
* Support separate textures and samplers

* Add missing bindless flag, fix SNORM format on buffer textures

* Add missing separation

* Add comments about the new handles
2020-05-27 16:07:10 +02:00
0b6d206daa Omit image format if possible, and fix BA bit (#1280)
* Omit image format if possible, and fix BA bit

* Match extension name
2020-05-27 11:00:21 +02:00
b8eb6abecc Refactor shader GPU state and memory access (#1203)
* Refactor shader GPU state and memory access

* Fix NVDEC project build

* Address PR feedback and add missing XML comments
2020-05-06 11:02:28 +10:00
03711dd7b5 Implement SULD shader instruction (#1117)
* Implement SULD shader instruction

* Some nits
2020-04-22 09:35:28 +10:00
e93ca84b14 Better IPA shader instruction implementation (#1082)
* Fix varying interpolation on fragment shader

* Some nits

* Alignment
2020-04-03 11:20:47 +11:00
5b5239ab5b Remove output interpolation qualifier (#1070) 2020-04-02 12:24:55 +11:00
f9c859c8ba Index constant buffer vec4s using ternary expressions. (#1015)
* Index constant buffer vec4s using ternary expressions.

* Remove indexed path.

We determined that it had negligible impact.

* Revert "Remove indexed path."

This reverts commit 25ec4eddfa.

* Revert "Revert "Remove indexed path.""

This reverts commit 7cd52fecb5.
2020-03-29 13:24:54 -03:00
49d7b1c7d8 Implement textureQueryLevels (#1007) 2020-03-23 08:31:31 +11:00
8bb64ac69c Improve shader sampler type selection (#989) 2020-03-15 11:24:45 +11:00
dc97457bf0 Initial support for double precision shader instructions. (#963)
* Implement DADD, DFMA and DMUL shader instructions

* Rename FP to FP32

* Correct double immediate

* Classic mistake
2020-03-03 15:02:08 +01:00
796e5d14b4 Use correct shader local memory size instead of a hardcoded size (#914)
* Use correct shader local size instead of a hardcoded size

* Remove unused uniform block

* Update XML doc

* Local memory size has 23 bits on maxwell

* Generate compute QMD struct from nv open doc header

* Remove dummy arrays when shared or local memory is not used, other improvements
2020-02-02 14:25:52 +11:00
92703af555 Address PR feedback 2020-01-09 02:13:00 +01:00
654e617fe7 Some code cleanup 2020-01-09 02:13:00 +01:00
947e14d3be Reimplement limited bindless textures support 2020-01-09 02:13:00 +01:00
9d7a142a48 Support texture rectangle targets (non-normalized coords) 2020-01-09 02:13:00 +01:00
2eccc7023a Partial support for shader memory barriers 2020-01-09 02:13:00 +01:00
f2c85c5d58 Support non-constant texture offsets on non-NVIDIA gpus 2020-01-09 02:13:00 +01:00
66d91cbc6c Use dispatch params shared memory size when available 2020-01-09 02:13:00 +01:00
0d9672f3ae Use maximum shared memory size supported by hardware 2020-01-09 02:13:00 +01:00
7ce5584f9e Support depth clip mode and disable shader fast math optimization on NVIDIA as a workaround for compiler bugs (?) 2020-01-09 02:13:00 +01:00
17fb11ddb9 Fix wrong maximum id on sampler pool in some cases 2020-01-09 02:13:00 +01:00
cb171f6ebf Support shared color mask, implement more shader instructions
Support shared color masks (used by Nouveau and maybe the NVIDIA
driver).
Support draw buffers (also required by OpenGL).
Support viewport transform disable (disabled for now as it breaks some
games).
Fix instanced rendering draw being ignored for multi draw.
Fix IADD and IADD3 immediate shader encodings, that was not matching
some ops.
Implement FFMA32I shader instruction.
Implement IMAD shader instruction.
2020-01-09 02:13:00 +01:00