Commit Graph

58 Commits

Author SHA1 Message Date
7f6b3d234a Implement IMUL, PCNT and CONT shader instructions, fix FFMA32I and HFMA32I (#2972)
* Implement IMUL shader instruction

* Implement PCNT/CONT instruction and fix FFMA32I

* Add HFMA232I to the table

* Shader cache version bump

* No Rc on Ffma32i
2022-01-10 12:08:00 -03:00
911ea38e93 Support shader gl_Color, gl_SecondaryColor and gl_TexCoord built-ins (#2817)
* Support shader gl_Color, gl_SecondaryColor and gl_TexCoord built-ins

* Shader cache version bump

* Fix back color value on fragment shader

* Disable IPA multiplication for fixed function attributes and back color selection
2021-11-08 13:18:46 -03:00
99445dd0a6 Add support for fragment shader interlock (#2768)
* Support coherent images

* Add support for fragment shader interlock

* Change to tree based match approach

* Refactor + check for branch targets and external registers

* Make detection more robust

* Use Intel fragment shader ordering if interlock is not available, use nothing if both are not available

* Remove unused field
2021-10-28 19:53:12 -03:00
d512ce122c Initial tessellation shader support (#2534)
* Initial tessellation shader support

* Nits

* Re-arrange built-in table

* This is not needed anymore

* PR feedback
2021-10-18 18:38:04 -03:00
7603dbe3c8 Add missing U8/S8 types from shader I2I instruction (#2740)
* Add missing U8/S8 types from shader I2I instruction

* Better names

* Fix dstIsSignedInt
2021-10-17 17:48:36 -03:00
a7109c767b Rewrite shader decoding stage (#2698)
* Rewrite shader decoding stage

* Fix P2R constant buffer encoding

* Fix PSET/PSETP

* PR feedback

* Log unimplemented shader instructions

* Implement NOP

* Remove using

* PR feedback
2021-10-12 22:35:31 +02:00
142cededd4 Implement Shader Instructions SUATOM and SURED (#2090)
* Initial Implementation

* Further improvements (no support for float/64-bit types)

* Merge atomic and reduce instructions, add missing format switch

* Fix rebase issues.

* Not used.

* Whoops. Fixed.

* Partial implementation of inc/dec, cleanup and TODOs

* Remove testing path

* Address Feedback
2021-08-31 02:51:57 -03:00
ee1038e542 Initial support for shader attribute indexing (#2546)
* Initial support for shader attribute indexing

* Support output indexing too, other improvements

* Fix order

* Address feedback
2021-08-27 01:44:47 +02:00
ed754af8d5 Make sure attributes used on subsequent shader stages are initialized (#2538) 2021-08-11 22:27:00 +02:00
d9d18439f6 Use a new approach for shader BRX targets (#2532)
* Use a new approach for shader BRX targets

* Make shader cache actually work

* Improve the shader pattern matching a bit

* Extend LDC search to predecessor blocks, catches more cases

* Nit

* Only save the amount of constant buffer data actually used. Avoids crashes on partially mapped buffers

* Ignore Rd on predicate instructions, as they do not have a Rd register (catches more cases)
2021-08-11 20:59:42 +02:00
7ff1f9aa12 End shader decoding when reaching a block that starts with an infinite loop (after BRX) (#2367)
* End shader decoding when reaching an infinite loop

The NV shader compiler puts these at the end of shaders.

* Update shader cache version
2021-06-15 02:09:59 +02:00
3b90adcd1d Fix shaders with mixed PBK and SSY addresses on the stack (#2329)
* Fix shaders with mixed PBK and SSY addresses on the stack

* Address PR feedback and nits
2021-06-03 01:41:53 +02:00
524fe3bea4 Implement shader HelperThreadNV (#2163)
* Implement shader HelperThreadNV

* Bump shader cache version

* Use gl_HelperInvocation since its supported across all vendors

* Nit
2021-04-02 21:50:35 +11:00
a5d5ca0635 Shader Cache: Move bindless checking from translation to decode (#2145) 2021-03-27 00:50:26 +01:00
5be6ec6364 Fix shader LOP3 predicate write condition (#1910)
* Fix LOP3 predicate write condition

* Bump shader cache version
2021-01-14 01:07:50 +01:00
8e0a421264 Fix remap when handle is 0 (#1882)
* Nvservices cleanup and attempt to fix remap

* Unmap if remap handle is 0

* Remove mapped pool add from Remap
2021-01-10 10:11:31 +11:00
b9200dd734 Support conditional on BRK and SYNC shader instructions (#1878)
* Support conditional on BRK and SYNC shader instructions

* Add TODO comment and bump cache version
2021-01-08 22:55:55 -03:00
48f6570557 Salieri: shader cache (#1701)
Here come Salieri, my implementation of a disk shader cache!

"I'm sure you know why I named it that."
"It doesn't really mean anything."

This implementation collects shaders at runtime and cache them to be later compiled when starting a game.
2020-11-13 00:15:34 +01:00
c3d62bd078 Implement ATOM shader instruction (#1687)
* Implement ATOM shader instruction

* Fix reduction type decoding
2020-11-10 01:06:46 +01:00
49f970d5bd Implement CAL and RET shader instructions (#1618)
* Add support for CAL and RET shader instructions

* Remove unused stuff

* Fix a bug that could cause the wrong values to be passed to a function

* Avoid repopulating function id dictionary every time

* PR feedback

* Fix vertex shader A/B merge
2020-10-25 17:00:44 -03:00
2f16491712 Get rid of Reflection.Emit dependency on CPU and Shader projects (#1626)
* Get rid of Reflection.Emit dependency on CPU and Shader projects

* Remove useless private sets

* Missed those due to the alignment
2020-10-21 09:13:44 -03:00
f02791b20c Fix LOP3 (cbuf) shader instruction encoding (#1616) 2020-10-13 19:33:04 -03:00
e4777717cd Implement LEA.HI shader instruction (#1609) 2020-10-12 21:46:04 -03:00
b066cfc1a3 Add support for shader constant buffer slot indexing (#1608)
* Add support for shader constant buffer slot indexing

* Fix typo
2020-10-12 21:40:50 -03:00
0954e76a26 Improve BRX target detection heuristics (#1591) 2020-10-03 15:43:33 +10:00
e13154c83d Implement shader LEA instruction and improve bindless image load/store (#1355) 2020-07-04 01:48:44 +02:00
0b6d206daa Omit image format if possible, and fix BA bit (#1280)
* Omit image format if possible, and fix BA bit

* Match extension name
2020-05-27 11:00:21 +02:00
ff7a933ec0 Implement TMML and TMML.B (#1270)
* Implement TMML and TMML.B

This implement TMML and TMML.B instructions

* Fix TmmlB declaration alignment

* Address gdkchan's comments

* Fix inverted encoding definitions
2020-05-23 12:04:35 +02:00
b8eb6abecc Refactor shader GPU state and memory access (#1203)
* Refactor shader GPU state and memory access

* Fix NVDEC project build

* Address PR feedback and add missing XML comments
2020-05-06 11:02:28 +10:00
3cb1fa0e85 Implement texture buffers (#1152)
* Implement texture buffers

* Throw NotSupportedException where appropriate
2020-04-25 23:02:18 +10:00
03711dd7b5 Implement SULD shader instruction (#1117)
* Implement SULD shader instruction

* Some nits
2020-04-22 09:35:28 +10:00
d599fba711 Implement FCMP shader instruction (#1067) 2020-03-30 12:04:00 +02:00
7ad8b3ef75 Move the OpActivator to OpCodeTable class for improve performance (#1001)
* Move the OpActivator to OpCodeTable class, for reduce the use of ConcurrentDictionary

* Modify code style.
2020-03-29 19:52:56 +11:00
06bf25521f Implement NOP and stub DEPBAR shader instructions (#1041)
* Implement NOP and stub DEPBAR shader instruction

* Fix a few issues and formatting stuff

* Remove OpCodeNop/Depbar and use OpCode instead

* Fix NOP shader instruction opcode

* Fix formatting
2020-03-26 19:30:16 -03:00
1586450a38 Implement VMNMX shader instruction (#1032)
* Implement VMNMX shader instruction

* No need for the gap on the enum

* Fix typo
2020-03-25 15:49:10 +01:00
6edc929894 Implement ICMP shader instruction (#1010) 2020-03-23 17:32:30 +01:00
54501962f6 Fix branch with CC and predicate, and a case of SYNC propagation (#967) 2020-03-06 11:09:49 +11:00
dc97457bf0 Initial support for double precision shader instructions. (#963)
* Implement DADD, DFMA and DMUL shader instructions

* Rename FP to FP32

* Correct double immediate

* Classic mistake
2020-03-03 15:02:08 +01:00
5a9dba0756 Sign-extend shader memory instruction offsets (#934) 2020-02-14 01:48:07 +01:00
b8e3909d80 Add a GetSpan method to the memory manager and use it on GPU (#877) 2020-01-13 10:27:50 +11:00
29a825b43b Address PR feedback
Removes a useless null check

Aligns some values to improve readability
2020-01-09 02:13:00 +01:00
18814d44b2 Address PR feedback
Add TODO comment for GL_EXT_polygon_offset_clamp
2020-01-09 02:13:00 +01:00
2eccc7023a Partial support for shader memory barriers 2020-01-09 02:13:00 +01:00
6b13c5b439 Support bindless texture gather shader instruction 2020-01-09 02:13:00 +01:00
cb171f6ebf Support shared color mask, implement more shader instructions
Support shared color masks (used by Nouveau and maybe the NVIDIA
driver).
Support draw buffers (also required by OpenGL).
Support viewport transform disable (disabled for now as it breaks some
games).
Fix instanced rendering draw being ignored for multi draw.
Fix IADD and IADD3 immediate shader encodings, that was not matching
some ops.
Implement FFMA32I shader instruction.
Implement IMAD shader instruction.
2020-01-09 02:13:00 +01:00
gdk
6a98c643ca Add a pass to turn global memory access into storage access, and do all storage related transformations on IR 2020-01-09 02:13:00 +01:00
gdk
442485adb3 Partial support for branch with CC, and fix a edge case of branch out of loop on shaders 2020-01-09 02:13:00 +01:00
gdk
b8528c6317 Implement HSET2 shader instruction and fix errors uncovered by Rodrigo tests 2020-01-09 02:13:00 +01:00
gdk
e0c95b18eb Add PSET shader instruction 2020-01-09 02:13:00 +01:00
gdk
6a8ba6d600 Add R2P shader instruction 2020-01-09 02:13:00 +01:00