Compare commits

...

98 Commits

Author SHA1 Message Date
36172ab43b Scale SamplesPassed counter by RT scale on report (#3680)
* Scale SamplesPassed counter by RT scale on report

Adds a scale factor for samples passed counter report based on the render target scale at the time. This ensures that when a game reads this counter, it appears similar to the result at 1x.

This doesn't cover cases where the the render target scale changes during the queried draws, though that might be better to handle along with other scope related issues in a future rework of counters. Games generally don't count for occlusion queries over render target changes anyways.

Fixes an issue in the Splatoon games where the special charge would scale too quickly at high res, points at the end of the game would be broken (but still provide a correct winner), and playing at a low res would make it impossible to swim in ink.

May also affect LOD scaling in The Witcher 3.

* Update Ryujinx.Graphics.Gpu/Engine/Threed/SemaphoreUpdater.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-09-11 15:58:15 +00:00
4d69286a9c Implement VRINT (vector) Arm32 NEON instructions (#3691) 2022-09-11 15:44:27 +00:00
1529e6cf0d T32: Add Vfp instructions (#3690) 2022-09-10 23:03:14 -03:00
f468db7602 Implement Thumb (32-bit) memory (ordered), multiply, extension and bitfield instructions (#3687)
* Implement Thumb (32-bit) memory (ordered), multiply and bitfield instructions

* Remove public from interface

* Fix T32 BL immediate and implement signed and unsigned extend instructions
2022-09-10 22:51:00 -03:00
gdk
c5f1d1749a Revert address space mirror changes 2022-09-10 16:23:49 +02:00
gdk
7dd69f2d0e Allocation free tree lookup 2022-09-10 16:23:49 +02:00
gdk
c646638680 Update several methods to use GetNode directly and avoid array allocations 2022-09-10 16:23:49 +02:00
gdk
65f2a82b97 Optimize PlaceholderManager.UnreserveRange 2022-09-10 16:23:49 +02:00
gdk
93dd6d525a Fix potential issue with partial unmap
We must also do the unmap operation with the RWLock, otherwise faults on the unmapped region will cause crashes and the whole thing becomes pointless
2022-09-10 16:23:49 +02:00
gdk
96d4ad952c Fix reprotection regression 2022-09-10 16:23:49 +02:00
gdk
6a07f80b76 Make RBTree node fields internal again
Prevents someone from accidentaly messing with them and leaving the tree in a invalid state
2022-09-10 16:23:49 +02:00
gdk
22214ac664 Delete unused code 2022-09-10 16:23:49 +02:00
gdk
45e520a27c Rewrite PlaceholderManager4KB to use intrusive RBTree, and to coalesce free placeholders
Also make the other placeholder manager use intrusive RBTree, allows the IntervalTree that was added just for this to be deleted
2022-09-10 16:23:49 +02:00
gdk
5b5810a46a Defer address space mirror mapping and use it only if strictly needed 2022-09-10 16:23:49 +02:00
619ac86bd0 Do not output ViewportIndex on SPIR-V if GPU does not support it (#3644)
* Do not output ViewportIndex on SPIR-V if GPU does not support it

* Bump shader cache version
2022-09-10 13:20:23 +00:00
7a1ab71c73 Update README.MD verbiage and compatibility 2022-09-10 15:07:37 +02:00
dc4ba3993b Rebind textures if format changes or they're buffer textures 2022-09-10 14:12:50 +02:00
81f1a4dc31 Allocate work buffer for audio renderer instead of using guest supplied memory (#3276)
* Allocate work buffer for audio renderer instead of using guest supplied memory

* Typo

* Use GC.AllocateArray to allocate pinned array
2022-09-10 01:16:24 +00:00
c64524a240 Add ADD (zx imm12), NOP, MOV (rs), LDA, TBB, TBH, MOV (zx imm16) and CLZ thumb instructions (#3683)
* Add ADD (zx imm12), NOP, MOV (register shifted), LDA, TBB, TBH, MOV (zx imm16) and CLZ thumb instructions, fix LDRD, STRD, CBZ, CBNZ and BLX (reg)

* Bump PPTC version
2022-09-09 22:09:11 -03:00
db45688aa8 Implement VRSRA, VRSHRN, VQSHRUN, VQMOVN, VQMOVUN, VQADD, VQSUB, VRHADD, VPADDL, VSUBL, VQDMULH and VMLAL Arm32 NEON instructions (#3677)
* Implement VRSRA, VRSHRN, VQSHRUN, VQMOVN, VQMOVUN, VQADD, VQSUB, VRHADD, VPADDL, VSUBL, VQDMULH and VMLAL Arm32 NEON instructions

* PPTC version

* Fix VQADD/VQSUB

* Improve MRC/MCR handling and exception messages

In case data is being recompiled as code, we don't want to throw at emit stage, instead we should only throw if it actually tries to execute
2022-09-09 21:47:38 -03:00
c6d82209ab Restride vertex buffer when stride causes attributes to misalign in Vulkan. (#3679)
* Vertex Buffer Alignment part 1

* Update CacheByRange

* Add Stride Change compute shader, fix storage buffers in helpers

* An AMD exclusive

* Reword

* Change rules - stride conversion when attrs misalign

* Fix stupid mistake

* Fix background pipeline compile

* Improve a few things.

* Fix some feedback

* Address Feedback

(the shader binary didn't change when i changed the source to use the subgroup size)

* Fix bug where rewritten buffer would be disposed instantly.
2022-09-08 20:30:19 -03:00
ee1825219b Clean up rejit queue (#2751) 2022-09-08 20:14:08 -03:00
7baa08dcb4 Implemented in IR the managed methods of the Saturating region ... (#3665)
* Implemented in IR the managed methods of the Saturating region ...

... of the SoftFallback class (the SatQ ones).

The need to natively manage the Fpcr and Fpsr system registers is still a fact.

Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones).

All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.

* Ptc.InternalVersion = 3665

* Addressed PR feedback.
2022-09-08 19:40:41 -03:00
408bd63b08 Transform shader LDC into constant buffer access if offset is constant (#3672)
* Transform shader LDC into constant buffer access if offset is constant

* Shader cache version bump
2022-09-07 20:25:22 -03:00
df99257d7f bsd: improve socket poll
We should report errors even when not requested.

This also ensure we only clear the bits that were requested on the output.

Finally, this fix when input events is 0.
2022-09-07 22:58:41 +02:00
f3835dc78b bsd: implement SendMMsg and RecvMMsg (#3660)
* bsd: implement sendmmsg and recvmmsg

* Fix wrong increment of vlen
2022-09-07 22:37:15 +02:00
51bb8707ef Update bug report template (#3676)
Adds some verbiage to indicate that game-specific issues should be posted instead on the game compatibility list, unless it is a provable regression.
2022-09-06 22:30:07 +02:00
5ff5fe47ba Bsd: Fix NullReferenceException in BsdSockAddr.FromIPEndPoint() (#3652)
* Bsd: Fix NullReferenceException in BsdSockAddr.FromIPEndPoint()

Allows "Victor Vran Overkill Edition" to boot with guest internet access enabled.
Thanks to EmulationFanatic for testing this for me!

* Bsd: Return proper error code if RemoteEndPoint is null

* Remove whitespace from empty line

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-09-01 22:04:01 +00:00
38275f9056 Change vsync signal to happen at 60hz, regardless of swap interval (#3642)
* Change vsync signal to happen at 60hz, regardless of swap interval

* Update Ryujinx.HLE/HOS/Services/SurfaceFlinger/SurfaceFlinger.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Fix softlock when toggling vsync

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-09-01 17:57:50 -03:00
67cbdc3a6a bsd: Fix Poll(0) returning ETIMEDOUT instead of SUCCESS
This was an oversight of the implementation.
2022-09-01 21:46:11 +02:00
131b43170e sfdsnres: fix endianess issue for port serialisation 2022-09-01 21:31:20 +02:00
730d2f4b9b Address gdkchan's comment 2022-08-31 21:33:03 +02:00
f6a7309b14 account: Implement LoadNetworkServiceLicenseKindAsync
This is needed to run Pokemon Legends Arceus 1.1.1 with guest internet enabled.

The game still get stuck at loading screen.
2022-08-31 21:33:03 +02:00
472a621589 Bsd: Fix ArgumentOutOfRangeException in SetSocketOption (#3633)
* Bsd: Fix ArgumentOutOfRangeException in SetSocketOption

* Ensure option level is Socket before checking for SoLinger
2022-08-28 14:24:19 +00:00
311c2661b8 Replace image format magic numbers with enums (#3631)
* Replace magic constants with enums

* Extra formatting

* Lower case ASTC dimensions

* Use uint for VertexAttributeFormat
2022-08-28 01:56:26 +00:00
a92e2028cb Updates Japanese localization for the Avalonia UI (#3635) 2022-08-27 07:01:30 +00:00
6922862db8 Optimize kernel memory block lookup and consolidate RBTree implementations (#3410)
* Implement intrusive red-black tree, use it for HLE kernel block manager

* Implement TreeDictionary using IntrusiveRedBlackTree

* Implement IntervalTree using IntrusiveRedBlackTree

* Implement IntervalTree (on Ryujinx.Memory) using IntrusiveRedBlackTree

* Make PredecessorOf and SuccessorOf internal, expose Predecessor and Successor properties on the node itself

* Allocation free tree node lookup
2022-08-26 18:21:48 +00:00
6592d64751 Update Turkish Translation (#3498)
Translated newly added lines and polished older entries.
2022-08-26 18:00:17 +00:00
8001c832d9 Update de_DE.json (#3502)
* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Another one

* Update de_DE.json

* addressed reviews

Co-Authored-By: Miepee <38186597+Miepee@users.noreply.github.com>

* welp

Co-Authored-By: Miepee <38186597+Miepee@users.noreply.github.com>

* Update de_DE.json

* Update de_DE.json

quick update with the latest changes

Co-Authored-By: Miepee <38186597+Miepee@users.noreply.github.com>

* Update de_DE.json

Co-Authored-By: Miepee <38186597+Miepee@users.noreply.github.com>

Co-authored-by: Miepee <38186597+Miepee@users.noreply.github.com>
Co-authored-by: reloxx13 <reloxx@interia.pl>
2022-08-26 19:49:05 +02:00
87919b193c Update zh_CN.json (#3598)
* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Delete zh_CN.json

* fix crash

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json
2022-08-26 19:36:42 +02:00
8de033e60e Avalonia - Add Polish Translation (#3569)
* Update Ryujinx.Ava.csproj

* Update MainWindow.axaml

* Create pl_PL.json

* Update pl_PL.json

* Update pl_PL.json

* Update pl_PL.json

* Update pl_PL.json

* PPTC wording changes

adding PPTC changes

Co-authored-by: Clara <moonbowjelly@gmail.com>
2022-08-26 19:24:59 +02:00
90432946ac Avalonia - Display language names in their corresponding language under "Change Language" (#3490)
* change languages to their native names

* fix Chinese language names

* Update MainWindow.axaml
2022-08-26 19:12:11 +02:00
9bad71afbf bsd: Fix Poll writting in input buffer (#3630)
This is a very old oversight on our Poll implementation.
This worked so far reliably because games and homebrews pass the same
buffer as input and output.
2022-08-26 18:10:45 +02:00
923089a298 Fast path for Inline-to-Memory texture data transfers (#3610)
* Fast path for Inline-to-Memory texture data transfers

* Only do it for block linear textures to be on the safe side
2022-08-26 02:16:41 +00:00
d9aa15eb24 pctl: Implement EndFreeCommunication
This PR Implement `EndFreeCommunication` (checked by RE). Nothing more.

Closes #2420
2022-08-25 23:18:37 +02:00
12c89a61f9 misc: Fix missing null terminator for strings with pchtxt (#3629)
As title say.
2022-08-25 19:59:15 +00:00
f5235fff29 ARMeilleure: Hardware accelerate SHA256 (#3585)
* ARMeilleure/HardwareCapabilities: Add Sha

* ARMeilleure/Intrinsic: Add X86Sha256Rnds2

* ARmeilleure: Hardware accelerate SHA256H/SHA256H2

* ARMeilleure/Intrinsic: Add X86Sha256Msg1, X86Sha256Msg2

* ARMeilleure/Intrinsic: Add X86Palignr

* ARMeilleure: Hardware accelerate SHA256SU0, SHA256SU1

* PTC: Bump InternalVersion
2022-08-25 10:12:13 +00:00
eba682b767 Implement some 32-bit Thumb instructions (#3614)
* Implement some 32-bit Thumb instructions

* Optimize OpCode32MemMult using PopCount
2022-08-25 09:59:34 +00:00
b994dafe7a Update PPTC dialog text to match label and tooltip (#3618)
* Update PPTC dialog text to match label and tooltip

* Update to requested text

* Reverting spaces

* Adding newline back in
2022-08-24 08:25:49 +00:00
54421760c3 Check if game directories have been updated before refreshing GUI (#3607)
* Check if game directories have been updated before refreshing list on save.

* Cleanup spacing

* Add Avalonia and reset value after saving

* Fix Avalonia

* Fix multiple directories not being added in GTK
2022-08-21 13:07:28 +00:00
88a0e720cb Use RGBA16 vertex format if RGB16 is not supported on Vulkan (#3552)
* Use RGBA16 vertex format if RGB16 is not supported on Vulkan

* Catch all shader compilation exceptions
2022-08-20 16:20:27 -03:00
53cc9e0561 Change 'Purge PPTC Cache' label & tooltip to reflect function behavior (#3601)
* Change PPTC purge label & tooltip

* Change Avalonia labels
2022-08-19 23:39:59 +00:00
7defc59b9d A few minor documentation fixes. (#3599)
* A few minor documentation fixes.

* Removed more invalid inheritdoc instances.
2022-08-19 18:21:06 -03:00
951700fdd8 Removed unused usings. (#3593)
* Removed unused usings.

* Added back using, now that it's used.

* Removed extra whitespace.
2022-08-18 18:04:54 +02:00
eb6430f103 Skipped over the last "Count" key explicitly, instead of relying on an exception. (#3595) 2022-08-18 02:00:27 +02:00
80a879cb44 Fix SpirV parse failure (#3597)
* Added .ToString overrides, to help diagnose and debug SpirV generated code.

* Added Spirv to team shared dictionary, so the word will not show up as a warning.

* Fixed bug where we were creating invalid constants (bool 0i and float 0i)

* Update Ryujinx.Graphics.Shader/CodeGen/Spirv/CodeGenContext.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Update Spv.Generator/Instruction.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Adjusted spacing to match style of the rest of the code.

* Added handler for FP64(double) as well, for undefined aggregate types.

* Made the operand labels a static dictionary, to avoid re-allocation on each call.
Replaced Contains/Get with a TryGetValue, to reduce the number of dictionary lookups.

* Added newline between AllOperands and ToString().

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-08-18 01:49:43 +02:00
2197f41506 Removed extra semicolons. (#3594) 2022-08-17 09:05:15 +02:00
c8f9292bab Avalonia - Couple fixes and improvements to vulkan (#3483)
* drop split devices, rebase

* add fallback to opengl if vulkan is not available

* addressed review

* ensure present image references are incremented and decremented when necessary

* allow changing vsync for vulkan

* fix screenshot on avalonia vulkan

* save favorite when toggled

* improve sync between popups

* use separate devices for each new window

* fix crash when closing window

* addressed review

* don't create the main window with immediate mode

* change skia vk delegate to method

* update vulkan throwonerror

* addressed review
2022-08-16 16:32:37 +00:00
0ec933a615 Vulkan: Add ETC2 texture formats (#3576) 2022-08-16 15:42:42 +02:00
2135b6a51a am: Stub SetWirelessPriorityMode, SetWirelessPriorityMode and GetHdcpAuthenticationState (#3535)
This PR some calls in `am` service:
- ISelfController: SetWirelessPriorityMode, SaveCurrentScreenshot (Partially checked by RE).
- ICommonStateGetter: GetHdcpAuthenticationState

Close #1831 and close #3527
2022-08-15 11:12:08 +00:00
00e35d9bf6 ControllerApplet: Override player counts when SingleMode is set (#3571) 2022-08-15 09:46:08 +02:00
6dfb6ccf8c PreAllocator: Check if instruction supports a Vex prefix in IsVexSameOperandDestSrc1 (#3587) 2022-08-14 17:35:08 -03:00
e87e8b012c Fix texture bindings using wrong sampler pool in some cases (#3583) 2022-08-14 14:00:30 -03:00
e8f1ca8427 OpenGL: Limit vertex buffer range for non-indexed draws (#3542)
* Limit vertex buffer range for non-indexed draws

* Fix typo
2022-08-11 20:21:56 -03:00
ad47bd2d4e Fix blend with RGBX color formats (#3553) 2022-08-11 18:23:25 -03:00
a5ff0024fb Rename ToSpan to AsSpan (#3556) 2022-08-11 18:07:37 -03:00
f9661a54d2 add Japanese translation to Avalonia UI (#3489)
* add Japanese translation to Avalonia UI

* translate language names

* fix raised in the review
2022-08-11 17:55:14 -03:00
66e7fdb871 OpenGL: Fix clears of unbound color targets (#3564) 2022-08-08 17:39:22 +00:00
2bb9b33da1 Implement Arm32 Sha256 and MRS Rd, CPSR instructions (#3544)
* Implement Arm32 Sha256 and MRS Rd, CPSR instructions

* Add tests using Arm64 outputs
2022-08-05 19:03:50 +02:00
1080f64df9 Implement HLE macros for render target clears (#3528)
* Implement HLE macros for render target clears

* Add constants for the offsets
2022-08-04 21:30:08 +00:00
c48a75979f Fix Multithreaded Compilation of Shader Cache on OpenGL (#3540)
This was broken by the Vulkan changes - OpenGL was building host caches at boot on one thread, which is very notably slower than when it is multithreaded.

This was caused by trying to get the program binary immediately after compilation started, which blocks. Now it does it after compilation has completed.
2022-08-03 19:37:56 -03:00
842cb26ba5 Sfdnsres; Stub ResolverSetOptionRequest (#3493)
This PR stub ResolverSetOptionRequest (checked by RE), but the options parsing is still missing since we don't support it in our current code.

(Close #3479)
2022-08-03 00:10:28 +02:00
e235d5e7bb Fix resolution scale values not being updated (#3514) 2022-08-02 23:58:56 +02:00
ed0b10c81f Fix geometry shader passthrough fallback being used when feature is supported (#3525)
* Fix geometry shader passthrough fallback being used when feature is supported

* Shader cache version bump
2022-08-02 08:44:30 +02:00
f92650fcff SPIR-V: Initialize undefined variables with 0 (#3526)
* SPIR-V: Initialize undefined variables with a value

Changes undefined values on spir-v shaders (caused by phi nodes) to be initialized instead of truly undefined.

Fixes an issue with NVIDIA gpus seemingly not liking when a variable is _potentially_ undefined. Not sure about the details at the moment.

Fixes:
- Tilt shift blur effect in Link's Awakening (bottom of the screen)
- Potentially block flickering on newer NVIDIA gpus in Splatoon 2? Needs testing.

Testing is welcome.

* Update Ryujinx.Graphics.Shader/CodeGen/Spirv/CodeGenContext.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-08-02 08:11:10 +02:00
712361f6e1 vk: Workaround XCB not availaible on FlatHub build (#3515)
Update SPB to 0.0.4-build24 which hopefully fix the issue by checking
libX11-xcb presence.
2022-08-01 08:46:19 +02:00
2232e4ae87 Vulkan backend (#2518)
* WIP Vulkan implementation

* No need to initialize attributes on the SPIR-V backend anymore

* Allow multithreading shaderc and vkCreateShaderModule

You'll only really see the benefit here with threaded-gal or parallel shader cache compile.

Fix shaderc multithreaded changes

Thread safety for shaderc Options constructor

Dunno how they managed to make a constructor not thread safe, but you do you. May avoid some freezes.

* Support multiple levels/layers for blit.

Fixes MK8D when scaled, maybe a few other games. AMD software "safe" blit not supported right now.

* TextureStorage should hold a ref of the foreign storage, otherwise it might be freed while in use

* New depth-stencil blit method for AMD

* Workaround for AMD driver bug

* Fix some tessellation related issues (still doesn't work?)

* Submit command buffer before Texture GetData. (UE4 fix)

* DrawTexture support

* Fix BGRA on OpenGL backend

* Fix rebase build break

* Support format aliasing on SetImage

* Fix uniform buffers being lost when bindings are out of order

* Fix storage buffers being lost when bindings are out of order

(also avoid allocations when changing bindings)

* Use current command buffer for unscaled copy (perf)

Avoids flushing commands and renting a command buffer when fulfilling copy dependencies and when games do unscaled copies.

* Update to .net6

* Update Silk.NET to version 2.10.1

Somehow, massive performance boost. Seems like their vtable for looking up vulkan methods was really slow before.

* Fix PrimitivesGenerated query, disable Transform Feedback queries for now

Lets Splatoon 2 work on nvidia. (mostly)

* Update counter queue to be similar to the OGL one

Fixes softlocks when games had to flush counters.

* Don't throw when ending conditional rendering for now

This should be re-enabled when conditional rendering is enabled on nvidia etc.

* Update findMSB/findLSB to match master's instruction enum

* Fix triangle overlay on SMO, Captain Toad, maybe others?

* Don't make Intel Mesa pay for Intel Windows bugs

* Fix samplers with MinFilter Linear or Nearest (fixes New Super Mario Bros U Deluxe black borders)

* Update Spv.Generator

* Add alpha test emulation on shader (but no shader specialisation yet...)

* Fix R4G4B4A4Unorm texture format permutation

* Validation layers should be enabled for any log level other than None

* Add barriers around vkCmdCopyImage

Write->Read barrier for src image (we want to wait for a write to read it)
Write->Read barrier for dst image (we want to wait for the copy to complete before use)

* Be a bit more careful with texture access flags, since it can be used for anything

* Device local mapping for all buffers

May avoid issues with drivers with NVIDIA on linux/older gpus on windows when using large buffers (?)
Also some performance things and fixes issues with opengl games loading textures weird.

* Cleanup, disable device local buffers for now.

* Add single queue support

Multiqueue seems to be a bit more responsive on NVIDIA. Should fix texture flush on intel. AMD has been forced to single queue for an experiment.

* Fix some validation errors around extended dynamic state

* Remove Intel bug workaround, it was fixed on the latest driver

* Use circular queue for checking consumption on command buffers

Speeds up games that spam command buffers a little. Avoids checking multiple command buffers if multiple are active at once.

* Use SupportBufferUpdater, add single layer flush

* Fix counter queue leak when game decides to use host conditional rendering

* Force device local storage for textures (fixes linux performance)

* Port #3019

* Insert barriers around vkCmdBlitImage (may fix some amd flicker)

* Fix transform feedback on Intel, gl_Position feedback and clears to inexistent depth buffers

* Don't pause transform feedback for multi draw

* Fix draw outside of render pass and missing capability

* Workaround for wrong last attribute on AMD (affects FFVII, STRIKERS1945, probably more)

* Better workaround for AMD vertex buffer size alignment issue

* More instructions + fixes on SPIR-V backend

* Allow custom aspect ratio on Vulkan

* Correct GTK UI status bar positions

* SPIR-V: Functions must always end with a return

* SPIR-V: Fix ImageQuerySizeLod

* SPIR-V: Set DepthReplacing execution mode when FragDepth is modified

* SPIR-V: Implement LoopContinue IR instruction

* SPIR-V: Geometry shader support

* SPIR-V: Use correct binding number on storage buffers array

* Reduce allocations for Spir-v serialization

Passes BinaryWriter instead of the stream to Write and WriteOperand

- Removes creation of BinaryWriter for each instruction
- Removes allocations for literal string

* Some optimizations to Spv.Generator

- Dictionary for lookups of type declarations, constants, extinst
- LiteralInteger internal data format -> ushort
- Deterministic HashCode implementation to avoid spirv result not being the same between runs
- Inline operand list instead of List<T>, falls back to array if many operands. (large performance boost)

TODO: improve instruction allocation, structured program creator, ssa?

* Pool Spv.Generator resources, cache delegates, spv opts

- Pools for Instructions and LiteralIntegers. Can be passed in when creating the generator module.
  - NewInstruction is called instead of new Instruction()
  - Ryujinx SpirvGenerator passes in some pools that are static. The idea is for these to be shared between threads eventually.
- Estimate code size when creating the output MemoryStream
- LiteralInteger pools using ThreadStatic pools that are initialized before and after creation... not sure of a better way since the way these are created is via implicit cast.

Also, cache delegates for Spv.Generator for functions that are passed around to GenerateBinary etc, since passing the function raw creates a delegate on each call.

TODO: update python spv cs generator to make the coregrammar with NewInstruction and the `params` overloads.

* LocalDefMap for Ssa Rewriter

Rather than allocating a large array of all registers for each block in the shader, allocate one array of all registers and clear it between blocks. Reduces allocations in the shader translator.

* SPIR-V: Transform feedback support

* SPIR-V: Fragment shader interlock support (and image coherency)

* SPIR-V: Add early fragment tests support

* SPIR-V: Implement SwizzleAdd, add missing Triangles ExecutionMode for geometry shaders, remove SamplerType field from TextureMeta

* Don't pass depth clip state right now (fix decals)

Explicitly disabling it is incorrect. OpenGL currently automatically disables based on depth clamp, which is the behaviour if this state is omitted.

* Multisampling support

* Multisampling: Use resolve if src samples count > dst samples count

* Multisampling: We can only resolve for unscaled copies

* SPIR-V: Only add FSI exec mode if used.

* SPIR-V: Use ConstantComposite for Texture Offset Vector

Fixes a bunch of freezes with SPIR-V on AMD hardware, and validation errors. Note: Obviously assumes input offsets are constant, which they currently are.

* SPIR-V: Don't OpReturn if we already OpExit'ed

Fixes spir-v parse failure and stack smashing in RADV (obviously you still need bolist)

* SPIR-V: Only use input attribute type for input attributes

Output vertex attributes should always be of type float.

* Multithreaded Pipeline Compilation

* Address some feedback

* Make this 32

* Update topology with GpuAccessorState

* Cleanup for merge (note: disables spir-v)

* Make more robust to shader compilation failure

- Don't freeze when GLSL compilation fails
- Background SPIR-V pipeline compile failure results in skipped draws, similar to GLSL compilation failure.

* Fix Multisampling

* Only update fragment scale count if a vertex texture needs a scale.

Fixes a performance regression introduced by texture scaling in the vertex stage where support buffer updates would be very frequent, even at 1x, if any textures were used on the vertex stage.

This check doesn't exactly look cheap (a flag in the shader stage would probably be preferred), but it is much cheaper than uploading scales in both vulkan and opengl, so it will do for now.

* Use a bitmap to do granular tracking for buffer uploads.

This path is only taken if the much faster check of "is the buffer rented at all" is triggered, so it doesn't actually end up costing too much, and the time saved by not ending render passes (and on gpu for not waiting on barriers) is probably helpful.

Avoids ending render passes to update buffer data (not all the time)
- 140-180 to 35-45 in SMO metro kingdom (these updates are in the UI)
- Very variable 60-150(!) to 16-25 in mario kart 8 (these updates are in the UI)

As well as allowing more data to be preloaded persistently, this will also allow more data to be loaded in the preload buffer, which should be faster as it doesn't need to insert barriers between draws. (and on tbdr, does not need to flush and reload tile memory)

Improves performance in GPU limited scenarios. Should notably improve performance on TBDR gpus. Still a lot more to do here.

* Copy query results after RP ends, rather than ending to copy

We need to end the render pass to get the data (submit command buffer) anyways...

Reduces render passes created in games that use queries.

* Rework Query stuff a bit to avoid render pass end

Tries to reset returned queries in background when possible, rather than ending the render pass.

Still ends render pass when resetting a counter after draws, but maybe that can be solved too. (by just pulling an empty object off the pool?)

* Remove unnecessary lines

Was for testing

* Fix validation error for query reset

Need to think of a better way to do this.

* SPIR-V: Fix SwizzleAdd and some validation errors

* SPIR-V: Implement attribute indexing and StoreAttribute

* SPIR-V: Fix TextureSize for MS and Buffer sampler types

* Fix relaunch issues

* SPIR-V: Implement LogicalExclusiveOr

* SPIR-V: Constant buffer indexing support

* Ignore unsupported attributes rather than throwing (matches current GLSL behaviour)

* SPIR-V: Implement tessellation support

* SPIR-V: Geometry shader passthrough support

* SPIR-V: Implement StoreShader8/16 and StoreStorage8/16

* SPIR-V: Resolution scale support and fix TextureSample multisample with LOD bug

* SPIR-V: Fix field index for scale count

* SPIR-V: Fix another case of wrong field index

* SPIRV/GLSL: More scaling related fixes

* SPIR-V: Fix ImageLoad CompositeExtract component type

* SPIR-V: Workaround for Intel FrontFacing bug

* Enable SPIR-V backend by default

* Allow null samplers (samplers are not required when only using texelFetch to access the texture)

* Fix some validation errors related to texel block view usage flag and invalid image barrier base level

* Use explicit subgroup size if we can (might fix some block flickering on AMD)

* Take componentMask and scissor into account when clearing framebuffer attachments

* Add missing barriers around CmdFillBuffer (fixes Monster Hunter Rise flickering on NVIDIA)

* Use ClampToEdge for Clamp sampler address mode on Vulkan (fixes Hollow Knight)

Clamp is unsupported on Vulkan, but ClampToEdge behaves almost the same. ClampToBorder on the other hand (which was being used before) is pretty different

* Shader specialization for new Vulkan required state (fixes remaining alpha test issues, vertex stretching on AMD on Crash Bandicoot, etc)

* Check if the subgroup size is supported before passing a explicit size

* Only enable ShaderFloat64 if the GPU supports it

* We don't need to recompile shaders if alpha test state changed but alpha test is disabled

* Enable shader cache on Vulkan and implement MultiplyHighS32/U32 on SPIR-V (missed those before)

* Fix pipeline state saving before it is updated.

This should fix a few warnings and potential stutters due to bad pipeline states being saved in the cache. You may need to clear your guest cache.

* Allow null samplers on OpenGL backend

* _unit0Sampler should be set only for binding 0

* Remove unused PipelineConverter format variable (was causing IOR)

* Raise textures limit to 64 on Vulkan

* No need to pack the shader binaries if shader cache is disabled

* Fix backbuffer not being cleared and scissor not being re-enabled on OpenGL

* Do not clear unbound framebuffer color attachments

* Geometry shader passthrough emulation

* Consolidate UpdateDepthMode and GetDepthMode implementation

* Fix A1B5G5R5 texture format and support R4G4 on Vulkan

* Add barrier before use of some modified images

* Report 32 bit query result on AMD windows (smo issue)

* Add texture recompression support (disabled for now)

It recompresses ASTC textures into BC7, which might reduce VRAM usage significantly on games that uses ASTC textures

* Do not report R4G4 format as supported on Vulkan

It was causing mario head to become white on Super Mario 64 (???)

* Improvements to -1 to 1 depth mode.

- Transformation is only applied on the last stage in the vertex pipeline.
- Should fix some issues with geometry and tessellation (hopefully)
- Reading back FragCoord Z on fragment will transform back to -1 to 1.

* Geometry Shader index count from ThreadsPerInputPrimitive

Generally fixes SPIR-V emitting too many triangles, may change games in OpenGL

* Remove gl_FragDepth scaling

This is always 0-1; the other two issues were causing the problems. Fixes regression with Xenoblade.

* Add Gl StencilOp enum values to Vulkan

* Update guest cache to v1.1 (due to specialization state changes)

This will explode your shader cache from earlier vulkan build, but it must be done. 😔

* Vulkan/SPIR-V support for viewport inverse

* Fix typo

* Don't create query pools for unsupported query types

* Return of the Vector Indexing Bug

One day, everyone will get this right.

* Check for transform feedback query support

Sometimes transform feedback is supported without the query type.

* Fix gl_FragCoord.z transformation

FragCoord.z is always in 0-1, even when the real depth range is -1 to 1. Turns out the only bug was geo and tess stage outputs.

Fixes Pokemon Sword/Shield, possibly others.

* Fix Avalonia Rebase

Vulkan is currently not available on Avalonia, but the build does work and you can use opengl.

* Fix headless build

* Add support for BC6 and BC7 decompression, decompress all BC formats if they are not supported by the host

* Fix BCn 4/5 conversion, GetTextureTarget

BCn 4/5 could generate invalid data when a line's size in bytes was not divisible by 4, which both backends expect.

GetTextureTarget was not creating a view with the replacement format.

* Fix dependency

* Fix inverse viewport transform vector type on SPIR-V

* Do not require null descriptors support

* If MultiViewport is not supported, do not try to set more than one viewport/scissor

* Bounds check on bitmap add.

* Flush queries on attachment change rather than program change

Occlusion queries are usually used in a depth only pass so the attachments changing is a better indication of the query block ending.

Write mask changes are also considered since some games do depth only pass by setting 0 write mask on all the colour targets.

* Add support for avalonia (#6)

* add avalonia support

* only lock around skia flush

* addressed review

* cleanup

* add fallback size if avalonia attempts to render but the window size is 0. read desktop scale after enabling dpi check

* fix getting window handle on linux. skip render is size is 0

* Combine non-buffer with buffer image descriptor sets

* Support multisample texture copy with automatic resolve on Vulkan

* Remove old CompileShader methods from the Vulkan backend

* Add minimal pipeline layouts that only contains used bindings

They are used by helper shaders, the intention is avoiding needing to recompile the shaders (from GLSL to SPIR-V) if the bindings changes on the translated guest shaders

* Pre-compile helper shader as SPIR-V, and some fixes

* Remove pre-compiled shaderc binary for Windows as its no longer needed by default

* Workaround RADV crash

Enabling the descriptor indexing extension, even if it is not used, forces the radv driver to use "bolist".

* Use RobustBufferAccess on NVIDIA gpus

Avoids the SMO waterfall triangle on older NVIDIA gpus.

* Implement GPU selector and expose texture recompression on the UI and config

* Fix and enable background compute shader compilation

Also disables warnings from shader cache pipeline misses.

* Fix error due to missing subpass dependency when Attachment Write -> Shader Read barriers are added

* If S8D24 is not supported, use D32FS8

* Ensure all fences are destroyed on dispose

* Pre-allocate arrays up front on DescriptorSetUpdater, allows the removal of some checks

* Add missing clear layer parameter after rebase

* Use selected gpu from config for avalonia (#7)

* use configured device

* address review

* Fix D32S8 copy workaround (AMD)

Fixes water in Pokemon Legends Arceus on AMD GPUs. Possibly fixes other things.

* Use push descriptors for uniform buffer updates (disabled for now)

* Push descriptor support check, buffer redundancy checks

Should make push descriptors faster, needs more testing though.

* Increase light command buffer pool to 2 command buffers, throw rather than returning invalid cbs

* Adjust bindings array sizes

* Force submit command buffers if memory in use by its resources is high

* Add workaround for AMD GCN cubemap view sins

`ImageCreateCubeCompatibleBit` seems to generally break 2D array textures with mipmaps... even if they are eventually aliased as a cubemap with mipmaps. Forcing a copy here works around the issue.

This could be used in future if enabling this bit reduces performance on certain GPUs. (mobile class is generally a worry)

Currently also enabled on Linux as I don't know if they managed to dodge this bug (someone please tell me). Not enabled on Vega at the moment, but easy to add if the issue is there.

* Add mobile, non-RX variants to the GCN regex.

Also make sure that the 3 digit ones only include numbers starting with 7 or 8.

* Increase image limit per stage from 8 to 16

Xenoblade Chronicles 2 was hiting the limit of 8

* Minor code cleanup

* Fix NRE caused by SupportBufferUpdater calling pipeline ClearBuffer

* Add gpu selector to Avalonia (#8)

* Add gpu selector to avalonia settings

* show backend label on window

* some fixes

* address review

* Minor changes to the Avalonia UI

* Update graphics window UI and locales. (#9)

* Update xaml and update locales

* locale updates

Did my best here but likely needs to be checked by native speakers, especially the use of ampersands in greek, russian and turkish?

* Fix locales with more (?) correct translations.

* add separator to render widget

* fix spanish and portuguese

* Add new IdList, replaces buffer list that could not remove elements and had unbounded growth

* Don't crash the settings window if Vulkan is not supported

* Fix Actions menu not being clickable on GTK UI after relaunch

* Rename VulkanGraphicsDevice to VulkanRenderer and Renderer to OpenGLRenderer

* Fix IdList and make it not thread safe

* Revert useless OpenGL format table changes

* Fix headless project build

* List throws ArgumentOutOfRangeException

* SPIR-V: Fix tessellation

* Increase shader cache version due to tessellation fix

* Reduce number of Sync objects created (improves perf in some specific titles)

* Fix vulkan validation errors for NPOT compressed upload and GCN workaround.

* Add timestamp to the shader cache and force rebuild if host cache is outdated

* Prefer Mail box present mode for popups (#11)

* Prefer Mail box present mode

* fix debug

* switch present mode when vsync is toggled

* only disable vsync on the main window

* SPIR-V: Fix geometry shader input load with transform feedback

* BC7 Encoder: Prefer more precision on alpha rather than RGB when alpha is 0

* Fix Avalonia build

* Address initial PR feedback

* Only set transform feedback outputs on last vertex stage

* Address riperiperi PR feedback

* Remove outdated comment

* Remove unused constructor

* Only throw for negative results

* Throw for QueueSubmit and other errors

No point in delaying the inevitable

* Transform feedback decorations inside gl_PerVertex struct breaks the NVIDIA compiler

* Fix some resolution scale issues

* No need for two UpdateScale calls

* Fix comments on SPIR-V generator project

* Try to fix shader local memory size

On DOOM, a shader is using local memory, but both Low and High size are 0, CRS size is 1536, it seems to store on that region?

* Remove RectangleF that is now unused

* Fix ImageGather with multiple offsets

Needs ImageGatherExtended capability, and must use `ConstantComposite` instead of `CompositeConstruct`

* Address PR feedback from jD in all projects except Avalonia

* Address most of jD PR feedback on Avalonia

* Remove unsafe

* Fix VulkanSkiaGpu

* move present mode request out of Create Swapchain method

* split more parts of create swapchain

* addressed reviews

* addressed review

* Address second batch of jD PR feedback

* Fix buffer <-> image copy row length and height alignment

AlignUp helper does not support NPOT alignment, and ASTC textures can have NPOT block sizes

* Better fix for NPOT alignment issue

* Use switch expressions on Vulkan EnumConversion

Thanks jD

* Fix Avalonia build

* Add Vulkan selection prompt on startup

* Grammar fixes on Vulkan prompt message

* Add missing Vulkan migration flag

Co-authored-by: riperiperi <rhy3756547@hotmail.com>
Co-authored-by: Emmanuel Hansen <emmausssss@gmail.com>
Co-authored-by: MutantAura <44103205+MutantAura@users.noreply.github.com>
2022-07-31 18:26:06 -03:00
14ce9e1567 Move partial unmap handler to the native signal handler (#3437)
* Initial commit with a lot of testing stuff.

* Partial Unmap Cleanup Part 1

* Fix some minor issues, hopefully windows tests.

* Disable partial unmap tests on macos for now

Weird issue.

* Goodbye magic number

* Add COMPlus_EnableAlternateStackCheck for tests

`COMPlus_EnableAlternateStackCheck` is needed for NullReferenceException handling to work on linux after registering the signal handler, due to how dotnet registers its own signal handler.

* Address some feedback

* Force retry when memory is mapped in memory tracking

This case existed before, but returning `false` no longer retries, so it would crash immediately after unprotecting the memory... Now, we return `true` to deliberately retry.

This case existed before (was just broken by this change) and I don't really want to look into fixing the issue right now. Technically, this means that on guest code partial unmaps will retry _due to this_ rather than hitting the handler. I don't expect this to cause any issues.

This should fix random crashes in Xenoblade Chronicles 2.

* Use IsRangeMapped

* Suppress MockMemoryManager.UnmapEvent warning

This event is not signalled by the mock memory manager.

* Remove 4kb mapping
2022-07-29 19:16:29 -03:00
952d013c67 Avalonia changes (#3497)
Co-authored-by: RNA <wQSZSQS2UQf5zun>
2022-07-29 01:14:37 +00:00
46c8129bf5 Avalonia: Another Cleanup (#3494)
* Avalonia: Another Cleanup

This PR is a cleanup to the avalonia code recently added:

- Some XAML file are autoformatted like a previous PR.
- Dlc is renamed to DownloadableContent (Locale exclude).
- DownloadableContentManagerWindow is a bit improved (Fixes #3491).
- Some nits here and there.

* Fix GTK

* Remove AttachDebugDevTools

* Fix last warning

* Fix JSON fields
2022-07-29 00:41:34 +02:00
8cfec5de4b Avalonia: Cleanup UserEditor a bit (#3492)
This PR cleanup the UserEditor code a bit, 2 texts are added for "Name" and "User Id", because when you create a new profile, the textbox is empty without any hints. `axaml` files are autoformated too.
2022-07-28 14:16:23 -03:00
37b6e081da Fix DMA linear texture copy fast path (#3496)
* Fix DMA linear texture copy fast path

* Formatting
2022-07-28 13:46:12 -03:00
3c3bcd82fe Add a sampler pool cache and improve texture pool cache (#3487)
* Add a sampler pool cache and improve texture pool cache

* Increase disposal timestamp delta more to be on the safe side

* Nits

* Use abstract class for PoolCache, remove factory callback
2022-07-27 21:07:48 -03:00
a00c59a46c update settings and main window tooltips (#3488) 2022-07-25 23:02:17 +02:00
1825bd87b4 misc: Reformat Ryujinx.Audio with dotnet-format (#3485)
This is the first commit of a series of reformat around the codebase as
discussed internally some weeks ago.

This project being one that isn't touched that much, it shouldn't cause
conflict with any opened PRs.
2022-07-25 15:46:33 -03:00
62f8ceb60b Resolution scaling hotkeys (#3185)
* hotkeys

* comments

* update implementation to include custom scales

* copypasta

* review changes

* hotkeys

* comments

* update implementation to include custom scales

* copypasta

* review changes

* Remove outdated configuration and force hotkeys unbound

* Add avalonia support

* Fix configuration file

* Update GTK implementation and fix config... again.

* Remove legacy implementation + nits

* Avalonia locales (DeepL)

* review

* Remove colon from chinese locale

* Update ConfigFile

* locale fix
2022-07-24 15:44:47 -03:00
1a888ae087 Add support for conditional (with CC) shader Exit instructions (#3470)
* Add support for conditional (with CC) shader Exit instructions

* Shader cache version bump

* Make CSM conditions default to false for EXIT.CC
2022-07-24 15:33:30 -03:00
84d0ca5645 feat: add traditional chinese translate (Avalonia) (#3474)
* feat: add traditional chinese translate

* update translate
2022-07-24 15:18:21 -03:00
31b8d413d5 Change MenuHeaders to embedded textblocks (#3469) 2022-07-24 14:50:06 -03:00
6e02cac952 Avalonia - Use content dialog for user profile manager (#3455)
* remove content dialog placeholder from all windows

* remove redundant window argument

* redesign user profile window

* wip

* use avalonia auto name generator

* add edit and new user options

* move profile image selection to content dialog

* remove usings

* fix updater

* address review

* adjust avatar dialog size

* add validation for user editor

* fix typo

* Shorten some labels
2022-07-24 14:38:38 -03:00
3a3380fa25 fix: Ensure to load latest version of ffmpeg libraries first (#3473)
Fix a possible crash related to older version of ffmpeg being loaded
instewad of the one shipped with the emulator.
2022-07-24 11:39:56 +02:00
2d252db0a7 GTK & Avalonia changes (#3480) 2022-07-23 12:05:51 -03:00
7f8a3541eb Fix decoding of block after shader BRA.CC instructions without predicate (#3472)
* Fix decoding of block after BRA.CC instructions without predicate

* Shader cache version bump
2022-07-23 11:53:14 -03:00
b34de74f81 Avoid adding shader buffer descriptors for constant buffers that are not used (#3478)
* Avoid adding shader buffer descriptors for constant buffers that are not used

* Shader cache version
2022-07-23 11:15:58 -03:00
5811d121df Avoid scaling 2d textures that could be used as 3d (#3464) 2022-07-15 09:24:13 -03:00
6eb85e846f Reduce some unnecessary allocations in DMA handler (#2886)
* experimental changes to try and reduce allocations in kernel threading and DMA handler

* Simplify the changes in this branch to just 1. Don't make unnecessary copies of data just for texture-texture transfers and 2. Add a fast path for 1bpp linear byte copies

* forgot to check src + dst linearity in 1bpp DMA fast path. Fixes the UE4 regression.

* removing dev log I left in

* Generalizing the DMA linear fast path to cases other than 1bpp copies

* revert kernel changes

* revert whitespace

* remove unneeded references

* PR feedback

Co-authored-by: Logan Stromberg <lostromb@microsoft.com>
Co-authored-by: gdk <gab.dark.100@gmail.com>
2022-07-14 15:45:56 -03:00
c5bddfeab8 Remove dependency for FFmpeg.AutoGen and Update FFmpeg to 5.0.1 for Windows (#3466)
* Remove dependency for FFMpeg.AutoGen

Also prepare for FFMpeg 5.0 and 5.1

* Update Ryujinx.Graphics.Nvdec.Dependencies to 5.0.1-build10

* Address gdkchan's comments

* Address Ack's comment

* Address gdkchan's comment
2022-07-14 15:13:23 +02:00
70ec5def9c BSD: Allow use of DontWait flag in Receive (#3462) 2022-07-14 11:47:25 +02:00
869 changed files with 52668 additions and 9975 deletions

View File

@ -1,6 +1,6 @@
---
name: Bug Report
about: Something doesn't work correctly in Ryujinx.
about: Something doesn't work correctly in Ryujinx. Note that game-specific issues should be instead posted on the Game Compatibility List at https://github.com/Ryujinx/Ryujinx-Games-List, unless it is a provable regression.
#assignees:
---

View File

@ -58,7 +58,6 @@ namespace ARMeilleure.CodeGen.Linking
/// <param name="a">First instance</param>
/// <param name="b">Second instance</param>
/// <returns><see langword="true"/> if not equal; otherwise <see langword="false"/></returns>
/// <inheritdoc/>
public static bool operator !=(Symbol a, Symbol b)
{
return !(a == b);

View File

@ -1,6 +1,4 @@
using ARMeilleure.Common;
using ARMeilleure.IntermediateRepresentation;
using System;
namespace ARMeilleure.CodeGen.RegisterAllocators
{

View File

@ -4,6 +4,11 @@ namespace ARMeilleure.CodeGen.X86
{
partial class Assembler
{
public static bool SupportsVexPrefix(X86Instruction inst)
{
return _instTable[(int)inst].Flags.HasFlag(InstructionFlags.Vex);
}
private const int BadOp = 0;
[Flags]
@ -152,6 +157,7 @@ namespace ARMeilleure.CodeGen.X86
Add(X86Instruction.Paddd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000ffe, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Paddq, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fd4, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Paddw, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000ffd, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Palignr, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f3a0f, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Pand, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fdb, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Pandn, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fdf, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Pavgb, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fe0, InstructionFlags.Vex | InstructionFlags.Prefix66));
@ -234,6 +240,9 @@ namespace ARMeilleure.CodeGen.X86
Add(X86Instruction.Rsqrtss, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000f52, InstructionFlags.Vex | InstructionFlags.PrefixF3));
Add(X86Instruction.Sar, new InstructionInfo(0x070000d3, 0x070000c1, BadOp, BadOp, BadOp, InstructionFlags.None));
Add(X86Instruction.Setcc, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000f90, InstructionFlags.Reg8Dest));
Add(X86Instruction.Sha256Msg1, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38cc, InstructionFlags.None));
Add(X86Instruction.Sha256Msg2, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38cd, InstructionFlags.None));
Add(X86Instruction.Sha256Rnds2, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38cb, InstructionFlags.None));
Add(X86Instruction.Shl, new InstructionInfo(0x040000d3, 0x040000c1, BadOp, BadOp, BadOp, InstructionFlags.None));
Add(X86Instruction.Shr, new InstructionInfo(0x050000d3, 0x050000c1, BadOp, BadOp, BadOp, InstructionFlags.None));
Add(X86Instruction.Shufpd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fc6, InstructionFlags.Vex | InstructionFlags.Prefix66));

View File

@ -1,5 +1,4 @@
using System;
using System.Runtime.InteropServices;
namespace ARMeilleure.CodeGen.X86
{

View File

@ -12,21 +12,28 @@ namespace ARMeilleure.CodeGen.X86
return;
}
(_, _, int ecx, int edx) = X86Base.CpuId(0x00000001, 0x00000000);
(int maxNum, _, _, _) = X86Base.CpuId(0x00000000, 0x00000000);
FeatureInfoEdx = (FeatureFlagsEdx)edx;
FeatureInfoEcx = (FeatureFlagsEcx)ecx;
(_, _, int ecx1, int edx1) = X86Base.CpuId(0x00000001, 0x00000000);
FeatureInfo1Edx = (FeatureFlags1Edx)edx1;
FeatureInfo1Ecx = (FeatureFlags1Ecx)ecx1;
if (maxNum >= 7)
{
(_, int ebx7, _, _) = X86Base.CpuId(0x00000007, 0x00000000);
FeatureInfo7Ebx = (FeatureFlags7Ebx)ebx7;
}
}
[Flags]
public enum FeatureFlagsEdx
public enum FeatureFlags1Edx
{
Sse = 1 << 25,
Sse2 = 1 << 26
}
[Flags]
public enum FeatureFlagsEcx
public enum FeatureFlags1Ecx
{
Sse3 = 1 << 0,
Pclmulqdq = 1 << 1,
@ -40,21 +47,31 @@ namespace ARMeilleure.CodeGen.X86
F16c = 1 << 29
}
public static FeatureFlagsEdx FeatureInfoEdx { get; }
public static FeatureFlagsEcx FeatureInfoEcx { get; }
[Flags]
public enum FeatureFlags7Ebx
{
Avx2 = 1 << 5,
Sha = 1 << 29
}
public static bool SupportsSse => FeatureInfoEdx.HasFlag(FeatureFlagsEdx.Sse);
public static bool SupportsSse2 => FeatureInfoEdx.HasFlag(FeatureFlagsEdx.Sse2);
public static bool SupportsSse3 => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Sse3);
public static bool SupportsPclmulqdq => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Pclmulqdq);
public static bool SupportsSsse3 => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Ssse3);
public static bool SupportsFma => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Fma);
public static bool SupportsSse41 => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Sse41);
public static bool SupportsSse42 => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Sse42);
public static bool SupportsPopcnt => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Popcnt);
public static bool SupportsAesni => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Aes);
public static bool SupportsAvx => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Avx);
public static bool SupportsF16c => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.F16c);
public static FeatureFlags1Edx FeatureInfo1Edx { get; }
public static FeatureFlags1Ecx FeatureInfo1Ecx { get; }
public static FeatureFlags7Ebx FeatureInfo7Ebx { get; } = 0;
public static bool SupportsSse => FeatureInfo1Edx.HasFlag(FeatureFlags1Edx.Sse);
public static bool SupportsSse2 => FeatureInfo1Edx.HasFlag(FeatureFlags1Edx.Sse2);
public static bool SupportsSse3 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Sse3);
public static bool SupportsPclmulqdq => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Pclmulqdq);
public static bool SupportsSsse3 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Ssse3);
public static bool SupportsFma => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Fma);
public static bool SupportsSse41 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Sse41);
public static bool SupportsSse42 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Sse42);
public static bool SupportsPopcnt => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Popcnt);
public static bool SupportsAesni => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Aes);
public static bool SupportsAvx => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Avx);
public static bool SupportsAvx2 => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx2) && SupportsAvx;
public static bool SupportsF16c => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.F16c);
public static bool SupportsSha => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Sha);
public static bool ForceLegacySse { get; set; }

View File

@ -82,6 +82,7 @@ namespace ARMeilleure.CodeGen.X86
Add(Intrinsic.X86Paddd, new IntrinsicInfo(X86Instruction.Paddd, IntrinsicType.Binary));
Add(Intrinsic.X86Paddq, new IntrinsicInfo(X86Instruction.Paddq, IntrinsicType.Binary));
Add(Intrinsic.X86Paddw, new IntrinsicInfo(X86Instruction.Paddw, IntrinsicType.Binary));
Add(Intrinsic.X86Palignr, new IntrinsicInfo(X86Instruction.Palignr, IntrinsicType.TernaryImm));
Add(Intrinsic.X86Pand, new IntrinsicInfo(X86Instruction.Pand, IntrinsicType.Binary));
Add(Intrinsic.X86Pandn, new IntrinsicInfo(X86Instruction.Pandn, IntrinsicType.Binary));
Add(Intrinsic.X86Pavgb, new IntrinsicInfo(X86Instruction.Pavgb, IntrinsicType.Binary));
@ -151,6 +152,9 @@ namespace ARMeilleure.CodeGen.X86
Add(Intrinsic.X86Roundss, new IntrinsicInfo(X86Instruction.Roundss, IntrinsicType.BinaryImm));
Add(Intrinsic.X86Rsqrtps, new IntrinsicInfo(X86Instruction.Rsqrtps, IntrinsicType.Unary));
Add(Intrinsic.X86Rsqrtss, new IntrinsicInfo(X86Instruction.Rsqrtss, IntrinsicType.Unary));
Add(Intrinsic.X86Sha256Msg1, new IntrinsicInfo(X86Instruction.Sha256Msg1, IntrinsicType.Binary));
Add(Intrinsic.X86Sha256Msg2, new IntrinsicInfo(X86Instruction.Sha256Msg2, IntrinsicType.Binary));
Add(Intrinsic.X86Sha256Rnds2, new IntrinsicInfo(X86Instruction.Sha256Rnds2, IntrinsicType.Ternary));
Add(Intrinsic.X86Shufpd, new IntrinsicInfo(X86Instruction.Shufpd, IntrinsicType.TernaryImm));
Add(Intrinsic.X86Shufps, new IntrinsicInfo(X86Instruction.Shufps, IntrinsicType.TernaryImm));
Add(Intrinsic.X86Sqrtpd, new IntrinsicInfo(X86Instruction.Sqrtpd, IntrinsicType.Unary));

View File

@ -308,11 +308,13 @@ namespace ARMeilleure.CodeGen.X86
case Instruction.Extended:
{
bool isBlend = node.Intrinsic == Intrinsic.X86Blendvpd ||
node.Intrinsic == Intrinsic.X86Blendvps ||
node.Intrinsic == Intrinsic.X86Pblendvb;
// BLENDVPD, BLENDVPS, PBLENDVB last operand is always implied to be XMM0 when VEX is not supported.
if ((node.Intrinsic == Intrinsic.X86Blendvpd ||
node.Intrinsic == Intrinsic.X86Blendvps ||
node.Intrinsic == Intrinsic.X86Pblendvb) &&
!HardwareCapabilities.SupportsVexEncoding)
// SHA256RNDS2 always has an implied XMM0 as a last operand.
if ((isBlend && !HardwareCapabilities.SupportsVexEncoding) || node.Intrinsic == Intrinsic.X86Sha256Rnds2)
{
Operand xmm0 = Xmm(X86Register.Xmm0, OperandType.V128);
@ -1297,11 +1299,15 @@ namespace ARMeilleure.CodeGen.X86
{
if (IsIntrinsic(operation.Instruction))
{
IntrinsicInfo info = IntrinsicTable.GetInfo(operation.Intrinsic);
bool hasVex = HardwareCapabilities.SupportsVexEncoding && Assembler.SupportsVexPrefix(info.Inst);
bool isUnary = operation.SourcesCount < 2;
bool hasVecDest = operation.Destination != default && operation.Destination.Type == OperandType.V128;
return !HardwareCapabilities.SupportsVexEncoding && !isUnary && hasVecDest;
return !hasVex && !isUnary && hasVecDest;
}
return false;

View File

@ -98,6 +98,7 @@ namespace ARMeilleure.CodeGen.X86
Paddd,
Paddq,
Paddw,
Palignr,
Pand,
Pandn,
Pavgb,
@ -180,6 +181,9 @@ namespace ARMeilleure.CodeGen.X86
Rsqrtss,
Sar,
Setcc,
Sha256Msg1,
Sha256Msg2,
Sha256Rnds2,
Shl,
Shr,
Shufpd,

View File

@ -1,4 +1,4 @@
using ARMeilleure.Diagnostics.EventSources;
using ARMeilleure.Diagnostics;
using System;
using System.Collections.Generic;
using System.Runtime.InteropServices;
@ -206,7 +206,7 @@ namespace ARMeilleure.Common
/// <typeparam name="T">Type of elements</typeparam>
/// <param name="length">Number of elements</param>
/// <param name="fill">Fill value</param>
/// <param name="leaf"><see langword="true"/> if leaf; otherwise <see langword=""="false"/></param>
/// <param name="leaf"><see langword="true"/> if leaf; otherwise <see langword="false"/></param>
/// <returns>Allocated block</returns>
private IntPtr Allocate<T>(int length, T fill, bool leaf) where T : unmanaged
{
@ -218,7 +218,7 @@ namespace ARMeilleure.Common
_pages.Add(page);
AddressTableEventSource.Log.Allocated(size, leaf);
TranslatorEventSource.Log.AddressTableAllocated(size, leaf);
return page;
}

View File

@ -1,7 +1,6 @@
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Threading;
namespace ARMeilleure.Common
{

View File

@ -9,6 +9,9 @@ namespace ARMeilleure.Common
class Counter<T> : IDisposable where T : unmanaged
{
private bool _disposed;
/// <summary>
/// Index in the <see cref="EntryTable{T}"/>
/// </summary>
private readonly int _index;
private readonly EntryTable<T> _countTable;
@ -17,7 +20,6 @@ namespace ARMeilleure.Common
/// <see cref="EntryTable{T}"/> instance and index.
/// </summary>
/// <param name="countTable"><see cref="EntryTable{T}"/> instance</param>
/// <param name="index">Index in the <see cref="EntryTable{T}"/></param>
/// <exception cref="ArgumentNullException"><paramref name="countTable"/> is <see langword="null"/></exception>
/// <exception cref="ArgumentException"><typeparamref name="T"/> is unsupported</exception>
public Counter(EntryTable<T> countTable)
@ -68,7 +70,7 @@ namespace ARMeilleure.Common
/// <summary>
/// Releases all unmanaged and optionally managed resources used by the <see cref="Counter{T}"/> instance.
/// </summary>
/// <param name="disposing"><see langword="true"/> to dispose managed resources also; otherwise just unmanaged resouces</param>
/// <param name="disposing"><see langword="true"/> to dispose managed resources also; otherwise just unmanaged resources</param>
protected virtual void Dispose(bool disposing)
{
if (!_disposed)

View File

@ -251,6 +251,13 @@ namespace ARMeilleure.Decoders
return false;
}
// Compare and branch instructions are always conditional.
if (opCode.Instruction.Name == InstName.Cbz ||
opCode.Instruction.Name == InstName.Cbnz)
{
return false;
}
// Note: On ARM32, most instructions have conditional execution,
// so there's no "Always" (unconditional) branch like on ARM64.
// We need to check if the condition is "Always" instead.

View File

@ -151,7 +151,7 @@ namespace ARMeilleure.Decoders
public static bool VectorArgumentsInvalid(bool q, params int[] args)
{
if (q)
if (q)
{
for (int i = 0; i < args.Length; i++)
{

View File

@ -7,5 +7,8 @@
int Msb { get; }
int Lsb { get; }
int SourceMask => (int)(0xFFFFFFFF >> (31 - Msb));
int DestMask => SourceMask & (int)(0xFFFFFFFF << Lsb);
}
}

View File

@ -0,0 +1,7 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32AluImm16 : IOpCode32Alu
{
int Immediate { get; }
}
}

View File

@ -0,0 +1,11 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32AluMla : IOpCode32AluReg
{
int Ra { get; }
bool NHigh { get; }
bool MHigh { get; }
bool R { get; }
}
}

View File

@ -0,0 +1,13 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32AluUmull : IOpCode32, IOpCode32HasSetFlags
{
int RdLo { get; }
int RdHi { get; }
int Rn { get; }
int Rm { get; }
bool NHigh { get; }
bool MHigh { get; }
}
}

View File

@ -3,6 +3,7 @@ namespace ARMeilleure.Decoders
interface IOpCode32Mem : IOpCode32
{
int Rt { get; }
int Rt2 => Rt | 1;
int Rn { get; }
bool WBack { get; }

View File

@ -0,0 +1,8 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32MemRsImm : IOpCode32Mem
{
int Rm { get; }
ShiftType ShiftType { get; }
}
}

View File

@ -13,16 +13,13 @@ namespace ARMeilleure.Decoders
Cond = (Condition)((uint)opCode >> 28);
}
public bool IsThumb()
{
return this is OpCodeT16 || this is OpCodeT32;
}
public bool IsThumb { get; protected init; } = false;
public uint GetPc()
{
// Due to backwards compatibility and legacy behavior of ARMv4 CPUs pipeline,
// the PC actually points 2 instructions ahead.
if (IsThumb())
if (IsThumb)
{
// PC is ahead by 4 in thumb mode whether or not the current instruction
// is 16 or 32 bit.

View File

@ -6,12 +6,8 @@
public int Rn { get; }
public int Msb { get; }
public int Lsb { get; }
public int SourceMask => (int)(0xFFFFFFFF >> (31 - Msb));
public int DestMask => SourceMask & (int)(0xFFFFFFFF << Lsb);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32AluBf(inst, address, opCode);
public OpCode32AluBf(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32AluImm16 : OpCode32Alu
class OpCode32AluImm16 : OpCode32Alu, IOpCode32AluImm16
{
public int Immediate { get; }

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32AluMla : OpCode32, IOpCode32AluReg
class OpCode32AluMla : OpCode32, IOpCode32AluMla
{
public int Rn { get; }
public int Rm { get; }

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32AluUmull : OpCode32, IOpCode32HasSetFlags
class OpCode32AluUmull : OpCode32, IOpCode32AluUmull
{
public int RdLo { get; }
public int RdHi { get; }
@ -11,7 +11,6 @@
public bool MHigh { get; }
public bool? SetFlags { get; }
public DataOp DataOp { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32AluUmull(inst, address, opCode);
@ -26,7 +25,6 @@
MHigh = ((opCode >> 6) & 0x1) == 1;
SetFlags = ((opCode >> 20) & 0x1) != 0;
DataOp = DataOp.Arithmetic;
}
}
}

View File

@ -1,3 +1,5 @@
using System.Numerics;
namespace ARMeilleure.Decoders
{
class OpCode32MemMult : OpCode32, IOpCode32MemMult
@ -23,14 +25,7 @@ namespace ARMeilleure.Decoders
RegisterMask = opCode & 0xffff;
int regsSize = 0;
for (int index = 0; index < 16; index++)
{
regsSize += (RegisterMask >> index) & 1;
}
regsSize *= 4;
int regsSize = BitOperations.PopCount((uint)RegisterMask) * 4;
if (!u)
{

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32MemRsImm : OpCode32Mem
class OpCode32MemRsImm : OpCode32Mem, IOpCode32MemRsImm
{
public int Rm { get; }
public ShiftType ShiftType { get; }

View File

@ -0,0 +1,16 @@
namespace ARMeilleure.Decoders
{
class OpCode32Mrs : OpCode32
{
public bool R { get; }
public int Rd { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32Mrs(inst, address, opCode);
public OpCode32Mrs(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
R = ((opCode >> 22) & 1) != 0;
Rd = (opCode >> 12) & 0xf;
}
}
}

View File

@ -2,9 +2,10 @@
{
class OpCode32SimdCvtFI : OpCode32SimdS
{
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdCvtFI(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdCvtFI(inst, address, opCode, false);
public new static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdCvtFI(inst, address, opCode, true);
public OpCode32SimdCvtFI(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdCvtFI(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode, isThumb)
{
Opc = (opCode >> 7) & 0x1;

View File

@ -7,10 +7,13 @@
public int Rt { get; }
public bool Q { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdDupGP(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdDupGP(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdDupGP(inst, address, opCode, true);
public OpCode32SimdDupGP(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdDupGP(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
Size = 2 - (((opCode >> 21) & 0x2) | ((opCode >> 5) & 0x1)); // B:E - 0 for 32, 16 then 8.
if (Size == -1)
{

View File

@ -7,10 +7,13 @@
public int Size { get; }
public int Elems { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdImm44(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdImm44(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdImm44(inst, address, opCode, true);
public OpCode32SimdImm44(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdImm44(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
Size = (opCode >> 8) & 0x3;
bool single = Size != 3;

View File

@ -8,10 +8,13 @@
public bool Add { get; }
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMemImm(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMemImm(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMemImm(inst, address, opCode, true);
public OpCode32SimdMemImm(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdMemImm(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
Immediate = opCode & 0xff;
Rn = (opCode >> 16) & 0xf;

View File

@ -12,10 +12,13 @@
public bool DoubleWidth { get; }
public bool Add { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMemMult(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMemMult(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMemMult(inst, address, opCode, true);
public OpCode32SimdMemMult(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdMemMult(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
Rn = (opCode >> 16) & 0xf;
bool isLoad = (opCode & (1 << 20)) != 0;

View File

@ -11,10 +11,13 @@
public int Opc1 { get; }
public int Opc2 { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGp(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGp(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGp(inst, address, opCode, true);
public OpCode32SimdMovGp(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdMovGp(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
// Which one is used is instruction dependant.
Op = (opCode >> 20) & 0x1;

View File

@ -9,10 +9,13 @@
public int Rt2 { get; }
public int Op { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGpDouble(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGpDouble(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGpDouble(inst, address, opCode, true);
public OpCode32SimdMovGpDouble(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdMovGpDouble(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
// Which one is used is instruction dependant.
Op = (opCode >> 20) & 0x1;

View File

@ -11,10 +11,13 @@
public int Index { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGpElem(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGpElem(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovGpElem(inst, address, opCode, true);
public OpCode32SimdMovGpElem(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdMovGpElem(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
Op = (opCode >> 20) & 0x1;
U = ((opCode >> 23) & 1) != 0;

View File

@ -0,0 +1,12 @@
namespace ARMeilleure.Decoders
{
class OpCode32SimdMovn : OpCode32Simd
{
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdMovn(inst, address, opCode);
public OpCode32SimdMovn(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Size = (opCode >> 18) & 0x3;
}
}
}

View File

@ -4,9 +4,10 @@
{
public int Vn { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdRegS(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdRegS(inst, address, opCode, false);
public new static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdRegS(inst, address, opCode, true);
public OpCode32SimdRegS(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdRegS(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode, isThumb)
{
bool single = Size != 3;
if (single)

View File

@ -8,10 +8,13 @@
public int Opc2 { get; } // opc2 or RM (opc2<1:0>) [Vcvt, Vrint].
public int Size { get; protected set; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdS(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdS(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdS(inst, address, opCode, true);
public OpCode32SimdS(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdS(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
Opc = (opCode >> 15) & 0x3;
Opc2 = (opCode >> 16) & 0x7;

View File

@ -4,9 +4,10 @@
{
public OpCode32SimdSelMode Cc { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdSel(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdSel(inst, address, opCode, false);
public new static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdSel(inst, address, opCode, true);
public OpCode32SimdSel(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdSel(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode, isThumb)
{
Cc = (OpCode32SimdSelMode)((opCode >> 20) & 3);
}

View File

@ -5,10 +5,13 @@
public int Rt { get; }
public int Sreg { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdSpecial(inst, address, opCode);
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdSpecial(inst, address, opCode, false);
public static OpCode CreateT32(InstDescriptor inst, ulong address, int opCode) => new OpCode32SimdSpecial(inst, address, opCode, true);
public OpCode32SimdSpecial(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
public OpCode32SimdSpecial(InstDescriptor inst, ulong address, int opCode, bool isThumb) : base(inst, address, opCode)
{
IsThumb = isThumb;
Rt = (opCode >> 12) & 0xf;
Sreg = (opCode >> 16) & 0xf;
}

View File

@ -8,6 +8,7 @@ namespace ARMeilleure.Decoders
{
Cond = Condition.Al;
IsThumb = true;
OpCodeSizeInBytes = 2;
}
}

View File

@ -4,14 +4,13 @@ namespace ARMeilleure.Decoders
{
public int Rd { get; }
public bool Add => true;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16Adr(inst, address, opCode);
public OpCodeT16Adr(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 8) & 7;
Rd = (opCode >> 8) & 7;
int imm = (opCode & 0xff) << 2;
Immediate = (int)(GetPc() & 0xfffffffc) + imm;

View File

@ -1,10 +1,10 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16BImmCmp : OpCodeT16
class OpCodeT16BImmCmp : OpCodeT16, IOpCode32BImm
{
public int Rn { get; }
public int Immediate { get; }
public long Immediate { get; }
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16BImmCmp(inst, address, opCode);

View File

@ -1,5 +1,4 @@
using System.Collections.Generic;
using System.Reflection.Emit;
namespace ARMeilleure.Decoders
{

View File

@ -8,6 +8,7 @@
{
Cond = Condition.Al;
IsThumb = true;
OpCodeSizeInBytes = 4;
}
}

View File

@ -0,0 +1,22 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32AluBf : OpCodeT32, IOpCode32AluBf
{
public int Rd { get; }
public int Rn { get; }
public int Msb { get; }
public int Lsb { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluBf(inst, address, opCode);
public OpCodeT32AluBf(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 8) & 0xf;
Rn = (opCode >> 16) & 0xf;
Msb = (opCode >> 0) & 0x1f;
Lsb = ((opCode >> 6) & 0x3) | ((opCode >> 10) & 0x1c);
}
}
}

View File

@ -0,0 +1,16 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32AluImm12 : OpCodeT32Alu, IOpCode32AluImm
{
public int Immediate { get; }
public bool IsRotated => false;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluImm12(inst, address, opCode);
public OpCodeT32AluImm12(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Immediate = (opCode & 0xff) | ((opCode >> 4) & 0x700) | ((opCode >> 15) & 0x800);
}
}
}

View File

@ -0,0 +1,29 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32AluMla : OpCodeT32, IOpCode32AluMla
{
public int Rn { get; }
public int Rm { get; }
public int Ra { get; }
public int Rd { get; }
public bool NHigh { get; }
public bool MHigh { get; }
public bool R { get; }
public bool? SetFlags => false;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluMla(inst, address, opCode);
public OpCodeT32AluMla(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rm = (opCode >> 0) & 0xf;
Rd = (opCode >> 8) & 0xf;
Ra = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
R = (opCode & (1 << 4)) != 0;
MHigh = ((opCode >> 4) & 0x1) == 1;
NHigh = ((opCode >> 5) & 0x1) == 1;
}
}
}

View File

@ -0,0 +1,14 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32AluReg : OpCodeT32Alu, IOpCode32AluReg
{
public int Rm { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluReg(inst, address, opCode);
public OpCodeT32AluReg(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rm = (opCode >> 0) & 0xf;
}
}
}

View File

@ -0,0 +1,28 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32AluUmull : OpCodeT32, IOpCode32AluUmull
{
public int RdLo { get; }
public int RdHi { get; }
public int Rn { get; }
public int Rm { get; }
public bool NHigh { get; }
public bool MHigh { get; }
public bool? SetFlags => false;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluUmull(inst, address, opCode);
public OpCodeT32AluUmull(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rm = (opCode >> 0) & 0xf;
RdHi = (opCode >> 8) & 0xf;
RdLo = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
MHigh = ((opCode >> 4) & 0x1) == 1;
NHigh = ((opCode >> 5) & 0x1) == 1;
}
}
}

View File

@ -0,0 +1,18 @@
using ARMeilleure.State;
namespace ARMeilleure.Decoders
{
class OpCodeT32AluUx : OpCodeT32AluReg, IOpCode32AluUx
{
public int Rotate { get; }
public int RotateBits => Rotate * 8;
public bool Add => Rn != RegisterAlias.Aarch32Pc;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluUx(inst, address, opCode);
public OpCodeT32AluUx(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rotate = (opCode >> 4) & 0x3;
}
}
}

View File

@ -1,6 +1,4 @@
using ARMeilleure.Instructions;
namespace ARMeilleure.Decoders
namespace ARMeilleure.Decoders
{
class OpCodeT32BImm20 : OpCodeT32, IOpCode32BImm
{

View File

@ -27,7 +27,7 @@ namespace ARMeilleure.Decoders
int i2 = j2 ^ s ^ 1;
int imm32 = imm11 | (imm10 << 11) | (i2 << 21) | (i1 << 22) | (s << 23);
imm32 = (imm32 << 9) >> 8;
imm32 = (imm32 << 8) >> 7;
Immediate = pc + imm32;
}

View File

@ -0,0 +1,31 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemImm8D : OpCodeT32, IOpCode32Mem
{
public int Rt { get; }
public int Rt2 { get; }
public int Rn { get; }
public bool WBack { get; }
public bool IsLoad { get; }
public bool Index { get; }
public bool Add { get; }
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemImm8D(inst, address, opCode);
public OpCodeT32MemImm8D(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt2 = (opCode >> 8) & 0xf;
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
Index = ((opCode >> 24) & 1) != 0;
Add = ((opCode >> 23) & 1) != 0;
WBack = ((opCode >> 21) & 1) != 0;
Immediate = (opCode & 0xff) << 2;
IsLoad = ((opCode >> 20) & 1) != 0;
}
}
}

View File

@ -0,0 +1,26 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemLdEx : OpCodeT32, IOpCode32MemEx
{
public int Rd => 0;
public int Rt { get; }
public int Rt2 { get; }
public int Rn { get; }
public bool WBack => false;
public bool IsLoad => true;
public bool Index => false;
public bool Add => false;
public int Immediate => 0;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemLdEx(inst, address, opCode);
public OpCodeT32MemLdEx(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt2 = (opCode >> 8) & 0xf;
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
}
}
}

View File

@ -0,0 +1,52 @@
using System.Numerics;
namespace ARMeilleure.Decoders
{
class OpCodeT32MemMult : OpCodeT32, IOpCode32MemMult
{
public int Rn { get; }
public int RegisterMask { get; }
public int Offset { get; }
public int PostOffset { get; }
public bool IsLoad { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemMult(inst, address, opCode);
public OpCodeT32MemMult(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rn = (opCode >> 16) & 0xf;
bool isLoad = (opCode & (1 << 20)) != 0;
bool w = (opCode & (1 << 21)) != 0;
bool u = (opCode & (1 << 23)) != 0;
bool p = (opCode & (1 << 24)) != 0;
RegisterMask = opCode & 0xffff;
int regsSize = BitOperations.PopCount((uint)RegisterMask) * 4;
if (!u)
{
Offset -= regsSize;
}
if (u == p)
{
Offset += 4;
}
if (w)
{
PostOffset = u ? regsSize : -regsSize;
}
else
{
PostOffset = 0;
}
IsLoad = isLoad;
}
}
}

View File

@ -0,0 +1,30 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemRsImm : OpCodeT32, IOpCode32MemRsImm
{
public int Rt { get; }
public int Rn { get; }
public int Rm { get; }
public ShiftType ShiftType => ShiftType.Lsl;
public bool WBack => false;
public bool IsLoad { get; }
public bool Index => true;
public bool Add => true;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemRsImm(inst, address, opCode);
public OpCodeT32MemRsImm(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rm = (opCode >> 0) & 0xf;
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
IsLoad = (opCode & (1 << 20)) != 0;
Immediate = (opCode >> 4) & 3;
}
}
}

View File

@ -0,0 +1,27 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemStEx : OpCodeT32, IOpCode32MemEx
{
public int Rd { get; }
public int Rt { get; }
public int Rt2 { get; }
public int Rn { get; }
public bool WBack => false;
public bool IsLoad => false;
public bool Index => false;
public bool Add => false;
public int Immediate => 0;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemStEx(inst, address, opCode);
public OpCodeT32MemStEx(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 0xf;
Rt2 = (opCode >> 8) & 0xf;
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
}
}
}

View File

@ -0,0 +1,16 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MovImm16 : OpCodeT32Alu, IOpCode32AluImm16
{
public int Immediate { get; }
public bool IsRotated => false;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MovImm16(inst, address, opCode);
public OpCodeT32MovImm16(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Immediate = (opCode & 0xff) | ((opCode >> 4) & 0x700) | ((opCode >> 15) & 0x800) | ((opCode >> 4) & 0xf000);
}
}
}

View File

@ -0,0 +1,19 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32ShiftReg : OpCodeT32Alu, IOpCode32AluRsReg
{
public int Rm => Rn;
public int Rs { get; }
public ShiftType ShiftType { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32ShiftReg(inst, address, opCode);
public OpCodeT32ShiftReg(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rs = (opCode >> 0) & 0xf;
ShiftType = (ShiftType)((opCode >> 21) & 3);
}
}
}

View File

@ -0,0 +1,16 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32Tb : OpCodeT32, IOpCode32BReg
{
public int Rm { get; }
public int Rn { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32Tb(inst, address, opCode);
public OpCodeT32Tb(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rm = (opCode >> 0) & 0xf;
Rn = (opCode >> 16) & 0xf;
}
}
}

View File

@ -704,6 +704,7 @@ namespace ARMeilleure.Decoders
SetA32("<<<<00110100xxxxxxxxxxxxxxxxxxxx", InstName.Movt, InstEmit32.Movt, OpCode32AluImm16.Create);
SetA32("<<<<1110xxx1xxxxxxxx111xxxx1xxxx", InstName.Mrc, InstEmit32.Mrc, OpCode32System.Create);
SetA32("<<<<11000101xxxxxxxx111xxxxxxxxx", InstName.Mrrc, InstEmit32.Mrrc, OpCode32System.Create);
SetA32("<<<<00010x001111xxxx000000000000", InstName.Mrs, InstEmit32.Mrs, OpCode32Mrs.Create);
SetA32("<<<<00010x10xxxx111100000000xxxx", InstName.Msr, InstEmit32.Msr, OpCode32MsrReg.Create);
SetA32("<<<<0000000xxxxx0000xxxx1001xxxx", InstName.Mul, InstEmit32.Mul, OpCode32AluMla.Create);
SetA32("<<<<0011111x0000xxxxxxxxxxxxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCode32AluImm.Create);
@ -791,186 +792,209 @@ namespace ARMeilleure.Decoders
SetA32("<<<<01101100xxxxxxxxxx000111xxxx", InstName.Uxtb16, InstEmit32.Uxtb16, OpCode32AluUx.Create);
SetA32("<<<<01101111xxxxxxxxxx000111xxxx", InstName.Uxth, InstEmit32.Uxth, OpCode32AluUx.Create);
// FP & SIMD
SetA32("111100111x110000xxx0001101x0xxx0", InstName.Aesd_V, InstEmit32.Aesd_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001100x0xxx0", InstName.Aese_V, InstEmit32.Aese_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001111x0xxx0", InstName.Aesimc_V, InstEmit32.Aesimc_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001110x0xxx0", InstName.Aesmc_V, InstEmit32.Aesmc_V, OpCode32Simd.Create);
SetA32("1111001x0x<<xxxxxxxx0111xxx0xxxx", InstName.Vabd, InstEmit32.Vabd_I, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0111x0x0xxxx", InstName.Vabdl, InstEmit32.Vabdl_I, OpCode32SimdRegLong.Create);
SetA32("<<<<11101x110000xxxx101x11x0xxxx", InstName.Vabs, InstEmit32.Vabs_S, OpCode32SimdS.Create);
SetA32("111100111x11<<01xxxx00110xx0xxxx", InstName.Vabs, InstEmit32.Vabs_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111001xxxx01110xx0xxxx", InstName.Vabs, InstEmit32.Vabs_V, OpCode32SimdCmpZ.Create);
SetA32("111100100xxxxxxxxxxx1000xxx0xxxx", InstName.Vadd, InstEmit32.Vadd_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x11xxxxxxxx101xx0x0xxxx", InstName.Vadd, InstEmit32.Vadd_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1101xxx0xxxx", InstName.Vadd, InstEmit32.Vadd_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0000x0x0xxxx", InstName.Vaddl, InstEmit32.Vaddl_I, OpCode32SimdRegLong.Create);
SetA32("1111001x1x<<xxxxxxxx0001x0x0xxxx", InstName.Vaddw, InstEmit32.Vaddw_I, OpCode32SimdRegWide.Create);
SetA32("111100100x00xxxxxxxx0001xxx1xxxx", InstName.Vand, InstEmit32.Vand_I, OpCode32SimdBinary.Create);
SetA32("111100100x01xxxxxxxx0001xxx1xxxx", InstName.Vbic, InstEmit32.Vbic_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx<<x10x11xxxx", InstName.Vbic, InstEmit32.Vbic_II, OpCode32SimdImm.Create);
SetA32("111100110x11xxxxxxxx0001xxx1xxxx", InstName.Vbif, InstEmit32.Vbif, OpCode32SimdBinary.Create);
SetA32("111100110x10xxxxxxxx0001xxx1xxxx", InstName.Vbit, InstEmit32.Vbit, OpCode32SimdBinary.Create);
SetA32("111100110x01xxxxxxxx0001xxx1xxxx", InstName.Vbsl, InstEmit32.Vbsl, OpCode32SimdBinary.Create);
SetA32("111100110x<<xxxxxxxx1000xxx1xxxx", InstName.Vceq, InstEmit32.Vceq_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1110xxx0xxxx", InstName.Vceq, InstEmit32.Vceq_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x010xx0xxxx", InstName.Vceq, InstEmit32.Vceq_Z, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx0011xxx1xxxx", InstName.Vcge, InstEmit32.Vcge_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1110xxx0xxxx", InstName.Vcge, InstEmit32.Vcge_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x001xx0xxxx", InstName.Vcge, InstEmit32.Vcge_Z, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx0011xxx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_I, OpCode32SimdReg.Create);
SetA32("111100110x10xxxxxxxx1110xxx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x000xx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x11xx01xxxx0x011xx0xxxx", InstName.Vcle, InstEmit32.Vcle_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x11xx01xxxx0x100xx0xxxx", InstName.Vclt, InstEmit32.Vclt_Z, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101x11010xxxxx101x01x0xxxx", InstName.Vcmp, InstEmit32.Vcmp, OpCode32SimdS.Create);
SetA32("<<<<11101x11010xxxxx101x11x0xxxx", InstName.Vcmpe, InstEmit32.Vcmpe, OpCode32SimdS.Create);
SetA32("111100111x110000xxxx01010xx0xxxx", InstName.Vcnt, InstEmit32.Vcnt, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101x110111xxxx101x11x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FD, OpCode32SimdS.Create); // FP 32 and 64, scalar.
SetA32("<<<<11101x11110xxxxx101x11x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FI, OpCode32SimdCvtFI.Create); // FP32 to int.
SetA32("<<<<11101x111000xxxx101xx1x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FI, OpCode32SimdCvtFI.Create); // Int to FP32.
SetA32("111111101x1111xxxxxx101xx1x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_RM, OpCode32SimdCvtFI.Create); // The many FP32 to int encodings (fp).
SetA32("111100111x111011xxxx011xxxx0xxxx", InstName.Vcvt, InstEmit32.Vcvt_V, OpCode32SimdCmpZ.Create); // FP and integer, vector.
SetA32("<<<<11101x00xxxxxxxx101xx0x0xxxx", InstName.Vdiv, InstEmit32.Vdiv_S, OpCode32SimdRegS.Create);
SetA32("<<<<11101xx0xxxxxxxx1011x0x10000", InstName.Vdup, InstEmit32.Vdup, OpCode32SimdDupGP.Create);
SetA32("111100111x11xxxxxxxx11000xx0xxxx", InstName.Vdup, InstEmit32.Vdup_1, OpCode32SimdDupElem.Create);
SetA32("111100110x00xxxxxxxx0001xxx1xxxx", InstName.Veor, InstEmit32.Veor_I, OpCode32SimdBinary.Create);
SetA32("111100101x11xxxxxxxxxxxxxxx0xxxx", InstName.Vext, InstEmit32.Vext, OpCode32SimdExt.Create);
SetA32("<<<<11101x10xxxxxxxx101xx0x0xxxx", InstName.Vfma, InstEmit32.Vfma_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1100xxx1xxxx", InstName.Vfma, InstEmit32.Vfma_V, OpCode32SimdReg.Create);
SetA32("<<<<11101x10xxxxxxxx101xx1x0xxxx", InstName.Vfms, InstEmit32.Vfms_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1100xxx1xxxx", InstName.Vfms, InstEmit32.Vfms_V, OpCode32SimdReg.Create);
SetA32("<<<<11101x01xxxxxxxx101xx1x0xxxx", InstName.Vfnma, InstEmit32.Vfnma_S, OpCode32SimdRegS.Create);
SetA32("<<<<11101x01xxxxxxxx101xx0x0xxxx", InstName.Vfnms, InstEmit32.Vfnms_S, OpCode32SimdRegS.Create);
SetA32("1111001x0x<<xxxxxxxx0000xxx0xxxx", InstName.Vhadd, InstEmit32.Vhadd, OpCode32SimdReg.Create);
SetA32("111101001x10xxxxxxxxxx00xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx0111xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 1.
SetA32("111101000x10xxxxxxxx1010xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 2.
SetA32("111101000x10xxxxxxxx0110xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 3.
SetA32("111101000x10xxxxxxxx0010xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 4.
SetA32("111101001x10xxxxxxxxxx01xxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx100xxxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemPair.Create); // Regs = 1, inc = 1/2 (itype).
SetA32("111101000x10xxxxxxxx0011xxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemPair.Create); // Regs = 2, inc = 2.
SetA32("111101001x10xxxxxxxxxx10xxxxxxxx", InstName.Vld3, InstEmit32.Vld3, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx010xxxxxxxxx", InstName.Vld3, InstEmit32.Vld3, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111101001x10xxxxxxxxxx11xxxxxxxx", InstName.Vld4, InstEmit32.Vld4, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx000xxxxxxxxx", InstName.Vld4, InstEmit32.Vld4, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("<<<<11001x01xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x11xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x11xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x01xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x11xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x11xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<1101xx01xxxxxxxx101xxxxxxxxx", InstName.Vldr, InstEmit32.Vldr, OpCode32SimdMemImm.Create);
SetA32("1111001x0x<<xxxxxxxx0110xxx0xxxx", InstName.Vmax, InstEmit32.Vmax_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1111xxx0xxxx", InstName.Vmax, InstEmit32.Vmax_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx0110xxx1xxxx", InstName.Vmin, InstEmit32.Vmin_I, OpCode32SimdReg.Create);
SetA32("111100100x10xxxxxxxx1111xxx0xxxx", InstName.Vmin, InstEmit32.Vmin_V, OpCode32SimdReg.Create);
SetA32("111111101x00xxxxxxxx10>>x0x0xxxx", InstName.Vmaxnm, InstEmit32.Vmaxnm_S, OpCode32SimdRegS.Create);
SetA32("111100110x0xxxxxxxxx1111xxx1xxxx", InstName.Vmaxnm, InstEmit32.Vmaxnm_V, OpCode32SimdReg.Create);
SetA32("111111101x00xxxxxxxx10>>x1x0xxxx", InstName.Vminnm, InstEmit32.Vminnm_S, OpCode32SimdRegS.Create);
SetA32("111100110x1xxxxxxxxx1111xxx1xxxx", InstName.Vminnm, InstEmit32.Vminnm_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx000xx1x0xxxx", InstName.Vmla, InstEmit32.Vmla_1, OpCode32SimdRegElem.Create);
SetA32("111100100xxxxxxxxxxx1001xxx0xxxx", InstName.Vmla, InstEmit32.Vmla_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x00xxxxxxxx101xx0x0xxxx", InstName.Vmla, InstEmit32.Vmla_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1101xxx1xxxx", InstName.Vmla, InstEmit32.Vmla_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx010xx1x0xxxx", InstName.Vmls, InstEmit32.Vmls_1, OpCode32SimdRegElem.Create);
SetA32("<<<<11100x00xxxxxxxx101xx1x0xxxx", InstName.Vmls, InstEmit32.Vmls_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1101xxx1xxxx", InstName.Vmls, InstEmit32.Vmls_V, OpCode32SimdReg.Create);
SetA32("111100110xxxxxxxxxxx1001xxx0xxxx", InstName.Vmls, InstEmit32.Vmls_I, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01010x0x0xxxx", InstName.Vmlsl, InstEmit32.Vmlsl_I, OpCode32SimdRegLong.Create);
SetA32("<<<<11100xx0xxxxxxxx1011xxx10000", InstName.Vmov, InstEmit32.Vmov_G1, OpCode32SimdMovGpElem.Create); // From gen purpose.
SetA32("<<<<1110xxx1xxxxxxxx1011xxx10000", InstName.Vmov, InstEmit32.Vmov_G1, OpCode32SimdMovGpElem.Create); // To gen purpose.
SetA32("<<<<1100010xxxxxxxxx101000x1xxxx", InstName.Vmov, InstEmit32.Vmov_G2, OpCode32SimdMovGpDouble.Create); // To/from gen purpose x2 and single precision x2.
SetA32("<<<<1100010xxxxxxxxx101100x1xxxx", InstName.Vmov, InstEmit32.Vmov_GD, OpCode32SimdMovGpDouble.Create); // To/from gen purpose x2 and double precision.
SetA32("<<<<1110000xxxxxxxxx1010x0010000", InstName.Vmov, InstEmit32.Vmov_GS, OpCode32SimdMovGp.Create); // To/from gen purpose and single precision.
SetA32("1111001x1x000xxxxxxx0xx00x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q vector I32.
SetA32("<<<<11101x11xxxxxxxx101x0000xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm44.Create); // Scalar f16/32/64 based on size 01 10 11.
SetA32("1111001x1x000xxxxxxx10x00x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q I16.
SetA32("1111001x1x000xxxxxxx11xx0x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q (dt - from cmode).
SetA32("1111001x1x000xxxxxxx11100x11xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q I64.
SetA32("<<<<11101x110000xxxx101x01x0xxxx", InstName.Vmov, InstEmit32.Vmov_S, OpCode32SimdS.Create);
SetA32("1111001x1x001000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("1111001x1x010000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("1111001x1x100000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("111100111x11xx10xxxx001000x0xxx0", InstName.Vmovn, InstEmit32.Vmovn, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101111xxxxxxxx101000010000", InstName.Vmrs, InstEmit32.Vmrs, OpCode32SimdSpecial.Create);
SetA32("<<<<11101110xxxxxxxx101000010000", InstName.Vmsr, InstEmit32.Vmsr, OpCode32SimdSpecial.Create);
SetA32("1111001x1x<<xxxxxxxx100xx1x0xxxx", InstName.Vmul, InstEmit32.Vmul_1, OpCode32SimdRegElem.Create);
SetA32("111100100x<<xxxxxxxx1001xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1001xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x10xxxxxxxx101xx0x0xxxx", InstName.Vmul, InstEmit32.Vmul_S, OpCode32SimdRegS.Create);
SetA32("111100110x00xxxxxxxx1101xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01010x1x0xxxx", InstName.Vmull, InstEmit32.Vmull_1, OpCode32SimdRegElemLong.Create);
SetA32("1111001x1x<<xxxxxxx01100x0x0xxxx", InstName.Vmull, InstEmit32.Vmull_I, OpCode32SimdRegLong.Create);
SetA32("111100101xx0xxxxxxx01110x0x0xxxx", InstName.Vmull, InstEmit32.Vmull_I, OpCode32SimdRegLong.Create); // P8/P64
SetA32("111100111x110000xxxx01011xx0xxxx", InstName.Vmvn, InstEmit32.Vmvn_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx0xx00x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create); // D/Q vector I32.
SetA32("1111001x1x000xxxxxxx10x00x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create);
SetA32("1111001x1x000xxxxxxx110x0x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create);
SetA32("<<<<11101x110001xxxx101x01x0xxxx", InstName.Vneg, InstEmit32.Vneg_S, OpCode32SimdS.Create);
SetA32("111100111x11<<01xxxx00111xx0xxxx", InstName.Vneg, InstEmit32.Vneg_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111001xxxx01111xx0xxxx", InstName.Vneg, InstEmit32.Vneg_V, OpCode32SimdCmpZ.Create);
SetA32("<<<<11100x01xxxxxxxx101xx1x0xxxx", InstName.Vnmla, InstEmit32.Vnmla_S, OpCode32SimdRegS.Create);
SetA32("<<<<11100x01xxxxxxxx101xx0x0xxxx", InstName.Vnmls, InstEmit32.Vnmls_S, OpCode32SimdRegS.Create);
SetA32("<<<<11100x10xxxxxxxx101xx1x0xxxx", InstName.Vnmul, InstEmit32.Vnmul_S, OpCode32SimdRegS.Create);
SetA32("111100100x11xxxxxxxx0001xxx1xxxx", InstName.Vorn, InstEmit32.Vorn_I, OpCode32SimdBinary.Create);
SetA32("111100100x10xxxxxxxx0001xxx1xxxx", InstName.Vorr, InstEmit32.Vorr_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx<<x10x01xxxx", InstName.Vorr, InstEmit32.Vorr_II, OpCode32SimdImm.Create);
SetA32("111100100x<<xxxxxxxx1011x0x1xxxx", InstName.Vpadd, InstEmit32.Vpadd_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1101x0x0xxxx", InstName.Vpadd, InstEmit32.Vpadd_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx1010x0x0xxxx", InstName.Vpmax, InstEmit32.Vpmax_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1111x0x0xxxx", InstName.Vpmax, InstEmit32.Vpmax_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx1010x0x1xxxx", InstName.Vpmin, InstEmit32.Vpmin_I, OpCode32SimdReg.Create);
SetA32("111100110x10xxxxxxxx1111x0x0xxxx", InstName.Vpmin, InstEmit32.Vpmin_V, OpCode32SimdReg.Create);
SetA32("1111001x1x>>>xxxxxxx100101x1xxx0", InstName.Vqrshrn, InstEmit32.Vqrshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x>>>xxxxxxx100001x1xxx0", InstName.Vqrshrun, InstEmit32.Vqrshrun, OpCode32SimdShImmNarrow.Create);
SetA32("1111001x1x>>>xxxxxxx100100x1xxx0", InstName.Vqshrn, InstEmit32.Vqshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x111011xxxx010x0xx0xxxx", InstName.Vrecpe, InstEmit32.Vrecpe, OpCode32SimdSqrte.Create);
SetA32("111100100x00xxxxxxxx1111xxx1xxxx", InstName.Vrecps, InstEmit32.Vrecps, OpCode32SimdReg.Create);
SetA32("111100111x11xx00xxxx000<<xx0xxxx", InstName.Vrev, InstEmit32.Vrev, OpCode32SimdRev.Create);
SetA32("111111101x1110xxxxxx101x01x0xxxx", InstName.Vrint, InstEmit32.Vrint_RM, OpCode32SimdS.Create);
SetA32("<<<<11101x110110xxxx101x11x0xxxx", InstName.Vrint, InstEmit32.Vrint_Z, OpCode32SimdS.Create);
SetA32("<<<<11101x110111xxxx101x01x0xxxx", InstName.Vrintx, InstEmit32.Vrintx_S, OpCode32SimdS.Create);
SetA32("1111001x1x>>>xxxxxxx0010>xx1xxxx", InstName.Vrshr, InstEmit32.Vrshr, OpCode32SimdShImm.Create);
SetA32("111100111x111011xxxx010x1xx0xxxx", InstName.Vrsqrte, InstEmit32.Vrsqrte, OpCode32SimdSqrte.Create);
SetA32("111100100x10xxxxxxxx1111xxx1xxxx", InstName.Vrsqrts, InstEmit32.Vrsqrts, OpCode32SimdReg.Create);
SetA32("111111100xxxxxxxxxxx101xx0x0xxxx", InstName.Vsel, InstEmit32.Vsel, OpCode32SimdSel.Create);
SetA32("111100101x>>>xxxxxxx0101>xx1xxxx", InstName.Vshl, InstEmit32.Vshl, OpCode32SimdShImm.Create);
SetA32("1111001x0xxxxxxxxxxx0100xxx0xxxx", InstName.Vshl, InstEmit32.Vshl_I, OpCode32SimdReg.Create);
SetA32("1111001x1x>>>xxxxxxx101000x1xxxx", InstName.Vshll, InstEmit32.Vshll, OpCode32SimdShImmLong.Create); // A1 encoding.
SetA32("1111001x1x>>>xxxxxxx0000>xx1xxxx", InstName.Vshr, InstEmit32.Vshr, OpCode32SimdShImm.Create);
SetA32("111100101x>>>xxxxxxx100000x1xxx0", InstName.Vshrn, InstEmit32.Vshrn, OpCode32SimdShImmNarrow.Create);
SetA32("<<<<11101x110001xxxx101x11x0xxxx", InstName.Vsqrt, InstEmit32.Vsqrt_S, OpCode32SimdS.Create);
SetA32("1111001x1x>>>xxxxxxx0001>xx1xxxx", InstName.Vsra, InstEmit32.Vsra, OpCode32SimdShImm.Create);
SetA32("111101001x00xxxxxxxx<<00xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx0111xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 1.
SetA32("111101000x00xxxxxxxx1010xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 2.
SetA32("111101000x00xxxxxxxx0110xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 3.
SetA32("111101000x00xxxxxxxx0010xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 4.
SetA32("111101001x00xxxxxxxx<<01xxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx100xxxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemPair.Create); // Regs = 1, inc = 1/2 (itype).
SetA32("111101000x00xxxxxxxx0011xxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemPair.Create); // Regs = 2, inc = 2.
SetA32("111101001x00xxxxxxxx<<10xxxxxxxx", InstName.Vst3, InstEmit32.Vst3, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx010xxxxxxxxx", InstName.Vst3, InstEmit32.Vst3, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111101001x00xxxxxxxx<<11xxxxxxxx", InstName.Vst4, InstEmit32.Vst4, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx000xxxxxxxxx", InstName.Vst4, InstEmit32.Vst4, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("<<<<11001x00xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x10xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x10xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x00xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x10xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x10xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<1101xx00xxxxxxxx101xxxxxxxxx", InstName.Vstr, InstEmit32.Vstr, OpCode32SimdMemImm.Create);
SetA32("111100110xxxxxxxxxxx1000xxx0xxxx", InstName.Vsub, InstEmit32.Vsub_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x11xxxxxxxx101xx1x0xxxx", InstName.Vsub, InstEmit32.Vsub_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1101xxx0xxxx", InstName.Vsub, InstEmit32.Vsub_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0011x0x0xxxx", InstName.Vsubw, InstEmit32.Vsubw_I, OpCode32SimdRegWide.Create);
SetA32("111100111x11xxxxxxxx10xxxxx0xxxx", InstName.Vtbl, InstEmit32.Vtbl, OpCode32SimdTbl.Create);
SetA32("111100111x11<<10xxxx00001xx0xxxx", InstName.Vtrn, InstEmit32.Vtrn, OpCode32SimdCmpZ.Create);
SetA32("111100100x<<xxxxxxxx1000xxx1xxxx", InstName.Vtst, InstEmit32.Vtst, OpCode32SimdReg.Create);
SetA32("111100111x11<<10xxxx00010xx0xxxx", InstName.Vuzp, InstEmit32.Vuzp, OpCode32SimdCmpZ.Create);
SetA32("111100111x11<<10xxxx00011xx0xxxx", InstName.Vzip, InstEmit32.Vzip, OpCode32SimdCmpZ.Create);
// VFP
SetVfp("<<<<11101x110000xxxx101x11x0xxxx", InstName.Vabs, InstEmit32.Vabs_S, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("<<<<11100x11xxxxxxxx101xx0x0xxxx", InstName.Vadd, InstEmit32.Vadd_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11101x11010xxxxx101x01x0xxxx", InstName.Vcmp, InstEmit32.Vcmp, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("<<<<11101x11010xxxxx101x11x0xxxx", InstName.Vcmpe, InstEmit32.Vcmpe, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("<<<<11101x110111xxxx101x11x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FD, OpCode32SimdS.Create, OpCode32SimdS.CreateT32); // FP 32 and 64, scalar.
SetVfp("<<<<11101x11110xxxxx101x11x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FI, OpCode32SimdCvtFI.Create, OpCode32SimdCvtFI.CreateT32); // FP32 to int.
SetVfp("<<<<11101x111000xxxx101xx1x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FI, OpCode32SimdCvtFI.Create, OpCode32SimdCvtFI.CreateT32); // Int to FP32.
SetVfp("111111101x1111xxxxxx101xx1x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_RM, OpCode32SimdCvtFI.Create, OpCode32SimdCvtFI.CreateT32); // The many FP32 to int encodings (fp).
SetVfp("<<<<11101x00xxxxxxxx101xx0x0xxxx", InstName.Vdiv, InstEmit32.Vdiv_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11101xx0xxxxxxxx1011x0x10000", InstName.Vdup, InstEmit32.Vdup, OpCode32SimdDupGP.Create, OpCode32SimdDupGP.CreateT32);
SetVfp("<<<<11101x10xxxxxxxx101xx0x0xxxx", InstName.Vfma, InstEmit32.Vfma_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11101x10xxxxxxxx101xx1x0xxxx", InstName.Vfms, InstEmit32.Vfms_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11101x01xxxxxxxx101xx1x0xxxx", InstName.Vfnma, InstEmit32.Vfnma_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11101x01xxxxxxxx101xx0x0xxxx", InstName.Vfnms, InstEmit32.Vfnms_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11001x01xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11001x11xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11010x11xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11001x01xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11001x11xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11010x11xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<1101xx01xxxxxxxx101xxxxxxxxx", InstName.Vldr, InstEmit32.Vldr, OpCode32SimdMemImm.Create, OpCode32SimdMemImm.CreateT32);
SetVfp("111111101x00xxxxxxxx10>>x0x0xxxx", InstName.Vmaxnm, InstEmit32.Vmaxnm_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("111111101x00xxxxxxxx10>>x1x0xxxx", InstName.Vminnm, InstEmit32.Vminnm_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11100x00xxxxxxxx101xx0x0xxxx", InstName.Vmla, InstEmit32.Vmla_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11100x00xxxxxxxx101xx1x0xxxx", InstName.Vmls, InstEmit32.Vmls_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11100xx0xxxxxxxx1011xxx10000", InstName.Vmov, InstEmit32.Vmov_G1, OpCode32SimdMovGpElem.Create, OpCode32SimdMovGpElem.CreateT32); // From gen purpose.
SetVfp("<<<<1110xxx1xxxxxxxx1011xxx10000", InstName.Vmov, InstEmit32.Vmov_G1, OpCode32SimdMovGpElem.Create, OpCode32SimdMovGpElem.CreateT32); // To gen purpose.
SetVfp("<<<<1100010xxxxxxxxx101000x1xxxx", InstName.Vmov, InstEmit32.Vmov_G2, OpCode32SimdMovGpDouble.Create, OpCode32SimdMovGpDouble.CreateT32); // To/from gen purpose x2 and single precision x2.
SetVfp("<<<<1100010xxxxxxxxx101100x1xxxx", InstName.Vmov, InstEmit32.Vmov_GD, OpCode32SimdMovGpDouble.Create, OpCode32SimdMovGpDouble.CreateT32); // To/from gen purpose x2 and double precision.
SetVfp("<<<<1110000xxxxxxxxx1010x0010000", InstName.Vmov, InstEmit32.Vmov_GS, OpCode32SimdMovGp.Create, OpCode32SimdMovGp.CreateT32); // To/from gen purpose and single precision.
SetVfp("<<<<11101x11xxxxxxxx101x0000xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm44.Create, OpCode32SimdImm44.CreateT32); // Scalar f16/32/64 based on size 01 10 11.
SetVfp("<<<<11101x110000xxxx101x01x0xxxx", InstName.Vmov, InstEmit32.Vmov_S, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("<<<<11101111xxxxxxxx101000010000", InstName.Vmrs, InstEmit32.Vmrs, OpCode32SimdSpecial.Create, OpCode32SimdSpecial.CreateT32);
SetVfp("<<<<11101110xxxxxxxx101000010000", InstName.Vmsr, InstEmit32.Vmsr, OpCode32SimdSpecial.Create, OpCode32SimdSpecial.CreateT32);
SetVfp("<<<<11100x10xxxxxxxx101xx0x0xxxx", InstName.Vmul, InstEmit32.Vmul_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11101x110001xxxx101x01x0xxxx", InstName.Vneg, InstEmit32.Vneg_S, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("<<<<11100x01xxxxxxxx101xx1x0xxxx", InstName.Vnmla, InstEmit32.Vnmla_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11100x01xxxxxxxx101xx0x0xxxx", InstName.Vnmls, InstEmit32.Vnmls_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("<<<<11100x10xxxxxxxx101xx1x0xxxx", InstName.Vnmul, InstEmit32.Vnmul_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
SetVfp("111111101x1110xxxxxx101x01x0xxxx", InstName.Vrint, InstEmit32.Vrint_RM, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("<<<<11101x110110xxxx101x11x0xxxx", InstName.Vrint, InstEmit32.Vrint_Z, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("<<<<11101x110111xxxx101x01x0xxxx", InstName.Vrintx, InstEmit32.Vrintx_S, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("<<<<11101x110001xxxx101x11x0xxxx", InstName.Vsqrt, InstEmit32.Vsqrt_S, OpCode32SimdS.Create, OpCode32SimdS.CreateT32);
SetVfp("111111100xxxxxxxxxxx101xx0x0xxxx", InstName.Vsel, InstEmit32.Vsel, OpCode32SimdSel.Create, OpCode32SimdSel.CreateT32);
SetVfp("<<<<11001x00xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11001x10xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11010x10xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11001x00xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11001x10xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<11010x10xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create, OpCode32SimdMemMult.CreateT32);
SetVfp("<<<<1101xx00xxxxxxxx101xxxxxxxxx", InstName.Vstr, InstEmit32.Vstr, OpCode32SimdMemImm.Create, OpCode32SimdMemImm.CreateT32);
SetVfp("<<<<11100x11xxxxxxxx101xx1x0xxxx", InstName.Vsub, InstEmit32.Vsub_S, OpCode32SimdRegS.Create, OpCode32SimdRegS.CreateT32);
// ASIMD
SetA32("111100111x110000xxx0001101x0xxx0", InstName.Aesd_V, InstEmit32.Aesd_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001100x0xxx0", InstName.Aese_V, InstEmit32.Aese_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001111x0xxx0", InstName.Aesimc_V, InstEmit32.Aesimc_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001110x0xxx0", InstName.Aesmc_V, InstEmit32.Aesmc_V, OpCode32Simd.Create);
SetA32("111100110x00xxx0xxx01100x1x0xxx0", InstName.Sha256h_V, InstEmit32.Sha256h_V, OpCode32SimdReg.Create);
SetA32("111100110x01xxx0xxx01100x1x0xxx0", InstName.Sha256h2_V, InstEmit32.Sha256h2_V, OpCode32SimdReg.Create);
SetA32("111100111x111010xxx0001111x0xxx0", InstName.Sha256su0_V, InstEmit32.Sha256su0_V, OpCode32Simd.Create);
SetA32("111100110x10xxx0xxx01100x1x0xxx0", InstName.Sha256su1_V, InstEmit32.Sha256su1_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx0111xxx0xxxx", InstName.Vabd, InstEmit32.Vabd_I, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0111x0x0xxxx", InstName.Vabdl, InstEmit32.Vabdl_I, OpCode32SimdRegLong.Create);
SetA32("111100111x11<<01xxxx00110xx0xxxx", InstName.Vabs, InstEmit32.Vabs_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111001xxxx01110xx0xxxx", InstName.Vabs, InstEmit32.Vabs_V, OpCode32SimdCmpZ.Create);
SetA32("111100100xxxxxxxxxxx1000xxx0xxxx", InstName.Vadd, InstEmit32.Vadd_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1101xxx0xxxx", InstName.Vadd, InstEmit32.Vadd_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx00000x0x0xxxx", InstName.Vaddl, InstEmit32.Vaddl_I, OpCode32SimdRegLong.Create);
SetA32("1111001x1x<<xxxxxxx00001x0x0xxxx", InstName.Vaddw, InstEmit32.Vaddw_I, OpCode32SimdRegWide.Create);
SetA32("111100100x00xxxxxxxx0001xxx1xxxx", InstName.Vand, InstEmit32.Vand_I, OpCode32SimdBinary.Create);
SetA32("111100100x01xxxxxxxx0001xxx1xxxx", InstName.Vbic, InstEmit32.Vbic_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx<<x10x11xxxx", InstName.Vbic, InstEmit32.Vbic_II, OpCode32SimdImm.Create);
SetA32("111100110x11xxxxxxxx0001xxx1xxxx", InstName.Vbif, InstEmit32.Vbif, OpCode32SimdBinary.Create);
SetA32("111100110x10xxxxxxxx0001xxx1xxxx", InstName.Vbit, InstEmit32.Vbit, OpCode32SimdBinary.Create);
SetA32("111100110x01xxxxxxxx0001xxx1xxxx", InstName.Vbsl, InstEmit32.Vbsl, OpCode32SimdBinary.Create);
SetA32("111100110x<<xxxxxxxx1000xxx1xxxx", InstName.Vceq, InstEmit32.Vceq_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1110xxx0xxxx", InstName.Vceq, InstEmit32.Vceq_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x010xx0xxxx", InstName.Vceq, InstEmit32.Vceq_Z, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx0011xxx1xxxx", InstName.Vcge, InstEmit32.Vcge_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1110xxx0xxxx", InstName.Vcge, InstEmit32.Vcge_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x001xx0xxxx", InstName.Vcge, InstEmit32.Vcge_Z, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx0011xxx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_I, OpCode32SimdReg.Create);
SetA32("111100110x10xxxxxxxx1110xxx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x000xx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x11xx01xxxx0x011xx0xxxx", InstName.Vcle, InstEmit32.Vcle_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x11xx01xxxx0x100xx0xxxx", InstName.Vclt, InstEmit32.Vclt_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x110000xxxx01010xx0xxxx", InstName.Vcnt, InstEmit32.Vcnt, OpCode32SimdCmpZ.Create);
SetA32("111100111x111011xxxx011xxxx0xxxx", InstName.Vcvt, InstEmit32.Vcvt_V, OpCode32SimdCmpZ.Create); // FP and integer, vector.
SetA32("111100111x11xxxxxxxx11000xx0xxxx", InstName.Vdup, InstEmit32.Vdup_1, OpCode32SimdDupElem.Create);
SetA32("111100110x00xxxxxxxx0001xxx1xxxx", InstName.Veor, InstEmit32.Veor_I, OpCode32SimdBinary.Create);
SetA32("111100101x11xxxxxxxxxxxxxxx0xxxx", InstName.Vext, InstEmit32.Vext, OpCode32SimdExt.Create);
SetA32("111100100x00xxxxxxxx1100xxx1xxxx", InstName.Vfma, InstEmit32.Vfma_V, OpCode32SimdReg.Create);
SetA32("111100100x10xxxxxxxx1100xxx1xxxx", InstName.Vfms, InstEmit32.Vfms_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx0000xxx0xxxx", InstName.Vhadd, InstEmit32.Vhadd, OpCode32SimdReg.Create);
SetA32("111101001x10xxxxxxxxxx00xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx0111xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 1.
SetA32("111101000x10xxxxxxxx1010xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 2.
SetA32("111101000x10xxxxxxxx0110xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 3.
SetA32("111101000x10xxxxxxxx0010xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 4.
SetA32("111101001x10xxxxxxxxxx01xxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx100xxxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemPair.Create); // Regs = 1, inc = 1/2 (itype).
SetA32("111101000x10xxxxxxxx0011xxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemPair.Create); // Regs = 2, inc = 2.
SetA32("111101001x10xxxxxxxxxx10xxxxxxxx", InstName.Vld3, InstEmit32.Vld3, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx010xxxxxxxxx", InstName.Vld3, InstEmit32.Vld3, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111101001x10xxxxxxxxxx11xxxxxxxx", InstName.Vld4, InstEmit32.Vld4, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx000xxxxxxxxx", InstName.Vld4, InstEmit32.Vld4, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("1111001x0x<<xxxxxxxx0110xxx0xxxx", InstName.Vmax, InstEmit32.Vmax_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1111xxx0xxxx", InstName.Vmax, InstEmit32.Vmax_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx0110xxx1xxxx", InstName.Vmin, InstEmit32.Vmin_I, OpCode32SimdReg.Create);
SetA32("111100100x10xxxxxxxx1111xxx0xxxx", InstName.Vmin, InstEmit32.Vmin_V, OpCode32SimdReg.Create);
SetA32("111100110x0xxxxxxxxx1111xxx1xxxx", InstName.Vmaxnm, InstEmit32.Vmaxnm_V, OpCode32SimdReg.Create);
SetA32("111100110x1xxxxxxxxx1111xxx1xxxx", InstName.Vminnm, InstEmit32.Vminnm_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx000xx1x0xxxx", InstName.Vmla, InstEmit32.Vmla_1, OpCode32SimdRegElem.Create);
SetA32("111100100xxxxxxxxxxx1001xxx0xxxx", InstName.Vmla, InstEmit32.Vmla_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1101xxx1xxxx", InstName.Vmla, InstEmit32.Vmla_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01000x0x0xxxx", InstName.Vmlal, InstEmit32.Vmlal_I, OpCode32SimdRegLong.Create);
SetA32("1111001x1x<<xxxxxxxx010xx1x0xxxx", InstName.Vmls, InstEmit32.Vmls_1, OpCode32SimdRegElem.Create);
SetA32("111100100x10xxxxxxxx1101xxx1xxxx", InstName.Vmls, InstEmit32.Vmls_V, OpCode32SimdReg.Create);
SetA32("111100110xxxxxxxxxxx1001xxx0xxxx", InstName.Vmls, InstEmit32.Vmls_I, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01010x0x0xxxx", InstName.Vmlsl, InstEmit32.Vmlsl_I, OpCode32SimdRegLong.Create);
SetA32("1111001x1x000xxxxxxx0xx00x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q vector I32.
SetA32("1111001x1x000xxxxxxx10x00x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q I16.
SetA32("1111001x1x000xxxxxxx11xx0x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q (dt - from cmode).
SetA32("1111001x1x000xxxxxxx11100x11xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q I64.
SetA32("1111001x1x001000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("1111001x1x010000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("1111001x1x100000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("111100111x11<<10xxxx001000x0xxx0", InstName.Vmovn, InstEmit32.Vmovn, OpCode32SimdMovn.Create);
SetA32("1111001x1x<<xxxxxxxx100xx1x0xxxx", InstName.Vmul, InstEmit32.Vmul_1, OpCode32SimdRegElem.Create);
SetA32("111100100x<<xxxxxxxx1001xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1001xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1101xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01010x1x0xxxx", InstName.Vmull, InstEmit32.Vmull_1, OpCode32SimdRegElemLong.Create);
SetA32("1111001x1x<<xxxxxxx01100x0x0xxxx", InstName.Vmull, InstEmit32.Vmull_I, OpCode32SimdRegLong.Create);
SetA32("111100101xx0xxxxxxx01110x0x0xxxx", InstName.Vmull, InstEmit32.Vmull_I, OpCode32SimdRegLong.Create); // P8/P64
SetA32("111100111x110000xxxx01011xx0xxxx", InstName.Vmvn, InstEmit32.Vmvn_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx0xx00x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create); // D/Q vector I32.
SetA32("1111001x1x000xxxxxxx10x00x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create);
SetA32("1111001x1x000xxxxxxx110x0x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create);
SetA32("111100111x11<<01xxxx00111xx0xxxx", InstName.Vneg, InstEmit32.Vneg_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111001xxxx01111xx0xxxx", InstName.Vneg, InstEmit32.Vneg_V, OpCode32SimdCmpZ.Create);
SetA32("111100100x11xxxxxxxx0001xxx1xxxx", InstName.Vorn, InstEmit32.Vorn_I, OpCode32SimdBinary.Create);
SetA32("111100100x10xxxxxxxx0001xxx1xxxx", InstName.Vorr, InstEmit32.Vorr_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx<<x10x01xxxx", InstName.Vorr, InstEmit32.Vorr_II, OpCode32SimdImm.Create);
SetA32("111100100x<<xxxxxxxx1011x0x1xxxx", InstName.Vpadd, InstEmit32.Vpadd_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1101x0x0xxxx", InstName.Vpadd, InstEmit32.Vpadd_V, OpCode32SimdReg.Create);
SetA32("111100111x11<<00xxxx0010xxx0xxxx", InstName.Vpaddl, InstEmit32.Vpaddl, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx1010x0x0xxxx", InstName.Vpmax, InstEmit32.Vpmax_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1111x0x0xxxx", InstName.Vpmax, InstEmit32.Vpmax_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx1010x0x1xxxx", InstName.Vpmin, InstEmit32.Vpmin_I, OpCode32SimdReg.Create);
SetA32("111100110x10xxxxxxxx1111x0x0xxxx", InstName.Vpmin, InstEmit32.Vpmin_V, OpCode32SimdReg.Create);
SetA32("1111001x0xxxxxxxxxxx0000xxx1xxxx", InstName.Vqadd, InstEmit32.Vqadd, OpCode32SimdReg.Create);
SetA32("111100100x01xxxxxxxx1011xxx0xxxx", InstName.Vqdmulh, InstEmit32.Vqdmulh, OpCode32SimdReg.Create);
SetA32("111100100x10xxxxxxxx1011xxx0xxxx", InstName.Vqdmulh, InstEmit32.Vqdmulh, OpCode32SimdReg.Create);
SetA32("111100111x11<<10xxxx00101xx0xxx0", InstName.Vqmovn, InstEmit32.Vqmovn, OpCode32SimdMovn.Create);
SetA32("111100111x11<<10xxxx001001x0xxx0", InstName.Vqmovun, InstEmit32.Vqmovun, OpCode32SimdMovn.Create);
SetA32("1111001x1x>>>xxxxxxx100101x1xxx0", InstName.Vqrshrn, InstEmit32.Vqrshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x>>>xxxxxxx100001x1xxx0", InstName.Vqrshrun, InstEmit32.Vqrshrun, OpCode32SimdShImmNarrow.Create);
SetA32("1111001x1x>>>xxxxxxx100100x1xxx0", InstName.Vqshrn, InstEmit32.Vqshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x>>>xxxxxxx100000x1xxx0", InstName.Vqshrun, InstEmit32.Vqshrun, OpCode32SimdShImmNarrow.Create);
SetA32("1111001x0xxxxxxxxxxx0010xxx1xxxx", InstName.Vqsub, InstEmit32.Vqsub, OpCode32SimdReg.Create);
SetA32("111100111x111011xxxx010x0xx0xxxx", InstName.Vrecpe, InstEmit32.Vrecpe, OpCode32SimdSqrte.Create);
SetA32("111100100x00xxxxxxxx1111xxx1xxxx", InstName.Vrecps, InstEmit32.Vrecps, OpCode32SimdReg.Create);
SetA32("111100111x11xx00xxxx000<<xx0xxxx", InstName.Vrev, InstEmit32.Vrev, OpCode32SimdRev.Create);
SetA32("1111001x0x<<xxxxxxxx0001xxx0xxxx", InstName.Vrhadd, InstEmit32.Vrhadd, OpCode32SimdReg.Create);
SetA32("111100111x111010xxxx01010xx0xxxx", InstName.Vrinta, InstEmit32.Vrinta_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111010xxxx01101xx0xxxx", InstName.Vrintm, InstEmit32.Vrintm_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111010xxxx01000xx0xxxx", InstName.Vrintn, InstEmit32.Vrintn_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111010xxxx01111xx0xxxx", InstName.Vrintp, InstEmit32.Vrintp_V, OpCode32SimdCmpZ.Create);
SetA32("1111001x1x>>>xxxxxxx0010>xx1xxxx", InstName.Vrshr, InstEmit32.Vrshr, OpCode32SimdShImm.Create);
SetA32("111100101x>>>xxxxxxx100001x1xxx0", InstName.Vrshrn, InstEmit32.Vrshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x111011xxxx010x1xx0xxxx", InstName.Vrsqrte, InstEmit32.Vrsqrte, OpCode32SimdSqrte.Create);
SetA32("111100100x10xxxxxxxx1111xxx1xxxx", InstName.Vrsqrts, InstEmit32.Vrsqrts, OpCode32SimdReg.Create);
SetA32("1111001x1x>>>xxxxxxx0011>xx1xxxx", InstName.Vrsra, InstEmit32.Vrsra, OpCode32SimdShImm.Create);
SetA32("111100101x>>>xxxxxxx0101>xx1xxxx", InstName.Vshl, InstEmit32.Vshl, OpCode32SimdShImm.Create);
SetA32("1111001x0xxxxxxxxxxx0100xxx0xxxx", InstName.Vshl, InstEmit32.Vshl_I, OpCode32SimdReg.Create);
SetA32("1111001x1x>>>xxxxxxx101000x1xxxx", InstName.Vshll, InstEmit32.Vshll, OpCode32SimdShImmLong.Create); // A1 encoding.
SetA32("1111001x1x>>>xxxxxxx0000>xx1xxxx", InstName.Vshr, InstEmit32.Vshr, OpCode32SimdShImm.Create);
SetA32("111100101x>>>xxxxxxx100000x1xxx0", InstName.Vshrn, InstEmit32.Vshrn, OpCode32SimdShImmNarrow.Create);
SetA32("1111001x1x>>>xxxxxxx0001>xx1xxxx", InstName.Vsra, InstEmit32.Vsra, OpCode32SimdShImm.Create);
SetA32("111101001x00xxxxxxxx<<00xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx0111xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 1.
SetA32("111101000x00xxxxxxxx1010xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 2.
SetA32("111101000x00xxxxxxxx0110xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 3.
SetA32("111101000x00xxxxxxxx0010xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 4.
SetA32("111101001x00xxxxxxxx<<01xxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx100xxxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemPair.Create); // Regs = 1, inc = 1/2 (itype).
SetA32("111101000x00xxxxxxxx0011xxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemPair.Create); // Regs = 2, inc = 2.
SetA32("111101001x00xxxxxxxx<<10xxxxxxxx", InstName.Vst3, InstEmit32.Vst3, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx010xxxxxxxxx", InstName.Vst3, InstEmit32.Vst3, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111101001x00xxxxxxxx<<11xxxxxxxx", InstName.Vst4, InstEmit32.Vst4, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx000xxxxxxxxx", InstName.Vst4, InstEmit32.Vst4, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111100110xxxxxxxxxxx1000xxx0xxxx", InstName.Vsub, InstEmit32.Vsub_I, OpCode32SimdReg.Create);
SetA32("111100100x10xxxxxxxx1101xxx0xxxx", InstName.Vsub, InstEmit32.Vsub_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx00010x0x0xxxx", InstName.Vsubl, InstEmit32.Vsubl_I, OpCode32SimdRegLong.Create);
SetA32("1111001x1x<<xxxxxxx00011x0x0xxxx", InstName.Vsubw, InstEmit32.Vsubw_I, OpCode32SimdRegWide.Create);
SetA32("111100111x11xxxxxxxx10xxxxx0xxxx", InstName.Vtbl, InstEmit32.Vtbl, OpCode32SimdTbl.Create);
SetA32("111100111x11<<10xxxx00001xx0xxxx", InstName.Vtrn, InstEmit32.Vtrn, OpCode32SimdCmpZ.Create);
SetA32("111100100x<<xxxxxxxx1000xxx1xxxx", InstName.Vtst, InstEmit32.Vtst, OpCode32SimdReg.Create);
SetA32("111100111x11<<10xxxx00010xx0xxxx", InstName.Vuzp, InstEmit32.Vuzp, OpCode32SimdCmpZ.Create);
SetA32("111100111x11<<10xxxx00011xx0xxxx", InstName.Vzip, InstEmit32.Vzip, OpCode32SimdCmpZ.Create);
#endregion
#region "OpCode Table (AArch32, T16)"
@ -1003,7 +1027,7 @@ namespace ARMeilleure.Decoders
SetT16("01000101xxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT16AluRegHigh.Create);
SetT16("01000110xxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16AluRegHigh.Create);
SetT16("010001110xxxx000", InstName.Bx, InstEmit32.Bx, OpCodeT16BReg.Create);
SetT16("010001111xxxx000", InstName.Blx, InstEmit32.Blx, OpCodeT16BReg.Create);
SetT16("010001111xxxx000", InstName.Blx, InstEmit32.Blxr, OpCodeT16BReg.Create);
SetT16("01001xxxxxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT16MemLit.Create);
SetT16("0101000xxxxxxxxx", InstName.Str, InstEmit32.Str, OpCodeT16MemReg.Create);
SetT16("0101001xxxxxxxxx", InstName.Strh, InstEmit32.Strh, OpCodeT16MemReg.Create);
@ -1051,44 +1075,71 @@ namespace ARMeilleure.Decoders
SetT32("11110x01010xxxxx0xxxxxxxxxxxxxxx", InstName.Adc, InstEmit32.Adc, OpCodeT32AluImm.Create);
SetT32("11101011000<xxxx0xxx<<<<xxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT32AluRsImm.Create);
SetT32("11110x01000<xxxx0xxx<<<<xxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT32AluImm.Create);
SetT32("11110x100000xxxx0xxxxxxxxxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT32AluImm12.Create);
SetT32("11101010000<xxxx0xxx<<<<xxxxxxxx", InstName.And, InstEmit32.And, OpCodeT32AluRsImm.Create);
SetT32("11110x00000<xxxx0xxx<<<<xxxxxxxx", InstName.And, InstEmit32.And, OpCodeT32AluImm.Create);
SetT32("11110x<<<xxxxxxx10x0xxxxxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT32BImm20.Create);
SetT32("11110xxxxxxxxxxx10x1xxxxxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT32BImm24.Create);
SetT32("11110011011011110xxxxxxxxx0xxxxx", InstName.Bfc, InstEmit32.Bfc, OpCodeT32AluBf.Create);
SetT32("111100110110<<<<0xxxxxxxxx0xxxxx", InstName.Bfi, InstEmit32.Bfi, OpCodeT32AluBf.Create);
SetT32("11101010001xxxxx0xxxxxxxxxxxxxxx", InstName.Bic, InstEmit32.Bic, OpCodeT32AluRsImm.Create);
SetT32("11110x00001xxxxx0xxxxxxxxxxxxxxx", InstName.Bic, InstEmit32.Bic, OpCodeT32AluImm.Create);
SetT32("11110xxxxxxxxxxx11x1xxxxxxxxxxxx", InstName.Bl, InstEmit32.Bl, OpCodeT32BImm24.Create);
SetT32("11110xxxxxxxxxxx11x0xxxxxxxxxxx0", InstName.Blx, InstEmit32.Blx, OpCodeT32BImm24.Create);
SetT32("111110101011xxxx1111xxxx1000xxxx", InstName.Clz, InstEmit32.Clz, OpCodeT32AluReg.Create);
SetT32("111010110001xxxx0xxx1111xxxxxxxx", InstName.Cmn, InstEmit32.Cmn, OpCodeT32AluRsImm.Create);
SetT32("11110x010001xxxx0xxx1111xxxxxxxx", InstName.Cmn, InstEmit32.Cmn, OpCodeT32AluImm.Create);
SetT32("111010111011xxxx0xxx1111xxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT32AluRsImm.Create);
SetT32("11110x011011xxxx0xxx1111xxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT32AluImm.Create);
SetT32("11101010100<xxxx0xxx<<<<xxxxxxxx", InstName.Eor, InstEmit32.Eor, OpCodeT32AluRsImm.Create);
SetT32("11110x00100<xxxx0xxx<<<<xxxxxxxx", InstName.Eor, InstEmit32.Eor, OpCodeT32AluImm.Create);
SetT32("111110000101xxxx<<<<10x1xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111110000101xxxx<<<<1100xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111110000101xxxx<<<<11x1xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111010001101xxxxxxxx111110101111", InstName.Lda, InstEmit32.Lda, OpCodeT32MemLdEx.Create);
SetT32("111010001101xxxxxxxx111110001111", InstName.Ldab, InstEmit32.Ldab, OpCodeT32MemLdEx.Create);
SetT32("111010001101xxxxxxxx111111101111", InstName.Ldaex, InstEmit32.Ldaex, OpCodeT32MemLdEx.Create);
SetT32("111010001101xxxxxxxx111111001111", InstName.Ldaexb, InstEmit32.Ldaexb, OpCodeT32MemLdEx.Create);
SetT32("111010001101xxxxxxxxxxxx11111111", InstName.Ldaexd, InstEmit32.Ldaexd, OpCodeT32MemLdEx.Create);
SetT32("111010001101xxxxxxxx111111011111", InstName.Ldaexh, InstEmit32.Ldaexh, OpCodeT32MemLdEx.Create);
SetT32("111010001101xxxxxxxx111110011111", InstName.Ldah, InstEmit32.Ldah, OpCodeT32MemLdEx.Create);
SetT32("1110100010x1xxxxxxxxxxxxxxxxxxxx", InstName.Ldm, InstEmit32.Ldm, OpCodeT32MemMult.Create);
SetT32("1110100100x1xxxxxxxxxxxxxxxxxxxx", InstName.Ldm, InstEmit32.Ldm, OpCodeT32MemMult.Create);
SetT32("111110000101xxxxxxxx10x1xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111110000101xxxxxxxx1100xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111110000101xxxxxxxx11x1xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111110001101xxxxxxxxxxxxxxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm12.Create);
SetT32("111110000001xxxx<<<<10x1xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110000001xxxx<<<<1100xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110000001xxxx<<<<11x1xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110000101<<<<xxxx000000xxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemRsImm.Create);
SetT32("111110000001xxxxxxxx10x1xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110000001xxxxxxxx1100xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110000001xxxxxxxx11x1xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110001001xxxxxxxxxxxxxxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm12.Create);
SetT32("111110000011xxxx<<<<10x1xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110000011xxxx<<<<1100xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110000011xxxx<<<<11x1xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110000001xxxx<<<<000000xxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemRsImm.Create);
SetT32("11101000x111<<<<xxxxxxxxxxxxxxxx", InstName.Ldrd, InstEmit32.Ldrd, OpCodeT32MemImm8D.Create);
SetT32("11101001x1x1<<<<xxxxxxxxxxxxxxxx", InstName.Ldrd, InstEmit32.Ldrd, OpCodeT32MemImm8D.Create);
SetT32("111110000011xxxxxxxx10x1xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110000011xxxxxxxx1100xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110000011xxxxxxxx11x1xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110001011xxxxxxxxxxxxxxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm12.Create);
SetT32("111110010001xxxx<<<<10x1xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110010001xxxx<<<<1100xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110010001xxxx<<<<11x1xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110000011xxxx<<<<000000xxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemRsImm.Create);
SetT32("111110010001xxxxxxxx10x1xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110010001xxxxxxxx1100xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110010001xxxxxxxx11x1xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110011001xxxxxxxxxxxxxxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm12.Create);
SetT32("111110010011xxxx<<<<10x1xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110010011xxxx<<<<1100xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110010011xxxx<<<<11x1xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110010001xxxx<<<<000000xxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemRsImm.Create);
SetT32("111110010011xxxxxxxx10x1xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110010011xxxxxxxx1100xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110010011xxxxxxxx11x1xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110011011xxxxxxxxxxxxxxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm12.Create);
SetT32("111110010011xxxx<<<<000000xxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemRsImm.Create);
SetT32("111110110000xxxx<<<<xxxx0000xxxx", InstName.Mla, InstEmit32.Mla, OpCodeT32AluMla.Create);
SetT32("111110110000xxxxxxxxxxxx0001xxxx", InstName.Mls, InstEmit32.Mls, OpCodeT32AluMla.Create);
SetT32("11101010010x11110xxxxxxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT32AluRsImm.Create);
SetT32("111110100xxxxxxx1111xxxx0000xxxx", InstName.Mov, InstEmit32.Mov, OpCodeT32ShiftReg.Create);
SetT32("11110x00010x11110xxxxxxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT32AluImm.Create);
SetT32("11110x100100xxxx0xxxxxxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT32MovImm16.Create);
SetT32("11110x101100xxxx0xxxxxxxxxxxxxxx", InstName.Movt, InstEmit32.Movt, OpCodeT32MovImm16.Create);
SetT32("111110110000xxxx1111xxxx0000xxxx", InstName.Mul, InstEmit32.Mul, OpCodeT32AluMla.Create);
SetT32("11101010011x11110xxxxxxxxxxxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCodeT32AluRsImm.Create);
SetT32("11110x00011x11110xxxxxxxxxxxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCodeT32AluImm.Create);
SetT32("11110011101011111000000000000000", InstName.Nop, InstEmit32.Nop, OpCodeT32.Create);
SetT32("11101010011x<<<<0xxxxxxxxxxxxxxx", InstName.Orn, InstEmit32.Orn, OpCodeT32AluRsImm.Create);
SetT32("11110x00011x<<<<0xxxxxxxxxxxxxxx", InstName.Orn, InstEmit32.Orn, OpCodeT32AluImm.Create);
SetT32("11101010010x<<<<0xxxxxxxxxxxxxxx", InstName.Orr, InstEmit32.Orr, OpCodeT32AluRsImm.Create);
@ -1097,18 +1148,56 @@ namespace ARMeilleure.Decoders
SetT32("11110x01110xxxxx0xxxxxxxxxxxxxxx", InstName.Rsb, InstEmit32.Rsb, OpCodeT32AluImm.Create);
SetT32("11101011011xxxxx0xxxxxxxxxxxxxxx", InstName.Sbc, InstEmit32.Sbc, OpCodeT32AluRsImm.Create);
SetT32("11110x01011xxxxx0xxxxxxxxxxxxxxx", InstName.Sbc, InstEmit32.Sbc, OpCodeT32AluImm.Create);
SetT32("111100110100xxxx0xxxxxxxxx0xxxxx", InstName.Sbfx, InstEmit32.Sbfx, OpCodeT32AluBf.Create);
SetT32("111110111001xxxx1111xxxx1111xxxx", InstName.Sdiv, InstEmit32.Sdiv, OpCodeT32AluMla.Create);
SetT32("111110110001xxxx<<<<xxxx00xxxxxx", InstName.Smla__, InstEmit32.Smla__, OpCodeT32AluMla.Create);
SetT32("111110111100xxxxxxxxxxxx0000xxxx", InstName.Smlal, InstEmit32.Smlal, OpCodeT32AluUmull.Create);
SetT32("111110111100xxxxxxxxxxxx10xxxxxx", InstName.Smlal__, InstEmit32.Smlal__, OpCodeT32AluUmull.Create);
SetT32("111110110011xxxx<<<<xxxx000xxxxx", InstName.Smlaw_, InstEmit32.Smlaw_, OpCodeT32AluMla.Create);
SetT32("111110110101xxxx<<<<xxxx000xxxxx", InstName.Smmla, InstEmit32.Smmla, OpCodeT32AluMla.Create);
SetT32("111110110110xxxxxxxxxxxx000xxxxx", InstName.Smmls, InstEmit32.Smmls, OpCodeT32AluMla.Create);
SetT32("111110110001xxxx1111xxxx00xxxxxx", InstName.Smul__, InstEmit32.Smul__, OpCodeT32AluMla.Create);
SetT32("111110111000xxxxxxxxxxxx0000xxxx", InstName.Smull, InstEmit32.Smull, OpCodeT32AluUmull.Create);
SetT32("111110110011xxxx1111xxxx000xxxxx", InstName.Smulw_, InstEmit32.Smulw_, OpCodeT32AluMla.Create);
SetT32("111010001100xxxxxxxx111110101111", InstName.Stl, InstEmit32.Stl, OpCodeT32MemStEx.Create);
SetT32("111010001100xxxxxxxx111110001111", InstName.Stlb, InstEmit32.Stlb, OpCodeT32MemStEx.Create);
SetT32("111010001100xxxxxxxx11111110xxxx", InstName.Stlex, InstEmit32.Stlex, OpCodeT32MemStEx.Create);
SetT32("111010001100xxxxxxxx11111100xxxx", InstName.Stlexb, InstEmit32.Stlexb, OpCodeT32MemStEx.Create);
SetT32("111010001100xxxxxxxxxxxx1111xxxx", InstName.Stlexd, InstEmit32.Stlexd, OpCodeT32MemStEx.Create);
SetT32("111010001100xxxxxxxx11111101xxxx", InstName.Stlexh, InstEmit32.Stlexh, OpCodeT32MemStEx.Create);
SetT32("111010001100xxxxxxxx111110011111", InstName.Stlh, InstEmit32.Stlh, OpCodeT32MemStEx.Create);
SetT32("1110100010x0xxxx0xxxxxxxxxxxxxxx", InstName.Stm, InstEmit32.Stm, OpCodeT32MemMult.Create);
SetT32("1110100100x0xxxx0xxxxxxxxxxxxxxx", InstName.Stm, InstEmit32.Stm, OpCodeT32MemMult.Create);
SetT32("111110000100xxxxxxxx1<<>xxxxxxxx", InstName.Str, InstEmit32.Str, OpCodeT32MemImm8.Create);
SetT32("111110001100xxxxxxxxxxxxxxxxxxxx", InstName.Str, InstEmit32.Str, OpCodeT32MemImm12.Create);
SetT32("111110000100<<<<xxxx000000xxxxxx", InstName.Str, InstEmit32.Str, OpCodeT32MemRsImm.Create);
SetT32("111110000000xxxxxxxx1<<>xxxxxxxx", InstName.Strb, InstEmit32.Strb, OpCodeT32MemImm8.Create);
SetT32("111110001000xxxxxxxxxxxxxxxxxxxx", InstName.Strb, InstEmit32.Strb, OpCodeT32MemImm12.Create);
SetT32("111110000000<<<<xxxx000000xxxxxx", InstName.Strb, InstEmit32.Strb, OpCodeT32MemRsImm.Create);
SetT32("11101000x110<<<<xxxxxxxxxxxxxxxx", InstName.Strd, InstEmit32.Strd, OpCodeT32MemImm8D.Create);
SetT32("11101001x1x0<<<<xxxxxxxxxxxxxxxx", InstName.Strd, InstEmit32.Strd, OpCodeT32MemImm8D.Create);
SetT32("111110000010xxxxxxxx1<<>xxxxxxxx", InstName.Strh, InstEmit32.Strh, OpCodeT32MemImm8.Create);
SetT32("111110001010xxxxxxxxxxxxxxxxxxxx", InstName.Strh, InstEmit32.Strh, OpCodeT32MemImm12.Create);
SetT32("111110000010<<<<xxxx000000xxxxxx", InstName.Strh, InstEmit32.Strh, OpCodeT32MemRsImm.Create);
SetT32("11101011101<xxxx0xxx<<<<xxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT32AluRsImm.Create);
SetT32("11110x01101<xxxx0xxx<<<<xxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT32AluImm.Create);
SetT32("111110100100xxxx1111xxxx10xxxxxx", InstName.Sxtb, InstEmit32.Sxtb, OpCodeT32AluUx.Create);
SetT32("111110100010xxxx1111xxxx10xxxxxx", InstName.Sxtb16, InstEmit32.Sxtb16, OpCodeT32AluUx.Create);
SetT32("111110100000xxxx1111xxxx10xxxxxx", InstName.Sxth, InstEmit32.Sxth, OpCodeT32AluUx.Create);
SetT32("111010001101xxxx111100000000xxxx", InstName.Tbb, InstEmit32.Tbb, OpCodeT32Tb.Create);
SetT32("111010001101xxxx111100000001xxxx", InstName.Tbh, InstEmit32.Tbh, OpCodeT32Tb.Create);
SetT32("111010101001xxxx0xxx1111xxxxxxxx", InstName.Teq, InstEmit32.Teq, OpCodeT32AluRsImm.Create);
SetT32("11110x001001xxxx0xxx1111xxxxxxxx", InstName.Teq, InstEmit32.Teq, OpCodeT32AluImm.Create);
SetT32("111010100001xxxx0xxx1111xxxxxxxx", InstName.Tst, InstEmit32.Tst, OpCodeT32AluRsImm.Create);
SetT32("11110x000001xxxx0xxx1111xxxxxxxx", InstName.Tst, InstEmit32.Tst, OpCodeT32AluImm.Create);
SetT32("111100111100xxxx0xxxxxxxxx0xxxxx", InstName.Ubfx, InstEmit32.Ubfx, OpCodeT32AluBf.Create);
SetT32("111110111011xxxx1111xxxx1111xxxx", InstName.Udiv, InstEmit32.Udiv, OpCodeT32AluMla.Create);
SetT32("111110111110xxxxxxxxxxxx0110xxxx", InstName.Umaal, InstEmit32.Umaal, OpCodeT32AluUmull.Create);
SetT32("111110111110xxxxxxxxxxxx0000xxxx", InstName.Umlal, InstEmit32.Umlal, OpCodeT32AluUmull.Create);
SetT32("111110111010xxxxxxxxxxxx0000xxxx", InstName.Umull, InstEmit32.Umull, OpCodeT32AluUmull.Create);
SetT32("111110100101xxxx1111xxxx10xxxxxx", InstName.Uxtb, InstEmit32.Uxtb, OpCodeT32AluUx.Create);
SetT32("111110100011xxxx1111xxxx10xxxxxx", InstName.Uxtb16, InstEmit32.Uxtb16, OpCodeT32AluUx.Create);
SetT32("111110100001xxxx1111xxxx10xxxxxx", InstName.Uxth, InstEmit32.Uxth, OpCodeT32AluUx.Create);
#endregion
FillFastLookupTable(InstA32FastLookup, AllInstA32, ToFastLookupIndexA);
@ -1165,6 +1254,18 @@ namespace ARMeilleure.Decoders
Set(reversedEncoding, AllInstT32, new InstDescriptor(name, emitter), reversedMakeOp);
}
private static void SetVfp(string encoding, InstName name, InstEmitter emitter, MakeOp makeOpA32, MakeOp makeOpT32)
{
SetA32(encoding, name, emitter, makeOpA32);
string thumbEncoding = encoding;
if (thumbEncoding.StartsWith("<<<<"))
{
thumbEncoding = "1110" + thumbEncoding.Substring(4);
}
SetT32(thumbEncoding, name, emitter, makeOpT32);
}
private static void SetA64(string encoding, InstName name, InstEmitter emitter, MakeOp makeOp)
{
Set(encoding, AllInstA64, new InstDescriptor(name, emitter), makeOp);

View File

@ -1,51 +0,0 @@
using System.Diagnostics.Tracing;
namespace ARMeilleure.Diagnostics.EventSources
{
[EventSource(Name = "ARMeilleure")]
class AddressTableEventSource : EventSource
{
public static readonly AddressTableEventSource Log = new();
private ulong _size;
private ulong _leafSize;
private PollingCounter _sizeCounter;
private PollingCounter _leafSizeCounter;
public AddressTableEventSource()
{
_sizeCounter = new PollingCounter("addr-tab-alloc", this, () => _size / 1024d / 1024d)
{
DisplayName = "AddressTable Total Bytes Allocated",
DisplayUnits = "MB"
};
_leafSizeCounter = new PollingCounter("addr-tab-leaf-alloc", this, () => _leafSize / 1024d / 1024d)
{
DisplayName = "AddressTable Total Leaf Bytes Allocated",
DisplayUnits = "MB"
};
}
public void Allocated(int bytes, bool leaf)
{
_size += (uint)bytes;
if (leaf)
{
_leafSize += (uint)bytes;
}
}
protected override void Dispose(bool disposing)
{
_leafSizeCounter.Dispose();
_leafSizeCounter = null;
_sizeCounter.Dispose();
_sizeCounter = null;
base.Dispose(disposing);
}
}
}

View File

@ -0,0 +1,67 @@
using System.Diagnostics.Tracing;
using System.Threading;
namespace ARMeilleure.Diagnostics
{
[EventSource(Name = "ARMeilleure")]
class TranslatorEventSource : EventSource
{
public static readonly TranslatorEventSource Log = new();
private int _rejitQueue;
private ulong _funcTabSize;
private ulong _funcTabLeafSize;
private PollingCounter _rejitQueueCounter;
private PollingCounter _funcTabSizeCounter;
private PollingCounter _funcTabLeafSizeCounter;
public TranslatorEventSource()
{
_rejitQueueCounter = new PollingCounter("rejit-queue-length", this, () => _rejitQueue)
{
DisplayName = "Rejit Queue Length"
};
_funcTabSizeCounter = new PollingCounter("addr-tab-alloc", this, () => _funcTabSize / 1024d / 1024d)
{
DisplayName = "AddressTable Total Bytes Allocated",
DisplayUnits = "MB"
};
_funcTabLeafSizeCounter = new PollingCounter("addr-tab-leaf-alloc", this, () => _funcTabLeafSize / 1024d / 1024d)
{
DisplayName = "AddressTable Total Leaf Bytes Allocated",
DisplayUnits = "MB"
};
}
public void RejitQueueAdd(int count)
{
Interlocked.Add(ref _rejitQueue, count);
}
public void AddressTableAllocated(int bytes, bool leaf)
{
_funcTabSize += (uint)bytes;
if (leaf)
{
_funcTabLeafSize += (uint)bytes;
}
}
protected override void Dispose(bool disposing)
{
_rejitQueueCounter.Dispose();
_rejitQueueCounter = null;
_funcTabLeafSizeCounter.Dispose();
_funcTabLeafSizeCounter = null;
_funcTabSizeCounter.Dispose();
_funcTabSizeCounter = null;
base.Dispose(disposing);
}
}
}

View File

@ -74,7 +74,7 @@ namespace ARMeilleure.Instructions
public static void Bfc(ArmEmitterContext context)
{
OpCode32AluBf op = (OpCode32AluBf)context.CurrOp;
IOpCode32AluBf op = (IOpCode32AluBf)context.CurrOp;
Operand d = GetIntA32(context, op.Rd);
Operand res = context.BitwiseAnd(d, Const(~op.DestMask));
@ -84,7 +84,7 @@ namespace ARMeilleure.Instructions
public static void Bfi(ArmEmitterContext context)
{
OpCode32AluBf op = (OpCode32AluBf)context.CurrOp;
IOpCode32AluBf op = (IOpCode32AluBf)context.CurrOp;
Operand n = GetIntA32(context, op.Rn);
Operand d = GetIntA32(context, op.Rd);
@ -185,7 +185,7 @@ namespace ARMeilleure.Instructions
public static void Movt(ArmEmitterContext context)
{
OpCode32AluImm16 op = (OpCode32AluImm16)context.CurrOp;
IOpCode32AluImm16 op = (IOpCode32AluImm16)context.CurrOp;
Operand d = GetIntA32(context, op.Rd);
Operand imm = Const(op.Immediate << 16); // Immeditate value as top halfword.
@ -389,7 +389,7 @@ namespace ARMeilleure.Instructions
public static void Sbfx(ArmEmitterContext context)
{
OpCode32AluBf op = (OpCode32AluBf)context.CurrOp;
IOpCode32AluBf op = (IOpCode32AluBf)context.CurrOp;
var msb = op.Lsb + op.Msb; // For this instruction, the msb is actually a width.
@ -484,7 +484,7 @@ namespace ARMeilleure.Instructions
public static void Ubfx(ArmEmitterContext context)
{
OpCode32AluBf op = (OpCode32AluBf)context.CurrOp;
IOpCode32AluBf op = (IOpCode32AluBf)context.CurrOp;
var msb = op.Lsb + op.Msb; // For this instruction, the msb is actually a width.

View File

@ -128,7 +128,7 @@ namespace ARMeilleure.Instructions
{
Debug.Assert(value.Type == OperandType.I32);
if (((OpCode32)context.CurrOp).IsThumb())
if (((OpCode32)context.CurrOp).IsThumb)
{
bool isReturn = IsA32Return(context);
if (!isReturn)
@ -205,7 +205,7 @@ namespace ARMeilleure.Instructions
return Const(op.Immediate);
}
case OpCode32AluImm16 op: return Const(op.Immediate);
case IOpCode32AluImm16 op: return Const(op.Immediate);
case IOpCode32AluRsImm op: return GetMShiftedByImmediate(context, op, setCarry);
case IOpCode32AluRsReg op: return GetMShiftedByReg(context, op, setCarry);

View File

@ -1,7 +1,5 @@
using ARMeilleure.Decoders;
using ARMeilleure.Translation;
using static ARMeilleure.Instructions.InstEmitFlowHelper;
using static ARMeilleure.IntermediateRepresentation.Operand.Factory;
namespace ARMeilleure.Instructions

View File

@ -34,7 +34,7 @@ namespace ARMeilleure.Instructions
uint pc = op.GetPc();
bool isThumb = ((OpCode32)context.CurrOp).IsThumb();
bool isThumb = ((OpCode32)context.CurrOp).IsThumb;
uint currentPc = isThumb
? pc | 1
@ -61,7 +61,7 @@ namespace ARMeilleure.Instructions
Operand addr = context.Copy(GetIntA32(context, op.Rm));
Operand bitOne = context.BitwiseAnd(addr, Const(1));
bool isThumb = ((OpCode32)context.CurrOp).IsThumb();
bool isThumb = ((OpCode32)context.CurrOp).IsThumb;
uint currentPc = isThumb
? (pc - 2) | 1
@ -88,7 +88,7 @@ namespace ARMeilleure.Instructions
{
OpCodeT16BImmCmp op = (OpCodeT16BImmCmp)context.CurrOp;
Operand value = GetIntOrZR(context, op.Rn);
Operand value = GetIntA32(context, op.Rn);
Operand lblTarget = context.GetLabel((ulong)op.Immediate);
if (onNotZero)
@ -107,5 +107,30 @@ namespace ARMeilleure.Instructions
context.SetIfThenBlockState(op.IfThenBlockConds);
}
public static void Tbb(ArmEmitterContext context) => EmitTb(context, halfword: false);
public static void Tbh(ArmEmitterContext context) => EmitTb(context, halfword: true);
private static void EmitTb(ArmEmitterContext context, bool halfword)
{
OpCodeT32Tb op = (OpCodeT32Tb)context.CurrOp;
Operand halfwords;
if (halfword)
{
Operand address = context.Add(GetIntA32(context, op.Rn), context.ShiftLeft(GetIntA32(context, op.Rm), Const(1)));
halfwords = InstEmitMemoryHelper.EmitReadInt(context, address, 1);
}
else
{
Operand address = context.Add(GetIntA32(context, op.Rn), GetIntA32(context, op.Rm));
halfwords = InstEmitMemoryHelper.EmitReadIntAligned(context, address, 0);
}
Operand targetAddress = context.Add(Const((int)op.GetPc()), context.ShiftLeft(halfwords, Const(1)));
EmitVirtualJump(context, targetAddress, isReturn: false);
}
}
}

View File

@ -204,15 +204,15 @@ namespace ARMeilleure.Instructions
context.BranchIfTrue(lblBigEndian, GetFlag(PState.EFlag));
Load(op.Rt, 0, WordSizeLog2);
Load(op.Rt | 1, 4, WordSizeLog2);
Load(op.Rt, 0, WordSizeLog2);
Load(op.Rt2, 4, WordSizeLog2);
context.Branch(lblEnd);
context.MarkLabel(lblBigEndian);
Load(op.Rt | 1, 0, WordSizeLog2);
Load(op.Rt, 4, WordSizeLog2);
Load(op.Rt2, 0, WordSizeLog2);
Load(op.Rt, 4, WordSizeLog2);
context.MarkLabel(lblEnd);
}
@ -237,15 +237,15 @@ namespace ARMeilleure.Instructions
context.BranchIfTrue(lblBigEndian, GetFlag(PState.EFlag));
Store(op.Rt, 0, WordSizeLog2);
Store(op.Rt | 1, 4, WordSizeLog2);
Store(op.Rt, 0, WordSizeLog2);
Store(op.Rt2, 4, WordSizeLog2);
context.Branch(lblEnd);
context.MarkLabel(lblBigEndian);
Store(op.Rt | 1, 0, WordSizeLog2);
Store(op.Rt, 4, WordSizeLog2);
Store(op.Rt2, 0, WordSizeLog2);
Store(op.Rt, 4, WordSizeLog2);
context.MarkLabel(lblEnd);
}

View File

@ -172,13 +172,13 @@ namespace ARMeilleure.Instructions
context.BranchIfTrue(lblBigEndian, GetFlag(PState.EFlag));
SetIntA32(context, op.Rt, valueLow);
SetIntA32(context, op.Rt | 1, valueHigh);
SetIntA32(context, op.Rt2, valueHigh);
context.Branch(lblEnd);
context.MarkLabel(lblBigEndian);
SetIntA32(context, op.Rt | 1, valueLow);
SetIntA32(context, op.Rt2, valueLow);
SetIntA32(context, op.Rt, valueHigh);
context.MarkLabel(lblEnd);
@ -195,7 +195,7 @@ namespace ARMeilleure.Instructions
// Split the result into 2 words (based on endianness)
Operand lo = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rt));
Operand hi = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rt | 1));
Operand hi = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rt2));
Operand lblBigEndian = Label();
Operand lblEnd = Label();

View File

@ -123,6 +123,41 @@ namespace ARMeilleure.Instructions
context.CurrOp is OpCodeSimdMemSs);
}
public static Operand EmitReadInt(ArmEmitterContext context, Operand address, int size)
{
Operand temp = context.AllocateLocal(size == 3 ? OperandType.I64 : OperandType.I32);
Operand lblSlowPath = Label();
Operand lblEnd = Label();
Operand physAddr = EmitPtPointerLoad(context, address, lblSlowPath, write: false, size);
Operand value = default;
switch (size)
{
case 0: value = context.Load8 (physAddr); break;
case 1: value = context.Load16(physAddr); break;
case 2: value = context.Load (OperandType.I32, physAddr); break;
case 3: value = context.Load (OperandType.I64, physAddr); break;
}
context.Copy(temp, value);
if (!context.Memory.Type.IsHostMapped())
{
context.Branch(lblEnd);
context.MarkLabel(lblSlowPath, BasicBlockFrequency.Cold);
context.Copy(temp, EmitReadIntFallback(context, address, size));
context.MarkLabel(lblEnd);
}
return temp;
}
private static void EmitReadInt(ArmEmitterContext context, Operand address, int rt, int size)
{
Operand lblSlowPath = Label();
@ -419,6 +454,11 @@ namespace ARMeilleure.Instructions
}
private static void EmitReadIntFallback(ArmEmitterContext context, Operand address, int rt, int size)
{
SetInt(context, rt, EmitReadIntFallback(context, address, size));
}
private static Operand EmitReadIntFallback(ArmEmitterContext context, Operand address, int size)
{
MethodInfo info = null;
@ -430,7 +470,7 @@ namespace ARMeilleure.Instructions
case 3: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.ReadUInt64)); break;
}
SetInt(context, rt, context.Call(info, address));
return context.Call(info, address);
}
private static void EmitReadVectorFallback(
@ -547,7 +587,7 @@ namespace ARMeilleure.Instructions
{
switch (context.CurrOp)
{
case OpCode32MemRsImm op: return GetMShiftedByImmediate(context, op, setCarry);
case IOpCode32MemRsImm op: return GetMShiftedByImmediate(context, op, setCarry);
case IOpCode32MemReg op: return GetIntA32(context, op.Rm);
@ -564,7 +604,7 @@ namespace ARMeilleure.Instructions
return new InvalidOperationException($"Invalid OpCode type \"{opCode?.GetType().Name ?? "null"}\".");
}
public static Operand GetMShiftedByImmediate(ArmEmitterContext context, OpCode32MemRsImm op, bool setCarry)
public static Operand GetMShiftedByImmediate(ArmEmitterContext context, IOpCode32MemRsImm op, bool setCarry)
{
Operand m = GetIntA32(context, op.Rm);

View File

@ -25,7 +25,7 @@ namespace ARMeilleure.Instructions
public static void Mla(ArmEmitterContext context)
{
OpCode32AluMla op = (OpCode32AluMla)context.CurrOp;
IOpCode32AluMla op = (IOpCode32AluMla)context.CurrOp;
Operand n = GetAluN(context);
Operand m = GetAluM(context);
@ -43,7 +43,7 @@ namespace ARMeilleure.Instructions
public static void Mls(ArmEmitterContext context)
{
OpCode32AluMla op = (OpCode32AluMla)context.CurrOp;
IOpCode32AluMla op = (IOpCode32AluMla)context.CurrOp;
Operand n = GetAluN(context);
Operand m = GetAluM(context);
@ -71,7 +71,7 @@ namespace ARMeilleure.Instructions
private static void EmitSmmul(ArmEmitterContext context, MullFlags flags)
{
OpCode32AluMla op = (OpCode32AluMla)context.CurrOp;
IOpCode32AluMla op = (IOpCode32AluMla)context.CurrOp;
Operand n = context.SignExtend32(OperandType.I64, GetIntA32(context, op.Rn));
Operand m = context.SignExtend32(OperandType.I64, GetIntA32(context, op.Rm));
@ -99,7 +99,7 @@ namespace ARMeilleure.Instructions
public static void Smla__(ArmEmitterContext context)
{
OpCode32AluMla op = (OpCode32AluMla)context.CurrOp;
IOpCode32AluMla op = (IOpCode32AluMla)context.CurrOp;
Operand n = GetIntA32(context, op.Rn);
Operand m = GetIntA32(context, op.Rm);
@ -142,7 +142,7 @@ namespace ARMeilleure.Instructions
public static void Smlal__(ArmEmitterContext context)
{
OpCode32AluUmull op = (OpCode32AluUmull)context.CurrOp;
IOpCode32AluUmull op = (IOpCode32AluUmull)context.CurrOp;
Operand n = GetIntA32(context, op.Rn);
Operand m = GetIntA32(context, op.Rm);
@ -180,7 +180,7 @@ namespace ARMeilleure.Instructions
public static void Smlaw_(ArmEmitterContext context)
{
OpCode32AluMla op = (OpCode32AluMla)context.CurrOp;
IOpCode32AluMla op = (IOpCode32AluMla)context.CurrOp;
Operand n = GetIntA32(context, op.Rn);
Operand m = GetIntA32(context, op.Rm);
@ -210,7 +210,7 @@ namespace ARMeilleure.Instructions
public static void Smul__(ArmEmitterContext context)
{
OpCode32AluMla op = (OpCode32AluMla)context.CurrOp;
IOpCode32AluMla op = (IOpCode32AluMla)context.CurrOp;
Operand n = GetIntA32(context, op.Rn);
Operand m = GetIntA32(context, op.Rm);
@ -240,7 +240,7 @@ namespace ARMeilleure.Instructions
public static void Smull(ArmEmitterContext context)
{
OpCode32AluUmull op = (OpCode32AluUmull)context.CurrOp;
IOpCode32AluUmull op = (IOpCode32AluUmull)context.CurrOp;
Operand n = context.SignExtend32(OperandType.I64, GetIntA32(context, op.Rn));
Operand m = context.SignExtend32(OperandType.I64, GetIntA32(context, op.Rm));
@ -261,7 +261,7 @@ namespace ARMeilleure.Instructions
public static void Smulw_(ArmEmitterContext context)
{
OpCode32AluMla op = (OpCode32AluMla)context.CurrOp;
IOpCode32AluMla op = (IOpCode32AluMla)context.CurrOp;
Operand n = GetIntA32(context, op.Rn);
Operand m = GetIntA32(context, op.Rm);
@ -285,7 +285,7 @@ namespace ARMeilleure.Instructions
public static void Umaal(ArmEmitterContext context)
{
OpCode32AluUmull op = (OpCode32AluUmull)context.CurrOp;
IOpCode32AluUmull op = (IOpCode32AluUmull)context.CurrOp;
Operand n = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rn));
Operand m = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rm));
@ -310,7 +310,7 @@ namespace ARMeilleure.Instructions
public static void Umull(ArmEmitterContext context)
{
OpCode32AluUmull op = (OpCode32AluUmull)context.CurrOp;
IOpCode32AluUmull op = (IOpCode32AluUmull)context.CurrOp;
Operand n = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rn));
Operand m = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rm));
@ -331,7 +331,7 @@ namespace ARMeilleure.Instructions
private static void EmitMlal(ArmEmitterContext context, bool signed)
{
OpCode32AluUmull op = (OpCode32AluUmull)context.CurrOp;
IOpCode32AluUmull op = (IOpCode32AluUmull)context.CurrOp;
Operand n = GetIntA32(context, op.Rn);
Operand m = GetIntA32(context, op.Rm);

View File

@ -2,8 +2,6 @@
using ARMeilleure.IntermediateRepresentation;
using ARMeilleure.Translation;
using System;
using System.Diagnostics;
using static ARMeilleure.Instructions.InstEmitFlowHelper;
using static ARMeilleure.Instructions.InstEmitHelper;
using static ARMeilleure.Instructions.InstEmitSimdHelper;
@ -779,6 +777,13 @@ namespace ARMeilleure.Instructions
}
}
public static void Vmlal_I(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
EmitVectorTernaryLongOpI32(context, (d, n, m) => context.Add(d, context.Multiply(n, m)), !op.U);
}
public static void Vmls_S(ArmEmitterContext context)
{
if (Optimizations.FastFP && Optimizations.UseSse2)
@ -994,6 +999,13 @@ namespace ARMeilleure.Instructions
}
}
public static void Vpaddl(ArmEmitterContext context)
{
OpCode32Simd op = (OpCode32Simd)context.CurrOp;
EmitVectorPairwiseLongOpI32(context, (op1, op2) => context.Add(op1, op2), (op.Opc & 1) == 0);
}
public static void Vpmax_V(ArmEmitterContext context)
{
if (Optimizations.FastFP && Optimizations.UseSse2)
@ -1016,7 +1028,7 @@ namespace ARMeilleure.Instructions
}
else
{
EmitVectorPairwiseOpI32(context, (op1, op2) =>
EmitVectorPairwiseOpI32(context, (op1, op2) =>
{
Operand greater = op.U ? context.ICompareGreaterUI(op1, op2) : context.ICompareGreater(op1, op2);
return context.ConditionalSelect(greater, op1, op2);
@ -1054,6 +1066,62 @@ namespace ARMeilleure.Instructions
}
}
public static void Vqadd(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
EmitSaturatingAddSubBinaryOp(context, add: true, !op.U);
}
public static void Vqdmulh(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
int eSize = 8 << op.Size;
EmitVectorBinaryOpI32(context, (op1, op2) =>
{
if (op.Size == 2)
{
op1 = context.SignExtend32(OperandType.I64, op1);
op2 = context.SignExtend32(OperandType.I64, op2);
}
Operand res = context.Multiply(op1, op2);
res = context.ShiftRightSI(res, Const(eSize - 1));
res = EmitSatQ(context, res, eSize, signedSrc: true, signedDst: true);
if (op.Size == 2)
{
res = context.ConvertI64ToI32(res);
}
return res;
}, signed: true);
}
public static void Vqmovn(ArmEmitterContext context)
{
OpCode32SimdMovn op = (OpCode32SimdMovn)context.CurrOp;
bool signed = !op.Q;
EmitVectorUnaryNarrowOp32(context, (op1) => EmitSatQ(context, op1, 8 << op.Size, signed, signed), signed);
}
public static void Vqmovun(ArmEmitterContext context)
{
OpCode32SimdMovn op = (OpCode32SimdMovn)context.CurrOp;
EmitVectorUnaryNarrowOp32(context, (op1) => EmitSatQ(context, op1, 8 << op.Size, signedSrc: true, signedDst: false), signed: true);
}
public static void Vqsub(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
EmitSaturatingAddSubBinaryOp(context, add: false, !op.U);
}
public static void Vrev(ArmEmitterContext context)
{
OpCode32SimdRev op = (OpCode32SimdRev)context.CurrOp;
@ -1204,6 +1272,30 @@ namespace ARMeilleure.Instructions
}
}
public static void Vrhadd(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
EmitVectorBinaryOpI32(context, (op1, op2) =>
{
if (op.Size == 2)
{
op1 = context.ZeroExtend32(OperandType.I64, op1);
op2 = context.ZeroExtend32(OperandType.I64, op2);
}
Operand res = context.Add(context.Add(op1, op2), Const(op1.Type, 1L));
res = context.ShiftRightUI(res, Const(1));
if (op.Size == 2)
{
res = context.ConvertI64ToI32(res);
}
return res;
}, !op.U);
}
public static void Vrsqrte(ArmEmitterContext context)
{
OpCode32SimdSqrte op = (OpCode32SimdSqrte)context.CurrOp;
@ -1351,6 +1443,13 @@ namespace ARMeilleure.Instructions
}
}
public static void Vsubl_I(ArmEmitterContext context)
{
OpCode32SimdRegLong op = (OpCode32SimdRegLong)context.CurrOp;
EmitVectorBinaryLongOpI32(context, (op1, op2) => context.Subtract(op1, op2), !op.U);
}
public static void Vsubw_I(ArmEmitterContext context)
{
OpCode32SimdRegWide op = (OpCode32SimdRegWide)context.CurrOp;
@ -1358,6 +1457,46 @@ namespace ARMeilleure.Instructions
EmitVectorBinaryWideOpI32(context, (op1, op2) => context.Subtract(op1, op2), !op.U);
}
private static void EmitSaturatingAddSubBinaryOp(ArmEmitterContext context, bool add, bool signed)
{
OpCode32Simd op = (OpCode32Simd)context.CurrOp;
EmitVectorBinaryOpI32(context, (ne, me) =>
{
if (op.Size <= 2)
{
if (op.Size == 2)
{
ne = signed ? context.SignExtend32(OperandType.I64, ne) : context.ZeroExtend32(OperandType.I64, ne);
me = signed ? context.SignExtend32(OperandType.I64, me) : context.ZeroExtend32(OperandType.I64, me);
}
Operand res = add ? context.Add(ne, me) : context.Subtract(ne, me);
res = EmitSatQ(context, res, 8 << op.Size, signedSrc: true, signed);
if (op.Size == 2)
{
res = context.ConvertI64ToI32(res);
}
return res;
}
else if (add) /* if (op.Size == 3) */
{
return signed
? EmitBinarySignedSatQAdd(context, ne, me)
: EmitBinaryUnsignedSatQAdd(context, ne, me);
}
else /* if (sub) */
{
return signed
? EmitBinarySignedSatQSub(context, ne, me)
: EmitBinaryUnsignedSatQSub(context, ne, me);
}
}, signed);
}
private static void EmitSse41MaxMinNumOpF32(ArmEmitterContext context, bool isMaxNum, bool scalar)
{
IOpCode32Simd op = (IOpCode32Simd)context.CurrOp;

View File

@ -323,6 +323,60 @@ namespace ARMeilleure.Instructions
}
}
// VRINTA (vector).
public static void Vrinta_V(ArmEmitterContext context)
{
EmitVectorUnaryOpF32(context, (m) => EmitRoundMathCall(context, MidpointRounding.AwayFromZero, m));
}
// VRINTM (vector).
public static void Vrintm_V(ArmEmitterContext context)
{
if (Optimizations.UseSse2)
{
EmitVectorUnaryOpSimd32(context, (m) =>
{
return context.AddIntrinsic(Intrinsic.X86Roundps, m, Const(X86GetRoundControl(FPRoundingMode.TowardsMinusInfinity)));
});
}
else
{
EmitVectorUnaryOpF32(context, (m) => EmitUnaryMathCall(context, nameof(Math.Floor), m));
}
}
// VRINTN (vector).
public static void Vrintn_V(ArmEmitterContext context)
{
if (Optimizations.UseSse2)
{
EmitVectorUnaryOpSimd32(context, (m) =>
{
return context.AddIntrinsic(Intrinsic.X86Roundps, m, Const(X86GetRoundControl(FPRoundingMode.ToNearest)));
});
}
else
{
EmitVectorUnaryOpF32(context, (m) => EmitRoundMathCall(context, MidpointRounding.ToEven, m));
}
}
// VRINTP (vector).
public static void Vrintp_V(ArmEmitterContext context)
{
if (Optimizations.UseSse2)
{
EmitVectorUnaryOpSimd32(context, (m) =>
{
return context.AddIntrinsic(Intrinsic.X86Roundps, m, Const(X86GetRoundControl(FPRoundingMode.TowardsPlusInfinity)));
});
}
else
{
EmitVectorUnaryOpF32(context, (m) => EmitUnaryMathCall(context, nameof(Math.Ceiling), m));
}
}
// VRINTZ (floating-point).
public static void Vrint_Z(ArmEmitterContext context)
{

View File

@ -100,7 +100,7 @@ namespace ARMeilleure.Instructions
Operand n = GetVec(op.Rn);
Operand m = GetVec(op.Rm);
Operand res = context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.HashLower)), d, n, m);
Operand res = InstEmitSimdHashHelper.EmitSha256h(context, d, n, m, part2: false);
context.Copy(GetVec(op.Rd), res);
}
@ -113,7 +113,7 @@ namespace ARMeilleure.Instructions
Operand n = GetVec(op.Rn);
Operand m = GetVec(op.Rm);
Operand res = context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.HashUpper)), d, n, m);
Operand res = InstEmitSimdHashHelper.EmitSha256h(context, n, d, m, part2: true);
context.Copy(GetVec(op.Rd), res);
}
@ -125,7 +125,7 @@ namespace ARMeilleure.Instructions
Operand d = GetVec(op.Rd);
Operand n = GetVec(op.Rn);
Operand res = context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Sha256SchedulePart1)), d, n);
Operand res = InstEmitSimdHashHelper.EmitSha256su0(context, d, n);
context.Copy(GetVec(op.Rd), res);
}
@ -138,7 +138,7 @@ namespace ARMeilleure.Instructions
Operand n = GetVec(op.Rn);
Operand m = GetVec(op.Rm);
Operand res = context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Sha256SchedulePart2)), d, n, m);
Operand res = InstEmitSimdHashHelper.EmitSha256su1(context, d, n, m);
context.Copy(GetVec(op.Rd), res);
}

View File

@ -0,0 +1,64 @@
using ARMeilleure.Decoders;
using ARMeilleure.IntermediateRepresentation;
using ARMeilleure.Translation;
using static ARMeilleure.Instructions.InstEmitHelper;
namespace ARMeilleure.Instructions
{
static partial class InstEmit32
{
#region "Sha256"
public static void Sha256h_V(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
Operand d = GetVecA32(op.Qd);
Operand n = GetVecA32(op.Qn);
Operand m = GetVecA32(op.Qm);
Operand res = InstEmitSimdHashHelper.EmitSha256h(context, d, n, m, part2: false);
context.Copy(GetVecA32(op.Qd), res);
}
public static void Sha256h2_V(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
Operand d = GetVecA32(op.Qd);
Operand n = GetVecA32(op.Qn);
Operand m = GetVecA32(op.Qm);
Operand res = InstEmitSimdHashHelper.EmitSha256h(context, n, d, m, part2: true);
context.Copy(GetVecA32(op.Qd), res);
}
public static void Sha256su0_V(ArmEmitterContext context)
{
OpCode32Simd op = (OpCode32Simd)context.CurrOp;
Operand d = GetVecA32(op.Qd);
Operand m = GetVecA32(op.Qm);
Operand res = InstEmitSimdHashHelper.EmitSha256su0(context, d, m);
context.Copy(GetVecA32(op.Qd), res);
}
public static void Sha256su1_V(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
Operand d = GetVecA32(op.Qd);
Operand n = GetVecA32(op.Qn);
Operand m = GetVecA32(op.Qm);
Operand res = InstEmitSimdHashHelper.EmitSha256su1(context, d, n, m);
context.Copy(GetVecA32(op.Qd), res);
}
#endregion
}
}

View File

@ -0,0 +1,56 @@
using ARMeilleure.IntermediateRepresentation;
using ARMeilleure.Translation;
using System;
using static ARMeilleure.IntermediateRepresentation.Operand.Factory;
namespace ARMeilleure.Instructions
{
static class InstEmitSimdHashHelper
{
public static Operand EmitSha256h(ArmEmitterContext context, Operand x, Operand y, Operand w, bool part2)
{
if (Optimizations.UseSha)
{
Operand src1 = context.AddIntrinsic(Intrinsic.X86Shufps, y, x, Const(0xbb));
Operand src2 = context.AddIntrinsic(Intrinsic.X86Shufps, y, x, Const(0x11));
Operand w2 = context.AddIntrinsic(Intrinsic.X86Punpckhqdq, w, w);
Operand round2 = context.AddIntrinsic(Intrinsic.X86Sha256Rnds2, src1, src2, w);
Operand round4 = context.AddIntrinsic(Intrinsic.X86Sha256Rnds2, src2, round2, w2);
Operand res = context.AddIntrinsic(Intrinsic.X86Shufps, round4, round2, Const(part2 ? 0x11 : 0xbb));
return res;
}
String method = part2 ? nameof(SoftFallback.HashUpper) : nameof(SoftFallback.HashLower);
return context.Call(typeof(SoftFallback).GetMethod(method), x, y, w);
}
public static Operand EmitSha256su0(ArmEmitterContext context, Operand x, Operand y)
{
if (Optimizations.UseSha)
{
return context.AddIntrinsic(Intrinsic.X86Sha256Msg1, x, y);
}
return context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Sha256SchedulePart1)), x, y);
}
public static Operand EmitSha256su1(ArmEmitterContext context, Operand x, Operand y, Operand z)
{
if (Optimizations.UseSha && Optimizations.UseSsse3)
{
Operand extr = context.AddIntrinsic(Intrinsic.X86Palignr, z, y, Const(4));
Operand tmp = context.AddIntrinsic(Intrinsic.X86Paddd, extr, x);
Operand res = context.AddIntrinsic(Intrinsic.X86Sha256Msg2, tmp, z);
return res;
}
return context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Sha256SchedulePart2)), x, y, z);
}
}
}

View File

@ -1281,7 +1281,7 @@ namespace ARMeilleure.Instructions
public static void EmitSseOrAvxExitFtzAndDazModesOpF(ArmEmitterContext context, Operand isTrue = default)
{
isTrue = isTrue == default
isTrue = isTrue == default
? context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetFpcrFz)))
: isTrue;
@ -1352,7 +1352,7 @@ namespace ARMeilleure.Instructions
if (op.Size <= 2)
{
de = EmitSatQ(context, emit(ne), op.Size, signedSrc: true, signedDst: true);
de = EmitSignedSrcSatQ(context, emit(ne), op.Size, signedDst: true);
}
else /* if (op.Size == 3) */
{
@ -1419,15 +1419,18 @@ namespace ARMeilleure.Instructions
{
Operand temp = add ? context.Add(ne, me) : context.Subtract(ne, me);
de = EmitSatQ(context, temp, op.Size, signedSrc: true, signedDst: signed);
de = EmitSignedSrcSatQ(context, temp, op.Size, signedDst: signed);
}
else if (add) /* if (op.Size == 3) */
else /* if (op.Size == 3) */
{
de = EmitBinarySatQAdd(context, ne, me, signed);
}
else /* if (sub) */
{
de = EmitBinarySatQSub(context, ne, me, signed);
if (add)
{
de = signed ? EmitBinarySignedSatQAdd(context, ne, me) : EmitBinaryUnsignedSatQAdd(context, ne, me);
}
else /* if (sub) */
{
de = signed ? EmitBinarySignedSatQSub(context, ne, me) : EmitBinaryUnsignedSatQSub(context, ne, me);
}
}
res = EmitVectorInsert(context, res, de, index, op.Size);
@ -1445,11 +1448,11 @@ namespace ARMeilleure.Instructions
{
Operand temp = context.Add(ne, me);
de = EmitSatQ(context, temp, op.Size, signedSrc: true, signedDst: signed);
de = EmitSignedSrcSatQ(context, temp, op.Size, signedDst: signed);
}
else /* if (op.Size == 3) */
{
de = EmitBinarySatQAccumulate(context, ne, me, signed);
de = signed ? EmitBinarySignedSatQAcc(context, ne, me) : EmitBinaryUnsignedSatQAcc(context, ne, me);
}
res = EmitVectorInsert(context, res, de, index, op.Size);
@ -1475,7 +1478,7 @@ namespace ARMeilleure.Instructions
me = EmitVectorExtract(context, ((OpCodeSimdReg)op).Rm, index, op.Size, signed);
}
Operand de = EmitSatQ(context, emit(ne, me), op.Size, true, signed);
Operand de = EmitSignedSrcSatQ(context, emit(ne, me), op.Size, signedDst: signed);
res = EmitVectorInsert(context, res, de, index, op.Size);
}
@ -1520,7 +1523,9 @@ namespace ARMeilleure.Instructions
{
Operand ne = EmitVectorExtract(context, op.Rn, index, op.Size + 1, signedSrc);
Operand temp = EmitSatQ(context, ne, op.Size, signedSrc, signedDst);
Operand temp = signedSrc
? EmitSignedSrcSatQ(context, ne, op.Size, signedDst)
: EmitUnsignedSrcSatQ(context, ne, op.Size, signedDst);
res = EmitVectorInsert(context, res, temp, part + index, op.Size);
}
@ -1528,74 +1533,248 @@ namespace ARMeilleure.Instructions
context.Copy(d, res);
}
// TSrc (16bit, 32bit, 64bit; signed, unsigned) > TDst (8bit, 16bit, 32bit; signed, unsigned).
public static Operand EmitSatQ(ArmEmitterContext context, Operand op, int sizeDst, bool signedSrc, bool signedDst)
// TSrc (16bit, 32bit, 64bit; signed) > TDst (8bit, 16bit, 32bit; signed, unsigned).
// long SignedSrcSignedDstSatQ(long op, int size); ulong SignedSrcUnsignedDstSatQ(long op, int size);
public static Operand EmitSignedSrcSatQ(ArmEmitterContext context, Operand op, int sizeDst, bool signedDst)
{
if ((uint)sizeDst > 2u)
{
throw new ArgumentOutOfRangeException(nameof(sizeDst));
}
Debug.Assert(op.Type == OperandType.I64 && (uint)sizeDst <= 2u);
MethodInfo info;
Operand lbl1 = Label();
Operand lblEnd = Label();
if (signedSrc)
{
info = signedDst
? typeof(SoftFallback).GetMethod(nameof(SoftFallback.SignedSrcSignedDstSatQ))
: typeof(SoftFallback).GetMethod(nameof(SoftFallback.SignedSrcUnsignedDstSatQ));
}
else
{
info = signedDst
? typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnsignedSrcSignedDstSatQ))
: typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnsignedSrcUnsignedDstSatQ));
}
int eSize = 8 << sizeDst;
return context.Call(info, op, Const(sizeDst));
Operand maxT = signedDst ? Const((1L << (eSize - 1)) - 1L) : Const((1UL << eSize) - 1UL);
Operand minT = signedDst ? Const(-(1L << (eSize - 1))) : Const(0UL);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), op);
context.BranchIf(lbl1, res, maxT, Comparison.LessOrEqual);
context.Copy(res, maxT);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lbl1);
context.BranchIf(lblEnd, res, minT, Comparison.GreaterOrEqual);
context.Copy(res, minT);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
// TSrc (64bit) == TDst (64bit); signed.
public static Operand EmitUnarySignedSatQAbsOrNeg(ArmEmitterContext context, Operand op)
// TSrc (16bit, 32bit, 64bit; unsigned) > TDst (8bit, 16bit, 32bit; signed, unsigned).
// long UnsignedSrcSignedDstSatQ(ulong op, int size); ulong UnsignedSrcUnsignedDstSatQ(ulong op, int size);
public static Operand EmitUnsignedSrcSatQ(ArmEmitterContext context, Operand op, int sizeDst, bool signedDst)
{
Debug.Assert(((OpCodeSimd)context.CurrOp).Size == 3, "Invalid element size.");
Debug.Assert(op.Type == OperandType.I64 && (uint)sizeDst <= 2u);
return context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnarySignedSatQAbsOrNeg)), op);
Operand lblEnd = Label();
int eSize = 8 << sizeDst;
Operand maxL = signedDst ? Const((1L << (eSize - 1)) - 1L) : Const((1UL << eSize) - 1UL);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), op);
context.BranchIf(lblEnd, res, maxL, Comparison.LessOrEqualUI);
context.Copy(res, maxL);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
// TSrcs (64bit) == TDst (64bit); signed, unsigned.
public static Operand EmitBinarySatQAdd(ArmEmitterContext context, Operand op1, Operand op2, bool signed)
// long UnarySignedSatQAbsOrNeg(long op);
private static Operand EmitUnarySignedSatQAbsOrNeg(ArmEmitterContext context, Operand op)
{
Debug.Assert(((OpCodeSimd)context.CurrOp).Size == 3, "Invalid element size.");
Debug.Assert(op.Type == OperandType.I64);
MethodInfo info = signed
? typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinarySignedSatQAdd))
: typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinaryUnsignedSatQAdd));
Operand lblEnd = Label();
return context.Call(info, op1, op2);
Operand minL = Const(long.MinValue);
Operand maxL = Const(long.MaxValue);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), op);
context.BranchIf(lblEnd, res, minL, Comparison.NotEqual);
context.Copy(res, maxL);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
// TSrcs (64bit) == TDst (64bit); signed, unsigned.
public static Operand EmitBinarySatQSub(ArmEmitterContext context, Operand op1, Operand op2, bool signed)
// long BinarySignedSatQAdd(long op1, long op2);
public static Operand EmitBinarySignedSatQAdd(ArmEmitterContext context, Operand op1, Operand op2)
{
Debug.Assert(((OpCodeSimd)context.CurrOp).Size == 3, "Invalid element size.");
Debug.Assert(op1.Type == OperandType.I64 && op2.Type == OperandType.I64);
MethodInfo info = signed
? typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinarySignedSatQSub))
: typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinaryUnsignedSatQSub));
Operand lblEnd = Label();
return context.Call(info, op1, op2);
Operand minL = Const(long.MinValue);
Operand maxL = Const(long.MaxValue);
Operand zero = Const(0L);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), context.Add(op1, op2));
Operand left = context.BitwiseNot(context.BitwiseExclusiveOr(op1, op2));
Operand right = context.BitwiseExclusiveOr(op1, res);
context.BranchIf(lblEnd, context.BitwiseAnd(left, right), zero, Comparison.GreaterOrEqual);
Operand isPositive = context.ICompareGreaterOrEqual(op1, zero);
context.Copy(res, context.ConditionalSelect(isPositive, maxL, minL));
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
// TSrcs (64bit) == TDst (64bit); signed, unsigned.
public static Operand EmitBinarySatQAccumulate(ArmEmitterContext context, Operand op1, Operand op2, bool signed)
// ulong BinaryUnsignedSatQAdd(ulong op1, ulong op2);
public static Operand EmitBinaryUnsignedSatQAdd(ArmEmitterContext context, Operand op1, Operand op2)
{
Debug.Assert(((OpCodeSimd)context.CurrOp).Size == 3, "Invalid element size.");
Debug.Assert(op1.Type == OperandType.I64 && op2.Type == OperandType.I64);
MethodInfo info = signed
? typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinarySignedSatQAcc))
: typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinaryUnsignedSatQAcc));
Operand lblEnd = Label();
return context.Call(info, op1, op2);
Operand maxUL = Const(ulong.MaxValue);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), context.Add(op1, op2));
context.BranchIf(lblEnd, res, op1, Comparison.GreaterOrEqualUI);
context.Copy(res, maxUL);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
// long BinarySignedSatQSub(long op1, long op2);
public static Operand EmitBinarySignedSatQSub(ArmEmitterContext context, Operand op1, Operand op2)
{
Debug.Assert(op1.Type == OperandType.I64 && op2.Type == OperandType.I64);
Operand lblEnd = Label();
Operand minL = Const(long.MinValue);
Operand maxL = Const(long.MaxValue);
Operand zero = Const(0L);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), context.Subtract(op1, op2));
Operand left = context.BitwiseExclusiveOr(op1, op2);
Operand right = context.BitwiseExclusiveOr(op1, res);
context.BranchIf(lblEnd, context.BitwiseAnd(left, right), zero, Comparison.GreaterOrEqual);
Operand isPositive = context.ICompareGreaterOrEqual(op1, zero);
context.Copy(res, context.ConditionalSelect(isPositive, maxL, minL));
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
// ulong BinaryUnsignedSatQSub(ulong op1, ulong op2);
public static Operand EmitBinaryUnsignedSatQSub(ArmEmitterContext context, Operand op1, Operand op2)
{
Debug.Assert(op1.Type == OperandType.I64 && op2.Type == OperandType.I64);
Operand lblEnd = Label();
Operand zero = Const(0L);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), context.Subtract(op1, op2));
context.BranchIf(lblEnd, op1, op2, Comparison.GreaterOrEqualUI);
context.Copy(res, zero);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
// long BinarySignedSatQAcc(ulong op1, long op2);
private static Operand EmitBinarySignedSatQAcc(ArmEmitterContext context, Operand op1, Operand op2)
{
Debug.Assert(op1.Type == OperandType.I64 && op2.Type == OperandType.I64);
Operand lbl1 = Label();
Operand lbl2 = Label();
Operand lblEnd = Label();
Operand maxL = Const(long.MaxValue);
Operand zero = Const(0L);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), context.Add(op1, op2));
context.BranchIf(lbl1, op1, maxL, Comparison.GreaterUI);
Operand notOp2AndRes = context.BitwiseAnd(context.BitwiseNot(op2), res);
context.BranchIf(lblEnd, notOp2AndRes, zero, Comparison.GreaterOrEqual);
context.Copy(res, maxL);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lbl1);
context.BranchIf(lbl2, op2, zero, Comparison.Less);
context.Copy(res, maxL);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lbl2);
context.BranchIf(lblEnd, res, maxL, Comparison.LessOrEqualUI);
context.Copy(res, maxL);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
// ulong BinaryUnsignedSatQAcc(long op1, ulong op2);
private static Operand EmitBinaryUnsignedSatQAcc(ArmEmitterContext context, Operand op1, Operand op2)
{
Debug.Assert(op1.Type == OperandType.I64 && op2.Type == OperandType.I64);
Operand lbl1 = Label();
Operand lblEnd = Label();
Operand maxUL = Const(ulong.MaxValue);
Operand maxL = Const(long.MaxValue);
Operand zero = Const(0L);
Operand res = context.Copy(context.AllocateLocal(OperandType.I64), context.Add(op1, op2));
context.BranchIf(lbl1, op1, zero, Comparison.Less);
context.BranchIf(lblEnd, res, op1, Comparison.GreaterOrEqualUI);
context.Copy(res, maxUL);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lbl1);
context.BranchIf(lblEnd, op2, maxL, Comparison.GreaterUI);
context.BranchIf(lblEnd, res, zero, Comparison.GreaterOrEqual);
context.Copy(res, zero);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
context.Branch(lblEnd);
context.MarkLabel(lblEnd);
return res;
}
public static Operand EmitFloatAbs(ArmEmitterContext context, Operand value, bool single, bool vector)

View File

@ -219,6 +219,25 @@ namespace ARMeilleure.Instructions
// Integer
public static void EmitVectorUnaryAccumulateOpI32(ArmEmitterContext context, Func1I emit, bool signed)
{
OpCode32Simd op = (OpCode32Simd)context.CurrOp;
Operand res = GetVecA32(op.Qd);
int elems = op.GetBytesCount() >> op.Size;
for (int index = 0; index < elems; index++)
{
Operand de = EmitVectorExtract32(context, op.Qd, op.Id + index, op.Size, signed);
Operand me = EmitVectorExtract32(context, op.Qm, op.Im + index, op.Size, signed);
res = EmitVectorInsert(context, res, context.Add(de, emit(me)), op.Id + index, op.Size);
}
context.Copy(GetVecA32(op.Qd), res);
}
public static void EmitVectorUnaryOpI32(ArmEmitterContext context, Func1I emit, bool signed)
{
OpCode32Simd op = (OpCode32Simd)context.CurrOp;
@ -385,6 +404,18 @@ namespace ARMeilleure.Instructions
EmitVectorUnaryOpI32(context, emit, true);
}
public static void EmitVectorUnaryOpSx32(ArmEmitterContext context, Func1I emit, bool accumulate)
{
if (accumulate)
{
EmitVectorUnaryAccumulateOpI32(context, emit, true);
}
else
{
EmitVectorUnaryOpI32(context, emit, true);
}
}
public static void EmitVectorBinaryOpSx32(ArmEmitterContext context, Func2I emit)
{
EmitVectorBinaryOpI32(context, emit, true);
@ -400,6 +431,18 @@ namespace ARMeilleure.Instructions
EmitVectorUnaryOpI32(context, emit, false);
}
public static void EmitVectorUnaryOpZx32(ArmEmitterContext context, Func1I emit, bool accumulate)
{
if (accumulate)
{
EmitVectorUnaryAccumulateOpI32(context, emit, false);
}
else
{
EmitVectorUnaryOpI32(context, emit, false);
}
}
public static void EmitVectorBinaryOpZx32(ArmEmitterContext context, Func2I emit)
{
EmitVectorBinaryOpI32(context, emit, false);
@ -592,6 +635,34 @@ namespace ARMeilleure.Instructions
context.Copy(GetVecA32(op.Qd), res);
}
public static void EmitVectorPairwiseLongOpI32(ArmEmitterContext context, Func2I emit, bool signed)
{
OpCode32Simd op = (OpCode32Simd)context.CurrOp;
int elems = (op.Q ? 16 : 8) >> op.Size;
int pairs = elems >> 1;
int id = (op.Vd & 1) * pairs;
Operand res = GetVecA32(op.Qd);
for (int index = 0; index < pairs; index++)
{
int pairIndex = index << 1;
Operand m1 = EmitVectorExtract32(context, op.Qm, op.Im + pairIndex, op.Size, signed);
Operand m2 = EmitVectorExtract32(context, op.Qm, op.Im + pairIndex + 1, op.Size, signed);
if (op.Size == 2)
{
m1 = signed ? context.SignExtend32(OperandType.I64, m1) : context.ZeroExtend32(OperandType.I64, m1);
m2 = signed ? context.SignExtend32(OperandType.I64, m2) : context.ZeroExtend32(OperandType.I64, m2);
}
res = EmitVectorInsert(context, res, emit(m1, m2), id + index, op.Size + 1);
}
context.Copy(GetVecA32(op.Qd), res);
}
// Narrow
public static void EmitVectorUnaryNarrowOp32(ArmEmitterContext context, Func1I emit, bool signed = false)

View File

@ -1004,7 +1004,7 @@ namespace ARMeilleure.Instructions
e = EmitShrImm64(context, e, signedSrc, roundConst, shift); // shift <= 32
}
e = EmitSatQ(context, e, op.Size, signedSrc, signedDst);
e = signedSrc ? EmitSignedSrcSatQ(context, e, op.Size, signedDst) : EmitUnsignedSrcSatQ(context, e, op.Size, signedDst);
res = EmitVectorInsert(context, res, e, part + index, op.Size);
}

View File

@ -33,64 +33,24 @@ namespace ARMeilleure.Instructions
EmitShrImmSaturatingNarrowOp(context, op.U ? ShrImmSaturatingNarrowFlags.VectorZxZx : ShrImmSaturatingNarrowFlags.VectorSxSx);
}
public static void Vqshrun(ArmEmitterContext context)
{
EmitShrImmSaturatingNarrowOp(context, ShrImmSaturatingNarrowFlags.VectorSxZx);
}
public static void Vrshr(ArmEmitterContext context)
{
OpCode32SimdShImm op = (OpCode32SimdShImm)context.CurrOp;
int shift = GetImmShr(op);
long roundConst = 1L << (shift - 1);
EmitRoundShrImmOp(context, accumulate: false);
}
if (op.U)
{
if (op.Size < 2)
{
EmitVectorUnaryOpZx32(context, (op1) =>
{
op1 = context.Add(op1, Const(op1.Type, roundConst));
public static void Vrshrn(ArmEmitterContext context)
{
EmitRoundShrImmNarrowOp(context, signed: false);
}
return context.ShiftRightUI(op1, Const(shift));
});
}
else if (op.Size == 2)
{
EmitVectorUnaryOpZx32(context, (op1) =>
{
op1 = context.ZeroExtend32(OperandType.I64, op1);
op1 = context.Add(op1, Const(op1.Type, roundConst));
return context.ConvertI64ToI32(context.ShiftRightUI(op1, Const(shift)));
});
}
else /* if (op.Size == 3) */
{
EmitVectorUnaryOpZx32(context, (op1) => EmitShrImm64(context, op1, signed: false, roundConst, shift));
}
}
else
{
if (op.Size < 2)
{
EmitVectorUnaryOpSx32(context, (op1) =>
{
op1 = context.Add(op1, Const(op1.Type, roundConst));
return context.ShiftRightSI(op1, Const(shift));
});
}
else if (op.Size == 2)
{
EmitVectorUnaryOpSx32(context, (op1) =>
{
op1 = context.SignExtend32(OperandType.I64, op1);
op1 = context.Add(op1, Const(op1.Type, roundConst));
return context.ConvertI64ToI32(context.ShiftRightSI(op1, Const(shift)));
});
}
else /* if (op.Size == 3) */
{
EmitVectorUnaryOpZx32(context, (op1) => EmitShrImm64(context, op1, signed: true, roundConst, shift));
}
}
public static void Vrsra(ArmEmitterContext context)
{
EmitRoundShrImmOp(context, accumulate: true);
}
public static void Vshl(ArmEmitterContext context)
@ -191,6 +151,89 @@ namespace ARMeilleure.Instructions
}
}
public static void EmitRoundShrImmOp(ArmEmitterContext context, bool accumulate)
{
OpCode32SimdShImm op = (OpCode32SimdShImm)context.CurrOp;
int shift = GetImmShr(op);
long roundConst = 1L << (shift - 1);
if (op.U)
{
if (op.Size < 2)
{
EmitVectorUnaryOpZx32(context, (op1) =>
{
op1 = context.Add(op1, Const(op1.Type, roundConst));
return context.ShiftRightUI(op1, Const(shift));
}, accumulate);
}
else if (op.Size == 2)
{
EmitVectorUnaryOpZx32(context, (op1) =>
{
op1 = context.ZeroExtend32(OperandType.I64, op1);
op1 = context.Add(op1, Const(op1.Type, roundConst));
return context.ConvertI64ToI32(context.ShiftRightUI(op1, Const(shift)));
}, accumulate);
}
else /* if (op.Size == 3) */
{
EmitVectorUnaryOpZx32(context, (op1) => EmitShrImm64(context, op1, signed: false, roundConst, shift), accumulate);
}
}
else
{
if (op.Size < 2)
{
EmitVectorUnaryOpSx32(context, (op1) =>
{
op1 = context.Add(op1, Const(op1.Type, roundConst));
return context.ShiftRightSI(op1, Const(shift));
}, accumulate);
}
else if (op.Size == 2)
{
EmitVectorUnaryOpSx32(context, (op1) =>
{
op1 = context.SignExtend32(OperandType.I64, op1);
op1 = context.Add(op1, Const(op1.Type, roundConst));
return context.ConvertI64ToI32(context.ShiftRightSI(op1, Const(shift)));
}, accumulate);
}
else /* if (op.Size == 3) */
{
EmitVectorUnaryOpZx32(context, (op1) => EmitShrImm64(context, op1, signed: true, roundConst, shift), accumulate);
}
}
}
private static void EmitRoundShrImmNarrowOp(ArmEmitterContext context, bool signed)
{
OpCode32SimdShImm op = (OpCode32SimdShImm)context.CurrOp;
int shift = GetImmShr(op);
long roundConst = 1L << (shift - 1);
EmitVectorUnaryNarrowOp32(context, (op1) =>
{
if (op.Size <= 1)
{
op1 = context.Add(op1, Const(op1.Type, roundConst));
op1 = signed ? context.ShiftRightSI(op1, Const(shift)) : context.ShiftRightUI(op1, Const(shift));
}
else /* if (op.Size == 2 && round) */
{
op1 = EmitShrImm64(context, op1, signed, roundConst, shift); // shift <= 32
}
return op1;
}, signed);
}
private static Operand EmitShlRegOp(ArmEmitterContext context, Operand op, Operand shiftLsB, int size, bool unsigned)
{
if (shiftLsB.Type == OperandType.I64)
@ -289,7 +332,7 @@ namespace ARMeilleure.Instructions
op1 = EmitShrImm64(context, op1, signedSrc, roundConst, shift); // shift <= 32
}
return EmitSatQ(context, op1, 8 << op.Size, signedDst);
return EmitSatQ(context, op1, 8 << op.Size, signedSrc, signedDst);
}, signedSrc);
}
@ -313,15 +356,20 @@ namespace ARMeilleure.Instructions
return context.Call(info, value, Const(roundConst), Const(shift));
}
private static Operand EmitSatQ(ArmEmitterContext context, Operand value, int eSize, bool signed)
private static Operand EmitSatQ(ArmEmitterContext context, Operand value, int eSize, bool signedSrc, bool signedDst)
{
Debug.Assert(eSize <= 32);
long intMin = signed ? -(1L << (eSize - 1)) : 0;
long intMax = signed ? (1L << (eSize - 1)) - 1 : (1L << eSize) - 1;
long intMin = signedDst ? -(1L << (eSize - 1)) : 0;
long intMax = signedDst ? (1L << (eSize - 1)) - 1 : (1L << eSize) - 1;
Operand gt = context.ICompareGreater(value, Const(value.Type, intMax));
Operand lt = context.ICompareLess(value, Const(value.Type, intMin));
Operand gt = signedSrc
? context.ICompareGreater(value, Const(value.Type, intMax))
: context.ICompareGreaterUI(value, Const(value.Type, intMax));
Operand lt = signedSrc
? context.ICompareLess(value, Const(value.Type, intMin))
: context.ICompareLessUI(value, Const(value.Type, intMin));
value = context.ConditionalSelect(gt, Const(value.Type, intMax), value);
value = context.ConditionalSelect(lt, Const(value.Type, intMin), value);

View File

@ -16,18 +16,13 @@ namespace ARMeilleure.Instructions
{
OpCode32System op = (OpCode32System)context.CurrOp;
if (op.Coproc != 15)
if (op.Coproc != 15 || op.Opc1 != 0)
{
InstEmit.Und(context);
return;
}
if (op.Opc1 != 0)
{
throw new NotImplementedException($"Unknown MRC Opc1 0x{op.Opc1:X16} at 0x{op.Address:X16}.");
}
MethodInfo info;
switch (op.CRn)
@ -35,7 +30,7 @@ namespace ARMeilleure.Instructions
case 13: // Process and Thread Info.
if (op.CRm != 0)
{
throw new NotImplementedException($"Unknown MRC CRm 0x{op.CRm:X16} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRC CRm 0x{op.CRm:X} at 0x{op.Address:X} (0x{op.RawOpCode:X}).");
}
switch (op.Opc2)
@ -44,7 +39,7 @@ namespace ARMeilleure.Instructions
info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetTpidrEl032)); break;
default:
throw new NotImplementedException($"Unknown MRC Opc2 0x{op.Opc2:X16} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRC Opc2 0x{op.Opc2:X} at 0x{op.Address:X} (0x{op.RawOpCode:X}).");
}
break;
@ -59,11 +54,11 @@ namespace ARMeilleure.Instructions
return; // No-op.
default:
throw new NotImplementedException($"Unknown MRC Opc2 0x{op.Opc2:X16} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRC Opc2 0x{op.Opc2:X16} at 0x{op.Address:X16} (0x{op.RawOpCode:X}).");
}
default:
throw new NotImplementedException($"Unknown MRC CRm 0x{op.CRm:X16} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRC CRm 0x{op.CRm:X16} at 0x{op.Address:X16} (0x{op.RawOpCode:X}).");
}
default:
@ -77,18 +72,13 @@ namespace ARMeilleure.Instructions
{
OpCode32System op = (OpCode32System)context.CurrOp;
if (op.Coproc != 15)
if (op.Coproc != 15 || op.Opc1 != 0)
{
InstEmit.Und(context);
return;
}
if (op.Opc1 != 0)
{
throw new NotImplementedException($"Unknown MRC Opc1 0x{op.Opc1:X16} at 0x{op.Address:X16}.");
}
MethodInfo info;
switch (op.CRn)
@ -96,7 +86,7 @@ namespace ARMeilleure.Instructions
case 13: // Process and Thread Info.
if (op.CRm != 0)
{
throw new NotImplementedException($"Unknown MRC CRm 0x{op.CRm:X16} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRC CRm 0x{op.CRm:X} at 0x{op.Address:X} (0x{op.RawOpCode:X}).");
}
switch (op.Opc2)
@ -108,13 +98,13 @@ namespace ARMeilleure.Instructions
info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetTpidr32)); break;
default:
throw new NotImplementedException($"Unknown MRC Opc2 0x{op.Opc2:X16} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRC Opc2 0x{op.Opc2:X} at 0x{op.Address:X} (0x{op.RawOpCode:X}).");
}
break;
default:
throw new NotImplementedException($"Unknown MRC 0x{op.RawOpCode:X8} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRC 0x{op.RawOpCode:X} at 0x{op.Address:X}.");
}
if (op.Rt == RegisterAlias.Aarch32Pc)
@ -154,13 +144,13 @@ namespace ARMeilleure.Instructions
info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetCntpctEl0)); break;
default:
throw new NotImplementedException($"Unknown MRRC Opc1 0x{opc:X16} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRRC Opc1 0x{opc:X} at 0x{op.Address:X} (0x{op.RawOpCode:X}).");
}
break;
default:
throw new NotImplementedException($"Unknown MRRC 0x{op.RawOpCode:X8} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown MRRC 0x{op.RawOpCode:X} at 0x{op.Address:X}.");
}
Operand result = context.Call(info);
@ -169,6 +159,31 @@ namespace ARMeilleure.Instructions
SetIntA32(context, op.CRn, context.ConvertI64ToI32(context.ShiftRightUI(result, Const(32))));
}
public static void Mrs(ArmEmitterContext context)
{
OpCode32Mrs op = (OpCode32Mrs)context.CurrOp;
if (op.R)
{
throw new NotImplementedException("SPSR");
}
else
{
Operand vSh = context.ShiftLeft(GetFlag(PState.VFlag), Const((int)PState.VFlag));
Operand cSh = context.ShiftLeft(GetFlag(PState.CFlag), Const((int)PState.CFlag));
Operand zSh = context.ShiftLeft(GetFlag(PState.ZFlag), Const((int)PState.ZFlag));
Operand nSh = context.ShiftLeft(GetFlag(PState.NFlag), Const((int)PState.NFlag));
Operand qSh = context.ShiftLeft(GetFlag(PState.QFlag), Const((int)PState.QFlag));
Operand spsr = context.BitwiseOr(context.BitwiseOr(nSh, zSh), context.BitwiseOr(cSh, vSh));
spsr = context.BitwiseOr(spsr, qSh);
// TODO: Remaining flags.
SetIntA32(context, op.Rd, spsr);
}
}
public static void Msr(ArmEmitterContext context)
{
OpCode32MsrReg op = (OpCode32MsrReg)context.CurrOp;
@ -240,7 +255,7 @@ namespace ARMeilleure.Instructions
case 0b1000: // FPEXC
throw new NotImplementedException("Supervisor Only");
default:
throw new NotImplementedException($"Unknown VMRS 0x{op.RawOpCode:X8} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown VMRS 0x{op.RawOpCode:X} at 0x{op.Address:X}.");
}
}
@ -263,7 +278,7 @@ namespace ARMeilleure.Instructions
case 0b1000: // FPEXC
throw new NotImplementedException("Supervisor Only");
default:
throw new NotImplementedException($"Unknown VMSR 0x{op.RawOpCode:X8} at 0x{op.Address:X16}.");
throw new NotImplementedException($"Unknown VMSR 0x{op.RawOpCode:X} at 0x{op.Address:X}.");
}
}

View File

@ -545,6 +545,8 @@ namespace ARMeilleure.Instructions
Strexh,
Strh,
Sxtb16,
Tbb,
Tbh,
Teq,
Trap,
Tst,
@ -601,6 +603,7 @@ namespace ARMeilleure.Instructions
Vmin,
Vminnm,
Vmla,
Vmlal,
Vmls,
Vmlsl,
Vmov,
@ -618,15 +621,28 @@ namespace ARMeilleure.Instructions
Vorn,
Vorr,
Vpadd,
Vpaddl,
Vpmax,
Vpmin,
Vqadd,
Vqdmulh,
Vqmovn,
Vqmovun,
Vqrshrn,
Vqrshrun,
Vqshrn,
Vqshrun,
Vqsub,
Vrev,
Vrhadd,
Vrint,
Vrinta,
Vrintm,
Vrintn,
Vrintp,
Vrintx,
Vrshr,
Vrshrn,
Vsel,
Vshl,
Vshll,
@ -643,8 +659,10 @@ namespace ARMeilleure.Instructions
Vrecps,
Vrsqrte,
Vrsqrts,
Vrsra,
Vsra,
Vsub,
Vsubl,
Vsubw,
Vtbl,
Vtrn,

View File

@ -2,8 +2,6 @@ using ARMeilleure.Memory;
using ARMeilleure.State;
using ARMeilleure.Translation;
using System;
using System.Diagnostics;
using System.Runtime.InteropServices;
namespace ARMeilleure.Instructions
{

View File

@ -91,7 +91,7 @@ namespace ARMeilleure.Instructions
}
else /* if (eSize != 64) */
{
return SignedSrcSignedDstSatQ(value << shiftLsB, size);
return SignedSrcSignedDstSatQ(value << shiftLsB, size); // InstEmitSimdHelper.EmitSignedSrcSatQ(signedDst: true).
}
}
else /* if (shiftLsB == 0) */
@ -135,7 +135,7 @@ namespace ARMeilleure.Instructions
}
else /* if (eSize != 64) */
{
return UnsignedSrcUnsignedDstSatQ(value << shiftLsB, size);
return UnsignedSrcUnsignedDstSatQ(value << shiftLsB, size); // InstEmitSimdHelper.EmitUnsignedSrcSatQ(signedDst: false).
}
}
else /* if (shiftLsB == 0) */
@ -509,7 +509,7 @@ namespace ARMeilleure.Instructions
#endregion
#region "Saturating"
public static long SignedSrcSignedDstSatQ(long op, int size)
private static long SignedSrcSignedDstSatQ(long op, int size)
{
ExecutionContext context = NativeInterface.GetContext();
@ -536,54 +536,7 @@ namespace ARMeilleure.Instructions
}
}
public static ulong SignedSrcUnsignedDstSatQ(long op, int size)
{
ExecutionContext context = NativeInterface.GetContext();
int eSize = 8 << size;
ulong tMaxValue = (1UL << eSize) - 1UL;
ulong tMinValue = 0UL;
if (op > (long)tMaxValue)
{
context.Fpsr |= FPSR.Qc;
return tMaxValue;
}
else if (op < (long)tMinValue)
{
context.Fpsr |= FPSR.Qc;
return tMinValue;
}
else
{
return (ulong)op;
}
}
public static long UnsignedSrcSignedDstSatQ(ulong op, int size)
{
ExecutionContext context = NativeInterface.GetContext();
int eSize = 8 << size;
long tMaxValue = (1L << (eSize - 1)) - 1L;
if (op > (ulong)tMaxValue)
{
context.Fpsr |= FPSR.Qc;
return tMaxValue;
}
else
{
return (long)op;
}
}
public static ulong UnsignedSrcUnsignedDstSatQ(ulong op, int size)
private static ulong UnsignedSrcUnsignedDstSatQ(ulong op, int size)
{
ExecutionContext context = NativeInterface.GetContext();
@ -602,208 +555,6 @@ namespace ARMeilleure.Instructions
return op;
}
}
public static long UnarySignedSatQAbsOrNeg(long op)
{
ExecutionContext context = NativeInterface.GetContext();
if (op == long.MinValue)
{
context.Fpsr |= FPSR.Qc;
return long.MaxValue;
}
else
{
return op;
}
}
public static long BinarySignedSatQAdd(long op1, long op2)
{
ExecutionContext context = NativeInterface.GetContext();
long add = op1 + op2;
if ((~(op1 ^ op2) & (op1 ^ add)) < 0L)
{
context.Fpsr |= FPSR.Qc;
if (op1 < 0L)
{
return long.MinValue;
}
else
{
return long.MaxValue;
}
}
else
{
return add;
}
}
public static ulong BinaryUnsignedSatQAdd(ulong op1, ulong op2)
{
ExecutionContext context = NativeInterface.GetContext();
ulong add = op1 + op2;
if ((add < op1) && (add < op2))
{
context.Fpsr |= FPSR.Qc;
return ulong.MaxValue;
}
else
{
return add;
}
}
public static long BinarySignedSatQSub(long op1, long op2)
{
ExecutionContext context = NativeInterface.GetContext();
long sub = op1 - op2;
if (((op1 ^ op2) & (op1 ^ sub)) < 0L)
{
context.Fpsr |= FPSR.Qc;
if (op1 < 0L)
{
return long.MinValue;
}
else
{
return long.MaxValue;
}
}
else
{
return sub;
}
}
public static ulong BinaryUnsignedSatQSub(ulong op1, ulong op2)
{
ExecutionContext context = NativeInterface.GetContext();
ulong sub = op1 - op2;
if (op1 < op2)
{
context.Fpsr |= FPSR.Qc;
return ulong.MinValue;
}
else
{
return sub;
}
}
public static long BinarySignedSatQAcc(ulong op1, long op2)
{
ExecutionContext context = NativeInterface.GetContext();
if (op1 <= (ulong)long.MaxValue)
{
// op1 from ulong.MinValue to (ulong)long.MaxValue
// op2 from long.MinValue to long.MaxValue
long add = (long)op1 + op2;
if ((~op2 & add) < 0L)
{
context.Fpsr |= FPSR.Qc;
return long.MaxValue;
}
else
{
return add;
}
}
else if (op2 >= 0L)
{
// op1 from (ulong)long.MaxValue + 1UL to ulong.MaxValue
// op2 from (long)ulong.MinValue to long.MaxValue
context.Fpsr |= FPSR.Qc;
return long.MaxValue;
}
else
{
// op1 from (ulong)long.MaxValue + 1UL to ulong.MaxValue
// op2 from long.MinValue to (long)ulong.MinValue - 1L
ulong add = op1 + (ulong)op2;
if (add > (ulong)long.MaxValue)
{
context.Fpsr |= FPSR.Qc;
return long.MaxValue;
}
else
{
return (long)add;
}
}
}
public static ulong BinaryUnsignedSatQAcc(long op1, ulong op2)
{
ExecutionContext context = NativeInterface.GetContext();
if (op1 >= 0L)
{
// op1 from (long)ulong.MinValue to long.MaxValue
// op2 from ulong.MinValue to ulong.MaxValue
ulong add = (ulong)op1 + op2;
if ((add < (ulong)op1) && (add < op2))
{
context.Fpsr |= FPSR.Qc;
return ulong.MaxValue;
}
else
{
return add;
}
}
else if (op2 > (ulong)long.MaxValue)
{
// op1 from long.MinValue to (long)ulong.MinValue - 1L
// op2 from (ulong)long.MaxValue + 1UL to ulong.MaxValue
return (ulong)op1 + op2;
}
else
{
// op1 from long.MinValue to (long)ulong.MinValue - 1L
// op2 from ulong.MinValue to (ulong)long.MaxValue
long add = op1 + (long)op2;
if (add < (long)ulong.MinValue)
{
context.Fpsr |= FPSR.Qc;
return ulong.MinValue;
}
else
{
return (ulong)add;
}
}
}
#endregion
#region "Count"
@ -1129,7 +880,7 @@ namespace ARMeilleure.Instructions
return Sha256Hash(hash_abcd, hash_efgh, wk, part1: true);
}
public static V128 HashUpper(V128 hash_efgh, V128 hash_abcd, V128 wk)
public static V128 HashUpper(V128 hash_abcd, V128 hash_efgh, V128 wk)
{
return Sha256Hash(hash_abcd, hash_efgh, wk, part1: false);
}

View File

@ -71,6 +71,7 @@ namespace ARMeilleure.IntermediateRepresentation
X86Paddd,
X86Paddq,
X86Paddw,
X86Palignr,
X86Pand,
X86Pandn,
X86Pavgb,
@ -140,6 +141,9 @@ namespace ARMeilleure.IntermediateRepresentation
X86Roundss,
X86Rsqrtps,
X86Rsqrtss,
X86Sha256Msg1,
X86Sha256Msg2,
X86Sha256Rnds2,
X86Shufpd,
X86Shufps,
X86Sqrtpd,

View File

@ -21,6 +21,7 @@ namespace ARMeilleure
public static bool UseFmaIfAvailable { get; set; } = true;
public static bool UseAesniIfAvailable { get; set; } = true;
public static bool UsePclmulqdqIfAvailable { get; set; } = true;
public static bool UseShaIfAvailable { get; set; } = true;
public static bool ForceLegacySse
{
@ -40,5 +41,6 @@ namespace ARMeilleure
internal static bool UseFma => UseFmaIfAvailable && HardwareCapabilities.SupportsFma;
internal static bool UseAesni => UseAesniIfAvailable && HardwareCapabilities.SupportsAesni;
internal static bool UsePclmulqdq => UsePclmulqdqIfAvailable && HardwareCapabilities.SupportsPclmulqdq;
internal static bool UseSha => UseShaIfAvailable && HardwareCapabilities.SupportsSha;
}
}

View File

@ -197,12 +197,29 @@ namespace ARMeilleure.Signal
// Only call tracking if in range.
context.BranchIfFalse(nextLabel, inRange, BasicBlockFrequency.Cold);
context.Copy(inRegionLocal, Const(1));
Operand offset = context.BitwiseAnd(context.Subtract(faultAddress, rangeAddress), Const(~PageMask));
// Call the tracking action, with the pointer's relative offset to the base address.
Operand trackingActionPtr = context.Load(OperandType.I64, Const((ulong)signalStructPtr + rangeBaseOffset + 20));
context.Call(trackingActionPtr, OperandType.I32, offset, Const(PageSize), isWrite, Const(0));
context.Copy(inRegionLocal, Const(0));
Operand skipActionLabel = Label();
// Tracking action should be non-null to call it, otherwise assume false return.
context.BranchIfFalse(skipActionLabel, trackingActionPtr);
Operand result = context.Call(trackingActionPtr, OperandType.I32, offset, Const(PageSize), isWrite, Const(0));
context.Copy(inRegionLocal, result);
context.MarkLabel(skipActionLabel);
// If the tracking action returns false or does not exist, it might be an invalid access due to a partial overlap on Windows.
if (OperatingSystem.IsWindows())
{
context.BranchIfTrue(endLabel, inRegionLocal);
context.Copy(inRegionLocal, WindowsPartialUnmapHandler.EmitRetryFromAccessViolation(context));
}
context.Branch(endLabel);

View File

@ -0,0 +1,84 @@
using ARMeilleure.IntermediateRepresentation;
using ARMeilleure.Translation;
using System;
using static ARMeilleure.IntermediateRepresentation.Operand.Factory;
namespace ARMeilleure.Signal
{
public struct NativeWriteLoopState
{
public int Running;
public int Error;
}
public static class TestMethods
{
public delegate bool DebugPartialUnmap();
public delegate int DebugThreadLocalMapGetOrReserve(int threadId, int initialState);
public delegate void DebugNativeWriteLoop(IntPtr nativeWriteLoopPtr, IntPtr writePtr);
public static DebugPartialUnmap GenerateDebugPartialUnmap()
{
EmitterContext context = new EmitterContext();
var result = WindowsPartialUnmapHandler.EmitRetryFromAccessViolation(context);
context.Return(result);
// Compile and return the function.
ControlFlowGraph cfg = context.GetControlFlowGraph();
OperandType[] argTypes = new OperandType[] { OperandType.I64 };
return Compiler.Compile(cfg, argTypes, OperandType.I32, CompilerOptions.HighCq).Map<DebugPartialUnmap>();
}
public static DebugThreadLocalMapGetOrReserve GenerateDebugThreadLocalMapGetOrReserve(IntPtr structPtr)
{
EmitterContext context = new EmitterContext();
var result = WindowsPartialUnmapHandler.EmitThreadLocalMapIntGetOrReserve(context, structPtr, context.LoadArgument(OperandType.I32, 0), context.LoadArgument(OperandType.I32, 1));
context.Return(result);
// Compile and return the function.
ControlFlowGraph cfg = context.GetControlFlowGraph();
OperandType[] argTypes = new OperandType[] { OperandType.I64 };
return Compiler.Compile(cfg, argTypes, OperandType.I32, CompilerOptions.HighCq).Map<DebugThreadLocalMapGetOrReserve>();
}
public static DebugNativeWriteLoop GenerateDebugNativeWriteLoop()
{
EmitterContext context = new EmitterContext();
// Loop a write to the target address until "running" is false.
Operand structPtr = context.Copy(context.LoadArgument(OperandType.I64, 0));
Operand writePtr = context.Copy(context.LoadArgument(OperandType.I64, 1));
Operand loopLabel = Label();
context.MarkLabel(loopLabel);
context.Store(writePtr, Const(12345));
Operand running = context.Load(OperandType.I32, structPtr);
context.BranchIfTrue(loopLabel, running);
context.Return();
// Compile and return the function.
ControlFlowGraph cfg = context.GetControlFlowGraph();
OperandType[] argTypes = new OperandType[] { OperandType.I64 };
return Compiler.Compile(cfg, argTypes, OperandType.None, CompilerOptions.HighCq).Map<DebugNativeWriteLoop>();
}
}
}

View File

@ -0,0 +1,186 @@
using ARMeilleure.IntermediateRepresentation;
using ARMeilleure.Translation;
using Ryujinx.Common.Memory.PartialUnmaps;
using System;
using static ARMeilleure.IntermediateRepresentation.Operand.Factory;
namespace ARMeilleure.Signal
{
/// <summary>
/// Methods to handle signals caused by partial unmaps. See the structs for C# implementations of the methods.
/// </summary>
internal static class WindowsPartialUnmapHandler
{
public static Operand EmitRetryFromAccessViolation(EmitterContext context)
{
IntPtr partialRemapStatePtr = PartialUnmapState.GlobalState;
IntPtr localCountsPtr = IntPtr.Add(partialRemapStatePtr, PartialUnmapState.LocalCountsOffset);
// Get the lock first.
EmitNativeReaderLockAcquire(context, IntPtr.Add(partialRemapStatePtr, PartialUnmapState.PartialUnmapLockOffset));
IntPtr getCurrentThreadId = WindowsSignalHandlerRegistration.GetCurrentThreadIdFunc();
Operand threadId = context.Call(Const((ulong)getCurrentThreadId), OperandType.I32);
Operand threadIndex = EmitThreadLocalMapIntGetOrReserve(context, localCountsPtr, threadId, Const(0));
Operand endLabel = Label();
Operand retry = context.AllocateLocal(OperandType.I32);
Operand threadIndexValidLabel = Label();
context.BranchIfFalse(threadIndexValidLabel, context.ICompareEqual(threadIndex, Const(-1)));
context.Copy(retry, Const(1)); // Always retry when thread local cannot be allocated.
context.Branch(endLabel);
context.MarkLabel(threadIndexValidLabel);
Operand threadLocalPartialUnmapsPtr = EmitThreadLocalMapIntGetValuePtr(context, localCountsPtr, threadIndex);
Operand threadLocalPartialUnmaps = context.Load(OperandType.I32, threadLocalPartialUnmapsPtr);
Operand partialUnmapsCount = context.Load(OperandType.I32, Const((ulong)IntPtr.Add(partialRemapStatePtr, PartialUnmapState.PartialUnmapsCountOffset)));
context.Copy(retry, context.ICompareNotEqual(threadLocalPartialUnmaps, partialUnmapsCount));
Operand noRetryLabel = Label();
context.BranchIfFalse(noRetryLabel, retry);
// if (retry) {
context.Store(threadLocalPartialUnmapsPtr, partialUnmapsCount);
context.Branch(endLabel);
context.MarkLabel(noRetryLabel);
// }
context.MarkLabel(endLabel);
// Finally, release the lock and return the retry value.
EmitNativeReaderLockRelease(context, IntPtr.Add(partialRemapStatePtr, PartialUnmapState.PartialUnmapLockOffset));
return retry;
}
public static Operand EmitThreadLocalMapIntGetOrReserve(EmitterContext context, IntPtr threadLocalMapPtr, Operand threadId, Operand initialState)
{
Operand idsPtr = Const((ulong)IntPtr.Add(threadLocalMapPtr, ThreadLocalMap<int>.ThreadIdsOffset));
Operand i = context.AllocateLocal(OperandType.I32);
context.Copy(i, Const(0));
// (Loop 1) Check all slots for a matching Thread ID (while also trying to allocate)
Operand endLabel = Label();
Operand loopLabel = Label();
context.MarkLabel(loopLabel);
Operand offset = context.Multiply(i, Const(sizeof(int)));
Operand idPtr = context.Add(idsPtr, context.SignExtend32(OperandType.I64, offset));
// Check that this slot has the thread ID.
Operand existingId = context.CompareAndSwap(idPtr, threadId, threadId);
// If it was already the thread ID, then we just need to return i.
context.BranchIfTrue(endLabel, context.ICompareEqual(existingId, threadId));
context.Copy(i, context.Add(i, Const(1)));
context.BranchIfTrue(loopLabel, context.ICompareLess(i, Const(ThreadLocalMap<int>.MapSize)));
// (Loop 2) Try take a slot that is 0 with our Thread ID.
context.Copy(i, Const(0)); // Reset i.
Operand loop2Label = Label();
context.MarkLabel(loop2Label);
Operand offset2 = context.Multiply(i, Const(sizeof(int)));
Operand idPtr2 = context.Add(idsPtr, context.SignExtend32(OperandType.I64, offset2));
// Try and swap in the thread id on top of 0.
Operand existingId2 = context.CompareAndSwap(idPtr2, Const(0), threadId);
Operand idNot0Label = Label();
// If it was 0, then we need to initialize the struct entry and return i.
context.BranchIfFalse(idNot0Label, context.ICompareEqual(existingId2, Const(0)));
Operand structsPtr = Const((ulong)IntPtr.Add(threadLocalMapPtr, ThreadLocalMap<int>.StructsOffset));
Operand structPtr = context.Add(structsPtr, context.SignExtend32(OperandType.I64, offset2));
context.Store(structPtr, initialState);
context.Branch(endLabel);
context.MarkLabel(idNot0Label);
context.Copy(i, context.Add(i, Const(1)));
context.BranchIfTrue(loop2Label, context.ICompareLess(i, Const(ThreadLocalMap<int>.MapSize)));
context.Copy(i, Const(-1)); // Could not place the thread in the list.
context.MarkLabel(endLabel);
return context.Copy(i);
}
private static Operand EmitThreadLocalMapIntGetValuePtr(EmitterContext context, IntPtr threadLocalMapPtr, Operand index)
{
Operand offset = context.Multiply(index, Const(sizeof(int)));
Operand structsPtr = Const((ulong)IntPtr.Add(threadLocalMapPtr, ThreadLocalMap<int>.StructsOffset));
return context.Add(structsPtr, context.SignExtend32(OperandType.I64, offset));
}
private static void EmitThreadLocalMapIntRelease(EmitterContext context, IntPtr threadLocalMapPtr, Operand threadId, Operand index)
{
Operand offset = context.Multiply(index, Const(sizeof(int)));
Operand idsPtr = Const((ulong)IntPtr.Add(threadLocalMapPtr, ThreadLocalMap<int>.ThreadIdsOffset));
Operand idPtr = context.Add(idsPtr, context.SignExtend32(OperandType.I64, offset));
context.CompareAndSwap(idPtr, threadId, Const(0));
}
private static void EmitAtomicAddI32(EmitterContext context, Operand ptr, Operand additive)
{
Operand loop = Label();
context.MarkLabel(loop);
Operand initial = context.Load(OperandType.I32, ptr);
Operand newValue = context.Add(initial, additive);
Operand replaced = context.CompareAndSwap(ptr, initial, newValue);
context.BranchIfFalse(loop, context.ICompareEqual(initial, replaced));
}
private static void EmitNativeReaderLockAcquire(EmitterContext context, IntPtr nativeReaderLockPtr)
{
Operand writeLockPtr = Const((ulong)IntPtr.Add(nativeReaderLockPtr, NativeReaderWriterLock.WriteLockOffset));
// Spin until we can acquire the write lock.
Operand spinLabel = Label();
context.MarkLabel(spinLabel);
// Old value must be 0 to continue (we gained the write lock)
context.BranchIfTrue(spinLabel, context.CompareAndSwap(writeLockPtr, Const(0), Const(1)));
// Increment reader count.
EmitAtomicAddI32(context, Const((ulong)IntPtr.Add(nativeReaderLockPtr, NativeReaderWriterLock.ReaderCountOffset)), Const(1));
// Release write lock.
context.CompareAndSwap(writeLockPtr, Const(1), Const(0));
}
private static void EmitNativeReaderLockRelease(EmitterContext context, IntPtr nativeReaderLockPtr)
{
// Decrement reader count.
EmitAtomicAddI32(context, Const((ulong)IntPtr.Add(nativeReaderLockPtr, NativeReaderWriterLock.ReaderCountOffset)), Const(-1));
}
}
}

View File

@ -3,7 +3,7 @@ using System.Runtime.InteropServices;
namespace ARMeilleure.Signal
{
class WindowsSignalHandlerRegistration
unsafe class WindowsSignalHandlerRegistration
{
[DllImport("kernel32.dll")]
private static extern IntPtr AddVectoredExceptionHandler(uint first, IntPtr handler);
@ -11,6 +11,14 @@ namespace ARMeilleure.Signal
[DllImport("kernel32.dll")]
private static extern ulong RemoveVectoredExceptionHandler(IntPtr handle);
[DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
static extern IntPtr LoadLibrary([MarshalAs(UnmanagedType.LPStr)] string lpFileName);
[DllImport("kernel32.dll", CharSet = CharSet.Ansi, ExactSpelling = true, SetLastError = true)]
private static extern IntPtr GetProcAddress(IntPtr hModule, string procName);
private static IntPtr _getCurrentThreadIdPtr;
public static IntPtr RegisterExceptionHandler(IntPtr action)
{
return AddVectoredExceptionHandler(1, action);
@ -20,5 +28,17 @@ namespace ARMeilleure.Signal
{
return RemoveVectoredExceptionHandler(handle) != 0;
}
public static IntPtr GetCurrentThreadIdFunc()
{
if (_getCurrentThreadIdPtr == IntPtr.Zero)
{
IntPtr handle = LoadLibrary("kernel32.dll");
_getCurrentThreadIdPtr = GetProcAddress(handle, "GetCurrentThreadId");
}
return _getCurrentThreadIdPtr;
}
}
}

View File

@ -14,7 +14,7 @@ namespace ARMeilleure.Translation
public BasicBlock Entry { get; }
public IntrusiveList<BasicBlock> Blocks { get; }
public BasicBlock[] PostOrderBlocks => _postOrderBlocks;
public int[] PostOrderMap => _postOrderMap;
public int[] PostOrderMap => _postOrderMap;
public ControlFlowGraph(BasicBlock entry, IntrusiveList<BasicBlock> blocks, int localsCount)
{

View File

@ -127,7 +127,7 @@ namespace ARMeilleure.Translation
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpcr)));
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpscr))); // A32 only.
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsr)));
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc))); // A32 only.
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetFpsrQc)));
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetTpidrEl0)));
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SetTpidrEl032))); // A32 only.
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.SignalMemoryTracking)));
@ -140,12 +140,6 @@ namespace ARMeilleure.Translation
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.WriteUInt64)));
SetDelegateInfo(typeof(NativeInterface).GetMethod(nameof(NativeInterface.WriteVector128)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinarySignedSatQAcc)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinarySignedSatQAdd)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinarySignedSatQSub)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinaryUnsignedSatQAcc)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinaryUnsignedSatQAdd)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.BinaryUnsignedSatQSub)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.CountLeadingSigns)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.CountLeadingZeros)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Crc32b)));
@ -188,8 +182,6 @@ namespace ARMeilleure.Translation
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.SignedShlReg)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.SignedShlRegSatQ)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.SignedShrImm64)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.SignedSrcSignedDstSatQ)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.SignedSrcUnsignedDstSatQ)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Tbl1)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Tbl2)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Tbl3)));
@ -198,12 +190,9 @@ namespace ARMeilleure.Translation
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Tbx2)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Tbx3)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Tbx4)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnarySignedSatQAbsOrNeg)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnsignedShlReg)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnsignedShlRegSatQ)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnsignedShrImm64)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnsignedSrcSignedDstSatQ)));
SetDelegateInfo(typeof(SoftFallback).GetMethod(nameof(SoftFallback.UnsignedSrcUnsignedDstSatQ)));
SetDelegateInfo(typeof(SoftFloat16_32).GetMethod(nameof(SoftFloat16_32.FPConvert)));
SetDelegateInfo(typeof(SoftFloat16_64).GetMethod(nameof(SoftFloat16_64.FPConvert)));

View File

@ -19,8 +19,6 @@ namespace ARMeilleure.Translation
public int Count => _count;
public IntervalTree() { }
#region Public Methods
/// <summary>
@ -344,7 +342,7 @@ namespace ARMeilleure.Translation
}
/// <summary>
/// Removes the value from the dictionary after searching for it with <paramref name="key">.
/// Removes the value from the dictionary after searching for it with <paramref name="key"/>.
/// </summary>
/// <param name="key">Key to search for</param>
/// <returns>Number of deleted values</returns>

View File

@ -27,7 +27,7 @@ namespace ARMeilleure.Translation.PTC
private const string OuterHeaderMagicString = "PTCohd\0\0";
private const string InnerHeaderMagicString = "PTCihd\0\0";
private const uint InternalVersion = 3439; //! To be incremented manually for each change to the ARMeilleure project.
private const uint InternalVersion = 3683; //! To be incremented manually for each change to the ARMeilleure project.
private const string ActualDir = "0";
private const string BackupDir = "1";
@ -946,9 +946,12 @@ namespace ARMeilleure.Translation.PTC
return BitConverter.IsLittleEndian;
}
private static ulong GetFeatureInfo()
private static FeatureInfo GetFeatureInfo()
{
return (ulong)HardwareCapabilities.FeatureInfoEdx << 32 | (uint)HardwareCapabilities.FeatureInfoEcx;
return new FeatureInfo(
(uint)HardwareCapabilities.FeatureInfo1Ecx,
(uint)HardwareCapabilities.FeatureInfo1Edx,
(uint)HardwareCapabilities.FeatureInfo7Ebx);
}
private static byte GetMemoryManagerMode()
@ -968,7 +971,7 @@ namespace ARMeilleure.Translation.PTC
return osPlatform;
}
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 50*/)]
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 54*/)]
private struct OuterHeader
{
public ulong Magic;
@ -976,7 +979,7 @@ namespace ARMeilleure.Translation.PTC
public uint CacheFileVersion;
public bool Endianness;
public ulong FeatureInfo;
public FeatureInfo FeatureInfo;
public byte MemoryManagerMode;
public uint OSPlatform;
@ -999,6 +1002,9 @@ namespace ARMeilleure.Translation.PTC
}
}
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 12*/)]
private record struct FeatureInfo(uint FeatureInfo0, uint FeatureInfo1, uint FeatureInfo2);
[StructLayout(LayoutKind.Sequential, Pack = 1/*, Size = 128*/)]
private struct InnerHeader
{

View File

@ -44,15 +44,11 @@ namespace ARMeilleure.Translation
private readonly IJitMemoryAllocator _allocator;
private readonly ConcurrentQueue<KeyValuePair<ulong, TranslatedFunction>> _oldFuncs;
private readonly ConcurrentDictionary<ulong, object> _backgroundSet;
private readonly ConcurrentStack<RejitRequest> _backgroundStack;
private readonly AutoResetEvent _backgroundTranslatorEvent;
private readonly ReaderWriterLock _backgroundTranslatorLock;
internal TranslatorCache<TranslatedFunction> Functions { get; }
internal AddressTable<ulong> FunctionTable { get; }
internal EntryTable<uint> CountTable { get; }
internal TranslatorStubs Stubs { get; }
internal TranslatorQueue Queue { get; }
internal IMemoryManager Memory { get; }
private volatile int _threadCount;
@ -67,10 +63,7 @@ namespace ARMeilleure.Translation
_oldFuncs = new ConcurrentQueue<KeyValuePair<ulong, TranslatedFunction>>();
_backgroundSet = new ConcurrentDictionary<ulong, object>();
_backgroundStack = new ConcurrentStack<RejitRequest>();
_backgroundTranslatorEvent = new AutoResetEvent(false);
_backgroundTranslatorLock = new ReaderWriterLock();
Queue = new TranslatorQueue();
JitCache.Initialize(allocator);
@ -87,43 +80,6 @@ namespace ARMeilleure.Translation
}
}
private void TranslateStackedSubs()
{
while (_threadCount != 0)
{
_backgroundTranslatorLock.AcquireReaderLock(Timeout.Infinite);
if (_backgroundStack.TryPop(out RejitRequest request) &&
_backgroundSet.TryRemove(request.Address, out _))
{
TranslatedFunction func = Translate(request.Address, request.Mode, highCq: true);
Functions.AddOrUpdate(request.Address, func.GuestSize, func, (key, oldFunc) =>
{
EnqueueForDeletion(key, oldFunc);
return func;
});
if (PtcProfiler.Enabled)
{
PtcProfiler.UpdateEntry(request.Address, request.Mode, highCq: true);
}
RegisterFunction(request.Address, func);
_backgroundTranslatorLock.ReleaseReaderLock();
}
else
{
_backgroundTranslatorLock.ReleaseReaderLock();
_backgroundTranslatorEvent.WaitOne();
}
}
// Wake up any other background translator threads, to encourage them to exit.
_backgroundTranslatorEvent.Set();
}
public void Execute(State.ExecutionContext context, ulong address)
{
if (Interlocked.Increment(ref _threadCount) == 1)
@ -155,7 +111,7 @@ namespace ARMeilleure.Translation
{
bool last = i != 0 && i == unboundedThreadCount - 1;
Thread backgroundTranslatorThread = new Thread(TranslateStackedSubs)
Thread backgroundTranslatorThread = new Thread(BackgroundTranslate)
{
Name = "CPU.BackgroundTranslatorThread." + i,
Priority = last ? ThreadPriority.Lowest : ThreadPriority.Normal
@ -186,10 +142,9 @@ namespace ARMeilleure.Translation
if (Interlocked.Decrement(ref _threadCount) == 0)
{
_backgroundTranslatorEvent.Set();
ClearJitCache();
Queue.Dispose();
Stubs.Dispose();
FunctionTable.Dispose();
CountTable.Dispose();
@ -317,6 +272,27 @@ namespace ARMeilleure.Translation
return new TranslatedFunction(func, counter, funcSize, highCq);
}
private void BackgroundTranslate()
{
while (_threadCount != 0 && Queue.TryDequeue(out RejitRequest request))
{
TranslatedFunction func = Translate(request.Address, request.Mode, highCq: true);
Functions.AddOrUpdate(request.Address, func.GuestSize, func, (key, oldFunc) =>
{
EnqueueForDeletion(key, oldFunc);
return func;
});
if (PtcProfiler.Enabled)
{
PtcProfiler.UpdateEntry(request.Address, request.Mode, highCq: true);
}
RegisterFunction(request.Address, func);
}
}
private struct Range
{
public ulong Start { get; }
@ -504,11 +480,7 @@ namespace ARMeilleure.Translation
internal void EnqueueForRejit(ulong guestAddress, ExecutionMode mode)
{
if (_backgroundSet.TryAdd(guestAddress, null))
{
_backgroundStack.Push(new RejitRequest(guestAddress, mode));
_backgroundTranslatorEvent.Set();
}
Queue.Enqueue(guestAddress, mode);
}
private void EnqueueForDeletion(ulong guestAddress, TranslatedFunction func)
@ -542,26 +514,23 @@ namespace ARMeilleure.Translation
private void ClearRejitQueue(bool allowRequeue)
{
_backgroundTranslatorLock.AcquireWriterLock(Timeout.Infinite);
if (allowRequeue)
if (!allowRequeue)
{
while (_backgroundStack.TryPop(out var request))
Queue.Clear();
return;
}
lock (Queue.Sync)
{
while (Queue.Count > 0 && Queue.TryDequeue(out RejitRequest request))
{
if (Functions.TryGetValue(request.Address, out var func) && func.CallCounter != null)
{
Volatile.Write(ref func.CallCounter.Value, 0);
}
_backgroundSet.TryRemove(request.Address, out _);
}
}
else
{
_backgroundStack.Clear();
}
_backgroundTranslatorLock.ReleaseWriterLock();
}
}
}

View File

@ -0,0 +1,121 @@
using ARMeilleure.Diagnostics;
using ARMeilleure.State;
using System;
using System.Collections.Generic;
using System.Threading;
namespace ARMeilleure.Translation
{
/// <summary>
/// Represents a queue of <see cref="RejitRequest"/>.
/// </summary>
/// <remarks>
/// This does not necessarily behave like a queue, i.e: a FIFO collection.
/// </remarks>
sealed class TranslatorQueue : IDisposable
{
private bool _disposed;
private readonly Stack<RejitRequest> _requests;
private readonly HashSet<ulong> _requestAddresses;
/// <summary>
/// Gets the object used to synchronize access to the <see cref="TranslatorQueue"/>.
/// </summary>
public object Sync { get; }
/// <summary>
/// Gets the number of requests in the <see cref="TranslatorQueue"/>.
/// </summary>
public int Count => _requests.Count;
/// <summary>
/// Initializes a new instance of the <see cref="TranslatorQueue"/> class.
/// </summary>
public TranslatorQueue()
{
Sync = new object();
_requests = new Stack<RejitRequest>();
_requestAddresses = new HashSet<ulong>();
}
/// <summary>
/// Enqueues a request with the specified <paramref name="address"/> and <paramref name="mode"/>.
/// </summary>
/// <param name="address">Address of request</param>
/// <param name="mode"><see cref="ExecutionMode"/> of request</param>
public void Enqueue(ulong address, ExecutionMode mode)
{
lock (Sync)
{
if (_requestAddresses.Add(address))
{
_requests.Push(new RejitRequest(address, mode));
TranslatorEventSource.Log.RejitQueueAdd(1);
Monitor.Pulse(Sync);
}
}
}
/// <summary>
/// Tries to dequeue a <see cref="RejitRequest"/>. This will block the thread until a <see cref="RejitRequest"/>
/// is enqueued or the <see cref="TranslatorQueue"/> is disposed.
/// </summary>
/// <param name="result"><see cref="RejitRequest"/> dequeued</param>
/// <returns><see langword="true"/> on success; otherwise <see langword="false"/></returns>
public bool TryDequeue(out RejitRequest result)
{
while (!_disposed)
{
lock (Sync)
{
if (_requests.TryPop(out result))
{
_requestAddresses.Remove(result.Address);
TranslatorEventSource.Log.RejitQueueAdd(-1);
return true;
}
Monitor.Wait(Sync);
}
}
result = default;
return false;
}
/// <summary>
/// Clears the <see cref="TranslatorQueue"/>.
/// </summary>
public void Clear()
{
lock (Sync)
{
TranslatorEventSource.Log.RejitQueueAdd(-_requests.Count);
_requests.Clear();
_requestAddresses.Clear();
Monitor.PulseAll(Sync);
}
}
/// <summary>
/// Releases all resources used by the <see cref="TranslatorQueue"/> instance.
/// </summary>
public void Dispose()
{
if (!_disposed)
{
_disposed = true;
Clear();
}
}
}
}

View File

@ -36,8 +36,8 @@
## Compatibility
As of January 2022, Ryujinx has been tested on approximately 3,500 titles; over 3,200 boot past menus and into gameplay, with roughly 2,500 of those being considered playable.
You can check out the compatibility list [here](https://github.com/Ryujinx/Ryujinx-Games-List/issues). Anyone is free to submit an updated test on an existing game entry; simply follow the new issue template and testing guidelines, or post as a reply to the applicable game issue. Use the search function to see if a game has been tested already!
As of September 2022, Ryujinx has been tested on approximately 3,600 titles; over 3,400 boot past menus and into gameplay, with roughly 2,700 of those being considered playable. You can check out the compatibility list [here](https://github.com/Ryujinx/Ryujinx-Games-List/issues).
Anyone is free to submit a new game test or update an existing game test entry; simply follow the new issue template and testing guidelines, or post as a reply to the applicable game issue. Use the search function to see if a game has been tested already!
## Usage
@ -118,11 +118,11 @@ If you'd like to support the project financially, Ryujinx has an active Patreon
<img src="https://images.squarespace-cdn.com/content/v1/560c1d39e4b0b4fae0c9cf2a/1567548955044-WVD994WZP76EWF15T0L3/Patreon+Button.png?format=500w" width="150">
</a>
All the developers working on the project do so on their free time, but the project has several expenses:
All developers working on the project do so in their free time, but the project has several expenses:
* Hackable Nintendo Switch consoles to reverse-engineer the hardware
* Additional computer hardware for testing purposes (e.g. GPUs to diagnose graphical bugs, etc.)
* Licenses for various software development tools (e.g. Jetbrains, LDN servers, IDA)
* Web hosting and infrastructure maintenance
* Licenses for various software development tools (e.g. Jetbrains, IDA)
* Web hosting and infrastructure maintenance (e.g. LDN servers)
All funds received through Patreon are considered a donation to support the project. Patrons receive early access to progress reports and exclusive access to developer interviews.

View File

@ -4,7 +4,6 @@ using Ryujinx.Memory;
using Ryujinx.SDL2.Common;
using System;
using System.Collections.Concurrent;
using System.Runtime.InteropServices;
using System.Threading;
using static Ryujinx.Audio.Integration.IHardwareDeviceDriver;

Some files were not shown because too many files have changed in this diff Show More