Compare commits

...

240 Commits

Author SHA1 Message Date
730d2f4b9b Address gdkchan's comment 2022-08-31 21:33:03 +02:00
f6a7309b14 account: Implement LoadNetworkServiceLicenseKindAsync
This is needed to run Pokemon Legends Arceus 1.1.1 with guest internet enabled.

The game still get stuck at loading screen.
2022-08-31 21:33:03 +02:00
472a621589 Bsd: Fix ArgumentOutOfRangeException in SetSocketOption (#3633)
* Bsd: Fix ArgumentOutOfRangeException in SetSocketOption

* Ensure option level is Socket before checking for SoLinger
2022-08-28 14:24:19 +00:00
311c2661b8 Replace image format magic numbers with enums (#3631)
* Replace magic constants with enums

* Extra formatting

* Lower case ASTC dimensions

* Use uint for VertexAttributeFormat
2022-08-28 01:56:26 +00:00
a92e2028cb Updates Japanese localization for the Avalonia UI (#3635) 2022-08-27 07:01:30 +00:00
6922862db8 Optimize kernel memory block lookup and consolidate RBTree implementations (#3410)
* Implement intrusive red-black tree, use it for HLE kernel block manager

* Implement TreeDictionary using IntrusiveRedBlackTree

* Implement IntervalTree using IntrusiveRedBlackTree

* Implement IntervalTree (on Ryujinx.Memory) using IntrusiveRedBlackTree

* Make PredecessorOf and SuccessorOf internal, expose Predecessor and Successor properties on the node itself

* Allocation free tree node lookup
2022-08-26 18:21:48 +00:00
6592d64751 Update Turkish Translation (#3498)
Translated newly added lines and polished older entries.
2022-08-26 18:00:17 +00:00
8001c832d9 Update de_DE.json (#3502)
* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Another one

* Update de_DE.json

* addressed reviews

Co-Authored-By: Miepee <38186597+Miepee@users.noreply.github.com>

* welp

Co-Authored-By: Miepee <38186597+Miepee@users.noreply.github.com>

* Update de_DE.json

* Update de_DE.json

quick update with the latest changes

Co-Authored-By: Miepee <38186597+Miepee@users.noreply.github.com>

* Update de_DE.json

Co-Authored-By: Miepee <38186597+Miepee@users.noreply.github.com>

Co-authored-by: Miepee <38186597+Miepee@users.noreply.github.com>
Co-authored-by: reloxx13 <reloxx@interia.pl>
2022-08-26 19:49:05 +02:00
87919b193c Update zh_CN.json (#3598)
* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Delete zh_CN.json

* fix crash

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json
2022-08-26 19:36:42 +02:00
8de033e60e Avalonia - Add Polish Translation (#3569)
* Update Ryujinx.Ava.csproj

* Update MainWindow.axaml

* Create pl_PL.json

* Update pl_PL.json

* Update pl_PL.json

* Update pl_PL.json

* Update pl_PL.json

* PPTC wording changes

adding PPTC changes

Co-authored-by: Clara <moonbowjelly@gmail.com>
2022-08-26 19:24:59 +02:00
90432946ac Avalonia - Display language names in their corresponding language under "Change Language" (#3490)
* change languages to their native names

* fix Chinese language names

* Update MainWindow.axaml
2022-08-26 19:12:11 +02:00
9bad71afbf bsd: Fix Poll writting in input buffer (#3630)
This is a very old oversight on our Poll implementation.
This worked so far reliably because games and homebrews pass the same
buffer as input and output.
2022-08-26 18:10:45 +02:00
923089a298 Fast path for Inline-to-Memory texture data transfers (#3610)
* Fast path for Inline-to-Memory texture data transfers

* Only do it for block linear textures to be on the safe side
2022-08-26 02:16:41 +00:00
d9aa15eb24 pctl: Implement EndFreeCommunication
This PR Implement `EndFreeCommunication` (checked by RE). Nothing more.

Closes #2420
2022-08-25 23:18:37 +02:00
12c89a61f9 misc: Fix missing null terminator for strings with pchtxt (#3629)
As title say.
2022-08-25 19:59:15 +00:00
f5235fff29 ARMeilleure: Hardware accelerate SHA256 (#3585)
* ARMeilleure/HardwareCapabilities: Add Sha

* ARMeilleure/Intrinsic: Add X86Sha256Rnds2

* ARmeilleure: Hardware accelerate SHA256H/SHA256H2

* ARMeilleure/Intrinsic: Add X86Sha256Msg1, X86Sha256Msg2

* ARMeilleure/Intrinsic: Add X86Palignr

* ARMeilleure: Hardware accelerate SHA256SU0, SHA256SU1

* PTC: Bump InternalVersion
2022-08-25 10:12:13 +00:00
eba682b767 Implement some 32-bit Thumb instructions (#3614)
* Implement some 32-bit Thumb instructions

* Optimize OpCode32MemMult using PopCount
2022-08-25 09:59:34 +00:00
b994dafe7a Update PPTC dialog text to match label and tooltip (#3618)
* Update PPTC dialog text to match label and tooltip

* Update to requested text

* Reverting spaces

* Adding newline back in
2022-08-24 08:25:49 +00:00
54421760c3 Check if game directories have been updated before refreshing GUI (#3607)
* Check if game directories have been updated before refreshing list on save.

* Cleanup spacing

* Add Avalonia and reset value after saving

* Fix Avalonia

* Fix multiple directories not being added in GTK
2022-08-21 13:07:28 +00:00
88a0e720cb Use RGBA16 vertex format if RGB16 is not supported on Vulkan (#3552)
* Use RGBA16 vertex format if RGB16 is not supported on Vulkan

* Catch all shader compilation exceptions
2022-08-20 16:20:27 -03:00
53cc9e0561 Change 'Purge PPTC Cache' label & tooltip to reflect function behavior (#3601)
* Change PPTC purge label & tooltip

* Change Avalonia labels
2022-08-19 23:39:59 +00:00
7defc59b9d A few minor documentation fixes. (#3599)
* A few minor documentation fixes.

* Removed more invalid inheritdoc instances.
2022-08-19 18:21:06 -03:00
951700fdd8 Removed unused usings. (#3593)
* Removed unused usings.

* Added back using, now that it's used.

* Removed extra whitespace.
2022-08-18 18:04:54 +02:00
eb6430f103 Skipped over the last "Count" key explicitly, instead of relying on an exception. (#3595) 2022-08-18 02:00:27 +02:00
80a879cb44 Fix SpirV parse failure (#3597)
* Added .ToString overrides, to help diagnose and debug SpirV generated code.

* Added Spirv to team shared dictionary, so the word will not show up as a warning.

* Fixed bug where we were creating invalid constants (bool 0i and float 0i)

* Update Ryujinx.Graphics.Shader/CodeGen/Spirv/CodeGenContext.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Update Spv.Generator/Instruction.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Adjusted spacing to match style of the rest of the code.

* Added handler for FP64(double) as well, for undefined aggregate types.

* Made the operand labels a static dictionary, to avoid re-allocation on each call.
Replaced Contains/Get with a TryGetValue, to reduce the number of dictionary lookups.

* Added newline between AllOperands and ToString().

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-08-18 01:49:43 +02:00
2197f41506 Removed extra semicolons. (#3594) 2022-08-17 09:05:15 +02:00
c8f9292bab Avalonia - Couple fixes and improvements to vulkan (#3483)
* drop split devices, rebase

* add fallback to opengl if vulkan is not available

* addressed review

* ensure present image references are incremented and decremented when necessary

* allow changing vsync for vulkan

* fix screenshot on avalonia vulkan

* save favorite when toggled

* improve sync between popups

* use separate devices for each new window

* fix crash when closing window

* addressed review

* don't create the main window with immediate mode

* change skia vk delegate to method

* update vulkan throwonerror

* addressed review
2022-08-16 16:32:37 +00:00
0ec933a615 Vulkan: Add ETC2 texture formats (#3576) 2022-08-16 15:42:42 +02:00
2135b6a51a am: Stub SetWirelessPriorityMode, SetWirelessPriorityMode and GetHdcpAuthenticationState (#3535)
This PR some calls in `am` service:
- ISelfController: SetWirelessPriorityMode, SaveCurrentScreenshot (Partially checked by RE).
- ICommonStateGetter: GetHdcpAuthenticationState

Close #1831 and close #3527
2022-08-15 11:12:08 +00:00
00e35d9bf6 ControllerApplet: Override player counts when SingleMode is set (#3571) 2022-08-15 09:46:08 +02:00
6dfb6ccf8c PreAllocator: Check if instruction supports a Vex prefix in IsVexSameOperandDestSrc1 (#3587) 2022-08-14 17:35:08 -03:00
e87e8b012c Fix texture bindings using wrong sampler pool in some cases (#3583) 2022-08-14 14:00:30 -03:00
e8f1ca8427 OpenGL: Limit vertex buffer range for non-indexed draws (#3542)
* Limit vertex buffer range for non-indexed draws

* Fix typo
2022-08-11 20:21:56 -03:00
ad47bd2d4e Fix blend with RGBX color formats (#3553) 2022-08-11 18:23:25 -03:00
a5ff0024fb Rename ToSpan to AsSpan (#3556) 2022-08-11 18:07:37 -03:00
f9661a54d2 add Japanese translation to Avalonia UI (#3489)
* add Japanese translation to Avalonia UI

* translate language names

* fix raised in the review
2022-08-11 17:55:14 -03:00
66e7fdb871 OpenGL: Fix clears of unbound color targets (#3564) 2022-08-08 17:39:22 +00:00
2bb9b33da1 Implement Arm32 Sha256 and MRS Rd, CPSR instructions (#3544)
* Implement Arm32 Sha256 and MRS Rd, CPSR instructions

* Add tests using Arm64 outputs
2022-08-05 19:03:50 +02:00
1080f64df9 Implement HLE macros for render target clears (#3528)
* Implement HLE macros for render target clears

* Add constants for the offsets
2022-08-04 21:30:08 +00:00
c48a75979f Fix Multithreaded Compilation of Shader Cache on OpenGL (#3540)
This was broken by the Vulkan changes - OpenGL was building host caches at boot on one thread, which is very notably slower than when it is multithreaded.

This was caused by trying to get the program binary immediately after compilation started, which blocks. Now it does it after compilation has completed.
2022-08-03 19:37:56 -03:00
842cb26ba5 Sfdnsres; Stub ResolverSetOptionRequest (#3493)
This PR stub ResolverSetOptionRequest (checked by RE), but the options parsing is still missing since we don't support it in our current code.

(Close #3479)
2022-08-03 00:10:28 +02:00
e235d5e7bb Fix resolution scale values not being updated (#3514) 2022-08-02 23:58:56 +02:00
ed0b10c81f Fix geometry shader passthrough fallback being used when feature is supported (#3525)
* Fix geometry shader passthrough fallback being used when feature is supported

* Shader cache version bump
2022-08-02 08:44:30 +02:00
f92650fcff SPIR-V: Initialize undefined variables with 0 (#3526)
* SPIR-V: Initialize undefined variables with a value

Changes undefined values on spir-v shaders (caused by phi nodes) to be initialized instead of truly undefined.

Fixes an issue with NVIDIA gpus seemingly not liking when a variable is _potentially_ undefined. Not sure about the details at the moment.

Fixes:
- Tilt shift blur effect in Link's Awakening (bottom of the screen)
- Potentially block flickering on newer NVIDIA gpus in Splatoon 2? Needs testing.

Testing is welcome.

* Update Ryujinx.Graphics.Shader/CodeGen/Spirv/CodeGenContext.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-08-02 08:11:10 +02:00
712361f6e1 vk: Workaround XCB not availaible on FlatHub build (#3515)
Update SPB to 0.0.4-build24 which hopefully fix the issue by checking
libX11-xcb presence.
2022-08-01 08:46:19 +02:00
2232e4ae87 Vulkan backend (#2518)
* WIP Vulkan implementation

* No need to initialize attributes on the SPIR-V backend anymore

* Allow multithreading shaderc and vkCreateShaderModule

You'll only really see the benefit here with threaded-gal or parallel shader cache compile.

Fix shaderc multithreaded changes

Thread safety for shaderc Options constructor

Dunno how they managed to make a constructor not thread safe, but you do you. May avoid some freezes.

* Support multiple levels/layers for blit.

Fixes MK8D when scaled, maybe a few other games. AMD software "safe" blit not supported right now.

* TextureStorage should hold a ref of the foreign storage, otherwise it might be freed while in use

* New depth-stencil blit method for AMD

* Workaround for AMD driver bug

* Fix some tessellation related issues (still doesn't work?)

* Submit command buffer before Texture GetData. (UE4 fix)

* DrawTexture support

* Fix BGRA on OpenGL backend

* Fix rebase build break

* Support format aliasing on SetImage

* Fix uniform buffers being lost when bindings are out of order

* Fix storage buffers being lost when bindings are out of order

(also avoid allocations when changing bindings)

* Use current command buffer for unscaled copy (perf)

Avoids flushing commands and renting a command buffer when fulfilling copy dependencies and when games do unscaled copies.

* Update to .net6

* Update Silk.NET to version 2.10.1

Somehow, massive performance boost. Seems like their vtable for looking up vulkan methods was really slow before.

* Fix PrimitivesGenerated query, disable Transform Feedback queries for now

Lets Splatoon 2 work on nvidia. (mostly)

* Update counter queue to be similar to the OGL one

Fixes softlocks when games had to flush counters.

* Don't throw when ending conditional rendering for now

This should be re-enabled when conditional rendering is enabled on nvidia etc.

* Update findMSB/findLSB to match master's instruction enum

* Fix triangle overlay on SMO, Captain Toad, maybe others?

* Don't make Intel Mesa pay for Intel Windows bugs

* Fix samplers with MinFilter Linear or Nearest (fixes New Super Mario Bros U Deluxe black borders)

* Update Spv.Generator

* Add alpha test emulation on shader (but no shader specialisation yet...)

* Fix R4G4B4A4Unorm texture format permutation

* Validation layers should be enabled for any log level other than None

* Add barriers around vkCmdCopyImage

Write->Read barrier for src image (we want to wait for a write to read it)
Write->Read barrier for dst image (we want to wait for the copy to complete before use)

* Be a bit more careful with texture access flags, since it can be used for anything

* Device local mapping for all buffers

May avoid issues with drivers with NVIDIA on linux/older gpus on windows when using large buffers (?)
Also some performance things and fixes issues with opengl games loading textures weird.

* Cleanup, disable device local buffers for now.

* Add single queue support

Multiqueue seems to be a bit more responsive on NVIDIA. Should fix texture flush on intel. AMD has been forced to single queue for an experiment.

* Fix some validation errors around extended dynamic state

* Remove Intel bug workaround, it was fixed on the latest driver

* Use circular queue for checking consumption on command buffers

Speeds up games that spam command buffers a little. Avoids checking multiple command buffers if multiple are active at once.

* Use SupportBufferUpdater, add single layer flush

* Fix counter queue leak when game decides to use host conditional rendering

* Force device local storage for textures (fixes linux performance)

* Port #3019

* Insert barriers around vkCmdBlitImage (may fix some amd flicker)

* Fix transform feedback on Intel, gl_Position feedback and clears to inexistent depth buffers

* Don't pause transform feedback for multi draw

* Fix draw outside of render pass and missing capability

* Workaround for wrong last attribute on AMD (affects FFVII, STRIKERS1945, probably more)

* Better workaround for AMD vertex buffer size alignment issue

* More instructions + fixes on SPIR-V backend

* Allow custom aspect ratio on Vulkan

* Correct GTK UI status bar positions

* SPIR-V: Functions must always end with a return

* SPIR-V: Fix ImageQuerySizeLod

* SPIR-V: Set DepthReplacing execution mode when FragDepth is modified

* SPIR-V: Implement LoopContinue IR instruction

* SPIR-V: Geometry shader support

* SPIR-V: Use correct binding number on storage buffers array

* Reduce allocations for Spir-v serialization

Passes BinaryWriter instead of the stream to Write and WriteOperand

- Removes creation of BinaryWriter for each instruction
- Removes allocations for literal string

* Some optimizations to Spv.Generator

- Dictionary for lookups of type declarations, constants, extinst
- LiteralInteger internal data format -> ushort
- Deterministic HashCode implementation to avoid spirv result not being the same between runs
- Inline operand list instead of List<T>, falls back to array if many operands. (large performance boost)

TODO: improve instruction allocation, structured program creator, ssa?

* Pool Spv.Generator resources, cache delegates, spv opts

- Pools for Instructions and LiteralIntegers. Can be passed in when creating the generator module.
  - NewInstruction is called instead of new Instruction()
  - Ryujinx SpirvGenerator passes in some pools that are static. The idea is for these to be shared between threads eventually.
- Estimate code size when creating the output MemoryStream
- LiteralInteger pools using ThreadStatic pools that are initialized before and after creation... not sure of a better way since the way these are created is via implicit cast.

Also, cache delegates for Spv.Generator for functions that are passed around to GenerateBinary etc, since passing the function raw creates a delegate on each call.

TODO: update python spv cs generator to make the coregrammar with NewInstruction and the `params` overloads.

* LocalDefMap for Ssa Rewriter

Rather than allocating a large array of all registers for each block in the shader, allocate one array of all registers and clear it between blocks. Reduces allocations in the shader translator.

* SPIR-V: Transform feedback support

* SPIR-V: Fragment shader interlock support (and image coherency)

* SPIR-V: Add early fragment tests support

* SPIR-V: Implement SwizzleAdd, add missing Triangles ExecutionMode for geometry shaders, remove SamplerType field from TextureMeta

* Don't pass depth clip state right now (fix decals)

Explicitly disabling it is incorrect. OpenGL currently automatically disables based on depth clamp, which is the behaviour if this state is omitted.

* Multisampling support

* Multisampling: Use resolve if src samples count > dst samples count

* Multisampling: We can only resolve for unscaled copies

* SPIR-V: Only add FSI exec mode if used.

* SPIR-V: Use ConstantComposite for Texture Offset Vector

Fixes a bunch of freezes with SPIR-V on AMD hardware, and validation errors. Note: Obviously assumes input offsets are constant, which they currently are.

* SPIR-V: Don't OpReturn if we already OpExit'ed

Fixes spir-v parse failure and stack smashing in RADV (obviously you still need bolist)

* SPIR-V: Only use input attribute type for input attributes

Output vertex attributes should always be of type float.

* Multithreaded Pipeline Compilation

* Address some feedback

* Make this 32

* Update topology with GpuAccessorState

* Cleanup for merge (note: disables spir-v)

* Make more robust to shader compilation failure

- Don't freeze when GLSL compilation fails
- Background SPIR-V pipeline compile failure results in skipped draws, similar to GLSL compilation failure.

* Fix Multisampling

* Only update fragment scale count if a vertex texture needs a scale.

Fixes a performance regression introduced by texture scaling in the vertex stage where support buffer updates would be very frequent, even at 1x, if any textures were used on the vertex stage.

This check doesn't exactly look cheap (a flag in the shader stage would probably be preferred), but it is much cheaper than uploading scales in both vulkan and opengl, so it will do for now.

* Use a bitmap to do granular tracking for buffer uploads.

This path is only taken if the much faster check of "is the buffer rented at all" is triggered, so it doesn't actually end up costing too much, and the time saved by not ending render passes (and on gpu for not waiting on barriers) is probably helpful.

Avoids ending render passes to update buffer data (not all the time)
- 140-180 to 35-45 in SMO metro kingdom (these updates are in the UI)
- Very variable 60-150(!) to 16-25 in mario kart 8 (these updates are in the UI)

As well as allowing more data to be preloaded persistently, this will also allow more data to be loaded in the preload buffer, which should be faster as it doesn't need to insert barriers between draws. (and on tbdr, does not need to flush and reload tile memory)

Improves performance in GPU limited scenarios. Should notably improve performance on TBDR gpus. Still a lot more to do here.

* Copy query results after RP ends, rather than ending to copy

We need to end the render pass to get the data (submit command buffer) anyways...

Reduces render passes created in games that use queries.

* Rework Query stuff a bit to avoid render pass end

Tries to reset returned queries in background when possible, rather than ending the render pass.

Still ends render pass when resetting a counter after draws, but maybe that can be solved too. (by just pulling an empty object off the pool?)

* Remove unnecessary lines

Was for testing

* Fix validation error for query reset

Need to think of a better way to do this.

* SPIR-V: Fix SwizzleAdd and some validation errors

* SPIR-V: Implement attribute indexing and StoreAttribute

* SPIR-V: Fix TextureSize for MS and Buffer sampler types

* Fix relaunch issues

* SPIR-V: Implement LogicalExclusiveOr

* SPIR-V: Constant buffer indexing support

* Ignore unsupported attributes rather than throwing (matches current GLSL behaviour)

* SPIR-V: Implement tessellation support

* SPIR-V: Geometry shader passthrough support

* SPIR-V: Implement StoreShader8/16 and StoreStorage8/16

* SPIR-V: Resolution scale support and fix TextureSample multisample with LOD bug

* SPIR-V: Fix field index for scale count

* SPIR-V: Fix another case of wrong field index

* SPIRV/GLSL: More scaling related fixes

* SPIR-V: Fix ImageLoad CompositeExtract component type

* SPIR-V: Workaround for Intel FrontFacing bug

* Enable SPIR-V backend by default

* Allow null samplers (samplers are not required when only using texelFetch to access the texture)

* Fix some validation errors related to texel block view usage flag and invalid image barrier base level

* Use explicit subgroup size if we can (might fix some block flickering on AMD)

* Take componentMask and scissor into account when clearing framebuffer attachments

* Add missing barriers around CmdFillBuffer (fixes Monster Hunter Rise flickering on NVIDIA)

* Use ClampToEdge for Clamp sampler address mode on Vulkan (fixes Hollow Knight)

Clamp is unsupported on Vulkan, but ClampToEdge behaves almost the same. ClampToBorder on the other hand (which was being used before) is pretty different

* Shader specialization for new Vulkan required state (fixes remaining alpha test issues, vertex stretching on AMD on Crash Bandicoot, etc)

* Check if the subgroup size is supported before passing a explicit size

* Only enable ShaderFloat64 if the GPU supports it

* We don't need to recompile shaders if alpha test state changed but alpha test is disabled

* Enable shader cache on Vulkan and implement MultiplyHighS32/U32 on SPIR-V (missed those before)

* Fix pipeline state saving before it is updated.

This should fix a few warnings and potential stutters due to bad pipeline states being saved in the cache. You may need to clear your guest cache.

* Allow null samplers on OpenGL backend

* _unit0Sampler should be set only for binding 0

* Remove unused PipelineConverter format variable (was causing IOR)

* Raise textures limit to 64 on Vulkan

* No need to pack the shader binaries if shader cache is disabled

* Fix backbuffer not being cleared and scissor not being re-enabled on OpenGL

* Do not clear unbound framebuffer color attachments

* Geometry shader passthrough emulation

* Consolidate UpdateDepthMode and GetDepthMode implementation

* Fix A1B5G5R5 texture format and support R4G4 on Vulkan

* Add barrier before use of some modified images

* Report 32 bit query result on AMD windows (smo issue)

* Add texture recompression support (disabled for now)

It recompresses ASTC textures into BC7, which might reduce VRAM usage significantly on games that uses ASTC textures

* Do not report R4G4 format as supported on Vulkan

It was causing mario head to become white on Super Mario 64 (???)

* Improvements to -1 to 1 depth mode.

- Transformation is only applied on the last stage in the vertex pipeline.
- Should fix some issues with geometry and tessellation (hopefully)
- Reading back FragCoord Z on fragment will transform back to -1 to 1.

* Geometry Shader index count from ThreadsPerInputPrimitive

Generally fixes SPIR-V emitting too many triangles, may change games in OpenGL

* Remove gl_FragDepth scaling

This is always 0-1; the other two issues were causing the problems. Fixes regression with Xenoblade.

* Add Gl StencilOp enum values to Vulkan

* Update guest cache to v1.1 (due to specialization state changes)

This will explode your shader cache from earlier vulkan build, but it must be done. 😔

* Vulkan/SPIR-V support for viewport inverse

* Fix typo

* Don't create query pools for unsupported query types

* Return of the Vector Indexing Bug

One day, everyone will get this right.

* Check for transform feedback query support

Sometimes transform feedback is supported without the query type.

* Fix gl_FragCoord.z transformation

FragCoord.z is always in 0-1, even when the real depth range is -1 to 1. Turns out the only bug was geo and tess stage outputs.

Fixes Pokemon Sword/Shield, possibly others.

* Fix Avalonia Rebase

Vulkan is currently not available on Avalonia, but the build does work and you can use opengl.

* Fix headless build

* Add support for BC6 and BC7 decompression, decompress all BC formats if they are not supported by the host

* Fix BCn 4/5 conversion, GetTextureTarget

BCn 4/5 could generate invalid data when a line's size in bytes was not divisible by 4, which both backends expect.

GetTextureTarget was not creating a view with the replacement format.

* Fix dependency

* Fix inverse viewport transform vector type on SPIR-V

* Do not require null descriptors support

* If MultiViewport is not supported, do not try to set more than one viewport/scissor

* Bounds check on bitmap add.

* Flush queries on attachment change rather than program change

Occlusion queries are usually used in a depth only pass so the attachments changing is a better indication of the query block ending.

Write mask changes are also considered since some games do depth only pass by setting 0 write mask on all the colour targets.

* Add support for avalonia (#6)

* add avalonia support

* only lock around skia flush

* addressed review

* cleanup

* add fallback size if avalonia attempts to render but the window size is 0. read desktop scale after enabling dpi check

* fix getting window handle on linux. skip render is size is 0

* Combine non-buffer with buffer image descriptor sets

* Support multisample texture copy with automatic resolve on Vulkan

* Remove old CompileShader methods from the Vulkan backend

* Add minimal pipeline layouts that only contains used bindings

They are used by helper shaders, the intention is avoiding needing to recompile the shaders (from GLSL to SPIR-V) if the bindings changes on the translated guest shaders

* Pre-compile helper shader as SPIR-V, and some fixes

* Remove pre-compiled shaderc binary for Windows as its no longer needed by default

* Workaround RADV crash

Enabling the descriptor indexing extension, even if it is not used, forces the radv driver to use "bolist".

* Use RobustBufferAccess on NVIDIA gpus

Avoids the SMO waterfall triangle on older NVIDIA gpus.

* Implement GPU selector and expose texture recompression on the UI and config

* Fix and enable background compute shader compilation

Also disables warnings from shader cache pipeline misses.

* Fix error due to missing subpass dependency when Attachment Write -> Shader Read barriers are added

* If S8D24 is not supported, use D32FS8

* Ensure all fences are destroyed on dispose

* Pre-allocate arrays up front on DescriptorSetUpdater, allows the removal of some checks

* Add missing clear layer parameter after rebase

* Use selected gpu from config for avalonia (#7)

* use configured device

* address review

* Fix D32S8 copy workaround (AMD)

Fixes water in Pokemon Legends Arceus on AMD GPUs. Possibly fixes other things.

* Use push descriptors for uniform buffer updates (disabled for now)

* Push descriptor support check, buffer redundancy checks

Should make push descriptors faster, needs more testing though.

* Increase light command buffer pool to 2 command buffers, throw rather than returning invalid cbs

* Adjust bindings array sizes

* Force submit command buffers if memory in use by its resources is high

* Add workaround for AMD GCN cubemap view sins

`ImageCreateCubeCompatibleBit` seems to generally break 2D array textures with mipmaps... even if they are eventually aliased as a cubemap with mipmaps. Forcing a copy here works around the issue.

This could be used in future if enabling this bit reduces performance on certain GPUs. (mobile class is generally a worry)

Currently also enabled on Linux as I don't know if they managed to dodge this bug (someone please tell me). Not enabled on Vega at the moment, but easy to add if the issue is there.

* Add mobile, non-RX variants to the GCN regex.

Also make sure that the 3 digit ones only include numbers starting with 7 or 8.

* Increase image limit per stage from 8 to 16

Xenoblade Chronicles 2 was hiting the limit of 8

* Minor code cleanup

* Fix NRE caused by SupportBufferUpdater calling pipeline ClearBuffer

* Add gpu selector to Avalonia (#8)

* Add gpu selector to avalonia settings

* show backend label on window

* some fixes

* address review

* Minor changes to the Avalonia UI

* Update graphics window UI and locales. (#9)

* Update xaml and update locales

* locale updates

Did my best here but likely needs to be checked by native speakers, especially the use of ampersands in greek, russian and turkish?

* Fix locales with more (?) correct translations.

* add separator to render widget

* fix spanish and portuguese

* Add new IdList, replaces buffer list that could not remove elements and had unbounded growth

* Don't crash the settings window if Vulkan is not supported

* Fix Actions menu not being clickable on GTK UI after relaunch

* Rename VulkanGraphicsDevice to VulkanRenderer and Renderer to OpenGLRenderer

* Fix IdList and make it not thread safe

* Revert useless OpenGL format table changes

* Fix headless project build

* List throws ArgumentOutOfRangeException

* SPIR-V: Fix tessellation

* Increase shader cache version due to tessellation fix

* Reduce number of Sync objects created (improves perf in some specific titles)

* Fix vulkan validation errors for NPOT compressed upload and GCN workaround.

* Add timestamp to the shader cache and force rebuild if host cache is outdated

* Prefer Mail box present mode for popups (#11)

* Prefer Mail box present mode

* fix debug

* switch present mode when vsync is toggled

* only disable vsync on the main window

* SPIR-V: Fix geometry shader input load with transform feedback

* BC7 Encoder: Prefer more precision on alpha rather than RGB when alpha is 0

* Fix Avalonia build

* Address initial PR feedback

* Only set transform feedback outputs on last vertex stage

* Address riperiperi PR feedback

* Remove outdated comment

* Remove unused constructor

* Only throw for negative results

* Throw for QueueSubmit and other errors

No point in delaying the inevitable

* Transform feedback decorations inside gl_PerVertex struct breaks the NVIDIA compiler

* Fix some resolution scale issues

* No need for two UpdateScale calls

* Fix comments on SPIR-V generator project

* Try to fix shader local memory size

On DOOM, a shader is using local memory, but both Low and High size are 0, CRS size is 1536, it seems to store on that region?

* Remove RectangleF that is now unused

* Fix ImageGather with multiple offsets

Needs ImageGatherExtended capability, and must use `ConstantComposite` instead of `CompositeConstruct`

* Address PR feedback from jD in all projects except Avalonia

* Address most of jD PR feedback on Avalonia

* Remove unsafe

* Fix VulkanSkiaGpu

* move present mode request out of Create Swapchain method

* split more parts of create swapchain

* addressed reviews

* addressed review

* Address second batch of jD PR feedback

* Fix buffer <-> image copy row length and height alignment

AlignUp helper does not support NPOT alignment, and ASTC textures can have NPOT block sizes

* Better fix for NPOT alignment issue

* Use switch expressions on Vulkan EnumConversion

Thanks jD

* Fix Avalonia build

* Add Vulkan selection prompt on startup

* Grammar fixes on Vulkan prompt message

* Add missing Vulkan migration flag

Co-authored-by: riperiperi <rhy3756547@hotmail.com>
Co-authored-by: Emmanuel Hansen <emmausssss@gmail.com>
Co-authored-by: MutantAura <44103205+MutantAura@users.noreply.github.com>
2022-07-31 18:26:06 -03:00
14ce9e1567 Move partial unmap handler to the native signal handler (#3437)
* Initial commit with a lot of testing stuff.

* Partial Unmap Cleanup Part 1

* Fix some minor issues, hopefully windows tests.

* Disable partial unmap tests on macos for now

Weird issue.

* Goodbye magic number

* Add COMPlus_EnableAlternateStackCheck for tests

`COMPlus_EnableAlternateStackCheck` is needed for NullReferenceException handling to work on linux after registering the signal handler, due to how dotnet registers its own signal handler.

* Address some feedback

* Force retry when memory is mapped in memory tracking

This case existed before, but returning `false` no longer retries, so it would crash immediately after unprotecting the memory... Now, we return `true` to deliberately retry.

This case existed before (was just broken by this change) and I don't really want to look into fixing the issue right now. Technically, this means that on guest code partial unmaps will retry _due to this_ rather than hitting the handler. I don't expect this to cause any issues.

This should fix random crashes in Xenoblade Chronicles 2.

* Use IsRangeMapped

* Suppress MockMemoryManager.UnmapEvent warning

This event is not signalled by the mock memory manager.

* Remove 4kb mapping
2022-07-29 19:16:29 -03:00
952d013c67 Avalonia changes (#3497)
Co-authored-by: RNA <wQSZSQS2UQf5zun>
2022-07-29 01:14:37 +00:00
46c8129bf5 Avalonia: Another Cleanup (#3494)
* Avalonia: Another Cleanup

This PR is a cleanup to the avalonia code recently added:

- Some XAML file are autoformatted like a previous PR.
- Dlc is renamed to DownloadableContent (Locale exclude).
- DownloadableContentManagerWindow is a bit improved (Fixes #3491).
- Some nits here and there.

* Fix GTK

* Remove AttachDebugDevTools

* Fix last warning

* Fix JSON fields
2022-07-29 00:41:34 +02:00
8cfec5de4b Avalonia: Cleanup UserEditor a bit (#3492)
This PR cleanup the UserEditor code a bit, 2 texts are added for "Name" and "User Id", because when you create a new profile, the textbox is empty without any hints. `axaml` files are autoformated too.
2022-07-28 14:16:23 -03:00
37b6e081da Fix DMA linear texture copy fast path (#3496)
* Fix DMA linear texture copy fast path

* Formatting
2022-07-28 13:46:12 -03:00
3c3bcd82fe Add a sampler pool cache and improve texture pool cache (#3487)
* Add a sampler pool cache and improve texture pool cache

* Increase disposal timestamp delta more to be on the safe side

* Nits

* Use abstract class for PoolCache, remove factory callback
2022-07-27 21:07:48 -03:00
a00c59a46c update settings and main window tooltips (#3488) 2022-07-25 23:02:17 +02:00
1825bd87b4 misc: Reformat Ryujinx.Audio with dotnet-format (#3485)
This is the first commit of a series of reformat around the codebase as
discussed internally some weeks ago.

This project being one that isn't touched that much, it shouldn't cause
conflict with any opened PRs.
2022-07-25 15:46:33 -03:00
62f8ceb60b Resolution scaling hotkeys (#3185)
* hotkeys

* comments

* update implementation to include custom scales

* copypasta

* review changes

* hotkeys

* comments

* update implementation to include custom scales

* copypasta

* review changes

* Remove outdated configuration and force hotkeys unbound

* Add avalonia support

* Fix configuration file

* Update GTK implementation and fix config... again.

* Remove legacy implementation + nits

* Avalonia locales (DeepL)

* review

* Remove colon from chinese locale

* Update ConfigFile

* locale fix
2022-07-24 15:44:47 -03:00
1a888ae087 Add support for conditional (with CC) shader Exit instructions (#3470)
* Add support for conditional (with CC) shader Exit instructions

* Shader cache version bump

* Make CSM conditions default to false for EXIT.CC
2022-07-24 15:33:30 -03:00
84d0ca5645 feat: add traditional chinese translate (Avalonia) (#3474)
* feat: add traditional chinese translate

* update translate
2022-07-24 15:18:21 -03:00
31b8d413d5 Change MenuHeaders to embedded textblocks (#3469) 2022-07-24 14:50:06 -03:00
6e02cac952 Avalonia - Use content dialog for user profile manager (#3455)
* remove content dialog placeholder from all windows

* remove redundant window argument

* redesign user profile window

* wip

* use avalonia auto name generator

* add edit and new user options

* move profile image selection to content dialog

* remove usings

* fix updater

* address review

* adjust avatar dialog size

* add validation for user editor

* fix typo

* Shorten some labels
2022-07-24 14:38:38 -03:00
3a3380fa25 fix: Ensure to load latest version of ffmpeg libraries first (#3473)
Fix a possible crash related to older version of ffmpeg being loaded
instewad of the one shipped with the emulator.
2022-07-24 11:39:56 +02:00
2d252db0a7 GTK & Avalonia changes (#3480) 2022-07-23 12:05:51 -03:00
7f8a3541eb Fix decoding of block after shader BRA.CC instructions without predicate (#3472)
* Fix decoding of block after BRA.CC instructions without predicate

* Shader cache version bump
2022-07-23 11:53:14 -03:00
b34de74f81 Avoid adding shader buffer descriptors for constant buffers that are not used (#3478)
* Avoid adding shader buffer descriptors for constant buffers that are not used

* Shader cache version
2022-07-23 11:15:58 -03:00
5811d121df Avoid scaling 2d textures that could be used as 3d (#3464) 2022-07-15 09:24:13 -03:00
6eb85e846f Reduce some unnecessary allocations in DMA handler (#2886)
* experimental changes to try and reduce allocations in kernel threading and DMA handler

* Simplify the changes in this branch to just 1. Don't make unnecessary copies of data just for texture-texture transfers and 2. Add a fast path for 1bpp linear byte copies

* forgot to check src + dst linearity in 1bpp DMA fast path. Fixes the UE4 regression.

* removing dev log I left in

* Generalizing the DMA linear fast path to cases other than 1bpp copies

* revert kernel changes

* revert whitespace

* remove unneeded references

* PR feedback

Co-authored-by: Logan Stromberg <lostromb@microsoft.com>
Co-authored-by: gdk <gab.dark.100@gmail.com>
2022-07-14 15:45:56 -03:00
c5bddfeab8 Remove dependency for FFmpeg.AutoGen and Update FFmpeg to 5.0.1 for Windows (#3466)
* Remove dependency for FFMpeg.AutoGen

Also prepare for FFMpeg 5.0 and 5.1

* Update Ryujinx.Graphics.Nvdec.Dependencies to 5.0.1-build10

* Address gdkchan's comments

* Address Ack's comment

* Address gdkchan's comment
2022-07-14 15:13:23 +02:00
70ec5def9c BSD: Allow use of DontWait flag in Receive (#3462) 2022-07-14 11:47:25 +02:00
7853faa334 Ava/MainWindow: Do not show Show Console menu item on non-Windows (#3461) 2022-07-12 12:58:31 +00:00
b7fb474bfe Handle the case where byte optionValues are sent to BSD (#3405)
Some games and the Mario Odyssey Multiplayer mod do this.

The SMO multiplayer mod also needs you to revert #3394 as it uses a blocking socket to receive (otherwise it hangs), and it doesn't seem to like being forced as non-blocking.
2022-07-12 00:50:01 +02:00
2fa6413ed8 Avalonia - Add border to Flyouts (#3341)
* add borders to menus

* apply to dropdowns

* darken the border for dark theme

* fix duplicate keys
2022-07-12 00:44:35 +02:00
4523a73f75 Propagate Shader phi nodes with the same source value from all blocks (#3457)
* Propagate Shader phi nodes with the same source value from all incoming blocks

* Shader cache version bump
2022-07-12 00:36:58 +02:00
f4c47f3c9a Avalonia - Make tooltips more useful and descriptive, update Spanish localization (#3453)
* expand English tooltips and clean up

* small oversight

* update Spanish locale

* wording

* Internet

* address feedback

* update localization accordingly
2022-07-12 00:32:14 +02:00
7d9a5feccb Avalonia - Couple fixes and improvements (#3451)
* fix updater check crash

* remove line

* reduce cheat window sizes

* enable tiered compilation and r2r

* remove warning on LaunchState

* remove warnings related to tasks

* addressed review

* undo csproj indentation

* fix tabs in axaml file

* remove double line

* remove R2R
2022-07-12 00:25:33 +02:00
14ae4e276f Avalonia - Further Optimize Chinese Translation (#3452)
* Optimize Chinese Translation

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* test

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Update zh_CN.json

* Delete zh_CN.json

* Add files via upload

* Update zh_CN.json

* Update zh_CN.json
2022-07-12 00:12:52 +02:00
3af42d6c7e UI - Avalonia Part 3 (#3441)
* Add all other windows

* addreesed review

* Prevent "No Update" option from being deleted

* Select no update is the current update is removed from the title update window

* fix amiibo crash
2022-07-08 15:47:11 -03:00
bccf5e8b5a Avalonia - Use loaded config when assigning controller input (#3447)
* Use loaded config when assigning controller input

* Fix crash when switch player in controller window
2022-07-08 15:28:45 -03:00
d86a116e1e ensure mouse cursor is only hidden when mouse is in renderer (#3448) 2022-07-08 15:16:30 -03:00
4c2ab880ef misc: Relicense Ryujinx.Audio under the terms of the MIT license (#3449)
* Ryujinx.Audio: Remove BOM from files

* misc: Relicense Ryujinx.Audio under the terms of the MIT license

With the approvals of all the Ryujinx.Audio contributors, this commit
changes Ryujinx.Audio license from LGPLv3 to MIT.
2022-07-08 19:45:53 +02:00
bc5bb4459e Fix deadlock in mouse input on Avalonia (#3444)
* fix deadlock in mouse input

* apply @AcK77 changes
2022-07-08 09:53:48 -03:00
55e97959b9 Fix Vi managed and stray layers open/close/destroy (#3438)
* Fix Vi managed and stray layers open/close/destroy

* OpenLayer should set the state to ManagedOpened
2022-07-06 13:37:36 -03:00
f7ef6364b7 Implement CPU FCVT Half <-> Double conversion variants (#3439)
* Half <-> Double conversion support

* Add tests, fast path and deduplicate SoftFloat code

* PPTC version
2022-07-06 13:40:31 +02:00
b46b63e06a Add support for alpha to coverage dithering (#3069)
* Add support for alpha to coverage dithering

* Shader cache version bump

* Fix wrong alpha register

* Ensure support buffer is cleared

* New shader specialization based approach
2022-07-05 19:58:36 -03:00
594246ea47 UI - Avalonia Part 2 (#3351)
* add settings windows and children views

* Expose hotkeys configuration on the UI

* Remove double spacing from locale JSON

* simplify button assigner

* add cemuhook buttons and title to locale

* move common button assigner to own class

* cancel button assigner when window is closed

* remove unused setting

* address review. fix controller profile not loading default when switching devices

* fix updater file name

* Input cleanup (#37)

* addressed review

* add device type to controller device checks

* change accessibility modifier of public classes to internal

* Update Ryujinx.Ava/Ui/ViewModels/ControllerSettingsViewModel.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Update de_DE.json

* Update de_DE.json

* Update tr_TR.json

Translated newly added lines

* Update it_IT.json

* fix rebase

* update avalonia

* fix wrong key used for button text

* Align settings window elements

* Tabs to spaces

* Update brazilian portuguese translation

* Minor improvement on brazilian portuguese translation

* fix turkish translation

* remove unused text

* change view related classes to public

* unsubscribe from deferred event if dialog is closed

* Load the default language before loading any other when switching languages

* Make controller settings more compact

* increase default width of settings window, reduce profile buttons width

Co-authored-by: gdk <gab.dark.100@gmail.com>
Co-authored-by: MutantAura <44103205+MutantAura@users.noreply.github.com>
Co-authored-by: Niwu34 <67392333+Niwu34@users.noreply.github.com>
Co-authored-by: aegiff <99728970+aegiff@users.noreply.github.com>
Co-authored-by: Antonio Brugnolo <36473846+AntoSkate@users.noreply.github.com>
2022-07-05 20:06:31 +02:00
d21b403886 Stub GetTemperature (#3429) 2022-07-03 10:17:24 +02:00
5afd521c5a Bindless elimination for constant sampler handle (#3424)
* Bindless elimination for constant sampler handle

* Shader cache version bump

* Update TextureHandle.ReadPackedId for new bindless elimination
2022-07-02 15:03:35 -03:00
0c66d71fe8 ui: Fix timezone abbreviation since #3361 (#3430)
As title say
2022-06-29 22:08:30 +02:00
bdc4fa81f2 Add Simplified Chinese to Avalonia (V2) (#3416)
* Add files via upload

* Update Ryujinx.Ava.csproj

* Update MainWindow.axaml
2022-06-25 17:03:48 +02:00
625f5fb88a Account for pool change on texture bindings cache (#3420)
* Account for pool change on texture bindings cache

* Reduce the number of checks needed
2022-06-25 16:52:38 +02:00
2382717600 timezone: Fix regression caused by #3361 (#3418)
Because of that PR, TimeZoneRule was bigger than 0x4000 thanks to a
misuse of a constant.

This commit address this issue and add a new unit test to ensure the size of
TimeZoneRule is 0x4000 bytes.

Also address suggestions that were lost on the original PR.
2022-06-24 21:11:56 +02:00
30ee70a9bc time: Make TimeZoneRule blittable and avoid copies (#3361)
* time: Make TimeZoneRule blittable and avoid copies

This drastically reduce overhead of using TimeZoneRule around the
codebase.

Effect on games is unknown

* Add missing Box type

* Ensure we clean the structure still

This doesn't perform any copies

* Address gdkchan's comments

* Simplify Box
2022-06-24 19:04:57 +02:00
232b1012b0 Fix ThreadingLock deadlock on invalid access and TerminateProcess (#3407) 2022-06-24 02:53:16 +02:00
e747f5cd83 Ensure texture ID is valid before getting texture descriptor (#3406) 2022-06-24 02:41:57 +02:00
8aff17a93c UI: Some Avalonia cleanup (#3358) 2022-06-23 15:59:02 -03:00
f2a41b7a1c Rewrite kernel memory allocator (#3316)
* Rewrite kernel memory allocator

* Remove unused using

* Adjust private static field naming

* Change UlongBitSize to UInt64BitSize

* Fix unused argument, change argument order to be inline with official code and disable random allocation
2022-06-22 12:28:14 -03:00
c881cd2d14 Fix doubling of detected gamepads on program start (#3398)
* Fix doubling of detected gamepads (sometimes the connected event is fired when the app starts even though the pad was connected for some time now).

The fix rejects the gamepad if one with the same ID is already present.

* Fixed review findings
2022-06-20 19:01:55 +02:00
68f9091870 Account for res scale changes when updating bindings (#3403)
Fixes a regression introduced by the texture bindings PR.

Also renames TextureStatePerStage, as it's no longer per stage.
2022-06-17 17:41:38 -03:00
99ffc061d3 Optimize Texture Binding and Shader Specialization Checks (#3399)
* Changes 1

* Changes 2

* Better ModifiedSequence handling

This should handle PreciseEvents properly, and simplifies a few things.

* Minor changes, remove debug log

* Handle stage.Info being null

Hopefully fixes Catherine crash

* Fix shader specialization fast texture lookup

* Fix some things.

* Address Feedback Part 1

* Make method static.
2022-06-17 13:09:14 -03:00
d987cacfb7 Fix VIC out of bounds copy (#3386)
* Fix VIC out of bounds copy

* Update the assert
2022-06-17 12:01:52 -03:00
851f56b08a Support Array/3D depth-stencil render target, and single layer clears (#3400)
* Support Array/3D depth-stencil render target, and single layer clears

* Alignment
2022-06-14 13:30:39 -03:00
b1bd6a50b5 Less invasive fix for EventFd blocking operations (#3394) 2022-06-12 09:29:12 +02:00
70895bdb04 Allow concurrent BSD EventFd read/write (#3385) 2022-06-11 14:58:30 -03:00
830cbf91bb Ignore ClipControl on draw texture fallback (#3388) 2022-06-11 14:31:17 -03:00
9a9349f0f4 Fix instanced indexed inline draw index count (#3389) 2022-06-10 23:44:49 -03:00
46cc7b55f0 Fix instanced indexed inline draws (#3383) 2022-06-05 21:24:28 -03:00
dd8f97ab9e Remove freed memory range from tree on memory block disposal (#3347)
* Remove freed memory range from tree on memory block disposal

* PR feedback
2022-06-05 15:12:42 -03:00
633c5ec330 Extend uses count from ushort to uint on Operand Data structure (#3374) 2022-06-05 14:15:27 -03:00
a3e7bb8eb4 Copy dependency for multisample and non-multisample textures (#3382)
* Use copy dependency for textures that differs in multisample but are otherwise compatible

* Remove allowMs flag as it's no longer required for correctness, it's just an optimization now

* Dispose intermmediate pool
2022-06-05 14:06:47 -03:00
2073ba2919 Fix a potential GPFIFO submission race (#3378)
The syncpoint maximum value represents the maximum possible syncpt value at a given time, however due to PBs being submitted before max was incremented, for a brief moment of time this is not the case which could lead to invalid behaviour if a game waits on the fence at that specific time.
2022-06-04 21:36:36 +02:00
d03124a992 Fix 3D semaphore counter type 0 handling (#3380)
Counter type 0 actually releases the semaphore payload rather than a constant zero as was previously thought. This is required by Skyrim.
2022-06-02 19:51:36 -03:00
59490d54b5 infra: Switch to win10-x64 RID and fix PR comment for Avalonia and SDL2 artifact rename (#3375)
* infra: Switch to win10-x64 RID and fix PR comment for Avalonia and SDL2 artifact rename

* Address gdkchan's comments
2022-06-01 02:01:16 +02:00
e546e5933f Rewrite SVC handler using source generators rather than IL emit (#3371)
* Implement syscall handlers using a source generator

* Copy FlushProcessDataCache implementation to Syscall since it was only implemented on Syscall32

* Fix wrong argument order in some syscalls

* Delete old Reflection.Emit based syscall handling code

* Improvements to the code generation

* ControlCodeMemory address and size is always 64-bit
2022-05-31 17:12:46 -03:00
0c87bf9ea4 Refactor CPU interface to allow the implementation of other CPU emulators (#3362)
* Refactor CPU interface

* Use IExecutionContext interface on SVC handler, change how CPU interrupts invokes the handlers

* Make CpuEngine take a ITickSource rather than returning one

The previous implementation had the scenario where the CPU engine had to implement the tick source in mind, like for example, when we have a hypervisor and the game can read CNTPCT on the host directly. However given that we need to do conversion due to different frequencies anyway, it's not worth it. It's better to just let the user pass the tick source and redirect any reads to CNTPCT to the user tick source

* XML docs for the public interfaces

* PPTC invalidation due to NativeInterface function name changes

* Fix build of the CPU tests

* PR feedback
2022-05-31 16:29:35 -03:00
9827dc35e1 Allow loading NSPs without a NCA inside (#3364)
* Allow loading NSPs without a NCA inside

* Set isHomebrew as true
2022-05-31 16:16:59 -03:00
448723d3b3 Don't force DPI aware on Avalonia - it already has it covered. (#3354) 2022-05-21 23:32:50 +02:00
89294b7772 Fix audio renderer error message result code base (#3348) 2022-05-19 00:59:27 +02:00
7b9c4757dd UI - Scale end framebuffer blit (#3342)
* Scale end framebuffer blit

* fix

* fix

* apply changes to avalonia
2022-05-16 18:10:29 -03:00
b8fc97adf2 Fix Avalonia updater 2022-05-15 21:01:12 +02:00
c1a7b5bcdb fix amiibo image path (#3345) 2022-05-15 20:47:00 +02:00
be1c375589 gh-actions: Prefix Avalonia builds with test- and disable prerelease.
As GitHub sort our builds in an alphanumeric way, we abuse that to fix
both new and old updater behaviour.

This should fix all our issues.

Avalonia updater will be broken between version 1.1.122 to 1.1.126, and
will need manual intervention.
2022-05-15 18:05:55 +02:00
378d19f87a gh-actions: Attempt to fix the whole mess up with Avalonia changes
Marked as prerelease just in case it break even more
2022-05-15 17:50:16 +02:00
f59f65ec4f add avalonia builds to release (#3339) 2022-05-15 16:28:32 +02:00
7bc4971cf9 misc: Clean up of CS project after Avalonia merge (#3340)
This reformat Avalonia csproj file, remove unused deps and reajust
Ryujinx csproj a bit after some other changes

Also updated OpenTK.Graphics
2022-05-15 16:02:15 +02:00
3551c18902 sdl2: Update to Ryujinx.SDL2-CS 2.0.22 (#3317)
Update to latest SDL2 release

Fix #2905, #2837 and #2767.
2022-05-15 13:51:30 +02:00
deb99d2cae Avalonia UI - Part 1 (#3270)
* avalonia part 1

* remove vulkan ui backend

* move ui common files to ui common project

* get name for oading screen from device

* rebase.

* review 1

* review 1.1

* review

* cleanup

* addressed review

* use cancellation token

* review

* review

* rebased

* cancel library loading when closing window

* remove star  image, use fonticon instead

* delete render control frame buffer when game ends. change position of fav star

* addressed @Thog review

* ensure the right ui is downloaded in updates

* fix crash when showing not supported dialog during controller request

* add prefix to artifact names

* Auto-format Avalonia project

* Fix input

* Fix build, simplify app disposal

* remove nv stutter thread

* addressed review

* add missing change

* maintain window size if new size is zero length

* add game, handheld, docked to local

* reverse scale main window

* Update de_DE.json

* Update de_DE.json

* Update de_DE.json

* Update italian json

* Update it_IT.json

* let render timer poll with no wait

* remove unused code

* more unused code

* enabled tiered compilation and trimming

* check if window event is not closed before signaling

* fix atmospher case

* locale fix

* locale fix

* remove explicit tiered compilation declarations

* Remove ) it_IT.json

* Remove ) de_DE.json

* Update it_IT.json

* Update pt_BR locale with latest strings

* Remove ')'

* add more strings to locale

* update locale

* remove extra slash

* remove extra slash

* set firmware version to 0 if key's not found

* fix

* revert timer changes

* lock  on object instead

* Update it_IT.json

* remove unused method

* add load screen text to locale

* drop swap event

* Update de_DE.json

* Update de_DE.json

* do null check when stopping emulator

* Update de_DE.json

* Create tr_TR.json

* Add tr_TR

* Add tr_TR + Turkish

* Update it_IT.json

* Update Ryujinx.Ava/Input/AvaloniaMappingHelper.cs

Co-authored-by: Ac_K <Acoustik666@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ac_K <Acoustik666@gmail.com>

* Apply suggestions from code review

Co-authored-by: Ac_K <Acoustik666@gmail.com>

* addressed review

* Update Ryujinx.Ava/Ui/Backend/OpenGl/OpenGlRenderTarget.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* use avalonia's inbuilt renderer on linux

* removed whitespace

* workaround for queue render crash with vsync off

* drop custom backend

* format files

* fix not closing issue

* remove warnings

* rebase

* update avalonia library

* Reposition the Text and Button on About Page

* Assign build version

* Remove appveyor text

Co-authored-by: gdk <gab.dark.100@gmail.com>
Co-authored-by: Niwu34 <67392333+Niwu34@users.noreply.github.com>
Co-authored-by: Antonio Brugnolo <36473846+AntoSkate@users.noreply.github.com>
Co-authored-by: aegiff <99728970+aegiff@users.noreply.github.com>
Co-authored-by: Ac_K <Acoustik666@gmail.com>
Co-authored-by: MostlyWhat <78652091+MostlyWhat@users.noreply.github.com>
2022-05-15 13:30:15 +02:00
9ba73ffbe5 Prefetch capabilities before spawning translation threads. (#3338)
* Prefetch capabilities before spawning translation threads.

The Backend Multithreading only expects one thread to submit commands at a time. When compiling shaders, the translator may request the host GPU capabilities from the backend. It's possible for a bunch of translators to do this at the same time.

There's a caching mechanism in place so that the capabilities are only fetched once. By triggering this before spawning the thread, the async translation threads no longer try to queue onto the backend queue all at the same time.

The Capabilities do need to be checked from the GPU thread, due to OpenGL needing a context to check them, so it's not possible to call the underlying backend directly.

* Initialize the capabilities when setting the GPU thread + missing call in headless

* Remove private variables
2022-05-14 11:58:33 -03:00
43b4b34376 Implement Viewport Transform Disable (#3328)
* Initial implementation (no specialization)

* Use specialization

* Fix render scale, increase code gen version

* Revert accidental change

* Address Feedback
2022-05-12 10:47:13 -03:00
92ca1cb0cb hid: Various fixes and cleanup (#3326)
* hid: Various fix and cleanup

* Add IsValidNpadIdType

* remove ()
2022-05-08 00:28:54 +02:00
50d7ecf76d Add alternative "GL" enum values for StencilOp (#3321)
This PR adds the alternative enum values for StencilOp. Similar to the other enums, I added these with the same names but with Gl added to the end. These are used by homebrew using Nouveau, though they might be used by games with the official Vulkan driver.

39d90be897/rnndb/graph/nv_3ddefs.xml (L77)

Fixes some broken graphics in Citra, such as missing shadows in Mario Kart 7. Likely fixes other homebrew.
2022-05-05 21:16:58 +02:00
42a2a80b87 Enable JIT service LLE (#2959)
* Enable JIT service LLE

* Force disable PPTC when using the JIT service

PPTC does not support multiple guest processes

* Fix build

* Make SM service registration per emulation context rather than global

* Address PR feedback
2022-05-05 15:23:30 -03:00
54deded929 Fix shared memory leak on Windows (#3319)
* Fix shared memory leak on Windows

* Fix memory leak caused by RO session disposal not decrementing the memory manager ref count

* Fix UnmapViewInternal deadlock

* Was not supposed to add those back
2022-05-05 14:58:59 -03:00
39bdf6d41e infra: Warn about support drop of old Windows versions (#3299)
* infra: Warn about support drop of old Windows versions

See #3298.

* Address comment
2022-05-04 20:21:27 +02:00
074190e03c Remove AddProtection count > 0 assert (#3315) 2022-05-04 14:07:10 -03:00
256514c7c9 Update the artifact build's version number (#3297) 2022-05-03 23:35:12 +02:00
556be08c4e Implement PM GetProcessInfo atmosphere extension (partially) (#2966) 2022-05-03 23:28:32 +02:00
1cbca5eecb Implement code memory syscalls (#2958)
* Implement code memory syscalls

* Remove owner process validation

* Add 32-bit code memory syscalls

* Remove unused field
2022-05-03 13:16:31 +02:00
95017b8c66 Support memory aliasing (#2954)
* Back to the origins: Make memory manager take guest PA rather than host address once again

* Direct mapping with alias support on Windows

* Fixes and remove more of the emulated shared memory

* Linux support

* Make shared and transfer memory not depend on SharedMemoryStorage

* More efficient view mapping on Windows (no more restricted to 4KB pages at a time)

* Handle potential access violations caused by partial unmap

* Implement host mapping using shared memory on Linux

* Add new GetPhysicalAddressChecked method, used to ensure the virtual address is mapped before address translation

Also align GetRef behaviour with software memory manager

* We don't need a mirrorable memory block for software memory manager mode

* Disable memory aliasing tests while we don't have shared memory support on Mac

* Shared memory & SIGBUS handler for macOS

* Fix typo + nits + re-enable memory tests

* Set MAP_JIT_DARWIN on x86 Mac too

* Add back the address space mirror

* Only set MAP_JIT_DARWIN if we are mapping as executable

* Disable aliasing tests again (still fails on Mac)

* Fix UnmapView4KB (by not casting size to int)

* Use ref counting on memory blocks to delay closing the shared memory handle until all blocks using it are disposed

* Address PR feedback

* Make RO hold a reference to the guest process memory manager to avoid early disposal

Co-authored-by: nastys <nastys@users.noreply.github.com>
2022-05-02 20:30:02 -03:00
4a892fbdc9 Fix flush action from multiple threads regression (#3311)
If two or more threads encounter a region of memory where a read action has been registered, then they must _both_ wait on the data.

Clearing the action before it completed was causing the null check above to fail, so the action would only be run on the first thread, and the second would end up continuing without waiting. Depending on what the game does, this could be disasterous.

This fixes a regression introduced by #3302 with Pokemon Legends Arceus, and possibly Catherine. There are likely other affected games. What is fixed in that PR should still be fixed.
2022-05-02 12:31:53 +02:00
9eb5b7a10d Restrict cases where vertex buffer size from index buffer type is used (#3304) 2022-05-01 11:12:34 -03:00
d64594ec74 Fix various issues with texture sync (#3302)
* Fix various issues with texture sync

A variable called _actionRegistered is used to keep track of whether a tracking action has been registered for a given texture group handle. This variable is set when the action is registered, and should be unset when it is consumed. This is used to skip registering the tracking action if it's already registered, saving some time for render targets that are modified very often.

There were two issues with this. The worst issue was that the tracking action handler exits early if the handle's modified flag is false... which means that it never reset _actionRegistered, as that was done within the Sync() method called later. The second issue was that this variable was set true after the sync action was registered, so it was technically possible for the action to run immediately, set the flag to false, then set it to true.

Both situations would lead to the action never being registered again, as the texture group handle would be sure the action is already registered. This breaks the texture for the remaining runtime, or until it is disposed.

It was also possible for a texture to register sync once, then on future frames the last modified sync number did not update. This may have caused some more minor issues.

Seems to fix the Xenoblade flashing bug. Obviously this needs a lot of testing, since it was random chance. I typically had the most luck getting it to happen by switching time of day on the event theatre screen for a while, then entering the equipment screen by pressing X on an event.

May also fix weird things like random chance air swimming in BOTW, maybe a few texture streaming bugs.

* Exchange rather than CompareExchange
2022-04-29 18:34:11 -03:00
6a1a03566a T32: Implement load/store single (immediate) (#3186)
* T32: Implement load/store single (immediate)

* tests

* tidy formatting

* address comments
2022-04-21 01:25:43 +02:00
13f5294aa3 Update NpadController.cs (#3284) 2022-04-20 13:22:45 +02:00
9444b4a647 Implement HwOpus multistream functions (#3275)
* Implement HwOpus multistream functions

* Avoid one copy
2022-04-15 23:16:28 +02:00
610fc84f3e ReactiveObject: Handle case when oldValue is null (#3268) 2022-04-15 12:58:27 +02:00
247d26b4b5 ForceDpiAware: X11 implementation (#3269)
* ForceDpiAware: X11 implementation

* address comments
2022-04-10 19:04:22 +02:00
43ebd7a9bb New shader cache implementation (#3194)
* New shader cache implementation

* Remove some debug code

* Take transform feedback varying count into account

* Create shader cache directory if it does not exist + fragment output map related fixes

* Remove debug code

* Only check texture descriptors if the constant buffer is bound

* Also check CPU VA on GetSpanMapped

* Remove more unused code and move cache related code

* XML docs + remove more unused methods

* Better codegen for TransformFeedbackDescriptor.AsSpan

* Support migration from old cache format, remove more unused code

Shader cache rebuild now also rewrites the shared toc and data files

* Fix migration error with BRX shaders

* Add a limit to the async translation queue

 Avoid async translation threads not being able to keep up and the queue growing very large

* Re-create specialization state on recompile

This might be required if a new version of the shader translator requires more or less state, or if there is a bug related to the GPU state access

* Make shader cache more error resilient

* Add some missing XML docs and move GpuAccessor docs to the interface/use inheritdoc

* Address early PR feedback

* Fix rebase

* Remove IRenderer.CompileShader and IShader interface, replace with new ShaderSource struct passed to CreateProgram directly

* Handle some missing exceptions

* Make shader cache purge delete both old and new shader caches

* Register textures on new specialization state

* Translate and compile shaders in forward order (eliminates diffs due to different binding numbers)

* Limit in-flight shader compilation to the maximum number of compilation threads

* Replace ParallelDiskCacheLoader state changed event with a callback function

* Better handling for invalid constant buffer 1 data length

* Do not create the old cache directory structure if the old cache does not exist

* Constant buffer use should be per-stage. This change will invalidate existing new caches (file format version was incremented)

* Replace rectangle texture with just coordinate normalization

* Skip incompatible shaders that are missing texture information, instead of crashing

This is required if we, for example, support new texture instruction to the shader translator, and then they allow access to textures that were not accessed before. In this scenario, the old cache entry is no longer usable

* Fix coordinates normalization on cubemap textures

* Check if title ID is null before combining shader cache path

* More robust constant buffer address validation on spec state

* More robust constant buffer address validation on spec state (2)

* Regenerate shader cache with one stream, rather than one per shader.

* Only create shader cache directory during initialization

* Logging improvements

* Proper shader program disposal

* PR feedback, and add a comment on serialized structs

* XML docs for RegisterTexture

Co-authored-by: riperiperi <rhy3756547@hotmail.com>
2022-04-10 10:49:44 -03:00
26a881176e Fix tail merge from block with conditional jump to multiple returns (#3267)
* Fix tail merge from block with conditional jump to multiple returns

* PPTC version bump
2022-04-09 16:56:50 +02:00
e44a43c7e1 Implement VMAD shader instruction and improve InvocationInfo and ISBERD handling (#3251)
* Implement VMAD shader instruction and improve InvocationInfo and ISBERD handling

* Shader cache version bump

* Fix typo
2022-04-08 12:42:39 +02:00
3139a85a2b Allow copy texture views to have mismatching multisample state (#3152) 2022-04-08 11:26:48 +02:00
a4e8bea866 Lop3Expression: Optimize expressions (#3184)
* lut3

* bugfixes

* TruthTable

* false/true -> 0/-1

* add or to expressions

* fix inversions

* increment cache version
2022-04-08 11:17:38 +02:00
6a9e9b5360 Remove save data creation prompt (#3252)
* begone

* review

* mods directory update
2022-04-08 11:09:35 +02:00
952f6f8a65 Calculate vertex buffer size from index buffer type (#3253)
* Calculate vertex buffer size from index buffer type

* We also need to update the size if first vertex changes
2022-04-08 11:02:06 +02:00
d04ba51bb0 amadeus: Improve and fix delay effect processing (#3205)
* amadeus: Improve and fix delay effect processing

This rework the delay effect processing by representing calculation with the appropriate matrix and by unrolling some loop in the code.
This allows better optimization by the JIT while making it more readeable.

Also fix a bug in the Surround code path found while looking back at my notes.

* Remove useless GetHashCode

* Address gdkchan's comments
2022-04-08 10:52:18 +02:00
55ee261363 service: hid: Signal event on AcquireNpadStyleSetUpdateEventHandle (#3247) 2022-04-07 15:43:14 -03:00
4e3a34412e Update to LibHac 0.16.1 (#3263) 2022-04-07 15:18:14 -03:00
3f4fb8f73a amadeus: Update to REV11 (#3230)
This should implement all ABI changes from REV11 on 14.0.0

As Nintendo changed the channel disposition for "legacy" effects (Delay, Reverb and Reverb 3D) to match the standard channel mapping, I took the liberty to just remap to the old disposition for now.
The proper changes will be handled at a later date with a complete rewriting of those 3 effects to be more readable (see https://github.com/Ryujinx/Ryujinx/pull/3205 for the first iteration of it).
2022-04-06 09:12:38 +02:00
56c56aa34d Do not clamp SNorm outputs to the [0, 1] range on OpenGL (#3260) 2022-04-05 18:09:06 -03:00
d4b960d348 Implement primitive restart draw arrays properly on OpenGL (#3256) 2022-04-04 18:43:24 -03:00
b2a225558d Do not force scissor on clear if scissor is disabled (#3258) 2022-04-04 18:30:43 -03:00
0ef0fc044a Small graphics abstraction layer cleanup (#3257) 2022-04-04 18:21:06 -03:00
04bd87ed5a Fix shader textureSize with multisample and buffer textures (#3240)
* Fix shader textureSize with multisample and buffer textures

* Replace out param with tuple return value
2022-04-04 14:43:58 -03:00
5158cdb308 infra: Put SDL2 headless release inside a GUI-less block in PR (#3218)
As title say.
2022-03-26 11:38:35 +01:00
1402d8391d Support NVDEC H264 interlaced video decoding and VIC deinterlacing (#3225)
* Support NVDEC H264 interlaced video decoding and VIC deinterlacing

* Remove unused code
2022-03-23 17:09:32 -03:00
e3b36db71c hle: Some cleanup (#3210)
* hle: Some cleanup

This PR cleaned up a bit the HLE folder and the VirtualFileSystem one, since we use LibHac, we can use some class of it directly instead of duplicate things. The "Content" of VFS folder is removed since it should be handled in the NCM service directly.
A larger cleanup should be done later since there is still be duplicated code here and there.

* Fix Headless.SDL2

* Addresses gdkchan feedback
2022-03-22 20:46:16 +01:00
ba0171d054 Memory.Tests: Make Multithreading test explicit (#3220) 2022-03-21 09:21:05 +01:00
d1146a5af2 Don't restore Viewport 0 if it hasn't been set yet. (#3219)
Fixes a driver crash when starting some games caused by #3217
2022-03-20 14:48:43 -03:00
79408b68c3 De-tile GOB when DMA copying from block linear to pitch kind memory regions (#3207)
* De-tile GOB when DMA copying from block linear to pitch kind memory regions

* XML docs + nits

* Remove using

* No flush for regular buffer copies

* Add back ulong casts, fix regression due to oversight
2022-03-20 13:55:07 -03:00
d461d4f68b Fix OpenGL issues with RTSS overlays and OBS Game Capture (#3217)
OpenGL game overlays and hooks tend to make a lot of assumptions about how games present frames to the screen, since presentation in OpenGL kind of sucks and they would like to have info such as the size of the screen, or if the contents are SRGB rather than linear.

There are two ways of getting this. OBS hooks swap buffers to get a frame for video capture, but it actually checks the bound framebuffer at the time. I made sure that this matches the output framebuffer (the window) so that the output matches the size. RTSS checks the viewport size by default, but this was actually set to the last used viewport by the game, causing the OSD to fly all across the screen depending on how it was used (or res scale). The viewport is now manually set to match the output framebuffer size.

In the case of RTSS, it also loads its resources by destructively setting a pixel pack parameter without regard to what it was set to by the guest application. OpenGL state can be set for a long period of time and is not expected to be set before each call to a method, so randomly changing it isn't great practice. To fix this, I've added a line to set the pixel unpack alignment back to 4 after presentation, which should cover RTSS loading its incredibly ugly font.

- RTSS and overlays that use it should no longer cause certain textures to load incorrectly. (mario kart 8, pokemon legends arceus)
- OBS Game Capture should no longer crop the game output incorrectly, flicker randomly, or capture with incorrect gamma.

This doesn't fix issues with how RTSS reports our frame timings.
2022-03-20 13:37:45 -03:00
b45d30acf8 oslc: Fix condition in GetSaveDataBackupSetting (#3208)
* oslc: Fix condition in GetSaveDataBackupSetting

This PR fixes a condition previously implemented in #3190 where ACNH can't be booted without an existing savedata.
Closes #3206

* Addresses gdkchan feedback
2022-03-20 13:25:29 -03:00
df70442c46 InstEmitMemoryEx: Barrier after write on ordered store (#3193)
* InstEmitMemoryEx: Barrier after write on ordered store

* increment ptc version

* 32
2022-03-19 10:32:35 -03:00
e2ffa5a125 ntc: Implement IEnsureNetworkClockAvailabilityService (#3192)
* ntc: Implement IEnsureNetworkClockAvailabilityService

This PR implement a basic `IEnsureNetworkClockAvailabilityService` checked by RE. It's needed by Splatoon 2 with Guest Internet Access enabled. Game is now playable with this setting.

* Update Ryujinx.HLE/HOS/Services/Nim/Ntc/StaticService/IEnsureNetworkClockAvailabilityService.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-03-15 04:07:07 +01:00
73feac5819 Caching local network info and using an event handler to invalidate as needed (improves menu slow down issue in FE3H) (#2761)
* Update IGeneralService.cs

Fix IPV4 local ip related frame drop in fire emblem by rewriting [CommandHipc(12)]

* Fix IPV4 Local IP Slowdown & Style Fixes

fix a missing space

* Remove unnecessary line

* Fix for hardcoding which index to use

* Replace argument with empty string.

By sending an empty string to Dns.GetHostAddresses("") you get back localhost info only.

* Add caching, undo change in GetCurrentIpAddress

Implement caching and revert the GetCurrentIP() function, speed improvements still present.

* Remove unnecessary using

* Syntax fixes and removing extra lines

Requested changes by AcK77

* Properly unsubscribe from event handler

Adds an unsubscribe in the dispose section of IGeneralService
2022-03-15 03:49:35 +01:00
e5ad1dfa48 Implement S8D24 texture format and tweak depth range detection (#2458) 2022-03-15 03:42:08 +01:00
79becc4b78 Dynamically increase buffer size when resizing (#2861)
* Grow buffers by 1.5x of its size when resizing

* Further restrict the cases where the dynamic expansion is done
2022-03-15 03:33:53 +01:00
223172ac0b Ui: Add option to show/hide console window (Windows-only) (#3170)
* Ui: Add option to show/hide console window (Windows-only)

* Ui: Only display Show Console menu item on Windows

* ConsoleHelper: Handle NULL case

This will never happen

* Address nits

* Address comments

* Address comments 2
2022-03-15 02:35:41 +01:00
8c9633d72f Initialize indexed inputs used on next shader stage (#3198) 2022-03-14 20:02:50 -03:00
1f93fd52d9 Do not initialize geometry shader passthrough attributes (#3196) 2022-03-14 19:35:41 -03:00
aac7bbd378 olsc: Implement GetSaveDataBackupSetting (#3190)
* olsc: Implement GetSaveDataBackupSetting

This PR implement GetSaveDataBackupSetting of OLSC service which is now needed by ACNH 2.0.5. The game is playable as usual if you use the same user profile as the original save file (I don't know if it was the case before), everything is checked by RE.

* addresses gdkchan feedback
2022-03-12 18:38:49 +01:00
bed516bfda Implement rotate stick 90 degrees clockwise (#3084)
* Implement swapping sticks

* Rotate 90 degrees clockwise

Co-authored-by: matesic.darko@gmail.com <Darkman1979>
2022-03-12 18:23:48 +01:00
69b05f9918 Fix GetUserDisableCount NRE (#3187) 2022-03-12 18:12:12 +01:00
fb7c80e928 Limit number of events that can be retrieved from GetDisplayVSyncEvent (#3188)
* Limit number of events that can be retrieved from GetDisplayVSyncEvent

* Cleaning

* Rename openDisplayInfos -> openDisplays
2022-03-12 17:56:19 +01:00
bb2f9df0a1 KThread: Fix GetPsr mask (#3180)
* ExecutionContext: GetPstate / SetPstate

* Put it in NativeContext

* KThread: Fix GetPsr mask

* ExecutionContext: Turn methods into Pstate property

* Address nit
2022-03-11 03:16:32 +01:00
54bfaa125d amadeus: Fix wrong Span usage in CopyHistories (#3181)
Fix a copypasta from the original Amadeus PR causing invalid
CopyHistories output.

Also added a missing size check.

This fix a crash in Mononoke Slashdown
2022-03-07 09:49:29 +01:00
7af9fcbc06 T32: Implement Data Processing (Modified Immediate) instructions (#3178)
* T32: Implement Data Processing (Modified Immediate) instructions

* Update tests

* switch -> lookup table
2022-03-06 22:25:01 +01:00
ee174be57c Mod loading from atmosphere SD directories (#3176)
* initial sd support

* GUI option

* alignment

* review changes
2022-03-06 22:12:01 +01:00
0bcbe32367 Only initialize shader outputs that are actually used on the next stage (#3054)
* Only initialize shader outputs that are actually used on the next stage

* Shader cache version bump
2022-03-06 20:42:13 +01:00
b97ff4da5e A32: Fix ALU immediate instructions (#3179)
* Tests: Add A32 tests for immediate ADC/ADCS/RSC/RSCS/SBC/SBCS

* A32: Fix bug in ADC/ADCS/RSC/RSCS/SBC/SBCS

* CpuTestAluImm32: Add more opcodes

* Increment PTC version
2022-03-05 15:23:10 -03:00
747081d2c7 Decoders: Fix instruction lengths for 16-bit B instructions (#3177) 2022-03-05 16:20:24 +01:00
497199bb50 Decoder: Exit on trapping instructions, and resume execution at trapping instruction (#3153)
* Decoder: Exit on trapping instructions, and resume execution at trapping instruction

* Resume at trapping address

* remove mustExit
2022-03-04 23:16:58 +01:00
bd9ac0fdaa T32: Implement B, B.cond, BL, BLX (#3155)
* Decoders: Make IsThumb a function of OpCode32

* OpCode32: Fix GetPc

* T32: Implement B, B.cond, BL, BLX

* rm usings
2022-03-04 23:05:08 +01:00
ac21abbb9d Preparation for initial Flatpack and FlatHub integration (#3173)
* Preparation for initial Flatpack and FlatHub integration

This integrate some initial changes required for Flatpack and distribution from FlatHub.

Also added some resources that will be used for packaging on Linux.

* Address gdkchan comment
2022-03-04 18:03:16 +01:00
a3dd04deef Implement -p or --profile command line argument (#2947)
* Implement -p or --profile command line argument

* Put command line logic all in Program and reference it elsewhere

* Address feedback
2022-03-02 09:51:37 +01:00
3705c20668 Update LibHac to v0.16.0 (#3159) 2022-02-27 00:52:25 +01:00
7b35ebc64a T32: Implement ALU (shifted register) instructions (#3135)
* T32: Implement ADC, ADD, AND, BIC, CMN, CMP, EOR, MOV, MVN, ORN, ORR, RSB, SBC, SUB, TEQ, TST (shifted register)

* OpCodeTable: Sort T32 list

* Tests: Rename RandomTestCase to PrecomputedThumbTestCase

* T32: Tests for AluRsImm instructions

* fix nit

* fix nit 2
2022-02-22 19:11:28 -03:00
0a24aa6af2 Allow textures to have their data partially mapped (#2629)
* Allow textures to have their data partially mapped

* Explicitly check for invalid memory ranges on the MultiRangeList

* Update GetWritableRegion to also support unmapped ranges
2022-02-22 13:34:16 -03:00
c9c65af59e Perform unscaled 2d engine copy on CPU if source texture isn't in cache. (#3112)
* Initial implementation of fast 2d copy

TODO: Partial copy for mismatching region/size.

* WIP

* Cleanup

* Update Ryujinx.Graphics.Gpu/Engine/Twod/TwodClass.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-02-22 11:21:29 -03:00
dc063eac83 ARMeilleure: Implement single stepping (#3133)
* Decoder: Implement SingleInstruction decoder mode

* Translator: Implement Step

* DecoderMode: Rename Normal to MultipleBlocks
2022-02-22 11:11:42 -03:00
ccf23fc629 gui: Fixes the games icon when there is an update (#3148)
* gui: Fixes the games icon when there is a game update

Currently we just load the version of the update, instead of the whole NACP file. This PR fixes that. A little cleanup is made into the code to avoid duplicate things.
(Closes #3039)

* Fix condition
2022-02-22 14:53:39 +01:00
f1460d5494 A32: Fix BLX and BXWritePC (#3151) 2022-02-22 10:41:56 -03:00
644b497df1 Collapse AsSpan().Slice(..) calls into AsSpan(..) (#3145)
* Collapse AsSpan().Slice(..) calls into AsSpan(..)

Less code and a bit faster

* Collapse an Array.Clear(array, 0, array.Length) call to Array.Clear(array)
2022-02-22 10:32:10 -03:00
fb935fd201 Add dedicated ServerBase for FileSystem services (#3142)
This should prevent filesystem services from blocking other services that don't have their own ServerBase. May improve filesystem related stutters in certain titles.

Improves button advanced cutscenes such as Miqol's Request in Xenoblade: DE when the game is on a network share (used to stutter when voice lines played).

Should probably be tested to make sure no mysterious bugs have been unearthed, and to see if any other filesystem related perf issues are improved.
2022-02-19 15:29:11 +01:00
f2087ca29e PPTC version increment (#3139) 2022-02-17 23:52:42 -03:00
92d166ecb7 Enable CPU JIT cache invalidation (#2965)
* Enable CPU JIT cache invalidation

* Invalidate cache on IC IVAU
2022-02-18 02:53:18 +01:00
72e543e946 Prefer texture over textureSize for sampler type (#3132)
* Prefer texture over textureSize for sampler type

* Shader cache version bump
2022-02-18 02:44:46 +01:00
98c838b24c Use BitOperations methods and delete now unused BitUtils methods (#3134)
Replaces BitUtils.CountTrailingZeros/CountLeadingZeros/IsPowerOfTwo with BitOperations methods
2022-02-18 02:35:23 +01:00
63c9c64196 Move kernel syscall logs to new trace log level (#3137) 2022-02-18 02:14:05 +01:00
a113ed0811 Implement/Stub mnpp:app service and some hid calls (#3131)
* Implement/Stub mnpp:app service and some hid calls

This PR Implement/Stub the `mnpp:app` service (closes #3107) accordingly to RE, which seems to do some telemetry for China region only, so everything is stubbed.

This PR fixes some inconsistencies in the hid service too and stub EnableSixAxisSensorUnalteredPassthrough, IsSixAxisSensorUnalteredPassthroughEnabled, LoadSixAxisSensorCalibrationParameter, GetSixAxisSensorIcInformation calls (closes #3123 and closes #3124).

* Addresses Thog review
2022-02-18 02:00:06 +01:00
747876dc67 Decoders: Add IOpCode32HasSetFlags (#3136) 2022-02-18 01:33:43 +01:00
95cc18a7b4 Added trace log level (#3096)
* added trace log level

* use trace log level instead of debug ( #1547)

* alignment #1547

* moved trace logs toggle at the bottom #1547

* bumped config file version #3096

* added migration step #3096

* setting moved to the dev section #1547

* performance warning displayed when trace is enabled #1547
2022-02-17 21:08:07 -03:00
c017c77365 Change ServiceNv map creation logs to the Debug level (#3058) 2022-02-18 00:41:21 +01:00
98e05ee4b7 ARMeilleure: Thumb support (All T16 instructions) (#3105)
* Decoders: Add InITBlock argument

* OpCodeTable: Minor cleanup

* OpCodeTable: Remove existing thumb instruction implementations

* OpCodeTable: Prepare for thumb instructions

* OpCodeTables: Improve thumb fast lookup

* Tests: Prepare for thumb tests

* T16: Implement BX

* T16: Implement LSL/LSR/ASR (imm)

* T16: Implement ADDS, SUBS (reg)

* T16: Implement ADDS, SUBS (3-bit immediate)

* T16: Implement MOVS, CMP, ADDS, SUBS (8-bit immediate)

* T16: Implement ANDS, EORS, LSLS, LSRS, ASRS, ADCS, SBCS, RORS, TST, NEGS, CMP, CMN, ORRS, MULS, BICS, MVNS (low registers)

* T16: Implement ADD, CMP, MOV (high reg)

* T16: Implement BLX (reg)

* T16: Implement LDR (literal)

* T16: Implement {LDR,STR}{,H,B,SB,SH} (register)

* T16: Implement {LDR,STR}{,B,H} (immediate)

* T16: Implement LDR/STR (SP)

* T16: Implement ADR

* T16: Implement Add to SP (immediate)

* T16: Implement ADD/SUB (SP)

* T16: Implement SXTH, SXTB, UXTH, UTXB

* T16: Implement CBZ, CBNZ

* T16: Implement PUSH, POP

* T16: Implement REV, REV16, REVSH

* T16: Implement NOP

* T16: Implement LDM, STM

* T16: Implement SVC

* T16: Implement B (conditional)

* T16: Implement B (unconditional)

* T16: Implement IT

* fixup! T16: Implement ADD/SUB (SP)

* fixup! T16: Implement Add to SP (immediate)

* fixup! T16: Implement IT

* CpuTestThumb: Add randomized tests

* Remove inITBlock argument

* Address nits

* Use index to handle IfThenBlockState

* Reduce line noise

* fixup

* nit
2022-02-17 19:39:45 -03:00
868919e101 misc: Update GtkSharp.Dependencies and speed up initial Windows build (#3128)
Update GtkSharp.Dependencies to fix between menu flickering and enable
the SkipInstallGtk in csproj to avoid downloading unused GTK3 install
locally.
2022-02-17 22:10:48 +01:00
9ca040c0ff Use ReadOnlySpan<byte> compiler optimization for static data (#3130) 2022-02-17 21:38:50 +01:00
7e9011673b Use a basic cubic interpolation for the audren upsampler (#3129)
Before, it was selecting nearest neighbour, which sounded terrible. This is likely temporary til the upsampling algorithm used by the switch is reversed.

Fixes bad audio in Skyward Sword HD.
2022-02-17 20:19:29 +01:00
741db8e43d amadeus: Fix PCMFloat datasource command v1 (#3127)
Really simple copy pasta error here.

Shouldn't affect anything as float support was added at the same REV as
datasource command v2.
2022-02-16 23:55:40 +01:00
3bd357045f Do not allow render targets not explicitly written by the fragment shader to be modified (#3063)
* Do not allow render targets not explicitly written by the fragment shader to be modified

* Shader cache version bump

* Remove blank lines

* Avoid redundant color mask updates

* HostShaderCacheEntry can be null

* Avoid more redundant glColorMask calls

* nit: Mask -> Masks

* Fix currentComponentMask

* More efficient way to update _currentComponentMasks
2022-02-16 23:15:39 +01:00
ab5d77c0c4 amadeus: Fix limiter correctness (#3126)
This fixes missing audio on Nintendo Switch Sports Online Play Test.
2022-02-16 21:38:45 +01:00
7bfb5f79b8 When copying linear textures, DMA should ignore region X/Y (#3121) 2022-02-16 11:13:45 +01:00
8cc2479825 Adjusting how deadzones are calculated (#3079)
* Making deadzones feel nice and smooth + adding rider files to .gitignore

* removing unnecessary parentheses and fixing possibility of divide by 0

* formatting :)

* fixing up ClampAxis

* fixing up ClampAxis
2022-02-16 11:06:52 +01:00
8f35345729 Use Enum and Delegate.CreateDelegate generic overloads (#3111)
* Use Enum generic overloads

* Remove EnumExtensions.cs

* Use Delegate.CreateDelegate generic overloads
2022-02-13 10:50:07 -03:00
ce71f9144e InstEmitMemory32: Literal loads always have word-aligned PC (#3104) 2022-02-11 17:51:03 -03:00
f861f0bca2 Fix missing geometry shader passthrough inputs (#3106)
* Fix missing geometry shader passthrough inputs

* Shader cache version bump
2022-02-11 19:52:20 +01:00
571496d243 Ship SoundIO library only for the specified runtime (#3103)
* Add RuntimeIdentifers properties

For Linux, Windows and OS X x86-64
This ensures that the SoundIO project gets this property when built as a subproject

* Address gdkchan's nit

Merge tags into one
2022-02-11 00:15:13 +01:00
c3c3914ed3 Add a limit on the number of uses a constant may have (#3097) 2022-02-09 17:42:47 -03:00
6dffe0fad4 misc: Make PID unsigned long instead of long (#3043) 2022-02-09 17:18:07 -03:00
86b37d0ff7 ARMeilleure: A32: Implement SHSUB8 and UHSUB8 (#3089)
* ARMeilleure: A32: Implement UHSUB8

* ARMeilleure: A32: Implement SHSUB8
2022-02-08 10:46:42 +01:00
863c581190 fix headless sdl2 option string (#3093) 2022-02-07 11:50:51 +01:00
5c3112aaeb Convert the bool to a lowercase string (#3080)
mesa_glthread doesn't accept PascalCase input
2022-02-06 12:52:39 -03:00
88d3ffb97c ARMeilleure: A32: Implement SHADD8 (#3086) 2022-02-06 12:25:45 -03:00
222b1ad7da ARMeilleure: OpCodeTable: Add CMN (RsReg) (#3087) 2022-02-06 02:01:05 +01:00
d317cfd639 Try to ensure save data always has a valid owner ID (#3057)
- Run the extra data fix in FixExtraData on non-system saves that have no owner ID.
- Set the owner ID in the dummy application control property if an application doesn't have a proper one available.
2022-02-02 14:49:49 -07:00
76b9041adf Fix the pronunciation of Ryujinx (#3059) 2022-01-31 18:34:21 +01:00
b944941733 Fix bug that could cause depth buffer to be missing after clear (#3067) 2022-01-31 00:11:43 -03:00
0dddcd012c Remove Appveyor from Readme and SLN (#3026)
* Replace Appveyor with Github badge.

* Delete appveyor.yml

* Remove Appveyor from SLN
2022-01-30 16:41:22 +01:00
bd412afb9f Fix small precision error on CPU reciprocal estimate instructions (#3061)
* Fix small precision error on CPU reciprocal estimate instructions

* PPTC version bump
2022-01-29 23:59:34 +01:00
20ce37dee6 kernel: A bit of refactoring and fix GetThreadContext3 correctness (#3042)
* Start refactoring kernel a bit and import some changes from kernel decoupling PR

* kernel: Put output always at the start in Syscall functions

* kernel: Rewrite GetThreadContext3 to use a structure and to be accurate

* kernel: make KernelTransfer use generic types and simplify

* Fix some warning and do not use getters on MemoryInfo

* Address gdkchan's comment

* GetThreadContext3: use correct pause flag
2022-01-29 22:18:03 +01:00
c52158b733 Add timestamp to 16-byte/4-word semaphore releases. (#3049)
* Add timestamp to 16-byte semaphore releases.

BOTW was reading a ulong 8 bytes after a semaphore return. Turns out this is the timestamp it was trying to do performance calculation with, so I've made it write when necessary.

This mode was also added to the DMA semaphore I added recently, as it is required by a few games. (i think quake?)

The timestamp code has been moved to GPU context. Check other games with an unusually low framerate cap or dynamic resolution to see if they have improved.

* Cast dma semaphore payload to ulong to fill the space

* Write timestamp first

Might be just worrying too much, but we don't want the applcation reading timestamp if it sees the payload before timestamp is written.
2022-01-27 22:50:32 +01:00
fd6d3ec88f Fix res scale parameters not being updated in vertex shader (#3046)
This fixes an issue where the render scale array would not be updated when technically the scales on the flat array were the same, but the start index for the vertex scales was different.
2022-01-27 14:17:13 -03:00
0a0a95fd81 Convert Octal-Mode to Decimal (#3041)
Apparently C# doesn't use 0 as a prefix like C does.
2022-01-25 23:31:04 +01:00
26019c7d06 Fix regression on PR builds version number since new release system 2022-01-24 18:49:14 +01:00
f3bfd799e1 Fix calls passing V128 values on Linux (#3034)
* Fix calls passing V128 values on Linux

* PPTC version bump
2022-01-24 11:23:24 +01:00
1325 changed files with 97246 additions and 19955 deletions

View File

@ -39,13 +39,14 @@ jobs:
- os: windows-latest
OS_NAME: Windows x64
DOTNET_RUNTIME_IDENTIFIER: win-x64
DOTNET_RUNTIME_IDENTIFIER: win10-x64
RELEASE_ZIP_OS_NAME: win_x64
fail-fast: false
env:
POWERSHELL_TELEMETRY_OPTOUT: 1
DOTNET_CLI_TELEMETRY_OPTOUT: 1
RYUJINX_BASE_VERSION: "1.1.0"
steps:
- uses: actions/checkout@v2
- uses: actions/setup-dotnet@v1
@ -59,24 +60,33 @@ jobs:
- name: Clear
run: dotnet clean && dotnet nuget locals all --clear
- name: Build
run: dotnet build -c "${{ matrix.configuration }}" /p:Version="1.1.0" /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER
run: dotnet build -c "${{ matrix.configuration }}" /p:Version="${{ env.RYUJINX_BASE_VERSION }}" /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER
- name: Test
run: dotnet test -c "${{ matrix.configuration }}"
- name: Publish Ryujinx
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish /p:Version="1.1.0" /p:DebugType=embedded /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx --self-contained
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish /p:Version="${{ env.RYUJINX_BASE_VERSION }}" /p:DebugType=embedded /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx --self-contained
if: github.event_name == 'pull_request'
- name: Publish Ryujinx.Headless.SDL2
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish_sdl2_headless /p:Version="1.1.0" /p:DebugType=embedded /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx.Headless.SDL2 --self-contained
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish_sdl2_headless /p:Version="${{ env.RYUJINX_BASE_VERSION }}" /p:DebugType=embedded /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx.Headless.SDL2 --self-contained
if: github.event_name == 'pull_request'
- name: Publish Ryujinx.Ava
run: dotnet publish -c "${{ matrix.configuration }}" -r "${{ matrix.DOTNET_RUNTIME_IDENTIFIER }}" -o ./publish_ava /p:Version="1.0.0" /p:DebugType=embedded /p:SourceRevisionId="${{ steps.git_short_hash.outputs.result }}" /p:ExtraDefineConstants=DISABLE_UPDATER Ryujinx.Ava
if: github.event_name == 'pull_request'
- name: Upload Ryujinx artifact
uses: actions/upload-artifact@v2
with:
name: ryujinx-${{ matrix.configuration }}-1.0.0+${{ steps.git_short_hash.outputs.result }}-${{ matrix.RELEASE_ZIP_OS_NAME }}
name: ryujinx-${{ matrix.configuration }}-${{ env.RYUJINX_BASE_VERSION }}+${{ steps.git_short_hash.outputs.result }}-${{ matrix.RELEASE_ZIP_OS_NAME }}
path: publish
if: github.event_name == 'pull_request'
- name: Upload Ryujinx.Headless.SDL2 artifact
uses: actions/upload-artifact@v2
with:
name: ryujinx-headless-sdl2-${{ matrix.configuration }}-1.0.0+${{ steps.git_short_hash.outputs.result }}-${{ matrix.RELEASE_ZIP_OS_NAME }}
name: sdl2-ryujinx-headless-${{ matrix.configuration }}-${{ env.RYUJINX_BASE_VERSION }}+${{ steps.git_short_hash.outputs.result }}-${{ matrix.RELEASE_ZIP_OS_NAME }}
path: publish_sdl2_headless
if: github.event_name == 'pull_request'
- name: Upload Ryujinx.Ava artifact
uses: actions/upload-artifact@v2
with:
name: ava-ryujinx-${{ matrix.configuration }}-${{ env.RYUJINX_BASE_VERSION }}+${{ steps.git_short_hash.outputs.result }}-${{ matrix.RELEASE_ZIP_OS_NAME }}
path: publish_ava
if: github.event_name == 'pull_request'

View File

@ -36,15 +36,25 @@ jobs:
return core.error(`No artifacts found`);
}
let body = `Download the artifacts for this pull request:\n`;
let hidden_avalonia_artifacts = `\n\n <details><summary>Experimental GUI (Avalonia)</summary>\n`;
let hidden_headless_artifacts = `\n\n <details><summary>GUI-less (SDL2)</summary>\n`;
let hidden_debug_artifacts = `\n\n <details><summary>Only for Developers</summary>\n`;
for (const art of artifacts) {
if(art.name.includes('Debug')){
if(art.name.includes('Debug')) {
hidden_debug_artifacts += `\n* [${art.name}](https://nightly.link/${owner}/${repo}/actions/artifacts/${art.id}.zip)`;
}else{
} else if(art.name.includes('ava-ryujinx')) {
hidden_avalonia_artifacts += `\n* [${art.name}](https://nightly.link/${owner}/${repo}/actions/artifacts/${art.id}.zip)`;
} else if(art.name.includes('sdl2-ryujinx-headless')) {
hidden_headless_artifacts += `\n* [${art.name}](https://nightly.link/${owner}/${repo}/actions/artifacts/${art.id}.zip)`;
} else {
body += `\n* [${art.name}](https://nightly.link/${owner}/${repo}/actions/artifacts/${art.id}.zip)`;
}
}
hidden_avalonia_artifacts += `\n</details>`;
hidden_headless_artifacts += `\n</details>`;
hidden_debug_artifacts += `\n</details>`;
body += hidden_avalonia_artifacts;
body += hidden_headless_artifacts;
body += hidden_debug_artifacts;
const {data: comments} = await github.issues.listComments({repo, owner, issue_number});

View File

@ -51,8 +51,9 @@ jobs:
run: "mkdir release_output"
- name: Publish Windows
run: |
dotnet publish -c Release -r win-x64 -o ./publish_windows/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx --self-contained
dotnet publish -c Release -r win-x64 -o ./publish_windows_sdl2_headless/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Headless.SDL2 --self-contained
dotnet publish -c Release -r win10-x64 -o ./publish_windows/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx --self-contained
dotnet publish -c Release -r win10-x64 -o ./publish_windows_sdl2_headless/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Headless.SDL2 --self-contained
dotnet publish -c Release -r win10-x64 -o ./publish_windows_ava/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Ava --self-contained
- name: Packing Windows builds
run: |
pushd publish_windows
@ -60,7 +61,11 @@ jobs:
popd
pushd publish_windows_sdl2_headless
7z a ../release_output/ryujinx-headless-sdl2-${{ steps.version_info.outputs.build_version }}-win_x64.zip publish
7z a ../release_output/sdl2-ryujinx-headless-${{ steps.version_info.outputs.build_version }}-win_x64.zip publish
popd
pushd publish_windows_ava
7z a ../release_output/test-ava-ryujinx-${{ steps.version_info.outputs.build_version }}-win_x64.zip publish
popd
shell: bash
@ -68,6 +73,7 @@ jobs:
run: |
dotnet publish -c Release -r linux-x64 -o ./publish_linux/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx --self-contained
dotnet publish -c Release -r linux-x64 -o ./publish_linux_sdl2_headless/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Headless.SDL2 --self-contained
dotnet publish -c Release -r linux-x64 -o ./publish_linux_ava/publish /p:Version="${{ steps.version_info.outputs.build_version }}" /p:SourceRevisionId="${{ steps.version_info.outputs.git_short_hash }}" /p:DebugType=embedded Ryujinx.Ava --self-contained
- name: Packing Linux builds
run: |
@ -76,7 +82,11 @@ jobs:
popd
pushd publish_linux_sdl2_headless
tar -czvf ../release_output/ryujinx-headless-sdl2-${{ steps.version_info.outputs.build_version }}-linux_x64.tar.gz publish
tar -czvf ../release_output/sdl2-ryujinx-headless-${{ steps.version_info.outputs.build_version }}-linux_x64.tar.gz publish
popd
pushd publish_linux_ava
tar -czvf ../release_output/test-ava-ryujinx-${{ steps.version_info.outputs.build_version }}-linux_x64.tar.gz publish
popd
shell: bash

3
.gitignore vendored
View File

@ -74,6 +74,9 @@ _TeamCity*
# DotCover is a Code Coverage Tool
*.dotCover
# Rider is a Visual Studio alternative
.idea/*
# NCrunch
*.ncrunch*
.*crunch*.local.xml

View File

@ -58,7 +58,6 @@ namespace ARMeilleure.CodeGen.Linking
/// <param name="a">First instance</param>
/// <param name="b">Second instance</param>
/// <returns><see langword="true"/> if not equal; otherwise <see langword="false"/></returns>
/// <inheritdoc/>
public static bool operator !=(Symbol a, Symbol b)
{
return !(a == b);

View File

@ -59,7 +59,7 @@ namespace ARMeilleure.CodeGen.Optimizations
BasicBlock fromPred = from.Predecessors.Count == 1 ? from.Predecessors[0] : null;
// If the block is empty, we can try to append to the predecessor and avoid unnecessary jumps.
if (from.Operations.Count == 0 && fromPred != null)
if (from.Operations.Count == 0 && fromPred != null && fromPred.SuccessorsCount == 1)
{
for (int i = 0; i < fromPred.SuccessorsCount; i++)
{

View File

@ -1,6 +1,4 @@
using ARMeilleure.Common;
using ARMeilleure.IntermediateRepresentation;
using System;
namespace ARMeilleure.CodeGen.RegisterAllocators
{

View File

@ -4,6 +4,11 @@ namespace ARMeilleure.CodeGen.X86
{
partial class Assembler
{
public static bool SupportsVexPrefix(X86Instruction inst)
{
return _instTable[(int)inst].Flags.HasFlag(InstructionFlags.Vex);
}
private const int BadOp = 0;
[Flags]
@ -152,6 +157,7 @@ namespace ARMeilleure.CodeGen.X86
Add(X86Instruction.Paddd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000ffe, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Paddq, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fd4, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Paddw, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000ffd, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Palignr, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f3a0f, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Pand, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fdb, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Pandn, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fdf, InstructionFlags.Vex | InstructionFlags.Prefix66));
Add(X86Instruction.Pavgb, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fe0, InstructionFlags.Vex | InstructionFlags.Prefix66));
@ -234,6 +240,9 @@ namespace ARMeilleure.CodeGen.X86
Add(X86Instruction.Rsqrtss, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000f52, InstructionFlags.Vex | InstructionFlags.PrefixF3));
Add(X86Instruction.Sar, new InstructionInfo(0x070000d3, 0x070000c1, BadOp, BadOp, BadOp, InstructionFlags.None));
Add(X86Instruction.Setcc, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000f90, InstructionFlags.Reg8Dest));
Add(X86Instruction.Sha256Msg1, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38cc, InstructionFlags.None));
Add(X86Instruction.Sha256Msg2, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38cd, InstructionFlags.None));
Add(X86Instruction.Sha256Rnds2, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x000f38cb, InstructionFlags.None));
Add(X86Instruction.Shl, new InstructionInfo(0x040000d3, 0x040000c1, BadOp, BadOp, BadOp, InstructionFlags.None));
Add(X86Instruction.Shr, new InstructionInfo(0x050000d3, 0x050000c1, BadOp, BadOp, BadOp, InstructionFlags.None));
Add(X86Instruction.Shufpd, new InstructionInfo(BadOp, BadOp, BadOp, BadOp, 0x00000fc6, InstructionFlags.Vex | InstructionFlags.Prefix66));

View File

@ -1,5 +1,4 @@
using System;
using System.Runtime.InteropServices;
namespace ARMeilleure.CodeGen.X86
{

View File

@ -12,21 +12,28 @@ namespace ARMeilleure.CodeGen.X86
return;
}
(_, _, int ecx, int edx) = X86Base.CpuId(0x00000001, 0x00000000);
(int maxNum, _, _, _) = X86Base.CpuId(0x00000000, 0x00000000);
FeatureInfoEdx = (FeatureFlagsEdx)edx;
FeatureInfoEcx = (FeatureFlagsEcx)ecx;
(_, _, int ecx1, int edx1) = X86Base.CpuId(0x00000001, 0x00000000);
FeatureInfo1Edx = (FeatureFlags1Edx)edx1;
FeatureInfo1Ecx = (FeatureFlags1Ecx)ecx1;
if (maxNum >= 7)
{
(_, int ebx7, _, _) = X86Base.CpuId(0x00000007, 0x00000000);
FeatureInfo7Ebx = (FeatureFlags7Ebx)ebx7;
}
}
[Flags]
public enum FeatureFlagsEdx
public enum FeatureFlags1Edx
{
Sse = 1 << 25,
Sse2 = 1 << 26
}
[Flags]
public enum FeatureFlagsEcx
public enum FeatureFlags1Ecx
{
Sse3 = 1 << 0,
Pclmulqdq = 1 << 1,
@ -40,21 +47,31 @@ namespace ARMeilleure.CodeGen.X86
F16c = 1 << 29
}
public static FeatureFlagsEdx FeatureInfoEdx { get; }
public static FeatureFlagsEcx FeatureInfoEcx { get; }
[Flags]
public enum FeatureFlags7Ebx
{
Avx2 = 1 << 5,
Sha = 1 << 29
}
public static bool SupportsSse => FeatureInfoEdx.HasFlag(FeatureFlagsEdx.Sse);
public static bool SupportsSse2 => FeatureInfoEdx.HasFlag(FeatureFlagsEdx.Sse2);
public static bool SupportsSse3 => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Sse3);
public static bool SupportsPclmulqdq => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Pclmulqdq);
public static bool SupportsSsse3 => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Ssse3);
public static bool SupportsFma => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Fma);
public static bool SupportsSse41 => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Sse41);
public static bool SupportsSse42 => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Sse42);
public static bool SupportsPopcnt => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Popcnt);
public static bool SupportsAesni => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Aes);
public static bool SupportsAvx => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.Avx);
public static bool SupportsF16c => FeatureInfoEcx.HasFlag(FeatureFlagsEcx.F16c);
public static FeatureFlags1Edx FeatureInfo1Edx { get; }
public static FeatureFlags1Ecx FeatureInfo1Ecx { get; }
public static FeatureFlags7Ebx FeatureInfo7Ebx { get; } = 0;
public static bool SupportsSse => FeatureInfo1Edx.HasFlag(FeatureFlags1Edx.Sse);
public static bool SupportsSse2 => FeatureInfo1Edx.HasFlag(FeatureFlags1Edx.Sse2);
public static bool SupportsSse3 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Sse3);
public static bool SupportsPclmulqdq => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Pclmulqdq);
public static bool SupportsSsse3 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Ssse3);
public static bool SupportsFma => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Fma);
public static bool SupportsSse41 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Sse41);
public static bool SupportsSse42 => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Sse42);
public static bool SupportsPopcnt => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Popcnt);
public static bool SupportsAesni => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Aes);
public static bool SupportsAvx => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.Avx);
public static bool SupportsAvx2 => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Avx2) && SupportsAvx;
public static bool SupportsF16c => FeatureInfo1Ecx.HasFlag(FeatureFlags1Ecx.F16c);
public static bool SupportsSha => FeatureInfo7Ebx.HasFlag(FeatureFlags7Ebx.Sha);
public static bool ForceLegacySse { get; set; }

View File

@ -82,6 +82,7 @@ namespace ARMeilleure.CodeGen.X86
Add(Intrinsic.X86Paddd, new IntrinsicInfo(X86Instruction.Paddd, IntrinsicType.Binary));
Add(Intrinsic.X86Paddq, new IntrinsicInfo(X86Instruction.Paddq, IntrinsicType.Binary));
Add(Intrinsic.X86Paddw, new IntrinsicInfo(X86Instruction.Paddw, IntrinsicType.Binary));
Add(Intrinsic.X86Palignr, new IntrinsicInfo(X86Instruction.Palignr, IntrinsicType.TernaryImm));
Add(Intrinsic.X86Pand, new IntrinsicInfo(X86Instruction.Pand, IntrinsicType.Binary));
Add(Intrinsic.X86Pandn, new IntrinsicInfo(X86Instruction.Pandn, IntrinsicType.Binary));
Add(Intrinsic.X86Pavgb, new IntrinsicInfo(X86Instruction.Pavgb, IntrinsicType.Binary));
@ -151,6 +152,9 @@ namespace ARMeilleure.CodeGen.X86
Add(Intrinsic.X86Roundss, new IntrinsicInfo(X86Instruction.Roundss, IntrinsicType.BinaryImm));
Add(Intrinsic.X86Rsqrtps, new IntrinsicInfo(X86Instruction.Rsqrtps, IntrinsicType.Unary));
Add(Intrinsic.X86Rsqrtss, new IntrinsicInfo(X86Instruction.Rsqrtss, IntrinsicType.Unary));
Add(Intrinsic.X86Sha256Msg1, new IntrinsicInfo(X86Instruction.Sha256Msg1, IntrinsicType.Binary));
Add(Intrinsic.X86Sha256Msg2, new IntrinsicInfo(X86Instruction.Sha256Msg2, IntrinsicType.Binary));
Add(Intrinsic.X86Sha256Rnds2, new IntrinsicInfo(X86Instruction.Sha256Rnds2, IntrinsicType.Ternary));
Add(Intrinsic.X86Shufpd, new IntrinsicInfo(X86Instruction.Shufpd, IntrinsicType.TernaryImm));
Add(Intrinsic.X86Shufps, new IntrinsicInfo(X86Instruction.Shufps, IntrinsicType.TernaryImm));
Add(Intrinsic.X86Sqrtpd, new IntrinsicInfo(X86Instruction.Sqrtpd, IntrinsicType.Unary));

View File

@ -308,11 +308,13 @@ namespace ARMeilleure.CodeGen.X86
case Instruction.Extended:
{
bool isBlend = node.Intrinsic == Intrinsic.X86Blendvpd ||
node.Intrinsic == Intrinsic.X86Blendvps ||
node.Intrinsic == Intrinsic.X86Pblendvb;
// BLENDVPD, BLENDVPS, PBLENDVB last operand is always implied to be XMM0 when VEX is not supported.
if ((node.Intrinsic == Intrinsic.X86Blendvpd ||
node.Intrinsic == Intrinsic.X86Blendvps ||
node.Intrinsic == Intrinsic.X86Pblendvb) &&
!HardwareCapabilities.SupportsVexEncoding)
// SHA256RNDS2 always has an implied XMM0 as a last operand.
if ((isBlend && !HardwareCapabilities.SupportsVexEncoding) || node.Intrinsic == Intrinsic.X86Sha256Rnds2)
{
Operand xmm0 = Xmm(X86Register.Xmm0, OperandType.V128);
@ -796,6 +798,8 @@ namespace ARMeilleure.CodeGen.X86
}
}
node.SetSources(sources.ToArray());
if (dest != default)
{
if (dest.Type == OperandType.V128)
@ -823,8 +827,6 @@ namespace ARMeilleure.CodeGen.X86
node.Destination = retReg;
}
}
node.SetSources(sources.ToArray());
}
private static void HandleTailcallSystemVAbi(IntrusiveList<Operation> nodes, StackAllocator stackAlloc, Operation node)
@ -1297,11 +1299,15 @@ namespace ARMeilleure.CodeGen.X86
{
if (IsIntrinsic(operation.Instruction))
{
IntrinsicInfo info = IntrinsicTable.GetInfo(operation.Intrinsic);
bool hasVex = HardwareCapabilities.SupportsVexEncoding && Assembler.SupportsVexPrefix(info.Inst);
bool isUnary = operation.SourcesCount < 2;
bool hasVecDest = operation.Destination != default && operation.Destination.Type == OperandType.V128;
return !HardwareCapabilities.SupportsVexEncoding && !isUnary && hasVecDest;
return !hasVex && !isUnary && hasVecDest;
}
return false;

View File

@ -98,6 +98,7 @@ namespace ARMeilleure.CodeGen.X86
Paddd,
Paddq,
Paddw,
Palignr,
Pand,
Pandn,
Pavgb,
@ -180,6 +181,9 @@ namespace ARMeilleure.CodeGen.X86
Rsqrtss,
Sar,
Setcc,
Sha256Msg1,
Sha256Msg2,
Sha256Rnds2,
Shl,
Shr,
Shufpd,

View File

@ -9,13 +9,17 @@ namespace ARMeilleure.CodeGen.X86
{
static class X86Optimizer
{
private const int MaxConstantUses = 10000;
public static void RunPass(ControlFlowGraph cfg)
{
var constants = new Dictionary<ulong, Operand>();
Operand GetConstantCopy(BasicBlock block, Operation operation, Operand source)
{
if (!constants.TryGetValue(source.Value, out var constant))
// If the constant has many uses, we also force a new constant mov to be added, in order
// to avoid overflow of the counts field (that is limited to 16 bits).
if (!constants.TryGetValue(source.Value, out var constant) || constant.UsesCount > MaxConstantUses)
{
constant = Local(source.Type);
@ -23,7 +27,7 @@ namespace ARMeilleure.CodeGen.X86
block.Operations.AddBefore(operation, copyOp);
constants.Add(source.Value, constant);
constants[source.Value] = constant;
}
return constant;

View File

@ -206,7 +206,7 @@ namespace ARMeilleure.Common
/// <typeparam name="T">Type of elements</typeparam>
/// <param name="length">Number of elements</param>
/// <param name="fill">Fill value</param>
/// <param name="leaf"><see langword="true"/> if leaf; otherwise <see langword=""="false"/></param>
/// <param name="leaf"><see langword="true"/> if leaf; otherwise <see langword="false"/></param>
/// <returns>Allocated block</returns>
private IntPtr Allocate<T>(int length, T fill, bool leaf) where T : unmanaged
{

View File

@ -1,7 +1,6 @@
using System;
using System.Collections.Generic;
using System.Runtime.CompilerServices;
using System.Threading;
namespace ARMeilleure.Common
{

View File

@ -9,6 +9,9 @@ namespace ARMeilleure.Common
class Counter<T> : IDisposable where T : unmanaged
{
private bool _disposed;
/// <summary>
/// Index in the <see cref="EntryTable{T}"/>
/// </summary>
private readonly int _index;
private readonly EntryTable<T> _countTable;
@ -17,7 +20,6 @@ namespace ARMeilleure.Common
/// <see cref="EntryTable{T}"/> instance and index.
/// </summary>
/// <param name="countTable"><see cref="EntryTable{T}"/> instance</param>
/// <param name="index">Index in the <see cref="EntryTable{T}"/></param>
/// <exception cref="ArgumentNullException"><paramref name="countTable"/> is <see langword="null"/></exception>
/// <exception cref="ArgumentException"><typeparamref name="T"/> is unsupported</exception>
public Counter(EntryTable<T> countTable)
@ -68,7 +70,7 @@ namespace ARMeilleure.Common
/// <summary>
/// Releases all unmanaged and optionally managed resources used by the <see cref="Counter{T}"/> instance.
/// </summary>
/// <param name="disposing"><see langword="true"/> to dispose managed resources also; otherwise just unmanaged resouces</param>
/// <param name="disposing"><see langword="true"/> to dispose managed resources also; otherwise just unmanaged resources</param>
protected virtual void Dispose(bool disposing)
{
if (!_disposed)

View File

@ -18,7 +18,7 @@ namespace ARMeilleure.Decoders
// For lower code quality translation, we set a lower limit since we're blocking execution.
private const int MaxInstsPerFunctionLowCq = 500;
public static Block[] Decode(IMemoryManager memory, ulong address, ExecutionMode mode, bool highCq, bool singleBlock)
public static Block[] Decode(IMemoryManager memory, ulong address, ExecutionMode mode, bool highCq, DecoderMode dMode)
{
List<Block> blocks = new List<Block>();
@ -38,7 +38,7 @@ namespace ARMeilleure.Decoders
{
block = new Block(blkAddress);
if ((singleBlock && visited.Count >= 1) || opsCount > instructionLimit || !memory.IsMapped(blkAddress))
if ((dMode != DecoderMode.MultipleBlocks && visited.Count >= 1) || opsCount > instructionLimit || !memory.IsMapped(blkAddress))
{
block.Exit = true;
block.EndAddress = blkAddress;
@ -96,6 +96,12 @@ namespace ARMeilleure.Decoders
}
}
if (dMode == DecoderMode.SingleInstruction)
{
// Only read at most one instruction
limitAddress = currBlock.Address + 1;
}
FillBlock(memory, mode, currBlock, limitAddress);
opsCount += currBlock.OpCodes.Count;
@ -115,7 +121,7 @@ namespace ARMeilleure.Decoders
currBlock.Branch = GetBlock((ulong)op.Immediate);
}
if (!IsUnconditionalBranch(lastOp) || isCall)
if (isCall || !(IsUnconditionalBranch(lastOp) || IsTrap(lastOp)))
{
currBlock.Next = GetBlock(currBlock.EndAddress);
}
@ -143,7 +149,7 @@ namespace ARMeilleure.Decoders
throw new InvalidOperationException($"Decoded a single empty exit block. Entry point = 0x{address:X}.");
}
if (!singleBlock)
if (dMode == DecoderMode.MultipleBlocks)
{
return TailCallRemover.RunPass(address, blocks);
}
@ -195,12 +201,13 @@ namespace ARMeilleure.Decoders
ulong limitAddress)
{
ulong address = block.Address;
int itBlockSize = 0;
OpCode opCode;
do
{
if (address >= limitAddress)
if (address >= limitAddress && itBlockSize == 0)
{
break;
}
@ -210,6 +217,15 @@ namespace ARMeilleure.Decoders
block.OpCodes.Add(opCode);
address += (ulong)opCode.OpCodeSizeInBytes;
if (opCode is OpCodeT16IfThen it)
{
itBlockSize = it.IfThenBlockSize;
}
else if (itBlockSize > 0)
{
itBlockSize--;
}
}
while (!(IsBranch(opCode) || IsException(opCode)));
@ -247,6 +263,11 @@ namespace ARMeilleure.Decoders
// so we must consider such operations as a branch in potential aswell.
if (opCode is IOpCode32Alu opAlu && opAlu.Rd == RegisterAlias.Aarch32Pc)
{
if (opCode is OpCodeT32)
{
return opCode.Instruction.Name != InstName.Tst && opCode.Instruction.Name != InstName.Teq &&
opCode.Instruction.Name != InstName.Cmp && opCode.Instruction.Name != InstName.Cmn;
}
return true;
}
@ -308,9 +329,13 @@ namespace ARMeilleure.Decoders
}
private static bool IsException(OpCode opCode)
{
return IsTrap(opCode) || opCode.Instruction.Name == InstName.Svc;
}
private static bool IsTrap(OpCode opCode)
{
return opCode.Instruction.Name == InstName.Brk ||
opCode.Instruction.Name == InstName.Svc ||
opCode.Instruction.Name == InstName.Trap ||
opCode.Instruction.Name == InstName.Und;
}
@ -345,7 +370,14 @@ namespace ARMeilleure.Decoders
}
else
{
return new OpCode(inst, address, opCode);
if (mode == ExecutionMode.Aarch32Thumb)
{
return new OpCodeT16(inst, address, opCode);
}
else
{
return new OpCode(inst, address, opCode);
}
}
}
}

View File

@ -0,0 +1,9 @@
namespace ARMeilleure.Decoders
{
enum DecoderMode
{
MultipleBlocks,
SingleBlock,
SingleInstruction,
}
}

View File

@ -0,0 +1,9 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32Adr
{
int Rd { get; }
int Immediate { get; }
}
}

View File

@ -1,10 +1,8 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32Alu : IOpCode32
interface IOpCode32Alu : IOpCode32, IOpCode32HasSetFlags
{
int Rd { get; }
int Rn { get; }
bool SetFlags { get; }
}
}

View File

@ -0,0 +1,9 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32AluImm : IOpCode32Alu
{
int Immediate { get; }
bool IsRotated { get; }
}
}

View File

@ -0,0 +1,10 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32AluRsImm : IOpCode32Alu
{
int Rm { get; }
int Immediate { get; }
ShiftType ShiftType { get; }
}
}

View File

@ -0,0 +1,10 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32AluRsReg : IOpCode32Alu
{
int Rm { get; }
int Rs { get; }
ShiftType ShiftType { get; }
}
}

View File

@ -0,0 +1,6 @@
namespace ARMeilleure.Decoders;
interface IOpCode32Exception
{
int Id { get; }
}

View File

@ -0,0 +1,7 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32HasSetFlags
{
bool? SetFlags { get; }
}
}

View File

@ -7,5 +7,9 @@ namespace ARMeilleure.Decoders
bool WBack { get; }
bool IsLoad { get; }
bool Index { get; }
bool Add { get; }
int Immediate { get; }
}
}

View File

@ -9,5 +9,7 @@ namespace ARMeilleure.Decoders
int PostOffset { get; }
bool IsLoad { get; }
int Offset { get; }
}
}

View File

@ -0,0 +1,7 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32MemReg : IOpCode32Mem
{
int Rm { get; }
}
}

View File

@ -0,0 +1,8 @@
namespace ARMeilleure.Decoders
{
interface IOpCode32MemRsImm : IOpCode32Mem
{
int Rm { get; }
ShiftType ShiftType { get; }
}
}

View File

@ -18,10 +18,9 @@ namespace ARMeilleure.Decoders
public OpCode(InstDescriptor inst, ulong address, int opCode)
{
Address = address;
RawOpCode = opCode;
Instruction = inst;
Address = address;
RawOpCode = opCode;
RegisterSize = RegisterSize.Int64;
}

View File

@ -13,11 +13,25 @@ namespace ARMeilleure.Decoders
Cond = (Condition)((uint)opCode >> 28);
}
public bool IsThumb()
{
return this is OpCodeT16 || this is OpCodeT32;
}
public uint GetPc()
{
// Due to backwards compatibility and legacy behavior of ARMv4 CPUs pipeline,
// the PC actually points 2 instructions ahead.
return (uint)Address + (uint)OpCodeSizeInBytes * 2;
if (IsThumb())
{
// PC is ahead by 4 in thumb mode whether or not the current instruction
// is 16 or 32 bit.
return (uint)Address + 4u;
}
else
{
return (uint)Address + 8u;
}
}
}
}

View File

@ -5,7 +5,7 @@ namespace ARMeilleure.Decoders
public int Rd { get; }
public int Rn { get; }
public bool SetFlags { get; }
public bool? SetFlags { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32Alu(inst, address, opCode);

View File

@ -2,7 +2,7 @@ using ARMeilleure.Common;
namespace ARMeilleure.Decoders
{
class OpCode32AluImm : OpCode32Alu
class OpCode32AluImm : OpCode32Alu, IOpCode32AluImm
{
public int Immediate { get; }

View File

@ -10,7 +10,7 @@
public bool NHigh { get; }
public bool MHigh { get; }
public bool R { get; }
public bool SetFlags { get; }
public bool? SetFlags { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32AluMla(inst, address, opCode);

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32AluRsImm : OpCode32Alu
class OpCode32AluRsImm : OpCode32Alu, IOpCode32AluRsImm
{
public int Rm { get; }
public int Immediate { get; }

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32AluRsReg : OpCode32Alu
class OpCode32AluRsReg : OpCode32Alu, IOpCode32AluRsReg
{
public int Rm { get; }
public int Rs { get; }

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32AluUmull : OpCode32
class OpCode32AluUmull : OpCode32, IOpCode32HasSetFlags
{
public int RdLo { get; }
public int RdHi { get; }
@ -10,7 +10,7 @@
public bool NHigh { get; }
public bool MHigh { get; }
public bool SetFlags { get; }
public bool? SetFlags { get; }
public DataOp DataOp { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32AluUmull(inst, address, opCode);

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32Exception : OpCode32
class OpCode32Exception : OpCode32, IOpCode32Exception
{
public int Id { get; }

View File

@ -1,3 +1,5 @@
using System.Numerics;
namespace ARMeilleure.Decoders
{
class OpCode32MemMult : OpCode32, IOpCode32MemMult
@ -23,14 +25,7 @@ namespace ARMeilleure.Decoders
RegisterMask = opCode & 0xffff;
int regsSize = 0;
for (int index = 0; index < 16; index++)
{
regsSize += (RegisterMask >> index) & 1;
}
regsSize *= 4;
int regsSize = BitOperations.PopCount((uint)RegisterMask) * 4;
if (!u)
{

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32MemReg : OpCode32Mem
class OpCode32MemReg : OpCode32Mem, IOpCode32MemReg
{
public int Rm { get; }

View File

@ -1,6 +1,6 @@
namespace ARMeilleure.Decoders
{
class OpCode32MemRsImm : OpCode32Mem
class OpCode32MemRsImm : OpCode32Mem, IOpCode32MemRsImm
{
public int Rm { get; }
public ShiftType ShiftType { get; }

View File

@ -0,0 +1,16 @@
namespace ARMeilleure.Decoders
{
class OpCode32Mrs : OpCode32
{
public bool R { get; }
public int Rd { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCode32Mrs(inst, address, opCode);
public OpCode32Mrs(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
R = ((opCode >> 22) & 1) != 0;
Rd = (opCode >> 12) & 0xf;
}
}
}

View File

@ -0,0 +1,24 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16AddSubImm3: OpCodeT16, IOpCode32AluImm
{
public int Rd { get; }
public int Rn { get; }
public bool? SetFlags => null;
public int Immediate { get; }
public bool IsRotated { get; }
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AddSubImm3(inst, address, opCode);
public OpCodeT16AddSubImm3(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 0x7;
Rn = (opCode >> 3) & 0x7;
Immediate = (opCode >> 6) & 0x7;
IsRotated = false;
}
}
}

View File

@ -0,0 +1,20 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16AddSubReg : OpCodeT16, IOpCode32AluReg
{
public int Rm { get; }
public int Rd { get; }
public int Rn { get; }
public bool? SetFlags => null;
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AddSubReg(inst, address, opCode);
public OpCodeT16AddSubReg(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 0x7;
Rn = (opCode >> 3) & 0x7;
Rm = (opCode >> 6) & 0x7;
}
}
}

View File

@ -0,0 +1,23 @@
using ARMeilleure.State;
namespace ARMeilleure.Decoders
{
class OpCodeT16AddSubSp : OpCodeT16, IOpCode32AluImm
{
public int Rd => RegisterAlias.Aarch32Sp;
public int Rn => RegisterAlias.Aarch32Sp;
public bool? SetFlags => false;
public int Immediate { get; }
public bool IsRotated => false;
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AddSubSp(inst, address, opCode);
public OpCodeT16AddSubSp(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Immediate = ((opCode >> 0) & 0x7f) << 2;
}
}
}

View File

@ -0,0 +1,20 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16Adr : OpCodeT16, IOpCode32Adr
{
public int Rd { get; }
public bool Add => true;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16Adr(inst, address, opCode);
public OpCodeT16Adr(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 8) & 7;
int imm = (opCode & 0xff) << 2;
Immediate = (int)(GetPc() & 0xfffffffc) + imm;
}
}
}

View File

@ -1,22 +1,24 @@
namespace ARMeilleure.Decoders
namespace ARMeilleure.Decoders
{
class OpCodeT16AluImm8 : OpCodeT16, IOpCode32Alu
class OpCodeT16AluImm8 : OpCodeT16, IOpCode32AluImm
{
private int _rdn;
public int Rd { get; }
public int Rn { get; }
public int Rd => _rdn;
public int Rn => _rdn;
public bool SetFlags => false;
public bool? SetFlags => null;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AluImm8(inst, address, opCode);
public bool IsRotated { get; }
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AluImm8(inst, address, opCode);
public OpCodeT16AluImm8(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 8) & 0x7;
Rn = (opCode >> 8) & 0x7;
Immediate = (opCode >> 0) & 0xff;
_rdn = (opCode >> 8) & 0x7;
IsRotated = false;
}
}
}
}

View File

@ -0,0 +1,24 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16AluImmZero : OpCodeT16, IOpCode32AluImm
{
public int Rd { get; }
public int Rn { get; }
public bool? SetFlags => null;
public int Immediate { get; }
public bool IsRotated { get; }
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AluImmZero(inst, address, opCode);
public OpCodeT16AluImmZero(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 0x7;
Rn = (opCode >> 3) & 0x7;
Immediate = 0;
IsRotated = false;
}
}
}

View File

@ -0,0 +1,20 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16AluRegHigh : OpCodeT16, IOpCode32AluReg
{
public int Rm { get; }
public int Rd { get; }
public int Rn { get; }
public bool? SetFlags => false;
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AluRegHigh(inst, address, opCode);
public OpCodeT16AluRegHigh(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = ((opCode >> 0) & 0x7) | ((opCode >> 4) & 0x8);
Rn = ((opCode >> 0) & 0x7) | ((opCode >> 4) & 0x8);
Rm = (opCode >> 3) & 0xf;
}
}
}

View File

@ -0,0 +1,20 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16AluRegLow : OpCodeT16, IOpCode32AluReg
{
public int Rm { get; }
public int Rd { get; }
public int Rn { get; }
public bool? SetFlags => null;
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AluRegLow(inst, address, opCode);
public OpCodeT16AluRegLow(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 0x7;
Rn = (opCode >> 0) & 0x7;
Rm = (opCode >> 3) & 0x7;
}
}
}

View File

@ -0,0 +1,22 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16AluUx : OpCodeT16, IOpCode32AluUx
{
public int Rm { get; }
public int Rd { get; }
public int Rn { get; }
public bool? SetFlags => false;
public int RotateBits => 0;
public bool Add => false;
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16AluUx(inst, address, opCode);
public OpCodeT16AluUx(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 0x7;
Rm = (opCode >> 3) & 0x7;
}
}
}

View File

@ -0,0 +1,15 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16BImm11 : OpCodeT16, IOpCode32BImm
{
public long Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16BImm11(inst, address, opCode);
public OpCodeT16BImm11(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
int imm = (opCode << 21) >> 20;
Immediate = GetPc() + imm;
}
}
}

View File

@ -0,0 +1,17 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16BImm8 : OpCodeT16, IOpCode32BImm
{
public long Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16BImm8(inst, address, opCode);
public OpCodeT16BImm8(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Cond = (Condition)((opCode >> 8) & 0xf);
int imm = (opCode << 24) >> 23;
Immediate = GetPc() + imm;
}
}
}

View File

@ -0,0 +1,19 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16BImmCmp : OpCodeT16, IOpCode32BImm
{
public int Rn { get; }
public long Immediate { get; }
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16BImmCmp(inst, address, opCode);
public OpCodeT16BImmCmp(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rn = (opCode >> 0) & 0x7;
int imm = ((opCode >> 2) & 0x3e) | ((opCode >> 3) & 0x40);
Immediate = (int)GetPc() + imm;
}
}
}

View File

@ -1,4 +1,4 @@
namespace ARMeilleure.Decoders
namespace ARMeilleure.Decoders
{
class OpCodeT16BReg : OpCodeT16, IOpCode32BReg
{
@ -11,4 +11,4 @@ namespace ARMeilleure.Decoders
Rm = (opCode >> 3) & 0xf;
}
}
}
}

View File

@ -0,0 +1,14 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16Exception : OpCodeT16, IOpCode32Exception
{
public int Id { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16Exception(inst, address, opCode);
public OpCodeT16Exception(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Id = opCode & 0xFF;
}
}
}

View File

@ -0,0 +1,33 @@
using System.Collections.Generic;
namespace ARMeilleure.Decoders
{
class OpCodeT16IfThen : OpCodeT16
{
public Condition[] IfThenBlockConds { get; }
public int IfThenBlockSize { get { return IfThenBlockConds.Length; } }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16IfThen(inst, address, opCode);
public OpCodeT16IfThen(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
List<Condition> conds = new();
int cond = (opCode >> 4) & 0xf;
int mask = opCode & 0xf;
conds.Add((Condition)cond);
while ((mask & 7) != 0)
{
int newLsb = (mask >> 3) & 1;
cond = (cond & 0xe) | newLsb;
mask <<= 1;
conds.Add((Condition)cond);
}
IfThenBlockConds = conds.ToArray();
}
}
}

View File

@ -0,0 +1,58 @@
using ARMeilleure.Instructions;
using System;
namespace ARMeilleure.Decoders
{
class OpCodeT16MemImm5 : OpCodeT16, IOpCode32Mem
{
public int Rt { get; }
public int Rn { get; }
public bool WBack => false;
public bool IsLoad { get; }
public bool Index => true;
public bool Add => true;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16MemImm5(inst, address, opCode);
public OpCodeT16MemImm5(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt = (opCode >> 0) & 7;
Rn = (opCode >> 3) & 7;
switch (inst.Name)
{
case InstName.Ldr:
case InstName.Ldrb:
case InstName.Ldrh:
IsLoad = true;
break;
case InstName.Str:
case InstName.Strb:
case InstName.Strh:
IsLoad = false;
break;
}
switch (inst.Name)
{
case InstName.Str:
case InstName.Ldr:
Immediate = ((opCode >> 6) & 0x1f) << 2;
break;
case InstName.Strb:
case InstName.Ldrb:
Immediate = ((opCode >> 6) & 0x1f);
break;
case InstName.Strh:
case InstName.Ldrh:
Immediate = ((opCode >> 6) & 0x1f) << 1;
break;
default:
throw new InvalidOperationException();
}
}
}
}

View File

@ -0,0 +1,26 @@
using ARMeilleure.State;
namespace ARMeilleure.Decoders
{
class OpCodeT16MemLit : OpCodeT16, IOpCode32Mem
{
public int Rt { get; }
public int Rn => RegisterAlias.Aarch32Pc;
public bool WBack => false;
public bool IsLoad => true;
public bool Index => true;
public bool Add => true;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16MemLit(inst, address, opCode);
public OpCodeT16MemLit(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt = (opCode >> 8) & 7;
Immediate = (opCode & 0xff) << 2;
}
}
}

View File

@ -0,0 +1,34 @@
using ARMeilleure.Instructions;
using System;
using System.Numerics;
namespace ARMeilleure.Decoders
{
class OpCodeT16MemMult : OpCodeT16, IOpCode32MemMult
{
public int Rn { get; }
public int RegisterMask { get; }
public int PostOffset { get; }
public bool IsLoad { get; }
public int Offset { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16MemMult(inst, address, opCode);
public OpCodeT16MemMult(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
RegisterMask = opCode & 0xff;
Rn = (opCode >> 8) & 7;
int regCount = BitOperations.PopCount((uint)RegisterMask);
Offset = 0;
PostOffset = 4 * regCount;
IsLoad = inst.Name switch
{
InstName.Ldm => true,
InstName.Stm => false,
_ => throw new InvalidOperationException()
};
}
}
}

View File

@ -0,0 +1,27 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16MemReg : OpCodeT16, IOpCode32MemReg
{
public int Rm { get; }
public int Rt { get; }
public int Rn { get; }
public bool WBack => false;
public bool IsLoad { get; }
public bool Index => true;
public bool Add => true;
public int Immediate => throw new System.InvalidOperationException();
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16MemReg(inst, address, opCode);
public OpCodeT16MemReg(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt = (opCode >> 0) & 7;
Rn = (opCode >> 3) & 7;
Rm = (opCode >> 6) & 7;
IsLoad = ((opCode >> 9) & 7) >= 3;
}
}
}

View File

@ -0,0 +1,28 @@
using ARMeilleure.State;
namespace ARMeilleure.Decoders
{
class OpCodeT16MemSp : OpCodeT16, IOpCode32Mem
{
public int Rt { get; }
public int Rn => RegisterAlias.Aarch32Sp;
public bool WBack => false;
public bool IsLoad { get; }
public bool Index => true;
public bool Add => true;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16MemSp(inst, address, opCode);
public OpCodeT16MemSp(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt = (opCode >> 8) & 7;
IsLoad = ((opCode >> 11) & 1) != 0;
Immediate = ((opCode >> 0) & 0xff) << 2;
}
}
}

View File

@ -0,0 +1,42 @@
using ARMeilleure.Instructions;
using ARMeilleure.State;
using System;
using System.Numerics;
namespace ARMeilleure.Decoders
{
class OpCodeT16MemStack : OpCodeT16, IOpCode32MemMult
{
public int Rn => RegisterAlias.Aarch32Sp;
public int RegisterMask { get; }
public int PostOffset { get; }
public bool IsLoad { get; }
public int Offset { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16MemStack(inst, address, opCode);
public OpCodeT16MemStack(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
int extra = (opCode >> 8) & 1;
int regCount = BitOperations.PopCount((uint)opCode & 0x1ff);
switch (inst.Name)
{
case InstName.Push:
RegisterMask = (opCode & 0xff) | (extra << 14);
IsLoad = false;
Offset = -4 * regCount;
PostOffset = -4 * regCount;
break;
case InstName.Pop:
RegisterMask = (opCode & 0xff) | (extra << 15);
IsLoad = true;
Offset = 0;
PostOffset = 4 * regCount;
break;
default:
throw new InvalidOperationException();
}
}
}
}

View File

@ -0,0 +1,24 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16ShiftImm : OpCodeT16, IOpCode32AluRsImm
{
public int Rd { get; }
public int Rn { get; }
public int Rm { get; }
public int Immediate { get; }
public ShiftType ShiftType { get; }
public bool? SetFlags => null;
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16ShiftImm(inst, address, opCode);
public OpCodeT16ShiftImm(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 0x7;
Rm = (opCode >> 3) & 0x7;
Immediate = (opCode >> 6) & 0x1F;
ShiftType = (ShiftType)((opCode >> 11) & 3);
}
}
}

View File

@ -0,0 +1,27 @@
namespace ARMeilleure.Decoders
{
class OpCodeT16ShiftReg : OpCodeT16, IOpCode32AluRsReg
{
public int Rm { get; }
public int Rs { get; }
public int Rd { get; }
public int Rn { get; }
public ShiftType ShiftType { get; }
public bool? SetFlags => null;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16ShiftReg(inst, address, opCode);
public OpCodeT16ShiftReg(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 7;
Rm = (opCode >> 0) & 7;
Rn = (opCode >> 3) & 7;
Rs = (opCode >> 3) & 7;
ShiftType = (ShiftType)(((opCode >> 6) & 1) | ((opCode >> 7) & 2));
}
}
}

View File

@ -0,0 +1,24 @@
using ARMeilleure.State;
namespace ARMeilleure.Decoders
{
class OpCodeT16SpRel : OpCodeT16, IOpCode32AluImm
{
public int Rd { get; }
public int Rn => RegisterAlias.Aarch32Sp;
public bool? SetFlags => false;
public int Immediate { get; }
public bool IsRotated => false;
public static new OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT16SpRel(inst, address, opCode);
public OpCodeT16SpRel(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 8) & 0x7;
Immediate = ((opCode >> 0) & 0xff) << 2;
}
}
}

View File

@ -0,0 +1,14 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32 : OpCode32
{
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32(inst, address, opCode);
public OpCodeT32(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Cond = Condition.Al;
OpCodeSizeInBytes = 4;
}
}
}

View File

@ -0,0 +1,20 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32Alu : OpCodeT32, IOpCode32Alu
{
public int Rd { get; }
public int Rn { get; }
public bool? SetFlags { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32Alu(inst, address, opCode);
public OpCodeT32Alu(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 8) & 0xf;
Rn = (opCode >> 16) & 0xf;
SetFlags = ((opCode >> 20) & 1) != 0;
}
}
}

View File

@ -0,0 +1,38 @@
using ARMeilleure.Common;
using System.Runtime.Intrinsics;
namespace ARMeilleure.Decoders
{
class OpCodeT32AluImm : OpCodeT32Alu, IOpCode32AluImm
{
public int Immediate { get; }
public bool IsRotated { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluImm(inst, address, opCode);
private static readonly Vector128<int> _factor = Vector128.Create(1, 0x00010001, 0x01000100, 0x01010101);
public OpCodeT32AluImm(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
int imm8 = (opCode >> 0) & 0xff;
int imm3 = (opCode >> 12) & 7;
int imm1 = (opCode >> 26) & 1;
int imm12 = imm8 | (imm3 << 8) | (imm1 << 11);
if ((imm12 >> 10) == 0)
{
Immediate = imm8 * _factor.GetElement((imm12 >> 8) & 3);
IsRotated = false;
}
else
{
int shift = imm12 >> 7;
Immediate = BitUtils.RotateRight(0x80 | (imm12 & 0x7f), shift, 32);
IsRotated = shift != 0;
}
}
}
}

View File

@ -0,0 +1,20 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32AluRsImm : OpCodeT32Alu, IOpCode32AluRsImm
{
public int Rm { get; }
public int Immediate { get; }
public ShiftType ShiftType { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32AluRsImm(inst, address, opCode);
public OpCodeT32AluRsImm(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rm = (opCode >> 0) & 0xf;
Immediate = ((opCode >> 6) & 3) | ((opCode >> 10) & 0x1c);
ShiftType = (ShiftType)((opCode >> 4) & 3);
}
}
}

View File

@ -0,0 +1,27 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32BImm20 : OpCodeT32, IOpCode32BImm
{
public long Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32BImm20(inst, address, opCode);
public OpCodeT32BImm20(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
uint pc = GetPc();
int imm11 = (opCode >> 0) & 0x7ff;
int j2 = (opCode >> 11) & 1;
int j1 = (opCode >> 13) & 1;
int imm6 = (opCode >> 16) & 0x3f;
int s = (opCode >> 26) & 1;
int imm32 = imm11 | (imm6 << 11) | (j1 << 17) | (j2 << 18) | (s << 19);
imm32 = (imm32 << 13) >> 12;
Immediate = pc + imm32;
Cond = (Condition)((opCode >> 22) & 0xf);
}
}
}

View File

@ -0,0 +1,35 @@
using ARMeilleure.Instructions;
namespace ARMeilleure.Decoders
{
class OpCodeT32BImm24 : OpCodeT32, IOpCode32BImm
{
public long Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32BImm24(inst, address, opCode);
public OpCodeT32BImm24(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
uint pc = GetPc();
if (inst.Name == InstName.Blx)
{
pc &= ~3u;
}
int imm11 = (opCode >> 0) & 0x7ff;
int j2 = (opCode >> 11) & 1;
int j1 = (opCode >> 13) & 1;
int imm10 = (opCode >> 16) & 0x3ff;
int s = (opCode >> 26) & 1;
int i1 = j1 ^ s ^ 1;
int i2 = j2 ^ s ^ 1;
int imm32 = imm11 | (imm10 << 11) | (i2 << 21) | (i1 << 22) | (s << 23);
imm32 = (imm32 << 9) >> 8;
Immediate = pc + imm32;
}
}
}

View File

@ -0,0 +1,25 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemImm12 : OpCodeT32, IOpCode32Mem
{
public int Rt { get; }
public int Rn { get; }
public bool WBack => false;
public bool IsLoad { get; }
public bool Index => true;
public bool Add => true;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemImm12(inst, address, opCode);
public OpCodeT32MemImm12(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
Immediate = opCode & 0xfff;
IsLoad = ((opCode >> 20) & 1) != 0;
}
}
}

View File

@ -0,0 +1,29 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemImm8 : OpCodeT32, IOpCode32Mem
{
public int Rt { get; }
public int Rn { get; }
public bool WBack { get; }
public bool IsLoad { get; }
public bool Index { get; }
public bool Add { get; }
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemImm8(inst, address, opCode);
public OpCodeT32MemImm8(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
Index = ((opCode >> 10) & 1) != 0;
Add = ((opCode >> 9) & 1) != 0;
WBack = ((opCode >> 8) & 1) != 0;
Immediate = opCode & 0xff;
IsLoad = ((opCode >> 20) & 1) != 0;
}
}
}

View File

@ -0,0 +1,31 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemImm8D : OpCodeT32, IOpCode32Mem
{
public int Rt { get; }
public int Rt2 { get; }
public int Rn { get; }
public bool WBack { get; }
public bool IsLoad { get; }
public bool Index { get; }
public bool Add { get; }
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemImm8D(inst, address, opCode);
public OpCodeT32MemImm8D(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt2 = (opCode >> 8) & 0xf;
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
Index = ((opCode >> 24) & 1) != 0;
Add = ((opCode >> 23) & 1) != 0;
WBack = ((opCode >> 21) & 1) != 0;
Immediate = opCode & 0xff;
IsLoad = ((opCode >> 20) & 1) != 0;
}
}
}

View File

@ -0,0 +1,24 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemLdEx : OpCodeT32, IOpCode32MemEx
{
public int Rd => 0;
public int Rt { get; }
public int Rn { get; }
public bool WBack => false;
public bool IsLoad => true;
public bool Index => false;
public bool Add => false;
public int Immediate => 0;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemLdEx(inst, address, opCode);
public OpCodeT32MemLdEx(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
}
}
}

View File

@ -0,0 +1,52 @@
using System.Numerics;
namespace ARMeilleure.Decoders
{
class OpCodeT32MemMult : OpCodeT32, IOpCode32MemMult
{
public int Rn { get; }
public int RegisterMask { get; }
public int Offset { get; }
public int PostOffset { get; }
public bool IsLoad { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemMult(inst, address, opCode);
public OpCodeT32MemMult(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rn = (opCode >> 16) & 0xf;
bool isLoad = (opCode & (1 << 20)) != 0;
bool w = (opCode & (1 << 21)) != 0;
bool u = (opCode & (1 << 23)) != 0;
bool p = (opCode & (1 << 24)) != 0;
RegisterMask = opCode & 0xffff;
int regsSize = BitOperations.PopCount((uint)RegisterMask) * 4;
if (!u)
{
Offset -= regsSize;
}
if (u == p)
{
Offset += 4;
}
if (w)
{
PostOffset = u ? regsSize : -regsSize;
}
else
{
PostOffset = 0;
}
IsLoad = isLoad;
}
}
}

View File

@ -0,0 +1,30 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemRsImm : OpCodeT32, IOpCode32MemRsImm
{
public int Rt { get; }
public int Rn { get; }
public int Rm { get; }
public ShiftType ShiftType => ShiftType.Lsl;
public bool WBack => false;
public bool IsLoad { get; }
public bool Index => true;
public bool Add => true;
public int Immediate { get; }
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemRsImm(inst, address, opCode);
public OpCodeT32MemRsImm(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rm = (opCode >> 0) & 0xf;
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
IsLoad = (opCode & (1 << 20)) != 0;
Immediate = (opCode >> 4) & 3;
}
}
}

View File

@ -0,0 +1,25 @@
namespace ARMeilleure.Decoders
{
class OpCodeT32MemStEx : OpCodeT32, IOpCode32MemEx
{
public int Rd { get; }
public int Rt { get; }
public int Rn { get; }
public bool WBack => false;
public bool IsLoad => false;
public bool Index => false;
public bool Add => false;
public int Immediate => 0;
public new static OpCode Create(InstDescriptor inst, ulong address, int opCode) => new OpCodeT32MemStEx(inst, address, opCode);
public OpCodeT32MemStEx(InstDescriptor inst, ulong address, int opCode) : base(inst, address, opCode)
{
Rd = (opCode >> 0) & 0xf;
Rt = (opCode >> 12) & 0xf;
Rn = (opCode >> 16) & 0xf;
}
}
}

View File

@ -1,7 +1,7 @@
using ARMeilleure.Instructions;
using ARMeilleure.State;
using System;
using System.Collections.Generic;
using System.Numerics;
namespace ARMeilleure.Decoders
{
@ -29,9 +29,9 @@ namespace ARMeilleure.Decoders
}
}
private static List<InstInfo> AllInstA32 = new List<InstInfo>();
private static List<InstInfo> AllInstT32 = new List<InstInfo>();
private static List<InstInfo> AllInstA64 = new List<InstInfo>();
private static List<InstInfo> AllInstA32 = new();
private static List<InstInfo> AllInstT32 = new();
private static List<InstInfo> AllInstA64 = new();
private static InstInfo[][] InstA32FastLookup = new InstInfo[FastLookupSize][];
private static InstInfo[][] InstT32FastLookup = new InstInfo[FastLookupSize][];
@ -628,7 +628,7 @@ namespace ARMeilleure.Decoders
SetA64("0>001110<<0xxxxx011110xxxxxxxxxx", InstName.Zip2_V, InstEmit.Zip2_V, OpCodeSimdReg.Create);
#endregion
#region "OpCode Table (AArch32)"
#region "OpCode Table (AArch32, A32)"
// Base
SetA32("<<<<0010101xxxxxxxxxxxxxxxxxxxxx", InstName.Adc, InstEmit32.Adc, OpCode32AluImm.Create);
SetA32("<<<<0000101xxxxxxxxxxxxxxxx0xxxx", InstName.Adc, InstEmit32.Adc, OpCode32AluRsImm.Create);
@ -649,11 +649,11 @@ namespace ARMeilleure.Decoders
SetA32("1111101xxxxxxxxxxxxxxxxxxxxxxxxx", InstName.Blx, InstEmit32.Blx, OpCode32BImm.Create);
SetA32("<<<<000100101111111111110011xxxx", InstName.Blx, InstEmit32.Blxr, OpCode32BReg.Create);
SetA32("<<<<000100101111111111110001xxxx", InstName.Bx, InstEmit32.Bx, OpCode32BReg.Create);
SetT32("xxxxxxxxxxxxxxxx010001110xxxx000", InstName.Bx, InstEmit32.Bx, OpCodeT16BReg.Create);
SetA32("11110101011111111111000000011111", InstName.Clrex, InstEmit32.Clrex, OpCode32.Create);
SetA32("<<<<000101101111xxxx11110001xxxx", InstName.Clz, InstEmit32.Clz, OpCode32AluReg.Create);
SetA32("<<<<00110111xxxx0000xxxxxxxxxxxx", InstName.Cmn, InstEmit32.Cmn, OpCode32AluImm.Create);
SetA32("<<<<00010111xxxx0000xxxxxxx0xxxx", InstName.Cmn, InstEmit32.Cmn, OpCode32AluRsImm.Create);
SetA32("<<<<00010111xxxx0000xxxx0xx1xxxx", InstName.Cmn, InstEmit32.Cmn, OpCode32AluRsReg.Create);
SetA32("<<<<00110101xxxx0000xxxxxxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCode32AluImm.Create);
SetA32("<<<<00010101xxxx0000xxxxxxx0xxxx", InstName.Cmp, InstEmit32.Cmp, OpCode32AluRsImm.Create);
SetA32("<<<<00010101xxxx0000xxxx0xx1xxxx", InstName.Cmp, InstEmit32.Cmp, OpCode32AluRsReg.Create);
@ -701,10 +701,10 @@ namespace ARMeilleure.Decoders
SetA32("<<<<0001101x0000xxxxxxxxxxx0xxxx", InstName.Mov, InstEmit32.Mov, OpCode32AluRsImm.Create);
SetA32("<<<<0001101x0000xxxxxxxx0xx1xxxx", InstName.Mov, InstEmit32.Mov, OpCode32AluRsReg.Create);
SetA32("<<<<00110000xxxxxxxxxxxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCode32AluImm16.Create);
SetT32("xxxxxxxxxxxxxxxx00100xxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16AluImm8.Create);
SetA32("<<<<00110100xxxxxxxxxxxxxxxxxxxx", InstName.Movt, InstEmit32.Movt, OpCode32AluImm16.Create);
SetA32("<<<<1110xxx1xxxxxxxx111xxxx1xxxx", InstName.Mrc, InstEmit32.Mrc, OpCode32System.Create);
SetA32("<<<<11000101xxxxxxxx111xxxxxxxxx", InstName.Mrrc, InstEmit32.Mrrc, OpCode32System.Create);
SetA32("<<<<00010x001111xxxx000000000000", InstName.Mrs, InstEmit32.Mrs, OpCode32Mrs.Create);
SetA32("<<<<00010x10xxxx111100000000xxxx", InstName.Msr, InstEmit32.Msr, OpCode32MsrReg.Create);
SetA32("<<<<0000000xxxxx0000xxxx1001xxxx", InstName.Mul, InstEmit32.Mul, OpCode32AluMla.Create);
SetA32("<<<<0011111x0000xxxxxxxxxxxxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCode32AluImm.Create);
@ -732,6 +732,8 @@ namespace ARMeilleure.Decoders
SetA32("<<<<0000110xxxxxxxxxxxxx0xx1xxxx", InstName.Sbc, InstEmit32.Sbc, OpCode32AluRsReg.Create);
SetA32("<<<<0111101xxxxxxxxxxxxxx101xxxx", InstName.Sbfx, InstEmit32.Sbfx, OpCode32AluBf.Create);
SetA32("<<<<01110001xxxx1111xxxx0001xxxx", InstName.Sdiv, InstEmit32.Sdiv, OpCode32AluMla.Create);
SetA32("<<<<01100011xxxxxxxx11111001xxxx", InstName.Shadd8, InstEmit32.Shadd8, OpCode32AluReg.Create);
SetA32("<<<<01100011xxxxxxxx11111111xxxx", InstName.Shsub8, InstEmit32.Shsub8, OpCode32AluReg.Create);
SetA32("<<<<00010000xxxxxxxxxxxx1xx0xxxx", InstName.Smla__, InstEmit32.Smla__, OpCode32AluMla.Create);
SetA32("<<<<0000111xxxxxxxxxxxxx1001xxxx", InstName.Smlal, InstEmit32.Smlal, OpCode32AluUmull.Create);
SetA32("<<<<00010100xxxxxxxxxxxx1xx0xxxx", InstName.Smlal__, InstEmit32.Smlal__, OpCode32AluUmull.Create);
@ -780,6 +782,7 @@ namespace ARMeilleure.Decoders
SetA32("<<<<0111111xxxxxxxxxxxxxx101xxxx", InstName.Ubfx, InstEmit32.Ubfx, OpCode32AluBf.Create);
SetA32("<<<<01110011xxxx1111xxxx0001xxxx", InstName.Udiv, InstEmit32.Udiv, OpCode32AluMla.Create);
SetA32("<<<<01100111xxxxxxxx11111001xxxx", InstName.Uhadd8, InstEmit32.Uhadd8, OpCode32AluReg.Create);
SetA32("<<<<01100111xxxxxxxx11111111xxxx", InstName.Uhsub8, InstEmit32.Uhsub8, OpCode32AluReg.Create);
SetA32("<<<<00000100xxxxxxxxxxxx1001xxxx", InstName.Umaal, InstEmit32.Umaal, OpCode32AluUmull.Create);
SetA32("<<<<0000101xxxxxxxxxxxxx1001xxxx", InstName.Umlal, InstEmit32.Umlal, OpCode32AluUmull.Create);
SetA32("<<<<0000100xxxxxxxxxxxxx1001xxxx", InstName.Umull, InstEmit32.Umull, OpCode32AluUmull.Create);
@ -790,193 +793,345 @@ namespace ARMeilleure.Decoders
SetA32("<<<<01101111xxxxxxxxxx000111xxxx", InstName.Uxth, InstEmit32.Uxth, OpCode32AluUx.Create);
// FP & SIMD
SetA32("111100111x110000xxx0001101x0xxx0", InstName.Aesd_V, InstEmit32.Aesd_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001100x0xxx0", InstName.Aese_V, InstEmit32.Aese_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001111x0xxx0", InstName.Aesimc_V, InstEmit32.Aesimc_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001110x0xxx0", InstName.Aesmc_V, InstEmit32.Aesmc_V, OpCode32Simd.Create);
SetA32("1111001x0x<<xxxxxxxx0111xxx0xxxx", InstName.Vabd, InstEmit32.Vabd_I, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0111x0x0xxxx", InstName.Vabdl, InstEmit32.Vabdl_I, OpCode32SimdRegLong.Create);
SetA32("<<<<11101x110000xxxx101x11x0xxxx", InstName.Vabs, InstEmit32.Vabs_S, OpCode32SimdS.Create);
SetA32("111100111x11<<01xxxx00110xx0xxxx", InstName.Vabs, InstEmit32.Vabs_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111001xxxx01110xx0xxxx", InstName.Vabs, InstEmit32.Vabs_V, OpCode32SimdCmpZ.Create);
SetA32("111100100xxxxxxxxxxx1000xxx0xxxx", InstName.Vadd, InstEmit32.Vadd_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x11xxxxxxxx101xx0x0xxxx", InstName.Vadd, InstEmit32.Vadd_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1101xxx0xxxx", InstName.Vadd, InstEmit32.Vadd_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0000x0x0xxxx", InstName.Vaddl, InstEmit32.Vaddl_I, OpCode32SimdRegLong.Create);
SetA32("1111001x1x<<xxxxxxxx0001x0x0xxxx", InstName.Vaddw, InstEmit32.Vaddw_I, OpCode32SimdRegWide.Create);
SetA32("111100100x00xxxxxxxx0001xxx1xxxx", InstName.Vand, InstEmit32.Vand_I, OpCode32SimdBinary.Create);
SetA32("111100100x01xxxxxxxx0001xxx1xxxx", InstName.Vbic, InstEmit32.Vbic_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx<<x10x11xxxx", InstName.Vbic, InstEmit32.Vbic_II, OpCode32SimdImm.Create);
SetA32("111100110x11xxxxxxxx0001xxx1xxxx", InstName.Vbif, InstEmit32.Vbif, OpCode32SimdBinary.Create);
SetA32("111100110x10xxxxxxxx0001xxx1xxxx", InstName.Vbit, InstEmit32.Vbit, OpCode32SimdBinary.Create);
SetA32("111100110x01xxxxxxxx0001xxx1xxxx", InstName.Vbsl, InstEmit32.Vbsl, OpCode32SimdBinary.Create);
SetA32("111100110x<<xxxxxxxx1000xxx1xxxx", InstName.Vceq, InstEmit32.Vceq_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1110xxx0xxxx", InstName.Vceq, InstEmit32.Vceq_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x010xx0xxxx", InstName.Vceq, InstEmit32.Vceq_Z, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx0011xxx1xxxx", InstName.Vcge, InstEmit32.Vcge_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1110xxx0xxxx", InstName.Vcge, InstEmit32.Vcge_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x001xx0xxxx", InstName.Vcge, InstEmit32.Vcge_Z, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx0011xxx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_I, OpCode32SimdReg.Create);
SetA32("111100110x10xxxxxxxx1110xxx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x000xx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x11xx01xxxx0x011xx0xxxx", InstName.Vcle, InstEmit32.Vcle_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x11xx01xxxx0x100xx0xxxx", InstName.Vclt, InstEmit32.Vclt_Z, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101x11010xxxxx101x01x0xxxx", InstName.Vcmp, InstEmit32.Vcmp, OpCode32SimdS.Create);
SetA32("<<<<11101x11010xxxxx101x11x0xxxx", InstName.Vcmpe, InstEmit32.Vcmpe, OpCode32SimdS.Create);
SetA32("111100111x110000xxxx01010xx0xxxx", InstName.Vcnt, InstEmit32.Vcnt, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101x110111xxxx101x11x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FD, OpCode32SimdS.Create); // FP 32 and 64, scalar.
SetA32("<<<<11101x11110xxxxx101x11x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FI, OpCode32SimdCvtFI.Create); // FP32 to int.
SetA32("<<<<11101x111000xxxx101xx1x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FI, OpCode32SimdCvtFI.Create); // Int to FP32.
SetA32("111111101x1111xxxxxx101xx1x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_RM, OpCode32SimdCvtFI.Create); // The many FP32 to int encodings (fp).
SetA32("111100111x111011xxxx011xxxx0xxxx", InstName.Vcvt, InstEmit32.Vcvt_V, OpCode32SimdCmpZ.Create); // FP and integer, vector.
SetA32("<<<<11101x00xxxxxxxx101xx0x0xxxx", InstName.Vdiv, InstEmit32.Vdiv_S, OpCode32SimdRegS.Create);
SetA32("<<<<11101xx0xxxxxxxx1011x0x10000", InstName.Vdup, InstEmit32.Vdup, OpCode32SimdDupGP.Create);
SetA32("111100111x11xxxxxxxx11000xx0xxxx", InstName.Vdup, InstEmit32.Vdup_1, OpCode32SimdDupElem.Create);
SetA32("111100110x00xxxxxxxx0001xxx1xxxx", InstName.Veor, InstEmit32.Veor_I, OpCode32SimdBinary.Create);
SetA32("111100101x11xxxxxxxxxxxxxxx0xxxx", InstName.Vext, InstEmit32.Vext, OpCode32SimdExt.Create);
SetA32("<<<<11101x10xxxxxxxx101xx0x0xxxx", InstName.Vfma, InstEmit32.Vfma_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1100xxx1xxxx", InstName.Vfma, InstEmit32.Vfma_V, OpCode32SimdReg.Create);
SetA32("<<<<11101x10xxxxxxxx101xx1x0xxxx", InstName.Vfms, InstEmit32.Vfms_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1100xxx1xxxx", InstName.Vfms, InstEmit32.Vfms_V, OpCode32SimdReg.Create);
SetA32("<<<<11101x01xxxxxxxx101xx1x0xxxx", InstName.Vfnma, InstEmit32.Vfnma_S, OpCode32SimdRegS.Create);
SetA32("<<<<11101x01xxxxxxxx101xx0x0xxxx", InstName.Vfnms, InstEmit32.Vfnms_S, OpCode32SimdRegS.Create);
SetA32("1111001x0x<<xxxxxxxx0000xxx0xxxx", InstName.Vhadd, InstEmit32.Vhadd, OpCode32SimdReg.Create);
SetA32("111101001x10xxxxxxxxxx00xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx0111xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 1.
SetA32("111101000x10xxxxxxxx1010xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 2.
SetA32("111101000x10xxxxxxxx0110xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 3.
SetA32("111101000x10xxxxxxxx0010xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 4.
SetA32("111101001x10xxxxxxxxxx01xxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx100xxxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemPair.Create); // Regs = 1, inc = 1/2 (itype).
SetA32("111101000x10xxxxxxxx0011xxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemPair.Create); // Regs = 2, inc = 2.
SetA32("111101001x10xxxxxxxxxx10xxxxxxxx", InstName.Vld3, InstEmit32.Vld3, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx010xxxxxxxxx", InstName.Vld3, InstEmit32.Vld3, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111101001x10xxxxxxxxxx11xxxxxxxx", InstName.Vld4, InstEmit32.Vld4, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx000xxxxxxxxx", InstName.Vld4, InstEmit32.Vld4, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("<<<<11001x01xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x11xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x11xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x01xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x11xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x11xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<1101xx01xxxxxxxx101xxxxxxxxx", InstName.Vldr, InstEmit32.Vldr, OpCode32SimdMemImm.Create);
SetA32("1111001x0x<<xxxxxxxx0110xxx0xxxx", InstName.Vmax, InstEmit32.Vmax_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1111xxx0xxxx", InstName.Vmax, InstEmit32.Vmax_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx0110xxx1xxxx", InstName.Vmin, InstEmit32.Vmin_I, OpCode32SimdReg.Create);
SetA32("111100100x10xxxxxxxx1111xxx0xxxx", InstName.Vmin, InstEmit32.Vmin_V, OpCode32SimdReg.Create);
SetA32("111111101x00xxxxxxxx10>>x0x0xxxx", InstName.Vmaxnm, InstEmit32.Vmaxnm_S, OpCode32SimdRegS.Create);
SetA32("111100110x0xxxxxxxxx1111xxx1xxxx", InstName.Vmaxnm, InstEmit32.Vmaxnm_V, OpCode32SimdReg.Create);
SetA32("111111101x00xxxxxxxx10>>x1x0xxxx", InstName.Vminnm, InstEmit32.Vminnm_S, OpCode32SimdRegS.Create);
SetA32("111100110x1xxxxxxxxx1111xxx1xxxx", InstName.Vminnm, InstEmit32.Vminnm_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx000xx1x0xxxx", InstName.Vmla, InstEmit32.Vmla_1, OpCode32SimdRegElem.Create);
SetA32("111100100xxxxxxxxxxx1001xxx0xxxx", InstName.Vmla, InstEmit32.Vmla_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x00xxxxxxxx101xx0x0xxxx", InstName.Vmla, InstEmit32.Vmla_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1101xxx1xxxx", InstName.Vmla, InstEmit32.Vmla_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx010xx1x0xxxx", InstName.Vmls, InstEmit32.Vmls_1, OpCode32SimdRegElem.Create);
SetA32("<<<<11100x00xxxxxxxx101xx1x0xxxx", InstName.Vmls, InstEmit32.Vmls_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1101xxx1xxxx", InstName.Vmls, InstEmit32.Vmls_V, OpCode32SimdReg.Create);
SetA32("111100110xxxxxxxxxxx1001xxx0xxxx", InstName.Vmls, InstEmit32.Vmls_I, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01010x0x0xxxx", InstName.Vmlsl, InstEmit32.Vmlsl_I, OpCode32SimdRegLong.Create);
SetA32("<<<<11100xx0xxxxxxxx1011xxx10000", InstName.Vmov, InstEmit32.Vmov_G1, OpCode32SimdMovGpElem.Create); // From gen purpose.
SetA32("<<<<1110xxx1xxxxxxxx1011xxx10000", InstName.Vmov, InstEmit32.Vmov_G1, OpCode32SimdMovGpElem.Create); // To gen purpose.
SetA32("<<<<1100010xxxxxxxxx101000x1xxxx", InstName.Vmov, InstEmit32.Vmov_G2, OpCode32SimdMovGpDouble.Create); // To/from gen purpose x2 and single precision x2.
SetA32("<<<<1100010xxxxxxxxx101100x1xxxx", InstName.Vmov, InstEmit32.Vmov_GD, OpCode32SimdMovGpDouble.Create); // To/from gen purpose x2 and double precision.
SetA32("<<<<1110000xxxxxxxxx1010x0010000", InstName.Vmov, InstEmit32.Vmov_GS, OpCode32SimdMovGp.Create); // To/from gen purpose and single precision.
SetA32("1111001x1x000xxxxxxx0xx00x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q vector I32.
SetA32("<<<<11101x11xxxxxxxx101x0000xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm44.Create); // Scalar f16/32/64 based on size 01 10 11.
SetA32("1111001x1x000xxxxxxx10x00x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q I16.
SetA32("1111001x1x000xxxxxxx11xx0x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q (dt - from cmode).
SetA32("1111001x1x000xxxxxxx11100x11xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q I64.
SetA32("<<<<11101x110000xxxx101x01x0xxxx", InstName.Vmov, InstEmit32.Vmov_S, OpCode32SimdS.Create);
SetA32("1111001x1x001000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("1111001x1x010000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("1111001x1x100000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("111100111x11xx10xxxx001000x0xxx0", InstName.Vmovn, InstEmit32.Vmovn, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101111xxxxxxxx101000010000", InstName.Vmrs, InstEmit32.Vmrs, OpCode32SimdSpecial.Create);
SetA32("<<<<11101110xxxxxxxx101000010000", InstName.Vmsr, InstEmit32.Vmsr, OpCode32SimdSpecial.Create);
SetA32("1111001x1x<<xxxxxxxx100xx1x0xxxx", InstName.Vmul, InstEmit32.Vmul_1, OpCode32SimdRegElem.Create);
SetA32("111100100x<<xxxxxxxx1001xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1001xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x10xxxxxxxx101xx0x0xxxx", InstName.Vmul, InstEmit32.Vmul_S, OpCode32SimdRegS.Create);
SetA32("111100110x00xxxxxxxx1101xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01010x1x0xxxx", InstName.Vmull, InstEmit32.Vmull_1, OpCode32SimdRegElemLong.Create);
SetA32("1111001x1x<<xxxxxxx01100x0x0xxxx", InstName.Vmull, InstEmit32.Vmull_I, OpCode32SimdRegLong.Create);
SetA32("111100101xx0xxxxxxx01110x0x0xxxx", InstName.Vmull, InstEmit32.Vmull_I, OpCode32SimdRegLong.Create); // P8/P64
SetA32("111100111x110000xxxx01011xx0xxxx", InstName.Vmvn, InstEmit32.Vmvn_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx0xx00x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create); // D/Q vector I32.
SetA32("1111001x1x000xxxxxxx10x00x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create);
SetA32("1111001x1x000xxxxxxx110x0x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create);
SetA32("<<<<11101x110001xxxx101x01x0xxxx", InstName.Vneg, InstEmit32.Vneg_S, OpCode32SimdS.Create);
SetA32("111100111x11<<01xxxx00111xx0xxxx", InstName.Vneg, InstEmit32.Vneg_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111001xxxx01111xx0xxxx", InstName.Vneg, InstEmit32.Vneg_V, OpCode32SimdCmpZ.Create);
SetA32("<<<<11100x01xxxxxxxx101xx1x0xxxx", InstName.Vnmla, InstEmit32.Vnmla_S, OpCode32SimdRegS.Create);
SetA32("<<<<11100x01xxxxxxxx101xx0x0xxxx", InstName.Vnmls, InstEmit32.Vnmls_S, OpCode32SimdRegS.Create);
SetA32("<<<<11100x10xxxxxxxx101xx1x0xxxx", InstName.Vnmul, InstEmit32.Vnmul_S, OpCode32SimdRegS.Create);
SetA32("111100100x11xxxxxxxx0001xxx1xxxx", InstName.Vorn, InstEmit32.Vorn_I, OpCode32SimdBinary.Create);
SetA32("111100100x10xxxxxxxx0001xxx1xxxx", InstName.Vorr, InstEmit32.Vorr_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx<<x10x01xxxx", InstName.Vorr, InstEmit32.Vorr_II, OpCode32SimdImm.Create);
SetA32("111100100x<<xxxxxxxx1011x0x1xxxx", InstName.Vpadd, InstEmit32.Vpadd_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1101x0x0xxxx", InstName.Vpadd, InstEmit32.Vpadd_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx1010x0x0xxxx", InstName.Vpmax, InstEmit32.Vpmax_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1111x0x0xxxx", InstName.Vpmax, InstEmit32.Vpmax_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx1010x0x1xxxx", InstName.Vpmin, InstEmit32.Vpmin_I, OpCode32SimdReg.Create);
SetA32("111100110x10xxxxxxxx1111x0x0xxxx", InstName.Vpmin, InstEmit32.Vpmin_V, OpCode32SimdReg.Create);
SetA32("1111001x1x>>>xxxxxxx100101x1xxx0", InstName.Vqrshrn, InstEmit32.Vqrshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x>>>xxxxxxx100001x1xxx0", InstName.Vqrshrun, InstEmit32.Vqrshrun, OpCode32SimdShImmNarrow.Create);
SetA32("1111001x1x>>>xxxxxxx100100x1xxx0", InstName.Vqshrn, InstEmit32.Vqshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x111011xxxx010x0xx0xxxx", InstName.Vrecpe, InstEmit32.Vrecpe, OpCode32SimdSqrte.Create);
SetA32("111100100x00xxxxxxxx1111xxx1xxxx", InstName.Vrecps, InstEmit32.Vrecps, OpCode32SimdReg.Create);
SetA32("111100111x11xx00xxxx000<<xx0xxxx", InstName.Vrev, InstEmit32.Vrev, OpCode32SimdRev.Create);
SetA32("111111101x1110xxxxxx101x01x0xxxx", InstName.Vrint, InstEmit32.Vrint_RM, OpCode32SimdS.Create);
SetA32("<<<<11101x110110xxxx101x11x0xxxx", InstName.Vrint, InstEmit32.Vrint_Z, OpCode32SimdS.Create);
SetA32("<<<<11101x110111xxxx101x01x0xxxx", InstName.Vrintx, InstEmit32.Vrintx_S, OpCode32SimdS.Create);
SetA32("1111001x1x>>>xxxxxxx0010>xx1xxxx", InstName.Vrshr, InstEmit32.Vrshr, OpCode32SimdShImm.Create);
SetA32("111100111x111011xxxx010x1xx0xxxx", InstName.Vrsqrte, InstEmit32.Vrsqrte, OpCode32SimdSqrte.Create);
SetA32("111100100x10xxxxxxxx1111xxx1xxxx", InstName.Vrsqrts, InstEmit32.Vrsqrts, OpCode32SimdReg.Create);
SetA32("111111100xxxxxxxxxxx101xx0x0xxxx", InstName.Vsel, InstEmit32.Vsel, OpCode32SimdSel.Create);
SetA32("111100101x>>>xxxxxxx0101>xx1xxxx", InstName.Vshl, InstEmit32.Vshl, OpCode32SimdShImm.Create);
SetA32("1111001x0xxxxxxxxxxx0100xxx0xxxx", InstName.Vshl, InstEmit32.Vshl_I, OpCode32SimdReg.Create);
SetA32("1111001x1x>>>xxxxxxx101000x1xxxx", InstName.Vshll, InstEmit32.Vshll, OpCode32SimdShImmLong.Create); // A1 encoding.
SetA32("1111001x1x>>>xxxxxxx0000>xx1xxxx", InstName.Vshr, InstEmit32.Vshr, OpCode32SimdShImm.Create);
SetA32("111100101x>>>xxxxxxx100000x1xxx0", InstName.Vshrn, InstEmit32.Vshrn, OpCode32SimdShImmNarrow.Create);
SetA32("<<<<11101x110001xxxx101x11x0xxxx", InstName.Vsqrt, InstEmit32.Vsqrt_S, OpCode32SimdS.Create);
SetA32("1111001x1x>>>xxxxxxx0001>xx1xxxx", InstName.Vsra, InstEmit32.Vsra, OpCode32SimdShImm.Create);
SetA32("111101001x00xxxxxxxx<<00xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx0111xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 1.
SetA32("111101000x00xxxxxxxx1010xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 2.
SetA32("111101000x00xxxxxxxx0110xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 3.
SetA32("111101000x00xxxxxxxx0010xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 4.
SetA32("111101001x00xxxxxxxx<<01xxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx100xxxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemPair.Create); // Regs = 1, inc = 1/2 (itype).
SetA32("111101000x00xxxxxxxx0011xxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemPair.Create); // Regs = 2, inc = 2.
SetA32("111101001x00xxxxxxxx<<10xxxxxxxx", InstName.Vst3, InstEmit32.Vst3, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx010xxxxxxxxx", InstName.Vst3, InstEmit32.Vst3, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111101001x00xxxxxxxx<<11xxxxxxxx", InstName.Vst4, InstEmit32.Vst4, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx000xxxxxxxxx", InstName.Vst4, InstEmit32.Vst4, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("<<<<11001x00xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x10xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x10xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x00xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x10xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x10xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<1101xx00xxxxxxxx101xxxxxxxxx", InstName.Vstr, InstEmit32.Vstr, OpCode32SimdMemImm.Create);
SetA32("111100110xxxxxxxxxxx1000xxx0xxxx", InstName.Vsub, InstEmit32.Vsub_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x11xxxxxxxx101xx1x0xxxx", InstName.Vsub, InstEmit32.Vsub_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1101xxx0xxxx", InstName.Vsub, InstEmit32.Vsub_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0011x0x0xxxx", InstName.Vsubw, InstEmit32.Vsubw_I, OpCode32SimdRegWide.Create);
SetA32("111100111x11xxxxxxxx10xxxxx0xxxx", InstName.Vtbl, InstEmit32.Vtbl, OpCode32SimdTbl.Create);
SetA32("111100111x11<<10xxxx00001xx0xxxx", InstName.Vtrn, InstEmit32.Vtrn, OpCode32SimdCmpZ.Create);
SetA32("111100100x<<xxxxxxxx1000xxx1xxxx", InstName.Vtst, InstEmit32.Vtst, OpCode32SimdReg.Create);
SetA32("111100111x11<<10xxxx00010xx0xxxx", InstName.Vuzp, InstEmit32.Vuzp, OpCode32SimdCmpZ.Create);
SetA32("111100111x11<<10xxxx00011xx0xxxx", InstName.Vzip, InstEmit32.Vzip, OpCode32SimdCmpZ.Create);
SetA32("111100111x110000xxx0001101x0xxx0", InstName.Aesd_V, InstEmit32.Aesd_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001100x0xxx0", InstName.Aese_V, InstEmit32.Aese_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001111x0xxx0", InstName.Aesimc_V, InstEmit32.Aesimc_V, OpCode32Simd.Create);
SetA32("111100111x110000xxx0001110x0xxx0", InstName.Aesmc_V, InstEmit32.Aesmc_V, OpCode32Simd.Create);
SetA32("111100110x00xxx0xxx01100x1x0xxx0", InstName.Sha256h_V, InstEmit32.Sha256h_V, OpCode32SimdReg.Create);
SetA32("111100110x01xxx0xxx01100x1x0xxx0", InstName.Sha256h2_V, InstEmit32.Sha256h2_V, OpCode32SimdReg.Create);
SetA32("111100111x111010xxx0001111x0xxx0", InstName.Sha256su0_V, InstEmit32.Sha256su0_V, OpCode32Simd.Create);
SetA32("111100110x10xxx0xxx01100x1x0xxx0", InstName.Sha256su1_V, InstEmit32.Sha256su1_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx0111xxx0xxxx", InstName.Vabd, InstEmit32.Vabd_I, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0111x0x0xxxx", InstName.Vabdl, InstEmit32.Vabdl_I, OpCode32SimdRegLong.Create);
SetA32("<<<<11101x110000xxxx101x11x0xxxx", InstName.Vabs, InstEmit32.Vabs_S, OpCode32SimdS.Create);
SetA32("111100111x11<<01xxxx00110xx0xxxx", InstName.Vabs, InstEmit32.Vabs_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111001xxxx01110xx0xxxx", InstName.Vabs, InstEmit32.Vabs_V, OpCode32SimdCmpZ.Create);
SetA32("111100100xxxxxxxxxxx1000xxx0xxxx", InstName.Vadd, InstEmit32.Vadd_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x11xxxxxxxx101xx0x0xxxx", InstName.Vadd, InstEmit32.Vadd_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1101xxx0xxxx", InstName.Vadd, InstEmit32.Vadd_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0000x0x0xxxx", InstName.Vaddl, InstEmit32.Vaddl_I, OpCode32SimdRegLong.Create);
SetA32("1111001x1x<<xxxxxxxx0001x0x0xxxx", InstName.Vaddw, InstEmit32.Vaddw_I, OpCode32SimdRegWide.Create);
SetA32("111100100x00xxxxxxxx0001xxx1xxxx", InstName.Vand, InstEmit32.Vand_I, OpCode32SimdBinary.Create);
SetA32("111100100x01xxxxxxxx0001xxx1xxxx", InstName.Vbic, InstEmit32.Vbic_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx<<x10x11xxxx", InstName.Vbic, InstEmit32.Vbic_II, OpCode32SimdImm.Create);
SetA32("111100110x11xxxxxxxx0001xxx1xxxx", InstName.Vbif, InstEmit32.Vbif, OpCode32SimdBinary.Create);
SetA32("111100110x10xxxxxxxx0001xxx1xxxx", InstName.Vbit, InstEmit32.Vbit, OpCode32SimdBinary.Create);
SetA32("111100110x01xxxxxxxx0001xxx1xxxx", InstName.Vbsl, InstEmit32.Vbsl, OpCode32SimdBinary.Create);
SetA32("111100110x<<xxxxxxxx1000xxx1xxxx", InstName.Vceq, InstEmit32.Vceq_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1110xxx0xxxx", InstName.Vceq, InstEmit32.Vceq_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x010xx0xxxx", InstName.Vceq, InstEmit32.Vceq_Z, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx0011xxx1xxxx", InstName.Vcge, InstEmit32.Vcge_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1110xxx0xxxx", InstName.Vcge, InstEmit32.Vcge_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x001xx0xxxx", InstName.Vcge, InstEmit32.Vcge_Z, OpCode32SimdCmpZ.Create);
SetA32("1111001x0x<<xxxxxxxx0011xxx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_I, OpCode32SimdReg.Create);
SetA32("111100110x10xxxxxxxx1110xxx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_V, OpCode32SimdReg.Create);
SetA32("111100111x11xx01xxxx0x000xx0xxxx", InstName.Vcgt, InstEmit32.Vcgt_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x11xx01xxxx0x011xx0xxxx", InstName.Vcle, InstEmit32.Vcle_Z, OpCode32SimdCmpZ.Create);
SetA32("111100111x11xx01xxxx0x100xx0xxxx", InstName.Vclt, InstEmit32.Vclt_Z, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101x11010xxxxx101x01x0xxxx", InstName.Vcmp, InstEmit32.Vcmp, OpCode32SimdS.Create);
SetA32("<<<<11101x11010xxxxx101x11x0xxxx", InstName.Vcmpe, InstEmit32.Vcmpe, OpCode32SimdS.Create);
SetA32("111100111x110000xxxx01010xx0xxxx", InstName.Vcnt, InstEmit32.Vcnt, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101x110111xxxx101x11x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FD, OpCode32SimdS.Create); // FP 32 and 64, scalar.
SetA32("<<<<11101x11110xxxxx101x11x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FI, OpCode32SimdCvtFI.Create); // FP32 to int.
SetA32("<<<<11101x111000xxxx101xx1x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_FI, OpCode32SimdCvtFI.Create); // Int to FP32.
SetA32("111111101x1111xxxxxx101xx1x0xxxx", InstName.Vcvt, InstEmit32.Vcvt_RM, OpCode32SimdCvtFI.Create); // The many FP32 to int encodings (fp).
SetA32("111100111x111011xxxx011xxxx0xxxx", InstName.Vcvt, InstEmit32.Vcvt_V, OpCode32SimdCmpZ.Create); // FP and integer, vector.
SetA32("<<<<11101x00xxxxxxxx101xx0x0xxxx", InstName.Vdiv, InstEmit32.Vdiv_S, OpCode32SimdRegS.Create);
SetA32("<<<<11101xx0xxxxxxxx1011x0x10000", InstName.Vdup, InstEmit32.Vdup, OpCode32SimdDupGP.Create);
SetA32("111100111x11xxxxxxxx11000xx0xxxx", InstName.Vdup, InstEmit32.Vdup_1, OpCode32SimdDupElem.Create);
SetA32("111100110x00xxxxxxxx0001xxx1xxxx", InstName.Veor, InstEmit32.Veor_I, OpCode32SimdBinary.Create);
SetA32("111100101x11xxxxxxxxxxxxxxx0xxxx", InstName.Vext, InstEmit32.Vext, OpCode32SimdExt.Create);
SetA32("<<<<11101x10xxxxxxxx101xx0x0xxxx", InstName.Vfma, InstEmit32.Vfma_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1100xxx1xxxx", InstName.Vfma, InstEmit32.Vfma_V, OpCode32SimdReg.Create);
SetA32("<<<<11101x10xxxxxxxx101xx1x0xxxx", InstName.Vfms, InstEmit32.Vfms_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1100xxx1xxxx", InstName.Vfms, InstEmit32.Vfms_V, OpCode32SimdReg.Create);
SetA32("<<<<11101x01xxxxxxxx101xx1x0xxxx", InstName.Vfnma, InstEmit32.Vfnma_S, OpCode32SimdRegS.Create);
SetA32("<<<<11101x01xxxxxxxx101xx0x0xxxx", InstName.Vfnms, InstEmit32.Vfnms_S, OpCode32SimdRegS.Create);
SetA32("1111001x0x<<xxxxxxxx0000xxx0xxxx", InstName.Vhadd, InstEmit32.Vhadd, OpCode32SimdReg.Create);
SetA32("111101001x10xxxxxxxxxx00xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx0111xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 1.
SetA32("111101000x10xxxxxxxx1010xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 2.
SetA32("111101000x10xxxxxxxx0110xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 3.
SetA32("111101000x10xxxxxxxx0010xxxxxxxx", InstName.Vld1, InstEmit32.Vld1, OpCode32SimdMemPair.Create); // Regs = 4.
SetA32("111101001x10xxxxxxxxxx01xxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx100xxxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemPair.Create); // Regs = 1, inc = 1/2 (itype).
SetA32("111101000x10xxxxxxxx0011xxxxxxxx", InstName.Vld2, InstEmit32.Vld2, OpCode32SimdMemPair.Create); // Regs = 2, inc = 2.
SetA32("111101001x10xxxxxxxxxx10xxxxxxxx", InstName.Vld3, InstEmit32.Vld3, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx010xxxxxxxxx", InstName.Vld3, InstEmit32.Vld3, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111101001x10xxxxxxxxxx11xxxxxxxx", InstName.Vld4, InstEmit32.Vld4, OpCode32SimdMemSingle.Create);
SetA32("111101000x10xxxxxxxx000xxxxxxxxx", InstName.Vld4, InstEmit32.Vld4, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("<<<<11001x01xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x11xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x11xxxxxxxx1011xxxxxxx0", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x01xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x11xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x11xxxxxxxx1010xxxxxxxx", InstName.Vldm, InstEmit32.Vldm, OpCode32SimdMemMult.Create);
SetA32("<<<<1101xx01xxxxxxxx101xxxxxxxxx", InstName.Vldr, InstEmit32.Vldr, OpCode32SimdMemImm.Create);
SetA32("1111001x0x<<xxxxxxxx0110xxx0xxxx", InstName.Vmax, InstEmit32.Vmax_I, OpCode32SimdReg.Create);
SetA32("111100100x00xxxxxxxx1111xxx0xxxx", InstName.Vmax, InstEmit32.Vmax_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx0110xxx1xxxx", InstName.Vmin, InstEmit32.Vmin_I, OpCode32SimdReg.Create);
SetA32("111100100x10xxxxxxxx1111xxx0xxxx", InstName.Vmin, InstEmit32.Vmin_V, OpCode32SimdReg.Create);
SetA32("111111101x00xxxxxxxx10>>x0x0xxxx", InstName.Vmaxnm, InstEmit32.Vmaxnm_S, OpCode32SimdRegS.Create);
SetA32("111100110x0xxxxxxxxx1111xxx1xxxx", InstName.Vmaxnm, InstEmit32.Vmaxnm_V, OpCode32SimdReg.Create);
SetA32("111111101x00xxxxxxxx10>>x1x0xxxx", InstName.Vminnm, InstEmit32.Vminnm_S, OpCode32SimdRegS.Create);
SetA32("111100110x1xxxxxxxxx1111xxx1xxxx", InstName.Vminnm, InstEmit32.Vminnm_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx000xx1x0xxxx", InstName.Vmla, InstEmit32.Vmla_1, OpCode32SimdRegElem.Create);
SetA32("111100100xxxxxxxxxxx1001xxx0xxxx", InstName.Vmla, InstEmit32.Vmla_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x00xxxxxxxx101xx0x0xxxx", InstName.Vmla, InstEmit32.Vmla_S, OpCode32SimdRegS.Create);
SetA32("111100100x00xxxxxxxx1101xxx1xxxx", InstName.Vmla, InstEmit32.Vmla_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx010xx1x0xxxx", InstName.Vmls, InstEmit32.Vmls_1, OpCode32SimdRegElem.Create);
SetA32("<<<<11100x00xxxxxxxx101xx1x0xxxx", InstName.Vmls, InstEmit32.Vmls_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1101xxx1xxxx", InstName.Vmls, InstEmit32.Vmls_V, OpCode32SimdReg.Create);
SetA32("111100110xxxxxxxxxxx1001xxx0xxxx", InstName.Vmls, InstEmit32.Vmls_I, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01010x0x0xxxx", InstName.Vmlsl, InstEmit32.Vmlsl_I, OpCode32SimdRegLong.Create);
SetA32("<<<<11100xx0xxxxxxxx1011xxx10000", InstName.Vmov, InstEmit32.Vmov_G1, OpCode32SimdMovGpElem.Create); // From gen purpose.
SetA32("<<<<1110xxx1xxxxxxxx1011xxx10000", InstName.Vmov, InstEmit32.Vmov_G1, OpCode32SimdMovGpElem.Create); // To gen purpose.
SetA32("<<<<1100010xxxxxxxxx101000x1xxxx", InstName.Vmov, InstEmit32.Vmov_G2, OpCode32SimdMovGpDouble.Create); // To/from gen purpose x2 and single precision x2.
SetA32("<<<<1100010xxxxxxxxx101100x1xxxx", InstName.Vmov, InstEmit32.Vmov_GD, OpCode32SimdMovGpDouble.Create); // To/from gen purpose x2 and double precision.
SetA32("<<<<1110000xxxxxxxxx1010x0010000", InstName.Vmov, InstEmit32.Vmov_GS, OpCode32SimdMovGp.Create); // To/from gen purpose and single precision.
SetA32("1111001x1x000xxxxxxx0xx00x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q vector I32.
SetA32("<<<<11101x11xxxxxxxx101x0000xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm44.Create); // Scalar f16/32/64 based on size 01 10 11.
SetA32("1111001x1x000xxxxxxx10x00x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q I16.
SetA32("1111001x1x000xxxxxxx11xx0x01xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q (dt - from cmode).
SetA32("1111001x1x000xxxxxxx11100x11xxxx", InstName.Vmov, InstEmit32.Vmov_I, OpCode32SimdImm.Create); // D/Q I64.
SetA32("<<<<11101x110000xxxx101x01x0xxxx", InstName.Vmov, InstEmit32.Vmov_S, OpCode32SimdS.Create);
SetA32("1111001x1x001000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("1111001x1x010000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("1111001x1x100000xxx0101000x1xxxx", InstName.Vmovl, InstEmit32.Vmovl, OpCode32SimdLong.Create);
SetA32("111100111x11xx10xxxx001000x0xxx0", InstName.Vmovn, InstEmit32.Vmovn, OpCode32SimdCmpZ.Create);
SetA32("<<<<11101111xxxxxxxx101000010000", InstName.Vmrs, InstEmit32.Vmrs, OpCode32SimdSpecial.Create);
SetA32("<<<<11101110xxxxxxxx101000010000", InstName.Vmsr, InstEmit32.Vmsr, OpCode32SimdSpecial.Create);
SetA32("1111001x1x<<xxxxxxxx100xx1x0xxxx", InstName.Vmul, InstEmit32.Vmul_1, OpCode32SimdRegElem.Create);
SetA32("111100100x<<xxxxxxxx1001xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1001xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x10xxxxxxxx101xx0x0xxxx", InstName.Vmul, InstEmit32.Vmul_S, OpCode32SimdRegS.Create);
SetA32("111100110x00xxxxxxxx1101xxx1xxxx", InstName.Vmul, InstEmit32.Vmul_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxx01010x1x0xxxx", InstName.Vmull, InstEmit32.Vmull_1, OpCode32SimdRegElemLong.Create);
SetA32("1111001x1x<<xxxxxxx01100x0x0xxxx", InstName.Vmull, InstEmit32.Vmull_I, OpCode32SimdRegLong.Create);
SetA32("111100101xx0xxxxxxx01110x0x0xxxx", InstName.Vmull, InstEmit32.Vmull_I, OpCode32SimdRegLong.Create); // P8/P64
SetA32("111100111x110000xxxx01011xx0xxxx", InstName.Vmvn, InstEmit32.Vmvn_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx0xx00x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create); // D/Q vector I32.
SetA32("1111001x1x000xxxxxxx10x00x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create);
SetA32("1111001x1x000xxxxxxx110x0x11xxxx", InstName.Vmvn, InstEmit32.Vmvn_II, OpCode32SimdImm.Create);
SetA32("<<<<11101x110001xxxx101x01x0xxxx", InstName.Vneg, InstEmit32.Vneg_S, OpCode32SimdS.Create);
SetA32("111100111x11<<01xxxx00111xx0xxxx", InstName.Vneg, InstEmit32.Vneg_V, OpCode32SimdCmpZ.Create);
SetA32("111100111x111001xxxx01111xx0xxxx", InstName.Vneg, InstEmit32.Vneg_V, OpCode32SimdCmpZ.Create);
SetA32("<<<<11100x01xxxxxxxx101xx1x0xxxx", InstName.Vnmla, InstEmit32.Vnmla_S, OpCode32SimdRegS.Create);
SetA32("<<<<11100x01xxxxxxxx101xx0x0xxxx", InstName.Vnmls, InstEmit32.Vnmls_S, OpCode32SimdRegS.Create);
SetA32("<<<<11100x10xxxxxxxx101xx1x0xxxx", InstName.Vnmul, InstEmit32.Vnmul_S, OpCode32SimdRegS.Create);
SetA32("111100100x11xxxxxxxx0001xxx1xxxx", InstName.Vorn, InstEmit32.Vorn_I, OpCode32SimdBinary.Create);
SetA32("111100100x10xxxxxxxx0001xxx1xxxx", InstName.Vorr, InstEmit32.Vorr_I, OpCode32SimdBinary.Create);
SetA32("1111001x1x000xxxxxxx<<x10x01xxxx", InstName.Vorr, InstEmit32.Vorr_II, OpCode32SimdImm.Create);
SetA32("111100100x<<xxxxxxxx1011x0x1xxxx", InstName.Vpadd, InstEmit32.Vpadd_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1101x0x0xxxx", InstName.Vpadd, InstEmit32.Vpadd_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx1010x0x0xxxx", InstName.Vpmax, InstEmit32.Vpmax_I, OpCode32SimdReg.Create);
SetA32("111100110x00xxxxxxxx1111x0x0xxxx", InstName.Vpmax, InstEmit32.Vpmax_V, OpCode32SimdReg.Create);
SetA32("1111001x0x<<xxxxxxxx1010x0x1xxxx", InstName.Vpmin, InstEmit32.Vpmin_I, OpCode32SimdReg.Create);
SetA32("111100110x10xxxxxxxx1111x0x0xxxx", InstName.Vpmin, InstEmit32.Vpmin_V, OpCode32SimdReg.Create);
SetA32("1111001x1x>>>xxxxxxx100101x1xxx0", InstName.Vqrshrn, InstEmit32.Vqrshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x>>>xxxxxxx100001x1xxx0", InstName.Vqrshrun, InstEmit32.Vqrshrun, OpCode32SimdShImmNarrow.Create);
SetA32("1111001x1x>>>xxxxxxx100100x1xxx0", InstName.Vqshrn, InstEmit32.Vqshrn, OpCode32SimdShImmNarrow.Create);
SetA32("111100111x111011xxxx010x0xx0xxxx", InstName.Vrecpe, InstEmit32.Vrecpe, OpCode32SimdSqrte.Create);
SetA32("111100100x00xxxxxxxx1111xxx1xxxx", InstName.Vrecps, InstEmit32.Vrecps, OpCode32SimdReg.Create);
SetA32("111100111x11xx00xxxx000<<xx0xxxx", InstName.Vrev, InstEmit32.Vrev, OpCode32SimdRev.Create);
SetA32("111111101x1110xxxxxx101x01x0xxxx", InstName.Vrint, InstEmit32.Vrint_RM, OpCode32SimdS.Create);
SetA32("<<<<11101x110110xxxx101x11x0xxxx", InstName.Vrint, InstEmit32.Vrint_Z, OpCode32SimdS.Create);
SetA32("<<<<11101x110111xxxx101x01x0xxxx", InstName.Vrintx, InstEmit32.Vrintx_S, OpCode32SimdS.Create);
SetA32("1111001x1x>>>xxxxxxx0010>xx1xxxx", InstName.Vrshr, InstEmit32.Vrshr, OpCode32SimdShImm.Create);
SetA32("111100111x111011xxxx010x1xx0xxxx", InstName.Vrsqrte, InstEmit32.Vrsqrte, OpCode32SimdSqrte.Create);
SetA32("111100100x10xxxxxxxx1111xxx1xxxx", InstName.Vrsqrts, InstEmit32.Vrsqrts, OpCode32SimdReg.Create);
SetA32("111111100xxxxxxxxxxx101xx0x0xxxx", InstName.Vsel, InstEmit32.Vsel, OpCode32SimdSel.Create);
SetA32("111100101x>>>xxxxxxx0101>xx1xxxx", InstName.Vshl, InstEmit32.Vshl, OpCode32SimdShImm.Create);
SetA32("1111001x0xxxxxxxxxxx0100xxx0xxxx", InstName.Vshl, InstEmit32.Vshl_I, OpCode32SimdReg.Create);
SetA32("1111001x1x>>>xxxxxxx101000x1xxxx", InstName.Vshll, InstEmit32.Vshll, OpCode32SimdShImmLong.Create); // A1 encoding.
SetA32("1111001x1x>>>xxxxxxx0000>xx1xxxx", InstName.Vshr, InstEmit32.Vshr, OpCode32SimdShImm.Create);
SetA32("111100101x>>>xxxxxxx100000x1xxx0", InstName.Vshrn, InstEmit32.Vshrn, OpCode32SimdShImmNarrow.Create);
SetA32("<<<<11101x110001xxxx101x11x0xxxx", InstName.Vsqrt, InstEmit32.Vsqrt_S, OpCode32SimdS.Create);
SetA32("1111001x1x>>>xxxxxxx0001>xx1xxxx", InstName.Vsra, InstEmit32.Vsra, OpCode32SimdShImm.Create);
SetA32("111101001x00xxxxxxxx<<00xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx0111xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 1.
SetA32("111101000x00xxxxxxxx1010xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 2.
SetA32("111101000x00xxxxxxxx0110xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 3.
SetA32("111101000x00xxxxxxxx0010xxxxxxxx", InstName.Vst1, InstEmit32.Vst1, OpCode32SimdMemPair.Create); // Regs = 4.
SetA32("111101001x00xxxxxxxx<<01xxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx100xxxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemPair.Create); // Regs = 1, inc = 1/2 (itype).
SetA32("111101000x00xxxxxxxx0011xxxxxxxx", InstName.Vst2, InstEmit32.Vst2, OpCode32SimdMemPair.Create); // Regs = 2, inc = 2.
SetA32("111101001x00xxxxxxxx<<10xxxxxxxx", InstName.Vst3, InstEmit32.Vst3, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx010xxxxxxxxx", InstName.Vst3, InstEmit32.Vst3, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("111101001x00xxxxxxxx<<11xxxxxxxx", InstName.Vst4, InstEmit32.Vst4, OpCode32SimdMemSingle.Create);
SetA32("111101000x00xxxxxxxx000xxxxxxxxx", InstName.Vst4, InstEmit32.Vst4, OpCode32SimdMemPair.Create); // Inc = 1/2 (itype).
SetA32("<<<<11001x00xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x10xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x10xxxxxxxx1011xxxxxxx0", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x00xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11001x10xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<11010x10xxxxxxxx1010xxxxxxxx", InstName.Vstm, InstEmit32.Vstm, OpCode32SimdMemMult.Create);
SetA32("<<<<1101xx00xxxxxxxx101xxxxxxxxx", InstName.Vstr, InstEmit32.Vstr, OpCode32SimdMemImm.Create);
SetA32("111100110xxxxxxxxxxx1000xxx0xxxx", InstName.Vsub, InstEmit32.Vsub_I, OpCode32SimdReg.Create);
SetA32("<<<<11100x11xxxxxxxx101xx1x0xxxx", InstName.Vsub, InstEmit32.Vsub_S, OpCode32SimdRegS.Create);
SetA32("111100100x10xxxxxxxx1101xxx0xxxx", InstName.Vsub, InstEmit32.Vsub_V, OpCode32SimdReg.Create);
SetA32("1111001x1x<<xxxxxxxx0011x0x0xxxx", InstName.Vsubw, InstEmit32.Vsubw_I, OpCode32SimdRegWide.Create);
SetA32("111100111x11xxxxxxxx10xxxxx0xxxx", InstName.Vtbl, InstEmit32.Vtbl, OpCode32SimdTbl.Create);
SetA32("111100111x11<<10xxxx00001xx0xxxx", InstName.Vtrn, InstEmit32.Vtrn, OpCode32SimdCmpZ.Create);
SetA32("111100100x<<xxxxxxxx1000xxx1xxxx", InstName.Vtst, InstEmit32.Vtst, OpCode32SimdReg.Create);
SetA32("111100111x11<<10xxxx00010xx0xxxx", InstName.Vuzp, InstEmit32.Vuzp, OpCode32SimdCmpZ.Create);
SetA32("111100111x11<<10xxxx00011xx0xxxx", InstName.Vzip, InstEmit32.Vzip, OpCode32SimdCmpZ.Create);
#endregion
FillFastLookupTable(InstA32FastLookup, AllInstA32);
FillFastLookupTable(InstT32FastLookup, AllInstT32);
FillFastLookupTable(InstA64FastLookup, AllInstA64);
#region "OpCode Table (AArch32, T16)"
SetT16("000<<xxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16ShiftImm.Create);
SetT16("0001100xxxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT16AddSubReg.Create);
SetT16("0001101xxxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT16AddSubReg.Create);
SetT16("0001110xxxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT16AddSubImm3.Create);
SetT16("0001111xxxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT16AddSubImm3.Create);
SetT16("00100xxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16AluImm8.Create);
SetT16("00101xxxxxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT16AluImm8.Create);
SetT16("00110xxxxxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT16AluImm8.Create);
SetT16("00111xxxxxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT16AluImm8.Create);
SetT16("0100000000xxxxxx", InstName.And, InstEmit32.And, OpCodeT16AluRegLow.Create);
SetT16("0100000001xxxxxx", InstName.Eor, InstEmit32.Eor, OpCodeT16AluRegLow.Create);
SetT16("0100000010xxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16ShiftReg.Create);
SetT16("0100000011xxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16ShiftReg.Create);
SetT16("0100000100xxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16ShiftReg.Create);
SetT16("0100000101xxxxxx", InstName.Adc, InstEmit32.Adc, OpCodeT16AluRegLow.Create);
SetT16("0100000110xxxxxx", InstName.Sbc, InstEmit32.Sbc, OpCodeT16AluRegLow.Create);
SetT16("0100000111xxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16ShiftReg.Create);
SetT16("0100001000xxxxxx", InstName.Tst, InstEmit32.Tst, OpCodeT16AluRegLow.Create);
SetT16("0100001001xxxxxx", InstName.Rsb, InstEmit32.Rsb, OpCodeT16AluImmZero.Create);
SetT16("0100001010xxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT16AluRegLow.Create);
SetT16("0100001011xxxxxx", InstName.Cmn, InstEmit32.Cmn, OpCodeT16AluRegLow.Create);
SetT16("0100001100xxxxxx", InstName.Orr, InstEmit32.Orr, OpCodeT16AluRegLow.Create);
SetT16("0100001101xxxxxx", InstName.Mul, InstEmit32.Mul, OpCodeT16AluRegLow.Create);
SetT16("0100001110xxxxxx", InstName.Bic, InstEmit32.Bic, OpCodeT16AluRegLow.Create);
SetT16("0100001111xxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCodeT16AluRegLow.Create);
SetT16("01000100xxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT16AluRegHigh.Create);
SetT16("01000101xxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT16AluRegHigh.Create);
SetT16("01000110xxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT16AluRegHigh.Create);
SetT16("010001110xxxx000", InstName.Bx, InstEmit32.Bx, OpCodeT16BReg.Create);
SetT16("010001111xxxx000", InstName.Blx, InstEmit32.Blx, OpCodeT16BReg.Create);
SetT16("01001xxxxxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT16MemLit.Create);
SetT16("0101000xxxxxxxxx", InstName.Str, InstEmit32.Str, OpCodeT16MemReg.Create);
SetT16("0101001xxxxxxxxx", InstName.Strh, InstEmit32.Strh, OpCodeT16MemReg.Create);
SetT16("0101010xxxxxxxxx", InstName.Strb, InstEmit32.Strb, OpCodeT16MemReg.Create);
SetT16("0101011xxxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT16MemReg.Create);
SetT16("0101100xxxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT16MemReg.Create);
SetT16("0101101xxxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT16MemReg.Create);
SetT16("0101110xxxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT16MemReg.Create);
SetT16("0101111xxxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT16MemReg.Create);
SetT16("01100xxxxxxxxxxx", InstName.Str, InstEmit32.Str, OpCodeT16MemImm5.Create);
SetT16("01101xxxxxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT16MemImm5.Create);
SetT16("01110xxxxxxxxxxx", InstName.Strb, InstEmit32.Strb, OpCodeT16MemImm5.Create);
SetT16("01111xxxxxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT16MemImm5.Create);
SetT16("10000xxxxxxxxxxx", InstName.Strh, InstEmit32.Strh, OpCodeT16MemImm5.Create);
SetT16("10001xxxxxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT16MemImm5.Create);
SetT16("10010xxxxxxxxxxx", InstName.Str, InstEmit32.Str, OpCodeT16MemSp.Create);
SetT16("10011xxxxxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT16MemSp.Create);
SetT16("10100xxxxxxxxxxx", InstName.Adr, InstEmit32.Adr, OpCodeT16Adr.Create);
SetT16("10101xxxxxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT16SpRel.Create);
SetT16("101100000xxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT16AddSubSp.Create);
SetT16("101100001xxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT16AddSubSp.Create);
SetT16("1011001000xxxxxx", InstName.Sxth, InstEmit32.Sxth, OpCodeT16AluUx.Create);
SetT16("1011001001xxxxxx", InstName.Sxtb, InstEmit32.Sxtb, OpCodeT16AluUx.Create);
SetT16("1011001010xxxxxx", InstName.Uxth, InstEmit32.Uxth, OpCodeT16AluUx.Create);
SetT16("1011001011xxxxxx", InstName.Uxtb, InstEmit32.Uxtb, OpCodeT16AluUx.Create);
SetT16("101100x1xxxxxxxx", InstName.Cbz, InstEmit32.Cbz, OpCodeT16BImmCmp.Create);
SetT16("1011010xxxxxxxxx", InstName.Push, InstEmit32.Stm, OpCodeT16MemStack.Create);
SetT16("1011101000xxxxxx", InstName.Rev, InstEmit32.Rev, OpCodeT16AluRegLow.Create);
SetT16("1011101001xxxxxx", InstName.Rev16, InstEmit32.Rev16, OpCodeT16AluRegLow.Create);
SetT16("1011101011xxxxxx", InstName.Revsh, InstEmit32.Revsh, OpCodeT16AluRegLow.Create);
SetT16("101110x1xxxxxxxx", InstName.Cbnz, InstEmit32.Cbnz, OpCodeT16BImmCmp.Create);
SetT16("1011110xxxxxxxxx", InstName.Pop, InstEmit32.Ldm, OpCodeT16MemStack.Create);
SetT16("10111111xxxx0000", InstName.Nop, InstEmit32.Nop, OpCodeT16.Create);
SetT16("10111111xxxx>>>>", InstName.It, InstEmit32.It, OpCodeT16IfThen.Create);
SetT16("11000xxxxxxxxxxx", InstName.Stm, InstEmit32.Stm, OpCodeT16MemMult.Create);
SetT16("11001xxxxxxxxxxx", InstName.Ldm, InstEmit32.Ldm, OpCodeT16MemMult.Create);
SetT16("1101<<<xxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT16BImm8.Create);
SetT16("11011111xxxxxxxx", InstName.Svc, InstEmit32.Svc, OpCodeT16Exception.Create);
SetT16("11100xxxxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT16BImm11.Create);
#endregion
#region "OpCode Table (AArch32, T32)"
// Base
SetT32("11101011010xxxxx0xxxxxxxxxxxxxxx", InstName.Adc, InstEmit32.Adc, OpCodeT32AluRsImm.Create);
SetT32("11110x01010xxxxx0xxxxxxxxxxxxxxx", InstName.Adc, InstEmit32.Adc, OpCodeT32AluImm.Create);
SetT32("11101011000<xxxx0xxx<<<<xxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT32AluRsImm.Create);
SetT32("11110x01000<xxxx0xxx<<<<xxxxxxxx", InstName.Add, InstEmit32.Add, OpCodeT32AluImm.Create);
SetT32("11101010000<xxxx0xxx<<<<xxxxxxxx", InstName.And, InstEmit32.And, OpCodeT32AluRsImm.Create);
SetT32("11110x00000<xxxx0xxx<<<<xxxxxxxx", InstName.And, InstEmit32.And, OpCodeT32AluImm.Create);
SetT32("11110x<<<xxxxxxx10x0xxxxxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT32BImm20.Create);
SetT32("11110xxxxxxxxxxx10x1xxxxxxxxxxxx", InstName.B, InstEmit32.B, OpCodeT32BImm24.Create);
SetT32("11101010001xxxxx0xxxxxxxxxxxxxxx", InstName.Bic, InstEmit32.Bic, OpCodeT32AluRsImm.Create);
SetT32("11110x00001xxxxx0xxxxxxxxxxxxxxx", InstName.Bic, InstEmit32.Bic, OpCodeT32AluImm.Create);
SetT32("11110xxxxxxxxxxx11x1xxxxxxxxxxxx", InstName.Bl, InstEmit32.Bl, OpCodeT32BImm24.Create);
SetT32("11110xxxxxxxxxxx11x0xxxxxxxxxxx0", InstName.Blx, InstEmit32.Blx, OpCodeT32BImm24.Create);
SetT32("111010110001xxxx0xxx1111xxxxxxxx", InstName.Cmn, InstEmit32.Cmn, OpCodeT32AluRsImm.Create);
SetT32("11110x010001xxxx0xxx1111xxxxxxxx", InstName.Cmn, InstEmit32.Cmn, OpCodeT32AluImm.Create);
SetT32("111010111011xxxx0xxx1111xxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT32AluRsImm.Create);
SetT32("11110x011011xxxx0xxx1111xxxxxxxx", InstName.Cmp, InstEmit32.Cmp, OpCodeT32AluImm.Create);
SetT32("11101010100<xxxx0xxx<<<<xxxxxxxx", InstName.Eor, InstEmit32.Eor, OpCodeT32AluRsImm.Create);
SetT32("11110x00100<xxxx0xxx<<<<xxxxxxxx", InstName.Eor, InstEmit32.Eor, OpCodeT32AluImm.Create);
SetT32("111010001101xxxxxxxx111111101111", InstName.Ldaex, InstEmit32.Ldaex, OpCodeT32MemLdEx.Create);
SetT32("1110100010x1xxxxxxxxxxxxxxxxxxxx", InstName.Ldm, InstEmit32.Ldm, OpCodeT32MemMult.Create);
SetT32("1110100100x1xxxxxxxxxxxxxxxxxxxx", InstName.Ldm, InstEmit32.Ldm, OpCodeT32MemMult.Create);
SetT32("111110000101xxxx<<<<10x1xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111110000101xxxx<<<<1100xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111110000101xxxx<<<<11x1xxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm8.Create);
SetT32("111110001101xxxxxxxxxxxxxxxxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemImm12.Create);
SetT32("111110000101<<<<xxxx000000xxxxxx", InstName.Ldr, InstEmit32.Ldr, OpCodeT32MemRsImm.Create);
SetT32("111110000001xxxx<<<<10x1xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110000001xxxx<<<<1100xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110000001xxxx<<<<11x1xxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm8.Create);
SetT32("111110001001xxxxxxxxxxxxxxxxxxxx", InstName.Ldrb, InstEmit32.Ldrb, OpCodeT32MemImm12.Create);
SetT32("1110100>x1>1<<<<xxxxxxxxxxxxxxxx", InstName.Ldrd, InstEmit32.Ldrd, OpCodeT32MemImm8D.Create);
SetT32("111110000011xxxx<<<<10x1xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110000011xxxx<<<<1100xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110000011xxxx<<<<11x1xxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm8.Create);
SetT32("111110001011xxxxxxxxxxxxxxxxxxxx", InstName.Ldrh, InstEmit32.Ldrh, OpCodeT32MemImm12.Create);
SetT32("111110010001xxxx<<<<10x1xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110010001xxxx<<<<1100xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110010001xxxx<<<<11x1xxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm8.Create);
SetT32("111110011001xxxxxxxxxxxxxxxxxxxx", InstName.Ldrsb, InstEmit32.Ldrsb, OpCodeT32MemImm12.Create);
SetT32("111110010011xxxx<<<<10x1xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110010011xxxx<<<<1100xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110010011xxxx<<<<11x1xxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm8.Create);
SetT32("111110011011xxxxxxxxxxxxxxxxxxxx", InstName.Ldrsh, InstEmit32.Ldrsh, OpCodeT32MemImm12.Create);
SetT32("11101010010x11110xxxxxxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT32AluRsImm.Create);
SetT32("11110x00010x11110xxxxxxxxxxxxxxx", InstName.Mov, InstEmit32.Mov, OpCodeT32AluImm.Create);
SetT32("11101010011x11110xxxxxxxxxxxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCodeT32AluRsImm.Create);
SetT32("11110x00011x11110xxxxxxxxxxxxxxx", InstName.Mvn, InstEmit32.Mvn, OpCodeT32AluImm.Create);
SetT32("11101010011x<<<<0xxxxxxxxxxxxxxx", InstName.Orn, InstEmit32.Orn, OpCodeT32AluRsImm.Create);
SetT32("11110x00011x<<<<0xxxxxxxxxxxxxxx", InstName.Orn, InstEmit32.Orn, OpCodeT32AluImm.Create);
SetT32("11101010010x<<<<0xxxxxxxxxxxxxxx", InstName.Orr, InstEmit32.Orr, OpCodeT32AluRsImm.Create);
SetT32("11110x00010x<<<<0xxxxxxxxxxxxxxx", InstName.Orr, InstEmit32.Orr, OpCodeT32AluImm.Create);
SetT32("11101011110xxxxx0xxxxxxxxxxxxxxx", InstName.Rsb, InstEmit32.Rsb, OpCodeT32AluRsImm.Create);
SetT32("11110x01110xxxxx0xxxxxxxxxxxxxxx", InstName.Rsb, InstEmit32.Rsb, OpCodeT32AluImm.Create);
SetT32("11101011011xxxxx0xxxxxxxxxxxxxxx", InstName.Sbc, InstEmit32.Sbc, OpCodeT32AluRsImm.Create);
SetT32("11110x01011xxxxx0xxxxxxxxxxxxxxx", InstName.Sbc, InstEmit32.Sbc, OpCodeT32AluImm.Create);
SetT32("111010001100xxxxxxxx11111110xxxx", InstName.Stlex, InstEmit32.Stlex, OpCodeT32MemStEx.Create);
SetT32("1110100010x0xxxx0xxxxxxxxxxxxxxx", InstName.Stm, InstEmit32.Stm, OpCodeT32MemMult.Create);
SetT32("1110100100x0xxxx0xxxxxxxxxxxxxxx", InstName.Stm, InstEmit32.Stm, OpCodeT32MemMult.Create);
SetT32("111110000100xxxxxxxx1<<>xxxxxxxx", InstName.Str, InstEmit32.Str, OpCodeT32MemImm8.Create);
SetT32("111110001100xxxxxxxxxxxxxxxxxxxx", InstName.Str, InstEmit32.Str, OpCodeT32MemImm12.Create);
SetT32("111110000100<<<<xxxx000000xxxxxx", InstName.Str, InstEmit32.Str, OpCodeT32MemRsImm.Create);
SetT32("111110000000xxxxxxxx1<<>xxxxxxxx", InstName.Strb, InstEmit32.Strb, OpCodeT32MemImm8.Create);
SetT32("111110001000xxxxxxxxxxxxxxxxxxxx", InstName.Strb, InstEmit32.Strb, OpCodeT32MemImm12.Create);
SetT32("1110100>x1>0<<<<xxxxxxxxxxxxxxxx", InstName.Strd, InstEmit32.Strd, OpCodeT32MemImm8D.Create);
SetT32("111110000010xxxxxxxx1<<>xxxxxxxx", InstName.Strh, InstEmit32.Strh, OpCodeT32MemImm8.Create);
SetT32("111110001010xxxxxxxxxxxxxxxxxxxx", InstName.Strh, InstEmit32.Strh, OpCodeT32MemImm12.Create);
SetT32("11101011101<xxxx0xxx<<<<xxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT32AluRsImm.Create);
SetT32("11110x01101<xxxx0xxx<<<<xxxxxxxx", InstName.Sub, InstEmit32.Sub, OpCodeT32AluImm.Create);
SetT32("111010101001xxxx0xxx1111xxxxxxxx", InstName.Teq, InstEmit32.Teq, OpCodeT32AluRsImm.Create);
SetT32("11110x001001xxxx0xxx1111xxxxxxxx", InstName.Teq, InstEmit32.Teq, OpCodeT32AluImm.Create);
SetT32("111010100001xxxx0xxx1111xxxxxxxx", InstName.Tst, InstEmit32.Tst, OpCodeT32AluRsImm.Create);
SetT32("11110x000001xxxx0xxx1111xxxxxxxx", InstName.Tst, InstEmit32.Tst, OpCodeT32AluImm.Create);
#endregion
FillFastLookupTable(InstA32FastLookup, AllInstA32, ToFastLookupIndexA);
FillFastLookupTable(InstT32FastLookup, AllInstT32, ToFastLookupIndexT);
FillFastLookupTable(InstA64FastLookup, AllInstA64, ToFastLookupIndexA);
}
private static void FillFastLookupTable(InstInfo[][] table, List<InstInfo> allInsts)
private static void FillFastLookupTable(InstInfo[][] table, List<InstInfo> allInsts, Func<int, int> ToFastLookupIndex)
{
List<InstInfo>[] temp = new List<InstInfo>[FastLookupSize];
@ -1007,20 +1162,30 @@ namespace ARMeilleure.Decoders
private static void SetA32(string encoding, InstName name, InstEmitter emitter, MakeOp makeOp)
{
Set(encoding, ExecutionMode.Aarch32Arm, new InstDescriptor(name, emitter), makeOp);
Set(encoding, AllInstA32, new InstDescriptor(name, emitter), makeOp);
}
private static void SetT16(string encoding, InstName name, InstEmitter emitter, MakeOp makeOp)
{
encoding = "xxxxxxxxxxxxxxxx" + encoding;
Set(encoding, AllInstT32, new InstDescriptor(name, emitter), makeOp);
}
private static void SetT32(string encoding, InstName name, InstEmitter emitter, MakeOp makeOp)
{
Set(encoding, ExecutionMode.Aarch32Thumb, new InstDescriptor(name, emitter), makeOp);
string reversedEncoding = encoding.Substring(16) + encoding.Substring(0, 16);
MakeOp reversedMakeOp =
(InstDescriptor inst, ulong address, int opCode)
=> makeOp(inst, address, (int)BitOperations.RotateRight((uint)opCode, 16));
Set(reversedEncoding, AllInstT32, new InstDescriptor(name, emitter), reversedMakeOp);
}
private static void SetA64(string encoding, InstName name, InstEmitter emitter, MakeOp makeOp)
{
Set(encoding, ExecutionMode.Aarch64, new InstDescriptor(name, emitter), makeOp);
Set(encoding, AllInstA64, new InstDescriptor(name, emitter), makeOp);
}
private static void Set(string encoding, ExecutionMode mode, InstDescriptor inst, MakeOp makeOp)
private static void Set(string encoding, List<InstInfo> list, InstDescriptor inst, MakeOp makeOp)
{
int bit = encoding.Length - 1;
int value = 0;
@ -1069,7 +1234,7 @@ namespace ARMeilleure.Decoders
if (xBits == 0)
{
InsertInst(new InstInfo(xMask, value, inst, makeOp), mode);
list.Add(new InstInfo(xMask, value, inst, makeOp));
return;
}
@ -1085,34 +1250,24 @@ namespace ARMeilleure.Decoders
if (mask != blacklisted)
{
InsertInst(new InstInfo(xMask, value | mask, inst, makeOp), mode);
list.Add(new InstInfo(xMask, value | mask, inst, makeOp));
}
}
}
private static void InsertInst(InstInfo info, ExecutionMode mode)
{
switch (mode)
{
case ExecutionMode.Aarch32Arm: AllInstA32.Add(info); break;
case ExecutionMode.Aarch32Thumb: AllInstT32.Add(info); break;
case ExecutionMode.Aarch64: AllInstA64.Add(info); break;
}
}
public static (InstDescriptor inst, MakeOp makeOp) GetInstA32(int opCode)
{
return GetInstFromList(InstA32FastLookup[ToFastLookupIndex(opCode)], opCode);
return GetInstFromList(InstA32FastLookup[ToFastLookupIndexA(opCode)], opCode);
}
public static (InstDescriptor inst, MakeOp makeOp) GetInstT32(int opCode)
{
return GetInstFromList(InstT32FastLookup[ToFastLookupIndex(opCode)], opCode);
return GetInstFromList(InstT32FastLookup[ToFastLookupIndexT(opCode)], opCode);
}
public static (InstDescriptor inst, MakeOp makeOp) GetInstA64(int opCode)
{
return GetInstFromList(InstA64FastLookup[ToFastLookupIndex(opCode)], opCode);
return GetInstFromList(InstA64FastLookup[ToFastLookupIndexA(opCode)], opCode);
}
private static (InstDescriptor inst, MakeOp makeOp) GetInstFromList(InstInfo[] insts, int opCode)
@ -1128,9 +1283,14 @@ namespace ARMeilleure.Decoders
return (new InstDescriptor(InstName.Und, InstEmit.Und), null);
}
private static int ToFastLookupIndex(int value)
private static int ToFastLookupIndexA(int value)
{
return ((value >> 10) & 0x00F) | ((value >> 18) & 0xFF0);
}
private static int ToFastLookupIndexT(int value)
{
return (value >> 4) & 0xFFF;
}
}
}

View File

@ -1,13 +1,14 @@
// https://www.intel.com/content/dam/doc/white-paper/advanced-encryption-standard-new-instructions-set-paper.pdf
using ARMeilleure.State;
using System;
namespace ARMeilleure.Instructions
{
static class CryptoHelper
{
#region "LookUp Tables"
private static readonly byte[] _sBox = new byte[]
private static ReadOnlySpan<byte> _sBox => new byte[]
{
0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,
0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,
@ -27,7 +28,7 @@ namespace ARMeilleure.Instructions
0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16
};
private static readonly byte[] _invSBox = new byte[]
private static ReadOnlySpan<byte> _invSBox => new byte[]
{
0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38, 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb,
0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87, 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb,
@ -47,7 +48,7 @@ namespace ARMeilleure.Instructions
0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26, 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d
};
private static readonly byte[] _gfMul02 = new byte[]
private static ReadOnlySpan<byte> _gfMul02 => new byte[]
{
0x00, 0x02, 0x04, 0x06, 0x08, 0x0a, 0x0c, 0x0e, 0x10, 0x12, 0x14, 0x16, 0x18, 0x1a, 0x1c, 0x1e,
0x20, 0x22, 0x24, 0x26, 0x28, 0x2a, 0x2c, 0x2e, 0x30, 0x32, 0x34, 0x36, 0x38, 0x3a, 0x3c, 0x3e,
@ -67,7 +68,7 @@ namespace ARMeilleure.Instructions
0xfb, 0xf9, 0xff, 0xfd, 0xf3, 0xf1, 0xf7, 0xf5, 0xeb, 0xe9, 0xef, 0xed, 0xe3, 0xe1, 0xe7, 0xe5
};
private static readonly byte[] _gfMul03 = new byte[]
private static ReadOnlySpan<byte> _gfMul03 => new byte[]
{
0x00, 0x03, 0x06, 0x05, 0x0c, 0x0f, 0x0a, 0x09, 0x18, 0x1b, 0x1e, 0x1d, 0x14, 0x17, 0x12, 0x11,
0x30, 0x33, 0x36, 0x35, 0x3c, 0x3f, 0x3a, 0x39, 0x28, 0x2b, 0x2e, 0x2d, 0x24, 0x27, 0x22, 0x21,
@ -87,7 +88,7 @@ namespace ARMeilleure.Instructions
0x0b, 0x08, 0x0d, 0x0e, 0x07, 0x04, 0x01, 0x02, 0x13, 0x10, 0x15, 0x16, 0x1f, 0x1c, 0x19, 0x1a
};
private static readonly byte[] _gfMul09 = new byte[]
private static ReadOnlySpan<byte> _gfMul09 => new byte[]
{
0x00, 0x09, 0x12, 0x1b, 0x24, 0x2d, 0x36, 0x3f, 0x48, 0x41, 0x5a, 0x53, 0x6c, 0x65, 0x7e, 0x77,
0x90, 0x99, 0x82, 0x8b, 0xb4, 0xbd, 0xa6, 0xaf, 0xd8, 0xd1, 0xca, 0xc3, 0xfc, 0xf5, 0xee, 0xe7,
@ -107,7 +108,7 @@ namespace ARMeilleure.Instructions
0x31, 0x38, 0x23, 0x2a, 0x15, 0x1c, 0x07, 0x0e, 0x79, 0x70, 0x6b, 0x62, 0x5d, 0x54, 0x4f, 0x46
};
private static readonly byte[] _gfMul0B = new byte[]
private static ReadOnlySpan<byte> _gfMul0B => new byte[]
{
0x00, 0x0b, 0x16, 0x1d, 0x2c, 0x27, 0x3a, 0x31, 0x58, 0x53, 0x4e, 0x45, 0x74, 0x7f, 0x62, 0x69,
0xb0, 0xbb, 0xa6, 0xad, 0x9c, 0x97, 0x8a, 0x81, 0xe8, 0xe3, 0xfe, 0xf5, 0xc4, 0xcf, 0xd2, 0xd9,
@ -127,7 +128,7 @@ namespace ARMeilleure.Instructions
0xca, 0xc1, 0xdc, 0xd7, 0xe6, 0xed, 0xf0, 0xfb, 0x92, 0x99, 0x84, 0x8f, 0xbe, 0xb5, 0xa8, 0xa3
};
private static readonly byte[] _gfMul0D = new byte[]
private static ReadOnlySpan<byte> _gfMul0D => new byte[]
{
0x00, 0x0d, 0x1a, 0x17, 0x34, 0x39, 0x2e, 0x23, 0x68, 0x65, 0x72, 0x7f, 0x5c, 0x51, 0x46, 0x4b,
0xd0, 0xdd, 0xca, 0xc7, 0xe4, 0xe9, 0xfe, 0xf3, 0xb8, 0xb5, 0xa2, 0xaf, 0x8c, 0x81, 0x96, 0x9b,
@ -147,7 +148,7 @@ namespace ARMeilleure.Instructions
0xdc, 0xd1, 0xc6, 0xcb, 0xe8, 0xe5, 0xf2, 0xff, 0xb4, 0xb9, 0xae, 0xa3, 0x80, 0x8d, 0x9a, 0x97
};
private static readonly byte[] _gfMul0E = new byte[]
private static ReadOnlySpan<byte> _gfMul0E => new byte[]
{
0x00, 0x0e, 0x1c, 0x12, 0x38, 0x36, 0x24, 0x2a, 0x70, 0x7e, 0x6c, 0x62, 0x48, 0x46, 0x54, 0x5a,
0xe0, 0xee, 0xfc, 0xf2, 0xd8, 0xd6, 0xc4, 0xca, 0x90, 0x9e, 0x8c, 0x82, 0xa8, 0xa6, 0xb4, 0xba,
@ -167,12 +168,12 @@ namespace ARMeilleure.Instructions
0xd7, 0xd9, 0xcb, 0xc5, 0xef, 0xe1, 0xf3, 0xfd, 0xa7, 0xa9, 0xbb, 0xb5, 0x9f, 0x91, 0x83, 0x8d
};
private static readonly byte[] _srPerm = new byte[]
private static ReadOnlySpan<byte> _srPerm => new byte[]
{
0, 13, 10, 7, 4, 1, 14, 11, 8, 5, 2, 15, 12, 9, 6, 3
};
private static readonly byte[] _isrPerm = new byte[]
private static ReadOnlySpan<byte> _isrPerm => new byte[]
{
0, 5, 10, 15, 4, 9, 14, 3, 8, 13, 2, 7, 12, 1, 6, 11
};

View File

@ -20,7 +20,7 @@ namespace ARMeilleure.Instructions
Operand res = context.Add(n, m);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
@ -44,7 +44,7 @@ namespace ARMeilleure.Instructions
res = context.Add(res, carry);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
@ -64,7 +64,7 @@ namespace ARMeilleure.Instructions
Operand res = context.BitwiseAnd(n, m);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
@ -110,7 +110,7 @@ namespace ARMeilleure.Instructions
Operand res = context.BitwiseAnd(n, context.BitwiseNot(m));
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
@ -161,7 +161,7 @@ namespace ARMeilleure.Instructions
Operand res = context.BitwiseExclusiveOr(n, m);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
@ -175,7 +175,7 @@ namespace ARMeilleure.Instructions
Operand m = GetAluM(context);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, m);
}
@ -204,7 +204,7 @@ namespace ARMeilleure.Instructions
Operand res = context.Multiply(n, m);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
@ -219,7 +219,7 @@ namespace ARMeilleure.Instructions
Operand res = context.BitwiseNot(m);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
@ -236,7 +236,24 @@ namespace ARMeilleure.Instructions
Operand res = context.BitwiseOr(n, m);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
EmitAluStore(context, res);
}
public static void Orn(ArmEmitterContext context)
{
IOpCode32Alu op = (IOpCode32Alu)context.CurrOp;
Operand n = GetAluN(context);
Operand m = GetAluM(context);
Operand res = context.BitwiseOr(n, context.BitwiseNot(m));
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
@ -315,7 +332,7 @@ namespace ARMeilleure.Instructions
res = context.Subtract(res, borrow);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
@ -335,7 +352,7 @@ namespace ARMeilleure.Instructions
Operand res = context.Subtract(m, n);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
@ -359,7 +376,7 @@ namespace ARMeilleure.Instructions
res = context.Subtract(res, borrow);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
@ -387,6 +404,16 @@ namespace ARMeilleure.Instructions
EmitDiv(context, false);
}
public static void Shadd8(ArmEmitterContext context)
{
EmitHadd8(context, false);
}
public static void Shsub8(ArmEmitterContext context)
{
EmitHsub8(context, false);
}
public static void Ssat(ArmEmitterContext context)
{
OpCode32Sat op = (OpCode32Sat)context.CurrOp;
@ -410,7 +437,7 @@ namespace ARMeilleure.Instructions
Operand res = context.Subtract(n, m);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
@ -474,20 +501,12 @@ namespace ARMeilleure.Instructions
public static void Uhadd8(ArmEmitterContext context)
{
OpCode32AluReg op = (OpCode32AluReg)context.CurrOp;
EmitHadd8(context, true);
}
Operand m = GetIntA32(context, op.Rm);
Operand n = GetIntA32(context, op.Rn);
Operand xor, res;
res = context.BitwiseAnd(m, n);
xor = context.BitwiseExclusiveOr(m, n);
xor = context.ShiftRightUI(xor, Const(1));
xor = context.BitwiseAnd(xor, Const(0x7F7F7F7Fu));
res = context.Add(res, xor);
SetIntA32(context, op.Rd, res);
public static void Uhsub8(ArmEmitterContext context)
{
EmitHsub8(context, true);
}
public static void Usat(ArmEmitterContext context)
@ -659,6 +678,71 @@ namespace ARMeilleure.Instructions
context.MarkLabel(lblEnd);
}
private static void EmitHadd8(ArmEmitterContext context, bool unsigned)
{
OpCode32AluReg op = (OpCode32AluReg)context.CurrOp;
Operand m = GetIntA32(context, op.Rm);
Operand n = GetIntA32(context, op.Rn);
Operand xor, res, carry;
// This relies on the equality x+y == ((x&y) << 1) + (x^y).
// Note that x^y always contains the LSB of the result.
// Since we want to calculate (x+y)/2, we can instead calculate (x&y) + ((x^y)>>1).
// We mask by 0x7F to remove the LSB so that it doesn't leak into the field below.
res = context.BitwiseAnd(m, n);
carry = context.BitwiseExclusiveOr(m, n);
xor = context.ShiftRightUI(carry, Const(1));
xor = context.BitwiseAnd(xor, Const(0x7F7F7F7Fu));
res = context.Add(res, xor);
if (!unsigned)
{
// Propagates the sign bit from (x^y)>>1 upwards by one.
carry = context.BitwiseAnd(carry, Const(0x80808080u));
res = context.BitwiseExclusiveOr(res, carry);
}
SetIntA32(context, op.Rd, res);
}
private static void EmitHsub8(ArmEmitterContext context, bool unsigned)
{
OpCode32AluReg op = (OpCode32AluReg)context.CurrOp;
Operand m = GetIntA32(context, op.Rm);
Operand n = GetIntA32(context, op.Rn);
Operand left, right, carry, res;
// This relies on the equality x-y == (x^y) - (((x^y)&y) << 1).
// Note that x^y always contains the LSB of the result.
// Since we want to calculate (x+y)/2, we can instead calculate ((x^y)>>1) - ((x^y)&y).
carry = context.BitwiseExclusiveOr(m, n);
left = context.ShiftRightUI(carry, Const(1));
right = context.BitwiseAnd(carry, m);
// We must now perform a partitioned subtraction.
// We can do this because minuend contains 7 bit fields.
// We use the extra bit in minuend as a bit to borrow from; we set this bit.
// We invert this bit at the end as this tells us if that bit was borrowed from.
res = context.BitwiseOr(left, Const(0x80808080));
res = context.Subtract(res, right);
res = context.BitwiseExclusiveOr(res, Const(0x80808080));
if (!unsigned)
{
// We then sign extend the result into this bit.
carry = context.BitwiseAnd(carry, Const(0x80808080));
res = context.BitwiseExclusiveOr(res, carry);
}
SetIntA32(context, op.Rd, res);
}
private static void EmitSat(ArmEmitterContext context, int intMin, int intMax)
{
OpCode32Sat op = (OpCode32Sat)context.CurrOp;
@ -769,7 +853,7 @@ namespace ARMeilleure.Instructions
{
IOpCode32Alu op = (IOpCode32Alu)context.CurrOp;
EmitGenericAluStoreA32(context, op.Rd, op.SetFlags, value);
EmitGenericAluStoreA32(context, op.Rd, ShouldSetFlags(context), value);
}
}
}
}

View File

@ -12,6 +12,18 @@ namespace ARMeilleure.Instructions
{
static class InstEmitAluHelper
{
public static bool ShouldSetFlags(ArmEmitterContext context)
{
IOpCode32HasSetFlags op = (IOpCode32HasSetFlags)context.CurrOp;
if (op.SetFlags == null)
{
return !context.IsInIfThenBlock;
}
return op.SetFlags.Value;
}
public static void EmitNZFlagsCheck(ArmEmitterContext context, Operand d)
{
SetFlag(context, PState.NFlag, context.ICompareLess (d, Const(d.Type, 0)));
@ -116,7 +128,7 @@ namespace ARMeilleure.Instructions
{
Debug.Assert(value.Type == OperandType.I32);
if (IsThumb(context.CurrOp))
if (((OpCode32)context.CurrOp).IsThumb())
{
bool isReturn = IsA32Return(context);
if (!isReturn)
@ -183,9 +195,9 @@ namespace ARMeilleure.Instructions
switch (context.CurrOp)
{
// ARM32.
case OpCode32AluImm op:
case IOpCode32AluImm op:
{
if (op.SetFlags && op.IsRotated)
if (ShouldSetFlags(context) && op.IsRotated && setCarry)
{
SetFlag(context, PState.CFlag, Const((uint)op.Immediate >> 31));
}
@ -195,10 +207,8 @@ namespace ARMeilleure.Instructions
case OpCode32AluImm16 op: return Const(op.Immediate);
case OpCode32AluRsImm op: return GetMShiftedByImmediate(context, op, setCarry);
case OpCode32AluRsReg op: return GetMShiftedByReg(context, op, setCarry);
case OpCodeT16AluImm8 op: return Const(op.Immediate);
case IOpCode32AluRsImm op: return GetMShiftedByImmediate(context, op, setCarry);
case IOpCode32AluRsReg op: return GetMShiftedByReg(context, op, setCarry);
case IOpCode32AluReg op: return GetIntA32(context, op.Rm);
@ -249,7 +259,7 @@ namespace ARMeilleure.Instructions
}
// ARM32 helpers.
public static Operand GetMShiftedByImmediate(ArmEmitterContext context, OpCode32AluRsImm op, bool setCarry)
public static Operand GetMShiftedByImmediate(ArmEmitterContext context, IOpCode32AluRsImm op, bool setCarry)
{
Operand m = GetIntA32(context, op.Rm);
@ -267,7 +277,7 @@ namespace ARMeilleure.Instructions
if (shift != 0)
{
setCarry &= op.SetFlags;
setCarry &= ShouldSetFlags(context);
switch (op.ShiftType)
{
@ -305,7 +315,7 @@ namespace ARMeilleure.Instructions
return shift;
}
public static Operand GetMShiftedByReg(ArmEmitterContext context, OpCode32AluRsReg op, bool setCarry)
public static Operand GetMShiftedByReg(ArmEmitterContext context, IOpCode32AluRsReg op, bool setCarry)
{
Operand m = GetIntA32(context, op.Rm);
Operand s = context.ZeroExtend8(OperandType.I32, GetIntA32(context, op.Rs));
@ -314,7 +324,7 @@ namespace ARMeilleure.Instructions
Operand zeroResult = m;
Operand shiftResult = m;
setCarry &= op.SetFlags;
setCarry &= ShouldSetFlags(context);
switch (op.ShiftType)
{

View File

@ -9,18 +9,25 @@ namespace ARMeilleure.Instructions
{
public static void Brk(ArmEmitterContext context)
{
EmitExceptionCall(context, nameof(NativeInterface.Break));
OpCodeException op = (OpCodeException)context.CurrOp;
string name = nameof(NativeInterface.Break);
context.StoreToContext();
context.Call(typeof(NativeInterface).GetMethod(name), Const(op.Address), Const(op.Id));
context.LoadFromContext();
context.Return(Const(op.Address));
}
public static void Svc(ArmEmitterContext context)
{
EmitExceptionCall(context, nameof(NativeInterface.SupervisorCall));
}
private static void EmitExceptionCall(ArmEmitterContext context, string name)
{
OpCodeException op = (OpCodeException)context.CurrOp;
string name = nameof(NativeInterface.SupervisorCall);
context.StoreToContext();
context.Call(typeof(NativeInterface).GetMethod(name), Const(op.Address), Const(op.Id));
@ -41,6 +48,8 @@ namespace ARMeilleure.Instructions
context.Call(typeof(NativeInterface).GetMethod(name), Const(op.Address), Const(op.RawOpCode));
context.LoadFromContext();
context.Return(Const(op.Address));
}
}
}

View File

@ -1,7 +1,5 @@
using ARMeilleure.Decoders;
using ARMeilleure.Translation;
using static ARMeilleure.Instructions.InstEmitFlowHelper;
using static ARMeilleure.IntermediateRepresentation.Operand.Factory;
namespace ARMeilleure.Instructions
@ -10,25 +8,32 @@ namespace ARMeilleure.Instructions
{
public static void Svc(ArmEmitterContext context)
{
EmitExceptionCall(context, nameof(NativeInterface.SupervisorCall));
}
IOpCode32Exception op = (IOpCode32Exception)context.CurrOp;
public static void Trap(ArmEmitterContext context)
{
EmitExceptionCall(context, nameof(NativeInterface.Break));
}
private static void EmitExceptionCall(ArmEmitterContext context, string name)
{
OpCode32Exception op = (OpCode32Exception)context.CurrOp;
string name = nameof(NativeInterface.SupervisorCall);
context.StoreToContext();
context.Call(typeof(NativeInterface).GetMethod(name), Const(op.Address), Const(op.Id));
context.Call(typeof(NativeInterface).GetMethod(name), Const(((IOpCode)op).Address), Const(op.Id));
context.LoadFromContext();
Translator.EmitSynchronization(context);
}
public static void Trap(ArmEmitterContext context)
{
IOpCode32Exception op = (IOpCode32Exception)context.CurrOp;
string name = nameof(NativeInterface.Break);
context.StoreToContext();
context.Call(typeof(NativeInterface).GetMethod(name), Const(((IOpCode)op).Address), Const(op.Id));
context.LoadFromContext();
context.Return(Const(context.CurrOp.Address));
}
}
}

View File

@ -34,7 +34,7 @@ namespace ARMeilleure.Instructions
uint pc = op.GetPc();
bool isThumb = IsThumb(context.CurrOp);
bool isThumb = ((OpCode32)context.CurrOp).IsThumb();
uint currentPc = isThumb
? pc | 1
@ -61,17 +61,17 @@ namespace ARMeilleure.Instructions
Operand addr = context.Copy(GetIntA32(context, op.Rm));
Operand bitOne = context.BitwiseAnd(addr, Const(1));
bool isThumb = IsThumb(context.CurrOp);
bool isThumb = ((OpCode32)context.CurrOp).IsThumb();
uint currentPc = isThumb
? pc | 1
? (pc - 2) | 1
: pc - 4;
SetIntA32(context, GetBankedRegisterAlias(context.Mode, RegisterAlias.Aarch32Lr), Const(currentPc));
SetFlag(context, PState.TFlag, bitOne);
EmitVirtualCall(context, addr);
EmitBxWritePc(context, addr);
}
public static void Bx(ArmEmitterContext context)
@ -80,5 +80,32 @@ namespace ARMeilleure.Instructions
EmitBxWritePc(context, GetIntA32(context, op.Rm), op.Rm);
}
public static void Cbnz(ArmEmitterContext context) => EmitCb(context, onNotZero: true);
public static void Cbz(ArmEmitterContext context) => EmitCb(context, onNotZero: false);
private static void EmitCb(ArmEmitterContext context, bool onNotZero)
{
OpCodeT16BImmCmp op = (OpCodeT16BImmCmp)context.CurrOp;
Operand value = GetIntA32(context, op.Rn);
Operand lblTarget = context.GetLabel((ulong)op.Immediate);
if (onNotZero)
{
context.BranchIfTrue(lblTarget, value);
}
else
{
context.BranchIfFalse(lblTarget, value);
}
}
public static void It(ArmEmitterContext context)
{
OpCodeT16IfThen op = (OpCodeT16IfThen)context.CurrOp;
context.SetIfThenBlockState(op.IfThenBlockConds);
}
}
}

View File

@ -10,11 +10,6 @@ namespace ARMeilleure.Instructions
{
static class InstEmitHelper
{
public static bool IsThumb(OpCode op)
{
return op is OpCodeT16;
}
public static Operand GetExtendedM(ArmEmitterContext context, int rm, IntType type)
{
Operand value = GetIntOrZR(context, rm);
@ -47,6 +42,20 @@ namespace ARMeilleure.Instructions
}
}
public static Operand GetIntA32AlignedPC(ArmEmitterContext context, int regIndex)
{
if (regIndex == RegisterAlias.Aarch32Pc)
{
OpCode32 op = (OpCode32)context.CurrOp;
return Const((int)(op.GetPc() & 0xfffffffc));
}
else
{
return Register(GetRegisterAlias(context.Mode, regIndex), RegisterType.Integer, OperandType.I32);
}
}
public static Operand GetVecA32(int regIndex)
{
return Register(regIndex, RegisterType.Vector, OperandType.V128);
@ -172,7 +181,7 @@ namespace ARMeilleure.Instructions
SetFlag(context, PState.TFlag, mode);
Operand addr = context.ConditionalSelect(mode, pc, context.BitwiseAnd(pc, Const(~3)));
Operand addr = context.ConditionalSelect(mode, context.BitwiseAnd(pc, Const(~1)), context.BitwiseAnd(pc, Const(~3)));
InstEmitFlowHelper.EmitVirtualJump(context, addr, isReturn);
}

View File

@ -32,7 +32,7 @@ namespace ARMeilleure.Instructions
public static void Ldm(ArmEmitterContext context)
{
OpCode32MemMult op = (OpCode32MemMult)context.CurrOp;
IOpCode32MemMult op = (IOpCode32MemMult)context.CurrOp;
Operand n = GetIntA32(context, op.Rn);
@ -95,7 +95,7 @@ namespace ARMeilleure.Instructions
public static void Stm(ArmEmitterContext context)
{
OpCode32MemMult op = (OpCode32MemMult)context.CurrOp;
IOpCode32MemMult op = (IOpCode32MemMult)context.CurrOp;
Operand n = context.Copy(GetIntA32(context, op.Rn));
@ -151,9 +151,9 @@ namespace ARMeilleure.Instructions
private static void EmitLoadOrStore(ArmEmitterContext context, int size, AccessType accType)
{
OpCode32Mem op = (OpCode32Mem)context.CurrOp;
IOpCode32Mem op = (IOpCode32Mem)context.CurrOp;
Operand n = context.Copy(GetIntA32(context, op.Rn));
Operand n = context.Copy(GetIntA32AlignedPC(context, op.Rn));
Operand m = GetMemM(context, setCarry: false);
Operand temp = default;
@ -255,5 +255,11 @@ namespace ARMeilleure.Instructions
}
}
}
public static void Adr(ArmEmitterContext context)
{
IOpCode32Adr op = (IOpCode32Adr)context.CurrOp;
SetIntA32(context, op.Rd, Const(op.Immediate));
}
}
}

View File

@ -130,11 +130,6 @@ namespace ARMeilleure.Instructions
bool ordered = (accType & AccessType.Ordered) != 0;
bool exclusive = (accType & AccessType.Exclusive) != 0;
if (ordered)
{
EmitBarrier(context);
}
Operand address = context.Copy(GetIntOrSP(context, op.Rn));
Operand t = GetIntOrZR(context, op.Rt);
@ -163,6 +158,11 @@ namespace ARMeilleure.Instructions
{
EmitStoreExclusive(context, address, t, exclusive, op.Size, op.Rs, a32: false);
}
if (ordered)
{
EmitBarrier(context);
}
}
private static void EmitBarrier(ArmEmitterContext context)

View File

@ -146,13 +146,13 @@ namespace ARMeilleure.Instructions
var exclusive = (accType & AccessType.Exclusive) != 0;
var ordered = (accType & AccessType.Ordered) != 0;
if (ordered)
{
EmitBarrier(context);
}
if ((accType & AccessType.Load) != 0)
{
if (ordered)
{
EmitBarrier(context);
}
if (size == DWordSizeLog2)
{
// Keep loads atomic - make the call to get the whole region and then decompose it into parts
@ -219,6 +219,11 @@ namespace ARMeilleure.Instructions
Operand value = context.ZeroExtend32(OperandType.I64, GetIntA32(context, op.Rt));
EmitStoreExclusive(context, address, value, exclusive, size, op.Rd, a32: true);
}
if (ordered)
{
EmitBarrier(context);
}
}
}

View File

@ -547,11 +547,11 @@ namespace ARMeilleure.Instructions
{
switch (context.CurrOp)
{
case OpCode32MemRsImm op: return GetMShiftedByImmediate(context, op, setCarry);
case IOpCode32MemRsImm op: return GetMShiftedByImmediate(context, op, setCarry);
case OpCode32MemReg op: return GetIntA32(context, op.Rm);
case IOpCode32MemReg op: return GetIntA32(context, op.Rm);
case OpCode32Mem op: return Const(op.Immediate);
case IOpCode32Mem op: return Const(op.Immediate);
case OpCode32SimdMemImm op: return Const(op.Immediate);
@ -564,7 +564,7 @@ namespace ARMeilleure.Instructions
return new InvalidOperationException($"Invalid OpCode type \"{opCode?.GetType().Name ?? "null"}\".");
}
public static Operand GetMShiftedByImmediate(ArmEmitterContext context, OpCode32MemRsImm op, bool setCarry)
public static Operand GetMShiftedByImmediate(ArmEmitterContext context, IOpCode32MemRsImm op, bool setCarry)
{
Operand m = GetIntA32(context, op.Rm);

View File

@ -33,7 +33,7 @@ namespace ARMeilleure.Instructions
Operand res = context.Add(a, context.Multiply(n, m));
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
@ -250,13 +250,13 @@ namespace ARMeilleure.Instructions
Operand hi = context.ConvertI64ToI32(context.ShiftRightUI(res, Const(32)));
Operand lo = context.ConvertI64ToI32(res);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
EmitGenericAluStoreA32(context, op.RdHi, op.SetFlags, hi);
EmitGenericAluStoreA32(context, op.RdLo, op.SetFlags, lo);
EmitGenericAluStoreA32(context, op.RdHi, ShouldSetFlags(context), hi);
EmitGenericAluStoreA32(context, op.RdLo, ShouldSetFlags(context), lo);
}
public static void Smulw_(ArmEmitterContext context)
@ -320,13 +320,13 @@ namespace ARMeilleure.Instructions
Operand hi = context.ConvertI64ToI32(context.ShiftRightUI(res, Const(32)));
Operand lo = context.ConvertI64ToI32(res);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
EmitGenericAluStoreA32(context, op.RdHi, op.SetFlags, hi);
EmitGenericAluStoreA32(context, op.RdLo, op.SetFlags, lo);
EmitGenericAluStoreA32(context, op.RdHi, ShouldSetFlags(context), hi);
EmitGenericAluStoreA32(context, op.RdLo, ShouldSetFlags(context), lo);
}
private static void EmitMlal(ArmEmitterContext context, bool signed)
@ -356,13 +356,13 @@ namespace ARMeilleure.Instructions
Operand hi = context.ConvertI64ToI32(context.ShiftRightUI(res, Const(32)));
Operand lo = context.ConvertI64ToI32(res);
if (op.SetFlags)
if (ShouldSetFlags(context))
{
EmitNZFlagsCheck(context, res);
}
EmitGenericAluStoreA32(context, op.RdHi, op.SetFlags, hi);
EmitGenericAluStoreA32(context, op.RdLo, op.SetFlags, lo);
EmitGenericAluStoreA32(context, op.RdHi, ShouldSetFlags(context), hi);
EmitGenericAluStoreA32(context, op.RdLo, ShouldSetFlags(context), lo);
}
private static void UpdateQFlag(ArmEmitterContext context, Operand q)

View File

@ -3613,7 +3613,7 @@ namespace ARMeilleure.Instructions
Operand masked = context.AddIntrinsic(Intrinsic.X86Pand, value, expMask);
Operand isNaNInf = context.AddIntrinsic(Intrinsic.X86Pcmpeqd, masked, expMask);
value = context.AddIntrinsic(Intrinsic.X86Paddw, value, roundMask);
value = context.AddIntrinsic(Intrinsic.X86Paddd, value, roundMask);
value = context.AddIntrinsic(Intrinsic.X86Pand, value, truncMask);
return context.AddIntrinsic(Intrinsic.X86Blendvps, value, oValue, isNaNInf);

View File

@ -2,8 +2,6 @@
using ARMeilleure.IntermediateRepresentation;
using ARMeilleure.Translation;
using System;
using System.Diagnostics;
using static ARMeilleure.Instructions.InstEmitFlowHelper;
using static ARMeilleure.Instructions.InstEmitHelper;
using static ARMeilleure.Instructions.InstEmitSimdHelper;

View File

@ -105,11 +105,48 @@ namespace ARMeilleure.Instructions
}
else if (op.Size == 1 && op.Opc == 3) // Double -> Half.
{
throw new NotImplementedException("Double-precision to half-precision.");
if (Optimizations.UseF16c)
{
Debug.Assert(!Optimizations.ForceLegacySse);
Operand n = GetVec(op.Rn);
Operand res = context.AddIntrinsic(Intrinsic.X86Cvtsd2ss, context.VectorZero(), n);
res = context.AddIntrinsic(Intrinsic.X86Vcvtps2ph, res, Const(X86GetRoundControl(FPRoundingMode.ToNearest)));
context.Copy(GetVec(op.Rd), res);
}
else
{
Operand ne = context.VectorExtract(OperandType.FP64, GetVec(op.Rn), 0);
Operand res = context.Call(typeof(SoftFloat64_16).GetMethod(nameof(SoftFloat64_16.FPConvert)), ne);
res = context.ZeroExtend16(OperandType.I64, res);
context.Copy(GetVec(op.Rd), EmitVectorInsert(context, context.VectorZero(), res, 0, 1));
}
}
else if (op.Size == 3 && op.Opc == 1) // Double -> Half.
else if (op.Size == 3 && op.Opc == 1) // Half -> Double.
{
throw new NotImplementedException("Half-precision to double-precision.");
if (Optimizations.UseF16c)
{
Operand n = GetVec(op.Rn);
Operand res = context.AddIntrinsic(Intrinsic.X86Vcvtph2ps, GetVec(op.Rn));
res = context.AddIntrinsic(Intrinsic.X86Cvtss2sd, context.VectorZero(), res);
res = context.VectorZeroUpper64(res);
context.Copy(GetVec(op.Rd), res);
}
else
{
Operand ne = EmitVectorExtractZx(context, op.Rn, 0, 1);
Operand res = context.Call(typeof(SoftFloat16_64).GetMethod(nameof(SoftFloat16_64.FPConvert)), ne);
context.Copy(GetVec(op.Rd), context.VectorInsert(context.VectorZero(), res, 0));
}
}
else // Invalid encoding.
{

View File

@ -100,7 +100,7 @@ namespace ARMeilleure.Instructions
Operand n = GetVec(op.Rn);
Operand m = GetVec(op.Rm);
Operand res = context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.HashLower)), d, n, m);
Operand res = InstEmitSimdHashHelper.EmitSha256h(context, d, n, m, part2: false);
context.Copy(GetVec(op.Rd), res);
}
@ -113,7 +113,7 @@ namespace ARMeilleure.Instructions
Operand n = GetVec(op.Rn);
Operand m = GetVec(op.Rm);
Operand res = context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.HashUpper)), d, n, m);
Operand res = InstEmitSimdHashHelper.EmitSha256h(context, n, d, m, part2: true);
context.Copy(GetVec(op.Rd), res);
}
@ -125,7 +125,7 @@ namespace ARMeilleure.Instructions
Operand d = GetVec(op.Rd);
Operand n = GetVec(op.Rn);
Operand res = context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Sha256SchedulePart1)), d, n);
Operand res = InstEmitSimdHashHelper.EmitSha256su0(context, d, n);
context.Copy(GetVec(op.Rd), res);
}
@ -138,7 +138,7 @@ namespace ARMeilleure.Instructions
Operand n = GetVec(op.Rn);
Operand m = GetVec(op.Rm);
Operand res = context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Sha256SchedulePart2)), d, n, m);
Operand res = InstEmitSimdHashHelper.EmitSha256su1(context, d, n, m);
context.Copy(GetVec(op.Rd), res);
}

View File

@ -0,0 +1,64 @@
using ARMeilleure.Decoders;
using ARMeilleure.IntermediateRepresentation;
using ARMeilleure.Translation;
using static ARMeilleure.Instructions.InstEmitHelper;
namespace ARMeilleure.Instructions
{
static partial class InstEmit32
{
#region "Sha256"
public static void Sha256h_V(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
Operand d = GetVecA32(op.Qd);
Operand n = GetVecA32(op.Qn);
Operand m = GetVecA32(op.Qm);
Operand res = InstEmitSimdHashHelper.EmitSha256h(context, d, n, m, part2: false);
context.Copy(GetVecA32(op.Qd), res);
}
public static void Sha256h2_V(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
Operand d = GetVecA32(op.Qd);
Operand n = GetVecA32(op.Qn);
Operand m = GetVecA32(op.Qm);
Operand res = InstEmitSimdHashHelper.EmitSha256h(context, n, d, m, part2: true);
context.Copy(GetVecA32(op.Qd), res);
}
public static void Sha256su0_V(ArmEmitterContext context)
{
OpCode32Simd op = (OpCode32Simd)context.CurrOp;
Operand d = GetVecA32(op.Qd);
Operand m = GetVecA32(op.Qm);
Operand res = InstEmitSimdHashHelper.EmitSha256su0(context, d, m);
context.Copy(GetVecA32(op.Qd), res);
}
public static void Sha256su1_V(ArmEmitterContext context)
{
OpCode32SimdReg op = (OpCode32SimdReg)context.CurrOp;
Operand d = GetVecA32(op.Qd);
Operand n = GetVecA32(op.Qn);
Operand m = GetVecA32(op.Qm);
Operand res = InstEmitSimdHashHelper.EmitSha256su1(context, d, n, m);
context.Copy(GetVecA32(op.Qd), res);
}
#endregion
}
}

View File

@ -0,0 +1,56 @@
using ARMeilleure.IntermediateRepresentation;
using ARMeilleure.Translation;
using System;
using static ARMeilleure.IntermediateRepresentation.Operand.Factory;
namespace ARMeilleure.Instructions
{
static class InstEmitSimdHashHelper
{
public static Operand EmitSha256h(ArmEmitterContext context, Operand x, Operand y, Operand w, bool part2)
{
if (Optimizations.UseSha)
{
Operand src1 = context.AddIntrinsic(Intrinsic.X86Shufps, y, x, Const(0xbb));
Operand src2 = context.AddIntrinsic(Intrinsic.X86Shufps, y, x, Const(0x11));
Operand w2 = context.AddIntrinsic(Intrinsic.X86Punpckhqdq, w, w);
Operand round2 = context.AddIntrinsic(Intrinsic.X86Sha256Rnds2, src1, src2, w);
Operand round4 = context.AddIntrinsic(Intrinsic.X86Sha256Rnds2, src2, round2, w2);
Operand res = context.AddIntrinsic(Intrinsic.X86Shufps, round4, round2, Const(part2 ? 0x11 : 0xbb));
return res;
}
String method = part2 ? nameof(SoftFallback.HashUpper) : nameof(SoftFallback.HashLower);
return context.Call(typeof(SoftFallback).GetMethod(method), x, y, w);
}
public static Operand EmitSha256su0(ArmEmitterContext context, Operand x, Operand y)
{
if (Optimizations.UseSha)
{
return context.AddIntrinsic(Intrinsic.X86Sha256Msg1, x, y);
}
return context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Sha256SchedulePart1)), x, y);
}
public static Operand EmitSha256su1(ArmEmitterContext context, Operand x, Operand y, Operand z)
{
if (Optimizations.UseSha && Optimizations.UseSsse3)
{
Operand extr = context.AddIntrinsic(Intrinsic.X86Palignr, z, y, Const(4));
Operand tmp = context.AddIntrinsic(Intrinsic.X86Paddd, extr, x);
Operand res = context.AddIntrinsic(Intrinsic.X86Sha256Msg2, tmp, z);
return res;
}
return context.Call(typeof(SoftFallback).GetMethod(nameof(SoftFallback.Sha256SchedulePart2)), x, y, z);
}
}
}

View File

@ -12,7 +12,8 @@ namespace ARMeilleure.Instructions
{
static partial class InstEmit
{
private const int DczSizeLog2 = 4;
private const int DczSizeLog2 = 4; // Log2 size in words
public const int DczSizeInBytes = 4 << DczSizeLog2;
public static void Hint(ArmEmitterContext context)
{
@ -32,13 +33,13 @@ namespace ARMeilleure.Instructions
switch (GetPackedId(op))
{
case 0b11_011_0000_0000_001: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetCtrEl0)); break;
case 0b11_011_0000_0000_111: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetDczidEl0)); break;
case 0b11_011_0100_0010_000: EmitGetNzcv(context); return;
case 0b11_011_0100_0100_000: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetFpcr)); break;
case 0b11_011_0100_0100_001: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetFpsr)); break;
case 0b11_011_1101_0000_010: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetTpidrEl0)); break;
case 0b11_011_1101_0000_011: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetTpidr)); break;
case 0b11_011_0000_0000_001: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetCtrEl0)); break;
case 0b11_011_0000_0000_111: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetDczidEl0)); break;
case 0b11_011_0100_0010_000: EmitGetNzcv(context); return;
case 0b11_011_0100_0100_000: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetFpcr)); break;
case 0b11_011_0100_0100_001: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetFpsr)); break;
case 0b11_011_1101_0000_010: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetTpidrEl0)); break;
case 0b11_011_1101_0000_011: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetTpidrroEl0)); break;
case 0b11_011_1110_0000_000: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetCntfrqEl0)); break;
case 0b11_011_1110_0000_001: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetCntpctEl0)); break;
case 0b11_011_1110_0000_010: info = typeof(NativeInterface).GetMethod(nameof(NativeInterface.GetCntvctEl0)); break;
@ -87,7 +88,7 @@ namespace ARMeilleure.Instructions
// DC ZVA
Operand t = GetIntOrZR(context, op.Rt);
for (long offset = 0; offset < (4 << DczSizeLog2); offset += 8)
for (long offset = 0; offset < DczSizeInBytes; offset += 8)
{
Operand address = context.Add(t, Const(offset));
@ -98,7 +99,12 @@ namespace ARMeilleure.Instructions
}
// No-op
case 0b11_011_0111_1110_001: //DC CIVAC
case 0b11_011_0111_1110_001: // DC CIVAC
break;
case 0b11_011_0111_0101_001: // IC IVAU
Operand target = Register(op.Rt, RegisterType.Integer, OperandType.I64);
context.Call(typeof(NativeInterface).GetMethod(nameof(NativeInterface.InvalidateCacheLine)), target);
break;
}
}

Some files were not shown because too many files have changed in this diff Show More