Compare commits

..

3 Commits

Author SHA1 Message Date
5af8ce7c38 A64: Add fast path for Fcvtas_Gp/S/V, Fcvtau_Gp/S/V and Frinta_S/V in… (#3712)
* A64: Add fast path for Fcvtas_Gp/S/V, Fcvtau_Gp/S/V and Frinta_S/V instructions;

they use "Round to Nearest with Ties to Away" rounding mode not supported in x86.

All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.

The titles Mario Strikers and Super Smash Bros. U. use these instructions intensively.

* Update Ptc.cs

* A32: Add fast path for Vcvta_RM, Vrinta_RM and Vrinta_V instructions aswell.
2022-10-19 00:21:33 +00:00
77c4291c34 Avalonia: Update Polish Translation (#3722)
* Add new string

You need the period there otherwise it could be read as "Głoś" -> "Preach"

* Update MainWindow.axaml

Updating to bring it in line with the other languages naming themselves in their respective languages

* Update pl_PL.json

realizing that period isn't necessary considering the string's usage (which to be fair, I should have checked when I added it)

* Update pl_PL.json

* Add Updater Message
2022-10-19 00:10:28 +00:00
6e92b7a378 Dispose Vulkan TextureStorage when views hit 0 instead of immediately (#3738)
Due to the `using` statement being scoped to the `CreateTextureView` method, `TextureStorage` would be disposed as soon as the view was returned.

This was largely fine as the TextureStorage resources were being kept alive by the views holding their own references to them, but it also meant that dispose is only called as soon as the texture is created.

Aliased Storages are TextureStorages created with the same allocation as another TextureStorage, if they have to be aliased as another format. We keep track of a TextureStorage's `_aliasedStorages` as they are created, and dispose them when the TextureStorage is disposed...

...except it is disposed immediately, before any aliased storages are even created. The aliased storages added after this will never be disposed.

This PR attempts to fix this by disposing TextureStorage when its view count reaches 0. The other use of texture storage - the D32S8 blit - still manually disposes the storage, but regular uses created via the GAL are now disposed by the view count.

I think this makes the most sense, as otherwise in the future this behaviour might be forgotton and more things could be added to the Dispose() method that don't work due to it not actually calling at the right time.

This should improve memory leaks in Super Mario Odyssey, most noticeable when resolution scaling. The memory usage of the game is still wildly unpredictable due to how it interacts with the texture cache, but now it shouldn't get considerably longer as you play... I hope. I've seen it typically recover back to the same level occasionally, though it can spike significantly.

Please test a bunch of games on multiple GPUs to make sure this doesn't break anything.
2022-10-18 23:52:08 +00:00
10 changed files with 252 additions and 49 deletions

View File

@ -1616,20 +1616,34 @@ namespace ARMeilleure.Instructions
}
public static void Frinta_S(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitSse41ScalarRoundOpF(context, FPRoundingMode.ToNearestAway);
}
else
{
EmitScalarUnaryOpF(context, (op1) =>
{
return EmitRoundMathCall(context, MidpointRounding.AwayFromZero, op1);
});
}
}
public static void Frinta_V(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitSse41VectorRoundOpF(context, FPRoundingMode.ToNearestAway);
}
else
{
EmitVectorUnaryOpF(context, (op1) =>
{
return EmitRoundMathCall(context, MidpointRounding.AwayFromZero, op1);
});
}
}
public static void Frinti_S(ArmEmitterContext context)
{
@ -3516,9 +3530,18 @@ namespace ARMeilleure.Instructions
Operand n = GetVec(op.Rn);
Operand res;
if (roundMode != FPRoundingMode.ToNearestAway)
{
Intrinsic inst = (op.Size & 1) != 0 ? Intrinsic.X86Roundsd : Intrinsic.X86Roundss;
Operand res = context.AddIntrinsic(inst, n, Const(X86GetRoundControl(roundMode)));
res = context.AddIntrinsic(inst, n, Const(X86GetRoundControl(roundMode)));
}
else
{
res = EmitSse41RoundToNearestWithTiesToAwayOpF(context, n, scalar: true);
}
if ((op.Size & 1) != 0)
{
@ -3538,9 +3561,18 @@ namespace ARMeilleure.Instructions
Operand n = GetVec(op.Rn);
Operand res;
if (roundMode != FPRoundingMode.ToNearestAway)
{
Intrinsic inst = (op.Size & 1) != 0 ? Intrinsic.X86Roundpd : Intrinsic.X86Roundps;
Operand res = context.AddIntrinsic(inst, n, Const(X86GetRoundControl(roundMode)));
res = context.AddIntrinsic(inst, n, Const(X86GetRoundControl(roundMode)));
}
else
{
res = EmitSse41RoundToNearestWithTiesToAwayOpF(context, n, scalar: false);
}
if (op.RegisterSize == RegisterSize.Simd64)
{

View File

@ -163,34 +163,76 @@ namespace ARMeilleure.Instructions
}
public static void Fcvtas_Gp(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitSse41Fcvts_Gp(context, FPRoundingMode.ToNearestAway, isFixed: false);
}
else
{
EmitFcvt_s_Gp(context, (op1) => EmitRoundMathCall(context, MidpointRounding.AwayFromZero, op1));
}
}
public static void Fcvtas_S(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitSse41FcvtsOpF(context, FPRoundingMode.ToNearestAway, scalar: true);
}
else
{
EmitFcvt(context, (op1) => EmitRoundMathCall(context, MidpointRounding.AwayFromZero, op1), signed: true, scalar: true);
}
}
public static void Fcvtas_V(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitSse41FcvtsOpF(context, FPRoundingMode.ToNearestAway, scalar: false);
}
else
{
EmitFcvt(context, (op1) => EmitRoundMathCall(context, MidpointRounding.AwayFromZero, op1), signed: true, scalar: false);
}
}
public static void Fcvtau_Gp(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitSse41Fcvtu_Gp(context, FPRoundingMode.ToNearestAway, isFixed: false);
}
else
{
EmitFcvt_u_Gp(context, (op1) => EmitRoundMathCall(context, MidpointRounding.AwayFromZero, op1));
}
}
public static void Fcvtau_S(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitSse41FcvtuOpF(context, FPRoundingMode.ToNearestAway, scalar: true);
}
else
{
EmitFcvt(context, (op1) => EmitRoundMathCall(context, MidpointRounding.AwayFromZero, op1), signed: false, scalar: true);
}
}
public static void Fcvtau_V(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitSse41FcvtuOpF(context, FPRoundingMode.ToNearestAway, scalar: false);
}
else
{
EmitFcvt(context, (op1) => EmitRoundMathCall(context, MidpointRounding.AwayFromZero, op1), signed: false, scalar: false);
}
}
public static void Fcvtl_V(ArmEmitterContext context)
{
@ -1223,7 +1265,14 @@ namespace ARMeilleure.Instructions
nRes = context.AddIntrinsic(Intrinsic.X86Mulps, nRes, fpScaledMask);
}
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundps, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar);
}
Operand nInt = context.AddIntrinsic(Intrinsic.X86Cvtps2dq, nRes);
@ -1265,7 +1314,14 @@ namespace ARMeilleure.Instructions
nRes = context.AddIntrinsic(Intrinsic.X86Mulpd, nRes, fpScaledMask);
}
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundpd, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar);
}
Operand nLong = EmitSse2CvtDoubleToInt64OpF(context, nRes, scalar);
@ -1314,7 +1370,14 @@ namespace ARMeilleure.Instructions
nRes = context.AddIntrinsic(Intrinsic.X86Mulps, nRes, fpScaledMask);
}
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundps, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar);
}
Operand zero = context.VectorZero();
@ -1369,7 +1432,14 @@ namespace ARMeilleure.Instructions
nRes = context.AddIntrinsic(Intrinsic.X86Mulpd, nRes, fpScaledMask);
}
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundpd, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar);
}
Operand zero = context.VectorZero();
@ -1424,7 +1494,14 @@ namespace ARMeilleure.Instructions
nRes = context.AddIntrinsic(Intrinsic.X86Mulss, nRes, fpScaledMask);
}
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundss, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar: true);
}
Operand nIntOrLong = op.RegisterSize == RegisterSize.Int32
? context.AddIntrinsicInt (Intrinsic.X86Cvtss2si, nRes)
@ -1464,7 +1541,14 @@ namespace ARMeilleure.Instructions
nRes = context.AddIntrinsic(Intrinsic.X86Mulsd, nRes, fpScaledMask);
}
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundsd, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar: true);
}
Operand nIntOrLong = op.RegisterSize == RegisterSize.Int32
? context.AddIntrinsicInt (Intrinsic.X86Cvtsd2si, nRes)
@ -1512,7 +1596,14 @@ namespace ARMeilleure.Instructions
nRes = context.AddIntrinsic(Intrinsic.X86Mulss, nRes, fpScaledMask);
}
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundss, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar: true);
}
Operand zero = context.VectorZero();
@ -1567,7 +1658,14 @@ namespace ARMeilleure.Instructions
nRes = context.AddIntrinsic(Intrinsic.X86Mulsd, nRes, fpScaledMask);
}
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundsd, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar: true);
}
Operand zero = context.VectorZero();

View File

@ -203,6 +203,9 @@ namespace ARMeilleure.Instructions
FPRoundingMode roundMode;
switch (rm)
{
case 0b00:
roundMode = FPRoundingMode.ToNearestAway;
break;
case 0b01:
roundMode = FPRoundingMode.ToNearest;
break;
@ -228,7 +231,7 @@ namespace ARMeilleure.Instructions
bool unsigned = op.Opc == 0;
int rm = op.Opc2 & 3;
if (Optimizations.UseSse41 && rm != 0b00)
if (Optimizations.UseSse41)
{
EmitSse41ConvertInt32(context, RMToRoundMode(rm), !unsigned);
}
@ -267,15 +270,21 @@ namespace ARMeilleure.Instructions
int rm = op.Opc2 & 3;
if (Optimizations.UseSse2 && rm != 0b00)
if (Optimizations.UseSse41)
{
EmitScalarUnaryOpSimd32(context, (m) =>
{
Intrinsic inst = (op.Size & 1) == 0 ? Intrinsic.X86Roundss : Intrinsic.X86Roundsd;
FPRoundingMode roundMode = RMToRoundMode(rm);
if (roundMode != FPRoundingMode.ToNearestAway)
{
Intrinsic inst = (op.Size & 1) == 0 ? Intrinsic.X86Roundss : Intrinsic.X86Roundsd;
return context.AddIntrinsic(inst, m, Const(X86GetRoundControl(roundMode)));
}
else
{
return EmitSse41RoundToNearestWithTiesToAwayOpF(context, m, scalar: true);
}
});
}
else
@ -304,9 +313,19 @@ namespace ARMeilleure.Instructions
// VRINTA (vector).
public static void Vrinta_V(ArmEmitterContext context)
{
if (Optimizations.UseSse41)
{
EmitVectorUnaryOpSimd32(context, (m) =>
{
return EmitSse41RoundToNearestWithTiesToAwayOpF(context, m, scalar: false);
});
}
else
{
EmitVectorUnaryOpF32(context, (m) => EmitRoundMathCall(context, MidpointRounding.AwayFromZero, m));
}
}
// VRINTM (vector).
public static void Vrintm_V(ArmEmitterContext context)
@ -413,7 +432,14 @@ namespace ARMeilleure.Instructions
Operand nRes = context.AddIntrinsic(Intrinsic.X86Cmpss, n, n, Const((int)CmpCondition.OrderedQ));
nRes = context.AddIntrinsic(Intrinsic.X86Pand, nRes, n);
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundss, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar: true);
}
Operand zero = context.VectorZero();
@ -464,7 +490,14 @@ namespace ARMeilleure.Instructions
Operand nRes = context.AddIntrinsic(Intrinsic.X86Cmpsd, n, n, Const((int)CmpCondition.OrderedQ));
nRes = context.AddIntrinsic(Intrinsic.X86Pand, nRes, n);
if (roundMode != FPRoundingMode.ToNearestAway)
{
nRes = context.AddIntrinsic(Intrinsic.X86Roundsd, nRes, Const(X86GetRoundControl(roundMode)));
}
else
{
nRes = EmitSse41RoundToNearestWithTiesToAwayOpF(context, nRes, scalar: true);
}
Operand zero = context.VectorZero();

View File

@ -33,6 +33,14 @@ namespace ARMeilleure.Instructions
};
public static readonly long ZeroMask = 128L << 56 | 128L << 48 | 128L << 40 | 128L << 32 | 128L << 24 | 128L << 16 | 128L << 8 | 128L << 0;
public static ulong X86GetGf2p8LogicalShiftLeft(int shift)
{
ulong identity = (0b00000001UL << 56) | (0b00000010UL << 48) | (0b00000100UL << 40) | (0b00001000UL << 32) |
(0b00010000UL << 24) | (0b00100000UL << 16) | (0b01000000UL << 8) | (0b10000000UL << 0);
return shift >= 0 ? identity >> (shift * 8) : identity << (-shift * 8);
}
#endregion
#region "X86 SSE Intrinsics"
@ -243,19 +251,44 @@ namespace ARMeilleure.Instructions
throw new ArgumentException($"Invalid rounding mode \"{roundMode}\".");
}
public static ulong X86GetGf2p8LogicalShiftLeft(int shift)
public static Operand EmitSse41RoundToNearestWithTiesToAwayOpF(ArmEmitterContext context, Operand n, bool scalar)
{
ulong identity =
(0b00000001UL << 56) |
(0b00000010UL << 48) |
(0b00000100UL << 40) |
(0b00001000UL << 32) |
(0b00010000UL << 24) |
(0b00100000UL << 16) |
(0b01000000UL << 8) |
(0b10000000UL << 0);
Debug.Assert(n.Type == OperandType.V128);
return shift >= 0 ? identity >> (shift * 8) : identity << (-shift * 8);
Operand nCopy = context.Copy(n);
Operand rC = Const(X86GetRoundControl(FPRoundingMode.TowardsZero));
IOpCodeSimd op = (IOpCodeSimd)context.CurrOp;
if ((op.Size & 1) == 0)
{
Operand signMask = scalar ? X86GetScalar(context, int.MinValue) : X86GetAllElements(context, int.MinValue);
signMask = context.AddIntrinsic(Intrinsic.X86Pand, signMask, nCopy);
// 0x3EFFFFFF == BitConverter.SingleToInt32Bits(0.5f) - 1
Operand valueMask = scalar ? X86GetScalar(context, 0x3EFFFFFF) : X86GetAllElements(context, 0x3EFFFFFF);
valueMask = context.AddIntrinsic(Intrinsic.X86Por, valueMask, signMask);
nCopy = context.AddIntrinsic(scalar ? Intrinsic.X86Addss : Intrinsic.X86Addps, nCopy, valueMask);
nCopy = context.AddIntrinsic(scalar ? Intrinsic.X86Roundss : Intrinsic.X86Roundps, nCopy, rC);
}
else
{
Operand signMask = scalar ? X86GetScalar(context, long.MinValue) : X86GetAllElements(context, long.MinValue);
signMask = context.AddIntrinsic(Intrinsic.X86Pand, signMask, nCopy);
// 0x3FDFFFFFFFFFFFFFL == BitConverter.DoubleToInt64Bits(0.5d) - 1L
Operand valueMask = scalar ? X86GetScalar(context, 0x3FDFFFFFFFFFFFFFL) : X86GetAllElements(context, 0x3FDFFFFFFFFFFFFFL);
valueMask = context.AddIntrinsic(Intrinsic.X86Por, valueMask, signMask);
nCopy = context.AddIntrinsic(scalar ? Intrinsic.X86Addsd : Intrinsic.X86Addpd, nCopy, valueMask);
nCopy = context.AddIntrinsic(scalar ? Intrinsic.X86Roundsd : Intrinsic.X86Roundpd, nCopy, rC);
}
return nCopy;
}
public static Operand EmitCountSetBits8(ArmEmitterContext context, Operand op) // "size" is 8 (SIMD&FP Inst.).

View File

@ -2,9 +2,10 @@ namespace ARMeilleure.State
{
public enum FPRoundingMode
{
ToNearest = 0,
ToNearest = 0, // With ties to even.
TowardsPlusInfinity = 1,
TowardsMinusInfinity = 2,
TowardsZero = 3
TowardsZero = 3,
ToNearestAway = 4 // With ties to away.
}
}

View File

@ -27,7 +27,7 @@ namespace ARMeilleure.Translation.PTC
private const string OuterHeaderMagicString = "PTCohd\0\0";
private const string InnerHeaderMagicString = "PTCihd\0\0";
private const uint InternalVersion = 3710; //! To be incremented manually for each change to the ARMeilleure project.
private const uint InternalVersion = 3713; //! To be incremented manually for each change to the ARMeilleure project.
private const string ActualDir = "0";
private const string BackupDir = "1";

View File

@ -588,5 +588,9 @@
"SettingsTabGraphicsPreferredGpuTooltip": "Wybierz kartę graficzną, która będzie używana z backendem graficznym Vulkan.\n\nNie wpływa na GPU używane przez OpenGL.\n\nW razie wątpliwości ustaw flagę GPU jako \"dGPU\". Jeśli żadnej nie ma, pozostaw nietknięte.",
"SettingsAppRequiredRestartMessage": "Wymagane Zrestartowanie Ryujinx",
"SettingsGpuBackendRestartMessage": "Zmieniono ustawienia Backendu Graficznego lub GPU. Będzie to wymagało ponownego uruchomienia",
"SettingsGpuBackendRestartSubMessage": "Czy chcesz zrestartować teraz?"
"SettingsGpuBackendRestartSubMessage": "Czy chcesz zrestartować teraz?",
"RyujinxUpdaterMessage": "Czy chcesz zaktualizować Ryujinx do najnowszej wersji?",
"SettingsTabHotkeysVolumeUpHotkey": "Zwiększ Głośność:",
"SettingsTabHotkeysVolumeDownHotkey": "Zmniejsz Głośność:",
"VolumeShort": "Głoś"
}

View File

@ -164,7 +164,7 @@
<MenuItem
Command="{ReflectionBinding ChangeLanguage}"
CommandParameter="pl_PL"
Header="Polish" />
Header="Polski" />
<MenuItem
Command="{ReflectionBinding ChangeLanguage}"
CommandParameter="ru_RU"

View File

@ -480,6 +480,8 @@ namespace Ryujinx.Graphics.Vulkan
if (--_viewsCount == 0)
{
_gd.PipelineInternal?.FlushCommandsIfWeightExceeding(_imageAuto, _size);
Dispose();
}
}

View File

@ -310,7 +310,7 @@ namespace Ryujinx.Graphics.Vulkan
internal TextureView CreateTextureView(TextureCreateInfo info, float scale)
{
// This should be disposed when all views are destroyed.
using var storage = CreateTextureStorage(info, scale);
var storage = CreateTextureStorage(info, scale);
return storage.CreateView(info, 0, 0);
}