* [PATCH v2 0/5] tests/tcg/ppc64le: fix the build of TCG tests with Clang @ 2022-03-03 17:20 matheus.ferst 2022-03-03 17:20 ` [PATCH v2 1/5] tests/tcg/ppc64le: use inline asm instead of __builtin_mtfsf matheus.ferst ` (4 more replies) 0 siblings, 5 replies; 14+ messages in thread From: matheus.ferst @ 2022-03-03 17:20 UTC (permalink / raw) To: qemu-devel, qemu-ppc Cc: thuth, Matheus Ferst, danielhb413, richard.henderson, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david From: Matheus Ferst <matheus.ferst@eldorado.org.br> As the configuration scripts used -mbig and -mlittle, building PPC tests with Clang was silently skipped. With the patch to fix these options[1], "make check-tcg" fails because of build and runtime errors. This patch series tries to fix some of these problems. The first patch fixes tests/tcg/ppc64le/mtfsf.c by removing the GCC-only builtins used to emit mtfsf and mffs. We can emit these insns with inline asm instead. The second patch addresses differences in the output of float_madds.c. The __builtin_fmaf used in this test emits fmadds with GCC and xsmaddasp with LLVM. The first insn had rounding errors fixed in d04ca895dc7f ("target/ppc: Add helpers for fmadds et al"), we apply a similar fix to xsmaddasp. Then we have build and runtime errors of bcdsub. According to GCC docs[2], the '-mpower8-vector' flag provides some bcdsub builtins, so it'd be reasonable to assume that the rest of the toolchain knows about the insn if the compiler accepts this flag. Clang has supported this flag since version 3.6[3], but the insn and builtins were only added in LLVM 14[4]. I couldn't find a good config-time solution, so we use __has_builtin to check for __builtin_bcdsub at compile-time and, if not available, emit the opcode with a ".long." If __has_builtin itself is not available, we assume that -mpower8-vector is enough and that the insn is available. Even building with a newer Clang that accepts the bcdsub insn, the test fails at runtime because LLVM doesn't like "__int128" in inline asm. No error or warning is emitted, but the generated code only loads one doubleword of the VSR. The third patch avoids this issue by passing the VSR values in GPR pairs, as we did in 84ade98e87ea ("target/ppc: do not silence snan in xscvspdpn"). The last patch fixes tests/tcg/ppc64le/non_signalling_xscv.c build with -mabi=elfv1. Clang only recognizes VSX register in the clobber list of inline asm when using ELFv2, so we use VSRs >= 32 and list them by their Altivec name. Finally, the insns tested by tests/tcg/ppc64le/byte_reverse.c are not yet supported by LLVM. Since the configuration script uses '-mpower10' to check for POWER10 support and Clang doesn't support this flag, "make check-tcg" doesn't fail. We should probably change this check in the future, but since LLVM support of POWER10 seems incomplete, I guess we can leave it for now. v2: - New patch to address non_signalling_xscv.c build problems with Clang and ELFv1; - Rework of bcdsub patch to work with LLVM < 14 and avoid vector types. [1] https://lists.gnu.org/archive/html/qemu-devel/2022-01/msg06506.html [2] https://gcc.gnu.org/onlinedocs/gcc-8.3.0/gcc/PowerPC-AltiVec_002fVSX-Built-in-Functions.html [3] https://github.com/llvm/llvm-project/commit/59eb767e11d4ffefb5f55409524e5c8416b2b0db [4] https://github.com/llvm/llvm-project/commit/c933c2eb334660c131f4afc9d194fafb0cec0423 Matheus Ferst (5): tests/tcg/ppc64le: use inline asm instead of __builtin_mtfsf target/ppc: change xs[n]madd[am]sp to use float64r32_muladd tests/tcg/ppc64le: drop __int128 usage in bcdsub tests/tcg/ppc64le: emit bcdsub with .long when needed tests/tcg/ppc64le: Use Altivec register names in clobbler list target/ppc/fpu_helper.c | 93 +++++++---------- tests/tcg/ppc64le/bcdsub.c | 126 +++++++++++------------- tests/tcg/ppc64le/mtfsf.c | 19 ++-- tests/tcg/ppc64le/non_signalling_xscv.c | 16 +-- 4 files changed, 112 insertions(+), 142 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 1/5] tests/tcg/ppc64le: use inline asm instead of __builtin_mtfsf 2022-03-03 17:20 [PATCH v2 0/5] tests/tcg/ppc64le: fix the build of TCG tests with Clang matheus.ferst @ 2022-03-03 17:20 ` matheus.ferst 2022-03-03 18:46 ` Richard Henderson 2022-03-03 17:20 ` [PATCH v2 2/5] target/ppc: change xs[n]madd[am]sp to use float64r32_muladd matheus.ferst ` (3 subsequent siblings) 4 siblings, 1 reply; 14+ messages in thread From: matheus.ferst @ 2022-03-03 17:20 UTC (permalink / raw) To: qemu-devel, qemu-ppc Cc: thuth, Matheus Ferst, danielhb413, richard.henderson, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david From: Matheus Ferst <matheus.ferst@eldorado.org.br> LLVM/Clang does not support __builtin_mtfsf. Acked-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> --- tests/tcg/ppc64le/mtfsf.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/tests/tcg/ppc64le/mtfsf.c b/tests/tcg/ppc64le/mtfsf.c index b3d31f3637..bed5b1afa4 100644 --- a/tests/tcg/ppc64le/mtfsf.c +++ b/tests/tcg/ppc64le/mtfsf.c @@ -1,8 +1,12 @@ #include <stdlib.h> +#include <stdint.h> #include <assert.h> #include <signal.h> #include <sys/prctl.h> +#define MTFSF(FLM, FRB) asm volatile ("mtfsf %0, %1" :: "i" (FLM), "f" (FRB)) +#define MFFS(FRT) asm("mffs %0" : "=f" (FRT)) + #define FPSCR_VE 7 /* Floating-point invalid operation exception enable */ #define FPSCR_VXSOFT 10 /* Floating-point invalid operation exception (soft) */ #define FPSCR_FI 17 /* Floating-point fraction inexact */ @@ -21,10 +25,7 @@ void sigfpe_handler(int sig, siginfo_t *si, void *ucontext) int main(void) { - union { - double d; - long long ll; - } fpscr; + uint64_t fpscr; struct sigaction sa = { .sa_sigaction = sigfpe_handler, @@ -40,10 +41,9 @@ int main(void) prctl(PR_SET_FPEXC, PR_FP_EXC_PRECISE); /* First test if the FI bit is being set correctly */ - fpscr.ll = FP_FI; - __builtin_mtfsf(0b11111111, fpscr.d); - fpscr.d = __builtin_mffs(); - assert((fpscr.ll & FP_FI) != 0); + MTFSF(0b11111111, FP_FI); + MFFS(fpscr); + assert((fpscr & FP_FI) != 0); /* Then test if the deferred exception is being called correctly */ sigaction(SIGFPE, &sa, NULL); @@ -54,8 +54,7 @@ int main(void) * But if a different exception is chosen si_code check should * change accordingly. */ - fpscr.ll = FP_VE | FP_VXSOFT; - __builtin_mtfsf(0b11111111, fpscr.d); + MTFSF(0b11111111, FP_VE | FP_VXSOFT); return 1; } -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 1/5] tests/tcg/ppc64le: use inline asm instead of __builtin_mtfsf 2022-03-03 17:20 ` [PATCH v2 1/5] tests/tcg/ppc64le: use inline asm instead of __builtin_mtfsf matheus.ferst @ 2022-03-03 18:46 ` Richard Henderson 0 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2022-03-03 18:46 UTC (permalink / raw) To: matheus.ferst, qemu-devel, qemu-ppc Cc: thuth, danielhb413, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david On 3/3/22 07:20, matheus.ferst@eldorado.org.br wrote: > From: Matheus Ferst<matheus.ferst@eldorado.org.br> > > LLVM/Clang does not support __builtin_mtfsf. > > Acked-by: Alex Bennée<alex.bennee@linaro.org> > Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br> > --- > tests/tcg/ppc64le/mtfsf.c | 19 +++++++++---------- > 1 file changed, 9 insertions(+), 10 deletions(-) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 2/5] target/ppc: change xs[n]madd[am]sp to use float64r32_muladd 2022-03-03 17:20 [PATCH v2 0/5] tests/tcg/ppc64le: fix the build of TCG tests with Clang matheus.ferst 2022-03-03 17:20 ` [PATCH v2 1/5] tests/tcg/ppc64le: use inline asm instead of __builtin_mtfsf matheus.ferst @ 2022-03-03 17:20 ` matheus.ferst 2022-03-03 18:49 ` Richard Henderson 2022-03-03 17:20 ` [RFC PATCH v2 3/5] tests/tcg/ppc64le: drop __int128 usage in bcdsub matheus.ferst ` (2 subsequent siblings) 4 siblings, 1 reply; 14+ messages in thread From: matheus.ferst @ 2022-03-03 17:20 UTC (permalink / raw) To: qemu-devel, qemu-ppc Cc: thuth, Matheus Ferst, danielhb413, richard.henderson, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david From: Matheus Ferst <matheus.ferst@eldorado.org.br> Change VSX Scalar Multiply-Add/Subtract Type-A/M Single Precision helpers to use float64r32_muladd. This method should correctly handle all rounding modes, so the workaround for float_round_nearest_even can be dropped. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> --- target/ppc/fpu_helper.c | 93 ++++++++++++++++------------------------- 1 file changed, 35 insertions(+), 58 deletions(-) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index 8f970288f5..c973968ed6 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -1916,22 +1916,19 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode, * fld - vsr_t field (VsrD(*) or VsrW(*)) * sfprf - set FPRF */ -#define VSX_RE(op, nels, tp, fld, sfprf, r2sp) \ +#define VSX_RE(op, nels, tp, op_tp, fld, sfprf) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ - int i; \ + int i, flags; \ \ helper_reset_fpstatus(env); \ \ for (i = 0; i < nels; i++) { \ - if (unlikely(tp##_is_signaling_nan(xb->fld, &env->fp_status))) { \ + t.fld = op_tp##_div(tp##_one, xb->fld, &env->fp_status); \ + flags = get_float_exception_flags(&env->fp_status); \ + if (unlikely(flags & float_flag_invalid_snan)) { \ float_invalid_op_vxsnan(env, GETPC()); \ - } \ - t.fld = tp##_div(tp##_one, xb->fld, &env->fp_status); \ - \ - if (r2sp) { \ - t.fld = do_frsp(env, t.fld, GETPC()); \ } \ \ if (sfprf) { \ @@ -1943,10 +1940,10 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ do_float_check_status(env, GETPC()); \ } -VSX_RE(xsredp, 1, float64, VsrD(0), 1, 0) -VSX_RE(xsresp, 1, float64, VsrD(0), 1, 1) -VSX_RE(xvredp, 2, float64, VsrD(i), 0, 0) -VSX_RE(xvresp, 4, float32, VsrW(i), 0, 0) +VSX_RE(xsredp, 1, float64, float64, VsrD(0), 1) +VSX_RE(xsresp, 1, float64, float64r32, VsrD(0), 1) +VSX_RE(xvredp, 2, float64, float64, VsrD(i), 0) +VSX_RE(xvresp, 4, float32, float32, VsrW(i), 0) /* * VSX_SQRT - VSX floating point square root @@ -1998,10 +1995,11 @@ VSX_SQRT(xvsqrtsp, 4, float32, VsrW(i), 0, 0) * op - instruction mnemonic * nels - number of elements (1, 2 or 4) * tp - type (float32 or float64) + * op_tp - operation type * fld - vsr_t field (VsrD(*) or VsrW(*)) * sfprf - set FPRF */ -#define VSX_RSQRTE(op, nels, tp, fld, sfprf, r2sp) \ +#define VSX_RSQRTE(op, nels, tp, op_tp, fld, sfprf) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -2012,15 +2010,12 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ for (i = 0; i < nels; i++) { \ float_status tstat = env->fp_status; \ set_float_exception_flags(0, &tstat); \ - t.fld = tp##_sqrt(xb->fld, &tstat); \ - t.fld = tp##_div(tp##_one, t.fld, &tstat); \ + t.fld = op_tp##_sqrt(xb->fld, &tstat); \ + t.fld = op_tp##_div(tp##_one, t.fld, &tstat); \ env->fp_status.float_exception_flags |= tstat.float_exception_flags; \ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ float_invalid_op_sqrt(env, tstat.float_exception_flags, \ sfprf, GETPC()); \ - } \ - if (r2sp) { \ - t.fld = do_frsp(env, t.fld, GETPC()); \ } \ \ if (sfprf) { \ @@ -2032,10 +2027,10 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ do_float_check_status(env, GETPC()); \ } -VSX_RSQRTE(xsrsqrtedp, 1, float64, VsrD(0), 1, 0) -VSX_RSQRTE(xsrsqrtesp, 1, float64, VsrD(0), 1, 1) -VSX_RSQRTE(xvrsqrtedp, 2, float64, VsrD(i), 0, 0) -VSX_RSQRTE(xvrsqrtesp, 4, float32, VsrW(i), 0, 0) +VSX_RSQRTE(xsrsqrtedp, 1, float64, float64, VsrD(0), 1) +VSX_RSQRTE(xsrsqrtesp, 1, float64, float64r32, VsrD(0), 1) +VSX_RSQRTE(xvrsqrtedp, 2, float64, float64, VsrD(i), 0) +VSX_RSQRTE(xvrsqrtesp, 4, float32, float32, VsrW(i), 0) /* * VSX_TDIV - VSX floating point test for divide @@ -2156,9 +2151,8 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23) * maddflgs - flags for the float*muladd routine that control the * various forms (madd, msub, nmadd, nmsub) * sfprf - set FPRF - * r2sp - round intermediate double precision result to single precision */ -#define VSX_MADD(op, nels, tp, fld, maddflgs, sfprf, r2sp) \ +#define VSX_MADD(op, nels, tp, fld, maddflgs, sfprf) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ ppc_vsr_t *s1, ppc_vsr_t *s2, ppc_vsr_t *s3) \ { \ @@ -2170,20 +2164,7 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ for (i = 0; i < nels; i++) { \ float_status tstat = env->fp_status; \ set_float_exception_flags(0, &tstat); \ - if (r2sp && (tstat.float_rounding_mode == float_round_nearest_even)) {\ - /* \ - * Avoid double rounding errors by rounding the intermediate \ - * result to odd. \ - */ \ - set_float_rounding_mode(float_round_to_zero, &tstat); \ - t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld, \ - maddflgs, &tstat); \ - t.fld |= (get_float_exception_flags(&tstat) & \ - float_flag_inexact) != 0; \ - } else { \ - t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld, \ - maddflgs, &tstat); \ - } \ + t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld, maddflgs, &tstat); \ env->fp_status.float_exception_flags |= tstat.float_exception_flags; \ \ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ @@ -2191,10 +2172,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ sfprf, GETPC()); \ } \ \ - if (r2sp) { \ - t.fld = do_frsp(env, t.fld, GETPC()); \ - } \ - \ if (sfprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ @@ -2203,24 +2180,24 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ do_float_check_status(env, GETPC()); \ } -VSX_MADD(XSMADDDP, 1, float64, VsrD(0), MADD_FLGS, 1, 0) -VSX_MADD(XSMSUBDP, 1, float64, VsrD(0), MSUB_FLGS, 1, 0) -VSX_MADD(XSNMADDDP, 1, float64, VsrD(0), NMADD_FLGS, 1, 0) -VSX_MADD(XSNMSUBDP, 1, float64, VsrD(0), NMSUB_FLGS, 1, 0) -VSX_MADD(XSMADDSP, 1, float64, VsrD(0), MADD_FLGS, 1, 1) -VSX_MADD(XSMSUBSP, 1, float64, VsrD(0), MSUB_FLGS, 1, 1) -VSX_MADD(XSNMADDSP, 1, float64, VsrD(0), NMADD_FLGS, 1, 1) -VSX_MADD(XSNMSUBSP, 1, float64, VsrD(0), NMSUB_FLGS, 1, 1) +VSX_MADD(XSMADDDP, 1, float64, VsrD(0), MADD_FLGS, 1) +VSX_MADD(XSMSUBDP, 1, float64, VsrD(0), MSUB_FLGS, 1) +VSX_MADD(XSNMADDDP, 1, float64, VsrD(0), NMADD_FLGS, 1) +VSX_MADD(XSNMSUBDP, 1, float64, VsrD(0), NMSUB_FLGS, 1) +VSX_MADD(XSMADDSP, 1, float64r32, VsrD(0), MADD_FLGS, 1) +VSX_MADD(XSMSUBSP, 1, float64r32, VsrD(0), MSUB_FLGS, 1) +VSX_MADD(XSNMADDSP, 1, float64r32, VsrD(0), NMADD_FLGS, 1) +VSX_MADD(XSNMSUBSP, 1, float64r32, VsrD(0), NMSUB_FLGS, 1) -VSX_MADD(xvmadddp, 2, float64, VsrD(i), MADD_FLGS, 0, 0) -VSX_MADD(xvmsubdp, 2, float64, VsrD(i), MSUB_FLGS, 0, 0) -VSX_MADD(xvnmadddp, 2, float64, VsrD(i), NMADD_FLGS, 0, 0) -VSX_MADD(xvnmsubdp, 2, float64, VsrD(i), NMSUB_FLGS, 0, 0) +VSX_MADD(xvmadddp, 2, float64, VsrD(i), MADD_FLGS, 0) +VSX_MADD(xvmsubdp, 2, float64, VsrD(i), MSUB_FLGS, 0) +VSX_MADD(xvnmadddp, 2, float64, VsrD(i), NMADD_FLGS, 0) +VSX_MADD(xvnmsubdp, 2, float64, VsrD(i), NMSUB_FLGS, 0) -VSX_MADD(xvmaddsp, 4, float32, VsrW(i), MADD_FLGS, 0, 0) -VSX_MADD(xvmsubsp, 4, float32, VsrW(i), MSUB_FLGS, 0, 0) -VSX_MADD(xvnmaddsp, 4, float32, VsrW(i), NMADD_FLGS, 0, 0) -VSX_MADD(xvnmsubsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0) +VSX_MADD(xvmaddsp, 4, float32, VsrW(i), MADD_FLGS, 0) +VSX_MADD(xvmsubsp, 4, float32, VsrW(i), MSUB_FLGS, 0) +VSX_MADD(xvnmaddsp, 4, float32, VsrW(i), NMADD_FLGS, 0) +VSX_MADD(xvnmsubsp, 4, float32, VsrW(i), NMSUB_FLGS, 0) /* * VSX_MADDQ - VSX floating point quad-precision muliply/add -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/5] target/ppc: change xs[n]madd[am]sp to use float64r32_muladd 2022-03-03 17:20 ` [PATCH v2 2/5] target/ppc: change xs[n]madd[am]sp to use float64r32_muladd matheus.ferst @ 2022-03-03 18:49 ` Richard Henderson 2022-03-03 20:53 ` Matheus K. Ferst 0 siblings, 1 reply; 14+ messages in thread From: Richard Henderson @ 2022-03-03 18:49 UTC (permalink / raw) To: matheus.ferst, qemu-devel, qemu-ppc Cc: thuth, danielhb413, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david On 3/3/22 07:20, matheus.ferst@eldorado.org.br wrote: > From: Matheus Ferst <matheus.ferst@eldorado.org.br> > > Change VSX Scalar Multiply-Add/Subtract Type-A/M Single Precision > helpers to use float64r32_muladd. This method should correctly handle > all rounding modes, so the workaround for float_round_nearest_even can > be dropped. > > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> > --- > target/ppc/fpu_helper.c | 93 ++++++++++++++++------------------------- > 1 file changed, 35 insertions(+), 58 deletions(-) > > diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c > index 8f970288f5..c973968ed6 100644 > --- a/target/ppc/fpu_helper.c > +++ b/target/ppc/fpu_helper.c > @@ -1916,22 +1916,19 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode, > * fld - vsr_t field (VsrD(*) or VsrW(*)) > * sfprf - set FPRF > */ > -#define VSX_RE(op, nels, tp, fld, sfprf, r2sp) \ > +#define VSX_RE(op, nels, tp, op_tp, fld, sfprf) \ > void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ > { \ > ppc_vsr_t t = { }; \ > - int i; \ > + int i, flags; \ > \ > helper_reset_fpstatus(env); \ > \ > for (i = 0; i < nels; i++) { \ > - if (unlikely(tp##_is_signaling_nan(xb->fld, &env->fp_status))) { \ > + t.fld = op_tp##_div(tp##_one, xb->fld, &env->fp_status); \ > + flags = get_float_exception_flags(&env->fp_status); \ > + if (unlikely(flags & float_flag_invalid_snan)) { You seem to have squashed the change to recip-estimate into the same patch as muladd. > -#define VSX_RSQRTE(op, nels, tp, fld, sfprf, r2sp) \ > +#define VSX_RSQRTE(op, nels, tp, op_tp, fld, sfprf) \ And recip-sqrt-estimate. I guess it's ok to squash, since it's all related, but you should update the patch description if you leave it this way. r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 2/5] target/ppc: change xs[n]madd[am]sp to use float64r32_muladd 2022-03-03 18:49 ` Richard Henderson @ 2022-03-03 20:53 ` Matheus K. Ferst 0 siblings, 0 replies; 14+ messages in thread From: Matheus K. Ferst @ 2022-03-03 20:53 UTC (permalink / raw) To: Richard Henderson, qemu-devel, qemu-ppc Cc: thuth, danielhb413, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david On 03/03/2022 15:49, Richard Henderson wrote: > On 3/3/22 07:20, matheus.ferst@eldorado.org.br wrote: >> From: Matheus Ferst <matheus.ferst@eldorado.org.br> >> >> Change VSX Scalar Multiply-Add/Subtract Type-A/M Single Precision >> helpers to use float64r32_muladd. This method should correctly handle >> all rounding modes, so the workaround for float_round_nearest_even can >> be dropped. >> >> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> >> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> >> --- >> target/ppc/fpu_helper.c | 93 ++++++++++++++++------------------------- >> 1 file changed, 35 insertions(+), 58 deletions(-) >> >> diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c >> index 8f970288f5..c973968ed6 100644 >> --- a/target/ppc/fpu_helper.c >> +++ b/target/ppc/fpu_helper.c >> @@ -1916,22 +1916,19 @@ void helper_xsdivqp(CPUPPCState *env, uint32_t >> opcode, >> * fld - vsr_t field (VsrD(*) or VsrW(*)) >> * sfprf - set FPRF >> */ >> -#define VSX_RE(op, nels, tp, fld, sfprf, >> r2sp) \ >> +#define VSX_RE(op, nels, tp, op_tp, fld, >> sfprf) \ >> void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t >> *xb) \ >> >> { >> \ >> ppc_vsr_t t = { >> }; \ >> - int >> i; \ >> + int i, >> flags; \ >> >> \ >> >> helper_reset_fpstatus(env); >> \ >> >> \ >> for (i = 0; i < nels; i++) >> { \ >> - if (unlikely(tp##_is_signaling_nan(xb->fld, >> &env->fp_status))) { \ >> + t.fld = op_tp##_div(tp##_one, xb->fld, >> &env->fp_status); \ >> + flags = >> get_float_exception_flags(&env->fp_status); \ >> + if (unlikely(flags & float_flag_invalid_snan)) { > > You seem to have squashed the change to recip-estimate into the same > patch as muladd. > > >> -#define VSX_RSQRTE(op, nels, tp, fld, sfprf, >> r2sp) \ >> +#define VSX_RSQRTE(op, nels, tp, op_tp, fld, >> sfprf) \ > > And recip-sqrt-estimate. > > I guess it's ok to squash, since it's all related, but you should update > the patch > description if you leave it this way. > Sorry, I cherry-picked the wrong branch. This patch should just be a rebase of v1. I'll send the changes to VSX_{ADD_SUB,MUL,DIV,RE,SQRT,RSQRTE} in a separate patch series since it's not test-related. Thanks, Matheus K. Ferst Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/> Analista de Software Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html> ^ permalink raw reply [flat|nested] 14+ messages in thread
* [RFC PATCH v2 3/5] tests/tcg/ppc64le: drop __int128 usage in bcdsub 2022-03-03 17:20 [PATCH v2 0/5] tests/tcg/ppc64le: fix the build of TCG tests with Clang matheus.ferst 2022-03-03 17:20 ` [PATCH v2 1/5] tests/tcg/ppc64le: use inline asm instead of __builtin_mtfsf matheus.ferst 2022-03-03 17:20 ` [PATCH v2 2/5] target/ppc: change xs[n]madd[am]sp to use float64r32_muladd matheus.ferst @ 2022-03-03 17:20 ` matheus.ferst 2022-03-03 18:57 ` Richard Henderson 2022-03-03 17:20 ` [RFC PATCH v2 4/5] tests/tcg/ppc64le: emit bcdsub with .long when needed matheus.ferst 2022-03-03 17:20 ` [PATCH v2 5/5] tests/tcg/ppc64le: Use Altivec register names in clobbler list matheus.ferst 4 siblings, 1 reply; 14+ messages in thread From: matheus.ferst @ 2022-03-03 17:20 UTC (permalink / raw) To: qemu-devel, qemu-ppc Cc: thuth, Matheus Ferst, danielhb413, richard.henderson, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david From: Matheus Ferst <matheus.ferst@eldorado.org.br> Using __int128 with inline asm constraints like "v" generates incorrect code when compiling with LLVM/Clang (e.g., only one doubleword of the VSR is loaded). Instead, use a GPR pair to pass the 128-bits value and load the VSR with mtvsrd/xxmrghd. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> --- I'm avoiding newer insns like mtvsrdd/mfvsrld to move values between VSR and GPR so we can run this test in a POWER8 machine. v2: - No vector types to avoid endian-dependent code. --- tests/tcg/ppc64le/bcdsub.c | 117 +++++++++++++++++-------------------- 1 file changed, 53 insertions(+), 64 deletions(-) diff --git a/tests/tcg/ppc64le/bcdsub.c b/tests/tcg/ppc64le/bcdsub.c index 8c188cae6d..c9ca5357cb 100644 --- a/tests/tcg/ppc64le/bcdsub.c +++ b/tests/tcg/ppc64le/bcdsub.c @@ -1,6 +1,7 @@ #include <assert.h> #include <unistd.h> #include <signal.h> +#include <stdint.h> #define CRF_LT (1 << 3) #define CRF_GT (1 << 2) @@ -8,21 +9,32 @@ #define CRF_SO (1 << 0) #define UNDEF 0 -#define BCDSUB(vra, vrb, ps) \ - asm ("bcdsub. %1,%2,%3,%4;" \ - "mfocrf %0,0b10;" \ - : "=r" (cr), "=v" (vrt) \ - : "v" (vra), "v" (vrb), "i" (ps) \ - : ); +#define BCDSUB(AH, AL, BH, BL, PS) \ + asm ("mtvsrd 32, %3\n\t" \ + "mtvsrd 33, %4\n\t" \ + "xxmrghd 32, 32, 33\n\t" \ + "mtvsrd 33, %5\n\t" \ + "mtvsrd 34, %6\n\t" \ + "xxmrghd 33, 33, 34\n\t" \ + "bcdsub. 0, 0, 1, %7\n\t" \ + "mfocrf %0, 0b10\n\t" \ + "mfvsrd %1, 32\n\t" \ + "xxswapd 32, 32\n\t" \ + "mfvsrd %2, 32\n\t" \ + : "=r" (cr), "=r" (th), "=r" (tl) \ + : "r" (AH), "r" (AL), "r" (BH), "r" (BL), "i" (PS) \ + : "v0", "v1", "v2"); -#define TEST(vra, vrb, ps, exp_res, exp_cr6) \ +#define TEST(AH, AL, BH, BL, PS, TH, TL, CR6) \ do { \ - __int128 vrt = 0; \ int cr = 0; \ - BCDSUB(vra, vrb, ps); \ - if (exp_res) \ - assert(vrt == exp_res); \ - assert((cr >> 4) == exp_cr6); \ + uint64_t th, tl; \ + BCDSUB(AH, AL, BH, BL, PS); \ + if (TH || TL) { \ + assert(tl == TL); \ + assert(th == TH); \ + } \ + assert((cr >> 4) == CR6); \ } while (0) @@ -33,13 +45,13 @@ */ void test_bcdsub_eq(void) { - __int128 a, b; - /* maximum positive BCD value */ - a = b = (((__int128) 0x9999999999999999) << 64 | 0x999999999999999c); - - TEST(a, b, 0, 0xc, CRF_EQ); - TEST(a, b, 1, 0xf, CRF_EQ); + TEST(0x9999999999999999, 0x999999999999999c, + 0x9999999999999999, 0x999999999999999c, + 0, 0x0, 0xc, CRF_EQ); + TEST(0x9999999999999999, 0x999999999999999c, + 0x9999999999999999, 0x999999999999999c, + 1, 0x0, 0xf, CRF_EQ); } /* @@ -49,21 +61,16 @@ void test_bcdsub_eq(void) */ void test_bcdsub_gt(void) { - __int128 a, b, c; + /* maximum positive and negative one BCD values */ + TEST(0x9999999999999999, 0x999999999999999c, 0x0, 0x1d, 0, + 0x0, 0xc, (CRF_GT | CRF_SO)); + TEST(0x9999999999999999, 0x999999999999999c, 0x0, 0x1d, 1, + 0x0, 0xf, (CRF_GT | CRF_SO)); - /* maximum positive BCD value */ - a = (((__int128) 0x9999999999999999) << 64 | 0x999999999999999c); - - /* negative one BCD value */ - b = (__int128) 0x1d; - - TEST(a, b, 0, 0xc, (CRF_GT | CRF_SO)); - TEST(a, b, 1, 0xf, (CRF_GT | CRF_SO)); - - c = (((__int128) 0x9999999999999999) << 64 | 0x999999999999998c); - - TEST(c, b, 0, a, CRF_GT); - TEST(c, b, 1, (a | 0x3), CRF_GT); + TEST(0x9999999999999999, 0x999999999999998c, 0x0, 0x1d, 0, + 0x9999999999999999, 0x999999999999999c, CRF_GT); + TEST(0x9999999999999999, 0x999999999999998c, 0x0, 0x1d, 1, + 0x9999999999999999, 0x999999999999999f, CRF_GT); } /* @@ -73,45 +80,27 @@ void test_bcdsub_gt(void) */ void test_bcdsub_lt(void) { - __int128 a, b; + /* positive zero and positive one BCD values */ + TEST(0x0, 0xc, 0x0, 0x1c, 0, 0x0, 0x1d, CRF_LT); + TEST(0x0, 0xc, 0x0, 0x1c, 1, 0x0, 0x1d, CRF_LT); - /* positive zero BCD value */ - a = (__int128) 0xc; - - /* positive one BCD value */ - b = (__int128) 0x1c; - - TEST(a, b, 0, 0x1d, CRF_LT); - TEST(a, b, 1, 0x1d, CRF_LT); - - /* maximum negative BCD value */ - a = (((__int128) 0x9999999999999999) << 64 | 0x999999999999999d); - - /* positive one BCD value */ - b = (__int128) 0x1c; - - TEST(a, b, 0, 0xd, (CRF_LT | CRF_SO)); - TEST(a, b, 1, 0xd, (CRF_LT | CRF_SO)); + /* maximum negative and positive one BCD values */ + TEST(0x9999999999999999, 0x999999999999999d, 0x0, 0x1c, 0, + 0x0, 0xd, (CRF_LT | CRF_SO)); + TEST(0x9999999999999999, 0x999999999999999d, 0x0, 0x1c, 1, + 0x0, 0xd, (CRF_LT | CRF_SO)); } void test_bcdsub_invalid(void) { - __int128 a, b; + TEST(0x0, 0x1c, 0x0, 0xf00, 0, UNDEF, UNDEF, CRF_SO); + TEST(0x0, 0x1c, 0x0, 0xf00, 1, UNDEF, UNDEF, CRF_SO); - /* positive one BCD value */ - a = (__int128) 0x1c; - b = 0xf00; + TEST(0x0, 0xf00, 0x0, 0x1c, 0, UNDEF, UNDEF, CRF_SO); + TEST(0x0, 0xf00, 0x0, 0x1c, 1, UNDEF, UNDEF, CRF_SO); - TEST(a, b, 0, UNDEF, CRF_SO); - TEST(a, b, 1, UNDEF, CRF_SO); - - TEST(b, a, 0, UNDEF, CRF_SO); - TEST(b, a, 1, UNDEF, CRF_SO); - - a = 0xbad; - - TEST(a, b, 0, UNDEF, CRF_SO); - TEST(a, b, 1, UNDEF, CRF_SO); + TEST(0x0, 0xbad, 0x0, 0xf00, 0, UNDEF, UNDEF, CRF_SO); + TEST(0x0, 0xbad, 0x0, 0xf00, 1, UNDEF, UNDEF, CRF_SO); } int main(void) -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 3/5] tests/tcg/ppc64le: drop __int128 usage in bcdsub 2022-03-03 17:20 ` [RFC PATCH v2 3/5] tests/tcg/ppc64le: drop __int128 usage in bcdsub matheus.ferst @ 2022-03-03 18:57 ` Richard Henderson 0 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2022-03-03 18:57 UTC (permalink / raw) To: matheus.ferst, qemu-devel, qemu-ppc Cc: thuth, danielhb413, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david On 3/3/22 07:20, matheus.ferst@eldorado.org.br wrote: > I'm avoiding newer insns like mtvsrdd/mfvsrld to move values between VSR > and GPR so we can run this test in a POWER8 machine. This... > +#define BCDSUB(AH, AL, BH, BL, PS) \ > + asm ("mtvsrd 32, %3\n\t" \ > + "mtvsrd 33, %4\n\t" \ > + "xxmrghd 32, 32, 33\n\t" \ > + "mtvsrd 33, %5\n\t" \ > + "mtvsrd 34, %6\n\t" \ > + "xxmrghd 33, 33, 34\n\t" \ > + "bcdsub. 0, 0, 1, %7\n\t" \ > + "mfocrf %0, 0b10\n\t" \ > + "mfvsrd %1, 32\n\t" \ > + "xxswapd 32, 32\n\t" \ > + "mfvsrd %2, 32\n\t" \ > + : "=r" (cr), "=r" (th), "=r" (tl) \ > + : "r" (AH), "r" (AL), "r" (BH), "r" (BL), "i" (PS) \ > + : "v0", "v1", "v2"); ... belongs here as a comment. > + if (TH || TL) { \ Would be clearer with TH != UNDEF || TL != UNDEF. Otherwise, Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* [RFC PATCH v2 4/5] tests/tcg/ppc64le: emit bcdsub with .long when needed 2022-03-03 17:20 [PATCH v2 0/5] tests/tcg/ppc64le: fix the build of TCG tests with Clang matheus.ferst ` (2 preceding siblings ...) 2022-03-03 17:20 ` [RFC PATCH v2 3/5] tests/tcg/ppc64le: drop __int128 usage in bcdsub matheus.ferst @ 2022-03-03 17:20 ` matheus.ferst 2022-03-03 19:19 ` Richard Henderson 2022-03-03 17:20 ` [PATCH v2 5/5] tests/tcg/ppc64le: Use Altivec register names in clobbler list matheus.ferst 4 siblings, 1 reply; 14+ messages in thread From: matheus.ferst @ 2022-03-03 17:20 UTC (permalink / raw) To: qemu-devel, qemu-ppc Cc: thuth, Matheus Ferst, danielhb413, richard.henderson, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david From: Matheus Ferst <matheus.ferst@eldorado.org.br> Based on GCC docs[1], we use the '-mpower8-vector' flag at config-time to detect the toolchain support to the bcdsub instruction. LLVM/Clang supports this flag since version 3.6[2], but the instruction and related builtins were only added in LLVM 14[3]. In the absence of other means to detect this support at config-time, we resort to __has_builtin to identify the presence of __builtin_bcdsub at compile-time. If the builtin is not available, the instruction is emitted with a ".long". [1] https://gcc.gnu.org/onlinedocs/gcc-8.3.0/gcc/PowerPC-AltiVec_002fVSX-Built-in-Functions.html [2] https://github.com/llvm/llvm-project/commit/59eb767e11d4ffefb5f55409524e5c8416b2b0db [3] https://github.com/llvm/llvm-project/commit/c933c2eb334660c131f4afc9d194fafb0cec0423 Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> --- tests/tcg/ppc64le/bcdsub.c | 55 +++++++++++++++++++++----------------- 1 file changed, 30 insertions(+), 25 deletions(-) diff --git a/tests/tcg/ppc64le/bcdsub.c b/tests/tcg/ppc64le/bcdsub.c index c9ca5357cb..445d50f07d 100644 --- a/tests/tcg/ppc64le/bcdsub.c +++ b/tests/tcg/ppc64le/bcdsub.c @@ -9,32 +9,37 @@ #define CRF_SO (1 << 0) #define UNDEF 0 -#define BCDSUB(AH, AL, BH, BL, PS) \ - asm ("mtvsrd 32, %3\n\t" \ - "mtvsrd 33, %4\n\t" \ - "xxmrghd 32, 32, 33\n\t" \ - "mtvsrd 33, %5\n\t" \ - "mtvsrd 34, %6\n\t" \ - "xxmrghd 33, 33, 34\n\t" \ - "bcdsub. 0, 0, 1, %7\n\t" \ - "mfocrf %0, 0b10\n\t" \ - "mfvsrd %1, 32\n\t" \ - "xxswapd 32, 32\n\t" \ - "mfvsrd %2, 32\n\t" \ - : "=r" (cr), "=r" (th), "=r" (tl) \ - : "r" (AH), "r" (AL), "r" (BH), "r" (BL), "i" (PS) \ - : "v0", "v1", "v2"); +#if defined(__has_builtin) && !__has_builtin(__builtin_bcdsub) +#define BCDSUB(T, A, B, PS) \ + ".long 4 << 26 | (" #T ") << 21 | (" #A ") << 16 | (" #B ") << 11" \ + " | 1 << 10 | (" #PS ") << 9 | 65\n\t" +#else +#define BCDSUB(T, A, B, PS) "bcdsub. " #T ", " #A ", " #B ", " #PS "\n\t" +#endif -#define TEST(AH, AL, BH, BL, PS, TH, TL, CR6) \ - do { \ - int cr = 0; \ - uint64_t th, tl; \ - BCDSUB(AH, AL, BH, BL, PS); \ - if (TH || TL) { \ - assert(tl == TL); \ - assert(th == TH); \ - } \ - assert((cr >> 4) == CR6); \ +#define TEST(AH, AL, BH, BL, PS, TH, TL, CR6) \ + do { \ + int cr = 0; \ + uint64_t th, tl; \ + asm ("mtvsrd 32, %3\n\t" \ + "mtvsrd 33, %4\n\t" \ + "xxmrghd 32, 32, 33\n\t" \ + "mtvsrd 33, %5\n\t" \ + "mtvsrd 34, %6\n\t" \ + "xxmrghd 33, 33, 34\n\t" \ + BCDSUB(0, 0, 1, PS) \ + "mfocrf %0, 0b10\n\t" \ + "mfvsrd %1, 32\n\t" \ + "xxswapd 32, 32\n\t" \ + "mfvsrd %2, 32\n\t" \ + : "=r" (cr), "=r" (th), "=r" (tl) \ + : "r" (AH), "r" (AL), "r" (BH), "r" (BL) \ + : "v0", "v1", "v2"); \ + if (TH || TL) { \ + assert(tl == TL); \ + assert(th == TH); \ + } \ + assert((cr >> 4) == CR6); \ } while (0) -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [RFC PATCH v2 4/5] tests/tcg/ppc64le: emit bcdsub with .long when needed 2022-03-03 17:20 ` [RFC PATCH v2 4/5] tests/tcg/ppc64le: emit bcdsub with .long when needed matheus.ferst @ 2022-03-03 19:19 ` Richard Henderson 0 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2022-03-03 19:19 UTC (permalink / raw) To: matheus.ferst, qemu-devel, qemu-ppc Cc: thuth, danielhb413, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david On 3/3/22 07:20, matheus.ferst@eldorado.org.br wrote: > From: Matheus Ferst<matheus.ferst@eldorado.org.br> > > Based on GCC docs[1], we use the '-mpower8-vector' flag at config-time > to detect the toolchain support to the bcdsub instruction. LLVM/Clang > supports this flag since version 3.6[2], but the instruction and related > builtins were only added in LLVM 14[3]. In the absence of other means to > detect this support at config-time, we resort to __has_builtin to > identify the presence of __builtin_bcdsub at compile-time. If the > builtin is not available, the instruction is emitted with a ".long". > > [1]https://gcc.gnu.org/onlinedocs/gcc-8.3.0/gcc/PowerPC-AltiVec_002fVSX-Built-in-Functions.html > [2]https://github.com/llvm/llvm-project/commit/59eb767e11d4ffefb5f55409524e5c8416b2b0db > [3]https://github.com/llvm/llvm-project/commit/c933c2eb334660c131f4afc9d194fafb0cec0423 > > Signed-off-by: Matheus Ferst<matheus.ferst@eldorado.org.br> > --- > tests/tcg/ppc64le/bcdsub.c | 55 +++++++++++++++++++++----------------- > 1 file changed, 30 insertions(+), 25 deletions(-) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 5/5] tests/tcg/ppc64le: Use Altivec register names in clobbler list 2022-03-03 17:20 [PATCH v2 0/5] tests/tcg/ppc64le: fix the build of TCG tests with Clang matheus.ferst ` (3 preceding siblings ...) 2022-03-03 17:20 ` [RFC PATCH v2 4/5] tests/tcg/ppc64le: emit bcdsub with .long when needed matheus.ferst @ 2022-03-03 17:20 ` matheus.ferst 2022-03-03 19:30 ` Richard Henderson 4 siblings, 1 reply; 14+ messages in thread From: matheus.ferst @ 2022-03-03 17:20 UTC (permalink / raw) To: qemu-devel, qemu-ppc Cc: thuth, Matheus Ferst, danielhb413, richard.henderson, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david From: Matheus Ferst <matheus.ferst@eldorado.org.br> LLVM/Clang doesn't know the VSX registers when compiling with -mabi=elfv1. Use only registers >= 32 and list them with their Altivec name. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> --- tests/tcg/ppc64le/non_signalling_xscv.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/tests/tcg/ppc64le/non_signalling_xscv.c b/tests/tcg/ppc64le/non_signalling_xscv.c index 91e25cad46..836df71ef0 100644 --- a/tests/tcg/ppc64le/non_signalling_xscv.c +++ b/tests/tcg/ppc64le/non_signalling_xscv.c @@ -6,16 +6,16 @@ #define TEST(INSN, B_HI, B_LO, T_HI, T_LO) \ do { \ uint64_t th, tl, bh = B_HI, bl = B_LO; \ - asm("mtvsrd 0, %2\n\t" \ - "mtvsrd 1, %3\n\t" \ - "xxmrghd 0, 0, 1\n\t" \ - INSN " 0, 0\n\t" \ - "mfvsrd %0, 0\n\t" \ - "xxswapd 0, 0\n\t" \ - "mfvsrd %1, 0\n\t" \ + asm("mtvsrd 32, %2\n\t" \ + "mtvsrd 33, %3\n\t" \ + "xxmrghd 32, 32, 33\n\t" \ + INSN " 32, 32\n\t" \ + "mfvsrd %0, 32\n\t" \ + "xxswapd 32, 32\n\t" \ + "mfvsrd %1, 32\n\t" \ : "=r" (th), "=r" (tl) \ : "r" (bh), "r" (bl) \ - : "vs0", "vs1"); \ + : "v0", "v1"); \ printf(INSN "(0x%016" PRIx64 "%016" PRIx64 ") = 0x%016" PRIx64 \ "%016" PRIx64 "\n", bh, bl, th, tl); \ assert(th == T_HI && tl == T_LO); \ -- 2.25.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 5/5] tests/tcg/ppc64le: Use Altivec register names in clobbler list 2022-03-03 17:20 ` [PATCH v2 5/5] tests/tcg/ppc64le: Use Altivec register names in clobbler list matheus.ferst @ 2022-03-03 19:30 ` Richard Henderson 2022-03-03 20:53 ` Matheus K. Ferst 0 siblings, 1 reply; 14+ messages in thread From: Richard Henderson @ 2022-03-03 19:30 UTC (permalink / raw) To: matheus.ferst, qemu-devel, qemu-ppc Cc: thuth, danielhb413, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david On 3/3/22 07:20, matheus.ferst@eldorado.org.br wrote: > From: Matheus Ferst <matheus.ferst@eldorado.org.br> > > LLVM/Clang doesn't know the VSX registers when compiling with > -mabi=elfv1. Use only registers >= 32 and list them with their Altivec > name. > > Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> This description isn't quite right. The change to the m[tf]vsr insns is a generic bug fix, and not related to Clang. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 5/5] tests/tcg/ppc64le: Use Altivec register names in clobbler list 2022-03-03 19:30 ` Richard Henderson @ 2022-03-03 20:53 ` Matheus K. Ferst 2022-03-03 21:05 ` Richard Henderson 0 siblings, 1 reply; 14+ messages in thread From: Matheus K. Ferst @ 2022-03-03 20:53 UTC (permalink / raw) To: Richard Henderson, qemu-devel, qemu-ppc Cc: thuth, danielhb413, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david On 03/03/2022 16:30, Richard Henderson wrote: > On 3/3/22 07:20, matheus.ferst@eldorado.org.br wrote: >> From: Matheus Ferst <matheus.ferst@eldorado.org.br> >> >> LLVM/Clang doesn't know the VSX registers when compiling with >> -mabi=elfv1. Use only registers >= 32 and list them with their Altivec >> name. >> >> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> > > This description isn't quite right. The change to the m[tf]vsr insns is > a generic bug > fix, and not related to Clang. > I'm not sure if I understood. I'm targeting the Clang problem with this patch, is something else being fixed by this change? AFAICT, the old "mtvsrd 0, %2" and "mfvsrd %0, 0" were correct, I'm just changing from VSR 0 to VSR 32 to allow the clobber with Clang, but GCC doesn't seem to have this limitation with ELFv1. Thanks, Matheus K. Ferst Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/> Analista de Software Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 5/5] tests/tcg/ppc64le: Use Altivec register names in clobbler list 2022-03-03 20:53 ` Matheus K. Ferst @ 2022-03-03 21:05 ` Richard Henderson 0 siblings, 0 replies; 14+ messages in thread From: Richard Henderson @ 2022-03-03 21:05 UTC (permalink / raw) To: Matheus K. Ferst, qemu-devel, qemu-ppc Cc: thuth, danielhb413, groug, philippe.mathieu.daude, clg, mrezanin, alex.bennee, david On 3/3/22 10:53, Matheus K. Ferst wrote: > On 03/03/2022 16:30, Richard Henderson wrote: >> On 3/3/22 07:20, matheus.ferst@eldorado.org.br wrote: >>> From: Matheus Ferst <matheus.ferst@eldorado.org.br> >>> >>> LLVM/Clang doesn't know the VSX registers when compiling with >>> -mabi=elfv1. Use only registers >= 32 and list them with their Altivec >>> name. >>> >>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> >> >> This description isn't quite right. The change to the m[tf]vsr insns is a generic bug >> fix, and not related to Clang. >> > > I'm not sure if I understood. I'm targeting the Clang problem with this patch, is > something else being fixed by this change? > AFAICT, the old "mtvsrd 0, %2" and "mfvsrd %0, 0" were correct, I'm just changing from VSR > 0 to VSR 32 to allow the clobber with Clang, but GCC doesn't seem to have this limitation > with ELFv1. Oh, whoops, I mis-read the patch. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~ ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2022-03-03 21:28 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-03-03 17:20 [PATCH v2 0/5] tests/tcg/ppc64le: fix the build of TCG tests with Clang matheus.ferst 2022-03-03 17:20 ` [PATCH v2 1/5] tests/tcg/ppc64le: use inline asm instead of __builtin_mtfsf matheus.ferst 2022-03-03 18:46 ` Richard Henderson 2022-03-03 17:20 ` [PATCH v2 2/5] target/ppc: change xs[n]madd[am]sp to use float64r32_muladd matheus.ferst 2022-03-03 18:49 ` Richard Henderson 2022-03-03 20:53 ` Matheus K. Ferst 2022-03-03 17:20 ` [RFC PATCH v2 3/5] tests/tcg/ppc64le: drop __int128 usage in bcdsub matheus.ferst 2022-03-03 18:57 ` Richard Henderson 2022-03-03 17:20 ` [RFC PATCH v2 4/5] tests/tcg/ppc64le: emit bcdsub with .long when needed matheus.ferst 2022-03-03 19:19 ` Richard Henderson 2022-03-03 17:20 ` [PATCH v2 5/5] tests/tcg/ppc64le: Use Altivec register names in clobbler list matheus.ferst 2022-03-03 19:30 ` Richard Henderson 2022-03-03 20:53 ` Matheus K. Ferst 2022-03-03 21:05 ` Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).