* [Qemu-devel] [patch] target-alpha: squashed fpu qualifiers patch @ 2009-12-18 22:09 ` Richard Henderson 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson 0 siblings, 1 reply; 18+ messages in thread From: Richard Henderson @ 2009-12-18 22:09 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues [-- Attachment #1: Type: text/plain, Size: 142 bytes --] This is a squashed version of the 3 or 4 incremental patches that I had sent out for implementing the alpha fpu instruction qualifiers. r~ [-- Attachment #2: commit-fpu-qual --] [-- Type: text/plain, Size: 47063 bytes --] commit 572164702dd83955fc8783c85811ec86c3fb6e4a Author: Richard Henderson <rth@twiddle.net> Date: Fri Dec 18 10:50:32 2009 -0800 target-alpha: Implement fp insn qualifiers. Adds a third constant argument to the fpu helpers, which contain the unparsed qualifier bits. The helper functions use new begin_fp/end_fp routines that extract the rounding mode from the qualifier bits, as well as raise exceptions for non-finite inputs and outputs also as directed by the qualifier bits. cpu_alpha_load/store_fpcr modified to load/store the majority of the bits from env->fpcr. This because we hadn't been saving a few of the fpcr bits in the fp_status field: in particular DNZ. Re-implement cvttq without saturation of overflow results, to match the Alpha specification. Signed-off-by: Richard Henderson <rth@twiddle.net> diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h index c0dff4b..c1c0470 100644 --- a/target-alpha/cpu.h +++ b/target-alpha/cpu.h @@ -430,9 +430,13 @@ enum { }; /* Arithmetic exception */ -enum { - EXCP_ARITH_OVERFLOW, -}; +#define EXC_M_IOV (1<<16) /* Integer Overflow */ +#define EXC_M_INE (1<<15) /* Inexact result */ +#define EXC_M_UNF (1<<14) /* Underflow */ +#define EXC_M_FOV (1<<13) /* Overflow */ +#define EXC_M_DZE (1<<12) /* Division by zero */ +#define EXC_M_INV (1<<11) /* Invalid operation */ +#define EXC_M_SWC (1<<10) /* Software completion */ enum { IR_V0 = 0, diff --git a/target-alpha/helper.c b/target-alpha/helper.c index be7d37b..94821bd 100644 --- a/target-alpha/helper.c +++ b/target-alpha/helper.c @@ -27,41 +27,13 @@ uint64_t cpu_alpha_load_fpcr (CPUState *env) { - uint64_t ret = 0; - int flags, mask; - - flags = env->fp_status.float_exception_flags; - ret |= (uint64_t) flags << 52; - if (flags) - ret |= FPCR_SUM; - env->ipr[IPR_EXC_SUM] &= ~0x3E; - env->ipr[IPR_EXC_SUM] |= flags << 1; - - mask = env->fp_status.float_exception_mask; - if (mask & float_flag_invalid) - ret |= FPCR_INVD; - if (mask & float_flag_divbyzero) - ret |= FPCR_DZED; - if (mask & float_flag_overflow) - ret |= FPCR_OVFD; - if (mask & float_flag_underflow) - ret |= FPCR_UNFD; - if (mask & float_flag_inexact) - ret |= FPCR_INED; - - switch (env->fp_status.float_rounding_mode) { - case float_round_nearest_even: - ret |= 2ULL << FPCR_DYN_SHIFT; - break; - case float_round_down: - ret |= 1ULL << FPCR_DYN_SHIFT; - break; - case float_round_up: - ret |= 3ULL << FPCR_DYN_SHIFT; - break; - case float_round_to_zero: - break; - } + uint64_t ret = env->fp_status.float_exception_flags; + + if (ret) + ret = FPCR_SUM | (ret << 52); + + ret |= env->fpcr & ~(FPCR_SUM | FPCR_STATUS_MASK); + return ret; } @@ -69,6 +41,8 @@ void cpu_alpha_store_fpcr (CPUState *env, uint64_t val) { int round_mode, mask; + env->fpcr = val; + set_float_exception_flags((val >> 52) & 0x3F, &env->fp_status); mask = 0; @@ -86,6 +60,7 @@ void cpu_alpha_store_fpcr (CPUState *env, uint64_t val) switch ((val >> FPCR_DYN_SHIFT) & 3) { case 0: + default: round_mode = float_round_to_zero; break; case 1: @@ -100,6 +75,11 @@ void cpu_alpha_store_fpcr (CPUState *env, uint64_t val) break; } set_float_rounding_mode(round_mode, &env->fp_status); + + mask = 0; + if ((val & (FPCR_UNDZ|FPCR_UNFD)) == (FPCR_UNDZ|FPCR_UNFD)) + mask = 1; + set_flush_to_zero(mask, &env->fp_status); } #if defined(CONFIG_USER_ONLY) diff --git a/target-alpha/helper.h b/target-alpha/helper.h index bedd3c0..1521a84 100644 --- a/target-alpha/helper.h +++ b/target-alpha/helper.h @@ -41,33 +41,33 @@ DEF_HELPER_1(store_fpcr, void, i64) DEF_HELPER_1(f_to_memory, i32, i64) DEF_HELPER_1(memory_to_f, i64, i32) -DEF_HELPER_2(addf, i64, i64, i64) -DEF_HELPER_2(subf, i64, i64, i64) -DEF_HELPER_2(mulf, i64, i64, i64) -DEF_HELPER_2(divf, i64, i64, i64) -DEF_HELPER_1(sqrtf, i64, i64) +DEF_HELPER_3(addf, i64, i64, i64, i32) +DEF_HELPER_3(subf, i64, i64, i64, i32) +DEF_HELPER_3(mulf, i64, i64, i64, i32) +DEF_HELPER_3(divf, i64, i64, i64, i32) +DEF_HELPER_2(sqrtf, i64, i64, i32) DEF_HELPER_1(g_to_memory, i64, i64) DEF_HELPER_1(memory_to_g, i64, i64) -DEF_HELPER_2(addg, i64, i64, i64) -DEF_HELPER_2(subg, i64, i64, i64) -DEF_HELPER_2(mulg, i64, i64, i64) -DEF_HELPER_2(divg, i64, i64, i64) -DEF_HELPER_1(sqrtg, i64, i64) +DEF_HELPER_3(addg, i64, i64, i64, i32) +DEF_HELPER_3(subg, i64, i64, i64, i32) +DEF_HELPER_3(mulg, i64, i64, i64, i32) +DEF_HELPER_3(divg, i64, i64, i64, i32) +DEF_HELPER_2(sqrtg, i64, i64, i32) DEF_HELPER_1(s_to_memory, i32, i64) DEF_HELPER_1(memory_to_s, i64, i32) -DEF_HELPER_2(adds, i64, i64, i64) -DEF_HELPER_2(subs, i64, i64, i64) -DEF_HELPER_2(muls, i64, i64, i64) -DEF_HELPER_2(divs, i64, i64, i64) -DEF_HELPER_1(sqrts, i64, i64) - -DEF_HELPER_2(addt, i64, i64, i64) -DEF_HELPER_2(subt, i64, i64, i64) -DEF_HELPER_2(mult, i64, i64, i64) -DEF_HELPER_2(divt, i64, i64, i64) -DEF_HELPER_1(sqrtt, i64, i64) +DEF_HELPER_3(adds, i64, i64, i64, i32) +DEF_HELPER_3(subs, i64, i64, i64, i32) +DEF_HELPER_3(muls, i64, i64, i64, i32) +DEF_HELPER_3(divs, i64, i64, i64, i32) +DEF_HELPER_2(sqrts, i64, i64, i32) + +DEF_HELPER_3(addt, i64, i64, i64, i32) +DEF_HELPER_3(subt, i64, i64, i64, i32) +DEF_HELPER_3(mult, i64, i64, i64, i32) +DEF_HELPER_3(divt, i64, i64, i64, i32) +DEF_HELPER_2(sqrtt, i64, i64, i32) DEF_HELPER_2(cmptun, i64, i64, i64) DEF_HELPER_2(cmpteq, i64, i64, i64) @@ -81,15 +81,15 @@ DEF_HELPER_2(cpys, i64, i64, i64) DEF_HELPER_2(cpysn, i64, i64, i64) DEF_HELPER_2(cpyse, i64, i64, i64) -DEF_HELPER_1(cvtts, i64, i64) -DEF_HELPER_1(cvtst, i64, i64) -DEF_HELPER_1(cvttq, i64, i64) -DEF_HELPER_1(cvtqs, i64, i64) -DEF_HELPER_1(cvtqt, i64, i64) -DEF_HELPER_1(cvtqf, i64, i64) -DEF_HELPER_1(cvtgf, i64, i64) -DEF_HELPER_1(cvtgq, i64, i64) -DEF_HELPER_1(cvtqg, i64, i64) +DEF_HELPER_2(cvtts, i64, i64, i32) +DEF_HELPER_2(cvtst, i64, i64, i32) +DEF_HELPER_2(cvttq, i64, i64, i32) +DEF_HELPER_2(cvtqs, i64, i64, i32) +DEF_HELPER_2(cvtqt, i64, i64, i32) +DEF_HELPER_2(cvtqf, i64, i64, i32) +DEF_HELPER_2(cvtgf, i64, i64, i32) +DEF_HELPER_2(cvtgq, i64, i64, i32) +DEF_HELPER_2(cvtqg, i64, i64, i32) DEF_HELPER_1(cvtlq, i64, i64) DEF_HELPER_1(cvtql, i64, i64) DEF_HELPER_1(cvtqlv, i64, i64) diff --git a/target-alpha/op_helper.c b/target-alpha/op_helper.c index b2abf6c..2d1c3d5 100644 --- a/target-alpha/op_helper.c +++ b/target-alpha/op_helper.c @@ -24,7 +24,7 @@ /*****************************************************************************/ /* Exceptions processing helpers */ -void helper_excp (int excp, int error) +void QEMU_NORETURN helper_excp (int excp, int error) { env->exception_index = excp; env->error_code = error; @@ -78,7 +78,7 @@ uint64_t helper_addqv (uint64_t op1, uint64_t op2) uint64_t tmp = op1; op1 += op2; if (unlikely((tmp ^ op2 ^ (-1ULL)) & (tmp ^ op1) & (1ULL << 63))) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return op1; } @@ -88,7 +88,7 @@ uint64_t helper_addlv (uint64_t op1, uint64_t op2) uint64_t tmp = op1; op1 = (uint32_t)(op1 + op2); if (unlikely((tmp ^ op2 ^ (-1UL)) & (tmp ^ op1) & (1UL << 31))) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return op1; } @@ -98,7 +98,7 @@ uint64_t helper_subqv (uint64_t op1, uint64_t op2) uint64_t res; res = op1 - op2; if (unlikely((op1 ^ op2) & (res ^ op1) & (1ULL << 63))) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return res; } @@ -108,7 +108,7 @@ uint64_t helper_sublv (uint64_t op1, uint64_t op2) uint32_t res; res = op1 - op2; if (unlikely((op1 ^ op2) & (res ^ op1) & (1UL << 31))) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return res; } @@ -118,7 +118,7 @@ uint64_t helper_mullv (uint64_t op1, uint64_t op2) int64_t res = (int64_t)op1 * (int64_t)op2; if (unlikely((int32_t)res != res)) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return (int64_t)((int32_t)res); } @@ -130,7 +130,7 @@ uint64_t helper_mulqv (uint64_t op1, uint64_t op2) muls64(&tl, &th, op1, op2); /* If th != 0 && th != -1, then we had an overflow */ if (unlikely((th + 1) > 1)) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return tl; } @@ -370,8 +370,175 @@ uint64_t helper_unpkbw (uint64_t op1) /* Floating point helpers */ +/* ??? Not implemented is setting EXC_MASK, containing a bitmask of + destination registers of instructions that have caused arithmetic + traps. Not needed for userspace emulation, or for complete + emulation of the entire fpu stack within qemu. But we would need + it to invoke a guest kernel's entArith trap handler properly. + + It would be possible to encode the FP destination register in the + QUAL parameter for the FPU helpers below; additional changes would + be required for ADD/V et al above. */ + +#define QUAL_RM_N 0x080 /* Round mode nearest even */ +#define QUAL_RM_C 0x000 /* Round mode chopped */ +#define QUAL_RM_M 0x040 /* Round mode minus infinity */ +#define QUAL_RM_D 0x0c0 /* Round mode dynamic */ +#define QUAL_RM_MASK 0x0c0 + +#define QUAL_U 0x100 /* Underflow enable (fp output) */ +#define QUAL_V 0x100 /* Overflow enable (int output) */ +#define QUAL_S 0x400 /* Software completion enable */ +#define QUAL_I 0x200 /* Inexact detection enable */ + +/* If the floating-point qualifiers specified a rounding mode, + set that rounding mode and remember the original mode for + resetting at the end of the instruction. */ +static inline uint32_t begin_fp_roundmode(uint32_t qual) +{ + uint32_t rm = FP_STATUS.float_rounding_mode, old_rm = rm; + + switch (qual & QUAL_RM_MASK) { + default: + case QUAL_RM_N: + rm = float_round_nearest_even; + break; + case QUAL_RM_C: + rm = float_round_to_zero; + break; + case QUAL_RM_M: + rm = float_round_down; + break; + case QUAL_RM_D: + return old_rm; + } + if (old_rm != rm) + set_float_rounding_mode(rm, &FP_STATUS); + return old_rm; +} + +/* Zero the exception flags so that we can determine if the current + instruction raises any exceptions. Save the old acrued exception + status so that we can restore them at the end of the insn. */ +static inline uint32_t begin_fp_exception(void) +{ + uint32_t old_exc = (uint32_t)FP_STATUS.float_exception_flags << 8; + set_float_exception_flags(0, &FP_STATUS); + return old_exc; +} + +static inline uint32_t begin_fp_flush_to_zero(uint32_t quals) +{ + /* If underflow detection is disabled, silently flush to zero. + Note that flush-to-zero mode may already be enabled via the FPCR. */ + if ((quals & QUAL_U) == 0 && !FP_STATUS.flush_to_zero) { + set_flush_to_zero(1, &FP_STATUS); + return 0x10000; + } + return 0; +} + +/* Begin processing an fp operation. Return a token that should be passed + when completing the fp operation. */ +static uint32_t begin_fp(uint32_t quals) +{ + uint32_t ret = 0; + + ret |= begin_fp_roundmode(quals); + ret |= begin_fp_flush_to_zero(quals); + ret |= begin_fp_exception(); + + return ret; +} + +/* End processing an fp operation. */ + +static inline void end_fp_roundmode(uint32_t orig) +{ + uint32_t rm = FP_STATUS.float_rounding_mode, old_rm = orig & 0xff; + if (unlikely(rm != old_rm)) + set_float_rounding_mode(old_rm, &FP_STATUS); +} + +static inline void end_fp_flush_to_zero(uint32_t orig) +{ + if (orig & 0x10000) + set_flush_to_zero(0, &FP_STATUS); +} + +static void end_fp_exception(uint32_t quals, uint32_t orig) +{ + uint8_t exc = FP_STATUS.float_exception_flags; + + /* If inexact detection is disabled, silently clear it. */ + if ((quals & QUAL_I) == 0) + exc &= ~float_flag_inexact; + + orig = (orig >> 8) & 0xff; + set_float_exception_flags(exc | orig, &FP_STATUS); + + /* Raise an exception as required. */ + if (unlikely(exc)) { + if (quals & QUAL_S) + exc &= ~FP_STATUS.float_exception_mask; + if (exc) { + uint32_t hw_exc = 0; + + if (exc & float_flag_invalid) + hw_exc |= EXC_M_INV; + if (exc & float_flag_divbyzero) + hw_exc |= EXC_M_DZE; + if (exc & float_flag_overflow) + hw_exc |= EXC_M_FOV; + if (exc & float_flag_underflow) + hw_exc |= EXC_M_UNF; + if (exc & float_flag_inexact) + hw_exc |= EXC_M_INE; + + helper_excp(EXCP_ARITH, hw_exc); + } + } +} + +static void end_fp(uint32_t quals, uint32_t orig) +{ + end_fp_roundmode(orig); + end_fp_flush_to_zero(orig); + end_fp_exception(quals, orig); +} + +static uint64_t remap_ieee_input(uint32_t quals, uint64_t a) +{ + uint64_t frac; + uint32_t exp; + + exp = (uint32_t)(a >> 52) & 0x7ff; + frac = a & 0xfffffffffffffull; + + if (exp == 0) { + if (frac != 0) { + /* If DNZ is set, flush denormals to zero on input. */ + if (env->fpcr & FPCR_DNZ) + a = a & (1ull << 63); + /* If software completion not enabled, trap. */ + else if ((quals & QUAL_S) == 0) + helper_excp(EXCP_ARITH, EXC_M_UNF); + } + } else if (exp == 0x7ff) { + /* Infinity or NaN. If software completion is not enabled, trap. + If /s is enabled, we'll properly signal for SNaN on output. */ + /* ??? I'm not sure these exception bit flags are correct. I do + know that the Linux kernel, at least, doesn't rely on them and + just emulates the insn to figure out what exception to use. */ + if ((quals & QUAL_S) == 0) + helper_excp(EXCP_ARITH, frac ? EXC_M_INV : EXC_M_FOV); + } + + return a; +} + /* F floating (VAX) */ -static inline uint64_t float32_to_f(float32 fa) +static uint64_t float32_to_f(float32 fa) { uint64_t r, exp, mant, sig; CPU_FloatU a; @@ -404,7 +571,7 @@ static inline uint64_t float32_to_f(float32 fa) return r; } -static inline float32 f_to_float32(uint64_t a) +static float32 f_to_float32(uint64_t a) { uint32_t exp, mant_sig; CPU_FloatU r; @@ -447,58 +614,83 @@ uint64_t helper_memory_to_f (uint32_t a) return r; } -uint64_t helper_addf (uint64_t a, uint64_t b) +uint64_t helper_addf (uint64_t a, uint64_t b, uint32_t quals) { float32 fa, fb, fr; + uint32_t token; fa = f_to_float32(a); fb = f_to_float32(b); + + token = begin_fp(quals); fr = float32_add(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float32_to_f(fr); } -uint64_t helper_subf (uint64_t a, uint64_t b) +uint64_t helper_subf (uint64_t a, uint64_t b, uint32_t quals) { float32 fa, fb, fr; + uint32_t token; fa = f_to_float32(a); fb = f_to_float32(b); + + token = begin_fp(quals); fr = float32_sub(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float32_to_f(fr); } -uint64_t helper_mulf (uint64_t a, uint64_t b) +uint64_t helper_mulf (uint64_t a, uint64_t b, uint32_t quals) { float32 fa, fb, fr; + uint32_t token; fa = f_to_float32(a); fb = f_to_float32(b); + + token = begin_fp(quals); fr = float32_mul(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float32_to_f(fr); } -uint64_t helper_divf (uint64_t a, uint64_t b) +uint64_t helper_divf (uint64_t a, uint64_t b, uint32_t quals) { float32 fa, fb, fr; + uint32_t token; fa = f_to_float32(a); fb = f_to_float32(b); + + token = begin_fp(quals); fr = float32_div(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float32_to_f(fr); } -uint64_t helper_sqrtf (uint64_t t) +uint64_t helper_sqrtf (uint64_t t, uint32_t quals) { float32 ft, fr; + uint32_t token; ft = f_to_float32(t); + + token = begin_fp(quals); fr = float32_sqrt(ft, &FP_STATUS); + end_fp(quals, token); + return float32_to_f(fr); } /* G floating (VAX) */ -static inline uint64_t float64_to_g(float64 fa) +static uint64_t float64_to_g(float64 fa) { uint64_t r, exp, mant, sig; CPU_DoubleU a; @@ -531,7 +723,7 @@ static inline uint64_t float64_to_g(float64 fa) return r; } -static inline float64 g_to_float64(uint64_t a) +static float64 g_to_float64(uint64_t a) { uint64_t exp, mant_sig; CPU_DoubleU r; @@ -574,52 +766,77 @@ uint64_t helper_memory_to_g (uint64_t a) return r; } -uint64_t helper_addg (uint64_t a, uint64_t b) +uint64_t helper_addg (uint64_t a, uint64_t b, uint32_t quals) { float64 fa, fb, fr; + uint32_t token; fa = g_to_float64(a); fb = g_to_float64(b); + + token = begin_fp(quals); fr = float64_add(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float64_to_g(fr); } -uint64_t helper_subg (uint64_t a, uint64_t b) +uint64_t helper_subg (uint64_t a, uint64_t b, uint32_t quals) { float64 fa, fb, fr; + uint32_t token; fa = g_to_float64(a); fb = g_to_float64(b); + + token = begin_fp(quals); fr = float64_sub(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float64_to_g(fr); } -uint64_t helper_mulg (uint64_t a, uint64_t b) +uint64_t helper_mulg (uint64_t a, uint64_t b, uint32_t quals) { float64 fa, fb, fr; - + uint32_t token; + fa = g_to_float64(a); fb = g_to_float64(b); + + token = begin_fp(quals); fr = float64_mul(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float64_to_g(fr); } -uint64_t helper_divg (uint64_t a, uint64_t b) +uint64_t helper_divg (uint64_t a, uint64_t b, uint32_t quals) { float64 fa, fb, fr; + uint32_t token; fa = g_to_float64(a); fb = g_to_float64(b); + + token = begin_fp(quals); fr = float64_div(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float64_to_g(fr); } -uint64_t helper_sqrtg (uint64_t a) +uint64_t helper_sqrtg (uint64_t a, uint32_t quals) { float64 fa, fr; + uint32_t token; fa = g_to_float64(a); + + token = begin_fp(quals); fr = float64_sqrt(fa, &FP_STATUS); + end_fp(quals, token); + return float64_to_g(fr); } @@ -627,7 +844,7 @@ uint64_t helper_sqrtg (uint64_t a) /* S floating (single) */ /* Taken from linux/arch/alpha/kernel/traps.c, s_mem_to_reg. */ -static inline uint64_t float32_to_s_int(uint32_t fi) +static uint64_t float32_to_s_int(uint32_t fi) { uint32_t frac = fi & 0x7fffff; uint32_t sign = fi >> 31; @@ -649,7 +866,7 @@ static inline uint64_t float32_to_s_int(uint32_t fi) | ((uint64_t)frac << 29)); } -static inline uint64_t float32_to_s(float32 fa) +static uint64_t float32_to_s(float32 fa) { CPU_FloatU a; a.f = fa; @@ -678,52 +895,77 @@ uint64_t helper_memory_to_s (uint32_t a) return float32_to_s_int(a); } -uint64_t helper_adds (uint64_t a, uint64_t b) +static float32 input_s(uint32_t quals, uint64_t a) +{ + return s_to_float32(remap_ieee_input(quals, a)); +} + +uint64_t helper_adds (uint64_t a, uint64_t b, uint32_t quals) { float32 fa, fb, fr; + uint32_t token; - fa = s_to_float32(a); - fb = s_to_float32(b); + token = begin_fp(quals); + fa = input_s(quals, a); + fb = input_s(quals, b); fr = float32_add(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float32_to_s(fr); } -uint64_t helper_subs (uint64_t a, uint64_t b) +uint64_t helper_subs (uint64_t a, uint64_t b, uint32_t quals) { float32 fa, fb, fr; + uint32_t token; - fa = s_to_float32(a); - fb = s_to_float32(b); + token = begin_fp(quals); + fa = input_s(quals, a); + fb = input_s(quals, b); fr = float32_sub(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float32_to_s(fr); } -uint64_t helper_muls (uint64_t a, uint64_t b) +uint64_t helper_muls (uint64_t a, uint64_t b, uint32_t quals) { float32 fa, fb, fr; + uint32_t token; - fa = s_to_float32(a); - fb = s_to_float32(b); + token = begin_fp(quals); + fa = input_s(quals, a); + fb = input_s(quals, b); fr = float32_mul(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float32_to_s(fr); } -uint64_t helper_divs (uint64_t a, uint64_t b) +uint64_t helper_divs (uint64_t a, uint64_t b, uint32_t quals) { float32 fa, fb, fr; + uint32_t token; - fa = s_to_float32(a); - fb = s_to_float32(b); + token = begin_fp(quals); + fa = input_s(quals, a); + fb = input_s(quals, b); fr = float32_div(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float32_to_s(fr); } -uint64_t helper_sqrts (uint64_t a) +uint64_t helper_sqrts (uint64_t a, uint32_t quals) { float32 fa, fr; + uint32_t token; - fa = s_to_float32(a); + token = begin_fp(quals); + fa = input_s(quals, a); fr = float32_sqrt(fa, &FP_STATUS); + end_fp(quals, token); + return float32_to_s(fr); } @@ -745,52 +987,78 @@ static inline uint64_t float64_to_t(float64 fa) return r.ll; } -uint64_t helper_addt (uint64_t a, uint64_t b) +/* Raise any exceptions needed for using F, given the insn qualifiers. */ +static float64 input_t(uint32_t quals, uint64_t a) +{ + return t_to_float64(remap_ieee_input(quals, a)); +} + +uint64_t helper_addt (uint64_t a, uint64_t b, uint32_t quals) { float64 fa, fb, fr; + uint32_t token; - fa = t_to_float64(a); - fb = t_to_float64(b); + token = begin_fp(quals); + fa = input_t(quals, a); + fb = input_t(quals, b); fr = float64_add(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float64_to_t(fr); } -uint64_t helper_subt (uint64_t a, uint64_t b) +uint64_t helper_subt (uint64_t a, uint64_t b, uint32_t quals) { float64 fa, fb, fr; + uint32_t token; - fa = t_to_float64(a); - fb = t_to_float64(b); + token = begin_fp(quals); + fa = input_t(quals, a); + fb = input_t(quals, b); fr = float64_sub(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float64_to_t(fr); } -uint64_t helper_mult (uint64_t a, uint64_t b) +uint64_t helper_mult (uint64_t a, uint64_t b, uint32_t quals) { float64 fa, fb, fr; + uint32_t token; - fa = t_to_float64(a); - fb = t_to_float64(b); + token = begin_fp(quals); + fa = input_t(quals, a); + fb = input_t(quals, b); fr = float64_mul(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float64_to_t(fr); } -uint64_t helper_divt (uint64_t a, uint64_t b) +uint64_t helper_divt (uint64_t a, uint64_t b, uint32_t quals) { float64 fa, fb, fr; + uint32_t token; - fa = t_to_float64(a); - fb = t_to_float64(b); + token = begin_fp(quals); + fa = input_t(quals, a); + fb = input_t(quals, b); fr = float64_div(fa, fb, &FP_STATUS); + end_fp(quals, token); + return float64_to_t(fr); } -uint64_t helper_sqrtt (uint64_t a) +uint64_t helper_sqrtt (uint64_t a, uint32_t quals) { float64 fa, fr; + uint32_t token; - fa = t_to_float64(a); + token = begin_fp(quals); + fa = input_t(quals, a); fr = float64_sqrt(fa, &FP_STATUS); + end_fp(quals, token); + return float64_to_t(fr); } @@ -813,6 +1081,8 @@ uint64_t helper_cpyse(uint64_t a, uint64_t b) /* Comparisons */ +/* ??? Software completion qualifier missing. */ + uint64_t helper_cmptun (uint64_t a, uint64_t b) { float64 fa, fb; @@ -905,70 +1175,218 @@ uint64_t helper_cmpglt(uint64_t a, uint64_t b) } /* Floating point format conversion */ -uint64_t helper_cvtts (uint64_t a) +uint64_t helper_cvtts (uint64_t a, uint32_t quals) { float64 fa; float32 fr; + uint32_t token; - fa = t_to_float64(a); + token = begin_fp(quals); + fa = input_t(quals, a); fr = float64_to_float32(fa, &FP_STATUS); + end_fp(quals, token); + return float32_to_s(fr); } -uint64_t helper_cvtst (uint64_t a) +uint64_t helper_cvtst (uint64_t a, uint32_t quals) { float32 fa; float64 fr; + uint32_t token; - fa = s_to_float32(a); + token = begin_fp(quals); + fa = input_s(quals, a); fr = float32_to_float64(fa, &FP_STATUS); + end_fp(quals, token); + return float64_to_t(fr); } -uint64_t helper_cvtqs (uint64_t a) +uint64_t helper_cvtqs (uint64_t a, uint32_t quals) { - float32 fr = int64_to_float32(a, &FP_STATUS); + float32 fr; + uint32_t token; + + token = begin_fp(quals); + fr = int64_to_float32(a, &FP_STATUS); + end_fp(quals, token); + return float32_to_s(fr); } -uint64_t helper_cvttq (uint64_t a) +/* Implement float64 to uint64 conversion without overflow enabled. + In this mode we must supply the truncated result. This behaviour + is used by the compiler to get unsigned conversion for free with + the same instruction. */ + +static uint64_t cvttq_internal(uint64_t a) { - float64 fa = t_to_float64(a); - return float64_to_int64_round_to_zero(fa, &FP_STATUS); + uint64_t frac, ret = 0; + uint32_t exp, sign, exc = 0; + int shift; + + sign = (a >> 63); + exp = (uint32_t)(a >> 52) & 0x7ff; + frac = a & 0xfffffffffffffull; + + if (exp == 0) { + if (unlikely(frac != 0)) + goto do_underflow; + } else if (exp == 0x7ff) { + if (frac == 0) + exc = float_flag_overflow; + else + exc = float_flag_invalid; + } else { + /* Restore implicit bit. */ + frac |= 0x10000000000000ull; + + /* Note that neither overflow exceptions nor inexact exceptions + are desired. This lets us streamline the checks quite a bit. */ + shift = exp - 1023 - 52; + if (shift >= 0) { + /* In this case the number is so large that we must shift + the fraction left. There is no rounding to do. */ + if (shift < 63) { + ret = frac << shift; + if ((ret >> shift) != frac) + exc = float_flag_overflow; + } + } else { + uint64_t round; + + /* In this case the number is smaller than the fraction as + represented by the 52 bit number. Here we must think + about rounding the result. Handle this by shifting the + fractional part of the number into the high bits of ROUND. + This will let us efficiently handle round-to-nearest. */ + shift = -shift; + if (shift < 63) { + ret = frac >> shift; + round = frac << (64 - shift); + } else { + /* The exponent is so small we shift out everything. + Leave a sticky bit for proper rounding below. */ + do_underflow: + round = 1; + } + + if (round) { + exc = float_flag_inexact; + switch (FP_STATUS.float_rounding_mode) { + case float_round_nearest_even: + if (round == (1ull << 63)) { + /* Fraction is exactly 0.5; round to even. */ + ret += (ret & 1); + } else if (round > (1ull << 63)) { + ret += 1; + } + break; + case float_round_to_zero: + break; + case float_round_up: + if (!sign) + ret += 1; + break; + case float_round_down: + if (sign) + ret += 1; + break; + } + } + } + if (sign) + ret = -ret; + } + if (unlikely(exc)) + float_raise(exc, &FP_STATUS); + + return ret; +} + +uint64_t helper_cvttq (uint64_t a, uint32_t quals) +{ + uint64_t ret; + uint32_t token; + + /* ??? There's an arugument to be made that when /S is enabled, we + should provide the standard IEEE saturated result, instead of + the truncated result that we *must* provide when /V is disabled. + However, that's not how either the Tru64 or Linux completion + handlers actually work, and GCC knows it. */ + + token = begin_fp(quals); + a = remap_ieee_input(quals, a); + ret = cvttq_internal(a); + end_fp(quals, token); + + return ret; } -uint64_t helper_cvtqt (uint64_t a) +uint64_t helper_cvtqt (uint64_t a, uint32_t quals) { - float64 fr = int64_to_float64(a, &FP_STATUS); + float64 fr; + uint32_t token; + + token = begin_fp(quals); + fr = int64_to_float64(a, &FP_STATUS); + end_fp(quals, token); + return float64_to_t(fr); } -uint64_t helper_cvtqf (uint64_t a) +uint64_t helper_cvtqf (uint64_t a, uint32_t quals) { - float32 fr = int64_to_float32(a, &FP_STATUS); + float32 fr; + uint32_t token; + + token = begin_fp(quals); + fr = int64_to_float32(a, &FP_STATUS); + end_fp(quals, token); + return float32_to_f(fr); } -uint64_t helper_cvtgf (uint64_t a) +uint64_t helper_cvtgf (uint64_t a, uint32_t quals) { float64 fa; float32 fr; + uint32_t token; fa = g_to_float64(a); + + token = begin_fp(quals); fr = float64_to_float32(fa, &FP_STATUS); + end_fp(quals, token); + return float32_to_f(fr); } -uint64_t helper_cvtgq (uint64_t a) +uint64_t helper_cvtgq (uint64_t a, uint32_t quals) { - float64 fa = g_to_float64(a); - return float64_to_int64_round_to_zero(fa, &FP_STATUS); + float64 fa; + uint64_t ret; + uint32_t token; + + fa = g_to_float64(a); + + token = begin_fp(quals); + ret = float64_to_int64(fa, &FP_STATUS); + end_fp(quals, token); + + return ret; } -uint64_t helper_cvtqg (uint64_t a) +uint64_t helper_cvtqg (uint64_t a, uint32_t quals) { float64 fr; + uint32_t token; + + token = begin_fp(quals); fr = int64_to_float64(a, &FP_STATUS); + end_fp(quals, token); + return float64_to_g(fr); } @@ -979,35 +1397,24 @@ uint64_t helper_cvtlq (uint64_t a) return (lo & 0x3FFFFFFF) | (hi & 0xc0000000); } -static inline uint64_t __helper_cvtql(uint64_t a, int s, int v) -{ - uint64_t r; - - r = ((uint64_t)(a & 0xC0000000)) << 32; - r |= ((uint64_t)(a & 0x7FFFFFFF)) << 29; - - if (v && (int64_t)((int32_t)r) != (int64_t)r) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); - } - if (s) { - /* TODO */ - } - return r; -} - uint64_t helper_cvtql (uint64_t a) { - return __helper_cvtql(a, 0, 0); + return ((a & 0xC0000000) << 32) | ((a & 0x7FFFFFFF) << 29); } uint64_t helper_cvtqlv (uint64_t a) { - return __helper_cvtql(a, 0, 1); + if ((int32_t)a != (int64_t)a) + helper_excp(EXCP_ARITH, EXC_M_IOV); + return helper_cvtql(a); } uint64_t helper_cvtqlsv (uint64_t a) { - return __helper_cvtql(a, 1, 1); + /* ??? I'm pretty sure there's nothing that /sv needs to do that /v + doesn't do. The only thing I can think is that /sv is a valid + instruction merely for completeness in the ISA. */ + return helper_cvtqlv(a); } /* PALcode support special instructions */ diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 45cb697..e0ca0ed 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -442,81 +442,79 @@ static void gen_fcmov(TCGCond inv_cond, int ra, int rb, int rc) gen_set_label(l1); } -#define FARITH2(name) \ -static inline void glue(gen_f, name)(int rb, int rc) \ -{ \ - if (unlikely(rc == 31)) \ - return; \ - \ - if (rb != 31) \ - gen_helper_ ## name (cpu_fir[rc], cpu_fir[rb]); \ - else { \ - TCGv tmp = tcg_const_i64(0); \ - gen_helper_ ## name (cpu_fir[rc], tmp); \ - tcg_temp_free(tmp); \ - } \ +#define FARITH2(name) \ +static inline void glue(gen_f, name)(int rb, int rc) \ +{ \ + if (unlikely(rc == 31)) \ + return; \ + \ + if (rb != 31) \ + gen_helper_ ## name (cpu_fir[rc], cpu_fir[rb]); \ + else { \ + TCGv tmp = tcg_const_i64(0); \ + gen_helper_ ## name (cpu_fir[rc], tmp); \ + tcg_temp_free(tmp); \ + } \ } -FARITH2(sqrts) -FARITH2(sqrtf) -FARITH2(sqrtg) -FARITH2(sqrtt) -FARITH2(cvtgf) -FARITH2(cvtgq) -FARITH2(cvtqf) -FARITH2(cvtqg) -FARITH2(cvtst) -FARITH2(cvtts) -FARITH2(cvttq) -FARITH2(cvtqs) -FARITH2(cvtqt) FARITH2(cvtlq) FARITH2(cvtql) FARITH2(cvtqlv) FARITH2(cvtqlsv) -#define FARITH3(name) \ -static inline void glue(gen_f, name)(int ra, int rb, int rc) \ -{ \ - if (unlikely(rc == 31)) \ - return; \ - \ - if (ra != 31) { \ - if (rb != 31) \ - gen_helper_ ## name (cpu_fir[rc], cpu_fir[ra], cpu_fir[rb]); \ - else { \ - TCGv tmp = tcg_const_i64(0); \ - gen_helper_ ## name (cpu_fir[rc], cpu_fir[ra], tmp); \ - tcg_temp_free(tmp); \ - } \ - } else { \ - TCGv tmp = tcg_const_i64(0); \ - if (rb != 31) \ - gen_helper_ ## name (cpu_fir[rc], tmp, cpu_fir[rb]); \ - else \ - gen_helper_ ## name (cpu_fir[rc], tmp, tmp); \ - tcg_temp_free(tmp); \ - } \ +#define QFARITH2(name) \ +static inline void glue(gen_f, name)(int rb, int rc, int opc) \ +{ \ + TCGv_i32 quals; \ + if (unlikely(rc == 31)) \ + return; \ + quals = tcg_const_i32(opc & ~0x3f); \ + if (rb != 31) \ + gen_helper_ ## name (cpu_fir[rc], cpu_fir[rb], quals); \ + else { \ + TCGv tmp = tcg_const_i64(0); \ + gen_helper_ ## name (cpu_fir[rc], tmp, quals); \ + tcg_temp_free(tmp); \ + } \ + tcg_temp_free_i32(quals); \ +} +QFARITH2(sqrts) +QFARITH2(sqrtf) +QFARITH2(sqrtg) +QFARITH2(sqrtt) +QFARITH2(cvtgf) +QFARITH2(cvtgq) +QFARITH2(cvtqf) +QFARITH2(cvtqg) +QFARITH2(cvtst) +QFARITH2(cvtts) +QFARITH2(cvttq) +QFARITH2(cvtqs) +QFARITH2(cvtqt) + +#define FARITH3(name) \ +static inline void glue(gen_f, name)(int ra, int rb, int rc) \ +{ \ + TCGv zero, ta, tb; \ + if (unlikely(rc == 31)) \ + return; \ + ta = cpu_fir[ra]; \ + tb = cpu_fir[rb]; \ + if (unlikely(ra == 31)) { \ + zero = tcg_const_i64(0); \ + ta = zero; \ + } \ + if (unlikely(rb == 31)) { \ + if (ra != 31) \ + zero = tcg_const_i64(0); \ + tb = zero; \ + } \ + gen_helper_ ## name (cpu_fir[rc], ta, tb); \ + if (ra == 31 || rb == 31) \ + tcg_temp_free(zero); \ } - -FARITH3(addf) -FARITH3(subf) -FARITH3(mulf) -FARITH3(divf) -FARITH3(addg) -FARITH3(subg) -FARITH3(mulg) -FARITH3(divg) FARITH3(cmpgeq) FARITH3(cmpglt) FARITH3(cmpgle) -FARITH3(adds) -FARITH3(subs) -FARITH3(muls) -FARITH3(divs) -FARITH3(addt) -FARITH3(subt) -FARITH3(mult) -FARITH3(divt) FARITH3(cmptun) FARITH3(cmpteq) FARITH3(cmptlt) @@ -525,6 +523,47 @@ FARITH3(cpys) FARITH3(cpysn) FARITH3(cpyse) +#define QFARITH3(name) \ +static inline void glue(gen_f, name)(int ra, int rb, int rc, int opc) \ +{ \ + TCGv zero, ta, tb; \ + TCGv_i32 quals; \ + if (unlikely(rc == 31)) \ + return; \ + ta = cpu_fir[ra]; \ + tb = cpu_fir[rb]; \ + if (unlikely(ra == 31)) { \ + zero = tcg_const_i64(0); \ + ta = zero; \ + } \ + if (unlikely(rb == 31)) { \ + if (ra != 31) \ + zero = tcg_const_i64(0); \ + tb = zero; \ + } \ + quals = tcg_const_i32(opc & ~0x3f); \ + gen_helper_ ## name (cpu_fir[rc], ta, tb, quals); \ + tcg_temp_free_i32(quals); \ + if (ra == 31 || rb == 31) \ + tcg_temp_free(zero); \ +} +QFARITH3(addf) +QFARITH3(subf) +QFARITH3(mulf) +QFARITH3(divf) +QFARITH3(addg) +QFARITH3(subg) +QFARITH3(mulg) +QFARITH3(divg) +QFARITH3(adds) +QFARITH3(subs) +QFARITH3(muls) +QFARITH3(divs) +QFARITH3(addt) +QFARITH3(subt) +QFARITH3(mult) +QFARITH3(divt) + static inline uint64_t zapnot_mask(uint8_t lit) { uint64_t mask = 0; @@ -1607,7 +1646,7 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) } break; case 0x14: - switch (fpfn) { /* f11 & 0x3F */ + switch (fpfn) { /* fn11 & 0x3F */ case 0x04: /* ITOFS */ if (!(ctx->amask & AMASK_FIX)) @@ -1626,13 +1665,13 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) /* SQRTF */ if (!(ctx->amask & AMASK_FIX)) goto invalid_opc; - gen_fsqrtf(rb, rc); + gen_fsqrtf(rb, rc, fn11); break; case 0x0B: /* SQRTS */ if (!(ctx->amask & AMASK_FIX)) goto invalid_opc; - gen_fsqrts(rb, rc); + gen_fsqrts(rb, rc, fn11); break; case 0x14: /* ITOFF */ @@ -1663,13 +1702,13 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) /* SQRTG */ if (!(ctx->amask & AMASK_FIX)) goto invalid_opc; - gen_fsqrtg(rb, rc); + gen_fsqrtg(rb, rc, fn11); break; case 0x02B: /* SQRTT */ if (!(ctx->amask & AMASK_FIX)) goto invalid_opc; - gen_fsqrtt(rb, rc); + gen_fsqrtt(rb, rc, fn11); break; default: goto invalid_opc; @@ -1677,47 +1716,42 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) break; case 0x15: /* VAX floating point */ - /* XXX: rounding mode and trap are ignored (!) */ - switch (fpfn) { /* f11 & 0x3F */ + switch (fpfn) { /* fn11 & 0x3F */ case 0x00: /* ADDF */ - gen_faddf(ra, rb, rc); + gen_faddf(ra, rb, rc, fn11); break; case 0x01: /* SUBF */ - gen_fsubf(ra, rb, rc); + gen_fsubf(ra, rb, rc, fn11); break; case 0x02: /* MULF */ - gen_fmulf(ra, rb, rc); + gen_fmulf(ra, rb, rc, fn11); break; case 0x03: /* DIVF */ - gen_fdivf(ra, rb, rc); + gen_fdivf(ra, rb, rc, fn11); break; case 0x1E: /* CVTDG */ -#if 0 // TODO - gen_fcvtdg(rb, rc); -#else + /* TODO */ goto invalid_opc; -#endif - break; case 0x20: /* ADDG */ - gen_faddg(ra, rb, rc); + gen_faddg(ra, rb, rc, fn11); break; case 0x21: /* SUBG */ - gen_fsubg(ra, rb, rc); + gen_fsubg(ra, rb, rc, fn11); break; case 0x22: /* MULG */ - gen_fmulg(ra, rb, rc); + gen_fmulg(ra, rb, rc, fn11); break; case 0x23: /* DIVG */ - gen_fdivg(ra, rb, rc); + gen_fdivg(ra, rb, rc, fn11); break; case 0x25: /* CMPGEQ */ @@ -1733,27 +1767,23 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) break; case 0x2C: /* CVTGF */ - gen_fcvtgf(rb, rc); + gen_fcvtgf(rb, rc, fn11); break; case 0x2D: /* CVTGD */ -#if 0 // TODO - gen_fcvtgd(rb, rc); -#else + /* TODO */ goto invalid_opc; -#endif - break; case 0x2F: /* CVTGQ */ - gen_fcvtgq(rb, rc); + gen_fcvtgq(rb, rc, fn11); break; case 0x3C: /* CVTQF */ - gen_fcvtqf(rb, rc); + gen_fcvtqf(rb, rc, fn11); break; case 0x3E: /* CVTQG */ - gen_fcvtqg(rb, rc); + gen_fcvtqg(rb, rc, fn11); break; default: goto invalid_opc; @@ -1761,39 +1791,38 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) break; case 0x16: /* IEEE floating-point */ - /* XXX: rounding mode and traps are ignored (!) */ - switch (fpfn) { /* f11 & 0x3F */ + switch (fpfn) { /* fn11 & 0x3F */ case 0x00: /* ADDS */ - gen_fadds(ra, rb, rc); + gen_fadds(ra, rb, rc, fn11); break; case 0x01: /* SUBS */ - gen_fsubs(ra, rb, rc); + gen_fsubs(ra, rb, rc, fn11); break; case 0x02: /* MULS */ - gen_fmuls(ra, rb, rc); + gen_fmuls(ra, rb, rc, fn11); break; case 0x03: /* DIVS */ - gen_fdivs(ra, rb, rc); + gen_fdivs(ra, rb, rc, fn11); break; case 0x20: /* ADDT */ - gen_faddt(ra, rb, rc); + gen_faddt(ra, rb, rc, fn11); break; case 0x21: /* SUBT */ - gen_fsubt(ra, rb, rc); + gen_fsubt(ra, rb, rc, fn11); break; case 0x22: /* MULT */ - gen_fmult(ra, rb, rc); + gen_fmult(ra, rb, rc, fn11); break; case 0x23: /* DIVT */ - gen_fdivt(ra, rb, rc); + gen_fdivt(ra, rb, rc, fn11); break; case 0x24: /* CMPTUN */ @@ -1812,26 +1841,25 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) gen_fcmptle(ra, rb, rc); break; case 0x2C: - /* XXX: incorrect */ if (fn11 == 0x2AC || fn11 == 0x6AC) { /* CVTST */ - gen_fcvtst(rb, rc); + gen_fcvtst(rb, rc, fn11); } else { /* CVTTS */ - gen_fcvtts(rb, rc); + gen_fcvtts(rb, rc, fn11); } break; case 0x2F: /* CVTTQ */ - gen_fcvttq(rb, rc); + gen_fcvttq(rb, rc, fn11); break; case 0x3C: /* CVTQS */ - gen_fcvtqs(rb, rc); + gen_fcvtqs(rb, rc, fn11); break; case 0x3E: /* CVTQT */ - gen_fcvtqt(rb, rc); + gen_fcvtqt(rb, rc, fn11); break; default: goto invalid_opc; ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 2009-12-18 22:09 ` [Qemu-devel] [patch] target-alpha: squashed fpu qualifiers patch Richard Henderson @ 2010-01-04 22:46 ` Richard Henderson 2009-12-31 19:54 ` [Qemu-devel] [PATCH 1/6] target-alpha: Fix gdb access to fpcr and unique Richard Henderson ` (8 more replies) 0 siblings, 9 replies; 18+ messages in thread From: Richard Henderson @ 2010-01-04 22:46 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien I've split up the FPCR as requested by Aurelien. We no longer set anything in FP_STATUS after the execution of the operation, only copy data from FP_STATUS to some env->fpcr field. I have totally rewritten the patch to be more along the line that Laurent was suggesting, in that the rounding mode and other qualifiers are totally parsed within the translator. I no longer pass the FN11 field to the helper functions. Unlike Laurent's prototype, I do not set the rounding mode at every FP instruction; I remember the previous setting of the rounding mode within a TB. Similarly for the flush-to-zero field. I do not handle VAX instructions at all. The existing VAX support is mostly broken, and I didn't feel like compounding the problem. r~ -- Richard Henderson (6): target-alpha: Fix gdb access to fpcr and unique. target-alpha: Split up FPCR value into separate fields. target-alpha: Reduce internal processor registers for user-mode. target-alpha: Clean up arithmetic traps. target-alpha: Mark helper_excp as NORETURN. target-alpha: Implement IEEE FP qualifiers. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 1/6] target-alpha: Fix gdb access to fpcr and unique. 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson @ 2009-12-31 19:54 ` Richard Henderson 2009-12-31 20:41 ` [Qemu-devel] [PATCH 2/6] target-alpha: Split up FPCR value into separate fields Richard Henderson ` (7 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Richard Henderson @ 2009-12-31 19:54 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien cpu_gdb_read/write_register need to access the fpcr via the cpu_alpha_load/store_fpcr functions. The unique register is number 66 in the gdb remote protocol. Signed-off-by: Richard Henderson <rth@twiddle.net> --- gdbstub.c | 88 +++++++++++++++++++++++++++++++++++++----------------------- 1 files changed, 54 insertions(+), 34 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index 6180171..4877103 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -1307,52 +1307,72 @@ static int cpu_gdb_write_register(CPUState *env, uint8_t *mem_buf, int n) } #elif defined (TARGET_ALPHA) -#define NUM_CORE_REGS 65 +#define NUM_CORE_REGS 67 static int cpu_gdb_read_register(CPUState *env, uint8_t *mem_buf, int n) { - if (n < 31) { - GET_REGL(env->ir[n]); - } - else if (n == 31) { - GET_REGL(0); - } - else if (n<63) { - uint64_t val; + uint64_t val; + CPU_DoubleU d; - val = *((uint64_t *)&env->fir[n-32]); - GET_REGL(val); - } - else if (n==63) { - GET_REGL(env->fpcr); - } - else if (n==64) { - GET_REGL(env->pc); - } - else { - GET_REGL(0); + switch (n) { + case 0 ... 30: + val = env->ir[n]; + break; + case 32 ... 62: + d.d = env->fir[n - 32]; + val = d.ll; + break; + case 63: + val = cpu_alpha_load_fpcr(env); + break; + case 64: + val = env->pc; + break; + case 66: + val = env->unique; + break; + case 31: + case 65: + /* 31 really is the zero register; 65 is unassigned in the + gdb protocol, but is still required to occupy 8 bytes. */ + val = 0; + break; + default: + return 0; } - - return 0; + GET_REGL(val); } static int cpu_gdb_write_register(CPUState *env, uint8_t *mem_buf, int n) { - target_ulong tmp; - tmp = ldtul_p(mem_buf); + target_ulong tmp = ldtul_p(mem_buf); + CPU_DoubleU d; - if (n < 31) { + switch (n) { + case 0 ... 30: env->ir[n] = tmp; + break; + case 32 ... 62: + d.ll = tmp; + env->fir[n - 32] = d.d; + break; + case 63: + cpu_alpha_store_fpcr(env, tmp); + break; + case 64: + env->pc = tmp; + break; + case 66: + env->unique = tmp; + break; + case 31: + case 65: + /* 31 really is the zero register; 65 is unassigned in the + gdb protocol, but is still required to occupy 8 bytes. */ + break; + default: + return 0; } - - if (n > 31 && n < 63) { - env->fir[n - 32] = ldfl_p(mem_buf); - } - - if (n == 64 ) { - env->pc=tmp; - } - return 8; } #elif defined (TARGET_S390X) -- 1.6.5.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 2/6] target-alpha: Split up FPCR value into separate fields. 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson 2009-12-31 19:54 ` [Qemu-devel] [PATCH 1/6] target-alpha: Fix gdb access to fpcr and unique Richard Henderson @ 2009-12-31 20:41 ` Richard Henderson 2010-01-04 19:19 ` [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode Richard Henderson ` (6 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Richard Henderson @ 2009-12-31 20:41 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien The fpcr_exc_status, fpcr_exc_mask, and fpcr_dyn_round fields are stored in <softfloat.h> format for convenience during regular execution. Revert the addition of float_exception_mask to float_status, added in ba0e276db4b51bd2255a5d5ff8902c70d32ade40. Signed-off-by: Richard Henderson <rth@twiddle.net> --- fpu/softfloat.h | 1 - target-alpha/cpu.h | 24 ++++++-- target-alpha/helper.c | 167 +++++++++++++++++++++++++++++++++---------------- 3 files changed, 131 insertions(+), 61 deletions(-) diff --git a/fpu/softfloat.h b/fpu/softfloat.h index 9d82694..636591b 100644 --- a/fpu/softfloat.h +++ b/fpu/softfloat.h @@ -187,7 +187,6 @@ typedef struct float_status { signed char float_detect_tininess; signed char float_rounding_mode; signed char float_exception_flags; - signed char float_exception_mask; #ifdef FLOATX80 signed char floatx80_rounding_precision; #endif diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h index c0dff4b..4722415 100644 --- a/target-alpha/cpu.h +++ b/target-alpha/cpu.h @@ -145,6 +145,10 @@ enum { #define FPCR_UNFD (1ULL << 61) #define FPCR_UNDZ (1ULL << 60) #define FPCR_DYN_SHIFT 58 +#define FPCR_DYN_CHOPPED (0ULL << FPCR_DYN_SHIFT) +#define FPCR_DYN_MINUS (1ULL << FPCR_DYN_SHIFT) +#define FPCR_DYN_NORMAL (2ULL << FPCR_DYN_SHIFT) +#define FPCR_DYN_PLUS (3ULL << FPCR_DYN_SHIFT) #define FPCR_DYN_MASK (3ULL << FPCR_DYN_SHIFT) #define FPCR_IOV (1ULL << 57) #define FPCR_INE (1ULL << 56) @@ -341,17 +345,27 @@ struct pal_handler_t { struct CPUAlphaState { uint64_t ir[31]; - float64 fir[31]; - float_status fp_status; - uint64_t fpcr; + float64 fir[31]; uint64_t pc; uint64_t lock; uint32_t pcc[2]; uint64_t ipr[IPR_LAST]; uint64_t ps; uint64_t unique; - int saved_mode; /* Used for HW_LD / HW_ST */ - int intr_flag; /* For RC and RS */ + float_status fp_status; + /* The following fields make up the FPCR, but in FP_STATUS format. */ + uint8_t fpcr_exc_status; + uint8_t fpcr_exc_mask; + uint8_t fpcr_dyn_round; + uint8_t fpcr_flush_to_zero; + uint8_t fpcr_dnz; + uint8_t fpcr_dnod; + uint8_t fpcr_undz; + + /* Used for HW_LD / HW_ST */ + uint8_t saved_mode; + /* For RC and RS */ + uint8_t intr_flag; #if TARGET_LONG_BITS > HOST_LONG_BITS /* temporary fixed-point registers diff --git a/target-alpha/helper.c b/target-alpha/helper.c index be7d37b..57830e4 100644 --- a/target-alpha/helper.c +++ b/target-alpha/helper.c @@ -27,79 +27,136 @@ uint64_t cpu_alpha_load_fpcr (CPUState *env) { - uint64_t ret = 0; - int flags, mask; - - flags = env->fp_status.float_exception_flags; - ret |= (uint64_t) flags << 52; - if (flags) - ret |= FPCR_SUM; - env->ipr[IPR_EXC_SUM] &= ~0x3E; - env->ipr[IPR_EXC_SUM] |= flags << 1; - - mask = env->fp_status.float_exception_mask; - if (mask & float_flag_invalid) - ret |= FPCR_INVD; - if (mask & float_flag_divbyzero) - ret |= FPCR_DZED; - if (mask & float_flag_overflow) - ret |= FPCR_OVFD; - if (mask & float_flag_underflow) - ret |= FPCR_UNFD; - if (mask & float_flag_inexact) - ret |= FPCR_INED; - - switch (env->fp_status.float_rounding_mode) { + uint64_t r = 0; + uint8_t t; + + t = env->fpcr_exc_status; + if (t) { + r = FPCR_SUM; + if (t & float_flag_invalid) { + r |= FPCR_INV; + } + if (t & float_flag_divbyzero) { + r |= FPCR_DZE; + } + if (t & float_flag_overflow) { + r |= FPCR_OVF; + } + if (t & float_flag_underflow) { + r |= FPCR_UNF; + } + if (t & float_flag_inexact) { + r |= FPCR_INE; + } + } + + t = env->fpcr_exc_mask; + if (t & float_flag_invalid) { + r |= FPCR_INVD; + } + if (t & float_flag_divbyzero) { + r |= FPCR_DZED; + } + if (t & float_flag_overflow) { + r |= FPCR_OVFD; + } + if (t & float_flag_underflow) { + r |= FPCR_UNFD; + } + if (t & float_flag_inexact) { + r |= FPCR_INED; + } + + switch (env->fpcr_dyn_round) { case float_round_nearest_even: - ret |= 2ULL << FPCR_DYN_SHIFT; + r |= FPCR_DYN_NORMAL; break; case float_round_down: - ret |= 1ULL << FPCR_DYN_SHIFT; + r |= FPCR_DYN_MINUS; break; case float_round_up: - ret |= 3ULL << FPCR_DYN_SHIFT; + r |= FPCR_DYN_PLUS; break; case float_round_to_zero: + r |= FPCR_DYN_CHOPPED; break; } - return ret; + + if (env->fpcr_dnz) { + r |= FPCR_DNZ; + } + if (env->fpcr_dnod) { + r |= FPCR_DNOD; + } + if (env->fpcr_undz) { + r |= FPCR_UNDZ; + } + + return r; } void cpu_alpha_store_fpcr (CPUState *env, uint64_t val) { - int round_mode, mask; + uint8_t t; - set_float_exception_flags((val >> 52) & 0x3F, &env->fp_status); + t = 0; + if (val & FPCR_INV) { + t |= float_flag_invalid; + } + if (val & FPCR_DZE) { + t |= float_flag_divbyzero; + } + if (val & FPCR_OVF) { + t |= float_flag_overflow; + } + if (val & FPCR_UNF) { + t |= float_flag_underflow; + } + if (val & FPCR_INE) { + t |= float_flag_inexact; + } + env->fpcr_exc_status = t; - mask = 0; - if (val & FPCR_INVD) - mask |= float_flag_invalid; - if (val & FPCR_DZED) - mask |= float_flag_divbyzero; - if (val & FPCR_OVFD) - mask |= float_flag_overflow; - if (val & FPCR_UNFD) - mask |= float_flag_underflow; - if (val & FPCR_INED) - mask |= float_flag_inexact; - env->fp_status.float_exception_mask = mask; + t = 0; + if (val & FPCR_INVD) { + t |= float_flag_invalid; + } + if (val & FPCR_DZED) { + t |= float_flag_divbyzero; + } + if (val & FPCR_OVFD) { + t |= float_flag_overflow; + } + if (val & FPCR_UNFD) { + t |= float_flag_underflow; + } + if (val & FPCR_INED) { + t |= float_flag_inexact; + } + env->fpcr_exc_mask = t; - switch ((val >> FPCR_DYN_SHIFT) & 3) { - case 0: - round_mode = float_round_to_zero; + switch (val & FPCR_DYN_MASK) { + case FPCR_DYN_CHOPPED: + t = float_round_to_zero; break; - case 1: - round_mode = float_round_down; + case FPCR_DYN_MINUS: + t = float_round_down; break; - case 2: - round_mode = float_round_nearest_even; + case FPCR_DYN_NORMAL: + t = float_round_nearest_even; break; - case 3: - default: /* this avoids a gcc (< 4.4) warning */ - round_mode = float_round_up; + case FPCR_DYN_PLUS: + t = float_round_up; break; } - set_float_rounding_mode(round_mode, &env->fp_status); + env->fpcr_dyn_round = t; + + env->fpcr_flush_to_zero + = (val & (FPCR_UNDZ|FPCR_UNFD)) == (FPCR_UNDZ|FPCR_UNFD); + + env->fpcr_dnz = (val & FPCR_DNZ) != 0; + env->fpcr_dnod = (val & FPCR_DNOD) != 0; + env->fpcr_undz = (val & FPCR_UNDZ) != 0; } #if defined(CONFIG_USER_ONLY) -- 1.6.5.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode. 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson 2009-12-31 19:54 ` [Qemu-devel] [PATCH 1/6] target-alpha: Fix gdb access to fpcr and unique Richard Henderson 2009-12-31 20:41 ` [Qemu-devel] [PATCH 2/6] target-alpha: Split up FPCR value into separate fields Richard Henderson @ 2010-01-04 19:19 ` Richard Henderson 2010-01-06 9:55 ` Tristan Gingold 2010-01-04 19:24 ` [Qemu-devel] [PATCH 4/6] target-alpha: Clean up arithmetic traps Richard Henderson ` (5 subsequent siblings) 8 siblings, 1 reply; 18+ messages in thread From: Richard Henderson @ 2010-01-04 19:19 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien The existing set of IPRs is totally irrelevant to user-mode emulation. Indeed, they most are irrelevant to implementing kernel-mode emulation, and would only be relevant to PAL-mode emulation, which I suspect that no one will ever attempt. Reducing the set of processor registers reduces the size of the CPU state. Signed-off-by: Richard Henderson <rth@twiddle.net> --- linux-user/main.c | 4 +--- target-alpha/cpu.h | 6 ++++++ target-alpha/translate.c | 45 +++++++++++++++++++++++++++------------------ 3 files changed, 34 insertions(+), 21 deletions(-) diff --git a/linux-user/main.c b/linux-user/main.c index a0d8ce7..91e5009 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -3050,10 +3050,8 @@ int main(int argc, char **argv, char **envp) for(i = 0; i < 28; i++) { env->ir[i] = ((abi_ulong *)regs)[i]; } - env->ipr[IPR_USP] = regs->usp; - env->ir[30] = regs->usp; + env->ir[IR_SP] = regs->usp; env->pc = regs->pc; - env->unique = regs->unique; } #elif defined(TARGET_CRIS) { diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h index 4722415..3728d83 100644 --- a/target-alpha/cpu.h +++ b/target-alpha/cpu.h @@ -193,6 +193,11 @@ enum { /* Internal processor registers */ /* XXX: TOFIX: most of those registers are implementation dependant */ enum { +#if defined(CONFIG_USER_ONLY) + IPR_EXC_ADDR, + IPR_EXC_SUM, + IPR_EXC_MASK, +#else /* Ebox IPRs */ IPR_CC = 0xC0, /* 21264 */ IPR_CC_CTL = 0xC1, /* 21264 */ @@ -306,6 +311,7 @@ enum { IPR_VPTB, IPR_WHAMI, IPR_ALT_MODE, +#endif IPR_LAST, }; diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 87813e7..515c8c7 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -2721,7 +2721,6 @@ static const struct cpu_def_t cpu_defs[] = { CPUAlphaState * cpu_alpha_init (const char *cpu_model) { CPUAlphaState *env; - uint64_t hwpcb; int implver, amask, i, max; env = qemu_mallocz(sizeof(CPUAlphaState)); @@ -2752,24 +2751,34 @@ CPUAlphaState * cpu_alpha_init (const char *cpu_model) | FPCR_UNFD | FPCR_INED | FPCR_DNOD)); #endif pal_init(env); + /* Initialize IPR */ - hwpcb = env->ipr[IPR_PCBB]; - env->ipr[IPR_ASN] = 0; - env->ipr[IPR_ASTEN] = 0; - env->ipr[IPR_ASTSR] = 0; - env->ipr[IPR_DATFX] = 0; - /* XXX: fix this */ - // env->ipr[IPR_ESP] = ldq_raw(hwpcb + 8); - // env->ipr[IPR_KSP] = ldq_raw(hwpcb + 0); - // env->ipr[IPR_SSP] = ldq_raw(hwpcb + 16); - // env->ipr[IPR_USP] = ldq_raw(hwpcb + 24); - env->ipr[IPR_FEN] = 0; - env->ipr[IPR_IPL] = 31; - env->ipr[IPR_MCES] = 0; - env->ipr[IPR_PERFMON] = 0; /* Implementation specific */ - // env->ipr[IPR_PTBR] = ldq_raw(hwpcb + 32); - env->ipr[IPR_SISR] = 0; - env->ipr[IPR_VIRBND] = -1ULL; +#if defined (CONFIG_USER_ONLY) + env->ipr[IPR_EXC_ADDR] = 0; + env->ipr[IPR_EXC_SUM] = 0; + env->ipr[IPR_EXC_MASK] = 0; +#else + { + uint64_t hwpcb; + hwpcb = env->ipr[IPR_PCBB]; + env->ipr[IPR_ASN] = 0; + env->ipr[IPR_ASTEN] = 0; + env->ipr[IPR_ASTSR] = 0; + env->ipr[IPR_DATFX] = 0; + /* XXX: fix this */ + // env->ipr[IPR_ESP] = ldq_raw(hwpcb + 8); + // env->ipr[IPR_KSP] = ldq_raw(hwpcb + 0); + // env->ipr[IPR_SSP] = ldq_raw(hwpcb + 16); + // env->ipr[IPR_USP] = ldq_raw(hwpcb + 24); + env->ipr[IPR_FEN] = 0; + env->ipr[IPR_IPL] = 31; + env->ipr[IPR_MCES] = 0; + env->ipr[IPR_PERFMON] = 0; /* Implementation specific */ + // env->ipr[IPR_PTBR] = ldq_raw(hwpcb + 32); + env->ipr[IPR_SISR] = 0; + env->ipr[IPR_VIRBND] = -1ULL; + } +#endif qemu_init_vcpu(env); return env; -- 1.6.5.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode. 2010-01-04 19:19 ` [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode Richard Henderson @ 2010-01-06 9:55 ` Tristan Gingold 2010-01-06 16:29 ` Richard Henderson 0 siblings, 1 reply; 18+ messages in thread From: Tristan Gingold @ 2010-01-06 9:55 UTC (permalink / raw) To: Richard Henderson; +Cc: QEMU Developers On Jan 4, 2010, at 8:19 PM, Richard Henderson wrote: > The existing set of IPRs is totally irrelevant to user-mode emulation. > Indeed, they most are irrelevant to implementing kernel-mode emulation, > and would only be relevant to PAL-mode emulation, which I suspect that > no one will ever attempt. Interesting, that's the approach I used to emulate an es40 (ie full emulation of 21264). This had the advantage to be able to use the genuine ROM. Tristan. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode. 2010-01-06 9:55 ` Tristan Gingold @ 2010-01-06 16:29 ` Richard Henderson 2010-01-06 17:04 ` Andreas Färber 0 siblings, 1 reply; 18+ messages in thread From: Richard Henderson @ 2010-01-06 16:29 UTC (permalink / raw) To: Tristan Gingold; +Cc: QEMU Developers On 01/06/2010 01:55 AM, Tristan Gingold wrote: >> The existing set of IPRs is totally irrelevant to user-mode emulation. >> Indeed, they most are irrelevant to implementing kernel-mode emulation, >> and would only be relevant to PAL-mode emulation, which I suspect that >> no one will ever attempt. > > Interesting, that's the approach I used to emulate an es40 (ie full emulation of 21264). > This had the advantage to be able to use the genuine ROM. Heh. Well, far be it for me to disuade. There's surely room for wanting to emulate at that level. However, since (1) ROMs other than the few supported by MILO are probably not redistributable and (2) I have the sense that generic kernel stuff would go faster emulating the PALcode specification, I think there's also a reason to do it that way too. All that said, there's currently nothing that distinguishes between anything but user-mode and system-mode. I would be delighted to see your es40 patches... r~ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode. 2010-01-06 16:29 ` Richard Henderson @ 2010-01-06 17:04 ` Andreas Färber 2010-01-07 11:54 ` Tristan Gingold 0 siblings, 1 reply; 18+ messages in thread From: Andreas Färber @ 2010-01-06 17:04 UTC (permalink / raw) To: Richard Henderson; +Cc: Tristan Gingold, QEMU Developers Am 06.01.2010 um 17:29 schrieb Richard Henderson: > since (1) ROMs other than the few supported by MILO are probably not > redistributable Tristan's trick here was to provide a way for the user to extract a non-distributable ROM. I fear the controversy of whether this should be in qemu-system-alpha or not kept the patch series from being committed. > I would be delighted to see your es40 patches... v4: http://lists.nongnu.org/archive/html/qemu-devel/2009-03/msg01541.html See also http://repo.or.cz/w/qemu/es40.git Since that series added devices necessary for system emulation I assume that parts of it need to be converted to the new qdev infrastructure. Andreas ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode. 2010-01-06 17:04 ` Andreas Färber @ 2010-01-07 11:54 ` Tristan Gingold 2010-01-07 20:13 ` Andreas Färber 0 siblings, 1 reply; 18+ messages in thread From: Tristan Gingold @ 2010-01-07 11:54 UTC (permalink / raw) To: Andreas Färber; +Cc: QEMU Developers, Richard Henderson On Jan 6, 2010, at 6:04 PM, Andreas Färber wrote: > > Am 06.01.2010 um 17:29 schrieb Richard Henderson: > >> since (1) ROMs other than the few supported by MILO are probably not redistributable > > Tristan's trick here was to provide a way for the user to extract a non-distributable ROM. I fear the controversy of whether this should be in qemu-system-alpha or not kept the patch series from being committed. No. This point was not discussed. It is always possible to use another ROM. I didn't continue this work because I hadn't had the time. The ROM is indeed an issue. MILO is a possible solution, but is not available on 21264 and can only boot linux. I am not even sure it is still supported. As I was mostly interested in Tru64 and VMS, I took the SRM way. Note that a SRM rescue files are available from hp.com site (which doesn't mean it is legal) and was already used in another es40 free emulator. I really don't plan to write a ROM for Alpha! Although for performance it can make sense to emulate PAL when the OS is running. >> I would be delighted to see your es40 patches... > > v4: http://lists.nongnu.org/archive/html/qemu-devel/2009-03/msg01541.html > > See also http://repo.or.cz/w/qemu/es40.git > > Since that series added devices necessary for system emulation I assume that parts of it need to be converted to the new qdev infrastructure. Right. This is not very recent. IIRC, I could reach a bash prompt, but I don't remember which kernel (2.2 or 2.6) I use. SRM was working, as well as AlphaBIOS. VMS bootloader crash very early. Maybe your alpha cpu fix could improve the situation. Tru64 crashed during boot. IOTLB was not supported in qemu. Windows crashed during boot too. Tristan. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode. 2010-01-07 11:54 ` Tristan Gingold @ 2010-01-07 20:13 ` Andreas Färber 0 siblings, 0 replies; 18+ messages in thread From: Andreas Färber @ 2010-01-07 20:13 UTC (permalink / raw) To: Tristan Gingold; +Cc: QEMU Developers, Richard Henderson Am 07.01.2010 um 12:54 schrieb Tristan Gingold: > > On Jan 6, 2010, at 6:04 PM, Andreas Färber wrote: > >> >> Am 06.01.2010 um 17:29 schrieb Richard Henderson: >> >>> since (1) ROMs other than the few supported by MILO are probably >>> not redistributable >> >> Tristan's trick here was to provide a way for the user to extract a >> non-distributable ROM. I fear the controversy of whether this >> should be in qemu-system-alpha or not kept the patch series from >> being committed. > > No. This point was not discussed. I was referring to this thread: http://lists.nongnu.org/archive/html/qemu-devel/2009-03/msg00795.html But I see now that it was discussed on v1, not v4. Anyway, part of the series was actually applied, just not the core es40 system emulation part. Andreas ^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 4/6] target-alpha: Clean up arithmetic traps. 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson ` (2 preceding siblings ...) 2010-01-04 19:19 ` [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode Richard Henderson @ 2010-01-04 19:24 ` Richard Henderson 2010-01-04 19:25 ` [Qemu-devel] [PATCH 5/6] target-alpha: Mark helper_excp as NORETURN Richard Henderson ` (4 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Richard Henderson @ 2010-01-04 19:24 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien Replace the EXCP_ARITH_OVERFLOW placeholder with the complete set of bits from the EXC_SUM IPR. Use them in the existing places where we raise arithmetic exceptions. Signed-off-by: Richard Henderson <rth@twiddle.net> --- target-alpha/cpu.h | 10 +++++++--- target-alpha/op_helper.c | 12 ++++++------ 2 files changed, 13 insertions(+), 9 deletions(-) diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h index 3728d83..eda1b4a 100644 --- a/target-alpha/cpu.h +++ b/target-alpha/cpu.h @@ -450,9 +450,13 @@ enum { }; /* Arithmetic exception */ -enum { - EXCP_ARITH_OVERFLOW, -}; +#define EXC_M_IOV (1<<16) /* Integer Overflow */ +#define EXC_M_INE (1<<15) /* Inexact result */ +#define EXC_M_UNF (1<<14) /* Underflow */ +#define EXC_M_FOV (1<<13) /* Overflow */ +#define EXC_M_DZE (1<<12) /* Division by zero */ +#define EXC_M_INV (1<<11) /* Invalid operation */ +#define EXC_M_SWC (1<<10) /* Software completion */ enum { IR_V0 = 0, diff --git a/target-alpha/op_helper.c b/target-alpha/op_helper.c index b2abf6c..48245dd 100644 --- a/target-alpha/op_helper.c +++ b/target-alpha/op_helper.c @@ -78,7 +78,7 @@ uint64_t helper_addqv (uint64_t op1, uint64_t op2) uint64_t tmp = op1; op1 += op2; if (unlikely((tmp ^ op2 ^ (-1ULL)) & (tmp ^ op1) & (1ULL << 63))) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return op1; } @@ -88,7 +88,7 @@ uint64_t helper_addlv (uint64_t op1, uint64_t op2) uint64_t tmp = op1; op1 = (uint32_t)(op1 + op2); if (unlikely((tmp ^ op2 ^ (-1UL)) & (tmp ^ op1) & (1UL << 31))) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return op1; } @@ -98,7 +98,7 @@ uint64_t helper_subqv (uint64_t op1, uint64_t op2) uint64_t res; res = op1 - op2; if (unlikely((op1 ^ op2) & (res ^ op1) & (1ULL << 63))) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return res; } @@ -108,7 +108,7 @@ uint64_t helper_sublv (uint64_t op1, uint64_t op2) uint32_t res; res = op1 - op2; if (unlikely((op1 ^ op2) & (res ^ op1) & (1UL << 31))) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return res; } @@ -118,7 +118,7 @@ uint64_t helper_mullv (uint64_t op1, uint64_t op2) int64_t res = (int64_t)op1 * (int64_t)op2; if (unlikely((int32_t)res != res)) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return (int64_t)((int32_t)res); } @@ -130,7 +130,7 @@ uint64_t helper_mulqv (uint64_t op1, uint64_t op2) muls64(&tl, &th, op1, op2); /* If th != 0 && th != -1, then we had an overflow */ if (unlikely((th + 1) > 1)) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); + helper_excp(EXCP_ARITH, EXC_M_IOV); } return tl; } -- 1.6.5.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 5/6] target-alpha: Mark helper_excp as NORETURN. 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson ` (3 preceding siblings ...) 2010-01-04 19:24 ` [Qemu-devel] [PATCH 4/6] target-alpha: Clean up arithmetic traps Richard Henderson @ 2010-01-04 19:25 ` Richard Henderson 2010-01-04 22:27 ` [Qemu-devel] [PATCH 6/6] target-alpha: Implement IEEE FP qualifiers Richard Henderson ` (3 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Richard Henderson @ 2010-01-04 19:25 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien Signed-off-by: Richard Henderson <rth@twiddle.net> --- target-alpha/op_helper.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/target-alpha/op_helper.c b/target-alpha/op_helper.c index 48245dd..a322f12 100644 --- a/target-alpha/op_helper.c +++ b/target-alpha/op_helper.c @@ -24,7 +24,7 @@ /*****************************************************************************/ /* Exceptions processing helpers */ -void helper_excp (int excp, int error) +void QEMU_NORETURN helper_excp (int excp, int error) { env->exception_index = excp; env->error_code = error; -- 1.6.5.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH 6/6] target-alpha: Implement IEEE FP qualifiers. 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson ` (4 preceding siblings ...) 2010-01-04 19:25 ` [Qemu-devel] [PATCH 5/6] target-alpha: Mark helper_excp as NORETURN Richard Henderson @ 2010-01-04 22:27 ` Richard Henderson 2010-01-26 16:35 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson ` (2 subsequent siblings) 8 siblings, 0 replies; 18+ messages in thread From: Richard Henderson @ 2010-01-04 22:27 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien IEEE FP instructions are split up so that the rounding mode coming from the instruction and exceptions (both masking and delivery) are handled external to the base FP operation. FP exceptions are properly raised for non-finite inputs to instructions that do not indicate software completion. A shortcut is applied if CONFIG_SOFTFLOAT_INLINE is defined at the top of translate.c: data is loaded and stored into FP_STATUS directly instead of using the functional interface defined by "softfloat.h". Signed-off-by: Richard Henderson <rth@twiddle.net> --- target-alpha/helper.h | 21 ++- target-alpha/op_helper.c | 261 +++++++++++++++++++++-- target-alpha/translate.c | 521 ++++++++++++++++++++++++++++++++++++++-------- 3 files changed, 688 insertions(+), 115 deletions(-) diff --git a/target-alpha/helper.h b/target-alpha/helper.h index bedd3c0..79cf375 100644 --- a/target-alpha/helper.h +++ b/target-alpha/helper.h @@ -83,7 +83,6 @@ DEF_HELPER_2(cpyse, i64, i64, i64) DEF_HELPER_1(cvtts, i64, i64) DEF_HELPER_1(cvtst, i64, i64) -DEF_HELPER_1(cvttq, i64, i64) DEF_HELPER_1(cvtqs, i64, i64) DEF_HELPER_1(cvtqt, i64, i64) DEF_HELPER_1(cvtqf, i64, i64) @@ -91,9 +90,25 @@ DEF_HELPER_1(cvtgf, i64, i64) DEF_HELPER_1(cvtgq, i64, i64) DEF_HELPER_1(cvtqg, i64, i64) DEF_HELPER_1(cvtlq, i64, i64) + +DEF_HELPER_1(cvttq, i64, i64) +DEF_HELPER_1(cvttq_c, i64, i64) +DEF_HELPER_1(cvttq_svic, i64, i64) + DEF_HELPER_1(cvtql, i64, i64) -DEF_HELPER_1(cvtqlv, i64, i64) -DEF_HELPER_1(cvtqlsv, i64, i64) +DEF_HELPER_1(cvtql_v, i64, i64) +DEF_HELPER_1(cvtql_sv, i64, i64) + +DEF_HELPER_1(setroundmode, void, i32) +DEF_HELPER_1(setflushzero, void, i32) +DEF_HELPER_0(fp_exc_clear, void) +DEF_HELPER_0(fp_exc_get, i32) +DEF_HELPER_2(fp_exc_raise, void, i32, i32) +DEF_HELPER_2(fp_exc_raise_s, void, i32, i32) + +DEF_HELPER_1(ieee_input, i64, i64) +DEF_HELPER_1(ieee_input_cmp, i64, i64) +DEF_HELPER_1(ieee_input_s, i64, i64) #if !defined (CONFIG_USER_ONLY) DEF_HELPER_0(hw_rei, void) diff --git a/target-alpha/op_helper.c b/target-alpha/op_helper.c index a322f12..7b1a869 100644 --- a/target-alpha/op_helper.c +++ b/target-alpha/op_helper.c @@ -370,6 +370,130 @@ uint64_t helper_unpkbw (uint64_t op1) /* Floating point helpers */ +void helper_setroundmode (uint32_t val) +{ + set_float_rounding_mode(val, &FP_STATUS); +} + +void helper_setflushzero (uint32_t val) +{ + set_flush_to_zero(val, &FP_STATUS); +} + +void helper_fp_exc_clear (void) +{ + set_float_exception_flags(0, &FP_STATUS); +} + +uint32_t helper_fp_exc_get (void) +{ + return get_float_exception_flags(&FP_STATUS); +} + +/* Raise exceptions for ieee fp insns without software completion. + In that case there are no exceptions that don't trap; the mask + doesn't apply. */ +void helper_fp_exc_raise(uint32_t exc, uint32_t regno) +{ + if (exc) { + uint32_t hw_exc = 0; + + env->ipr[IPR_EXC_MASK] |= 1ull << regno; + + if (exc & float_flag_invalid) { + hw_exc |= EXC_M_INV; + } + if (exc & float_flag_divbyzero) { + hw_exc |= EXC_M_DZE; + } + if (exc & float_flag_overflow) { + hw_exc |= EXC_M_FOV; + } + if (exc & float_flag_underflow) { + hw_exc |= EXC_M_UNF; + } + if (exc & float_flag_inexact) { + hw_exc |= EXC_M_INE; + } + helper_excp(EXCP_ARITH, hw_exc); + } +} + +/* Raise exceptions for ieee fp insns with software completion. */ +void helper_fp_exc_raise_s(uint32_t exc, uint32_t regno) +{ + if (exc) { + env->fpcr_exc_status |= exc; + + exc &= ~env->fpcr_exc_mask; + if (exc) { + helper_fp_exc_raise(exc, regno); + } + } +} + +/* Input remapping without software completion. Handle denormal-map-to-zero + and trap for all other non-finite numbers. */ +uint64_t helper_ieee_input(uint64_t val) +{ + uint32_t exp = (uint32_t)(val >> 52) & 0x7ff; + uint64_t frac = val & 0xfffffffffffffull; + + if (exp == 0) { + if (frac != 0) { + /* If DNZ is set flush denormals to zero on input. */ + if (env->fpcr_dnz) { + val &= 1ull << 63; + } else { + helper_excp(EXCP_ARITH, EXC_M_UNF); + } + } + } else if (exp == 0x7ff) { + /* Infinity or NaN. */ + /* ??? I'm not sure these exception bit flags are correct. I do + know that the Linux kernel, at least, doesn't rely on them and + just emulates the insn to figure out what exception to use. */ + helper_excp(EXCP_ARITH, frac ? EXC_M_INV : EXC_M_FOV); + } + return val; +} + +/* Similar, but does not trap for infinities. Used for comparisons. */ +uint64_t helper_ieee_input_cmp(uint64_t val) +{ + uint32_t exp = (uint32_t)(val >> 52) & 0x7ff; + uint64_t frac = val & 0xfffffffffffffull; + + if (exp == 0) { + if (frac != 0) { + /* If DNZ is set flush denormals to zero on input. */ + if (env->fpcr_dnz) { + val &= 1ull << 63; + } else { + helper_excp(EXCP_ARITH, EXC_M_UNF); + } + } + } else if (exp == 0x7ff && frac) { + /* NaN. */ + helper_excp(EXCP_ARITH, EXC_M_INV); + } + return val; +} + +/* Input remapping with software completion enabled. All we have to do + is handle denormal-map-to-zero; all other inputs get exceptions as + needed from the actual operation. */ +uint64_t helper_ieee_input_s(uint64_t val) +{ + if (env->fpcr_dnz) { + uint32_t exp = (uint32_t)(val >> 52) & 0x7ff; + if (exp == 0) { + val &= 1ull << 63; + } + } + return val; +} + /* F floating (VAX) */ static inline uint64_t float32_to_f(float32 fa) { @@ -447,6 +571,9 @@ uint64_t helper_memory_to_f (uint32_t a) return r; } +/* ??? Emulating VAX arithmetic with IEEE arithmetic is wrong. We should + either implement VAX arithmetic properly or just signal invalid opcode. */ + uint64_t helper_addf (uint64_t a, uint64_t b) { float32 fa, fb, fr; @@ -931,10 +1058,107 @@ uint64_t helper_cvtqs (uint64_t a) return float32_to_s(fr); } -uint64_t helper_cvttq (uint64_t a) +/* Implement float64 to uint64 conversion without saturation -- we must + supply the truncated result. This behaviour is used by the compiler + to get unsigned conversion for free with the same instruction. + + The VI flag is set when overflow or inexact exceptions should be raised. */ + +static inline uint64_t helper_cvttq_internal(uint64_t a, int roundmode, int VI) { - float64 fa = t_to_float64(a); - return float64_to_int64_round_to_zero(fa, &FP_STATUS); + uint64_t frac, ret = 0; + uint32_t exp, sign, exc = 0; + int shift; + + sign = (a >> 63); + exp = (uint32_t)(a >> 52) & 0x7ff; + frac = a & 0xfffffffffffffull; + + if (exp == 0) { + if (unlikely(frac != 0)) { + goto do_underflow; + } + } else if (exp == 0x7ff) { + exc = (frac ? float_flag_invalid : VI ? float_flag_overflow : 0); + } else { + /* Restore implicit bit. */ + frac |= 0x10000000000000ull; + + shift = exp - 1023 - 52; + if (shift >= 0) { + /* In this case the number is so large that we must shift + the fraction left. There is no rounding to do. */ + if (shift < 63) { + ret = frac << shift; + if (VI && (ret >> shift) != frac) { + exc = float_flag_overflow; + } + } + } else { + uint64_t round; + + /* In this case the number is smaller than the fraction as + represented by the 52 bit number. Here we must think + about rounding the result. Handle this by shifting the + fractional part of the number into the high bits of ROUND. + This will let us efficiently handle round-to-nearest. */ + shift = -shift; + if (shift < 63) { + ret = frac >> shift; + round = frac << (64 - shift); + } else { + /* The exponent is so small we shift out everything. + Leave a sticky bit for proper rounding below. */ + do_underflow: + round = 1; + } + + if (round) { + exc = (VI ? float_flag_inexact : 0); + switch (roundmode) { + case float_round_nearest_even: + if (round == (1ull << 63)) { + /* Fraction is exactly 0.5; round to even. */ + ret += (ret & 1); + } else if (round > (1ull << 63)) { + ret += 1; + } + break; + case float_round_to_zero: + break; + case float_round_up: + ret += 1 - sign; + break; + case float_round_down: + ret += sign; + break; + } + } + } + if (sign) { + ret = -ret; + } + } + if (unlikely(exc)) { + float_raise(exc, &FP_STATUS); + } + + return ret; +} + +uint64_t helper_cvttq(uint64_t a) +{ + return helper_cvttq_internal(a, FP_STATUS.float_rounding_mode, 1); +} + +uint64_t helper_cvttq_c(uint64_t a) +{ + return helper_cvttq_internal(a, float_round_to_zero, 0); +} + +uint64_t helper_cvttq_svic(uint64_t a) +{ + return helper_cvttq_internal(a, float_round_to_zero, 1); } uint64_t helper_cvtqt (uint64_t a) @@ -979,35 +1203,24 @@ uint64_t helper_cvtlq (uint64_t a) return (lo & 0x3FFFFFFF) | (hi & 0xc0000000); } -static inline uint64_t __helper_cvtql(uint64_t a, int s, int v) -{ - uint64_t r; - - r = ((uint64_t)(a & 0xC0000000)) << 32; - r |= ((uint64_t)(a & 0x7FFFFFFF)) << 29; - - if (v && (int64_t)((int32_t)r) != (int64_t)r) { - helper_excp(EXCP_ARITH, EXCP_ARITH_OVERFLOW); - } - if (s) { - /* TODO */ - } - return r; -} - uint64_t helper_cvtql (uint64_t a) { - return __helper_cvtql(a, 0, 0); + return ((a & 0xC0000000) << 32) | ((a & 0x7FFFFFFF) << 29); } -uint64_t helper_cvtqlv (uint64_t a) +uint64_t helper_cvtql_v (uint64_t a) { - return __helper_cvtql(a, 0, 1); + if ((int32_t)a != (int64_t)a) + helper_excp(EXCP_ARITH, EXC_M_IOV); + return helper_cvtql(a); } -uint64_t helper_cvtqlsv (uint64_t a) +uint64_t helper_cvtql_sv (uint64_t a) { - return __helper_cvtql(a, 1, 1); + /* ??? I'm pretty sure there's nothing that /sv needs to do that /v + doesn't do. The only thing I can think is that /sv is a valid + instruction merely for completeness in the ISA. */ + return helper_cvtql_v(a); } /* PALcode support special instructions */ diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 515c8c7..a11e5ed 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -33,6 +33,7 @@ #include "helper.h" #undef ALPHA_DEBUG_DISAS +#define CONFIG_SOFTFLOAT_INLINE #ifdef ALPHA_DEBUG_DISAS # define LOG_DISAS(...) qemu_log_mask(CPU_LOG_TB_IN_ASM, ## __VA_ARGS__) @@ -49,6 +50,11 @@ struct DisasContext { #endif CPUAlphaState *env; uint32_t amask; + + /* Current rounding mode for this TB. */ + int tb_rm; + /* Current flush-to-zero setting for this TB. */ + int tb_ftz; }; /* global register indexes */ @@ -442,62 +448,333 @@ static void gen_fcmov(TCGCond inv_cond, int ra, int rb, int rc) gen_set_label(l1); } -#define FARITH2(name) \ -static inline void glue(gen_f, name)(int rb, int rc) \ -{ \ - if (unlikely(rc == 31)) \ - return; \ - \ - if (rb != 31) \ - gen_helper_ ## name (cpu_fir[rc], cpu_fir[rb]); \ - else { \ - TCGv tmp = tcg_const_i64(0); \ - gen_helper_ ## name (cpu_fir[rc], tmp); \ - tcg_temp_free(tmp); \ - } \ +#define QUAL_RM_N 0x080 /* Round mode nearest even */ +#define QUAL_RM_C 0x000 /* Round mode chopped */ +#define QUAL_RM_M 0x040 /* Round mode minus infinity */ +#define QUAL_RM_D 0x0c0 /* Round mode dynamic */ +#define QUAL_RM_MASK 0x0c0 + +#define QUAL_U 0x100 /* Underflow enable (fp output) */ +#define QUAL_V 0x100 /* Overflow enable (int output) */ +#define QUAL_S 0x400 /* Software completion enable */ +#define QUAL_I 0x200 /* Inexact detection enable */ + +static void gen_qual_roundmode(DisasContext *ctx, int fn11) +{ + TCGv_i32 tmp; + + fn11 &= QUAL_RM_MASK; + if (fn11 == ctx->tb_rm) { + return; + } + ctx->tb_rm = fn11; + + tmp = tcg_temp_new_i32(); + switch (fn11) { + case QUAL_RM_N: + tcg_gen_movi_i32(tmp, float_round_nearest_even); + break; + case QUAL_RM_C: + tcg_gen_movi_i32(tmp, float_round_to_zero); + break; + case QUAL_RM_M: + tcg_gen_movi_i32(tmp, float_round_down); + break; + case QUAL_RM_D: + tcg_gen_ld8u_i32(tmp, cpu_env, offsetof(CPUState, fpcr_dyn_round)); + break; + } + +#if defined(CONFIG_SOFTFLOAT_INLINE) + /* ??? The "softfloat.h" interface is to call set_float_rounding_mode. + With CONFIG_SOFTFLOAT that expands to an out-of-line call that just + sets the one field. */ + tcg_gen_st8_i32(tmp, cpu_env, + offsetof(CPUState, fp_status.float_rounding_mode)); +#else + gen_helper_setroundmode(tmp); +#endif + + tcg_temp_free_i32(tmp); +} + +static void gen_qual_flushzero(DisasContext *ctx, int fn11) +{ + TCGv_i32 tmp; + + fn11 &= QUAL_U; + if (fn11 == ctx->tb_ftz) { + return; + } + ctx->tb_ftz = fn11; + + tmp = tcg_temp_new_i32(); + if (fn11) { + /* Underflow is enabled, use the FPCR setting. */ + tcg_gen_ld8u_i32(tmp, cpu_env, offsetof(CPUState, fpcr_flush_to_zero)); + } else { + /* Underflow is disabled, force flush-to-zero. */ + tcg_gen_movi_i32(tmp, 1); + } + +#if defined(CONFIG_SOFTFLOAT_INLINE) + tcg_gen_st8_i32(tmp, cpu_env, + offsetof(CPUState, fp_status.flush_to_zero)); +#else + gen_helper_setflushzero(tmp); +#endif + + tcg_temp_free_i32(tmp); +} + +static TCGv gen_ieee_input(int reg, int fn11, int is_cmp) +{ + TCGv val = tcg_temp_new(); + if (reg == 31) { + tcg_gen_movi_i64(val, 0); + } else if (fn11 & QUAL_S) { + gen_helper_ieee_input_s(val, cpu_fir[reg]); + } else if (is_cmp) { + gen_helper_ieee_input_cmp(val, cpu_fir[reg]); + } else { + gen_helper_ieee_input(val, cpu_fir[reg]); + } + return val; +} + +static void gen_fp_exc_clear(void) +{ +#if defined(CONFIG_SOFTFLOAT_INLINE) + TCGv_i32 zero = tcg_const_i32(0); + tcg_gen_st8_i32(zero, cpu_env, + offsetof(CPUState, fp_status.float_exception_flags)); + tcg_temp_free_i32(zero); +#else + gen_helper_fp_exc_clear(); +#endif +} + +static void gen_fp_exc_raise_ignore(int rc, int fn11, int ignore) +{ + /* ??? We ought to be able to do something with imprecise exceptions. + E.g. notice we're still in the trap shadow of something within the + TB and do not generate the code to signal the exception; end the TB + when an exception is forced to arrive, either by consumption of a + register value or TRAPB or EXCB. */ + TCGv_i32 exc = tcg_temp_new_i32(); + TCGv_i32 reg; + +#if defined(CONFIG_SOFTFLOAT_INLINE) + tcg_gen_ld8u_i32(exc, cpu_env, + offsetof(CPUState, fp_status.float_exception_flags)); +#else + gen_helper_fp_exc_get(exc); +#endif + + if (ignore) { + tcg_gen_andi_i32(exc, exc, ~ignore); + } + + /* ??? Pass in the regno of the destination so that the helper can + set EXC_MASK, which contains a bitmask of destination registers + that have caused arithmetic traps. A simple userspace emulation + does not require this. We do need it for a guest kernel's entArith, + or if we were to do something clever with imprecise exceptions. */ + reg = tcg_const_i32(rc + 32); + + if (fn11 & QUAL_S) { + gen_helper_fp_exc_raise_s(exc, reg); + } else { + gen_helper_fp_exc_raise(exc, reg); + } + + tcg_temp_free_i32(reg); + tcg_temp_free_i32(exc); +} + +static inline void gen_fp_exc_raise(int rc, int fn11) +{ + gen_fp_exc_raise_ignore(rc, fn11, fn11 & QUAL_I ? 0 : float_flag_inexact); } -FARITH2(sqrts) + +#define FARITH2(name) \ +static inline void glue(gen_f, name)(int rb, int rc) \ +{ \ + if (unlikely(rc == 31)) { \ + return; \ + } \ + if (rb != 31) { \ + gen_helper_ ## name (cpu_fir[rc], cpu_fir[rb]); \ + } else { \ + TCGv tmp = tcg_const_i64(0); \ + gen_helper_ ## name (cpu_fir[rc], tmp); \ + tcg_temp_free(tmp); \ + } \ +} +FARITH2(cvtlq) +FARITH2(cvtql) +FARITH2(cvtql_v) +FARITH2(cvtql_sv) + +/* ??? VAX instruction qualifiers ignored. */ FARITH2(sqrtf) FARITH2(sqrtg) -FARITH2(sqrtt) FARITH2(cvtgf) FARITH2(cvtgq) FARITH2(cvtqf) FARITH2(cvtqg) -FARITH2(cvtst) -FARITH2(cvtts) -FARITH2(cvttq) -FARITH2(cvtqs) -FARITH2(cvtqt) -FARITH2(cvtlq) -FARITH2(cvtql) -FARITH2(cvtqlv) -FARITH2(cvtqlsv) - -#define FARITH3(name) \ -static inline void glue(gen_f, name)(int ra, int rb, int rc) \ -{ \ - if (unlikely(rc == 31)) \ - return; \ - \ - if (ra != 31) { \ - if (rb != 31) \ - gen_helper_ ## name (cpu_fir[rc], cpu_fir[ra], cpu_fir[rb]); \ - else { \ - TCGv tmp = tcg_const_i64(0); \ - gen_helper_ ## name (cpu_fir[rc], cpu_fir[ra], tmp); \ - tcg_temp_free(tmp); \ - } \ - } else { \ - TCGv tmp = tcg_const_i64(0); \ - if (rb != 31) \ - gen_helper_ ## name (cpu_fir[rc], tmp, cpu_fir[rb]); \ - else \ - gen_helper_ ## name (cpu_fir[rc], tmp, tmp); \ - tcg_temp_free(tmp); \ - } \ + +static void gen_ieee_arith2(DisasContext *ctx, void (*helper)(TCGv, TCGv), + int rb, int rc, int fn11) +{ + TCGv vb; + + /* ??? This is wrong: the instruction is not a nop, it still may + raise exceptions. */ + if (unlikely(rc == 31)) { + return; + } + + gen_qual_roundmode(ctx, fn11); + gen_qual_flushzero(ctx, fn11); + gen_fp_exc_clear(); + + vb = gen_ieee_input(rb, fn11, 0); + helper(cpu_fir[rc], vb); + tcg_temp_free(vb); + + gen_fp_exc_raise(rc, fn11); +} + +#define IEEE_ARITH2(name) \ +static inline void glue(gen_f, name)(DisasContext *ctx, \ + int rb, int rc, int fn11) \ +{ \ + gen_ieee_arith2(ctx, gen_helper_##name, rb, rc, fn11); \ +} +IEEE_ARITH2(sqrts) +IEEE_ARITH2(sqrtt) +IEEE_ARITH2(cvtst) +IEEE_ARITH2(cvtts) + +static void gen_fcvttq(DisasContext *ctx, int rb, int rc, int fn11) +{ + TCGv vb; + int ignore = 0; + + /* ??? This is wrong: the instruction is not a nop, it still may + raise exceptions. */ + if (unlikely(rc == 31)) { + return; + } + + /* No need to set flushzero, since we have an integer output. */ + gen_fp_exc_clear(); + vb = gen_ieee_input(rb, fn11, 0); + + /* Almost all integer conversions use cropped rounding, and most + also do not have integer overflow enabled. Special case that. */ + switch (fn11) { + case QUAL_RM_C: + gen_helper_cvttq_c(cpu_fir[rc], vb); + break; + case QUAL_V | QUAL_RM_C: + case QUAL_S | QUAL_V | QUAL_RM_C: + ignore = float_flag_inexact; + /* FALLTHRU */ + case QUAL_S | QUAL_V | QUAL_I | QUAL_RM_C: + gen_helper_cvttq_svic(cpu_fir[rc], vb); + break; + default: + gen_qual_roundmode(ctx, fn11); + gen_helper_cvttq(cpu_fir[rc], vb); + ignore |= (fn11 & QUAL_V ? 0 : float_flag_overflow); + ignore |= (fn11 & QUAL_I ? 0 : float_flag_inexact); + break; + } + tcg_temp_free(vb); + + gen_fp_exc_raise_ignore(rc, fn11, ignore); } +static void gen_ieee_intcvt(DisasContext *ctx, void (*helper)(TCGv, TCGv), + int rb, int rc, int fn11) +{ + TCGv vb; + + /* ??? This is wrong: the instruction is not a nop, it still may + raise exceptions. */ + if (unlikely(rc == 31)) { + return; + } + + gen_qual_roundmode(ctx, fn11); + + if (rb == 31) { + vb = tcg_const_i64(0); + } else { + vb = cpu_fir[rb]; + } + + /* The only exception that can be raised by integer conversion + is inexact. Thus we only need to worry about exceptions when + inexact handling is requested. */ + if (fn11 & QUAL_I) { + gen_fp_exc_clear(); + helper(cpu_fir[rc], vb); + gen_fp_exc_raise(rc, fn11); + } else { + helper(cpu_fir[rc], vb); + } + + if (rb == 31) { + tcg_temp_free(vb); + } +} + +#define IEEE_INTCVT(name) \ +static inline void glue(gen_f, name)(DisasContext *ctx, \ + int rb, int rc, int fn11) \ +{ \ + gen_ieee_intcvt(ctx, gen_helper_##name, rb, rc, fn11); \ +} +IEEE_INTCVT(cvtqs) +IEEE_INTCVT(cvtqt) + +#define FARITH3(name) \ +static inline void glue(gen_f, name)(int ra, int rb, int rc) \ +{ \ + TCGv va, vb; \ + \ + if (unlikely(rc == 31)) { \ + return; \ + } \ + if (ra == 31) { \ + va = tcg_const_i64(0); \ + } else { \ + va = cpu_fir[ra]; \ + } \ + if (rb == 31) { \ + vb = tcg_const_i64(0); \ + } else { \ + vb = cpu_fir[rb]; \ + } \ + \ + gen_helper_ ## name (cpu_fir[rc], va, vb); \ + \ + if (ra == 31) { \ + tcg_temp_free(va); \ + } \ + if (rb == 31) { \ + tcg_temp_free(vb); \ + } \ +} +/* ??? Ought to expand these inline; simple masking operations. */ +FARITH3(cpys) +FARITH3(cpysn) +FARITH3(cpyse) + +/* ??? VAX instruction qualifiers ignored. */ FARITH3(addf) FARITH3(subf) FARITH3(mulf) @@ -509,21 +786,80 @@ FARITH3(divg) FARITH3(cmpgeq) FARITH3(cmpglt) FARITH3(cmpgle) -FARITH3(adds) -FARITH3(subs) -FARITH3(muls) -FARITH3(divs) -FARITH3(addt) -FARITH3(subt) -FARITH3(mult) -FARITH3(divt) -FARITH3(cmptun) -FARITH3(cmpteq) -FARITH3(cmptlt) -FARITH3(cmptle) -FARITH3(cpys) -FARITH3(cpysn) -FARITH3(cpyse) + +static void gen_ieee_arith3(DisasContext *ctx, + void (*helper)(TCGv, TCGv, TCGv), + int ra, int rb, int rc, int fn11) +{ + TCGv va, vb; + + /* ??? This is wrong: the instruction is not a nop, it still may + raise exceptions. */ + if (unlikely(rc == 31)) { + return; + } + + gen_qual_roundmode(ctx, fn11); + gen_qual_flushzero(ctx, fn11); + gen_fp_exc_clear(); + + va = gen_ieee_input(ra, fn11, 0); + vb = gen_ieee_input(rb, fn11, 0); + helper(cpu_fir[rc], va, vb); + tcg_temp_free(va); + tcg_temp_free(vb); + + gen_fp_exc_raise(rc, fn11); +} + +#define IEEE_ARITH3(name) \ +static inline void glue(gen_f, name)(DisasContext *ctx, \ + int ra, int rb, int rc, int fn11) \ +{ \ + gen_ieee_arith3(ctx, gen_helper_##name, ra, rb, rc, fn11); \ +} +IEEE_ARITH3(adds) +IEEE_ARITH3(subs) +IEEE_ARITH3(muls) +IEEE_ARITH3(divs) +IEEE_ARITH3(addt) +IEEE_ARITH3(subt) +IEEE_ARITH3(mult) +IEEE_ARITH3(divt) + +static void gen_ieee_compare(DisasContext *ctx, + void (*helper)(TCGv, TCGv, TCGv), + int ra, int rb, int rc, int fn11) +{ + TCGv va, vb; + + /* ??? This is wrong: the instruction is not a nop, it still may + raise exceptions. */ + if (unlikely(rc == 31)) { + return; + } + + gen_fp_exc_clear(); + + va = gen_ieee_input(ra, fn11, 1); + vb = gen_ieee_input(rb, fn11, 1); + helper(cpu_fir[rc], va, vb); + tcg_temp_free(va); + tcg_temp_free(vb); + + gen_fp_exc_raise(rc, fn11); +} + +#define IEEE_CMP3(name) \ +static inline void glue(gen_f, name)(DisasContext *ctx, \ + int ra, int rb, int rc, int fn11) \ +{ \ + gen_ieee_compare(ctx, gen_helper_##name, ra, rb, rc, fn11); \ +} +IEEE_CMP3(cmptun) +IEEE_CMP3(cmpteq) +IEEE_CMP3(cmptlt) +IEEE_CMP3(cmptle) static inline uint64_t zapnot_mask(uint8_t lit) { @@ -1607,7 +1943,7 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) } break; case 0x14: - switch (fpfn) { /* f11 & 0x3F */ + switch (fpfn) { /* fn11 & 0x3F */ case 0x04: /* ITOFS */ if (!(ctx->amask & AMASK_FIX)) @@ -1632,7 +1968,7 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) /* SQRTS */ if (!(ctx->amask & AMASK_FIX)) goto invalid_opc; - gen_fsqrts(rb, rc); + gen_fsqrts(ctx, rb, rc, fn11); break; case 0x14: /* ITOFF */ @@ -1669,7 +2005,7 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) /* SQRTT */ if (!(ctx->amask & AMASK_FIX)) goto invalid_opc; - gen_fsqrtt(rb, rc); + gen_fsqrtt(ctx, rb, rc, fn11); break; default: goto invalid_opc; @@ -1678,7 +2014,7 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) case 0x15: /* VAX floating point */ /* XXX: rounding mode and trap are ignored (!) */ - switch (fpfn) { /* f11 & 0x3F */ + switch (fpfn) { /* fn11 & 0x3F */ case 0x00: /* ADDF */ gen_faddf(ra, rb, rc); @@ -1761,77 +2097,75 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) break; case 0x16: /* IEEE floating-point */ - /* XXX: rounding mode and traps are ignored (!) */ - switch (fpfn) { /* f11 & 0x3F */ + switch (fpfn) { /* fn11 & 0x3F */ case 0x00: /* ADDS */ - gen_fadds(ra, rb, rc); + gen_fadds(ctx, ra, rb, rc, fn11); break; case 0x01: /* SUBS */ - gen_fsubs(ra, rb, rc); + gen_fsubs(ctx, ra, rb, rc, fn11); break; case 0x02: /* MULS */ - gen_fmuls(ra, rb, rc); + gen_fmuls(ctx, ra, rb, rc, fn11); break; case 0x03: /* DIVS */ - gen_fdivs(ra, rb, rc); + gen_fdivs(ctx, ra, rb, rc, fn11); break; case 0x20: /* ADDT */ - gen_faddt(ra, rb, rc); + gen_faddt(ctx, ra, rb, rc, fn11); break; case 0x21: /* SUBT */ - gen_fsubt(ra, rb, rc); + gen_fsubt(ctx, ra, rb, rc, fn11); break; case 0x22: /* MULT */ - gen_fmult(ra, rb, rc); + gen_fmult(ctx, ra, rb, rc, fn11); break; case 0x23: /* DIVT */ - gen_fdivt(ra, rb, rc); + gen_fdivt(ctx, ra, rb, rc, fn11); break; case 0x24: /* CMPTUN */ - gen_fcmptun(ra, rb, rc); + gen_fcmptun(ctx, ra, rb, rc, fn11); break; case 0x25: /* CMPTEQ */ - gen_fcmpteq(ra, rb, rc); + gen_fcmpteq(ctx, ra, rb, rc, fn11); break; case 0x26: /* CMPTLT */ - gen_fcmptlt(ra, rb, rc); + gen_fcmptlt(ctx, ra, rb, rc, fn11); break; case 0x27: /* CMPTLE */ - gen_fcmptle(ra, rb, rc); + gen_fcmptle(ctx, ra, rb, rc, fn11); break; case 0x2C: - /* XXX: incorrect */ if (fn11 == 0x2AC || fn11 == 0x6AC) { /* CVTST */ - gen_fcvtst(rb, rc); + gen_fcvtst(ctx, rb, rc, fn11); } else { /* CVTTS */ - gen_fcvtts(rb, rc); + gen_fcvtts(ctx, rb, rc, fn11); } break; case 0x2F: /* CVTTQ */ - gen_fcvttq(rb, rc); + gen_fcvttq(ctx, rb, rc, fn11); break; case 0x3C: /* CVTQS */ - gen_fcvtqs(rb, rc); + gen_fcvtqs(ctx, rb, rc, fn11); break; case 0x3E: /* CVTQT */ - gen_fcvtqt(rb, rc); + gen_fcvtqt(ctx, rb, rc, fn11); break; default: goto invalid_opc; @@ -1910,11 +2244,11 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) break; case 0x130: /* CVTQL/V */ - gen_fcvtqlv(rb, rc); + gen_fcvtql_v(rb, rc); break; case 0x530: /* CVTQL/SV */ - gen_fcvtqlsv(rb, rc); + gen_fcvtql_sv(rb, rc); break; default: goto invalid_opc; @@ -2597,6 +2931,17 @@ static inline void gen_intermediate_code_internal(CPUState *env, ctx.mem_idx = ((env->ps >> 3) & 3); ctx.pal_mode = env->ipr[IPR_EXC_ADDR] & 1; #endif + + /* ??? Every TB begins with unset rounding mode, to be initialized on + the first fp insn of the TB. Alternately we could define a proper + default for every TB (e.g. QUAL_RM_N or QUAL_RM_D) and make sure + to reset the FP_STATUS to that default at the end of any TB that + changes the default. We could even (gasp) dynamiclly figure out + what default would be most efficient given the running program. */ + ctx.tb_rm = -1; + /* Similarly for flush-to-zero. */ + ctx.tb_ftz = -1; + num_insns = 0; max_insns = tb->cflags & CF_COUNT_MASK; if (max_insns == 0) -- 1.6.5.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson ` (5 preceding siblings ...) 2010-01-04 22:27 ` [Qemu-devel] [PATCH 6/6] target-alpha: Implement IEEE FP qualifiers Richard Henderson @ 2010-01-26 16:35 ` Richard Henderson 2010-02-09 18:47 ` Richard Henderson 2010-02-23 22:58 ` Aurelien Jarno 8 siblings, 0 replies; 18+ messages in thread From: Richard Henderson @ 2010-01-26 16:35 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien Ping? r~ On 01/04/2010 02:46 PM, Richard Henderson wrote: > I've split up the FPCR as requested by Aurelien. We no longer > set anything in FP_STATUS after the execution of the operation, > only copy data from FP_STATUS to some env->fpcr field. > > I have totally rewritten the patch to be more along the line > that Laurent was suggesting, in that the rounding mode and other > qualifiers are totally parsed within the translator. I no longer > pass the FN11 field to the helper functions. > > Unlike Laurent's prototype, I do not set the rounding mode at > every FP instruction; I remember the previous setting of the > rounding mode within a TB. Similarly for the flush-to-zero field. > > I do not handle VAX instructions at all. The existing VAX support > is mostly broken, and I didn't feel like compounding the problem. > > > r~ > > > -- > Richard Henderson (6): > target-alpha: Fix gdb access to fpcr and unique. > target-alpha: Split up FPCR value into separate fields. > target-alpha: Reduce internal processor registers for user-mode. > target-alpha: Clean up arithmetic traps. > target-alpha: Mark helper_excp as NORETURN. > target-alpha: Implement IEEE FP qualifiers. > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson ` (6 preceding siblings ...) 2010-01-26 16:35 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson @ 2010-02-09 18:47 ` Richard Henderson 2010-02-23 22:58 ` Aurelien Jarno 8 siblings, 0 replies; 18+ messages in thread From: Richard Henderson @ 2010-02-09 18:47 UTC (permalink / raw) To: qemu-devel; +Cc: laurent.desnogues, aurelien Ping 2. r~ On 01/04/2010 02:46 PM, Richard Henderson wrote: > I've split up the FPCR as requested by Aurelien. We no longer > set anything in FP_STATUS after the execution of the operation, > only copy data from FP_STATUS to some env->fpcr field. > > I have totally rewritten the patch to be more along the line > that Laurent was suggesting, in that the rounding mode and other > qualifiers are totally parsed within the translator. I no longer > pass the FN11 field to the helper functions. > > Unlike Laurent's prototype, I do not set the rounding mode at > every FP instruction; I remember the previous setting of the > rounding mode within a TB. Similarly for the flush-to-zero field. > > I do not handle VAX instructions at all. The existing VAX support > is mostly broken, and I didn't feel like compounding the problem. > > > r~ > > > -- > Richard Henderson (6): > target-alpha: Fix gdb access to fpcr and unique. > target-alpha: Split up FPCR value into separate fields. > target-alpha: Reduce internal processor registers for user-mode. > target-alpha: Clean up arithmetic traps. > target-alpha: Mark helper_excp as NORETURN. > target-alpha: Implement IEEE FP qualifiers. > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson ` (7 preceding siblings ...) 2010-02-09 18:47 ` Richard Henderson @ 2010-02-23 22:58 ` Aurelien Jarno 2010-02-24 11:24 ` Richard Henderson 8 siblings, 1 reply; 18+ messages in thread From: Aurelien Jarno @ 2010-02-23 22:58 UTC (permalink / raw) To: Richard Henderson; +Cc: laurent.desnogues, qemu-devel On Mon, Jan 04, 2010 at 02:46:05PM -0800, Richard Henderson wrote: > I've split up the FPCR as requested by Aurelien. We no longer > set anything in FP_STATUS after the execution of the operation, > only copy data from FP_STATUS to some env->fpcr field. > > I have totally rewritten the patch to be more along the line > that Laurent was suggesting, in that the rounding mode and other > qualifiers are totally parsed within the translator. I no longer > pass the FN11 field to the helper functions. > What's the benefit of doing that? I don't say it's wrong, I just want to understand. Otherwise the patch looks good, so it can probably be applied without any change. In the meanwhile, I have applied patches 1 to 5. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 2010-02-23 22:58 ` Aurelien Jarno @ 2010-02-24 11:24 ` Richard Henderson 2010-02-28 16:49 ` Aurelien Jarno 0 siblings, 1 reply; 18+ messages in thread From: Richard Henderson @ 2010-02-24 11:24 UTC (permalink / raw) To: Aurelien Jarno; +Cc: laurent.desnogues, qemu-devel On 02/23/2010 02:58 PM, Aurelien Jarno wrote: >> I have totally rewritten the patch to be more along the line >> that Laurent was suggesting, in that the rounding mode and other >> qualifiers are totally parsed within the translator. I no longer >> pass the FN11 field to the helper functions. > > What's the benefit of doing that? I don't say it's wrong, I just want > to understand. Otherwise the patch looks good, so it can probably be > applied without any change. I seem to recall Laurent opining that doing the interpretation of the opcode in two different places was less than clean, and in the end I agree with him. FWIW, this configuration would also be compatible with a future TCG enhancement to generate fp code, whereas the first config would not. r~ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 2010-02-24 11:24 ` Richard Henderson @ 2010-02-28 16:49 ` Aurelien Jarno 0 siblings, 0 replies; 18+ messages in thread From: Aurelien Jarno @ 2010-02-28 16:49 UTC (permalink / raw) To: Richard Henderson; +Cc: laurent.desnogues, qemu-devel On Wed, Feb 24, 2010 at 12:24:55PM +0100, Richard Henderson wrote: > On 02/23/2010 02:58 PM, Aurelien Jarno wrote: > >>I have totally rewritten the patch to be more along the line > >>that Laurent was suggesting, in that the rounding mode and other > >>qualifiers are totally parsed within the translator. I no longer > >>pass the FN11 field to the helper functions. > > > >What's the benefit of doing that? I don't say it's wrong, I just want > >to understand. Otherwise the patch looks good, so it can probably be > >applied without any change. > > I seem to recall Laurent opining that doing the interpretation > of the opcode in two different places was less than clean, and > in the end I agree with him. > > FWIW, this configuration would also be compatible with a > future TCG enhancement to generate fp code, whereas the first > config would not. I have applied the patch, but in order to avoid doing the same for all targets, it might be a good idea to directly provide TCG functions to modify FP_STATUS instead of using the interface from softfloat.h. This would also have the advantage of clearly defining this interface, and make sure that the alpha target is not broken by a change in softfloat.h. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2010-02-28 16:50 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20091228201020.GD5695@hall.aurel32.net> 2009-12-18 22:09 ` [Qemu-devel] [patch] target-alpha: squashed fpu qualifiers patch Richard Henderson 2010-01-04 22:46 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson 2009-12-31 19:54 ` [Qemu-devel] [PATCH 1/6] target-alpha: Fix gdb access to fpcr and unique Richard Henderson 2009-12-31 20:41 ` [Qemu-devel] [PATCH 2/6] target-alpha: Split up FPCR value into separate fields Richard Henderson 2010-01-04 19:19 ` [Qemu-devel] [PATCH 3/6] target-alpha: Reduce internal processor registers for user-mode Richard Henderson 2010-01-06 9:55 ` Tristan Gingold 2010-01-06 16:29 ` Richard Henderson 2010-01-06 17:04 ` Andreas Färber 2010-01-07 11:54 ` Tristan Gingold 2010-01-07 20:13 ` Andreas Färber 2010-01-04 19:24 ` [Qemu-devel] [PATCH 4/6] target-alpha: Clean up arithmetic traps Richard Henderson 2010-01-04 19:25 ` [Qemu-devel] [PATCH 5/6] target-alpha: Mark helper_excp as NORETURN Richard Henderson 2010-01-04 22:27 ` [Qemu-devel] [PATCH 6/6] target-alpha: Implement IEEE FP qualifiers Richard Henderson 2010-01-26 16:35 ` [Qemu-devel] [PATCH 0/6] target-alpha: fpu qualifiers, round 2 Richard Henderson 2010-02-09 18:47 ` Richard Henderson 2010-02-23 22:58 ` Aurelien Jarno 2010-02-24 11:24 ` Richard Henderson 2010-02-28 16:49 ` Aurelien Jarno
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).