[PATCH 0/3] target/s390x: Implement DIVIDE TO INTEGER

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] target/s390x: Implement DIVIDE TO INTEGER
@ 2026-01-21 22:12 Ilya Leoshkevich
  2026-01-21 22:12 ` [PATCH 1/3] target/s390x: Dump Floating-Point-Control Register Ilya Leoshkevich
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Ilya Leoshkevich @ 2026-01-21 22:12 UTC (permalink / raw)
  To: Thomas Huth, Richard Henderson
  Cc: David Hildenbrand, qemu-s390x, qemu-devel, Ilya Leoshkevich

Hi,

This series implements DIVIDE TO INTEGER instruction, which is
required to run LuaJIT.

Patch 1 is a debugging helper. Patch 2 is the implementation.

Since the instruction is quite complex, I've extensively tested it
using a libFuzzer-based harness [1] that compares emulation with native
execution at ~15k exec/s. The tests (patch 3) use data generated
this way.

Best regards,
Ilya

[1] https://github.com/iii-i/qemu/commits/iii/wip/fuzz-tcg-v1/

Ilya Leoshkevich (3):
  target/s390x: Dump Floating-Point-Control Register
  target/s390x: Implement DIVIDE TO INTEGER
  tests/tcg/s390x: Test DIVIDE TO INTEGER

 target/s390x/cpu-dump.c             |   1 +
 target/s390x/helper.h               |   2 +
 target/s390x/tcg/fpu_helper.c       | 199 +++++++++++++++++++++++++
 target/s390x/tcg/insn-data.h.inc    |   5 +-
 target/s390x/tcg/translate.c        |  26 ++++
 tests/tcg/s390x/Makefile.target     |   3 +
 tests/tcg/s390x/divide-to-integer.c | 215 ++++++++++++++++++++++++++++
 7 files changed, 450 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/s390x/divide-to-integer.c

-- 
2.52.0



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] target/s390x: Dump Floating-Point-Control Register
  2026-01-21 22:12 [PATCH 0/3] target/s390x: Implement DIVIDE TO INTEGER Ilya Leoshkevich
@ 2026-01-21 22:12 ` Ilya Leoshkevich
  2026-01-22 16:40   ` Alex Bennée
  2026-01-21 22:12 ` [PATCH 2/3] target/s390x: Implement DIVIDE TO INTEGER Ilya Leoshkevich
  2026-01-21 22:12 ` [PATCH 3/3] tests/tcg/s390x: Test " Ilya Leoshkevich
  2 siblings, 1 reply; 10+ messages in thread
From: Ilya Leoshkevich @ 2026-01-21 22:12 UTC (permalink / raw)
  To: Thomas Huth, Richard Henderson
  Cc: David Hildenbrand, qemu-s390x, qemu-devel, Ilya Leoshkevich

Knowing the value of this register is very useful for debugging
floating-point code.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 target/s390x/cpu-dump.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/s390x/cpu-dump.c b/target/s390x/cpu-dump.c
index 869d3a4ad54..5b852928031 100644
--- a/target/s390x/cpu-dump.c
+++ b/target/s390x/cpu-dump.c
@@ -63,6 +63,7 @@ void s390_cpu_dump_state(CPUState *cs, FILE *f, int flags)
                              (i % 4) == 3 ? '\n' : ' ');
             }
         }
+        qemu_fprintf(f, "FPC=%08" PRIx32 "\n", env->fpc);
     }
 
 #ifndef CONFIG_USER_ONLY
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] target/s390x: Implement DIVIDE TO INTEGER
  2026-01-21 22:12 [PATCH 0/3] target/s390x: Implement DIVIDE TO INTEGER Ilya Leoshkevich
  2026-01-21 22:12 ` [PATCH 1/3] target/s390x: Dump Floating-Point-Control Register Ilya Leoshkevich
@ 2026-01-21 22:12 ` Ilya Leoshkevich
  2026-01-22  1:04   ` Richard Henderson
  2026-01-21 22:12 ` [PATCH 3/3] tests/tcg/s390x: Test " Ilya Leoshkevich
  2 siblings, 1 reply; 10+ messages in thread
From: Ilya Leoshkevich @ 2026-01-21 22:12 UTC (permalink / raw)
  To: Thomas Huth, Richard Henderson
  Cc: David Hildenbrand, qemu-s390x, qemu-devel, Ilya Leoshkevich

DIVIDE TO INTEGER computes floating point remainder and is used by
LuaJIT, so add it to QEMU.

The instruction comes in two flavors: for floats and doubles, which are
very similar. Since it's also quite complex, copy-pasting the
implementation would result in barely maintainable code. Mitigate that
using macros. An alternative would be an .inc file, but this looks like
an overkill.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 target/s390x/helper.h            |   2 +
 target/s390x/tcg/fpu_helper.c    | 199 +++++++++++++++++++++++++++++++
 target/s390x/tcg/insn-data.h.inc |   5 +-
 target/s390x/tcg/translate.c     |  26 ++++
 4 files changed, 231 insertions(+), 1 deletion(-)

diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 1a8a76abb98..f2b24c65a88 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -46,6 +46,8 @@ DEF_HELPER_FLAGS_3(sxb, TCG_CALL_NO_WG, i128, env, i128, i128)
 DEF_HELPER_FLAGS_3(deb, TCG_CALL_NO_WG, i64, env, i64, i64)
 DEF_HELPER_FLAGS_3(ddb, TCG_CALL_NO_WG, i64, env, i64, i64)
 DEF_HELPER_FLAGS_3(dxb, TCG_CALL_NO_WG, i128, env, i128, i128)
+DEF_HELPER_5(didb, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(dieb, void, env, i32, i32, i32, i32)
 DEF_HELPER_FLAGS_3(meeb, TCG_CALL_NO_WG, i64, env, i64, i64)
 DEF_HELPER_FLAGS_3(mdeb, TCG_CALL_NO_WG, i64, env, i64, i64)
 DEF_HELPER_FLAGS_3(mdb, TCG_CALL_NO_WG, i64, env, i64, i64)
diff --git a/target/s390x/tcg/fpu_helper.c b/target/s390x/tcg/fpu_helper.c
index 1ba43715ac1..f524c4257fb 100644
--- a/target/s390x/tcg/fpu_helper.c
+++ b/target/s390x/tcg/fpu_helper.c
@@ -286,6 +286,205 @@ Int128 HELPER(dxb)(CPUS390XState *env, Int128 a, Int128 b)
     return RET128(ret);
 }
 
+static float128 float128_precision_round_to_float32(float128 x)
+{
+    x.low = 0;
+    x.high = deposit64(x.high, 0, 25, 0);
+    return x;
+}
+
+static float128 float128_precision_round_to_float64(float128 x)
+{
+    x.low = deposit64(x.low, 0, 60, 0);
+    return x;
+}
+
+static int float128_get_exp(float128 x)
+{
+    return extract64(x.high, 48, 15) - 16383;
+}
+
+static float128 float128_set_exp(float128 x, int exp)
+{
+    x.high = deposit64(x.high, 48, 15, exp + 16383);
+    return x;
+}
+
+static float128 float128_adjust_exp(float128 x, int delta)
+{
+    return float128_set_exp(x, float128_get_exp(x) + delta);
+}
+
+static bool float128_is_int(float128 x)
+{
+    return extract64(x.high, 0, 48) == 0 && x.low == 0;
+}
+
+static float32 extract_float32(CPUS390XState *env, uint32_t r)
+{
+    return env->vregs[r][0] >> 32;
+}
+
+static void deposit_float32(CPUS390XState *env, uint32_t r, float32 x)
+{
+    env->vregs[r][0] = deposit64(env->vregs[r][0], 32, 32, x);
+}
+
+static float64 extract_float64(CPUS390XState *env, uint32_t r)
+{
+    return env->vregs[r][0];
+}
+
+static void deposit_float64(CPUS390XState *env, uint32_t r, float64 x)
+{
+    env->vregs[r][0] = x;
+}
+
+#define DIVIDE_TO_INTEGER(name, floatN, p, exp_max, exp_bias)                  \
+void HELPER(name)(CPUS390XState *env, uint32_t r1, uint32_t r2,                \
+                  uint32_t r3, uint32_t m4)                                    \
+{                                                                              \
+    int float_exception_flags = 0;                                             \
+    floatN a, b, n, r;                                                         \
+    int dxc = -1;                                                              \
+    uint32_t cc;                                                               \
+                                                                               \
+    a = extract_ ## floatN(env, r1);                                           \
+    b = extract_ ## floatN(env, r2);                                           \
+                                                                               \
+    /* POp table "Results: DIVIDE TO INTEGER (Part 1 of 2)" */                 \
+    if (floatN ## _is_signaling_nan(a, &env->fpu_status)) {                    \
+        r = n = floatN ## _silence_nan(a, &env->fpu_status);                   \
+        cc = 1;                                                                \
+        float_exception_flags |= float_flag_invalid;                           \
+    } else if (floatN ## _is_signaling_nan(b, &env->fpu_status)) {             \
+        r = n = floatN ## _silence_nan(b, &env->fpu_status);                   \
+        cc = 1;                                                                \
+        float_exception_flags |= float_flag_invalid;                           \
+    } else if (floatN ## _is_quiet_nan(a, &env->fpu_status)) {                 \
+        r = n = a;                                                             \
+        cc = 1;                                                                \
+    } else if (floatN ## _is_quiet_nan(b, &env->fpu_status)) {                 \
+        r = n = b;                                                             \
+        cc = 1;                                                                \
+    } else if (floatN ## _is_infinity(a) || floatN ## _is_zero(b)) {           \
+        r = n = floatN ## _default_nan(&env->fpu_status);                      \
+        cc = 1;                                                                \
+        float_exception_flags |= float_flag_invalid;                           \
+    } else if (floatN ## _is_infinity(b))  {                                   \
+        r = a;                                                                 \
+        n = floatN ## _set_sign(floatN ## _zero,                               \
+                                floatN ## _is_neg(a) != floatN ## _is_neg(b)); \
+        cc = 0;                                                                \
+    } else {                                                                   \
+        float128 a128, b128, m128, n128, q128, r128;                           \
+        bool is_final, is_q128_smallish;                                       \
+        int old_mode, r128_exp;                                                \
+        uint32_t r_flags;                                                      \
+                                                                               \
+        /* Compute precise quotient */                                         \
+        a128 = floatN ## _to_float128(a, &env->fpu_status);                    \
+        b128 = floatN ## _to_float128(b, &env->fpu_status);                    \
+        q128 = float128_div(a128, b128, &env->fpu_status);                     \
+                                                                               \
+        /* Final or partial case? */                                           \
+        is_q128_smallish = float128_get_exp(q128) < p;                         \
+        is_final = is_q128_smallish || float128_is_int(q128);                  \
+                                                                               \
+        /*                                                                     \
+         * Final quotient is rounded using M4,                                 \
+         * partial quotient is rounded toward zero.                            \
+         */                                                                    \
+        old_mode = s390_swap_bfp_rounding_mode(env, is_final ? m4 : 5);        \
+        n128 = float128_round_to_int(q128, &env->fpu_status);                  \
+        s390_restore_bfp_rounding_mode(env, old_mode);                         \
+                                                                               \
+        /*                                                                     \
+         * Intermediate values are precision-rounded,                          \
+         * see "Intermediate Values" in POp.                                   \
+         */                                                                    \
+        n128 = float128_precision_round_to_ ## floatN(n128);                   \
+                                                                               \
+        /* Compute remainder */                                                \
+        m128 = float128_mul(b128, n128, &env->fpu_status);                     \
+        env->fpu_status.float_exception_flags = 0;                             \
+        r128 = float128_sub(a128, m128, &env->fpu_status);                     \
+        r128_exp = float128_get_exp(r128);                                     \
+        r = float128_to_## floatN(r128, &env->fpu_status);                     \
+        r_flags = env->fpu_status.float_exception_flags;                       \
+                                                                               \
+        /* POp table "Results: DIVIDE TO INTEGER (Part 2 of 2)" */             \
+        if (is_q128_smallish) {                                                \
+            cc = 0;                                                            \
+            if (!floatN ## _is_zero(r)) {                                      \
+                if (r128_exp < -(exp_max - 1)) {                               \
+                    if ((env->fpc >> 24) & S390_IEEE_MASK_UNDERFLOW) {         \
+                        float_exception_flags |= float_flag_underflow;         \
+                        dxc = 0x10;                                            \
+                        r128 = float128_adjust_exp(r128, exp_bias);            \
+                        r = float128_to_## floatN(r128, &env->fpu_status);     \
+                    }                                                          \
+                } else if (r_flags & float_flag_inexact) {                     \
+                    float_exception_flags |= float_flag_inexact;               \
+                    if ((env->fpc >> 24) & S390_IEEE_MASK_INEXACT) {           \
+                        /*                                                     \
+                         * Check whether remainder was truncated (rounded      \
+                         * toward zero) or incremented.                        \
+                         */                                                    \
+                        if (float128_lt(                                       \
+                                floatN ## _to_float128(floatN ## _abs(r),      \
+                                                       &env->fpu_status),      \
+                                float128_abs(r128), &env->fpu_status)) {       \
+                           dxc = 0x8;                                          \
+                        } else {                                               \
+                           dxc = 0xc;                                          \
+                        }                                                      \
+                    }                                                          \
+                }                                                              \
+            }                                                                  \
+        } else if (float128_get_exp(n128) > exp_max) {                         \
+            n128 = float128_adjust_exp(n128, -exp_bias);                       \
+            cc = floatN ## _is_zero(r) ? 1 : 3;                                \
+        } else {                                                               \
+            cc = floatN ## _is_zero(r) ? 0 : 2;                                \
+        }                                                                      \
+                                                                               \
+        /* Adjust sign of zero */                                              \
+        if (floatN ## _is_zero(r)) {                                           \
+            r = floatN ## _set_sign(r, float128_is_neg(a128));                 \
+        }                                                                      \
+        n = float128_to_ ## floatN(n128, &env->fpu_status);                    \
+        if (floatN ## _is_zero(n)) {                                           \
+            n = floatN ## _set_sign(n,                                         \
+                                    float128_is_neg(a128) !=                   \
+                                        float128_is_neg(b128));                \
+        }                                                                      \
+    }                                                                          \
+                                                                               \
+    /* Flush the results if needed */                                          \
+    if ((float_exception_flags & float_flag_invalid) &&                        \
+        ((env->fpc >> 24) & S390_IEEE_MASK_INVALID)) {                         \
+        /* The action for invalid operation is "Suppress" */                   \
+    } else {                                                                   \
+        /* The action for other exceptions is "Complete" */                    \
+        deposit_ ## floatN(env, r1, r);                                        \
+        deposit_ ## floatN(env, r3, n);                                        \
+        env->cc_op = cc;                                                       \
+    }                                                                          \
+                                                                               \
+    /* Raise an exception if needed */                                         \
+    if (dxc == -1) {                                                           \
+        env->fpu_status.float_exception_flags = float_exception_flags;         \
+        handle_exceptions(env, false, GETPC());                                \
+    } else {                                                                   \
+        env->fpu_status.float_exception_flags = 0;                             \
+        tcg_s390_data_exception(env, dxc, GETPC());                            \
+    }                                                                          \
+}
+
+DIVIDE_TO_INTEGER(dieb, float32, 24, 127, 192)
+DIVIDE_TO_INTEGER(didb, float64, 53, 1023, 1536)
+
 /* 32-bit FP multiplication */
 uint64_t HELPER(meeb)(CPUS390XState *env, uint64_t f1, uint64_t f2)
 {
diff --git a/target/s390x/tcg/insn-data.h.inc b/target/s390x/tcg/insn-data.h.inc
index baaafe922e9..0d5392eac54 100644
--- a/target/s390x/tcg/insn-data.h.inc
+++ b/target/s390x/tcg/insn-data.h.inc
@@ -9,7 +9,7 @@
  *  OPC  = (op << 8) | op2 where op is the major, op2 the minor opcode
  *  NAME = name of the opcode, used internally
  *  FMT  = format of the opcode (defined in insn-format.h.inc)
- *  FAC  = facility the opcode is available in (defined in DisasFacility)
+ *  FAC  = facility the opcode is available in (define in translate.c)
  *  I1   = func in1_xx fills o->in1
  *  I2   = func in2_xx fills o->in2
  *  P    = func prep_xx initializes o->*out*
@@ -361,6 +361,9 @@
     C(0xb91d, DSGFR,   RRE,   Z,   r1p1, r2_32s, r1_P, 0, divs64, 0)
     C(0xe30d, DSG,     RXY_a, Z,   r1p1, m2_64, r1_P, 0, divs64, 0)
     C(0xe31d, DSGF,    RXY_a, Z,   r1p1, m2_32s, r1_P, 0, divs64, 0)
+/* DIVIDE TO INTEGER */
+    D(0xb35b, DIDBR,   RRF_b, Z,   0, 0, 0, 0, dib, 0, 64)
+    D(0xb353, DIEBR,   RRF_b, Z,   0, 0, 0, 0, dib, 0, 32)
 
 /* EXCLUSIVE OR */
     C(0x1700, XR,      RR_a,  Z,   r1, r2, new, r1_32, xor, nz32)
diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c
index 540c5a569c0..a3b753bc829 100644
--- a/target/s390x/tcg/translate.c
+++ b/target/s390x/tcg/translate.c
@@ -2283,6 +2283,32 @@ static DisasJumpType op_dxb(DisasContext *s, DisasOps *o)
     return DISAS_NEXT;
 }
 
+static DisasJumpType op_dib(DisasContext *s, DisasOps *o)
+{
+    const bool fpe = s390_has_feat(S390_FEAT_FLOATING_POINT_EXT);
+    uint8_t m4 = get_field(s, m4);
+
+    if (get_field(s, r1) == get_field(s, r2) ||
+        get_field(s, r1) == get_field(s, r3) ||
+        get_field(s, r2) == get_field(s, r3)) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    if (m4 == 2 || (!fpe && m4 == 3) || m4 > 7) {
+        gen_program_exception(s, PGM_SPECIFICATION);
+        return DISAS_NORETURN;
+    }
+
+    (s->insn->data == 32 ? gen_helper_dieb : gen_helper_didb)(
+        tcg_env, tcg_constant_i32(get_field(s, r1)),
+        tcg_constant_i32(get_field(s, r2)),
+        tcg_constant_i32(get_field(s, r3)), tcg_constant_i32(m4));
+    set_cc_static(s);
+
+    return DISAS_NEXT;
+}
+
 static DisasJumpType op_ear(DisasContext *s, DisasOps *o)
 {
     int r2 = get_field(s, r2);
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] tests/tcg/s390x: Test DIVIDE TO INTEGER
  2026-01-21 22:12 [PATCH 0/3] target/s390x: Implement DIVIDE TO INTEGER Ilya Leoshkevich
  2026-01-21 22:12 ` [PATCH 1/3] target/s390x: Dump Floating-Point-Control Register Ilya Leoshkevich
  2026-01-21 22:12 ` [PATCH 2/3] target/s390x: Implement DIVIDE TO INTEGER Ilya Leoshkevich
@ 2026-01-21 22:12 ` Ilya Leoshkevich
  2026-01-22 16:43   ` Alex Bennée
  2 siblings, 1 reply; 10+ messages in thread
From: Ilya Leoshkevich @ 2026-01-21 22:12 UTC (permalink / raw)
  To: Thomas Huth, Richard Henderson
  Cc: David Hildenbrand, qemu-s390x, qemu-devel, Ilya Leoshkevich

Add a test to prevent regressions. Data is generated using a
libFuzzer-based fuzzer and hopefully covers all the important corner
cases.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tests/tcg/s390x/Makefile.target     |   3 +
 tests/tcg/s390x/divide-to-integer.c | 215 ++++++++++++++++++++++++++++
 2 files changed, 218 insertions(+)
 create mode 100644 tests/tcg/s390x/divide-to-integer.c

diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index da5fe71a407..d5ec01d04fd 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -49,14 +49,17 @@ TESTS+=cvd
 TESTS+=cvb
 TESTS+=ts
 TESTS+=ex-smc
+TESTS+=divide-to-integer
 
 cdsg: CFLAGS+=-pthread
 cdsg: LDFLAGS+=-pthread
 
 rxsbg: CFLAGS+=-O2
+divide-to-integer: CFLAGS+=-O2
 
 cgebra: LDFLAGS+=-lm
 clgebr: LDFLAGS+=-lm
+divide-to-integer: LDFLAGS+=-lm
 
 include $(S390X_SRC)/pgm-specification.mak
 $(PGM_SPECIFICATION_TESTS): pgm-specification-user.o
diff --git a/tests/tcg/s390x/divide-to-integer.c b/tests/tcg/s390x/divide-to-integer.c
new file mode 100644
index 00000000000..49a18d07b31
--- /dev/null
+++ b/tests/tcg/s390x/divide-to-integer.c
@@ -0,0 +1,215 @@
+/*
+ * Test DIEBR and DIDBR instructions.
+ *
+ * Most inputs were discovered by fuzzing and exercise various corner cases in
+ * the helpers.
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <asm/ucontext.h>
+
+static void sigfpe_handler(int sig, siginfo_t *info, void *puc)
+{
+    struct ucontext *uc = puc;
+    unsigned short *xr_insn;
+    int r;
+
+    xr_insn = (unsigned short *)(uc->uc_mcontext.regs.psw.addr - 6);
+    r = *xr_insn & 0xf;
+    uc->uc_mcontext.regs.gprs[r] = sig;
+}
+
+#define DIVIDE_TO_INTEGER(name, floatN)                                        \
+static inline __attribute__((__always_inline__)) int                           \
+name(floatN *r1, floatN r2, floatN *r3, int m4, int *sig)                      \
+{                                                                              \
+    int cc;                                                                    \
+                                                                               \
+    asm(/* Make the initial CC predictable for suppression tests */            \
+        "xr %[sig],%[sig]\n"                                                   \
+        #name " %[r1],%[r3],%[r2],%[m4]\n"                                     \
+        "ipm %[cc]\n"                                                          \
+        "srl %[cc],28"                                                         \
+        /*                                                                     \
+         * Use earlyclobbers to prevent the compiler from reusing floating     \
+         * point registers. This instruction doesn't like it.                  \
+         */                                                                    \
+        : [r1] "+&f" (*r1), [r3] "+&f" (*r3), [sig] "=r" (*sig), [cc] "=d" (cc)\
+        : [r2] "f" (r2), [m4] "i" (m4)                                         \
+        : "cc");                                                               \
+                                                                               \
+    return cc;                                                                 \
+}
+
+DIVIDE_TO_INTEGER(diebr, float)
+DIVIDE_TO_INTEGER(didbr, double)
+
+#define TEST_DIVIDE_TO_INTEGER(name, intN, int_fmt, floatN, float_fmt)         \
+static inline __attribute__((__always_inline__)) int                           \
+test_ ## name(unsigned intN r1i, unsigned intN r2i, int m4, int fpc,           \
+              unsigned intN r1o, unsigned intN r3o, int cco, unsigned int fpco,\
+              int sigo)                                                        \
+{                                                                              \
+    union {                                                                    \
+        floatN f;                                                              \
+        unsigned intN i;                                                       \
+    } r1, r2, r3;                                                              \
+    int cc, err = 0, sig;                                                      \
+                                                                               \
+    r1.i = r1i;                                                                \
+    r2.i = r2i;                                                                \
+    r3.i = 0x12345678;                                                         \
+    printf("[ RUN      ] %" float_fmt "(0x%" int_fmt                           \
+           ") / %" float_fmt "(0x%" int_fmt ")\n", r1.f, r1.i, r2.f, r2.i);    \
+    asm volatile("sfpc %[fpc]" : : [fpc] "r" (fpc));                           \
+    cc = name(&r1.f, r2.f, &r3.f, m4, &sig);                                   \
+    asm volatile("stfpc %[fpc]" : [fpc] "=Q" (fpc));                           \
+    if (r1.i != r1o) {                                                         \
+        printf("[  FAILED  ] remainder 0x%" int_fmt                            \
+               " != expected 0x%" int_fmt "\n", r1.i, r1o);                    \
+        err += 1;                                                              \
+    }                                                                          \
+    if (r3.i != r3o) {                                                         \
+        printf("[  FAILED  ] quotient 0x%" int_fmt                             \
+               " != expected 0x%" int_fmt "\n", r3.i, r3o);                    \
+        err += 1;                                                              \
+    }                                                                          \
+    if (cc != cco) {                                                           \
+        printf("[  FAILED  ] cc %d != expected %d\n", cc, cco);                \
+        err += 1;                                                              \
+    }                                                                          \
+    if (fpc != fpco) {                                                         \
+        printf("[  FAILED  ] fpc 0x%x != expected 0x%x\n", fpc, fpco);         \
+        err += 1;                                                              \
+    }                                                                          \
+    if (sig != sigo) {                                                         \
+        printf("[  FAILED  ] signal 0x%x != expected 0x%x\n", sig, sigo);      \
+        err += 1;                                                              \
+    }                                                                          \
+                                                                               \
+    return err;                                                                \
+}
+
+TEST_DIVIDE_TO_INTEGER(diebr, int, "x", float, "f")
+TEST_DIVIDE_TO_INTEGER(didbr, long, "lx", double, "lf")
+
+int main(void)
+{
+    struct sigaction act = {
+        .sa_sigaction = sigfpe_handler,
+        .sa_flags = SA_SIGINFO,
+    };
+    int err = 0;
+
+    /* Set up SIG handler */
+    if (sigaction(SIGFPE, &act, NULL)) {
+        printf("[  FAILED  ] sigaction(SIGFPE) failed\n");
+        return EXIT_FAILURE;
+    }
+
+    /* 451 / 460 */
+    err += test_diebr(0x43e1f1f1, 0x43e61616, 7, 0,
+                      0x43e1f1f1, 0, 0, 0, 0);
+
+    /* 480 / 0 */
+    err += test_diebr(0x43f00000, 0, 0, 0,
+                      0x7fc00000, 0x7fc00000, 1, 0x800000, 0);
+
+    /* QNaN / QNaN */
+    err += test_diebr(0xffffffff, 0xffffffff, 0, 0,
+                      0xffffffff, 0xffffffff, 1, 0, 0);
+
+    /* -2.08E-8 / -2.08E-8 */
+    err += test_diebr(0xb2b2b2b2, 0xb2b2b2b2, 0, 0,
+                      0x80000000, 0x3f800000, 0, 0, 0);
+
+    /* 4.62E-2 / -7.94E-11 */
+    err += test_diebr(0x3d3d3d3d, 0xaeaeaeae, 0, 0,
+                      0x2f38b8c0, 0xce0aaaab, 2, 0, 0);
+
+    /* 1.07E-31 / 2.19 */
+    err += test_diebr(0x0c0c0c0c, 0x400c0c0c, 6, 0,
+                      0xc00c0c0c, 0x3f800000, 0, 0x80000, 0);
+
+    /* 2.98E+29 / -5.7E-29 */
+    err += test_diebr(0x7070ffff, 0x90909090, 0, 0,
+                      0x6431c0c0, 0xbf5562aa, 3, 0, 0);
+
+    /* -2.19E-5 / 2.58E-26 */
+    err += test_diebr(0xb7b7b7b7, 0x15000000, 7, 0,
+                      0x80000000, 0xe237b7b7, 0, 0, 0);
+
+    /* 0 / 0 */
+    err += test_diebr(0, 0, 1, 0,
+                      0x7fc00000, 0x7fc00000, 1, 0x800000, 0);
+
+    /* 4.3E-33 / -2.08E-8 with SIGFPE */
+    err += test_diebr(0x09b2b2b2, 0xb2b2b2b2, 0, 0xfc000007,
+                      0xb2b2b2b1, 0xbf800000, 0, 0xfc000807, SIGFPE);
+
+    /* 1.19E-39 / -1.28E-9 */
+    err += test_diebr(0x000d0100, 0xb0b0b0b0, 6, 0xfc000000,
+                      0x5ed01000, 0x80000000, 0, 0xfc001000, SIGFPE);
+
+    /* 5.35E+7 / -5.21E+20 */
+    err += test_diebr(0x4c4c4c4c, 0xe1e1e1e1, 0, 0xfc000007,
+                      0xe1e1e1e1, 0xbf800000, 0, 0xfc000c07, SIGFPE);
+
+    /* 0 / 0 with SIGFPE */
+    err += test_diebr(0, 0, 0, 0xfc000007,
+                      0, 0x12345678, 0, 0xfc008007, SIGFPE);
+
+    /* 5.76E-16 / 5.39E+34 */
+    err += test_diebr(0x26262626, 0x79262626, 6, 0,
+                      0xf9262626, 0x3f800000, 0, 0x80000, 0);
+
+    /* -4.97E+17 / 2.03E-38 */
+    err += test_diebr(0xdcdcdcdc, 0x00dcdcdc, 7, 0xfc000000,
+                      0x80000000, 0xbb800000, 1, 0xfc000000, 0);
+
+    /* -1.23E+17 / SNaN */
+    err += test_diebr(0xdbdb240b, 0xffac73ff, 4, 0,
+                      0xffec73ff, 0xffec73ff, 1, 0x800000, 0);
+
+    /* 2.34E-38 / 3.27E-33 with SIGFPE */
+    err += test_diebr(0x00ff0987, 0x0987c6f6, 6, 0x08000000,
+                      0x8987c6b6, 0x3f800000, 0, 0x8000800, SIGFPE);
+
+    /* -5.93E+11 / -2.7E+4 */
+    err += test_diebr(0xd30a0040, 0xc6d30a00, 0, 0xc4000000,
+                      0xc74a4400, 0x4ba766c6, 2, 0xc4000000, 0);
+
+    /* 9.86E-32 / -inf */
+    err += test_diebr(0x0c000029, 0xff800000, 0, 0,
+                      0xc000029, 0x80000000, 0, 0, 0);
+
+    /* QNaN / SNaN */
+    err += test_diebr(0xffff94ff, 0xff94ff24, 4, 7,
+                      0xffd4ff24, 0xffd4ff24, 1, 0x800007, 0);
+
+    /* 2.8E-43 / -inf */
+    err += test_diebr(0x000000c8, 0xff800000, 0, 0x7c000007,
+                      0x000000c8, 0x80000000, 0, 0x7c000007, 0);
+
+    /* -1.7E+38 / -inf */
+    err += test_diebr(0xff00003d, 0xff800000, 0, 0,
+                      0xff00003d, 0, 0, 0, 0);
+
+    /* 1.94E-304 / 1.94E-304 */
+    err += test_didbr(0x00e100e100e100e1, 0x00e100e100e100e1, 0, 1,
+                      0, 0x3ff0000000000000, 0, 1, 0);
+
+    /* 4.82E-299 / 5.29E-308 */
+    err += test_didbr(0x0200230200230200, 0x0023020023020023, 0, 0,
+                      0x8001a017d247b3f4, 0x41cb2aa05f000000, 0, 0, 0);
+
+    /* -1.38E-75 / -3.77E+208 */
+    err += test_didbr(0xb063eb3d63b063eb, 0xeb3d63b063eb3d63, 3, 0xe8000000,
+                      0x6b3d63b063eb3d63, 0x3ff0000000000000, 0, 0xe8000c00,
+                      SIGFPE);
+
+    return err ? EXIT_FAILURE : EXIT_SUCCESS;
+}
-- 
2.52.0



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] target/s390x: Implement DIVIDE TO INTEGER
  2026-01-21 22:12 ` [PATCH 2/3] target/s390x: Implement DIVIDE TO INTEGER Ilya Leoshkevich
@ 2026-01-22  1:04   ` Richard Henderson
  2026-01-22 13:14     ` Ilya Leoshkevich
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Henderson @ 2026-01-22  1:04 UTC (permalink / raw)
  To: Ilya Leoshkevich, Thomas Huth; +Cc: David Hildenbrand, qemu-s390x, qemu-devel

On 1/22/26 09:12, Ilya Leoshkevich wrote:
> +static bool float128_is_int(float128 x)
> +{
> +    return extract64(x.high, 0, 48) == 0 && x.low == 0;
> +}

This isn't testing for integer, it's testing for 1.0eNN,
i.e. a power of two.

> +        /* Compute precise quotient */                                         \
> +        a128 = floatN ## _to_float128(a, &env->fpu_status);                    \
> +        b128 = floatN ## _to_float128(b, &env->fpu_status);                    \
> +        q128 = float128_div(a128, b128, &env->fpu_status);                     \
> +                                                                               \
> +        /* Final or partial case? */                                           \
> +        is_q128_smallish = float128_get_exp(q128) < p;                         \
> +        is_final = is_q128_smallish || float128_is_int(q128);                  \

The language from the manual,

# If the precise quotient is not an integer and the two
# integers closest to this precise quotient cannot both
# be represented exactly in the precision of the quotient ...

does not appear to be what you are computing here.
Certainly none of this relates to "precision of the quotient".

I would imagine that all of this would be easier to accomplish if you did this in fpu/ 
with FloatParts instead of continually swapping in and out of float128.


r~



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] target/s390x: Implement DIVIDE TO INTEGER
  2026-01-22  1:04   ` Richard Henderson
@ 2026-01-22 13:14     ` Ilya Leoshkevich
  0 siblings, 0 replies; 10+ messages in thread
From: Ilya Leoshkevich @ 2026-01-22 13:14 UTC (permalink / raw)
  To: Thomas Huth, Richard Henderson, David Hildenbrand, qemu-devel

On 1/22/26 02:04, Richard Henderson wrote:
> On 1/22/26 09:12, Ilya Leoshkevich wrote:
>> +static bool float128_is_int(float128 x)
>> +{
>> +    return extract64(x.high, 0, 48) == 0 && x.low == 0;
>> +}
>
> This isn't testing for integer, it's testing for 1.0eNN,
> i.e. a power of two.

Whoops. Not sure what I was thinking here.


>> +        /* Compute precise quotient 
>> */                                         \
>> +        a128 = floatN ## _to_float128(a, 
>> &env->fpu_status);                    \
>> +        b128 = floatN ## _to_float128(b, 
>> &env->fpu_status);                    \
>> +        q128 = float128_div(a128, b128, 
>> &env->fpu_status);                     \
>> + \
>> +        /* Final or partial case? 
>> */                                           \
>> +        is_q128_smallish = float128_get_exp(q128) < 
>> p;                         \
>> +        is_final = is_q128_smallish || 
>> float128_is_int(q128);                  \
>
> The language from the manual,
>
> # If the precise quotient is not an integer and the two
> # integers closest to this precise quotient cannot both
> # be represented exactly in the precision of the quotient ...
>
> does not appear to be what you are computing here.
> Certainly none of this relates to "precision of the quotient".

I was rather following the tables, I think they are more precise w.r.t. 
what needs to be checked.

float128_is_int(q128) was supposed to replace the r=0 check, but it's 
probably unnecessary here altogether, because if the division result is 
precise, it doesn't matter which way we round. And I'm explicitly 
checking for r=0 in other places.


> I would imagine that all of this would be easier to accomplish if you 
> did this in fpu/ with FloatParts instead of continually swapping in 
> and out of float128.

That sounds good, I think I will also be able to reuse pick_nan and 
simplify exponent manipulations then. I will give it a try.


> r~ 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] target/s390x: Dump Floating-Point-Control Register
  2026-01-21 22:12 ` [PATCH 1/3] target/s390x: Dump Floating-Point-Control Register Ilya Leoshkevich
@ 2026-01-22 16:40   ` Alex Bennée
  0 siblings, 0 replies; 10+ messages in thread
From: Alex Bennée @ 2026-01-22 16:40 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Thomas Huth, Richard Henderson, David Hildenbrand, qemu-s390x,
	qemu-devel

Ilya Leoshkevich <iii@linux.ibm.com> writes:

> Knowing the value of this register is very useful for debugging
> floating-point code.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] tests/tcg/s390x: Test DIVIDE TO INTEGER
  2026-01-21 22:12 ` [PATCH 3/3] tests/tcg/s390x: Test " Ilya Leoshkevich
@ 2026-01-22 16:43   ` Alex Bennée
  2026-01-22 16:59     ` Ilya Leoshkevich
  0 siblings, 1 reply; 10+ messages in thread
From: Alex Bennée @ 2026-01-22 16:43 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Thomas Huth, Richard Henderson, David Hildenbrand, qemu-s390x,
	qemu-devel

Ilya Leoshkevich <iii@linux.ibm.com> writes:

> Add a test to prevent regressions. Data is generated using a
> libFuzzer-based fuzzer and hopefully covers all the important corner
> cases.
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
>  tests/tcg/s390x/Makefile.target     |   3 +
>  tests/tcg/s390x/divide-to-integer.c | 215 ++++++++++++++++++++++++++++
>  2 files changed, 218 insertions(+)
>  create mode 100644 tests/tcg/s390x/divide-to-integer.c
>
> diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
> index da5fe71a407..d5ec01d04fd 100644
> --- a/tests/tcg/s390x/Makefile.target
> +++ b/tests/tcg/s390x/Makefile.target
> @@ -49,14 +49,17 @@ TESTS+=cvd
>  TESTS+=cvb
>  TESTS+=ts
>  TESTS+=ex-smc
> +TESTS+=divide-to-integer
>  
>  cdsg: CFLAGS+=-pthread
>  cdsg: LDFLAGS+=-pthread
>  
>  rxsbg: CFLAGS+=-O2
> +divide-to-integer: CFLAGS+=-O2

As we generally compile -O0 to make life easier for people debugging
behaviour via gdbstub could we have an explanation of why -O2 is needed
here? Is it the same reason as rxsbg?

<snip>

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] tests/tcg/s390x: Test DIVIDE TO INTEGER
  2026-01-22 16:43   ` Alex Bennée
@ 2026-01-22 16:59     ` Ilya Leoshkevich
  2026-01-22 18:09       ` Alex Bennée
  0 siblings, 1 reply; 10+ messages in thread
From: Ilya Leoshkevich @ 2026-01-22 16:59 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Thomas Huth, Richard Henderson, David Hildenbrand, qemu-s390x,
	qemu-devel


On 1/22/26 17:43, Alex Bennée wrote:
> Ilya Leoshkevich <iii@linux.ibm.com> writes:
>
>> Add a test to prevent regressions. Data is generated using a
>> libFuzzer-based fuzzer and hopefully covers all the important corner
>> cases.
>>
>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
>> ---
>>   tests/tcg/s390x/Makefile.target     |   3 +
>>   tests/tcg/s390x/divide-to-integer.c | 215 ++++++++++++++++++++++++++++
>>   2 files changed, 218 insertions(+)
>>   create mode 100644 tests/tcg/s390x/divide-to-integer.c
>>
>> diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
>> index da5fe71a407..d5ec01d04fd 100644
>> --- a/tests/tcg/s390x/Makefile.target
>> +++ b/tests/tcg/s390x/Makefile.target
>> @@ -49,14 +49,17 @@ TESTS+=cvd
>>   TESTS+=cvb
>>   TESTS+=ts
>>   TESTS+=ex-smc
>> +TESTS+=divide-to-integer
>>   
>>   cdsg: CFLAGS+=-pthread
>>   cdsg: LDFLAGS+=-pthread
>>   
>>   rxsbg: CFLAGS+=-O2
>> +divide-to-integer: CFLAGS+=-O2
> As we generally compile -O0 to make life easier for people debugging
> behaviour via gdbstub could we have an explanation of why -O2 is needed
> here? Is it the same reason as rxsbg?
>
> <snip>

Yes, this is because inlining is mandatory for the mask argument, and 
-O0 does not honor always_inline.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] tests/tcg/s390x: Test DIVIDE TO INTEGER
  2026-01-22 16:59     ` Ilya Leoshkevich
@ 2026-01-22 18:09       ` Alex Bennée
  0 siblings, 0 replies; 10+ messages in thread
From: Alex Bennée @ 2026-01-22 18:09 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Thomas Huth, Richard Henderson, David Hildenbrand, qemu-s390x,
	qemu-devel

Ilya Leoshkevich <iii@linux.ibm.com> writes:

> On 1/22/26 17:43, Alex Bennée wrote:
>> Ilya Leoshkevich <iii@linux.ibm.com> writes:
>>
>>> Add a test to prevent regressions. Data is generated using a
>>> libFuzzer-based fuzzer and hopefully covers all the important corner
>>> cases.
>>>
>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
>>> ---
>>>   tests/tcg/s390x/Makefile.target     |   3 +
>>>   tests/tcg/s390x/divide-to-integer.c | 215 ++++++++++++++++++++++++++++
>>>   2 files changed, 218 insertions(+)
>>>   create mode 100644 tests/tcg/s390x/divide-to-integer.c
>>>
>>> diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
>>> index da5fe71a407..d5ec01d04fd 100644
>>> --- a/tests/tcg/s390x/Makefile.target
>>> +++ b/tests/tcg/s390x/Makefile.target
>>> @@ -49,14 +49,17 @@ TESTS+=cvd
>>>   TESTS+=cvb
>>>   TESTS+=ts
>>>   TESTS+=ex-smc
>>> +TESTS+=divide-to-integer
>>>     cdsg: CFLAGS+=-pthread
>>>   cdsg: LDFLAGS+=-pthread
>>>     rxsbg: CFLAGS+=-O2
>>> +divide-to-integer: CFLAGS+=-O2
>> As we generally compile -O0 to make life easier for people debugging
>> behaviour via gdbstub could we have an explanation of why -O2 is needed
>> here? Is it the same reason as rxsbg?
>>
>> <snip>
>
> Yes, this is because inlining is mandatory for the mask argument, and
> -O0 does not honor always_inline.

Could we add a comment to that effect in the Makefile. Otherwise looks
fine to me:

Acked-by: Alex Bennée <alex.bennee@linaro.org>


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-01-22 18:11 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-21 22:12 [PATCH 0/3] target/s390x: Implement DIVIDE TO INTEGER Ilya Leoshkevich
2026-01-21 22:12 ` [PATCH 1/3] target/s390x: Dump Floating-Point-Control Register Ilya Leoshkevich
2026-01-22 16:40   ` Alex Bennée
2026-01-21 22:12 ` [PATCH 2/3] target/s390x: Implement DIVIDE TO INTEGER Ilya Leoshkevich
2026-01-22  1:04   ` Richard Henderson
2026-01-22 13:14     ` Ilya Leoshkevich
2026-01-21 22:12 ` [PATCH 3/3] tests/tcg/s390x: Test " Ilya Leoshkevich
2026-01-22 16:43   ` Alex Bennée
2026-01-22 16:59     ` Ilya Leoshkevich
2026-01-22 18:09       ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.