qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw
@ 2011-06-06 14:25 Paolo Bonzini
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h Paolo Bonzini
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 14:25 UTC (permalink / raw)
  To: qemu-devel

Functions like ldl_be_p and ldl_le_p are currently used only as building
blocks for {ld,st}XX_p.  As such, they are in cpu-all.h even though they
have absolutely no dependency on the target.

In order to make them globally available, this series moves them to
bswap.h instead.

An interesting part of this is that there are functions also for floating
point values.  Leaving them in cpu-all.h would be possible but untidy.
In fact handling these is easy, but it requires to make softfloat.h
target-dependent as well.  This is what patch 2 does.

v1->v2:
        rebase, use softfloat-specialize.h instead of introducing
        softfloat-target.h

Paolo Bonzini (3):
  move WORDS_ALIGNED to qemu-common.h
  softfloat: change default nan definitions to variables
  move unaligned memory access functions to bswap.h

 Makefile.hw                |    2 +-
 bswap.h                    |  474 ++++++++++++++++++++++++++++++++++++++++++++
 configure                  |    3 +-
 cpu-all.h                  |  446 +-----------------------------------------
 cpu-common.h               |    4 -
 fpu/softfloat-specialize.h |   72 +++++++
 fpu/softfloat.h            |   60 +-----
 qemu-common.h              |    4 +
 8 files changed, 562 insertions(+), 503 deletions(-)

-- 
1.7.4.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h
  2011-06-06 14:25 [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw Paolo Bonzini
@ 2011-06-06 14:25 ` Paolo Bonzini
  2011-06-06 17:15   ` Andreas Färber
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 2/3] softfloat: change default nan definitions to variables Paolo Bonzini
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 14:25 UTC (permalink / raw)
  To: qemu-devel

This is not a CPU interface.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-common.h  |    4 ----
 qemu-common.h |    4 ++++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index 7a86d63..7979a18 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -3,10 +3,6 @@
 
 /* CPU interfaces that are target indpendent.  */
 
-#if defined(__arm__) || defined(__sparc__) || defined(__mips__) || defined(__hppa__) || defined(__ia64__)
-#define WORDS_ALIGNED
-#endif
-
 #ifdef TARGET_PHYS_ADDR_BITS
 #include "targphys.h"
 #endif
diff --git a/qemu-common.h b/qemu-common.h
index b851b20..7484ef8 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -4,6 +4,10 @@
 
 #include "config-host.h"
 
+#if defined(__arm__) || defined(__sparc__) || defined(__mips__) || defined(__hppa__) || defined(__ia64__)
+#define WORDS_ALIGNED
+#endif
+
 #define QEMU_NORETURN __attribute__ ((__noreturn__))
 #ifdef CONFIG_GCC_ATTRIBUTE_WARN_UNUSED_RESULT
 #define QEMU_WARN_UNUSED_RESULT __attribute__((warn_unused_result))
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 2/3] softfloat: change default nan definitions to variables
  2011-06-06 14:25 [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw Paolo Bonzini
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h Paolo Bonzini
@ 2011-06-06 14:25 ` Paolo Bonzini
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h Paolo Bonzini
  2011-07-06 13:29 ` [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw Paolo Bonzini
  3 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 14:25 UTC (permalink / raw)
  To: qemu-devel

Most definitions in softfloat.h are really target-independent, but the
file is not because it includes definitions of the default NaN values.
Change those to variables to allow including softfloat.h from files that
are not compiled per-target.  By making them const, the compiler is
allowed to optimize them into softfloat functions that use them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 fpu/softfloat-specialize.h |   72 ++++++++++++++++++++++++++++++++++++++++++++
 fpu/softfloat.h            |   60 +++++-------------------------------
 2 files changed, 81 insertions(+), 51 deletions(-)

diff --git a/fpu/softfloat-specialize.h b/fpu/softfloat-specialize.h
index c7d35a1..c165205 100644
--- a/fpu/softfloat-specialize.h
+++ b/fpu/softfloat-specialize.h
@@ -35,6 +35,78 @@ these four paragraphs for those parts of this code that are retained.
 
 =============================================================================*/
 
+#if defined(TARGET_MIPS) || defined(TARGET_SH4) || defined(TARGET_UNICORE32)
+#define SNAN_BIT_IS_ONE		1
+#else
+#define SNAN_BIT_IS_ONE		0
+#endif
+
+/*----------------------------------------------------------------------------
+| The pattern for a default generated half-precision NaN.
+*----------------------------------------------------------------------------*/
+#if defined(TARGET_ARM)
+const float16 float16_default_nan = const_float16(0x7E00);
+#elif SNAN_BIT_IS_ONE
+const float16 float16_default_nan = const_float16(0x7DFF);
+#else
+const float16 float16_default_nan = const_float16(0xFE00);
+#endif
+
+/*----------------------------------------------------------------------------
+| The pattern for a default generated single-precision NaN.
+*----------------------------------------------------------------------------*/
+#if defined(TARGET_SPARC)
+const float32 float32_default_nan = const_float32(0x7FFFFFFF);
+#elif defined(TARGET_PPC) || defined(TARGET_ARM) || defined(TARGET_ALPHA)
+const float32 float32_default_nan = const_float32(0x7FC00000);
+#elif SNAN_BIT_IS_ONE
+const float32 float32_default_nan = const_float32(0x7FBFFFFF);
+#else
+const float32 float32_default_nan = const_float32(0xFFC00000);
+#endif
+
+/*----------------------------------------------------------------------------
+| The pattern for a default generated double-precision NaN.
+*----------------------------------------------------------------------------*/
+#if defined(TARGET_SPARC)
+const float64 float64_default_nan = const_float64(LIT64( 0x7FFFFFFFFFFFFFFF ));
+#elif defined(TARGET_PPC) || defined(TARGET_ARM) || defined(TARGET_ALPHA)
+const float64 float64_default_nan = const_float64(LIT64( 0x7FF8000000000000 ));
+#elif SNAN_BIT_IS_ONE
+const float64 float64_default_nan = const_float64(LIT64( 0x7FF7FFFFFFFFFFFF ));
+#else
+const float64 float64_default_nan = const_float64(LIT64( 0xFFF8000000000000 ));
+#endif
+
+/*----------------------------------------------------------------------------
+| The pattern for a default generated extended double-precision NaN.
+*----------------------------------------------------------------------------*/
+#if SNAN_BIT_IS_ONE
+#define floatx80_default_nan_high 0x7FFF
+#define floatx80_default_nan_low  LIT64( 0xBFFFFFFFFFFFFFFF )
+#else
+#define floatx80_default_nan_high 0xFFFF
+#define floatx80_default_nan_low  LIT64( 0xC000000000000000 )
+#endif
+
+const floatx80 floatx80_default_nan = make_floatx80(floatx80_default_nan_high,
+                                                    floatx80_default_nan_low);
+
+/*----------------------------------------------------------------------------
+| The pattern for a default generated quadruple-precision NaN.  The `high' and
+| `low' values hold the most- and least-significant bits, respectively.
+*----------------------------------------------------------------------------*/
+#if SNAN_BIT_IS_ONE
+#define float128_default_nan_high LIT64( 0x7FFF7FFFFFFFFFFF )
+#define float128_default_nan_low  LIT64( 0xFFFFFFFFFFFFFFFF )
+#else
+#define float128_default_nan_high LIT64( 0xFFFF800000000000 )
+#define float128_default_nan_low  LIT64( 0x0000000000000000 )
+#endif
+
+const float128 float128_default_nan = make_float128(float128_default_nan_high,
+                                                    float128_default_nan_low);
+
 /*----------------------------------------------------------------------------
 | Raises the exceptions specified by `flags'.  Floating-point traps can be
 | defined here if desired.  It is currently not possible for such a trap
diff --git a/fpu/softfloat.h b/fpu/softfloat.h
index bde2500..3bb7d8f 100644
--- a/fpu/softfloat.h
+++ b/fpu/softfloat.h
@@ -43,7 +43,7 @@ these four paragraphs for those parts of this code that are retained.
 #endif
 
 #include <inttypes.h>
-#include "config.h"
+#include "config-host.h"
 
 /*----------------------------------------------------------------------------
 | Each of the following `typedef's defines the most convenient type that holds
@@ -68,12 +68,6 @@ typedef int64_t int64;
 #define LIT64( a ) a##LL
 #define INLINE static inline
 
-#if defined(TARGET_MIPS) || defined(TARGET_SH4) || defined(TARGET_UNICORE32)
-#define SNAN_BIT_IS_ONE		1
-#else
-#define SNAN_BIT_IS_ONE		0
-#endif
-
 #define STATUS_PARAM , float_status *status
 #define STATUS(field) status->field
 #define STATUS_VAR , status
@@ -142,6 +136,7 @@ typedef struct {
     uint64_t low, high;
 #endif
 } float128;
+#define make_float128(high_, low_) ((float128) { .high = high_, .low = low_ })
 
 /*----------------------------------------------------------------------------
 | Software IEC/IEEE floating-point underflow tininess-detection mode.
@@ -248,13 +243,7 @@ float16 float16_maybe_silence_nan( float16 );
 /*----------------------------------------------------------------------------
 | The pattern for a default generated half-precision NaN.
 *----------------------------------------------------------------------------*/
-#if defined(TARGET_ARM)
-#define float16_default_nan make_float16(0x7E00)
-#elif SNAN_BIT_IS_ONE
-#define float16_default_nan make_float16(0x7DFF)
-#else
-#define float16_default_nan make_float16(0xFE00)
-#endif
+extern const float16 float16_default_nan;
 
 /*----------------------------------------------------------------------------
 | Software IEC/IEEE single-precision conversion routines.
@@ -357,15 +346,7 @@ INLINE float32 float32_set_sign(float32 a, int sign)
 /*----------------------------------------------------------------------------
 | The pattern for a default generated single-precision NaN.
 *----------------------------------------------------------------------------*/
-#if defined(TARGET_SPARC)
-#define float32_default_nan make_float32(0x7FFFFFFF)
-#elif defined(TARGET_PPC) || defined(TARGET_ARM) || defined(TARGET_ALPHA)
-#define float32_default_nan make_float32(0x7FC00000)
-#elif SNAN_BIT_IS_ONE
-#define float32_default_nan make_float32(0x7FBFFFFF)
-#else
-#define float32_default_nan make_float32(0xFFC00000)
-#endif
+extern const float32 float32_default_nan;
 
 /*----------------------------------------------------------------------------
 | Software IEC/IEEE double-precision conversion routines.
@@ -470,15 +451,7 @@ INLINE float64 float64_set_sign(float64 a, int sign)
 /*----------------------------------------------------------------------------
 | The pattern for a default generated double-precision NaN.
 *----------------------------------------------------------------------------*/
-#if defined(TARGET_SPARC)
-#define float64_default_nan make_float64(LIT64( 0x7FFFFFFFFFFFFFFF ))
-#elif defined(TARGET_PPC) || defined(TARGET_ARM) || defined(TARGET_ALPHA)
-#define float64_default_nan make_float64(LIT64( 0x7FF8000000000000 ))
-#elif SNAN_BIT_IS_ONE
-#define float64_default_nan make_float64(LIT64( 0x7FF7FFFFFFFFFFFF ))
-#else
-#define float64_default_nan make_float64(LIT64( 0xFFF8000000000000 ))
-#endif
+extern const float64 float64_default_nan;
 
 /*----------------------------------------------------------------------------
 | Software IEC/IEEE extended double-precision conversion routines.
@@ -561,17 +534,9 @@ INLINE int floatx80_is_any_nan(floatx80 a)
 #define floatx80_infinity make_floatx80(0x7fff, 0x8000000000000000LL)
 
 /*----------------------------------------------------------------------------
-| The pattern for a default generated extended double-precision NaN.  The
-| `high' and `low' values hold the most- and least-significant bits,
-| respectively.
+| The pattern for a default generated extended double-precision NaN.
 *----------------------------------------------------------------------------*/
-#if SNAN_BIT_IS_ONE
-#define floatx80_default_nan_high 0x7FFF
-#define floatx80_default_nan_low  LIT64( 0xBFFFFFFFFFFFFFFF )
-#else
-#define floatx80_default_nan_high 0xFFFF
-#define floatx80_default_nan_low  LIT64( 0xC000000000000000 )
-#endif
+extern const floatx80 floatx80_default_nan;
 
 /*----------------------------------------------------------------------------
 | Software IEC/IEEE quadruple-precision conversion routines.
@@ -648,15 +613,8 @@ INLINE int float128_is_any_nan(float128 a)
 }
 
 /*----------------------------------------------------------------------------
-| The pattern for a default generated quadruple-precision NaN.  The `high' and
-| `low' values hold the most- and least-significant bits, respectively.
+| The pattern for a default generated quadruple-precision NaN.
 *----------------------------------------------------------------------------*/
-#if SNAN_BIT_IS_ONE
-#define float128_default_nan_high LIT64( 0x7FFF7FFFFFFFFFFF )
-#define float128_default_nan_low  LIT64( 0xFFFFFFFFFFFFFFFF )
-#else
-#define float128_default_nan_high LIT64( 0xFFFF800000000000 )
-#define float128_default_nan_low  LIT64( 0x0000000000000000 )
-#endif
+extern const float128 float128_default_nan;
 
 #endif /* !SOFTFLOAT_H */
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h
  2011-06-06 14:25 [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw Paolo Bonzini
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h Paolo Bonzini
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 2/3] softfloat: change default nan definitions to variables Paolo Bonzini
@ 2011-06-06 14:25 ` Paolo Bonzini
  2011-06-06 19:56   ` Richard Henderson
  2011-07-06 13:29 ` [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw Paolo Bonzini
  3 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 14:25 UTC (permalink / raw)
  To: qemu-devel

This is just code movement, and moving the fpu/ include path from
target-dependent to target-independent Make variables.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Makefile.hw |    2 +-
 bswap.h     |  474 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 configure   |    3 +-
 cpu-all.h   |  446 +-------------------------------------------------------
 4 files changed, 477 insertions(+), 448 deletions(-)

diff --git a/Makefile.hw b/Makefile.hw
index b9181ab..659e441 100644
--- a/Makefile.hw
+++ b/Makefile.hw
@@ -9,7 +9,7 @@ include $(SRC_PATH)/rules.mak
 
 $(call set-vpath, $(SRC_PATH):$(SRC_PATH)/hw)
 
-QEMU_CFLAGS+=-I.. -I$(SRC_PATH)/fpu
+QEMU_CFLAGS+=-I..
 
 include $(SRC_PATH)/Makefile.objs
 
diff --git a/bswap.h b/bswap.h
index 82a7951..f41bebe 100644
--- a/bswap.h
+++ b/bswap.h
@@ -11,6 +11,8 @@
 #include <machine/bswap.h>
 #else
 
+#include "softfloat.h"
+
 #ifdef CONFIG_BYTESWAP_H
 #include <byteswap.h>
 #else
@@ -237,4 +239,476 @@ static inline uint32_t qemu_bswap_len(uint32_t value, int len)
     return bswap32(value) >> (32 - 8 * len);
 }
 
+typedef union {
+    float32 f;
+    uint32_t l;
+} CPU_FloatU;
+
+typedef union {
+    float64 d;
+#if defined(HOST_WORDS_BIGENDIAN)
+    struct {
+        uint32_t upper;
+        uint32_t lower;
+    } l;
+#else
+    struct {
+        uint32_t lower;
+        uint32_t upper;
+    } l;
+#endif
+    uint64_t ll;
+} CPU_DoubleU;
+
+typedef union {
+     floatx80 d;
+     struct {
+         uint64_t lower;
+         uint16_t upper;
+     } l;
+} CPU_LDoubleU;
+
+typedef union {
+    float128 q;
+#if defined(HOST_WORDS_BIGENDIAN)
+    struct {
+        uint32_t upmost;
+        uint32_t upper;
+        uint32_t lower;
+        uint32_t lowest;
+    } l;
+    struct {
+        uint64_t upper;
+        uint64_t lower;
+    } ll;
+#else
+    struct {
+        uint32_t lowest;
+        uint32_t lower;
+        uint32_t upper;
+        uint32_t upmost;
+    } l;
+    struct {
+        uint64_t lower;
+        uint64_t upper;
+    } ll;
+#endif
+} CPU_QuadU;
+
+/* unaligned/endian-independent pointer access */
+
+/*
+ * the generic syntax is:
+ *
+ * load: ld{type}{sign}{size}{endian}_p(ptr)
+ *
+ * store: st{type}{size}{endian}_p(ptr, val)
+ *
+ * Note there are small differences with the softmmu access API!
+ *
+ * type is:
+ * (empty): integer access
+ *   f    : float access
+ *
+ * sign is:
+ * (empty): for floats or 32 bit size
+ *   u    : unsigned
+ *   s    : signed
+ *
+ * size is:
+ *   b: 8 bits
+ *   w: 16 bits
+ *   l: 32 bits
+ *   q: 64 bits
+ *
+ * endian is:
+ * (empty): 8 bit access
+ *   be   : big endian
+ *   le   : little endian
+ */
+static inline int ldub_p(const void *ptr)
+{
+    return *(uint8_t *)ptr;
+}
+
+static inline int ldsb_p(const void *ptr)
+{
+    return *(int8_t *)ptr;
+}
+
+static inline void stb_p(void *ptr, int v)
+{
+    *(uint8_t *)ptr = v;
+}
+
+/* NOTE: on arm, putting 2 in /proc/sys/debug/alignment so that the
+   kernel handles unaligned load/stores may give better results, but
+   it is a system wide setting : bad */
+#if defined(HOST_WORDS_BIGENDIAN) || defined(WORDS_ALIGNED)
+
+/* conservative code for little endian unaligned accesses */
+static inline int lduw_le_p(const void *ptr)
+{
+#ifdef _ARCH_PPC
+    int val;
+    __asm__ __volatile__ ("lhbrx %0,0,%1" : "=r" (val) : "r" (ptr));
+    return val;
+#else
+    const uint8_t *p = ptr;
+    return p[0] | (p[1] << 8);
+#endif
+}
+
+static inline int ldsw_le_p(const void *ptr)
+{
+#ifdef _ARCH_PPC
+    int val;
+    __asm__ __volatile__ ("lhbrx %0,0,%1" : "=r" (val) : "r" (ptr));
+    return (int16_t)val;
+#else
+    const uint8_t *p = ptr;
+    return (int16_t)(p[0] | (p[1] << 8));
+#endif
+}
+
+static inline int ldl_le_p(const void *ptr)
+{
+#ifdef _ARCH_PPC
+    int val;
+    __asm__ __volatile__ ("lwbrx %0,0,%1" : "=r" (val) : "r" (ptr));
+    return val;
+#else
+    const uint8_t *p = ptr;
+    return p[0] | (p[1] << 8) | (p[2] << 16) | (p[3] << 24);
+#endif
+}
+
+static inline uint64_t ldq_le_p(const void *ptr)
+{
+    const uint8_t *p = ptr;
+    uint32_t v1, v2;
+    v1 = ldl_le_p(p);
+    v2 = ldl_le_p(p + 4);
+    return v1 | ((uint64_t)v2 << 32);
+}
+
+static inline void stw_le_p(void *ptr, int v)
+{
+#ifdef _ARCH_PPC
+    __asm__ __volatile__ ("sthbrx %1,0,%2" : "=m" (*(uint16_t *)ptr) : "r" (v), "r" (ptr));
+#else
+    uint8_t *p = ptr;
+    p[0] = v;
+    p[1] = v >> 8;
+#endif
+}
+
+static inline void stl_le_p(void *ptr, int v)
+{
+#ifdef _ARCH_PPC
+    __asm__ __volatile__ ("stwbrx %1,0,%2" : "=m" (*(uint32_t *)ptr) : "r" (v), "r" (ptr));
+#else
+    uint8_t *p = ptr;
+    p[0] = v;
+    p[1] = v >> 8;
+    p[2] = v >> 16;
+    p[3] = v >> 24;
+#endif
+}
+
+static inline void stq_le_p(void *ptr, uint64_t v)
+{
+    uint8_t *p = ptr;
+    stl_le_p(p, (uint32_t)v);
+    stl_le_p(p + 4, v >> 32);
+}
+
+/* float access */
+
+static inline float32 ldfl_le_p(const void *ptr)
+{
+    union {
+        float32 f;
+        uint32_t i;
+    } u;
+    u.i = ldl_le_p(ptr);
+    return u.f;
+}
+
+static inline void stfl_le_p(void *ptr, float32 v)
+{
+    union {
+        float32 f;
+        uint32_t i;
+    } u;
+    u.f = v;
+    stl_le_p(ptr, u.i);
+}
+
+static inline float64 ldfq_le_p(const void *ptr)
+{
+    CPU_DoubleU u;
+    u.l.lower = ldl_le_p(ptr);
+    u.l.upper = ldl_le_p(ptr + 4);
+    return u.d;
+}
+
+static inline void stfq_le_p(void *ptr, float64 v)
+{
+    CPU_DoubleU u;
+    u.d = v;
+    stl_le_p(ptr, u.l.lower);
+    stl_le_p(ptr + 4, u.l.upper);
+}
+
+#else
+
+static inline int lduw_le_p(const void *ptr)
+{
+    return *(uint16_t *)ptr;
+}
+
+static inline int ldsw_le_p(const void *ptr)
+{
+    return *(int16_t *)ptr;
+}
+
+static inline int ldl_le_p(const void *ptr)
+{
+    return *(uint32_t *)ptr;
+}
+
+static inline uint64_t ldq_le_p(const void *ptr)
+{
+    return *(uint64_t *)ptr;
+}
+
+static inline void stw_le_p(void *ptr, int v)
+{
+    *(uint16_t *)ptr = v;
+}
+
+static inline void stl_le_p(void *ptr, int v)
+{
+    *(uint32_t *)ptr = v;
+}
+
+static inline void stq_le_p(void *ptr, uint64_t v)
+{
+    *(uint64_t *)ptr = v;
+}
+
+/* float access */
+
+static inline float32 ldfl_le_p(const void *ptr)
+{
+    return *(float32 *)ptr;
+}
+
+static inline float64 ldfq_le_p(const void *ptr)
+{
+    return *(float64 *)ptr;
+}
+
+static inline void stfl_le_p(void *ptr, float32 v)
+{
+    *(float32 *)ptr = v;
+}
+
+static inline void stfq_le_p(void *ptr, float64 v)
+{
+    *(float64 *)ptr = v;
+}
+#endif
+
+#if !defined(HOST_WORDS_BIGENDIAN) || defined(WORDS_ALIGNED)
+
+static inline int lduw_be_p(const void *ptr)
+{
+#if defined(__i386__)
+    int val;
+    asm volatile ("movzwl %1, %0\n"
+                  "xchgb %b0, %h0\n"
+                  : "=q" (val)
+                  : "m" (*(uint16_t *)ptr));
+    return val;
+#else
+    const uint8_t *b = ptr;
+    return ((b[0] << 8) | b[1]);
+#endif
+}
+
+static inline int ldsw_be_p(const void *ptr)
+{
+#if defined(__i386__)
+    int val;
+    asm volatile ("movzwl %1, %0\n"
+                  "xchgb %b0, %h0\n"
+                  : "=q" (val)
+                  : "m" (*(uint16_t *)ptr));
+    return (int16_t)val;
+#else
+    const uint8_t *b = ptr;
+    return (int16_t)((b[0] << 8) | b[1]);
+#endif
+}
+
+static inline int ldl_be_p(const void *ptr)
+{
+#if defined(__i386__) || defined(__x86_64__)
+    int val;
+    asm volatile ("movl %1, %0\n"
+                  "bswap %0\n"
+                  : "=r" (val)
+                  : "m" (*(uint32_t *)ptr));
+    return val;
+#else
+    const uint8_t *b = ptr;
+    return (b[0] << 24) | (b[1] << 16) | (b[2] << 8) | b[3];
+#endif
+}
+
+static inline uint64_t ldq_be_p(const void *ptr)
+{
+    uint32_t a,b;
+    a = ldl_be_p(ptr);
+    b = ldl_be_p((uint8_t *)ptr + 4);
+    return (((uint64_t)a<<32)|b);
+}
+
+static inline void stw_be_p(void *ptr, int v)
+{
+#if defined(__i386__)
+    asm volatile ("xchgb %b0, %h0\n"
+                  "movw %w0, %1\n"
+                  : "=q" (v)
+                  : "m" (*(uint16_t *)ptr), "0" (v));
+#else
+    uint8_t *d = (uint8_t *) ptr;
+    d[0] = v >> 8;
+    d[1] = v;
+#endif
+}
+
+static inline void stl_be_p(void *ptr, int v)
+{
+#if defined(__i386__) || defined(__x86_64__)
+    asm volatile ("bswap %0\n"
+                  "movl %0, %1\n"
+                  : "=r" (v)
+                  : "m" (*(uint32_t *)ptr), "0" (v));
+#else
+    uint8_t *d = (uint8_t *) ptr;
+    d[0] = v >> 24;
+    d[1] = v >> 16;
+    d[2] = v >> 8;
+    d[3] = v;
+#endif
+}
+
+static inline void stq_be_p(void *ptr, uint64_t v)
+{
+    stl_be_p(ptr, v >> 32);
+    stl_be_p((uint8_t *)ptr + 4, v);
+}
+
+/* float access */
+
+static inline float32 ldfl_be_p(const void *ptr)
+{
+    union {
+        float32 f;
+        uint32_t i;
+    } u;
+    u.i = ldl_be_p(ptr);
+    return u.f;
+}
+
+static inline void stfl_be_p(void *ptr, float32 v)
+{
+    union {
+        float32 f;
+        uint32_t i;
+    } u;
+    u.f = v;
+    stl_be_p(ptr, u.i);
+}
+
+static inline float64 ldfq_be_p(const void *ptr)
+{
+    CPU_DoubleU u;
+    u.l.upper = ldl_be_p(ptr);
+    u.l.lower = ldl_be_p((uint8_t *)ptr + 4);
+    return u.d;
+}
+
+static inline void stfq_be_p(void *ptr, float64 v)
+{
+    CPU_DoubleU u;
+    u.d = v;
+    stl_be_p(ptr, u.l.upper);
+    stl_be_p((uint8_t *)ptr + 4, u.l.lower);
+}
+
+#else
+
+static inline int lduw_be_p(const void *ptr)
+{
+    return *(uint16_t *)ptr;
+}
+
+static inline int ldsw_be_p(const void *ptr)
+{
+    return *(int16_t *)ptr;
+}
+
+static inline int ldl_be_p(const void *ptr)
+{
+    return *(uint32_t *)ptr;
+}
+
+static inline uint64_t ldq_be_p(const void *ptr)
+{
+    return *(uint64_t *)ptr;
+}
+
+static inline void stw_be_p(void *ptr, int v)
+{
+    *(uint16_t *)ptr = v;
+}
+
+static inline void stl_be_p(void *ptr, int v)
+{
+    *(uint32_t *)ptr = v;
+}
+
+static inline void stq_be_p(void *ptr, uint64_t v)
+{
+    *(uint64_t *)ptr = v;
+}
+
+/* float access */
+
+static inline float32 ldfl_be_p(const void *ptr)
+{
+    return *(float32 *)ptr;
+}
+
+static inline float64 ldfq_be_p(const void *ptr)
+{
+    return *(float64 *)ptr;
+}
+
+static inline void stfl_be_p(void *ptr, float32 v)
+{
+    *(float32 *)ptr = v;
+}
+
+static inline void stfq_be_p(void *ptr, float64 v)
+{
+    *(float64 *)ptr = v;
+}
+
+#endif
+
 #endif /* BSWAP_H */
diff --git a/configure b/configure
index d38b952..29c5bcc 100755
--- a/configure
+++ b/configure
@@ -233,7 +233,7 @@ QEMU_CFLAGS="-Wall -Wundef -Wwrite-strings -Wmissing-prototypes $QEMU_CFLAGS"
 QEMU_CFLAGS="-Wstrict-prototypes -Wredundant-decls $QEMU_CFLAGS"
 QEMU_CFLAGS="-D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE $QEMU_CFLAGS"
 QEMU_CFLAGS="-D_FORTIFY_SOURCE=2 $QEMU_CFLAGS"
-QEMU_INCLUDES="-I. -I\$(SRC_PATH)"
+QEMU_INCLUDES="-I. -I\$(SRC_PATH) -I\$(SRC_PATH)/fpu"
 LDFLAGS="-g $LDFLAGS"
 
 # make source path absolute
@@ -3415,7 +3415,6 @@ else
   includes="-I\$(SRC_PATH)/tcg/\$(ARCH) $includes"
 fi
 includes="-I\$(SRC_PATH)/tcg $includes"
-includes="-I\$(SRC_PATH)/fpu $includes"
 
 if test "$target_user_only" = "yes" ; then
     libdis_config_mak=libdis-user/config.mak
diff --git a/cpu-all.h b/cpu-all.h
index 880f570..bdf1f08 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -35,8 +35,6 @@
  * TARGET_WORDS_BIGENDIAN : same for target cpu
  */
 
-#include "softfloat.h"
-
 #if defined(HOST_WORDS_BIGENDIAN) != defined(TARGET_WORDS_BIGENDIAN)
 #define BSWAP_NEEDED
 #endif
@@ -114,64 +112,6 @@ static inline void tswap64s(uint64_t *s)
 #define bswaptls(s) bswap64s(s)
 #endif
 
-typedef union {
-    float32 f;
-    uint32_t l;
-} CPU_FloatU;
-
-/* NOTE: arm FPA is horrible as double 32 bit words are stored in big
-   endian ! */
-typedef union {
-    float64 d;
-#if defined(HOST_WORDS_BIGENDIAN)
-    struct {
-        uint32_t upper;
-        uint32_t lower;
-    } l;
-#else
-    struct {
-        uint32_t lower;
-        uint32_t upper;
-    } l;
-#endif
-    uint64_t ll;
-} CPU_DoubleU;
-
-typedef union {
-     floatx80 d;
-     struct {
-         uint64_t lower;
-         uint16_t upper;
-     } l;
-} CPU_LDoubleU;
-
-typedef union {
-    float128 q;
-#if defined(HOST_WORDS_BIGENDIAN)
-    struct {
-        uint32_t upmost;
-        uint32_t upper;
-        uint32_t lower;
-        uint32_t lowest;
-    } l;
-    struct {
-        uint64_t upper;
-        uint64_t lower;
-    } ll;
-#else
-    struct {
-        uint32_t lowest;
-        uint32_t lower;
-        uint32_t upper;
-        uint32_t upmost;
-    } l;
-    struct {
-        uint64_t lower;
-        uint64_t upper;
-    } ll;
-#endif
-} CPU_QuadU;
-
 /* CPU memory access without any memory or io remapping */
 
 /*
@@ -207,392 +147,8 @@ typedef union {
  *   user   : user mode access using soft MMU
  *   kernel : kernel mode access using soft MMU
  */
-static inline int ldub_p(const void *ptr)
-{
-    return *(uint8_t *)ptr;
-}
-
-static inline int ldsb_p(const void *ptr)
-{
-    return *(int8_t *)ptr;
-}
-
-static inline void stb_p(void *ptr, int v)
-{
-    *(uint8_t *)ptr = v;
-}
-
-/* NOTE: on arm, putting 2 in /proc/sys/debug/alignment so that the
-   kernel handles unaligned load/stores may give better results, but
-   it is a system wide setting : bad */
-#if defined(HOST_WORDS_BIGENDIAN) || defined(WORDS_ALIGNED)
-
-/* conservative code for little endian unaligned accesses */
-static inline int lduw_le_p(const void *ptr)
-{
-#ifdef _ARCH_PPC
-    int val;
-    __asm__ __volatile__ ("lhbrx %0,0,%1" : "=r" (val) : "r" (ptr));
-    return val;
-#else
-    const uint8_t *p = ptr;
-    return p[0] | (p[1] << 8);
-#endif
-}
-
-static inline int ldsw_le_p(const void *ptr)
-{
-#ifdef _ARCH_PPC
-    int val;
-    __asm__ __volatile__ ("lhbrx %0,0,%1" : "=r" (val) : "r" (ptr));
-    return (int16_t)val;
-#else
-    const uint8_t *p = ptr;
-    return (int16_t)(p[0] | (p[1] << 8));
-#endif
-}
-
-static inline int ldl_le_p(const void *ptr)
-{
-#ifdef _ARCH_PPC
-    int val;
-    __asm__ __volatile__ ("lwbrx %0,0,%1" : "=r" (val) : "r" (ptr));
-    return val;
-#else
-    const uint8_t *p = ptr;
-    return p[0] | (p[1] << 8) | (p[2] << 16) | (p[3] << 24);
-#endif
-}
-
-static inline uint64_t ldq_le_p(const void *ptr)
-{
-    const uint8_t *p = ptr;
-    uint32_t v1, v2;
-    v1 = ldl_le_p(p);
-    v2 = ldl_le_p(p + 4);
-    return v1 | ((uint64_t)v2 << 32);
-}
-
-static inline void stw_le_p(void *ptr, int v)
-{
-#ifdef _ARCH_PPC
-    __asm__ __volatile__ ("sthbrx %1,0,%2" : "=m" (*(uint16_t *)ptr) : "r" (v), "r" (ptr));
-#else
-    uint8_t *p = ptr;
-    p[0] = v;
-    p[1] = v >> 8;
-#endif
-}
-
-static inline void stl_le_p(void *ptr, int v)
-{
-#ifdef _ARCH_PPC
-    __asm__ __volatile__ ("stwbrx %1,0,%2" : "=m" (*(uint32_t *)ptr) : "r" (v), "r" (ptr));
-#else
-    uint8_t *p = ptr;
-    p[0] = v;
-    p[1] = v >> 8;
-    p[2] = v >> 16;
-    p[3] = v >> 24;
-#endif
-}
-
-static inline void stq_le_p(void *ptr, uint64_t v)
-{
-    uint8_t *p = ptr;
-    stl_le_p(p, (uint32_t)v);
-    stl_le_p(p + 4, v >> 32);
-}
-
-/* float access */
-
-static inline float32 ldfl_le_p(const void *ptr)
-{
-    union {
-        float32 f;
-        uint32_t i;
-    } u;
-    u.i = ldl_le_p(ptr);
-    return u.f;
-}
-
-static inline void stfl_le_p(void *ptr, float32 v)
-{
-    union {
-        float32 f;
-        uint32_t i;
-    } u;
-    u.f = v;
-    stl_le_p(ptr, u.i);
-}
-
-static inline float64 ldfq_le_p(const void *ptr)
-{
-    CPU_DoubleU u;
-    u.l.lower = ldl_le_p(ptr);
-    u.l.upper = ldl_le_p(ptr + 4);
-    return u.d;
-}
-
-static inline void stfq_le_p(void *ptr, float64 v)
-{
-    CPU_DoubleU u;
-    u.d = v;
-    stl_le_p(ptr, u.l.lower);
-    stl_le_p(ptr + 4, u.l.upper);
-}
-
-#else
-
-static inline int lduw_le_p(const void *ptr)
-{
-    return *(uint16_t *)ptr;
-}
-
-static inline int ldsw_le_p(const void *ptr)
-{
-    return *(int16_t *)ptr;
-}
-
-static inline int ldl_le_p(const void *ptr)
-{
-    return *(uint32_t *)ptr;
-}
-
-static inline uint64_t ldq_le_p(const void *ptr)
-{
-    return *(uint64_t *)ptr;
-}
-
-static inline void stw_le_p(void *ptr, int v)
-{
-    *(uint16_t *)ptr = v;
-}
-
-static inline void stl_le_p(void *ptr, int v)
-{
-    *(uint32_t *)ptr = v;
-}
-
-static inline void stq_le_p(void *ptr, uint64_t v)
-{
-    *(uint64_t *)ptr = v;
-}
-
-/* float access */
-
-static inline float32 ldfl_le_p(const void *ptr)
-{
-    return *(float32 *)ptr;
-}
-
-static inline float64 ldfq_le_p(const void *ptr)
-{
-    return *(float64 *)ptr;
-}
-
-static inline void stfl_le_p(void *ptr, float32 v)
-{
-    *(float32 *)ptr = v;
-}
-
-static inline void stfq_le_p(void *ptr, float64 v)
-{
-    *(float64 *)ptr = v;
-}
-#endif
-
-#if !defined(HOST_WORDS_BIGENDIAN) || defined(WORDS_ALIGNED)
-
-static inline int lduw_be_p(const void *ptr)
-{
-#if defined(__i386__)
-    int val;
-    asm volatile ("movzwl %1, %0\n"
-                  "xchgb %b0, %h0\n"
-                  : "=q" (val)
-                  : "m" (*(uint16_t *)ptr));
-    return val;
-#else
-    const uint8_t *b = ptr;
-    return ((b[0] << 8) | b[1]);
-#endif
-}
-
-static inline int ldsw_be_p(const void *ptr)
-{
-#if defined(__i386__)
-    int val;
-    asm volatile ("movzwl %1, %0\n"
-                  "xchgb %b0, %h0\n"
-                  : "=q" (val)
-                  : "m" (*(uint16_t *)ptr));
-    return (int16_t)val;
-#else
-    const uint8_t *b = ptr;
-    return (int16_t)((b[0] << 8) | b[1]);
-#endif
-}
-
-static inline int ldl_be_p(const void *ptr)
-{
-#if defined(__i386__) || defined(__x86_64__)
-    int val;
-    asm volatile ("movl %1, %0\n"
-                  "bswap %0\n"
-                  : "=r" (val)
-                  : "m" (*(uint32_t *)ptr));
-    return val;
-#else
-    const uint8_t *b = ptr;
-    return (b[0] << 24) | (b[1] << 16) | (b[2] << 8) | b[3];
-#endif
-}
-
-static inline uint64_t ldq_be_p(const void *ptr)
-{
-    uint32_t a,b;
-    a = ldl_be_p(ptr);
-    b = ldl_be_p((uint8_t *)ptr + 4);
-    return (((uint64_t)a<<32)|b);
-}
-
-static inline void stw_be_p(void *ptr, int v)
-{
-#if defined(__i386__)
-    asm volatile ("xchgb %b0, %h0\n"
-                  "movw %w0, %1\n"
-                  : "=q" (v)
-                  : "m" (*(uint16_t *)ptr), "0" (v));
-#else
-    uint8_t *d = (uint8_t *) ptr;
-    d[0] = v >> 8;
-    d[1] = v;
-#endif
-}
-
-static inline void stl_be_p(void *ptr, int v)
-{
-#if defined(__i386__) || defined(__x86_64__)
-    asm volatile ("bswap %0\n"
-                  "movl %0, %1\n"
-                  : "=r" (v)
-                  : "m" (*(uint32_t *)ptr), "0" (v));
-#else
-    uint8_t *d = (uint8_t *) ptr;
-    d[0] = v >> 24;
-    d[1] = v >> 16;
-    d[2] = v >> 8;
-    d[3] = v;
-#endif
-}
-
-static inline void stq_be_p(void *ptr, uint64_t v)
-{
-    stl_be_p(ptr, v >> 32);
-    stl_be_p((uint8_t *)ptr + 4, v);
-}
-
-/* float access */
-
-static inline float32 ldfl_be_p(const void *ptr)
-{
-    union {
-        float32 f;
-        uint32_t i;
-    } u;
-    u.i = ldl_be_p(ptr);
-    return u.f;
-}
-
-static inline void stfl_be_p(void *ptr, float32 v)
-{
-    union {
-        float32 f;
-        uint32_t i;
-    } u;
-    u.f = v;
-    stl_be_p(ptr, u.i);
-}
-
-static inline float64 ldfq_be_p(const void *ptr)
-{
-    CPU_DoubleU u;
-    u.l.upper = ldl_be_p(ptr);
-    u.l.lower = ldl_be_p((uint8_t *)ptr + 4);
-    return u.d;
-}
-
-static inline void stfq_be_p(void *ptr, float64 v)
-{
-    CPU_DoubleU u;
-    u.d = v;
-    stl_be_p(ptr, u.l.upper);
-    stl_be_p((uint8_t *)ptr + 4, u.l.lower);
-}
-
-#else
-
-static inline int lduw_be_p(const void *ptr)
-{
-    return *(uint16_t *)ptr;
-}
-
-static inline int ldsw_be_p(const void *ptr)
-{
-    return *(int16_t *)ptr;
-}
-
-static inline int ldl_be_p(const void *ptr)
-{
-    return *(uint32_t *)ptr;
-}
-
-static inline uint64_t ldq_be_p(const void *ptr)
-{
-    return *(uint64_t *)ptr;
-}
-
-static inline void stw_be_p(void *ptr, int v)
-{
-    *(uint16_t *)ptr = v;
-}
-
-static inline void stl_be_p(void *ptr, int v)
-{
-    *(uint32_t *)ptr = v;
-}
-
-static inline void stq_be_p(void *ptr, uint64_t v)
-{
-    *(uint64_t *)ptr = v;
-}
-
-/* float access */
-
-static inline float32 ldfl_be_p(const void *ptr)
-{
-    return *(float32 *)ptr;
-}
-
-static inline float64 ldfq_be_p(const void *ptr)
-{
-    return *(float64 *)ptr;
-}
-
-static inline void stfl_be_p(void *ptr, float32 v)
-{
-    *(float32 *)ptr = v;
-}
-
-static inline void stfq_be_p(void *ptr, float64 v)
-{
-    *(float64 *)ptr = v;
-}
-
-#endif
 
-/* target CPU memory access functions */
+/* target-endianness CPU memory access functions */
 #if defined(TARGET_WORDS_BIGENDIAN)
 #define lduw_p(p) lduw_be_p(p)
 #define ldsw_p(p) ldsw_be_p(p)
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h Paolo Bonzini
@ 2011-06-06 17:15   ` Andreas Färber
  2011-06-06 20:24     ` Paolo Bonzini
  2011-06-06 22:15     ` Richard Henderson
  0 siblings, 2 replies; 15+ messages in thread
From: Andreas Färber @ 2011-06-06 17:15 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Am 06.06.2011 um 16:25 schrieb Paolo Bonzini:

> This is not a CPU interface.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

> diff --git a/qemu-common.h b/qemu-common.h
> index b851b20..7484ef8 100644
> --- a/qemu-common.h
> +++ b/qemu-common.h
> @@ -4,6 +4,10 @@
>
> #include "config-host.h"
>
> +#if defined(__arm__) || defined(__sparc__) || defined(__mips__) ||  
> defined(__hppa__) || defined(__ia64__)
> +#define WORDS_ALIGNED
> +#endif

Since it depends on the host and you're placing it directly under  
config-host.h inclusion, might it make sense to move the decision into  
configure instead, so that it ends up in config-host.h?

Andreas

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h Paolo Bonzini
@ 2011-06-06 19:56   ` Richard Henderson
  2011-06-06 20:27     ` Paolo Bonzini
  2011-06-06 23:07     ` malc
  0 siblings, 2 replies; 15+ messages in thread
From: Richard Henderson @ 2011-06-06 19:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Patches 1-3:
Reviewed-by: Richard Henderson <rth@twiddle.net>

That said,

On 06/06/2011 09:25 AM, Paolo Bonzini wrote:
> +/* conservative code for little endian unaligned accesses */
> +static inline int lduw_le_p(const void *ptr)
> +{
> +#ifdef _ARCH_PPC
> +    int val;
> +    __asm__ __volatile__ ("lhbrx %0,0,%1" : "=r" (val) : "r" (ptr));
> +    return val;
> +#else
> +    const uint8_t *p = ptr;
> +    return p[0] | (p[1] << 8);
> +#endif
> +}

Can we add a patch 4/3 that removes this sort of hard-coded
assembly stuff in favour of generic gcc code.  E.g.

static inline int lduw_le_p(const void *ptr)
{
  struct pack { uint16_t x __attribute__((packed)); };
  const struct pack *p = ptr;
  uint16_t ret;

  ret = p->x;
  ret = le_bswap(ret, 16);

  return ret;
}

One could fairly well macroize all 6 instances.

The compiler knows that ppc and i386 can do the unaligned loads.
The compiler *should* be able to match the bswap_N patterns, and
thus also fold the unaligned load with the bswap for ppc.

I can confirm that gcc head does in fact do the right thing for
ppc32/64 here, as well as for i386/x86_64.  I don't have previous
versions of gcc checked out atm...

I havn't checked, but this *ought* to enable the load-asi bswap
instruction for sparcv9.  I also havn't checked what happens for
a target like sparcv8 that lacks both unaligned load and bswap,
to see that we don't simply double the number of shifts.  However,
I'd be tempted to file that as a gcc missed optimization bug.


r~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h
  2011-06-06 17:15   ` Andreas Färber
@ 2011-06-06 20:24     ` Paolo Bonzini
  2011-06-06 22:15     ` Richard Henderson
  1 sibling, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 20:24 UTC (permalink / raw)
  To: Andreas Färber; +Cc: qemu-devel

On 06/06/2011 07:15 PM, Andreas Färber wrote:
> Am 06.06.2011 um 16:25 schrieb Paolo Bonzini:
>
>> This is not a CPU interface.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>
>> diff --git a/qemu-common.h b/qemu-common.h
>> index b851b20..7484ef8 100644
>> --- a/qemu-common.h
>> +++ b/qemu-common.h
>> @@ -4,6 +4,10 @@
>>
>> #include "config-host.h"
>>
>> +#if defined(__arm__) || defined(__sparc__) || defined(__mips__) ||
>> defined(__hppa__) || defined(__ia64__)
>> +#define WORDS_ALIGNED
>> +#endif
>
> Since it depends on the host and you're placing it directly under
> config-host.h inclusion, might it make sense to move the decision into
> configure instead, so that it ends up in config-host.h?

Richard's suggestion will get rid of this altogether.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h
  2011-06-06 19:56   ` Richard Henderson
@ 2011-06-06 20:27     ` Paolo Bonzini
  2011-06-06 22:05       ` Richard Henderson
  2011-06-06 23:07     ` malc
  1 sibling, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 20:27 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

On 06/06/2011 09:56 PM, Richard Henderson wrote:
> Can we add a patch 4/3 that removes this sort of hard-coded
> assembly stuff in favour of generic gcc code.

Then it's also possible to commit patch 2 from this submission, and wait 
while I rework the other two to use generic GCC code.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h
  2011-06-06 20:27     ` Paolo Bonzini
@ 2011-06-06 22:05       ` Richard Henderson
  0 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2011-06-06 22:05 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 06/06/2011 01:27 PM, Paolo Bonzini wrote:
> On 06/06/2011 09:56 PM, Richard Henderson wrote:
>> Can we add a patch 4/3 that removes this sort of hard-coded
>> assembly stuff in favour of generic gcc code.
> 
> Then it's also possible to commit patch 2 from this submission, and
> wait while I rework the other two to use generic GCC code.

I began the message by acking all of 1 through 3.

The re-work for GCC code ought to be a follow-on patch.


r~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h
  2011-06-06 17:15   ` Andreas Färber
  2011-06-06 20:24     ` Paolo Bonzini
@ 2011-06-06 22:15     ` Richard Henderson
  2011-06-07  9:17       ` Paolo Bonzini
  1 sibling, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2011-06-06 22:15 UTC (permalink / raw)
  To: Andreas Färber; +Cc: Paolo Bonzini, qemu-devel

On 06/06/2011 10:15 AM, Andreas Färber wrote:
> Am 06.06.2011 um 16:25 schrieb Paolo Bonzini:
>> +#if defined(__arm__) || defined(__sparc__) || defined(__mips__) || defined(__hppa__) || defined(__ia64__)
>> +#define WORDS_ALIGNED
>> +#endif
> 
> Since it depends on the host and you're placing it directly under
> config-host.h inclusion, might it make sense to move the decision
> into configure instead, so that it ends up in config-host.h?

Hum, I now understand what Paulo was talking about elsewhere in the thread.

If he takes my suggestion to re-write the unaligned functions with GCC packed
support, these host ifdefs go away, and this objection disappears.  The
question becomes one of ordering.

Do we take his existing 3-part patch as-is, and the packed patch as a followup?
Do we convert to packed accesses first and move it around after?
Do we do it all in one step?


r~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h
  2011-06-06 19:56   ` Richard Henderson
  2011-06-06 20:27     ` Paolo Bonzini
@ 2011-06-06 23:07     ` malc
  2011-06-07 16:54       ` Richard Henderson
  1 sibling, 1 reply; 15+ messages in thread
From: malc @ 2011-06-06 23:07 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, qemu-devel

On Mon, 6 Jun 2011, Richard Henderson wrote:

> Patches 1-3:
> Reviewed-by: Richard Henderson <rth@twiddle.net>
> 
> That said,
> 
> On 06/06/2011 09:25 AM, Paolo Bonzini wrote:
> > +/* conservative code for little endian unaligned accesses */
> > +static inline int lduw_le_p(const void *ptr)
> > +{
> > +#ifdef _ARCH_PPC
> > +    int val;
> > +    __asm__ __volatile__ ("lhbrx %0,0,%1" : "=r" (val) : "r" (ptr));
> > +    return val;
> > +#else
> > +    const uint8_t *p = ptr;
> > +    return p[0] | (p[1] << 8);
> > +#endif
> > +}
> 
> Can we add a patch 4/3 that removes this sort of hard-coded
> assembly stuff in favour of generic gcc code.  E.g.
> 
> static inline int lduw_le_p(const void *ptr)
> {
>   struct pack { uint16_t x __attribute__((packed)); };
>   const struct pack *p = ptr;
>   uint16_t ret;
> 
>   ret = p->x;
>   ret = le_bswap(ret, 16);
> 
>   return ret;
> }
> 
> One could fairly well macroize all 6 instances.
> 
> The compiler knows that ppc and i386 can do the unaligned loads.
> The compiler *should* be able to match the bswap_N patterns, and
> thus also fold the unaligned load with the bswap for ppc.
> 
> I can confirm that gcc head does in fact do the right thing for
> ppc32/64 here, as well as for i386/x86_64.  I don't have previous
> versions of gcc checked out atm...

Depends on how bswap_16 is defined. If it is __builtin_bswap16
then 4.5.0 and 4.6.0 generate byte reversed loads, and previous
versions lack that builtin, so i don't think this generic code
should go in.

> 
> I havn't checked, but this *ought* to enable the load-asi bswap
> instruction for sparcv9.  I also havn't checked what happens for
> a target like sparcv8 that lacks both unaligned load and bswap,
> to see that we don't simply double the number of shifts.  However,
> I'd be tempted to file that as a gcc missed optimization bug.
> 
> 
> r~
> 

-- 
mailto:av1474@comtv.ru

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h
  2011-06-06 22:15     ` Richard Henderson
@ 2011-06-07  9:17       ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-07  9:17 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Andreas Färber, qemu-devel

On 06/07/2011 12:15 AM, Richard Henderson wrote:
> Do we take his existing 3-part patch as-is, and the packed patch as a followup?
> Do we convert to packed accesses first and move it around after?
> Do we do it all in one step?

Either of the first two works for me.

However, since this series was a start towards fixing real bugs reported 
by Coverity:

     qemu-kvm-0.14.0/hw/scsi-bus.c:190:
     sign_extension: Suspicious implicit sign extension:

     "cmd[10]" with type "unsigned char" (8 bits, unsigned) is promoted
     in "cmd[13] | (cmd[12] << 8) | (cmd[11] << 16) | (cmd[10] << 24)"
     to type "int" (32 bits, signed), then sign-extended to type
     "unsigned long" (64 bits, unsigned).

     If "cmd[13] | (cmd[12] << 8) | (cmd[11] << 16) | (cmd[10] << 24)"
     is greater than 0x7FFFFFFF, the upper bits of the result will all
     be 1.

... and there were objections on requiring recent GCC, perhaps it's 
better to just commit it as is.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h
  2011-06-06 23:07     ` malc
@ 2011-06-07 16:54       ` Richard Henderson
  2011-06-07 17:29         ` malc
  0 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2011-06-07 16:54 UTC (permalink / raw)
  To: malc; +Cc: Paolo Bonzini, qemu-devel

On 06/06/2011 04:07 PM, malc wrote:
> Depends on how bswap_16 is defined. If it is __builtin_bswap16
> then 4.5.0 and 4.6.0 generate byte reversed loads, and previous
> versions lack that builtin, so i don't think this generic code
> should go in.

It would continue to be defined as-is, without direct reference
to the __builtin_bswap functions.  But you're right that the 
generic code would depend on the bswap optimization pass that
recognizes the mask/shift/or pattern and converts it to the
builtins internally.

What if we kept the ppc ifdefs, but converted the rest to the
gcc generic code?


r~

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h
  2011-06-07 16:54       ` Richard Henderson
@ 2011-06-07 17:29         ` malc
  0 siblings, 0 replies; 15+ messages in thread
From: malc @ 2011-06-07 17:29 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, qemu-devel

On Tue, 7 Jun 2011, Richard Henderson wrote:

> On 06/06/2011 04:07 PM, malc wrote:
> > Depends on how bswap_16 is defined. If it is __builtin_bswap16
> > then 4.5.0 and 4.6.0 generate byte reversed loads, and previous
> > versions lack that builtin, so i don't think this generic code
> > should go in.
> 
> It would continue to be defined as-is, without direct reference
> to the __builtin_bswap functions.  But you're right that the 
> generic code would depend on the bswap optimization pass that
> recognizes the mask/shift/or pattern and converts it to the
> builtins internally.
> 
> What if we kept the ppc ifdefs, but converted the rest to the
> gcc generic code?
> 

No objections.

-- 
mailto:av1474@comtv.ru

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw
  2011-06-06 14:25 [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw Paolo Bonzini
                   ` (2 preceding siblings ...)
  2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h Paolo Bonzini
@ 2011-07-06 13:29 ` Paolo Bonzini
  3 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-07-06 13:29 UTC (permalink / raw)
  To: qemu-devel

On 06/06/2011 04:25 PM, Paolo Bonzini wrote:
> Functions like ldl_be_p and ldl_le_p are currently used only as building
> blocks for {ld,st}XX_p.  As such, they are in cpu-all.h even though they
> have absolutely no dependency on the target.
>
> In order to make them globally available, this series moves them to
> bswap.h instead.
>
> An interesting part of this is that there are functions also for floating
> point values.  Leaving them in cpu-all.h would be possible but untidy.
> In fact handling these is easy, but it requires to make softfloat.h
> target-dependent as well.  This is what patch 2 does.
>
> v1->v2:
>          rebase, use softfloat-specialize.h instead of introducing
>          softfloat-target.h
>
> Paolo Bonzini (3):
>    move WORDS_ALIGNED to qemu-common.h
>    softfloat: change default nan definitions to variables
>    move unaligned memory access functions to bswap.h
>
>   Makefile.hw                |    2 +-
>   bswap.h                    |  474 ++++++++++++++++++++++++++++++++++++++++++++
>   configure                  |    3 +-
>   cpu-all.h                  |  446 +-----------------------------------------
>   cpu-common.h               |    4 -
>   fpu/softfloat-specialize.h |   72 +++++++
>   fpu/softfloat.h            |   60 +-----
>   qemu-common.h              |    4 +
>   8 files changed, 562 insertions(+), 503 deletions(-)
>

Ping?

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-07-06 13:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-06 14:25 [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw Paolo Bonzini
2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 1/3] move WORDS_ALIGNED to qemu-common.h Paolo Bonzini
2011-06-06 17:15   ` Andreas Färber
2011-06-06 20:24     ` Paolo Bonzini
2011-06-06 22:15     ` Richard Henderson
2011-06-07  9:17       ` Paolo Bonzini
2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 2/3] softfloat: change default nan definitions to variables Paolo Bonzini
2011-06-06 14:25 ` [Qemu-devel] [PATCH v2 3/3] move unaligned memory access functions to bswap.h Paolo Bonzini
2011-06-06 19:56   ` Richard Henderson
2011-06-06 20:27     ` Paolo Bonzini
2011-06-06 22:05       ` Richard Henderson
2011-06-06 23:07     ` malc
2011-06-07 16:54       ` Richard Henderson
2011-06-07 17:29         ` malc
2011-07-06 13:29 ` [Qemu-devel] [PATCH v2 0/3] make endian-independent unaligned memory access functions available to libhw Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).