qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] RFC: fix for random Qemu crashes
@ 2007-11-15 23:18 J. Mayer
  2007-11-15 23:49 ` andrzej zaborowski
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: J. Mayer @ 2007-11-15 23:18 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2665 bytes --]

Some may have experienced of having some Qemu builds crashing,
apparently at random places, but in a reproducable way.
I found one reason for this crashes: it appears that with the growth of
the op.c file, there may be cases where we could reach the inlining
limits of gcc. In such a case, gcc would not inline some declared
"inline" function but would emit a call and provide a separate function.
Unfortunately, this is not acceptable in op.o context as it will
slowdown the emulation and because the call is likely to break the
specific compilation rules (ie reserved registers) used while compiling
op.o
I found some workaround to avoid this behavior and I'd like to get
opinions about it.
The first idea is to change all occurences of 'inline' with
'always_inline' in all headers included in op.c. It then appeared to me
that always_inline is not globally declared and that the definition is
duplicated in vl.h and exec-all.h. But it's not declared in
darwin-user/qemu.h and linux-user/qemu.h, which is not consistent with
the declaration in vl.h. Further investigations showed me that the
osdep.h header is the one that is actually included everywhere. Even if
those are more compiler than OS dependent, I decided to move the
definitions for glue, tostring, likely, unlikely, always_inline and
REGPARM to this header so they can be globally used. I also changed the
include orders in darwin-user/qemu.h to be sure this header will be
included first. This patch is attached in common_defs.diff.
Giving this patch, I've been able to replace all occurence of 'inline'
with 'always_inline' in all headers included from op.c (given the
generated .d file). Some would say I'd better add a #define inline
always_inline somewhere. I personnally dislike this solution as this
kind of macro as it tends to expand recursivally (always_inline
definition contains the inline word) and this may lead to compilation
warnings or errors in some context; one could do tests using the linux
kernel headers to get convinced that it can happen.
Then, I choosed to replace 'inline' by 'always_inline', which is more
invasive but have less risks of side effects. The diff is attached in
always_inline.diff.
The last thing that helps solve the problem is to change the inlining
limits of gcc, at least to compile the op.o file.Unfortunatelly, there
is no way to disable those limits (I checked in the source code), then I
put them to an arbitrary high level. I also added the -funit-at-a-time
switch, as this kind of optimisation would not be relevant in op.o
context. The diff is attached in gcc_inline_limits.diff.

Please comment.

-- 
J. Mayer <l_indien@magic.fr>
Never organized

[-- Attachment #2: common_defs.diff --]
[-- Type: text/x-patch, Size: 3823 bytes --]

Index: exec-all.h
===================================================================
RCS file: /sources/qemu/qemu/exec-all.h,v
retrieving revision 1.70
diff -u -d -d -p -r1.70 exec-all.h
--- exec-all.h	4 Nov 2007 02:24:57 -0000	1.70
+++ exec-all.h	15 Nov 2007 22:58:45 -0000
@@ -21,36 +21,6 @@
 /* allow to see translation results - the slowdown should be negligible, so we leave it */
 #define DEBUG_DISAS
 
-#ifndef glue
-#define xglue(x, y) x ## y
-#define glue(x, y) xglue(x, y)
-#define stringify(s)	tostring(s)
-#define tostring(s)	#s
-#endif
-
-#ifndef likely
-#if __GNUC__ < 3
-#define __builtin_expect(x, n) (x)
-#endif
-
-#define likely(x)   __builtin_expect(!!(x), 1)
-#define unlikely(x)   __builtin_expect(!!(x), 0)
-#endif
-
-#ifndef always_inline
-#if (__GNUC__ < 3) || defined(__APPLE__)
-#define always_inline inline
-#else
-#define always_inline __attribute__ (( always_inline )) inline
-#endif
-#endif
-
-#ifdef __i386__
-#define REGPARM(n) __attribute((regparm(n)))
-#else
-#define REGPARM(n)
-#endif
-
 /* is_jmp field values */
 #define DISAS_NEXT    0 /* next instruction can be analyzed */
 #define DISAS_JUMP    1 /* only pc was modified dynamically */
Index: osdep.h
===================================================================
RCS file: /sources/qemu/qemu/osdep.h,v
retrieving revision 1.10
diff -u -d -d -p -r1.10 osdep.h
--- osdep.h	7 Jun 2007 23:09:47 -0000	1.10
+++ osdep.h	15 Nov 2007 22:58:45 -0000
@@ -3,6 +3,43 @@
 
 #include <stdarg.h>
 
+#ifndef glue
+#define xglue(x, y) x ## y
+#define glue(x, y) xglue(x, y)
+#define stringify(s)	tostring(s)
+#define tostring(s)	#s
+#endif
+
+#ifndef likely
+#if __GNUC__ < 3
+#define __builtin_expect(x, n) (x)
+#endif
+
+#define likely(x)   __builtin_expect(!!(x), 1)
+#define unlikely(x)   __builtin_expect(!!(x), 0)
+#endif
+
+#ifndef MIN
+#define MIN(a, b) (((a) < (b)) ? (a) : (b))
+#endif
+#ifndef MAX
+#define MAX(a, b) (((a) > (b)) ? (a) : (b))
+#endif
+
+#ifndef always_inline
+#if (__GNUC__ < 3) || defined(__APPLE__)
+#define always_inline inline
+#else
+#define always_inline __attribute__ (( always_inline )) inline
+#endif
+#endif
+
+#ifdef __i386__
+#define REGPARM(n) __attribute((regparm(n)))
+#else
+#define REGPARM(n)
+#endif
+
 #define qemu_printf printf
 
 void *qemu_malloc(size_t size);
Index: vl.h
===================================================================
RCS file: /sources/qemu/qemu/vl.h,v
retrieving revision 1.295
diff -u -d -d -p -r1.295 vl.h
--- vl.h	11 Nov 2007 17:56:38 -0000	1.295
+++ vl.h	15 Nov 2007 22:58:45 -0000
@@ -29,37 +29,6 @@
 /* FIXME: Remove this.  */
 #include "block.h"
 
-#ifndef glue
-#define xglue(x, y) x ## y
-#define glue(x, y) xglue(x, y)
-#define stringify(s)	tostring(s)
-#define tostring(s)	#s
-#endif
-
-#ifndef likely
-#if __GNUC__ < 3
-#define __builtin_expect(x, n) (x)
-#endif
-
-#define likely(x)   __builtin_expect(!!(x), 1)
-#define unlikely(x)   __builtin_expect(!!(x), 0)
-#endif
-
-#ifndef MIN
-#define MIN(a, b) (((a) < (b)) ? (a) : (b))
-#endif
-#ifndef MAX
-#define MAX(a, b) (((a) > (b)) ? (a) : (b))
-#endif
-
-#ifndef always_inline
-#if (__GNUC__ < 3) || defined(__APPLE__)
-#define always_inline inline
-#else
-#define always_inline __attribute__ (( always_inline )) inline
-#endif
-#endif
-
 #include "audio/audio.h"
 
 /* vl.c */
Index: darwin-user/qemu.h
===================================================================
RCS file: /sources/qemu/qemu/darwin-user/qemu.h,v
retrieving revision 1.1
diff -u -d -d -p -r1.1 qemu.h
--- darwin-user/qemu.h	18 Jan 2007 20:06:33 -0000	1.1
+++ darwin-user/qemu.h	15 Nov 2007 22:58:45 -0000
@@ -1,13 +1,13 @@
 #ifndef GEMU_H
 #define GEMU_H
 
-#include "thunk.h"
-
 #include <signal.h>
 #include <string.h>
 
 #include "cpu.h"
 
+#include "thunk.h"
+
 #include "gdbstub.h"
 
 typedef siginfo_t target_siginfo_t;

[-- Attachment #3: always_inline.diff --]
[-- Type: text/x-patch, Size: 37976 bytes --]

Index: bswap.h
===================================================================
RCS file: /sources/qemu/qemu/bswap.h,v
retrieving revision 1.6
diff -u -d -d -p -r1.6 bswap.h
--- bswap.h	16 Sep 2007 21:07:48 -0000	1.6
+++ bswap.h	15 Nov 2007 22:59:46 -0000
@@ -43,32 +43,32 @@
 
 #endif /* !HAVE_BYTESWAP_H */
 
-static inline uint16_t bswap16(uint16_t x)
+static always_inline uint16_t bswap16(uint16_t x)
 {
     return bswap_16(x);
 }
 
-static inline uint32_t bswap32(uint32_t x)
+static always_inline uint32_t bswap32(uint32_t x)
 {
     return bswap_32(x);
 }
 
-static inline uint64_t bswap64(uint64_t x)
+static always_inline uint64_t bswap64(uint64_t x)
 {
     return bswap_64(x);
 }
 
-static inline void bswap16s(uint16_t *s)
+static always_inline void bswap16s(uint16_t *s)
 {
     *s = bswap16(*s);
 }
 
-static inline void bswap32s(uint32_t *s)
+static always_inline void bswap32s(uint32_t *s)
 {
     *s = bswap32(*s);
 }
 
-static inline void bswap64s(uint64_t *s)
+static always_inline void bswap64s(uint64_t *s)
 {
     *s = bswap64(*s);
 }
@@ -86,32 +86,32 @@ static inline void bswap64s(uint64_t *s)
 #endif
 
 #define CPU_CONVERT(endian, size, type)\
-static inline type endian ## size ## _to_cpu(type v)\
+static always_inline type endian ## size ## _to_cpu(type v)\
 {\
     return endian ## _bswap(v, size);\
 }\
 \
-static inline type cpu_to_ ## endian ## size(type v)\
+static always_inline type cpu_to_ ## endian ## size(type v)\
 {\
     return endian ## _bswap(v, size);\
 }\
 \
-static inline void endian ## size ## _to_cpus(type *p)\
+static always_inline void endian ## size ## _to_cpus(type *p)\
 {\
     endian ## _bswaps(p, size)\
 }\
 \
-static inline void cpu_to_ ## endian ## size ## s(type *p)\
+static always_inline void cpu_to_ ## endian ## size ## s(type *p)\
 {\
     endian ## _bswaps(p, size)\
 }\
 \
-static inline type endian ## size ## _to_cpup(const type *p)\
+static always_inline type endian ## size ## _to_cpup(const type *p)\
 {\
     return endian ## size ## _to_cpu(*p);\
 }\
 \
-static inline void cpu_to_ ## endian ## size ## w(type *p, type v)\
+static always_inline void cpu_to_ ## endian ## size ## w(type *p, type v)\
 {\
      *p = cpu_to_ ## endian ## size(v);\
 }
@@ -138,7 +138,7 @@ CPU_CONVERT(le, 64, uint64_t)
 
 #else
 
-static inline void cpu_to_le16wu(uint16_t *p, uint16_t v)
+static always_inline void cpu_to_le16wu(uint16_t *p, uint16_t v)
 {
     uint8_t *p1 = (uint8_t *)p;
 
@@ -146,7 +146,7 @@ static inline void cpu_to_le16wu(uint16_
     p1[1] = v >> 8;
 }
 
-static inline void cpu_to_le32wu(uint32_t *p, uint32_t v)
+static always_inline void cpu_to_le32wu(uint32_t *p, uint32_t v)
 {
     uint8_t *p1 = (uint8_t *)p;
 
@@ -156,19 +156,19 @@ static inline void cpu_to_le32wu(uint32_
     p1[3] = v >> 24;
 }
 
-static inline uint16_t le16_to_cpupu(const uint16_t *p)
+static always_inline uint16_t le16_to_cpupu(const uint16_t *p)
 {
     const uint8_t *p1 = (const uint8_t *)p;
     return p1[0] | (p1[1] << 8);
 }
 
-static inline uint32_t le32_to_cpupu(const uint32_t *p)
+static always_inline uint32_t le32_to_cpupu(const uint32_t *p)
 {
     const uint8_t *p1 = (const uint8_t *)p;
     return p1[0] | (p1[1] << 8) | (p1[2] << 16) | (p1[3] << 24);
 }
 
-static inline void cpu_to_be16wu(uint16_t *p, uint16_t v)
+static always_inline void cpu_to_be16wu(uint16_t *p, uint16_t v)
 {
     uint8_t *p1 = (uint8_t *)p;
 
@@ -176,7 +176,7 @@ static inline void cpu_to_be16wu(uint16_
     p1[1] = v;
 }
 
-static inline void cpu_to_be32wu(uint32_t *p, uint32_t v)
+static always_inline void cpu_to_be32wu(uint32_t *p, uint32_t v)
 {
     uint8_t *p1 = (uint8_t *)p;
 
Index: cpu-all.h
===================================================================
RCS file: /sources/qemu/qemu/cpu-all.h,v
retrieving revision 1.78
diff -u -d -d -p -r1.78 cpu-all.h
--- cpu-all.h	14 Nov 2007 10:51:00 -0000	1.78
+++ cpu-all.h	15 Nov 2007 22:59:46 -0000
@@ -45,62 +45,62 @@
 
 #ifdef BSWAP_NEEDED
 
-static inline uint16_t tswap16(uint16_t s)
+static always_inline uint16_t tswap16(uint16_t s)
 {
     return bswap16(s);
 }
 
-static inline uint32_t tswap32(uint32_t s)
+static always_inline uint32_t tswap32(uint32_t s)
 {
     return bswap32(s);
 }
 
-static inline uint64_t tswap64(uint64_t s)
+static always_inline uint64_t tswap64(uint64_t s)
 {
     return bswap64(s);
 }
 
-static inline void tswap16s(uint16_t *s)
+static always_inline void tswap16s(uint16_t *s)
 {
     *s = bswap16(*s);
 }
 
-static inline void tswap32s(uint32_t *s)
+static always_inline void tswap32s(uint32_t *s)
 {
     *s = bswap32(*s);
 }
 
-static inline void tswap64s(uint64_t *s)
+static always_inline void tswap64s(uint64_t *s)
 {
     *s = bswap64(*s);
 }
 
 #else
 
-static inline uint16_t tswap16(uint16_t s)
+static always_inline uint16_t tswap16(uint16_t s)
 {
     return s;
 }
 
-static inline uint32_t tswap32(uint32_t s)
+static always_inline uint32_t tswap32(uint32_t s)
 {
     return s;
 }
 
-static inline uint64_t tswap64(uint64_t s)
+static always_inline uint64_t tswap64(uint64_t s)
 {
     return s;
 }
 
-static inline void tswap16s(uint16_t *s)
+static always_inline void tswap16s(uint16_t *s)
 {
 }
 
-static inline void tswap32s(uint32_t *s)
+static always_inline void tswap32s(uint32_t *s)
 {
 }
 
-static inline void tswap64s(uint64_t *s)
+static always_inline void tswap64s(uint64_t *s)
 {
 }
 
@@ -170,17 +170,17 @@ typedef union {
  *   user   : user mode access using soft MMU
  *   kernel : kernel mode access using soft MMU
  */
-static inline int ldub_p(void *ptr)
+static always_inline int ldub_p(void *ptr)
 {
     return *(uint8_t *)ptr;
 }
 
-static inline int ldsb_p(void *ptr)
+static always_inline int ldsb_p(void *ptr)
 {
     return *(int8_t *)ptr;
 }
 
-static inline void stb_p(void *ptr, int v)
+static always_inline void stb_p(void *ptr, int v)
 {
     *(uint8_t *)ptr = v;
 }
@@ -191,7 +191,7 @@ static inline void stb_p(void *ptr, int 
 #if defined(WORDS_BIGENDIAN) || defined(WORDS_ALIGNED)
 
 /* conservative code for little endian unaligned accesses */
-static inline int lduw_le_p(void *ptr)
+static always_inline int lduw_le_p(void *ptr)
 {
 #ifdef __powerpc__
     int val;
@@ -203,7 +203,7 @@ static inline int lduw_le_p(void *ptr)
 #endif
 }
 
-static inline int ldsw_le_p(void *ptr)
+static always_inline int ldsw_le_p(void *ptr)
 {
 #ifdef __powerpc__
     int val;
@@ -215,7 +215,7 @@ static inline int ldsw_le_p(void *ptr)
 #endif
 }
 
-static inline int ldl_le_p(void *ptr)
+static always_inline int ldl_le_p(void *ptr)
 {
 #ifdef __powerpc__
     int val;
@@ -227,7 +227,7 @@ static inline int ldl_le_p(void *ptr)
 #endif
 }
 
-static inline uint64_t ldq_le_p(void *ptr)
+static always_inline uint64_t ldq_le_p(void *ptr)
 {
     uint8_t *p = ptr;
     uint32_t v1, v2;
@@ -236,7 +236,7 @@ static inline uint64_t ldq_le_p(void *pt
     return v1 | ((uint64_t)v2 << 32);
 }
 
-static inline void stw_le_p(void *ptr, int v)
+static always_inline void stw_le_p(void *ptr, int v)
 {
 #ifdef __powerpc__
     __asm__ __volatile__ ("sthbrx %1,0,%2" : "=m" (*(uint16_t *)ptr) : "r" (v), "r" (ptr));
@@ -247,7 +247,7 @@ static inline void stw_le_p(void *ptr, i
 #endif
 }
 
-static inline void stl_le_p(void *ptr, int v)
+static always_inline void stl_le_p(void *ptr, int v)
 {
 #ifdef __powerpc__
     __asm__ __volatile__ ("stwbrx %1,0,%2" : "=m" (*(uint32_t *)ptr) : "r" (v), "r" (ptr));
@@ -260,7 +260,7 @@ static inline void stl_le_p(void *ptr, i
 #endif
 }
 
-static inline void stq_le_p(void *ptr, uint64_t v)
+static always_inline void stq_le_p(void *ptr, uint64_t v)
 {
     uint8_t *p = ptr;
     stl_le_p(p, (uint32_t)v);
@@ -269,7 +269,7 @@ static inline void stq_le_p(void *ptr, u
 
 /* float access */
 
-static inline float32 ldfl_le_p(void *ptr)
+static always_inline float32 ldfl_le_p(void *ptr)
 {
     union {
         float32 f;
@@ -279,7 +279,7 @@ static inline float32 ldfl_le_p(void *pt
     return u.f;
 }
 
-static inline void stfl_le_p(void *ptr, float32 v)
+static always_inline void stfl_le_p(void *ptr, float32 v)
 {
     union {
         float32 f;
@@ -289,7 +289,7 @@ static inline void stfl_le_p(void *ptr, 
     stl_le_p(ptr, u.i);
 }
 
-static inline float64 ldfq_le_p(void *ptr)
+static always_inline float64 ldfq_le_p(void *ptr)
 {
     CPU_DoubleU u;
     u.l.lower = ldl_le_p(ptr);
@@ -297,7 +297,7 @@ static inline float64 ldfq_le_p(void *pt
     return u.d;
 }
 
-static inline void stfq_le_p(void *ptr, float64 v)
+static always_inline void stfq_le_p(void *ptr, float64 v)
 {
     CPU_DoubleU u;
     u.d = v;
@@ -307,59 +307,59 @@ static inline void stfq_le_p(void *ptr, 
 
 #else
 
-static inline int lduw_le_p(void *ptr)
+static always_inline int lduw_le_p(void *ptr)
 {
     return *(uint16_t *)ptr;
 }
 
-static inline int ldsw_le_p(void *ptr)
+static always_inline int ldsw_le_p(void *ptr)
 {
     return *(int16_t *)ptr;
 }
 
-static inline int ldl_le_p(void *ptr)
+static always_inline int ldl_le_p(void *ptr)
 {
     return *(uint32_t *)ptr;
 }
 
-static inline uint64_t ldq_le_p(void *ptr)
+static always_inline uint64_t ldq_le_p(void *ptr)
 {
     return *(uint64_t *)ptr;
 }
 
-static inline void stw_le_p(void *ptr, int v)
+static always_inline void stw_le_p(void *ptr, int v)
 {
     *(uint16_t *)ptr = v;
 }
 
-static inline void stl_le_p(void *ptr, int v)
+static always_inline void stl_le_p(void *ptr, int v)
 {
     *(uint32_t *)ptr = v;
 }
 
-static inline void stq_le_p(void *ptr, uint64_t v)
+static always_inline void stq_le_p(void *ptr, uint64_t v)
 {
     *(uint64_t *)ptr = v;
 }
 
 /* float access */
 
-static inline float32 ldfl_le_p(void *ptr)
+static always_inline float32 ldfl_le_p(void *ptr)
 {
     return *(float32 *)ptr;
 }
 
-static inline float64 ldfq_le_p(void *ptr)
+static always_inline float64 ldfq_le_p(void *ptr)
 {
     return *(float64 *)ptr;
 }
 
-static inline void stfl_le_p(void *ptr, float32 v)
+static always_inline void stfl_le_p(void *ptr, float32 v)
 {
     *(float32 *)ptr = v;
 }
 
-static inline void stfq_le_p(void *ptr, float64 v)
+static always_inline void stfq_le_p(void *ptr, float64 v)
 {
     *(float64 *)ptr = v;
 }
@@ -367,7 +367,7 @@ static inline void stfq_le_p(void *ptr, 
 
 #if !defined(WORDS_BIGENDIAN) || defined(WORDS_ALIGNED)
 
-static inline int lduw_be_p(void *ptr)
+static always_inline int lduw_be_p(void *ptr)
 {
 #if defined(__i386__)
     int val;
@@ -382,7 +382,7 @@ static inline int lduw_be_p(void *ptr)
 #endif
 }
 
-static inline int ldsw_be_p(void *ptr)
+static always_inline int ldsw_be_p(void *ptr)
 {
 #if defined(__i386__)
     int val;
@@ -397,7 +397,7 @@ static inline int ldsw_be_p(void *ptr)
 #endif
 }
 
-static inline int ldl_be_p(void *ptr)
+static always_inline int ldl_be_p(void *ptr)
 {
 #if defined(__i386__) || defined(__x86_64__)
     int val;
@@ -412,7 +412,7 @@ static inline int ldl_be_p(void *ptr)
 #endif
 }
 
-static inline uint64_t ldq_be_p(void *ptr)
+static always_inline uint64_t ldq_be_p(void *ptr)
 {
     uint32_t a,b;
     a = ldl_be_p(ptr);
@@ -420,7 +420,7 @@ static inline uint64_t ldq_be_p(void *pt
     return (((uint64_t)a<<32)|b);
 }
 
-static inline void stw_be_p(void *ptr, int v)
+static always_inline void stw_be_p(void *ptr, int v)
 {
 #if defined(__i386__)
     asm volatile ("xchgb %b0, %h0\n"
@@ -434,7 +434,7 @@ static inline void stw_be_p(void *ptr, i
 #endif
 }
 
-static inline void stl_be_p(void *ptr, int v)
+static always_inline void stl_be_p(void *ptr, int v)
 {
 #if defined(__i386__) || defined(__x86_64__)
     asm volatile ("bswap %0\n"
@@ -450,7 +450,7 @@ static inline void stl_be_p(void *ptr, i
 #endif
 }
 
-static inline void stq_be_p(void *ptr, uint64_t v)
+static always_inline void stq_be_p(void *ptr, uint64_t v)
 {
     stl_be_p(ptr, v >> 32);
     stl_be_p(ptr + 4, v);
@@ -458,7 +458,7 @@ static inline void stq_be_p(void *ptr, u
 
 /* float access */
 
-static inline float32 ldfl_be_p(void *ptr)
+static always_inline float32 ldfl_be_p(void *ptr)
 {
     union {
         float32 f;
@@ -468,7 +468,7 @@ static inline float32 ldfl_be_p(void *pt
     return u.f;
 }
 
-static inline void stfl_be_p(void *ptr, float32 v)
+static always_inline void stfl_be_p(void *ptr, float32 v)
 {
     union {
         float32 f;
@@ -478,7 +478,7 @@ static inline void stfl_be_p(void *ptr, 
     stl_be_p(ptr, u.i);
 }
 
-static inline float64 ldfq_be_p(void *ptr)
+static always_inline float64 ldfq_be_p(void *ptr)
 {
     CPU_DoubleU u;
     u.l.upper = ldl_be_p(ptr);
@@ -486,7 +486,7 @@ static inline float64 ldfq_be_p(void *pt
     return u.d;
 }
 
-static inline void stfq_be_p(void *ptr, float64 v)
+static always_inline void stfq_be_p(void *ptr, float64 v)
 {
     CPU_DoubleU u;
     u.d = v;
@@ -496,59 +496,59 @@ static inline void stfq_be_p(void *ptr, 
 
 #else
 
-static inline int lduw_be_p(void *ptr)
+static always_inline int lduw_be_p(void *ptr)
 {
     return *(uint16_t *)ptr;
 }
 
-static inline int ldsw_be_p(void *ptr)
+static always_inline int ldsw_be_p(void *ptr)
 {
     return *(int16_t *)ptr;
 }
 
-static inline int ldl_be_p(void *ptr)
+static always_inline int ldl_be_p(void *ptr)
 {
     return *(uint32_t *)ptr;
 }
 
-static inline uint64_t ldq_be_p(void *ptr)
+static always_inline uint64_t ldq_be_p(void *ptr)
 {
     return *(uint64_t *)ptr;
 }
 
-static inline void stw_be_p(void *ptr, int v)
+static always_inline void stw_be_p(void *ptr, int v)
 {
     *(uint16_t *)ptr = v;
 }
 
-static inline void stl_be_p(void *ptr, int v)
+static always_inline void stl_be_p(void *ptr, int v)
 {
     *(uint32_t *)ptr = v;
 }
 
-static inline void stq_be_p(void *ptr, uint64_t v)
+static always_inline void stq_be_p(void *ptr, uint64_t v)
 {
     *(uint64_t *)ptr = v;
 }
 
 /* float access */
 
-static inline float32 ldfl_be_p(void *ptr)
+static always_inline float32 ldfl_be_p(void *ptr)
 {
     return *(float32 *)ptr;
 }
 
-static inline float64 ldfq_be_p(void *ptr)
+static always_inline float64 ldfq_be_p(void *ptr)
 {
     return *(float64 *)ptr;
 }
 
-static inline void stfl_be_p(void *ptr, float32 v)
+static always_inline void stfl_be_p(void *ptr, float32 v)
 {
     *(float32 *)ptr = v;
 }
 
-static inline void stfq_be_p(void *ptr, float64 v)
+static always_inline void stfq_be_p(void *ptr, float64 v)
 {
     *(float64 *)ptr = v;
 }
@@ -809,13 +809,13 @@ CPUReadMemoryFunc **cpu_get_io_memory_re
 
 void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
                             int len, int is_write);
-static inline void cpu_physical_memory_read(target_phys_addr_t addr,
-                                            uint8_t *buf, int len)
+static always_inline void cpu_physical_memory_read(target_phys_addr_t addr,
+                                                   uint8_t *buf, int len)
 {
     cpu_physical_memory_rw(addr, buf, len, 0);
 }
-static inline void cpu_physical_memory_write(target_phys_addr_t addr,
-                                             const uint8_t *buf, int len)
+static always_inline void cpu_physical_memory_write(target_phys_addr_t addr,
+                                                    const uint8_t *buf, int len)
 {
     cpu_physical_memory_rw(addr, (uint8_t *)buf, len, 1);
 }
@@ -839,18 +839,18 @@ int cpu_memory_rw_debug(CPUState *env, t
 #define CODE_DIRTY_FLAG 0x02
 
 /* read dirty bit (return 0 or 1) */
-static inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
+static always_inline int cpu_physical_memory_is_dirty(ram_addr_t addr)
 {
     return phys_ram_dirty[addr >> TARGET_PAGE_BITS] == 0xff;
 }
 
-static inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
-                                                int dirty_flags)
+static always_inline int cpu_physical_memory_get_dirty(ram_addr_t addr,
+                                                       int dirty_flags)
 {
     return phys_ram_dirty[addr >> TARGET_PAGE_BITS] & dirty_flags;
 }
 
-static inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
+static always_inline void cpu_physical_memory_set_dirty(ram_addr_t addr)
 {
     phys_ram_dirty[addr >> TARGET_PAGE_BITS] = 0xff;
 }
@@ -867,21 +867,21 @@ void dump_exec_info(FILE *f,
 
 #if defined(__powerpc__)
 
-static inline uint32_t get_tbl(void)
+static always_inline uint32_t get_tbl(void)
 {
     uint32_t tbl;
     asm volatile("mftb %0" : "=r" (tbl));
     return tbl;
 }
 
-static inline uint32_t get_tbu(void)
+static always_inline uint32_t get_tbu(void)
 {
 	uint32_t tbl;
 	asm volatile("mftbu %0" : "=r" (tbl));
 	return tbl;
 }
 
-static inline int64_t cpu_get_real_ticks(void)
+static always_inline int64_t cpu_get_real_ticks(void)
 {
     uint32_t l, h, h1;
     /* NOTE: we test if wrapping has occurred */
@@ -895,7 +895,7 @@ static inline int64_t cpu_get_real_ticks
 
 #elif defined(__i386__)
 
-static inline int64_t cpu_get_real_ticks(void)
+static always_inline int64_t cpu_get_real_ticks(void)
 {
     int64_t val;
     asm volatile ("rdtsc" : "=A" (val));
@@ -904,7 +904,7 @@ static inline int64_t cpu_get_real_ticks
 
 #elif defined(__x86_64__)
 
-static inline int64_t cpu_get_real_ticks(void)
+static always_inline int64_t cpu_get_real_ticks(void)
 {
     uint32_t low,high;
     int64_t val;
@@ -917,7 +917,7 @@ static inline int64_t cpu_get_real_ticks
 
 #elif defined(__ia64)
 
-static inline int64_t cpu_get_real_ticks(void)
+static always_inline int64_t cpu_get_real_ticks(void)
 {
 	int64_t val;
 	asm volatile ("mov %0 = ar.itc" : "=r"(val) :: "memory");
@@ -926,7 +926,7 @@ static inline int64_t cpu_get_real_ticks
 
 #elif defined(__s390__)
 
-static inline int64_t cpu_get_real_ticks(void)
+static always_inline int64_t cpu_get_real_ticks(void)
 {
     int64_t val;
     asm volatile("stck 0(%1)" : "=m" (val) : "a" (&val) : "cc");
@@ -935,7 +935,7 @@ static inline int64_t cpu_get_real_ticks
 
 #elif defined(__sparc_v8plus__) || defined(__sparc_v8plusa__) || defined(__sparc_v9__)
 
-static inline int64_t cpu_get_real_ticks (void)
+static always_inline int64_t cpu_get_real_ticks (void)
 {
 #if     defined(_LP64)
         uint64_t        rval;
@@ -957,7 +957,7 @@ static inline int64_t cpu_get_real_ticks
 
 #elif defined(__mips__)
 
-static inline int64_t cpu_get_real_ticks(void)
+static always_inline int64_t cpu_get_real_ticks(void)
 {
 #if __mips_isa_rev >= 2
     uint32_t count;
@@ -979,7 +979,7 @@ static inline int64_t cpu_get_real_ticks
 /* The host CPU doesn't have an easily accessible cycle counter.
    Just return a monotonically increasing value.  This will be
    totally wrong, but hopefully better than nothing.  */
-static inline int64_t cpu_get_real_ticks (void)
+static always_inline int64_t cpu_get_real_ticks (void)
 {
     static int64_t ticks = 0;
     return ticks++;
@@ -988,7 +988,7 @@ static inline int64_t cpu_get_real_ticks
 
 /* profiling */
 #ifdef CONFIG_PROFILER
-static inline int64_t profile_getclock(void)
+static always_inline int64_t profile_getclock(void)
 {
     return cpu_get_real_ticks();
 }
Index: dyngen.c
===================================================================
RCS file: /sources/qemu/qemu/dyngen.c,v
retrieving revision 1.57
diff -u -d -d -p -r1.57 dyngen.c
--- dyngen.c	7 Nov 2007 16:07:32 -0000	1.57
+++ dyngen.c	15 Nov 2007 22:59:46 -0000
@@ -219,6 +219,7 @@ struct nlist_extended
 
 #endif /* CONFIG_FORMAT_MACH */
 
+#include "osdep.h"
 #include "bswap.h"
 
 enum {
Index: exec-all.h
===================================================================
RCS file: /sources/qemu/qemu/exec-all.h,v
retrieving revision 1.70
diff -u -d -d -p -r1.70 exec-all.h
--- exec-all.h	4 Nov 2007 02:24:57 -0000	1.70
+++ exec-all.h	15 Nov 2007 22:59:46 -0000
@@ -115,9 +90,9 @@ void tlb_flush(CPUState *env, int flush_
 int tlb_set_page_exec(CPUState *env, target_ulong vaddr,
                       target_phys_addr_t paddr, int prot,
                       int mmu_idx, int is_softmmu);
-static inline int tlb_set_page(CPUState *env, target_ulong vaddr,
-                               target_phys_addr_t paddr, int prot,
-                               int mmu_idx, int is_softmmu)
+static always_inline int tlb_set_page(CPUState *env, target_ulong vaddr,
+                                      target_phys_addr_t paddr, int prot,
+                                      int mmu_idx, int is_softmmu)
 {
     if (prot & PAGE_READ)
         prot |= PAGE_EXEC;
@@ -209,14 +184,14 @@ typedef struct TranslationBlock {
     struct TranslationBlock *jmp_first;
 } TranslationBlock;
 
-static inline unsigned int tb_jmp_cache_hash_page(target_ulong pc)
+static always_inline unsigned int tb_jmp_cache_hash_page(target_ulong pc)
 {
     target_ulong tmp;
     tmp = pc ^ (pc >> (TARGET_PAGE_BITS - TB_JMP_PAGE_BITS));
     return (tmp >> TB_JMP_PAGE_BITS) & TB_JMP_PAGE_MASK;
 }
 
-static inline unsigned int tb_jmp_cache_hash_func(target_ulong pc)
+static always_inline unsigned int tb_jmp_cache_hash_func(target_ulong pc)
 {
     target_ulong tmp;
     tmp = pc ^ (pc >> (TARGET_PAGE_BITS - TB_JMP_PAGE_BITS));
@@ -224,7 +199,7 @@ static inline unsigned int tb_jmp_cache_
 	    (tmp & TB_JMP_ADDR_MASK));
 }
 
-static inline unsigned int tb_phys_hash_func(unsigned long pc)
+static always_inline unsigned int tb_phys_hash_func(unsigned long pc)
 {
     return pc & (CODE_GEN_PHYS_HASH_SIZE - 1);
 }
@@ -242,7 +217,8 @@ extern uint8_t *code_gen_ptr;
 #if defined(USE_DIRECT_JUMP)
 
 #if defined(__powerpc__)
-static inline void tb_set_jmp_target1(unsigned long jmp_addr, unsigned long addr)
+static always_inline void tb_set_jmp_target1(unsigned long jmp_addr,
+                                             unsigned long addr)
 {
     uint32_t val, *ptr;
 
@@ -259,7 +235,8 @@ static inline void tb_set_jmp_target1(un
     asm volatile ("isync" : : : "memory");
 }
 #elif defined(__i386__)
-static inline void tb_set_jmp_target1(unsigned long jmp_addr, unsigned long addr)
+static always_inline void tb_set_jmp_target1(unsigned long jmp_addr,
+                                             unsigned long addr)
 {
     /* patch the branch destination */
     *(uint32_t *)jmp_addr = addr - (jmp_addr + 4);
@@ -267,8 +244,8 @@ static inline void tb_set_jmp_target1(un
 }
 #endif
 
-static inline void tb_set_jmp_target(TranslationBlock *tb,
-                                     int n, unsigned long addr)
+static always_inline void tb_set_jmp_target(TranslationBlock *tb,
+                                            int n, unsigned long addr)
 {
     unsigned long offset;
 
@@ -282,16 +259,16 @@ static inline void tb_set_jmp_target(Tra
 #else
 
 /* set the jump target */
-static inline void tb_set_jmp_target(TranslationBlock *tb,
-                                     int n, unsigned long addr)
+static always_inline void tb_set_jmp_target(TranslationBlock *tb,
+                                            int n, unsigned long addr)
 {
     tb->tb_next[n] = addr;
 }
 
 #endif
 
-static inline void tb_add_jump(TranslationBlock *tb, int n,
-                               TranslationBlock *tb_next)
+static always_inline void tb_add_jump(TranslationBlock *tb, int n,
+                                      TranslationBlock *tb_next)
 {
     /* NOTE: this test is only needed for thread safety */
     if (!tb->jmp_next[n]) {
@@ -389,7 +366,7 @@ extern CPUReadMemoryFunc *io_mem_read[IO
 extern void *io_mem_opaque[IO_MEM_NB_ENTRIES];
 
 #if defined(__powerpc__)
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
     int ret;
     __asm__ __volatile__ (
@@ -405,7 +382,7 @@ static inline int testandset (int *p)
     return ret;
 }
 #elif defined(__i386__)
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
     long int readval = 0;
 
@@ -416,7 +393,7 @@ static inline int testandset (int *p)
     return readval;
 }
 #elif defined(__x86_64__)
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
     long int readval = 0;
 
@@ -427,7 +404,7 @@ static inline int testandset (int *p)
     return readval;
 }
 #elif defined(__s390__)
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
     int ret;
 
@@ -439,7 +416,7 @@ static inline int testandset (int *p)
     return ret;
 }
 #elif defined(__alpha__)
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
     int ret;
     unsigned long one;
@@ -456,7 +433,7 @@ static inline int testandset (int *p)
     return ret;
 }
 #elif defined(__sparc__)
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
 	int ret;
 
@@ -468,7 +445,7 @@ static inline int testandset (int *p)
 	return (ret ? 1 : 0);
 }
 #elif defined(__arm__)
-static inline int testandset (int *spinlock)
+static always_inline int testandset (int *spinlock)
 {
     register unsigned int ret;
     __asm__ __volatile__("swp %0, %1, [%2]"
@@ -478,7 +455,7 @@ static inline int testandset (int *spinl
     return ret;
 }
 #elif defined(__mc68000)
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
     char ret;
     __asm__ __volatile__("tas %1; sne %0"
@@ -491,12 +468,12 @@ static inline int testandset (int *p)
 
 #include <ia64intrin.h>
 
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
     return __sync_lock_test_and_set (p, 1);
 }
 #elif defined(__mips__)
-static inline int testandset (int *p)
+static always_inline int testandset (int *p)
 {
     int ret;
 
@@ -524,30 +501,30 @@ typedef int spinlock_t;
 #define SPIN_LOCK_UNLOCKED 0
 
 #if defined(CONFIG_USER_ONLY)
-static inline void spin_lock(spinlock_t *lock)
+static always_inline void spin_lock(spinlock_t *lock)
 {
     while (testandset(lock));
 }
 
-static inline void spin_unlock(spinlock_t *lock)
+static always_inline void spin_unlock(spinlock_t *lock)
 {
     *lock = 0;
 }
 
-static inline int spin_trylock(spinlock_t *lock)
+static always_inline int spin_trylock(spinlock_t *lock)
 {
     return !testandset(lock);
 }
 #else
-static inline void spin_lock(spinlock_t *lock)
+static always_inline void spin_lock(spinlock_t *lock)
 {
 }
 
-static inline void spin_unlock(spinlock_t *lock)
+static always_inline void spin_unlock(spinlock_t *lock)
 {
 }
 
-static inline int spin_trylock(spinlock_t *lock)
+static always_inline int spin_trylock(spinlock_t *lock)
 {
     return 1;
 }
@@ -585,7 +562,8 @@ void tlb_fill(target_ulong addr, int is_
 #endif
 
 #if defined(CONFIG_USER_ONLY)
-static inline target_ulong get_phys_addr_code(CPUState *env, target_ulong addr)
+static always_inline target_ulong get_phys_addr_code(CPUState *env,
+                                                     target_ulong addr)
 {
     return addr;
 }
@@ -593,7 +571,8 @@ static inline target_ulong get_phys_addr
 /* NOTE: this function can trigger an exception */
 /* NOTE2: the returned address is not exactly the physical address: it
    is the offset relative to phys_ram_base */
-static inline target_ulong get_phys_addr_code(CPUState *env, target_ulong addr)
+static always_inline target_ulong get_phys_addr_code(CPUState *env,
+                                                     target_ulong addr)
 {
     int mmu_idx, index, pd;
 
@@ -627,7 +606,7 @@ void kqemu_modify_page(CPUState *env, ra
 void kqemu_cpu_interrupt(CPUState *env);
 void kqemu_record_dump(void);
 
-static inline int kqemu_is_ok(CPUState *env)
+static always_inline int kqemu_is_ok(CPUState *env)
 {
     return(env->kqemu_enabled &&
            (env->cr[0] & CR0_PE_MASK) &&
Index: qemu-common.h
===================================================================
RCS file: /sources/qemu/qemu/qemu-common.h,v
retrieving revision 1.1
diff -u -d -d -p -r1.1 qemu-common.h
--- qemu-common.h	11 Nov 2007 02:51:16 -0000	1.1
+++ qemu-common.h	15 Nov 2007 22:59:46 -0000
@@ -36,7 +36,7 @@ extern int qemu_ftruncate64(int, int64_t
 #define ftruncate qemu_ftruncate64
 
 
-static inline char *realpath(const char *path, char *resolved_path)
+static always_inline char *realpath(const char *path, char *resolved_path)
 {
     _fullpath(resolved_path, path, _MAX_PATH);
     return resolved_path;
Index: softmmu-semi.h
===================================================================
RCS file: /sources/qemu/qemu/softmmu-semi.h,v
retrieving revision 1.3
diff -u -d -d -p -r1.3 softmmu-semi.h
--- softmmu-semi.h	11 Nov 2007 14:26:46 -0000	1.3
+++ softmmu-semi.h	15 Nov 2007 22:59:46 -0000
@@ -7,14 +7,14 @@
  * This code is licenced under the GPL
  */
 
-static inline uint32_t softmmu_tget32(CPUState *env, uint32_t addr)
+static always_inline uint32_t softmmu_tget32(CPUState *env, uint32_t addr)
 {
     uint32_t val;
 
     cpu_memory_rw_debug(env, addr, (uint8_t *)&val, 4, 0);
     return tswap32(val);
 }
-static inline uint32_t softmmu_tget8(CPUState *env, uint32_t addr)
+static always_inline uint32_t softmmu_tget8(CPUState *env, uint32_t addr)
 {
     uint8_t val;
 
@@ -24,7 +24,8 @@ static inline uint32_t softmmu_tget8(CPU
 #define tget32(p) softmmu_tget32(env, p)
 #define tget8(p) softmmu_tget8(env, p)
 
-static inline void softmmu_tput32(CPUState *env, uint32_t addr, uint32_t val)
+static always_inline void softmmu_tput32(CPUState *env,
+                                         uint32_t addr, uint32_t val)
 {
     val = tswap32(val);
     cpu_memory_rw_debug(env, addr, (uint8_t *)&val, 4, 1);
Index: softmmu_header.h
===================================================================
RCS file: /sources/qemu/qemu/softmmu_header.h,v
retrieving revision 1.18
diff -u -d -d -p -r1.18 softmmu_header.h
--- softmmu_header.h	14 Oct 2007 07:07:05 -0000	1.18
+++ softmmu_header.h	15 Nov 2007 22:59:46 -0000
@@ -79,7 +79,7 @@ void REGPARM(2) glue(glue(__st, SUFFIX),
 
 #define CPU_TLB_ENTRY_BITS 4
 
-static inline RES_TYPE glue(glue(ld, USUFFIX), MEMSUFFIX)(target_ulong ptr)
+static always_inline RES_TYPE glue(glue(ld, USUFFIX), MEMSUFFIX)(target_ulong ptr)
 {
     int res;
 
@@ -122,7 +122,7 @@ static inline RES_TYPE glue(glue(ld, USU
 }
 
 #if DATA_SIZE <= 2
-static inline int glue(glue(lds, SUFFIX), MEMSUFFIX)(target_ulong ptr)
+static always_inline int glue(glue(lds, SUFFIX), MEMSUFFIX)(target_ulong ptr)
 {
     int res;
 
@@ -169,7 +169,8 @@ static inline int glue(glue(lds, SUFFIX)
 }
 #endif
 
-static inline void glue(glue(st, SUFFIX), MEMSUFFIX)(target_ulong ptr, RES_TYPE v)
+static always_inline void glue(glue(st, SUFFIX), MEMSUFFIX)(target_ulong ptr,
+                                                            RES_TYPE v)
 {
     asm volatile ("movl %0, %%edx\n"
                   "movl %0, %%eax\n"
@@ -223,7 +224,7 @@ static inline void glue(glue(st, SUFFIX)
 
 /* generic load/store macros */
 
-static inline RES_TYPE glue(glue(ld, USUFFIX), MEMSUFFIX)(target_ulong ptr)
+static always_inline RES_TYPE glue(glue(ld, USUFFIX), MEMSUFFIX)(target_ulong ptr)
 {
     int index;
     RES_TYPE res;
@@ -245,7 +246,7 @@ static inline RES_TYPE glue(glue(ld, USU
 }
 
 #if DATA_SIZE <= 2
-static inline int glue(glue(lds, SUFFIX), MEMSUFFIX)(target_ulong ptr)
+static always_inline int glue(glue(lds, SUFFIX), MEMSUFFIX)(target_ulong ptr)
 {
     int res, index;
     target_ulong addr;
@@ -270,7 +271,8 @@ static inline int glue(glue(lds, SUFFIX)
 
 /* generic store macro */
 
-static inline void glue(glue(st, SUFFIX), MEMSUFFIX)(target_ulong ptr, RES_TYPE v)
+static always_inline void glue(glue(st, SUFFIX), MEMSUFFIX)(target_ulong ptr,
+                                                            RES_TYPE v)
 {
     int index;
     target_ulong addr;
@@ -296,7 +298,7 @@ static inline void glue(glue(st, SUFFIX)
 #if ACCESS_TYPE != (NB_MMU_MODES + 1)
 
 #if DATA_SIZE == 8
-static inline float64 glue(ldfq, MEMSUFFIX)(target_ulong ptr)
+static always_inline float64 glue(ldfq, MEMSUFFIX)(target_ulong ptr)
 {
     union {
         float64 d;
@@ -306,7 +308,7 @@ static inline float64 glue(ldfq, MEMSUFF
     return u.d;
 }
 
-static inline void glue(stfq, MEMSUFFIX)(target_ulong ptr, float64 v)
+static always_inline void glue(stfq, MEMSUFFIX)(target_ulong ptr, float64 v)
 {
     union {
         float64 d;
@@ -318,7 +320,7 @@ static inline void glue(stfq, MEMSUFFIX)
 #endif /* DATA_SIZE == 8 */
 
 #if DATA_SIZE == 4
-static inline float32 glue(ldfl, MEMSUFFIX)(target_ulong ptr)
+static always_inline float32 glue(ldfl, MEMSUFFIX)(target_ulong ptr)
 {
     union {
         float32 f;
@@ -328,7 +330,7 @@ static inline float32 glue(ldfl, MEMSUFF
     return u.f;
 }
 
-static inline void glue(stfl, MEMSUFFIX)(target_ulong ptr, float32 v)
+static always_inline void glue(stfl, MEMSUFFIX)(target_ulong ptr, float32 v)
 {
     union {
         float32 f;
Index: softmmu_template.h
===================================================================
RCS file: /sources/qemu/qemu/softmmu_template.h,v
retrieving revision 1.19
diff -u -d -d -p -r1.19 softmmu_template.h
--- softmmu_template.h	14 Oct 2007 07:07:05 -0000	1.19
+++ softmmu_template.h	15 Nov 2007 22:59:46 -0000
@@ -50,8 +50,8 @@
 static DATA_TYPE glue(glue(slow_ld, SUFFIX), MMUSUFFIX)(target_ulong addr,
                                                         int mmu_idx,
                                                         void *retaddr);
-static inline DATA_TYPE glue(io_read, SUFFIX)(target_phys_addr_t physaddr,
-                                              target_ulong tlb_addr)
+static always_inline DATA_TYPE glue(io_read, SUFFIX)(target_phys_addr_t physaddr,
+                                                     target_ulong tlb_addr)
 {
     DATA_TYPE res;
     int index;
@@ -183,10 +183,10 @@ static void glue(glue(slow_st, SUFFIX), 
                                                    int mmu_idx,
                                                    void *retaddr);
 
-static inline void glue(io_write, SUFFIX)(target_phys_addr_t physaddr,
-                                          DATA_TYPE val,
-                                          target_ulong tlb_addr,
-                                          void *retaddr)
+static always_inline void glue(io_write, SUFFIX)(target_phys_addr_t physaddr,
+                                                 DATA_TYPE val,
+                                                 target_ulong tlb_addr,
+                                                 void *retaddr)
 {
     int index;
 
Index: vl.h
===================================================================
RCS file: /sources/qemu/qemu/vl.h,v
retrieving revision 1.295
diff -u -d -d -p -r1.295 vl.h
--- vl.h	11 Nov 2007 17:56:38 -0000	1.295
+++ vl.h	15 Nov 2007 22:59:46 -0000
@@ -319,12 +288,13 @@ struct DisplayState {
                           uint8_t *image, uint8_t *mask);
 };
 
-static inline void dpy_update(DisplayState *s, int x, int y, int w, int h)
+static always_inline void dpy_update(DisplayState *s, int x,
+                                     int y, int w, int h)
 {
     s->dpy_update(s, x, y, w, h);
 }
 
-static inline void dpy_resize(DisplayState *s, int w, int h)
+static always_inline void dpy_resize(DisplayState *s, int w, int h)
 {
     s->dpy_resize(s, w, h);
 }
@@ -464,42 +434,42 @@ unsigned int qemu_get_be16(QEMUFile *f);
 unsigned int qemu_get_be32(QEMUFile *f);
 uint64_t qemu_get_be64(QEMUFile *f);
 
-static inline void qemu_put_be64s(QEMUFile *f, const uint64_t *pv)
+static always_inline void qemu_put_be64s(QEMUFile *f, const uint64_t *pv)
 {
     qemu_put_be64(f, *pv);
 }
 
-static inline void qemu_put_be32s(QEMUFile *f, const uint32_t *pv)
+static always_inline void qemu_put_be32s(QEMUFile *f, const uint32_t *pv)
 {
     qemu_put_be32(f, *pv);
 }
 
-static inline void qemu_put_be16s(QEMUFile *f, const uint16_t *pv)
+static always_inline void qemu_put_be16s(QEMUFile *f, const uint16_t *pv)
 {
     qemu_put_be16(f, *pv);
 }
 
-static inline void qemu_put_8s(QEMUFile *f, const uint8_t *pv)
+static always_inline void qemu_put_8s(QEMUFile *f, const uint8_t *pv)
 {
     qemu_put_byte(f, *pv);
 }
 
-static inline void qemu_get_be64s(QEMUFile *f, uint64_t *pv)
+static always_inline void qemu_get_be64s(QEMUFile *f, uint64_t *pv)
 {
     *pv = qemu_get_be64(f);
 }
 
-static inline void qemu_get_be32s(QEMUFile *f, uint32_t *pv)
+static always_inline void qemu_get_be32s(QEMUFile *f, uint32_t *pv)
 {
     *pv = qemu_get_be32(f);
 }
 
-static inline void qemu_get_be16s(QEMUFile *f, uint16_t *pv)
+static always_inline void qemu_get_be16s(QEMUFile *f, uint16_t *pv)
 {
     *pv = qemu_get_be16(f);
 }
 
-static inline void qemu_get_8s(QEMUFile *f, uint8_t *pv)
+static always_inline void qemu_get_8s(QEMUFile *f, uint8_t *pv)
 {
     *pv = qemu_get_byte(f);
 }
@@ -1021,7 +993,7 @@ struct clk_setup_t {
     clk_setup_cb cb;
     void *opaque;
 };
-static inline void clk_setup (clk_setup_t *clk, uint32_t freq)
+static always_inline void clk_setup (clk_setup_t *clk, uint32_t freq)
 {
     if (clk->cb != NULL)
         (*clk->cb)(clk->opaque, freq);
@@ -1054,16 +1026,16 @@ extern QEMUMachine ss5_machine, ss10_mac
 void *iommu_init(target_phys_addr_t addr);
 void sparc_iommu_memory_rw(void *opaque, target_phys_addr_t addr,
                                  uint8_t *buf, int len, int is_write);
-static inline void sparc_iommu_memory_read(void *opaque,
-                                           target_phys_addr_t addr,
-                                           uint8_t *buf, int len)
+static always_inline void sparc_iommu_memory_read(void *opaque,
+                                                  target_phys_addr_t addr,
+                                                  uint8_t *buf, int len)
 {
     sparc_iommu_memory_rw(opaque, addr, buf, len, 0);
 }
 
-static inline void sparc_iommu_memory_write(void *opaque,
-                                            target_phys_addr_t addr,
-                                            uint8_t *buf, int len)
+static always_inline void sparc_iommu_memory_write(void *opaque,
+                                                   target_phys_addr_t addr,
+                                                   uint8_t *buf, int len)
 {
     sparc_iommu_memory_rw(opaque, addr, buf, len, 1);
 }

[-- Attachment #4: gcc_inline_limits.diff --]
[-- Type: text/x-patch, Size: 885 bytes --]

Index: Makefile.target
===================================================================
RCS file: /sources/qemu/qemu/Makefile.target,v
retrieving revision 1.223
diff -u -d -d -p -r1.223 Makefile.target
--- Makefile.target	11 Nov 2007 14:26:46 -0000	1.223
+++ Makefile.target	15 Nov 2007 22:57:58 -0000
@@ -117,6 +116,10 @@ OP_CFLAGS+=$(call cc-option, -fno-align-
 OP_CFLAGS+=$(call cc-option, -fno-align-jumps, "")
 OP_CFLAGS+=$(call cc-option, -fno-align-functions, $(call cc-option, -malign-functions=0, ""))
 OP_CFLAGS+=$(call cc-option, -fno-section-anchors, "")
+OP_CFLAGS+=$(call cc-option, -fno-unit-at-a-time, "")
+OP_CFLAGS+=$(call cc-option, --param inline-unit-growth=1000000, "")
+OP_CFLAGS+=$(call cc-option, -finline-limit=100000, "")
+OP_CFLAGS+=$(call cc-option, --param large-function-growth=100000, "")
 
 ifeq ($(ARCH),i386)
 HELPER_CFLAGS+=-fomit-frame-pointer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-15 23:18 [Qemu-devel] RFC: fix for random Qemu crashes J. Mayer
@ 2007-11-15 23:49 ` andrzej zaborowski
  2007-11-16  0:09   ` J. Mayer
  2007-11-16 15:06 ` Heikki Lindholm
  2007-11-16 15:52 ` Paul Brook
  2 siblings, 1 reply; 18+ messages in thread
From: andrzej zaborowski @ 2007-11-15 23:49 UTC (permalink / raw)
  To: qemu-devel

Hi,

On 16/11/2007, J. Mayer <l_indien@magic.fr> wrote:
> Some may have experienced of having some Qemu builds crashing,
> apparently at random places, but in a reproducable way.
> I found one reason for this crashes: it appears that with the growth of
> the op.c file, there may be cases where we could reach the inlining
> limits of gcc. In such a case, gcc would not inline some declared
> "inline" function but would emit a call and provide a separate function.
> Unfortunately, this is not acceptable in op.o context as it will
> slowdown the emulation and because the call is likely to break the
> specific compilation rules (ie reserved registers) used while compiling
> op.o
> I found some workaround to avoid this behavior and I'd like to get
> opinions about it.
> The first idea is to change all occurences of 'inline' with
> 'always_inline' in all headers included in op.c. It then appeared to me
> that always_inline is not globally declared and that the definition is
> duplicated in vl.h and exec-all.h. But it's not declared in
> darwin-user/qemu.h and linux-user/qemu.h, which is not consistent with
> the declaration in vl.h. Further investigations showed me that the
> osdep.h header is the one that is actually included everywhere. Even if
> those are more compiler than OS dependent, I decided to move the
> definitions for glue, tostring, likely, unlikely, always_inline and
> REGPARM to this header so they can be globally used. I also changed the
> include orders in darwin-user/qemu.h to be sure this header will be
> included first. This patch is attached in common_defs.diff.

I had also noticed that glue and tostring definitions were present in
three places in qemu and it seemed wrong to me, so I'm in favour of
this patch. I can't say much about the other patches.

After armv6/armv7 support was merged on Sunday, I had a consistent
segfaults in the generated code and I tracked it down to cpsr_write
function not being inlined properly because it grew in size. Changing
the inline to always_inline helped but we decided to instead move the
function to target-arm/helper.c. I don't know which approach is
better, the performance hit shouldn't be notable (in case of ARM
cpsr).

> Giving this patch, I've been able to replace all occurence of 'inline'
> with 'always_inline' in all headers included from op.c (given the
> generated .d file). Some would say I'd better add a #define inline
> always_inline somewhere. I personnally dislike this solution as this
> kind of macro as it tends to expand recursivally (always_inline
> definition contains the inline word) and this may lead to compilation
> warnings or errors in some context; one could do tests using the linux
> kernel headers to get convinced that it can happen.
> Then, I choosed to replace 'inline' by 'always_inline', which is more
> invasive but have less risks of side effects. The diff is attached in
> always_inline.diff.
> The last thing that helps solve the problem is to change the inlining
> limits of gcc, at least to compile the op.o file.Unfortunatelly, there
> is no way to disable those limits (I checked in the source code), then I
> put them to an arbitrary high level. I also added the -funit-at-a-time
> switch, as this kind of optimisation would not be relevant in op.o
> context. The diff is attached in gcc_inline_limits.diff.
>
> Please comment.
>
> --
> J. Mayer <l_indien@magic.fr>
> Never organized
>
>

Regards

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-15 23:49 ` andrzej zaborowski
@ 2007-11-16  0:09   ` J. Mayer
  0 siblings, 0 replies; 18+ messages in thread
From: J. Mayer @ 2007-11-16  0:09 UTC (permalink / raw)
  To: qemu-devel


On Fri, 2007-11-16 at 00:49 +0100, andrzej zaborowski wrote:
> Hi,
> 
> On 16/11/2007, J. Mayer <l_indien@magic.fr> wrote:
> > Some may have experienced of having some Qemu builds crashing,
> > apparently at random places, but in a reproducable way.
> > I found one reason for this crashes: it appears that with the growth of
> > the op.c file, there may be cases where we could reach the inlining
> > limits of gcc. In such a case, gcc would not inline some declared
> > "inline" function but would emit a call and provide a separate function.
> > Unfortunately, this is not acceptable in op.o context as it will
> > slowdown the emulation and because the call is likely to break the
> > specific compilation rules (ie reserved registers) used while compiling
> > op.o
> > I found some workaround to avoid this behavior and I'd like to get
> > opinions about it.
> > The first idea is to change all occurences of 'inline' with
> > 'always_inline' in all headers included in op.c. It then appeared to me
> > that always_inline is not globally declared and that the definition is
> > duplicated in vl.h and exec-all.h. But it's not declared in
> > darwin-user/qemu.h and linux-user/qemu.h, which is not consistent with
> > the declaration in vl.h. Further investigations showed me that the
> > osdep.h header is the one that is actually included everywhere. Even if
> > those are more compiler than OS dependent, I decided to move the
> > definitions for glue, tostring, likely, unlikely, always_inline and
> > REGPARM to this header so they can be globally used. I also changed the
> > include orders in darwin-user/qemu.h to be sure this header will be
> > included first. This patch is attached in common_defs.diff.
> 
> I had also noticed that glue and tostring definitions were present in
> three places in qemu and it seemed wrong to me, so I'm in favour of
> this patch. I can't say much about the other patches.
> 
> After armv6/armv7 support was merged on Sunday, I had a consistent
> segfaults in the generated code and I tracked it down to cpsr_write
> function not being inlined properly because it grew in size. Changing
> the inline to always_inline helped but we decided to instead move the
> function to target-arm/helper.c. I don't know which approach is
> better, the performance hit shouldn't be notable (in case of ARM
> cpsr).

When the problem appears in rarely used or large functions, moving them
to helpers might be the proper solution. When it appears in load/store
functions (which is the case I experienced), this would not be
acceptable, as those functions are used very often and are really time
critical. Furthermore, this problem is likely to be encountered more
often as the targets features increase and then need to be solved, even
if using more helpers might be a work-around to temporary hide the
problem.

[...]

-- 
J. Mayer <l_indien@magic.fr>
Never organized

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-15 23:18 [Qemu-devel] RFC: fix for random Qemu crashes J. Mayer
  2007-11-15 23:49 ` andrzej zaborowski
@ 2007-11-16 15:06 ` Heikki Lindholm
  2007-11-16 15:35   ` Jocelyn Mayer
  2007-11-16 15:59   ` Jamie Lokier
  2007-11-16 15:52 ` Paul Brook
  2 siblings, 2 replies; 18+ messages in thread
From: Heikki Lindholm @ 2007-11-16 15:06 UTC (permalink / raw)
  To: qemu-devel

J. Mayer kirjoitti:
> Some may have experienced of having some Qemu builds crashing,
> apparently at random places, but in a reproducable way.
> I found one reason for this crashes: it appears that with the growth of
> the op.c file, there may be cases where we could reach the inlining
> limits of gcc. In such a case, gcc would not inline some declared
> "inline" function but would emit a call and provide a separate function.
> Unfortunately, this is not acceptable in op.o context as it will
> slowdown the emulation and because the call is likely to break the
> specific compilation rules (ie reserved registers) used while compiling
> op.o

Does -winline give a warning when this happens?

-- Heikki Lindholm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-16 15:06 ` Heikki Lindholm
@ 2007-11-16 15:35   ` Jocelyn Mayer
  2007-11-16 15:42     ` Heikki Lindholm
  2007-11-16 15:59   ` Jamie Lokier
  1 sibling, 1 reply; 18+ messages in thread
From: Jocelyn Mayer @ 2007-11-16 15:35 UTC (permalink / raw)
  To: qemu-devel


On Fri, 2007-11-16 at 17:06 +0200, Heikki Lindholm wrote:
> J. Mayer kirjoitti:
> > Some may have experienced of having some Qemu builds crashing,
> > apparently at random places, but in a reproducable way.
> > I found one reason for this crashes: it appears that with the growth of
> > the op.c file, there may be cases where we could reach the inlining
> > limits of gcc. In such a case, gcc would not inline some declared
> > "inline" function but would emit a call and provide a separate function.
> > Unfortunately, this is not acceptable in op.o context as it will
> > slowdown the emulation and because the call is likely to break the
> > specific compilation rules (ie reserved registers) used while compiling
> > op.o
> 
> Does -winline give a warning when this happens?

I did not check this, but getting a warning won't help us a lot. The
generated Qemu executable would still crash.

-- 
Jocelyn Mayer <l_indien@magic.fr>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-16 15:35   ` Jocelyn Mayer
@ 2007-11-16 15:42     ` Heikki Lindholm
  2007-11-16 16:34       ` Jocelyn Mayer
  0 siblings, 1 reply; 18+ messages in thread
From: Heikki Lindholm @ 2007-11-16 15:42 UTC (permalink / raw)
  To: l_indien, qemu-devel

Jocelyn Mayer kirjoitti:
> On Fri, 2007-11-16 at 17:06 +0200, Heikki Lindholm wrote:
>> J. Mayer kirjoitti:
>>> Some may have experienced of having some Qemu builds crashing,
>>> apparently at random places, but in a reproducable way.
>>> I found one reason for this crashes: it appears that with the growth of
>>> the op.c file, there may be cases where we could reach the inlining
>>> limits of gcc. In such a case, gcc would not inline some declared
>>> "inline" function but would emit a call and provide a separate function.
>>> Unfortunately, this is not acceptable in op.o context as it will
>>> slowdown the emulation and because the call is likely to break the
>>> specific compilation rules (ie reserved registers) used while compiling
>>> op.o
>> Does -winline give a warning when this happens?
> 
> I did not check this, but getting a warning won't help us a lot. The
> generated Qemu executable would still crash.

But a warning makes the problem more noticable. I tried -Winline and it 
does give warnings, but I'm not sure whether I'm hitting the same 
behaviour you see. I silenced the lot with 
--param-max-inline-insns-single=4000

-- Heikki Lindholm

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-15 23:18 [Qemu-devel] RFC: fix for random Qemu crashes J. Mayer
  2007-11-15 23:49 ` andrzej zaborowski
  2007-11-16 15:06 ` Heikki Lindholm
@ 2007-11-16 15:52 ` Paul Brook
  2007-11-16 16:05   ` Jocelyn Mayer
  2 siblings, 1 reply; 18+ messages in thread
From: Paul Brook @ 2007-11-16 15:52 UTC (permalink / raw)
  To: qemu-devel; +Cc: J. Mayer

> Then, I choosed to replace 'inline' by 'always_inline', which is more
> invasive but have less risks of side effects. The diff is attached in
> always_inline.diff.
> The last thing that helps solve the problem is to change the inlining
> limits of gcc, at least to compile the op.o file.

Presumably we only need one of the last two patches? It seems rather pointless 
to have always_inline *and* change the inlining heuristics.

I'm ok with using always_inline for op.o (and things it uses directly) as this 
is required for correctness. I'm not convinced that that using always_inline 
everywhere is such a good idea.

Paul

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-16 15:06 ` Heikki Lindholm
  2007-11-16 15:35   ` Jocelyn Mayer
@ 2007-11-16 15:59   ` Jamie Lokier
  2007-11-16 20:13     ` Jocelyn Mayer
  1 sibling, 1 reply; 18+ messages in thread
From: Jamie Lokier @ 2007-11-16 15:59 UTC (permalink / raw)
  To: qemu-devel

Heikki Lindholm wrote:
> J. Mayer kirjoitti:
> >Some may have experienced of having some Qemu builds crashing,
> >apparently at random places, but in a reproducable way.
> >I found one reason for this crashes: it appears that with the growth of
> >the op.c file, there may be cases where we could reach the inlining
> >limits of gcc. In such a case, gcc would not inline some declared
> >"inline" function but would emit a call and provide a separate function.
> >Unfortunately, this is not acceptable in op.o context as it will
> >slowdown the emulation and because the call is likely to break the
> >specific compilation rules (ie reserved registers) used while compiling
> >op.o
> 
> Does -winline give a warning when this happens?

And can it be repaired with __attribute__((__always_inline__))?

-- Jamie

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-16 15:52 ` Paul Brook
@ 2007-11-16 16:05   ` Jocelyn Mayer
  2007-11-16 20:32     ` andrzej zaborowski
  0 siblings, 1 reply; 18+ messages in thread
From: Jocelyn Mayer @ 2007-11-16 16:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paul Brook


On Fri, 2007-11-16 at 15:52 +0000, Paul Brook wrote:
> > Then, I choosed to replace 'inline' by 'always_inline', which is more
> > invasive but have less risks of side effects. The diff is attached in
> > always_inline.diff.
> > The last thing that helps solve the problem is to change the inlining
> > limits of gcc, at least to compile the op.o file.
> 
> Presumably we only need one of the last two patches? It seems rather pointless 
> to have always_inline *and* change the inlining heuristics.

>From the tests I made, it seems that adding always_inline helps but
unfortunatelly does not solve all cases. Should check in the gcc source
code why it is so...

> I'm ok with using always_inline for op.o (and things it uses directly) as this 
> is required for correctness. I'm not convinced that that using always_inline 
> everywhere is such a good idea.

That's exactly what I did: I changed 'inline' to 'always_inline' in
headers that are included by op.c, I did not made any change in other
headers. If some of the functions I changed should not be changed, that
means that there is another bug, ie those functions should not be used
by op.c in any way then should be moved to other headers, or maybe some
of those headers should not be included directly or indirectly in
op.c... If you see such cases, then we may better fix those issues
first...


-- 
Jocelyn Mayer <l_indien@magic.fr>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-16 15:42     ` Heikki Lindholm
@ 2007-11-16 16:34       ` Jocelyn Mayer
  0 siblings, 0 replies; 18+ messages in thread
From: Jocelyn Mayer @ 2007-11-16 16:34 UTC (permalink / raw)
  To: qemu-devel


On Fri, 2007-11-16 at 17:42 +0200, Heikki Lindholm wrote:
> Jocelyn Mayer kirjoitti:
> > On Fri, 2007-11-16 at 17:06 +0200, Heikki Lindholm wrote:
> >> J. Mayer kirjoitti:
> >>> Some may have experienced of having some Qemu builds crashing,
> >>> apparently at random places, but in a reproducable way.
> >>> I found one reason for this crashes: it appears that with the growth of
> >>> the op.c file, there may be cases where we could reach the inlining
> >>> limits of gcc. In such a case, gcc would not inline some declared
> >>> "inline" function but would emit a call and provide a separate function.
> >>> Unfortunately, this is not acceptable in op.o context as it will
> >>> slowdown the emulation and because the call is likely to break the
> >>> specific compilation rules (ie reserved registers) used while compiling
> >>> op.o
> >> Does -winline give a warning when this happens?
> > 
> > I did not check this, but getting a warning won't help us a lot. The
> > generated Qemu executable would still crash.
> 
> But a warning makes the problem more noticable.

yes, sure.

>  I tried -Winline and it 
> does give warnings, but I'm not sure whether I'm hitting the same 
> behaviour you see. I silenced the lot with 
> --param-max-inline-insns-single=4000

This is not sufficient. You will still have problems if triggering the
growth limit of a single function or the whole compilation unit.
Furthermore, 4000 is not a lot, I think it should be set to a very high
value we could never reach in real cases.


I did a build with current CVS and the -Winline switch set, using
gcc-3.4.6 on a x86_64 host. This showed me more problems than I
expected:

while compiling slirp code, I got a lot of warnings from slirp headers:
warning: inlining failed in call to 'slirp_remque': function body not
available
Those seem to be real bugs to be fixed.

from linux-user/syscall.c, I got a lot of warnings "--param
large-function-growth limit reached" in tswap32/tswap64 functions. This
does not seem to be critical, as syscall is actually a slow operation.

from target-i386/helper.c, I got a gew warnings "--param
max-inline-insns-single limit reached"

from target-arm/translate.c, got a few "--param max-inline-insns-single
limit reached"

from target-sparc/translate.c, got a few "--param
max-inline-insns-single limit reached after inlining into the callee"

from target-mips/translate.c: got a lot of "--param inline-unit-growth
limit reached" from gen-op.h, which can be quite a problem
from target-mips/op_helper.c: a lot of " --param inline-unit-growth
 limit reached", mostly for load/store functions, which may be
problematic.

from target-alpha/translate.c: got a lot of "--param inline-unit-growth
limit reached"

More warnings are issued for full system emulation:
from target-i386/translate.c: a lot of "--param large-function-growth
limit reached" from softmmu_header.h, which means code fetch is done via
function, which is not great.

from hw/pxa2xx.c: "--param large-function-growth limit reached". Should
not be problematic.

from target-ppc/helper.c: lots of "--param inline-unit-growth limit
reached" for load/store functions, which may be problematic

I noticed no warnings from target-ppc/op.c. But I've seen the bug
triggered with the build I done before commiting wy latest patches (with
those patches applied to a clean repository...) for the PowerPC 64
target, resulting in an emulator that would crash while entering user
mode (with the CDROM images I tested).
I'm not sure if there could be some cases gcc won't warn even when not
inlining an declared inline function, or if the bug what not triggered
this time, for an unknown reason as I just did a build from a checkout
with just the -Winline flag added (then the build should be quite
equivalent to the one I did this morning before commiting).

-- 
Jocelyn Mayer <l_indien@magic.fr>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-16 15:59   ` Jamie Lokier
@ 2007-11-16 20:13     ` Jocelyn Mayer
  0 siblings, 0 replies; 18+ messages in thread
From: Jocelyn Mayer @ 2007-11-16 20:13 UTC (permalink / raw)
  To: qemu-devel


On Fri, 2007-11-16 at 15:59 +0000, Jamie Lokier wrote:
> Heikki Lindholm wrote:
> > J. Mayer kirjoitti:
> > >Some may have experienced of having some Qemu builds crashing,
> > >apparently at random places, but in a reproducable way.
> > >I found one reason for this crashes: it appears that with the growth of
> > >the op.c file, there may be cases where we could reach the inlining
> > >limits of gcc. In such a case, gcc would not inline some declared
> > >"inline" function but would emit a call and provide a separate function.
> > >Unfortunately, this is not acceptable in op.o context as it will
> > >slowdown the emulation and because the call is likely to break the
> > >specific compilation rules (ie reserved registers) used while compiling
> > >op.o
> > 
> > Does -winline give a warning when this happens?
> 
> And can it be repaired with __attribute__((__always_inline__))?

As I already reported in the previous message, this helps solve the
problem but is not sufficient. Please refer to the proposed patches and
the rest of the discussion.

Further invastigation also showed me that there may be problems while
compiling translate-op.c. It seems the solution would be to change the
gcc inlining limits into the BASE_CFLAGS too _and_ define functions in
dyngen.h as always_inline. This would also avoid most inline related
warnings during the compilation (but maybe not solve all cases, as seen
in the op.c case).
Reading gcc source code showed me there are cases where a function could
be not inline when marked 'inline' without generating any warning
message. What I don't know is if those cases are likely to happen during
normal compilations, but as the op.c compilation seems to be buggy and
never emit those warnings, I guess we can reach those cases.

-- 
Jocelyn Mayer <l_indien@magic.fr>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-16 16:05   ` Jocelyn Mayer
@ 2007-11-16 20:32     ` andrzej zaborowski
  2007-11-17  0:04       ` J. Mayer
  0 siblings, 1 reply; 18+ messages in thread
From: andrzej zaborowski @ 2007-11-16 20:32 UTC (permalink / raw)
  To: l_indien, qemu-devel; +Cc: Paul Brook

On 16/11/2007, Jocelyn Mayer <l_indien@magic.fr> wrote:
>
> On Fri, 2007-11-16 at 15:52 +0000, Paul Brook wrote:
> > > Then, I choosed to replace 'inline' by 'always_inline', which is more
> > > invasive but have less risks of side effects. The diff is attached in
> > > always_inline.diff.
> > > The last thing that helps solve the problem is to change the inlining
> > > limits of gcc, at least to compile the op.o file.
> >
> > Presumably we only need one of the last two patches? It seems rather pointless
> > to have always_inline *and* change the inlining heuristics.
>
> >From the tests I made, it seems that adding always_inline helps but
> unfortunatelly does not solve all cases. Should check in the gcc source
> code why it is so...
>
> > I'm ok with using always_inline for op.o (and things it uses directly) as this
> > is required for correctness. I'm not convinced that that using always_inline
> > everywhere is such a good idea.
>
> That's exactly what I did: I changed 'inline' to 'always_inline' in
> headers that are included by op.c, I did not made any change in other
> headers.

I think a line like

#define inline __attribute__ (( always_inline )) inline

in dyngen-exec.h should be enough?

Regards

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-16 20:32     ` andrzej zaborowski
@ 2007-11-17  0:04       ` J. Mayer
  2007-11-17  2:58         ` [Qemu-devel] " Ben Pfaff
  2007-11-17 10:57         ` [Qemu-devel] " andrzej zaborowski
  0 siblings, 2 replies; 18+ messages in thread
From: J. Mayer @ 2007-11-17  0:04 UTC (permalink / raw)
  To: andrzej zaborowski; +Cc: qemu-devel, Paul Brook


On Fri, 2007-11-16 at 21:32 +0100, andrzej zaborowski wrote:
> On 16/11/2007, Jocelyn Mayer <l_indien@magic.fr> wrote:
> >
> > On Fri, 2007-11-16 at 15:52 +0000, Paul Brook wrote:
> > > > Then, I choosed to replace 'inline' by 'always_inline', which is more
> > > > invasive but have less risks of side effects. The diff is attached in
> > > > always_inline.diff.
> > > > The last thing that helps solve the problem is to change the inlining
> > > > limits of gcc, at least to compile the op.o file.
> > >
> > > Presumably we only need one of the last two patches? It seems rather pointless
> > > to have always_inline *and* change the inlining heuristics.
> >
> > >From the tests I made, it seems that adding always_inline helps but
> > unfortunatelly does not solve all cases. Should check in the gcc source
> > code why it is so...
> >
> > > I'm ok with using always_inline for op.o (and things it uses directly) as this
> > > is required for correctness. I'm not convinced that that using always_inline
> > > everywhere is such a good idea.
> >
> > That's exactly what I did: I changed 'inline' to 'always_inline' in
> > headers that are included by op.c, I did not made any change in other
> > headers.
> 
> I think a line like
> 
> #define inline __attribute__ (( always_inline )) inline
> 
> in dyngen-exec.h should be 

As I already pointed it in the first message of the thread, this kind of
define would expand recursivelly, which is particullary ugly, and which
can in some cases lead to compiler warnings or errors. I already had
this kind of problems using the linux kernel headers which preciselly
uses this definitition.
But, once again, adding always_inline to functions does not completelly
solve the problem (please read the thread !) or at least does not solves
it with all gcc versions. The inline growth limits tweaking seems needed
too.

-- 
J. Mayer <l_indien@magic.fr>
Never organized

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] Re: RFC: fix for random Qemu crashes
  2007-11-17  0:04       ` J. Mayer
@ 2007-11-17  2:58         ` Ben Pfaff
  2007-11-17  8:22           ` J. Mayer
  2007-11-17 10:57         ` [Qemu-devel] " andrzej zaborowski
  1 sibling, 1 reply; 18+ messages in thread
From: Ben Pfaff @ 2007-11-17  2:58 UTC (permalink / raw)
  To: qemu-devel

"J. Mayer" <l_indien@magic.fr> writes:

> On Fri, 2007-11-16 at 21:32 +0100, andrzej zaborowski wrote:
>> I think a line like
>> 
>> #define inline __attribute__ (( always_inline )) inline
>> 
>> in dyngen-exec.h should be 
>
> As I already pointed it in the first message of the thread, this kind of
> define would expand recursivelly, [...]

No.  A macro is not expanded within its own expansion.  See ISO
C99:

     6.10.3.4  Rescanning and further replacement
[...]
2    If the name of the macro being replaced is found during this
     scan of the replacement list (not including the rest of the
     source file's preprocessing tokens), it is not replaced.

If it still bothers you, you could write it as
        #define inline __attribute__ (( always_inline )) __inline__
since GCC accepts __inline__ as a synonym for inline.
-- 
Ben Pfaff 
http://benpfaff.org

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] Re: RFC: fix for random Qemu crashes
  2007-11-17  2:58         ` [Qemu-devel] " Ben Pfaff
@ 2007-11-17  8:22           ` J. Mayer
  0 siblings, 0 replies; 18+ messages in thread
From: J. Mayer @ 2007-11-17  8:22 UTC (permalink / raw)
  To: blp, qemu-devel


On Fri, 2007-11-16 at 18:58 -0800, Ben Pfaff wrote:
> "J. Mayer" <l_indien@magic.fr> writes:
> 
> > On Fri, 2007-11-16 at 21:32 +0100, andrzej zaborowski wrote:
> >> I think a line like
> >> 
> >> #define inline __attribute__ (( always_inline )) inline
> >> 
> >> in dyngen-exec.h should be 
> >
> > As I already pointed it in the first message of the thread, this kind of
> > define would expand recursivelly, [...]
> 
> No.  A macro is not expanded within its own expansion.  See ISO
> C99:

I just take a look of what happens in *real life* while compiling the
linux kernel which uses such a definition.... As I reported, I had
compilation problems due to this behavior and did inspect the
preprocessor output and saw this result. I did not check if it happens
only with some versions of gcc or if this behavior has been changed with
newer releases, I have to admit.

> 
>      6.10.3.4  Rescanning and further replacement
> [...]
> 2    If the name of the macro being replaced is found during this
>      scan of the replacement list (not including the rest of the
>      source file's preprocessing tokens), it is not replaced.
> 
> If it still bothers you, you could write it as
>         #define inline __attribute__ (( always_inline )) __inline__
> since GCC accepts __inline__ as a synonym for inline.

You're right, this would be a good solution to avoid many changes in the
code.

-- 
J. Mayer <l_indien@magic.fr>
Never organized

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-17  0:04       ` J. Mayer
  2007-11-17  2:58         ` [Qemu-devel] " Ben Pfaff
@ 2007-11-17 10:57         ` andrzej zaborowski
  2007-11-17 11:13           ` J. Mayer
  1 sibling, 1 reply; 18+ messages in thread
From: andrzej zaborowski @ 2007-11-17 10:57 UTC (permalink / raw)
  To: J. Mayer; +Cc: qemu-devel

On 17/11/2007, J. Mayer <l_indien@magic.fr> wrote:
>
> On Fri, 2007-11-16 at 21:32 +0100, andrzej zaborowski wrote:
> > On 16/11/2007, Jocelyn Mayer <l_indien@magic.fr> wrote:
> > >
> > > On Fri, 2007-11-16 at 15:52 +0000, Paul Brook wrote:
> > > > > Then, I choosed to replace 'inline' by 'always_inline', which is more
> > > > > invasive but have less risks of side effects. The diff is attached in
> > > > > always_inline.diff.
> > > > > The last thing that helps solve the problem is to change the inlining
> > > > > limits of gcc, at least to compile the op.o file.
> > > >
> > > > Presumably we only need one of the last two patches? It seems rather pointless
> > > > to have always_inline *and* change the inlining heuristics.
> > >
> > > >From the tests I made, it seems that adding always_inline helps but
> > > unfortunatelly does not solve all cases. Should check in the gcc source
> > > code why it is so...
> > >
> > > > I'm ok with using always_inline for op.o (and things it uses directly) as this
> > > > is required for correctness. I'm not convinced that that using always_inline
> > > > everywhere is such a good idea.
> > >
> > > That's exactly what I did: I changed 'inline' to 'always_inline' in
> > > headers that are included by op.c, I did not made any change in other
> > > headers.
> >
> > I think a line like
> >
> > #define inline __attribute__ (( always_inline )) inline
> >
> > in dyngen-exec.h should be
>
> As I already pointed it in the first message of the thread, this kind of
> define would expand recursivelly, which is particullary ugly, and which
> can in some cases lead to compiler warnings or errors. I already had
> this kind of problems using the linux kernel headers which preciselly
> uses this definitition.

My point here is that you can use dyngen-exec.h for the macro so that
the functions are only always_inline'd when used in op.c, not in other
files, I think that's what pbrook mean too. For example cpsr_write
from target-arm/exec.h was used in op.c as well as in vl.c. There's no
problem if it isn't inlined in vl.c, the fix should only affect op.c
which is a special case, for other files let gcc decide in the way it
was designed by gcc authors.

> But, once again, adding always_inline to functions does not completelly
> solve the problem (please read the thread !) or at least does not solves
> it with all gcc versions.

Yes, I'm not saying anything about the other part.
Regards

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] RFC: fix for random Qemu crashes
  2007-11-17 10:57         ` [Qemu-devel] " andrzej zaborowski
@ 2007-11-17 11:13           ` J. Mayer
  0 siblings, 0 replies; 18+ messages in thread
From: J. Mayer @ 2007-11-17 11:13 UTC (permalink / raw)
  To: andrzej zaborowski; +Cc: qemu-devel


On Sat, 2007-11-17 at 11:57 +0100, andrzej zaborowski wrote:
> On 17/11/2007, J. Mayer <l_indien@magic.fr> wrote:
> >
> > On Fri, 2007-11-16 at 21:32 +0100, andrzej zaborowski wrote:
> > > On 16/11/2007, Jocelyn Mayer <l_indien@magic.fr> wrote:
> > > >
> > > > On Fri, 2007-11-16 at 15:52 +0000, Paul Brook wrote:
> > > > > > Then, I choosed to replace 'inline' by 'always_inline', which is more
> > > > > > invasive but have less risks of side effects. The diff is attached in
> > > > > > always_inline.diff.
> > > > > > The last thing that helps solve the problem is to change the inlining
> > > > > > limits of gcc, at least to compile the op.o file.
> > > > >
> > > > > Presumably we only need one of the last two patches? It seems rather pointless
> > > > > to have always_inline *and* change the inlining heuristics.
> > > >
> > > > >From the tests I made, it seems that adding always_inline helps but
> > > > unfortunatelly does not solve all cases. Should check in the gcc source
> > > > code why it is so...
> > > >
> > > > > I'm ok with using always_inline for op.o (and things it uses directly) as this
> > > > > is required for correctness. I'm not convinced that that using always_inline
> > > > > everywhere is such a good idea.
> > > >
> > > > That's exactly what I did: I changed 'inline' to 'always_inline' in
> > > > headers that are included by op.c, I did not made any change in other
> > > > headers.
> > >
> > > I think a line like
> > >
> > > #define inline __attribute__ (( always_inline )) inline
> > >
> > > in dyngen-exec.h should be
> >
> > As I already pointed it in the first message of the thread, this kind of
> > define would expand recursivelly, which is particullary ugly, and which
> > can in some cases lead to compiler warnings or errors. I already had
> > this kind of problems using the linux kernel headers which preciselly
> > uses this definitition.
> 
> My point here is that you can use dyngen-exec.h for the macro so that
> the functions are only always_inline'd when used in op.c, not in other
> files, I think that's what pbrook mean too. For example cpsr_write
> from target-arm/exec.h was used in op.c as well as in vl.c. There's no
> problem if it isn't inlined in vl.c, the fix should only affect op.c
> which is a special case, for other files let gcc decide in the way it
> was designed by gcc authors.

The problem in op.c is that not inlining lead to crashes.
But not inlining functions in many other places would lead to great
performances issues.
dyngen-exec.h is included only in cpu-exec.c, host-utils.c, op.c and
op_helpers.c (from what I see in the .d files).
But, for example, not inlining code fetch or gen_op_xxx in translate.c
is a very bad news for code translation efficiency. Not inlining in
exec.c also have a great performance impact.
It seems to me that there are much more cases we want always_inline rule
to be applied than cases for which it's not so important. And it never
hurts if the inlining is not done in the later case but it can have a
great impact if inlining is not done for most functions that are
declared inline in Qemu.

[...]

-- 
J. Mayer <l_indien@magic.fr>
Never organized

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [RFC] Fix for random Qemu crashes
@ 2007-11-18 15:48 J. Mayer
  0 siblings, 0 replies; 18+ messages in thread
From: J. Mayer @ 2007-11-18 15:48 UTC (permalink / raw)
  To: qemu-devel

[-- Attachment #1: Type: text/plain, Size: 369 bytes --]

Here's an updated patch to fix the inlining problems that make some Qemu
targets crash randomly.
As we have at least one broken target in the CVS because of this bug
(and maybe more), we have an urgent need of a fix. I'll then commit this
patch today if there is no other fix proposed that actually solves the
problem.

-- 
J. Mayer <l_indien@magic.fr>
Never organized

[-- Attachment #2: always_inline.diff --]
[-- Type: text/x-patch, Size: 4374 bytes --]

Index: exec-all.h
===================================================================
RCS file: /sources/qemu/qemu/exec-all.h,v
retrieving revision 1.70
diff -u -d -d -p -r1.70 exec-all.h
--- exec-all.h	4 Nov 2007 02:24:57 -0000	1.70
+++ exec-all.h	18 Nov 2007 15:44:16 -0000
@@ -21,36 +21,6 @@
 /* allow to see translation results - the slowdown should be negligible, so we leave it */
 #define DEBUG_DISAS
 
-#ifndef glue
-#define xglue(x, y) x ## y
-#define glue(x, y) xglue(x, y)
-#define stringify(s)	tostring(s)
-#define tostring(s)	#s
-#endif
-
-#ifndef likely
-#if __GNUC__ < 3
-#define __builtin_expect(x, n) (x)
-#endif
-
-#define likely(x)   __builtin_expect(!!(x), 1)
-#define unlikely(x)   __builtin_expect(!!(x), 0)
-#endif
-
-#ifndef always_inline
-#if (__GNUC__ < 3) || defined(__APPLE__)
-#define always_inline inline
-#else
-#define always_inline __attribute__ (( always_inline )) inline
-#endif
-#endif
-
-#ifdef __i386__
-#define REGPARM(n) __attribute((regparm(n)))
-#else
-#define REGPARM(n)
-#endif
-
 /* is_jmp field values */
 #define DISAS_NEXT    0 /* next instruction can be analyzed */
 #define DISAS_JUMP    1 /* only pc was modified dynamically */
Index: osdep.h
===================================================================
RCS file: /sources/qemu/qemu/osdep.h,v
retrieving revision 1.10
diff -u -d -d -p -r1.10 osdep.h
--- osdep.h	7 Jun 2007 23:09:47 -0000	1.10
+++ osdep.h	18 Nov 2007 15:44:16 -0000
@@ -3,6 +3,44 @@
 
 #include <stdarg.h>
 
+#ifndef glue
+#define xglue(x, y) x ## y
+#define glue(x, y) xglue(x, y)
+#define stringify(s)	tostring(s)
+#define tostring(s)	#s
+#endif
+
+#ifndef likely
+#if __GNUC__ < 3
+#define __builtin_expect(x, n) (x)
+#endif
+
+#define likely(x)   __builtin_expect(!!(x), 1)
+#define unlikely(x)   __builtin_expect(!!(x), 0)
+#endif
+
+#ifndef MIN
+#define MIN(a, b) (((a) < (b)) ? (a) : (b))
+#endif
+#ifndef MAX
+#define MAX(a, b) (((a) > (b)) ? (a) : (b))
+#endif
+
+#ifndef always_inline
+#if (__GNUC__ < 3) || defined(__APPLE__)
+#define always_inline inline
+#else
+#define always_inline __attribute__ (( always_inline )) __inline__
+#endif
+#endif
+#define inline always_inline
+
+#ifdef __i386__
+#define REGPARM(n) __attribute((regparm(n)))
+#else
+#define REGPARM(n)
+#endif
+
 #define qemu_printf printf
 
 void *qemu_malloc(size_t size);
Index: qemu-common.h
===================================================================
RCS file: /sources/qemu/qemu/qemu-common.h,v
retrieving revision 1.2
diff -u -d -d -p -r1.2 qemu-common.h
--- qemu-common.h	17 Nov 2007 17:14:38 -0000	1.2
+++ qemu-common.h	18 Nov 2007 15:44:16 -0000
@@ -62,37 +62,6 @@ static inline char *realpath(const char 
 
 #endif /* !defined(NEED_CPU_H) */
 
-#ifndef glue
-#define xglue(x, y) x ## y
-#define glue(x, y) xglue(x, y)
-#define stringify(s)	tostring(s)
-#define tostring(s)	#s
-#endif
-
-#ifndef likely
-#if __GNUC__ < 3
-#define __builtin_expect(x, n) (x)
-#endif
-
-#define likely(x)   __builtin_expect(!!(x), 1)
-#define unlikely(x)   __builtin_expect(!!(x), 0)
-#endif
-
-#ifndef MIN
-#define MIN(a, b) (((a) < (b)) ? (a) : (b))
-#endif
-#ifndef MAX
-#define MAX(a, b) (((a) > (b)) ? (a) : (b))
-#endif
-
-#ifndef always_inline
-#if (__GNUC__ < 3) || defined(__APPLE__)
-#define always_inline inline
-#else
-#define always_inline __attribute__ (( always_inline )) inline
-#endif
-#endif
-
 /* bottom halves */
 typedef struct QEMUBH QEMUBH;
 
Index: translate-op.c
===================================================================
RCS file: /sources/qemu/qemu/translate-op.c,v
retrieving revision 1.3
diff -u -d -d -p -r1.3 translate-op.c
--- translate-op.c	18 Nov 2007 01:44:36 -0000	1.3
+++ translate-op.c	18 Nov 2007 15:44:16 -0000
@@ -24,6 +24,7 @@
 #include <inttypes.h>
 
 #include "config.h"
+#include "osdep.h"
 
 enum {
 #define DEF(s, n, copy_size) INDEX_op_ ## s,
Index: darwin-user/qemu.h
===================================================================
RCS file: /sources/qemu/qemu/darwin-user/qemu.h,v
retrieving revision 1.1
diff -u -d -d -p -r1.1 qemu.h
--- darwin-user/qemu.h	18 Jan 2007 20:06:33 -0000	1.1
+++ darwin-user/qemu.h	18 Nov 2007 15:44:16 -0000
@@ -1,13 +1,13 @@
 #ifndef GEMU_H
 #define GEMU_H
 
-#include "thunk.h"
-
 #include <signal.h>
 #include <string.h>
 
 #include "cpu.h"
 
+#include "thunk.h"
+
 #include "gdbstub.h"
 
 typedef siginfo_t target_siginfo_t;

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-11-18 15:48 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-15 23:18 [Qemu-devel] RFC: fix for random Qemu crashes J. Mayer
2007-11-15 23:49 ` andrzej zaborowski
2007-11-16  0:09   ` J. Mayer
2007-11-16 15:06 ` Heikki Lindholm
2007-11-16 15:35   ` Jocelyn Mayer
2007-11-16 15:42     ` Heikki Lindholm
2007-11-16 16:34       ` Jocelyn Mayer
2007-11-16 15:59   ` Jamie Lokier
2007-11-16 20:13     ` Jocelyn Mayer
2007-11-16 15:52 ` Paul Brook
2007-11-16 16:05   ` Jocelyn Mayer
2007-11-16 20:32     ` andrzej zaborowski
2007-11-17  0:04       ` J. Mayer
2007-11-17  2:58         ` [Qemu-devel] " Ben Pfaff
2007-11-17  8:22           ` J. Mayer
2007-11-17 10:57         ` [Qemu-devel] " andrzej zaborowski
2007-11-17 11:13           ` J. Mayer
  -- strict thread matches above, loose matches on Subject: below --
2007-11-18 15:48 [Qemu-devel] [RFC] Fix " J. Mayer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).