qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] x86_64 mttcg
@ 2018-08-14  1:37 Emilio G. Cota
  2018-08-14  1:37 ` [Qemu-devel] [PATCH 1/4] hw/i386/pc: hold the BQL when calling cpu_get_ticks Emilio G. Cota
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Emilio G. Cota @ 2018-08-14  1:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
	Alex Bennée

With this series I can boot a busybox image and several Ubuntu images
with various -smp's.

The speedup we get is in line with what we get with other ISAs, e.g.
-smp 4 takes 34s instead of 1min with thread=single.

I've run this through the Valgrind race detectors, and the remaining
races reported seem benign.

I have not tested i386, so I'm not turning mttcg on for it yet.

Patches 1-2 comply with cpu_get_ticks' documentation. I'm not too
happy about effectively serializing rdtsc, since Linux guests call it
very often (for each vCPU!). But as usual, correctness goes before
performance or scalability.

You can fetch the series from:
  https://github.com/cota/qemu/tree/x86_64-mttcg

Thanks,

		Emilio

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 1/4] hw/i386/pc: hold the BQL when calling cpu_get_ticks
  2018-08-14  1:37 [Qemu-devel] [PATCH 0/4] x86_64 mttcg Emilio G. Cota
@ 2018-08-14  1:37 ` Emilio G. Cota
  2018-08-14  6:37   ` Paolo Bonzini
  2018-08-14  1:37 ` [Qemu-devel] [PATCH 2/4] cpus: assert that the BQL is held in cpu_get_ticks Emilio G. Cota
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Emilio G. Cota @ 2018-08-14  1:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
	Alex Bennée

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 hw/i386/pc.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 83a444472b..7371cd9960 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -158,7 +158,18 @@ static uint64_t ioportF0_read(void *opaque, hwaddr addr, unsigned size)
 /* TSC handling */
 uint64_t cpu_get_tsc(CPUX86State *env)
 {
-    return cpu_get_ticks();
+    uint64_t ret;
+    bool locked;
+
+    locked = qemu_mutex_iothread_locked();
+    if (!locked) {
+        qemu_mutex_lock_iothread();
+    }
+    ret = cpu_get_ticks();
+    if (!locked) {
+        qemu_mutex_unlock_iothread();
+    }
+    return ret;
 }
 
 /* IRQ handling */
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 2/4] cpus: assert that the BQL is held in cpu_get_ticks
  2018-08-14  1:37 [Qemu-devel] [PATCH 0/4] x86_64 mttcg Emilio G. Cota
  2018-08-14  1:37 ` [Qemu-devel] [PATCH 1/4] hw/i386/pc: hold the BQL when calling cpu_get_ticks Emilio G. Cota
@ 2018-08-14  1:37 ` Emilio G. Cota
  2018-08-14  1:38 ` [Qemu-devel] [PATCH 3/4] target/i386/translate: use thread-local storage in !user-mode Emilio G. Cota
  2018-08-14  1:38 ` [Qemu-devel] [PATCH 4/4] configure: enable mttcg for x86_64 Emilio G. Cota
  3 siblings, 0 replies; 8+ messages in thread
From: Emilio G. Cota @ 2018-08-14  1:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
	Alex Bennée

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 cpus.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/cpus.c b/cpus.c
index b5844b7103..b23bf6fef7 100644
--- a/cpus.c
+++ b/cpus.c
@@ -308,6 +308,8 @@ int64_t cpu_get_ticks(void)
 {
     int64_t ticks;
 
+    g_assert(qemu_mutex_iothread_locked());
+
     if (use_icount) {
         return cpu_get_icount();
     }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 3/4] target/i386/translate: use thread-local storage in !user-mode
  2018-08-14  1:37 [Qemu-devel] [PATCH 0/4] x86_64 mttcg Emilio G. Cota
  2018-08-14  1:37 ` [Qemu-devel] [PATCH 1/4] hw/i386/pc: hold the BQL when calling cpu_get_ticks Emilio G. Cota
  2018-08-14  1:37 ` [Qemu-devel] [PATCH 2/4] cpus: assert that the BQL is held in cpu_get_ticks Emilio G. Cota
@ 2018-08-14  1:38 ` Emilio G. Cota
  2018-08-14  6:31   ` Paolo Bonzini
  2018-08-14  1:38 ` [Qemu-devel] [PATCH 4/4] configure: enable mttcg for x86_64 Emilio G. Cota
  3 siblings, 1 reply; 8+ messages in thread
From: Emilio G. Cota @ 2018-08-14  1:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
	Alex Bennée

Needed for MTTCG.

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 target/i386/translate.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/target/i386/translate.c b/target/i386/translate.c
index 07d185e7b6..f4c2a41f8f 100644
--- a/target/i386/translate.c
+++ b/target/i386/translate.c
@@ -71,26 +71,34 @@
 
 //#define MACRO_TEST   1
 
+/* we need thread-local storage for mttcg */
+#ifdef CONFIG_USER_ONLY
+#define I386_THREAD
+#else
+#define I386_THREAD __thread
+#endif
+
 /* global register indexes */
-static TCGv cpu_A0;
-static TCGv cpu_cc_dst, cpu_cc_src, cpu_cc_src2, cpu_cc_srcT;
+static TCGv cpu_cc_dst, cpu_cc_src, cpu_cc_src2;
 static TCGv_i32 cpu_cc_op;
 static TCGv cpu_regs[CPU_NB_REGS];
 static TCGv cpu_seg_base[6];
 static TCGv_i64 cpu_bndl[4];
 static TCGv_i64 cpu_bndu[4];
 /* local temps */
-static TCGv cpu_T0, cpu_T1;
+static I386_THREAD TCGv cpu_cc_srcT;
+static I386_THREAD TCGv cpu_A0;
+static I386_THREAD TCGv cpu_T0, cpu_T1;
 /* local register indexes (only used inside old micro ops) */
-static TCGv cpu_tmp0, cpu_tmp4;
-static TCGv_ptr cpu_ptr0, cpu_ptr1;
-static TCGv_i32 cpu_tmp2_i32, cpu_tmp3_i32;
-static TCGv_i64 cpu_tmp1_i64;
+static I386_THREAD TCGv cpu_tmp0, cpu_tmp4;
+static I386_THREAD TCGv_ptr cpu_ptr0, cpu_ptr1;
+static I386_THREAD TCGv_i32 cpu_tmp2_i32, cpu_tmp3_i32;
+static I386_THREAD TCGv_i64 cpu_tmp1_i64;
 
 #include "exec/gen-icount.h"
 
 #ifdef TARGET_X86_64
-static int x86_64_hregs;
+static I386_THREAD int x86_64_hregs;
 #endif
 
 typedef struct DisasContext {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [Qemu-devel] [PATCH 4/4] configure: enable mttcg for x86_64
  2018-08-14  1:37 [Qemu-devel] [PATCH 0/4] x86_64 mttcg Emilio G. Cota
                   ` (2 preceding siblings ...)
  2018-08-14  1:38 ` [Qemu-devel] [PATCH 3/4] target/i386/translate: use thread-local storage in !user-mode Emilio G. Cota
@ 2018-08-14  1:38 ` Emilio G. Cota
  3 siblings, 0 replies; 8+ messages in thread
From: Emilio G. Cota @ 2018-08-14  1:38 UTC (permalink / raw)
  To: qemu-devel
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Eduardo Habkost, Michael S . Tsirkin, Marcel Apfelbaum,
	Alex Bennée

Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 configure | 1 +
 1 file changed, 1 insertion(+)

diff --git a/configure b/configure
index 2a7796ea80..db216bafb1 100755
--- a/configure
+++ b/configure
@@ -6935,6 +6935,7 @@ case "$target_name" in
     TARGET_BASE_ARCH=i386
     gdb_xml_files="i386-64bit.xml i386-64bit-core.xml i386-64bit-sse.xml"
     target_compiler=$cross_cc_x86_64
+    mttcg="yes"
   ;;
   alpha)
     mttcg="yes"
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] target/i386/translate: use thread-local storage in !user-mode
  2018-08-14  1:38 ` [Qemu-devel] [PATCH 3/4] target/i386/translate: use thread-local storage in !user-mode Emilio G. Cota
@ 2018-08-14  6:31   ` Paolo Bonzini
  2018-08-14  7:34     ` Emilio G. Cota
  0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2018-08-14  6:31 UTC (permalink / raw)
  To: Emilio G. Cota, qemu-devel
  Cc: Peter Crosthwaite, Richard Henderson, Eduardo Habkost,
	Michael S . Tsirkin, Marcel Apfelbaum, Alex Bennée

On 14/08/2018 03:38, Emilio G. Cota wrote:
> Needed for MTTCG.
> 
> Signed-off-by: Emilio G. Cota <cota@braap.org>

Why not always use TLS, even in user-mode?

Paolo

> ---
>  target/i386/translate.c | 24 ++++++++++++++++--------
>  1 file changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/target/i386/translate.c b/target/i386/translate.c
> index 07d185e7b6..f4c2a41f8f 100644
> --- a/target/i386/translate.c
> +++ b/target/i386/translate.c
> @@ -71,26 +71,34 @@
>  
>  //#define MACRO_TEST   1
>  
> +/* we need thread-local storage for mttcg */
> +#ifdef CONFIG_USER_ONLY
> +#define I386_THREAD
> +#else
> +#define I386_THREAD __thread
> +#endif
> +
>  /* global register indexes */
> -static TCGv cpu_A0;
> -static TCGv cpu_cc_dst, cpu_cc_src, cpu_cc_src2, cpu_cc_srcT;
> +static TCGv cpu_cc_dst, cpu_cc_src, cpu_cc_src2;
>  static TCGv_i32 cpu_cc_op;
>  static TCGv cpu_regs[CPU_NB_REGS];
>  static TCGv cpu_seg_base[6];
>  static TCGv_i64 cpu_bndl[4];
>  static TCGv_i64 cpu_bndu[4];
>  /* local temps */
> -static TCGv cpu_T0, cpu_T1;
> +static I386_THREAD TCGv cpu_cc_srcT;
> +static I386_THREAD TCGv cpu_A0;
> +static I386_THREAD TCGv cpu_T0, cpu_T1;
>  /* local register indexes (only used inside old micro ops) */
> -static TCGv cpu_tmp0, cpu_tmp4;
> -static TCGv_ptr cpu_ptr0, cpu_ptr1;
> -static TCGv_i32 cpu_tmp2_i32, cpu_tmp3_i32;
> -static TCGv_i64 cpu_tmp1_i64;
> +static I386_THREAD TCGv cpu_tmp0, cpu_tmp4;
> +static I386_THREAD TCGv_ptr cpu_ptr0, cpu_ptr1;
> +static I386_THREAD TCGv_i32 cpu_tmp2_i32, cpu_tmp3_i32;
> +static I386_THREAD TCGv_i64 cpu_tmp1_i64;
>  
>  #include "exec/gen-icount.h"
>  
>  #ifdef TARGET_X86_64
> -static int x86_64_hregs;
> +static I386_THREAD int x86_64_hregs;
>  #endif
>  
>  typedef struct DisasContext {
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 1/4] hw/i386/pc: hold the BQL when calling cpu_get_ticks
  2018-08-14  1:37 ` [Qemu-devel] [PATCH 1/4] hw/i386/pc: hold the BQL when calling cpu_get_ticks Emilio G. Cota
@ 2018-08-14  6:37   ` Paolo Bonzini
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2018-08-14  6:37 UTC (permalink / raw)
  To: Emilio G. Cota, qemu-devel
  Cc: Peter Crosthwaite, Richard Henderson, Eduardo Habkost,
	Michael S . Tsirkin, Marcel Apfelbaum, Alex Bennée

On 14/08/2018 03:37, Emilio G. Cota wrote:
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---
>  hw/i386/pc.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 83a444472b..7371cd9960 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -158,7 +158,18 @@ static uint64_t ioportF0_read(void *opaque, hwaddr addr, unsigned size)
>  /* TSC handling */
>  uint64_t cpu_get_tsc(CPUX86State *env)
>  {
> -    return cpu_get_ticks();
> +    uint64_t ret;
> +    bool locked;
> +
> +    locked = qemu_mutex_iothread_locked();
> +    if (!locked) {
> +        qemu_mutex_lock_iothread();
> +    }
> +    ret = cpu_get_ticks();
> +    if (!locked) {
> +        qemu_mutex_unlock_iothread();
> +    }
> +    return ret;
>  }
>  
>  /* IRQ handling */
> 

We could use a spinlock for the rare case of timers_state.cpu_ticks_prev
> ticks (or even, on 64-bit, a seqlock+spinlock).  I'll give it a shot.

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Qemu-devel] [PATCH 3/4] target/i386/translate: use thread-local storage in !user-mode
  2018-08-14  6:31   ` Paolo Bonzini
@ 2018-08-14  7:34     ` Emilio G. Cota
  0 siblings, 0 replies; 8+ messages in thread
From: Emilio G. Cota @ 2018-08-14  7:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Peter Crosthwaite, Richard Henderson, Eduardo Habkost,
	Michael S . Tsirkin, Marcel Apfelbaum, Alex Bennée

On Tue, Aug 14, 2018 at 08:31:02 +0200, Paolo Bonzini wrote:
> On 14/08/2018 03:38, Emilio G. Cota wrote:
> > Needed for MTTCG.
> > 
> > Signed-off-by: Emilio G. Cota <cota@braap.org>
> 
> Why not always use TLS, even in user-mode?

To avoid TLS waste; user-mode uses a single TCGContext,
so a single copy of these variables is all that's needed because
code generation is serialized with a lock.

If in user-mode we just have a few threads it'd be no big deal,
but apps can easily spawn thousands of threads.

		E.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-08-14  7:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-14  1:37 [Qemu-devel] [PATCH 0/4] x86_64 mttcg Emilio G. Cota
2018-08-14  1:37 ` [Qemu-devel] [PATCH 1/4] hw/i386/pc: hold the BQL when calling cpu_get_ticks Emilio G. Cota
2018-08-14  6:37   ` Paolo Bonzini
2018-08-14  1:37 ` [Qemu-devel] [PATCH 2/4] cpus: assert that the BQL is held in cpu_get_ticks Emilio G. Cota
2018-08-14  1:38 ` [Qemu-devel] [PATCH 3/4] target/i386/translate: use thread-local storage in !user-mode Emilio G. Cota
2018-08-14  6:31   ` Paolo Bonzini
2018-08-14  7:34     ` Emilio G. Cota
2018-08-14  1:38 ` [Qemu-devel] [PATCH 4/4] configure: enable mttcg for x86_64 Emilio G. Cota

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).