qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC PATCH 0/3] tcg: tb lock updates
@ 2016-12-06 20:56 Pranith Kumar
  2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired Pranith Kumar
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Pranith Kumar @ 2016-12-06 20:56 UTC (permalink / raw)
  To: alex.bennee; +Cc: qemu-devel, rth, mst, cota

Hello,

Please find a few minor tb lock updates in the MTTCG series. The main
motivation was to remove tb_lock_reset() and explicitly unlock the
tb_lock wherever possible so that reset() can be removed. This is not
possible yet since we take interrupts and exit the execution loop with
the tb lock held from multiple locations. However, I still think that explicit
unlocking is a good idea.

Thanks,

Pranith Kumar (3):
  tcg: Release tb_lock in the order acquired
  tcg: Reuse hashed pc value
  tcg: Explicitly unlock tb_lock

 cpu-exec.c         | 17 ++++++-----------
 exec.c             |  2 ++
 hw/i386/kvmvapic.c |  1 +
 translate-all.c    |  3 +++
 4 files changed, 12 insertions(+), 11 deletions(-)

-- 
2.11.0

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired
  2016-12-06 20:56 [Qemu-devel] [RFC PATCH 0/3] tcg: tb lock updates Pranith Kumar
@ 2016-12-06 20:56 ` Pranith Kumar
  2016-12-07  8:39   ` Alex Bennée
  2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 2/3] tcg: Reuse hashed pc value Pranith Kumar
  2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 3/3] tcg: Explicitly unlock tb_lock Pranith Kumar
  2 siblings, 1 reply; 7+ messages in thread
From: Pranith Kumar @ 2016-12-06 20:56 UTC (permalink / raw)
  To: alex.bennee; +Cc: qemu-devel, rth, mst, cota

We acquire mmap lock and tb lock in one order and release them in a
different order. This does not need to be that way.

This patch was inspired by a previous patch by Emilio G. Cota
(https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg03785.html).

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
 cpu-exec.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index aa8318d864..f4a00f5047 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -318,7 +318,6 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
     TranslationBlock *tb;
     target_ulong cs_base, pc;
     uint32_t flags;
-    bool have_tb_lock = false;
 
     /* we record a subset of the CPU state. It will
        always be the same before a given translated block
@@ -336,7 +335,6 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
              */
             mmap_lock();
             tb_lock();
-            have_tb_lock = true;
 
             /* There's a chance that our desired tb has been translated while
              * taking the locks so we check again inside the lock.
@@ -347,6 +345,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
                 tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
             }
 
+            tb_unlock();
             mmap_unlock();
         }
 
@@ -364,17 +363,12 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
 #endif
     /* See if we can patch the calling TB. */
     if (last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-        if (!have_tb_lock) {
-            tb_lock();
-            have_tb_lock = true;
-        }
         if (!tb->invalid) {
+            tb_lock();
             tb_add_jump(last_tb, tb_exit, tb);
+            tb_unlock();
         }
     }
-    if (have_tb_lock) {
-        tb_unlock();
-    }
     return tb;
 }
 
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [RFC PATCH 2/3] tcg: Reuse hashed pc value
  2016-12-06 20:56 [Qemu-devel] [RFC PATCH 0/3] tcg: tb lock updates Pranith Kumar
  2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired Pranith Kumar
@ 2016-12-06 20:56 ` Pranith Kumar
  2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 3/3] tcg: Explicitly unlock tb_lock Pranith Kumar
  2 siblings, 0 replies; 7+ messages in thread
From: Pranith Kumar @ 2016-12-06 20:56 UTC (permalink / raw)
  To: alex.bennee; +Cc: qemu-devel, rth, mst, cota

Reuse the hashed pc value instead of calculating it again.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
 cpu-exec.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index f4a00f5047..13cb15de0e 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -323,7 +323,8 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
        always be the same before a given translated block
        is executed. */
     cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
-    tb = atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]);
+    unsigned int pc_hash = tb_jmp_cache_hash_func(pc);
+    tb = atomic_rcu_read(&cpu->tb_jmp_cache[pc_hash]);
     if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base ||
                  tb->flags != flags)) {
         tb = tb_htable_lookup(cpu, pc, cs_base, flags);
@@ -350,7 +351,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
         }
 
         /* We add the TB in the virtual pc hash table for the fast lookup */
-        atomic_set(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)], tb);
+        atomic_set(&cpu->tb_jmp_cache[pc_hash], tb);
     }
 #ifndef CONFIG_USER_ONLY
     /* We don't take care of direct jumps when address mapping changes in
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Qemu-devel] [RFC PATCH 3/3] tcg: Explicitly unlock tb_lock
  2016-12-06 20:56 [Qemu-devel] [RFC PATCH 0/3] tcg: tb lock updates Pranith Kumar
  2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired Pranith Kumar
  2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 2/3] tcg: Reuse hashed pc value Pranith Kumar
@ 2016-12-06 20:56 ` Pranith Kumar
  2 siblings, 0 replies; 7+ messages in thread
From: Pranith Kumar @ 2016-12-06 20:56 UTC (permalink / raw)
  To: alex.bennee; +Cc: qemu-devel, rth, mst, cota

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
 exec.c             | 2 ++
 hw/i386/kvmvapic.c | 1 +
 translate-all.c    | 3 +++
 3 files changed, 6 insertions(+)

diff --git a/exec.c b/exec.c
index 46e2044b1f..f49088b259 100644
--- a/exec.c
+++ b/exec.c
@@ -2115,11 +2115,13 @@ static void check_watchpoint(int offset, int len, MemTxAttrs attrs, int flags)
                 tb_lock();
                 tb_check_watchpoint(cpu);
                 if (wp->flags & BP_STOP_BEFORE_ACCESS) {
+                    tb_unlock();
                     cpu->exception_index = EXCP_DEBUG;
                     cpu_loop_exit(cpu);
                 } else {
                     cpu_get_tb_cpu_state(env, &pc, &cs_base, &cpu_flags);
                     tb_gen_code(cpu, pc, cs_base, cpu_flags, 1);
+                    tb_unlock();
                     cpu_loop_exit_noexc(cpu);
                 }
             }
diff --git a/hw/i386/kvmvapic.c b/hw/i386/kvmvapic.c
index c8d908ede6..ffee94dd88 100644
--- a/hw/i386/kvmvapic.c
+++ b/hw/i386/kvmvapic.c
@@ -454,6 +454,7 @@ static void patch_instruction(VAPICROMState *s, X86CPU *cpu, target_ulong ip)
          *  longjmps back into the cpu_exec loop. */
         tb_lock();
         tb_gen_code(cs, current_pc, current_cs_base, current_flags, 1);
+        tb_unlock();
         cpu_loop_exit_noexc(cs);
     }
 }
diff --git a/translate-all.c b/translate-all.c
index cf828aa927..240c0a5c3d 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -1282,6 +1282,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
  buffer_overflow:
         /* flush must be done */
         tb_flush(cpu);
+        tb_unlock();
         mmap_unlock();
         cpu_loop_exit(cpu);
     }
@@ -1526,6 +1527,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end,
            modifying the memory. It will ensure that it cannot modify
            itself */
         tb_gen_code(cpu, current_pc, current_cs_base, current_flags, 1);
+        tb_unlock();
         cpu_loop_exit_noexc(cpu);
     }
 #endif
@@ -1802,6 +1804,7 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr)
     /* FIXME: In theory this could raise an exception.  In practice
        we have already translated the block once so it's probably ok.  */
     tb_gen_code(cpu, pc, cs_base, flags, cflags);
+    tb_unlock();
 
     /* TODO: If env->pc != tb->pc (i.e. the faulting instruction was not
      * the first in the TB) then we end up generating a whole new TB and
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired
  2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired Pranith Kumar
@ 2016-12-07  8:39   ` Alex Bennée
  2016-12-07 15:38     ` Pranith Kumar
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Bennée @ 2016-12-07  8:39 UTC (permalink / raw)
  To: Pranith Kumar; +Cc: qemu-devel, rth, mst, cota


Pranith Kumar <bobby.prani@gmail.com> writes:

> We acquire mmap lock and tb lock in one order and release them in a
> different order. This does not need to be that way.
>
> This patch was inspired by a previous patch by Emilio G. Cota
> (https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg03785.html).
>
> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
> ---
>  cpu-exec.c | 12 +++---------
>  1 file changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/cpu-exec.c b/cpu-exec.c
> index aa8318d864..f4a00f5047 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -318,7 +318,6 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
>      TranslationBlock *tb;
>      target_ulong cs_base, pc;
>      uint32_t flags;
> -    bool have_tb_lock = false;
>
>      /* we record a subset of the CPU state. It will
>         always be the same before a given translated block
> @@ -336,7 +335,6 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
>               */
>              mmap_lock();
>              tb_lock();
> -            have_tb_lock = true;
>
>              /* There's a chance that our desired tb has been translated while
>               * taking the locks so we check again inside the lock.
> @@ -347,6 +345,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
>                  tb = tb_gen_code(cpu, pc, cs_base, flags, 0);
>              }
>
> +            tb_unlock();
>              mmap_unlock();
>          }
>
> @@ -364,17 +363,12 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
>  #endif
>      /* See if we can patch the calling TB. */
>      if (last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> -        if (!have_tb_lock) {
> -            tb_lock();
> -            have_tb_lock = true;
> -        }
>          if (!tb->invalid) {
> +            tb_lock();
>              tb_add_jump(last_tb, tb_exit, tb);
> +            tb_unlock();
>          }
>      }
> -    if (have_tb_lock) {
> -        tb_unlock();
> -    }
>      return tb;
>  }


Do you have any numbers for this? The main reason being we are trying to
avoid bouncing the lock too much and while this is cleaner it could
cause more contention.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired
  2016-12-07  8:39   ` Alex Bennée
@ 2016-12-07 15:38     ` Pranith Kumar
  2016-12-08 17:52       ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Pranith Kumar @ 2016-12-07 15:38 UTC (permalink / raw)
  To: Alex Bennée; +Cc: qemu-devel, rth, mst, cota


Hi Alex,

Alex Bennée writes:

>
> Do you have any numbers for this? The main reason being we are trying to
> avoid bouncing the lock too much and while this is cleaner it could
> cause more contention.

I did not really consider performance while cleaning this up. However, I
looked closer and I think we can remove the tb lock acquisition while adding
the jump by using atomics. I've attached the patch below. This should remove
any concern for a negative performance impact.

I will include this patch if you think it's okay.

Thanks,

diff --git a/cpu-exec.c b/cpu-exec.c
index 13cb15de0e..93debe64b6 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -365,9 +365,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
     /* See if we can patch the calling TB. */
     if (last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
         if (!tb->invalid) {
-            tb_lock();
             tb_add_jump(last_tb, tb_exit, tb);
-            tb_unlock();
         }
     }
     return tb;
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 84a3240df6..60597cb07e 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -336,7 +336,7 @@ static inline void tb_set_jmp_target(TranslationBlock *tb,
 static inline void tb_add_jump(TranslationBlock *tb, int n,
                                TranslationBlock *tb_next)
 {
-    if (tb->jmp_list_next[n]) {
+    if (atomic_cmpxchg(&tb->jmp_list_next[n], 0, tb_next->jmp_list_first)) {
         /* Another thread has already done this while we were
          * outside of the lock; nothing to do in this case */
         return;
@@ -351,7 +351,6 @@ static inline void tb_add_jump(TranslationBlock *tb, int n,
     tb_set_jmp_target(tb, n, (uintptr_t)tb_next->tc_ptr);
 
     /* add in TB jmp circular list */
-    tb->jmp_list_next[n] = tb_next->jmp_list_first;
     tb_next->jmp_list_first = (uintptr_t)tb | n;
 }
 

-- 
Pranith

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired
  2016-12-07 15:38     ` Pranith Kumar
@ 2016-12-08 17:52       ` Paolo Bonzini
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Bonzini @ 2016-12-08 17:52 UTC (permalink / raw)
  To: Pranith Kumar, Alex Bennée; +Cc: mst, cota, qemu-devel, rth



On 07/12/2016 16:38, Pranith Kumar wrote:
> 
> Hi Alex,
> 
> Alex Bennée writes:
> 
>>
>> Do you have any numbers for this? The main reason being we are trying to
>> avoid bouncing the lock too much and while this is cleaner it could
>> cause more contention.
> 
> I did not really consider performance while cleaning this up. However, I
> looked closer and I think we can remove the tb lock acquisition while adding
> the jump by using atomics. I've attached the patch below. This should remove
> any concern for a negative performance impact.
> 
> I will include this patch if you think it's okay.

I don't like this, it's too tricky.  You'd have to prove that it's
correct and that it keeps the list consistent.  There's nothing wrong in
locking like

	lock(L1)
	lock(L2)
	unlock(L1)
	unlock(L2)

so I'm not sure I see the point of this patch.

Paolo

> Thanks,
> 
> diff --git a/cpu-exec.c b/cpu-exec.c
> index 13cb15de0e..93debe64b6 100644
> --- a/cpu-exec.c
> +++ b/cpu-exec.c
> @@ -365,9 +365,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
>      /* See if we can patch the calling TB. */
>      if (last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>          if (!tb->invalid) {
> -            tb_lock();
>              tb_add_jump(last_tb, tb_exit, tb);
> -            tb_unlock();
>          }
>      }
>      return tb;
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 84a3240df6..60597cb07e 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -336,7 +336,7 @@ static inline void tb_set_jmp_target(TranslationBlock *tb,
>  static inline void tb_add_jump(TranslationBlock *tb, int n,
>                                 TranslationBlock *tb_next)
>  {
> -    if (tb->jmp_list_next[n]) {
> +    if (atomic_cmpxchg(&tb->jmp_list_next[n], 0, tb_next->jmp_list_first)) {
>          /* Another thread has already done this while we were
>           * outside of the lock; nothing to do in this case */
>          return;
> @@ -351,7 +351,6 @@ static inline void tb_add_jump(TranslationBlock *tb, int n,
>      tb_set_jmp_target(tb, n, (uintptr_t)tb_next->tc_ptr);
>  
>      /* add in TB jmp circular list */
> -    tb->jmp_list_next[n] = tb_next->jmp_list_first;
>      tb_next->jmp_list_first = (uintptr_t)tb | n;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-12-08 17:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-06 20:56 [Qemu-devel] [RFC PATCH 0/3] tcg: tb lock updates Pranith Kumar
2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 1/3] tcg: Release tb_lock in the order acquired Pranith Kumar
2016-12-07  8:39   ` Alex Bennée
2016-12-07 15:38     ` Pranith Kumar
2016-12-08 17:52       ` Paolo Bonzini
2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 2/3] tcg: Reuse hashed pc value Pranith Kumar
2016-12-06 20:56 ` [Qemu-devel] [RFC PATCH 3/3] tcg: Explicitly unlock tb_lock Pranith Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).