public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix QEMU vcpu thread race with apic_reset
@ 2008-04-26  5:26 Ryan Harper
  2008-04-26  5:33 ` Ryan Harper
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Ryan Harper @ 2008-04-26  5:26 UTC (permalink / raw)
  To: kvm-devel; +Cc: Ryan Harper, Avi Kivity

There is a race between when the vcpu thread issues a create ioctl and when
apic_reset() gets called resulting in getting a badfd error.

main thread                vcpu thread
diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index 78127de..3513e8c 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -31,7 +31,9 @@ extern int smp_cpus;
 static int qemu_kvm_reset_requested;
 
 pthread_mutex_t qemu_mutex = PTHREAD_MUTEX_INITIALIZER;
+pthread_mutex_t vcpu_mutex = PTHREAD_MUTEX_INITIALIZER;
 pthread_cond_t qemu_aio_cond = PTHREAD_COND_INITIALIZER;
+pthread_cond_t qemu_vcpuup_cond = PTHREAD_COND_INITIALIZER;
 __thread struct vcpu_info *vcpu;
 
 struct qemu_kvm_signal_table {
@@ -369,6 +371,11 @@ static void *ap_main_loop(void *_env)
     sigfillset(&signals);
     sigprocmask(SIG_BLOCK, &signals, NULL);
     kvm_create_vcpu(kvm_context, env->cpu_index);
+    /* block until cond_wait occurs */
+    pthread_mutex_lock(&vcpu_mutex);
+    /* now we can signal */
+    pthread_cond_signal(&qemu_vcpuup_cond);
+    pthread_mutex_unlock(&vcpu_mutex);
     kvm_qemu_init_env(env);
     kvm_main_loop_cpu(env);
     return NULL;
@@ -388,9 +395,10 @@ static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum)
 
 void kvm_init_new_ap(int cpu, CPUState *env)
 {
+    pthread_mutex_lock(&vcpu_mutex);
     pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env);
-    /* FIXME: wait for thread to spin up */
-    usleep(200);
+    pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex);
+    pthread_mutex_unlock(&vcpu_mutex);
 }
 
 static void qemu_kvm_init_signal_tables(void)

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix QEMU vcpu thread race with apic_reset
  2008-04-26  5:26 [PATCH] Fix QEMU vcpu thread race with apic_reset Ryan Harper
@ 2008-04-26  5:33 ` Ryan Harper
  2008-04-26  7:16 ` Avi Kivity
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Ryan Harper @ 2008-04-26  5:33 UTC (permalink / raw)
  To: kvm-devel; +Cc: Avi Kivity

* Ryan Harper <ryanh@us.ibm.com> [2008-04-26 00:27]:
> There is a race between when the vcpu thread issues a create ioctl and when
> apic_reset() gets called resulting in getting a badfd error.
> 
> main thread                vcpu thread

guilt refresh clipped my text short.


main thread                vcpu thread
-----------                -----------
qemu/hw/pc.c:pc_new_cpu()
cpu_init()
cpu_x86_init()
kvm_init_new_ap()          ap_main_loop()
                              *blocks*
usleep()
apic_init()
kvm_set_lapic()
kvm_ioctl with unitilized context
badfd

To fix this, ensure we create the vcpu in the vcpu thread before returning from
kvm_init_new_ap.  Synchronize on a new mutux, vcpu_mutex, and wait for the
vcpuup condition before signaling to ensure the main thread is waiting before we
send the signal.

With this patch, I can launch 64 kvm guests, 1 second apart and not see any Bad
File descriptor errors.

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix QEMU vcpu thread race with apic_reset
  2008-04-26  5:26 [PATCH] Fix QEMU vcpu thread race with apic_reset Ryan Harper
  2008-04-26  5:33 ` Ryan Harper
@ 2008-04-26  7:16 ` Avi Kivity
  2008-04-28 16:26   ` Ryan Harper
  2008-04-26 16:58 ` Ulrich Drepper
  2008-04-26 17:04 ` Ulrich Drepper
  3 siblings, 1 reply; 9+ messages in thread
From: Avi Kivity @ 2008-04-26  7:16 UTC (permalink / raw)
  To: Ryan Harper; +Cc: kvm-devel

Ryan Harper wrote:
> There is a race between when the vcpu thread issues a create ioctl and when
> apic_reset() gets called resulting in getting a badfd error.
>
>   

The problem is indeed there, but the fix is wrong:

> main thread                vcpu thread
> diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
> index 78127de..3513e8c 100644
> --- a/qemu/qemu-kvm.c
> +++ b/qemu/qemu-kvm.c
> @@ -31,7 +31,9 @@ extern int smp_cpus;
>  static int qemu_kvm_reset_requested;
>  
>  pthread_mutex_t qemu_mutex = PTHREAD_MUTEX_INITIALIZER;
> +pthread_mutex_t vcpu_mutex = PTHREAD_MUTEX_INITIALIZER;
>  pthread_cond_t qemu_aio_cond = PTHREAD_COND_INITIALIZER;
> +pthread_cond_t qemu_vcpuup_cond = PTHREAD_COND_INITIALIZER;
>  __thread struct vcpu_info *vcpu;
>  
>  struct qemu_kvm_signal_table {
> @@ -369,6 +371,11 @@ static void *ap_main_loop(void *_env)
>      sigfillset(&signals);
>      sigprocmask(SIG_BLOCK, &signals, NULL);
>      kvm_create_vcpu(kvm_context, env->cpu_index);
> +    /* block until cond_wait occurs */
> +    pthread_mutex_lock(&vcpu_mutex);
> +    /* now we can signal */
> +    pthread_cond_signal(&qemu_vcpuup_cond);
> +    pthread_mutex_unlock(&vcpu_mutex);
>      kvm_qemu_init_env(env);
>      kvm_main_loop_cpu(env);
>      return NULL;
> @@ -388,9 +395,10 @@ static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum)
>  
>  void kvm_init_new_ap(int cpu, CPUState *env)
>  {
> +    pthread_mutex_lock(&vcpu_mutex);
>      pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env);
> -    /* FIXME: wait for thread to spin up */
> -    usleep(200);
> +    pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex);
>   

pthread_cond_wait() is never correct outside a loop.  The signal may 
arrive before wait is called.

The usual idiom is

    while (condition is not fulfilled)
         pthread_cond_wait();

I see you have something there to ensure we block, but please use the 
right idiom.

> +    pthread_mutex_unlock(&vcpu_mutex);
>  }
>  

Please reuse qemu_mutex for this, no need for a new one.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix QEMU vcpu thread race with apic_reset
  2008-04-26  5:26 [PATCH] Fix QEMU vcpu thread race with apic_reset Ryan Harper
  2008-04-26  5:33 ` Ryan Harper
  2008-04-26  7:16 ` Avi Kivity
@ 2008-04-26 16:58 ` Ulrich Drepper
  2008-04-26 17:04 ` Ulrich Drepper
  3 siblings, 0 replies; 9+ messages in thread
From: Ulrich Drepper @ 2008-04-26 16:58 UTC (permalink / raw)
  To: Ryan Harper; +Cc: kvm-devel, Avi Kivity

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ryan Harper wrote:
> +    /* block until cond_wait occurs */
> +    pthread_mutex_lock(&vcpu_mutex);
> +    /* now we can signal */
> +    pthread_cond_signal(&qemu_vcpuup_cond);
> +    pthread_mutex_unlock(&vcpu_mutex);

It is not necessary to take the mutex for a condvar if you're just
waking a waiter.  This just unnecessarily slows things down.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFIE19D2ijCOnn/RHQRAvLqAJ9j+7ICPHpB/gCthCelDgYn5poGMgCfaZVy
+tE5zOuxqlBMUR7Fgufw/wY=
=5j4s
-----END PGP SIGNATURE-----

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix QEMU vcpu thread race with apic_reset
  2008-04-26  5:26 [PATCH] Fix QEMU vcpu thread race with apic_reset Ryan Harper
                   ` (2 preceding siblings ...)
  2008-04-26 16:58 ` Ulrich Drepper
@ 2008-04-26 17:04 ` Ulrich Drepper
  2008-04-26 17:56   ` Anthony Liguori
  3 siblings, 1 reply; 9+ messages in thread
From: Ulrich Drepper @ 2008-04-26 17:04 UTC (permalink / raw)
  To: Ryan Harper; +Cc: kvm-devel, Avi Kivity

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ryan Harper wrote:
> @@ -388,9 +395,10 @@ static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum)
>  
>  void kvm_init_new_ap(int cpu, CPUState *env)
>  {
> +    pthread_mutex_lock(&vcpu_mutex);
>      pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env);
> -    /* FIXME: wait for thread to spin up */
> -    usleep(200);
> +    pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex);
> +    pthread_mutex_unlock(&vcpu_mutex);

And something is very wrong here.  The pattern for using a condvar is

1 take mutex

2 check condition

3 if condition is not fulfilled
3a   call cond_wait
3b   when returning, go back to step 2

4 unlock mutex


Anything else is buggy.

So, either your condvar use is wrong or you don't really want a condvar
in the first place.  I haven't checked the code.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFIE2CD2ijCOnn/RHQRAs4kAJ40kbWjNJAzj2gGdbo/sSxZTx5b0ACglbis
kw7ST4eJK9CXhNbjKphNsUo=
=ISaC
-----END PGP SIGNATURE-----

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix QEMU vcpu thread race with apic_reset
  2008-04-26 17:04 ` Ulrich Drepper
@ 2008-04-26 17:56   ` Anthony Liguori
  0 siblings, 0 replies; 9+ messages in thread
From: Anthony Liguori @ 2008-04-26 17:56 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: kvm-devel, Avi Kivity

Ulrich Drepper wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Ryan Harper wrote:
>   
>> @@ -388,9 +395,10 @@ static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum)
>>  
>>  void kvm_init_new_ap(int cpu, CPUState *env)
>>  {
>> +    pthread_mutex_lock(&vcpu_mutex);
>>      pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env);
>> -    /* FIXME: wait for thread to spin up */
>> -    usleep(200);
>> +    pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex);
>> +    pthread_mutex_unlock(&vcpu_mutex);
>>     
>
> And something is very wrong here.  The pattern for using a condvar is
>
> 1 take mutex
>
> 2 check condition
>
> 3 if condition is not fulfilled
> 3a   call cond_wait
> 3b   when returning, go back to step 2
>
> 4 unlock mutex
>
>
> Anything else is buggy.
>
> So, either your condvar use is wrong or you don't really want a condvar
> in the first place.  I haven't checked the code.
>   

A flag is needed in the vcpu structure that indicates whether the vcpu 
spun up or not.  This is what the while () condition should be.  This is 
needed as the thread may spin up before it gets to the 
pthread_cond_wait() in which case the signal happens when noone is 
waiting on it.  The other reason a while () is usually needed is that 
cond_signal may not wake up the right thread so it's necessary to check 
whether you really have something to do.  Not really a problem here but 
the former race is.

A condvar is definitely the right thing to use here.

Regards,

Anthony Liguori

> - --
> ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.7 (GNU/Linux)
>
> iD8DBQFIE2CD2ijCOnn/RHQRAs4kAJ40kbWjNJAzj2gGdbo/sSxZTx5b0ACglbis
> kw7ST4eJK9CXhNbjKphNsUo=
> =ISaC
> -----END PGP SIGNATURE-----
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> kvm-devel mailing list
> kvm-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/kvm-devel
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix QEMU vcpu thread race with apic_reset
  2008-04-26  7:16 ` Avi Kivity
@ 2008-04-28 16:26   ` Ryan Harper
  2008-04-28 20:02     ` Anthony Liguori
  2008-04-29  0:46     ` Ulrich Drepper
  0 siblings, 2 replies; 9+ messages in thread
From: Ryan Harper @ 2008-04-28 16:26 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm-devel, Ryan Harper

* Avi Kivity <avi@qumranet.com> [2008-04-26 02:23]:
> 
> Please reuse qemu_mutex for this, no need for a new one.

I'm having a little trouble wrapping my head around all of the locking
here.  If I avoid qemu_mutex and use a new one, I've got everything
working.  However, attemping to use qemu_mutex is stepping into a pile
of locking crap.  I'm sure there is a good reason... 

The current code looks like this:

Thread1:
main()
  kvm_qemu_init() // mutex_lock()
  machine->init()
   pc_init1()
    pc_new_cpu()
     cpu_init()
      cpu_x86_init()
       kvm_init_new_ap() // create vcpu Thread2
       <--      
      <--      
     <--      
    <--      
   <--      
  <--      
  kvm_main_loop() // mutex_unlock()
                        
Thread2:
ap_main_loop()
 /* vcpu init */
 kvm_main_loop_cpu()
  kvm_main_loop_wait() // mutex_unlock() on enter, lock on exit
   kvm_eat_signals()   // mutex_lock() on enter, unlock on exit
   <--
  <--
 <--


It wedges up in kvm_init_new_ap() if I attempt acquire qemu_mutex.
Quite obvious after I looked at the call trace and discovered
kvm_qemu_init() locking on exit.  I see other various functions that
unlock and then lock; I really don't want to wade into this mess...
rather whomever cooked it up should do some cleanup.  I tried the
unlock, then re-lock on exit in kvm_init_new_ap() but that also wedged.

Here is a rework with a new flag in vcpu_info indicating vcpu
creation.  Tested this with 64 1VCPU guests booting with 1 second delay,
and single 16-way SMP guest boot.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@us.ibm.com


diffstat output:
 qemu-kvm.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
---
diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index 78127de..2768ea5 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -31,7 +31,9 @@ extern int smp_cpus;
 static int qemu_kvm_reset_requested;
 
 pthread_mutex_t qemu_mutex = PTHREAD_MUTEX_INITIALIZER;
+pthread_mutex_t vcpu_mutex = PTHREAD_MUTEX_INITIALIZER;
 pthread_cond_t qemu_aio_cond = PTHREAD_COND_INITIALIZER;
+pthread_cond_t qemu_vcpuup_cond = PTHREAD_COND_INITIALIZER;
 __thread struct vcpu_info *vcpu;
 
 struct qemu_kvm_signal_table {
@@ -53,6 +55,7 @@ struct vcpu_info {
     int stop;
     int stopped;
     int reload_regs;
+    int created;
 } vcpu_info[256];
 
 pthread_t io_thread;
@@ -369,6 +372,10 @@ static void *ap_main_loop(void *_env)
     sigfillset(&signals);
     sigprocmask(SIG_BLOCK, &signals, NULL);
     kvm_create_vcpu(kvm_context, env->cpu_index);
+    pthread_mutex_lock(&vcpu_mutex);
+    vcpu->created = 1;
+    pthread_cond_signal(&qemu_vcpuup_cond);
+    pthread_mutex_unlock(&vcpu_mutex);
     kvm_qemu_init_env(env);
     kvm_main_loop_cpu(env);
     return NULL;
@@ -388,9 +395,12 @@ static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum)
 
 void kvm_init_new_ap(int cpu, CPUState *env)
 {
+    pthread_mutex_lock(&vcpu_mutex);
     pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env);
-    /* FIXME: wait for thread to spin up */
-    usleep(200);
+    while (vcpu_info[cpu].created == 0) {
+        pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex);
+    }
+    pthread_mutex_unlock(&vcpu_mutex);
 }
 
 static void qemu_kvm_init_signal_tables(void)

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix QEMU vcpu thread race with apic_reset
  2008-04-28 16:26   ` Ryan Harper
@ 2008-04-28 20:02     ` Anthony Liguori
  2008-04-29  0:46     ` Ulrich Drepper
  1 sibling, 0 replies; 9+ messages in thread
From: Anthony Liguori @ 2008-04-28 20:02 UTC (permalink / raw)
  To: Ryan Harper; +Cc: kvm-devel, Avi Kivity

Ryan Harper wrote:
> * Avi Kivity <avi@qumranet.com> [2008-04-26 02:23]:
>   
>> Please reuse qemu_mutex for this, no need for a new one.
>>     
>
> I'm having a little trouble wrapping my head around all of the locking
> here.  If I avoid qemu_mutex and use a new one, I've got everything
> working.  However, attemping to use qemu_mutex is stepping into a pile
> of locking crap.  I'm sure there is a good reason... 
>
> The current code looks like this:
>
> Thread1:
> main()
>   kvm_qemu_init() // mutex_lock()
>   machine->init()
>    pc_init1()
>     pc_new_cpu()
>      cpu_init()
>       cpu_x86_init()
>        kvm_init_new_ap() // create vcpu Thread2
>        <--      
>       <--      
>      <--      
>     <--      
>    <--      
>   <--      
>   kvm_main_loop() // mutex_unlock()
>                         
> Thread2:
> ap_main_loop()
>  /* vcpu init */
>  kvm_main_loop_cpu()
>   kvm_main_loop_wait() // mutex_unlock() on enter, lock on exit
>    kvm_eat_signals()   // mutex_lock() on enter, unlock on exit
>    <--
>   <--
>  <--
>   

The qemu_mutex is meant to ensure that the QEMU code is only ever 
entered by one thread at a time.  QEMU is not thread-safe so this is 
necessary.  It's a little odd because we're taking this lock initially 
by being called from QEMU code.  Here's the basic theory of locking AFAICT:

kvm_main_loop_cpu() is the main loop for each VCPU.  It must acquire the 
QEMU mutex since it will call into normal QEMU code to process events.  
Whenever a VCPU is allowed to run, or when the VCPU is idling, it needs 
to release the QEMU mutex.  The former is done via the post_kvm_run() 
and pre_kvm_run() hooks.  The later is done within kvm_main_loop_wait().

kvm_main_loop_wait() will release the QEMU mutex and call 
kvm_eat_signals() which calls kvm_eat_signal() which will issue a 
sigtimedwait().  This is where we actually idle (and why SIGIO is so 
important right now).  We don't want to idle with the QEMU mutex held as 
that may result in dead lock so this is why we release it here.

kvm_eat_signal() has to acquire the lock again in order to dispatch IO 
events (via kvm_process_signal()).

Regards,

Anthony Liguori

> It wedges up in kvm_init_new_ap() if I attempt acquire qemu_mutex.
> Quite obvious after I looked at the call trace and discovered
> kvm_qemu_init() locking on exit.  I see other various functions that
> unlock and then lock; I really don't want to wade into this mess...
> rather whomever cooked it up should do some cleanup.  I tried the
> unlock, then re-lock on exit in kvm_init_new_ap() but that also wedged.
>
> Here is a rework with a new flag in vcpu_info indicating vcpu
> creation.  Tested this with 64 1VCPU guests booting with 1 second delay,
> and single 16-way SMP guest boot.
>
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Fix QEMU vcpu thread race with apic_reset
  2008-04-28 16:26   ` Ryan Harper
  2008-04-28 20:02     ` Anthony Liguori
@ 2008-04-29  0:46     ` Ulrich Drepper
  1 sibling, 0 replies; 9+ messages in thread
From: Ulrich Drepper @ 2008-04-29  0:46 UTC (permalink / raw)
  To: Ryan Harper; +Cc: kvm-devel, Avi Kivity

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> @@ -369,6 +372,10 @@ static void *ap_main_loop(void *_env)
>      sigfillset(&signals);
>      sigprocmask(SIG_BLOCK, &signals, NULL);
>      kvm_create_vcpu(kvm_context, env->cpu_index);
> +    pthread_mutex_lock(&vcpu_mutex);
> +    vcpu->created = 1;
> +    pthread_cond_signal(&qemu_vcpuup_cond);
> +    pthread_mutex_unlock(&vcpu_mutex);
>      kvm_qemu_init_env(env);
>      kvm_main_loop_cpu(env);
>      return NULL;

Still no locking needed on any CPU we support.  The memory access is
atomic and that's all that counts here.  With the mutex taken the woken
thread immediately runs into a brick wall and has to be put to sleep again.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFIFm/w2ijCOnn/RHQRAmxRAKDHoem5zvdt6gSVdoX6vQoYPr106QCcC3IA
RzKcqRUNgUmUDzqnrkjHrAI=
=MNbq
-----END PGP SIGNATURE-----

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-04-29  0:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-26  5:26 [PATCH] Fix QEMU vcpu thread race with apic_reset Ryan Harper
2008-04-26  5:33 ` Ryan Harper
2008-04-26  7:16 ` Avi Kivity
2008-04-28 16:26   ` Ryan Harper
2008-04-28 20:02     ` Anthony Liguori
2008-04-29  0:46     ` Ulrich Drepper
2008-04-26 16:58 ` Ulrich Drepper
2008-04-26 17:04 ` Ulrich Drepper
2008-04-26 17:56   ` Anthony Liguori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox