[Qemu-devel] Role of qemu_fair

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] Role of qemu_fair_mutex
@ 2011-01-03  9:46 Jan Kiszka
  2011-01-03 10:01 ` [Qemu-devel] " Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2011-01-03  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: Marcelo Tosatti, kvm

[-- Attachment #1: Type: text/plain, Size: 506 bytes --]

Hi,

at least in kvm mode, the qemu_fair_mutex seems to have lost its
function of balancing qemu_global_mutex access between the io-thread and
vcpus. It's now only taken by the latter, isn't it?

This and the fact that qemu-kvm does not use this kind of lock made me
wonder what its role is and if it is still relevant in practice. I'd
like to unify the execution models of qemu-kvm and qemu, and this lock
is the most obvious difference (there are surely more subtle ones as
well...).

Jan

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-03  9:46 [Qemu-devel] Role of qemu_fair_mutex Jan Kiszka
@ 2011-01-03 10:01 ` Avi Kivity
  2011-01-03 10:03   ` Jan Kiszka
  2011-01-04 14:17   ` Anthony Liguori
  0 siblings, 2 replies; 13+ messages in thread
From: Avi Kivity @ 2011-01-03 10:01 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, qemu-devel, kvm

On 01/03/2011 11:46 AM, Jan Kiszka wrote:
> Hi,
>
> at least in kvm mode, the qemu_fair_mutex seems to have lost its
> function of balancing qemu_global_mutex access between the io-thread and
> vcpus. It's now only taken by the latter, isn't it?
>
> This and the fact that qemu-kvm does not use this kind of lock made me
> wonder what its role is and if it is still relevant in practice. I'd
> like to unify the execution models of qemu-kvm and qemu, and this lock
> is the most obvious difference (there are surely more subtle ones as
> well...).
>

IIRC it was used for tcg, which has a problem that kvm doesn't have: a 
tcg vcpu needs to hold qemu_mutex when it runs, which means there will 
always be contention on qemu_mutex.  In the absence of fairness, the tcg 
thread could dominate qemu_mutex and starve the iothread.

This doesn't happen with kvm since kvm vcpus drop qemu_mutex when running.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-03 10:01 ` [Qemu-devel] " Avi Kivity
@ 2011-01-03 10:03   ` Jan Kiszka
  2011-01-03 10:08     ` Avi Kivity
  2011-01-04 14:17   ` Anthony Liguori
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Kiszka @ 2011-01-03 10:03 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, qemu-devel, kvm

[-- Attachment #1: Type: text/plain, Size: 1518 bytes --]

Am 03.01.2011 11:01, Avi Kivity wrote:
> On 01/03/2011 11:46 AM, Jan Kiszka wrote:
>> Hi,
>>
>> at least in kvm mode, the qemu_fair_mutex seems to have lost its
>> function of balancing qemu_global_mutex access between the io-thread and
>> vcpus. It's now only taken by the latter, isn't it?
>>
>> This and the fact that qemu-kvm does not use this kind of lock made me
>> wonder what its role is and if it is still relevant in practice. I'd
>> like to unify the execution models of qemu-kvm and qemu, and this lock
>> is the most obvious difference (there are surely more subtle ones as
>> well...).
>>
> 
> IIRC it was used for tcg, which has a problem that kvm doesn't have: a
> tcg vcpu needs to hold qemu_mutex when it runs, which means there will
> always be contention on qemu_mutex.  In the absence of fairness, the tcg
> thread could dominate qemu_mutex and starve the iothread.
> 
> This doesn't happen with kvm since kvm vcpus drop qemu_mutex when running.
> 

I see. Then I guess we should do this:

diff --git a/cpus.c b/cpus.c
index 9bf5224..0de8552 100644
--- a/cpus.c
+++ b/cpus.c
@@ -734,9 +734,7 @@ static sigset_t block_io_signals(void)
 void qemu_mutex_lock_iothread(void)
 {
     if (kvm_enabled()) {
-        qemu_mutex_lock(&qemu_fair_mutex);
         qemu_mutex_lock(&qemu_global_mutex);
-        qemu_mutex_unlock(&qemu_fair_mutex);
     } else {
         qemu_mutex_lock(&qemu_fair_mutex);
         if (qemu_mutex_trylock(&qemu_global_mutex)) {

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 259 bytes --]

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-03 10:03   ` Jan Kiszka
@ 2011-01-03 10:08     ` Avi Kivity
  0 siblings, 0 replies; 13+ messages in thread
From: Avi Kivity @ 2011-01-03 10:08 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, qemu-devel, kvm

On 01/03/2011 12:03 PM, Jan Kiszka wrote:
> Am 03.01.2011 11:01, Avi Kivity wrote:
> >  On 01/03/2011 11:46 AM, Jan Kiszka wrote:
> >>  Hi,
> >>
> >>  at least in kvm mode, the qemu_fair_mutex seems to have lost its
> >>  function of balancing qemu_global_mutex access between the io-thread and
> >>  vcpus. It's now only taken by the latter, isn't it?
> >>
> >>  This and the fact that qemu-kvm does not use this kind of lock made me
> >>  wonder what its role is and if it is still relevant in practice. I'd
> >>  like to unify the execution models of qemu-kvm and qemu, and this lock
> >>  is the most obvious difference (there are surely more subtle ones as
> >>  well...).
> >>
> >
> >  IIRC it was used for tcg, which has a problem that kvm doesn't have: a
> >  tcg vcpu needs to hold qemu_mutex when it runs, which means there will
> >  always be contention on qemu_mutex.  In the absence of fairness, the tcg
> >  thread could dominate qemu_mutex and starve the iothread.
> >
> >  This doesn't happen with kvm since kvm vcpus drop qemu_mutex when running.
> >
>
> I see. Then I guess we should do this:
>
> diff --git a/cpus.c b/cpus.c
> index 9bf5224..0de8552 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -734,9 +734,7 @@ static sigset_t block_io_signals(void)
>   void qemu_mutex_lock_iothread(void)
>   {
>       if (kvm_enabled()) {
> -        qemu_mutex_lock(&qemu_fair_mutex);
>           qemu_mutex_lock(&qemu_global_mutex);
> -        qemu_mutex_unlock(&qemu_fair_mutex);
>       } else {
>           qemu_mutex_lock(&qemu_fair_mutex);
>           if (qemu_mutex_trylock(&qemu_global_mutex)) {

I think so, though Anthony or Marcelo should confirm my interpretation 
first.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-03 10:01 ` [Qemu-devel] " Avi Kivity
  2011-01-03 10:03   ` Jan Kiszka
@ 2011-01-04 14:17   ` Anthony Liguori
  2011-01-04 14:27     ` Avi Kivity
  2011-01-04 21:39     ` Marcelo Tosatti
  1 sibling, 2 replies; 13+ messages in thread
From: Anthony Liguori @ 2011-01-04 14:17 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, Jan Kiszka, qemu-devel, kvm

On 01/03/2011 04:01 AM, Avi Kivity wrote:
> On 01/03/2011 11:46 AM, Jan Kiszka wrote:
>> Hi,
>>
>> at least in kvm mode, the qemu_fair_mutex seems to have lost its
>> function of balancing qemu_global_mutex access between the io-thread and
>> vcpus. It's now only taken by the latter, isn't it?
>>
>> This and the fact that qemu-kvm does not use this kind of lock made me
>> wonder what its role is and if it is still relevant in practice. I'd
>> like to unify the execution models of qemu-kvm and qemu, and this lock
>> is the most obvious difference (there are surely more subtle ones as
>> well...).
>>
>
> IIRC it was used for tcg, which has a problem that kvm doesn't have: a 
> tcg vcpu needs to hold qemu_mutex when it runs, which means there will 
> always be contention on qemu_mutex.  In the absence of fairness, the 
> tcg thread could dominate qemu_mutex and starve the iothread.

No, it's actually the opposite IIRC.

TCG relies on the following behavior.   A guest VCPU runs until 1) it 
encounters a HLT instruction 2) an event occurs that forces the TCG 
execution to break.

(2) really means that the TCG thread receives a signal.  Usually, this 
is the periodic timer signal.

When the TCG thread, it needs to let the IO thread run for at least one 
iteration.  Coordinating the execution of the IO thread such that it's 
guaranteed to run at least once and then having it drop the qemu mutex 
long enough for the TCG thread to acquire it is the purpose of the 
qemu_fair_mutex.

Regards,

Anthony Liguori

> This doesn't happen with kvm since kvm vcpus drop qemu_mutex when 
> running.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-04 14:17   ` Anthony Liguori
@ 2011-01-04 14:27     ` Avi Kivity
  2011-01-04 14:55       ` Anthony Liguori
  2011-01-04 21:39     ` Marcelo Tosatti
  1 sibling, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2011-01-04 14:27 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Marcelo Tosatti, Jan Kiszka, qemu-devel, kvm

On 01/04/2011 04:17 PM, Anthony Liguori wrote:
> On 01/03/2011 04:01 AM, Avi Kivity wrote:
>> On 01/03/2011 11:46 AM, Jan Kiszka wrote:
>>> Hi,
>>>
>>> at least in kvm mode, the qemu_fair_mutex seems to have lost its
>>> function of balancing qemu_global_mutex access between the io-thread 
>>> and
>>> vcpus. It's now only taken by the latter, isn't it?
>>>
>>> This and the fact that qemu-kvm does not use this kind of lock made me
>>> wonder what its role is and if it is still relevant in practice. I'd
>>> like to unify the execution models of qemu-kvm and qemu, and this lock
>>> is the most obvious difference (there are surely more subtle ones as
>>> well...).
>>>
>>
>> IIRC it was used for tcg, which has a problem that kvm doesn't have: 
>> a tcg vcpu needs to hold qemu_mutex when it runs, which means there 
>> will always be contention on qemu_mutex.  In the absence of fairness, 
>> the tcg thread could dominate qemu_mutex and starve the iothread.
>
> No, it's actually the opposite IIRC.
>
> TCG relies on the following behavior.   A guest VCPU runs until 1) it 
> encounters a HLT instruction 2) an event occurs that forces the TCG 
> execution to break.
>
> (2) really means that the TCG thread receives a signal.  Usually, this 
> is the periodic timer signal.

What about a completion?  an I/O completes, the I/O thread wakes up, 
needs to acquire the global lock (and force tcg off it) inject and 
interrupt, and go back to sleep.

>
> When the TCG thread, it needs to let the IO thread run for at least 
> one iteration.  Coordinating the execution of the IO thread such that 
> it's guaranteed to run at least once and then having it drop the qemu 
> mutex long enough for the TCG thread to acquire it is the purpose of 
> the qemu_fair_mutex.

That doesn't compute - the iothread doesn't hog the global lock (it 
sleeps most of the time, and drops the lock while sleeping), so the 
iothread cannot starve out tcg.  On the other hand, tcg does hog the 
global lock, so it needs to be made to give it up so the iothread can 
run, for example my completion example.

I think the abstraction we need here is a priority lock, with higher 
priority given to the iothread.  A lock() operation that takes 
precedence would atomically signal the current owner to drop the lock.

Under kvm we'd run a normal mutex, so the it wouldn't need to take the 
extra mutex.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-04 14:27     ` Avi Kivity
@ 2011-01-04 14:55       ` Anthony Liguori
  2011-01-04 15:12         ` Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Anthony Liguori @ 2011-01-04 14:55 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, Jan Kiszka, qemu-devel, kvm

On 01/04/2011 08:27 AM, Avi Kivity wrote:
> On 01/04/2011 04:17 PM, Anthony Liguori wrote:
>> On 01/03/2011 04:01 AM, Avi Kivity wrote:
>>> On 01/03/2011 11:46 AM, Jan Kiszka wrote:
>>>> Hi,
>>>>
>>>> at least in kvm mode, the qemu_fair_mutex seems to have lost its
>>>> function of balancing qemu_global_mutex access between the 
>>>> io-thread and
>>>> vcpus. It's now only taken by the latter, isn't it?
>>>>
>>>> This and the fact that qemu-kvm does not use this kind of lock made me
>>>> wonder what its role is and if it is still relevant in practice. I'd
>>>> like to unify the execution models of qemu-kvm and qemu, and this lock
>>>> is the most obvious difference (there are surely more subtle ones as
>>>> well...).
>>>>
>>>
>>> IIRC it was used for tcg, which has a problem that kvm doesn't have: 
>>> a tcg vcpu needs to hold qemu_mutex when it runs, which means there 
>>> will always be contention on qemu_mutex.  In the absence of 
>>> fairness, the tcg thread could dominate qemu_mutex and starve the 
>>> iothread.
>>
>> No, it's actually the opposite IIRC.
>>
>> TCG relies on the following behavior.   A guest VCPU runs until 1) it 
>> encounters a HLT instruction 2) an event occurs that forces the TCG 
>> execution to break.
>>
>> (2) really means that the TCG thread receives a signal.  Usually, 
>> this is the periodic timer signal.
>
> What about a completion?  an I/O completes, the I/O thread wakes up, 
> needs to acquire the global lock (and force tcg off it) inject and 
> interrupt, and go back to sleep.

I/O completion triggers an fd to become readable.  This will cause 
select to break and the io thread will attempt to acquire the 
qemu_mutex.  When acquiring the mutex in TCG, the io thread will send a 
SIG_IPI to the TCG VCPU thread.


>>
>> When the TCG thread, it needs to let the IO thread run for at least 
>> one iteration.  Coordinating the execution of the IO thread such that 
>> it's guaranteed to run at least once and then having it drop the qemu 
>> mutex long enough for the TCG thread to acquire it is the purpose of 
>> the qemu_fair_mutex.
>
> That doesn't compute - the iothread doesn't hog the global lock (it 
> sleeps most of the time, and drops the lock while sleeping), so the 
> iothread cannot starve out tcg.

The fact that the iothread drops the global lock during sleep is a 
detail that shouldn't affect correctness.  The IO thread is absolutely 
allowed to run for arbitrary periods of time without dropping the qemu 
mutex.

>   On the other hand, tcg does hog the global lock, so it needs to be 
> made to give it up so the iothread can run, for example my completion 
> example.

It's very easy to ask TCG to give up the qemu_mutex by using 
cpu_interrupt().  It will drop the qemu_mutex and it will not attempt to 
acquire it again until you restart the VCPU.

> I think the abstraction we need here is a priority lock, with higher 
> priority given to the iothread.  A lock() operation that takes 
> precedence would atomically signal the current owner to drop the lock.

The I/O thread can reliably acquire the lock whenever it needs to.

If you drop all of the qemu_fair_mutex stuff and leave the qemu_mutex 
getting dropped around select, TCG will generally work reliably.  But 
this is not race free.  Just dropping a lock does not result in reliable 
hand off.

I think a generational counter could work and a condition could work.

Regards,

Anthony Liguori


> Under kvm we'd run a normal mutex, so the it wouldn't need to take the 
> extra mutex.
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-04 14:55       ` Anthony Liguori
@ 2011-01-04 15:12         ` Avi Kivity
  2011-01-04 15:43           ` Anthony Liguori
  0 siblings, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2011-01-04 15:12 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Marcelo Tosatti, Jan Kiszka, qemu-devel, kvm

On 01/04/2011 04:55 PM, Anthony Liguori wrote:
>
>>>
>>> When the TCG thread, it needs to let the IO thread run for at least 
>>> one iteration.  Coordinating the execution of the IO thread such 
>>> that it's guaranteed to run at least once and then having it drop 
>>> the qemu mutex long enough for the TCG thread to acquire it is the 
>>> purpose of the qemu_fair_mutex.
>>
>> That doesn't compute - the iothread doesn't hog the global lock (it 
>> sleeps most of the time, and drops the lock while sleeping), so the 
>> iothread cannot starve out tcg.
>
> The fact that the iothread drops the global lock during sleep is a 
> detail that shouldn't affect correctness.  The IO thread is absolutely 
> allowed to run for arbitrary periods of time without dropping the qemu 
> mutex.

No, it's not, since it will stop vcpus in their tracks.  Whenever we 
hold qemu_mutex for unbounded time, that's a bug.  I think the only 
place is live migration and perhaps tcg?

>
>>   On the other hand, tcg does hog the global lock, so it needs to be 
>> made to give it up so the iothread can run, for example my completion 
>> example.
>
> It's very easy to ask TCG to give up the qemu_mutex by using 
> cpu_interrupt().  It will drop the qemu_mutex and it will not attempt 
> to acquire it again until you restart the VCPU.

Maybe that's the solution:

def acquire_global_mutex():
    if not tcg_thread:
       cpu_interrupt()
    global_mutex.aquire()

release_global_mutex():
     global_mutex.release()
     if not tcg_thread:
        cpu_resume()

though it's racy, two non-tcg threads can cause an early resume.

>
>> I think the abstraction we need here is a priority lock, with higher 
>> priority given to the iothread.  A lock() operation that takes 
>> precedence would atomically signal the current owner to drop the lock.
>
> The I/O thread can reliably acquire the lock whenever it needs to.
>
> If you drop all of the qemu_fair_mutex stuff and leave the qemu_mutex 
> getting dropped around select, TCG will generally work reliably.  But 
> this is not race free. 

What would be the impact of a race here?

> Just dropping a lock does not result in reliable hand off.

Why do we want a handoff in the first place?

I don't think we do.  I think we want the iothread to run in preference 
to tcg, since tcg is a lock hog under guest control, while the iothread 
is not a lock hog (excepting live migration).

>
> I think a generational counter could work and a condition could work.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-04 15:12         ` Avi Kivity
@ 2011-01-04 15:43           ` Anthony Liguori
  2011-01-05  8:55             ` Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Anthony Liguori @ 2011-01-04 15:43 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, Jan Kiszka, qemu-devel, kvm

On 01/04/2011 09:12 AM, Avi Kivity wrote:
> On 01/04/2011 04:55 PM, Anthony Liguori wrote:
>>
>>>>
>>>> When the TCG thread, it needs to let the IO thread run for at least 
>>>> one iteration.  Coordinating the execution of the IO thread such 
>>>> that it's guaranteed to run at least once and then having it drop 
>>>> the qemu mutex long enough for the TCG thread to acquire it is the 
>>>> purpose of the qemu_fair_mutex.
>>>
>>> That doesn't compute - the iothread doesn't hog the global lock (it 
>>> sleeps most of the time, and drops the lock while sleeping), so the 
>>> iothread cannot starve out tcg.
>>
>> The fact that the iothread drops the global lock during sleep is a 
>> detail that shouldn't affect correctness.  The IO thread is 
>> absolutely allowed to run for arbitrary periods of time without 
>> dropping the qemu mutex.
>
> No, it's not, since it will stop vcpus in their tracks.  Whenever we 
> hold qemu_mutex for unbounded time, that's a bug.

I'm not sure that designing the io thread to hold the lock for a 
"bounded" amount of time is a good design point.  What is an accepted 
amount of time for it to hold the lock?

Instead of the VCPU relying on the IO thread to eventually drop the 
lock, it seems far superior to have the VCPU thread indicate to the IO 
thread that it needs the lock.

As of right now, the IO thread can indicate to the VCPU thread that it 
needs the lock so having a symmetric interface seems obvious.  Of 
course, you need to pick one to have more priority in case both indicate 
they need to use the lock at the same exact time.

>   I think the only place is live migration and perhaps tcg?

qcow2 and anything else that puts the IO thread to sleep.

>>>   On the other hand, tcg does hog the global lock, so it needs to be 
>>> made to give it up so the iothread can run, for example my 
>>> completion example.
>>
>> It's very easy to ask TCG to give up the qemu_mutex by using 
>> cpu_interrupt().  It will drop the qemu_mutex and it will not attempt 
>> to acquire it again until you restart the VCPU.
>
> Maybe that's the solution:
>
> def acquire_global_mutex():
>    if not tcg_thread:
>       cpu_interrupt()

It's not quite as direct as this at the moment but this is also not 
really a bad idea.  Right now we just send a SIG_IPI but cpu_interrupt 
would be better.

>    global_mutex.aquire()
>
> release_global_mutex():
>     global_mutex.release()
>     if not tcg_thread:
>        cpu_resume()
>
> though it's racy, two non-tcg threads can cause an early resume.
>
>>
>>> I think the abstraction we need here is a priority lock, with higher 
>>> priority given to the iothread.  A lock() operation that takes 
>>> precedence would atomically signal the current owner to drop the lock.
>>
>> The I/O thread can reliably acquire the lock whenever it needs to.
>>
>> If you drop all of the qemu_fair_mutex stuff and leave the qemu_mutex 
>> getting dropped around select, TCG will generally work reliably.  But 
>> this is not race free. 
>
> What would be the impact of a race here?

Racy is probably the wrong word.  To give a concrete example of why one 
is better than the other, consider live migration.

It would be reasonable to have a check in live migration to iterate 
unless there was higher priority work.  If a VCPU thread needs to 
acquire the mutex, that could be considered higher priority work.  If 
you don't have an explicit hand off, it's not possible to implement such 
logic.

>> Just dropping a lock does not result in reliable hand off.
>
> Why do we want a handoff in the first place?
>
> I don't think we do.  I think we want the iothread to run in 
> preference to tcg, since tcg is a lock hog under guest control, while 
> the iothread is not a lock hog (excepting live migration).

The io thread is a lock hog practically speaking.

Regards,

Anthony Liguori

>>
>> I think a generational counter could work and a condition could work.
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-04 14:17   ` Anthony Liguori
  2011-01-04 14:27     ` Avi Kivity
@ 2011-01-04 21:39     ` Marcelo Tosatti
  2011-01-05 16:44       ` Anthony Liguori
  1 sibling, 1 reply; 13+ messages in thread
From: Marcelo Tosatti @ 2011-01-04 21:39 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Jan Kiszka, Avi Kivity, kvm, qemu-devel

On Tue, Jan 04, 2011 at 08:17:26AM -0600, Anthony Liguori wrote:
> On 01/03/2011 04:01 AM, Avi Kivity wrote:
> >On 01/03/2011 11:46 AM, Jan Kiszka wrote:
> >>Hi,
> >>
> >>at least in kvm mode, the qemu_fair_mutex seems to have lost its
> >>function of balancing qemu_global_mutex access between the io-thread and
> >>vcpus. It's now only taken by the latter, isn't it?
> >>
> >>This and the fact that qemu-kvm does not use this kind of lock made me
> >>wonder what its role is and if it is still relevant in practice. I'd
> >>like to unify the execution models of qemu-kvm and qemu, and this lock
> >>is the most obvious difference (there are surely more subtle ones as
> >>well...).
> >>
> >
> >IIRC it was used for tcg, which has a problem that kvm doesn't
> >have: a tcg vcpu needs to hold qemu_mutex when it runs, which
> >means there will always be contention on qemu_mutex.  In the
> >absence of fairness, the tcg thread could dominate qemu_mutex and
> >starve the iothread.
> 
> No, it's actually the opposite IIRC.
> 
> TCG relies on the following behavior.   A guest VCPU runs until 1)
> it encounters a HLT instruction 2) an event occurs that forces the
> TCG execution to break.
> 
> (2) really means that the TCG thread receives a signal.  Usually,
> this is the periodic timer signal.
> 
> When the TCG thread, it needs to let the IO thread run for at least
> one iteration.  Coordinating the execution of the IO thread such
> that it's guaranteed to run at least once and then having it drop
> the qemu mutex long enough for the TCG thread to acquire it is the
> purpose of the qemu_fair_mutex.

Its the vcpu threads that starve the IO thread.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-04 15:43           ` Anthony Liguori
@ 2011-01-05  8:55             ` Avi Kivity
  0 siblings, 0 replies; 13+ messages in thread
From: Avi Kivity @ 2011-01-05  8:55 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Marcelo Tosatti, Jan Kiszka, qemu-devel, kvm

On 01/04/2011 05:43 PM, Anthony Liguori wrote:
>>> The fact that the iothread drops the global lock during sleep is a 
>>> detail that shouldn't affect correctness.  The IO thread is 
>>> absolutely allowed to run for arbitrary periods of time without 
>>> dropping the qemu mutex.
>>
>> No, it's not, since it will stop vcpus in their tracks.  Whenever we 
>> hold qemu_mutex for unbounded time, that's a bug.
>
>
> I'm not sure that designing the io thread to hold the lock for a 
> "bounded" amount of time is a good design point.  What is an accepted 
> amount of time for it to hold the lock?

Ultimately, zero.  It's ridiculous to talk about 64-vcpu guests or 
multiqueue virtio on one hand, and have everything serialize on a global 
lock on the other hand.

A reasonable amount of time would be (heavyweight_vmexit_time / 
nr_vcpu), which would ensure that the lock never dominates performance.  
I don't think it's achievable, probably the time to bounce the lock's 
cache line exceeds this.

I'd be happy with "a few microseconds" for now.

> Instead of the VCPU relying on the IO thread to eventually drop the 
> lock, it seems far superior to have the VCPU thread indicate to the IO 
> thread that it needs the lock.

I don't see why.  First, the iothread is not the lock hog, tcg is.  
Second, you can't usually break out of iothread tasks (unlike tcg).

> As of right now, the IO thread can indicate to the VCPU thread that it 
> needs the lock so having a symmetric interface seems obvious.  Of 
> course, you need to pick one to have more priority in case both 
> indicate they need to use the lock at the same exact time.

io and tcg are not symmetric.  If you let io have the higher priority, 
all io will complete and the iothread will go back to sleep.  If you let 
tcg have the higher priority, the guest will spin.

qemu-kvm works fine without any prioritization, since there are no lock 
hogs.

>
>>   I think the only place is live migration and perhaps tcg?
>
> qcow2 and anything else that puts the IO thread to sleep.

... while holding the lock.  All those are bugs, we should never ever 
sleep while holding the lock, it converts an HPET read from something 
that is cpu bound to something that is io bound.

>>
>>>
>>>> I think the abstraction we need here is a priority lock, with 
>>>> higher priority given to the iothread.  A lock() operation that 
>>>> takes precedence would atomically signal the current owner to drop 
>>>> the lock.
>>>
>>> The I/O thread can reliably acquire the lock whenever it needs to.
>>>
>>> If you drop all of the qemu_fair_mutex stuff and leave the 
>>> qemu_mutex getting dropped around select, TCG will generally work 
>>> reliably.  But this is not race free. 
>>
>> What would be the impact of a race here?
>
> Racy is probably the wrong word.  To give a concrete example of why 
> one is better than the other, consider live migration.
>
> It would be reasonable to have a check in live migration to iterate 
> unless there was higher priority work.  If a VCPU thread needs to 
> acquire the mutex, that could be considered higher priority work.  If 
> you don't have an explicit hand off, it's not possible to implement 
> such logic.

Live migration needs not to hold the global lock while copying memory.  
Failing that, a priority lock would work (in order of increasing 
priority: tcg -> live migration -> kvm-vcpu -> iothread), but I don't 
think it's a good direction to pursue.  The Linux mantra is, if you have 
lock contention, don't improve the lock, improve the locking to remove 
the contention until you no longer understand the code.  It's a lot 
harder but playing with priorities is a dead end IMO.

>
>>> Just dropping a lock does not result in reliable hand off.
>>
>> Why do we want a handoff in the first place?
>>
>> I don't think we do.  I think we want the iothread to run in 
>> preference to tcg, since tcg is a lock hog under guest control, while 
>> the iothread is not a lock hog (excepting live migration).
>
> The io thread is a lock hog practically speaking.

It's not.  Give it the highest priority and it will drop the lock and 
sleep.  Give tcg the highest priority and it will hold the lock and spin.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-04 21:39     ` Marcelo Tosatti
@ 2011-01-05 16:44       ` Anthony Liguori
  2011-01-05 17:08         ` Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Anthony Liguori @ 2011-01-05 16:44 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Jan Kiszka, Avi Kivity, kvm, qemu-devel

On 01/04/2011 03:39 PM, Marcelo Tosatti wrote:
> On Tue, Jan 04, 2011 at 08:17:26AM -0600, Anthony Liguori wrote:
>    
>> On 01/03/2011 04:01 AM, Avi Kivity wrote:
>>      
>>> On 01/03/2011 11:46 AM, Jan Kiszka wrote:
>>>        
>>>> Hi,
>>>>
>>>> at least in kvm mode, the qemu_fair_mutex seems to have lost its
>>>> function of balancing qemu_global_mutex access between the io-thread and
>>>> vcpus. It's now only taken by the latter, isn't it?
>>>>
>>>> This and the fact that qemu-kvm does not use this kind of lock made me
>>>> wonder what its role is and if it is still relevant in practice. I'd
>>>> like to unify the execution models of qemu-kvm and qemu, and this lock
>>>> is the most obvious difference (there are surely more subtle ones as
>>>> well...).
>>>>
>>>>          
>>> IIRC it was used for tcg, which has a problem that kvm doesn't
>>> have: a tcg vcpu needs to hold qemu_mutex when it runs, which
>>> means there will always be contention on qemu_mutex.  In the
>>> absence of fairness, the tcg thread could dominate qemu_mutex and
>>> starve the iothread.
>>>        
>> No, it's actually the opposite IIRC.
>>
>> TCG relies on the following behavior.   A guest VCPU runs until 1)
>> it encounters a HLT instruction 2) an event occurs that forces the
>> TCG execution to break.
>>
>> (2) really means that the TCG thread receives a signal.  Usually,
>> this is the periodic timer signal.
>>
>> When the TCG thread, it needs to let the IO thread run for at least
>> one iteration.  Coordinating the execution of the IO thread such
>> that it's guaranteed to run at least once and then having it drop
>> the qemu mutex long enough for the TCG thread to acquire it is the
>> purpose of the qemu_fair_mutex.
>>      
> Its the vcpu threads that starve the IO thread.
>    

I'm not sure if this is a difference in semantics or if we're not 
understanding each other.

With TCG, the VCPU thread will dominate the qemu_mutex and cause the IO 
thread to contend heavily on it.

But the IO thread can always force TCG to exit it's loop (and does so 
when leaving select()).  So the TCG thread make keep the IO thread 
hungry, but it never "starves" it.

OTOH, the TCG thread struggles to hand over execution to the IO thread 
while making sure that it gets back the qemu_mutex in a timely fashion.  
That's the tricky part.  Avi's point is that by giving up the lock at 
select time, we prevent starvation but my concern is that because the 
time between select intervals is unbounded (and potentially very, very 
lock), it's effectively starvation.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] Re: Role of qemu_fair_mutex
  2011-01-05 16:44       ` Anthony Liguori
@ 2011-01-05 17:08         ` Avi Kivity
  0 siblings, 0 replies; 13+ messages in thread
From: Avi Kivity @ 2011-01-05 17:08 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Marcelo Tosatti, Jan Kiszka, qemu-devel, kvm

On 01/05/2011 06:44 PM, Anthony Liguori wrote:
> On 01/04/2011 03:39 PM, Marcelo Tosatti wrote:
>> On Tue, Jan 04, 2011 at 08:17:26AM -0600, Anthony Liguori wrote:
>>> On 01/03/2011 04:01 AM, Avi Kivity wrote:
>>>> On 01/03/2011 11:46 AM, Jan Kiszka wrote:
>>>>> Hi,
>>>>>
>>>>> at least in kvm mode, the qemu_fair_mutex seems to have lost its
>>>>> function of balancing qemu_global_mutex access between the 
>>>>> io-thread and
>>>>> vcpus. It's now only taken by the latter, isn't it?
>>>>>
>>>>> This and the fact that qemu-kvm does not use this kind of lock 
>>>>> made me
>>>>> wonder what its role is and if it is still relevant in practice. I'd
>>>>> like to unify the execution models of qemu-kvm and qemu, and this 
>>>>> lock
>>>>> is the most obvious difference (there are surely more subtle ones as
>>>>> well...).
>>>>>
>>>> IIRC it was used for tcg, which has a problem that kvm doesn't
>>>> have: a tcg vcpu needs to hold qemu_mutex when it runs, which
>>>> means there will always be contention on qemu_mutex.  In the
>>>> absence of fairness, the tcg thread could dominate qemu_mutex and
>>>> starve the iothread.
>>> No, it's actually the opposite IIRC.
>>>
>>> TCG relies on the following behavior.   A guest VCPU runs until 1)
>>> it encounters a HLT instruction 2) an event occurs that forces the
>>> TCG execution to break.
>>>
>>> (2) really means that the TCG thread receives a signal.  Usually,
>>> this is the periodic timer signal.
>>>
>>> When the TCG thread, it needs to let the IO thread run for at least
>>> one iteration.  Coordinating the execution of the IO thread such
>>> that it's guaranteed to run at least once and then having it drop
>>> the qemu mutex long enough for the TCG thread to acquire it is the
>>> purpose of the qemu_fair_mutex.
>> Its the vcpu threads that starve the IO thread.
>
> I'm not sure if this is a difference in semantics or if we're not 
> understanding each other.

I think, the latter.

>
> With TCG, the VCPU thread will dominate the qemu_mutex and cause the 
> IO thread to contend heavily on it.
>
> But the IO thread can always force TCG to exit it's loop (and does so 
> when leaving select()).  So the TCG thread make keep the IO thread 
> hungry, but it never "starves" it.

With a pure qemu_mutex_acquire(), tcg does starve out iothread.  
SIG_IPI/cpu_interrupt and qemu_fair_mutex, were introduced to solve this 
starvation; kvm doesn't require them.

>
> OTOH, the TCG thread struggles to hand over execution to the IO thread 
> while making sure that it gets back the qemu_mutex in a timely 
> fashion.  That's the tricky part.  Avi's point is that by giving up 
> the lock at select time, we prevent starvation but my concern is that 
> because the time between select intervals is unbounded (and 
> potentially very, very lock), it's effectively starvation.

It isn't starvation, since the iothread will eventually drain its work.

Suppose we do hand over to tcg while the iothread still has pending 
work.  What now? tcg will not drop the lock voluntarily.  When will the 
iothread complete its work?

Do we immediately interrupt tcg again?  If so, why did we give it the lock?
Do we sleep for a while and then reaquire the lock?  For how long?  
AFAWCT, tcg may be spinning waiting for a completion.

There's simply no scope for an iothread->tcg handoff.  The situation is 
not symmetric, it's more a client/server relationship.

Zooming out for a bit, let's see what out options are:

- the current qemu_fair_mutex/SIG_IPI thing
- a priority lock, which simply encapsulates the current 
qemu_fair_mutex.  tcg is made to drop the lock whenever anyone else 
attempts to acquire it.  No change in behaviour, just coding.
- make tcg take the qemu lock only in helper code; make sure we only do 
tcg things in the tcg thread (like playing with the tlb).  No need for 
special locking, but will reduce tcg throughput somewhat, my estimation 
is measurably but not significantly.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-01-05 17:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-03  9:46 [Qemu-devel] Role of qemu_fair_mutex Jan Kiszka
2011-01-03 10:01 ` [Qemu-devel] " Avi Kivity
2011-01-03 10:03   ` Jan Kiszka
2011-01-03 10:08     ` Avi Kivity
2011-01-04 14:17   ` Anthony Liguori
2011-01-04 14:27     ` Avi Kivity
2011-01-04 14:55       ` Anthony Liguori
2011-01-04 15:12         ` Avi Kivity
2011-01-04 15:43           ` Anthony Liguori
2011-01-05  8:55             ` Avi Kivity
2011-01-04 21:39     ` Marcelo Tosatti
2011-01-05 16:44       ` Anthony Liguori
2011-01-05 17:08         ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).