Re: Role of qemu_fair_mutex

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: Jan Kiszka <jan.kiszka@web.de>,
	qemu-devel <qemu-devel@nongnu.org>, kvm <kvm@vger.kernel.org>,
	Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: Role of qemu_fair_mutex
Date: Tue, 04 Jan 2011 09:43:20 -0600	[thread overview]
Message-ID: <4D234018.2010102@codemonkey.ws> (raw)
In-Reply-To: <4D2338CB.7020900@redhat.com>

On 01/04/2011 09:12 AM, Avi Kivity wrote:
> On 01/04/2011 04:55 PM, Anthony Liguori wrote:
>>
>>>>
>>>> When the TCG thread, it needs to let the IO thread run for at least 
>>>> one iteration.  Coordinating the execution of the IO thread such 
>>>> that it's guaranteed to run at least once and then having it drop 
>>>> the qemu mutex long enough for the TCG thread to acquire it is the 
>>>> purpose of the qemu_fair_mutex.
>>>
>>> That doesn't compute - the iothread doesn't hog the global lock (it 
>>> sleeps most of the time, and drops the lock while sleeping), so the 
>>> iothread cannot starve out tcg.
>>
>> The fact that the iothread drops the global lock during sleep is a 
>> detail that shouldn't affect correctness.  The IO thread is 
>> absolutely allowed to run for arbitrary periods of time without 
>> dropping the qemu mutex.
>
> No, it's not, since it will stop vcpus in their tracks.  Whenever we 
> hold qemu_mutex for unbounded time, that's a bug.

I'm not sure that designing the io thread to hold the lock for a 
"bounded" amount of time is a good design point.  What is an accepted 
amount of time for it to hold the lock?

Instead of the VCPU relying on the IO thread to eventually drop the 
lock, it seems far superior to have the VCPU thread indicate to the IO 
thread that it needs the lock.

As of right now, the IO thread can indicate to the VCPU thread that it 
needs the lock so having a symmetric interface seems obvious.  Of 
course, you need to pick one to have more priority in case both indicate 
they need to use the lock at the same exact time.

>   I think the only place is live migration and perhaps tcg?

qcow2 and anything else that puts the IO thread to sleep.

>>>   On the other hand, tcg does hog the global lock, so it needs to be 
>>> made to give it up so the iothread can run, for example my 
>>> completion example.
>>
>> It's very easy to ask TCG to give up the qemu_mutex by using 
>> cpu_interrupt().  It will drop the qemu_mutex and it will not attempt 
>> to acquire it again until you restart the VCPU.
>
> Maybe that's the solution:
>
> def acquire_global_mutex():
>    if not tcg_thread:
>       cpu_interrupt()

It's not quite as direct as this at the moment but this is also not 
really a bad idea.  Right now we just send a SIG_IPI but cpu_interrupt 
would be better.

>    global_mutex.aquire()
>
> release_global_mutex():
>     global_mutex.release()
>     if not tcg_thread:
>        cpu_resume()
>
> though it's racy, two non-tcg threads can cause an early resume.
>
>>
>>> I think the abstraction we need here is a priority lock, with higher 
>>> priority given to the iothread.  A lock() operation that takes 
>>> precedence would atomically signal the current owner to drop the lock.
>>
>> The I/O thread can reliably acquire the lock whenever it needs to.
>>
>> If you drop all of the qemu_fair_mutex stuff and leave the qemu_mutex 
>> getting dropped around select, TCG will generally work reliably.  But 
>> this is not race free. 
>
> What would be the impact of a race here?

Racy is probably the wrong word.  To give a concrete example of why one 
is better than the other, consider live migration.

It would be reasonable to have a check in live migration to iterate 
unless there was higher priority work.  If a VCPU thread needs to 
acquire the mutex, that could be considered higher priority work.  If 
you don't have an explicit hand off, it's not possible to implement such 
logic.

>> Just dropping a lock does not result in reliable hand off.
>
> Why do we want a handoff in the first place?
>
> I don't think we do.  I think we want the iothread to run in 
> preference to tcg, since tcg is a lock hog under guest control, while 
> the iothread is not a lock hog (excepting live migration).

The io thread is a lock hog practically speaking.

Regards,

Anthony Liguori

>>
>> I think a generational counter could work and a condition could work.
>

next prev parent reply	other threads:[~2011-01-04 15:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-03  9:46 Role of qemu_fair_mutex Jan Kiszka
2011-01-03 10:01 ` Avi Kivity
2011-01-03 10:03   ` Jan Kiszka
2011-01-03 10:08     ` Avi Kivity
2011-01-04 14:17   ` Anthony Liguori
2011-01-04 14:27     ` Avi Kivity
2011-01-04 14:55       ` Anthony Liguori
2011-01-04 15:12         ` Avi Kivity
2011-01-04 15:43           ` Anthony Liguori [this message]
2011-01-05  8:55             ` Avi Kivity
2011-01-04 21:39     ` Marcelo Tosatti
2011-01-05 16:44       ` Anthony Liguori
2011-01-05 17:08         ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D234018.2010102@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=jan.kiszka@web.de \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox