qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@redhat.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Liu Ping Fan <pingfank@linux.vnet.ibm.com>,
	Stefan Hajnoczi <stefanha@gmail.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	liu ping fan <qemulist@gmail.com>,
	Anthony Liguori <anthony@codemonkey.ws>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000
Date: Thu, 25 Oct 2012 19:02:01 +0200	[thread overview]
Message-ID: <50897089.50304@redhat.com> (raw)
In-Reply-To: <50896B51.3040708@siemens.com>

On 10/25/2012 06:39 PM, Jan Kiszka wrote:
>> 
>> That doesn't work cross-thread.
>> 
>> vcpu A: write to device X, dma-ing to device Y
>> vcpu B: write to device Y, dma-ing to device X
> 
> We will deny DMA-ing from device X on behalf of a VCPU, ie. in dispatch
> context, to Y.
> 
> What we do not deny, though, is DMA-ing from an I/O thread that
> processes an event for device X. 

I would really like to avoid depending on the context.  In real hardware, there is no such thing.

> If the invoked callback of device X
> holds the device lock across some DMA request to Y, then we risk to run
> into the same ABBA issue. Hmm...

Yup.

> 
>> 
>> My suggestion was to drop the locks around DMA, then re-acquire the lock
>> and re-validate data.
> 
> Maybe possible, but hairy depending on the device model.

It's unpleasant, yes.

Note depending on the device, we may not need to re-validate data, it may be sufficient to load it into local variables to we know it is consistent at some point.  But all those solutions suffer from requiring device model authors to understand all those issues, rather than just add a simple lock around access to their data structures.

>>>>> I see that we have a all-or-nothing problem here: to address this
>>>>> properly, we need to convert the IRQ path to lock-less (or at least
>>>>> compatible with holding per-device locks) as well.
>>>>
>>>> There is a transitional path where writing to a register that can cause
>>>> IRQ changes takes both the big lock and the local lock.
>>>>
>>>> Eventually, though, of course all inner subsystems must be threaded for
>>>> this work to have value.
>>>>
>>>
>>> But that transitional path must not introduce regressions. Opening a
>>> race window between IRQ cause update and event injection is such a
>>> thing, just like dropping concurrent requests on the floor.
>> 
>> Can you explain the race?
> 
> Context A				Context B
> 
> device.lock
> ...
> device.set interrupt_cause = 0
> lower_irq = true
> ...
> device.unlock
> 					device.lock
> 					...
> 					device.interrupt_cause = 42
> 					raise_irq = true
> 					...
> 					device.unlock
> 					if (raise_irq)
> 						bql.lock
> 						set_irq(device.irqno)
> 						bql.unlock
> if (lower_irq)
> 	bql.lock
> 	clear_irq(device.irqno)
> 	bql.unlock
> 
> 
> And there it goes, our interrupt event.

Obviously you'll need to reacquire the device lock after taking bql and revalidate stuff.  But that is not what I am suggesting.  Instead, any path that can lead to an irq update (or timer update etc) will take both the bql and the device lock.  This will leave after the first pass only side effect free register reads and writes, which is silly if we keep it that way, but we will follow with a threaded timer and irq subsystem and we'll peel away those big locks.

  device_mmio_write:
    if register is involved in irq or timers or block layer or really anything that matters:
      bql.acquire
    device.lock.acquire
    do stuff
    device.lock.release
    if that big condition from above was true:
      bql.release

-- 
error compiling committee.c: too many arguments to function

  reply	other threads:[~2012-10-25 17:02 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-22  9:23 [Qemu-devel] [patch v4 00/16] push mmio dispatch out of big lock Liu Ping Fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 01/16] atomic: introduce atomic operations Liu Ping Fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 02/16] qom: apply atomic on object's refcount Liu Ping Fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 03/16] hotplug: introduce qdev_unplug_complete() to remove device from views Liu Ping Fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 04/16] pci: remove pci device from mem view when unplug Liu Ping Fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 05/16] memory: introduce ref, unref interface for MemoryRegionOps Liu Ping Fan
2012-10-22  9:38   ` Avi Kivity
2012-10-23 11:51     ` Paolo Bonzini
2012-10-23 11:55       ` Avi Kivity
2012-10-23 11:57         ` Paolo Bonzini
2012-10-23 12:02           ` Avi Kivity
2012-10-23 12:06             ` Paolo Bonzini
2012-10-23 12:15               ` Avi Kivity
2012-10-23 12:32                 ` Paolo Bonzini
2012-10-23 14:49                   ` Avi Kivity
2012-10-23 15:26                     ` Paolo Bonzini
2012-10-23 16:09                       ` Avi Kivity
2012-10-24  7:29                         ` Paolo Bonzini
2012-10-25 16:28                           ` Avi Kivity
2012-10-26 15:05                             ` Paolo Bonzini
2012-10-23 12:04         ` Jan Kiszka
2012-10-23 12:12           ` Paolo Bonzini
2012-10-23 12:16             ` Jan Kiszka
2012-10-23 12:28               ` Avi Kivity
2012-10-23 12:40                 ` Jan Kiszka
2012-10-23 14:37                   ` Avi Kivity
2012-10-22  9:23 ` [Qemu-devel] [patch v4 06/16] memory: document ref, unref interface Liu Ping Fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 07/16] memory: make mmio dispatch able to be out of biglock Liu Ping Fan
2012-10-23 12:12   ` Jan Kiszka
2012-10-23 12:36     ` Avi Kivity
2012-10-24  6:31       ` liu ping fan
2012-10-24  6:56         ` liu ping fan
2012-10-25  8:57           ` Avi Kivity
2012-10-22  9:23 ` [Qemu-devel] [patch v4 08/16] QemuThread: make QemuThread as tls to store extra info Liu Ping Fan
2012-10-22  9:30   ` Jan Kiszka
2012-10-22 17:13     ` Peter Maydell
2012-10-23  5:58       ` liu ping fan
2012-10-23 11:48       ` Paolo Bonzini
2012-10-23 11:50         ` Peter Maydell
2012-10-23 11:51           ` Jan Kiszka
2012-10-23 12:00           ` Paolo Bonzini
2012-10-23 12:27             ` Peter Maydell
2012-11-18 10:02             ` Brad Smith
2012-11-18 16:14               ` Paolo Bonzini
2012-11-18 16:15                 ` Paolo Bonzini
2012-10-22  9:23 ` [Qemu-devel] [patch v4 09/16] memory: introduce mmio request pending to anti nested DMA Liu Ping Fan
2012-10-22 10:28   ` Avi Kivity
2012-10-23 12:38   ` Gleb Natapov
2012-10-24  6:31     ` liu ping fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 10/16] memory: introduce lock ops for MemoryRegionOps Liu Ping Fan
2012-10-22 10:30   ` Avi Kivity
2012-10-23  5:53     ` liu ping fan
2012-10-23  8:53       ` Jan Kiszka
2012-10-22  9:23 ` [Qemu-devel] [patch v4 11/16] vcpu: push mmio dispatcher out of big lock Liu Ping Fan
2012-10-22 10:31   ` Avi Kivity
2012-10-22 10:36     ` Jan Kiszka
2012-10-22  9:23 ` [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000 Liu Ping Fan
2012-10-22 10:37   ` Avi Kivity
2012-10-23  9:04   ` Jan Kiszka
2012-10-24  6:31     ` liu ping fan
2012-10-24  7:17       ` Jan Kiszka
2012-10-25  9:01         ` Avi Kivity
2012-10-25  9:31           ` Jan Kiszka
2012-10-25 16:21             ` Avi Kivity
2012-10-25 16:39               ` Jan Kiszka
2012-10-25 17:02                 ` Avi Kivity [this message]
2012-10-25 18:48                   ` Jan Kiszka
2012-10-29  5:24                     ` liu ping fan
2012-10-24  7:29     ` liu ping fan
2012-10-25 13:34       ` Jan Kiszka
2012-10-25 16:23         ` Avi Kivity
2012-10-25 16:41           ` Jan Kiszka
2012-10-25 17:03             ` Avi Kivity
2012-10-29  5:24         ` liu ping fan
2012-10-31  7:03           ` Jan Kiszka
2012-10-22  9:23 ` [Qemu-devel] [patch v4 13/16] e1000: add busy flag to anti broken device state Liu Ping Fan
2012-10-22 10:40   ` Avi Kivity
2012-10-23  5:52     ` liu ping fan
2012-10-23  9:06       ` Avi Kivity
2012-10-23  9:07       ` Jan Kiszka
2012-10-23  9:32         ` liu ping fan
2012-10-23  9:37           ` Avi Kivity
2012-10-24  6:36             ` liu ping fan
2012-10-25  8:55               ` Avi Kivity
2012-10-25  9:00             ` Peter Maydell
2012-10-25  9:04               ` Avi Kivity
2012-10-26  3:05                 ` liu ping fan
2012-10-26  3:08                   ` liu ping fan
2012-10-26 10:25                     ` Jan Kiszka
2012-10-29  5:24                       ` liu ping fan
2012-10-29  7:50                         ` Peter Maydell
2012-10-22  9:23 ` [Qemu-devel] [patch v4 14/16] qdev: introduce stopping state Liu Ping Fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 15/16] e1000: introduce unmap() to fix unplug issue Liu Ping Fan
2012-10-22  9:23 ` [Qemu-devel] [patch v4 16/16] e1000: implement MemoryRegionOps's ref&lock interface Liu Ping Fan
2012-10-25 14:04 ` [Qemu-devel] [patch v4 00/16] push mmio dispatch out of big lock Peter Maydell
2012-10-25 16:44   ` Jan Kiszka
2012-10-25 17:07   ` Avi Kivity
2012-10-25 17:13     ` Peter Maydell
2012-10-25 18:13       ` Marcelo Tosatti
2012-10-25 19:00         ` Jan Kiszka
2012-10-25 19:06           ` Peter Maydell
2012-10-29 15:24       ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50897089.50304@redhat.com \
    --to=avi@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=jan.kiszka@siemens.com \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pingfank@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemulist@gmail.com \
    --cc=stefanha@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).