qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Li Qiang <liq3ea@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	Dmitry Fleytman <dmitry.fleytman@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>, Li Qiang <liq3ea@163.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	Alexander Bulekov <alxndr@bu.edu>,
	Gerd Hoffmann <kraxel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [RFC 0/3] try to solve the DMA to MMIO issue
Date: Thu, 3 Sep 2020 14:16:02 +0800	[thread overview]
Message-ID: <b6c01013-7b26-5da2-ba8b-401ff3e58256@redhat.com> (raw)
In-Reply-To: <CAKXe6SLB280LqvMzfMbRvMNff6Yt21unVpCs7TygF_bSPCTe8w@mail.gmail.com>


On 2020/9/3 下午12:50, Li Qiang wrote:
> Jason Wang <jasowang@redhat.com> 于2020年9月3日周四 下午12:24写道:
>>
>> On 2020/9/3 下午12:06, Alexander Bulekov wrote:
>>> On 200903 1154, Jason Wang wrote:
>>>> On 2020/9/3 上午12:22, Li Qiang wrote:
>>>>> The qemu device fuzzer has found several DMA to MMIO issue.
>>>>> These issues is caused by the guest driver programs the DMA
>>>>> address, then in the device MMIO handler it trigger the DMA
>>>>> and as the DMA address is MMIO it will trigger another dispatch
>>>>> and reenter the MMIO handler again. However most of the device
>>>>> is not reentrant.
>>>>>
>>>>> DMA to MMIO will cause issues depend by the device emulator,
>>>>> mostly it will crash the qemu. Following is three classic
>>>>> DMA to MMIO issue.
>>>>>
>>>>> e1000e: https://bugs.launchpad.net/qemu/+bug/1886362
>>>>> xhci: https://bugs.launchpad.net/qemu/+bug/1891354
>>>>> virtio-gpu: https://bugs.launchpad.net/qemu/+bug/1888606
>>>>>
>>>>> The DMA to MMIO issue I think can be classified as following:
>>>>> 1. DMA to the device itself
>>>>> 2. device A DMA to device B and to device C
>>>>> 3. device A DMA to device B and to device A
>>>>>
>>>>> The first case of course should not be allowed.
>>>>> The second case I think it ok as the device IO handler has no
>>>>> assumption about the IO data came from no matter it come from
>>>>> device or other device. This is for P2P DMA.
>>>>> The third case I think it also should not be allowed.
>>>>>
>>>>> So our issue has been reduced by one case: not allowed the
>>>>> device's IO handler reenter.
>>>>>
>>>>> Paolo suggested that we can refactor the device emulation with
>>>>> BH. However it is a lot of work.
>>>>> I have thought several propose to address this, also discuss
>>>>> this with Jason Wang in private email.
>>>>>
>>>>> I have can solve this issue in core framework or in specific device.
>>>>> After try several methods I choose address it in per-device for
>>>>> following reason:
>>>>> 1. If we address it in core framwork we have to recored and check the
>>>>> device or MR info in MR dispatch write function. Unfortunally we have
>>>>> no these info in core framework.
>>>>> 2. The performance will also be decrease largely
>>>>> 3. Only the device itself know its IO
>>>> I think we still need to seek a way to address this issue completely.
>>>>
>>>> How about adding a flag in MemoryRegionOps and detect the reentrancy through
>>>> that flag?
>>> What happens for devices with multiple MemoryRegions? Make all the
>>> MemoryRegionOps share the same flag?
>>
>> I think there could be two approaches:
>>
>> 1) record the device in MR as Qiang mentioned
> I have tried this as we discussed. But has following consideration:
> 1. The performance, we need to check/record/clean the MR in an array/hashtable.
>
> 2. The multiple MR and alias MR process in the memory layer. It is
> complicated and performance effective.
> So If we let the MR issue to the device itself, it is just as this
> patch does-let the device address the reentrancy issue.f
>
> Another solution. We connects a MR with the corresponding device. Now
> the device often tight MR with an 'opaque' field.
> Just uses it in the calling of MR callback. Then we add a flag in the
> device and needs to modify the MR register interface.
>
> So in the memory layer we can check/record/clean the MR->device->flag.
> But this is can't address the DMA (in BH) to MMIO issue as the BH runs
> in main thread.


This is probably good enough to start. To my point of view, there're two 
different issues:

1) re-entrant MMIO handler
2) MMIO hanlder sync with BH

For 1), we'd better solve it at core, For 2) it can only be solved in 
the device.

Thanks


>
> Thanks,
> Li Qiang
>
>
>
>> 2) Only forbid the reentrancy in MMIO handler and depends on the device
>> to solve the multiple Memory Region issue, if the regions want to access
>> the same data, it needs to be synchronized internally
>>
>> But the point is still to try to solve it in the layer of memory
>> regions. Otherwise we may still hit similar issues.
>>
>>
>>> What about the virtio-gpu bug, where the problem happens in a bh->mmio
>>> access rather than an mmio->mmio access?
>>
>> Yes, it needs more thought, but as a first step, we can try to fix the
>> MMIO handler issue and do bh fix on top.
>
>
>> Thanks
>>
>>
>>> -Alex
>>>
>>>> Thanks
>>>>
>>>>
>>>>> The (most of the) device emulation is protected by BQL one time only
>>>>> a device emulation code can be run. We can add a flag to indicate the
>>>>> IO is running. The first two patches does this. For simplicity at the
>>>>> RFC stage I just set it while enter the IO callback and clear it exit
>>>>> the IO callback. It should be check/set/clean according the per-device's
>>>>> IO emulation.
>>>>> The second issue which itself suffers a race condition so I uses a
>>>>> atomic.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Li Qiang (3):
>>>>>      e1000e: make the IO handler reentrant
>>>>>      xhci: make the IO handler reentrant
>>>>>      virtio-gpu: make the IO handler reentrant
>>>>>
>>>>>     hw/display/virtio-gpu.c        | 10 ++++++
>>>>>     hw/net/e1000e.c                | 35 +++++++++++++++++++-
>>>>>     hw/usb/hcd-xhci.c              | 60 ++++++++++++++++++++++++++++++++++
>>>>>     hw/usb/hcd-xhci.h              |  1 +
>>>>>     include/hw/virtio/virtio-gpu.h |  1 +
>>>>>     5 files changed, 106 insertions(+), 1 deletion(-)
>>>>>



  reply	other threads:[~2020-09-03  6:18 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-02 16:22 [RFC 0/3] try to solve the DMA to MMIO issue Li Qiang
2020-09-02 16:22 ` [RFC 1/3] e1000e: make the IO handler reentrant Li Qiang
2020-09-02 16:22 ` [RFC 2/3] xhci: " Li Qiang
2020-09-02 16:22 ` [RFC 3/3] virtio-gpu: " Li Qiang
2020-09-03  5:12   ` Michael Tokarev
2020-09-03 10:32     ` Li Qiang
2020-09-03  3:54 ` [RFC 0/3] try to solve the DMA to MMIO issue Jason Wang
2020-09-03  4:06   ` Alexander Bulekov
2020-09-03  4:24     ` Jason Wang
2020-09-03  4:50       ` Li Qiang
2020-09-03  6:16         ` Jason Wang [this message]
2020-09-03  6:28           ` Li Qiang
2020-09-03 10:53   ` Peter Maydell
2020-09-03 11:11     ` Li Qiang
2020-09-03 11:19       ` Peter Maydell
2020-09-03 11:23         ` Li Qiang
2020-09-03 11:28           ` Peter Maydell
2020-09-03 13:35             ` Philippe Mathieu-Daudé
2020-09-03 13:41               ` Peter Maydell
2020-09-04  2:45         ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6c01013-7b26-5da2-ba8b-401ff3e58256@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alxndr@bu.edu \
    --cc=dmitry.fleytman@gmail.com \
    --cc=kraxel@redhat.com \
    --cc=liq3ea@163.com \
    --cc=liq3ea@gmail.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).