qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Fam Zheng <famz@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jintack Lim <jintack@cs.columbia.edu>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH 03/10] intel-iommu: add iommu lock
Date: Sat, 28 Apr 2018 11:11:53 +0800	[thread overview]
Message-ID: <cd0a28a9-2082-9c77-0b72-f2451764bd9e@redhat.com> (raw)
In-Reply-To: <20180428030601.GJ13269@xz-mi>



On 2018年04月28日 11:06, Peter Xu wrote:
> On Sat, Apr 28, 2018 at 10:42:11AM +0800, Jason Wang wrote:
>>
>> On 2018年04月28日 10:24, Peter Xu wrote:
>>> On Sat, Apr 28, 2018 at 09:43:54AM +0800, Jason Wang wrote:
>>>> On 2018年04月27日 14:26, Peter Xu wrote:
>>>>> On Fri, Apr 27, 2018 at 01:13:02PM +0800, Jason Wang wrote:
>>>>>> On 2018年04月25日 12:51, Peter Xu wrote:
>>>>>>> Add a per-iommu big lock to protect IOMMU status.  Currently the only
>>>>>>> thing to be protected is the IOTLB cache, since that can be accessed
>>>>>>> even without BQL, e.g., in IO dataplane.
>>>>>>>
>>>>>>> Note that device page tables should not need any protection.  The safety
>>>>>>> of that should be provided by guest OS.  E.g., when a page entry is
>>>>>>> freed, the guest OS should be responsible to make sure that no device
>>>>>>> will be using that page any more.
>>>>>>>
>>>>>>> Reported-by: Fam Zheng<famz@redhat.com>
>>>>>>> Signed-off-by: Peter Xu<peterx@redhat.com>
>>>>>>> ---
>>>>>>>      include/hw/i386/intel_iommu.h |  8 ++++++++
>>>>>>>      hw/i386/intel_iommu.c         | 31 +++++++++++++++++++++++++++++--
>>>>>>>      2 files changed, 37 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
>>>>>>> index 220697253f..1a8ba8e415 100644
>>>>>>> --- a/include/hw/i386/intel_iommu.h
>>>>>>> +++ b/include/hw/i386/intel_iommu.h
>>>>>>> @@ -262,6 +262,14 @@ struct IntelIOMMUState {
>>>>>>>          uint8_t w1cmask[DMAR_REG_SIZE]; /* RW1C(Write 1 to Clear) bytes */
>>>>>>>          uint8_t womask[DMAR_REG_SIZE];  /* WO (write only - read returns 0) */
>>>>>>>          uint32_t version;
>>>>>>> +    /*
>>>>>>> +     * Protects IOMMU states in general.  Normally we don't need to
>>>>>>> +     * take this lock when we are with BQL held.  However we have code
>>>>>>> +     * paths that may run even without BQL.  In those cases, we need
>>>>>>> +     * to take the lock when we have access to IOMMU state
>>>>>>> +     * informations, e.g., the IOTLB.
>>>>>>> +     */
>>>>>>> +    QemuMutex iommu_lock;
>>>>>> Some questions:
>>>>>>
>>>>>> 1) Do we need to protect context cache too?
>>>>> IMHO the context cache entry should work even without lock.  That's a
>>>>> bit trickly since we have two cases that this cache will be updated:
>>>>>
>>>>>      (1) first translation of the address space of a device
>>>>>      (2) invalidation of context entries
>>>>>
>>>>> For (2) IMHO we don't need to worry about since guest OS should be
>>>>> controlling that part, say, device should not be doing any translation
>>>>> (DMA operations) when the context entry is invalidated.
>>>>>
>>>>> For (1) the worst case is that the context entry cache be updated
>>>>> multiple times with the same value by multiple threads.  IMHO that'll
>>>>> be fine too.
>>>>>
>>>>> But yes for sure we can protect that too with the iommu lock.
>>>>>
>>>>>> 2) Can we just reuse qemu BQL here?
>>>>> I would prefer not.  As I mentioned, at least I have spent too much
>>>>> time on fighting BQL already.  I really hope we can start to use
>>>>> isolated locks when capable.  BQL is always the worst choice to me.
>>>> Just a thought, using BQL may greatly simplify the code actually (consider
>>>> we don't plan to remove BQL now).
>>> Frankly speaking I don't understand why using BQL may greatly simplify
>>> the code... :( IMHO the lock here is really not a complicated one.
>>>
>>> Note that IMO BQL is mostly helpful when we really want something to
>>> be run sequentially with some other things _already_ protected by BQL.
>> Except for the translate path from dataplane, I belive all other codes were
>> already protected by BQL.
>>
>>> In this case, all the stuff is inside VT-d code itself (or other
>>> IOMMUs), why bother taking the BQL to make our life harder?
>> It looks to me it was as simple as:
>>
>> @@ -494,6 +494,7 @@ static MemoryRegionSection
>> flatview_do_translate(FlatView *fv,
>>       IOMMUMemoryRegionClass *imrc;
>>       hwaddr page_mask = (hwaddr)(-1);
>>       hwaddr plen = (hwaddr)(-1);
>> +    int locked = false;
>>
>>       if (plen_out) {
>>           plen = *plen_out;
>> @@ -510,8 +511,15 @@ static MemoryRegionSection
>> flatview_do_translate(FlatView *fv,
>>           }
>>           imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
>>
>> +        if (!qemu_mutex_iothread_locked()) {
>> +            locked = true;
>> +            qemu_mutex_lock_iothread();
>> +        }
>>           iotlb = imrc->translate(iommu_mr, addr, is_write ?
>>                                   IOMMU_WO : IOMMU_RO);
>> +        if (locked) {
>> +            qemu_mutex_unlock_iothread();
>> +        }
>>           addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
>>                   | (addr & iotlb.addr_mask));
>>           page_mask &= iotlb.addr_mask;
> We'll need to add the flags thing too.  How do we flag-out existing
> thread-safe IOMMUs?

We can let thread safe IOMMU code to choose to set a flag somewhere.

>
>>
>>> So, even if we want to provide a general lock for the translation
>>> procedure, I would prefer we add a per AddressSpace lock but not BQL.
>> It could be, but it needs more work on each specific IOMMU codes.
>>
>>> However still that will need some extra flag showing that whether we
>>> need the protection of not.  For example, we may need to expliclitly
>>> turn that off for Power and s390.  Would that really worth it?
>> It would cost just several lines of code, anything wrong with this?
> It's not about anything wrong; it's just about preference.
>
> I never said BQL won't work here.  It will work.  But if you have
> spent tens of hours working on BQL-related problems maybe you'll have
> the same preference as me... :)
>
> IMHO the point is to decide which might be simpler and more efficient
> in general, really.

So I'm not against your approach. It could be on top of the BQL patch I 
think.

>
>>> So my final preference is still current patch - we solve thread-safety
>>> problems in VT-d and IOMMU code.  Again, we really should make sure
>>> all IOMMUs work with multithreads.
>>>
>>>>>> 3) I think the issue is common to all other kinds of IOMMU, so can we simply
>>>>>> synchronize before calling ->translate() in memory.c. This seems a more
>>>>>> common solution.
>>>>> I suspect Power and s390 live well with that.  I think it mean at
>>>>> least these platforms won't have problem in concurrency.  I'm adding
>>>>> DavidG in loop in case there is further comment.  IMHO we should just
>>>>> make sure IOMMU code be thread safe, and we fix problem if there is.
>>>>>
>>>>> Thanks,
>>>>>
>>>> Yes, it needs some investigation, but we have other IOMMUs like AMD, and we
>>>> could have a flag to bypass BQL if IOMMU can synchronize by itself.
>>> AMD is still only for experimental.  If we really want to use it in
>>> production IMHO it'll need more testings and tunings not only on
>>> thread-safety but on other stuffs too.  So again, we can just fix them
>>> when needed.  I still don't see it a reason to depend on BQL here.
>> Well, it's not about BQL specifically, it's about whether we have or need a
>> generic thread safety solution for all IOMMUs.
>>
>> We have more IOMMUs than just AMD, s390 and ppc:
>>
>> # git grep imrc-\>translate\ =
>> hw/alpha/typhoon.c:    imrc->translate = typhoon_translate_iommu;
>> hw/dma/rc4030.c:    imrc->translate = rc4030_dma_translate;
>> hw/i386/amd_iommu.c:    imrc->translate = amdvi_translate;
>> hw/i386/intel_iommu.c:    imrc->translate = vtd_iommu_translate;
>> hw/ppc/spapr_iommu.c:    imrc->translate = spapr_tce_translate_iommu;
>> hw/s390x/s390-pci-bus.c:    imrc->translate = s390_translate_iommu;
>> hw/sparc/sun4m_iommu.c:    imrc->translate = sun4m_translate_iommu;
>> hw/sparc64/sun4u_iommu.c:    imrc->translate = sun4u_translate_iommu;
>>
>> And we know there will be more in the near future.
> Again - here I would suggest we consider thread-safe when implementing
> new ones.  I suppose it should not be a hard thing to achieve.
>
> I don't have more and new input here since I have had some in previous
> posts already.  If this is still during discussion before the next
> post, I'll pick this patch out of the series since this patch is not
> related to other patches at all, so can be dealt with isolatedly.
>
> Thanks,
>

I fully understand the your motivation, just want to see if we can do 
something simply for all other IOMMUs. I think this series can go alone 
without caring other IOMMUs for sure.

Thanks

  reply	other threads:[~2018-04-28  3:12 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-25  4:51 [Qemu-devel] [PATCH 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 01/10] intel-iommu: send PSI always even if across PDEs Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 02/10] intel-iommu: remove IntelIOMMUNotifierNode Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 03/10] intel-iommu: add iommu lock Peter Xu
2018-04-25 16:26   ` Emilio G. Cota
2018-04-26  5:45     ` Peter Xu
2018-04-27  5:13   ` Jason Wang
2018-04-27  6:26     ` Peter Xu
2018-04-27  7:19       ` Tian, Kevin
2018-04-27  9:53         ` Peter Xu
2018-04-28  1:54           ` Tian, Kevin
2018-04-28  1:43       ` Jason Wang
2018-04-28  2:24         ` Peter Xu
2018-04-28  2:42           ` Jason Wang
2018-04-28  3:06             ` Peter Xu
2018-04-28  3:11               ` Jason Wang [this message]
2018-04-28  3:14             ` Peter Xu
2018-04-28  3:16               ` Jason Wang
2018-04-30  7:22               ` Paolo Bonzini
2018-04-30  7:20           ` Paolo Bonzini
2018-05-03  5:39             ` Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 04/10] intel-iommu: only do page walk for MAP notifiers Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 05/10] intel-iommu: introduce vtd_page_walk_info Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 06/10] intel-iommu: pass in address space when page walk Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 07/10] util: implement simple interval tree logic Peter Xu
2018-04-27  5:53   ` Jason Wang
2018-04-27  6:27     ` Peter Xu
2018-05-03  7:10     ` Peter Xu
2018-05-03  7:21       ` Jason Wang
2018-05-03  7:30         ` Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 08/10] intel-iommu: maintain per-device iova ranges Peter Xu
2018-04-27  6:07   ` Jason Wang
2018-04-27  6:34     ` Peter Xu
2018-04-27  7:02     ` Tian, Kevin
2018-04-27  7:28       ` Peter Xu
2018-04-27  7:44         ` Tian, Kevin
2018-04-27  9:55           ` Peter Xu
2018-04-27 11:40             ` Peter Xu
2018-04-27 23:37               ` Tian, Kevin
2018-05-03  6:04                 ` Peter Xu
2018-05-03  7:20                   ` Jason Wang
2018-05-03  7:28                     ` Peter Xu
2018-05-03  7:43                       ` Jason Wang
2018-05-03  7:53                         ` Peter Xu
2018-05-03  9:22                           ` Jason Wang
2018-05-03  9:53                             ` Peter Xu
2018-05-03 12:01                               ` Peter Xu
2018-04-28  1:49               ` Jason Wang
2018-04-25  4:51 ` [Qemu-devel] [PATCH 09/10] intel-iommu: don't unmap all for shadow page table Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 10/10] intel-iommu: remove notify_unmap for page walk Peter Xu
2018-04-25  5:05 ` [Qemu-devel] [PATCH 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes no-reply
2018-04-25  5:34   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd0a28a9-2082-9c77-0b72-f2451764bd9e@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=famz@redhat.com \
    --cc=jintack@cs.columbia.edu \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).