qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Fam Zheng <famz@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jintack Lim <jintack@cs.columbia.edu>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH 03/10] intel-iommu: add iommu lock
Date: Sat, 28 Apr 2018 10:42:11 +0800	[thread overview]
Message-ID: <635e37b2-30a6-b204-3005-e3e098cb38f8@redhat.com> (raw)
In-Reply-To: <20180428022407.GG13269@xz-mi>



On 2018年04月28日 10:24, Peter Xu wrote:
> On Sat, Apr 28, 2018 at 09:43:54AM +0800, Jason Wang wrote:
>>
>> On 2018年04月27日 14:26, Peter Xu wrote:
>>> On Fri, Apr 27, 2018 at 01:13:02PM +0800, Jason Wang wrote:
>>>> On 2018年04月25日 12:51, Peter Xu wrote:
>>>>> Add a per-iommu big lock to protect IOMMU status.  Currently the only
>>>>> thing to be protected is the IOTLB cache, since that can be accessed
>>>>> even without BQL, e.g., in IO dataplane.
>>>>>
>>>>> Note that device page tables should not need any protection.  The safety
>>>>> of that should be provided by guest OS.  E.g., when a page entry is
>>>>> freed, the guest OS should be responsible to make sure that no device
>>>>> will be using that page any more.
>>>>>
>>>>> Reported-by: Fam Zheng<famz@redhat.com>
>>>>> Signed-off-by: Peter Xu<peterx@redhat.com>
>>>>> ---
>>>>>     include/hw/i386/intel_iommu.h |  8 ++++++++
>>>>>     hw/i386/intel_iommu.c         | 31 +++++++++++++++++++++++++++++--
>>>>>     2 files changed, 37 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
>>>>> index 220697253f..1a8ba8e415 100644
>>>>> --- a/include/hw/i386/intel_iommu.h
>>>>> +++ b/include/hw/i386/intel_iommu.h
>>>>> @@ -262,6 +262,14 @@ struct IntelIOMMUState {
>>>>>         uint8_t w1cmask[DMAR_REG_SIZE]; /* RW1C(Write 1 to Clear) bytes */
>>>>>         uint8_t womask[DMAR_REG_SIZE];  /* WO (write only - read returns 0) */
>>>>>         uint32_t version;
>>>>> +    /*
>>>>> +     * Protects IOMMU states in general.  Normally we don't need to
>>>>> +     * take this lock when we are with BQL held.  However we have code
>>>>> +     * paths that may run even without BQL.  In those cases, we need
>>>>> +     * to take the lock when we have access to IOMMU state
>>>>> +     * informations, e.g., the IOTLB.
>>>>> +     */
>>>>> +    QemuMutex iommu_lock;
>>>> Some questions:
>>>>
>>>> 1) Do we need to protect context cache too?
>>> IMHO the context cache entry should work even without lock.  That's a
>>> bit trickly since we have two cases that this cache will be updated:
>>>
>>>     (1) first translation of the address space of a device
>>>     (2) invalidation of context entries
>>>
>>> For (2) IMHO we don't need to worry about since guest OS should be
>>> controlling that part, say, device should not be doing any translation
>>> (DMA operations) when the context entry is invalidated.
>>>
>>> For (1) the worst case is that the context entry cache be updated
>>> multiple times with the same value by multiple threads.  IMHO that'll
>>> be fine too.
>>>
>>> But yes for sure we can protect that too with the iommu lock.
>>>
>>>> 2) Can we just reuse qemu BQL here?
>>> I would prefer not.  As I mentioned, at least I have spent too much
>>> time on fighting BQL already.  I really hope we can start to use
>>> isolated locks when capable.  BQL is always the worst choice to me.
>> Just a thought, using BQL may greatly simplify the code actually (consider
>> we don't plan to remove BQL now).
> Frankly speaking I don't understand why using BQL may greatly simplify
> the code... :( IMHO the lock here is really not a complicated one.
>
> Note that IMO BQL is mostly helpful when we really want something to
> be run sequentially with some other things _already_ protected by BQL.

Except for the translate path from dataplane, I belive all other codes 
were already protected by BQL.

> In this case, all the stuff is inside VT-d code itself (or other
> IOMMUs), why bother taking the BQL to make our life harder?

It looks to me it was as simple as:

@@ -494,6 +494,7 @@ static MemoryRegionSection 
flatview_do_translate(FlatView *fv,
      IOMMUMemoryRegionClass *imrc;
      hwaddr page_mask = (hwaddr)(-1);
      hwaddr plen = (hwaddr)(-1);
+    int locked = false;

      if (plen_out) {
          plen = *plen_out;
@@ -510,8 +511,15 @@ static MemoryRegionSection 
flatview_do_translate(FlatView *fv,
          }
          imrc = memory_region_get_iommu_class_nocheck(iommu_mr);

+        if (!qemu_mutex_iothread_locked()) {
+            locked = true;
+            qemu_mutex_lock_iothread();
+        }
          iotlb = imrc->translate(iommu_mr, addr, is_write ?
                                  IOMMU_WO : IOMMU_RO);
+        if (locked) {
+            qemu_mutex_unlock_iothread();
+        }
          addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
                  | (addr & iotlb.addr_mask));
          page_mask &= iotlb.addr_mask;


>
> So, even if we want to provide a general lock for the translation
> procedure, I would prefer we add a per AddressSpace lock but not BQL.

It could be, but it needs more work on each specific IOMMU codes.

> However still that will need some extra flag showing that whether we
> need the protection of not.  For example, we may need to expliclitly
> turn that off for Power and s390.  Would that really worth it?

It would cost just several lines of code, anything wrong with this?

>
> So my final preference is still current patch - we solve thread-safety
> problems in VT-d and IOMMU code.  Again, we really should make sure
> all IOMMUs work with multithreads.
>
>>>> 3) I think the issue is common to all other kinds of IOMMU, so can we simply
>>>> synchronize before calling ->translate() in memory.c. This seems a more
>>>> common solution.
>>> I suspect Power and s390 live well with that.  I think it mean at
>>> least these platforms won't have problem in concurrency.  I'm adding
>>> DavidG in loop in case there is further comment.  IMHO we should just
>>> make sure IOMMU code be thread safe, and we fix problem if there is.
>>>
>>> Thanks,
>>>
>> Yes, it needs some investigation, but we have other IOMMUs like AMD, and we
>> could have a flag to bypass BQL if IOMMU can synchronize by itself.
> AMD is still only for experimental.  If we really want to use it in
> production IMHO it'll need more testings and tunings not only on
> thread-safety but on other stuffs too.  So again, we can just fix them
> when needed.  I still don't see it a reason to depend on BQL here.

Well, it's not about BQL specifically, it's about whether we have or 
need a generic thread safety solution for all IOMMUs.

We have more IOMMUs than just AMD, s390 and ppc:

# git grep imrc-\>translate\ =
hw/alpha/typhoon.c:    imrc->translate = typhoon_translate_iommu;
hw/dma/rc4030.c:    imrc->translate = rc4030_dma_translate;
hw/i386/amd_iommu.c:    imrc->translate = amdvi_translate;
hw/i386/intel_iommu.c:    imrc->translate = vtd_iommu_translate;
hw/ppc/spapr_iommu.c:    imrc->translate = spapr_tce_translate_iommu;
hw/s390x/s390-pci-bus.c:    imrc->translate = s390_translate_iommu;
hw/sparc/sun4m_iommu.c:    imrc->translate = sun4m_translate_iommu;
hw/sparc64/sun4u_iommu.c:    imrc->translate = sun4u_translate_iommu;

And we know there will be more in the near future.

Thanks

>
> I'll see what others think about it.
>
> CCing Paolo, Stefan and Fam too.
>
> Thanks,
>

  reply	other threads:[~2018-04-28  2:42 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-25  4:51 [Qemu-devel] [PATCH 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 01/10] intel-iommu: send PSI always even if across PDEs Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 02/10] intel-iommu: remove IntelIOMMUNotifierNode Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 03/10] intel-iommu: add iommu lock Peter Xu
2018-04-25 16:26   ` Emilio G. Cota
2018-04-26  5:45     ` Peter Xu
2018-04-27  5:13   ` Jason Wang
2018-04-27  6:26     ` Peter Xu
2018-04-27  7:19       ` Tian, Kevin
2018-04-27  9:53         ` Peter Xu
2018-04-28  1:54           ` Tian, Kevin
2018-04-28  1:43       ` Jason Wang
2018-04-28  2:24         ` Peter Xu
2018-04-28  2:42           ` Jason Wang [this message]
2018-04-28  3:06             ` Peter Xu
2018-04-28  3:11               ` Jason Wang
2018-04-28  3:14             ` Peter Xu
2018-04-28  3:16               ` Jason Wang
2018-04-30  7:22               ` Paolo Bonzini
2018-04-30  7:20           ` Paolo Bonzini
2018-05-03  5:39             ` Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 04/10] intel-iommu: only do page walk for MAP notifiers Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 05/10] intel-iommu: introduce vtd_page_walk_info Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 06/10] intel-iommu: pass in address space when page walk Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 07/10] util: implement simple interval tree logic Peter Xu
2018-04-27  5:53   ` Jason Wang
2018-04-27  6:27     ` Peter Xu
2018-05-03  7:10     ` Peter Xu
2018-05-03  7:21       ` Jason Wang
2018-05-03  7:30         ` Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 08/10] intel-iommu: maintain per-device iova ranges Peter Xu
2018-04-27  6:07   ` Jason Wang
2018-04-27  6:34     ` Peter Xu
2018-04-27  7:02     ` Tian, Kevin
2018-04-27  7:28       ` Peter Xu
2018-04-27  7:44         ` Tian, Kevin
2018-04-27  9:55           ` Peter Xu
2018-04-27 11:40             ` Peter Xu
2018-04-27 23:37               ` Tian, Kevin
2018-05-03  6:04                 ` Peter Xu
2018-05-03  7:20                   ` Jason Wang
2018-05-03  7:28                     ` Peter Xu
2018-05-03  7:43                       ` Jason Wang
2018-05-03  7:53                         ` Peter Xu
2018-05-03  9:22                           ` Jason Wang
2018-05-03  9:53                             ` Peter Xu
2018-05-03 12:01                               ` Peter Xu
2018-04-28  1:49               ` Jason Wang
2018-04-25  4:51 ` [Qemu-devel] [PATCH 09/10] intel-iommu: don't unmap all for shadow page table Peter Xu
2018-04-25  4:51 ` [Qemu-devel] [PATCH 10/10] intel-iommu: remove notify_unmap for page walk Peter Xu
2018-04-25  5:05 ` [Qemu-devel] [PATCH 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes no-reply
2018-04-25  5:34   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=635e37b2-30a6-b204-3005-e3e098cb38f8@redhat.com \
    --to=jasowang@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=famz@redhat.com \
    --cc=jintack@cs.columbia.edu \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).