From: Jason Wang <jasowang@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Fam Zheng <famz@redhat.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org,
Alex Williamson <alex.williamson@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Jintack Lim <jintack@cs.columbia.edu>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH 03/10] intel-iommu: add iommu lock
Date: Sat, 28 Apr 2018 10:42:11 +0800 [thread overview]
Message-ID: <635e37b2-30a6-b204-3005-e3e098cb38f8@redhat.com> (raw)
In-Reply-To: <20180428022407.GG13269@xz-mi>
On 2018年04月28日 10:24, Peter Xu wrote:
> On Sat, Apr 28, 2018 at 09:43:54AM +0800, Jason Wang wrote:
>>
>> On 2018年04月27日 14:26, Peter Xu wrote:
>>> On Fri, Apr 27, 2018 at 01:13:02PM +0800, Jason Wang wrote:
>>>> On 2018年04月25日 12:51, Peter Xu wrote:
>>>>> Add a per-iommu big lock to protect IOMMU status. Currently the only
>>>>> thing to be protected is the IOTLB cache, since that can be accessed
>>>>> even without BQL, e.g., in IO dataplane.
>>>>>
>>>>> Note that device page tables should not need any protection. The safety
>>>>> of that should be provided by guest OS. E.g., when a page entry is
>>>>> freed, the guest OS should be responsible to make sure that no device
>>>>> will be using that page any more.
>>>>>
>>>>> Reported-by: Fam Zheng<famz@redhat.com>
>>>>> Signed-off-by: Peter Xu<peterx@redhat.com>
>>>>> ---
>>>>> include/hw/i386/intel_iommu.h | 8 ++++++++
>>>>> hw/i386/intel_iommu.c | 31 +++++++++++++++++++++++++++++--
>>>>> 2 files changed, 37 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
>>>>> index 220697253f..1a8ba8e415 100644
>>>>> --- a/include/hw/i386/intel_iommu.h
>>>>> +++ b/include/hw/i386/intel_iommu.h
>>>>> @@ -262,6 +262,14 @@ struct IntelIOMMUState {
>>>>> uint8_t w1cmask[DMAR_REG_SIZE]; /* RW1C(Write 1 to Clear) bytes */
>>>>> uint8_t womask[DMAR_REG_SIZE]; /* WO (write only - read returns 0) */
>>>>> uint32_t version;
>>>>> + /*
>>>>> + * Protects IOMMU states in general. Normally we don't need to
>>>>> + * take this lock when we are with BQL held. However we have code
>>>>> + * paths that may run even without BQL. In those cases, we need
>>>>> + * to take the lock when we have access to IOMMU state
>>>>> + * informations, e.g., the IOTLB.
>>>>> + */
>>>>> + QemuMutex iommu_lock;
>>>> Some questions:
>>>>
>>>> 1) Do we need to protect context cache too?
>>> IMHO the context cache entry should work even without lock. That's a
>>> bit trickly since we have two cases that this cache will be updated:
>>>
>>> (1) first translation of the address space of a device
>>> (2) invalidation of context entries
>>>
>>> For (2) IMHO we don't need to worry about since guest OS should be
>>> controlling that part, say, device should not be doing any translation
>>> (DMA operations) when the context entry is invalidated.
>>>
>>> For (1) the worst case is that the context entry cache be updated
>>> multiple times with the same value by multiple threads. IMHO that'll
>>> be fine too.
>>>
>>> But yes for sure we can protect that too with the iommu lock.
>>>
>>>> 2) Can we just reuse qemu BQL here?
>>> I would prefer not. As I mentioned, at least I have spent too much
>>> time on fighting BQL already. I really hope we can start to use
>>> isolated locks when capable. BQL is always the worst choice to me.
>> Just a thought, using BQL may greatly simplify the code actually (consider
>> we don't plan to remove BQL now).
> Frankly speaking I don't understand why using BQL may greatly simplify
> the code... :( IMHO the lock here is really not a complicated one.
>
> Note that IMO BQL is mostly helpful when we really want something to
> be run sequentially with some other things _already_ protected by BQL.
Except for the translate path from dataplane, I belive all other codes
were already protected by BQL.
> In this case, all the stuff is inside VT-d code itself (or other
> IOMMUs), why bother taking the BQL to make our life harder?
It looks to me it was as simple as:
@@ -494,6 +494,7 @@ static MemoryRegionSection
flatview_do_translate(FlatView *fv,
IOMMUMemoryRegionClass *imrc;
hwaddr page_mask = (hwaddr)(-1);
hwaddr plen = (hwaddr)(-1);
+ int locked = false;
if (plen_out) {
plen = *plen_out;
@@ -510,8 +511,15 @@ static MemoryRegionSection
flatview_do_translate(FlatView *fv,
}
imrc = memory_region_get_iommu_class_nocheck(iommu_mr);
+ if (!qemu_mutex_iothread_locked()) {
+ locked = true;
+ qemu_mutex_lock_iothread();
+ }
iotlb = imrc->translate(iommu_mr, addr, is_write ?
IOMMU_WO : IOMMU_RO);
+ if (locked) {
+ qemu_mutex_unlock_iothread();
+ }
addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
| (addr & iotlb.addr_mask));
page_mask &= iotlb.addr_mask;
>
> So, even if we want to provide a general lock for the translation
> procedure, I would prefer we add a per AddressSpace lock but not BQL.
It could be, but it needs more work on each specific IOMMU codes.
> However still that will need some extra flag showing that whether we
> need the protection of not. For example, we may need to expliclitly
> turn that off for Power and s390. Would that really worth it?
It would cost just several lines of code, anything wrong with this?
>
> So my final preference is still current patch - we solve thread-safety
> problems in VT-d and IOMMU code. Again, we really should make sure
> all IOMMUs work with multithreads.
>
>>>> 3) I think the issue is common to all other kinds of IOMMU, so can we simply
>>>> synchronize before calling ->translate() in memory.c. This seems a more
>>>> common solution.
>>> I suspect Power and s390 live well with that. I think it mean at
>>> least these platforms won't have problem in concurrency. I'm adding
>>> DavidG in loop in case there is further comment. IMHO we should just
>>> make sure IOMMU code be thread safe, and we fix problem if there is.
>>>
>>> Thanks,
>>>
>> Yes, it needs some investigation, but we have other IOMMUs like AMD, and we
>> could have a flag to bypass BQL if IOMMU can synchronize by itself.
> AMD is still only for experimental. If we really want to use it in
> production IMHO it'll need more testings and tunings not only on
> thread-safety but on other stuffs too. So again, we can just fix them
> when needed. I still don't see it a reason to depend on BQL here.
Well, it's not about BQL specifically, it's about whether we have or
need a generic thread safety solution for all IOMMUs.
We have more IOMMUs than just AMD, s390 and ppc:
# git grep imrc-\>translate\ =
hw/alpha/typhoon.c: imrc->translate = typhoon_translate_iommu;
hw/dma/rc4030.c: imrc->translate = rc4030_dma_translate;
hw/i386/amd_iommu.c: imrc->translate = amdvi_translate;
hw/i386/intel_iommu.c: imrc->translate = vtd_iommu_translate;
hw/ppc/spapr_iommu.c: imrc->translate = spapr_tce_translate_iommu;
hw/s390x/s390-pci-bus.c: imrc->translate = s390_translate_iommu;
hw/sparc/sun4m_iommu.c: imrc->translate = sun4m_translate_iommu;
hw/sparc64/sun4u_iommu.c: imrc->translate = sun4u_translate_iommu;
And we know there will be more in the near future.
Thanks
>
> I'll see what others think about it.
>
> CCing Paolo, Stefan and Fam too.
>
> Thanks,
>
next prev parent reply other threads:[~2018-04-28 2:42 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-25 4:51 [Qemu-devel] [PATCH 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 01/10] intel-iommu: send PSI always even if across PDEs Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 02/10] intel-iommu: remove IntelIOMMUNotifierNode Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 03/10] intel-iommu: add iommu lock Peter Xu
2018-04-25 16:26 ` Emilio G. Cota
2018-04-26 5:45 ` Peter Xu
2018-04-27 5:13 ` Jason Wang
2018-04-27 6:26 ` Peter Xu
2018-04-27 7:19 ` Tian, Kevin
2018-04-27 9:53 ` Peter Xu
2018-04-28 1:54 ` Tian, Kevin
2018-04-28 1:43 ` Jason Wang
2018-04-28 2:24 ` Peter Xu
2018-04-28 2:42 ` Jason Wang [this message]
2018-04-28 3:06 ` Peter Xu
2018-04-28 3:11 ` Jason Wang
2018-04-28 3:14 ` Peter Xu
2018-04-28 3:16 ` Jason Wang
2018-04-30 7:22 ` Paolo Bonzini
2018-04-30 7:20 ` Paolo Bonzini
2018-05-03 5:39 ` Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 04/10] intel-iommu: only do page walk for MAP notifiers Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 05/10] intel-iommu: introduce vtd_page_walk_info Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 06/10] intel-iommu: pass in address space when page walk Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 07/10] util: implement simple interval tree logic Peter Xu
2018-04-27 5:53 ` Jason Wang
2018-04-27 6:27 ` Peter Xu
2018-05-03 7:10 ` Peter Xu
2018-05-03 7:21 ` Jason Wang
2018-05-03 7:30 ` Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 08/10] intel-iommu: maintain per-device iova ranges Peter Xu
2018-04-27 6:07 ` Jason Wang
2018-04-27 6:34 ` Peter Xu
2018-04-27 7:02 ` Tian, Kevin
2018-04-27 7:28 ` Peter Xu
2018-04-27 7:44 ` Tian, Kevin
2018-04-27 9:55 ` Peter Xu
2018-04-27 11:40 ` Peter Xu
2018-04-27 23:37 ` Tian, Kevin
2018-05-03 6:04 ` Peter Xu
2018-05-03 7:20 ` Jason Wang
2018-05-03 7:28 ` Peter Xu
2018-05-03 7:43 ` Jason Wang
2018-05-03 7:53 ` Peter Xu
2018-05-03 9:22 ` Jason Wang
2018-05-03 9:53 ` Peter Xu
2018-05-03 12:01 ` Peter Xu
2018-04-28 1:49 ` Jason Wang
2018-04-25 4:51 ` [Qemu-devel] [PATCH 09/10] intel-iommu: don't unmap all for shadow page table Peter Xu
2018-04-25 4:51 ` [Qemu-devel] [PATCH 10/10] intel-iommu: remove notify_unmap for page walk Peter Xu
2018-04-25 5:05 ` [Qemu-devel] [PATCH 00/10] intel-iommu: nested vIOMMU, cleanups, bug fixes no-reply
2018-04-25 5:34 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=635e37b2-30a6-b204-3005-e3e098cb38f8@redhat.com \
--to=jasowang@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=famz@redhat.com \
--cc=jintack@cs.columbia.edu \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).