From: Peter Xu <peterx@redhat.com>
To: Jason Wang <jasowang@redhat.com>
Cc: qemu-devel@nongnu.org, David Gibson <david@gibson.dropbear.id.au>,
Lan Tianyu <tianyu.lan@intel.com>,
Marcel Apfelbaum <marcel@redhat.com>,
"Michael S.Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 7/7] intel_iommu: support passthrough (PT)
Date: Tue, 18 Apr 2017 11:50:33 +0800 [thread overview]
Message-ID: <20170418035033.GH16703@pxdev.xzpeter.org> (raw)
In-Reply-To: <1c5c9278-92b1-6f56-b251-53bf60c7f4e8@redhat.com>
On Tue, Apr 18, 2017 at 11:23:35AM +0800, Jason Wang wrote:
> On 2017年04月17日 18:58, Peter Xu wrote:
[...]
> >+static void vtd_switch_address_space(VTDAddressSpace *as)
> >+{
> >+ bool use_iommu;
> >+
> >+ assert(as);
> >+
> >+ use_iommu = as->iommu_state->dmar_enabled;
> >+ if (use_iommu) {
> >+ /* Further checks per-device configuration */
> >+ use_iommu &= !vtd_dev_pt_enabled(as);
> >+ }
>
> Looks like you can use as->iommu_state->dmar_enabled &&
> !vtd_dev_pt_enabled(as)
vtd_dev_pt_enalbed() needs to read the guest memory (starting from
reading root entry), which is slightly slow. I was trying to avoid
unecessary reads.
[...]
> >@@ -991,6 +1058,18 @@ static void vtd_do_iommu_translate(VTDAddressSpace *vtd_as, PCIBus *bus,
> > cc_entry->context_cache_gen = s->context_cache_gen;
> > }
> >+ /*
> >+ * We don't need to translate for pass-through context entries.
> >+ * Also, let's ignore IOTLB caching as well for PT devices.
> >+ */
> >+ if (vtd_ce_get_type(&ce) == VTD_CONTEXT_TT_PASS_THROUGH) {
> >+ entry->translated_addr = entry->iova;
> >+ entry->addr_mask = VTD_PAGE_SIZE - 1;
> >+ entry->perm = IOMMU_RW;
> >+ trace_vtd_translate_pt(source_id, entry->iova);
> >+ return;
> >+ }
>
> Several questions here:
>
> 1) Is this just for vhost?
No. When caching mode is not enabled, all passthroughed devices should
be using this path.
> 2) Since this is done after IOTLB querying, do we need flush IOTLB during
> address switching?
IMHO if guest switches address space for a device, it is required to
send IOTLB flush as well for that device/domain.
[...]
> > static void vtd_switch_address_space_all(IntelIOMMUState *s)
> > {
> > GHashTableIter iter;
> >@@ -2849,6 +2914,10 @@ static void vtd_init(IntelIOMMUState *s)
> > s->ecap |= VTD_ECAP_DT;
> > }
> >+ if (x86_iommu->pt_supported) {
> >+ s->ecap |= VTD_ECAP_PT;
> >+ }
>
> Since we support migration now, need compat this for pre 2.10.
Oh yes. If I set pt=off by default, it should be okay then, right?
Then, at some point, we can switch to on by default, with a touch-up
in include/hw/compat.h I guess?
>
> >+
> > if (s->caching_mode) {
> > s->cap |= VTD_CAP_CM;
> > }
> >diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> >index 29d6707..0e73a65 100644
> >--- a/hw/i386/intel_iommu_internal.h
> >+++ b/hw/i386/intel_iommu_internal.h
> >@@ -187,6 +187,7 @@
> > /* Interrupt Remapping support */
> > #define VTD_ECAP_IR (1ULL << 3)
> > #define VTD_ECAP_EIM (1ULL << 4)
> >+#define VTD_ECAP_PT (1ULL << 6)
> > #define VTD_ECAP_MHMV (15ULL << 20)
> > /* CAP_REG */
> >diff --git a/hw/i386/trace-events b/hw/i386/trace-events
> >index 04a6980..867ad0b 100644
> >--- a/hw/i386/trace-events
> >+++ b/hw/i386/trace-events
> >@@ -38,6 +38,7 @@ vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova 0x%"P
> > vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip iova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set"
> > vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)"
> > vtd_as_unmap_whole(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, uint64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64
> >+vtd_translate_pt(uint16_t sid, uint64_t addr) "source id 0x%"PRIu16", iova 0x%"PRIx64
> > # hw/i386/amd_iommu.c
> > amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr 0x%"PRIx64" + offset 0x%"PRIx32
> >diff --git a/hw/i386/x86-iommu.c b/hw/i386/x86-iommu.c
> >index 02b8825..293caf8 100644
> >--- a/hw/i386/x86-iommu.c
> >+++ b/hw/i386/x86-iommu.c
> >@@ -91,6 +91,7 @@ static void x86_iommu_realize(DeviceState *dev, Error **errp)
> > static Property x86_iommu_properties[] = {
> > DEFINE_PROP_BOOL("intremap", X86IOMMUState, intr_supported, false),
> > DEFINE_PROP_BOOL("device-iotlb", X86IOMMUState, dt_supported, false),
> >+ DEFINE_PROP_BOOL("pt", X86IOMMUState, pt_supported, true),
>
> Do you know if AMD IOMMU support this?
AMD IOMMU should support this. IIUC it's the first bit of Device Table Entry.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2017-04-18 3:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1492426712-12230-1-git-send-email-peterx@redhat.com>
[not found] ` <1492426712-12230-8-git-send-email-peterx@redhat.com>
2017-04-18 3:23 ` [Qemu-devel] [PATCH 7/7] intel_iommu: support passthrough (PT) Jason Wang
2017-04-18 3:50 ` Peter Xu [this message]
2017-04-18 4:00 ` Jason Wang
2017-04-18 4:21 ` Peter Xu
2017-04-20 5:18 ` Jason Wang
2017-04-20 5:28 ` Peter Xu
2017-04-20 6:05 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170418035033.GH16703@pxdev.xzpeter.org \
--to=peterx@redhat.com \
--cc=david@gibson.dropbear.id.au \
--cc=jasowang@redhat.com \
--cc=marcel@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=tianyu.lan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.