From: "Derrick, Jonathan" <jonathan.derrick@intel.com>
To: "baolu.lu@linux.intel.com" <baolu.lu@linux.intel.com>,
"drake@endlessm.com" <drake@endlessm.com>
Cc: "bhelgaas@google.com" <bhelgaas@google.com>,
"dwmw2@infradead.org" <dwmw2@infradead.org>,
"iommu@lists.linux-foundation.org"
<iommu@lists.linux-foundation.org>,
"linux@endlessm.com" <linux@endlessm.com>
Subject: Re: [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed
Date: Thu, 9 Apr 2020 00:33:00 +0000 [thread overview]
Message-ID: <ae740b4de5fd6cf7f52ee1baa7b9347fed26a3b4.camel@intel.com> (raw)
In-Reply-To: <a99ff4f4edc4fd5495c9d6b245a590a256c9261b.camel@intel.com>
Hm that didn't come through right..
On Wed, 2020-04-08 at 18:16 -0600, Jonathan Derrick wrote:
> Hi Daniel,
>
> Reviving this thread
>
> On Thu, 2020-02-20 at 18:06 +0800, Daniel Drake wrote:
> > > On Wed, Feb 19, 2020 at 11:40 AM Lu Baolu <baolu.lu@linux.intel.com> wrote:
> > > > With respect, this is problematical. The parent and all subdevices share
> > > > a single translation entry. The DMA mask should be consistent.
> > > >
> > > > Otherwise, for example, subdevice A has 64-bit DMA capability and uses
> > > > an identity domain for DMA translation. While subdevice B has 32-bit DMA
> > > > capability and is forced to switch to DMA domain. Subdevice A will be
> > > > impacted without any notification.
> >
> > Looking closer, this problematic codepath may already be happening for VMD,
> > under intel_iommu_add_device(). Consider this function running for a VMD
> > subdevice, we hit:
> >
> > domain = iommu_get_domain_for_dev(dev);
> >
> > I can't quite grasp the code flow here, but domain->type now always seems
> > to return the domain type of the real DMA device, which seems like pretty
> > reasonable behaviour.
> >
> > if (domain->type == IOMMU_DOMAIN_DMA) {
> >
> > and as detailed in previous mails, the real VMD device seems to be in a DMA
> > domain by default, so we continue.
> >
> > if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
> >
> > Now we checked the default domain type of the subdevice. This seems rather
> > likely to return IDENTITY because that's effectively the default type...
> >
> > ret = iommu_request_dm_for_dev(dev);
> > if (ret) {
> > dmar_remove_one_dev_info(dev);
> > dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
> > domain_add_dev_info(si_domain, dev);
> > dev_info(dev,
> > "Device uses a private identity domain.\n");
> > }
> > }
> >
> > and now we're doing the bad stuff that Lu pointed out: we only have one
> > mapping shared for all the subdevices, so if we end up changing it for one
> > subdevice, we're likely to be breaking another.
> > In this case iommu_request_dm_for_dev() returns -EBUSY without doing anything
> > and the following private identity code fortunately seems to have no
> > consequential effects - the real DMA device continues to operate in the DMA
> > domain, and all subdevice DMA requests go through the DMA mapping codepath.
> > That's probably why VMD appears to be working fine anyway, but this seems
> > fragile.
> >
> > The following changes enforce that the real DMA device is in the DMA domain,
> > and avoid the intel_iommu_add_device() codepaths that would try to change
> > it to a different domain type. Let me know if I'm on the right lines...
> > ---
> > drivers/iommu/intel-iommu.c | 16 ++++++++++++++++
> > drivers/pci/controller/intel-nvme-remap.c | 6 ++++++
> > 2 files changed, 22 insertions(+)
> >
> > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> > index 9644a5b3e0ae..8872b8d1780d 100644
> > --- a/drivers/iommu/intel-iommu.c
> > +++ b/drivers/iommu/intel-iommu.c
> > @@ -2911,6 +2911,9 @@ static int device_def_domain_type(struct device *dev)
> > if (dev_is_pci(dev)) {
> > struct pci_dev *pdev = to_pci_dev(dev);
> >
> > + if (pci_real_dma_dev(pdev) != pdev)
> > + return IOMMU_DOMAIN_DMA;
> > +
> > if (device_is_rmrr_locked(dev))
> > return IOMMU_DOMAIN_DMA;
> >
> > @@ -5580,6 +5583,7 @@ static bool intel_iommu_capable(enum iommu_cap cap)
> >
> > static int intel_iommu_add_device(struct device *dev)
> > {
> > + struct device *real_dev = dev;
> > struct dmar_domain *dmar_domain;
> > struct iommu_domain *domain;
> > struct intel_iommu *iommu;
> > @@ -5591,6 +5595,17 @@ static int intel_iommu_add_device(struct device *dev)
> > if (!iommu)
> > return -ENODEV;
> >
> > + if (dev_is_pci(dev))
> > + real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
> > +
> > + if (real_dev != dev) {
> > + domain = iommu_get_domain_for_dev(real_dev);
> > + if (domain->type != IOMMU_DOMAIN_DMA) {
> > + dev_err(dev, "Real DMA device not in DMA domain; can't handle DMA\n");
> > + return -ENODEV;
> > + }
> > + }
> > +
> > iommu_device_link(&iommu->iommu, dev);
> >
> > if (translation_pre_enabled(iommu))
> >
>
> We need one additional change to enforce IOMMU_DOMAIN_DMA on the real
> dma dev, otherwise it could be put into Identity and the subdevices as
> DMA leading to this WARN:
>
> struct intel_iommu *domain_get_iommu(struct dmar_domain *domain)
> {
> int iommu_id;
>
> /* si_domain and vm domain should not get here. */
> if (WARN_ON(domain->domain.type != IOMMU_DOMAIN_DMA))
> return NULL;
>
>
> We could probably define and enforce it in device_def_domain_type. We
> could also try moving real dma dev to DMA on the first subdevice, like
> below. Though there might be a few cases we can't do that.
>
>
> [snip]
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4be549478691..6c80c6c9d808 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3049,6 +3049,9 @@ static int device_def_domain_type(struct device *dev)
if ((iommu_identity_mapping & IDENTMAP_GFX) && IS_GFX_DEVICE(pdev))
return IOMMU_DOMAIN_IDENTITY;
+ if (pci_real_dma_dev(pdev) != pdev)
+ return IOMMU_DOMAIN_DMA;
+
/*
* We want to start off with all devices in the 1:1 domain, and
* take them out later if we find they can't access all of memory.
@@ -5780,12 +5783,32 @@ static bool intel_iommu_capable(enum iommu_cap cap)
return false;
}
+static int intel_iommu_request_dma_domain_for_dev(struct device *dev,
+ struct dmar_domain *domain)
+{
+ int ret;
+
+ ret = iommu_request_dma_domain_for_dev(dev);
+ if (ret) {
+ dmar_remove_one_dev_info(dev);
+ domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
+ if (!get_private_domain_for_dev(dev)) {
+ dev_warn(dev,
+ "Failed to get a private domain.\n");
+ return -ENOMEM;
+ }
+ }
+
+ return 0;
+}
+
static int intel_iommu_add_device(struct device *dev)
{
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
struct intel_iommu *iommu;
struct iommu_group *group;
+ struct device *real_dev = dev;
u8 bus, devfn;
int ret;
@@ -5809,6 +5832,21 @@ static int intel_iommu_add_device(struct device *dev)
domain = iommu_get_domain_for_dev(dev);
dmar_domain = to_dmar_domain(domain);
+
+ if (dev_is_pci(dev))
+ real_dev = &pci_real_dma_dev(to_pci_dev(dev))->dev;
+
+ if (real_dev != dev) {
+ domain = iommu_get_domain_for_dev(real_dev);
+ if (domain->type != IOMMU_DOMAIN_DMA) {
+ dmar_remove_one_dev_info(real_dev);
+
+ ret = intel_iommu_request_dma_domain_for_dev(real_dev, dmar_domain);
+ if (ret)
+ goto unlink;
+ }
+ }
+
if (domain->type == IOMMU_DOMAIN_DMA) {
if (device_def_domain_type(dev) == IOMMU_DOMAIN_IDENTITY) {
ret = iommu_request_dm_for_dev(dev);
@@ -5822,20 +5860,12 @@ static int intel_iommu_add_device(struct device *dev)
}
} else {
if (device_def_domain_type(dev) == IOMMU_DOMAIN_DMA) {
- ret = iommu_request_dma_domain_for_dev(dev);
- if (ret) {
- dmar_remove_one_dev_info(dev);
- dmar_domain->flags |= DOMAIN_FLAG_LOSE_CHILDREN;
- if (!get_private_domain_for_dev(dev)) {
- dev_warn(dev,
- "Failed to get a private domain.\n");
- ret = -ENOMEM;
- goto unlink;
- }
+ ret = intel_iommu_request_dma_domain_for_dev(dev, dmar_domain);
+ if (ret)
+ goto unlink;
- dev_info(dev,
- "Device uses a private dma domain.\n");
- }
+ dev_info(dev,
+ "Device uses a private dma domain.\n");
}
}
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
prev parent reply other threads:[~2020-04-09 0:33 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-19 3:21 [PATCH v4] iommu/vt-d: consider real PCI device when checking if mapping is needed Daniel Drake
2020-02-19 3:40 ` Lu Baolu
2020-02-20 3:36 ` Daniel Drake
2020-02-20 10:06 ` Daniel Drake
2020-02-20 11:58 ` Lu Baolu
2020-02-27 18:19 ` Derrick, Jonathan
2020-04-09 0:16 ` Derrick, Jonathan
2020-04-09 0:33 ` Derrick, Jonathan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ae740b4de5fd6cf7f52ee1baa7b9347fed26a3b4.camel@intel.com \
--to=jonathan.derrick@intel.com \
--cc=baolu.lu@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=drake@endlessm.com \
--cc=dwmw2@infradead.org \
--cc=iommu@lists.linux-foundation.org \
--cc=linux@endlessm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.