From: Alex Williamson <alex.williamson@redhat.com>
To: Jerome Glisse <jglisse@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>, Keith Busch <keith.busch@intel.com>,
linux-nvdimm@lists.01.org, linux-rdma@vger.kernel.org,
linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
Jason Gunthorpe <jgg@mellanox.com>,
Bjorn Helgaas <helgaas@kernel.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Bjorn Helgaas <bhelgaas@google.com>,
Max Gurtovoy <maxg@mellanox.com>, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices
Date: Fri, 5 Jan 2018 08:41:08 -0700 [thread overview]
Message-ID: <20180105084108.47a45561@t450s.home> (raw)
In-Reply-To: <20180105064547.GA3212@redhat.com>
On Fri, 5 Jan 2018 01:47:01 -0500
Jerome Glisse <jglisse@redhat.com> wrote:
> On Thu, Jan 04, 2018 at 08:33:00PM -0700, Alex Williamson wrote:
> > On Thu, 4 Jan 2018 17:00:47 -0700
> > Logan Gunthorpe <logang@deltatee.com> wrote:
> >
> > > On 04/01/18 03:35 PM, Alex Williamson wrote:
> > > > Yep, flipping these ACS bits invalidates any IOMMU groups that depend
> > > > on the isolation of that downstream port and I suspect also any peers
> > > > within the same PCI slot of that port and their downstream devices. The
> > > > entire sub-hierarchy grouping needs to be re-evaluated. This
> > > > potentially affects running devices that depend on that isolation, so
> > > > I'm not sure how that happens dynamically. A boot option might be
> > > > easier. Thanks,
> > >
> > > I don't see how this is the case in current kernel code. It appears to
> > > only enable ACS globally if the IOMMU requests it.
> >
> > IOMMU groups don't exist unless the IOMMU is enabled and x86 and ARM
> > both request ACS be enabled if an IOMMU is present, so I'm not sure
> > what you're getting at here. Also, in reply to your other email, if
> > the IOMMU is enabled, every device handled by the IOMMU is a member of
> > an IOMMU group, see struct device.iommu_group. There's an
> > iommu_group_get() accessor to get a reference to it.
> >
> > > I also don't see how turning off ACS isolation for a specific device is
> > > going to hurt anything. The IOMMU should still be able to keep going on
> > > unaware that anything has changed. The only worry is that a security
> > > hole may now be created if a user was relying on the isolation between
> > > two devices that are in different VMs or something. However, if a user
> > > was relying on this, they probably shouldn't have turned on P2P in the
> > > first place.
> >
> > That's exactly what IOMMU groups represent, the smallest set of devices
> > which have DMA isolation from other devices. By poking this hole, the
> > IOMMU group is invalid. We cannot turn off ACS only for a specific
> > device, in order to enable p2p it needs to be disabled at every
> > downstream port between the devices where we want to enable p2p.
> > Depending on the topology, that could mean we're also enabling p2p for
> > unrelated devices. Those unrelated devices might be in active use and
> > the p2p IOVAs now have a different destination which is no longer IOMMU
> > translated.
> >
> > > We started with a fairly unintelligent choice to simply disable ACS on
> > > any kernel that had CONFIG_PCI_P2P set. However, this did not seem like
> > > a good idea going forward. Instead, we now selectively disable the ACS
> > > bit only on the downstream ports that are involved in P2P transactions.
> > > This seems like the safest choice and still allows people to (carefully)
> > > use P2P adjacent to other devices that need to be isolated.
> >
> > I don't see that the code is doing much checking that adjacent devices
> > are also affected by the p2p change and of course the IOMMU group is
> > entirely invalid once the p2p holes start getting poked.
> >
> > > I don't think anyone wants another boot option that must be set in order
> > > to use this functionality (and only some hardware would require this).
> > > That's just a huge pain for users.
> >
> > No, but nor do we need IOMMU groups that no longer represent what
> > they're intended to describe or runtime, unchecked routing changes
> > through the topology for devices that might already be using
> > conflicting IOVA ranges. Maybe soft hotplugs are another possibility,
> > designate a sub-hierarchy to be removed and re-scanned with ACS
> > disabled. Otherwise it seems like disabling and re-enabling ACS needs
> > to also handle merging and splitting groups dynamically. Thanks,
> >
>
> Dumb question, can we use a PCI bar address of one device into the
> IOMMU page table of another address ie like we would DMA map a
> regular system page ?
>
> It would be much better in my view to follow down such path if that
> is at all possible from hardware point of view (i am not sure where
> to dig in the specification to answer my above question).
Yes, we can bounce device MMIO through the IOMMU, or at least vfio does
enable these mappings to allow p2p between devices assigned to the same
VM and we've seen evidence that they work, but like p2p in general,
there are likely platform dependencies. For Logan this doesn't really
solve the problem unless the device and downstream port support ATS and
ACS Direct Translated P2P is enable. That would allow the device to do
p2p with addresses pre-translated by the IOMMU and cached by ATS on the
device without the transaction needing to traverse all the way to the
IOMMU on the root bus. The security of ATS is pretty questionable in
general, but that would likely be a solution that wouldn't require
manipulating the groupings. Thanks,
Alex
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
next prev parent reply other threads:[~2018-01-05 15:36 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-04 19:01 [PATCH 00/11] Copy Offload in NVMe Fabrics with P2P PCI Memory Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 01/12] pci-p2p: Support peer to peer memory Logan Gunthorpe
2018-01-04 21:40 ` Bjorn Helgaas
2018-01-04 23:06 ` Logan Gunthorpe
2018-01-04 21:59 ` Bjorn Helgaas
2018-01-05 0:20 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 02/12] pci-p2p: Add sysfs group to display p2pmem stats Logan Gunthorpe
2018-01-04 21:50 ` Bjorn Helgaas
2018-01-04 22:25 ` Jason Gunthorpe
2018-01-04 23:13 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 03/12] pci-p2p: Add PCI p2pmem dma mappings to adjust the bus offset Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices Logan Gunthorpe
2018-01-04 21:57 ` Bjorn Helgaas
2018-01-04 22:35 ` Alex Williamson
2018-01-05 0:00 ` Logan Gunthorpe
2018-01-05 1:09 ` Logan Gunthorpe
2018-01-05 3:33 ` Alex Williamson
2018-01-05 6:47 ` Jerome Glisse
2018-01-05 15:41 ` Alex Williamson [this message]
2018-01-05 17:10 ` Logan Gunthorpe
2018-01-05 17:18 ` Alex Williamson
2018-01-04 19:01 ` [PATCH 05/12] block: Introduce PCI P2P flags for request and request queue Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]() Logan Gunthorpe
2018-01-04 19:22 ` Jason Gunthorpe
2018-01-04 19:52 ` Logan Gunthorpe
2018-01-04 22:13 ` Jason Gunthorpe
2018-01-04 23:44 ` Logan Gunthorpe
2018-01-05 4:50 ` Jason Gunthorpe
2018-01-08 14:59 ` Christoph Hellwig
2018-01-08 18:09 ` Jason Gunthorpe
2018-01-08 18:17 ` Logan Gunthorpe
2018-01-08 18:29 ` Jason Gunthorpe
2018-01-08 18:34 ` Christoph Hellwig
2018-01-08 18:44 ` Logan Gunthorpe
2018-01-08 18:57 ` Christoph Hellwig
2018-01-08 19:05 ` Logan Gunthorpe
2018-01-09 16:47 ` Christoph Hellwig
2018-01-08 19:49 ` Jason Gunthorpe
2018-01-09 16:46 ` Christoph Hellwig
2018-01-09 17:10 ` Jason Gunthorpe
2018-01-08 19:01 ` Jason Gunthorpe
2018-01-09 16:55 ` Christoph Hellwig
2018-01-04 19:01 ` [PATCH 07/12] nvme-pci: clean up CMB initialization Logan Gunthorpe
2018-01-04 19:08 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 08/12] nvme-pci: clean up SMBSZ bit definitions Logan Gunthorpe
2018-01-04 19:08 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 09/12] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Logan Gunthorpe
2018-01-05 15:30 ` Marta Rybczynska
2018-01-05 18:14 ` Logan Gunthorpe
2018-01-05 18:11 ` Keith Busch
2018-01-05 18:19 ` Logan Gunthorpe
2018-01-05 19:01 ` Keith Busch
2018-01-05 19:04 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 10/12] nvme-pci: Add support for P2P memory in requests Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 11/12] nvme-pci: Add a quirk for a pseudo CMB Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 12/12] nvmet: Optionally use PCI P2P memory Logan Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180105084108.47a45561@t450s.home \
--to=alex.williamson@redhat.com \
--cc=axboe@kernel.dk \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=hch@lst.de \
--cc=helgaas@kernel.org \
--cc=jgg@mellanox.com \
--cc=jglisse@redhat.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=maxg@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).