From: Alex Williamson <alex.williamson@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: "Logan Gunthorpe" <logang@deltatee.com>,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
linux-nvdimm@lists.01.org, linux-block@vger.kernel.org,
"Stephen Bates" <sbates@raithlin.com>,
"Christoph Hellwig" <hch@lst.de>, "Jens Axboe" <axboe@kernel.dk>,
"Keith Busch" <keith.busch@intel.com>,
"Sagi Grimberg" <sagi@grimberg.me>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Jason Gunthorpe" <jgg@mellanox.com>,
"Max Gurtovoy" <maxg@mellanox.com>,
"Dan Williams" <dan.j.williams@intel.com>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Benjamin Herrenschmidt" <benh@kernel.crashing.org>
Subject: Re: [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices
Date: Thu, 4 Jan 2018 15:35:51 -0700 [thread overview]
Message-ID: <20180104153551.3118f71b@t450s.home> (raw)
In-Reply-To: <20180104215721.GF189897@bhelgaas-glaptop.roam.corp.google.com>
On Thu, 4 Jan 2018 15:57:21 -0600
Bjorn Helgaas <helgaas@kernel.org> wrote:
> [+cc Alex]
>
> On Thu, Jan 04, 2018 at 12:01:29PM -0700, Logan Gunthorpe wrote:
> > When the ACS P2P flags are set in the downstream port of the switch,
> > any P2P TLPs will be sent back to the root complex. The whole point of
> > the P2P work is to have TLPs avoid the RC seeing it may not support
> > P2P transactions or, at best, it will perform poorly. So we clear these
>
> "It will perform poorly" seems like an empirical statement about the
> hardware you've seen, not something that's driven by the spec. So I'm
> not sure it adds information that will be useful long-term.
>
> > flags for all devices involved in transactions with the p2pmem.
>
> I'm not sure how this coordinates with other uses of ACS, e.g., VFIO.
> I think that should be addressed here somehow.
Yep, flipping these ACS bits invalidates any IOMMU groups that depend
on the isolation of that downstream port and I suspect also any peers
within the same PCI slot of that port and their downstream devices. The
entire sub-hierarchy grouping needs to be re-evaluated. This
potentially affects running devices that depend on that isolation, so
I'm not sure how that happens dynamically. A boot option might be
easier. Thanks,
Alex
> > A count of the number of requests to disable the flags is maintained.
> > When the count transitions from 1 to 0, the old flags are restored.
> >
> > Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> > ---
> > drivers/pci/p2p.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++++++--
> > include/linux/pci.h | 2 +
> > 2 files changed, 143 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/pci/p2p.c b/drivers/pci/p2p.c
> > index 87cec87b02e3..617adaa905d2 100644
> > --- a/drivers/pci/p2p.c
> > +++ b/drivers/pci/p2p.c
> > @@ -227,6 +227,89 @@ static struct pci_dev *find_parent_pci_dev(struct device *dev)
> > }
> >
> > /*
> > + * The ACS flags for P2P Request Redirect and P2P Completion Redirect need
> > + * to be disabled in the downstream port of each device in order for
> > + * the TLPs to not be forwarded up to the RC.
> > + */
> > +#define PCI_P2P_ACS_FLAGS (PCI_ACS_RR | PCI_ACS_CR)
> > +
> > +static int pci_p2pmem_disable_acs(struct pci_dev *pdev)
> > +{
> > + int pos;
> > + u16 ctrl;
> > + struct pci_dev *downstream;
> > +
> > + downstream = pci_dev_get(pci_upstream_bridge(pdev));
> > + if (!downstream) {
> > + dev_err(&pdev->dev, "could not find downstream port\n");
> > + return -ENODEV;
> > + }
> > +
> > + device_lock(&downstream->dev);
> > + if (downstream->p2p_acs_requests++)
> > + goto unlock_and_return;
> > +
> > + pos = pci_find_ext_capability(downstream, PCI_EXT_CAP_ID_ACS);
> > + if (!pos)
> > + goto unlock_and_return;
> > +
> > + pci_read_config_word(downstream, pos + PCI_ACS_CTRL, &ctrl);
> > +
> > + downstream->p2p_old_acs_flags = ctrl & PCI_P2P_ACS_FLAGS;
> > +
> > + if (downstream->p2p_old_acs_flags)
> > + dev_info(&pdev->dev, "disabling p2p acs flags: %x\n", ctrl);
> > +
> > + ctrl &= ~PCI_P2P_ACS_FLAGS;
> > +
> > + pci_write_config_word(downstream, pos + PCI_ACS_CTRL, ctrl);
> > +
> > +unlock_and_return:
> > + device_unlock(&downstream->dev);
> > + pci_dev_put(downstream);
> > + return 0;
> > +}
> > +
> > +static int pci_p2pmem_reset_acs(struct pci_dev *pdev)
> > +{
> > + int pos;
> > + u16 ctrl;
> > + struct pci_dev *downstream;
> > +
> > + downstream = pci_dev_get(pci_upstream_bridge(pdev));
> > + if (!downstream)
> > + return -ENODEV;
> > +
> > + device_lock(&downstream->dev);
> > +
> > + /* Only actually reset the flags on a 1->0 transition */
> > + if (!downstream->p2p_acs_requests)
> > + goto unlock_and_return;
> > +
> > + if (--downstream->p2p_acs_requests)
> > + goto unlock_and_return;
> > +
> > + pos = pci_find_ext_capability(downstream, PCI_EXT_CAP_ID_ACS);
> > + if (!pos)
> > + goto unlock_and_return;
> > +
> > + pci_read_config_word(downstream, pos + PCI_ACS_CTRL, &ctrl);
> > +
> > + ctrl &= ~PCI_P2P_ACS_FLAGS;
> > + ctrl |= downstream->p2p_old_acs_flags;
> > +
> > + if (downstream->p2p_old_acs_flags)
> > + dev_info(&pdev->dev, "resetting p2p acs flags: %x\n", ctrl);
> > +
> > + pci_write_config_word(downstream, pos + PCI_ACS_CTRL, ctrl);
> > +
> > +unlock_and_return:
> > + device_unlock(&downstream->dev);
> > + pci_dev_put(downstream);
> > + return 0;
> > +}
> > +
> > +/*
> > * If a device is behind a switch, we try to find the upstream bridge
> > * port of the switch. This requires two calls to pci_upstream_bridge:
> > * one for the upstream port on the switch, one on the upstream port
> > @@ -340,6 +423,10 @@ int pci_p2pmem_add_client(struct list_head *head, struct device *dev)
> > ret = -EXDEV;
> > goto put_client;
> > }
> > +
> > + ret = pci_p2pmem_disable_acs(client);
> > + if (!ret)
> > + goto put_client;
> > }
> >
> > new_item = kzalloc(sizeof(*new_item), GFP_KERNEL);
> > @@ -363,6 +450,9 @@ EXPORT_SYMBOL_GPL(pci_p2pmem_add_client);
> >
> > static void pci_p2pmem_client_free(struct pci_p2pmem_client *item)
> > {
> > + if (item->p2pmem)
> > + pci_p2pmem_reset_acs(item->client);
> > +
> > list_del(&item->list);
> > pci_dev_put(item->client);
> > pci_dev_put(item->p2pmem);
> > @@ -383,6 +473,7 @@ void pci_p2pmem_remove_client(struct list_head *head, struct device *dev)
> > {
> > struct pci_p2pmem_client *pos, *tmp;
> > struct pci_dev *pdev;
> > + struct pci_dev *p2pmem = NULL;
> >
> > pdev = find_parent_pci_dev(dev);
> > if (!pdev)
> > @@ -392,9 +483,16 @@ void pci_p2pmem_remove_client(struct list_head *head, struct device *dev)
> > if (pos->client != pdev)
> > continue;
> >
> > + if (!p2pmem)
> > + p2pmem = pci_dev_get(pos->p2pmem);
> > +
> > pci_p2pmem_client_free(pos);
> > }
> >
> > + if (p2pmem && list_empty(head))
> > + pci_p2pmem_reset_acs(p2pmem);
> > +
> > + pci_dev_put(p2pmem);
> > pci_dev_put(pdev);
> > }
> > EXPORT_SYMBOL_GPL(pci_p2pmem_remove_client);
> > @@ -412,6 +510,10 @@ void pci_p2pmem_client_list_free(struct list_head *head)
> > {
> > struct pci_p2pmem_client *pos, *tmp;
> >
> > + pos = list_first_entry_or_null(head, struct pci_p2pmem_client, list);
> > + if (pos && pos->p2pmem)
> > + pci_p2pmem_reset_acs(pos->p2pmem);
> > +
> > list_for_each_entry_safe(pos, tmp, head, list)
> > pci_p2pmem_client_free(pos);
> > }
> > @@ -440,6 +542,40 @@ static bool upstream_bridges_match_list(struct pci_dev *pdev,
> > return ret;
> > }
> >
> > +static int bind_clients(struct pci_dev *p2pmem, struct list_head *clients)
> > +{
> > + int ret;
> > + struct pci_p2pmem_client *pos, *unwind_pos;
> > +
> > + ret = pci_p2pmem_disable_acs(p2pmem);
> > + if (ret)
> > + return ret;
> > +
> > + list_for_each_entry(pos, clients, list) {
> > + ret = pci_p2pmem_disable_acs(pos->client);
> > + if (ret)
> > + goto unwind;
> > +
> > + pos->p2pmem = pci_dev_get(p2pmem);
> > + }
> > +
> > + return 0;
> > +
> > +unwind:
> > + list_for_each_entry(unwind_pos, clients, list) {
> > + if (pos == unwind_pos)
> > + break;
> > +
> > + pci_p2pmem_reset_acs(unwind_pos->client);
> > + pci_dev_put(unwind_pos->p2pmem);
> > + unwind_pos->p2pmem = NULL;
> > + }
> > +
> > + pci_p2pmem_reset_acs(p2pmem);
> > +
> > + return ret;
> > +}
> > +
> > /**
> > * pci_p2pmem_find - find a p2p mem device compatible with the specified device
> > * @dev: list of device to check (NULL-terminated)
> > @@ -450,11 +586,13 @@ static bool upstream_bridges_match_list(struct pci_dev *pdev,
> > *
> > * Returns a pointer to the PCI device with a reference taken (use pci_dev_put
> > * to return the reference) or NULL if no compatible device is found.
> > + *
> > + * The P2P ACS flags on all applicable PCI devices will be cleared and
> > + * reset when the client is removed from the list.
> > */
> > struct pci_dev *pci_p2pmem_find(struct list_head *clients)
> > {
> > struct pci_dev *pdev = NULL;
> > - struct pci_p2pmem_client *pos;
> >
> > while ((pdev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, pdev))) {
> > if (!pdev->p2p || !pdev->p2p->published)
> > @@ -463,8 +601,8 @@ struct pci_dev *pci_p2pmem_find(struct list_head *clients)
> > if (!upstream_bridges_match_list(pdev, clients))
> > continue;
> >
> > - list_for_each_entry(pos, clients, list)
> > - pos->p2pmem = pdev;
> > + if (bind_clients(pdev, clients))
> > + continue;
> >
> > return pdev;
> > }
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 047aea679e87..cdd4d3c3e562 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -435,6 +435,8 @@ struct pci_dev {
> > #endif
> > #ifdef CONFIG_PCI_P2P
> > struct pci_p2p *p2p;
> > + unsigned int p2p_acs_requests;
> > + u16 p2p_old_acs_flags;
> > #endif
> > phys_addr_t rom; /* Physical address of ROM if it's not from the BAR */
> > size_t romlen; /* Length of ROM if it's not from the BAR */
> > --
> > 2.11.0
> >
next prev parent reply other threads:[~2018-01-04 22:35 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-04 19:01 [PATCH 00/11] Copy Offload in NVMe Fabrics with P2P PCI Memory Logan Gunthorpe
[not found] ` <20180104190137.7654-1-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 19:01 ` [PATCH 01/12] pci-p2p: Support peer to peer memory Logan Gunthorpe
2018-01-04 21:40 ` Bjorn Helgaas
[not found] ` <20180104214028.GD189897-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2018-01-04 23:06 ` Logan Gunthorpe
[not found] ` <20180104190137.7654-2-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 21:59 ` Bjorn Helgaas
[not found] ` <20180104215941.GG189897-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2018-01-05 0:20 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 02/12] pci-p2p: Add sysfs group to display p2pmem stats Logan Gunthorpe
2018-01-04 21:50 ` Bjorn Helgaas
[not found] ` <20180104215040.GE189897-1RhO1Y9PlrlHTL0Zs8A6p5iNqAH0jzoTYJqu5kTmcBRl57MIdRCFDg@public.gmane.org>
2018-01-04 22:25 ` Jason Gunthorpe
2018-01-04 23:13 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 03/12] pci-p2p: Add PCI p2pmem dma mappings to adjust the bus offset Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 04/12] pci-p2p: Clear ACS P2P flags for all client devices Logan Gunthorpe
[not found] ` <20180104190137.7654-5-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 21:57 ` Bjorn Helgaas
2018-01-04 22:35 ` Alex Williamson [this message]
[not found] ` <20180104153551.3118f71b-1yVPhWWZRC1BDLzU/O5InQ@public.gmane.org>
2018-01-05 0:00 ` Logan Gunthorpe
2018-01-05 1:09 ` Logan Gunthorpe
2018-01-05 3:33 ` Alex Williamson
2018-01-05 6:47 ` Jerome Glisse
2018-01-05 15:41 ` Alex Williamson
[not found] ` <20180104203300.79487c98-DGNDKt5SQtizQB+pC5nmwQ@public.gmane.org>
2018-01-05 17:10 ` Logan Gunthorpe
[not found] ` <77ec7893-6ff5-ebdf-163d-fc4e02077cc2-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-05 17:18 ` Alex Williamson
2018-01-04 19:01 ` [PATCH 05/12] block: Introduce PCI P2P flags for request and request queue Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 06/12] IB/core: Add optional PCI P2P flag to rdma_rw_ctx_[init|destroy]() Logan Gunthorpe
2018-01-04 19:22 ` Jason Gunthorpe
[not found] ` <20180104192225.GS11348-uk2M96/98Pc@public.gmane.org>
2018-01-04 19:52 ` Logan Gunthorpe
2018-01-04 22:13 ` Jason Gunthorpe
2018-01-04 23:44 ` Logan Gunthorpe
2018-01-05 4:50 ` Jason Gunthorpe
2018-01-08 14:59 ` Christoph Hellwig
2018-01-08 18:09 ` Jason Gunthorpe
[not found] ` <20180108180917.GF11348-uk2M96/98Pc@public.gmane.org>
2018-01-08 18:17 ` Logan Gunthorpe
[not found] ` <3daea7fe-f64a-36a9-ca80-0cb4d9acf171-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-08 18:29 ` Jason Gunthorpe
2018-01-08 18:34 ` Christoph Hellwig
[not found] ` <20180108183434.GA15549-jcswGhMUV9g@public.gmane.org>
2018-01-08 18:44 ` Logan Gunthorpe
[not found] ` <cffacbee-477a-fbe9-19cc-373cd2ec6fef-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-08 18:57 ` Christoph Hellwig
[not found] ` <20180108185743.GA15936-jcswGhMUV9g@public.gmane.org>
2018-01-08 19:05 ` Logan Gunthorpe
[not found] ` <7d276107-d2ae-530b-3d56-e104e22b4eea-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-09 16:47 ` Christoph Hellwig
2018-01-08 19:49 ` Jason Gunthorpe
2018-01-09 16:46 ` Christoph Hellwig
2018-01-09 17:10 ` Jason Gunthorpe
2018-01-08 19:01 ` Jason Gunthorpe
[not found] ` <20180108190116.GI11348-uk2M96/98Pc@public.gmane.org>
2018-01-09 16:55 ` Christoph Hellwig
2018-01-04 19:01 ` [PATCH 07/12] nvme-pci: clean up CMB initialization Logan Gunthorpe
[not found] ` <20180104190137.7654-8-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 19:08 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 08/12] nvme-pci: clean up SMBSZ bit definitions Logan Gunthorpe
[not found] ` <20180104190137.7654-9-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-04 19:08 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 09/12] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Logan Gunthorpe
[not found] ` <20180104190137.7654-10-logang-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-05 15:30 ` Marta Rybczynska
[not found] ` <281272464.115248600.1515166226009.JavaMail.zimbra-FNhOzJFKnXGHXe+LvDLADg@public.gmane.org>
2018-01-05 18:14 ` Logan Gunthorpe
2018-01-05 18:11 ` Keith Busch
2018-01-05 18:19 ` Logan Gunthorpe
[not found] ` <6c89e4b4-1854-c251-a5ec-7e54bc8085fc-OTvnGxWRz7hWk0Htik3J/w@public.gmane.org>
2018-01-05 19:01 ` Keith Busch
2018-01-05 19:04 ` Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 10/12] nvme-pci: Add support for P2P memory in requests Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 11/12] nvme-pci: Add a quirk for a pseudo CMB Logan Gunthorpe
2018-01-04 19:01 ` [PATCH 12/12] nvmet: Optionally use PCI P2P memory Logan Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180104153551.3118f71b@t450s.home \
--to=alex.williamson@redhat.com \
--cc=axboe@kernel.dk \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=helgaas@kernel.org \
--cc=jgg@mellanox.com \
--cc=jglisse@redhat.com \
--cc=keith.busch@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=maxg@mellanox.com \
--cc=sagi@grimberg.me \
--cc=sbates@raithlin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox