From: Don Dutile <ddutile@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>,
Logan Gunthorpe <logang@deltatee.com>,
linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
linux-block@vger.kernel.org, linux-pci@vger.kernel.org,
linux-mm@kvack.org, iommu@lists.linux-foundation.org
Cc: "Stephen Bates" <sbates@raithlin.com>,
"Christoph Hellwig" <hch@lst.de>,
"Dan Williams" <dan.j.williams@intel.com>,
"Jason Gunthorpe" <jgg@ziepe.ca>,
"Christian König" <christian.koenig@amd.com>,
"Matthew Wilcox" <willy@infradead.org>,
"Daniel Vetter" <daniel.vetter@ffwll.ch>,
"Jakowski Andrzej" <andrzej.jakowski@intel.com>,
"Minturn Dave B" <dave.b.minturn@intel.com>,
"Jason Ekstrand" <jason@jlekstrand.net>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"Xiong Jianxin" <jianxin.xiong@intel.com>,
"Bjorn Helgaas" <helgaas@kernel.org>,
"Ira Weiny" <ira.weiny@intel.com>,
"Robin Murphy" <robin.murphy@arm.com>
Subject: Re: [PATCH 02/16] PCI/P2PDMA: Avoid pci_get_slot() which sleeps
Date: Tue, 11 May 2021 12:05:40 -0400 [thread overview]
Message-ID: <a6830332-c866-451f-3c6a-585cbf295ff8@redhat.com> (raw)
In-Reply-To: <d6220bff-83fc-6c03-76f7-32e9e00e40fd@nvidia.com>
On 5/2/21 1:35 AM, John Hubbard wrote:
> On 4/8/21 10:01 AM, Logan Gunthorpe wrote:
>> In order to use upstream_bridge_distance_warn() from a dma_map function,
>> it must not sleep. However, pci_get_slot() takes the pci_bus_sem so it
>> might sleep.
>>
>> In order to avoid this, try to get the host bridge's device from
>> bus->self, and if that is not set, just get the first element in the
>> device list. It should be impossible for the host bridge's device to
>> go away while references are held on child devices, so the first element
>> should not be able to change and, thus, this should be safe.
>>
>> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
>> ---
>> drivers/pci/p2pdma.c | 14 ++++++++++++--
>> 1 file changed, 12 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
>> index bd89437faf06..473a08940fbc 100644
>> --- a/drivers/pci/p2pdma.c
>> +++ b/drivers/pci/p2pdma.c
>> @@ -311,16 +311,26 @@ static const struct pci_p2pdma_whitelist_entry {
>> static bool __host_bridge_whitelist(struct pci_host_bridge *host,
>> bool same_host_bridge)
>> {
>> - struct pci_dev *root = pci_get_slot(host->bus, PCI_DEVFN(0, 0));
>> const struct pci_p2pdma_whitelist_entry *entry;
>> + struct pci_dev *root = host->bus->self;
>> unsigned short vendor, device;
>> + /*
>> + * This makes the assumption that the first device on the bus is the
>> + * bridge itself and it has the devfn of 00.0. This assumption should
>> + * hold for the devices in the white list above, and if there are cases
>> + * where this isn't true they will have to be dealt with when such a
>> + * case is added to the whitelist.
>
> Actually, it makes the assumption that the first device *in the list*
> (the host->bus-devices list) is 00.0. The previous code made the
> assumption that you wrote.
>
> By the way, pre-existing code comment: pci_p2pdma_whitelist[] seems
> really short. From a naive point of view, I'd expect that there must be
> a lot more CPUs/chipsets that can do pci p2p, what do you think? I
> wonder if we have to be so super strict, anyway. It just seems extremely
> limited, and I suspect there will be some additions to the list as soon
> as we start to use this.
>
>
>> + */
>> if (!root)
>> + root = list_first_entry_or_null(&host->bus->devices,
>> + struct pci_dev, bus_list);
>
> OK, yes this avoids taking the pci_bus_sem, but it's kind of cheating.
> Why is it OK to avoid taking any locks in order to retrieve the
> first entry from the list, but in order to retrieve any other entry, you
> have to aquire the pci_bus_sem, and get a reference as well? Something
> is inconsistent there.
>
> The new version here also no longer takes a reference on the device,
> which is also cheating. But I'm guessing that the unstated assumption
> here is that there is always at least one entry in the list. But if
> that's true, then it's better to show clearly that assumption, instead
> of hiding it in an implicit call that skips both locking and reference
> counting.
>
> You could add a new function, which is a cut-down version of pci_get_slot(),
> like this, and call this from __host_bridge_whitelist():
>
> /*
> * A special purpose variant of pci_get_slot() that doesn't take the pci_bus_sem
> * lock, and only looks for the 00.0 bus-device-function. Once the PCI bus is
> * up, it is safe to call this, because there will always be a top-level PCI
> * root device.
> *
> * Other assumptions: the root device is the first device in the list, and the
> * root device is numbered 00.0.
> */
> struct pci_dev *pci_get_root_slot(struct pci_bus *bus)
> {
> struct pci_dev *root;
> unsigned devfn = PCI_DEVFN(0, 0);
>
> root = list_first_entry_or_null(&bus->devices, struct pci_dev,
> bus_list);
> if (root->devfn == devfn)
> goto out;
>
... add a flag (set for p2pdma use) to the function to print out what the root->devfn is, and what
the device is so the needed quirk &/or modification can added to handle when this assumption fails;
or make it a prdebug that can be flipped on for this failing situation, again, to add needed change to accomodate.
> root = NULL;
> out:
> pci_dev_get(root);
> return root;
> }
> EXPORT_SYMBOL(pci_get_root_slot);
>
> ...I think that's a lot clearer to the reader, about what's going on here.
>
> Note that I'm not really sure if it *is* safe, I would need to ask other
> PCIe subsystem developers with more experience. But I don't think anyone
> is trying to make p2pdma calls so early that PCIe buses are uninitialized.
>
>
>> +
>> + if (!root || root->devfn)
>> return false;
>> vendor = root->vendor;
>> device = root->device;
>> - pci_dev_put(root);
and the reason to remove the dev_put is b/c it can sleep as well?
is that ok, given the dev_get that John put into the new pci_get_root_slot()?
... seems like a locking version with no get/put's is needed, or, fix the host-bridge setups so no !NULL self pointers.
>> for (entry = pci_p2pdma_whitelist; entry->vendor; entry++) {
>> if (vendor != entry->vendor || device != entry->device)
>>
>
> thanks,
next prev parent reply other threads:[~2021-05-11 16:05 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-08 17:01 [PATCH 00/16] Add new DMA mapping operation for P2PDMA Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 01/16] PCI/P2PDMA: Pass gfp_mask flags to upstream_bridge_distance_warn() Logan Gunthorpe
2021-05-02 3:58 ` John Hubbard
2021-05-03 15:57 ` Logan Gunthorpe
2021-05-03 18:17 ` John Hubbard
2021-05-03 18:20 ` Logan Gunthorpe
2021-05-03 18:23 ` John Hubbard
2021-05-03 18:24 ` Christoph Hellwig
2021-05-11 16:05 ` Don Dutile
2021-05-11 16:12 ` Logan Gunthorpe
2021-05-11 16:23 ` Don Dutile
2021-04-08 17:01 ` [PATCH 02/16] PCI/P2PDMA: Avoid pci_get_slot() which sleeps Logan Gunthorpe
2021-05-02 5:35 ` John Hubbard
2021-05-03 16:08 ` Logan Gunthorpe
2021-05-03 18:20 ` John Hubbard
2021-05-03 18:25 ` Christoph Hellwig
2021-05-11 16:05 ` Don Dutile [this message]
2021-05-11 16:16 ` Logan Gunthorpe
2021-05-11 16:05 ` Don Dutile
2021-05-11 16:14 ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 03/16] PCI/P2PDMA: Attempt to set map_type if it has not been set Logan Gunthorpe
2021-05-02 19:58 ` John Hubbard
2021-05-03 16:17 ` Logan Gunthorpe
2021-05-03 18:22 ` John Hubbard
2021-05-03 18:35 ` Christoph Hellwig
2021-05-03 18:46 ` Logan Gunthorpe
2021-05-11 16:05 ` Don Dutile
2021-04-08 17:01 ` [PATCH 04/16] PCI/P2PDMA: Refactor pci_p2pdma_map_type() to take pagmap and device Logan Gunthorpe
2021-05-02 20:41 ` John Hubbard
2021-05-03 16:30 ` Logan Gunthorpe
2021-05-03 18:31 ` John Hubbard
2021-05-03 18:56 ` Logan Gunthorpe
2021-05-03 21:54 ` John Hubbard
2021-05-03 22:57 ` Jason Gunthorpe
2021-05-03 23:40 ` John Hubbard
2021-04-08 17:01 ` [PATCH 05/16] dma-mapping: Introduce dma_map_sg_p2pdma() Logan Gunthorpe
2021-04-27 19:22 ` Jason Gunthorpe
2021-04-27 22:49 ` Logan Gunthorpe
2021-04-27 19:31 ` Jason Gunthorpe
2021-04-27 22:55 ` Logan Gunthorpe
2021-04-27 23:01 ` Jason Gunthorpe
2021-05-03 18:28 ` Christoph Hellwig
2021-05-03 18:31 ` Logan Gunthorpe
2021-05-02 21:23 ` John Hubbard
2021-05-03 16:38 ` Logan Gunthorpe
2021-05-11 16:05 ` Don Dutile
2021-04-08 17:01 ` [PATCH 06/16] lib/scatterlist: Add flag for indicating P2PDMA segments in an SGL Logan Gunthorpe
2021-05-02 22:34 ` John Hubbard
2021-04-08 17:01 ` [PATCH 07/16] PCI/P2PDMA: Make pci_p2pdma_map_type() non-static Logan Gunthorpe
2021-05-02 22:44 ` John Hubbard
2021-05-03 16:39 ` Logan Gunthorpe
2021-05-11 16:06 ` Don Dutile
2021-05-11 16:17 ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 08/16] PCI/P2PDMA: Introduce helpers for dma_map_sg implementations Logan Gunthorpe
2021-05-02 22:52 ` John Hubbard
2021-05-03 0:50 ` John Hubbard
2021-05-03 17:15 ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 09/16] dma-direct: Support PCI P2PDMA pages in dma-direct map_sg Logan Gunthorpe
2021-04-27 19:33 ` Jason Gunthorpe
2021-04-27 19:40 ` Jason Gunthorpe
2021-04-27 22:56 ` Logan Gunthorpe
2021-05-02 23:28 ` John Hubbard
2021-05-02 23:32 ` John Hubbard
2021-05-03 17:06 ` Logan Gunthorpe
2021-05-03 16:55 ` Logan Gunthorpe
2021-05-04 0:12 ` John Hubbard
2021-05-03 17:04 ` Logan Gunthorpe
2021-05-04 0:01 ` John Hubbard
2021-04-08 17:01 ` [PATCH 10/16] dma-mapping: Add flags to dma_map_ops to indicate PCI P2PDMA support Logan Gunthorpe
2021-05-03 0:32 ` John Hubbard
2021-05-03 17:09 ` Logan Gunthorpe
2021-05-11 16:06 ` Don Dutile
2021-05-11 16:19 ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 11/16] iommu/dma: Support PCI P2PDMA pages in dma-iommu map_sg Logan Gunthorpe
2021-04-27 19:43 ` Jason Gunthorpe
2021-04-27 22:59 ` Logan Gunthorpe
2021-05-03 1:14 ` John Hubbard
2021-05-06 23:59 ` Logan Gunthorpe
2021-05-11 16:06 ` Don Dutile
2021-05-11 16:35 ` Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 12/16] nvme-pci: Check DMA ops when indicating support for PCI P2PDMA Logan Gunthorpe
2021-05-03 1:29 ` John Hubbard
2021-05-03 17:17 ` Logan Gunthorpe
2021-05-04 0:17 ` John Hubbard
2021-04-08 17:01 ` [PATCH 13/16] nvme-pci: Convert to using dma_map_sg_p2pdma for p2pdma pages Logan Gunthorpe
2021-05-03 1:34 ` John Hubbard
2021-05-03 17:19 ` Logan Gunthorpe
2021-05-04 0:26 ` John Hubbard
2021-04-08 17:01 ` [PATCH 14/16] nvme-rdma: Ensure dma support when using p2pdma Logan Gunthorpe
2021-04-27 19:47 ` Jason Gunthorpe
2021-04-27 22:59 ` Logan Gunthorpe
2021-05-03 1:37 ` John Hubbard
2021-04-08 17:01 ` [PATCH 15/16] RDMA/rw: use dma_map_sg_p2pdma() Logan Gunthorpe
2021-04-08 17:01 ` [PATCH 16/16] PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg() Logan Gunthorpe
2021-04-27 19:28 ` [PATCH 00/16] Add new DMA mapping operation for P2PDMA Jason Gunthorpe
2021-04-27 20:21 ` John Hubbard
2021-04-27 20:48 ` Dan Williams
2021-05-02 1:22 ` John Hubbard
2021-05-11 16:05 ` Don Dutile
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a6830332-c866-451f-3c6a-585cbf295ff8@redhat.com \
--to=ddutile@redhat.com \
--cc=andrzej.jakowski@intel.com \
--cc=christian.koenig@amd.com \
--cc=dan.j.williams@intel.com \
--cc=daniel.vetter@ffwll.ch \
--cc=dave.b.minturn@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hch@lst.de \
--cc=helgaas@kernel.org \
--cc=iommu@lists.linux-foundation.org \
--cc=ira.weiny@intel.com \
--cc=jason@jlekstrand.net \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=jianxin.xiong@intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-pci@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=robin.murphy@arm.com \
--cc=sbates@raithlin.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).