linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Logan Gunthorpe <logang@deltatee.com>,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-block@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-mm@kvack.org, iommu@lists.linux-foundation.org
Cc: "Stephen Bates" <sbates@raithlin.com>,
	"Christoph Hellwig" <hch@lst.de>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Christian König" <christian.koenig@amd.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Don Dutile" <ddutile@redhat.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Minturn Dave B" <dave.b.minturn@intel.com>,
	"Jason Ekstrand" <jason@jlekstrand.net>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Xiong Jianxin" <jianxin.xiong@intel.com>,
	"Bjorn Helgaas" <helgaas@kernel.org>,
	"Ira Weiny" <ira.weiny@intel.com>,
	"Martin Oliveira" <martin.oliveira@eideticom.com>,
	"Chaitanya Kulkarni" <ckulkarnilinux@gmail.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Chaitanya Kulkarni" <kch@nvidia.com>
Subject: Re: [PATCH v7 01/21] lib/scatterlist: add flag for indicating P2PDMA segments in an SGL
Date: Wed, 29 Jun 2022 19:02:33 +0100	[thread overview]
Message-ID: <d84a0498-3b7f-3d38-2bfd-9a175db4002a@arm.com> (raw)
In-Reply-To: <c42b5ee3-5d4f-7e44-8885-26b8417208ae@deltatee.com>

On 2022-06-29 16:39, Logan Gunthorpe wrote:
> 
> 
> 
> On 2022-06-29 03:05, Robin Murphy wrote:
>> On 2022-06-15 17:12, Logan Gunthorpe wrote:
>>> Make use of the third free LSB in scatterlist's page_link on 64bit
>>> systems.
>>>
>>> The extra bit will be used by dma_[un]map_sg_p2pdma() to determine when a
>>> given SGL segments dma_address points to a PCI bus address.
>>> dma_unmap_sg_p2pdma() will need to perform different cleanup when a
>>> segment is marked as a bus address.
>>>
>>> The new bit will only be used when CONFIG_PCI_P2PDMA is set; this means
>>> PCI P2PDMA will require CONFIG_64BIT. This should be acceptable as the
>>> majority of P2PDMA use cases are restricted to newer root complexes and
>>> roughly require the extra address space for memory BARs used in the
>>> transactions.
>>>
>>> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
>>> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
>>> ---
>>>    drivers/pci/Kconfig         |  5 +++++
>>>    include/linux/scatterlist.h | 44 ++++++++++++++++++++++++++++++++++++-
>>>    2 files changed, 48 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
>>> index 133c73207782..5cc7cba1941f 100644
>>> --- a/drivers/pci/Kconfig
>>> +++ b/drivers/pci/Kconfig
>>> @@ -164,6 +164,11 @@ config PCI_PASID
>>>    config PCI_P2PDMA
>>>        bool "PCI peer-to-peer transfer support"
>>>        depends on ZONE_DEVICE
>>> +    #
>>> +    # The need for the scatterlist DMA bus address flag means PCI P2PDMA
>>> +    # requires 64bit
>>> +    #
>>> +    depends on 64BIT
>>>        select GENERIC_ALLOCATOR
>>>        help
>>>          Enableѕ drivers to do PCI peer-to-peer transactions to and from
>>> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
>>> index 7ff9d6386c12..6561ca8aead8 100644
>>> --- a/include/linux/scatterlist.h
>>> +++ b/include/linux/scatterlist.h
>>> @@ -64,12 +64,24 @@ struct sg_append_table {
>>>    #define SG_CHAIN    0x01UL
>>>    #define SG_END        0x02UL
>>>    +/*
>>> + * bit 2 is the third free bit in the page_link on 64bit systems which
>>> + * is used by dma_unmap_sg() to determine if the dma_address is a
>>> + * bus address when doing P2PDMA.
>>> + */
>>> +#ifdef CONFIG_PCI_P2PDMA
>>> +#define SG_DMA_BUS_ADDRESS    0x04UL
>>> +static_assert(__alignof__(struct page) >= 8);
>>> +#else
>>> +#define SG_DMA_BUS_ADDRESS    0x00UL
>>> +#endif
>>> +
>>>    /*
>>>     * We overload the LSB of the page pointer to indicate whether it's
>>>     * a valid sg entry, or whether it points to the start of a new
>>> scatterlist.
>>>     * Those low bits are there for everyone! (thanks mason :-)
>>>     */
>>> -#define SG_PAGE_LINK_MASK (SG_CHAIN | SG_END)
>>> +#define SG_PAGE_LINK_MASK (SG_CHAIN | SG_END | SG_DMA_BUS_ADDRESS)
>>>      static inline unsigned int __sg_flags(struct scatterlist *sg)
>>>    {
>>> @@ -91,6 +103,11 @@ static inline bool sg_is_last(struct scatterlist *sg)
>>>        return __sg_flags(sg) & SG_END;
>>>    }
>>>    +static inline bool sg_is_dma_bus_address(struct scatterlist *sg)
>>> +{
>>> +    return __sg_flags(sg) & SG_DMA_BUS_ADDRESS;
>>> +}
>>> +
>>>    /**
>>>     * sg_assign_page - Assign a given page to an SG entry
>>>     * @sg:            SG entry
>>> @@ -245,6 +262,31 @@ static inline void sg_unmark_end(struct
>>> scatterlist *sg)
>>>        sg->page_link &= ~SG_END;
>>>    }
>>>    +/**
>>> + * sg_dma_mark_bus address - Mark the scatterlist entry as a bus address
>>> + * @sg:         SG entryScatterlist
>>
>> entryScatterlist?
>>
>>> + *
>>> + * Description:
>>> + *   Marks the passed in sg entry to indicate that the dma_address is
>>> + *   a bus address and doesn't need to be unmapped.
>>> + **/
>>> +static inline void sg_dma_mark_bus_address(struct scatterlist *sg)
>>> +{
>>> +    sg->page_link |= SG_DMA_BUS_ADDRESS;
>>> +}
>>> +
>>> +/**
>>> + * sg_unmark_pci_p2pdma - Unmark the scatterlist entry as a bus address
>>> + * @sg:         SG entryScatterlist
>>> + *
>>> + * Description:
>>> + *   Clears the bus address mark.
>>> + **/
>>> +static inline void sg_dma_unmark_bus_address(struct scatterlist *sg)
>>> +{
>>> +    sg->page_link &= ~SG_DMA_BUS_ADDRESS;
>>> +}
>>
>> Does this serve any useful purpose? If a page is determined to be device
>> memory, it's not going to suddenly stop being device memory, and if the
>> underlying sg is recycled to point elsewhere then sg_assign_page() will
>> still (correctly) clear this flag anyway. Trying to reason about this
>> beyond superficial API symmetry - i.e. why exactly would a caller need
>> to call it, and what would the implications be of failing to do so -
>> seems to lead straight to confusion.
>>
>> In fact I'd be inclined to have sg_assign_page() be responsible for
>> setting the flag automatically as well, and thus not need
>> sg_dma_mark_bus_address() either, however I can see the argument for
>> doing it this way round to not entangle the APIs too much, so I don't
>> have any great objection to that.
> 
> Yes, I think you misunderstand what this is for. The SG_DMA_BUS_ADDDRESS
> flag doesn't mark the segment for the page, but for the dma address. It
> cannot be set in sg_assign_page() seeing it's not a property of the page
> but a property of the dma_address in the sgl.
> 
> It's not meant for use by regular SG users, it's only meant for use
> inside DMA mapping implementations. The purpose is to know whether a
> given dma_address in the SGL is a bus address or regular memory because
> the two different types must be unmapped differently. We can't rely on
> the page because, as you know, many dma_map_sg() the dma_address entry
> in the sgl does not map to the same memory as the page. Or to put it
> another way: is_pci_p2pdma_page(sg->page) does not imply that
> sg->dma_address points to a bus address.
> 
> Does that make sense?

Ah, you're quite right, in trying to take in the whole series at once 
first thing in the morning I did fail to properly grasp that detail, so 
indeed the sg_assign_page() thing couldn't possibly work, but as I said 
that's fine anyway. I still think the lifecycle management is a bit off 
though - equivalently, a bus address doesn't stop being a bus address, 
so it would seem appropriate to update this flag appropriately whenever 
sg_dma_address() is assigned to, and not when it isn't.

Thanks,
Robin.


  reply	other threads:[~2022-06-29 18:02 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-15 16:12 [PATCH v7 00/21] Userspace P2PDMA with O_DIRECT NVMe devices Logan Gunthorpe
2022-06-15 16:12 ` [PATCH v7 01/21] lib/scatterlist: add flag for indicating P2PDMA segments in an SGL Logan Gunthorpe
2022-06-29  6:33   ` Christoph Hellwig
2022-06-29  9:05   ` Robin Murphy
2022-06-29 15:39     ` Logan Gunthorpe
2022-06-29 18:02       ` Robin Murphy [this message]
2022-06-29 18:24         ` Logan Gunthorpe
2022-07-04 15:08   ` Robin Murphy
2022-06-15 16:12 ` [PATCH v7 02/21] PCI/P2PDMA: Attempt to set map_type if it has not been set Logan Gunthorpe
2022-06-29  6:33   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 03/21] PCI/P2PDMA: Expose pci_p2pdma_map_type() Logan Gunthorpe
2022-06-29  6:39   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 04/21] PCI/P2PDMA: Introduce helpers for dma_map_sg implementations Logan Gunthorpe
2022-06-29  6:39   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 05/21] dma-mapping: allow EREMOTEIO return code for P2PDMA transfers Logan Gunthorpe
2022-06-29  6:40   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 06/21] dma-direct: support PCI P2PDMA pages in dma-direct map_sg Logan Gunthorpe
2022-06-29  6:40   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 07/21] dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support Logan Gunthorpe
2022-06-29  6:41   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 08/21] iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg Logan Gunthorpe
2022-06-29 12:07   ` Robin Murphy
2022-06-29 15:57     ` Logan Gunthorpe
2022-06-29 19:15       ` Robin Murphy
2022-06-29 22:41         ` Logan Gunthorpe
2022-06-30 14:56           ` Robin Murphy
2022-06-30 21:21             ` Logan Gunthorpe
2022-06-15 16:12 ` [PATCH v7 09/21] nvme-pci: check DMA ops when indicating support for PCI P2PDMA Logan Gunthorpe
2022-06-29  6:41   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 10/21] nvme-pci: convert to using dma_map_sgtable() Logan Gunthorpe
2022-06-29  6:42   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 11/21] RDMA/core: introduce ib_dma_pci_p2p_dma_supported() Logan Gunthorpe
2022-06-29  6:42   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 12/21] RDMA/rw: drop pci_p2pdma_[un]map_sg() Logan Gunthorpe
2022-06-29  6:42   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 13/21] PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg() Logan Gunthorpe
2022-06-29  6:43   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 14/21] mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages Logan Gunthorpe
2022-06-29  6:45   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 15/21] iov_iter: introduce iov_iter_get_pages_[alloc_]flags() Logan Gunthorpe
2022-06-29  6:45   ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 16/21] block: add check when merging zone device pages Logan Gunthorpe
2022-06-29  6:46   ` Christoph Hellwig
2022-06-29 16:06     ` Logan Gunthorpe
2022-06-30 21:50       ` Logan Gunthorpe
2022-07-04  6:07         ` Christoph Hellwig
2022-06-15 16:12 ` [PATCH v7 17/21] lib/scatterlist: " Logan Gunthorpe
2022-06-15 16:12 ` [PATCH v7 18/21] block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages() Logan Gunthorpe
2022-06-15 16:12 ` [PATCH v7 19/21] block: set FOLL_PCI_P2PDMA in bio_map_user_iov() Logan Gunthorpe
2022-06-15 16:12 ` [PATCH v7 20/21] PCI/P2PDMA: Introduce pci_mmap_p2pmem() Logan Gunthorpe
2022-06-29  6:48   ` Christoph Hellwig
2022-06-29 16:00     ` Logan Gunthorpe
2022-06-29 17:59       ` Jason Gunthorpe
2022-07-05  7:51         ` Christoph Hellwig
2022-07-05 13:51           ` Jason Gunthorpe
2022-07-05 16:12             ` Christoph Hellwig
2022-07-05 16:29               ` Jason Gunthorpe
2022-07-05 16:40                 ` Christoph Hellwig
2022-07-05 16:41               ` Logan Gunthorpe
2022-07-05 16:43                 ` Christoph Hellwig
2022-07-05 16:44                   ` Logan Gunthorpe
2022-07-05 16:50                     ` Christoph Hellwig
2022-07-05 17:21                       ` Greg Kroah-Hartman
2022-07-05 17:32                         ` Logan Gunthorpe
2022-07-05 17:42                           ` Greg Kroah-Hartman
2022-07-05 18:16                             ` Logan Gunthorpe
2022-07-06  6:51                               ` Christoph Hellwig
2022-07-06  7:04                                 ` Greg Kroah-Hartman
2022-07-06 21:30                                   ` Logan Gunthorpe
2022-06-15 16:12 ` [PATCH v7 21/21] nvme-pci: allow mmaping the CMB in userspace Logan Gunthorpe
2022-06-29  6:49 ` [PATCH v7 00/21] Userspace P2PDMA with O_DIRECT NVMe devices Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d84a0498-3b7f-3d38-2bfd-9a175db4002a@arm.com \
    --to=robin.murphy@arm.com \
    --cc=christian.koenig@amd.com \
    --cc=ckulkarnilinux@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dave.b.minturn@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=ddutile@redhat.com \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=ira.weiny@intel.com \
    --cc=jason@jlekstrand.net \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=jianxin.xiong@intel.com \
    --cc=kch@nvidia.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=martin.oliveira@eideticom.com \
    --cc=rcampbell@nvidia.com \
    --cc=sbates@raithlin.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).