All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>,
	"Kim, Dongwon" <dongwon.kim@intel.com>,
	David Hildenbrand <david@redhat.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Hugh Dickins <hughd@google.com>,
	"Kasireddy, Vivek" <vivek.kasireddy@intel.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Peter Xu <peterx@redhat.com>,
	"Chang, Junxiao" <junxiao.chang@intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH v7 3/6] mm/gup: Introduce memfd_pin_folios() for pinning memfd folios (v7)
Date: Wed, 13 Dec 2023 13:06:59 -0400	[thread overview]
Message-ID: <20231213170659.GB3261327@nvidia.com> (raw)
In-Reply-To: <20231213153634.GA7301@lst.de>

On Wed, Dec 13, 2023 at 04:36:34PM +0100, Christoph Hellwig wrote:
> On Wed, Dec 13, 2023 at 08:31:55AM -0400, Jason Gunthorpe wrote:
> > > That is, populate a scatterlist with ubuf->pagecount number of entries,
> > > where each segment if of size PAGE_SIZE, in order to be consistent and
> > > support a wide variety of DMA importers that may not probably handle
> > > segments that are larger than PAGE_SIZE.
> > 
> > No! This is totally wrong, sg lists must aggregate up to the limits
> > specified in the struct device. We have importer helpers that do this
> > aggregation.
> > 
> > If some driver is working with a sglist and can't handle this it is
> > simply broken. Do not mess up core code to accomodate such things.
> 
> Well.. There's no single driver that is broken, it's more the whole
> dmabuf philosophy that wants things to be mappable by multiple devices
> without knowing their limits beforehand.  So you'll get this cargo
> culting.

It is not so bad, the API has the importer pass a struct device to the
exporter that can be used in the usual way to shape the sg list.

But really, I think in most cases importers don't strictly need the sg
list to be a certain configuration, it is just a combination of lazy
driver writers and a lack of common helpers to iterate over the sg
list in the way they need.

RDMA has done this right, but for it to work efficiently the exporter
*must* aggregate all contiguous memory into a single sg element
otherwise you loose the HW's large page support.

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Christoph Hellwig <hch@lst.de>
Cc: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>,
	David Hildenbrand <david@redhat.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Christoph Hellwig <hch@infradead.org>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Hugh Dickins <hughd@google.com>, Peter Xu <peterx@redhat.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	"Kim, Dongwon" <dongwon.kim@intel.com>,
	"Chang, Junxiao" <junxiao.chang@intel.com>
Subject: Re: [PATCH v7 3/6] mm/gup: Introduce memfd_pin_folios() for pinning memfd folios (v7)
Date: Wed, 13 Dec 2023 13:06:59 -0400	[thread overview]
Message-ID: <20231213170659.GB3261327@nvidia.com> (raw)
In-Reply-To: <20231213153634.GA7301@lst.de>

On Wed, Dec 13, 2023 at 04:36:34PM +0100, Christoph Hellwig wrote:
> On Wed, Dec 13, 2023 at 08:31:55AM -0400, Jason Gunthorpe wrote:
> > > That is, populate a scatterlist with ubuf->pagecount number of entries,
> > > where each segment if of size PAGE_SIZE, in order to be consistent and
> > > support a wide variety of DMA importers that may not probably handle
> > > segments that are larger than PAGE_SIZE.
> > 
> > No! This is totally wrong, sg lists must aggregate up to the limits
> > specified in the struct device. We have importer helpers that do this
> > aggregation.
> > 
> > If some driver is working with a sglist and can't handle this it is
> > simply broken. Do not mess up core code to accomodate such things.
> 
> Well.. There's no single driver that is broken, it's more the whole
> dmabuf philosophy that wants things to be mappable by multiple devices
> without knowing their limits beforehand.  So you'll get this cargo
> culting.

It is not so bad, the API has the importer pass a struct device to the
exporter that can be used in the usual way to shape the sg list.

But really, I think in most cases importers don't strictly need the sg
list to be a certain configuration, it is just a combination of lazy
driver writers and a lack of common helpers to iterate over the sg
list in the way they need.

RDMA has done this right, but for it to work efficiently the exporter
*must* aggregate all contiguous memory into a single sg element
otherwise you loose the HW's large page support.

Jason


  reply	other threads:[~2023-12-13 17:07 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-12  7:37 [PATCH v7 0/6] mm/gup: Introduce memfd_pin_folios() for pinning memfd folios (v7) Vivek Kasireddy
2023-12-12  7:37 ` Vivek Kasireddy
2023-12-12  7:37 ` [PATCH v7 1/6] udmabuf: Use vmf_insert_pfn and VM_PFNMAP for handling mmap Vivek Kasireddy
2023-12-12  7:37   ` Vivek Kasireddy
2023-12-12  7:37 ` [PATCH v7 2/6] udmabuf: Add back support for mapping hugetlb pages (v6) Vivek Kasireddy
2023-12-12  7:37   ` Vivek Kasireddy
2023-12-12  7:38 ` [PATCH v7 3/6] mm/gup: Introduce memfd_pin_folios() for pinning memfd folios (v7) Vivek Kasireddy
2023-12-12  7:38   ` Vivek Kasireddy
2023-12-12 12:13   ` David Hildenbrand
2023-12-12 12:13     ` David Hildenbrand
2023-12-13  8:44     ` Kasireddy, Vivek
2023-12-13  8:44       ` Kasireddy, Vivek
2023-12-13 12:31       ` Jason Gunthorpe
2023-12-13 12:31         ` Jason Gunthorpe
2023-12-13 15:36         ` Christoph Hellwig
2023-12-13 17:06           ` Jason Gunthorpe [this message]
2023-12-13 17:06             ` Jason Gunthorpe
2023-12-13 15:15       ` David Hildenbrand
2023-12-13 15:15         ` David Hildenbrand
2023-12-12 13:21   ` kernel test robot
2023-12-12 13:21     ` kernel test robot
2023-12-12 15:27   ` kernel test robot
2023-12-12 15:27     ` kernel test robot
2023-12-12  7:38 ` [PATCH v7 4/6] udmabuf: Convert udmabuf driver to use folios Vivek Kasireddy
2023-12-12  7:38   ` Vivek Kasireddy
2023-12-13 18:00   ` Matthew Wilcox
2023-12-13 18:00     ` Matthew Wilcox
2023-12-12  7:38 ` [PATCH v7 5/6] udmabuf: Pin the pages using memfd_pin_folios() API (v5) Vivek Kasireddy
2023-12-12  7:38   ` Vivek Kasireddy
2023-12-13 18:03   ` Matthew Wilcox
2023-12-13 18:03     ` Matthew Wilcox
2023-12-12  7:38 ` [PATCH v7 6/6] selftests/dma-buf/udmabuf: Add tests to verify data after page migration Vivek Kasireddy
2023-12-12  7:38   ` Vivek Kasireddy
2023-12-12 12:15 ` [PATCH v7 0/6] mm/gup: Introduce memfd_pin_folios() for pinning memfd folios (v7) David Hildenbrand
2023-12-12 12:15   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231213170659.GB3261327@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@redhat.com \
    --cc=dongwon.kim@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=hughd@google.com \
    --cc=junxiao.chang@intel.com \
    --cc=kraxel@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=peterx@redhat.com \
    --cc=vivek.kasireddy@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.