All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	linux-block@vger.kernel.org, linux-pci@vger.kernel.org,
	linux-mm@kvack.org, "Christoph Hellwig" <hch@lst.de>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Christian König" <christian.koenig@amd.com>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Don Dutile" <ddutile@redhat.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Minturn Dave B" <dave.b.minturn@intel.com>,
	"Jason Ekstrand" <jason@jlekstrand.net>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Xiong Jianxin" <jianxin.xiong@intel.com>,
	"Bjorn Helgaas" <helgaas@kernel.org>,
	"Ira Weiny" <ira.weiny@intel.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Martin Oliveira" <martin.oliveira@eideticom.com>,
	"Chaitanya Kulkarni" <ckulkarnilinux@gmail.com>,
	"Ralph Campbell" <rcampbell@nvidia.com>,
	"Stephen Bates" <sbates@raithlin.com>
Subject: Re: [PATCH v10 1/8] mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages
Date: Fri, 23 Sep 2022 19:58:49 -0300	[thread overview]
Message-ID: <Yy46KbD/PvhaHA6X@ziepe.ca> (raw)
In-Reply-To: <2327d393-af5c-3f4c-b9b9-6852b9d72f90@deltatee.com>

On Fri, Sep 23, 2022 at 02:11:03PM -0600, Logan Gunthorpe wrote:
> 
> 
> On 2022-09-23 13:53, Jason Gunthorpe wrote:
> > On Fri, Sep 23, 2022 at 01:08:31PM -0600, Logan Gunthorpe wrote:
> > I'm encouraging Dan to work on better infrastructure in pgmap core
> > because every pgmap implementation has this issue currently.
> > 
> > For that reason it is probably not so relavent to this series.
> > 
> > Perhaps just clarify in the commit message that the FOLL_LONGTERM
> > restriction is to copy DAX until the pgmap page refcounts are fixed.
> 
> Ok, I'll add that note.
> 
> Per the fix for the try_grab_page(), to me it doesn't fit well in 
> try_grab_page() without doing a bunch of cleanup to change the
> error handling, and the same would have to be added to try_grab_folio().
> So I think it's better to leave it where it was, but move it below the 
> respective grab calls. Does the incremental patch below look correct?

Oh? I was thinking of just a very simple thing:

--- a/mm/gup.c
+++ b/mm/gup.c
@@ -225,6 +225,11 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
                node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, 1);
        }
 
+       if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) {
+               gup_put_folio(page_folio(page), 1, flags);
+              return false;
+       }
+
        return true;
 }


> I am confused about what happens if neither FOLL_PIN or FOLL_GET 
> are set (which the documentation for try_grab_x() says is possible, but
> other documentation suggests that FOLL_GET is automatically set). 
> In which case it'd be impossible to do the check if we can't 
> access the page.

try_grab_page is operating under the PTL so it can probably touch the
page OK (though perhaps we don't need to even check anything)

try_grab_folio cannot be called without PIN/GET, so like this perhaps:

@@ -123,11 +123,14 @@ static inline struct folio *try_get_folio(struct page *page, int refs)
  */
 struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
 {
+       struct folio *folio;
+
+       if (WARN_ON((flags & (FOLL_GET | FOLL_PIN)) == 0))
+               return NULL;
+
        if (flags & FOLL_GET)
-               return try_get_folio(page, refs);
+               folio = try_get_folio(page, refs);
        else if (flags & FOLL_PIN) {
-               struct folio *folio;
-
                /*
                 * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
                 * right zone, so fail and let the caller fall back to the slow
@@ -160,11 +163,14 @@ struct folio *try_grab_folio(struct page *page, int refs, unsigned int flags)
                                        refs * (GUP_PIN_COUNTING_BIAS - 1));
                node_stat_mod_folio(folio, NR_FOLL_PIN_ACQUIRED, refs);
 
-               return folio;
        }
 
-       WARN_ON_ONCE(1);
-       return NULL;
+       if (unlikely(!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page))) {
+               gup_put_folio(page, 1, flags);
+               return NULL;
+       }
+
+       return folio;
 }

Jason


  reply	other threads:[~2022-09-23 22:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22 16:39 [PATCH v10 0/8] Userspace P2PDMA with O_DIRECT NVMe devices Logan Gunthorpe
2022-09-22 16:39 ` [PATCH v10 1/8] mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages Logan Gunthorpe
2022-09-23 18:13   ` Jason Gunthorpe
2022-09-23 19:08     ` Logan Gunthorpe
2022-09-23 19:53       ` Jason Gunthorpe
2022-09-23 20:11         ` Logan Gunthorpe
2022-09-23 22:58           ` Jason Gunthorpe [this message]
2022-09-23 23:01             ` Logan Gunthorpe
2022-09-23 23:07               ` Jason Gunthorpe
2022-09-23 23:14                 ` Logan Gunthorpe
2022-09-23 23:21                   ` Jason Gunthorpe
2022-09-23 23:35                     ` Logan Gunthorpe
2022-09-23 23:51                     ` Logan Gunthorpe
2022-09-26 22:57                       ` Jason Gunthorpe
2022-09-28 21:38                         ` Logan Gunthorpe
2022-09-22 16:39 ` [PATCH v10 2/8] iov_iter: introduce iov_iter_get_pages_[alloc_]flags() Logan Gunthorpe
2022-09-22 16:39 ` [PATCH v10 3/8] block: add check when merging zone device pages Logan Gunthorpe
2022-09-22 16:39 ` [PATCH v10 4/8] lib/scatterlist: " Logan Gunthorpe
2022-09-22 16:39 ` [PATCH v10 5/8] block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages() Logan Gunthorpe
2022-09-22 16:39 ` [PATCH v10 6/8] block: set FOLL_PCI_P2PDMA in bio_map_user_iov() Logan Gunthorpe
2022-09-22 16:39 ` [PATCH v10 7/8] PCI/P2PDMA: Allow userspace VMA allocations through sysfs Logan Gunthorpe
2022-09-22 18:27   ` Bjorn Helgaas
2022-09-23  8:15   ` Greg Kroah-Hartman
2022-09-22 16:39 ` [PATCH v10 8/8] ABI: sysfs-bus-pci: add documentation for p2pmem allocate Logan Gunthorpe
2022-09-23  8:15   ` Greg Kroah-Hartman
2022-09-23  6:01 ` [PATCH v10 0/8] Userspace P2PDMA with O_DIRECT NVMe devices Christoph Hellwig
2022-09-23 15:25   ` Logan Gunthorpe
2022-09-23  8:16 ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yy46KbD/PvhaHA6X@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=christian.koenig@amd.com \
    --cc=ckulkarnilinux@gmail.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dave.b.minturn@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=ddutile@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=helgaas@kernel.org \
    --cc=ira.weiny@intel.com \
    --cc=jason@jlekstrand.net \
    --cc=jhubbard@nvidia.com \
    --cc=jianxin.xiong@intel.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=martin.oliveira@eideticom.com \
    --cc=rcampbell@nvidia.com \
    --cc=robin.murphy@arm.com \
    --cc=sbates@raithlin.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.