From: Jerome Glisse <jglisse@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: John Hubbard <jhubbard@nvidia.com>,
John Hubbard <john.hubbard@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux MM <linux-mm@kvack.org>, Jan Kara <jack@suse.cz>,
tom@talpey.com, Al Viro <viro@zeniv.linux.org.uk>,
benve@cisco.com, Christoph Hellwig <hch@infradead.org>,
Christopher Lameter <cl@linux.com>,
"Dalessandro, Dennis" <dennis.dalessandro@intel.com>,
Doug Ledford <dledford@redhat.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
Matthew Wilcox <willy@infradead.org>,
Michal Hocko <mhocko@kernel.org>,
mike.marciniszyn@intel.com, rcampbell@nvidia.com,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions
Date: Tue, 4 Dec 2018 19:36:48 -0500 [thread overview]
Message-ID: <20181205003648.GT2937@redhat.com> (raw)
In-Reply-To: <CAPcyv4iNtamDAY9raab=iXhSZByecedBpnGybjLM+PuDMwq7SQ@mail.gmail.com>
On Tue, Dec 04, 2018 at 03:03:02PM -0800, Dan Williams wrote:
> On Tue, Dec 4, 2018 at 1:56 PM John Hubbard <jhubbard@nvidia.com> wrote:
> >
> > On 12/4/18 12:28 PM, Dan Williams wrote:
> > > On Mon, Dec 3, 2018 at 4:17 PM <john.hubbard@gmail.com> wrote:
> > >>
> > >> From: John Hubbard <jhubbard@nvidia.com>
> > >>
> > >> Introduces put_user_page(), which simply calls put_page().
> > >> This provides a way to update all get_user_pages*() callers,
> > >> so that they call put_user_page(), instead of put_page().
> > >>
> > >> Also introduces put_user_pages(), and a few dirty/locked variations,
> > >> as a replacement for release_pages(), and also as a replacement
> > >> for open-coded loops that release multiple pages.
> > >> These may be used for subsequent performance improvements,
> > >> via batching of pages to be released.
> > >>
> > >> This is the first step of fixing the problem described in [1]. The steps
> > >> are:
> > >>
> > >> 1) (This patch): provide put_user_page*() routines, intended to be used
> > >> for releasing pages that were pinned via get_user_pages*().
> > >>
> > >> 2) Convert all of the call sites for get_user_pages*(), to
> > >> invoke put_user_page*(), instead of put_page(). This involves dozens of
> > >> call sites, and will take some time.
> > >>
> > >> 3) After (2) is complete, use get_user_pages*() and put_user_page*() to
> > >> implement tracking of these pages. This tracking will be separate from
> > >> the existing struct page refcounting.
> > >>
> > >> 4) Use the tracking and identification of these pages, to implement
> > >> special handling (especially in writeback paths) when the pages are
> > >> backed by a filesystem. Again, [1] provides details as to why that is
> > >> desirable.
> > >
> > > I thought at Plumbers we talked about using a page bit to tag pages
> > > that have had their reference count elevated by get_user_pages()? That
> > > way there is no need to distinguish put_page() from put_user_page() it
> > > just happens internally to put_page(). At the conference Matthew was
> > > offering to free up a page bit for this purpose.
> > >
> >
> > ...but then, upon further discussion in that same session, we realized that
> > that doesn't help. You need a reference count. Otherwise a random put_page
> > could affect your dma-pinned pages, etc, etc.
>
> Ok, sorry, I mis-remembered. So, you're effectively trying to capture
> the end of the page pin event separate from the final 'put' of the
> page? Makes sense.
>
> > I was not able to actually find any place where a single additional page
> > bit would help our situation, which is why this still uses LRU fields for
> > both the two bits required (the RFC [1] still applies), and the dma_pinned_count.
>
> Except the LRU fields are already in use for ZONE_DEVICE pages... how
> does this proposal interact with those?
>
> > [1] https://lore.kernel.org/r/20181110085041.10071-7-jhubbard@nvidia.com
> >
> > >> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
> > >>
> > >> Reviewed-by: Jan Kara <jack@suse.cz>
> > >
> > > Wish, you could have been there Jan. I'm missing why it's safe to
> > > assume that a single put_user_page() is paired with a get_user_page()?
> > >
> >
> > A put_user_page() per page, or a put_user_pages() for an array of pages. See
> > patch 0002 for several examples.
>
> Yes, however I was more concerned about validation and trying to
> locate missed places where put_page() is used instead of
> put_user_page().
>
> It would be interesting to see if we could have a debug mode where
> get_user_pages() returned dynamically allocated pages from a known
> address range and catch drivers that operate on a user-pinned page
> without using the proper helper to 'put' it. I think we might also
> need a ref_user_page() for drivers that may do their own get_page()
> and expect the dma_pinned_count to also increase.
Total crazy idea for this, but this is the right time of day
for this (for me at least it is beer time :)) What about mapping
all struct page in two different range of kernel virtual address
and when get user space is use it returns a pointer from the second
range of kernel virtual address to the struct page. Then in put_page
you know for sure if the code putting the page got it from GUP or
from somewhere else. page_to_pfn() would need some trickery to
handle that.
Dunno if we are running out of kernel virtual address (outside
32bits that i believe we are trying to shot down quietly behind
the bar).
Cheers,
J�r�me
next prev parent reply other threads:[~2018-12-05 0:36 UTC|newest]
Thread overview: 214+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-04 0:17 [PATCH 0/2] put_user_page*(): start converting the call sites john.hubbard
2018-12-04 0:17 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard
2018-12-04 7:53 ` Mike Rapoport
2018-12-05 1:40 ` John Hubbard
2018-12-05 1:40 ` John Hubbard
2018-12-04 20:28 ` Dan Williams
2018-12-04 21:56 ` John Hubbard
2018-12-04 23:03 ` Dan Williams
2018-12-05 0:36 ` Jerome Glisse [this message]
2018-12-05 0:40 ` Dan Williams
2018-12-05 0:59 ` John Hubbard
2018-12-05 0:59 ` John Hubbard
2018-12-05 0:58 ` John Hubbard
2018-12-05 1:00 ` Dan Williams
2018-12-05 1:15 ` Matthew Wilcox
2018-12-05 1:44 ` Jerome Glisse
2018-12-05 1:57 ` John Hubbard
2018-12-05 1:57 ` John Hubbard
2018-12-07 2:45 ` John Hubbard
2018-12-07 2:45 ` John Hubbard
2018-12-07 19:16 ` Jerome Glisse
2018-12-07 19:26 ` Dan Williams
2018-12-07 19:40 ` Jerome Glisse
2018-12-08 0:52 ` John Hubbard
2018-12-08 0:52 ` John Hubbard
2018-12-08 2:24 ` Jerome Glisse
2018-12-10 10:28 ` Jan Kara
2018-12-12 15:03 ` Jerome Glisse
2018-12-12 16:27 ` Dan Williams
2018-12-12 17:02 ` Jerome Glisse
2018-12-12 17:49 ` Dan Williams
2018-12-12 19:07 ` John Hubbard
2018-12-12 19:07 ` John Hubbard
2018-12-12 21:30 ` Jerome Glisse
2018-12-12 21:40 ` Dan Williams
2018-12-12 21:53 ` Jerome Glisse
2018-12-12 22:11 ` Matthew Wilcox
2018-12-12 22:16 ` Jerome Glisse
2018-12-12 23:37 ` Jason Gunthorpe
2018-12-12 23:46 ` John Hubbard
2018-12-12 23:46 ` John Hubbard
2018-12-12 23:54 ` Dan Williams
2018-12-13 0:01 ` Jerome Glisse
2018-12-13 0:18 ` Dan Williams
2018-12-13 0:44 ` Jerome Glisse
2018-12-13 3:26 ` Jason Gunthorpe
2018-12-13 3:20 ` Jason Gunthorpe
2018-12-13 12:43 ` Jerome Glisse
2018-12-13 13:40 ` Tom Talpey
2018-12-13 14:18 ` Jerome Glisse
2018-12-13 14:51 ` Tom Talpey
2018-12-13 15:18 ` Jerome Glisse
2018-12-13 18:12 ` Tom Talpey
2018-12-13 19:18 ` Jerome Glisse
2018-12-14 10:41 ` Jan Kara
2018-12-14 15:25 ` Jerome Glisse
2018-12-12 21:56 ` John Hubbard
2018-12-12 21:56 ` John Hubbard
2018-12-12 22:04 ` Jerome Glisse
2018-12-12 22:11 ` John Hubbard
2018-12-12 22:11 ` John Hubbard
2018-12-12 22:14 ` Jerome Glisse
2018-12-12 22:17 ` John Hubbard
2018-12-12 22:17 ` John Hubbard
2018-12-12 21:46 ` Dave Chinner
2018-12-12 21:59 ` Jerome Glisse
2018-12-13 0:51 ` Dave Chinner
2018-12-13 2:02 ` Jerome Glisse
2018-12-13 15:56 ` Christopher Lameter
2018-12-13 16:02 ` Jerome Glisse
2018-12-14 6:00 ` Dave Chinner
2018-12-14 15:13 ` Jerome Glisse
2018-12-14 3:52 ` John Hubbard
2018-12-14 3:52 ` John Hubbard
2018-12-14 5:21 ` Dan Williams
2018-12-14 6:11 ` John Hubbard
2018-12-14 15:20 ` Jerome Glisse
2018-12-14 19:38 ` Dan Williams
2018-12-14 19:48 ` Matthew Wilcox
2018-12-14 19:53 ` Dave Hansen
2018-12-14 20:03 ` Matthew Wilcox
2018-12-14 20:17 ` Dan Williams
2018-12-14 20:29 ` Matthew Wilcox
2018-12-15 0:41 ` John Hubbard
2018-12-17 8:56 ` Jan Kara
2018-12-17 18:28 ` Dan Williams
2018-12-14 15:43 ` Jan Kara
2018-12-16 21:58 ` Dave Chinner
2018-12-17 18:11 ` Jerome Glisse
2018-12-17 18:34 ` Matthew Wilcox
2018-12-17 19:48 ` Jerome Glisse
2018-12-17 19:51 ` Matthew Wilcox
2018-12-17 19:54 ` Jerome Glisse
2018-12-17 19:59 ` Matthew Wilcox
2018-12-17 20:55 ` Jerome Glisse
2018-12-17 21:03 ` Matthew Wilcox
2018-12-17 21:15 ` Jerome Glisse
2018-12-18 1:09 ` Dave Chinner
2018-12-18 6:12 ` Darrick J. Wong
2018-12-18 9:30 ` Jan Kara
2018-12-18 23:29 ` John Hubbard
2018-12-18 23:29 ` John Hubbard
2018-12-19 2:07 ` Jerome Glisse
2018-12-19 11:08 ` Jan Kara
2018-12-20 10:54 ` John Hubbard
2018-12-20 10:54 ` John Hubbard
2018-12-20 16:50 ` Jerome Glisse
2018-12-20 16:50 ` Jerome Glisse
2018-12-20 16:57 ` Dan Williams
2018-12-20 16:49 ` Jerome Glisse
2018-12-20 16:49 ` Jerome Glisse
2019-01-03 1:55 ` Jerome Glisse
2019-01-03 1:55 ` Jerome Glisse
2019-01-03 3:27 ` John Hubbard
2019-01-03 3:27 ` John Hubbard
2019-01-03 14:57 ` Jerome Glisse
2019-01-03 14:57 ` Jerome Glisse
2019-01-03 9:26 ` Jan Kara
2019-01-03 9:26 ` Jan Kara
2019-01-03 14:44 ` Jerome Glisse
2019-01-03 14:44 ` Jerome Glisse
2019-01-11 2:59 ` John Hubbard
2019-01-11 2:59 ` John Hubbard
2019-01-11 16:51 ` Jerome Glisse
2019-01-11 16:51 ` Jerome Glisse
2019-01-12 1:04 ` John Hubbard
2019-01-12 1:04 ` John Hubbard
2019-01-12 2:02 ` Jerome Glisse
2019-01-12 2:38 ` John Hubbard
2019-01-12 2:38 ` John Hubbard
2019-01-12 2:46 ` Jerome Glisse
2019-01-12 3:06 ` John Hubbard
2019-01-12 3:06 ` John Hubbard
2019-01-12 3:25 ` Jerome Glisse
2019-01-12 20:46 ` John Hubbard
2019-01-12 20:46 ` John Hubbard
2019-01-14 14:54 ` Jan Kara
2019-01-14 17:21 ` Jerome Glisse
2019-01-14 19:09 ` John Hubbard
2019-01-14 19:09 ` John Hubbard
2019-01-14 19:09 ` John Hubbard
2019-01-15 8:34 ` Jan Kara
2019-01-15 21:39 ` John Hubbard
2019-01-15 21:39 ` John Hubbard
2019-01-15 8:07 ` Jan Kara
2019-01-15 17:15 ` Jerome Glisse
2019-01-15 21:56 ` John Hubbard
2019-01-15 21:56 ` John Hubbard
2019-01-15 22:12 ` Jerome Glisse
2019-01-16 0:44 ` John Hubbard
2019-01-16 0:44 ` John Hubbard
2019-01-16 1:56 ` Jerome Glisse
2019-01-16 2:01 ` Dan Williams
2019-01-16 2:23 ` Jerome Glisse
2019-01-16 4:34 ` Dave Chinner
2019-01-16 14:50 ` Jerome Glisse
2019-01-16 22:51 ` Dave Chinner
2019-01-16 11:38 ` Jan Kara
2019-01-16 13:08 ` Jerome Glisse
2019-01-17 5:42 ` John Hubbard
2019-01-17 5:42 ` John Hubbard
2019-01-17 15:21 ` Jerome Glisse
2019-01-18 0:16 ` Dave Chinner
2019-01-18 1:59 ` Jerome Glisse
2019-01-17 9:30 ` Jan Kara
2019-01-17 15:17 ` Jerome Glisse
2019-01-22 15:24 ` Jan Kara
2019-01-22 16:46 ` Jerome Glisse
2019-01-23 18:02 ` Jan Kara
2019-01-23 19:04 ` Jerome Glisse
2019-01-29 0:22 ` John Hubbard
2019-01-29 1:23 ` Jerome Glisse
2019-01-29 6:41 ` John Hubbard
2019-01-29 10:12 ` Jan Kara
2019-01-30 2:21 ` John Hubbard
2019-01-17 5:25 ` John Hubbard
2019-01-17 5:25 ` John Hubbard
2019-01-17 9:04 ` Jan Kara
2019-01-12 3:14 ` Jerome Glisse
2018-12-18 10:33 ` Jan Kara
2018-12-18 23:42 ` Dave Chinner
2018-12-19 3:03 ` Jason Gunthorpe
2018-12-19 5:26 ` Dan Williams
2018-12-19 11:19 ` Jan Kara
2018-12-19 10:28 ` Dave Chinner
2018-12-19 11:35 ` Jan Kara
2018-12-19 16:56 ` Jason Gunthorpe
2018-12-19 22:33 ` Dave Chinner
2018-12-20 9:07 ` Jan Kara
2018-12-20 16:54 ` Jerome Glisse
2018-12-20 16:54 ` Jerome Glisse
2018-12-19 13:24 ` Jan Kara
2018-12-08 5:18 ` Matthew Wilcox
2018-12-12 19:13 ` John Hubbard
2018-12-12 19:13 ` John Hubbard
2018-12-08 7:16 ` Dan Williams
2018-12-08 16:33 ` Jerome Glisse
2018-12-08 16:48 ` Christoph Hellwig
2018-12-08 17:47 ` Jerome Glisse
2018-12-08 18:26 ` Christoph Hellwig
2018-12-08 18:45 ` Jerome Glisse
2018-12-08 18:09 ` Dan Williams
2018-12-08 18:12 ` Christoph Hellwig
2018-12-11 6:18 ` Dave Chinner
2018-12-05 5:52 ` Dan Williams
2018-12-05 11:16 ` Jan Kara
2018-12-04 0:17 ` [PATCH 2/2] infiniband/mm: convert put_page() to put_user_page*() john.hubbard
2018-12-04 17:10 ` [PATCH 0/2] put_user_page*(): start converting the call sites David Laight
2018-12-05 1:05 ` John Hubbard
2018-12-05 14:08 ` David Laight
2018-12-28 8:37 ` Pavel Machek
-- strict thread matches above, loose matches on Subject: below --
2019-02-08 7:56 [PATCH 0/2] mm: put_user_page() call site conversion first john.hubbard
2019-02-08 7:56 ` [PATCH 1/2] mm: introduce put_user_page*(), placeholder versions john.hubbard
2019-02-08 10:32 ` Mike Rapoport
2019-02-08 20:44 ` John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181205003648.GT2937@redhat.com \
--to=jglisse@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=benve@cisco.com \
--cc=cl@linux.com \
--cc=dan.j.williams@intel.com \
--cc=dennis.dalessandro@intel.com \
--cc=dledford@redhat.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=john.hubbard@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mike.marciniszyn@intel.com \
--cc=rcampbell@nvidia.com \
--cc=tom@talpey.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.