From: Balbir Singh <bsingharora@gmail.com>
To: john.hubbard@gmail.com
Cc: Matthew Wilcox <willy@infradead.org>,
Michal Hocko <mhocko@kernel.org>,
Christopher Lameter <cl@linux.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
Dan Williams <dan.j.williams@intel.com>, Jan Kara <jack@suse.cz>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-rdma <linux-rdma@vger.kernel.org>,
linux-fsdevel@vger.kernel.org, John Hubbard <jhubbard@nvidia.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Jerome Glisse <jglisse@redhat.com>,
Christoph Hellwig <hch@infradead.org>,
Ralph Campbell <rcampbell@nvidia.com>
Subject: Re: [PATCH 2/6] mm: introduce put_user_page*(), placeholder versions
Date: Fri, 12 Oct 2018 18:35:22 +1100 [thread overview]
Message-ID: <20181012073521.GJ8537@350D> (raw)
In-Reply-To: <20181012060014.10242-3-jhubbard@nvidia.com>
On Thu, Oct 11, 2018 at 11:00:10PM -0700, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
>
> Introduces put_user_page(), which simply calls put_page().
> This provides a way to update all get_user_pages*() callers,
> so that they call put_user_page(), instead of put_page().
>
> Also introduces put_user_pages(), and a few dirty/locked variations,
> as a replacement for release_pages(), and also as a replacement
> for open-coded loops that release multiple pages.
> These may be used for subsequent performance improvements,
> via batching of pages to be released.
>
> This is the first step of fixing the problem described in [1]. The steps
> are:
>
> 1) (This patch): provide put_user_page*() routines, intended to be used
> for releasing pages that were pinned via get_user_pages*().
>
> 2) Convert all of the call sites for get_user_pages*(), to
> invoke put_user_page*(), instead of put_page(). This involves dozens of
> call sites, any will take some time.
>
> 3) After (2) is complete, use get_user_pages*() and put_user_page*() to
> implement tracking of these pages. This tracking will be separate from
> the existing struct page refcounting.
>
> 4) Use the tracking and identification of these pages, to implement
> special handling (especially in writeback paths) when the pages are
> backed by a filesystem. Again, [1] provides details as to why that is
> desirable.
>
> [1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"
>
> CC: Matthew Wilcox <willy@infradead.org>
> CC: Michal Hocko <mhocko@kernel.org>
> CC: Christopher Lameter <cl@linux.com>
> CC: Jason Gunthorpe <jgg@ziepe.ca>
> CC: Dan Williams <dan.j.williams@intel.com>
> CC: Jan Kara <jack@suse.cz>
> CC: Al Viro <viro@zeniv.linux.org.uk>
> CC: Jerome Glisse <jglisse@redhat.com>
> CC: Christoph Hellwig <hch@infradead.org>
> CC: Ralph Campbell <rcampbell@nvidia.com>
>
> Reviewed-by: Jan Kara <jack@suse.cz>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
> include/linux/mm.h | 20 +++++++++++
> mm/swap.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 103 insertions(+)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0416a7204be3..76d18aada9f8 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -943,6 +943,26 @@ static inline void put_page(struct page *page)
> __put_page(page);
> }
>
> +/*
> + * put_user_page() - release a page that had previously been acquired via
> + * a call to one of the get_user_pages*() functions.
> + *
> + * Pages that were pinned via get_user_pages*() must be released via
> + * either put_user_page(), or one of the put_user_pages*() routines
> + * below. This is so that eventually, pages that are pinned via
> + * get_user_pages*() can be separately tracked and uniquely handled. In
> + * particular, interactions with RDMA and filesystems need special
> + * handling.
> + */
> +static inline void put_user_page(struct page *page)
> +{
> + put_page(page);
> +}
> +
> +void put_user_pages_dirty(struct page **pages, unsigned long npages);
> +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
> +void put_user_pages(struct page **pages, unsigned long npages);
> +
> #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> #define SECTION_IN_PAGE_FLAGS
> #endif
> diff --git a/mm/swap.c b/mm/swap.c
> index 26fc9b5f1b6c..efab3a6b6f91 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -134,6 +134,89 @@ void put_pages_list(struct list_head *pages)
> }
> EXPORT_SYMBOL(put_pages_list);
>
> +/*
> + * put_user_pages_dirty() - for each page in the @pages array, make
> + * that page (or its head page, if a compound page) dirty, if it was
> + * previously listed as clean. Then, release the page using
> + * put_user_page().
> + *
> + * Please see the put_user_page() documentation for details.
> + *
> + * set_page_dirty(), which does not lock the page, is used here.
> + * Therefore, it is the caller's responsibility to ensure that this is
> + * safe. If not, then put_user_pages_dirty_lock() should be called instead.
> + *
> + * @pages: array of pages to be marked dirty and released.
> + * @npages: number of pages in the @pages array.
> + *
> + */
> +void put_user_pages_dirty(struct page **pages, unsigned long npages)
> +{
> + unsigned long index;
> +
> + for (index = 0; index < npages; index++) {
Do we need any checks on npages, npages <= (PUD_SHIFT - PAGE_SHIFT)?
> + struct page *page = compound_head(pages[index]);
> +
> + if (!PageDirty(page))
> + set_page_dirty(page);
> +
> + put_user_page(page);
> + }
> +}
> +EXPORT_SYMBOL(put_user_pages_dirty);
> +
> +/*
> + * put_user_pages_dirty_lock() - for each page in the @pages array, make
> + * that page (or its head page, if a compound page) dirty, if it was
> + * previously listed as clean. Then, release the page using
> + * put_user_page().
> + *
> + * Please see the put_user_page() documentation for details.
> + *
> + * This is just like put_user_pages_dirty(), except that it invokes
> + * set_page_dirty_lock(), instead of set_page_dirty().
> + *
> + * @pages: array of pages to be marked dirty and released.
> + * @npages: number of pages in the @pages array.
> + *
> + */
> +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages)
> +{
> + unsigned long index;
> +
> + for (index = 0; index < npages; index++) {
> + struct page *page = compound_head(pages[index]);
> +
> + if (!PageDirty(page))
> + set_page_dirty_lock(page);
> +
> + put_user_page(page);
> + }
> +}
> +EXPORT_SYMBOL(put_user_pages_dirty_lock);
> +
This can be collapsed w.r.t put_user_pages_dirty, a function pointer indirection
for the locked vs unlocked case, not sure how that affects function optimization.
> +/*
> + * put_user_pages() - for each page in the @pages array, release the page
> + * using put_user_page().
> + *
> + * Please see the put_user_page() documentation for details.
> + *
> + * This is just like put_user_pages_dirty(), except that it invokes
> + * set_page_dirty_lock(), instead of set_page_dirty().
The comment is incorrect.
> + *
> + * @pages: array of pages to be marked dirty and released.
> + * @npages: number of pages in the @pages array.
> + *
> + */
> +void put_user_pages(struct page **pages, unsigned long npages)
> +{
> + unsigned long index;
> +
> + for (index = 0; index < npages; index++)
> + put_user_page(pages[index]);
> +}
Ditto in terms of code duplication
How about
for_each_page_index(index, npages) {
<do the dirty bits if needed>
put_user_pages(pages[index]
}
Then pass what you want the page iterator to do
> +EXPORT_SYMBOL(put_user_pages);
> +
> /*
> * get_kernel_pages() - pin kernel pages in memory
> * @kiov: An array of struct kvec structures
> --
> 2.19.1
>
Balbir Singh.
next prev parent reply other threads:[~2018-10-12 7:35 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-12 6:00 [PATCH 0/6] RFC: gup+dma: tracking dma-pinned pages john.hubbard
2018-10-12 6:00 ` [PATCH 1/6] mm: get_user_pages: consolidate error handling john.hubbard
2018-10-12 6:30 ` Balbir Singh
2018-10-12 22:45 ` John Hubbard
2018-10-12 22:45 ` John Hubbard
2018-10-12 6:00 ` [PATCH 2/6] mm: introduce put_user_page*(), placeholder versions john.hubbard
2018-10-12 7:35 ` Balbir Singh [this message]
2018-10-12 22:31 ` John Hubbard
2018-10-12 22:31 ` John Hubbard
2018-10-12 6:00 ` [PATCH 3/6] infiniband/mm: convert put_page() to put_user_page*() john.hubbard
2018-10-12 6:00 ` [PATCH 4/6] mm: introduce page->dma_pinned_flags, _count john.hubbard
2018-10-12 10:56 ` Balbir Singh
2018-10-13 0:15 ` John Hubbard
2018-10-13 0:15 ` John Hubbard
2018-10-24 11:00 ` Balbir Singh
2018-11-02 23:27 ` John Hubbard
2018-11-02 23:27 ` John Hubbard
2018-10-13 3:55 ` Dave Chinner
2018-10-13 7:34 ` John Hubbard
2018-10-13 7:34 ` John Hubbard
2018-10-13 16:47 ` Christoph Hellwig
2018-10-13 21:19 ` John Hubbard
2018-10-13 21:19 ` John Hubbard
2018-11-05 7:10 ` John Hubbard
2018-11-05 7:10 ` John Hubbard
2018-11-05 9:54 ` Jan Kara
2018-11-06 0:26 ` John Hubbard
2018-11-06 0:26 ` John Hubbard
2018-11-06 2:47 ` Dave Chinner
2018-11-06 11:00 ` Jan Kara
2018-11-06 20:41 ` Dave Chinner
2018-11-07 6:36 ` John Hubbard
2018-11-07 6:36 ` John Hubbard
2018-10-13 23:01 ` Dave Chinner
2018-10-16 8:51 ` Jan Kara
2018-10-17 1:48 ` John Hubbard
2018-10-17 1:48 ` John Hubbard
2018-10-17 11:09 ` Michal Hocko
2018-10-18 0:03 ` John Hubbard
2018-10-18 0:03 ` John Hubbard
2018-10-19 8:11 ` Michal Hocko
2018-10-12 6:00 ` [PATCH 5/6] mm: introduce zone_gup_lock, for dma-pinned pages john.hubbard
2018-10-12 6:00 ` [PATCH 6/6] mm: track gup pages with page->dma_pinned_* fields john.hubbard
2018-10-12 11:07 ` Balbir Singh
2018-10-13 0:33 ` John Hubbard
2018-10-13 0:33 ` John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181012073521.GJ8537@350D \
--to=bsingharora@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=dan.j.williams@intel.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=jgg@ziepe.ca \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=john.hubbard@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mhocko@kernel.org \
--cc=rcampbell@nvidia.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.