From: Jerome Glisse <jglisse@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
"Al Viro" <viro@zeniv.linux.org.uk>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
"Björn Töpel" <bjorn.topel@intel.com>,
"Christoph Hellwig" <hch@infradead.org>,
"Dan Williams" <dan.j.williams@intel.com>,
"Daniel Vetter" <daniel@ffwll.ch>,
"Dave Chinner" <david@fromorbit.com>,
"David Airlie" <airlied@linux.ie>,
"David S . Miller" <davem@davemloft.net>,
"Ira Weiny" <ira.weiny@intel.com>, "Jan Kara" <jack@suse.cz>,
"Jason Gunthorpe" <jgg@ziepe.ca>, "Jens Axboe" <axboe@kernel.dk>,
"Jonathan Corbet" <corbet@lwn.net>,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
"Mauro Carvalho Chehab" <mchehab@kernel.org>,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Michal Hocko" <mhocko@suse.com>,
"Mike Kravetz" <mike.kravetz@oracle.com>,
"Paul Mackerras" <paulus@samba.org>,
"Shuah Khan" <shuah@kernel.org>,
"Vlastimil Babka" <vbabka@suse.cz>,
bpf@vger.kernel.org, dri-devel@lists.freedesktop.org,
kvm@vger.kernel.org, linux-block@vger.kernel.org,
linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org,
linux-rdma@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
netdev@vger.kernel.org, linux-mm@kvack.org,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN
Date: Mon, 4 Nov 2019 12:33:25 -0500 [thread overview]
Message-ID: <20191104173325.GD5134@redhat.com> (raw)
In-Reply-To: <20191103211813.213227-6-jhubbard@nvidia.com>
On Sun, Nov 03, 2019 at 01:18:00PM -0800, John Hubbard wrote:
> Introduce pin_user_pages*() variations of get_user_pages*() calls,
> and also pin_longterm_pages*() variations.
>
> These variants all set FOLL_PIN, which is also introduced, and
> thoroughly documented.
>
> The pin_longterm*() variants also set FOLL_LONGTERM, in addition
> to FOLL_PIN:
>
> pin_user_pages()
> pin_user_pages_remote()
> pin_user_pages_fast()
>
> pin_longterm_pages()
> pin_longterm_pages_remote()
> pin_longterm_pages_fast()
>
> All pages that are pinned via the above calls, must be unpinned via
> put_user_page().
>
> The underlying rules are:
>
> * These are gup-internal flags, so the call sites should not directly
> set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with
> assertions, for the new FOLL_PIN flag. However, for the pre-existing
> FOLL_LONGTERM flag, which has some call sites that still directly
> set FOLL_LONGTERM, there is no assertion yet.
>
> * Call sites that want to indicate that they are going to do DirectIO
> ("DIO") or something with similar characteristics, should call a
> get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers
> will:
> * Start with "pin_user_pages" instead of "get_user_pages". That
> makes it easy to find and audit the call sites.
> * Set FOLL_PIN
>
> * For pages that are received via FOLL_PIN, those pages must be returned
> via put_user_page().
>
> Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases
> in this documentation. (I've reworded it and expanded on it slightly.)
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Few nitpick belows, nonetheless:
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> Documentation/vm/index.rst | 1 +
> Documentation/vm/pin_user_pages.rst | 212 ++++++++++++++++++++++
> include/linux/mm.h | 62 ++++++-
> mm/gup.c | 265 +++++++++++++++++++++++++---
> 4 files changed, 514 insertions(+), 26 deletions(-)
> create mode 100644 Documentation/vm/pin_user_pages.rst
>
[...]
> diff --git a/Documentation/vm/pin_user_pages.rst b/Documentation/vm/pin_user_pages.rst
> new file mode 100644
> index 000000000000..3910f49ca98c
> --- /dev/null
> +++ b/Documentation/vm/pin_user_pages.rst
[...]
> +
> +FOLL_PIN, FOLL_GET, FOLL_LONGTERM: when to use which flags
> +==========================================================
> +
> +Thanks to Jan Kara, Vlastimil Babka and several other -mm people, for describing
> +these categories:
> +
> +CASE 1: Direct IO (DIO)
> +-----------------------
> +There are GUP references to pages that are serving
> +as DIO buffers. These buffers are needed for a relatively short time (so they
> +are not "long term"). No special synchronization with page_mkclean() or
> +munmap() is provided. Therefore, flags to set at the call site are: ::
> +
> + FOLL_PIN
> +
> +...but rather than setting FOLL_PIN directly, call sites should use one of
> +the pin_user_pages*() routines that set FOLL_PIN.
> +
> +CASE 2: RDMA
> +------------
> +There are GUP references to pages that are serving as DMA
> +buffers. These buffers are needed for a long time ("long term"). No special
> +synchronization with page_mkclean() or munmap() is provided. Therefore, flags
> +to set at the call site are: ::
> +
> + FOLL_PIN | FOLL_LONGTERM
> +
> +NOTE: Some pages, such as DAX pages, cannot be pinned with longterm pins. That's
> +because DAX pages do not have a separate page cache, and so "pinning" implies
> +locking down file system blocks, which is not (yet) supported in that way.
> +
> +CASE 3: ODP
> +-----------
> +(Mellanox/Infiniband On Demand Paging: the hardware supports
> +replayable page faulting). There are GUP references to pages serving as DMA
> +buffers. For ODP, MMU notifiers are used to synchronize with page_mkclean()
> +and munmap(). Therefore, normal GUP calls are sufficient, so neither flag
> +needs to be set.
I would not include ODP or anything like it here, they do not use
GUP anymore and i believe it is more confusing here. I would how-
ever include some text in this documentation explaining that hard-
ware that support page fault is superior as it does not incur any
of the issues described here.
> +
> +CASE 4: Pinning for struct page manipulation only
> +-------------------------------------------------
> +Here, normal GUP calls are sufficient, so neither flag needs to be set.
> +
[...]
> diff --git a/mm/gup.c b/mm/gup.c
> index 199da99e8ffc..1aea48427879 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
[...]
> @@ -1014,7 +1018,16 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
> BUG_ON(*locked != 1);
> }
>
> - if (pages)
> + /*
> + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior
> + * is to set FOLL_GET if the caller wants pages[] filled in (but has
> + * carelessly failed to specify FOLL_GET), so keep doing that, but only
> + * for FOLL_GET, not for the newer FOLL_PIN.
> + *
> + * FOLL_PIN always expects pages to be non-null, but no need to assert
> + * that here, as any failures will be obvious enough.
> + */
> + if (pages && !(flags & FOLL_PIN))
> flags |= FOLL_GET;
Did you look at user that have pages and not FOLL_GET set ?
I believe it would be better to first fix them to end up
with FOLL_GET set and then error out if pages is != NULL but
nor FOLL_GET or FOLL_PIN is set.
>
> pages_done = 0;
> @@ -2373,24 +2402,9 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages,
> return ret;
> }
>
> -/**
> - * get_user_pages_fast() - pin user pages in memory
> - * @start: starting user address
> - * @nr_pages: number of pages from start to pin
> - * @gup_flags: flags modifying pin behaviour
> - * @pages: array that receives pointers to the pages pinned.
> - * Should be at least nr_pages long.
> - *
> - * Attempt to pin user pages in memory without taking mm->mmap_sem.
> - * If not successful, it will fall back to taking the lock and
> - * calling get_user_pages().
> - *
> - * Returns number of pages pinned. This may be fewer than the number
> - * requested. If nr_pages is 0 or negative, returns 0. If no pages
> - * were pinned, returns -errno.
> - */
> -int get_user_pages_fast(unsigned long start, int nr_pages,
> - unsigned int gup_flags, struct page **pages)
> +static int internal_get_user_pages_fast(unsigned long start, int nr_pages,
> + unsigned int gup_flags,
> + struct page **pages)
Usualy function are rename to _old_func_name ie add _ in front. So
here it would become _get_user_pages_fast but i know some people
don't like that as sometimes we endup with ___function_overloaded :)
> {
> unsigned long addr, len, end;
> int nr = 0, ret = 0;
> @@ -2435,4 +2449,215 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
[...]
> +/**
> + * pin_user_pages_remote() - pin pages for (typically) use by Direct IO, and
> + * return the pages to the user.
Not a fan of (typically) maybe:
pin_user_pages_remote() - pin pages of a remote process (task != current)
I think here the remote part if more important that DIO. Remote is use by
other thing that DIO.
> + *
> + * Nearly the same as get_user_pages_remote(), except that FOLL_PIN is set. See
> + * get_user_pages_remote() for documentation on the function arguments, because
> + * the arguments here are identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for details.
> + *
> + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It
> + * is NOT intended for Case 2 (RDMA: long-term pins).
> + */
> +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN;
> +
> + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
> + locked, gup_flags);
> +}
> +EXPORT_SYMBOL(pin_user_pages_remote);
> +
> +/**
> + * pin_longterm_pages_remote() - pin pages for (typically) use by Direct IO, and
> + * return the pages to the user.
I think you copy pasted this from pin_user_pages_remote() :)
> + *
> + * Nearly the same as get_user_pages_remote(), but note that FOLL_TOUCH is not
> + * set, and FOLL_PIN and FOLL_LONGTERM are set. See get_user_pages_remote() for
> + * documentation on the function arguments, because the arguments here are
> + * identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for further details.
> + *
> + * FOLL_LONGTERM means that the pages are being pinned for "long term" use,
> + * typically by a non-CPU device, and we cannot be sure that waiting for a
> + * pinned page to become unpin will be effective.
> + *
> + * This is intended for Case 2 (RDMA: long-term pins) in
> + * Documentation/vm/pin_user_pages.rst.
> + */
> +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + /*
> + * FIXME: as noted in the get_user_pages_remote() implementation, it
> + * is not yet possible to safely set FOLL_LONGTERM here. FOLL_LONGTERM
> + * needs to be set, but for now the best we can do is a "TODO" item.
> + */
> + gup_flags |= FOLL_REMOTE | FOLL_PIN;
Wouldn't it be better to not add pin_longterm_pages_remote() until
it can be properly implemented ?
WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: "Michal Hocko" <mhocko@suse.com>, "Jan Kara" <jack@suse.cz>,
kvm@vger.kernel.org, linux-doc@vger.kernel.org,
"David Airlie" <airlied@linux.ie>,
"Dave Chinner" <david@fromorbit.com>,
dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
"Paul Mackerras" <paulus@samba.org>,
linux-kselftest@vger.kernel.org,
"Ira Weiny" <ira.weiny@intel.com>,
"Jonathan Corbet" <corbet@lwn.net>,
linux-rdma@vger.kernel.org,
"Christoph Hellwig" <hch@infradead.org>,
"Jason Gunthorpe" <jgg@ziepe.ca>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Björn Töpel" <bjorn.topel@intel.com>,
linux-media@vger.kernel.org, "Shuah Khan" <shuah@kernel.org>,
linux-block@vger.kernel.org,
"Alex Williamson" <alex.williamson@redhat.com>,
"Al Viro" <viro@zeniv.linux.org.uk>,
"Dan Williams" <dan.j.williams@intel.com>,
"Mauro Carvalho Chehab" <mchehab@kernel.org>,
bpf@vger.kernel.org,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
"Jens Axboe" <axboe@kernel.dk>,
netdev@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
"Daniel Vetter" <daniel@ffwll.ch>,
linux-fsdevel@vger.kernel.org,
"Andrew Morton" <akpm@linux-foundation.org>,
linuxppc-dev@lists.ozlabs.org,
"David S . Miller" <davem@davemloft.net>,
"Mike Kravetz" <mike.kravetz@oracle.com>
Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN
Date: Mon, 4 Nov 2019 12:33:25 -0500 [thread overview]
Message-ID: <20191104173325.GD5134@redhat.com> (raw)
In-Reply-To: <20191103211813.213227-6-jhubbard@nvidia.com>
On Sun, Nov 03, 2019 at 01:18:00PM -0800, John Hubbard wrote:
> Introduce pin_user_pages*() variations of get_user_pages*() calls,
> and also pin_longterm_pages*() variations.
>
> These variants all set FOLL_PIN, which is also introduced, and
> thoroughly documented.
>
> The pin_longterm*() variants also set FOLL_LONGTERM, in addition
> to FOLL_PIN:
>
> pin_user_pages()
> pin_user_pages_remote()
> pin_user_pages_fast()
>
> pin_longterm_pages()
> pin_longterm_pages_remote()
> pin_longterm_pages_fast()
>
> All pages that are pinned via the above calls, must be unpinned via
> put_user_page().
>
> The underlying rules are:
>
> * These are gup-internal flags, so the call sites should not directly
> set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with
> assertions, for the new FOLL_PIN flag. However, for the pre-existing
> FOLL_LONGTERM flag, which has some call sites that still directly
> set FOLL_LONGTERM, there is no assertion yet.
>
> * Call sites that want to indicate that they are going to do DirectIO
> ("DIO") or something with similar characteristics, should call a
> get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers
> will:
> * Start with "pin_user_pages" instead of "get_user_pages". That
> makes it easy to find and audit the call sites.
> * Set FOLL_PIN
>
> * For pages that are received via FOLL_PIN, those pages must be returned
> via put_user_page().
>
> Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases
> in this documentation. (I've reworded it and expanded on it slightly.)
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Few nitpick belows, nonetheless:
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> Documentation/vm/index.rst | 1 +
> Documentation/vm/pin_user_pages.rst | 212 ++++++++++++++++++++++
> include/linux/mm.h | 62 ++++++-
> mm/gup.c | 265 +++++++++++++++++++++++++---
> 4 files changed, 514 insertions(+), 26 deletions(-)
> create mode 100644 Documentation/vm/pin_user_pages.rst
>
[...]
> diff --git a/Documentation/vm/pin_user_pages.rst b/Documentation/vm/pin_user_pages.rst
> new file mode 100644
> index 000000000000..3910f49ca98c
> --- /dev/null
> +++ b/Documentation/vm/pin_user_pages.rst
[...]
> +
> +FOLL_PIN, FOLL_GET, FOLL_LONGTERM: when to use which flags
> +==========================================================
> +
> +Thanks to Jan Kara, Vlastimil Babka and several other -mm people, for describing
> +these categories:
> +
> +CASE 1: Direct IO (DIO)
> +-----------------------
> +There are GUP references to pages that are serving
> +as DIO buffers. These buffers are needed for a relatively short time (so they
> +are not "long term"). No special synchronization with page_mkclean() or
> +munmap() is provided. Therefore, flags to set at the call site are: ::
> +
> + FOLL_PIN
> +
> +...but rather than setting FOLL_PIN directly, call sites should use one of
> +the pin_user_pages*() routines that set FOLL_PIN.
> +
> +CASE 2: RDMA
> +------------
> +There are GUP references to pages that are serving as DMA
> +buffers. These buffers are needed for a long time ("long term"). No special
> +synchronization with page_mkclean() or munmap() is provided. Therefore, flags
> +to set at the call site are: ::
> +
> + FOLL_PIN | FOLL_LONGTERM
> +
> +NOTE: Some pages, such as DAX pages, cannot be pinned with longterm pins. That's
> +because DAX pages do not have a separate page cache, and so "pinning" implies
> +locking down file system blocks, which is not (yet) supported in that way.
> +
> +CASE 3: ODP
> +-----------
> +(Mellanox/Infiniband On Demand Paging: the hardware supports
> +replayable page faulting). There are GUP references to pages serving as DMA
> +buffers. For ODP, MMU notifiers are used to synchronize with page_mkclean()
> +and munmap(). Therefore, normal GUP calls are sufficient, so neither flag
> +needs to be set.
I would not include ODP or anything like it here, they do not use
GUP anymore and i believe it is more confusing here. I would how-
ever include some text in this documentation explaining that hard-
ware that support page fault is superior as it does not incur any
of the issues described here.
> +
> +CASE 4: Pinning for struct page manipulation only
> +-------------------------------------------------
> +Here, normal GUP calls are sufficient, so neither flag needs to be set.
> +
[...]
> diff --git a/mm/gup.c b/mm/gup.c
> index 199da99e8ffc..1aea48427879 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
[...]
> @@ -1014,7 +1018,16 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
> BUG_ON(*locked != 1);
> }
>
> - if (pages)
> + /*
> + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior
> + * is to set FOLL_GET if the caller wants pages[] filled in (but has
> + * carelessly failed to specify FOLL_GET), so keep doing that, but only
> + * for FOLL_GET, not for the newer FOLL_PIN.
> + *
> + * FOLL_PIN always expects pages to be non-null, but no need to assert
> + * that here, as any failures will be obvious enough.
> + */
> + if (pages && !(flags & FOLL_PIN))
> flags |= FOLL_GET;
Did you look at user that have pages and not FOLL_GET set ?
I believe it would be better to first fix them to end up
with FOLL_GET set and then error out if pages is != NULL but
nor FOLL_GET or FOLL_PIN is set.
>
> pages_done = 0;
> @@ -2373,24 +2402,9 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages,
> return ret;
> }
>
> -/**
> - * get_user_pages_fast() - pin user pages in memory
> - * @start: starting user address
> - * @nr_pages: number of pages from start to pin
> - * @gup_flags: flags modifying pin behaviour
> - * @pages: array that receives pointers to the pages pinned.
> - * Should be at least nr_pages long.
> - *
> - * Attempt to pin user pages in memory without taking mm->mmap_sem.
> - * If not successful, it will fall back to taking the lock and
> - * calling get_user_pages().
> - *
> - * Returns number of pages pinned. This may be fewer than the number
> - * requested. If nr_pages is 0 or negative, returns 0. If no pages
> - * were pinned, returns -errno.
> - */
> -int get_user_pages_fast(unsigned long start, int nr_pages,
> - unsigned int gup_flags, struct page **pages)
> +static int internal_get_user_pages_fast(unsigned long start, int nr_pages,
> + unsigned int gup_flags,
> + struct page **pages)
Usualy function are rename to _old_func_name ie add _ in front. So
here it would become _get_user_pages_fast but i know some people
don't like that as sometimes we endup with ___function_overloaded :)
> {
> unsigned long addr, len, end;
> int nr = 0, ret = 0;
> @@ -2435,4 +2449,215 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
[...]
> +/**
> + * pin_user_pages_remote() - pin pages for (typically) use by Direct IO, and
> + * return the pages to the user.
Not a fan of (typically) maybe:
pin_user_pages_remote() - pin pages of a remote process (task != current)
I think here the remote part if more important that DIO. Remote is use by
other thing that DIO.
> + *
> + * Nearly the same as get_user_pages_remote(), except that FOLL_PIN is set. See
> + * get_user_pages_remote() for documentation on the function arguments, because
> + * the arguments here are identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for details.
> + *
> + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It
> + * is NOT intended for Case 2 (RDMA: long-term pins).
> + */
> +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN;
> +
> + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
> + locked, gup_flags);
> +}
> +EXPORT_SYMBOL(pin_user_pages_remote);
> +
> +/**
> + * pin_longterm_pages_remote() - pin pages for (typically) use by Direct IO, and
> + * return the pages to the user.
I think you copy pasted this from pin_user_pages_remote() :)
> + *
> + * Nearly the same as get_user_pages_remote(), but note that FOLL_TOUCH is not
> + * set, and FOLL_PIN and FOLL_LONGTERM are set. See get_user_pages_remote() for
> + * documentation on the function arguments, because the arguments here are
> + * identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for further details.
> + *
> + * FOLL_LONGTERM means that the pages are being pinned for "long term" use,
> + * typically by a non-CPU device, and we cannot be sure that waiting for a
> + * pinned page to become unpin will be effective.
> + *
> + * This is intended for Case 2 (RDMA: long-term pins) in
> + * Documentation/vm/pin_user_pages.rst.
> + */
> +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + /*
> + * FIXME: as noted in the get_user_pages_remote() implementation, it
> + * is not yet possible to safely set FOLL_LONGTERM here. FOLL_LONGTERM
> + * needs to be set, but for now the best we can do is a "TODO" item.
> + */
> + gup_flags |= FOLL_REMOTE | FOLL_PIN;
Wouldn't it be better to not add pin_longterm_pages_remote() until
it can be properly implemented ?
WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
"Al Viro" <viro@zeniv.linux.org.uk>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Benjamin Herrenschmidt" <benh@kernel.crashing.org>,
"Björn Töpel" <bjorn.topel@intel.com>,
"Christoph Hellwig" <hch@infradead.org>,
"Dan Williams" <dan.j.williams@intel.com>,
"Daniel Vetter" <daniel@ffwll.ch>,
"Dave Chinner" <david@fromorbit.com>,
"David Airlie" <airlied@linux.ie>,
"David S . Miller" <davem@davemloft.net>,
"Ira Weiny" <ira.weiny@intel.com>, "Jan Kara" <jack@suse.cz>,
"Jason Gunthorpe" <jgg@ziepe.ca>, "Jens Axboe" <axboe@kernel.dk>,
"Jonathan Corbet" <corbet@lwn.net>,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
"Mauro Carvalho Chehab" <mchehab@kernel.org>,
"Michael Ellerman" <mpe@ellerman.id.au>
Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN
Date: Mon, 4 Nov 2019 12:33:25 -0500 [thread overview]
Message-ID: <20191104173325.GD5134@redhat.com> (raw)
In-Reply-To: <20191103211813.213227-6-jhubbard@nvidia.com>
On Sun, Nov 03, 2019 at 01:18:00PM -0800, John Hubbard wrote:
> Introduce pin_user_pages*() variations of get_user_pages*() calls,
> and also pin_longterm_pages*() variations.
>
> These variants all set FOLL_PIN, which is also introduced, and
> thoroughly documented.
>
> The pin_longterm*() variants also set FOLL_LONGTERM, in addition
> to FOLL_PIN:
>
> pin_user_pages()
> pin_user_pages_remote()
> pin_user_pages_fast()
>
> pin_longterm_pages()
> pin_longterm_pages_remote()
> pin_longterm_pages_fast()
>
> All pages that are pinned via the above calls, must be unpinned via
> put_user_page().
>
> The underlying rules are:
>
> * These are gup-internal flags, so the call sites should not directly
> set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with
> assertions, for the new FOLL_PIN flag. However, for the pre-existing
> FOLL_LONGTERM flag, which has some call sites that still directly
> set FOLL_LONGTERM, there is no assertion yet.
>
> * Call sites that want to indicate that they are going to do DirectIO
> ("DIO") or something with similar characteristics, should call a
> get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers
> will:
> * Start with "pin_user_pages" instead of "get_user_pages". That
> makes it easy to find and audit the call sites.
> * Set FOLL_PIN
>
> * For pages that are received via FOLL_PIN, those pages must be returned
> via put_user_page().
>
> Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases
> in this documentation. (I've reworded it and expanded on it slightly.)
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Few nitpick belows, nonetheless:
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> Documentation/vm/index.rst | 1 +
> Documentation/vm/pin_user_pages.rst | 212 ++++++++++++++++++++++
> include/linux/mm.h | 62 ++++++-
> mm/gup.c | 265 +++++++++++++++++++++++++---
> 4 files changed, 514 insertions(+), 26 deletions(-)
> create mode 100644 Documentation/vm/pin_user_pages.rst
>
[...]
> diff --git a/Documentation/vm/pin_user_pages.rst b/Documentation/vm/pin_user_pages.rst
> new file mode 100644
> index 000000000000..3910f49ca98c
> --- /dev/null
> +++ b/Documentation/vm/pin_user_pages.rst
[...]
> +
> +FOLL_PIN, FOLL_GET, FOLL_LONGTERM: when to use which flags
> +==========================================================
> +
> +Thanks to Jan Kara, Vlastimil Babka and several other -mm people, for describing
> +these categories:
> +
> +CASE 1: Direct IO (DIO)
> +-----------------------
> +There are GUP references to pages that are serving
> +as DIO buffers. These buffers are needed for a relatively short time (so they
> +are not "long term"). No special synchronization with page_mkclean() or
> +munmap() is provided. Therefore, flags to set at the call site are: ::
> +
> + FOLL_PIN
> +
> +...but rather than setting FOLL_PIN directly, call sites should use one of
> +the pin_user_pages*() routines that set FOLL_PIN.
> +
> +CASE 2: RDMA
> +------------
> +There are GUP references to pages that are serving as DMA
> +buffers. These buffers are needed for a long time ("long term"). No special
> +synchronization with page_mkclean() or munmap() is provided. Therefore, flags
> +to set at the call site are: ::
> +
> + FOLL_PIN | FOLL_LONGTERM
> +
> +NOTE: Some pages, such as DAX pages, cannot be pinned with longterm pins. That's
> +because DAX pages do not have a separate page cache, and so "pinning" implies
> +locking down file system blocks, which is not (yet) supported in that way.
> +
> +CASE 3: ODP
> +-----------
> +(Mellanox/Infiniband On Demand Paging: the hardware supports
> +replayable page faulting). There are GUP references to pages serving as DMA
> +buffers. For ODP, MMU notifiers are used to synchronize with page_mkclean()
> +and munmap(). Therefore, normal GUP calls are sufficient, so neither flag
> +needs to be set.
I would not include ODP or anything like it here, they do not use
GUP anymore and i believe it is more confusing here. I would how-
ever include some text in this documentation explaining that hard-
ware that support page fault is superior as it does not incur any
of the issues described here.
> +
> +CASE 4: Pinning for struct page manipulation only
> +-------------------------------------------------
> +Here, normal GUP calls are sufficient, so neither flag needs to be set.
> +
[...]
> diff --git a/mm/gup.c b/mm/gup.c
> index 199da99e8ffc..1aea48427879 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
[...]
> @@ -1014,7 +1018,16 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
> BUG_ON(*locked != 1);
> }
>
> - if (pages)
> + /*
> + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior
> + * is to set FOLL_GET if the caller wants pages[] filled in (but has
> + * carelessly failed to specify FOLL_GET), so keep doing that, but only
> + * for FOLL_GET, not for the newer FOLL_PIN.
> + *
> + * FOLL_PIN always expects pages to be non-null, but no need to assert
> + * that here, as any failures will be obvious enough.
> + */
> + if (pages && !(flags & FOLL_PIN))
> flags |= FOLL_GET;
Did you look at user that have pages and not FOLL_GET set ?
I believe it would be better to first fix them to end up
with FOLL_GET set and then error out if pages is != NULL but
nor FOLL_GET or FOLL_PIN is set.
>
> pages_done = 0;
> @@ -2373,24 +2402,9 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages,
> return ret;
> }
>
> -/**
> - * get_user_pages_fast() - pin user pages in memory
> - * @start: starting user address
> - * @nr_pages: number of pages from start to pin
> - * @gup_flags: flags modifying pin behaviour
> - * @pages: array that receives pointers to the pages pinned.
> - * Should be at least nr_pages long.
> - *
> - * Attempt to pin user pages in memory without taking mm->mmap_sem.
> - * If not successful, it will fall back to taking the lock and
> - * calling get_user_pages().
> - *
> - * Returns number of pages pinned. This may be fewer than the number
> - * requested. If nr_pages is 0 or negative, returns 0. If no pages
> - * were pinned, returns -errno.
> - */
> -int get_user_pages_fast(unsigned long start, int nr_pages,
> - unsigned int gup_flags, struct page **pages)
> +static int internal_get_user_pages_fast(unsigned long start, int nr_pages,
> + unsigned int gup_flags,
> + struct page **pages)
Usualy function are rename to _old_func_name ie add _ in front. So
here it would become _get_user_pages_fast but i know some people
don't like that as sometimes we endup with ___function_overloaded :)
> {
> unsigned long addr, len, end;
> int nr = 0, ret = 0;
> @@ -2435,4 +2449,215 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
[...]
> +/**
> + * pin_user_pages_remote() - pin pages for (typically) use by Direct IO, and
> + * return the pages to the user.
Not a fan of (typically) maybe:
pin_user_pages_remote() - pin pages of a remote process (task != current)
I think here the remote part if more important that DIO. Remote is use by
other thing that DIO.
> + *
> + * Nearly the same as get_user_pages_remote(), except that FOLL_PIN is set. See
> + * get_user_pages_remote() for documentation on the function arguments, because
> + * the arguments here are identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for details.
> + *
> + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It
> + * is NOT intended for Case 2 (RDMA: long-term pins).
> + */
> +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN;
> +
> + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
> + locked, gup_flags);
> +}
> +EXPORT_SYMBOL(pin_user_pages_remote);
> +
> +/**
> + * pin_longterm_pages_remote() - pin pages for (typically) use by Direct IO, and
> + * return the pages to the user.
I think you copy pasted this from pin_user_pages_remote() :)
> + *
> + * Nearly the same as get_user_pages_remote(), but note that FOLL_TOUCH is not
> + * set, and FOLL_PIN and FOLL_LONGTERM are set. See get_user_pages_remote() for
> + * documentation on the function arguments, because the arguments here are
> + * identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for further details.
> + *
> + * FOLL_LONGTERM means that the pages are being pinned for "long term" use,
> + * typically by a non-CPU device, and we cannot be sure that waiting for a
> + * pinned page to become unpin will be effective.
> + *
> + * This is intended for Case 2 (RDMA: long-term pins) in
> + * Documentation/vm/pin_user_pages.rst.
> + */
> +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + /*
> + * FIXME: as noted in the get_user_pages_remote() implementation, it
> + * is not yet possible to safely set FOLL_LONGTERM here. FOLL_LONGTERM
> + * needs to be set, but for now the best we can do is a "TODO" item.
> + */
> + gup_flags |= FOLL_REMOTE | FOLL_PIN;
Wouldn't it be better to not add pin_longterm_pages_remote() until
it can be properly implemented ?
WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: "Michal Hocko" <mhocko@suse.com>, "Jan Kara" <jack@suse.cz>,
kvm@vger.kernel.org, linux-doc@vger.kernel.org,
"David Airlie" <airlied@linux.ie>,
"Dave Chinner" <david@fromorbit.com>,
dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
"Paul Mackerras" <paulus@samba.org>,
linux-kselftest@vger.kernel.org,
"Ira Weiny" <ira.weiny@intel.com>,
"Jonathan Corbet" <corbet@lwn.net>,
linux-rdma@vger.kernel.org,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Christoph Hellwig" <hch@infradead.org>,
"Jason Gunthorpe" <jgg@ziepe.ca>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Björn Töpel" <bjorn.topel@intel.com>,
linux-media@vger.kernel.org, "Shuah Khan" <shuah@kernel.org>,
linux-block@vger.kernel.org,
"Alex Williamson" <alex.williamson@redhat.com>,
"Al Viro" <viro@zeniv.linux.org.uk>,
"Dan Williams" <dan.j.williams@intel.com>,
"Mauro Carvalho Chehab" <mchehab@kernel.org>,
bpf@vger.kernel.org,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
"Jens Axboe" <axboe@kernel.dk>,
netdev@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
linux-fsdevel@vger.kernel.org,
"Andrew Morton" <akpm@linux-foundation.org>,
linuxppc-dev@lists.ozlabs.org,
"David S . Miller" <davem@davemloft.net>,
"Mike Kravetz" <mike.kravetz@oracle.com>
Subject: Re: [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN
Date: Mon, 4 Nov 2019 12:33:25 -0500 [thread overview]
Message-ID: <20191104173325.GD5134@redhat.com> (raw)
Message-ID: <20191104173325.7mcboHumvJyqzPE2iRYCtywWE93XUnyV9bfPJ6pCpRY@z> (raw)
In-Reply-To: <20191103211813.213227-6-jhubbard@nvidia.com>
On Sun, Nov 03, 2019 at 01:18:00PM -0800, John Hubbard wrote:
> Introduce pin_user_pages*() variations of get_user_pages*() calls,
> and also pin_longterm_pages*() variations.
>
> These variants all set FOLL_PIN, which is also introduced, and
> thoroughly documented.
>
> The pin_longterm*() variants also set FOLL_LONGTERM, in addition
> to FOLL_PIN:
>
> pin_user_pages()
> pin_user_pages_remote()
> pin_user_pages_fast()
>
> pin_longterm_pages()
> pin_longterm_pages_remote()
> pin_longterm_pages_fast()
>
> All pages that are pinned via the above calls, must be unpinned via
> put_user_page().
>
> The underlying rules are:
>
> * These are gup-internal flags, so the call sites should not directly
> set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with
> assertions, for the new FOLL_PIN flag. However, for the pre-existing
> FOLL_LONGTERM flag, which has some call sites that still directly
> set FOLL_LONGTERM, there is no assertion yet.
>
> * Call sites that want to indicate that they are going to do DirectIO
> ("DIO") or something with similar characteristics, should call a
> get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers
> will:
> * Start with "pin_user_pages" instead of "get_user_pages". That
> makes it easy to find and audit the call sites.
> * Set FOLL_PIN
>
> * For pages that are received via FOLL_PIN, those pages must be returned
> via put_user_page().
>
> Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases
> in this documentation. (I've reworded it and expanded on it slightly.)
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Few nitpick belows, nonetheless:
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> Documentation/vm/index.rst | 1 +
> Documentation/vm/pin_user_pages.rst | 212 ++++++++++++++++++++++
> include/linux/mm.h | 62 ++++++-
> mm/gup.c | 265 +++++++++++++++++++++++++---
> 4 files changed, 514 insertions(+), 26 deletions(-)
> create mode 100644 Documentation/vm/pin_user_pages.rst
>
[...]
> diff --git a/Documentation/vm/pin_user_pages.rst b/Documentation/vm/pin_user_pages.rst
> new file mode 100644
> index 000000000000..3910f49ca98c
> --- /dev/null
> +++ b/Documentation/vm/pin_user_pages.rst
[...]
> +
> +FOLL_PIN, FOLL_GET, FOLL_LONGTERM: when to use which flags
> +==========================================================
> +
> +Thanks to Jan Kara, Vlastimil Babka and several other -mm people, for describing
> +these categories:
> +
> +CASE 1: Direct IO (DIO)
> +-----------------------
> +There are GUP references to pages that are serving
> +as DIO buffers. These buffers are needed for a relatively short time (so they
> +are not "long term"). No special synchronization with page_mkclean() or
> +munmap() is provided. Therefore, flags to set at the call site are: ::
> +
> + FOLL_PIN
> +
> +...but rather than setting FOLL_PIN directly, call sites should use one of
> +the pin_user_pages*() routines that set FOLL_PIN.
> +
> +CASE 2: RDMA
> +------------
> +There are GUP references to pages that are serving as DMA
> +buffers. These buffers are needed for a long time ("long term"). No special
> +synchronization with page_mkclean() or munmap() is provided. Therefore, flags
> +to set at the call site are: ::
> +
> + FOLL_PIN | FOLL_LONGTERM
> +
> +NOTE: Some pages, such as DAX pages, cannot be pinned with longterm pins. That's
> +because DAX pages do not have a separate page cache, and so "pinning" implies
> +locking down file system blocks, which is not (yet) supported in that way.
> +
> +CASE 3: ODP
> +-----------
> +(Mellanox/Infiniband On Demand Paging: the hardware supports
> +replayable page faulting). There are GUP references to pages serving as DMA
> +buffers. For ODP, MMU notifiers are used to synchronize with page_mkclean()
> +and munmap(). Therefore, normal GUP calls are sufficient, so neither flag
> +needs to be set.
I would not include ODP or anything like it here, they do not use
GUP anymore and i believe it is more confusing here. I would how-
ever include some text in this documentation explaining that hard-
ware that support page fault is superior as it does not incur any
of the issues described here.
> +
> +CASE 4: Pinning for struct page manipulation only
> +-------------------------------------------------
> +Here, normal GUP calls are sufficient, so neither flag needs to be set.
> +
[...]
> diff --git a/mm/gup.c b/mm/gup.c
> index 199da99e8ffc..1aea48427879 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
[...]
> @@ -1014,7 +1018,16 @@ static __always_inline long __get_user_pages_locked(struct task_struct *tsk,
> BUG_ON(*locked != 1);
> }
>
> - if (pages)
> + /*
> + * FOLL_PIN and FOLL_GET are mutually exclusive. Traditional behavior
> + * is to set FOLL_GET if the caller wants pages[] filled in (but has
> + * carelessly failed to specify FOLL_GET), so keep doing that, but only
> + * for FOLL_GET, not for the newer FOLL_PIN.
> + *
> + * FOLL_PIN always expects pages to be non-null, but no need to assert
> + * that here, as any failures will be obvious enough.
> + */
> + if (pages && !(flags & FOLL_PIN))
> flags |= FOLL_GET;
Did you look at user that have pages and not FOLL_GET set ?
I believe it would be better to first fix them to end up
with FOLL_GET set and then error out if pages is != NULL but
nor FOLL_GET or FOLL_PIN is set.
>
> pages_done = 0;
> @@ -2373,24 +2402,9 @@ static int __gup_longterm_unlocked(unsigned long start, int nr_pages,
> return ret;
> }
>
> -/**
> - * get_user_pages_fast() - pin user pages in memory
> - * @start: starting user address
> - * @nr_pages: number of pages from start to pin
> - * @gup_flags: flags modifying pin behaviour
> - * @pages: array that receives pointers to the pages pinned.
> - * Should be at least nr_pages long.
> - *
> - * Attempt to pin user pages in memory without taking mm->mmap_sem.
> - * If not successful, it will fall back to taking the lock and
> - * calling get_user_pages().
> - *
> - * Returns number of pages pinned. This may be fewer than the number
> - * requested. If nr_pages is 0 or negative, returns 0. If no pages
> - * were pinned, returns -errno.
> - */
> -int get_user_pages_fast(unsigned long start, int nr_pages,
> - unsigned int gup_flags, struct page **pages)
> +static int internal_get_user_pages_fast(unsigned long start, int nr_pages,
> + unsigned int gup_flags,
> + struct page **pages)
Usualy function are rename to _old_func_name ie add _ in front. So
here it would become _get_user_pages_fast but i know some people
don't like that as sometimes we endup with ___function_overloaded :)
> {
> unsigned long addr, len, end;
> int nr = 0, ret = 0;
> @@ -2435,4 +2449,215 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
[...]
> +/**
> + * pin_user_pages_remote() - pin pages for (typically) use by Direct IO, and
> + * return the pages to the user.
Not a fan of (typically) maybe:
pin_user_pages_remote() - pin pages of a remote process (task != current)
I think here the remote part if more important that DIO. Remote is use by
other thing that DIO.
> + *
> + * Nearly the same as get_user_pages_remote(), except that FOLL_PIN is set. See
> + * get_user_pages_remote() for documentation on the function arguments, because
> + * the arguments here are identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for details.
> + *
> + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. It
> + * is NOT intended for Case 2 (RDMA: long-term pins).
> + */
> +long pin_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + gup_flags |= FOLL_TOUCH | FOLL_REMOTE | FOLL_PIN;
> +
> + return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
> + locked, gup_flags);
> +}
> +EXPORT_SYMBOL(pin_user_pages_remote);
> +
> +/**
> + * pin_longterm_pages_remote() - pin pages for (typically) use by Direct IO, and
> + * return the pages to the user.
I think you copy pasted this from pin_user_pages_remote() :)
> + *
> + * Nearly the same as get_user_pages_remote(), but note that FOLL_TOUCH is not
> + * set, and FOLL_PIN and FOLL_LONGTERM are set. See get_user_pages_remote() for
> + * documentation on the function arguments, because the arguments here are
> + * identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for further details.
> + *
> + * FOLL_LONGTERM means that the pages are being pinned for "long term" use,
> + * typically by a non-CPU device, and we cannot be sure that waiting for a
> + * pinned page to become unpin will be effective.
> + *
> + * This is intended for Case 2 (RDMA: long-term pins) in
> + * Documentation/vm/pin_user_pages.rst.
> + */
> +long pin_longterm_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> + unsigned long start, unsigned long nr_pages,
> + unsigned int gup_flags, struct page **pages,
> + struct vm_area_struct **vmas, int *locked)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + /*
> + * FIXME: as noted in the get_user_pages_remote() implementation, it
> + * is not yet possible to safely set FOLL_LONGTERM here. FOLL_LONGTERM
> + * needs to be set, but for now the best we can do is a "TODO" item.
> + */
> + gup_flags |= FOLL_REMOTE | FOLL_PIN;
Wouldn't it be better to not add pin_longterm_pages_remote() until
it can be properly implemented ?
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
next prev parent reply other threads:[~2019-11-04 17:33 UTC|newest]
Thread overview: 197+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-03 21:17 [PATCH v2 00/18] mm/gup: track dma-pinned pages: FOLL_PIN, FOLL_LONGTERM John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-03 21:17 ` [PATCH v2 01/18] mm/gup: pass flags arg to __gup_device_* functions John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-04 16:39 ` Jerome Glisse
2019-11-04 16:39 ` Jerome Glisse
2019-11-04 16:39 ` Jerome Glisse
2019-11-04 16:39 ` Jerome Glisse
2019-11-03 21:17 ` [PATCH v2 02/18] mm/gup: factor out duplicate code from four routines John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-04 16:51 ` Jerome Glisse
2019-11-04 16:51 ` Jerome Glisse
2019-11-04 16:51 ` Jerome Glisse
2019-11-04 16:51 ` Jerome Glisse
2019-11-03 21:17 ` [PATCH v2 03/18] goldish_pipe: rename local pin_user_pages() routine John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-04 16:52 ` Jerome Glisse
2019-11-04 16:52 ` Jerome Glisse
2019-11-04 16:52 ` Jerome Glisse
2019-11-04 16:52 ` Jerome Glisse
2019-11-03 21:17 ` [PATCH v2 04/18] media/v4l2-core: set pages dirty upon releasing DMA buffers John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-03 21:17 ` John Hubbard
2019-11-10 10:10 ` Hans Verkuil
2019-11-10 10:10 ` Hans Verkuil
2019-11-10 10:10 ` Hans Verkuil
2019-11-10 10:10 ` Hans Verkuil
2019-11-11 21:46 ` John Hubbard
2019-11-11 21:46 ` John Hubbard
2019-11-11 21:46 ` John Hubbard
2019-11-11 21:46 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 05/18] mm/gup: introduce pin_user_pages*() and FOLL_PIN John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-04 17:33 ` Jerome Glisse [this message]
2019-11-04 17:33 ` Jerome Glisse
2019-11-04 17:33 ` Jerome Glisse
2019-11-04 17:33 ` Jerome Glisse
2019-11-04 19:04 ` John Hubbard
2019-11-04 19:04 ` John Hubbard
2019-11-04 19:04 ` John Hubbard
2019-11-04 19:04 ` John Hubbard
2019-11-04 19:18 ` Jerome Glisse
2019-11-04 19:18 ` Jerome Glisse
2019-11-04 19:18 ` Jerome Glisse
2019-11-04 19:18 ` Jerome Glisse
2019-11-04 19:30 ` John Hubbard
2019-11-04 19:30 ` John Hubbard
2019-11-04 19:30 ` John Hubbard
2019-11-04 19:30 ` John Hubbard
2019-11-04 19:52 ` Jerome Glisse
2019-11-04 19:52 ` Jerome Glisse
2019-11-04 19:52 ` Jerome Glisse
2019-11-04 19:52 ` Jerome Glisse
2019-11-04 20:09 ` John Hubbard
2019-11-04 20:09 ` John Hubbard
2019-11-04 20:09 ` John Hubbard
2019-11-04 20:31 ` Jason Gunthorpe
2019-11-04 20:31 ` Jason Gunthorpe
2019-11-04 20:31 ` Jason Gunthorpe
2019-11-04 20:40 ` John Hubbard
2019-11-04 20:40 ` John Hubbard
2019-11-04 20:40 ` John Hubbard
2019-11-04 20:31 ` Jerome Glisse
2019-11-04 20:31 ` Jerome Glisse
2019-11-04 20:31 ` Jerome Glisse
2019-11-04 20:31 ` Jerome Glisse
2019-11-04 20:37 ` Jason Gunthorpe
2019-11-04 20:37 ` Jason Gunthorpe
2019-11-04 20:37 ` Jason Gunthorpe
2019-11-04 20:57 ` John Hubbard
2019-11-04 20:57 ` John Hubbard
2019-11-04 20:57 ` John Hubbard
2019-11-04 21:15 ` Jason Gunthorpe
2019-11-04 21:15 ` Jason Gunthorpe
2019-11-04 21:15 ` Jason Gunthorpe
2019-11-04 21:34 ` John Hubbard
2019-11-04 21:34 ` John Hubbard
2019-11-04 21:34 ` John Hubbard
2019-11-04 20:33 ` David Rientjes
2019-11-04 20:33 ` David Rientjes
2019-11-04 20:33 ` David Rientjes
2019-11-04 20:33 ` David Rientjes
2019-11-04 20:48 ` Jerome Glisse
2019-11-04 20:48 ` Jerome Glisse
2019-11-04 20:48 ` Jerome Glisse
2019-11-04 20:48 ` Jerome Glisse
2019-11-05 13:10 ` Mike Rapoport
2019-11-05 13:10 ` Mike Rapoport
2019-11-05 13:10 ` Mike Rapoport
2019-11-05 13:10 ` Mike Rapoport
2019-11-05 19:00 ` John Hubbard
2019-11-05 19:00 ` John Hubbard
2019-11-05 19:00 ` John Hubbard
2019-11-05 19:00 ` John Hubbard
2019-11-07 2:25 ` Ira Weiny
2019-11-07 2:25 ` Ira Weiny
2019-11-07 2:25 ` Ira Weiny
2019-11-07 2:25 ` Ira Weiny
2019-11-07 8:07 ` Mike Rapoport
2019-11-07 8:07 ` Mike Rapoport
2019-11-07 8:07 ` Mike Rapoport
2019-11-07 8:07 ` Mike Rapoport
2019-11-03 21:18 ` [PATCH v2 06/18] goldish_pipe: convert to pin_user_pages() and put_user_page() John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 07/18] infiniband: set FOLL_PIN, FOLL_LONGTERM via pin_longterm_pages*() John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-04 20:33 ` Jason Gunthorpe
2019-11-04 20:33 ` Jason Gunthorpe
2019-11-04 20:33 ` Jason Gunthorpe
2019-11-04 20:48 ` John Hubbard
2019-11-04 20:48 ` John Hubbard
2019-11-04 20:48 ` John Hubbard
2019-11-04 20:57 ` Jason Gunthorpe
2019-11-04 20:57 ` Jason Gunthorpe
2019-11-04 20:57 ` Jason Gunthorpe
2019-11-04 22:03 ` John Hubbard
2019-11-04 22:03 ` John Hubbard
2019-11-04 22:03 ` John Hubbard
2019-11-05 2:32 ` Jason Gunthorpe
2019-11-05 2:32 ` Jason Gunthorpe
2019-11-05 2:32 ` Jason Gunthorpe
2019-11-07 2:26 ` Ira Weiny
2019-11-07 2:26 ` Ira Weiny
2019-11-07 2:26 ` Ira Weiny
2019-11-07 2:26 ` Ira Weiny
2019-11-03 21:18 ` [PATCH v2 08/18] mm/process_vm_access: set FOLL_PIN via pin_user_pages_remote() John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-04 17:41 ` Jerome Glisse
2019-11-04 17:41 ` Jerome Glisse
2019-11-04 17:41 ` Jerome Glisse
2019-11-04 17:41 ` Jerome Glisse
2019-11-03 21:18 ` [PATCH v2 09/18] drm/via: set FOLL_PIN via pin_user_pages_fast() John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-04 17:44 ` Jerome Glisse
2019-11-04 17:44 ` Jerome Glisse
2019-11-04 17:44 ` Jerome Glisse
2019-11-04 17:44 ` Jerome Glisse
2019-11-04 18:22 ` John Hubbard
2019-11-04 18:22 ` John Hubbard
2019-11-04 18:22 ` John Hubbard
2019-11-04 18:22 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 10/18] fs/io_uring: set FOLL_PIN via pin_user_pages() John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 11/18] net/xdp: " John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 12/18] mm/gup: track FOLL_PIN pages John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-04 18:52 ` Jerome Glisse
2019-11-04 18:52 ` Jerome Glisse
2019-11-04 18:52 ` Jerome Glisse
2019-11-04 18:52 ` Jerome Glisse
2019-11-04 22:49 ` John Hubbard
2019-11-04 22:49 ` John Hubbard
2019-11-04 22:49 ` John Hubbard
2019-11-04 23:49 ` Jerome Glisse
2019-11-04 23:49 ` Jerome Glisse
2019-11-04 23:49 ` Jerome Glisse
2019-11-04 23:49 ` Jerome Glisse
2019-11-05 0:18 ` John Hubbard
2019-11-05 0:18 ` John Hubbard
2019-11-05 0:18 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 13/18] media/v4l2-core: pin_longterm_pages (FOLL_PIN) and put_user_page() conversion John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-10 10:11 ` Hans Verkuil
2019-11-10 10:11 ` Hans Verkuil
2019-11-10 10:11 ` Hans Verkuil
2019-11-10 10:11 ` Hans Verkuil
2019-11-03 21:18 ` [PATCH v2 14/18] vfio, mm: " John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 15/18] powerpc: book3s64: convert to pin_longterm_pages() and put_user_page() John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 16/18] mm/gup_benchmark: support pin_user_pages() and related calls John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 17/18] selftests/vm: run_vmtests: invoke gup_benchmark with basic FOLL_PIN coverage John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` [PATCH v2 18/18] mm/gup: remove support for gup(FOLL_LONGTERM) John Hubbard
2019-11-03 21:18 ` John Hubbard
2019-11-03 21:18 ` John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191104173325.GD5134@redhat.com \
--to=jglisse@redhat.com \
--cc=airlied@linux.ie \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=axboe@kernel.dk \
--cc=benh@kernel.crashing.org \
--cc=bjorn.topel@intel.com \
--cc=bpf@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=daniel@ffwll.ch \
--cc=davem@davemloft.net \
--cc=david@fromorbit.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=hch@infradead.org \
--cc=ira.weiny@intel.com \
--cc=jack@suse.cz \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=magnus.karlsson@intel.com \
--cc=mchehab@kernel.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=mpe@ellerman.id.au \
--cc=netdev@vger.kernel.org \
--cc=paulus@samba.org \
--cc=shuah@kernel.org \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.