* [PATCH v2 1/3] mm/gup_benchmark: handle gup failures
[not found] <1522962072-182137-1-git-send-email-mst@redhat.com>
@ 2018-04-05 21:03 ` Michael S. Tsirkin
2018-04-07 20:08 ` Linus Torvalds
2018-04-05 21:03 ` [PATCH v2 2/3] gup: return -EFAULT on access_ok failure Michael S. Tsirkin
2018-04-05 21:03 ` [PATCH v2 3/3] mm/gup: document return value Michael S. Tsirkin
2 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2018-04-05 21:03 UTC (permalink / raw)
To: linux-kernel
Cc: Kirill A . Shutemov, Andrew Morton, Huang Ying, Jonathan Corbet,
Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
Thorsten Leemhuis, stable, linux-mm
__gup_benchmark_ioctl does not handle the case where
get_user_pages_fast fails:
- a negative return code will cause a buffer overrun
- returning with partial success will cause use of
uninitialized memory.
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
mm/gup_benchmark.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mm/gup_benchmark.c b/mm/gup_benchmark.c
index 5c8e2ab..d743035 100644
--- a/mm/gup_benchmark.c
+++ b/mm/gup_benchmark.c
@@ -23,7 +23,7 @@ static int __gup_benchmark_ioctl(unsigned int cmd,
struct page **pages;
nr_pages = gup->size / PAGE_SIZE;
- pages = kvmalloc(sizeof(void *) * nr_pages, GFP_KERNEL);
+ pages = kvzalloc(sizeof(void *) * nr_pages, GFP_KERNEL);
if (!pages)
return -ENOMEM;
@@ -41,7 +41,8 @@ static int __gup_benchmark_ioctl(unsigned int cmd,
}
nr = get_user_pages_fast(addr, nr, gup->flags & 1, pages + i);
- i += nr;
+ if (nr > 0)
+ i += nr;
}
end_time = ktime_get();
--
MST
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/3] gup: return -EFAULT on access_ok failure
[not found] <1522962072-182137-1-git-send-email-mst@redhat.com>
2018-04-05 21:03 ` [PATCH v2 1/3] mm/gup_benchmark: handle gup failures Michael S. Tsirkin
@ 2018-04-05 21:03 ` Michael S. Tsirkin
2018-04-05 21:03 ` [PATCH v2 3/3] mm/gup: document return value Michael S. Tsirkin
2 siblings, 0 replies; 6+ messages in thread
From: Michael S. Tsirkin @ 2018-04-05 21:03 UTC (permalink / raw)
To: linux-kernel
Cc: Kirill A . Shutemov, Andrew Morton, Huang Ying, Jonathan Corbet,
Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
Thorsten Leemhuis, stable, syzbot+6304bf97ef436580fede, linux-mm
get_user_pages_fast is supposed to be a faster drop-in equivalent of
get_user_pages. As such, callers expect it to return a negative return
code when passed an invalid address, and never expect it to
return 0 when passed a positive number of pages, since
its documentation says:
* Returns number of pages pinned. This may be fewer than the number
* requested. If nr_pages is 0 or negative, returns 0. If no pages
* were pinned, returns -errno.
When get_user_pages_fast fall back on get_user_pages this is exactly
what happens. Unfortunately the implementation is inconsistent: it returns 0 if
passed a kernel address, confusing callers: for example, the following
is pretty common but does not appear to do the right thing with a kernel
address:
ret = get_user_pages_fast(addr, 1, writeable, &page);
if (ret < 0)
return ret;
Change get_user_pages_fast to return -EFAULT when supplied a
kernel address to make it match expectations.
All caller have been audited for consistency with the documented
semantics.
Lightly tested.
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thorsten Leemhuis <regressions@leemhuis.info>
Cc: stable@vger.kernel.org
Fixes: 5b65c4677a57 ("mm, x86/mm: Fix performance regression in get_user_pages_fast()")
Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
mm/gup.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/gup.c b/mm/gup.c
index 6afae32..8f3a064 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1806,9 +1806,12 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write,
len = (unsigned long) nr_pages << PAGE_SHIFT;
end = start + len;
+ if (nr_pages <= 0)
+ return 0;
+
if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ,
(void __user *)start, len)))
- return 0;
+ return -EFAULT;
if (gup_fast_permitted(start, nr_pages, write)) {
local_irq_disable();
--
MST
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 3/3] mm/gup: document return value
[not found] <1522962072-182137-1-git-send-email-mst@redhat.com>
2018-04-05 21:03 ` [PATCH v2 1/3] mm/gup_benchmark: handle gup failures Michael S. Tsirkin
2018-04-05 21:03 ` [PATCH v2 2/3] gup: return -EFAULT on access_ok failure Michael S. Tsirkin
@ 2018-04-05 21:03 ` Michael S. Tsirkin
2 siblings, 0 replies; 6+ messages in thread
From: Michael S. Tsirkin @ 2018-04-05 21:03 UTC (permalink / raw)
To: linux-kernel
Cc: Kirill A . Shutemov, Andrew Morton, Huang Ying, Jonathan Corbet,
Linus Torvalds, Peter Zijlstra, Thomas Gleixner,
Thorsten Leemhuis, Ralf Baechle, James Hogan, Martin Schwidefsky,
Heiko Carstens, Yoshinori Sato, Rich Felker, David S. Miller,
linux-mips, linux-s390, linux-sh, sparclinux, linux-mm
__get_user_pages_fast handles errors differently from
get_user_pages_fast: the former always returns the number of pages
pinned, the later might return a negative error code.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
arch/mips/mm/gup.c | 2 ++
arch/s390/mm/gup.c | 2 ++
arch/sh/mm/gup.c | 2 ++
arch/sparc/mm/gup.c | 4 ++++
mm/gup.c | 4 +++-
mm/util.c | 6 ++++--
6 files changed, 17 insertions(+), 3 deletions(-)
diff --git a/arch/mips/mm/gup.c b/arch/mips/mm/gup.c
index 1e4658e..5a4875ca 100644
--- a/arch/mips/mm/gup.c
+++ b/arch/mips/mm/gup.c
@@ -178,6 +178,8 @@ static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end,
/*
* Like get_user_pages_fast() except its IRQ-safe in that it won't fall
* back to the regular GUP.
+ * Note a difference with get_user_pages_fast: this always returns the
+ * number of pages pinned, 0 if no pages were pinned.
*/
int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages)
diff --git a/arch/s390/mm/gup.c b/arch/s390/mm/gup.c
index 05c8abd..2809d11 100644
--- a/arch/s390/mm/gup.c
+++ b/arch/s390/mm/gup.c
@@ -220,6 +220,8 @@ static inline int gup_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr,
/*
* Like get_user_pages_fast() except its IRQ-safe in that it won't fall
* back to the regular GUP.
+ * Note a difference with get_user_pages_fast: this always returns the
+ * number of pages pinned, 0 if no pages were pinned.
*/
int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages)
diff --git a/arch/sh/mm/gup.c b/arch/sh/mm/gup.c
index 8045b5b..56c86ca 100644
--- a/arch/sh/mm/gup.c
+++ b/arch/sh/mm/gup.c
@@ -160,6 +160,8 @@ static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end,
/*
* Like get_user_pages_fast() except its IRQ-safe in that it won't fall
* back to the regular GUP.
+ * Note a difference with get_user_pages_fast: this always returns the
+ * number of pages pinned, 0 if no pages were pinned.
*/
int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages)
diff --git a/arch/sparc/mm/gup.c b/arch/sparc/mm/gup.c
index 5335ba3..ca3eb69 100644
--- a/arch/sparc/mm/gup.c
+++ b/arch/sparc/mm/gup.c
@@ -192,6 +192,10 @@ static int gup_pud_range(pgd_t pgd, unsigned long addr, unsigned long end,
return 1;
}
+/*
+ * Note a difference with get_user_pages_fast: this always returns the
+ * number of pages pinned, 0 if no pages were pinned.
+ */
int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages)
{
diff --git a/mm/gup.c b/mm/gup.c
index 8f3a064..5cb5bb1 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1740,7 +1740,9 @@ bool gup_fast_permitted(unsigned long start, int nr_pages, int write)
/*
* Like get_user_pages_fast() except it's IRQ-safe in that it won't fall back to
- * the regular GUP. It will only return non-negative values.
+ * the regular GUP.
+ * Note a difference with get_user_pages_fast: this always returns the
+ * number of pages pinned, 0 if no pages were pinned.
*/
int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages)
diff --git a/mm/util.c b/mm/util.c
index c125050..db2f005 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -297,8 +297,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm)
/*
* Like get_user_pages_fast() except its IRQ-safe in that it won't fall
* back to the regular GUP.
- * If the architecture not support this function, simply return with no
- * page pinned
+ * Note a difference with get_user_pages_fast: this always returns the
+ * number of pages pinned, 0 if no pages were pinned.
+ * If the architecture does not support this function, simply return with no
+ * pages pinned.
*/
int __weak __get_user_pages_fast(unsigned long start,
int nr_pages, int write, struct page **pages)
--
MST
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/3] mm/gup_benchmark: handle gup failures
2018-04-05 21:03 ` [PATCH v2 1/3] mm/gup_benchmark: handle gup failures Michael S. Tsirkin
@ 2018-04-07 20:08 ` Linus Torvalds
2018-04-08 3:12 ` Michael S. Tsirkin
0 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2018-04-07 20:08 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Linux Kernel Mailing List, Kirill A . Shutemov, Andrew Morton,
Huang Ying, Jonathan Corbet, Peter Zijlstra, Thomas Gleixner,
Thorsten Leemhuis, stable, linux-mm
On Thu, Apr 5, 2018 at 2:03 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> nr = get_user_pages_fast(addr, nr, gup->flags & 1, pages + i);
> - i += nr;
> + if (nr > 0)
> + i += nr;
Can we just make this robust while at it, and just make it
if (nr <= 0)
break;
instead? Then it doesn't care about zero vs negative error, and
wouldn't get stuck in an endless loop if it got zero.
Linus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/3] mm/gup_benchmark: handle gup failures
2018-04-07 20:08 ` Linus Torvalds
@ 2018-04-08 3:12 ` Michael S. Tsirkin
2018-04-10 0:38 ` Andrew Morton
0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2018-04-08 3:12 UTC (permalink / raw)
To: Linus Torvalds
Cc: Linux Kernel Mailing List, Kirill A . Shutemov, Andrew Morton,
Huang Ying, Jonathan Corbet, Peter Zijlstra, Thomas Gleixner,
Thorsten Leemhuis, stable, linux-mm
On Sat, Apr 07, 2018 at 01:08:43PM -0700, Linus Torvalds wrote:
> On Thu, Apr 5, 2018 at 2:03 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > nr = get_user_pages_fast(addr, nr, gup->flags & 1, pages + i);
> > - i += nr;
> > + if (nr > 0)
> > + i += nr;
>
> Can we just make this robust while at it, and just make it
>
> if (nr <= 0)
> break;
>
> instead? Then it doesn't care about zero vs negative error, and
> wouldn't get stuck in an endless loop if it got zero.
>
> Linus
I don't mind though it alredy breaks out on the next cycle:
if (nr != gup->nr_pages_per_call)
break;
the only issue is i getting corrupted when nr < 0;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/3] mm/gup_benchmark: handle gup failures
2018-04-08 3:12 ` Michael S. Tsirkin
@ 2018-04-10 0:38 ` Andrew Morton
0 siblings, 0 replies; 6+ messages in thread
From: Andrew Morton @ 2018-04-10 0:38 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Linus Torvalds, Linux Kernel Mailing List, Kirill A . Shutemov,
Huang Ying, Jonathan Corbet, Peter Zijlstra, Thomas Gleixner,
Thorsten Leemhuis, stable, linux-mm
On Sun, 8 Apr 2018 06:12:13 +0300 "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Sat, Apr 07, 2018 at 01:08:43PM -0700, Linus Torvalds wrote:
> > On Thu, Apr 5, 2018 at 2:03 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > nr = get_user_pages_fast(addr, nr, gup->flags & 1, pages + i);
> > > - i += nr;
> > > + if (nr > 0)
> > > + i += nr;
> >
> > Can we just make this robust while at it, and just make it
> >
> > if (nr <= 0)
> > break;
> >
> > instead? Then it doesn't care about zero vs negative error, and
> > wouldn't get stuck in an endless loop if it got zero.
> >
> > Linus
>
> I don't mind though it alredy breaks out on the next cycle:
>
> if (nr != gup->nr_pages_per_call)
> break;
>
> the only issue is i getting corrupted when nr < 0;
>
It does help readability to have the thing bail out as soon as we see
something go bad. This?
--- a/mm/gup_benchmark.c~mm-gup_benchmark-handle-gup-failures-fix
+++ a/mm/gup_benchmark.c
@@ -41,8 +41,9 @@ static int __gup_benchmark_ioctl(unsigne
}
nr = get_user_pages_fast(addr, nr, gup->flags & 1, pages + i);
- if (nr > 0)
- i += nr;
+ if (nr <= 0)
+ break;
+ i += nr;
}
end_time = ktime_get();
_
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-04-10 0:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1522962072-182137-1-git-send-email-mst@redhat.com>
2018-04-05 21:03 ` [PATCH v2 1/3] mm/gup_benchmark: handle gup failures Michael S. Tsirkin
2018-04-07 20:08 ` Linus Torvalds
2018-04-08 3:12 ` Michael S. Tsirkin
2018-04-10 0:38 ` Andrew Morton
2018-04-05 21:03 ` [PATCH v2 2/3] gup: return -EFAULT on access_ok failure Michael S. Tsirkin
2018-04-05 21:03 ` [PATCH v2 3/3] mm/gup: document return value Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).