From: Nick Piggin <npiggin@suse.de>
To: Andy Whitcroft <apw@shadowen.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
shaggy@austin.ibm.com, jens.axboe@oracle.com,
torvalds@linux-foundation.org, linux-mm@kvack.org,
linux-arch@vger.kernel.org
Subject: Re: [patch 2/2] lockless get_user_pages
Date: Wed, 28 May 2008 14:28:27 +0200 [thread overview]
Message-ID: <20080528122827.GH2630@wotan.suse.de> (raw)
In-Reply-To: <20080528113906.GA699@shadowen.org>
On Wed, May 28, 2008 at 12:39:06PM +0100, Andy Whitcroft wrote:
> On Sun, May 25, 2008 at 04:52:27PM +0200, Nick Piggin wrote:
> >
> > The downside of using fast_gup is that if there is not a pte with the
> > correct permissions for the access, we end up falling back to get_user_pages
> > and so the fast_gup is just extra work. This should not be the common case
> > in performance critical code, I'd hope.
>
> >From what I can see of the algorithm, as it stops at the first non-present
> page, as long as the optimisation you allude to in the comments was
> implemented the overhead would be much less, and basically constant for
> the failure case.
Right.
> See below for my take on that optimisation.
>
> > Signed-off-by: Nick Piggin <npiggin@suse.de>
> >
> > ---
> > arch/x86/mm/Makefile | 2
> > arch/x86/mm/gup.c | 193 ++++++++++++++++++++++++++++++++++++++++++++++
> > fs/bio.c | 8 -
> > fs/direct-io.c | 10 --
> > fs/splice.c | 41 ---------
> > include/asm-x86/uaccess.h | 5 +
> > include/linux/mm.h | 19 ++++
> > 7 files changed, 225 insertions(+), 53 deletions(-)
>
> I do wonder if it would be logical to introduce the generic version and
> the x86 implementation as separate patches as there will be further
> architecture specific implementations.
Yeah it probably would be. Probably also convert call sites over one
at a time too, which would make eg. regressions more bisectable. Will
do that.
> > Index: linux-2.6/include/linux/mm.h
> > ===================================================================
> > --- linux-2.6.orig/include/linux/mm.h
> > +++ linux-2.6/include/linux/mm.h
> > @@ -12,6 +12,7 @@
> > #include <linux/prio_tree.h>
> > #include <linux/debug_locks.h>
> > #include <linux/mm_types.h>
> > +#include <linux/uaccess.h> /* for __HAVE_ARCH_FAST_GUP */
> >
> > struct mempolicy;
> > struct anon_vma;
> > @@ -830,6 +831,24 @@ extern int mprotect_fixup(struct vm_area
> > struct vm_area_struct **pprev, unsigned long start,
> > unsigned long end, unsigned long newflags);
> >
> > +#ifndef __HAVE_ARCH_FAST_GUP
> > +/* Should be moved to asm-generic, and architectures can include it if they
> > + * don't implement their own fast_gup.
> > + */
> > +#define fast_gup(start, nr_pages, write, pages) \
> > +({ \
> > + struct mm_struct *mm = current->mm; \
> > + int ret; \
> > + \
> > + down_read(&mm->mmap_sem); \
> > + ret = get_user_pages(current, mm, start, nr_pages, \
> > + write, 0, pages, NULL); \
> > + up_read(&mm->mmap_sem); \
> > + \
> > + ret; \
> > +})
> > +#endif
> > +
>
> Could this not be inserted into linux/uaccess.h? That feels like a more
> natural fit given the location of the __HAVE_* flag as well. Feels like
> that would pull something out of mm.h and as call sites convert they
> would necessarily get fixed should they need the additional header.
>
> I had a quick try and it cirtainly seems like moving it there compiles
> on x86 at least (with the optimised version removed).
I don't know. uaccess seems to me like copying blindly to or from
userspace addresses (basically doesn't otherwise care about how the
vm is implemented). get_user_pages has a more mm.hish feel.
Doesn't bother me so much though.
> > + if (!gup_pud_range(pgd, addr, next, write, pages, &nr))
> > + goto slow;
> > + } while (pgdp++, addr = next, addr != end);
> > + local_irq_enable();
> > +
> > + VM_BUG_ON(nr != (end - start) >> PAGE_SHIFT);
> > + return nr;
> > +
> > + {
> > + int i, ret;
> > +
> > +slow:
> > + local_irq_enable();
> > +slow_irqon:
> > + /* Could optimise this more by keeping what we've already got */
> > + for (i = 0; i < nr; i++)
> > + put_page(pages[i]);
>
> it feels like optimising this would be pretty simple, is not the
> following sufficient:
>
> start += nr << PAGE_SHIFT;
> pages += nr;
Yeah... I can't remember whether I had any reservations about that.
But no, I think just doing that should be fine.
I suppose the get_user_pages return value has to be checked too.
>
> > +
> > + down_read(&mm->mmap_sem);
> > + ret = get_user_pages(current, mm, start,
> > + (end - start) >> PAGE_SHIFT, write, 0, pages, NULL);
> > + up_read(&mm->mmap_sem);
if (nr > 0) {
if (ret < 0)
ret = nr;
else
ret += nr;
}
OK. I'll try to implement that and test it.
> > +
> > + return ret;
> > + }
> > +}
> > Index: linux-2.6/include/asm-x86/uaccess.h
> > ===================================================================
> > --- linux-2.6.orig/include/asm-x86/uaccess.h
> > +++ linux-2.6/include/asm-x86/uaccess.h
> > @@ -3,3 +3,8 @@
> > #else
> > # include "uaccess_64.h"
> > #endif
> > +
> > +#define __HAVE_ARCH_FAST_GUP
> > +struct page;
> > +int fast_gup(unsigned long start, int nr_pages, int write, struct page **pages);
> > +
>
> Reviewed-by: Andy Whitcroft <apw@shadowen.org>
Thanks. I'll resubmit after incorporating your comments.
Thanks,
Nick
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-05-28 12:28 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-25 14:48 [patch 1/2] x86: implement pte_special Nick Piggin
2008-05-25 14:52 ` [patch 2/2] lockless get_user_pages Nick Piggin
2008-05-25 17:18 ` Johannes Weiner
2008-05-26 1:40 ` Nick Piggin
2008-05-26 15:02 ` Johannes Weiner
2008-05-27 0:57 ` KOSAKI Motohiro
2008-05-27 2:28 ` Nick Piggin
2008-05-27 2:46 ` KOSAKI Motohiro
2008-05-27 2:57 ` Nick Piggin
2008-05-29 3:43 ` Nick Piggin
2008-05-28 11:39 ` Andy Whitcroft
2008-05-28 12:28 ` Nick Piggin [this message]
2008-05-28 14:32 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080528122827.GH2630@wotan.suse.de \
--to=npiggin@suse.de \
--cc=akpm@linux-foundation.org \
--cc=apw@shadowen.org \
--cc=jens.axboe@oracle.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=shaggy@austin.ibm.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).