linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: steve.capper@linaro.org (Steve Capper)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATH V2 1/6] mm: Introduce a general RCU get_user_pages_fast.
Date: Wed, 27 Aug 2014 13:50:28 +0100	[thread overview]
Message-ID: <20140827125027.GA7765@linaro.org> (raw)
In-Reply-To: <20140827085442.GD16376@arm.com>

On Wed, Aug 27, 2014 at 09:54:42AM +0100, Will Deacon wrote:
> Hi Steve,
> 

Hey Will,

> A few minor comments (took me a while to understand how this works, so I
> thought I'd make some noise :)

A big thank you for reading through it :-).

> 
> On Thu, Aug 21, 2014 at 04:43:27PM +0100, Steve Capper wrote:
> > get_user_pages_fast attempts to pin user pages by walking the page
> > tables directly and avoids taking locks. Thus the walker needs to be
> > protected from page table pages being freed from under it, and needs
> > to block any THP splits.
> > 
> > One way to achieve this is to have the walker disable interrupts, and
> > rely on IPIs from the TLB flushing code blocking before the page table
> > pages are freed.
> > 
> > On some platforms we have hardware broadcast of TLB invalidations, thus
> > the TLB flushing code doesn't necessarily need to broadcast IPIs; and
> > spuriously broadcasting IPIs can hurt system performance if done too
> > often.
> > 
> > This problem has been solved on PowerPC and Sparc by batching up page
> > table pages belonging to more than one mm_user, then scheduling an
> > rcu_sched callback to free the pages. This RCU page table free logic
> > has been promoted to core code and is activated when one enables
> > HAVE_RCU_TABLE_FREE. Unfortunately, these architectures implement
> > their own get_user_pages_fast routines.
> > 
> > The RCU page table free logic coupled with a an IPI broadcast on THP
> > split (which is a rare event), allows one to protect a page table
> > walker by merely disabling the interrupts during the walk.
> 
> Disabling interrupts isn't completely free (it's a self-synchronising
> operation on ARM). It would be interesting to see if your futex workload
> performance is improved by my simple irq_save optimisation for ARM:
> 
>   https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches&id=312a70adfa6f22e9d62803dd21400f481253e58b
> 
> (I've been struggling to show anything other than tiny improvements from
> that patch).
> 

This looks like a useful optimisation; I'll have a think about workloads that
fire many futexes on THP tails. (The test I used only fired off one futex).

> > This patch provides a general RCU implementation of get_user_pages_fast
> > that can be used by architectures that perform hardware broadcast of
> > TLB invalidations.
> > 
> > It is based heavily on the PowerPC implementation by Nick Piggin.
> 
> [...]
> 
> > diff --git a/mm/gup.c b/mm/gup.c
> > index 91d044b..2f684fa 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -10,6 +10,10 @@
> >  #include <linux/swap.h>
> >  #include <linux/swapops.h>
> >  
> > +#include <linux/sched.h>
> > +#include <linux/rwsem.h>
> > +#include <asm/pgtable.h>
> > +
> >  #include "internal.h"
> >  
> >  static struct page *no_page_table(struct vm_area_struct *vma,
> > @@ -672,3 +676,277 @@ struct page *get_dump_page(unsigned long addr)
> >  	return page;
> >  }
> >  #endif /* CONFIG_ELF_CORE */
> > +
> > +#ifdef CONFIG_HAVE_RCU_GUP
> > +
> > +#ifdef __HAVE_ARCH_PTE_SPECIAL
> 
> Do we actually require this (pte special) if hugepages are disabled or
> not supported?

We need this logic if we want use fast_gup on normal pages safely. The special
bit indicates that we should not attempt to take a reference to the underlying
page.

Huge pages are guaranteed not to be special.

Cheers,
-- 
Steve

  reply	other threads:[~2014-08-27 12:50 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-21 15:43 [PATH V2 0/6] RCU get_user_pages_fast and __get_user_pages_fast Steve Capper
2014-08-21 15:43 ` [PATH V2 1/6] mm: Introduce a general RCU get_user_pages_fast Steve Capper
2014-08-27  8:54   ` Will Deacon
2014-08-27 12:50     ` Steve Capper [this message]
2014-08-27 13:14       ` Will Deacon
2014-08-27 14:28   ` Catalin Marinas
2014-08-27 14:42     ` Steve Capper
2014-08-27 15:01   ` Russell King - ARM Linux
2014-08-28  8:59     ` Steve Capper
2014-08-21 15:43 ` [PATH V2 2/6] arm: mm: Introduce special ptes for LPAE Steve Capper
2014-08-27 10:46   ` Catalin Marinas
2014-08-27 12:52     ` Steve Capper
2014-08-21 15:43 ` [PATH V2 3/6] arm: mm: Enable HAVE_RCU_TABLE_FREE logic Steve Capper
2014-08-27 11:50   ` Catalin Marinas
2014-08-27 12:59     ` Steve Capper
2014-08-21 15:43 ` [PATH V2 4/6] arm: mm: Enable RCU fast_gup Steve Capper
2014-08-27 11:51   ` Catalin Marinas
2014-08-27 13:01     ` Steve Capper
2014-08-21 15:43 ` [PATH V2 5/6] arm64: mm: Enable HAVE_RCU_TABLE_FREE logic Steve Capper
2014-08-27 10:48   ` Catalin Marinas
2014-08-27 13:08     ` Steve Capper
2014-08-21 15:43 ` [PATH V2 6/6] arm64: mm: Enable RCU fast_gup Steve Capper
2014-08-27 11:09   ` Catalin Marinas
2014-08-27 13:43     ` Steve Capper
2014-08-21 20:42 ` [PATH V2 0/6] RCU get_user_pages_fast and __get_user_pages_fast Dann Frazier
2014-08-22  8:11   ` Steve Capper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140827125027.GA7765@linaro.org \
    --to=steve.capper@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).