linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steve Capper <steve.capper@linaro.org>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: linux-arch@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: Introduce a general RCU get_user_pages_fast.
Date: Tue, 14 Oct 2014 12:58:55 +0100	[thread overview]
Message-ID: <20141014115854.GA32351@linaro.org> (raw)
In-Reply-To: <1413284274-13521-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

On Tue, Oct 14, 2014 at 04:27:53PM +0530, Aneesh Kumar K.V wrote:
> get_user_pages_fast attempts to pin user pages by walking the page
> tables directly and avoids taking locks. Thus the walker needs to be
> protected from page table pages being freed from under it, and needs
> to block any THP splits.
> 
> One way to achieve this is to have the walker disable interrupts, and
> rely on IPIs from the TLB flushing code blocking before the page table
> pages are freed.
> 
> On some platforms we have hardware broadcast of TLB invalidations, thus
> the TLB flushing code doesn't necessarily need to broadcast IPIs; and
> spuriously broadcasting IPIs can hurt system performance if done too
> often.
> 
> This problem has been solved on PowerPC and Sparc by batching up page
> table pages belonging to more than one mm_user, then scheduling an
> rcu_sched callback to free the pages. This RCU page table free logic
> has been promoted to core code and is activated when one enables
> HAVE_RCU_TABLE_FREE. Unfortunately, these architectures implement
> their own get_user_pages_fast routines.
> 
> The RCU page table free logic coupled with a an IPI broadcast on THP
> split (which is a rare event), allows one to protect a page table
> walker by merely disabling the interrupts during the walk.
> 
> This patch provides a general RCU implementation of get_user_pages_fast
> that can be used by architectures that perform hardware broadcast of
> TLB invalidations.
> 
> It is based heavily on the PowerPC implementation.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
> 
> NOTE: I kept the description of patch as it is and also retained the documentation.
> I also dropped the tested-by and other SOB, because i was not sure whether they want
> to be blamed for the bugs here. Please feel free to update.

Hi Aneesh,
Thank you for coding this up.

I've compiled and briefly tested this on arm (with and without LPAE),
and arm64. I ran a custom futex on THP tail test, and this passed.
I'll test this a little more aggressively with ltp.

I think Linus has already pulled in the RCU gup I posted, could you
please instead write a patch against?
2667f50 mm: introduce a general RCU get_user_pages_fast()

I had one issue compiling this, pgd_huge was undefined. I think this
is only defined for PowerPC? Could a stub definition of pgd_huge be
added?

Cheers,
-- 
Steve

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Steve Capper <steve.capper@linaro.org>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: linux-arch@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: Introduce a general RCU get_user_pages_fast.
Date: Tue, 14 Oct 2014 12:58:55 +0100	[thread overview]
Message-ID: <20141014115854.GA32351@linaro.org> (raw)
Message-ID: <20141014115855.0Kma8oquqDPxl0h7Kd0zilbFmjncVgeay2TOYemQ-Lc@z> (raw)
In-Reply-To: <1413284274-13521-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com>

On Tue, Oct 14, 2014 at 04:27:53PM +0530, Aneesh Kumar K.V wrote:
> get_user_pages_fast attempts to pin user pages by walking the page
> tables directly and avoids taking locks. Thus the walker needs to be
> protected from page table pages being freed from under it, and needs
> to block any THP splits.
> 
> One way to achieve this is to have the walker disable interrupts, and
> rely on IPIs from the TLB flushing code blocking before the page table
> pages are freed.
> 
> On some platforms we have hardware broadcast of TLB invalidations, thus
> the TLB flushing code doesn't necessarily need to broadcast IPIs; and
> spuriously broadcasting IPIs can hurt system performance if done too
> often.
> 
> This problem has been solved on PowerPC and Sparc by batching up page
> table pages belonging to more than one mm_user, then scheduling an
> rcu_sched callback to free the pages. This RCU page table free logic
> has been promoted to core code and is activated when one enables
> HAVE_RCU_TABLE_FREE. Unfortunately, these architectures implement
> their own get_user_pages_fast routines.
> 
> The RCU page table free logic coupled with a an IPI broadcast on THP
> split (which is a rare event), allows one to protect a page table
> walker by merely disabling the interrupts during the walk.
> 
> This patch provides a general RCU implementation of get_user_pages_fast
> that can be used by architectures that perform hardware broadcast of
> TLB invalidations.
> 
> It is based heavily on the PowerPC implementation.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
> 
> NOTE: I kept the description of patch as it is and also retained the documentation.
> I also dropped the tested-by and other SOB, because i was not sure whether they want
> to be blamed for the bugs here. Please feel free to update.

Hi Aneesh,
Thank you for coding this up.

I've compiled and briefly tested this on arm (with and without LPAE),
and arm64. I ran a custom futex on THP tail test, and this passed.
I'll test this a little more aggressively with ltp.

I think Linus has already pulled in the RCU gup I posted, could you
please instead write a patch against?
2667f50 mm: introduce a general RCU get_user_pages_fast()

I had one issue compiling this, pgd_huge was undefined. I think this
is only defined for PowerPC? Could a stub definition of pgd_huge be
added?

Cheers,
-- 
Steve

  parent reply	other threads:[~2014-10-14 11:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-14 10:57 [PATCH 1/2] mm: Introduce a general RCU get_user_pages_fast Aneesh Kumar K.V
2014-10-14 10:57 ` [PATCH 2/2] arch/powerpc: Switch to generic " Aneesh Kumar K.V
2014-10-14 10:57   ` Aneesh Kumar K.V
2014-10-14 11:58 ` Steve Capper [this message]
2014-10-14 11:58   ` [PATCH 1/2] mm: Introduce a general " Steve Capper
2014-10-14 12:08   ` Aneesh Kumar K.V
2014-10-14 12:08     ` Aneesh Kumar K.V
2014-10-14 12:29     ` Steve Capper
2014-10-14 12:29       ` Steve Capper
2014-10-14 13:42       ` Aneesh Kumar K.V

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141014115854.GA32351@linaro.org \
    --to=steve.capper@linaro.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).