All of lore.kernel.org
 help / color / mirror / Atom feed
From: Will Deacon <will.deacon@arm.com>
To: Steve Capper <steve.capper@linaro.org>
Cc: "linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	"linux@arm.linux.org.uk" <linux@arm.linux.org.uk>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"gary.robertson@linaro.org" <gary.robertson@linaro.org>,
	"christoffer.dall@linaro.org" <christoffer.dall@linaro.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"anders.roxell@linaro.org" <anders.roxell@linaro.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"dann.frazier@canonical.com" <dann.frazier@canonical.com>,
	Mark Rutland <Mark.Rutland@arm.com>,
	"mgorman@suse.de" <mgorman@suse.de>
Subject: Re: [PATH V2 1/6] mm: Introduce a general RCU get_user_pages_fast.
Date: Wed, 27 Aug 2014 09:54:42 +0100	[thread overview]
Message-ID: <20140827085442.GD16376@arm.com> (raw)
In-Reply-To: <1408635812-31584-2-git-send-email-steve.capper@linaro.org>

Hi Steve,

A few minor comments (took me a while to understand how this works, so I
thought I'd make some noise :)

On Thu, Aug 21, 2014 at 04:43:27PM +0100, Steve Capper wrote:
> get_user_pages_fast attempts to pin user pages by walking the page
> tables directly and avoids taking locks. Thus the walker needs to be
> protected from page table pages being freed from under it, and needs
> to block any THP splits.
> 
> One way to achieve this is to have the walker disable interrupts, and
> rely on IPIs from the TLB flushing code blocking before the page table
> pages are freed.
> 
> On some platforms we have hardware broadcast of TLB invalidations, thus
> the TLB flushing code doesn't necessarily need to broadcast IPIs; and
> spuriously broadcasting IPIs can hurt system performance if done too
> often.
> 
> This problem has been solved on PowerPC and Sparc by batching up page
> table pages belonging to more than one mm_user, then scheduling an
> rcu_sched callback to free the pages. This RCU page table free logic
> has been promoted to core code and is activated when one enables
> HAVE_RCU_TABLE_FREE. Unfortunately, these architectures implement
> their own get_user_pages_fast routines.
> 
> The RCU page table free logic coupled with a an IPI broadcast on THP
> split (which is a rare event), allows one to protect a page table
> walker by merely disabling the interrupts during the walk.

Disabling interrupts isn't completely free (it's a self-synchronising
operation on ARM). It would be interesting to see if your futex workload
performance is improved by my simple irq_save optimisation for ARM:

  https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches&id=312a70adfa6f22e9d62803dd21400f481253e58b

(I've been struggling to show anything other than tiny improvements from
that patch).

> This patch provides a general RCU implementation of get_user_pages_fast
> that can be used by architectures that perform hardware broadcast of
> TLB invalidations.
> 
> It is based heavily on the PowerPC implementation by Nick Piggin.

[...]

> diff --git a/mm/gup.c b/mm/gup.c
> index 91d044b..2f684fa 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -10,6 +10,10 @@
>  #include <linux/swap.h>
>  #include <linux/swapops.h>
>  
> +#include <linux/sched.h>
> +#include <linux/rwsem.h>
> +#include <asm/pgtable.h>
> +
>  #include "internal.h"
>  
>  static struct page *no_page_table(struct vm_area_struct *vma,
> @@ -672,3 +676,277 @@ struct page *get_dump_page(unsigned long addr)
>  	return page;
>  }
>  #endif /* CONFIG_ELF_CORE */
> +
> +#ifdef CONFIG_HAVE_RCU_GUP
> +
> +#ifdef __HAVE_ARCH_PTE_SPECIAL

Do we actually require this (pte special) if hugepages are disabled or
not supported?

Will

WARNING: multiple messages have this Message-ID (diff)
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATH V2 1/6] mm: Introduce a general RCU get_user_pages_fast.
Date: Wed, 27 Aug 2014 09:54:42 +0100	[thread overview]
Message-ID: <20140827085442.GD16376@arm.com> (raw)
In-Reply-To: <1408635812-31584-2-git-send-email-steve.capper@linaro.org>

Hi Steve,

A few minor comments (took me a while to understand how this works, so I
thought I'd make some noise :)

On Thu, Aug 21, 2014 at 04:43:27PM +0100, Steve Capper wrote:
> get_user_pages_fast attempts to pin user pages by walking the page
> tables directly and avoids taking locks. Thus the walker needs to be
> protected from page table pages being freed from under it, and needs
> to block any THP splits.
> 
> One way to achieve this is to have the walker disable interrupts, and
> rely on IPIs from the TLB flushing code blocking before the page table
> pages are freed.
> 
> On some platforms we have hardware broadcast of TLB invalidations, thus
> the TLB flushing code doesn't necessarily need to broadcast IPIs; and
> spuriously broadcasting IPIs can hurt system performance if done too
> often.
> 
> This problem has been solved on PowerPC and Sparc by batching up page
> table pages belonging to more than one mm_user, then scheduling an
> rcu_sched callback to free the pages. This RCU page table free logic
> has been promoted to core code and is activated when one enables
> HAVE_RCU_TABLE_FREE. Unfortunately, these architectures implement
> their own get_user_pages_fast routines.
> 
> The RCU page table free logic coupled with a an IPI broadcast on THP
> split (which is a rare event), allows one to protect a page table
> walker by merely disabling the interrupts during the walk.

Disabling interrupts isn't completely free (it's a self-synchronising
operation on ARM). It would be interesting to see if your futex workload
performance is improved by my simple irq_save optimisation for ARM:

  https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches&id=312a70adfa6f22e9d62803dd21400f481253e58b

(I've been struggling to show anything other than tiny improvements from
that patch).

> This patch provides a general RCU implementation of get_user_pages_fast
> that can be used by architectures that perform hardware broadcast of
> TLB invalidations.
> 
> It is based heavily on the PowerPC implementation by Nick Piggin.

[...]

> diff --git a/mm/gup.c b/mm/gup.c
> index 91d044b..2f684fa 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -10,6 +10,10 @@
>  #include <linux/swap.h>
>  #include <linux/swapops.h>
>  
> +#include <linux/sched.h>
> +#include <linux/rwsem.h>
> +#include <asm/pgtable.h>
> +
>  #include "internal.h"
>  
>  static struct page *no_page_table(struct vm_area_struct *vma,
> @@ -672,3 +676,277 @@ struct page *get_dump_page(unsigned long addr)
>  	return page;
>  }
>  #endif /* CONFIG_ELF_CORE */
> +
> +#ifdef CONFIG_HAVE_RCU_GUP
> +
> +#ifdef __HAVE_ARCH_PTE_SPECIAL

Do we actually require this (pte special) if hugepages are disabled or
not supported?

Will

WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will.deacon@arm.com>
To: Steve Capper <steve.capper@linaro.org>
Cc: "linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	"linux@arm.linux.org.uk" <linux@arm.linux.org.uk>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"gary.robertson@linaro.org" <gary.robertson@linaro.org>,
	"christoffer.dall@linaro.org" <christoffer.dall@linaro.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"anders.roxell@linaro.org" <anders.roxell@linaro.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"dann.frazier@canonical.com" <dann.frazier@canonical.com>,
	Mark Rutland <Mark.Rutland@arm.com>,
	"mgorman@suse.de" <mgorman@suse.de>
Subject: Re: [PATH V2 1/6] mm: Introduce a general RCU get_user_pages_fast.
Date: Wed, 27 Aug 2014 09:54:42 +0100	[thread overview]
Message-ID: <20140827085442.GD16376@arm.com> (raw)
In-Reply-To: <1408635812-31584-2-git-send-email-steve.capper@linaro.org>

Hi Steve,

A few minor comments (took me a while to understand how this works, so I
thought I'd make some noise :)

On Thu, Aug 21, 2014 at 04:43:27PM +0100, Steve Capper wrote:
> get_user_pages_fast attempts to pin user pages by walking the page
> tables directly and avoids taking locks. Thus the walker needs to be
> protected from page table pages being freed from under it, and needs
> to block any THP splits.
> 
> One way to achieve this is to have the walker disable interrupts, and
> rely on IPIs from the TLB flushing code blocking before the page table
> pages are freed.
> 
> On some platforms we have hardware broadcast of TLB invalidations, thus
> the TLB flushing code doesn't necessarily need to broadcast IPIs; and
> spuriously broadcasting IPIs can hurt system performance if done too
> often.
> 
> This problem has been solved on PowerPC and Sparc by batching up page
> table pages belonging to more than one mm_user, then scheduling an
> rcu_sched callback to free the pages. This RCU page table free logic
> has been promoted to core code and is activated when one enables
> HAVE_RCU_TABLE_FREE. Unfortunately, these architectures implement
> their own get_user_pages_fast routines.
> 
> The RCU page table free logic coupled with a an IPI broadcast on THP
> split (which is a rare event), allows one to protect a page table
> walker by merely disabling the interrupts during the walk.

Disabling interrupts isn't completely free (it's a self-synchronising
operation on ARM). It would be interesting to see if your futex workload
performance is improved by my simple irq_save optimisation for ARM:

  https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches&id=312a70adfa6f22e9d62803dd21400f481253e58b

(I've been struggling to show anything other than tiny improvements from
that patch).

> This patch provides a general RCU implementation of get_user_pages_fast
> that can be used by architectures that perform hardware broadcast of
> TLB invalidations.
> 
> It is based heavily on the PowerPC implementation by Nick Piggin.

[...]

> diff --git a/mm/gup.c b/mm/gup.c
> index 91d044b..2f684fa 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -10,6 +10,10 @@
>  #include <linux/swap.h>
>  #include <linux/swapops.h>
>  
> +#include <linux/sched.h>
> +#include <linux/rwsem.h>
> +#include <asm/pgtable.h>
> +
>  #include "internal.h"
>  
>  static struct page *no_page_table(struct vm_area_struct *vma,
> @@ -672,3 +676,277 @@ struct page *get_dump_page(unsigned long addr)
>  	return page;
>  }
>  #endif /* CONFIG_ELF_CORE */
> +
> +#ifdef CONFIG_HAVE_RCU_GUP
> +
> +#ifdef __HAVE_ARCH_PTE_SPECIAL

Do we actually require this (pte special) if hugepages are disabled or
not supported?

Will

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-08-27  8:54 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-21 15:43 [PATH V2 0/6] RCU get_user_pages_fast and __get_user_pages_fast Steve Capper
2014-08-21 15:43 ` Steve Capper
2014-08-21 15:43 ` Steve Capper
2014-08-21 15:43 ` [PATH V2 1/6] mm: Introduce a general RCU get_user_pages_fast Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-27  8:54   ` Will Deacon [this message]
2014-08-27  8:54     ` Will Deacon
2014-08-27  8:54     ` Will Deacon
2014-08-27 12:50     ` Steve Capper
2014-08-27 12:50       ` Steve Capper
2014-08-27 12:50       ` Steve Capper
2014-08-27 13:14       ` Will Deacon
2014-08-27 13:14         ` Will Deacon
2014-08-27 13:14         ` Will Deacon
2014-08-27 14:28   ` Catalin Marinas
2014-08-27 14:28     ` Catalin Marinas
2014-08-27 14:28     ` Catalin Marinas
2014-08-27 14:42     ` Steve Capper
2014-08-27 14:42       ` Steve Capper
2014-08-27 14:42       ` Steve Capper
2014-08-27 15:01   ` Russell King - ARM Linux
2014-08-27 15:01     ` Russell King - ARM Linux
2014-08-27 15:01     ` Russell King - ARM Linux
2014-08-28  8:59     ` Steve Capper
2014-08-28  8:59       ` Steve Capper
2014-08-28  8:59       ` Steve Capper
2014-08-21 15:43 ` [PATH V2 2/6] arm: mm: Introduce special ptes for LPAE Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-27 10:46   ` Catalin Marinas
2014-08-27 10:46     ` Catalin Marinas
2014-08-27 10:46     ` Catalin Marinas
2014-08-27 12:52     ` Steve Capper
2014-08-27 12:52       ` Steve Capper
2014-08-27 12:52       ` Steve Capper
2014-08-21 15:43 ` [PATH V2 3/6] arm: mm: Enable HAVE_RCU_TABLE_FREE logic Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-27 11:50   ` Catalin Marinas
2014-08-27 11:50     ` Catalin Marinas
2014-08-27 11:50     ` Catalin Marinas
2014-08-27 12:59     ` Steve Capper
2014-08-27 12:59       ` Steve Capper
2014-08-27 12:59       ` Steve Capper
2014-08-21 15:43 ` [PATH V2 4/6] arm: mm: Enable RCU fast_gup Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-27 11:51   ` Catalin Marinas
2014-08-27 11:51     ` Catalin Marinas
2014-08-27 11:51     ` Catalin Marinas
2014-08-27 13:01     ` Steve Capper
2014-08-27 13:01       ` Steve Capper
2014-08-27 13:01       ` Steve Capper
2014-08-21 15:43 ` [PATH V2 5/6] arm64: mm: Enable HAVE_RCU_TABLE_FREE logic Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-27 10:48   ` Catalin Marinas
2014-08-27 10:48     ` Catalin Marinas
2014-08-27 10:48     ` Catalin Marinas
2014-08-27 13:08     ` Steve Capper
2014-08-27 13:08       ` Steve Capper
2014-08-27 13:08       ` Steve Capper
2014-08-21 15:43 ` [PATH V2 6/6] arm64: mm: Enable RCU fast_gup Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-21 15:43   ` Steve Capper
2014-08-27 11:09   ` Catalin Marinas
2014-08-27 11:09     ` Catalin Marinas
2014-08-27 11:09     ` Catalin Marinas
2014-08-27 13:43     ` Steve Capper
2014-08-27 13:43       ` Steve Capper
2014-08-27 13:43       ` Steve Capper
2014-08-21 20:42 ` [PATH V2 0/6] RCU get_user_pages_fast and __get_user_pages_fast Dann Frazier
2014-08-21 20:42   ` Dann Frazier
2014-08-21 20:42   ` Dann Frazier
2014-08-22  8:11   ` Steve Capper
2014-08-22  8:11     ` Steve Capper
2014-08-22  8:11     ` Steve Capper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140827085442.GD16376@arm.com \
    --to=will.deacon@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Mark.Rutland@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anders.roxell@linaro.org \
    --cc=christoffer.dall@linaro.org \
    --cc=dann.frazier@canonical.com \
    --cc=gary.robertson@linaro.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mgorman@suse.de \
    --cc=peterz@infradead.org \
    --cc=steve.capper@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.