All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Peter Zijlstra <peterz@infradead.org>,
	torvalds@linux-foundation.org, paulmck@linux.vnet.ibm.com,
	tglx@linutronix.de, akpm@linux-foundation.org, riel@redhat.com,
	mgorman@suse.de, oleg@redhat.com, mingo@redhat.com,
	minchan@kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
	viro@zeniv.linux.org.uk, laijs@cn.fujitsu.com, dave@stgolabs.net
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC][PATCH 0/6] Another go at speculative page faults
Date: Mon, 20 Oct 2014 17:07:02 -0700	[thread overview]
Message-ID: <5445A3A6.2@amacapital.net> (raw)
In-Reply-To: <20141020215633.717315139@infradead.org>

On 10/20/2014 02:56 PM, Peter Zijlstra wrote:
> Hi,
> 
> I figured I'd give my 2010 speculative fault series another spin:
> 
>   https://lkml.org/lkml/2010/1/4/257
> 
> Since then I think many of the outstanding issues have changed sufficiently to
> warrant another go. In particular Al Viro's delayed fput seems to have made it
> entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
> with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
> under the PTL.
> 
> The code needs way more attention but builds a kernel and runs the
> micro-benchmark so I figured I'd post it before sinking more time into it.
> 
> I realize the micro-bench is about as good as it gets for this series and not
> very realistic otherwise, but I think it does show the potential benefit the
> approach has.

Does this mean that an entire fault can complete without ever taking
mmap_sem at all?  If so, that's a *huge* win.

I'm a bit concerned about drivers that assume that the vma is unchanged
during .fault processing.  In particular, is there a race between .close
and .fault?  Would it make sense to add a per-vma rw lock and hold it
during vma modification and .fault calls?

--Andy

> 
> (patches go against .18-rc1+)
> 
> ---
> 
> Using Kamezawa's multi-fault micro-bench from: https://lkml.org/lkml/2010/1/6/28
> 
> My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
> 
> PRE:
> 
> root@ivb-ep:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 20
> 
>  Performance counter stats for './multi-fault 20' (5 runs):
> 
>        149,441,555      page-faults                  ( +-  1.25% )
>      2,153,651,828      cache-misses                 ( +-  1.09% )
> 
>       60.003082014 seconds time elapsed              ( +-  0.00% )
> 
> POST:
> 
> root@ivb-ep:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 20
> 
>  Performance counter stats for './multi-fault 20' (5 runs):
> 
>        236,442,626      page-faults                  ( +-  0.08% )
>      2,796,353,939      cache-misses                 ( +-  1.01% )
> 
>       60.002792431 seconds time elapsed              ( +-  0.00% )
> 
> 
> My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
> 
> PRE:
> 
> root@ivb-ex:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 60
> 
>  Performance counter stats for './multi-fault 60' (5 runs):
> 
>        105,789,078      page-faults                 ( +-  2.24% )
>      1,314,072,090      cache-misses                ( +-  1.17% )
> 
>       60.009243533 seconds time elapsed             ( +-  0.00% )
> 
> POST:
> 
> root@ivb-ex:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 60
> 
>  Performance counter stats for './multi-fault 60' (5 runs):
> 
>        187,751,767      page-faults                 ( +-  2.24% )
>      1,792,758,664      cache-misses                ( +-  2.30% )
> 
>       60.011611579 seconds time elapsed             ( +-  0.00% )
> 
> (I've not yet looked at why the EX sucks chunks compared to the EP box, I
>  suspect we contend on other locks, but it could be anything.)
> 
> ---
> 
>  arch/x86/mm/fault.c      |  35 ++-
>  include/linux/mm.h       |  19 +-
>  include/linux/mm_types.h |   5 +
>  kernel/fork.c            |   1 +
>  mm/init-mm.c             |   1 +
>  mm/internal.h            |  18 ++
>  mm/memory.c              | 672 ++++++++++++++++++++++++++++-------------------
>  mm/mmap.c                | 101 +++++--
>  8 files changed, 544 insertions(+), 308 deletions(-)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Andy Lutomirski <luto@amacapital.net>
To: Peter Zijlstra <peterz@infradead.org>,
	torvalds@linux-foundation.org, paulmck@linux.vnet.ibm.com,
	tglx@linutronix.de, akpm@linux-foundation.org, riel@redhat.com,
	mgorman@suse.de, oleg@redhat.com, mingo@redhat.com,
	minchan@kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
	viro@zeniv.linux.org.uk, laijs@cn.fujitsu.com, dave@stgolabs.net
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC][PATCH 0/6] Another go at speculative page faults
Date: Mon, 20 Oct 2014 17:07:02 -0700	[thread overview]
Message-ID: <5445A3A6.2@amacapital.net> (raw)
In-Reply-To: <20141020215633.717315139@infradead.org>

On 10/20/2014 02:56 PM, Peter Zijlstra wrote:
> Hi,
> 
> I figured I'd give my 2010 speculative fault series another spin:
> 
>   https://lkml.org/lkml/2010/1/4/257
> 
> Since then I think many of the outstanding issues have changed sufficiently to
> warrant another go. In particular Al Viro's delayed fput seems to have made it
> entirely 'normal' to delay fput(). Lai Jiangshan's SRCU rewrite provided us
> with call_srcu() and my preemptible mmu_gather removed the TLB flushes from
> under the PTL.
> 
> The code needs way more attention but builds a kernel and runs the
> micro-benchmark so I figured I'd post it before sinking more time into it.
> 
> I realize the micro-bench is about as good as it gets for this series and not
> very realistic otherwise, but I think it does show the potential benefit the
> approach has.

Does this mean that an entire fault can complete without ever taking
mmap_sem at all?  If so, that's a *huge* win.

I'm a bit concerned about drivers that assume that the vma is unchanged
during .fault processing.  In particular, is there a race between .close
and .fault?  Would it make sense to add a per-vma rw lock and hold it
during vma modification and .fault calls?

--Andy

> 
> (patches go against .18-rc1+)
> 
> ---
> 
> Using Kamezawa's multi-fault micro-bench from: https://lkml.org/lkml/2010/1/6/28
> 
> My Ivy Bridge EP (2*10*2) has a ~58% improvement in pagefault throughput:
> 
> PRE:
> 
> root@ivb-ep:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 20
> 
>  Performance counter stats for './multi-fault 20' (5 runs):
> 
>        149,441,555      page-faults                  ( +-  1.25% )
>      2,153,651,828      cache-misses                 ( +-  1.09% )
> 
>       60.003082014 seconds time elapsed              ( +-  0.00% )
> 
> POST:
> 
> root@ivb-ep:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 20
> 
>  Performance counter stats for './multi-fault 20' (5 runs):
> 
>        236,442,626      page-faults                  ( +-  0.08% )
>      2,796,353,939      cache-misses                 ( +-  1.01% )
> 
>       60.002792431 seconds time elapsed              ( +-  0.00% )
> 
> 
> My Ivy Bridge EX (4*15*2) has a ~78% improvement in pagefault throughput:
> 
> PRE:
> 
> root@ivb-ex:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 60
> 
>  Performance counter stats for './multi-fault 60' (5 runs):
> 
>        105,789,078      page-faults                 ( +-  2.24% )
>      1,314,072,090      cache-misses                ( +-  1.17% )
> 
>       60.009243533 seconds time elapsed             ( +-  0.00% )
> 
> POST:
> 
> root@ivb-ex:~# perf stat -e page-faults,cache-misses --repeat 5 ./multi-fault 60
> 
>  Performance counter stats for './multi-fault 60' (5 runs):
> 
>        187,751,767      page-faults                 ( +-  2.24% )
>      1,792,758,664      cache-misses                ( +-  2.30% )
> 
>       60.011611579 seconds time elapsed             ( +-  0.00% )
> 
> (I've not yet looked at why the EX sucks chunks compared to the EP box, I
>  suspect we contend on other locks, but it could be anything.)
> 
> ---
> 
>  arch/x86/mm/fault.c      |  35 ++-
>  include/linux/mm.h       |  19 +-
>  include/linux/mm_types.h |   5 +
>  kernel/fork.c            |   1 +
>  mm/init-mm.c             |   1 +
>  mm/internal.h            |  18 ++
>  mm/memory.c              | 672 ++++++++++++++++++++++++++++-------------------
>  mm/mmap.c                | 101 +++++--
>  8 files changed, 544 insertions(+), 308 deletions(-)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 


  parent reply	other threads:[~2014-10-21  0:07 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-20 21:56 [RFC][PATCH 0/6] Another go at speculative page faults Peter Zijlstra
2014-10-20 21:56 ` Peter Zijlstra
2014-10-20 21:56 ` [RFC][PATCH 1/6] mm: Dont assume page-table invariance during faults Peter Zijlstra
2014-10-20 21:56   ` Peter Zijlstra
2014-10-20 21:56 ` [RFC][PATCH 2/6] mm: Prepare for FAULT_FLAG_SPECULATIVE Peter Zijlstra
2014-10-20 21:56   ` Peter Zijlstra
2014-10-20 21:56 ` [RFC][PATCH 3/6] mm: VMA sequence count Peter Zijlstra
2014-10-20 21:56   ` Peter Zijlstra
2014-10-22 11:26   ` Kirill A. Shutemov
2014-10-22 11:26     ` Kirill A. Shutemov
2014-10-22 11:39     ` Peter Zijlstra
2014-10-22 11:39       ` Peter Zijlstra
2014-10-22 11:53       ` Kirill A. Shutemov
2014-10-22 11:53         ` Kirill A. Shutemov
2014-10-22 12:15         ` Peter Zijlstra
2014-10-22 12:15           ` Peter Zijlstra
2014-10-22 13:44           ` Peter Zijlstra
2014-10-22 13:44             ` Peter Zijlstra
2014-10-23 12:36             ` Kirill A. Shutemov
2014-10-23 12:36               ` Kirill A. Shutemov
2014-10-23 14:22               ` Peter Zijlstra
2014-10-23 14:22                 ` Peter Zijlstra
2014-10-23 15:05                 ` Kirill A. Shutemov
2014-10-23 15:05                   ` Kirill A. Shutemov
2014-10-20 21:56 ` [RFC][PATCH 4/6] SRCU free VMAs Peter Zijlstra
2014-10-20 21:56   ` Peter Zijlstra
2014-10-20 23:41   ` Linus Torvalds
2014-10-20 23:41     ` Linus Torvalds
2014-10-21  8:07     ` Peter Zijlstra
2014-10-21  8:07       ` Peter Zijlstra
2014-10-24 15:16       ` Christoph Lameter
2014-10-24 15:16         ` Christoph Lameter
2014-10-24 15:51         ` Peter Zijlstra
2014-10-24 15:51           ` Peter Zijlstra
2014-10-24 17:08           ` Christoph Lameter
2014-10-24 17:08             ` Christoph Lameter
2014-10-21  8:22     ` Peter Zijlstra
2014-10-21  8:22       ` Peter Zijlstra
2014-10-23 10:14   ` Lai Jiangshan
2014-10-23 10:14     ` Lai Jiangshan
2014-10-23 11:03     ` Peter Zijlstra
2014-10-23 11:03       ` Peter Zijlstra
2014-10-24  3:33       ` Lai Jiangshan
2014-10-24  3:33         ` Lai Jiangshan
2014-10-24  7:26         ` Peter Zijlstra
2014-10-24  7:26           ` Peter Zijlstra
2014-10-20 21:56 ` [RFC][PATCH 5/6] mm: Provide speculative fault infrastructure Peter Zijlstra
2014-10-20 21:56   ` Peter Zijlstra
2014-10-21  8:35   ` Kirill A. Shutemov
2014-10-21  8:35     ` Kirill A. Shutemov
2014-10-21 10:41     ` Peter Zijlstra
2014-10-21 10:41       ` Peter Zijlstra
2014-10-21 19:00   ` Peter Zijlstra
2014-10-21 19:00     ` Peter Zijlstra
2014-10-20 21:56 ` [RFC][PATCH 6/6] mm,x86: Add speculative pagefault handling Peter Zijlstra
2014-10-20 21:56   ` Peter Zijlstra
2014-10-21  0:07 ` Andy Lutomirski [this message]
2014-10-21  0:07   ` [RFC][PATCH 0/6] Another go at speculative page faults Andy Lutomirski
2014-10-21  8:11   ` Peter Zijlstra
2014-10-21  8:11     ` Peter Zijlstra
2014-10-21 16:23 ` Ingo Molnar
2014-10-21 16:23   ` Ingo Molnar
2014-10-21 17:09   ` Kirill A. Shutemov
2014-10-21 17:09     ` Kirill A. Shutemov
2014-10-21 17:56     ` Peter Zijlstra
2014-10-21 17:56       ` Peter Zijlstra
2014-10-23 10:40       ` Lai Jiangshan
2014-10-23 10:40         ` Lai Jiangshan
2014-10-23 11:04         ` Peter Zijlstra
2014-10-23 11:04           ` Peter Zijlstra
2014-10-24  7:54           ` Ingo Molnar
2014-10-24  7:54             ` Ingo Molnar
2014-10-24 13:14             ` Peter Zijlstra
2014-10-24 13:14               ` Peter Zijlstra
2014-10-28  5:32               ` Namhyung Kim
2014-10-28  5:32                 ` Namhyung Kim
2014-10-21 17:25   ` Peter Zijlstra
2014-10-21 17:25     ` Peter Zijlstra
2014-10-22 12:35     ` Ingo Molnar
2014-10-22 12:35       ` Ingo Molnar
2014-10-22  7:34 ` Davidlohr Bueso
2014-10-22  7:34   ` Davidlohr Bueso
2014-10-22 11:29   ` Kirill A. Shutemov
2014-10-22 11:29     ` Kirill A. Shutemov
2014-10-22 11:45     ` Peter Zijlstra
2014-10-22 11:45       ` Peter Zijlstra
2014-10-22 11:55       ` Kirill A. Shutemov
2014-10-22 11:55         ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5445A3A6.2@amacapital.net \
    --to=luto@amacapital.net \
    --cc=akpm@linux-foundation.org \
    --cc=dave@stgolabs.net \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=minchan@kernel.org \
    --cc=mingo@redhat.com \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.