From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Nick Piggin <npiggin@suse.de>,
Andrea Arcangeli <aarcange@redhat.com>,
Avi Kivity <avi@redhat.com>, Thomas Gleixner <tglx@linutronix.de>,
Rik van Riel <riel@redhat.com>, Ingo Molnar <mingo@elte.hu>,
akpm@linux-foundation.org,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
David Miller <davem@davemloft.net>,
Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Mel Gorman <mel@csn.ul.ie>
Subject: Re: [PATCH 07/13] powerpc: Preemptible mmu_gather
Date: Tue, 13 Apr 2010 12:06:40 +1000 [thread overview]
Message-ID: <1271124400.13059.36.camel@pasglop> (raw)
In-Reply-To: <1270800847.20295.3225.camel@laptop>
On Fri, 2010-04-09 at 10:14 +0200, Peter Zijlstra wrote:
>
> and doing that vaddr collection right along with it in the same batch.
>
> I think that that would work, Ben, Dave?
Well, ours aren't struct pages.
IE. There's fundamentally 3 things that we are trying to batch here :-)
1- The original mmu_gather: batching the freeing of the actual user
pages, so that the TLB flush can be delayed/gathered, plus there might
be some micro-improvement in passing the page list to the allocator for
freeing all at omce. This is thus purely a batch of struct pages.
2- The batching of the TLB flushes (or hash invalidates in the ppc
case) proper, which needs the addition of the vaddr for things like
sparc and powerpc since we don't just invalidate the whole bloody thing
unlike x86 :-) On powerpc, we actually need more, we need the actual PTE
content since it also contains tracking information relative to where
things have been put in the hash table.
3- The batching of the freeing of the page table structure, which we
want to delay more than batch, ie, the goal here is to delay that
freeing using RCU until everybody has stopped walking them. This does
rely on RCU grace period being "interrupt safe", ie, there's no
rcu_read_lock() in the low level TLB or hash miss code, but that code
runs with interrupts off.
Now, 2. has a problem I described earlier, which is that we must not
have the possibility of introducing a duplicate in the hash table, thus
it must not be possible to put a new PTE in until the previous one has
been flushed or bad things would happen. This is why powerpc doesn't use
the mmu_gather the way it was originally intended to do both 1. and 2.
but really only for 1., while for 2. we use a small batch that only
exist between lazy_mmu_enter/exit, since those are always fully enclosed
by a pte lock section.
3. As you have noticed, relies on the irq stuff. Plus there seem to be a
dubious optimization here with mm_users. Might be worth sorting that
out. However, it's a very different goal than 1. and 2. in the sense
that batching proper is a minor issue, what we want is synchronization
with walkers, and that batching is a way to lower the cost of that
synchronization (allocating of the RCU struct etc...).
Cheers,
Ben.
next prev parent reply other threads:[~2010-04-13 2:12 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-08 19:17 [PATCH 00/13] mm: preemptibility -v2 Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-08 19:17 ` [PATCH 01/13] powerpc: Add rcu_read_lock() to gup_fast() implementation Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-08 20:31 ` Rik van Riel
2010-04-09 3:11 ` Nick Piggin
2010-04-13 1:05 ` Benjamin Herrenschmidt
2010-04-13 3:43 ` Paul E. McKenney
2010-04-14 13:51 ` Peter Zijlstra
2010-04-15 14:28 ` Paul E. McKenney
2010-04-16 6:54 ` Benjamin Herrenschmidt
2010-04-16 13:43 ` Paul E. McKenney
2010-04-16 23:25 ` Benjamin Herrenschmidt
2010-04-16 23:25 ` Benjamin Herrenschmidt
2010-04-16 13:51 ` Peter Zijlstra
2010-04-16 14:17 ` Paul E. McKenney
2010-04-16 14:17 ` Paul E. McKenney
2010-04-16 14:23 ` Peter Zijlstra
2010-04-16 14:32 ` Paul E. McKenney
2010-04-16 14:56 ` Peter Zijlstra
2010-04-16 15:09 ` Paul E. McKenney
2010-04-16 15:14 ` Peter Zijlstra
2010-04-16 16:45 ` Paul E. McKenney
2010-04-16 19:37 ` Peter Zijlstra
2010-04-16 20:28 ` Paul E. McKenney
2010-04-18 3:06 ` James Bottomley
2010-04-18 13:55 ` Paul E. McKenney
2010-04-18 18:55 ` James Bottomley
2010-04-16 6:51 ` Benjamin Herrenschmidt
2010-04-16 8:18 ` Nick Piggin
2010-04-16 8:29 ` Benjamin Herrenschmidt
2010-04-16 9:22 ` Nick Piggin
2010-04-08 19:17 ` [PATCH 02/13] mm: Revalidate anon_vma in page_lock_anon_vma() Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-08 20:50 ` Rik van Riel
2010-04-08 21:20 ` Andrew Morton
2010-04-08 21:54 ` Peter Zijlstra
2010-04-08 21:54 ` Peter Zijlstra
2010-04-09 2:19 ` KOSAKI Motohiro
2010-04-09 2:19 ` Minchan Kim
2010-04-09 3:16 ` Nick Piggin
2010-04-09 4:56 ` KAMEZAWA Hiroyuki
2010-04-09 6:34 ` KOSAKI Motohiro
2010-04-09 6:47 ` KAMEZAWA Hiroyuki
2010-04-09 7:29 ` KOSAKI Motohiro
2010-04-09 7:57 ` KAMEZAWA Hiroyuki
2010-04-09 8:03 ` KAMEZAWA Hiroyuki
2010-04-09 8:24 ` KAMEZAWA Hiroyuki
2010-04-09 8:01 ` Minchan Kim
2010-04-09 8:17 ` KOSAKI Motohiro
2010-04-09 14:41 ` mlock and pageout race? Minchan Kim
2010-04-09 8:44 ` [PATCH 02/13] mm: Revalidate anon_vma in page_lock_anon_vma() Peter Zijlstra
2010-05-24 19:32 ` Andrew Morton
2010-05-25 9:01 ` Peter Zijlstra
2010-05-25 9:01 ` Peter Zijlstra
2010-04-09 12:57 ` Peter Zijlstra
2010-04-08 19:17 ` [PATCH 03/13] x86: Remove last traces of quicklist usage Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-08 20:51 ` Rik van Riel
2010-04-08 19:17 ` [PATCH 04/13] mm: Move anon_vma ref out from under CONFIG_KSM Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-09 12:35 ` Rik van Riel
2010-04-08 19:17 ` [PATCH 05/13] mm: Make use of the anon_vma ref count Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-09 7:04 ` Christian Ehrhardt
2010-04-09 7:04 ` Christian Ehrhardt
2010-04-09 9:57 ` Peter Zijlstra
2010-04-08 19:17 ` [PATCH 06/13] mm: Preemptible mmu_gather Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-09 3:25 ` Nick Piggin
2010-04-09 8:18 ` Peter Zijlstra
2010-04-09 20:36 ` Peter Zijlstra
2010-04-19 19:16 ` Peter Zijlstra
2010-04-08 19:17 ` [PATCH 07/13] powerpc: " Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-09 4:07 ` Nick Piggin
2010-04-09 8:14 ` Peter Zijlstra
2010-04-09 8:14 ` Peter Zijlstra
2010-04-09 8:46 ` Nick Piggin
2010-04-09 9:22 ` Peter Zijlstra
2010-04-13 2:06 ` Benjamin Herrenschmidt [this message]
2010-04-13 1:56 ` Benjamin Herrenschmidt
2010-04-13 1:23 ` Benjamin Herrenschmidt
2010-04-13 10:22 ` Peter Zijlstra
2010-04-14 13:34 ` Peter Zijlstra
2010-04-14 13:34 ` Peter Zijlstra
2010-04-14 13:51 ` Peter Zijlstra
2010-04-14 13:51 ` Peter Zijlstra
2010-04-08 19:17 ` [PATCH 08/13] sparc: " Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-08 19:17 ` [PATCH 09/13] mm, powerpc: Move the RCU page-table freeing into generic code Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-09 3:35 ` Nick Piggin
2010-04-09 8:08 ` Peter Zijlstra
2010-04-08 19:17 ` [PATCH 10/13] lockdep, mutex: Provide mutex_lock_nest_lock Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-09 15:36 ` Rik van Riel
2010-04-08 19:17 ` [PATCH 11/13] mutex: Provide mutex_is_contended Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-09 15:37 ` Rik van Riel
2010-04-08 19:17 ` [PATCH 12/13] mm: Convert i_mmap_lock and anon_vma->lock to mutexes Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-08 19:17 ` [PATCH 13/13] mm: Optimize page_lock_anon_vma Peter Zijlstra
2010-04-08 19:17 ` Peter Zijlstra
2010-04-08 22:18 ` Paul E. McKenney
2010-04-09 8:35 ` Peter Zijlstra
2010-04-09 19:22 ` Paul E. McKenney
2010-04-09 19:22 ` Paul E. McKenney
2010-04-08 20:29 ` [PATCH 00/13] mm: preemptibility -v2 David Miller
2010-04-08 20:35 ` Peter Zijlstra
2010-04-09 1:00 ` David Miller
2010-04-09 4:14 ` Nick Piggin
2010-04-09 8:35 ` Peter Zijlstra
2010-04-09 8:35 ` Peter Zijlstra
2010-04-09 8:50 ` Nick Piggin
2010-04-09 8:58 ` Peter Zijlstra
2010-04-09 8:58 ` Martin Schwidefsky
2010-04-09 9:53 ` Peter Zijlstra
2010-04-09 9:03 ` David Howells
2010-04-09 9:22 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1271124400.13059.36.camel@pasglop \
--to=benh@kernel.crashing.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=davem@davemloft.net \
--cc=hugh.dickins@tiscali.co.uk \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mel@csn.ul.ie \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).