linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nick Piggin <npiggin@kernel.dk>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [RFC] page-table walkers vs memory order
Date: Sat, 4 Aug 2012 16:06:05 -0700	[thread overview]
Message-ID: <20120804230605.GJ3307@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120804224705.GD10459@redhat.com>

On Sun, Aug 05, 2012 at 12:47:05AM +0200, Andrea Arcangeli wrote:
> On Sat, Aug 04, 2012 at 03:02:45PM -0700, Paul E. McKenney wrote:
> > OK, I'll bite.  ;-)
> 
> :))
> 
> > The most sane way for this to happen is with feedback-driven techniques
> > involving profiling, similar to what is done for basic-block reordering
> > or branch prediction.  The idea is that you compile the kernel in an
> > as-yet (and thankfully) mythical pointer-profiling mode, which records
> > the values of pointer loads and also measures the pointer-load latency.
> > If a situation is found where a given pointer almost always has the
> > same value but has high load latency (for example, is almost always a
> > high-latency cache miss), this fact is recorded and fed back into a
> > subsequent kernel build.  This subsequent kernel build might choose to
> > speculate the value of the pointer concurrently with the pointer load.
> > 
> > And of course, when interpreting the phrase "most sane way" at the
> > beginning of the prior paragraph, it would probably be wise to keep
> > in mind who wrote it.  And that "most sane way" might have little or
> > no resemblance to anything that typical kernel hackers would consider
> > anywhere near sanity.  ;-)
> 
> I see. The above scenario is sure fair enough assumption. We're
> clearly stretching the constraints to see what is theoretically
> possible and this is a very clear explanation of how gcc could have an
> hardcoded "guessed" address in the .text.
> 
> Next step to clearify now, is how gcc can safely dereference such a
> "guessed" address without the kernel knowing about it.
> 
> If gcc would really dereference a guessed address coming from a
> profiling run without kernel being aware of it, it would eventually
> crash the kernel with an oops. gcc cannot know what another CPU will
> do with the kernel pagetables. It'd be perfectly legitimate to
> temporarily move the data at the "guessed address" to another page and
> to update the pointer through stop_cpu during some weird "cpu
> offlining scenario" or anything you can imagine. I mean gcc must
> behave in all cases so it's not allowed to deference the guessed
> address at any given time.
> 
> The only way gcc could do the alpha thing and dereference the guessed
> address before the real pointer, is with cooperation with the kernel.
> The kernel should provide gcc "safe ranges" that won't crash the
> kernel, and/or gcc could provide a .fixup section similar to the
> current .fixup and the kernel should look it up during the page fault
> handler in case the kernel is ok with temporarily getting faults in
> that range. And in turn it can't happen unless we explicitly decide to
> allow gcc to do it.

And these are indeed some good reasons why I am not a fan of pointer-value
speculation.  ;-)

> > > Furthermore the ACCESS_ONCE that Peter's patch added to gup_fast
> > > pud/pgd can't prevent the compiler to read a guessed pmdp address as a
> > > volatile variable, before reading the pmdp pointer and compare it with
> > > the guessed address! So if it's 5 you worry about, when adding
> > > ACCESS_ONCE in pudp/pgdp/pmdp is useless and won't fix it. You should
> > > have added a barrier() instead.
> > 
> > Most compiler writers I have discussed this with agreed that a volatile
> > cast would suppress value speculation.  The "volatile" keyword is not
> > all that well specified in the C and C++ standards, but as "nix" said
> > at http://lwn.net/Articles/509731/:
> > 
> > 	volatile's meaning as 'minimize optimizations applied to things
> > 	manipulating anything of volatile type, do not duplicate, elide,
> > 	move, fold, spindle or mutilate' is of long standing.
> 
> Ok, so if the above optimization would be possible, volatile would
> stop it too, thanks for the quote and the explanation.
> 
> On a side note I believe there's a few barrier()s that may be worth
> converting to ACCESS_ONCE, that would take care of case 6) too in
> addition to avoid clobbering more CPU registers than strictly
> necessary. Not very important but a possible microoptimization.

Agreed on both points.

> > That said, value speculation as a compiler optimization makes me a bit
> > nervous, so my current feeling is that is should be suppressed entirely.
> > 
> > Hey, you asked, even if only implicitly!  ;-)
> 
> You're reading my mind! :)

Or succesfully carrying out value speculation on it.  ;-)

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      parent reply	other threads:[~2012-08-04 23:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-23 17:34 [RFC] page-table walkers vs memory order Peter Zijlstra
2012-07-23 19:27 ` Paul E. McKenney
2012-07-24 21:51 ` Hugh Dickins
2012-07-25 17:56   ` Paul E. McKenney
2012-07-25 20:26     ` Hugh Dickins
2012-07-25 21:12       ` Paul E. McKenney
2012-07-25 22:09         ` Hugh Dickins
2012-07-25 22:37           ` Paul E. McKenney
2012-07-26  8:11           ` Peter Zijlstra
2012-07-30 19:21         ` Jamie Lokier
2012-07-30 20:28           ` Paul E. McKenney
2012-07-26 20:39   ` Peter Zijlstra
2012-07-27 19:22     ` Hugh Dickins
2012-07-27 19:39       ` Paul E. McKenney
2012-08-04 14:37   ` Andrea Arcangeli
2012-08-04 22:02     ` Paul E. McKenney
2012-08-04 22:47       ` Andrea Arcangeli
2012-08-04 22:59         ` Dr. David Alan Gilbert
2012-08-04 23:11           ` Paul E. McKenney
2012-08-05  0:10             ` Dr. David Alan Gilbert
2012-08-04 23:06         ` Paul E. McKenney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120804230605.GJ3307@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@kernel.dk \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).