From: Jeremy Fitzhardinge <jeremy-TSDbQ3PG+2Y@public.gmane.org>
To: Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: Linus Torvalds
<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Nick Piggin <npiggin-l3A5Bk7waGM@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
shaggy-V7BBcbaFuwjMbYB6QlFGEg@public.gmane.org,
axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Clark Williams <williams-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>
Subject: Re: [patch 2/2]: introduce fast_gup
Date: Fri, 18 Apr 2008 19:58:58 +1000 [thread overview]
Message-ID: <480870E2.20507@goop.org> (raw)
In-Reply-To: <1208448768.7115.30.camel@twins>
Peter Zijlstra wrote:
> On Thu, 2008-04-17 at 08:25 -0700, Linus Torvalds wrote:
>
>> On Thu, 17 Apr 2008, Peter Zijlstra wrote:
>>
>>> Would this be sufficient to address that comment's conern?
>>>
>> It would be nicer to just add a "native_get_pte()" to x86, to match the
>> already-existing "native_set_pte()".
>>
>
> See, I _knew_ I was missing something obvious :-/
>
>
>> And that "barrier()" should b "smp_rmb()". They may be the same code
>> sequence, but from a conceptual angle, "smp_rmb()" makes a whole lot more
>> sense.
>>
>> Finally, I don't think that comment is correct in the first place. It's
>> not that simple. The thing is, even *with* the memory barrier in place, we
>> may have:
>>
>> CPU#1 CPU#2
>> ===== =====
>>
>> fast_gup:
>> - read low word
>>
>> native_set_pte_present:
>> - set low word to 0
>> - set high word to new value
>>
>> - read high word
>>
>> - set low word to new value
>>
>> and so you read a low word that is associated with a *different* high
>> word! Notice?
>>
>> So trivial memory ordering is _not_ enough.
>>
>> So I think the code literally needs to be something like this
>>
>> #ifdef CONFIG_X86_PAE
>>
>> static inline pte_t native_get_pte(pte_t *ptep)
>> {
>> pte_t pte;
>>
>> retry:
>> pte.pte_low = ptep->pte_low;
>> smp_rmb();
>> pte.pte_high = ptep->pte_high;
>> smp_rmb();
>> if (unlikely(pte.pte_low != ptep->pte_low)
>> goto retry;
>> return pte;
>> }
>>
>> #else
>>
>> #define native_get_pte(ptep) (*(ptep))
>>
>> #endif
>>
>> but I have admittedly not really thought it fully through.
>>
>
> Looks sane here; Clark can you give this a spin?
>
> Jeremy, did I get the paravirt stuff right?
>
You shouldn't need to do anything special for paravirt. set_pte is
necessary because it may have side-effects (like a hypervisor call), but
get_pte should be side-effect free. There's no other need for it; any
special processing on the pte value itself is done in pte_val().
J
WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
shaggy@austin.ibm.com, axboe@kernel.dk, linux-mm@kvack.org,
linux-arch@vger.kernel.org, Clark Williams <williams@redhat.com>,
Ingo Molnar <mingo@elte.hu>
Subject: Re: [patch 2/2]: introduce fast_gup
Date: Fri, 18 Apr 2008 19:58:58 +1000 [thread overview]
Message-ID: <480870E2.20507@goop.org> (raw)
Message-ID: <20080418095858.AKyBFtcCBh0gRD5-ATRI3e2Vwuuf5K9gNWu25tB_10Q@z> (raw)
In-Reply-To: <1208448768.7115.30.camel@twins>
Peter Zijlstra wrote:
> On Thu, 2008-04-17 at 08:25 -0700, Linus Torvalds wrote:
>
>> On Thu, 17 Apr 2008, Peter Zijlstra wrote:
>>
>>> Would this be sufficient to address that comment's conern?
>>>
>> It would be nicer to just add a "native_get_pte()" to x86, to match the
>> already-existing "native_set_pte()".
>>
>
> See, I _knew_ I was missing something obvious :-/
>
>
>> And that "barrier()" should b "smp_rmb()". They may be the same code
>> sequence, but from a conceptual angle, "smp_rmb()" makes a whole lot more
>> sense.
>>
>> Finally, I don't think that comment is correct in the first place. It's
>> not that simple. The thing is, even *with* the memory barrier in place, we
>> may have:
>>
>> CPU#1 CPU#2
>> ===== =====
>>
>> fast_gup:
>> - read low word
>>
>> native_set_pte_present:
>> - set low word to 0
>> - set high word to new value
>>
>> - read high word
>>
>> - set low word to new value
>>
>> and so you read a low word that is associated with a *different* high
>> word! Notice?
>>
>> So trivial memory ordering is _not_ enough.
>>
>> So I think the code literally needs to be something like this
>>
>> #ifdef CONFIG_X86_PAE
>>
>> static inline pte_t native_get_pte(pte_t *ptep)
>> {
>> pte_t pte;
>>
>> retry:
>> pte.pte_low = ptep->pte_low;
>> smp_rmb();
>> pte.pte_high = ptep->pte_high;
>> smp_rmb();
>> if (unlikely(pte.pte_low != ptep->pte_low)
>> goto retry;
>> return pte;
>> }
>>
>> #else
>>
>> #define native_get_pte(ptep) (*(ptep))
>>
>> #endif
>>
>> but I have admittedly not really thought it fully through.
>>
>
> Looks sane here; Clark can you give this a spin?
>
> Jeremy, did I get the paravirt stuff right?
>
You shouldn't need to do anything special for paravirt. set_pte is
necessary because it may have side-effects (like a hypervisor call), but
get_pte should be side-effect free. There's no other need for it; any
special processing on the pte value itself is done in pte_val().
J
WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Nick Piggin <npiggin@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
shaggy@austin.ibm.com, axboe@kernel.dk, linux-mm@kvack.org,
linux-arch@vger.kernel.org, Clark Williams <williams@redhat.com>,
Ingo Molnar <mingo@elte.hu>
Subject: Re: [patch 2/2]: introduce fast_gup
Date: Fri, 18 Apr 2008 19:58:58 +1000 [thread overview]
Message-ID: <480870E2.20507@goop.org> (raw)
In-Reply-To: <1208448768.7115.30.camel@twins>
Peter Zijlstra wrote:
> On Thu, 2008-04-17 at 08:25 -0700, Linus Torvalds wrote:
>
>> On Thu, 17 Apr 2008, Peter Zijlstra wrote:
>>
>>> Would this be sufficient to address that comment's conern?
>>>
>> It would be nicer to just add a "native_get_pte()" to x86, to match the
>> already-existing "native_set_pte()".
>>
>
> See, I _knew_ I was missing something obvious :-/
>
>
>> And that "barrier()" should b "smp_rmb()". They may be the same code
>> sequence, but from a conceptual angle, "smp_rmb()" makes a whole lot more
>> sense.
>>
>> Finally, I don't think that comment is correct in the first place. It's
>> not that simple. The thing is, even *with* the memory barrier in place, we
>> may have:
>>
>> CPU#1 CPU#2
>> ===== =====
>>
>> fast_gup:
>> - read low word
>>
>> native_set_pte_present:
>> - set low word to 0
>> - set high word to new value
>>
>> - read high word
>>
>> - set low word to new value
>>
>> and so you read a low word that is associated with a *different* high
>> word! Notice?
>>
>> So trivial memory ordering is _not_ enough.
>>
>> So I think the code literally needs to be something like this
>>
>> #ifdef CONFIG_X86_PAE
>>
>> static inline pte_t native_get_pte(pte_t *ptep)
>> {
>> pte_t pte;
>>
>> retry:
>> pte.pte_low = ptep->pte_low;
>> smp_rmb();
>> pte.pte_high = ptep->pte_high;
>> smp_rmb();
>> if (unlikely(pte.pte_low != ptep->pte_low)
>> goto retry;
>> return pte;
>> }
>>
>> #else
>>
>> #define native_get_pte(ptep) (*(ptep))
>>
>> #endif
>>
>> but I have admittedly not really thought it fully through.
>>
>
> Looks sane here; Clark can you give this a spin?
>
> Jeremy, did I get the paravirt stuff right?
>
You shouldn't need to do anything special for paravirt. set_pte is
necessary because it may have side-effects (like a hypervisor call), but
get_pte should be side-effect free. There's no other need for it; any
special processing on the pte value itself is done in pte_val().
J
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-04-18 9:58 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-28 2:54 [patch 0/2]: lockless get_user_pages patchset Nick Piggin
2008-03-28 2:54 ` Nick Piggin
2008-03-28 2:54 ` Nick Piggin
[not found] ` <20080328025455.GA8083-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-03-28 2:55 ` [patch 1/2]: x86: implement pte_special Nick Piggin
2008-03-28 2:55 ` Nick Piggin
2008-03-28 2:55 ` Nick Piggin
[not found] ` <20080328025541.GB8083-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-03-28 3:23 ` David Miller
2008-03-28 3:23 ` David Miller, Nick Piggin
2008-03-28 3:23 ` David Miller
[not found] ` <20080327.202334.250213398.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-03-28 3:31 ` Nick Piggin
2008-03-28 3:31 ` Nick Piggin
2008-03-28 3:31 ` Nick Piggin
[not found] ` <20080328033149.GD8083-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-03-28 3:44 ` David Miller
2008-03-28 3:44 ` David Miller, Nick Piggin
2008-03-28 3:44 ` David Miller
[not found] ` <20080327.204431.201380891.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-03-28 4:04 ` Nick Piggin
2008-03-28 4:04 ` Nick Piggin
2008-03-28 4:04 ` Nick Piggin
[not found] ` <20080328040442.GE8083-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-03-28 4:09 ` David Miller
2008-03-28 4:09 ` David Miller, Nick Piggin
2008-03-28 4:09 ` David Miller
[not found] ` <20080327.210910.101408473.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-03-28 4:15 ` Nick Piggin
2008-03-28 4:15 ` Nick Piggin
2008-03-28 4:15 ` Nick Piggin
[not found] ` <20080328041519.GF8083-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-03-28 4:16 ` David Miller
2008-03-28 4:16 ` David Miller, Nick Piggin
2008-03-28 4:16 ` David Miller
[not found] ` <20080327.211632.02770342.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2008-03-28 4:19 ` Nick Piggin
2008-03-28 4:19 ` Nick Piggin
2008-03-28 4:19 ` Nick Piggin
2008-03-28 4:17 ` Nick Piggin
2008-03-28 4:17 ` Nick Piggin
2008-03-28 4:17 ` Nick Piggin
2008-03-28 3:00 ` [patch 2/2]: introduce fast_gup Nick Piggin
2008-03-28 3:00 ` Nick Piggin
2008-03-28 3:00 ` Nick Piggin
[not found] ` <20080328030023.GC8083-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-03-28 10:01 ` Jens Axboe
2008-03-28 10:01 ` Jens Axboe
2008-03-28 10:01 ` Jens Axboe
2008-04-17 15:03 ` Peter Zijlstra
2008-04-17 15:03 ` Peter Zijlstra
2008-04-17 15:03 ` Peter Zijlstra
2008-04-17 15:25 ` Linus Torvalds
2008-04-17 15:25 ` Linus Torvalds
2008-04-17 15:25 ` Linus Torvalds
[not found] ` <alpine.LFD.1.00.0804170814090.2879-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
2008-04-17 16:12 ` Peter Zijlstra
2008-04-17 16:12 ` Peter Zijlstra
2008-04-17 16:12 ` Peter Zijlstra
2008-04-17 16:18 ` Linus Torvalds
2008-04-17 16:18 ` Linus Torvalds
2008-04-17 16:18 ` Linus Torvalds
[not found] ` <alpine.LFD.1.00.0804170916470.2879-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
2008-04-17 16:35 ` Peter Zijlstra
2008-04-17 16:35 ` Peter Zijlstra
2008-04-17 16:35 ` Peter Zijlstra
2008-04-17 16:40 ` Linus Torvalds
2008-04-17 16:40 ` Linus Torvalds
2008-04-17 16:40 ` Linus Torvalds
[not found] ` <alpine.LFD.1.00.0804170940270.2879-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
2008-04-17 17:23 ` Peter Zijlstra
2008-04-17 17:23 ` Peter Zijlstra
2008-04-17 17:23 ` Peter Zijlstra
2008-04-17 18:28 ` Linus Torvalds
2008-04-17 18:28 ` Linus Torvalds
2008-04-17 18:28 ` Linus Torvalds
[not found] ` <alpine.LFD.1.00.0804171127310.2879-5CScLwifNT1QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
2008-04-22 3:14 ` Nick Piggin
2008-04-22 3:14 ` Nick Piggin
2008-04-22 3:14 ` Nick Piggin
2008-04-18 6:31 ` Geert Uytterhoeven
2008-04-18 6:31 ` Geert Uytterhoeven
2008-04-18 6:31 ` Geert Uytterhoeven
2008-04-18 14:40 ` Linus Torvalds
2008-04-18 14:40 ` Linus Torvalds
2008-04-18 14:40 ` Linus Torvalds
2008-04-18 9:58 ` Jeremy Fitzhardinge [this message]
2008-04-18 9:58 ` Jeremy Fitzhardinge
2008-04-18 9:58 ` Jeremy Fitzhardinge
2008-04-21 12:00 ` Avi Kivity
2008-04-21 12:00 ` Avi Kivity
2008-04-21 12:00 ` Avi Kivity
[not found] ` <480C81C4.8030200-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-04-21 12:30 ` Peter Zijlstra
2008-04-21 12:30 ` Peter Zijlstra
2008-04-21 12:30 ` Peter Zijlstra
2008-04-21 13:26 ` Avi Kivity
2008-04-21 13:26 ` Avi Kivity
2008-04-21 13:26 ` Avi Kivity
[not found] ` <480C9619.2050201-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2008-04-21 14:35 ` Peter Zijlstra
2008-04-21 14:35 ` Peter Zijlstra
2008-04-21 14:35 ` Peter Zijlstra
2008-04-22 3:23 ` Nick Piggin
2008-04-22 3:23 ` Nick Piggin
2008-04-22 3:23 ` Nick Piggin
[not found] ` <20080422032319.GB21993-B4tOwbsTzaBolqkO4TVVkw@public.gmane.org>
2008-04-22 7:19 ` Avi Kivity
2008-04-22 7:19 ` Avi Kivity
2008-04-22 7:19 ` Avi Kivity
2008-04-22 8:07 ` Ingo Molnar
2008-04-22 8:07 ` Ingo Molnar
2008-04-22 8:07 ` Ingo Molnar
2008-04-22 9:42 ` Peter Zijlstra
2008-04-22 9:42 ` Peter Zijlstra
2008-04-22 9:42 ` Peter Zijlstra
2008-04-22 9:46 ` Nick Piggin
2008-04-22 9:46 ` Nick Piggin
2008-04-22 9:46 ` Nick Piggin
2008-05-14 18:33 ` Dave Kleikamp
2008-05-14 18:33 ` Dave Kleikamp
2008-05-15 1:13 ` Nick Piggin
2008-05-15 1:13 ` Nick Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=480870E2.20507@goop.org \
--to=jeremy-tsdbq3pg+2y@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
--cc=linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mingo-X9Un+BFzKDI@public.gmane.org \
--cc=npiggin-l3A5Bk7waGM@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=shaggy-V7BBcbaFuwjMbYB6QlFGEg@public.gmane.org \
--cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=williams-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.