All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew MacLeod <amacleod@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: paulmck@linux.vnet.ibm.com, Torvald Riegel <triegel@redhat.com>,
	Jan Kara <jack@suse.cz>, LKML <linux-kernel@vger.kernel.org>,
	linux-ia64@vger.kernel.org, dsterba@suse.cz, ptesarik@suse.cz,
	rguenther@suse.de, gcc@gcc.gnu.org
Subject: Re: Memory corruption due to word sharing
Date: Fri, 03 Feb 2012 19:16:43 +0000	[thread overview]
Message-ID: <4F2C329B.2080107@redhat.com> (raw)
In-Reply-To: <CA+55aFx1NjR9XMQ9iLyVk05Erwtqgf7hfZkBCMJzkCMFJS8CHw@mail.gmail.com>

On 02/03/2012 12:16 PM, Linus Torvalds wrote:
>
> So we have several atomics we use in the kernel, with the more common being
>
>   - add (and subtract) and cmpchg of both 'int' and 'long'
This would be __atomic_fetch_add, __atomic_fetch_sub, and 
__atomic_compare_exchange.

For 4.8 __atomic_compare_exchange is planned to be better optimized then 
it is now...  ie, it currently uses the same form as c++ requires:

  atomic_compare_exchange (&var, &expected, value, weak/strong, 
memorymodel).

'expected' is updated in place with the current value if it doesn't 
match.  With the address of expected taken, we dont always do a good job 
generating code for it... I plan to remedy that in 4.8 so that it is 
efficient and doesn't impact optimization of 'expected' elsewhere.
>   - add_return (add and return new value)
__atomic_add_fetch returns the new value. (__atomic_fetch_add returns 
the old value).   If it isn't as efficient as it needs to be, the RTL 
pattern can be fixed.    what sequence do you currently use for this?  
The compiler currently generates the equivilent of

lock; xadd
add

ie, it performs the atomic add then re-adds the same value to the 
previous value to get the atomic post-add value.  If there is something 
more efficient, we ought to be able to do the same.
>   - special cases of the above:
>        dec_and_test (decrement and test result for zero)
>        inc_and_test (decrement and test result for zero)
>        add_negative (add and check if result is negative)
>
>     The special cases are because older x86 cannot do the generic
> "add_return" efficiently - it needs xadd - but can do atomic versions
> that test the end result and give zero or sign information.
Since these are older x86 only, could you use add_return() always and 
then have the compiler use new peephole optimizations to detect those 
usage patterns and change the instruction sequence for x86 when 
required?  would that be acceptable?    Or maybe you don't trust the 
compiler :-)   Or maybe I can innocently ask if the performance impact 
on older x86 matters enough any more? :-)
>   - atomic_add_unless() - basically an optimized cmpxchg.

is this the reverse of a compare_exchange and add?  Ie, add if the value 
ISN'T expected?   or some form of compare_exchange_and_add?    This 
might require a new atomic builltin.

What exactly does it do?

>   - atomic bit array operations (bit set, clear, set-and-test,
> clear-and-test). We do them on "unsigned long" exclusively, and in
> fact we do them on arrays of unsigned long, ie we have the whole "bts
> reg,mem" semantics. I'm not sure we really care about the atomic
> versions for the arrays, so it's possible we only really care about a
> single long.
>
>     The only complication with the bit setting is that we have a
> concept of "set/clear bit with memory barrier before or after the bit"
> (for locking). We don't do the whole release/acquire thing, though.
Are these functions wrappers to a  tight load, mask, cmpxchg loop? or 
something else?  These could also require new built-ins if they can't be 
constructed from the existing operations...
>
>   - compare_xchg_double
>
> We also do byte/word atomic increments and decrements, but that' sin
> the x86 spinlock implementation, so it's not a generic need.
The existing __atomic builtins will work on 1,2,4,8 or 16 byte values 
regardless of type, as long as the hardware supports those sizes.  so 
x86-64 can do a 16 byte cmpxchg.

In theory, the add_fetch and sub_fetch are suppose to use INC/DEC if the 
operand is 1/-1 and the result isn't used.  If it isnt doing this right 
now, I will fix it.
> We also do the add version in particular as CPU-local optimizations
> that do not need to be SMP-safe, but do need to be interrupt-safe. On
> x86, this is just an r-m-w op, on most other architectures it ends up
> being the usual load-locked/store-conditional.
>
It may be possible to add modifier extensions to the memory model 
component for such a thing. ie

v = __atomic_add_fetch (&v, __ATOMIC_RELAXED | __ATOMIC_CPU_LOCAL)

which will allow fine tuning for something more specific like this. 
Targets which dont care can ignore it, but x86 could have atomic_add  
avoid the lock when the CPU_LOCAL modifier flag is present.

> I think that's pretty much it, but maybe I'm missing something.
>
> Of course, locking itself tends to be special cases of the above with
> extra memory barriers, but it's usually hidden in asm for other
> reasons (the bit-op + barrier being a special case).

All of the __atomic operations are currently optimization barriers in 
both directions, the optimizers tend to treat them like function calls.  
I hope to enable some sorts of optimizations eventually, especially 
based on memory model... but for now we play it safe.

Synchronization barriers are inserted based on the memory model used.  
If it can be determined that something additional is required that the 
existing memory model doesn't cover, it could be possible to add 
extensions beyond the c++11 memory model.. ( ie add new 
__ATOMIC_OTHER_BARRIER_KIND models)

Andrew


WARNING: multiple messages have this Message-ID (diff)
From: Andrew MacLeod <amacleod@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: paulmck@linux.vnet.ibm.com, Torvald Riegel <triegel@redhat.com>,
	Jan Kara <jack@suse.cz>, LKML <linux-kernel@vger.kernel.org>,
	linux-ia64@vger.kernel.org, dsterba@suse.cz, ptesarik@suse.cz,
	rguenther@suse.de, gcc@gcc.gnu.org
Subject: Re: Memory corruption due to word sharing
Date: Fri, 03 Feb 2012 14:16:43 -0500	[thread overview]
Message-ID: <4F2C329B.2080107@redhat.com> (raw)
In-Reply-To: <CA+55aFx1NjR9XMQ9iLyVk05Erwtqgf7hfZkBCMJzkCMFJS8CHw@mail.gmail.com>

On 02/03/2012 12:16 PM, Linus Torvalds wrote:
>
> So we have several atomics we use in the kernel, with the more common being
>
>   - add (and subtract) and cmpchg of both 'int' and 'long'
This would be __atomic_fetch_add, __atomic_fetch_sub, and 
__atomic_compare_exchange.

For 4.8 __atomic_compare_exchange is planned to be better optimized then 
it is now...  ie, it currently uses the same form as c++ requires:

  atomic_compare_exchange (&var, &expected, value, weak/strong, 
memorymodel).

'expected' is updated in place with the current value if it doesn't 
match.  With the address of expected taken, we dont always do a good job 
generating code for it... I plan to remedy that in 4.8 so that it is 
efficient and doesn't impact optimization of 'expected' elsewhere.
>   - add_return (add and return new value)
__atomic_add_fetch returns the new value. (__atomic_fetch_add returns 
the old value).   If it isn't as efficient as it needs to be, the RTL 
pattern can be fixed.    what sequence do you currently use for this?  
The compiler currently generates the equivilent of

lock; xadd
add

ie, it performs the atomic add then re-adds the same value to the 
previous value to get the atomic post-add value.  If there is something 
more efficient, we ought to be able to do the same.
>   - special cases of the above:
>        dec_and_test (decrement and test result for zero)
>        inc_and_test (decrement and test result for zero)
>        add_negative (add and check if result is negative)
>
>     The special cases are because older x86 cannot do the generic
> "add_return" efficiently - it needs xadd - but can do atomic versions
> that test the end result and give zero or sign information.
Since these are older x86 only, could you use add_return() always and 
then have the compiler use new peephole optimizations to detect those 
usage patterns and change the instruction sequence for x86 when 
required?  would that be acceptable?    Or maybe you don't trust the 
compiler :-)   Or maybe I can innocently ask if the performance impact 
on older x86 matters enough any more? :-)
>   - atomic_add_unless() - basically an optimized cmpxchg.

is this the reverse of a compare_exchange and add?  Ie, add if the value 
ISN'T expected?   or some form of compare_exchange_and_add?    This 
might require a new atomic builltin.

What exactly does it do?

>   - atomic bit array operations (bit set, clear, set-and-test,
> clear-and-test). We do them on "unsigned long" exclusively, and in
> fact we do them on arrays of unsigned long, ie we have the whole "bts
> reg,mem" semantics. I'm not sure we really care about the atomic
> versions for the arrays, so it's possible we only really care about a
> single long.
>
>     The only complication with the bit setting is that we have a
> concept of "set/clear bit with memory barrier before or after the bit"
> (for locking). We don't do the whole release/acquire thing, though.
Are these functions wrappers to a  tight load, mask, cmpxchg loop? or 
something else?  These could also require new built-ins if they can't be 
constructed from the existing operations...
>
>   - compare_xchg_double
>
> We also do byte/word atomic increments and decrements, but that' sin
> the x86 spinlock implementation, so it's not a generic need.
The existing __atomic builtins will work on 1,2,4,8 or 16 byte values 
regardless of type, as long as the hardware supports those sizes.  so 
x86-64 can do a 16 byte cmpxchg.

In theory, the add_fetch and sub_fetch are suppose to use INC/DEC if the 
operand is 1/-1 and the result isn't used.  If it isnt doing this right 
now, I will fix it.
> We also do the add version in particular as CPU-local optimizations
> that do not need to be SMP-safe, but do need to be interrupt-safe. On
> x86, this is just an r-m-w op, on most other architectures it ends up
> being the usual load-locked/store-conditional.
>
It may be possible to add modifier extensions to the memory model 
component for such a thing. ie

v = __atomic_add_fetch (&v, __ATOMIC_RELAXED | __ATOMIC_CPU_LOCAL)

which will allow fine tuning for something more specific like this. 
Targets which dont care can ignore it, but x86 could have atomic_add  
avoid the lock when the CPU_LOCAL modifier flag is present.

> I think that's pretty much it, but maybe I'm missing something.
>
> Of course, locking itself tends to be special cases of the above with
> extra memory barriers, but it's usually hidden in asm for other
> reasons (the bit-op + barrier being a special case).

All of the __atomic operations are currently optimization barriers in 
both directions, the optimizers tend to treat them like function calls.  
I hope to enable some sorts of optimizations eventually, especially 
based on memory model... but for now we play it safe.

Synchronization barriers are inserted based on the memory model used.  
If it can be determined that something additional is required that the 
existing memory model doesn't cover, it could be possible to add 
extensions beyond the c++11 memory model.. ( ie add new 
__ATOMIC_OTHER_BARRIER_KIND models)

Andrew


  reply	other threads:[~2012-02-03 19:16 UTC|newest]

Thread overview: 134+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-01 15:19 Memory corruption due to word sharing Jan Kara
2012-02-01 15:19 ` Jan Kara
2012-02-01 15:34 ` Markus Trippelsdorf
2012-02-01 15:34   ` Markus Trippelsdorf
2012-02-01 16:37 ` Colin Walters
2012-02-01 16:37   ` Colin Walters
2012-02-01 16:56   ` Linus Torvalds
2012-02-01 16:56     ` Linus Torvalds
2012-02-01 17:11     ` Jiri Kosina
2012-02-01 17:11       ` Jiri Kosina
2012-02-01 17:37       ` Linus Torvalds
2012-02-01 17:37         ` Linus Torvalds
2012-02-01 17:41       ` Michael Matz
2012-02-01 17:41         ` Michael Matz
2012-02-01 18:09         ` David Miller
2012-02-01 18:09           ` David Miller
2012-02-01 18:45           ` Jeff Law
2012-02-01 18:45             ` Jeff Law
2012-02-01 19:09             ` Linus Torvalds
2012-02-01 19:09               ` Linus Torvalds
2012-02-02 15:51               ` Jeff Garzik
2012-02-02 15:51                 ` Jeff Garzik
2012-02-01 18:57           ` Linus Torvalds
2012-02-01 18:57             ` Linus Torvalds
2012-02-01 19:04           ` Peter Bergner
2012-02-01 19:04             ` Peter Bergner
2012-02-01 18:52         ` Linus Torvalds
2012-02-01 18:52           ` Linus Torvalds
2012-02-02  9:35           ` Richard Guenther
2012-02-02  9:35             ` Richard Guenther
2012-02-02  9:37           ` Richard Guenther
2012-02-02  9:37             ` Richard Guenther
2012-02-02 13:43           ` Michael Matz
2012-02-02 13:43             ` Michael Matz
2012-02-01 16:41 ` Linus Torvalds
2012-02-01 16:41   ` Linus Torvalds
2012-02-01 17:42   ` Torvald Riegel
2012-02-01 17:42     ` Torvald Riegel
2012-02-01 19:40     ` Jakub Jelinek
2012-02-01 19:40       ` Jakub Jelinek
2012-02-01 20:01       ` Linus Torvalds
2012-02-01 20:01         ` Linus Torvalds
2012-02-01 20:16         ` Jakub Jelinek
2012-02-01 20:16           ` Jakub Jelinek
2012-02-01 20:44           ` Linus Torvalds
2012-02-01 20:44             ` Linus Torvalds
2012-02-02 15:58             ` Aldy Hernandez
2012-02-02 15:58               ` Aldy Hernandez
2012-02-02 16:28               ` Michael Matz
2012-02-02 16:28                 ` Michael Matz
2012-02-02 17:51                 ` Linus Torvalds
2012-02-02 17:51                   ` Linus Torvalds
2012-02-01 20:19         ` Linus Torvalds
2012-02-01 20:19           ` Linus Torvalds
2012-02-02  9:46           ` Richard Guenther
2012-02-02  9:46             ` Richard Guenther
2012-02-01 19:44     ` Boehm, Hans
2012-02-01 19:44       ` Boehm, Hans
2012-02-01 19:54       ` Jeff Law
2012-02-01 19:54         ` Jeff Law
2012-02-01 19:47     ` Linus Torvalds
2012-02-01 19:47       ` Linus Torvalds
2012-02-01 19:58       ` Alan Cox
2012-02-01 19:58         ` Alan Cox
2012-02-01 20:41       ` Torvald Riegel
2012-02-01 20:41         ` Torvald Riegel
2012-02-01 20:59         ` Linus Torvalds
2012-02-01 20:59           ` Linus Torvalds
2012-02-01 21:24           ` Torvald Riegel
2012-02-01 21:24             ` Torvald Riegel
2012-02-01 21:55             ` Linus Torvalds
2012-02-01 21:55               ` Linus Torvalds
2012-02-01 21:25           ` Boehm, Hans
2012-02-01 21:25             ` Boehm, Hans
2012-02-01 22:27             ` Linus Torvalds
2012-02-01 22:27               ` Linus Torvalds
2012-02-01 22:45           ` Paul E. McKenney
2012-02-01 22:45             ` Paul E. McKenney
2012-02-01 23:11             ` Linus Torvalds
2012-02-01 23:11               ` Linus Torvalds
2012-02-02 18:42               ` Paul E. McKenney
2012-02-02 18:42                 ` Paul E. McKenney
2012-02-02 19:08                 ` Linus Torvalds
2012-02-02 19:08                   ` Linus Torvalds
2012-02-02 19:37                   ` Paul E. McKenney
2012-02-02 19:37                     ` Paul E. McKenney
2012-02-03 16:38                     ` Andrew MacLeod
2012-02-03 16:38                       ` Andrew MacLeod
2012-02-03 17:16                       ` Linus Torvalds
2012-02-03 17:16                         ` Linus Torvalds
2012-02-03 19:16                         ` Andrew MacLeod [this message]
2012-02-03 19:16                           ` Andrew MacLeod
2012-02-03 20:00                           ` Linus Torvalds
2012-02-03 20:00                             ` Linus Torvalds
2012-02-03 20:19                             ` Paul E. McKenney
2012-02-03 20:19                               ` Paul E. McKenney
2012-02-06 15:38                             ` Torvald Riegel
2012-02-06 15:38                               ` Torvald Riegel
2012-02-10 19:27                             ` Richard Henderson
2012-02-10 19:27                               ` Richard Henderson
2012-02-02 11:19           ` Ingo Molnar
2012-02-02 11:19             ` Ingo Molnar
2012-02-01 21:04       ` Boehm, Hans
2012-02-01 21:04         ` Boehm, Hans
2012-02-02  9:28         ` Bernd Petrovitsch
2012-02-02  9:28           ` Bernd Petrovitsch
2012-02-01 17:08 ` Torvald Riegel
2012-02-01 17:08   ` Torvald Riegel
2012-02-01 17:29   ` Linus Torvalds
2012-02-01 17:29     ` Linus Torvalds
2012-02-01 20:53     ` Torvald Riegel
2012-02-01 20:53       ` Torvald Riegel
2012-02-01 21:20       ` Linus Torvalds
2012-02-01 21:20         ` Linus Torvalds
2012-02-01 21:37         ` Torvald Riegel
2012-02-01 21:37           ` Torvald Riegel
2012-02-01 22:18           ` Boehm, Hans
2012-02-01 22:18             ` Boehm, Hans
2012-02-01 17:52 ` Dennis Clarke
2012-02-01 17:52   ` Dennis Clarke
2012-02-02 11:11 ` James Courtier-Dutton
2012-02-02 11:11   ` James Courtier-Dutton
2012-02-02 11:24   ` Richard Guenther
2012-02-02 11:24     ` Richard Guenther
2012-02-02 11:13 ` David Sterba
2012-02-02 11:13   ` David Sterba
2012-02-02 11:23   ` Richard Guenther
2012-02-02 11:23     ` Richard Guenther
2012-02-03  6:45 ` DJ Delorie
2012-02-03  6:45   ` DJ Delorie
2012-02-03  9:37   ` Richard Guenther
2012-02-03  9:37     ` Richard Guenther
2012-02-03 10:03     ` Matthew Gretton-Dann
2012-02-03 10:03       ` Matthew Gretton-Dann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F2C329B.2080107@redhat.com \
    --to=amacleod@redhat.com \
    --cc=dsterba@suse.cz \
    --cc=gcc@gcc.gnu.org \
    --cc=jack@suse.cz \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=ptesarik@suse.cz \
    --cc=rguenther@suse.de \
    --cc=torvalds@linux-foundation.org \
    --cc=triegel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.