Re: [Qemu-devel] [PATCH v2 1/2] add a header file for atomic operations

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Paolo Bonzini <pbonzini@redhat.com>
To: Richard Henderson <rth@twiddle.net>
Cc: Liu Ping Fan <qemulist@gmail.com>,
	Anthony Liguori <anthony@codemonkey.ws>,
	qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v2 1/2] add a header file for atomic operations
Date: Tue, 18 Jun 2013 10:36:30 +0200	[thread overview]
Message-ID: <51C01C0E.5030501@redhat.com> (raw)
In-Reply-To: <51BF5C0F.6020209@twiddle.net>

Il 17/06/2013 20:57, Richard Henderson ha scritto:
> On 06/16/2013 04:21 AM, Liu Ping Fan wrote:
>> +#if QEMU_GNUC_PREREQ(4, 8)
>> +#ifndef __ATOMIC_RELAXED
>> +#define __ATOMIC_RELAXED 0
>> +#endif
> 
> Why all the ifdefs?  If __atomic support is present, then __ATOMIC defines will
> exist.

Then I can just use "#ifdef __ATOMIC_RELAXED" instead of
"#if QEMU_GNUC_PREREQ(4, 8)"?

>> +/* Compiler barrier */
>> +#define barrier()   ({ asm volatile("" ::: "memory"); (void)0; })
>> +
>> +#if !QEMU_GNUC_PREREQ(4, 8)
> 
> With C++11 atomic support we should define barrier as
> 
>   __atomic_signal_fence(__ATOMIC_ACQ_REL)

I don't really understand fences other than SEQ_CST, so I was hesitant to use
them.  The asms should work anyway.

>>  #if defined(__powerpc64__)
>> -#define smp_rmb()   asm volatile("lwsync" ::: "memory")
>> +#define smp_rmb()   ({ asm volatile("lwsync" ::: "memory"); (void)0; })
>>  #else
>> -#define smp_rmb()   asm volatile("sync" ::: "memory")
>> +#define smp_rmb()   ({ asm volatile("sync" ::: "memory"); (void)0; })
>>  #endif
> 
> Err... lwsync corresponds to ACQ_REL consistency, while sync corresponds to
> SEQ_CST consistency.  The newly added document says you want SEQ_CST while the
> old code implies ACQ_REL.
> 
> Which is true?

I have no idea, but I can say which semantics I want:

1) Linux kernel memory barrier semantics for smp_*mb (i.e. express the barriers
in terms of read/write/full, not in terms of acq/rel/seqcst);

2) Java volatile semantics for atomic_mb_*.

Basically, I cannot claim I understand this stuff 100%, but at least I could
use sources I trust to implement it.

My rationale was that both have been in use for a while and have proven to be
both efficient and useful.  They are more easily understood than the C11 memory
model, though not as complete.  I wanted something that people could figure out
relatively easily (tested on myself while working on the RCU stuff).

Most important, Java volatile can be described in terms of barriers as in the
JSR-133 cookbook, making it easy to convert one to the other.  This also reflected
in the names.

Finally, Java volatile semantics are reasonably fast for all architectures,
including ARM and PPC.

I think my understanding that Java volatile == SEQ_CST was wrong, though,
and I should rewrite the documentation.  More on this below.

>> +#ifndef smp_mb
>> +#define smp_mb()    __sync_synchronize()
>> +#endif
> 
> Use __atomic_thread_fence here for completeness?
> 
>> +#ifndef atomic_read
>> +#define atomic_read(ptr)       (*(__typeof__(*ptr) *volatile) (ptr))
>>  #endif
>>  
>> +#ifndef atomic_set
>> +#define atomic_set(ptr, i)     ((*(__typeof__(*ptr) *volatile) (ptr)) = (i))
>> +#endif
> 
> Use
> 
> __atomic_load(..., __ATOMIC_RELAXED)
> __atomic_store(..., __ATOMIC_RELAXED)
> 
> ?

Same here, I didn't want proliferation of #ifdefs beyond what is actually required.

>> + * And for the few ia64 lovers that exist, an atomic_mb_read is a ld.acq,
>> + * while an atomic_mb_set is a st.rel followed by a memory barrier.
> ...
>> + */
>> +#ifndef atomic_mb_read
>> +#if QEMU_GNUC_PREREQ(4, 8)
>> +#define atomic_mb_read(ptr)       ({             \
>> +    typeof(*ptr) _val;                           \
>> +    __atomic_load(ptr, &_val, __ATOMIC_SEQ_CST); \
>> +    _val;                                        \
>> +})
>> +#else
>> +#define atomic_mb_read(ptr)    ({           \
>> +    typeof(*ptr) _val = atomic_read(ptr);   \
>> +    smp_rmb();                              \
>> +    _val;                                   \
> 
> This latter definition is ACQUIRE not SEQ_CST (except for ia64).  Without
> load_acquire, one needs barriers before and after the atomic_read in order to
> implement SEQ_CST.
> 
> So again I have to ask, what semantics are you actually looking for here?

As mentioned above, I wanted Java volatile semantics.  I misunderstood
that Java volatile == SEQ_CST.  The Java memory model FAQ says:
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile

     Each read of a volatile will see the last write to that volatile
     by any thread; in effect, they are designated by the programmer as
     fields for which it is never acceptable to see a "stale" value as
     a result of caching or reordering. The compiler and runtime are
     prohibited from allocating them in registers. They must also ensure
     that after they are written, they are flushed out of the cache to
     main memory, so they can immediately become visible to other threads.
     Similarly, before a volatile field is read, the cache must be
     invalidated so that the value in main memory, not the local processor
     cache, is the one seen. There are also additional restrictions on
     reordering accesses to volatile variables. 

(Since almost all architectures are cache coherent, I take that "cache"
really means the store buffers.  The JSR-133 cookbook basically says the
same).  In addition, mentioned by the cookbook but not the FAQ, normal
stores cannot be moved after a volatile stores, and volatile loads
cannot be moved after a normal load.

The FAQ also has an "important note", however:

    Important Note: Note that it is important for both threads to access
    the same volatile variable in order to properly set up the happens-before
    relationship. It is not the case that everything visible to thread A
    when it writes volatile field f becomes visible to thread B after it
    reads volatile field g. The release and acquire have to "match" (i.e.,
    be performed on the same volatile field) to have the right semantics. 

Is this final "important note" the difference between ACQ_REL and SEQ_CST?

>> +#define atomic_mb_set(ptr, i)  do {         \
>> +    smp_wmb();                              \
>> +    atomic_set(ptr, i);                     \
>> +    smp_mb();                               \
>> +} while (0)
> 
> Really only a smp_wmb?  What about outstanding loads?

Probably the same as before.  Also, as mentioned in the documentation
smp_rmb() + smp_wmb() imply a load-store barrier on all machines.

Thanks for reviewing this, you're probably the only person in the community
that understands this stuff.

Paolo

next prev parent reply	other threads:[~2013-06-18  8:36 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-16 11:21 [Qemu-devel] [PATCH v2 0/2] make AioContext's bh re-entrant Liu Ping Fan
2013-06-16 11:21 ` [Qemu-devel] [PATCH v2 1/2] add a header file for atomic operations Liu Ping Fan
2013-06-17 18:57   ` Richard Henderson
2013-06-18  8:36     ` Paolo Bonzini [this message]
2013-06-18 11:03       ` Paolo Bonzini
2013-06-18 14:38       ` Richard Henderson
2013-06-18 15:04         ` Paolo Bonzini
2013-06-18 13:24     ` [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations) Paolo Bonzini
2013-06-18 14:50       ` Paul E. McKenney
2013-06-18 15:29         ` Peter Sewell
2013-06-18 15:37         ` Torvald Riegel
2013-06-19  1:53           ` Paul E. McKenney
2013-06-19  7:11             ` Torvald Riegel
2013-06-20 15:18               ` Paul E. McKenney
2013-06-18 16:08         ` Paolo Bonzini
2013-06-18 16:38           ` Torvald Riegel
2013-06-19  1:57             ` Paul E. McKenney
2013-06-19  9:31             ` Paolo Bonzini
2013-06-19 13:15               ` Torvald Riegel
2013-06-19 15:14                 ` Paolo Bonzini
2013-06-19 20:25                   ` Torvald Riegel
2013-06-20  7:53                     ` Paolo Bonzini
2013-06-22 10:55                       ` Torvald Riegel
2013-06-18 15:26       ` Torvald Riegel
2013-06-18 17:38       ` Andrew Haley
2013-06-19  9:30         ` Paolo Bonzini
2013-06-19 15:36           ` Andrew Haley
2013-06-16 11:21 ` [Qemu-devel] [PATCH v2 2/2] QEMUBH: make AioContext's bh re-entrant Liu Ping Fan
2013-06-17 15:28   ` Stefan Hajnoczi
2013-06-17 16:41     ` Paolo Bonzini
2013-06-18  2:19     ` liu ping fan
2013-06-18  9:31       ` Stefan Hajnoczi
2013-06-18 15:14   ` mdroth
2013-06-18 16:19     ` mdroth
2013-06-18 19:20     ` Paolo Bonzini
2013-06-18 22:26       ` mdroth
2013-06-19  9:27         ` Paolo Bonzini
2013-06-20  9:11           ` Stefan Hajnoczi
2013-06-17  7:11 ` [Qemu-devel] [PATCH v2 0/2] " Paolo Bonzini
2013-06-18  2:40   ` liu ping fan
2013-06-18  8:36     ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51C01C0E.5030501@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=qemu-devel@nongnu.org \
    --cc=qemulist@gmail.com \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).