From: Eric Dumazet <eric.dumazet@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>,
mingo@elte.hu, akpm@linux-foundation.org, paulus@samba.org,
arnd@arndb.de, linux-kernel@vger.kernel.org
Subject: [PATCH] x86: atomic64_t should be 8 bytes aligned
Date: Fri, 03 Jul 2009 00:08:26 +0200 [thread overview]
Message-ID: <4A4D2FDA.3000509@gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.01.0907021419560.3210@localhost.localdomain>
Linus Torvalds a écrit :
>
> On Thu, 2 Jul 2009, Eric Dumazet wrote:
>> Using a fixed initial value (instead of __atomic64_read()) is even faster,
>> it apparently permits cpu to use an appropriate bus transaction.
>
> Yeah, I guess it does a "read-for-write-ownership" and allows the thing to
> be done as a single cache transaction.
>
> If we read it first, it will first get the cacheline for shared-read, and
> then the cmpxchg8b will need to turn it from shared to exclusive.
>
> Of course, the _optimal_ situation would be if the cmpxchg8b didn't
> actually do the write at all when the value matches (and all cores could
> just keep it shared), but I guess that's not going to happen.
>
> Too bad there is no pure 8-byte read op. Using MMX has too many downsides.
>
> Btw, your numbers imply that for the atomic64_add_return(), we really
> would be much better off not reading the original value at all. Again, in
> that case, we really do want the "read-for-write-ownership" cache
> transaction, not a read.
>
I forgot to mention that if atomic64_t uses to cache lines, my test
program is 10x slower.
So we probably need an __attribute__((aligned(8)) on atomic64_t definition
as well, since alignof(atomic64_t) is 4, not 8 (!!!)
#include <stdio.h>
typedef struct {
unsigned long long counter;
} atomic64_t;
typedef struct {
unsigned long long __attribute__((aligned(8))) counter;
} atomic64_ta;
struct {
int a;
atomic64_t counter;
} s __attribute__((aligned(64)));
struct {
int a;
atomic64_ta counter;
} sa __attribute__((aligned(64)));
int main()
{
printf("alignof(atomic64_t)=%d\n", __alignof__(atomic64_t));
printf("alignof(atomic64_ta)=%d\n", __alignof__(atomic64_t));
printf("alignof(unsigned long long)=%d\n", __alignof__(unsigned long long));
printf("&s.counter=%p\n", &s.counter);
printf("&sa.counter=%p\n", &sa.counter);
return 0;
}
$ gcc -O2 -o test test.c ; ./test
alignof(atomic64_t)=4
alignof(atomic64_ta)=4
alignof(unsigned long long)=8
&s.counter=0x80496c4
&sa.counter=0x8049708
$ gcc -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../gcc-4.3.3/configure --enable-languages=c,c++ --prefix=/usr
Thread model: posix
gcc version 4.3.3 (GCC)
[PATCH] x86: atomic64_t should be 8 bytes aligned
LOCKED instructions on two cache lines are painful.
Make sure an atomic64_t is 8bytes aligned.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
diff --git a/arch/x86/include/asm/atomic_32.h b/arch/x86/include/asm/atomic_32.h
index 2503d4e..1927a56 100644
--- a/arch/x86/include/asm/atomic_32.h
+++ b/arch/x86/include/asm/atomic_32.h
@@ -250,7 +250,7 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
/* An 64bit atomic type */
typedef struct {
- unsigned long long counter;
+ unsigned long long __attribute__((__aligned__(8))) counter;
} atomic64_t;
#define ATOMIC64_INIT(val) { (val) }
next prev parent reply other threads:[~2009-07-02 22:09 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-30 21:24 [PATCH] FRV: Wire up new syscalls David Howells
2009-06-30 21:34 ` Ingo Molnar
2009-06-30 21:41 ` Arnd Bergmann
2009-06-30 21:54 ` Ingo Molnar
2009-07-01 11:28 ` David Howells
2009-07-01 11:54 ` Ingo Molnar
2009-07-01 12:19 ` David Howells
2009-07-01 12:36 ` Paul Mackerras
2009-07-01 12:41 ` David Howells
2009-07-01 13:13 ` Ingo Molnar
2009-07-01 14:10 ` David Howells
2009-07-01 14:49 ` Ingo Molnar
2009-07-01 16:47 ` [PATCH 1/2] FRV: Implement atomic64_t David Howells
2009-07-01 17:20 ` Linus Torvalds
2009-07-01 21:11 ` Ingo Molnar
2009-07-01 22:57 ` [PATCH] x86: Code atomic(64)_read and atomic(64)_set in C not CPP [was Re: FRV: Implement atomic64_t] Paul Mackerras
2009-07-02 7:21 ` [tip:x86/urgent] x86: Code atomic(64)_read and atomic(64)_set in C not CPP tip-bot for Paul Mackerras
2009-07-02 7:21 ` [PATCH] x86: Code atomic(64)_read and atomic(64)_set in C not CPP [was Re: FRV: Implement atomic64_t] Ingo Molnar
2009-07-01 23:46 ` [PATCH 1/2] FRV: Implement atomic64_t [ver #2] David Howells
2009-07-01 23:46 ` [PATCH 2/2] FRV: Add basic performance counter support " David Howells
2009-07-02 21:10 ` [PATCH 1/2] FRV: Implement atomic64_t Eric Dumazet
2009-07-02 21:28 ` Linus Torvalds
2009-07-02 22:08 ` Eric Dumazet [this message]
2009-07-02 23:53 ` [PATCH] x86: atomic64_t should be 8 bytes aligned Linus Torvalds
2009-07-03 6:14 ` Ingo Molnar
2009-07-03 12:42 ` [tip:perfcounters/urgent] x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too tip-bot for Eric Dumazet
2009-07-03 16:58 ` Linus Torvalds
2009-07-03 17:49 ` H. Peter Anvin
2009-07-03 12:42 ` [tip:perfcounters/urgent] x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file tip-bot for Ingo Molnar
2009-07-03 16:47 ` Linus Torvalds
2009-07-03 18:31 ` [tip:perfcounters/urgent] x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative() tip-bot for Ingo Molnar
2009-07-03 19:18 ` tip-bot for Ingo Molnar
2009-07-04 0:05 ` [tip:perfcounters/urgent] x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file Paul Mackerras
2009-07-05 11:25 ` Ingo Molnar
2009-07-03 12:43 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_read() tip-bot for Eric Dumazet
2009-07-03 12:43 ` [tip:perfcounters/urgent] x86: atomic64: Improve cmpxchg8b() tip-bot for Eric Dumazet
2009-07-03 12:43 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_add_return() tip-bot for Ingo Molnar
2009-07-03 12:43 ` [tip:perfcounters/urgent] x86: atomic64: Reduce size of functions tip-bot for Ingo Molnar
2009-07-03 12:44 ` [tip:perfcounters/urgent] x86: atomic64: Fix unclean type use in atomic64_xchg() tip-bot for Ingo Molnar
2009-07-03 17:02 ` Linus Torvalds
2009-07-03 18:00 ` Ingo Molnar
2009-07-03 12:44 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_read() tip-bot for Eric Dumazet
2009-07-03 14:50 ` [PATCH -tip] x86: atomic64: inline atomic64_read() Eric Dumazet
2009-07-03 18:04 ` Ingo Molnar
2009-07-03 18:10 ` Arjan van de Ven
2009-07-03 18:18 ` Ingo Molnar
2009-07-03 18:25 ` Andi Kleen
2009-07-03 18:30 ` Arjan van de Ven
2009-07-03 18:43 ` Ingo Molnar
2009-07-03 18:24 ` Andi Kleen
2009-07-03 18:31 ` [tip:perfcounters/urgent] x86: atomic64: Optimize CMPXCHG8B sequences to not use the LOCK prefix tip-bot for Ingo Molnar
2009-07-03 18:45 ` Ingo Molnar
2009-07-03 19:10 ` [PATCH -tip] x86: atomic64: inline atomic64_read() Linus Torvalds
2009-07-03 19:17 ` Ingo Molnar
2009-07-03 19:38 ` Linus Torvalds
2009-07-03 21:40 ` Ingo Molnar
2009-07-03 18:31 ` [tip:perfcounters/urgent] x86: atomic64: Inline atomic64_read() again tip-bot for Eric Dumazet
2009-07-03 19:18 ` tip-bot for Eric Dumazet
2009-07-04 9:49 ` tip-bot for Eric Dumazet
2009-07-03 12:44 ` [tip:perfcounters/urgent] x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP tip-bot for Paul Mackerras
2009-07-03 12:48 ` tip-bot for Paul Mackerras
2009-07-03 12:48 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_read() tip-bot for Eric Dumazet
2009-07-03 15:33 ` [tip:perfcounters/urgent] x86: atomic64: Export APIs to modules tip-bot for Ingo Molnar
2009-07-03 18:30 ` tip-bot for Ingo Molnar
2009-07-03 18:30 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_xchg() tip-bot for Ingo Molnar
2009-07-03 12:01 ` [patch] x86: atomic64_t: Improve atomic64_add_return() Ingo Molnar
2009-07-03 12:26 ` [PATCH] x86: atomic64_t: _cmpxchg() & _read() optimizations Eric Dumazet
2009-07-03 12:40 ` Ingo Molnar
2009-07-03 17:38 ` [patch] x86: atomic64_t: Improve atomic64_add_return() Linus Torvalds
2009-07-03 6:05 ` [PATCH 1/2] FRV: Implement atomic64_t Eric Dumazet
2009-07-03 12:27 ` Ingo Molnar
2009-07-03 12:39 ` Eric Dumazet
2009-07-03 11:17 ` Ingo Molnar
2009-07-03 11:26 ` Ingo Molnar
2009-07-01 17:33 ` David Howells
2009-07-01 23:48 ` David Howells
2009-07-01 16:47 ` [PATCH 2/2] FRV: Add basic performance counter support David Howells
2009-07-01 21:10 ` Ingo Molnar
2009-07-01 15:19 ` [PATCH] FRV: Wire up new syscalls David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A4D2FDA.3000509@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=dhowells@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).