From: Eric Dumazet <eric.dumazet@gmail.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>,
mingo@elte.hu, akpm@linux-foundation.org, paulus@samba.org,
arnd@arndb.de, linux-kernel@vger.kernel.org
Subject: [PATCH] x86: atomic64_t should be 8 bytes aligned
Date: Fri, 03 Jul 2009 00:08:26 +0200 [thread overview]
Message-ID: <4A4D2FDA.3000509@gmail.com> (raw)
In-Reply-To: <alpine.LFD.2.01.0907021419560.3210@localhost.localdomain>
Linus Torvalds a écrit :
>
> On Thu, 2 Jul 2009, Eric Dumazet wrote:
>> Using a fixed initial value (instead of __atomic64_read()) is even faster,
>> it apparently permits cpu to use an appropriate bus transaction.
>
> Yeah, I guess it does a "read-for-write-ownership" and allows the thing to
> be done as a single cache transaction.
>
> If we read it first, it will first get the cacheline for shared-read, and
> then the cmpxchg8b will need to turn it from shared to exclusive.
>
> Of course, the _optimal_ situation would be if the cmpxchg8b didn't
> actually do the write at all when the value matches (and all cores could
> just keep it shared), but I guess that's not going to happen.
>
> Too bad there is no pure 8-byte read op. Using MMX has too many downsides.
>
> Btw, your numbers imply that for the atomic64_add_return(), we really
> would be much better off not reading the original value at all. Again, in
> that case, we really do want the "read-for-write-ownership" cache
> transaction, not a read.
>
I forgot to mention that if atomic64_t uses to cache lines, my test
program is 10x slower.
So we probably need an __attribute__((aligned(8)) on atomic64_t definition
as well, since alignof(atomic64_t) is 4, not 8 (!!!)
#include <stdio.h>
typedef struct {
unsigned long long counter;
} atomic64_t;
typedef struct {
unsigned long long __attribute__((aligned(8))) counter;
} atomic64_ta;
struct {
int a;
atomic64_t counter;
} s __attribute__((aligned(64)));
struct {
int a;
atomic64_ta counter;
} sa __attribute__((aligned(64)));
int main()
{
printf("alignof(atomic64_t)=%d\n", __alignof__(atomic64_t));
printf("alignof(atomic64_ta)=%d\n", __alignof__(atomic64_t));
printf("alignof(unsigned long long)=%d\n", __alignof__(unsigned long long));
printf("&s.counter=%p\n", &s.counter);
printf("&sa.counter=%p\n", &sa.counter);
return 0;
}
$ gcc -O2 -o test test.c ; ./test
alignof(atomic64_t)=4
alignof(atomic64_ta)=4
alignof(unsigned long long)=8
&s.counter=0x80496c4
&sa.counter=0x8049708
$ gcc -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../gcc-4.3.3/configure --enable-languages=c,c++ --prefix=/usr
Thread model: posix
gcc version 4.3.3 (GCC)
[PATCH] x86: atomic64_t should be 8 bytes aligned
LOCKED instructions on two cache lines are painful.
Make sure an atomic64_t is 8bytes aligned.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
diff --git a/arch/x86/include/asm/atomic_32.h b/arch/x86/include/asm/atomic_32.h
index 2503d4e..1927a56 100644
--- a/arch/x86/include/asm/atomic_32.h
+++ b/arch/x86/include/asm/atomic_32.h
@@ -250,7 +250,7 @@ static inline int atomic_add_unless(atomic_t *v, int a, int u)
/* An 64bit atomic type */
typedef struct {
- unsigned long long counter;
+ unsigned long long __attribute__((__aligned__(8))) counter;
} atomic64_t;
#define ATOMIC64_INIT(val) { (val) }
next prev parent reply other threads:[~2009-07-02 22:09 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-30 21:24 [PATCH] FRV: Wire up new syscalls David Howells
2009-06-30 21:34 ` Ingo Molnar
2009-06-30 21:41 ` Arnd Bergmann
2009-06-30 21:54 ` Ingo Molnar
2009-07-01 11:28 ` David Howells
2009-07-01 11:54 ` Ingo Molnar
2009-07-01 12:19 ` David Howells
2009-07-01 12:36 ` Paul Mackerras
2009-07-01 12:41 ` David Howells
2009-07-01 13:13 ` Ingo Molnar
2009-07-01 14:10 ` David Howells
2009-07-01 14:49 ` Ingo Molnar
2009-07-01 15:19 ` David Howells
2009-07-01 16:47 ` [PATCH 1/2] FRV: Implement atomic64_t David Howells
2009-07-01 17:20 ` Linus Torvalds
2009-07-01 17:33 ` David Howells
2009-07-01 21:11 ` Ingo Molnar
2009-07-01 22:57 ` [PATCH] x86: Code atomic(64)_read and atomic(64)_set in C not CPP [was Re: FRV: Implement atomic64_t] Paul Mackerras
2009-07-02 7:21 ` [tip:x86/urgent] x86: Code atomic(64)_read and atomic(64)_set in C not CPP tip-bot for Paul Mackerras
2009-07-02 7:21 ` [PATCH] x86: Code atomic(64)_read and atomic(64)_set in C not CPP [was Re: FRV: Implement atomic64_t] Ingo Molnar
2009-07-01 23:46 ` [PATCH 1/2] FRV: Implement atomic64_t [ver #2] David Howells
2009-07-01 23:46 ` [PATCH 2/2] FRV: Add basic performance counter support " David Howells
2009-07-01 23:48 ` [PATCH 1/2] FRV: Implement atomic64_t David Howells
2009-07-02 21:10 ` Eric Dumazet
2009-07-02 21:28 ` Linus Torvalds
2009-07-02 22:08 ` Eric Dumazet [this message]
2009-07-02 23:53 ` [PATCH] x86: atomic64_t should be 8 bytes aligned Linus Torvalds
2009-07-03 6:14 ` Ingo Molnar
2009-07-03 12:42 ` [tip:perfcounters/urgent] x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too tip-bot for Eric Dumazet
2009-07-03 16:58 ` Linus Torvalds
2009-07-03 17:49 ` H. Peter Anvin
2009-07-03 12:42 ` [tip:perfcounters/urgent] x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file tip-bot for Ingo Molnar
2009-07-03 16:47 ` Linus Torvalds
2009-07-03 18:31 ` [tip:perfcounters/urgent] x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative() tip-bot for Ingo Molnar
2009-07-03 19:18 ` tip-bot for Ingo Molnar
2009-07-04 0:05 ` [tip:perfcounters/urgent] x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file Paul Mackerras
2009-07-05 11:25 ` Ingo Molnar
2009-07-03 12:43 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_read() tip-bot for Eric Dumazet
2009-07-03 12:43 ` [tip:perfcounters/urgent] x86: atomic64: Improve cmpxchg8b() tip-bot for Eric Dumazet
2009-07-03 12:43 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_add_return() tip-bot for Ingo Molnar
2009-07-03 12:43 ` [tip:perfcounters/urgent] x86: atomic64: Reduce size of functions tip-bot for Ingo Molnar
2009-07-03 12:44 ` [tip:perfcounters/urgent] x86: atomic64: Fix unclean type use in atomic64_xchg() tip-bot for Ingo Molnar
2009-07-03 17:02 ` Linus Torvalds
2009-07-03 18:00 ` Ingo Molnar
2009-07-03 12:44 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_read() tip-bot for Eric Dumazet
2009-07-03 14:50 ` [PATCH -tip] x86: atomic64: inline atomic64_read() Eric Dumazet
2009-07-03 18:04 ` Ingo Molnar
2009-07-03 18:10 ` Arjan van de Ven
2009-07-03 18:18 ` Ingo Molnar
2009-07-03 18:25 ` Andi Kleen
2009-07-03 18:30 ` Arjan van de Ven
2009-07-03 18:43 ` Ingo Molnar
2009-07-03 18:24 ` Andi Kleen
2009-07-03 18:31 ` [tip:perfcounters/urgent] x86: atomic64: Optimize CMPXCHG8B sequences to not use the LOCK prefix tip-bot for Ingo Molnar
2009-07-03 18:45 ` Ingo Molnar
2009-07-03 19:10 ` [PATCH -tip] x86: atomic64: inline atomic64_read() Linus Torvalds
2009-07-03 19:17 ` Ingo Molnar
2009-07-03 19:38 ` Linus Torvalds
2009-07-03 21:40 ` Ingo Molnar
2009-07-03 18:31 ` [tip:perfcounters/urgent] x86: atomic64: Inline atomic64_read() again tip-bot for Eric Dumazet
2009-07-03 19:18 ` tip-bot for Eric Dumazet
2009-07-04 9:49 ` tip-bot for Eric Dumazet
2009-07-03 12:44 ` [tip:perfcounters/urgent] x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP tip-bot for Paul Mackerras
2009-07-03 12:48 ` tip-bot for Paul Mackerras
2009-07-03 12:48 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_read() tip-bot for Eric Dumazet
2009-07-03 15:33 ` [tip:perfcounters/urgent] x86: atomic64: Export APIs to modules tip-bot for Ingo Molnar
2009-07-03 18:30 ` tip-bot for Ingo Molnar
2009-07-03 18:30 ` [tip:perfcounters/urgent] x86: atomic64: Improve atomic64_xchg() tip-bot for Ingo Molnar
2009-07-03 12:01 ` [patch] x86: atomic64_t: Improve atomic64_add_return() Ingo Molnar
2009-07-03 12:26 ` [PATCH] x86: atomic64_t: _cmpxchg() & _read() optimizations Eric Dumazet
2009-07-03 12:40 ` Ingo Molnar
2009-07-03 17:38 ` [patch] x86: atomic64_t: Improve atomic64_add_return() Linus Torvalds
2009-07-03 6:05 ` [PATCH 1/2] FRV: Implement atomic64_t Eric Dumazet
2009-07-03 12:27 ` Ingo Molnar
2009-07-03 12:39 ` Eric Dumazet
2009-07-03 11:17 ` Ingo Molnar
2009-07-03 11:26 ` Ingo Molnar
2009-07-01 16:47 ` [PATCH 2/2] FRV: Add basic performance counter support David Howells
2009-07-01 21:10 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A4D2FDA.3000509@gmail.com \
--to=eric.dumazet@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=dhowells@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.