* [OT] ppc64 serialization problem @ 2006-03-29 1:58 Greg Smith 2006-03-29 3:11 ` Benjamin Herrenschmidt 2006-03-29 4:07 ` Paul Mackerras 0 siblings, 2 replies; 9+ messages in thread From: Greg Smith @ 2006-03-29 1:58 UTC (permalink / raw) To: linuxppc-dev We have a multi-threaded app running on a p520 in 64 bit mode. Thread A does pthread_mutex_lock(&lock); u32 &= ~bitA; pthread_mutex_unlock(&lock); and Thread B does pthread_mutex_lock(&lock); u32 |= bitB; A = u32; B = u32; pthread_mutex_unlock(&lock); On rare occasions, values A and B will differ! In the examples that I have seen, there is contention with `lock'. This phenomenon does not occur on ppc32 or a number of other architectures that we support. I confess I do not know the linux version nor the glibc version nor what pthreads implementation is being used. I'll find that out shortly. What I am curious about is where the problem might lie (kernel/lib/pthreads/app) so I can ask the right people. Thank you for your patience, Greg Smith ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [OT] ppc64 serialization problem 2006-03-29 1:58 [OT] ppc64 serialization problem Greg Smith @ 2006-03-29 3:11 ` Benjamin Herrenschmidt 2006-03-29 4:08 ` Greg Smith 2006-03-29 4:07 ` Paul Mackerras 1 sibling, 1 reply; 9+ messages in thread From: Benjamin Herrenschmidt @ 2006-03-29 3:11 UTC (permalink / raw) To: Greg Smith; +Cc: linuxppc-dev On Tue, 2006-03-28 at 20:58 -0500, Greg Smith wrote: > We have a multi-threaded app running on a p520 in 64 bit mode. > > Thread A does > > pthread_mutex_lock(&lock); > u32 &= ~bitA; > pthread_mutex_unlock(&lock); > > and Thread B does > > pthread_mutex_lock(&lock); > u32 |= bitB; > A = u32; > B = u32; > pthread_mutex_unlock(&lock); > > On rare occasions, values A and B will differ! In the examples that I > have seen, there is contention with `lock'. This phenomenon does not > occur on ppc32 or a number of other architectures that we support. How did you actually "look" at A and B ? is that also protected by the lock ? > I confess I do not know the linux version nor the glibc version nor what > pthreads implementation is being used. I'll find that out shortly. That's fairly important to know those yes. > What I am curious about is where the problem might lie > (kernel/lib/pthreads/app) so I can ask the right people. > > Thank you for your patience, > Greg Smith > > > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc-dev ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [OT] ppc64 serialization problem 2006-03-29 3:11 ` Benjamin Herrenschmidt @ 2006-03-29 4:08 ` Greg Smith 2006-03-29 4:21 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 9+ messages in thread From: Greg Smith @ 2006-03-29 4:08 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: linuxppc-dev Very fair questions!! Actually the code was pthread_mutex_lock(&lock); u32 |= bitB; TRACE("A", u32, ...); TRACE("B", u32, ...); pthread_mutex_unlock(&lock); where TRACE is a function call (entering a trace entry to an in-storage wrap-around table). So for the "A" call, u32 could have come directly from a register and for "B" from the storage location. I'll have the user (actually a fellow developer) send me the assembly listing to make sure. He has tested SLES9 (kernel 2.6.5, glibc 2.3.3, gcc 3.3.3) and Debian (kernel 2.6.16, glibc 2.3.6, gcc 4.0.3). The TRACE occurs while the lock is held. Now the interesting part. I suggested he try u64 instead of u32. That works!! He is suspecting a recent firmware upgrade may have something to do with the problem. Thank you, Greg Smith On Wed, 2006-03-29 at 14:11 +1100, Benjamin Herrenschmidt wrote: > On Tue, 2006-03-28 at 20:58 -0500, Greg Smith wrote: > > We have a multi-threaded app running on a p520 in 64 bit mode. > > > > Thread A does > > > > pthread_mutex_lock(&lock); > > u32 &= ~bitA; > > pthread_mutex_unlock(&lock); > > > > and Thread B does > > > > pthread_mutex_lock(&lock); > > u32 |= bitB; > > A = u32; > > B = u32; > > pthread_mutex_unlock(&lock); > > > > On rare occasions, values A and B will differ! In the examples that I > > have seen, there is contention with `lock'. This phenomenon does not > > occur on ppc32 or a number of other architectures that we support. > > How did you actually "look" at A and B ? is that also protected by the > lock ? > > > I confess I do not know the linux version nor the glibc version nor what > > pthreads implementation is being used. I'll find that out shortly. > > That's fairly important to know those yes. > > > What I am curious about is where the problem might lie > > (kernel/lib/pthreads/app) so I can ask the right people. > > > > Thank you for your patience, > > Greg Smith ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [OT] ppc64 serialization problem 2006-03-29 4:08 ` Greg Smith @ 2006-03-29 4:21 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 9+ messages in thread From: Benjamin Herrenschmidt @ 2006-03-29 4:21 UTC (permalink / raw) To: Greg Smith; +Cc: linuxppc-dev On Tue, 2006-03-28 at 23:08 -0500, Greg Smith wrote: > Very fair questions!! > > Actually the code was > > pthread_mutex_lock(&lock); > u32 |= bitB; > TRACE("A", u32, ...); > TRACE("B", u32, ...); > pthread_mutex_unlock(&lock); > > where TRACE is a function call (entering a trace entry to an in-storage > wrap-around table). So for the "A" call, u32 could have come directly > from a register and for "B" from the storage location. I'll have the > user (actually a fellow developer) send me the assembly listing to make > sure. Could you try to make a small program that reproduces the problem instead ? > He has tested SLES9 (kernel 2.6.5, glibc 2.3.3, gcc 3.3.3) and Debian > (kernel 2.6.16, glibc 2.3.6, gcc 4.0.3). > > The TRACE occurs while the lock is held. > > Now the interesting part. > > I suggested he try u64 instead of u32. That works!! > > He is suspecting a recent firmware upgrade may have something to do with > the problem. I doubt it, but I need more informations. > Thank you, > Greg Smith > > On Wed, 2006-03-29 at 14:11 +1100, Benjamin Herrenschmidt wrote: > > On Tue, 2006-03-28 at 20:58 -0500, Greg Smith wrote: > > > We have a multi-threaded app running on a p520 in 64 bit mode. > > > > > > Thread A does > > > > > > pthread_mutex_lock(&lock); > > > u32 &= ~bitA; > > > pthread_mutex_unlock(&lock); > > > > > > and Thread B does > > > > > > pthread_mutex_lock(&lock); > > > u32 |= bitB; > > > A = u32; > > > B = u32; > > > pthread_mutex_unlock(&lock); > > > > > > On rare occasions, values A and B will differ! In the examples that I > > > have seen, there is contention with `lock'. This phenomenon does not > > > occur on ppc32 or a number of other architectures that we support. > > > > How did you actually "look" at A and B ? is that also protected by the > > lock ? > > > > > I confess I do not know the linux version nor the glibc version nor what > > > pthreads implementation is being used. I'll find that out shortly. > > > > That's fairly important to know those yes. > > > > > What I am curious about is where the problem might lie > > > (kernel/lib/pthreads/app) so I can ask the right people. > > > > > > Thank you for your patience, > > > Greg Smith > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [OT] ppc64 serialization problem 2006-03-29 1:58 [OT] ppc64 serialization problem Greg Smith 2006-03-29 3:11 ` Benjamin Herrenschmidt @ 2006-03-29 4:07 ` Paul Mackerras 2006-03-29 18:20 ` Greg Smith 1 sibling, 1 reply; 9+ messages in thread From: Paul Mackerras @ 2006-03-29 4:07 UTC (permalink / raw) To: Greg Smith; +Cc: linuxppc-dev Greg Smith writes: > On rare occasions, values A and B will differ! In the examples that I > have seen, there is contention with `lock'. This phenomenon does not > occur on ppc32 or a number of other architectures that we support. I would be interested to see the assembler code being generated for your code snippets. Paul. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [OT] ppc64 serialization problem 2006-03-29 4:07 ` Paul Mackerras @ 2006-03-29 18:20 ` Greg Smith 2006-03-29 18:32 ` Olaf Hering 0 siblings, 1 reply; 9+ messages in thread From: Greg Smith @ 2006-03-29 18:20 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev On Wed, 2006-03-29 at 15:07 +1100, Paul Mackerras wrote: > Greg Smith writes: > > > On rare occasions, values A and B will differ! In the examples that I > > have seen, there is contention with `lock'. This phenomenon does not > > occur on ppc32 or a number of other architectures that we support. > > I would be interested to see the assembler code being generated for > your code snippets. > > Paul. This is looking like a compiler optimization issue. We have two fields _u32 x,y; Field y is what we are trying to serialize with the lock. However, while we are doing that it appears another thread is updating x. When x is updated and with optimization, the code is doing an 8 byte load of x and y then doing an 8 byte store, wiping out what was stored in y. Which explains why changing the type to _u64 seemed to fix the problem. Below are some of the notes the other developer sent me. Thanks, Greg Smith There are OTHER field neighbouring the ints_state field.. (and this is probably some of the flags).. gcc, when optimizing, gcc load and stores 64 bit values, *INCLUDING* the ints_state field effectivelly breaking serialization. Here is a sample program : struct _test1 { unsigned long long z; unsigned char c; unsigned int a:1; const volatile unsigned int b; }; int main() { } unsigned int toto(struct _test1 *data) { data->a=1; return data->b; } And here is the relevant asm snippet : .A ld 11,8(9) - get 64 bits ->a lis 0,0x80 * sldi 0,0,32 * or 0,11,0 * .B std 0,8(9) * Store 64 bits -> a ld 9,112(31) * lwz 0,12(9) * get 32 bit b So it is clear between .A and .B -> value of b was read then written. If anything happens between .A and .B to ->b, then it is lost. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [OT] ppc64 serialization problem 2006-03-29 18:20 ` Greg Smith @ 2006-03-29 18:32 ` Olaf Hering 2006-03-29 18:42 ` Ivan Warren 0 siblings, 1 reply; 9+ messages in thread From: Olaf Hering @ 2006-03-29 18:32 UTC (permalink / raw) To: Greg Smith; +Cc: linuxppc-dev, Paul Mackerras On Wed, Mar 29, Greg Smith wrote: > So it is clear between .A and .B -> value of b was read then written. If > anything happens between .A and .B to ->b, then it is lost. Even gcc4.1 does it that way with -O0. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [OT] ppc64 serialization problem 2006-03-29 18:32 ` Olaf Hering @ 2006-03-29 18:42 ` Ivan Warren 2006-03-29 21:29 ` Ivan Warren 0 siblings, 1 reply; 9+ messages in thread From: Ivan Warren @ 2006-03-29 18:42 UTC (permalink / raw) To: Olaf Hering; +Cc: Greg Smith, linuxppc-dev, Paul Mackerras [-- Attachment #1: Type: text/plain, Size: 938 bytes --] Olaf Hering wrote: > On Wed, Mar 29, Greg Smith wrote: > > >> So it is clear between .A and .B -> value of b was read then written. If >> anything happens between .A and .B to ->b, then it is lost. >> > > Even gcc4.1 does it that way with -O0 Yeah.. Noticed that..gcc 3.3.3 and gcc 4.0.3 also do this Even declaring the field volatile has no effect.. Question : Is this *NORMAL* behavior ? I discussed this with some C folks and they seem to think (at least some of them) that it's broken per standards.. (volatile fields should be only read/written IFF read/written explicitly - pretty much what section 6.3.7 of the C std says). Anyway I can produce perfectly valid sample code (to my understanding anyway) that produces incorrect result.. YUCK ! This is also possibly a problem with the powerpc64 back end of gcc.. Anyway.. It's OT.. Sorry to have bothered you guys ! (And thanks Greg for pointing me here). --Ivan [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/x-pkcs7-signature, Size: 3216 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [OT] ppc64 serialization problem 2006-03-29 18:42 ` Ivan Warren @ 2006-03-29 21:29 ` Ivan Warren 0 siblings, 0 replies; 9+ messages in thread From: Ivan Warren @ 2006-03-29 21:29 UTC (permalink / raw) To: Ivan Warren; +Cc: Greg Smith, linuxppc-dev, Paul Mackerras, Olaf Hering [-- Attachment #1: Type: text/plain, Size: 474 bytes --] Ivan Warren wrote: > Question : > Is this *NORMAL* behavior ? I discussed this with some C folks and > they seem to think (at least some of them) that it's broken per > standards.. (volatile fields should be only read/written IFF > read/written explicitly - pretty much what section 6.3.7 of the C std > says). > ...For Info... I've submitted a gcc PR : http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26933 Anyway.. My apologies since this was a bit off topic.. --Ivan [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/x-pkcs7-signature, Size: 3216 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2006-03-29 21:29 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-03-29 1:58 [OT] ppc64 serialization problem Greg Smith 2006-03-29 3:11 ` Benjamin Herrenschmidt 2006-03-29 4:08 ` Greg Smith 2006-03-29 4:21 ` Benjamin Herrenschmidt 2006-03-29 4:07 ` Paul Mackerras 2006-03-29 18:20 ` Greg Smith 2006-03-29 18:32 ` Olaf Hering 2006-03-29 18:42 ` Ivan Warren 2006-03-29 21:29 ` Ivan Warren
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).