On 08/26/2015 12:18 PM, Jason Gunthorpe wrote: > On Wed, Aug 26, 2015 at 09:37:50AM -0400, Doug Ledford wrote: > >>> Sure looks like these two race: >>> >>> if (test_and_set_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)).. >>> clear_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags); >>> >>> Resulting in corruption of the flags. >> >> Incorrect. On all arches, the set_bit and friends functions are defined >> to be atomic. > > Ah, right, I always forget about that because they don't use the > atomic type or naming scheme like everything else :| > >> Which brings me to what I was alluding to earlier. In your patch, you >> move the complete(mcast->done); to inside the spinlock. I don't >> actually like that. While the flags are atomic and therefore visible >> the moment they are done, any other writes to the mcast struct might be >> reordered. By releasing the spinlock before you call complete(), you >> force a write memory barrier and make all changes visible. That way >> anything that was waiting on that complete has a guaranteed clean >> picture when they get woke up. Calling complete() within the spinlock >> takes that guarantee away. > > complete() is a guarenteed memory barrier. > > /** > * complete: - signals a single thread waiting on this completion > * @x: holds the state of this particular completion > [..] > * It may be assumed that this function implies a write memory barrier before > * changing the task state if and only if any tasks are woken up. > */ That still takes us back to the fact that the locking changes are unneeded. I'm not opposed to them, but as you mentioned in your first email, they should go with the changes that require them, and none of the changes in the first patch require them. Which means that if we want to keep them, it might be worth splitting them out and giving them their own patch with an explanation of why they are a benefit (lightly contended code, saves a release/reacquire on the failure path). -- Doug Ledford GPG KeyID: 0E572FDD