From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: From: David Howells In-Reply-To: <43A08D50.30707@yahoo.com.au> References: <43A08D50.30707@yahoo.com.au> <439FFF63.6020105@yahoo.com.au> <439F6EAB.6030903@yahoo.com.au> <439E122E.3080902@yahoo.com.au> <22479.1134467689@warthog.cambridge.redhat.com> <13613.1134557656@warthog.cambridge.redhat.com> <15157.1134560767@warthog.cambridge.redhat.com> Subject: Re: [PATCH 1/19] MUTEX: Introduce simple mutex implementation Date: Fri, 16 Dec 2005 12:00:38 +0000 Message-ID: <12832.1134734438@warthog.cambridge.redhat.com> Sender: dhowells@redhat.com To: Nick Piggin Cc: David Howells , torvalds@osdl.org, akpm@osdl.org, hch@infradead.org, arjan@infradead.org, matthew@wil.cx, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org List-ID: Nick Piggin wrote: > I was under the impression that with cmpxchg, you don't need the mutex lock. > If you do then sure, cmpxchg doesn't buy you anything (even if the arch does > natively support it). Consider the slow path... Imagine the mutex has three states: 0 (unset), 1 (held), 3 (contention). MUTEX PROCESS A PROCESS B PROCESS C ====== ============== ============== ============== 1,B,- 1,B,- -->mutex_lock() 1,B,- cmpxchg(0,1) [failed] 1,B,- -->__mutex_lock() 1,B,- -->mutex_unlock() 1,B,- cmpxchg(1,0) [success] 0,-,- <--mutex_unlock() 0,-,- spin_lock 0,-,A cmpxchg(0,1) [success] 1,A,A spin_unlock 1,A,- <--__mutex_lock() 1,A,- <--mutex_lock() Or: MUTEX PROCESS A PROCESS B PROCESS C ====== ============== ============== ============== 1,B,- 1,B,- -->mutex_lock() 1,B,- cmpxchg(0,1) [failed] 1,B,- -->__mutex_lock() 1,B,- spin_lock 1,B,A cmpxchg(0,1) [failed] 1,B,A -->mutex_unlock() 1,B,A cmpxchg(1,0) [success] 0,-,A <--mutex_unlock() 0,-,A cmpxchg(1,3) [failed] 0,-,A cmpxchg(0,1) [success] 1,A,A spin_unlock 1,A,- <--__mutex_lock() 1,A,- <--mutex_lock() Or: MUTEX PROCESS A PROCESS B PROCESS C ====== ============== ============== ============== 1,B,- 1,B,- -->mutex_lock() 1,B,- cmpxchg(0,1) [failed] 1,B,- -->__mutex_lock() 1,B,- spin_lock 1,B,A cmpxchg(0,1) [failed] 1,B,A -->mutex_unlock() 1,B,A cmpxchg(1,0) [success] 0,-,A <--mutex_unlock() 0,-,A cmpxchg(1,3) [failed] 0,-,A -->mutex_lock() 0,-,A cmpxchg(0,1) [success] 1,C,A <--mutex_lock() 1,C,A cmpxchg(0,1) [failed] 1,C,A cmpxchg(1,3) [success] 3,A,A spin_unlock 3,A,- <--__mutex_lock() 3,A,- <--mutex_lock() See how many cmpxchgs you may end up doing? Now imagine that cmpxchg() is implemented as: spin_lock_irqsave(&common_lock[N], flags); actual = *state; if (actual == old) *state = new; spin_unlock_irqrestore(&common_lock[N], flags); Now my point about using LL/SC is that: 1,C,A cmpxchg(0,1) [failed] 1,C,A cmpxchg(1,3) [success] 3,C,A ... Can be turned into: 1,C,A x = LL() 1,C,A x |= 2; 1,C,A SC(3) [success] 3,C,A ... On x86 you could use: 1,C,A LOCK OR (2) 3,C,A ... instead. Now, contention isn't very likely, so using CMPXCHG _may_ be good enough _if_ you have it. But if you have to emulate it by using spinlocks, you're far better off just wrapping the entire thing in spinlocks and not pretending use atomic ops to access the counter; unless, of course, you have somthing that can do a 1-bit XCHG or better... MUTEX PROCESS A PROCESS B PROCESS C ====== ============== ============== ============== 0,-,- -->mutex_lock() 0,-,- xchg(1) == 0 1,B,- <--mutex_lock() 1,B,- 1,B,- -->mutex_lock() 1,B,- xchg(1) == 1 1,B,- -->__mutex_lock() 1,B,- -->mutex_unlock() 1,B,B spin_lock 1,B,B set(0) 0,-,B spin_unlock 0,-,- <--mutex_unlock() 0,-,- spin_lock 0,-,A xchg(1) == 0 1,A,A spin_unlock 1,A,- <--__mutex_lock() 1,A,- <--mutex_lock() mutex_unlock() should get the spinlock here before modifying the count, because if there's anything on the queue, it should wake up the first waiter rather than clearing the count. David