From mboxrd@z Thu Jan 1 00:00:00 1970 From: lemonlime51@gmail.com (Matthias Bonne) Date: Sun, 15 Mar 2015 23:49:07 +0200 Subject: Question on mutex code In-Reply-To: <1426381746.28068.70.camel@stgolabs.net> References: <54F64E10.7050801@gmail.com> <1425992639.3991.11.camel@opteya.com> <5504BECB.50605@gmail.com> <1426381401.28068.68.camel@stgolabs.net> <1426381746.28068.70.camel@stgolabs.net> Message-ID: <5505FE53.1060807@gmail.com> To: kernelnewbies@lists.kernelnewbies.org List-Id: kernelnewbies.lists.kernelnewbies.org On 03/15/15 03:09, Davidlohr Bueso wrote: > On Sat, 2015-03-14 at 18:03 -0700, Davidlohr Bueso wrote: >> Good analysis, but not quite accurate for one simple fact: mutex >> trylocks _only_ use fastpaths (obviously just depend on the counter >> cmpxchg to 0), so you never fallback to the slowpath you are mentioning, >> thus the race is non existent. Please see the arch code. > > For debug we use the trylock slowpath, but so does everything else, so > again you cannot hit this scenario. > > You are correct of course - this is why I said that CONFIG_DEBUG_MUTEXES must be enabled for this to happen. Can you explain why this scenario is still not possible in the debug case? The debug case uses mutex-null.h, which contains these macros: #define __mutex_fastpath_lock(count, fail_fn) fail_fn(count) #define __mutex_fastpath_lock_retval(count) (-1) #define __mutex_fastpath_unlock(count, fail_fn) fail_fn(count) #define __mutex_fastpath_trylock(count, fail_fn) fail_fn(count) #define __mutex_slowpath_needs_to_unlock() 1 So both mutex_trylock() and mutex_unlock() always use the slow paths. The slowpath for mutex_unlock() is __mutex_unlock_slowpath(), which simply calls __mutex_unlock_common_slowpath(), and the latter starts like this: /* * As a performance measurement, release the lock before doing other * wakeup related duties to follow. This allows other tasks to acquire * the lock sooner, while still handling cleanups in past unlock calls. * This can be done as we do not enforce strict equivalence between the * mutex counter and wait_list. * * * Some architectures leave the lock unlocked in the fastpath failure * case, others need to leave it locked. In the later case we have to * unlock it here - as the lock counter is currently 0 or negative. */ if (__mutex_slowpath_needs_to_unlock()) atomic_set(&lock->count, 1); spin_lock_mutex(&lock->wait_lock, flags); [...] So the counter is set to 1 before taking the spinlock, which I think might cause the race. Did I miss something?