All of lore.kernel.org
 help / color / mirror / Atom feed
From: lemonlime51@gmail.com (Matthias Bonne)
To: kernelnewbies@lists.kernelnewbies.org
Subject: Question on mutex code
Date: Wed, 04 Mar 2015 02:13:04 +0200	[thread overview]
Message-ID: <54F64E10.7050801@gmail.com> (raw)

Hello,

I am trying to understand how mutexes work in the kernel, and I think
there might be a race between mutex_trylock() and mutex_unlock(). More
specifically, the race is between the functions
__mutex_trylock_slowpath and __mutex_unlock_common_slowpath (both
defined in kernel/locking/mutex.c).

Consider the following sequence of events:

0. Suppose a mutex is locked by task A and has no waiters.

1. Task B calls mutex_trylock().

2. mutex_trylock() calls the architecture-specific
    __mutex_fastpath_trylock(), with __mutex_trylock_slowpath() as
    fail_fn.

3. According to the description of __mutex_fastpath_trylock() (for
    example in include/asm-generic/mutex-dec.h), "if the architecture
    has no effective trylock variant, it should call the fail_fn
    spinlock-based trylock variant unconditionally". So
    __mutex_fastpath_trylock() may now call __mutex_trylock_slowpath().

4. Task A releases the mutex.

5. Task B, in __mutex_trylock_slowpath, executes:

         /* No need to trylock if the mutex is locked. */
         if (mutex_is_locked(lock))
                 return 0;

    Since the mutex is no longer locked, the function continues.

6. Task C, which runs on a different cpu than task B, locks the mutex
    again.

7. Task B, in __mutex_trylock_slowpath(), continues:

         spin_lock_mutex(&lock->wait_lock, flags);

         prev = atomic_xchg(&lock->count, -1);
         if (likely(prev == 1)) {
                 mutex_set_owner(lock);
                 mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_);
         }

At this point task B holds mutex->wait_lock, prev is 0 (because there
are no waiters other than task B, so the count was 0) and the mutex
count is set to -1.

5. Task C calls mutex_unlock() to unlock the mutex.

6. mutex_unlock() calls the architecture-specific function
    __mutex_fastpath_unlock(), which fails (because the mutex count is
    -1), so it now calls __mutex_unlock_slowpath(), which calls
    __mutex_unlock_common_slowpath().

7. __mutex_unlock_common_slowpath() sets the mutex count to 1
    unconditionally, before spinning on mutex->wait_lock.

8. Task B, in __mutex_trylock_slowpath, continues:

         /* Set it back to 0 if there are no waiters: */
         if (likely(list_empty(&lock->wait_list)))
                 atomic_set(&lock->count, 0);

         spin_unlock_mutex(&lock->wait_lock, flags);

         return prev == 1;

    mutex->wait_list is still empty, so the code sets the mutex count to
    zero (which means the mutex is locked), releases mutex->wait_lock,
    and returns 0 (which means that the mutex is locked by someone else,
    and cannot be acquired).

9. Task C, in __mutex_unlock_common_slowpath, acquires
    mutex->wait_lock, unlocks it immediately (because there are no
    waiters to wake up) and returns.

The end result is that the mutex count is 0 (locked), although the
owner has just released it, and nobody else is holding the mutex. So it
can no longer be acquired by anyone.

Am I missing something that prevents the above scenario from happening?
If not, should I post a patch that fixes it to LKML? Or is it
considered too "theoretical" and cannot happen in practice?

Thanks.

             reply	other threads:[~2015-03-04  0:13 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-04  0:13 Matthias Bonne [this message]
2015-03-10 13:03 ` Question on mutex code Yann Droneaud
2015-03-10 13:03   ` Yann Droneaud
2015-03-10 14:59   ` Valdis.Kletnieks at vt.edu
2015-03-10 14:59     ` Valdis.Kletnieks
2015-03-14 23:08     ` Matthias Bonne
2015-03-14 23:08       ` Matthias Bonne
2015-03-14 23:05   ` Matthias Bonne
2015-03-14 23:05     ` Matthias Bonne
2015-03-15  1:03     ` Davidlohr Bueso
2015-03-15  1:04       ` Davidlohr Bueso
2015-03-15  1:09       ` Davidlohr Bueso
2015-03-15  1:10         ` Davidlohr Bueso
2015-03-15 21:49         ` Matthias Bonne
2015-03-15 21:49           ` Matthias Bonne
2015-03-15 22:10           ` Rabin Vincent
2015-03-15 22:11             ` Rabin Vincent
2015-03-16  3:40             ` Matthias Bonne
2015-03-16  3:40               ` Matthias Bonne
2015-03-15 22:18           ` Davidlohr Bueso
2015-03-15 22:19             ` Davidlohr Bueso
2015-03-15 22:23             ` Davidlohr Bueso
2015-03-15 22:24               ` Davidlohr Bueso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54F64E10.7050801@gmail.com \
    --to=lemonlime51@gmail.com \
    --cc=kernelnewbies@lists.kernelnewbies.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.