From: Christian Schoenebeck <qemu_oss@crudebyte.com>
To: qemu-devel@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Geoffrey McRae <geoff@hostfission.com>,
Gerd Hoffmann <kraxel@redhat.com>
Subject: recursive locks (in general)
Date: Fri, 21 Aug 2020 13:12:50 +0200 [thread overview]
Message-ID: <4046931.6zmTeCK0lb@silver> (raw)
In-Reply-To: <c84d95de-c71d-3272-6b41-95753634482a@redhat.com>
On Donnerstag, 20. August 2020 12:54:49 CEST Paolo Bonzini wrote:
> More seriously: programming with concurrency is first and foremost about
> understanding invariants[1]. Locks are relatively simple to reason
> about because they enforce invariants at the points where they are
> acquired or released.
As a side note, independent of the actual QBL issue discussed, while I agree
with you that nested locks should be avoided as much as possible, especially
on heterogenous large scale projects like QEMU; please let me correct some of
the things you said about recursive locks in general:
> However, recursive locks are pure evil. :)
That's a common overgeneralization of the potential issues when dealing with
recursive locks. Especially this claim ...
> but callers have no clue about what invariants are provided around calls
> to do_it(). So, understanding complex code that uses recursive mutexes
> is effectively impossible.
... is incorrect.
It is correct that you can run into deadlocks if you don't know how to deal
with nested recursive locks appropriately. It is incorrect though to say they
were not managable.
There is a golden rule with recursive locks: You always have to preserve a
clear hierarchy. Say you have the following recursive mutexes:
RecursiveMutex mutex0;
RecursiveMutex mutex1;
RecursiveMutex mutex2;
...
RecursiveMutex mutexN;
where the suffix shall identify the hierarchy, i.e. h(mutex0) = 0,
h(mutex1) = 1, ... h(mutexN) = N. Then the golden rule is that in any call
stack the nested locks must always preserve the same transitive hierarchy,
e.g.:
h(lock1) <= h(lock2) <= ... <= h(lockK)
Example (using lock-guard notation), let's say ascending transitivity is
chosen, then the following is Ok, as it does not violate chosen transitivity:
synchronized(mutex0) {
synchronized(mutex1) {
...
synchronized(mutexN) {
}
}
}
Likewise, the following is Ok as well:
synchronized(mutex3) {
synchronized(mutex8) {
...
synchronized(mutexN) {
}
}
}
whereas this would be illegal:
synchronized(mutex3) {
synchronized(mutex2) { // < violates chosen transitivity
...
synchronized(mutexN) {
}
}
}
Now let's say one day somebody accidentally broke that rule and ran into a
deadlock. What you can do to resolve the situation is capturing the call stack
of each mutex's [last] lock. And when you triggered the deadlock, you know
that at least one of the threads violated the lock hierarchy. So you look at
the individual call stacks and correct the program flow appropriately to
restore the hierarchy. And the latter is BTW independent of (i.e. any side
effects) of other threads, so it is really just about looking at what exactly
ONE thread is doing.
And for the latter reason, there are also more systematic approaches to ensure
correctness: for instance a built-in hierarchy check in the individual Mutex
implementation which would then e.g. raise an assertaion fault on early
testing whenever a developer broke the hierarchy in a nested lock sequence.
Another solution would be integrating this hierarchy check into a (e.g.
static) sanatizer, as this information can already be derived directly from
the AST in most cases. But unfortunatley this does not exist yet in any
sanatizer yet AFAIK.
For me, a non-recursive mutex makes sense for one use case: if the intention
is to lock the mutex on one thread while allowing to unlock it on another
thread. For all other use cases I would (personally) prefer a recursive type,
as it guards a clear ownership relation and hence allows to guard and prevent
many mistakes.
Best regards,
Christian Schoenebeck
next prev parent reply other threads:[~2020-08-21 11:43 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-19 6:29 [PATCH v5 0/1] audio/jack: fix use after free segfault Geoffrey McRae
2020-08-19 6:29 ` [PATCH v5 1/1] " Geoffrey McRae
2020-08-19 15:21 ` Christian Schoenebeck
2020-08-19 15:27 ` Geoffrey McRae
2020-08-20 5:37 ` Gerd Hoffmann
2020-08-20 10:06 ` Christian Schoenebeck
2020-08-20 10:54 ` Paolo Bonzini
2020-08-20 12:00 ` Christian Schoenebeck
2020-08-21 13:13 ` Paolo Bonzini
2020-08-26 13:48 ` PTHREAD_MUTEX_ERRORCHECK and fork() Christian Schoenebeck
2020-08-21 11:12 ` Christian Schoenebeck [this message]
2020-08-21 13:08 ` recursive locks (in general) Paolo Bonzini
2020-08-21 15:25 ` Christian Schoenebeck
2020-08-21 11:28 ` [PATCH v5 1/1] audio/jack: fix use after free segfault Geoffrey McRae
2020-08-21 13:13 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4046931.6zmTeCK0lb@silver \
--to=qemu_oss@crudebyte.com \
--cc=geoff@hostfission.com \
--cc=kraxel@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).