From: Davidlohr Bueso <dave@stgolabs.net>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>,
Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
"H. Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
linux-next@vger.kernel.org, 1vier1@web.de,
felixh@informatik.uni-bremen.de, stable@vger.kernel.org
Subject: Re: [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race
Date: Mon, 20 Jun 2016 17:30:15 -0700 [thread overview]
Message-ID: <20160621003015.GA15593@linux-80c1.suse> (raw)
In-Reply-To: <1466280142-19741-1-git-send-email-manfred@colorfullife.com>
On Sat, 18 Jun 2016, Manfred Spraul wrote:
>diff --git a/include/linux/sem.h b/include/linux/sem.h
>index 976ce3a..d0efd6e 100644
>--- a/include/linux/sem.h
>+++ b/include/linux/sem.h
>@@ -21,6 +21,7 @@ struct sem_array {
> struct list_head list_id; /* undo requests on this array */
> int sem_nsems; /* no. of semaphores in array */
> int complex_count; /* pending complex operations */
I see this patch also takes care of complex_count needing READ/WRITE_ONCE
as you change that to always be done with the big sem_perm lock.
>+ bool complex_mode; /* no parallel simple ops */
But I'm wondering about races with this one. Doesn't complex_mode need
acquire/release semantics?
> };
>
[...]
> /*
>- * Wait until all currently ongoing simple ops have completed.
>+ * Enter the mode suitable for non-simple operations:
> * Caller must own sem_perm.lock.
>- * New simple ops cannot start, because simple ops first check
>- * that sem_perm.lock is free.
>- * that a) sem_perm.lock is free and b) complex_count is 0.
> */
>-static void sem_wait_array(struct sem_array *sma)
>+static void complexmode_enter(struct sem_array *sma)
> {
> int i;
> struct sem *sem;
>
>- if (sma->complex_count) {
>- /* The thread that increased sma->complex_count waited on
>- * all sem->lock locks. Thus we don't need to wait again.
>- */
>+ if (sma->complex_mode) {
>+ /* We are already in complex_mode. Nothing to do */
This complex_mode load is serialized because both complexmode_enter() and
_tryleave(), which are the main calls that modify the variable, are serialized
by sem_perm lock, right?
> return;
> }
Btw, I like that this logic is much simpler, just by reading the comments :)
>+ WRITE_ONCE(sma->complex_mode, true);
>+
>+ /* We need a full barrier:
>+ * The write to complex_mode must be visible
>+ * before we read the first sem->lock spinlock state.
>+ */
>+ smp_mb();
>
> for (i = 0; i < sma->sem_nsems; i++) {
> sem = sma->sem_base + i;
>@@ -285,6 +294,29 @@ static void sem_wait_array(struct sem_array *sma)
> }
>
> /*
>+ * Try to leave the mode that disallows simple operations:
>+ * Caller must own sem_perm.lock.
>+ */
>+static void complexmode_tryleave(struct sem_array *sma)
>+{
>+ if (sma->complex_count) {
>+ /* Complex ops are sleeping.
>+ * We must stay in complex mode
>+ */
>+ return;
>+ }
>+ /*
>+ * Immediately after setting complex_mode to false,
>+ * a simple op can start. Thus: all memory writes
>+ * performed by the current operation must be visible
>+ * before we set complex_mode to false.
>+ */
>+ smp_wmb();
>+
>+ WRITE_ONCE(sma->complex_mode, false);
smp_store_release()? See below.
>+}
>+
>+/*
> * If the request contains only one semaphore operation, and there are
> * no complex transactions pending, lock only the semaphore involved.
> * Otherwise, lock the entire semaphore array, since we either have
>@@ -300,56 +332,38 @@ static inline int sem_lock(struct sem_array *sma, struct sembuf *sops,
> /* Complex operation - acquire a full lock */
> ipc_lock_object(&sma->sem_perm);
>
>- /* And wait until all simple ops that are processed
>- * right now have dropped their locks.
>- */
>- sem_wait_array(sma);
>+ /* Prevent parallel simple ops */
>+ complexmode_enter(sma);
> return -1;
> }
>
> /*
> * Only one semaphore affected - try to optimize locking.
>- * The rules are:
>- * - optimized locking is possible if no complex operation
>- * is either enqueued or processed right now.
>- * - The test for enqueued complex ops is simple:
>- * sma->complex_count != 0
>- * - Testing for complex ops that are processed right now is
>- * a bit more difficult. Complex ops acquire the full lock
>- * and first wait that the running simple ops have completed.
>- * (see above)
>- * Thus: If we own a simple lock and the global lock is free
>- * and complex_count is now 0, then it will stay 0 and
>- * thus just locking sem->lock is sufficient.
>+ * Optimized locking is possible if no complex operation
>+ * is either enqueued or processed right now.
>+ *
>+ * Both facts are tracked by complex_mode.
> */
> sem = sma->sem_base + sops->sem_num;
>
>- if (sma->complex_count == 0) {
>+ /*
>+ * Initial check for complex_mode. Just an optimization,
>+ * no locking.
>+ */
>+ if (!READ_ONCE(sma->complex_mode)) {
We have no lock (which is the whole point), I think we need stronger
guarantees here to avoid racing with another thread that holds sem_perm
lock and is entering complexmode for the first time or vice versa with
tryleave(). An smp_load_acquire here would pair with the suggested
smp_store_release calls.
Thanks,
Davidlohr
next prev parent reply other threads:[~2016-06-21 0:30 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-15 5:23 linux-next: manual merge of the akpm-current tree with the tip tree Stephen Rothwell
2016-06-18 19:39 ` Manfred Spraul
2016-06-18 20:02 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Manfred Spraul
2016-06-18 20:02 ` Manfred Spraul
2016-06-18 20:02 ` [PATCH 2/2] ipc/sem: sem_lock with hysteresis Manfred Spraul
2016-06-21 20:29 ` Davidlohr Bueso
2016-06-25 17:37 ` Manfred Spraul
2016-06-28 17:54 ` Davidlohr Bueso
2016-06-20 23:04 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Andrew Morton
2016-06-20 23:04 ` Andrew Morton
2016-06-23 18:55 ` Manfred Spraul
2016-06-21 0:30 ` Davidlohr Bueso [this message]
2016-06-23 19:22 ` Manfred Spraul
2016-06-28 5:27 ` Davidlohr Bueso
2016-06-30 19:28 ` Manfred Spraul
2016-07-01 16:52 ` Davidlohr Bueso
-- strict thread matches above, loose matches on Subject: below --
2016-06-25 17:37 [PATCH 0/2] ipc/sem.c: sem_lock fixes Manfred Spraul
2016-06-25 17:37 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Manfred Spraul
2016-06-25 19:41 ` Holger Hoffstätte
2016-07-13 5:06 [PATCH 0/2] ipc/sem.c: sem_lock fixes Manfred Spraul
2016-07-13 5:06 ` [PATCH 1/2] ipc/sem.c: Fix complex_count vs. simple op race Manfred Spraul
2016-07-16 1:27 ` Davidlohr Bueso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160621003015.GA15593@linux-80c1.suse \
--to=dave@stgolabs.net \
--cc=1vier1@web.de \
--cc=akpm@linux-foundation.org \
--cc=felixh@informatik.uni-bremen.de \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=sfr@canb.auug.org.au \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.