From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>, Al Viro <viro@ZenIV.linux.org.uk>,
Bart Van Assche <bvanassche@acm.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Neil Brown <neilb@suse.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] sched/wait: avoid abort_exclusive_wait() in __wait_on_bit_lock()
Date: Thu, 1 Sep 2016 21:01:41 +0200 [thread overview]
Message-ID: <20160901190141.GJ10138@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <20160826124552.GB28904@redhat.com>
On Fri, Aug 26, 2016 at 02:45:52PM +0200, Oleg Nesterov wrote:
> We do not need anything tricky to avoid the race,
The race being:
CPU0 CPU1 CPU2
__wait_on_bit_lock()
bit_wait_io()
io_schedule()
clear_bit_unlock()
__wake_up_common(.nr_exclusive=1)
list_for_each_entry()
if (curr->func() && --nr_exclusive)
break
signal()
if (signal_pending_state()) == TRUE
return -EINTR
And no progress because CPU1 exits without acquiring the lock and CPU0
thinks its done because it woke someone.
> we can just call finish_wait() if action() fails.
That would be bit_wait*() returning -EINTR because sigpending.
Sure, you can always call that, first thing through the loop does
prepare again, so no harm. That however does not connect to your
condition,.. /me puzzled
> test_and_set_bit() implies mb() so
> the lockless list_empty_careful() case is fine, we can not miss the
> condition if we race with unlock_page().
You're talking about this ordering?:
finish_wait() clear_bit_unlock();
list_empty_careful()
/* MB implied */ smp_mb__after_atomic();
test_and_set_bit() wake_up_page()
...
autoremove_wake_function()
list_del_init();
That could do with spelling out I feel.. :-)
> __wait_on_bit_lock(wait_queue_head_t *wq, struct wait_bit_queue *q,
> wait_bit_action_f *action, unsigned mode)
> {
> + int ret = 0;
>
> + for (;;) {
> prepare_to_wait_exclusive(wq, &q->wait, mode);
> + if (test_bit(q->key.bit_nr, q->key.flags)) {
> + ret = action(&q->key, mode);
> + /*
> + * Ensure that clear_bit() + wake_up() right after
> + * test_and_set_bit() below can't see us; it should
> + * wake up another exclusive waiter if we fail.
> + */
> + if (ret)
> + finish_wait(wq, &q->wait);
> + }
> + if (!test_and_set_bit(q->key.bit_nr, q->key.flags)) {
So this is the actual difference, instead of failing the lock and
aborting on signal, we acquire the lock if possible. If its not
possible, someone else has it, which guarantees that someone else will
do an unlock which implies another wakeup and life goes on.
> + if (!ret)
> + finish_wait(wq, &q->wait);
> + return 0;
> + } else if (ret) {
> + return ret;
> + }
> + }
> }
> I am not sure we even want to conditionalize both finish_wait()'s,
> we could simply call it unconditionally and once before test_and_set(),
> the spurious wakeup is unlikely case.
ret = 0;
for (;;) {
prepare_to_wait_exclusive(wq, &q->wait, mode);
if (test_bit(&q->key.bit_nr, &q->key.flag))
ret = action(&q->key, mode);
if (!test_and_set_bit(&q->key.bit_nr, &q->key.flag)) {
/* we got the lock anyway, ignore the signal */
ret = 0;
break;
}
if (ret)
break;
}
finish_wait(wq, &q->wait);
return ret;
Would not that work too?
next prev parent reply other threads:[~2016-09-01 21:14 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-26 12:44 [PATCH 0/2] sched/wait: abort_exclusive_wait() should pass TASK_NORMAL to wake_up() Oleg Nesterov
2016-08-26 12:45 ` [PATCH 1/2] " Oleg Nesterov
2016-09-01 11:39 ` Peter Zijlstra
2016-09-01 17:26 ` Oleg Nesterov
2016-09-01 18:09 ` Peter Zijlstra
2016-08-26 12:45 ` [PATCH 2/2] sched/wait: avoid abort_exclusive_wait() in __wait_on_bit_lock() Oleg Nesterov
2016-08-26 12:47 ` Oleg Nesterov
2016-09-01 19:01 ` Peter Zijlstra [this message]
2016-09-01 19:08 ` Peter Zijlstra
2016-09-02 12:06 ` Oleg Nesterov
2016-09-01 22:17 ` Peter Zijlstra
2016-09-02 12:06 ` Oleg Nesterov
2016-09-02 13:20 ` Peter Zijlstra
2016-09-02 12:06 ` Oleg Nesterov
2016-09-01 11:03 ` [PATCH 0/2] sched/wait: abort_exclusive_wait() should pass TASK_NORMAL to wake_up() Peter Zijlstra
[not found] <00e501d201cf$7bfecd40$73fc67c0$@alibaba-inc.com>
2016-08-29 8:41 ` [PATCH 2/2] sched/wait: avoid abort_exclusive_wait() in __wait_on_bit_lock() Hillf Danton
2016-08-29 13:48 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160901190141.GJ10138@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=bvanassche@acm.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=neilb@suse.de \
--cc=oleg@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.