From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>, Al Viro <viro@ZenIV.linux.org.uk>,
Bart Van Assche <bvanassche@acm.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Neil Brown <neilb@suse.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] sched/wait: avoid abort_exclusive_wait() in __wait_on_bit_lock()
Date: Fri, 2 Sep 2016 14:06:02 +0200 [thread overview]
Message-ID: <20160902120601.GA26495@redhat.com> (raw)
In-Reply-To: <20160901190141.GJ10138@twins.programming.kicks-ass.net>
On 09/01, Peter Zijlstra wrote:
>
> On Fri, Aug 26, 2016 at 02:45:52PM +0200, Oleg Nesterov wrote:
>
> > We do not need anything tricky to avoid the race,
>
> The race being:
>
> CPU0 CPU1 CPU2
>
> __wait_on_bit_lock()
> bit_wait_io()
> io_schedule()
>
> clear_bit_unlock()
> __wake_up_common(.nr_exclusive=1)
> list_for_each_entry()
> if (curr->func() && --nr_exclusive)
> break
>
> signal()
>
> if (signal_pending_state()) == TRUE
> return -EINTR
>
> And no progress because CPU1 exits without acquiring the lock and CPU0
> thinks its done because it woke someone.
Yes,
> > we can just call finish_wait() if action() fails.
>
> That would be bit_wait*() returning -EINTR because sigpending.
Hmm. Not sure I understand... Let me reply just in case, even if
I am sure you get it right.
Yes, in the likely case we are going to fail with -EINTR, but only
if test-and-set after thar fails.
> Sure, you can always call that, first thing through the loop does
> prepare again, so no harm. That however does not connect to your
> condition,.. /me puzzled
If ->action() fails we will abort the loop in any case, prepare
won't be called. So in this case finish_wait() does the right thing.
> > test_and_set_bit() implies mb() so
> > the lockless list_empty_careful() case is fine, we can not miss the
> > condition if we race with unlock_page().
>
> You're talking about this ordering?:
>
> finish_wait() clear_bit_unlock();
> list_empty_careful()
>
> /* MB implied */ smp_mb__after_atomic();
> test_and_set_bit() wake_up_page()
> ...
> autoremove_wake_function()
> list_del_init();
>
>
> That could do with spelling out I feel.. :-)
Yes, yes.
> > __wait_on_bit_lock(wait_queue_head_t *wq, struct wait_bit_queue *q,
> > wait_bit_action_f *action, unsigned mode)
> > {
> > + int ret = 0;
> >
> > + for (;;) {
> > prepare_to_wait_exclusive(wq, &q->wait, mode);
> > + if (test_bit(q->key.bit_nr, q->key.flags)) {
> > + ret = action(&q->key, mode);
> > + /*
> > + * Ensure that clear_bit() + wake_up() right after
> > + * test_and_set_bit() below can't see us; it should
> > + * wake up another exclusive waiter if we fail.
> > + */
> > + if (ret)
> > + finish_wait(wq, &q->wait);
> > + }
> > + if (!test_and_set_bit(q->key.bit_nr, q->key.flags)) {
>
> So this is the actual difference, instead of failing the lock and
> aborting on signal, we acquire the lock if possible. If its not
> possible, someone else has it, which guarantees that someone else will
> do an unlock which implies another wakeup and life goes on.
Yes. This way we eliminate the need for the additional wake_up.
Oleg.
next prev parent reply other threads:[~2016-09-02 12:06 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-26 12:44 [PATCH 0/2] sched/wait: abort_exclusive_wait() should pass TASK_NORMAL to wake_up() Oleg Nesterov
2016-08-26 12:45 ` [PATCH 1/2] " Oleg Nesterov
2016-09-01 11:39 ` Peter Zijlstra
2016-09-01 17:26 ` Oleg Nesterov
2016-09-01 18:09 ` Peter Zijlstra
2016-08-26 12:45 ` [PATCH 2/2] sched/wait: avoid abort_exclusive_wait() in __wait_on_bit_lock() Oleg Nesterov
2016-08-26 12:47 ` Oleg Nesterov
2016-09-01 19:01 ` Peter Zijlstra
2016-09-01 19:08 ` Peter Zijlstra
2016-09-02 12:06 ` Oleg Nesterov
2016-09-01 22:17 ` Peter Zijlstra
2016-09-02 12:06 ` Oleg Nesterov
2016-09-02 13:20 ` Peter Zijlstra
2016-09-02 12:06 ` Oleg Nesterov [this message]
2016-09-01 11:03 ` [PATCH 0/2] sched/wait: abort_exclusive_wait() should pass TASK_NORMAL to wake_up() Peter Zijlstra
[not found] <00e501d201cf$7bfecd40$73fc67c0$@alibaba-inc.com>
2016-08-29 8:41 ` [PATCH 2/2] sched/wait: avoid abort_exclusive_wait() in __wait_on_bit_lock() Hillf Danton
2016-08-29 13:48 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160902120601.GA26495@redhat.com \
--to=oleg@redhat.com \
--cc=bvanassche@acm.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=neilb@suse.de \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).