All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	David Howells <dhowells@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: wake_up_process implied memory barrier clarification
Date: Fri, 28 Aug 2015 18:06:37 +0200	[thread overview]
Message-ID: <20150828160637.GA4393@redhat.com> (raw)
In-Reply-To: <20150828145121.GG5301@dhcp22.suse.cz>

On 08/28, Michal Hocko wrote:
>
> On Thu 27-08-15 20:26:54, Oleg Nesterov wrote:
> > On 08/27, Michal Hocko wrote:
> > >
> > > --- a/Documentation/memory-barriers.txt
> > > +++ b/Documentation/memory-barriers.txt
> > > @@ -2031,6 +2031,9 @@ something up.  The barrier occurs before the task state is cleared, and so sits
> > >  	    <general barrier>		  STORE current->state
> > >  	LOAD event_indicated
> > >
> > > +Please note that wake_up_process is an exception here because it implies
> > > +the write memory barrier unconditionally.
> > > +
> >
> > I simply can't understand (can't even parse) this part of memory-barriers.txt.
>
> Do you mean the added text or the example above it?

Both ;)

but note that "load from X might see 0" is true of course, and in this
sense wake_up_process() is not exception:

	X = 1;
	wmb();	// unless I am totally confused this just adds more confusion
	Y = 1;
	wake_up_process(TASK);

vs TASK doing

	for (;;) {
		set_current_state(...);
		if (Y)
			break;
		schedule();
	}

	BUG_ON(X == 0)

is not correct, it can actually can hit the BUG_ON() above. However, if
wake_up_process() actually wakes a sleeping TASK up, then it should also
see X = 1. Even without wmb(), even if we do

	Y = 1;
	X = 1;
	wake_up_process(TASK);

> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -1967,8 +1967,7 @@ static void try_to_wake_up_local(struct task_struct *p)
> > >   *
> > >   * Return: 1 if the process was woken up, 0 if it was already running.
> > >   *
> > > - * It may be assumed that this function implies a write memory barrier before
> > > - * changing the task state if and only if any tasks are woken up.
> > > + * It may be assumed that this function implies a write memory barrier.
> > >   */
> >
> > I won't argue, technically this is correct of course. And I agree that
> > the old comment is misleading.
>
> Well the reason I've noticed is the following race in the scsi code
>     CPU0                                        CPU1
> scsi_error_handler                      scsi_host_dev_release
>                                           kthread_stop()
>   while (!kthread_should_stop()) {
>                                             set_bit(KTHREAD_SHOULD_STOP)
>                                             wake_up_process()
>                                             wait_for_completion()
>
>     set_current_state(TASK_INTERRUPTIBLE)
>     schedule()

Heh. This looks like a common mistake. See fecdf8be2d91e04b0a9a4f79ff06499 ;)

But I believe this is another thing.

> I have read the comment for wake_up_process and was wondering that
> moving set_current_state before kthread_should_stop wouldn't be enough
> because the the task at CPU0 might be TASK_RUNNIG and so wake_up_process
> wouldn't wake up it and the missing write barrier could lead to a missed
> KTHREAD_SHOULD_STOP.

And that is why try_to_wake_up()->smp_mb__before_spinlock() needs to
serialize STORE(CONDITION) and the subsequent LOAD(p->state). The fact
that it actually does wmb() is just implementation detail, that is what
I tried to say.

> > To me, this comment should just explain that this function implies a barrier
> > but only in a sense that you do not need another one after CONDITION = T and
> > before wake_up_process().
>
> I have no objection against more precise wording here but what we have is just
> misleading.

Yes, yes, I agree. Just I do not know what exactly it should document.

Oleg.


  reply	other threads:[~2015-08-28 16:09 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-27 12:27 wake_up_process implied memory barrier clarification Michal Hocko
2015-08-27 12:43 ` Peter Zijlstra
2015-08-27 13:14   ` Michal Hocko
2015-08-27 18:26     ` Oleg Nesterov
2015-08-28 14:51       ` Michal Hocko
2015-08-28 16:06         ` Oleg Nesterov [this message]
2015-08-29  9:25           ` Boqun Feng
2015-08-29 14:27             ` Oleg Nesterov
2015-08-31  0:37               ` Boqun Feng
2015-08-31 18:33                 ` Oleg Nesterov
2015-08-31 20:37                   ` Paul E. McKenney
2015-09-01  3:40                     ` Boqun Feng
2015-09-01  4:03                       ` Paul E. McKenney
2015-09-01  9:59                       ` Oleg Nesterov
2015-09-01 14:50                         ` Boqun Feng
2015-09-01 16:39                           ` Oleg Nesterov
2015-09-02  1:10                             ` Boqun Feng
2015-09-07 17:06                               ` Oleg Nesterov
2015-09-08  0:22                                 ` Boqun Feng
2015-09-01  9:41                     ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150828160637.GA4393@redhat.com \
    --to=oleg@redhat.com \
    --cc=dhowells@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.