From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Douglas Hatch <doug.hatch@hp.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@kernel.org>, Waiman Long <Waiman.Long@hp.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"Norton, Scott J" <scott.norton@hp.com>,
Peter Anvin <hpa@zytor.com>,
"linux-tip-commits@vger.kernel.org"
<linux-tip-commits@vger.kernel.org>
Subject: Re: [tip:locking/core] locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()
Date: Tue, 12 May 2015 10:45:29 +0200 [thread overview]
Message-ID: <20150512084529.GC21418@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <CA+55aFxre4JPFoPUC1UqsCvG8=nSka=AJMvFgqKzG8XiNWoD=A@mail.gmail.com>
On Mon, May 11, 2015 at 10:50:42AM -0700, Linus Torvalds wrote:
> On Mon, May 11, 2015 at 7:54 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Hmm, so I looked at the set_mb() definitions and I figure we want to do
> > something like the below, right?
>
> I don't think you need to do this for the non-smp cases.
Well, its the store tearing thing again, we use WRITE_ONCE() in
smp_store_release() for the same reason. We want it to be a single
store.
> The whole
> thing is about smp memory ordering, so on UP you don't even need the
> WRITE_ONCE(), much less a barrier.
No, we actually need both still on UP.
Imagine the following sequence:
for (;;) {
set_current_state(TASK_KILLABLE);
if (cond)
break;
schedule();
}
__set_current_state(TASK_RUNNING);
vs
<IRQ>
wake_up_process(p);
As we know, set_current_state() is set_mb(), and thus will look like:
current->state = TASK_KILLABLE;
smp_mb();
if (cond)
break;
So without the WRITE_ONCE() we can get store tearing, and suppose our
compiler is insane and translates the store into 4 byte stores.
current->state[0] = TASK_UNINTERRUPTIBLE;
current->state[1] = TASK_WAKEKILL >> 8;
current->state[2] = 0;
current->state[3] = 0;
The obvious fail here is to get the wakeup interrupt between [0] and
[1].
current->state[0] = TASK_UNINTERRUPTIBLE;
<IRQ>
wake_up_process(p);
p->state = TASK_RUNNING;
current->state[1] = TASK_WAKEKILL >> 8;
current->state[2] = 0;
current->state[3] = 0;
With the end result that ->state == TASK_WAKEKILL, from which we'll not
wake up unless killed.
Similarly, without the barrier(), our friendly compiler is allowed to
do:
if (cond)
break
current->state = TASK_KILLABLE;
schedule();
Which we all know to be broken.
So no, set_mb() (or smp_store_mb()) very much does need the WRITE_ONCE()
and a barrier() on UP.
next prev parent reply other threads:[~2015-05-12 8:45 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <tip-52c9d2badd1ae4d11c29de57d4e964e48afd3cb4@git.kernel.org>
2015-05-11 14:54 ` [tip:locking/core] locking/pvqspinlock: Replace xchg() by the more descriptive set_mb() Peter Zijlstra
2015-05-11 16:50 ` Waiman Long
2015-05-11 17:50 ` Linus Torvalds
2015-05-12 8:45 ` Peter Zijlstra [this message]
2015-05-12 13:00 ` Peter Zijlstra
2015-05-12 8:53 ` Peter Zijlstra
2015-05-12 14:59 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150512084529.GC21418@twins.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=Waiman.Long@hp.com \
--cc=doug.hatch@hp.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=scott.norton@hp.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox