From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
linux-kernel <linux-kernel@vger.kernel.org>, x86 <x86@kernel.org>,
Nadav Amit <namit@vmware.com>, paulmck <paulmck@linux.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
Will Deacon <will.deacon@arm.com>
Subject: Re: [PATCH] cpu/hotplug: Cache number of online CPUs
Date: Fri, 5 Jul 2019 11:38:53 -0400 (EDT) [thread overview]
Message-ID: <824482130.8027.1562341133252.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20190705084910.GA6592@gmail.com>
----- On Jul 5, 2019, at 4:49 AM, Ingo Molnar mingo@kernel.org wrote:
> * Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
>
>> ----- On Jul 4, 2019, at 6:33 PM, Thomas Gleixner tglx@linutronix.de wrote:
>>
>> > On Thu, 4 Jul 2019, Mathieu Desnoyers wrote:
>> >> ----- On Jul 4, 2019, at 5:10 PM, Thomas Gleixner tglx@linutronix.de wrote:
>> >> >
>> >> > num_online_cpus() is racy today vs. CPU hotplug operations as
>> >> > long as you don't hold the hotplug lock.
>> >>
>> >> Fair point, AFAIU none of the loads performed within num_online_cpus()
>> >> seem to rely on atomic nor volatile accesses. So not using a volatile
>> >> access to load the cached value should not introduce any regression.
>> >>
>> >> I'm concerned that some code may rely on re-fetching of the cached
>> >> value between iterations of a loop. The lack of READ_ONCE() would
>> >> let the compiler keep a lifted load within a register and never
>> >> re-fetch, unless there is a cpu_relax() or a barrier() within the
>> >> loop.
>> >
>> > If someone really wants to write code which can handle concurrent CPU
>> > hotplug operations and rely on that information, then it's probably better
>> > to write out:
>> >
>> > ncpus = READ_ONCE(__num_online_cpus);
>> >
>> > explicitely along with a big fat comment.
>> >
>> > I can't figure out why one wants to do that and how it is supposed to work,
>> > but my brain is in shutdown mode already :)
>> >
>> > I'd rather write a proper kernel doc comment for num_online_cpus() which
>> > explains what the constraints are instead of pretending that the READ_ONCE
>> > in the inline has any meaning.
>>
>> The other aspect I am concerned about is freedom given to the compiler
>> to perform the store to __num_online_cpus non-atomically, or the load
>> non-atomically due to memory pressure.
>
> What connection does "memory pressure" have to what the compiler does?
>
> Did you confuse it with "register pressure"?
My brain wires must have been crossed when I wrote that. Indeed, I mean
register pressure.
>
>> Is that something we should be concerned about ?
>
> Once I understand it :)
>
>> I thought we had WRITE_ONCE and READ_ONCE to take care of that kind of
>> situation.
>
> Store and load tearing is one of the minor properties of READ_ONCE() and
> WRITE_ONCE() - the main properties are the ordering guarantees.
>
> Since __num_online_cpus is neither weirdly aligned nor is it written via
> constants I don't see how load/store tearing could occur. Can you outline
> such a scenario?
I'm referring to the examples provided in Documentation/core-api/atomic_ops.rst,
e.g.:
"For another example, consider the following code::
tmp_a = a;
do_something_with(tmp_a);
do_something_else_with(tmp_a);
If the compiler can prove that do_something_with() does not store to the
variable a, then the compiler is within its rights to manufacture an
additional load as follows::
tmp_a = a;
do_something_with(tmp_a);
tmp_a = a;
do_something_else_with(tmp_a);"
So if instead of "a" we have num_online_cpus(), the compiler would be
within its right to re-fetch the number of online cpus between the two
calls, which seems unexpected.
>
>> The semantic I am looking for here is C11's relaxed atomics.
>
> What does this mean?
C11 states:
"Atomic operations specifying memory_order_relaxed are relaxed only with respect
to memory ordering. Implementations must still guarantee that any given atomic access
to a particular atomic object be indivisible with respect to all other atomic accesses
to that object."
So I am concerned that num_online_cpus() as proposed in this patch
try to access __num_online_cpus non-atomically, and without using
READ_ONCE().
Similarly, the update-side should use WRITE_ONCE(). Protecting with a mutex
does not provide mutual exclusion against concurrent readers of that variable.
Again based on Documentation/core-api/atomic_ops.rst:
"For a final example, consider the following code, assuming that the
variable a is set at boot time before the second CPU is brought online
and never changed later, so that memory barriers are not needed::
if (a)
b = 9;
else
b = 42;
The compiler is within its rights to manufacture an additional store
by transforming the above code into the following::
b = 42;
if (a)
b = 9;
This could come as a fatal surprise to other code running concurrently
that expected b to never have the value 42 if a was zero. To prevent
the compiler from doing this, write something like::
if (a)
WRITE_ONCE(b, 9);
else
WRITE_ONCE(b, 42);"
If the compiler decides to manufacture additional stores when updating
the __num_online_cpus cached value, then loads could observe this
intermediate value because they fetch it without holding any mutex.
Thanks,
Mathieu
>
> Thanks,
>
> Ingo
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
next prev parent reply other threads:[~2019-07-05 15:38 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-04 20:42 [PATCH] cpu/hotplug: Cache number of online CPUs Thomas Gleixner
2019-07-04 20:59 ` Mathieu Desnoyers
2019-07-04 21:10 ` Thomas Gleixner
2019-07-04 22:00 ` Mathieu Desnoyers
2019-07-04 22:33 ` Thomas Gleixner
2019-07-04 23:34 ` Mathieu Desnoyers
2019-07-05 8:49 ` Ingo Molnar
2019-07-05 15:38 ` Mathieu Desnoyers [this message]
2019-07-05 20:53 ` Thomas Gleixner
2019-07-05 21:00 ` Thomas Gleixner
2019-07-06 23:24 ` Mathieu Desnoyers
2019-07-08 13:43 ` [PATCH V2] " Thomas Gleixner
2019-07-08 14:07 ` Paul E. McKenney
2019-07-08 14:20 ` Thomas Gleixner
2019-07-09 14:23 ` [PATCH V3] " Thomas Gleixner
2019-07-09 15:52 ` Mathieu Desnoyers
2019-07-22 7:58 ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2019-07-25 14:11 ` tip-bot for Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=824482130.8027.1562341133252.JavaMail.zimbra@efficios.com \
--to=mathieu.desnoyers@efficios.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namit@vmware.com \
--cc=paulmck@linux.ibm.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=will.deacon@arm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.