* [GIT PULL] percpu fixes for v4.14-rc3
@ 2017-10-03 13:26 Tejun Heo
2017-10-03 17:27 ` Linus Torvalds
0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2017-10-03 13:26 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, Christoph Lameter, Dennis Zhou
Hello, Linus.
Rather important fixes this time.
* The new percpu area allocator had a subtle bug in how it iterates
the memory regions and could skip viable areas, which led to
allocation failures for module static percpu variables. Dennis
fixed the bug and another non-critical one in stat calculation.
* Mark noticed that the generic implementations of percpu local atomic
reads aren't properly protected against irqs and there's a (slim)
chance for split reads on some 32bit systems. Generic
implementations are updated to disable irq when read size is larger
than ulong size. This may have made some 32bit archs which can do
atomic local 64bit accesses generate sub-optimal code. We need to
find them out and implement arch-specific overrides.
Thanks.
The following changes since commit e365806ac289457263a133bd32df8df49897f612:
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs (2017-09-25 18:24:14 -0700)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-4.14-fixes
for you to fetch changes up to 1fa4df3e688902d033dfda796eb83ae6ad8d0488:
percpu: fix iteration to prevent skipping over block (2017-09-28 07:39:27 -0700)
----------------------------------------------------------------
Dennis Zhou (2):
percpu: fix starting offset for chunk statistics traversal
percpu: fix iteration to prevent skipping over block
Mark Rutland (1):
percpu: make this_cpu_generic_read() atomic w.r.t. interrupts
include/asm-generic/percpu.h | 24 ++++++++++++++++++++++--
mm/percpu-stats.c | 2 +-
mm/percpu.c | 4 ++++
3 files changed, 27 insertions(+), 3 deletions(-)
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [GIT PULL] percpu fixes for v4.14-rc3
2017-10-03 13:26 [GIT PULL] percpu fixes for v4.14-rc3 Tejun Heo
@ 2017-10-03 17:27 ` Linus Torvalds
2017-10-03 18:32 ` Tejun Heo
0 siblings, 1 reply; 3+ messages in thread
From: Linus Torvalds @ 2017-10-03 17:27 UTC (permalink / raw)
To: Tejun Heo; +Cc: Linux Kernel Mailing List, Christoph Lameter, Dennis Zhou
On Tue, Oct 3, 2017 at 6:26 AM, Tejun Heo <tj@kernel.org> wrote:
>
> * Mark noticed that the generic implementations of percpu local atomic
> reads aren't properly protected against irqs and there's a (slim)
> chance for split reads on some 32bit systems.
Grr.
Do we really want to support 64-bit percpu operations on 32-bit architectures?
It does kind of break the whole point of percpu operations, and I
would like to point out that I find things like
static DEFINE_PER_CPU(u64, running_sample_length);
which is preceded by a comment that talks about how this is accessed
from critical code and explicitly mentions NMI's.
So protection them against interrupts isn't actually going to *fix* anything.
Doing a
git grep DEFINE_PER_CPU.*64
isn't likely to find everything, but maybe it's a representative
sample. There aren't that many of those things, and some of them are
very much ok (ie only 64-bit architectures, or explicitly using
"atomic64_t" to avoid access issues)
I dunno. I have pulled you change, but it does make me go "people are
doing something wrong".
Maybe we could just aim to disallow everything but CPU-native accesses?
Linus
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [GIT PULL] percpu fixes for v4.14-rc3
2017-10-03 17:27 ` Linus Torvalds
@ 2017-10-03 18:32 ` Tejun Heo
0 siblings, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2017-10-03 18:32 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Christoph Lameter, Dennis Zhou
Hello, Linus.
On Tue, Oct 03, 2017 at 10:27:42AM -0700, Linus Torvalds wrote:
> On Tue, Oct 3, 2017 at 6:26 AM, Tejun Heo <tj@kernel.org> wrote:
> >
> > * Mark noticed that the generic implementations of percpu local atomic
> > reads aren't properly protected against irqs and there's a (slim)
> > chance for split reads on some 32bit systems.
>
> Grr.
>
> Do we really want to support 64-bit percpu operations on 32-bit architectures?
I don't know. However, AFAICS, no 32bit arch provides 64bit
optimizations and we've been doing the same irq-disable protection on
all !read percpu ops, so bringing reads in line is at least the
immediately right thing to do.
> It does kind of break the whole point of percpu operations, and I
> would like to point out that I find things like
>
> static DEFINE_PER_CPU(u64, running_sample_length);
>
> which is preceded by a comment that talks about how this is accessed
> from critical code and explicitly mentions NMI's.
>
> So protection them against interrupts isn't actually going to *fix* anything.
NMI users would have to do their own protection (running_sample_length
seems to be accessed only from NMI context) but there are quite a few
existing irq users which were risking a very slight chance of doing
corrupt split reads.
> Doing a
>
> git grep DEFINE_PER_CPU.*64
>
> isn't likely to find everything, but maybe it's a representative
> sample. There aren't that many of those things, and some of them are
> very much ok (ie only 64-bit architectures, or explicitly using
> "atomic64_t" to avoid access issues)
>
> I dunno. I have pulled you change, but it does make me go "people are
> doing something wrong".
>
> Maybe we could just aim to disallow everything but CPU-native accesses?
Here are a couple points to consider.
* On a lot of archs, most of percpu operations need to be protected
explicitly anyway regardless of size. This patch shifts things a
bit worse but not drastically. IOW, removing 64bit support on 32bit
isn't gonna remove most of explicit context protections.
* Using 64bit percpu ops is a bit of cop-out, where we trade off some
overhead on 32bit for performance / simplicity on 64bit, which
doesn't seem too different from what we do when we use explicit
64bit variables in general. And we do the latter quite a bit.
The question is whether we want to force percpu users to explicitly
worry about 32bit machines and shape the code accordingly, which can
possibly incur overhead / complexity on 64bit while resulting in
better code on 32bit.
Given that we need explicit protections on most operations anyway, I
lean towards keeping it. I don't think removing 64-on-32 support will
buy us anything noticeable enough to justify the inconvenience.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-10-03 18:33 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-03 13:26 [GIT PULL] percpu fixes for v4.14-rc3 Tejun Heo
2017-10-03 17:27 ` Linus Torvalds
2017-10-03 18:32 ` Tejun Heo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.