[GIT PULL] percpu fixes for v4.14-rc3

All of lore.kernel.org
 help / color / mirror / Atom feed

* [GIT PULL] percpu fixes for v4.14-rc3
@ 2017-10-03 13:26 Tejun Heo
  2017-10-03 17:27 ` Linus Torvalds
  0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2017-10-03 13:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, Christoph Lameter, Dennis Zhou

Hello, Linus.

Rather important fixes this time.

* The new percpu area allocator had a subtle bug in how it iterates
  the memory regions and could skip viable areas, which led to
  allocation failures for module static percpu variables.  Dennis
  fixed the bug and another non-critical one in stat calculation.

* Mark noticed that the generic implementations of percpu local atomic
  reads aren't properly protected against irqs and there's a (slim)
  chance for split reads on some 32bit systems.  Generic
  implementations are updated to disable irq when read size is larger
  than ulong size.  This may have made some 32bit archs which can do
  atomic local 64bit accesses generate sub-optimal code.  We need to
  find them out and implement arch-specific overrides.

Thanks.

The following changes since commit e365806ac289457263a133bd32df8df49897f612:

  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs (2017-09-25 18:24:14 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu.git for-4.14-fixes

for you to fetch changes up to 1fa4df3e688902d033dfda796eb83ae6ad8d0488:

  percpu: fix iteration to prevent skipping over block (2017-09-28 07:39:27 -0700)

----------------------------------------------------------------
Dennis Zhou (2):
      percpu: fix starting offset for chunk statistics traversal
      percpu: fix iteration to prevent skipping over block

Mark Rutland (1):
      percpu: make this_cpu_generic_read() atomic w.r.t. interrupts

 include/asm-generic/percpu.h | 24 ++++++++++++++++++++++--
 mm/percpu-stats.c            |  2 +-
 mm/percpu.c                  |  4 ++++
 3 files changed, 27 insertions(+), 3 deletions(-)

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GIT PULL] percpu fixes for v4.14-rc3
  2017-10-03 13:26 [GIT PULL] percpu fixes for v4.14-rc3 Tejun Heo
@ 2017-10-03 17:27 ` Linus Torvalds
  2017-10-03 18:32   ` Tejun Heo
  0 siblings, 1 reply; 3+ messages in thread
From: Linus Torvalds @ 2017-10-03 17:27 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Linux Kernel Mailing List, Christoph Lameter, Dennis Zhou

On Tue, Oct 3, 2017 at 6:26 AM, Tejun Heo <tj@kernel.org> wrote:
>
> * Mark noticed that the generic implementations of percpu local atomic
>   reads aren't properly protected against irqs and there's a (slim)
>   chance for split reads on some 32bit systems.

Grr.

Do we really want to support 64-bit percpu operations on 32-bit architectures?

It does kind of break the whole point of percpu operations, and I
would like to point out that I find things like

   static DEFINE_PER_CPU(u64, running_sample_length);

which is preceded by a comment that talks about how this is accessed
from critical code and explicitly mentions NMI's.

So protection them against interrupts isn't actually going to *fix* anything.

Doing a

   git grep DEFINE_PER_CPU.*64

isn't likely to find everything, but maybe it's a representative
sample. There aren't that many of those things, and some of them are
very much ok (ie only 64-bit architectures, or explicitly using
"atomic64_t" to avoid access issues)

I dunno. I have pulled you change, but it does make me go "people are
doing something wrong".

Maybe we could just aim to disallow everything but CPU-native accesses?

              Linus

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GIT PULL] percpu fixes for v4.14-rc3
  2017-10-03 17:27 ` Linus Torvalds
@ 2017-10-03 18:32   ` Tejun Heo
  0 siblings, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2017-10-03 18:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, Christoph Lameter, Dennis Zhou

Hello, Linus.

On Tue, Oct 03, 2017 at 10:27:42AM -0700, Linus Torvalds wrote:
> On Tue, Oct 3, 2017 at 6:26 AM, Tejun Heo <tj@kernel.org> wrote:
> >
> > * Mark noticed that the generic implementations of percpu local atomic
> >   reads aren't properly protected against irqs and there's a (slim)
> >   chance for split reads on some 32bit systems.
> 
> Grr.
> 
> Do we really want to support 64-bit percpu operations on 32-bit architectures?

I don't know.  However, AFAICS, no 32bit arch provides 64bit
optimizations and we've been doing the same irq-disable protection on
all !read percpu ops, so bringing reads in line is at least the
immediately right thing to do.

> It does kind of break the whole point of percpu operations, and I
> would like to point out that I find things like
> 
>    static DEFINE_PER_CPU(u64, running_sample_length);
> 
> which is preceded by a comment that talks about how this is accessed
> from critical code and explicitly mentions NMI's.
>
> So protection them against interrupts isn't actually going to *fix* anything.

NMI users would have to do their own protection (running_sample_length
seems to be accessed only from NMI context) but there are quite a few
existing irq users which were risking a very slight chance of doing
corrupt split reads.

> Doing a
> 
>    git grep DEFINE_PER_CPU.*64
> 
> isn't likely to find everything, but maybe it's a representative
> sample. There aren't that many of those things, and some of them are
> very much ok (ie only 64-bit architectures, or explicitly using
> "atomic64_t" to avoid access issues)
> 
> I dunno. I have pulled you change, but it does make me go "people are
> doing something wrong".
> 
> Maybe we could just aim to disallow everything but CPU-native accesses?

Here are a couple points to consider.

* On a lot of archs, most of percpu operations need to be protected
  explicitly anyway regardless of size.  This patch shifts things a
  bit worse but not drastically.  IOW, removing 64bit support on 32bit
  isn't gonna remove most of explicit context protections.

* Using 64bit percpu ops is a bit of cop-out, where we trade off some
  overhead on 32bit for performance / simplicity on 64bit, which
  doesn't seem too different from what we do when we use explicit
  64bit variables in general.  And we do the latter quite a bit.

The question is whether we want to force percpu users to explicitly
worry about 32bit machines and shape the code accordingly, which can
possibly incur overhead / complexity on 64bit while resulting in
better code on 32bit.

Given that we need explicit protections on most operations anyway, I
lean towards keeping it.  I don't think removing 64-on-32 support will
buy us anything noticeable enough to justify the inconvenience.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-10-03 18:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-03 13:26 [GIT PULL] percpu fixes for v4.14-rc3 Tejun Heo
2017-10-03 17:27 ` Linus Torvalds
2017-10-03 18:32   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.