Re: [PATCH v3 3/3] arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y

public inbox for llvm@lists.linux.dev
 help / color / mirror / Atom feed

From: David Laight <david.laight.linux@gmail.com>
To: Marco Elver <elver@google.com>
Cc: Will Deacon <will@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Boqun Feng <boqun.feng@gmail.com>,
	Waiman Long <longman@redhat.com>,
	Bart Van Assche <bvanassche@acm.org>,
	llvm@lists.linux.dev, Catalin Marinas <catalin.marinas@arm.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, kernel test robot <lkp@intel.com>,
	Boqun Feng <boqun@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH v3 3/3] arm64, compiler-context-analysis: Permit alias analysis through __READ_ONCE() with CONFIG_LTO=y
Date: Sun, 15 Feb 2026 23:18:16 +0000	[thread overview]
Message-ID: <20260215231816.2398e4f5@pumpkin> (raw)
In-Reply-To: <CANpmjNOnOqq2k41=iNsgZLXAzspVQhcTZ88Nvms_c_0M+7b7YQ@mail.gmail.com>

On Sun, 15 Feb 2026 23:43:23 +0100
Marco Elver <elver@google.com> wrote:

> On Sun, 15 Feb 2026 at 23:16, David Laight <david.laight.linux@gmail.com> wrote:
> >
> > On Sun, 15 Feb 2026 22:55:44 +0100
> > Marco Elver <elver@google.com> wrote:
> >  
> > > On Fri, 6 Feb 2026 at 19:26, David Laight <david.laight.linux@gmail.com> wrote:  
> > > > On Fri, 6 Feb 2026 16:09:35 +0100
> > > > Marco Elver <elver@google.com> wrote:
> > > >  
> > > > >  On Wed, 4 Feb 2026 at 15:15, Will Deacon <will@kernel.org> wrote:  
> > > > > >
> > > > > > On Wed, Feb 04, 2026 at 02:14:00PM +0100, Peter Zijlstra wrote:  
> > > > > > > On Wed, Feb 04, 2026 at 11:46:02AM +0100, Marco Elver wrote:  
> > > > > > > > On Tue, 3 Feb 2026 at 12:47, Will Deacon <will@kernel.org> wrote:
> > > > > > > > [...]  
> > > > > > > > > > > What does GCC do with this? :/  
> > > > > > > > > >
> > > > > > > > > > GCC currently doesn't see it, LTO is clang only.  
> > > > > > > > >
> > > > > > > > > LTO is just one way that a compiler could end up breaking dependency
> > > > > > > > > chains, so I really want to maintain the option to enable this path for
> > > > > > > > > GCC in case we run into problems caused by other optimisations in future.  
> > > > > > > >
> > > > > > > > It will work for GCC, but only from GCC 11. Before that __auto_type
> > > > > > > > does not drop qualifiers:
> > > > > > > > https://godbolt.org/z/sc5bcnzKd (switch to GCC 11 to see it compile)
> > > > > > > >
> > > > > > > > So to summarize, all supported Clang versions deal with __auto_type
> > > > > > > > correctly for the fallback; GCC from version 11 does (kernel currently
> > > > > > > > supports GCC 8 and above). From GCC 14 and Clang 19 we have
> > > > > > > > __typeof_unqual__.
> > > > > > > >
> > > > > > > > I really don't see another way forward; there's no other good way to
> > > > > > > > solve this issue. I would advise against pessimizing new compilers and
> > > > > > > > features because maybe one day we might still want to enable this
> > > > > > > > version of READ_ONCE() for GCC 8-10.
> > > > > > > >
> > > > > > > > Should we one day choose to enable this READ_ONCE() version for GCC,
> > > > > > > > we will (a) either have bumped the minimum GCC version to 11+, or (b)
> > > > > > > > we can only do so from GCC 11. At this point GCC 11 was released 5
> > > > > > > > years ago!  
> > > > > > >
> > > > > > > There is, from this thread:
> > > > > > >
> > > > > > >   https://lkml.kernel.org/r/20260111182010.GH3634291@ZenIV
> > > > > > >
> > > > > > > another trick to strip qualifiers:
> > > > > > >
> > > > > > >   #define unqual_non_array(T) __typeof__(((T(*)(void))0)())
> > > > > > >
> > > > > > > which will work from GCC-8.4 onwards. Arguably, it should be possible to
> > > > > > > raise the minimum from 8 to 8.4 (IMO).  
> > > > >
> > > > > That looks like an interesting option.
> > > > >  
> > > > > > That sounds reasonable to me but I'm not usually the one to push back
> > > > > > on raising the minimum compiler version!
> > > > > >  
> > > > > > > But yes; in general I think it is fine to have 'old' compilers generate
> > > > > > > suboptimal code.  
> > > > > >
> > > > > > I'm absolutely fine with the codegen being terrible for ancient
> > > > > > toolchains as long as it's correct.  
> > > > >
> > > > > From that discussion a month ago and this one, it seems we need
> > > > > something to fix __unqual_scalar_typeof().
> > > > >
> > > > > What's the way forward?
> > > > >
> > > > > 1. Bump minimum GCC version to 8.4. Replace __unqual_scalar_typeof()
> > > > > for old compilers with the better unqual_non_array hack?
> > > > >
> > > > > 2. Leave __unqual_scalar_typeof() as-is. The patch "compiler: Use
> > > > > __typeof_unqual__() for __unqual_scalar_typeof()" will fix the codegen
> > > > > issues for new compilers. Doesn't fix not dropping 'const' for old
> > > > > compilers for non-scalar types, and requires localized workarounds
> > > > > (like this patch here).
> > > > >
> > > > > Either way we need a fix for this arm64 LTO version to fix the
> > > > > context-analysis "see through" the inline asm (how this patch series
> > > > > started).
> > > > >
> > > > > Option #1 needs a lot more due-diligence and testing that it all works
> > > > > for all compilers and configs (opening Pandora's Box :-)). For option
> > > > > #2 we just need these patches here to at least fix the acute issue
> > > > > with this arm64 LTO version.  
> > > >
> > > > Option 3.
> > > >
> > > > Look are where/why they are used and change the code to do it differently.
> > > > Don't forget the similar __unsigned_scalar_typeof() in bitfield.h.
> > > > (I posted a patch that nuked that one not long ago - used sizeof instead.)
> > > >
> > > > The one in minmax_array (in minmax.h) is particularly pointless.
> > > > The value 'suffers' integer promotion as soon as it is used, nothing
> > > > wrong with 'auto _x = x + 0' there.
> > > > That will work elsewhere.  
> > >
> > > Agreed that getting rid of __unqual_scalar_typeof() in favor of 'auto'
> > > where possible is the way to go.
> > >
> > > Unfortunately I spent the last week occasionally glancing at this
> > > arm64 READ_ONCE problem, and could not come up with something that
> > > avoids using typeof_unqual() or __unqual_scalar_typeof(). I'm inclined
> > > to go with the unqual_non_array hack, but not make this available as a
> > > macro for general use - we have too many of these horrid macros, don't
> > > want to add more to this hack pile.  
> >
> > Agreed, having to do such things inside what are already horrid 'functions'
> > is one thing, but when they get used in 'normal' code it is silly.
> >
> > Have you checked whether sizes other than 1, 2, 4 and 8 are ever used?
> > There aren't any in an x86-64 allmodconfig build and it used to be an error.
> > Even if there are handful, having to use a different define wouldn't
> > really be an issue.
> > Removing that support would make READ_ONCE() easier to write/understand
> > and (hopefully) compile faster - there is a measurable cost for the
> > 'size check' in the x86-64 build, the arm LTO expansion must be significant.  
> 
> I found e.g. xen_get_runstate_snapshot_cpu_delta() uses the >8 byte
> case via __READ_ONCE(). READ_ONCE() itself is already restricted to <=
> 8 bytes (due to that static assert), but that itself uses the
> __READ_ONCE() helper which these patches were touching.

One thing that might reduce the cost of that static_assert is to move
the error_function out of it - defining that in every expansion can't help.
A few places do that, but it really needs a helper - say:
#define compiletime_assert_fn(fn, msg) \
	__noreturn extern void fn(void) __compiletime_assert(msg) 

> 
> We could invert the game: have READ_ONCE() which just deals with <= 8
> bytes. And __READ_ONCE() which uses READ_ONCE() if <= 8 bytes, and the
> non-atomic case if >8 bytes. However, I fear the static size check
> won't go away because the asm-generic version of __READ_ONCE() happily
> works on any type (it's just a volatile cast+deref) - I don't know how
> we'd enforce the size limit otherwise.

That should probably be a NON_ATOMIC_READ_ONCE() that doesn't 'fall-back'.

	David

next prev parent reply	other threads:[~2026-02-15 23:18 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-30 13:28 [PATCH v3 0/3] arm64: Fixes for __READ_ONCE() with CONFIG_LTO=y Marco Elver
2026-01-30 13:28 ` [PATCH v3 1/3] arm64: Fix non-atomic " Marco Elver
2026-01-30 15:06   ` David Laight
2026-01-30 13:28 ` [PATCH v3 2/3] arm64: Optimize " Marco Elver
2026-01-30 15:11   ` David Laight
2026-02-02 15:36   ` Will Deacon
2026-02-02 16:01     ` Peter Zijlstra
2026-02-02 16:05       ` Will Deacon
2026-02-02 17:48         ` Marco Elver
2026-02-02 19:28     ` David Laight
2026-01-30 13:28 ` [PATCH v3 3/3] arm64, compiler-context-analysis: Permit alias analysis through " Marco Elver
2026-01-30 15:13   ` David Laight
2026-02-02 15:39   ` Will Deacon
2026-02-02 19:29     ` David Laight
2026-02-03 11:47       ` Will Deacon
2026-02-04 10:46         ` Marco Elver
2026-02-04 13:14           ` Peter Zijlstra
2026-02-04 14:15             ` Will Deacon
2026-02-06 15:09               ` Marco Elver
2026-02-06 18:26                 ` David Laight
2026-02-15 21:55                   ` Marco Elver
2026-02-15 22:16                     ` David Laight
2026-02-15 22:43                       ` Marco Elver
2026-02-15 23:18                         ` David Laight [this message]
2026-02-15 23:40                         ` Linus Torvalds
2026-02-16 11:09                           ` David Laight
2026-02-16 15:32                             ` Linus Torvalds
2026-02-16 17:43                               ` David Laight
2026-02-17 12:16                                 ` Peter Zijlstra
2026-02-17 14:25                                   ` David Laight
2026-02-17 16:23                                 ` Linus Torvalds
2026-02-17 16:32                                   ` Linus Torvalds
2026-02-18 19:34                                     ` Boqun Feng
2026-02-18 20:18                                       ` Linus Torvalds
2026-02-19 15:21                                     ` Gary Guo
2026-02-19 18:36                                       ` Linus Torvalds
2026-02-02 19:13 ` [PATCH v3 0/3] arm64: Fixes for " Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260215231816.2398e4f5@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=arnd@arndb.de \
    --cc=boqun.feng@gmail.com \
    --cc=boqun@kernel.org \
    --cc=bvanassche@acm.org \
    --cc=catalin.marinas@arm.com \
    --cc=elver@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=llvm@lists.linux.dev \
    --cc=longman@redhat.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox