public inbox for llvm@lists.linux.dev
 help / color / mirror / Atom feed
From: "Paul Heidekrüger" <paul.heidekrueger@in.tum.de>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: paulmck@kernel.org, will@kernel.org, peterz@infradead.org,
	boqun.feng@gmail.com, parri.andrea@gmail.com,
	linux-kernel@vger.kernel.org, llvm@lists.linux.dev,
	elver@google.com, charalampos.mainas@gmail.com,
	pramod.bhatotia@in.tum.de
Subject: Re: Potentially Broken Address Dependency via test_bit() When Compiling With Clang
Date: Tue, 2 Nov 2021 19:35:57 +0100	[thread overview]
Message-ID: <YYGFDUR5dJ784P88@Pauls-MacBook-Pro> (raw)
In-Reply-To: <20211028143446.GA1351384@rowland.harvard.edu>

On Thu, Oct 28, 2021 at 10:34:46AM -0400, Alan Stern wrote:
> On Thu, Oct 28, 2021 at 02:37:47PM +0200, Paul Heidekrüger wrote:
> > On Wed, Oct 27, 2021 at 10:27:20AM -0400, Alan Stern wrote:
> > > On Wed, Oct 27, 2021 at 12:19:48PM +0200, Paul Heidekrüger wrote:
> 
> > > > Address dependency in source code, lines 373 - 375 in fs/afs/addr_list.c:
> > > > 
> > > > > [...]
> > > > >   index = READ_ONCE(ac->alist->preferred);
> > > > >   if (test_bit(index, &set))
> > > > >     goto selected;
> > > > > [...]
> > > > 
> > > > where test_bit() expands to the following in 
> > > > include/asm-generic/bitops/non-atomic.h, lines 115 - 122:
> > > > 
> > > > > static __always_inline int
> > > > > arch_test_bit(unsigned int nr, const volatile unsigned long *addr)
> > > > > {
> > > > >   return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
> > > > > }
> > > > > #define test_bit arch_test_bit
> 
> > > However, I can't follow the IR code.  Can you please explain in ordinary 
> > > English how the LLVM compiler manages to lose track of this dependency?
> > > 
> > > Alan Stern
> > 
> > Here's what we think might be going on:
> > - In 'arch_test_bit()', 'addr[BIT_WORD(nr)]' expands to 'addr[(nr) / 64]'.
> > - Since 'addr' points to an 'unsigned long', any other result than '0' for
> >   '(nr) / 64' would be out of bounds and therefore undefined.
> > - We assume LLVM is able to figure this out and use it to get rid of the
> >   address computation all together.
> 
> Ah, that explains it.  Yes, when set is a single unsigned long (or an 
> array of length 1), the address dependency is only syntactic, not 
> semantic.  As a result, we should expect that compilers will sometimes 
> not preserve it.

In LKMM's explanation.txt, lines 450 - 453 state:

> A read event and another memory access event are linked by an address
> dependency if the value obtained by the read affects the location
> accessed by the other event.

If we understand correctly, the ambiguity you're pointing out is that by
looking at 'addr[BIT_WORD(nr)]', one could deduce an address dependency by
seeing an array subscript operator, using a value which can be traced back to a
'READ_ONCE()' (syntactic).

However, since 'addr' points to an 'unsigned long', the 'READ_ONCE()' never
affects the location accessed in 'arch_test_bit()' as the offset computed in
the subscript operator can only be, as clang identified, '0' (semantic).

> The danger, of course, is that people relying on an ordering prescribed
> by the LKMM may get fooled because (unbeknownst to them) the dependency
> in question is not semantic.

Do you think this would warrant a change to LKMM's explanation.txt to make this
more explicit?

> It would be great if a static checker
> could detect such things -- but this would require some way for us to
> inform the checker about when the code does rely on a dependency
> ordering.

The compiler passes we're working on will hopefully be able to do exactly that,
not taking into account the programmer's intent of course.

Many thanks,
Paul

  reply	other threads:[~2021-11-02 18:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27 10:19 Potentially Broken Address Dependency via test_bit() When Compiling With Clang Paul Heidekrüger
2021-10-27 11:56 ` David Laight
2021-10-27 12:17 ` Peter Zijlstra
2021-10-27 12:24   ` Marco Elver
2021-10-27 12:34   ` Peter Zijlstra
2021-10-27 14:27 ` Alan Stern
2021-10-28 12:37   ` Paul Heidekrüger
2021-10-28 14:34     ` Alan Stern
2021-11-02 18:35       ` Paul Heidekrüger [this message]
2021-11-02 19:01         ` Alan Stern
2021-11-04 18:16           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YYGFDUR5dJ784P88@Pauls-MacBook-Pro \
    --to=paul.heidekrueger@in.tum.de \
    --cc=boqun.feng@gmail.com \
    --cc=charalampos.mainas@gmail.com \
    --cc=elver@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=parri.andrea@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pramod.bhatotia@in.tum.de \
    --cc=stern@rowland.harvard.edu \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox