public inbox for llvm@lists.linux.dev
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Paul Heidekrüger" <paul.heidekrueger@in.tum.de>
Cc: paulmck@kernel.org, will@kernel.org, boqun.feng@gmail.com,
	stern@rowland.harvard.edu, parri.andrea@gmail.com,
	linux-kernel@vger.kernel.org, llvm@lists.linux.dev,
	elver@google.com, charalampos.mainas@gmail.com,
	pramod.bhatotia@in.tum.de
Subject: Re: Potentially Broken Address Dependency via test_bit() When Compiling With Clang
Date: Wed, 27 Oct 2021 14:17:47 +0200	[thread overview]
Message-ID: <20211027121747.GI174703@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <YXknxGFjvaB46d/p@Pauls-MacBook-Pro>

On Wed, Oct 27, 2021 at 12:19:48PM +0200, Paul Heidekrüger wrote:
> Hi all,
> 
> For my bachelor thesis, I have been working on the infamous problem of 
> potentially broken dependency orderings in the Linux kernel. I'm being 
> advised by Marco Elver, Charalampos Mainas, Pramod Bhatotia (Cc'd).

Nice! Great to see someone working on this!

> For context, see: 
> https://linuxplumbersconf.org/event/7/contributions/821/attachments/598/1075/LPC_2020_--_Dependency_ordering.pdf
> 
> Our approach consists of two LLVM compiler passes which annotate 
> dependencies in unoptimised intermediate representation (IR) and verify 
> the annotated dependencies in optimised IR. ATM, the passes only 
> recognise a subset of address dependencies - everything is still WIP ;-)
> 
> We have been cross-compiling with a slightly modified version of 
> allyesconfig for arm64, and the passes have now found a case that we 
> would like to share with LKML for feedback: an address dependency being 
> broken (?) through compiler optimisations in 
> fs/afs/addr_list.c::afs_iterate_addresses().
> 
> Address dependency in source code, lines 373 - 375 in fs/afs/addr_list.c:
> 
> > [...]
> >   index = READ_ONCE(ac->alist->preferred);
> >   if (test_bit(index, &set))
> >     goto selected;
> > [...]
> 
> where test_bit() expands to the following in 
> include/asm-generic/bitops/non-atomic.h, lines 115 - 122:
> 
> > static __always_inline int
> > arch_test_bit(unsigned int nr, const volatile unsigned long *addr)
> > {
> >   return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
> > }
> > #define test_bit arch_test_bit
> 
> The address dependency gets preserved in unoptimised IR since the virtual register %33 transitively depends on %28:
> 
> > %28 = load volatile i8, i8* %preferred, align 2, !annotation !15
> > store i8 %28, i8* %tmp21, align 1
> > %29 = load i8, i8* %tmp21, align 1
> > %conv23 = zext i8 %29 to i32
> > store i32 %conv23, i32* %index, align 4
> > %30 = load i32, i32* %index, align 4
> > store i32 %30, i32* %nr.addr.i, align 4
> > store i64* %set, i64** %addr.addr.i, align 8
> > %31 = load i64*, i64** %addr.addr.i, align 8
> > %32 = load i32, i32* %nr.addr.i, align 4
> > %div.i = udiv i32 %32, 64
> > %idxprom.i = zext i32 %div.i to i64
> > %arrayidx.i = getelementptr i64, i64* %31, i64 %idxprom.i
> > %33 = load volatile i64, i64* %arrayidx.i, align 8, !annotation !16
> 
> In optimised IR, there is no dependency between the two volatile loads 
> anymore:
> 
> > %11 = load volatile i8, i8* %preferred, align 2, !annotation !19
> > %conv25 = zext i8 %11 to i32
> > %set.0. = load volatile i64, i64* %set, align 8
> 
> Now, since @nr traces back to the READ_ONCE() to @index, does this make 
> the load from @addr in test_bit() address-dependent on that READ_ONCE()? 
> Should the load from @addr therefore be ordered against the READ_ONCE()?

I would personally not consider this a dependend load. The result
depends on two loads, but there is no actual ordering between them.

  r1 = *x
  r2 = *y
  b = 1 & (r1 >> r2);

(more or less)

A dependent load would be something where the address of the second load
depends on the value of the first load, eg:

  r1 = *x;
  r2 = *(y + r1);

typically derefencing or array accesses have this pattern. The canonical
example being rcu_dereference(), and is the reason Paul Mckenney is
arguing that pointers should carry dependecies; I'll let him refer to
the many C language papers on this.

Other examples, ones we're actually worried about the compiler breaking,
are, for example, the array access as found in __ktime_get_fast_ns():

	seq = READ_ONCE(tkf->seq);
	tkr = tkf->base + (seq & 1);
	now = tkr->...

Here the dependency is on an integer (seq), and worse, only a single bit
of it. If the compiler were this to transform into something like:

	seq = READ_ONCE(tkf->seq)
	if (seq & 1) {
		// use tkf->base[1]
	} else {
		// use tkf->base[0]
	}

Then it would be broken, since the condition doesn't order the two loads
and they can be re-ordered. Which in turn breaks the premise of the
seqcount_latch construct -- see the comment that goes with
raw_write_seqcount_latch() in seqlock.h.

hth,

~Peter


  parent reply	other threads:[~2021-10-27 12:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-27 10:19 Potentially Broken Address Dependency via test_bit() When Compiling With Clang Paul Heidekrüger
2021-10-27 11:56 ` David Laight
2021-10-27 12:17 ` Peter Zijlstra [this message]
2021-10-27 12:24   ` Marco Elver
2021-10-27 12:34   ` Peter Zijlstra
2021-10-27 14:27 ` Alan Stern
2021-10-28 12:37   ` Paul Heidekrüger
2021-10-28 14:34     ` Alan Stern
2021-11-02 18:35       ` Paul Heidekrüger
2021-11-02 19:01         ` Alan Stern
2021-11-04 18:16           ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211027121747.GI174703@worktop.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=boqun.feng@gmail.com \
    --cc=charalampos.mainas@gmail.com \
    --cc=elver@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=parri.andrea@gmail.com \
    --cc=paul.heidekrueger@in.tum.de \
    --cc=paulmck@kernel.org \
    --cc=pramod.bhatotia@in.tum.de \
    --cc=stern@rowland.harvard.edu \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox