linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: "Nicholas Piggin" <npiggin@gmail.com>
To: "Michael Ellerman" <mpe@ellerman.id.au>, <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [RFC PATCH] powerpc: Optimise barriers for fully ordered atomics
Date: Tue, 16 Apr 2024 12:12:27 +1000	[thread overview]
Message-ID: <D0L6LU8KAPN2.1PBOHAD6WL1TE@gmail.com> (raw)
In-Reply-To: <878r1hd24t.fsf@mail.lhotse>

On Sat Apr 13, 2024 at 7:48 PM AEST, Michael Ellerman wrote:
> Nicholas Piggin <npiggin@gmail.com> writes:
> > "Fully ordered" atomics (RMW that return a value) are said to have a
> > full barrier before and after the atomic operation. This is implemented
> > as:
> >
> >   hwsync
> >   larx
> >   ...
> >   stcx.
> >   bne-
> >   hwsync
> >
> > This is slow on POWER processors because hwsync and stcx. require a
> > round-trip to the nest (~= L2 cache). The hwsyncs can be avoided with
> > the sequence:
> >
> >   lwsync
> >   larx
> >   ...
> >   stcx.
> >   bne-
> >   isync
> >
> > lwsync prevents all reorderings except store/load reordering, so the
> > larx could be execued ahead of a prior store becoming visible. However
> > the stcx. is a store, so it is ordered by the lwsync against all prior
> > access and if the value in memory had been modified since the larx, it
> > will fail. So the point at which the larx executes is not a concern
> > because the stcx. always verifies the memory was unchanged.
> >
> > The isync prevents subsequent instructions being executed before the
> > stcx. executes, and stcx. is necessarily visible to the system after
> > it executes, so there is no opportunity for it (or prior stores, thanks
> > to lwsync) to become visible after a subsequent load or store.
>
> AFAICS commit b97021f85517 ("powerpc: Fix atomic_xxx_return barrier
> semantics") disagrees.
>
> That was 2011, so maybe it's wrong or outdated?

Hmm, thanks for the reference. I didn't know about that.

isync or ordering execution / completion of a load after a previous
plain store doesn't guarantee ordering, because stores drain from
queues and become visible some time after they complete.

Okay, but I was thinking a successful stcx. should have to be visible.
I guess that's not true if it broke on P7. Maybe it's possible in the
architecture to have a successful stcx. not having this property though,
I find it pretty hard to read this part of the architecture. Clearly
it has to be visible to other processors performing larx/stcx.,
otherwise the canonical stcx. ; bne- ; isync sequence can't provide
mutual exclusion.

I wonder if the P7 breakage was caused by to the "wormhole coherency"
thing that was changed ("fixed") in POWER9. I'll have to look into it
a bit more. The cache coherency guy I was talking to retired before
answering :/ I'll have to find another victim.

I should try to break a P8 too.

>
> Either way it would be good to have some litmus tests to back this up.
>
> cheers
>
> ps. isn't there a rule against sending barrier patches after midnight? ;)

There should be if not.

Thanks,
Nick

      reply	other threads:[~2024-04-16  2:13 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-12 17:25 [RFC PATCH] powerpc: Optimise barriers for fully ordered atomics Nicholas Piggin
2024-04-13  9:48 ` Michael Ellerman
2024-04-16  2:12   ` Nicholas Piggin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D0L6LU8KAPN2.1PBOHAD6WL1TE@gmail.com \
    --to=npiggin@gmail.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).