The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: David Laight <david.laight.linux@gmail.com>
To: Heiko Carstens <hca@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Juergen Christ <jchrist@linux.ibm.com>,
	linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org
Subject: Re: [PATCH v5 1/7] s390/percpu: Infrastructure for more efficient this_cpu operations
Date: Tue, 2 Jun 2026 14:32:09 +0100	[thread overview]
Message-ID: <20260602143209.1ce024e3@pumpkin> (raw)
In-Reply-To: <20260601150813.9109C90-hca@linux.ibm.com>

On Mon, 1 Jun 2026 17:08:13 +0200
Heiko Carstens <hca@linux.ibm.com> wrote:

> On Mon, Jun 01, 2026 at 04:49:35PM +0200, Alexander Gordeev wrote:
> > On Mon, Jun 01, 2026 at 03:27:50PM +0200, Heiko Carstens wrote:
> > ...  
> > > > > +	if ((insn & 0xff0f) != 0xe300)  
> > > > 
> > > > Any reason mask/not to check the register number as well?  
> > > 
> > > Just to save code. If there would be a mismatch with the percpu
> > > register, something serious would be wrong..  
> > 
> > Is it worth checking the disp then? ;)  
> 
> It is: the check makes sure this is an AG instruction, which adds the
> percpu offset from lowcore - by checking that the displacement is
> correct, as well as that the base register is zero.
> 
> There could be a different AG instruction within the inline assembly,
> for whatever reason.

Do you actually even need to check the instruction?

This sequence can only work for simple per-cpu accesses, so I don't
see a reason to let the specified register point anywhere other than the
base of the per-cpu data.

That means the process switch code can just load the register with the
base of the per-cpu data for the new cpu.
If that happens before the 'AG' is executed it won't matter.

The only reason would be to support non-offsettable memory accesses.
But it looks like the 'laag    %r5,%r2,0(%r4)' in the example has an
offset (of zero).
Probably only stops you doing a direct access of an array.

That would mean that needs_fixup goes in the bin and percpu_exit() becomes:
	...
	reg = regs->percpu_register;
	if (likely(!reg))
		return;
	lc->percpu_register = reg;
	regs->gprs[reg] = lc->percpu_offset
}

I guess I'm missing something?

-- David

> 
> > > > > + * Inline assemblies making use of this typically have a code sequence like:
> > > > > + *
> > > > > + *   MVIY_PERCPU(...) <- start of percpu code section
> > > > > + *   AG_ALT(...)      <- add percpu offset; must be the second instruction
> > > > > + *   atomic_op	      <- atomic op  
> > > >                  \t here, but should be spaces?  
> > > 
> > > I can't follow? We have tabs in comments all over the place in s390 code.  
> > 
> > The other '<-' comments below and above use spaces, but this one
> > mixes spaces with '\t'.  
> 
> Because it is not possible to use tabs there. We put tabs in our
> comments whenever possible.
> 
> > > > Probably it worth noting that no extra instructions between atomic_op
> > > > and MVIY_ALT may exist (e.g. in the future), especially ones that use
> > > > the percpu_register.  
> > > 
> > > That's not true. There may be additional instructions, they may just
> > > not use the same register that is used for the percpu variable.  
> > 
> > That is what I tried to say, but I also thought it is intentionally
> > only two instructions between mviy brackets allowed: ag + atomic_op.
> > Am I wrong here?  
> 
> Yes, you are wrong. You can use as many instructions as you want, as
> long as the inline assembly follows the rules with respect to the
> register which contains the percpu variable.
> 
> But seriously: all of this works only with atomic ops, so I don't see
> reason why one put more instructions there. We actively try to keep
> our inline assemblies as small as possible.
> 
> Even though with relocatable lowcore and alternatives the C code looks
> huge, while it only generates very few instructions.
> 
> > > But that was true before this patch as well, even though it might not
> > > have been "obvious".  
> > 
> > Hmm.. I do not get it. We would not get scheduled away before this patch,
> > aren't we?  
> 
> True. What I tried to say: before and after this patch there was and
> is no code which _uses_ the pointer of the percpu variable more than
> once. Accessing the percpu variable twice within such a section would
> be potentially broken, since between two accesses an interrupt / nmi
> could happen, which could modify the percpu var contents, which again
> could result in loss of information.
> 


  reply	other threads:[~2026-06-02 13:32 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26  5:56 [PATCH v5 0/7] s390: Improve this_cpu operations Heiko Carstens
2026-05-26  5:56 ` [PATCH v5 1/7] s390/percpu: Infrastructure for more efficient " Heiko Carstens
2026-06-01  7:10   ` Alexander Gordeev
2026-06-01 13:27     ` Heiko Carstens
2026-06-01 14:49       ` Alexander Gordeev
2026-06-01 15:08         ` Heiko Carstens
2026-06-02 13:32           ` David Laight [this message]
2026-06-02 13:54             ` Heiko Carstens
2026-06-02 14:36               ` David Laight
2026-06-02 10:08   ` Alexander Gordeev
2026-05-26  5:56 ` [PATCH v5 2/7] s390/percpu: Add missing do { } while (0) constructs Heiko Carstens
2026-06-01  7:14   ` Alexander Gordeev
2026-05-26  5:56 ` [PATCH v5 3/7] s390/percpu: Use new percpu code section for arch_this_cpu_add() Heiko Carstens
2026-06-02 11:33   ` Alexander Gordeev
2026-05-26  5:56 ` [PATCH v5 4/7] s390/percpu: Use new percpu code section for arch_this_cpu_add_return() Heiko Carstens
2026-06-02 11:34   ` Alexander Gordeev
2026-05-26  5:57 ` [PATCH v5 5/7] s390/percpu: Use new percpu code section for arch_this_cpu_[and|or]() Heiko Carstens
2026-06-02 11:35   ` Alexander Gordeev
2026-05-26  5:57 ` [PATCH v5 6/7] s390/percpu: Provide arch_this_cpu_read() implementation Heiko Carstens
2026-06-02 11:40   ` Alexander Gordeev
2026-05-26  5:57 ` [PATCH v5 7/7] s390/percpu: Provide arch_this_cpu_write() implementation Heiko Carstens
2026-06-02 11:40   ` Alexander Gordeev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260602143209.1ce024e3@pumpkin \
    --to=david.laight.linux@gmail.com \
    --cc=agordeev@linux.ibm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=jchrist@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=svens@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox