From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Paul Mackerras <paulus@samba.org>,
linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible
Date: Mon, 20 Sep 2021 16:23:04 -0500 [thread overview]
Message-ID: <20210920212303.GZ1583@gate.crashing.org> (raw)
In-Reply-To: <db9d01d5c543c5add4b2beadb03d39e99c7ada2c.1632126669.git.christophe.leroy@csgroup.eu>
Hi!
On Mon, Sep 20, 2021 at 10:31:17AM +0200, Christophe Leroy wrote:
> Today we get the following code generation for bitops like
> set or clear bit:
>
> c0009fe0: 39 40 08 00 li r10,2048
> c0009fe4: 7c e0 40 28 lwarx r7,0,r8
> c0009fe8: 7c e7 53 78 or r7,r7,r10
> c0009fec: 7c e0 41 2d stwcx. r7,0,r8
>
> c000d568: 39 00 18 00 li r8,6144
> c000d56c: 7c c0 38 28 lwarx r6,0,r7
> c000d570: 7c c6 40 78 andc r6,r6,r8
> c000d574: 7c c0 39 2d stwcx. r6,0,r7
>
> Most set bits are constant on lower 16 bits, so it can easily
> be replaced by the "immediate" version of the operation. Allow
> GCC to choose between the normal or immediate form.
You can also handle the second sixteen bits (the "shifted" half), by
using oris etc. The "%eN" output modifier prints an "s" for this:
/* If the low 16 bits are 0, but some other bit is set, write 's'. */
But this doesn't handle non-constant arguments, so you're likely better
off using what you have noe.
> For clear bits, on 32 bits 'rlwinm' can be used instead of 'andc' for
> when all bits to be cleared are consecutive.
Or when all you want to keep are consecutive (you do handle that now :-) )
> On 64 bits we don't have any equivalent single operation for clearing,
> single bits or a few bits, we'd need two 'rldicl' so it is not
> worth it, the li/andc sequence is doing the same.
You can use rlwinm whenever you want to clear all top 32 bits.
A sometimes nice idiom is ori x,x,N ; xori x,x,N to clear the bits N
(or oris/xoris). But it's two insns no matter what (but no spare
register is needed).
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> +static inline unsigned long test_and_clear_bits(unsigned long mask, volatile unsigned long *_p)
> +{
> + unsigned long old, t;
> + unsigned long *p = (unsigned long *)_p;
> +
> + if (IS_ENABLED(CONFIG_PPC32) &&
> + __builtin_constant_p(mask) && is_rlwinm_mask_valid(mask)) {
is_rlwinm_mask_valid(~mask)? So that test_and_clear_bits(0, ...) will
work with rlwinm, and test_and_clear_bits(0xffffffff, ...) will not make
gas scream bloody murder ("illegal bitmask"). Tha mask you pass to the
instruction is ~mask after all.
Looks great except that one nit. Thanks :-)
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Segher
WARNING: multiple messages have this Message-ID (diff)
From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Michael Ellerman <mpe@ellerman.id.au>,
linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible
Date: Mon, 20 Sep 2021 16:23:04 -0500 [thread overview]
Message-ID: <20210920212303.GZ1583@gate.crashing.org> (raw)
In-Reply-To: <db9d01d5c543c5add4b2beadb03d39e99c7ada2c.1632126669.git.christophe.leroy@csgroup.eu>
Hi!
On Mon, Sep 20, 2021 at 10:31:17AM +0200, Christophe Leroy wrote:
> Today we get the following code generation for bitops like
> set or clear bit:
>
> c0009fe0: 39 40 08 00 li r10,2048
> c0009fe4: 7c e0 40 28 lwarx r7,0,r8
> c0009fe8: 7c e7 53 78 or r7,r7,r10
> c0009fec: 7c e0 41 2d stwcx. r7,0,r8
>
> c000d568: 39 00 18 00 li r8,6144
> c000d56c: 7c c0 38 28 lwarx r6,0,r7
> c000d570: 7c c6 40 78 andc r6,r6,r8
> c000d574: 7c c0 39 2d stwcx. r6,0,r7
>
> Most set bits are constant on lower 16 bits, so it can easily
> be replaced by the "immediate" version of the operation. Allow
> GCC to choose between the normal or immediate form.
You can also handle the second sixteen bits (the "shifted" half), by
using oris etc. The "%eN" output modifier prints an "s" for this:
/* If the low 16 bits are 0, but some other bit is set, write 's'. */
But this doesn't handle non-constant arguments, so you're likely better
off using what you have noe.
> For clear bits, on 32 bits 'rlwinm' can be used instead of 'andc' for
> when all bits to be cleared are consecutive.
Or when all you want to keep are consecutive (you do handle that now :-) )
> On 64 bits we don't have any equivalent single operation for clearing,
> single bits or a few bits, we'd need two 'rldicl' so it is not
> worth it, the li/andc sequence is doing the same.
You can use rlwinm whenever you want to clear all top 32 bits.
A sometimes nice idiom is ori x,x,N ; xori x,x,N to clear the bits N
(or oris/xoris). But it's two insns no matter what (but no spare
register is needed).
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> +static inline unsigned long test_and_clear_bits(unsigned long mask, volatile unsigned long *_p)
> +{
> + unsigned long old, t;
> + unsigned long *p = (unsigned long *)_p;
> +
> + if (IS_ENABLED(CONFIG_PPC32) &&
> + __builtin_constant_p(mask) && is_rlwinm_mask_valid(mask)) {
is_rlwinm_mask_valid(~mask)? So that test_and_clear_bits(0, ...) will
work with rlwinm, and test_and_clear_bits(0xffffffff, ...) will not make
gas scream bloody murder ("illegal bitmask"). Tha mask you pass to the
instruction is ~mask after all.
Looks great except that one nit. Thanks :-)
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Segher
next prev parent reply other threads:[~2021-09-20 21:25 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-20 8:31 [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible Christophe Leroy
2021-09-20 8:31 ` Christophe Leroy
2021-09-20 8:31 ` [PATCH v4 2/3] powerpc/atomics: " Christophe Leroy
2021-09-20 8:31 ` Christophe Leroy
2021-09-20 8:31 ` [PATCH v4 3/3] powerpc/atomics: Remove atomic_inc()/atomic_dec() and friends Christophe Leroy
2021-09-20 8:31 ` Christophe Leroy
2021-09-20 21:23 ` Segher Boessenkool [this message]
2021-09-20 21:23 ` [PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible Segher Boessenkool
2021-09-21 15:15 ` Christophe Leroy
2021-09-21 15:15 ` Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210920212303.GZ1583@gate.crashing.org \
--to=segher@kernel.crashing.org \
--cc=christophe.leroy@csgroup.eu \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.