public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Oltean <olteanv@gmail.com>
To: Dan Carpenter <error27@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	netdev@vger.kernel.org, kernel-janitors@vger.kernel.org
Subject: Re: [PATCH net] lib: packing: fix shift wrapping in bit_reverse()
Date: Wed, 7 Dec 2022 15:02:40 +0200	[thread overview]
Message-ID: <20221207130240.aoojta5n5enec72y@skbuf> (raw)
In-Reply-To: <20221207122254.otq7biekqz2nzhgl@skbuf>

On Wed, Dec 07, 2022 at 02:22:54PM +0200, Vladimir Oltean wrote:
> Wait a second, I deliberately wrote the code without conditionals.
> Let me look at the code disassembly before and after the patch and see
> what they look like.

Before (make lib/packing.lst on arm64):

	for (i = 0; i < width; i++) {
  1c:	310004ec 	adds	w12, w7, #0x1
  20:	540001c0 	b.eq	58 <adjust_for_msb_right_quirk+0x58>  // b.none
  24:	52800004 	mov	w4, #0x0                   	// #0
		bit = (val & (1 << i)) != 0;
  28:	5280002b 	mov	w11, #0x1                   	// #1
  2c:	d503201f 	nop
  30:	1ac42166 	lsl	w6, w11, w4
		new_val |= (bit << (width - i - 1));
  34:	4b0400e9 	sub	w9, w7, w4
		bit = (val & (1 << i)) != 0;
  38:	93407cc6 	sxtw	x6, w6		/* We don't want sign extension */
  3c:	ea0a00df 	tst	x6, x10
  40:	1a9f07e6 	cset	w6, ne  // ne = any
	for (i = 0; i < width; i++) {
  44:	6b07009f 	cmp	w4, w7
  48:	11000484 	add	w4, w4, #0x1
		new_val |= (bit << (width - i - 1));
  4c:	1ac920c6 	lsl	w6, w6, w9
  50:	aa060108 	orr	x8, x8, x6
	for (i = 0; i < width; i++) {
  54:	54fffee1 	b.ne	30 <adjust_for_msb_right_quirk+0x30>  // b.any


Then this modified code:

static u64 bit_reverse(u64 val, unsigned int width)
{
	u64 new_val = 0;
	unsigned int i;

	for (i = 0; i < width; i++)
		if (val & BIT_ULL(i))
			new_val |= BIT_ULL(width - i - 1);

	return new_val;
}

compiles to this:

	for (i = 0; i < width; i++) {
  1c:	310004eb 	adds	w11, w7, #0x1
  20:	540001c0 	b.eq	58 <adjust_for_msb_right_quirk+0x58>  // b.none
  24:	52800005 	mov	w5, #0x0                   	// #0
			new_val |= BIT_ULL(width - i - 1);
  28:	d280002a 	mov	x10, #0x1                   	// #1
  2c:	14000002 	b	34 <adjust_for_msb_right_quirk+0x34>	/* this is an unconditional jump to "sub w4, w7, w5", skipping the assignment to w5 */
  30:	2a0803e5 	mov	w5, w8			/* this is only hit by the backwards jump b.ne 30 at the end */
  34:	4b0500e4 	sub	w4, w7, w5
		if (val & BIT_ULL(i))
  38:	9ac52528 	lsr	x8, x9, x5
			new_val |= BIT_ULL(width - i - 1);
  3c:	f240011f 	tst	x8, #0x1
	for (i = 0; i < width; i++) {
  40:	110004a8 	add	w8, w5, #0x1
			new_val |= BIT_ULL(width - i - 1);
  44:	9ac42144 	lsl	x4, x10, x4
  48:	aa0400c4 	orr	x4, x6, x4
  4c:	9a861086 	csel	x6, x4, x6, ne  // ne = any
	for (i = 0; i < width; i++) {
  50:	6b0700bf 	cmp	w5, w7
  54:	54fffee1 	b.ne	30 <adjust_for_msb_right_quirk+0x30>  // b.any

which indeed has no branch in the main loop (between 0x30 and 0x54),
because as I explained, the "b 34" doesn't count - it's not in the loop.

Additionally, this rewritten code which is considerably more obscure in C:

static u64 bit_reverse(u64 val, unsigned int width)
{
	u64 new_val = 0;
	unsigned int i;

	for (i = 0; i < width; i++)
		new_val |= !!(val & BIT_ULL(i)) ?
			   BIT_ULL(width - i - 1) : 0;

	return new_val;
}

compiles to the exact same assembly code as the variant with "if":

	for (i = 0; i < width; i++)
  1c:	310004eb 	adds	w11, w7, #0x1
  20:	540001c0 	b.eq	58 <adjust_for_msb_right_quirk+0x58>  // b.none
  24:	52800005 	mov	w5, #0x0                   	// #0
		new_val |= !!(val & BIT_ULL(i)) ?
  28:	d280002a 	mov	x10, #0x1                   	// #1
  2c:	14000002 	b	34 <adjust_for_msb_right_quirk+0x34>
  30:	2a0803e5 	mov	w5, w8
  34:	4b0500e4 	sub	w4, w7, w5
  38:	9ac52528 	lsr	x8, x9, x5
  3c:	f240011f 	tst	x8, #0x1
	for (i = 0; i < width; i++)
  40:	110004a8 	add	w8, w5, #0x1
		new_val |= !!(val & BIT_ULL(i)) ?
  44:	9ac42144 	lsl	x4, x10, x4
  48:	aa0400c4 	orr	x4, x6, x4
  4c:	9a861086 	csel	x6, x4, x6, ne  // ne = any
	for (i = 0; i < width; i++)
  50:	6b0700bf 	cmp	w5, w7
  54:	54fffee1 	b.ne	30 <adjust_for_msb_right_quirk+0x30>  // b.any

So this is good with the "if".

  parent reply	other threads:[~2022-12-07 13:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-07 11:23 [PATCH net] lib: packing: fix shift wrapping in bit_reverse() Dan Carpenter
2022-12-07 12:19 ` Vladimir Oltean
2022-12-07 12:21   ` Dan Carpenter
2022-12-07 12:22     ` Vladimir Oltean
2022-12-07 12:51       ` Dan Carpenter
2022-12-07 13:06         ` Vladimir Oltean
2022-12-07 13:02       ` Vladimir Oltean [this message]
2022-12-07 19:41     ` David Laight
2022-12-08 16:58       ` Vladimir Oltean
2022-12-09  8:21 ` Uladzislau Koshchanka
2022-12-09 14:30   ` Vladimir Oltean
2022-12-09 21:01     ` Uladzislau Koshchanka
2022-12-09 22:06       ` Vladimir Oltean
2022-12-09 22:07         ` Vladimir Oltean
2022-12-10  0:44         ` [PATCH] lib: packing: replace bit_reverse() with bitrev8() Uladzislau Koshchanka
2022-12-12 23:04           ` Vladimir Oltean
2022-12-12 23:30           ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221207130240.aoojta5n5enec72y@skbuf \
    --to=olteanv@gmail.com \
    --cc=davem@davemloft.net \
    --cc=error27@gmail.com \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox