From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Puranjay Mohan <puranjay@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>,
bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, puranjay12@gmail.com
Subject: Re: [PATCH bpf] arm32, bpf: reimplement sign-extension mov instruction
Date: Mon, 22 Apr 2024 12:14:13 +0100 [thread overview]
Message-ID: <ZiZGhVQx2ei/7Xlx@shell.armlinux.org.uk> (raw)
In-Reply-To: <20240419182832.27707-1-puranjay@kernel.org>
On Fri, Apr 19, 2024 at 06:28:32PM +0000, Puranjay Mohan wrote:
> The current implementation of the mov instruction with sign extension
> has the following problems:
>
> 1. It clobbers the source register if it is not stacked because it
> sign extends the source and then moves it to the destination.
> 2. If the dst_reg is stacked, the current code doesn't write the value
> back in case of 64-bit mov.
> 3. There is room for improvement by emitting fewer instructions.
>
> The steps for fixing this and the instructions emitted by the JIT are
> explained below with examples in all combinations:
>
> Case A: offset == 32:
> =====================
> Case A.1: src and dst are stacked registers:
> --------------------------------------------
> 1. Load src_lo into tmp_lo
> 2. Store tmp_lo into dst_lo
> 3. Sign extend tmp_lo into tmp_hi
> 4. Store tmp_hi to dst_hi
>
> Example: r3 = (s32)r3
> r3 is a stacked register
>
> ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
> // str to dst_lo is not emitted because src_lo == dst_lo
> asr r7, r6, #31 // Sign extend tmp_lo into tmp_hi
> str r7, [r11, #-12] // Store tmp_hi into r3_hi
>
> Case A.2: src is stacked but dst is not:
> ----------------------------------------
> 1. Load src_lo into dst_lo
> 2. Sign extend dst_lo into dst_hi
>
> Example: r6 = (s32)r3
> r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
>
> ldr r4, [r11, #-16] // Load r3_lo into r6_lo
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case A.3: src is not stacked but dst is stacked:
> ------------------------------------------------
> 1. Store src_lo into dst_lo
> 2. Sign extend src_lo into tmp_hi
> 3. Store tmp_hi to dst_hi
>
> Example: r3 = (s32)r6
> r3 is stacked and r6 maps to {ARM_R5, ARM_R4}
>
> str r4, [r11, #-16] // Store r6_lo to r3_lo
> asr r7, r4, #31 // Sign extend r6_lo into tmp_hi
> str r7, [r11, #-12] // Store tmp_hi to dest_hi
>
> Case A.4: Both src and dst are not stacked:
> -------------------------------------------
> 1. Mov src_lo into dst_lo
> 2. Sign extend src_lo into dst_hi
>
> Example: (bf) r6 = (s32)r6
> r6 maps to {ARM_R5, ARM_R4}
>
> // Mov not emitted because dst == src
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case B: offset != 32:
> =====================
> Case B.1: src and dst are stacked registers:
> --------------------------------------------
> 1. Load src_lo into tmp_lo
> 2. Sign extend tmp_lo according to offset.
> 3. Store tmp_lo into dst_lo
> 4. Sign extend tmp_lo into tmp_hi
> 5. Store tmp_hi to dst_hi
>
> Example: r9 = (s8)r3
> r9 and r3 are both stacked registers
>
> ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
> lsl r6, r6, #24 // Sign extend tmp_lo
> asr r6, r6, #24 // ..
> str r6, [r11, #-56] // Store tmp_lo to r9_lo
> asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
> str r7, [r11, #-52] // Store tmp_hi to r9_hi
>
> Case B.2: src is stacked but dst is not:
> ----------------------------------------
> 1. Load src_lo into dst_lo
> 2. Sign extend dst_lo according to offset.
> 3. Sign extend tmp_lo into dst_hi
>
> Example: r6 = (s8)r3
> r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
>
> ldr r4, [r11, #-16] // Load r3_lo to r6_lo
> lsl r4, r4, #24 // Sign extend r6_lo
> asr r4, r4, #24 // ..
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case B.3: src is not stacked but dst is stacked:
> ------------------------------------------------
> 1. Sign extend src_lo into tmp_lo according to offset.
> 2. Store tmp_lo into dst_lo.
> 3. Sign extend src_lo into tmp_hi.
> 4. Store tmp_hi to dst_hi.
>
> Example: r3 = (s8)r1
> r3 is stacked and r1 maps to {ARM_R3, ARM_R2}
>
> lsl r6, r2, #24 // Sign extend r1_lo to tmp_lo
> asr r6, r6, #24 // ..
> str r6, [r11, #-16] // Store tmp_lo to r3_lo
> asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
> str r7, [r11, #-12] // Store tmp_hi to r3_hi
>
> Case B.4: Both src and dst are not stacked:
> -------------------------------------------
> 1. Sign extend src_lo into dst_lo according to offset.
> 2. Sign extend dst_lo into dst_hi.
>
> Example: r6 = (s8)r1
> r6 maps to {ARM_R5, ARM_R4} and r1 maps to {ARM_R3, ARM_R2}
>
> lsl r4, r2, #24 // Sign extend r1_lo to r6_lo
> asr r4, r4, #24 // ..
> asr r5, r4, #31 // Sign extend r6_lo to r6_hi
>
> Fixes: fc832653fa0d ("arm32, bpf: add support for sign-extension mov instruction")
> Reported-by: syzbot+186522670e6722692d86@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000e9a8d80615163f2a@google.com/
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
This looks really great, thanks for putting in the extra effort!
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Thanks!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-04-22 11:15 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-19 18:28 [PATCH bpf] arm32, bpf: reimplement sign-extension mov instruction Puranjay Mohan
2024-04-22 11:14 ` Russell King (Oracle) [this message]
2024-04-22 15:30 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZiZGhVQx2ei/7Xlx@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=puranjay12@gmail.com \
--cc=puranjay@kernel.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).