From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Puranjay Mohan <puranjay@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>,
bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, puranjay12@gmail.com
Subject: Re: [PATCH bpf] arm32, bpf: reimplement sign-extension mov instruction
Date: Mon, 22 Apr 2024 12:14:13 +0100 [thread overview]
Message-ID: <ZiZGhVQx2ei/7Xlx@shell.armlinux.org.uk> (raw)
In-Reply-To: <20240419182832.27707-1-puranjay@kernel.org>
On Fri, Apr 19, 2024 at 06:28:32PM +0000, Puranjay Mohan wrote:
> The current implementation of the mov instruction with sign extension
> has the following problems:
>
> 1. It clobbers the source register if it is not stacked because it
> sign extends the source and then moves it to the destination.
> 2. If the dst_reg is stacked, the current code doesn't write the value
> back in case of 64-bit mov.
> 3. There is room for improvement by emitting fewer instructions.
>
> The steps for fixing this and the instructions emitted by the JIT are
> explained below with examples in all combinations:
>
> Case A: offset == 32:
> =====================
> Case A.1: src and dst are stacked registers:
> --------------------------------------------
> 1. Load src_lo into tmp_lo
> 2. Store tmp_lo into dst_lo
> 3. Sign extend tmp_lo into tmp_hi
> 4. Store tmp_hi to dst_hi
>
> Example: r3 = (s32)r3
> r3 is a stacked register
>
> ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
> // str to dst_lo is not emitted because src_lo == dst_lo
> asr r7, r6, #31 // Sign extend tmp_lo into tmp_hi
> str r7, [r11, #-12] // Store tmp_hi into r3_hi
>
> Case A.2: src is stacked but dst is not:
> ----------------------------------------
> 1. Load src_lo into dst_lo
> 2. Sign extend dst_lo into dst_hi
>
> Example: r6 = (s32)r3
> r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
>
> ldr r4, [r11, #-16] // Load r3_lo into r6_lo
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case A.3: src is not stacked but dst is stacked:
> ------------------------------------------------
> 1. Store src_lo into dst_lo
> 2. Sign extend src_lo into tmp_hi
> 3. Store tmp_hi to dst_hi
>
> Example: r3 = (s32)r6
> r3 is stacked and r6 maps to {ARM_R5, ARM_R4}
>
> str r4, [r11, #-16] // Store r6_lo to r3_lo
> asr r7, r4, #31 // Sign extend r6_lo into tmp_hi
> str r7, [r11, #-12] // Store tmp_hi to dest_hi
>
> Case A.4: Both src and dst are not stacked:
> -------------------------------------------
> 1. Mov src_lo into dst_lo
> 2. Sign extend src_lo into dst_hi
>
> Example: (bf) r6 = (s32)r6
> r6 maps to {ARM_R5, ARM_R4}
>
> // Mov not emitted because dst == src
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case B: offset != 32:
> =====================
> Case B.1: src and dst are stacked registers:
> --------------------------------------------
> 1. Load src_lo into tmp_lo
> 2. Sign extend tmp_lo according to offset.
> 3. Store tmp_lo into dst_lo
> 4. Sign extend tmp_lo into tmp_hi
> 5. Store tmp_hi to dst_hi
>
> Example: r9 = (s8)r3
> r9 and r3 are both stacked registers
>
> ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
> lsl r6, r6, #24 // Sign extend tmp_lo
> asr r6, r6, #24 // ..
> str r6, [r11, #-56] // Store tmp_lo to r9_lo
> asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
> str r7, [r11, #-52] // Store tmp_hi to r9_hi
>
> Case B.2: src is stacked but dst is not:
> ----------------------------------------
> 1. Load src_lo into dst_lo
> 2. Sign extend dst_lo according to offset.
> 3. Sign extend tmp_lo into dst_hi
>
> Example: r6 = (s8)r3
> r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
>
> ldr r4, [r11, #-16] // Load r3_lo to r6_lo
> lsl r4, r4, #24 // Sign extend r6_lo
> asr r4, r4, #24 // ..
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case B.3: src is not stacked but dst is stacked:
> ------------------------------------------------
> 1. Sign extend src_lo into tmp_lo according to offset.
> 2. Store tmp_lo into dst_lo.
> 3. Sign extend src_lo into tmp_hi.
> 4. Store tmp_hi to dst_hi.
>
> Example: r3 = (s8)r1
> r3 is stacked and r1 maps to {ARM_R3, ARM_R2}
>
> lsl r6, r2, #24 // Sign extend r1_lo to tmp_lo
> asr r6, r6, #24 // ..
> str r6, [r11, #-16] // Store tmp_lo to r3_lo
> asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
> str r7, [r11, #-12] // Store tmp_hi to r3_hi
>
> Case B.4: Both src and dst are not stacked:
> -------------------------------------------
> 1. Sign extend src_lo into dst_lo according to offset.
> 2. Sign extend dst_lo into dst_hi.
>
> Example: r6 = (s8)r1
> r6 maps to {ARM_R5, ARM_R4} and r1 maps to {ARM_R3, ARM_R2}
>
> lsl r4, r2, #24 // Sign extend r1_lo to r6_lo
> asr r4, r4, #24 // ..
> asr r5, r4, #31 // Sign extend r6_lo to r6_hi
>
> Fixes: fc832653fa0d ("arm32, bpf: add support for sign-extension mov instruction")
> Reported-by: syzbot+186522670e6722692d86@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000e9a8d80615163f2a@google.com/
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
This looks really great, thanks for putting in the extra effort!
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Thanks!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
WARNING: multiple messages have this Message-ID (diff)
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Puranjay Mohan <puranjay@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
Martin KaFai Lau <martin.lau@linux.dev>,
Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>,
bpf@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, puranjay12@gmail.com
Subject: Re: [PATCH bpf] arm32, bpf: reimplement sign-extension mov instruction
Date: Mon, 22 Apr 2024 12:14:13 +0100 [thread overview]
Message-ID: <ZiZGhVQx2ei/7Xlx@shell.armlinux.org.uk> (raw)
In-Reply-To: <20240419182832.27707-1-puranjay@kernel.org>
On Fri, Apr 19, 2024 at 06:28:32PM +0000, Puranjay Mohan wrote:
> The current implementation of the mov instruction with sign extension
> has the following problems:
>
> 1. It clobbers the source register if it is not stacked because it
> sign extends the source and then moves it to the destination.
> 2. If the dst_reg is stacked, the current code doesn't write the value
> back in case of 64-bit mov.
> 3. There is room for improvement by emitting fewer instructions.
>
> The steps for fixing this and the instructions emitted by the JIT are
> explained below with examples in all combinations:
>
> Case A: offset == 32:
> =====================
> Case A.1: src and dst are stacked registers:
> --------------------------------------------
> 1. Load src_lo into tmp_lo
> 2. Store tmp_lo into dst_lo
> 3. Sign extend tmp_lo into tmp_hi
> 4. Store tmp_hi to dst_hi
>
> Example: r3 = (s32)r3
> r3 is a stacked register
>
> ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
> // str to dst_lo is not emitted because src_lo == dst_lo
> asr r7, r6, #31 // Sign extend tmp_lo into tmp_hi
> str r7, [r11, #-12] // Store tmp_hi into r3_hi
>
> Case A.2: src is stacked but dst is not:
> ----------------------------------------
> 1. Load src_lo into dst_lo
> 2. Sign extend dst_lo into dst_hi
>
> Example: r6 = (s32)r3
> r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
>
> ldr r4, [r11, #-16] // Load r3_lo into r6_lo
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case A.3: src is not stacked but dst is stacked:
> ------------------------------------------------
> 1. Store src_lo into dst_lo
> 2. Sign extend src_lo into tmp_hi
> 3. Store tmp_hi to dst_hi
>
> Example: r3 = (s32)r6
> r3 is stacked and r6 maps to {ARM_R5, ARM_R4}
>
> str r4, [r11, #-16] // Store r6_lo to r3_lo
> asr r7, r4, #31 // Sign extend r6_lo into tmp_hi
> str r7, [r11, #-12] // Store tmp_hi to dest_hi
>
> Case A.4: Both src and dst are not stacked:
> -------------------------------------------
> 1. Mov src_lo into dst_lo
> 2. Sign extend src_lo into dst_hi
>
> Example: (bf) r6 = (s32)r6
> r6 maps to {ARM_R5, ARM_R4}
>
> // Mov not emitted because dst == src
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case B: offset != 32:
> =====================
> Case B.1: src and dst are stacked registers:
> --------------------------------------------
> 1. Load src_lo into tmp_lo
> 2. Sign extend tmp_lo according to offset.
> 3. Store tmp_lo into dst_lo
> 4. Sign extend tmp_lo into tmp_hi
> 5. Store tmp_hi to dst_hi
>
> Example: r9 = (s8)r3
> r9 and r3 are both stacked registers
>
> ldr r6, [r11, #-16] // Load r3_lo into tmp_lo
> lsl r6, r6, #24 // Sign extend tmp_lo
> asr r6, r6, #24 // ..
> str r6, [r11, #-56] // Store tmp_lo to r9_lo
> asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
> str r7, [r11, #-52] // Store tmp_hi to r9_hi
>
> Case B.2: src is stacked but dst is not:
> ----------------------------------------
> 1. Load src_lo into dst_lo
> 2. Sign extend dst_lo according to offset.
> 3. Sign extend tmp_lo into dst_hi
>
> Example: r6 = (s8)r3
> r6 maps to {ARM_R5, ARM_R4} and r3 is stacked
>
> ldr r4, [r11, #-16] // Load r3_lo to r6_lo
> lsl r4, r4, #24 // Sign extend r6_lo
> asr r4, r4, #24 // ..
> asr r5, r4, #31 // Sign extend r6_lo into r6_hi
>
> Case B.3: src is not stacked but dst is stacked:
> ------------------------------------------------
> 1. Sign extend src_lo into tmp_lo according to offset.
> 2. Store tmp_lo into dst_lo.
> 3. Sign extend src_lo into tmp_hi.
> 4. Store tmp_hi to dst_hi.
>
> Example: r3 = (s8)r1
> r3 is stacked and r1 maps to {ARM_R3, ARM_R2}
>
> lsl r6, r2, #24 // Sign extend r1_lo to tmp_lo
> asr r6, r6, #24 // ..
> str r6, [r11, #-16] // Store tmp_lo to r3_lo
> asr r7, r6, #31 // Sign extend tmp_lo to tmp_hi
> str r7, [r11, #-12] // Store tmp_hi to r3_hi
>
> Case B.4: Both src and dst are not stacked:
> -------------------------------------------
> 1. Sign extend src_lo into dst_lo according to offset.
> 2. Sign extend dst_lo into dst_hi.
>
> Example: r6 = (s8)r1
> r6 maps to {ARM_R5, ARM_R4} and r1 maps to {ARM_R3, ARM_R2}
>
> lsl r4, r2, #24 // Sign extend r1_lo to r6_lo
> asr r4, r4, #24 // ..
> asr r5, r4, #31 // Sign extend r6_lo to r6_hi
>
> Fixes: fc832653fa0d ("arm32, bpf: add support for sign-extension mov instruction")
> Reported-by: syzbot+186522670e6722692d86@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/all/000000000000e9a8d80615163f2a@google.com/
> Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
This looks really great, thanks for putting in the extra effort!
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Thanks!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-04-22 11:14 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-19 18:28 [PATCH bpf] arm32, bpf: reimplement sign-extension mov instruction Puranjay Mohan
2024-04-19 18:28 ` Puranjay Mohan
2024-04-22 11:14 ` Russell King (Oracle) [this message]
2024-04-22 11:14 ` Russell King (Oracle)
2024-04-22 15:30 ` patchwork-bot+netdevbpf
2024-04-22 15:30 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZiZGhVQx2ei/7Xlx@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=martin.lau@linux.dev \
--cc=puranjay12@gmail.com \
--cc=puranjay@kernel.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.