From: "Heiko Stübner" <heiko@sntech.de>
To: Andrew Jones <ajones@ventanamicro.com>
Cc: linux-riscv@lists.infradead.org, palmer@dabbelt.com,
christoph.muellner@vrull.eu, conor@kernel.org,
philipp.tomsich@vrull.eu, jszhang@kernel.org
Subject: Re: [PATCH v4 4/5] RISC-V: add infrastructure to allow different str* implementations
Date: Thu, 12 Jan 2023 17:05:19 +0100 [thread overview]
Message-ID: <2094019.QZUTf85G27@diego> (raw)
In-Reply-To: <20230110121320.zr4tk4nl2w57klqg@orel>
Hi Andrew,
thanks a lot for taking the time to provide these extensive and really
helpful comments.
Am Dienstag, 10. Januar 2023, 13:13:20 CET schrieb Andrew Jones:
> On Mon, Jan 09, 2023 at 07:17:54PM +0100, Heiko Stuebner wrote:
> > From: Heiko Stuebner <heiko.stuebner@vrull.eu>
> >
> > Depending on supported extensions on specific RISC-V cores,
> > optimized str* functions might make sense.
> >
> > This adds basic infrastructure to allow patching the function calls
> > via alternatives later on.
> >
> > The Linux kernel provides standard implementations for string functions
> > but when architectures want to extend them, they need to provide their
> > own.
> >
> > The added generic string functions are done in assembler (taken from
> > disassembling the main-kernel functions for now) to allow us to control
> > the used registers and extend them with optimized variants.
> >
> > Signed-off-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
> > ---
> > arch/riscv/include/asm/string.h | 10 +++++++++
> > arch/riscv/kernel/riscv_ksyms.c | 3 +++
> > arch/riscv/lib/Makefile | 3 +++
> > arch/riscv/lib/strcmp.S | 37 ++++++++++++++++++++++++++++++
> > arch/riscv/lib/strlen.S | 28 +++++++++++++++++++++++
> > arch/riscv/lib/strncmp.S | 40 +++++++++++++++++++++++++++++++++
> > arch/riscv/purgatory/Makefile | 13 +++++++++++
> > 7 files changed, 134 insertions(+)
> > create mode 100644 arch/riscv/lib/strcmp.S
> > create mode 100644 arch/riscv/lib/strlen.S
> > create mode 100644 arch/riscv/lib/strncmp.S
> >
> > diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
> > index 909049366555..a96b1fea24fe 100644
> > --- a/arch/riscv/include/asm/string.h
> > +++ b/arch/riscv/include/asm/string.h
> > @@ -18,6 +18,16 @@ extern asmlinkage void *__memcpy(void *, const void *, size_t);
> > #define __HAVE_ARCH_MEMMOVE
> > extern asmlinkage void *memmove(void *, const void *, size_t);
> > extern asmlinkage void *__memmove(void *, const void *, size_t);
> > +
> > +#define __HAVE_ARCH_STRCMP
> > +extern asmlinkage int strcmp(const char *cs, const char *ct);
> > +
> > +#define __HAVE_ARCH_STRLEN
> > +extern asmlinkage __kernel_size_t strlen(const char *);
> > +
> > +#define __HAVE_ARCH_STRNCMP
> > +extern asmlinkage int strncmp(const char *cs, const char *ct, size_t count);
> > +
> > /* For those files which don't want to check by kasan. */
> > #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
> > #define memcpy(dst, src, len) __memcpy(dst, src, len)
> > diff --git a/arch/riscv/kernel/riscv_ksyms.c b/arch/riscv/kernel/riscv_ksyms.c
> > index 5ab1c7e1a6ed..a72879b4249a 100644
> > --- a/arch/riscv/kernel/riscv_ksyms.c
> > +++ b/arch/riscv/kernel/riscv_ksyms.c
> > @@ -12,6 +12,9 @@
> > EXPORT_SYMBOL(memset);
> > EXPORT_SYMBOL(memcpy);
> > EXPORT_SYMBOL(memmove);
> > +EXPORT_SYMBOL(strcmp);
> > +EXPORT_SYMBOL(strlen);
> > +EXPORT_SYMBOL(strncmp);
> > EXPORT_SYMBOL(__memset);
> > EXPORT_SYMBOL(__memcpy);
> > EXPORT_SYMBOL(__memmove);
> > diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
> > index 25d5c9664e57..6c74b0bedd60 100644
> > --- a/arch/riscv/lib/Makefile
> > +++ b/arch/riscv/lib/Makefile
> > @@ -3,6 +3,9 @@ lib-y += delay.o
> > lib-y += memcpy.o
> > lib-y += memset.o
> > lib-y += memmove.o
> > +lib-y += strcmp.o
> > +lib-y += strlen.o
> > +lib-y += strncmp.o
> > lib-$(CONFIG_MMU) += uaccess.o
> > lib-$(CONFIG_64BIT) += tishift.o
> >
> > diff --git a/arch/riscv/lib/strcmp.S b/arch/riscv/lib/strcmp.S
> > new file mode 100644
> > index 000000000000..94440fb8390c
> > --- /dev/null
> > +++ b/arch/riscv/lib/strcmp.S
> > @@ -0,0 +1,37 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strcmp(const char *cs, const char *ct) */
> > +SYM_FUNC_START(strcmp)
> > + /*
> > + * Returns
> > + * a0 - comparison result, value like strcmp
> > + *
> > + * Parameters
> > + * a0 - string1
> > + * a1 - string2
> > + *
> > + * Clobbers
> > + * t0, t1, t2
> > + */
> > + mv t2, a1
>
> The above instruction and the 'mv a1, t2' below appear to be attempting
> to preserve a1, but that shouldn't be necessary.
correct and gone now
> > +1:
> > + lbu t1, 0(a0)
> > + lbu t0, 0(a1)
>
> I'd rather have t0 be 0(a0) and t1 be 0(a1)
ok
> > + addi a0, a0, 1
> > + addi a1, a1, 1
> > + beq t1, t0, 3f
> > + li a0, 1
> > + bgeu t1, t0, 2f
> > + li a0, -1
> > +2:
> > + mv a1, t2
> > + ret
> > +3:
> > + bnez t1, 1b
> > + li a0, 0
> > + j 2b
>
> For fun I removed one conditional and one unconditional branch (untested)
>
> 1:
> lbu t0, 0(a0)
> lbu t1, 0(a1)
> addi a0, a0, 1
> addi a1, a1, 1
> bne t0, t1, 2f
> bnez t0, 1b
> li a0, 0
> ret
> 2:
> slt a1, t1, t0
> slli a1, a1, 1
> li a0, -1
> add a0, a0, a1
> ret
yep that
- looks correct
- and also seems to produce correct results
also including your
sub a0, t0, t1
comment from the followup that then produces the same result als the
zbb-variant.
And I've also verified the
- 0, if the s1 and s2 are equal;
- a negative value if s1 is less than s2;
- a positive value if s1 is greater than s2.
return value calling convention with documentation
And added a comment above it, pointing out this fact for the next
person stumbling over this :-)
> > +SYM_FUNC_END(strcmp)
> > diff --git a/arch/riscv/lib/strlen.S b/arch/riscv/lib/strlen.S
> > new file mode 100644
> > index 000000000000..09a7aaff26c8
> > --- /dev/null
> > +++ b/arch/riscv/lib/strlen.S
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strlen(const char *s) */
> > +SYM_FUNC_START(strlen)
> > + /*
> > + * Returns
> > + * a0 - string length
> > + *
> > + * Parameters
> > + * a0 - String to measure
> > + *
> > + * Clobbers:
> > + * t0, t1
> > + */
> > + mv t1, a0
> > +1:
> > + lbu t0, 0(t1)
> > + bnez t0, 2f
> > + sub a0, t1, a0
> > + ret
> > +2:
> > + addi t1, t1, 1
> > + j 1b
>
> Slightly reorganizing looks better (to me)
>
> mv t1, a0
> 1:
> lbu t0, 0(t1)
> beqz t0, 2f
> addi t1, t1, 1
> j 1b
> 2:
> sub a0, t1, a0
> ret
ok
> > +SYM_FUNC_END(strlen)
> > diff --git a/arch/riscv/lib/strncmp.S b/arch/riscv/lib/strncmp.S
> > new file mode 100644
> > index 000000000000..493ab6febcb2
> > --- /dev/null
> > +++ b/arch/riscv/lib/strncmp.S
> > @@ -0,0 +1,40 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm.h>
> > +#include <asm-generic/export.h>
> > +
> > +/* int strncmp(const char *cs, const char *ct, size_t count) */
> > +SYM_FUNC_START(strncmp)
> > + /*
> > + * Returns
> > + * a0 - comparison result, value like strncmp
> > + *
> > + * Parameters
> > + * a0 - string1
> > + * a1 - string2
> > + * a2 - number of characters to compare
> > + *
> > + * Clobbers
> > + * t0, t1, t2
> > + */
> > + li t0, 0
> > +1:
> > + beq a2, t0, 4f
> > + add t1, a0, t0
> > + add t2, a1, t0
> > + lbu t1, 0(t1)
> > + lbu t2, 0(t2)
> > + beq t1, t2, 3f
> > + li a0, 1
> > + bgeu t1, t2, 2f
> > + li a0, -1
> > +2:
> > + ret
> > +3:
> > + addi t0, t0, 1
> > + bnez t1, 1b
> > +4:
> > + li a0, 0
> > + j 2b
>
> (untested)
>
> li t2, 0
> 1:
> beq a2, t2, 2f
> lbu t0, 0(a0)
> lbu t1, 0(a1)
> addi a0, a0, 1
> addi a1, a1, 1
> bne t0, t1, 3f
> addi t2, t2, 1
> bnez t0, 1b
> 2:
> li a0, 0
> ret
> 3:
> slt a1, t1, t0
> slli a1, a1, 1
> li a0, -1
> add a0, a0, a1
> ret
same here, I did go over the changed assembly and verified that
it produces the same results as the original and then did a second
pass to also add the
sub a0, t0, t1
replacment.
> > +SYM_FUNC_END(strncmp)
> > diff --git a/arch/riscv/purgatory/Makefile b/arch/riscv/purgatory/Makefile
> > index dd58e1d99397..d16bf715a586 100644
> > --- a/arch/riscv/purgatory/Makefile
> > +++ b/arch/riscv/purgatory/Makefile
> > @@ -2,6 +2,7 @@
> > OBJECT_FILES_NON_STANDARD := y
> >
> > purgatory-y := purgatory.o sha256.o entry.o string.o ctype.o memcpy.o memset.o
> > +purgatory-y += strcmp.o strlen.o strncmp.o
> >
> > targets += $(purgatory-y)
> > PURGATORY_OBJS = $(addprefix $(obj)/,$(purgatory-y))
> > @@ -18,6 +19,15 @@ $(obj)/memcpy.o: $(srctree)/arch/riscv/lib/memcpy.S FORCE
> > $(obj)/memset.o: $(srctree)/arch/riscv/lib/memset.S FORCE
> > $(call if_changed_rule,as_o_S)
> >
> > +$(obj)/strcmp.o: $(srctree)/arch/riscv/lib/strcmp.S FORCE
> > + $(call if_changed_rule,as_o_S)
> > +
> > +$(obj)/strlen.o: $(srctree)/arch/riscv/lib/strlen.S FORCE
> > + $(call if_changed_rule,as_o_S)
> > +
> > +$(obj)/strncmp.o: $(srctree)/arch/riscv/lib/strncmp.S FORCE
> > + $(call if_changed_rule,as_o_S)
> > +
> > $(obj)/sha256.o: $(srctree)/lib/crypto/sha256.c FORCE
> > $(call if_changed_rule,cc_o_c)
> >
> > @@ -77,6 +87,9 @@ CFLAGS_ctype.o += $(PURGATORY_CFLAGS)
> > AFLAGS_REMOVE_entry.o += -Wa,-gdwarf-2
> > AFLAGS_REMOVE_memcpy.o += -Wa,-gdwarf-2
> > AFLAGS_REMOVE_memset.o += -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strcmp.o += -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strlen.o += -Wa,-gdwarf-2
> > +AFLAGS_REMOVE_strncmp.o += -Wa,-gdwarf-2
> >
> > $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
> > $(call if_changed,ld)
> >
>
> With at least the removal of the unnecessary preserving of a1 in strcmp,
> then it looks correct to me, so
>
> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
>
> But I think there's room for making it more readable, and maybe even
> optimized, as I've tried to do.
Thanks a lot
Heiko
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2023-01-12 16:05 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-09 18:17 [PATCH v4 0/5] Zbb string optimizations and call support in alternatives Heiko Stuebner
2023-01-09 18:17 ` [PATCH v4 1/5] RISC-V: move some stray __RISCV_INSN_FUNCS definitions from kprobes Heiko Stuebner
2023-01-09 20:53 ` Conor Dooley
2023-01-11 15:14 ` Heiko Stübner
2023-01-10 8:32 ` Andrew Jones
2023-01-09 18:17 ` [PATCH v4 2/5] RISC-V: add helpers for J-type immediate handling Heiko Stuebner
2023-01-09 22:22 ` Conor Dooley
2023-01-10 8:44 ` Andrew Jones
2023-01-10 8:54 ` Conor Dooley
2023-01-11 14:43 ` Jisheng Zhang
2023-01-09 18:17 ` [PATCH v4 3/5] RISC-V: fix jal addresses in patched alternatives Heiko Stuebner
2023-01-10 9:28 ` Andrew Jones
2023-01-11 17:15 ` Jisheng Zhang
2023-01-11 13:18 ` Jisheng Zhang
2023-01-11 13:53 ` Heiko Stübner
2023-01-11 14:15 ` Andrew Jones
2023-01-11 14:44 ` Jisheng Zhang
2023-01-09 18:17 ` [PATCH v4 4/5] RISC-V: add infrastructure to allow different str* implementations Heiko Stuebner
2023-01-09 22:37 ` Conor Dooley
2023-01-09 23:31 ` Heiko Stübner
2023-01-10 9:39 ` Andrew Jones
2023-01-10 10:46 ` Heiko Stübner
2023-01-10 11:16 ` Andrew Jones
2023-01-11 12:34 ` Andrew Jones
[not found] ` <CAEg0e7gJgpoiGjfLeedba0-r=dCE1Z_qkU53w_+-cVjsuqaC3A@mail.gmail.com>
2023-01-11 13:42 ` Philipp Tomsich
2023-01-11 13:47 ` Andrew Jones
2023-01-10 12:13 ` Andrew Jones
2023-01-11 12:30 ` Andrew Jones
2023-01-12 16:05 ` Heiko Stübner [this message]
2023-01-09 18:17 ` [PATCH v4 5/5] RISC-V: add zbb support to string functions Heiko Stuebner
2023-01-09 20:39 ` Conor Dooley
2023-01-10 9:57 ` Andrew Jones
2023-01-10 10:14 ` Conor Dooley
2023-01-12 11:21 ` Heiko Stübner
2023-01-12 12:06 ` Conor Dooley
2023-01-12 12:28 ` Heiko Stübner
2023-01-11 12:24 ` Andrew Jones
2023-01-11 14:27 ` Christoph Müllner
2023-01-11 15:16 ` Andrew Jones
2023-01-11 15:22 ` Jeff Law
2023-01-12 22:05 ` Heiko Stübner
2023-01-11 13:24 ` [PATCH v4 0/5] Zbb string optimizations and call support in alternatives Jisheng Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2094019.QZUTf85G27@diego \
--to=heiko@sntech.de \
--cc=ajones@ventanamicro.com \
--cc=christoph.muellner@vrull.eu \
--cc=conor@kernel.org \
--cc=jszhang@kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=philipp.tomsich@vrull.eu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).