From: Conor Dooley <conor.dooley@microchip.com>
To: Andrew Jones <ajones@ventanamicro.com>
Cc: <linux-riscv@lists.infradead.org>,
<kvm-riscv@lists.infradead.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Anup Patel <apatel@ventanamicro.com>,
Heiko Stuebner <heiko@sntech.de>,
Atish Patra <atishp@rivosinc.com>,
Jisheng Zhang <jszhang@kernel.org>
Subject: Re: [PATCH 0/9] RISC-V: Apply Zicboz to clear_page and memset
Date: Tue, 20 Dec 2022 12:55:07 +0000 [thread overview]
Message-ID: <Y6Gwq3f6sFa2gBNZ@wendy> (raw)
In-Reply-To: <20221027130247.31634-1-ajones@ventanamicro.com>
[-- Attachment #1.1: Type: text/plain, Size: 5766 bytes --]
Hey Drew,
I assume you're not gonna respin this one before the xmas holidays etc,
but a v2 is on the cards, right?
Thanks,
Conor.
On Thu, Oct 27, 2022 at 03:02:38PM +0200, Andrew Jones wrote:
> When the Zicboz extension is available we can more rapidly zero naturally
> aligned Zicboz block sized chunks of memory. As pages are always page
> aligned and are larger than any Zicboz block size will be, then
> clear_page() appears to be a good candidate for the extension. While cycle
> count and energy consumption should also be considered, we can be pretty
> certain that implementing clear_page() with the Zicboz extension is a win
> by comparing the new dynamic instruction count with its current count[1].
> Doing so we see that the new count is less than half the old count (see
> patch4's commit message for more details). Another candidate for the
> extension is memset(), but, since memset() isn't just used for zeroing
> memory and it accepts arbitrarily aligned addresses and arbitrary sizes,
> it's not as obvious if adding support for Zicboz will be an overall win.
> In order to make a determination, I've done some analysis and wrote my
> conclusions in the bullets below.
>
> * When compiling the kernel without CONFIG_RISCV_ISA_ZICBOZ, memset()
> doesn't change, so that's fine.
>
> * The overhead added to memset() when the Zicboz extension isn't present,
> but CONFIG_RISCV_ISA_ZICBOZ is selected, is 3 jumps to known targets,
> which I believe is fine.
>
> * The overhead added to a memset() invocation which is not zeroing memory
> is 7 instructions, where 3 are branches. This seems fine and,
> furthermore, memset() is almost always invoked to zero memory (99% [2]).
>
> * When memset() is invoked to zero memory, the proposed Zicboz extended
> memset() always has a lower dynamic instruction count than the current
> memset() as long as the input address is Zicboz block aligned and the
> length is >= the block size.
>
> * When memset() is invoked to zero memory, the proposed Zicboz extended
> memset() is always worse for unaligned or too small inputs than the
> current memset(), but it's only at most a few dozen instructions worse.
> I think this is probably fine, especially considering the large majority
> of zeroing invocations are 64 bytes or larger and are aligned to a
> power-of-2 boundary, 64-byte or larger (77% [2]).
>
> [1] I ported the functions under test to userspace and linked them with
> a test program. Then, I ran them under gdb with a script[3] which
> counted instructions by single stepping.
>
> [2] I wrote bpftrace scripts[4] to count memset() invocations to see the
> frequency of it being used to zero memory and have block size aligned
> input addresses with block size or larger lengths. The workload was
> just random desktop stuff including streaming video and compiling.
> While I did run this on my x86 notebook, I still expect the data to
> be representative on RISC-V. Note, x86 has clear_page() so the
> memset() data regarding alignment and size weren't over inflated by
> page zeroing invocations. Grepping also shows the large majority of
> memset() calls are to zero memory (93%).
>
> [3] https://gist.github.com/jones-drew/487791c956ceca8c18adc2847eec9c60
> [4] https://gist.github.com/jones-drew/1e860692cf6fc0fb2a82a04c9ce720fe
>
> These patches are based on the following pending series
>
> 1. "[PATCH v2 0/3] RISC-V: Ensure Zicbom has a valid block size"
> 20221024091309.406906-1-ajones@ventanamicro.com
>
> 2. "[PATCH 0/8] riscv: improve boot time isa extensions handling"
> 20221006070818.3616-1-jszhang@kernel.org
> Also including the additional patch proposed here
> 20221013162038.ehseju2neic2xu5z@kamzik
>
> The patches are also available here
> https://github.com/jones-drew/linux/commits/riscv/zicboz
>
> To test over QEMU this branch may be used to enable Zicboz
> https://gitlab.com/jones-drew/qemu/-/commits/riscv/zicboz
>
> To test running a KVM guest with Zicboz this kvmtool branch may be used
> https://github.com/jones-drew/kvmtool/commits/riscv/zicboz
>
> Thanks,
> drew
>
> Andrew Jones (9):
> RISC-V: Factor out body of riscv_init_cbom_blocksize loop
> RISC-V: Add Zicboz detection and block size parsing
> RISC-V: insn-def: Define cbo.zero
> RISC-V: Use Zicboz in clear_page when available
> RISC-V: KVM: Provide UAPI for Zicboz block size
> RISC-V: KVM: Expose Zicboz to the guest
> RISC-V: lib: Improve memset assembler formatting
> RISC-V: lib: Use named labels in memset
> RISC-V: Use Zicboz in memset when available
>
> arch/riscv/Kconfig | 13 ++
> arch/riscv/include/asm/cacheflush.h | 3 +-
> arch/riscv/include/asm/hwcap.h | 1 +
> arch/riscv/include/asm/insn-def.h | 50 ++++++
> arch/riscv/include/asm/page.h | 6 +-
> arch/riscv/include/uapi/asm/kvm.h | 2 +
> arch/riscv/kernel/cpu.c | 1 +
> arch/riscv/kernel/cpufeature.c | 10 ++
> arch/riscv/kernel/setup.c | 2 +-
> arch/riscv/kvm/vcpu.c | 11 ++
> arch/riscv/lib/Makefile | 1 +
> arch/riscv/lib/clear_page.S | 28 ++++
> arch/riscv/lib/memset.S | 241 +++++++++++++++++++---------
> arch/riscv/mm/cacheflush.c | 64 +++++---
> 14 files changed, 325 insertions(+), 108 deletions(-)
> create mode 100644 arch/riscv/lib/clear_page.S
>
> --
> 2.37.3
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
[-- Attachment #2: Type: text/plain, Size: 161 bytes --]
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2022-12-20 12:55 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-27 13:02 [PATCH 0/9] RISC-V: Apply Zicboz to clear_page and memset Andrew Jones
2022-10-27 13:02 ` [PATCH 1/9] RISC-V: Factor out body of riscv_init_cbom_blocksize loop Andrew Jones
2022-10-27 14:58 ` Heiko Stübner
2022-10-30 20:31 ` Conor Dooley
2022-10-31 8:11 ` Andrew Jones
2022-10-27 13:02 ` [PATCH 2/9] RISC-V: Add Zicboz detection and block size parsing Andrew Jones
2022-10-27 15:03 ` Heiko Stübner
2022-10-27 15:42 ` Andrew Jones
2022-10-30 20:47 ` Conor Dooley
2022-10-31 8:12 ` Andrew Jones
2022-11-13 22:24 ` Conor Dooley
2022-11-14 8:29 ` Andrew Jones
2022-10-27 13:02 ` [PATCH 3/9] RISC-V: insn-def: Define cbo.zero Andrew Jones
2022-10-27 15:37 ` Heiko Stübner
2022-10-30 21:08 ` Conor Dooley
2022-10-31 8:18 ` Andrew Jones
2022-10-27 13:02 ` [PATCH 4/9] RISC-V: Use Zicboz in clear_page when available Andrew Jones
2022-10-27 13:02 ` [PATCH 5/9] RISC-V: KVM: Provide UAPI for Zicboz block size Andrew Jones
2022-10-30 21:23 ` Conor Dooley
2022-11-27 5:37 ` Anup Patel
2022-10-27 13:02 ` [PATCH 6/9] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones
2022-10-30 21:23 ` Conor Dooley
2022-11-27 5:38 ` Anup Patel
2022-10-27 13:02 ` [PATCH 7/9] RISC-V: lib: Improve memset assembler formatting Andrew Jones
2022-10-30 21:27 ` Conor Dooley
2022-10-27 13:02 ` [PATCH 8/9] RISC-V: lib: Use named labels in memset Andrew Jones
2022-10-30 22:15 ` Conor Dooley
2022-10-31 8:24 ` Andrew Jones
2022-10-27 13:02 ` [PATCH 9/9] RISC-V: Use Zicboz in memset when available Andrew Jones
2022-10-30 22:35 ` Conor Dooley
2022-10-31 8:30 ` Andrew Jones
2022-11-03 2:43 ` Palmer Dabbelt
2022-11-03 10:21 ` Andrew Jones
2022-10-29 9:59 ` [PATCH 0/9] RISC-V: Apply Zicboz to clear_page and memset Andrew Jones
2022-10-30 20:23 ` Conor Dooley
2022-10-31 8:39 ` Andrew Jones
2022-11-01 10:37 ` Andrew Jones
2022-11-01 10:53 ` Andrew Jones
2022-12-20 12:55 ` Conor Dooley [this message]
2022-12-26 18:56 ` Andrew Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y6Gwq3f6sFa2gBNZ@wendy \
--to=conor.dooley@microchip.com \
--cc=ajones@ventanamicro.com \
--cc=aou@eecs.berkeley.edu \
--cc=apatel@ventanamicro.com \
--cc=atishp@rivosinc.com \
--cc=heiko@sntech.de \
--cc=jszhang@kernel.org \
--cc=kvm-riscv@lists.infradead.org \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox