linux-riscv.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Conor Dooley <conor@kernel.org>
To: Andrew Jones <ajones@ventanamicro.com>
Cc: linux-riscv@lists.infradead.org, kvm-riscv@lists.infradead.org,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>,
	Anup Patel <apatel@ventanamicro.com>,
	Heiko Stuebner <heiko@sntech.de>,
	Conor Dooley <conor.dooley@microchip.com>,
	Atish Patra <atishp@rivosinc.com>,
	Jisheng Zhang <jszhang@kernel.org>
Subject: Re: [PATCH 0/9] RISC-V: Apply Zicboz to clear_page and memset
Date: Sun, 30 Oct 2022 20:23:27 +0000	[thread overview]
Message-ID: <Y17dP7/SJXk4kLpr@spud> (raw)
In-Reply-To: <20221029095953.zhzb47lbiepptqpn@kamzik>

On Sat, Oct 29, 2022 at 11:59:53AM +0200, Andrew Jones wrote:
> On Thu, Oct 27, 2022 at 03:02:38PM +0200, Andrew Jones wrote:
> > When the Zicboz extension is available we can more rapidly zero naturally
> > aligned Zicboz block sized chunks of memory. As pages are always page
> > aligned and are larger than any Zicboz block size will be, then
> > clear_page() appears to be a good candidate for the extension. While cycle
> > count and energy consumption should also be considered, we can be pretty
> > certain that implementing clear_page() with the Zicboz extension is a win
> > by comparing the new dynamic instruction count with its current count[1].
> > Doing so we see that the new count is less than half the old count (see
> > patch4's commit message for more details). Another candidate for the
> > extension is memset(), but, since memset() isn't just used for zeroing
> > memory and it accepts arbitrarily aligned addresses and arbitrary sizes,
> > it's not as obvious if adding support for Zicboz will be an overall win.
> > In order to make a determination, I've done some analysis and wrote my
> > conclusions in the bullets below.
> > 
> > * When compiling the kernel without CONFIG_RISCV_ISA_ZICBOZ, memset()
> >   doesn't change, so that's fine.
> > 
> > * The overhead added to memset() when the Zicboz extension isn't present,
> >   but CONFIG_RISCV_ISA_ZICBOZ is selected, is 3 jumps to known targets,
> >   which I believe is fine.
> > 
> > * The overhead added to a memset() invocation which is not zeroing memory
> >   is 7 instructions, where 3 are branches. This seems fine and,
> >   furthermore, memset() is almost always invoked to zero memory (99% [2]).
> > 
> > * When memset() is invoked to zero memory, the proposed Zicboz extended
> >   memset() always has a lower dynamic instruction count than the current
> >   memset() as long as the input address is Zicboz block aligned and the
> >   length is >= the block size.
> > 
> > * When memset() is invoked to zero memory, the proposed Zicboz extended
> >   memset() is always worse for unaligned or too small inputs than the
> >   current memset(), but it's only at most a few dozen instructions worse.
> >   I think this is probably fine, especially considering the large majority
> >   of zeroing invocations are 64 bytes or larger and are aligned to a
> >   power-of-2 boundary, 64-byte or larger (77% [2]).
> > 
> > [1] I ported the functions under test to userspace and linked them with
> >     a test program. Then, I ran them under gdb with a script[3] which
> >     counted instructions by single stepping.
> > 
> > [2] I wrote bpftrace scripts[4] to count memset() invocations to see the
> >     frequency of it being used to zero memory and have block size aligned
> >     input addresses with block size or larger lengths. The workload was
> >     just random desktop stuff including streaming video and compiling.
> >     While I did run this on my x86 notebook, I still expect the data to
> >     be representative on RISC-V. Note, x86 has clear_page() so the
> >     memset() data regarding alignment and size weren't over inflated by
> >     page zeroing invocations. Grepping also shows the large majority of
> >     memset() calls are to zero memory (93%).
> > 
> > [3] https://gist.github.com/jones-drew/487791c956ceca8c18adc2847eec9c60
> > [4] https://gist.github.com/jones-drew/1e860692cf6fc0fb2a82a04c9ce720fe
> > 
> > These patches are based on the following pending series
> > 
> > 1. "[PATCH v2 0/3] RISC-V: Ensure Zicbom has a valid block size"
> >    20221024091309.406906-1-ajones@ventanamicro.com
> > 
> > 2. "[PATCH 0/8] riscv: improve boot time isa extensions handling"
> >    20221006070818.3616-1-jszhang@kernel.org
> >    Also including the additional patch proposed here
> >    20221013162038.ehseju2neic2xu5z@kamzik
> > 
> > The patches are also available here
> > https://github.com/jones-drew/linux/commits/riscv/zicboz
> > 
> > To test over QEMU this branch may be used to enable Zicboz
> > https://gitlab.com/jones-drew/qemu/-/commits/riscv/zicboz
> > 
> > To test running a KVM guest with Zicboz this kvmtool branch may be used
> > https://github.com/jones-drew/kvmtool/commits/riscv/zicboz
> > 
> > Thanks,
> > drew
> > 
> > Andrew Jones (9):
> >   RISC-V: Factor out body of riscv_init_cbom_blocksize loop
> >   RISC-V: Add Zicboz detection and block size parsing
> >   RISC-V: insn-def: Define cbo.zero
> >   RISC-V: Use Zicboz in clear_page when available
> >   RISC-V: KVM: Provide UAPI for Zicboz block size
> >   RISC-V: KVM: Expose Zicboz to the guest
> >   RISC-V: lib: Improve memset assembler formatting
> >   RISC-V: lib: Use named labels in memset
> >   RISC-V: Use Zicboz in memset when available
> > 
> >  arch/riscv/Kconfig                  |  13 ++
> >  arch/riscv/include/asm/cacheflush.h |   3 +-
> >  arch/riscv/include/asm/hwcap.h      |   1 +
> >  arch/riscv/include/asm/insn-def.h   |  50 ++++++
> >  arch/riscv/include/asm/page.h       |   6 +-
> >  arch/riscv/include/uapi/asm/kvm.h   |   2 +
> >  arch/riscv/kernel/cpu.c             |   1 +
> >  arch/riscv/kernel/cpufeature.c      |  10 ++
> >  arch/riscv/kernel/setup.c           |   2 +-
> >  arch/riscv/kvm/vcpu.c               |  11 ++
> >  arch/riscv/lib/Makefile             |   1 +
> >  arch/riscv/lib/clear_page.S         |  28 ++++
> >  arch/riscv/lib/memset.S             | 241 +++++++++++++++++++---------
> >  arch/riscv/mm/cacheflush.c          |  64 +++++---
> >  14 files changed, 325 insertions(+), 108 deletions(-)
> >  create mode 100644 arch/riscv/lib/clear_page.S
> > 
> > -- 
> > 2.37.3
> >
> 
> FYI, I just tried this with clang. It compiles but doesn't boot when
> Zicboz is present (it does boot when Zicboz is disabled in QEMU). I'm
> suspicious of the ALTERNATIVE() stuff, but I'll debug more on Monday.
> Also, I see building with LLVM=1 doesn't work, but that appears to be
> an issue introduced with "[PATCH 0/8] riscv: improve boot time isa
> extensions handling" which this series is based on.

Yeah, I was hitting the LLVM failures when I gave it a go the other day,
thanks for passing it on to Jisheng.. I meant to do that and forgot.
FWIW, I did find a build error with gcc too:

riscv64-unknown-linux-gnu-ld: arch/riscv/purgatory/purgatory.ro: in function `.L3␂1':
ctype.c:(.text+0x215a): undefined reference to `riscv_cboz_block_size'

defconfig should be:
https://gist.github.com/ConchuOD/e14293a28b04dca2125bdcc0d97d366a
& toolchain stuff:
CONFIG_CC_VERSION_TEXT="riscv64-unknown-linux-gnu-gcc (g5964b5cd727) 11.1.0"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=110100
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=23700
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=23700

My toolchain predates zicbo* being introduced & I set the config option.
I didn't do any specific digging as to the cause.

Thanks,
Conor.


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2022-10-30 20:23 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-27 13:02 [PATCH 0/9] RISC-V: Apply Zicboz to clear_page and memset Andrew Jones
2022-10-27 13:02 ` [PATCH 1/9] RISC-V: Factor out body of riscv_init_cbom_blocksize loop Andrew Jones
2022-10-27 14:58   ` Heiko Stübner
2022-10-30 20:31   ` Conor Dooley
2022-10-31  8:11     ` Andrew Jones
2022-10-27 13:02 ` [PATCH 2/9] RISC-V: Add Zicboz detection and block size parsing Andrew Jones
2022-10-27 15:03   ` Heiko Stübner
2022-10-27 15:42     ` Andrew Jones
2022-10-30 20:47   ` Conor Dooley
2022-10-31  8:12     ` Andrew Jones
2022-11-13 22:24     ` Conor Dooley
2022-11-14  8:29       ` Andrew Jones
2022-10-27 13:02 ` [PATCH 3/9] RISC-V: insn-def: Define cbo.zero Andrew Jones
2022-10-27 15:37   ` Heiko Stübner
2022-10-30 21:08   ` Conor Dooley
2022-10-31  8:18     ` Andrew Jones
2022-10-27 13:02 ` [PATCH 4/9] RISC-V: Use Zicboz in clear_page when available Andrew Jones
2022-10-27 13:02 ` [PATCH 5/9] RISC-V: KVM: Provide UAPI for Zicboz block size Andrew Jones
2022-10-30 21:23   ` Conor Dooley
2022-11-27  5:37   ` Anup Patel
2022-10-27 13:02 ` [PATCH 6/9] RISC-V: KVM: Expose Zicboz to the guest Andrew Jones
2022-10-30 21:23   ` Conor Dooley
2022-11-27  5:38   ` Anup Patel
2022-10-27 13:02 ` [PATCH 7/9] RISC-V: lib: Improve memset assembler formatting Andrew Jones
2022-10-30 21:27   ` Conor Dooley
2022-10-27 13:02 ` [PATCH 8/9] RISC-V: lib: Use named labels in memset Andrew Jones
2022-10-30 22:15   ` Conor Dooley
2022-10-31  8:24     ` Andrew Jones
2022-10-27 13:02 ` [PATCH 9/9] RISC-V: Use Zicboz in memset when available Andrew Jones
2022-10-30 22:35   ` Conor Dooley
2022-10-31  8:30     ` Andrew Jones
2022-11-03  2:43   ` Palmer Dabbelt
2022-11-03 10:21     ` Andrew Jones
2022-10-29  9:59 ` [PATCH 0/9] RISC-V: Apply Zicboz to clear_page and memset Andrew Jones
2022-10-30 20:23   ` Conor Dooley [this message]
2022-10-31  8:39     ` Andrew Jones
2022-11-01 10:37 ` Andrew Jones
2022-11-01 10:53   ` Andrew Jones
2022-12-20 12:55 ` Conor Dooley
2022-12-26 18:56   ` Andrew Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y17dP7/SJXk4kLpr@spud \
    --to=conor@kernel.org \
    --cc=ajones@ventanamicro.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=apatel@ventanamicro.com \
    --cc=atishp@rivosinc.com \
    --cc=conor.dooley@microchip.com \
    --cc=heiko@sntech.de \
    --cc=jszhang@kernel.org \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).