From: Alistair Francis <alistair23@gmail.com>
To: Rajnesh Kanwal <rkanwal@rivosinc.com>
Cc: qemu-riscv@nongnu.org, qemu-devel@nongnu.org,
alistair.francis@wdc.com, bin.meng@windriver.com,
liweiwei@iscas.ac.cn, dbarboza@ventanamicro.com,
zhiwei_liu@linux.alibaba.com, atishp@rivosinc.com,
apatel@ventanamicro.com, beeman@rivosinc.com,
jason.chien@sifive.com, frank.chang@sifive.com,
richard.henderson@linaro.org, bmeng.cn@gmail.com
Subject: Re: [PATCH v7] target/riscv: Add support to access ctrsource, ctrtarget, ctrdata regs.
Date: Mon, 17 Feb 2025 15:24:38 +1000 [thread overview]
Message-ID: <CAKmqyKP82mMut92UMPTgsOZMFbKmND2+yK6ABVhTjTXZnD1rig@mail.gmail.com> (raw)
In-Reply-To: <20250212-b4-ctr_upstream_v6-v7-1-4e8159ea33bf@rivosinc.com>
On Wed, Feb 12, 2025 at 8:20 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
>
> CTR entries are accessed using ctrsource, ctrtarget and ctrdata
> registers using smcsrind/sscsrind extension. This commits extends
> the csrind extension to support CTR registers.
>
> ctrsource is accessible through xireg CSR, ctrtarget is accessible
> through xireg1 and ctrdata is accessible through xireg2 CSR.
>
> CTR supports maximum depth of 256 entries which are accessed using
> xiselect range 0x200 to 0x2ff.
>
> This commits also adds properties to enable CTR extension. CTR can be
> enabled using smctr=true and ssctr=true now.
>
> Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> Acked-by: Alistair Francis <alistair.francis@wdc.com>
Thanks!
Applied to riscv-to-apply.next
Alistair
> ---
> This series enables Control Transfer Records extension support on riscv
> platform. This extension is similar to Arch LBR in x86 and BRBE in ARM.
> The Extension has been ratified and this series is based on v1.0 [0]
>
> CTR extension depends on both the implementation of S-mode and Sscsrind
> extension v1.0.0 [1]. CTR access ctrsource, ctrtartget and ctrdata CSRs using
> sscsrind extension.
>
> The series is based on Smcdeleg/Ssccfg counter delegation extension [2]
> patches [3]. CTR itself doesn't depend on counter delegation support. This
> rebase is basically to include the Smcsrind patches.
>
> Here is the link to a quick start guide [4] to setup and run a basic perf demo
> on Linux to use CTR Ext.
>
> Qemu patches can be found here:
> https://github.com/rajnesh-kanwal/qemu/tree/b4/ctr_upstream_v7
>
> Opensbi patch can be found here:
> https://github.com/rajnesh-kanwal/opensbi/tree/ctr_upstream_v2
>
> Linux kernel patches can be found here:
> https://github.com/rajnesh-kanwal/linux/tree/b4/ctr_upstream_v2
>
> [0]: https://github.com/riscv/riscv-control-transfer-records/releases/tag/v1.0
> [1]: https://github.com/riscvarchive/riscv-indirect-csr-access/releases/tag/v1.0.0
> [2]: https://github.com/riscvarchive/riscv-smcdeleg-ssccfg/releases/tag/v1.0.0
> [3]: https://lore.kernel.org/qemu-riscv/20241203-counter_delegation-v4-0-c12a89baed86@rivosinc.com/
> [4]: https://github.com/rajnesh-kanwal/linux/wiki/Running-CTR-basic-demo-on-QEMU-RISC%E2%80%90V-Virt-machine
> ---
> Changes in v7:
> v7: Rebased on latest riscv-to-apply.next. Given 6 out of 7 patches
> are already in riscv-to-apply.next, this version only contains the
> last patch which failed to apply.
>
> v6: Rebased on latest riscv-to-apply.for-upstream.
> - https://lore.kernel.org/qemu-devel/20250205-b4-ctr_upstream_v6-v6-0-439d8e06c8ef@rivosinc.com
>
> v5: Improvements based on Richard Henderson's feedback.
> - Fixed code gen logic to use gen_update_pc() instead of
> tcg_constant_tl().
> - Some function renaming.
> - Rebased onto v4 of counter delegation series.
> - https://lore.kernel.org/qemu-riscv/20241205-b4-ctr_upstream_v3-v5-0-60b993aa567d@rivosinc.com/
>
> v4: Improvements based on Richard Henderson's feedback.
> - Refactored CTR related code generation to move more code into
> translation side and avoid unnecessary code execution in generated
> code.
> - Added missing code in machine.c to migrate the new state.
> - https://lore.kernel.org/r/20241204-b4-ctr_upstream_v3-v4-0-d3ce6bef9432@rivosinc.com
>
> v3: Improvements based on Jason Chien and Frank Chang's feedback.
> - Created single set of MACROs for CTR CSRs in cpu_bit.h
> - Some fixes in riscv_ctr_add_entry.
> - Return zero for vs/sireg4-6 for CTR 0x200 to 0x2ff range.
> - Improved extension dependency check.
> - Fixed invalid ctrctl csr selection bug in riscv_ctr_freeze.
> - Added implied rules for Smctr and Ssctr.
> - Added missing SMSTATEEN0_CTR bit in mstateen0 and hstateen0 write ops.
> - Some more cosmetic changes.
> - https://lore.kernel.org/qemu-riscv/20241104-b4-ctr_upstream_v3-v3-0-32fd3c48205f@rivosinc.com/
>
> v2: Lots of improvements based on Jason Chien's feedback including:
> - Added CTR recording for cm.jalt, cm.jt, cm.popret, cm.popretz.
> - Fixed and added more CTR extension enable checks.
> - Fixed CTR CSR predicate functions.
> - Fixed external trap xTE bit checks.
> - One fix in freeze function for VS-mode.
> - Lots of minor code improvements.
> - Added checks in sctrclr instruction helper.
> - https://lore.kernel.org/qemu-riscv/20240619152708.135991-1-rkanwal@rivosinc.com/
>
> v1:
> - https://lore.kernel.org/qemu-riscv/20240529160950.132754-1-rkanwal@rivosinc.com/
> ---
> target/riscv/cpu.c | 26 +++++++-
> target/riscv/csr.c | 150 ++++++++++++++++++++++++++++++++++++++++++++-
> target/riscv/tcg/tcg-cpu.c | 11 ++++
> 3 files changed, 185 insertions(+), 2 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8264c81e889424dfd491cec0ef95eeffc8fcc5b6..522d6584e4c3be7070e5a59f70f5948be8196a77 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -216,6 +216,8 @@ const RISCVIsaExtData isa_edata_arr[] = {
> ISA_EXT_DATA_ENTRY(ssu64xl, PRIV_VERSION_1_12_0, has_priv_1_12),
> ISA_EXT_DATA_ENTRY(supm, PRIV_VERSION_1_13_0, ext_supm),
> ISA_EXT_DATA_ENTRY(svade, PRIV_VERSION_1_11_0, ext_svade),
> + ISA_EXT_DATA_ENTRY(smctr, PRIV_VERSION_1_12_0, ext_smctr),
> + ISA_EXT_DATA_ENTRY(ssctr, PRIV_VERSION_1_12_0, ext_ssctr),
> ISA_EXT_DATA_ENTRY(svadu, PRIV_VERSION_1_12_0, ext_svadu),
> ISA_EXT_DATA_ENTRY(svinval, PRIV_VERSION_1_12_0, ext_svinval),
> ISA_EXT_DATA_ENTRY(svnapot, PRIV_VERSION_1_12_0, ext_svnapot),
> @@ -1599,6 +1601,8 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
> MULTI_EXT_CFG_BOOL("smcdeleg", ext_smcdeleg, false),
> MULTI_EXT_CFG_BOOL("sscsrind", ext_sscsrind, false),
> MULTI_EXT_CFG_BOOL("ssccfg", ext_ssccfg, false),
> + MULTI_EXT_CFG_BOOL("smctr", ext_smctr, false),
> + MULTI_EXT_CFG_BOOL("ssctr", ext_ssctr, false),
> MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true),
> MULTI_EXT_CFG_BOOL("zicfilp", ext_zicfilp, false),
> MULTI_EXT_CFG_BOOL("zicfiss", ext_zicfiss, false),
> @@ -2863,6 +2867,26 @@ static RISCVCPUImpliedExtsRule SSPM_IMPLIED = {
> },
> };
>
> +static RISCVCPUImpliedExtsRule SMCTR_IMPLIED = {
> + .ext = CPU_CFG_OFFSET(ext_smctr),
> + .implied_misa_exts = RVS,
> + .implied_multi_exts = {
> + CPU_CFG_OFFSET(ext_sscsrind),
> +
> + RISCV_IMPLIED_EXTS_RULE_END
> + },
> +};
> +
> +static RISCVCPUImpliedExtsRule SSCTR_IMPLIED = {
> + .ext = CPU_CFG_OFFSET(ext_ssctr),
> + .implied_misa_exts = RVS,
> + .implied_multi_exts = {
> + CPU_CFG_OFFSET(ext_sscsrind),
> +
> + RISCV_IMPLIED_EXTS_RULE_END
> + },
> +};
> +
> RISCVCPUImpliedExtsRule *riscv_misa_ext_implied_rules[] = {
> &RVA_IMPLIED, &RVD_IMPLIED, &RVF_IMPLIED,
> &RVM_IMPLIED, &RVV_IMPLIED, NULL
> @@ -2881,7 +2905,7 @@ RISCVCPUImpliedExtsRule *riscv_multi_ext_implied_rules[] = {
> &ZVFH_IMPLIED, &ZVFHMIN_IMPLIED, &ZVKN_IMPLIED,
> &ZVKNC_IMPLIED, &ZVKNG_IMPLIED, &ZVKNHB_IMPLIED,
> &ZVKS_IMPLIED, &ZVKSC_IMPLIED, &ZVKSG_IMPLIED, &SSCFG_IMPLIED,
> - &SUPM_IMPLIED, &SSPM_IMPLIED,
> + &SUPM_IMPLIED, &SSPM_IMPLIED, &SMCTR_IMPLIED, &SSCTR_IMPLIED,
> NULL
> };
>
> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> index a62c50f057f487753a79393306641d3e50085ee5..d0068ce98c156abd67b7d08f94f29edb957143bd 100644
> --- a/target/riscv/csr.c
> +++ b/target/riscv/csr.c
> @@ -2431,6 +2431,13 @@ static bool xiselect_cd_range(target_ulong isel)
> return (ISELECT_CD_FIRST <= isel && isel <= ISELECT_CD_LAST);
> }
>
> +static bool xiselect_ctr_range(int csrno, target_ulong isel)
> +{
> + /* MIREG-MIREG6 for the range 0x200-0x2ff are not used by CTR. */
> + return CTR_ENTRIES_FIRST <= isel && isel <= CTR_ENTRIES_LAST &&
> + csrno < CSR_MIREG;
> +}
> +
> static int rmw_iprio(target_ulong xlen,
> target_ulong iselect, uint8_t *iprio,
> target_ulong *val, target_ulong new_val,
> @@ -2476,6 +2483,124 @@ static int rmw_iprio(target_ulong xlen,
> return 0;
> }
>
> +static int rmw_ctrsource(CPURISCVState *env, int isel, target_ulong *val,
> + target_ulong new_val, target_ulong wr_mask)
> +{
> + /*
> + * CTR arrays are treated as circular buffers and TOS always points to next
> + * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> + * 0 is always the latest one, traversal is a bit different here. See the
> + * below example.
> + *
> + * Depth = 16.
> + *
> + * idx [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> + * TOS H
> + * entry 6 5 4 3 2 1 0 F E D C B A 9 8 7
> + */
> + const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> + const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> + uint64_t idx;
> +
> + /* Entry greater than depth-1 is read-only zero */
> + if (entry >= depth) {
> + if (val) {
> + *val = 0;
> + }
> + return 0;
> + }
> +
> + idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> + idx = (idx - entry - 1) & (depth - 1);
> +
> + if (val) {
> + *val = env->ctr_src[idx];
> + }
> +
> + env->ctr_src[idx] = (env->ctr_src[idx] & ~wr_mask) | (new_val & wr_mask);
> +
> + return 0;
> +}
> +
> +static int rmw_ctrtarget(CPURISCVState *env, int isel, target_ulong *val,
> + target_ulong new_val, target_ulong wr_mask)
> +{
> + /*
> + * CTR arrays are treated as circular buffers and TOS always points to next
> + * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> + * 0 is always the latest one, traversal is a bit different here. See the
> + * below example.
> + *
> + * Depth = 16.
> + *
> + * idx [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> + * head H
> + * entry 6 5 4 3 2 1 0 F E D C B A 9 8 7
> + */
> + const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> + const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> + uint64_t idx;
> +
> + /* Entry greater than depth-1 is read-only zero */
> + if (entry >= depth) {
> + if (val) {
> + *val = 0;
> + }
> + return 0;
> + }
> +
> + idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> + idx = (idx - entry - 1) & (depth - 1);
> +
> + if (val) {
> + *val = env->ctr_dst[idx];
> + }
> +
> + env->ctr_dst[idx] = (env->ctr_dst[idx] & ~wr_mask) | (new_val & wr_mask);
> +
> + return 0;
> +}
> +
> +static int rmw_ctrdata(CPURISCVState *env, int isel, target_ulong *val,
> + target_ulong new_val, target_ulong wr_mask)
> +{
> + /*
> + * CTR arrays are treated as circular buffers and TOS always points to next
> + * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> + * 0 is always the latest one, traversal is a bit different here. See the
> + * below example.
> + *
> + * Depth = 16.
> + *
> + * idx [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> + * head H
> + * entry 6 5 4 3 2 1 0 F E D C B A 9 8 7
> + */
> + const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> + const uint64_t mask = wr_mask & CTRDATA_MASK;
> + const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> + uint64_t idx;
> +
> + /* Entry greater than depth-1 is read-only zero */
> + if (entry >= depth) {
> + if (val) {
> + *val = 0;
> + }
> + return 0;
> + }
> +
> + idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> + idx = (idx - entry - 1) & (depth - 1);
> +
> + if (val) {
> + *val = env->ctr_data[idx];
> + }
> +
> + env->ctr_data[idx] = (env->ctr_data[idx] & ~mask) | (new_val & mask);
> +
> + return 0;
> +}
> +
> static RISCVException rmw_xireg_aia(CPURISCVState *env, int csrno,
> target_ulong isel, target_ulong *val,
> target_ulong new_val, target_ulong wr_mask)
> @@ -2628,6 +2753,27 @@ done:
> return ret;
> }
>
> +static int rmw_xireg_ctr(CPURISCVState *env, int csrno,
> + target_ulong isel, target_ulong *val,
> + target_ulong new_val, target_ulong wr_mask)
> +{
> + if (!riscv_cpu_cfg(env)->ext_smctr && !riscv_cpu_cfg(env)->ext_ssctr) {
> + return -EINVAL;
> + }
> +
> + if (csrno == CSR_SIREG || csrno == CSR_VSIREG) {
> + return rmw_ctrsource(env, isel, val, new_val, wr_mask);
> + } else if (csrno == CSR_SIREG2 || csrno == CSR_VSIREG2) {
> + return rmw_ctrtarget(env, isel, val, new_val, wr_mask);
> + } else if (csrno == CSR_SIREG3 || csrno == CSR_VSIREG3) {
> + return rmw_ctrdata(env, isel, val, new_val, wr_mask);
> + } else if (val) {
> + *val = 0;
> + }
> +
> + return 0;
> +}
> +
> /*
> * rmw_xireg_csrind: Perform indirect access to xireg and xireg2-xireg6
> *
> @@ -2639,11 +2785,13 @@ static int rmw_xireg_csrind(CPURISCVState *env, int csrno,
> target_ulong isel, target_ulong *val,
> target_ulong new_val, target_ulong wr_mask)
> {
> - int ret = -EINVAL;
> bool virt = csrno == CSR_VSIREG ? true : false;
> + int ret = -EINVAL;
>
> if (xiselect_cd_range(isel)) {
> ret = rmw_xireg_cd(env, csrno, isel, val, new_val, wr_mask);
> + } else if (xiselect_ctr_range(csrno, isel)) {
> + ret = rmw_xireg_ctr(env, csrno, isel, val, new_val, wr_mask);
> } else {
> /*
> * As per the specification, access to unimplented region is undefined
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> index 027b0324136961c61efb3fcca7a8dc13920d5e4d..29f6a3a72901abd9d56744834c6b0c28ae8cf685 100644
> --- a/target/riscv/tcg/tcg-cpu.c
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -681,6 +681,17 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
> return;
> }
>
> + if ((cpu->cfg.ext_smctr || cpu->cfg.ext_ssctr) &&
> + (!riscv_has_ext(env, RVS) || !cpu->cfg.ext_sscsrind)) {
> + if (cpu_cfg_ext_is_user_set(CPU_CFG_OFFSET(ext_smctr)) ||
> + cpu_cfg_ext_is_user_set(CPU_CFG_OFFSET(ext_ssctr))) {
> + error_setg(errp, "Smctr and Ssctr require S-mode and Sscsrind");
> + return;
> + }
> + cpu->cfg.ext_smctr = false;
> + cpu->cfg.ext_ssctr = false;
> + }
> +
> /*
> * Disable isa extensions based on priv spec after we
> * validated and set everything we need.
>
> ---
> base-commit: 485adaaf6657dd5070dbefed593b2923a397a63f
> change-id: 20250205-b4-ctr_upstream_v6-71418cd245ee
>
> Best regards,
> --
> Rajnesh Kanwal
>
>
next prev parent reply other threads:[~2025-02-17 5:32 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-12 10:18 [PATCH v7] target/riscv: Add support to access ctrsource, ctrtarget, ctrdata regs Rajnesh Kanwal
2025-02-17 5:24 ` Alistair Francis [this message]
2025-02-17 12:06 ` Rajnesh Kanwal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAKmqyKP82mMut92UMPTgsOZMFbKmND2+yK6ABVhTjTXZnD1rig@mail.gmail.com \
--to=alistair23@gmail.com \
--cc=alistair.francis@wdc.com \
--cc=apatel@ventanamicro.com \
--cc=atishp@rivosinc.com \
--cc=beeman@rivosinc.com \
--cc=bin.meng@windriver.com \
--cc=bmeng.cn@gmail.com \
--cc=dbarboza@ventanamicro.com \
--cc=frank.chang@sifive.com \
--cc=jason.chien@sifive.com \
--cc=liweiwei@iscas.ac.cn \
--cc=qemu-devel@nongnu.org \
--cc=qemu-riscv@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=rkanwal@rivosinc.com \
--cc=zhiwei_liu@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).