* [PATCH v5 00/21] Add Counter delegation ISA extension support
@ 2025-03-27 19:35 Atish Patra
2025-03-27 19:35 ` [PATCH v5 01/21] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping Atish Patra
` (20 more replies)
0 siblings, 21 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra,
Kaiwen Xue, Clément Léger, Charlie Jenkins
This series adds the counter delegation extension support. It is based on
very early PoC work done by Kevin Xue and mostly rewritten after that.
The counter delegation ISA extension(Smcdeleg/Ssccfg) actually depends
on multiple ISA extensions.
1. S[m|s]csrind : The indirect CSR extension[1] which defines additional
5 ([M|S|VS]IREG2-[M|S|VS]IREG6) register to address size limitation of
RISC-V CSR address space.
2. Smstateen: The stateen bit[60] controls the access to the registers
indirectly via the above indirect registers.
3. Smcdeleg/Ssccfg: The counter delegation extensions[2]
The counter delegation extension allows Supervisor mode to program the
hpmevent and hpmcounters directly without needing the assistance from the
M-mode via SBI calls. This results in a faster perf profiling and very
few traps. This extension also introduces a scountinhibit CSR which allows
to stop/start any counter directly from the S-mode. As the counter
delegation extension potentially can have more than 100 CSRs, the specification
leverages the indirect CSR extension to save the precious CSR address range.
Due to the dependency of these extensions, the following extensions must be
enabled in qemu to use the counter delegation feature in S-mode.
"smstateen=true,sscofpmf=true,ssccfg=true,smcdeleg=true,smcsrind=true,sscsrind=true"
or Virt machine users can just "max" cpu instead.
When we access the counters directly in S-mode, we also need to solve the
following problems.
1. Event to counter mapping
2. Event encoding discovery
The RISC-V ISA doesn't define any standard either for event encoding or the
event to counter mapping rules. Until now, the SBI PMU implementation relies
on device tree binding[3] to discover the event to counter mapping in RISC-V
platform in the firmware. The SBI PMU specification[4] defines event encoding
for standard perf events as well. Thus, the kernel can query the appropriate
counter for an given event from the firmware.
However, the kernel doesn't need any firmware interaction for hardware
counters if counter delegation is available in the hardware. Thus, the driver
needs to discover the above mappings/encodings by itself without any assistance
from firmware.
Solution to Problem #1:
This patch series solves the above problem #1 by extending the perf tool in a
way so that event json file can specify the counter constraints of each event
and that can be passed to the driver to choose the best counter for a given
event. The perf stat metric series[5] from Weilin already extend the perf tool
to parse "Counter" property to specify the hardware counter restriction.
As that series was not revised in a while, I have rebased it and included in
this series. I can only include the necessary parts from that patch required
for this series if required.
This series extends that support by converting comma separated string to a
bitmap. The counter constraint bitmap is passed to the perf driver via
newly introduced "counterid_mask" property set in "config2".
However, it results in the following event string which has repeated information
about the counters both in list and bitmask format. I am not sure how I can pass
the list information to the driver directly. That's why I added a
counterid_mask property.
Additionaly, the PATCH5 in [5] parses the bitmask information from the
string and puts it into the metric group structure. We can just convert it in
python easily and pass it to the metric group instead. The PATCH19 does exactly
that and sets the counterid_mask property.
@Weilin @Ian : Please let me know if there is a better way to solve the problem I
described.
Due to the new counterid_mask property, the layout in empty-pmu-events.c got
changed which is patched in PATCH 21 based on existing script.
Possible solutions to Problem #2:
1. Extend the PMU DT parsing support to kernel as well. However, that requires
additional support in ACPI based system. It also needs more infrastructure in
the virtualization as well.
2. Rename perf legacy events to riscv specific names. This will require users to
use perf differently than other ISAs which is not ideal.
3. Define a architecture specific override function for legacy events. Earlier
RFC version did that but it is not preferred as arch specific behavior in perf
tool has other ramifications on the tests.
4. Ian graciously helped and sent a generic fix[6] for #3 that prefers json
over legacy encoding. Unfortunately, it had some regressions and the discussions
are ongoing if it is a viable solution.
5. Specify the encodings in the driver. There were earlier concerns of managing
these in the driver as these encodings are vendor specific in absence of an ISA
guidelines. However, we also need to support counter virtualization and legacy
event users (without perf tool) as described in [7]. That's why, this series
adapts this solution similar to other ISAs. The vendors can define their pmu
event encoding and event to counter mapping in the driver.
Note: This solution is still compatible with solution #4 by Ian. It gives vendors
flexibility to define legacy event encoding in either the driver or json file
if Ian's series [6] is merged. If we can get rid of the legacy events in the
future, we can just rely on the json encodings. I have not added a json file for
qemu as I have not included Ian's patches in this series. But I have verified them
with a virt machine specific json file.
The Qemu patches are available in upstream now.
The Linux kernel patches can be found here:
https://github.com/atishp04/linux/tree/b4/counter_delegation_v4
[1] https://github.com/riscv/riscv-indirect-csr-access
[2] https://github.com/riscv/riscv-smcdeleg-ssccfg
[3] https://www.kernel.org/doc/Documentation/devicetree/bindings/perf/riscv%2Cpmu.yaml
[4] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-pmu.adoc
[5] https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/
[6] https://lore.kernel.org/lkml/20250109222109.567031-1-irogers@google.com/
[7] https://lore.kernel.org/lkml/20241026121758.143259-1-irogers@google.com/T/#m653a6b98919a365a361a698032502bd26af9f6ba
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
Changes in v5:
- Fixed dt_binding_check errors.
- Added the ISA extension dependancy for counter delegation extensions.
- Replaced the boolean variables with static key conditional check required at boot time.
- Miscellaneous minor code restructuring.
- Link to v4: https://lore.kernel.org/r/20250205-counter_delegation-v4-0-835cfa88e3b1@rivosinc.com
Changes in v4:
- Added ISA dependencies as per dt schema instead of description.
- Fixed few compilation issues due to patch reordering in v3.
- Link to v3: https://lore.kernel.org/r/20250127-counter_delegation-v3-0-64894d7e16d5@rivosinc.com
Changes in v3:
- Fixed the dtb binding check failures.
- Inlcuded the fix reported by Rajnesh Kanwal for guest counter overflow.
- Rearranged the overflow handling more efficiently for better modularity.
- Link to v2: https://lore.kernel.org/r/20250114-counter_delegation-v2-0-8ba74cdb851b@rivosinc.com
Changes in v2:
- Dropped architecture specific overrides for event encoding.
- Dropped hwprobe bits.
- Added a vendor specific event encoding table to support vendor specific event
encoding and counter mapping.
- Fixed few bugs and cleanup.
- Link to v1: https://lore.kernel.org/r/20240217005738.3744121-1-atishp@rivosinc.com
---
Atish Patra (17):
RISC-V: Add Sxcsrind ISA extension definition and parsing
dt-bindings: riscv: add Sxcsrind ISA extension description
RISC-V: Define indirect CSR access helpers
RISC-V: Add Smcntrpmf extension parsing
dt-bindings: riscv: add Smcntrpmf ISA extension description
RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing
dt-bindings: riscv: add Counter delegation ISA extensions description
RISC-V: perf: Restructure the SBI PMU code
RISC-V: perf: Modify the counter discovery mechanism
RISC-V: perf: Add a mechanism to defined legacy event encoding
RISC-V: perf: Implement supervisor counter delegation support
RISC-V: perf: Use config2/vendor table for event to counter mapping
RISC-V: perf: Add legacy event encodings via sysfs
RISC-V: perf: Add Qemu virt machine events
tools/perf: Support event code for arch standard events
tools/perf: Pass the Counter constraint values in the pmu events
Sync empty-pmu-events.c with autogenerated one
Charlie Jenkins (1):
RISC-V: perf: Skip PMU SBI extension when not implemented
Kaiwen Xue (2):
RISC-V: Add Sxcsrind ISA extension CSR definitions
RISC-V: Add Sscfg extension CSR definition
Weilin Wang (1):
perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping
.../devicetree/bindings/riscv/extensions.yaml | 67 ++
MAINTAINERS | 4 +-
arch/riscv/include/asm/csr.h | 57 ++
arch/riscv/include/asm/csr_ind.h | 42 +
arch/riscv/include/asm/hwcap.h | 8 +
arch/riscv/include/asm/kvm_vcpu_pmu.h | 4 +-
arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +-
arch/riscv/include/asm/vendorid_list.h | 4 +
arch/riscv/kernel/cpufeature.c | 27 +
arch/riscv/kvm/Makefile | 4 +-
arch/riscv/kvm/vcpu_sbi.c | 2 +-
drivers/perf/Kconfig | 16 +-
drivers/perf/Makefile | 4 +-
drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} | 0
drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 982 +++++++++++++++++----
include/linux/perf/riscv_pmu.h | 26 +-
.../perf/pmu-events/arch/riscv/arch-standard.json | 10 +
tools/perf/pmu-events/empty-pmu-events.c | 299 ++++---
tools/perf/pmu-events/jevents.py | 218 ++++-
tools/perf/pmu-events/pmu-events.h | 32 +-
20 files changed, 1490 insertions(+), 318 deletions(-)
---
base-commit: 74e48c9286409128a3dccd0efd4b83b3f9ef95fd
change-id: 20240715-counter_delegation-628a32f8c9cc
--
Regards,
Atish patra
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH v5 01/21] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-04-23 0:13 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 02/21] RISC-V: Add Sxcsrind ISA extension CSR definitions Atish Patra
` (19 subsequent siblings)
20 siblings, 1 reply; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
From: Weilin Wang <weilin.wang@intel.com>
These functions are added to parse event counter restrictions and counter
availability info from json files so that the metric grouping method could
do grouping based on the counter restriction of events and the counters
that are available on the system.
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
---
tools/perf/pmu-events/empty-pmu-events.c | 299 ++++++++++++++++++++-----------
tools/perf/pmu-events/jevents.py | 205 ++++++++++++++++++++-
tools/perf/pmu-events/pmu-events.h | 32 +++-
3 files changed, 419 insertions(+), 117 deletions(-)
diff --git a/tools/perf/pmu-events/empty-pmu-events.c b/tools/perf/pmu-events/empty-pmu-events.c
index 1c7a2cfa321f..3a7ec31576f5 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -20,73 +20,73 @@ struct pmu_table_entry {
static const char *const big_c_string =
/* offset=0 */ "tool\000"
-/* offset=5 */ "duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000"
-/* offset=78 */ "user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000"
-/* offset=145 */ "system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000"
-/* offset=210 */ "has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000"
-/* offset=283 */ "num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000"
-/* offset=425 */ "num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000"
-/* offset=525 */ "num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000"
-/* offset=639 */ "num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000"
-/* offset=712 */ "num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000"
-/* offset=795 */ "slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000"
-/* offset=902 */ "smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000"
-/* offset=1006 */ "system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000"
-/* offset=1102 */ "default_core\000"
-/* offset=1115 */ "bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000"
-/* offset=1174 */ "bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000"
-/* offset=1233 */ "l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000"
-/* offset=1328 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\000"
-/* offset=1427 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\000"
-/* offset=1557 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\000"
-/* offset=1672 */ "hisi_sccl,ddrc\000"
-/* offset=1687 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000"
-/* offset=1773 */ "uncore_cbox\000"
-/* offset=1785 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000"
-/* offset=2016 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000"
-/* offset=2081 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000"
-/* offset=2152 */ "hisi_sccl,l3c\000"
-/* offset=2166 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000"
-/* offset=2246 */ "uncore_imc_free_running\000"
-/* offset=2270 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000"
-/* offset=2365 */ "uncore_imc\000"
-/* offset=2376 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000"
-/* offset=2454 */ "uncore_sys_ddr_pmu\000"
-/* offset=2473 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000"
-/* offset=2546 */ "uncore_sys_ccn_pmu\000"
-/* offset=2565 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000"
-/* offset=2639 */ "uncore_sys_cmn_pmu\000"
-/* offset=2658 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000"
-/* offset=2798 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000"
-/* offset=2820 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000"
-/* offset=2883 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000"
-/* offset=3049 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
-/* offset=3113 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
-/* offset=3180 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000"
-/* offset=3251 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000"
-/* offset=3345 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000"
-/* offset=3479 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000"
-/* offset=3543 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000"
-/* offset=3611 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000"
-/* offset=3681 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000"
-/* offset=3703 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000"
-/* offset=3725 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000"
-/* offset=3745 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000"
+/* offset=5 */ "duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000\000"
+/* offset=79 */ "user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000\000"
+/* offset=147 */ "system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000\000"
+/* offset=213 */ "has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000\000"
+/* offset=287 */ "num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000\000"
+/* offset=430 */ "num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000\000"
+/* offset=531 */ "num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000\000"
+/* offset=646 */ "num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000\000"
+/* offset=720 */ "num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000\000"
+/* offset=804 */ "slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000\000"
+/* offset=912 */ "smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000\000"
+/* offset=1017 */ "system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000\000"
+/* offset=1114 */ "default_core\000"
+/* offset=1127 */ "bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000"
+/* offset=1187 */ "bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000"
+/* offset=1247 */ "l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000"
+/* offset=1343 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000"
+/* offset=1446 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000"
+/* offset=1580 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000"
+/* offset=1699 */ "hisi_sccl,ddrc\000"
+/* offset=1714 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000"
+/* offset=1801 */ "uncore_cbox\000"
+/* offset=1813 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000"
+/* offset=2048 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000"
+/* offset=2114 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000"
+/* offset=2186 */ "hisi_sccl,l3c\000"
+/* offset=2200 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000"
+/* offset=2281 */ "uncore_imc_free_running\000"
+/* offset=2305 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000"
+/* offset=2401 */ "uncore_imc\000"
+/* offset=2412 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000"
+/* offset=2491 */ "uncore_sys_ddr_pmu\000"
+/* offset=2510 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000"
+/* offset=2584 */ "uncore_sys_ccn_pmu\000"
+/* offset=2603 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000"
+/* offset=2678 */ "uncore_sys_cmn_pmu\000"
+/* offset=2697 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000"
+/* offset=2838 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000"
+/* offset=2860 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000"
+/* offset=2923 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000"
+/* offset=3089 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
+/* offset=3153 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
+/* offset=3220 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000"
+/* offset=3291 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000"
+/* offset=3385 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000"
+/* offset=3519 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000"
+/* offset=3583 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000"
+/* offset=3651 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000"
+/* offset=3721 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000"
+/* offset=3743 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000"
+/* offset=3765 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000"
+/* offset=3785 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000"
;
static const struct compact_pmu_event pmu_events__common_tool[] = {
-{ 5 }, /* duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000 */
-{ 210 }, /* has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000 */
-{ 283 }, /* num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000 */
-{ 425 }, /* num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000 */
-{ 525 }, /* num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000 */
-{ 639 }, /* num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000 */
-{ 712 }, /* num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000 */
-{ 795 }, /* slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000 */
-{ 902 }, /* smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000 */
-{ 145 }, /* system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000 */
-{ 1006 }, /* system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000 */
-{ 78 }, /* user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000 */
+{ 5 }, /* duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000\000 */
+{ 213 }, /* has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000\000 */
+{ 287 }, /* num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000\000 */
+{ 430 }, /* num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000\000 */
+{ 531 }, /* num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000\000 */
+{ 646 }, /* num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000\000 */
+{ 720 }, /* num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000\000 */
+{ 804 }, /* slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000\000 */
+{ 912 }, /* smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000\000 */
+{ 147 }, /* system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000\000 */
+{ 1017 }, /* system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000\000 */
+{ 79 }, /* user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000\000 */
};
@@ -99,29 +99,29 @@ const struct pmu_table_entry pmu_events__common[] = {
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_default_core[] = {
-{ 1115 }, /* bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000 */
-{ 1174 }, /* bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000 */
-{ 1427 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\000 */
-{ 1557 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\000 */
-{ 1233 }, /* l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000 */
-{ 1328 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\000 */
+{ 1127 }, /* bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000 */
+{ 1187 }, /* bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000 */
+{ 1446 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000 */
+{ 1580 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000 */
+{ 1247 }, /* l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000 */
+{ 1343 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_ddrc[] = {
-{ 1687 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000 */
+{ 1714 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_l3c[] = {
-{ 2166 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000 */
+{ 2200 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_cbox[] = {
-{ 2016 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000 */
-{ 2081 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000 */
-{ 1785 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000 */
+{ 2048 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000 */
+{ 2114 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000 */
+{ 1813 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc[] = {
-{ 2376 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000 */
+{ 2412 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc_free_running[] = {
-{ 2270 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000 */
+{ 2305 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000 */
};
@@ -129,51 +129,51 @@ const struct pmu_table_entry pmu_events__test_soc_cpu[] = {
{
.entries = pmu_events__test_soc_cpu_default_core,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_default_core),
- .pmu_name = { 1102 /* default_core\000 */ },
+ .pmu_name = { 1114 /* default_core\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_hisi_sccl_ddrc,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_ddrc),
- .pmu_name = { 1672 /* hisi_sccl,ddrc\000 */ },
+ .pmu_name = { 1699 /* hisi_sccl,ddrc\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_hisi_sccl_l3c,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_l3c),
- .pmu_name = { 2152 /* hisi_sccl,l3c\000 */ },
+ .pmu_name = { 2186 /* hisi_sccl,l3c\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_uncore_cbox,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_cbox),
- .pmu_name = { 1773 /* uncore_cbox\000 */ },
+ .pmu_name = { 1801 /* uncore_cbox\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_uncore_imc,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc),
- .pmu_name = { 2365 /* uncore_imc\000 */ },
+ .pmu_name = { 2401 /* uncore_imc\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_uncore_imc_free_running,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc_free_running),
- .pmu_name = { 2246 /* uncore_imc_free_running\000 */ },
+ .pmu_name = { 2281 /* uncore_imc_free_running\000 */ },
},
};
static const struct compact_pmu_event pmu_metrics__test_soc_cpu_default_core[] = {
-{ 2798 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */
-{ 3479 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */
-{ 3251 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */
-{ 3345 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */
-{ 3543 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
-{ 3611 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
-{ 2883 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */
-{ 2820 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */
-{ 3745 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */
-{ 3681 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */
-{ 3703 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */
-{ 3725 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */
-{ 3180 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */
-{ 3049 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
-{ 3113 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
+{ 2838 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */
+{ 3519 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */
+{ 3291 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */
+{ 3385 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */
+{ 3583 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
+{ 3651 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
+{ 2923 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */
+{ 2860 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */
+{ 3785 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */
+{ 3721 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */
+{ 3743 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */
+{ 3765 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */
+{ 3220 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */
+{ 3089 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
+{ 3153 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
};
@@ -181,18 +181,18 @@ const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = {
{
.entries = pmu_metrics__test_soc_cpu_default_core,
.num_entries = ARRAY_SIZE(pmu_metrics__test_soc_cpu_default_core),
- .pmu_name = { 1102 /* default_core\000 */ },
+ .pmu_name = { 1114 /* default_core\000 */ },
},
};
static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ccn_pmu[] = {
-{ 2565 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000 */
+{ 2603 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_cmn_pmu[] = {
-{ 2658 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000 */
+{ 2697 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ddr_pmu[] = {
-{ 2473 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000 */
+{ 2510 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000 */
};
@@ -200,17 +200,17 @@ const struct pmu_table_entry pmu_events__test_soc_sys[] = {
{
.entries = pmu_events__test_soc_sys_uncore_sys_ccn_pmu,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ccn_pmu),
- .pmu_name = { 2546 /* uncore_sys_ccn_pmu\000 */ },
+ .pmu_name = { 2584 /* uncore_sys_ccn_pmu\000 */ },
},
{
.entries = pmu_events__test_soc_sys_uncore_sys_cmn_pmu,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_cmn_pmu),
- .pmu_name = { 2639 /* uncore_sys_cmn_pmu\000 */ },
+ .pmu_name = { 2678 /* uncore_sys_cmn_pmu\000 */ },
},
{
.entries = pmu_events__test_soc_sys_uncore_sys_ddr_pmu,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ddr_pmu),
- .pmu_name = { 2454 /* uncore_sys_ddr_pmu\000 */ },
+ .pmu_name = { 2491 /* uncore_sys_ddr_pmu\000 */ },
},
};
@@ -227,6 +227,12 @@ struct pmu_metrics_table {
uint32_t num_pmus;
};
+/* Struct used to make the PMU counter layout table implementation opaque to callers. */
+struct pmu_layouts_table {
+ const struct compact_pmu_event *entries;
+ size_t length;
+};
+
/*
* Map a CPU to its table of PMU events. The CPU is identified by the
* cpuid field, which is an arch-specific identifier for the CPU.
@@ -240,6 +246,7 @@ struct pmu_events_map {
const char *cpuid;
struct pmu_events_table event_table;
struct pmu_metrics_table metric_table;
+ struct pmu_layouts_table layout_table;
};
/*
@@ -273,6 +280,7 @@ const struct pmu_events_map pmu_events_map[] = {
.cpuid = 0,
.event_table = { 0, 0 },
.metric_table = { 0, 0 },
+ .layout_table = { 0, 0 },
}
};
@@ -317,6 +325,8 @@ static void decompress_event(int offset, struct pmu_event *pe)
pe->unit = (*p == '\0' ? NULL : p);
while (*p++);
pe->long_desc = (*p == '\0' ? NULL : p);
+ while (*p++);
+ pe->counters_list = (*p == '\0' ? NULL : p);
}
static void decompress_metric(int offset, struct pmu_metric *pm)
@@ -348,6 +358,19 @@ static void decompress_metric(int offset, struct pmu_metric *pm)
pm->event_grouping = *p - '0';
}
+static void decompress_layout(int offset, struct pmu_layout *pm)
+{
+ const char *p = &big_c_string[offset];
+
+ pm->pmu = (*p == '\0' ? NULL : p);
+ while (*p++);
+ pm->desc = (*p == '\0' ? NULL : p);
+ p++;
+ pm->counters_num_gp = *p - '0';
+ p++;
+ pm->counters_num_fixed = *p - '0';
+}
+
static int pmu_events_table__for_each_event_pmu(const struct pmu_events_table *table,
const struct pmu_table_entry *pmu,
pmu_event_iter_fn fn,
@@ -503,6 +526,21 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table,
return 0;
}
+int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table,
+ pmu_layout_iter_fn fn,
+ void *data) {
+ for (size_t i = 0; i < table->length; i++) {
+ struct pmu_layout pm;
+ int ret;
+
+ decompress_layout(table->entries[i].offset, &pm);
+ ret = fn(&pm, data);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
static const struct pmu_events_map *map_for_cpu(struct perf_cpu cpu)
{
static struct {
@@ -595,6 +633,34 @@ const struct pmu_metrics_table *pmu_metrics_table__find(void)
return map ? &map->metric_table : NULL;
}
+const struct pmu_layouts_table *perf_pmu__find_layouts_table(void)
+{
+ const struct pmu_layouts_table *table = NULL;
+ struct perf_cpu cpu = {-1};
+ char *cpuid = get_cpuid_allow_env_override(cpu);
+ int i;
+
+ /* on some platforms which uses cpus map, cpuid can be NULL for
+ * PMUs other than CORE PMUs.
+ */
+ if (!cpuid)
+ return NULL;
+
+ i = 0;
+ for (;;) {
+ const struct pmu_events_map *map = &pmu_events_map[i++];
+ if (!map->arch)
+ break;
+
+ if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+ table = &map->layout_table;
+ break;
+ }
+ }
+ free(cpuid);
+ return table;
+}
+
const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid)
{
for (const struct pmu_events_map *tables = &pmu_events_map[0];
@@ -616,6 +682,16 @@ const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const
}
return NULL;
}
+const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid)
+{
+ for (const struct pmu_events_map *tables = &pmu_events_map[0];
+ tables->arch;
+ tables++) {
+ if (!strcmp(tables->arch, arch) && !strcmp_cpuid_str(tables->cpuid, cpuid))
+ return &tables->layout_table;
+ }
+ return NULL;
+}
int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
{
@@ -644,6 +720,19 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
return 0;
}
+int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data)
+{
+ for (const struct pmu_events_map *tables = &pmu_events_map[0];
+ tables->arch;
+ tables++) {
+ int ret = pmu_layouts_table__for_each_layout(&tables->layout_table, fn, data);
+
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
const struct pmu_events_table *find_sys_events_table(const char *name)
{
for (const struct pmu_sys_events *tables = &pmu_sys_event_tables[0];
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index 3e204700b59a..fa7c466a5ef3 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -23,6 +23,8 @@ _metric_tables = []
_sys_metric_tables = []
# Mapping between sys event table names and sys metric table names.
_sys_event_table_to_metric_table_mapping = {}
+# List of regular PMU counter layout tables.
+_pmu_layouts_tables = []
# Map from an event name to an architecture standard
# JsonEvent. Architecture standard events are in json files in the top
# f'{_args.starting_dir}/{_args.arch}' directory.
@@ -31,6 +33,10 @@ _arch_std_events = {}
_pending_events = []
# Name of events table to be written out
_pending_events_tblname = None
+# PMU counter layout to write out when the layout table is closed
+_pending_pmu_counts = []
+# Name of PMU counter layout table to be written out
+_pending_pmu_counts_tblname = None
# Metrics to write out when the table is closed
_pending_metrics = []
# Name of metrics table to be written out
@@ -51,6 +57,11 @@ _json_event_attributes = [
'long_desc'
]
+# Attributes that are in pmu_unit_layout.
+_json_layout_attributes = [
+ 'pmu', 'desc'
+]
+
# Attributes that are in pmu_metric rather than pmu_event.
_json_metric_attributes = [
'metric_name', 'metric_group', 'metric_expr', 'metric_threshold',
@@ -265,7 +276,7 @@ class JsonEvent:
def unit_to_pmu(unit: str) -> Optional[str]:
"""Convert a JSON Unit to Linux PMU name."""
- if not unit:
+ if not unit or unit == "core":
return 'default_core'
# Comment brought over from jevents.c:
# it's not realistic to keep adding these, we need something more scalable ...
@@ -336,6 +347,19 @@ class JsonEvent:
if 'Errata' in jd:
extra_desc += ' Spec update: ' + jd['Errata']
self.pmu = unit_to_pmu(jd.get('Unit'))
+ # The list of counter(s) the event could be collected with
+ class Counter:
+ gp = str()
+ fixed = str()
+ self.counters = {'list': str(), 'num': Counter()}
+ self.counters['list'] = jd.get('Counter')
+ # Number of generic counter
+ self.counters['num'].gp = jd.get('CountersNumGeneric')
+ # Number of fixed counter
+ self.counters['num'].fixed = jd.get('CountersNumFixed')
+ # If the event uses an MSR, other event uses the same MSR could not be
+ # schedule to collect at the same time.
+ self.msr = jd.get('MSRIndex')
filter = jd.get('Filter')
self.unit = jd.get('ScaleUnit')
self.perpkg = jd.get('PerPkg')
@@ -411,8 +435,20 @@ class JsonEvent:
s += f'\t{attr} = {value},\n'
return s + '}'
- def build_c_string(self, metric: bool) -> str:
+ def build_c_string(self, metric: bool, layout: bool) -> str:
s = ''
+ if layout:
+ for attr in _json_layout_attributes:
+ x = getattr(self, attr)
+ if attr in _json_enum_attributes:
+ s += x if x else '0'
+ else:
+ s += f'{x}\\000' if x else '\\000'
+ x = self.counters['num'].gp
+ s += x if x else '0'
+ x = self.counters['num'].fixed
+ s += x if x else '0'
+ return s
for attr in _json_metric_attributes if metric else _json_event_attributes:
x = getattr(self, attr)
if metric and x and attr == 'metric_expr':
@@ -425,15 +461,18 @@ class JsonEvent:
s += x if x else '0'
else:
s += f'{x}\\000' if x else '\\000'
+ if not metric:
+ x = self.counters['list']
+ s += f'{x}\\000' if x else '\\000'
return s
- def to_c_string(self, metric: bool) -> str:
+ def to_c_string(self, metric: bool, layout: bool) -> str:
"""Representation of the event as a C struct initializer."""
def fix_comment(s: str) -> str:
return s.replace('*/', r'\*\/')
- s = self.build_c_string(metric)
+ s = self.build_c_string(metric, layout)
return f'{{ { _bcs.offsets[s] } }}, /* {fix_comment(s)} */\n'
@@ -472,6 +511,8 @@ def preprocess_arch_std_files(archpath: str) -> None:
_arch_std_events[event.name.lower()] = event
if event.metric_name:
_arch_std_events[event.metric_name.lower()] = event
+ if event.counters['num'].gp:
+ _arch_std_events[event.pmu.lower()] = event
except Exception as e:
raise RuntimeError(f'Failure processing \'{item.name}\' in \'{archpath}\'') from e
@@ -483,6 +524,8 @@ def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
_pending_events.append(e)
if e.metric_name:
_pending_metrics.append(e)
+ if e.counters['num'].gp:
+ _pending_pmu_counts.append(e)
def print_pending_events() -> None:
@@ -526,8 +569,8 @@ def print_pending_events() -> None:
last_pmu = event.pmu
pmus.add((event.pmu, pmu_name))
- _args.output_file.write(event.to_c_string(metric=False))
last_name = event.name
+ _args.output_file.write(event.to_c_string(metric=False, layout=False))
_pending_events = []
_args.output_file.write(f"""
@@ -582,7 +625,7 @@ def print_pending_metrics() -> None:
last_pmu = metric.pmu
pmus.add((metric.pmu, pmu_name))
- _args.output_file.write(metric.to_c_string(metric=True))
+ _args.output_file.write(metric.to_c_string(metric=True, layout=False))
_pending_metrics = []
_args.output_file.write(f"""
@@ -600,6 +643,35 @@ const struct pmu_table_entry {_pending_metrics_tblname}[] = {{
""")
_args.output_file.write('};\n\n')
+def print_pending_pmu_counter_layout_table() -> None:
+ '''Print counter layout data from counter.json file to counter layout table in
+ c-string'''
+
+ def pmu_counts_cmp_key(j: JsonEvent) -> Tuple[bool, str, str]:
+ def fix_none(s: Optional[str]) -> str:
+ if s is None:
+ return ''
+ return s
+
+ return (j.desc is not None, fix_none(j.pmu))
+
+ global _pending_pmu_counts
+ if not _pending_pmu_counts:
+ return
+
+ global _pending_pmu_counts_tblname
+ global pmu_layouts_tables
+ _pmu_layouts_tables.append(_pending_pmu_counts_tblname)
+
+ _args.output_file.write(
+ f'static const struct compact_pmu_event {_pending_pmu_counts_tblname}[] = {{\n')
+
+ for pmu_layout in sorted(_pending_pmu_counts, key=pmu_counts_cmp_key):
+ _args.output_file.write(pmu_layout.to_c_string(metric=False, layout=True))
+ _pending_pmu_counts = []
+
+ _args.output_file.write('};\n\n')
+
def get_topic(topic: str) -> str:
if topic.endswith('metrics.json'):
return 'metrics'
@@ -636,10 +708,12 @@ def preprocess_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
pmu_name = f"{event.pmu}\\000"
if event.name:
_bcs.add(pmu_name, metric=False)
- _bcs.add(event.build_c_string(metric=False), metric=False)
+ _bcs.add(event.build_c_string(metric=False, layout=False), metric=False)
if event.metric_name:
_bcs.add(pmu_name, metric=True)
- _bcs.add(event.build_c_string(metric=True), metric=True)
+ _bcs.add(event.build_c_string(metric=True, layout=False), metric=True)
+ if event.counters['num'].gp:
+ _bcs.add(event.build_c_string(metric=False, layout=True), metric=False)
def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
"""Process a JSON file during the main walk."""
@@ -656,11 +730,14 @@ def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
if item.is_dir() and is_leaf_dir_ignoring_sys(item.path):
print_pending_events()
print_pending_metrics()
+ print_pending_pmu_counter_layout_table()
global _pending_events_tblname
_pending_events_tblname = file_name_to_table_name('pmu_events_', parents, item.name)
global _pending_metrics_tblname
_pending_metrics_tblname = file_name_to_table_name('pmu_metrics_', parents, item.name)
+ global _pending_pmu_counts_tblname
+ _pending_pmu_counts_tblname = file_name_to_table_name('pmu_layouts_', parents, item.name)
if item.name == 'sys':
_sys_event_table_to_metric_table_mapping[_pending_events_tblname] = _pending_metrics_tblname
@@ -694,6 +771,12 @@ struct pmu_metrics_table {
uint32_t num_pmus;
};
+/* Struct used to make the PMU counter layout table implementation opaque to callers. */
+struct pmu_layouts_table {
+ const struct compact_pmu_event *entries;
+ size_t length;
+};
+
/*
* Map a CPU to its table of PMU events. The CPU is identified by the
* cpuid field, which is an arch-specific identifier for the CPU.
@@ -707,6 +790,7 @@ struct pmu_events_map {
const char *cpuid;
struct pmu_events_table event_table;
struct pmu_metrics_table metric_table;
+ struct pmu_layouts_table layout_table;
};
/*
@@ -762,6 +846,12 @@ const struct pmu_events_map pmu_events_map[] = {
metric_size = '0'
if event_size == '0' and metric_size == '0':
continue
+ layout_tblname = file_name_to_table_name('pmu_layouts_', [], row[2].replace('/', '_'))
+ if layout_tblname in _pmu_layouts_tables:
+ layout_size = f'ARRAY_SIZE({layout_tblname})'
+ else:
+ layout_tblname = 'NULL'
+ layout_size = '0'
cpuid = row[0].replace('\\', '\\\\')
_args.output_file.write(f"""{{
\t.arch = "{arch}",
@@ -773,6 +863,10 @@ const struct pmu_events_map pmu_events_map[] = {
\t.metric_table = {{
\t\t.pmus = {metric_tblname},
\t\t.num_pmus = {metric_size}
+\t}},
+\t.layout_table = {{
+\t\t.entries = {layout_tblname},
+\t\t.length = {layout_size}
\t}}
}},
""")
@@ -783,6 +877,7 @@ const struct pmu_events_map pmu_events_map[] = {
\t.cpuid = 0,
\t.event_table = { 0, 0 },
\t.metric_table = { 0, 0 },
+\t.layout_table = { 0, 0 },
}
};
""")
@@ -851,6 +946,9 @@ static void decompress_event(int offset, struct pmu_event *pe)
_args.output_file.write('\tp++;')
else:
_args.output_file.write('\twhile (*p++);')
+ _args.output_file.write('\twhile (*p++);')
+ _args.output_file.write(f'\n\tpe->counters_list = ')
+ _args.output_file.write("(*p == '\\0' ? NULL : p);\n")
_args.output_file.write("""}
static void decompress_metric(int offset, struct pmu_metric *pm)
@@ -871,6 +969,30 @@ static void decompress_metric(int offset, struct pmu_metric *pm)
_args.output_file.write('\twhile (*p++);')
_args.output_file.write("""}
+static void decompress_layout(int offset, struct pmu_layout *pm)
+{
+\tconst char *p = &big_c_string[offset];
+""")
+ for attr in _json_layout_attributes:
+ _args.output_file.write(f'\n\tpm->{attr} = ')
+ if attr in _json_enum_attributes:
+ _args.output_file.write("*p - '0';\n")
+ else:
+ _args.output_file.write("(*p == '\\0' ? NULL : p);\n")
+ if attr == _json_layout_attributes[-1]:
+ continue
+ if attr in _json_enum_attributes:
+ _args.output_file.write('\tp++;')
+ else:
+ _args.output_file.write('\twhile (*p++);')
+ _args.output_file.write('\tp++;')
+ _args.output_file.write(f'\n\tpm->counters_num_gp = ')
+ _args.output_file.write("*p - '0';\n")
+ _args.output_file.write('\tp++;')
+ _args.output_file.write(f'\n\tpm->counters_num_fixed = ')
+ _args.output_file.write("*p - '0';\n")
+ _args.output_file.write("""}
+
static int pmu_events_table__for_each_event_pmu(const struct pmu_events_table *table,
const struct pmu_table_entry *pmu,
pmu_event_iter_fn fn,
@@ -1026,6 +1148,21 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table,
return 0;
}
+int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table,
+ pmu_layout_iter_fn fn,
+ void *data) {
+ for (size_t i = 0; i < table->length; i++) {
+ struct pmu_layout pm;
+ int ret;
+
+ decompress_layout(table->entries[i].offset, &pm);
+ ret = fn(&pm, data);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
static const struct pmu_events_map *map_for_cpu(struct perf_cpu cpu)
{
static struct {
@@ -1118,6 +1255,34 @@ const struct pmu_metrics_table *pmu_metrics_table__find(void)
return map ? &map->metric_table : NULL;
}
+const struct pmu_layouts_table *perf_pmu__find_layouts_table(void)
+{
+ const struct pmu_layouts_table *table = NULL;
+ struct perf_cpu cpu = {-1};
+ char *cpuid = get_cpuid_allow_env_override(cpu);
+ int i;
+
+ /* on some platforms which uses cpus map, cpuid can be NULL for
+ * PMUs other than CORE PMUs.
+ */
+ if (!cpuid)
+ return NULL;
+
+ i = 0;
+ for (;;) {
+ const struct pmu_events_map *map = &pmu_events_map[i++];
+ if (!map->arch)
+ break;
+
+ if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
+ table = &map->layout_table;
+ break;
+ }
+ }
+ free(cpuid);
+ return table;
+}
+
const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid)
{
for (const struct pmu_events_map *tables = &pmu_events_map[0];
@@ -1139,6 +1304,16 @@ const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const
}
return NULL;
}
+const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid)
+{
+ for (const struct pmu_events_map *tables = &pmu_events_map[0];
+ tables->arch;
+ tables++) {
+ if (!strcmp(tables->arch, arch) && !strcmp_cpuid_str(tables->cpuid, cpuid))
+ return &tables->layout_table;
+ }
+ return NULL;
+}
int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
{
@@ -1167,6 +1342,19 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
return 0;
}
+int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data)
+{
+ for (const struct pmu_events_map *tables = &pmu_events_map[0];
+ tables->arch;
+ tables++) {
+ int ret = pmu_layouts_table__for_each_layout(&tables->layout_table, fn, data);
+
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
const struct pmu_events_table *find_sys_events_table(const char *name)
{
for (const struct pmu_sys_events *tables = &pmu_sys_event_tables[0];
@@ -1330,6 +1518,7 @@ struct pmu_table_entry {
ftw(arch_path, [], process_one_file)
print_pending_events()
print_pending_metrics()
+ print_pending_pmu_counter_layout_table()
print_mapping_table(archs)
print_system_mapping_table()
diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h
index 675562e6f770..9a5cbec32513 100644
--- a/tools/perf/pmu-events/pmu-events.h
+++ b/tools/perf/pmu-events/pmu-events.h
@@ -45,6 +45,11 @@ struct pmu_event {
const char *desc;
const char *topic;
const char *long_desc;
+ /**
+ * The list of counter(s) the event could be collected on.
+ * eg., "0,1,2,3,4,5,6,7".
+ */
+ const char *counters_list;
const char *pmu;
const char *unit;
bool perpkg;
@@ -67,8 +72,18 @@ struct pmu_metric {
enum metric_event_groups event_grouping;
};
+struct pmu_layout {
+ const char *pmu;
+ const char *desc;
+ /** Total number of generic counters*/
+ int counters_num_gp;
+ /** Total number of fixed counters. Set to zero if no fixed counter on the unit.*/
+ int counters_num_fixed;
+};
+
struct pmu_events_table;
struct pmu_metrics_table;
+struct pmu_layouts_table;
#define PMU_EVENTS__NOT_FOUND -1000
@@ -80,6 +95,9 @@ typedef int (*pmu_metric_iter_fn)(const struct pmu_metric *pm,
const struct pmu_metrics_table *table,
void *data);
+typedef int (*pmu_layout_iter_fn)(const struct pmu_layout *pm,
+ void *data);
+
int pmu_events_table__for_each_event(const struct pmu_events_table *table,
struct perf_pmu *pmu,
pmu_event_iter_fn fn,
@@ -92,10 +110,13 @@ int pmu_events_table__for_each_event(const struct pmu_events_table *table,
* search of all tables.
*/
int pmu_events_table__find_event(const struct pmu_events_table *table,
- struct perf_pmu *pmu,
- const char *name,
- pmu_event_iter_fn fn,
- void *data);
+ struct perf_pmu *pmu,
+ const char *name,
+ pmu_event_iter_fn fn,
+ void *data);
+int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table,
+ pmu_layout_iter_fn fn,
+ void *data);
size_t pmu_events_table__num_events(const struct pmu_events_table *table,
struct perf_pmu *pmu);
@@ -104,10 +125,13 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table, pm
const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu *pmu);
const struct pmu_metrics_table *pmu_metrics_table__find(void);
+const struct pmu_layouts_table *perf_pmu__find_layouts_table(void);
const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid);
const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const char *cpuid);
+const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid);
int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data);
int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data);
+int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data);
const struct pmu_events_table *find_sys_events_table(const char *name);
const struct pmu_metrics_table *find_sys_metrics_table(const char *name);
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 02/21] RISC-V: Add Sxcsrind ISA extension CSR definitions
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
2025-03-27 19:35 ` [PATCH v5 01/21] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 03/21] RISC-V: Add Sxcsrind ISA extension definition and parsing Atish Patra
` (18 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra,
Kaiwen Xue, Clément Léger
From: Kaiwen Xue <kaiwenx@rivosinc.com>
This adds definitions of new CSRs and bits defined in Sxcsrind ISA
extension. These CSR enables indirect accesses mechanism to access
any CSRs in M-, S-, and VS-mode. The range of the select values
and ireg will be define by the ISA extension using Sxcsrind extension.
Signed-off-by: Kaiwen Xue <kaiwenx@rivosinc.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/csr.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 6fed42e37705..bce56a83c384 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -333,6 +333,12 @@
/* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */
#define CSR_SISELECT 0x150
#define CSR_SIREG 0x151
+/* Supervisor-Level Window to Indirectly Accessed Registers (Sxcsrind) */
+#define CSR_SIREG2 0x152
+#define CSR_SIREG3 0x153
+#define CSR_SIREG4 0x155
+#define CSR_SIREG5 0x156
+#define CSR_SIREG6 0x157
/* Supervisor-Level Interrupts (AIA) */
#define CSR_STOPEI 0x15c
@@ -380,6 +386,14 @@
/* VS-Level Window to Indirectly Accessed Registers (H-extension with AIA) */
#define CSR_VSISELECT 0x250
#define CSR_VSIREG 0x251
+/*
+ * VS-Level Window to Indirectly Accessed Registers (H-extension with Sxcsrind)
+ */
+#define CSR_VSIREG2 0x252
+#define CSR_VSIREG3 0x253
+#define CSR_VSIREG4 0x255
+#define CSR_VSIREG5 0x256
+#define CSR_VISREG6 0x257
/* VS-Level Interrupts (H-extension with AIA) */
#define CSR_VSTOPEI 0x25c
@@ -422,6 +436,12 @@
/* Machine-Level Window to Indirectly Accessed Registers (AIA) */
#define CSR_MISELECT 0x350
#define CSR_MIREG 0x351
+/* Machine-Level Window to Indirectly Accessed Registers (Sxcsrind) */
+#define CSR_MIREG2 0x352
+#define CSR_MIREG3 0x353
+#define CSR_MIREG4 0x355
+#define CSR_MIREG5 0x356
+#define CSR_MIREG6 0x357
/* Machine-Level Interrupts (AIA) */
#define CSR_MTOPEI 0x35c
@@ -467,6 +487,11 @@
# define CSR_IEH CSR_MIEH
# define CSR_ISELECT CSR_MISELECT
# define CSR_IREG CSR_MIREG
+# define CSR_IREG2 CSR_MIREG2
+# define CSR_IREG3 CSR_MIREG3
+# define CSR_IREG4 CSR_MIREG4
+# define CSR_IREG5 CSR_MIREG5
+# define CSR_IREG6 CSR_MIREG6
# define CSR_IPH CSR_MIPH
# define CSR_TOPEI CSR_MTOPEI
# define CSR_TOPI CSR_MTOPI
@@ -492,6 +517,11 @@
# define CSR_IEH CSR_SIEH
# define CSR_ISELECT CSR_SISELECT
# define CSR_IREG CSR_SIREG
+# define CSR_IREG2 CSR_SIREG2
+# define CSR_IREG3 CSR_SIREG3
+# define CSR_IREG4 CSR_SIREG4
+# define CSR_IREG5 CSR_SIREG5
+# define CSR_IREG6 CSR_SIREG6
# define CSR_IPH CSR_SIPH
# define CSR_TOPEI CSR_STOPEI
# define CSR_TOPI CSR_STOPI
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 03/21] RISC-V: Add Sxcsrind ISA extension definition and parsing
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
2025-03-27 19:35 ` [PATCH v5 01/21] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping Atish Patra
2025-03-27 19:35 ` [PATCH v5 02/21] RISC-V: Add Sxcsrind ISA extension CSR definitions Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 04/21] dt-bindings: riscv: add Sxcsrind ISA extension description Atish Patra
` (17 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
The S[m|s]csrind extension extends the indirect CSR access mechanism
defined in Smaia/Ssaia extensions.
This patch just enables the definition and parsing.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/hwcap.h | 5 +++++
arch/riscv/kernel/cpufeature.c | 2 ++
2 files changed, 7 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 869da082252a..3d6e706fc5b2 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -100,6 +100,8 @@
#define RISCV_ISA_EXT_ZICCRSE 91
#define RISCV_ISA_EXT_SVADE 92
#define RISCV_ISA_EXT_SVADU 93
+#define RISCV_ISA_EXT_SSCSRIND 94
+#define RISCV_ISA_EXT_SMCSRIND 95
#define RISCV_ISA_EXT_XLINUXENVCFG 127
@@ -109,9 +111,12 @@
#ifdef CONFIG_RISCV_M_MODE
#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA
#define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SMNPM
+#define RISCV_ISA_EXT_SxCSRIND RISCV_ISA_EXT_SMCSRIND
#else
#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA
#define RISCV_ISA_EXT_SUPM RISCV_ISA_EXT_SSNPM
+#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SSAIA
+#define RISCV_ISA_EXT_SxCSRIND RISCV_ISA_EXT_SSCSRIND
#endif
#endif /* _ASM_RISCV_HWCAP_H */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 40ac72e407b6..eddbab038301 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -394,11 +394,13 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_BUNDLE(zvksg, riscv_zvksg_bundled_exts),
__RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+ __RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM),
__RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts),
__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
+ __RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
__RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svade, RISCV_ISA_EXT_SVADE),
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 04/21] dt-bindings: riscv: add Sxcsrind ISA extension description
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (2 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 03/21] RISC-V: Add Sxcsrind ISA extension definition and parsing Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 05/21] RISC-V: Define indirect CSR access helpers Atish Patra
` (16 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
Add the S[m|s]csrind ISA extension description.
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
Documentation/devicetree/bindings/riscv/extensions.yaml | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index a63b994e0763..0520a9d8b1cd 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -128,6 +128,14 @@ properties:
changes to interrupts as frozen at commit ccbddab ("Merge pull
request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: smcsrind
+ description: |
+ The standard Smcsrind supervisor-level extension extends the
+ indirect CSR access mechanism defined by the Smaia extension. This
+ extension allows other ISA extension to use indirect CSR access
+ mechanism in M-mode as ratified in the 20240326 version of the
+ privileged ISA specification.
+
- const: smmpm
description: |
The standard Smmpm extension for M-mode pointer masking as
@@ -146,6 +154,14 @@ properties:
added by other RISC-V extensions in H/S/VS/U/VU modes and as
ratified at commit a28bfae (Ratified (#7)) of riscv-state-enable.
+ - const: sscsrind
+ description: |
+ The standard Sscsrind supervisor-level extension extends the
+ indirect CSR access mechanism defined by the Ssaia extension. This
+ extension allows other ISA extension to use indirect CSR access
+ mechanism in S-mode as ratified in the 20240326 version of the
+ privileged ISA specification.
+
- const: ssaia
description: |
The standard Ssaia supervisor-level extension for the advanced
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 05/21] RISC-V: Define indirect CSR access helpers
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (3 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 04/21] dt-bindings: riscv: add Sxcsrind ISA extension description Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 06/21] RISC-V: Add Smcntrpmf extension parsing Atish Patra
` (15 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
The indriect CSR requires multiple instructions to read/write CSR.
Add a few helper functions for ease of usage.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/csr_ind.h | 42 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
diff --git a/arch/riscv/include/asm/csr_ind.h b/arch/riscv/include/asm/csr_ind.h
new file mode 100644
index 000000000000..d36e1e06ed2b
--- /dev/null
+++ b/arch/riscv/include/asm/csr_ind.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2024 Rivos Inc.
+ */
+
+#ifndef _ASM_RISCV_CSR_IND_H
+#define _ASM_RISCV_CSR_IND_H
+
+#include <asm/csr.h>
+
+#define csr_ind_read(iregcsr, iselbase, iseloff) ({ \
+ unsigned long value = 0; \
+ unsigned long flags; \
+ local_irq_save(flags); \
+ csr_write(CSR_ISELECT, iselbase + iseloff); \
+ value = csr_read(iregcsr); \
+ local_irq_restore(flags); \
+ value; \
+})
+
+#define csr_ind_write(iregcsr, iselbase, iseloff, value) ({ \
+ unsigned long flags; \
+ local_irq_save(flags); \
+ csr_write(CSR_ISELECT, iselbase + iseloff); \
+ csr_write(iregcsr, value); \
+ local_irq_restore(flags); \
+})
+
+#define csr_ind_warl(iregcsr, iselbase, iseloff, warl_val) ({ \
+ unsigned long old_val = 0, value = 0; \
+ unsigned long flags; \
+ local_irq_save(flags); \
+ csr_write(CSR_ISELECT, iselbase + iseloff); \
+ old_val = csr_read(iregcsr); \
+ csr_write(iregcsr, warl_val); \
+ value = csr_read(iregcsr); \
+ csr_write(iregcsr, old_val); \
+ local_irq_restore(flags); \
+ value; \
+})
+
+#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 06/21] RISC-V: Add Smcntrpmf extension parsing
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (4 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 05/21] RISC-V: Define indirect CSR access helpers Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 07/21] dt-bindings: riscv: add Smcntrpmf ISA extension description Atish Patra
` (14 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra,
Clément Léger
Smcntrpmf extension allows M-mode to enable privilege mode filtering
for cycle/instret counters. However, the cyclecfg/instretcfg CSRs are
only available only in Ssccfg only Smcntrpmf is present.
That's why, kernel needs to detect presence of Smcntrpmf extension and
enable privilege mode filtering for cycle/instret counters.
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/kernel/cpufeature.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index 3d6e706fc5b2..b4eddcb57842 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -102,6 +102,7 @@
#define RISCV_ISA_EXT_SVADU 93
#define RISCV_ISA_EXT_SSCSRIND 94
#define RISCV_ISA_EXT_SMCSRIND 95
+#define RISCV_ISA_EXT_SMCNTRPMF 96
#define RISCV_ISA_EXT_XLINUXENVCFG 127
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index eddbab038301..e3e40cfe7967 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -394,6 +394,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_BUNDLE(zvksg, riscv_zvksg_bundled_exts),
__RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+ __RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF),
__RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM),
__RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts),
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 07/21] dt-bindings: riscv: add Smcntrpmf ISA extension description
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (5 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 06/21] RISC-V: Add Smcntrpmf extension parsing Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 08/21] RISC-V: Add Sscfg extension CSR definition Atish Patra
` (13 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
Add the description for Smcntrpmf ISA extension
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
Documentation/devicetree/bindings/riscv/extensions.yaml | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index 0520a9d8b1cd..c2025949295f 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -136,6 +136,12 @@ properties:
mechanism in M-mode as ratified in the 20240326 version of the
privileged ISA specification.
+ - const: smcntrpmf
+ description: |
+ The standard Smcntrpmf supervisor-level extension for the machine mode
+ to enable privilege mode filtering for cycle and instret counters as
+ ratified in the 20240326 version of the privileged ISA specification.
+
- const: smmpm
description: |
The standard Smmpm extension for M-mode pointer masking as
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 08/21] RISC-V: Add Sscfg extension CSR definition
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (6 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 07/21] dt-bindings: riscv: add Smcntrpmf ISA extension description Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 09/21] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing Atish Patra
` (12 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra,
Kaiwen Xue, Clément Léger
From: Kaiwen Xue <kaiwenx@rivosinc.com>
This adds the scountinhibit CSR definition and S-mode accessible hpmevent
bits defined by smcdeleg/ssccfg. scountinhibit allows S-mode to start/stop
counters directly from S-mode without invoking SBI calls to M-mode. It is
also used to figure out the counters delegated to S-mode by the M-mode as
well.
Signed-off-by: Kaiwen Xue <kaiwenx@rivosinc.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
---
arch/riscv/include/asm/csr.h | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index bce56a83c384..3d2d4f886c77 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -230,6 +230,31 @@
#define SMSTATEEN0_HSENVCFG (_ULL(1) << SMSTATEEN0_HSENVCFG_SHIFT)
#define SMSTATEEN0_SSTATEEN0_SHIFT 63
#define SMSTATEEN0_SSTATEEN0 (_ULL(1) << SMSTATEEN0_SSTATEEN0_SHIFT)
+/* HPMEVENT bits. These are accessible in S-mode via Smcdeleg/Ssccfg */
+#ifdef CONFIG_64BIT
+#define HPMEVENT_OF (BIT_ULL(63))
+#define HPMEVENT_MINH (BIT_ULL(62))
+#define HPMEVENT_SINH (BIT_ULL(61))
+#define HPMEVENT_UINH (BIT_ULL(60))
+#define HPMEVENT_VSINH (BIT_ULL(59))
+#define HPMEVENT_VUINH (BIT_ULL(58))
+#else
+#define HPMEVENTH_OF (BIT_ULL(31))
+#define HPMEVENTH_MINH (BIT_ULL(30))
+#define HPMEVENTH_SINH (BIT_ULL(29))
+#define HPMEVENTH_UINH (BIT_ULL(28))
+#define HPMEVENTH_VSINH (BIT_ULL(27))
+#define HPMEVENTH_VUINH (BIT_ULL(26))
+
+#define HPMEVENT_OF (HPMEVENTH_OF << 32)
+#define HPMEVENT_MINH (HPMEVENTH_MINH << 32)
+#define HPMEVENT_SINH (HPMEVENTH_SINH << 32)
+#define HPMEVENT_UINH (HPMEVENTH_UINH << 32)
+#define HPMEVENT_VSINH (HPMEVENTH_VSINH << 32)
+#define HPMEVENT_VUINH (HPMEVENTH_VUINH << 32)
+#endif
+
+#define SISELECT_SSCCFG_BASE 0x40
/* mseccfg bits */
#define MSECCFG_PMM ENVCFG_PMM
@@ -311,6 +336,7 @@
#define CSR_SCOUNTEREN 0x106
#define CSR_SENVCFG 0x10a
#define CSR_SSTATEEN0 0x10c
+#define CSR_SCOUNTINHIBIT 0x120
#define CSR_SSCRATCH 0x140
#define CSR_SEPC 0x141
#define CSR_SCAUSE 0x142
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 09/21] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (7 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 08/21] RISC-V: Add Sscfg extension CSR definition Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 10/21] dt-bindings: riscv: add Counter delegation ISA extensions description Atish Patra
` (11 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
Smcdeleg extension allows the M-mode to delegate selected counters
to S-mode so that it can access those counters and correpsonding
hpmevent CSRs without M-mode.
Ssccfg (‘Ss’ for Privileged architecture and Supervisor-level
extension, ‘ccfg’ for Counter Configuration) provides access to
delegated counters and new supervisor-level state.
This patch just enables these definitions and enable parsing.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/hwcap.h | 2 ++
arch/riscv/kernel/cpufeature.c | 24 ++++++++++++++++++++++++
2 files changed, 26 insertions(+)
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index b4eddcb57842..fa5e01bcb990 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -103,6 +103,8 @@
#define RISCV_ISA_EXT_SSCSRIND 94
#define RISCV_ISA_EXT_SMCSRIND 95
#define RISCV_ISA_EXT_SMCNTRPMF 96
+#define RISCV_ISA_EXT_SSCCFG 97
+#define RISCV_ISA_EXT_SMCDELEG 98
#define RISCV_ISA_EXT_XLINUXENVCFG 127
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index e3e40cfe7967..f72552adb257 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -150,6 +150,27 @@ static int riscv_ext_svadu_validate(const struct riscv_isa_ext_data *data,
return 0;
}
+static int riscv_ext_smcdeleg_validate(const struct riscv_isa_ext_data *data,
+ const unsigned long *isa_bitmap)
+{
+ if (__riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_SSCSRIND) &&
+ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZIHPM) &&
+ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_ZICNTR))
+ return 0;
+
+ return -EPROBE_DEFER;
+}
+
+static int riscv_ext_ssccfg_validate(const struct riscv_isa_ext_data *data,
+ const unsigned long *isa_bitmap)
+{
+ if (!riscv_ext_smcdeleg_validate(data, isa_bitmap) &&
+ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_SMCDELEG))
+ return 0;
+
+ return -EPROBE_DEFER;
+}
+
static const unsigned int riscv_zk_bundled_exts[] = {
RISCV_ISA_EXT_ZBKB,
RISCV_ISA_EXT_ZBKC,
@@ -394,12 +415,15 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_BUNDLE(zvksg, riscv_zvksg_bundled_exts),
__RISCV_ISA_EXT_DATA(zvkt, RISCV_ISA_EXT_ZVKT),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
+ __RISCV_ISA_EXT_DATA_VALIDATE(smcdeleg, RISCV_ISA_EXT_SMCDELEG,
+ riscv_ext_smcdeleg_validate),
__RISCV_ISA_EXT_DATA(smcntrpmf, RISCV_ISA_EXT_SMCNTRPMF),
__RISCV_ISA_EXT_DATA(smcsrind, RISCV_ISA_EXT_SMCSRIND),
__RISCV_ISA_EXT_DATA(smmpm, RISCV_ISA_EXT_SMMPM),
__RISCV_ISA_EXT_SUPERSET(smnpm, RISCV_ISA_EXT_SMNPM, riscv_xlinuxenvcfg_exts),
__RISCV_ISA_EXT_DATA(smstateen, RISCV_ISA_EXT_SMSTATEEN),
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
+ __RISCV_ISA_EXT_DATA_VALIDATE(ssccfg, RISCV_ISA_EXT_SSCCFG, riscv_ext_ssccfg_validate),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_DATA(sscsrind, RISCV_ISA_EXT_SSCSRIND),
__RISCV_ISA_EXT_SUPERSET(ssnpm, RISCV_ISA_EXT_SSNPM, riscv_xlinuxenvcfg_exts),
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 10/21] dt-bindings: riscv: add Counter delegation ISA extensions description
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (8 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 09/21] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-31 15:38 ` Conor Dooley
2025-03-27 19:35 ` [PATCH v5 11/21] RISC-V: perf: Restructure the SBI PMU code Atish Patra
` (10 subsequent siblings)
20 siblings, 1 reply; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
Add description for the Smcdeleg/Ssccfg extension.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
.../devicetree/bindings/riscv/extensions.yaml | 45 ++++++++++++++++++++++
1 file changed, 45 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index c2025949295f..f34bc66940c0 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -128,6 +128,13 @@ properties:
changes to interrupts as frozen at commit ccbddab ("Merge pull
request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: smcdeleg
+ description: |
+ The standard Smcdeleg supervisor-level extension for the machine mode
+ to delegate the hpmcounters to supvervisor mode so that they are
+ directlyi accessible in the supervisor mode as ratified in the
+ 20240213 version of the privileged ISA specification.
+
- const: smcsrind
description: |
The standard Smcsrind supervisor-level extension extends the
@@ -175,6 +182,14 @@ properties:
behavioural changes to interrupts as frozen at commit ccbddab
("Merge pull request #42 from riscv/jhauser-2023-RC4") of riscv-aia.
+ - const: ssccfg
+ description: |
+ The standard Ssccfg supervisor-level extension for configuring
+ the delegated hpmcounters to be accessible directly in supervisor
+ mode as ratified in the 20240213 version of the privileged ISA
+ specification. This extension depends on Sscsrind, Smcdeleg, Zihpm,
+ Zicntr extensions.
+
- const: sscofpmf
description: |
The standard Sscofpmf supervisor-level extension for count overflow
@@ -695,6 +710,36 @@ properties:
then:
contains:
const: zca
+ # Smcdeleg depends on Sscsrind, Zihpm, Zicntr
+ - if:
+ contains:
+ const: smcdeleg
+ then:
+ allOf:
+ - contains:
+ const: sscsrind
+ - contains:
+ const: zihpm
+ - contains:
+ const: zicntr
+ # Ssccfg depends on Smcdeleg, Sscsrind, Zihpm, Zicntr, Sscofpmf, Smcntrpmf
+ - if:
+ contains:
+ const: ssccfg
+ then:
+ allOf:
+ - contains:
+ const: smcdeleg
+ - contains:
+ const: sscsrind
+ - contains:
+ const: sscofpmf
+ - contains:
+ const: smcntrpmf
+ - contains:
+ const: zihpm
+ - contains:
+ const: zicntr
allOf:
# Zcf extension does not exist on rv64
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 11/21] RISC-V: perf: Restructure the SBI PMU code
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (9 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 10/21] dt-bindings: riscv: add Counter delegation ISA extensions description Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-04-04 13:49 ` Will Deacon
2025-03-27 19:35 ` [PATCH v5 12/21] RISC-V: perf: Modify the counter discovery mechanism Atish Patra
` (9 subsequent siblings)
20 siblings, 1 reply; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra,
Clément Léger
With Ssccfg/Smcdeleg, we no longer need SBI PMU extension to program/
access hpmcounter/events. However, we do need it for firmware counters.
Rename the driver and its related code to represent generic name
that will handle both sbi and ISA mechanism for hpmcounter related
operations. Take this opportunity to update the Kconfig names to
match the new driver name closely.
No functional change intended.
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
MAINTAINERS | 4 +-
arch/riscv/include/asm/kvm_vcpu_pmu.h | 4 +-
arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +-
arch/riscv/kvm/Makefile | 4 +-
arch/riscv/kvm/vcpu_sbi.c | 2 +-
drivers/perf/Kconfig | 16 +-
drivers/perf/Makefile | 4 +-
drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} | 0
drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 214 +++++++++++++---------
include/linux/perf/riscv_pmu.h | 8 +-
10 files changed, 151 insertions(+), 107 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index ed7aa6867674..b6d174f7735e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -20406,9 +20406,9 @@ M: Atish Patra <atishp@atishpatra.org>
R: Anup Patel <anup@brainfault.org>
L: linux-riscv@lists.infradead.org
S: Supported
-F: drivers/perf/riscv_pmu.c
+F: drivers/perf/riscv_pmu_common.c
+F: drivers/perf/riscv_pmu_dev.c
F: drivers/perf/riscv_pmu_legacy.c
-F: drivers/perf/riscv_pmu_sbi.c
RISC-V SPACEMIT SoC Support
M: Yixun Lan <dlan@gentoo.org>
diff --git a/arch/riscv/include/asm/kvm_vcpu_pmu.h b/arch/riscv/include/asm/kvm_vcpu_pmu.h
index 1d85b6617508..aa75f52e9092 100644
--- a/arch/riscv/include/asm/kvm_vcpu_pmu.h
+++ b/arch/riscv/include/asm/kvm_vcpu_pmu.h
@@ -13,7 +13,7 @@
#include <asm/kvm_vcpu_insn.h>
#include <asm/sbi.h>
-#ifdef CONFIG_RISCV_PMU_SBI
+#ifdef CONFIG_RISCV_PMU
#define RISCV_KVM_MAX_FW_CTRS 32
#define RISCV_KVM_MAX_HW_CTRS 32
#define RISCV_KVM_MAX_COUNTERS (RISCV_KVM_MAX_HW_CTRS + RISCV_KVM_MAX_FW_CTRS)
@@ -128,5 +128,5 @@ static inline int kvm_riscv_vcpu_pmu_incr_fw(struct kvm_vcpu *vcpu, unsigned lon
static inline void kvm_riscv_vcpu_pmu_deinit(struct kvm_vcpu *vcpu) {}
static inline void kvm_riscv_vcpu_pmu_reset(struct kvm_vcpu *vcpu) {}
-#endif /* CONFIG_RISCV_PMU_SBI */
+#endif /* CONFIG_RISCV_PMU */
#endif /* !__KVM_VCPU_RISCV_PMU_H */
diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
index 4ed6203cdd30..745690d9e8b4 100644
--- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
+++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
@@ -90,7 +90,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_sta;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental;
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor;
-#ifdef CONFIG_RISCV_PMU_SBI
+#ifdef CONFIG_RISCV_PMU
extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
#endif
#endif /* __RISCV_KVM_VCPU_SBI_H__ */
diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
index 4e0bba91d284..1ffe8c3400c0 100644
--- a/arch/riscv/kvm/Makefile
+++ b/arch/riscv/kvm/Makefile
@@ -23,11 +23,11 @@ kvm-y += vcpu_exit.o
kvm-y += vcpu_fp.o
kvm-y += vcpu_insn.o
kvm-y += vcpu_onereg.o
-kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o
+kvm-$(CONFIG_RISCV_PMU) += vcpu_pmu.o
kvm-y += vcpu_sbi.o
kvm-y += vcpu_sbi_base.o
kvm-y += vcpu_sbi_hsm.o
-kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_sbi_pmu.o
+kvm-$(CONFIG_RISCV_PMU) += vcpu_sbi_pmu.o
kvm-y += vcpu_sbi_replace.o
kvm-y += vcpu_sbi_sta.o
kvm-y += vcpu_sbi_system.o
diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
index d1c83a77735e..7bb4517921d9 100644
--- a/arch/riscv/kvm/vcpu_sbi.c
+++ b/arch/riscv/kvm/vcpu_sbi.c
@@ -20,7 +20,7 @@ static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_v01 = {
};
#endif
-#ifndef CONFIG_RISCV_PMU_SBI
+#ifndef CONFIG_RISCV_PMU
static const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu = {
.extid_start = -1UL,
.extid_end = -1UL,
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 4e268de351c4..b3bdff2a99a4 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -75,7 +75,7 @@ config ARM_XSCALE_PMU
depends on ARM_PMU && CPU_XSCALE
def_bool y
-config RISCV_PMU
+config RISCV_PMU_COMMON
depends on RISCV
bool "RISC-V PMU framework"
default y
@@ -86,7 +86,7 @@ config RISCV_PMU
can reuse it.
config RISCV_PMU_LEGACY
- depends on RISCV_PMU
+ depends on RISCV_PMU_COMMON
bool "RISC-V legacy PMU implementation"
default y
help
@@ -95,15 +95,15 @@ config RISCV_PMU_LEGACY
of cycle/instruction counter and doesn't support counter overflow,
or programmable counters. It will be removed in future.
-config RISCV_PMU_SBI
- depends on RISCV_PMU && RISCV_SBI
- bool "RISC-V PMU based on SBI PMU extension"
+config RISCV_PMU
+ depends on RISCV_PMU_COMMON && RISCV_SBI
+ bool "RISC-V PMU based on SBI PMU extension and/or Counter delegation extension"
default y
help
Say y if you want to use the CPU performance monitor
- using SBI PMU extension on RISC-V based systems. This option provides
- full perf feature support i.e. counter overflow, privilege mode
- filtering, counter configuration.
+ using SBI PMU extension or counter delegation ISA extension on RISC-V
+ based systems. This option provides full perf feature support i.e.
+ counter overflow, privilege mode filtering, counter configuration.
config STARFIVE_STARLINK_PMU
depends on ARCH_STARFIVE || COMPILE_TEST
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index de71d2574857..0805d740c773 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -16,9 +16,9 @@ obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o
obj-$(CONFIG_HISI_PMU) += hisilicon/
obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o
obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
-obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o
+obj-$(CONFIG_RISCV_PMU_COMMON) += riscv_pmu_common.o
obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o
-obj-$(CONFIG_RISCV_PMU_SBI) += riscv_pmu_sbi.o
+obj-$(CONFIG_RISCV_PMU) += riscv_pmu_dev.o
obj-$(CONFIG_STARFIVE_STARLINK_PMU) += starfive_starlink_pmu.o
obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o
obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o
diff --git a/drivers/perf/riscv_pmu.c b/drivers/perf/riscv_pmu_common.c
similarity index 100%
rename from drivers/perf/riscv_pmu.c
rename to drivers/perf/riscv_pmu_common.c
diff --git a/drivers/perf/riscv_pmu_sbi.c b/drivers/perf/riscv_pmu_dev.c
similarity index 87%
rename from drivers/perf/riscv_pmu_sbi.c
rename to drivers/perf/riscv_pmu_dev.c
index 698de8ddf895..6cebbc16bfe4 100644
--- a/drivers/perf/riscv_pmu_sbi.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -8,7 +8,7 @@
* sparc64 and x86 code.
*/
-#define pr_fmt(fmt) "riscv-pmu-sbi: " fmt
+#define pr_fmt(fmt) "riscv-pmu-dev: " fmt
#include <linux/mod_devicetable.h>
#include <linux/perf/riscv_pmu.h>
@@ -87,6 +87,8 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = {
static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS;
/*
+ * This structure is SBI specific but counter delegation also require counter
+ * width, csr mapping. Reuse it for now.
* RISC-V doesn't have heterogeneous harts yet. This need to be part of
* per_cpu in case of harts with different pmu counters
*/
@@ -119,7 +121,7 @@ struct sbi_pmu_event_data {
};
};
-static struct sbi_pmu_event_data pmu_hw_event_map[] = {
+static struct sbi_pmu_event_data pmu_hw_event_sbi_map[] = {
[PERF_COUNT_HW_CPU_CYCLES] = {.hw_gen_event = {
SBI_PMU_HW_CPU_CYCLES,
SBI_PMU_EVENT_TYPE_HW, 0}},
@@ -153,7 +155,7 @@ static struct sbi_pmu_event_data pmu_hw_event_map[] = {
};
#define C(x) PERF_COUNT_HW_CACHE_##x
-static struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_MAX]
+static struct sbi_pmu_event_data pmu_cache_event_sbi_map[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
[C(L1D)] = {
@@ -298,7 +300,7 @@ static struct sbi_pmu_event_data pmu_cache_event_map[PERF_COUNT_HW_CACHE_MAX]
},
};
-static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata)
+static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata)
{
struct sbiret ret;
@@ -313,25 +315,25 @@ static void pmu_sbi_check_event(struct sbi_pmu_event_data *edata)
}
}
-static void pmu_sbi_check_std_events(struct work_struct *work)
+static void rvpmu_sbi_check_std_events(struct work_struct *work)
{
- for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_map); i++)
- pmu_sbi_check_event(&pmu_hw_event_map[i]);
+ for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
+ rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]);
- for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_map); i++)
- for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_map[i]); j++)
- for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_map[i][j]); k++)
- pmu_sbi_check_event(&pmu_cache_event_map[i][j][k]);
+ for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++)
+ for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++)
+ for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
+ rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]);
}
-static DECLARE_WORK(check_std_events_work, pmu_sbi_check_std_events);
+static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events);
-static int pmu_sbi_ctr_get_width(int idx)
+static int rvpmu_ctr_get_width(int idx)
{
return pmu_ctr_list[idx].width;
}
-static bool pmu_sbi_ctr_is_fw(int cidx)
+static bool rvpmu_ctr_is_fw(int cidx)
{
union sbi_pmu_ctr_info *info;
@@ -373,12 +375,12 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
}
EXPORT_SYMBOL_GPL(riscv_pmu_get_hpm_info);
-static uint8_t pmu_sbi_csr_index(struct perf_event *event)
+static uint8_t rvpmu_csr_index(struct perf_event *event)
{
return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE;
}
-static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
+static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event)
{
unsigned long cflags = 0;
bool guest_events = false;
@@ -399,7 +401,7 @@ static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
return cflags;
}
-static int pmu_sbi_ctr_get_idx(struct perf_event *event)
+static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
@@ -409,7 +411,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
uint64_t cbase = 0, cmask = rvpmu->cmask;
unsigned long cflags = 0;
- cflags = pmu_sbi_get_filter_flags(event);
+ cflags = rvpmu_sbi_get_filter_flags(event);
/*
* In legacy mode, we have to force the fixed counters for those events
@@ -446,7 +448,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
return -ENOENT;
/* Additional sanity check for the counter id */
- if (pmu_sbi_ctr_is_fw(idx)) {
+ if (rvpmu_ctr_is_fw(idx)) {
if (!test_and_set_bit(idx, cpuc->used_fw_ctrs))
return idx;
} else {
@@ -457,7 +459,7 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
return -ENOENT;
}
-static void pmu_sbi_ctr_clear_idx(struct perf_event *event)
+static void rvpmu_ctr_clear_idx(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
@@ -465,13 +467,13 @@ static void pmu_sbi_ctr_clear_idx(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
int idx = hwc->idx;
- if (pmu_sbi_ctr_is_fw(idx))
+ if (rvpmu_ctr_is_fw(idx))
clear_bit(idx, cpuc->used_fw_ctrs);
else
clear_bit(idx, cpuc->used_hw_ctrs);
}
-static int pmu_event_find_cache(u64 config)
+static int sbi_pmu_event_find_cache(u64 config)
{
unsigned int cache_type, cache_op, cache_result, ret;
@@ -487,7 +489,7 @@ static int pmu_event_find_cache(u64 config)
if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
return -EINVAL;
- ret = pmu_cache_event_map[cache_type][cache_op][cache_result].event_idx;
+ ret = pmu_cache_event_sbi_map[cache_type][cache_op][cache_result].event_idx;
return ret;
}
@@ -503,7 +505,7 @@ static bool pmu_sbi_is_fw_event(struct perf_event *event)
return false;
}
-static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig)
+static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
{
u32 type = event->attr.type;
u64 config = event->attr.config;
@@ -519,10 +521,10 @@ static int pmu_sbi_event_map(struct perf_event *event, u64 *econfig)
case PERF_TYPE_HARDWARE:
if (config >= PERF_COUNT_HW_MAX)
return -EINVAL;
- ret = pmu_hw_event_map[event->attr.config].event_idx;
+ ret = pmu_hw_event_sbi_map[event->attr.config].event_idx;
break;
case PERF_TYPE_HW_CACHE:
- ret = pmu_event_find_cache(config);
+ ret = sbi_pmu_event_find_cache(config);
break;
case PERF_TYPE_RAW:
/*
@@ -648,7 +650,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu)
return 0;
}
-static u64 pmu_sbi_ctr_read(struct perf_event *event)
+static u64 rvpmu_sbi_ctr_read(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
@@ -690,25 +692,25 @@ static u64 pmu_sbi_ctr_read(struct perf_event *event)
return val;
}
-static void pmu_sbi_set_scounteren(void *arg)
+static void rvpmu_set_scounteren(void *arg)
{
struct perf_event *event = (struct perf_event *)arg;
if (event->hw.idx != -1)
csr_write(CSR_SCOUNTEREN,
- csr_read(CSR_SCOUNTEREN) | BIT(pmu_sbi_csr_index(event)));
+ csr_read(CSR_SCOUNTEREN) | BIT(rvpmu_csr_index(event)));
}
-static void pmu_sbi_reset_scounteren(void *arg)
+static void rvpmu_reset_scounteren(void *arg)
{
struct perf_event *event = (struct perf_event *)arg;
if (event->hw.idx != -1)
csr_write(CSR_SCOUNTEREN,
- csr_read(CSR_SCOUNTEREN) & ~BIT(pmu_sbi_csr_index(event)));
+ csr_read(CSR_SCOUNTEREN) & ~BIT(rvpmu_csr_index(event)));
}
-static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
+static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival)
{
struct sbiret ret;
struct hw_perf_event *hwc = &event->hw;
@@ -728,10 +730,10 @@ static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- pmu_sbi_set_scounteren((void *)event);
+ rvpmu_set_scounteren((void *)event);
}
-static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
+static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
{
struct sbiret ret;
struct hw_perf_event *hwc = &event->hw;
@@ -741,7 +743,7 @@ static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- pmu_sbi_reset_scounteren((void *)event);
+ rvpmu_reset_scounteren((void *)event);
if (sbi_pmu_snapshot_available())
flag |= SBI_PMU_STOP_FLAG_TAKE_SNAPSHOT;
@@ -767,7 +769,7 @@ static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
}
}
-static int pmu_sbi_find_num_ctrs(void)
+static int rvpmu_sbi_find_num_ctrs(void)
{
struct sbiret ret;
@@ -778,7 +780,7 @@ static int pmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}
-static int pmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
+static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
{
struct sbiret ret;
int i, num_hw_ctr = 0, num_fw_ctr = 0;
@@ -809,7 +811,7 @@ static int pmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
return 0;
}
-static inline void pmu_sbi_stop_all(struct riscv_pmu *pmu)
+static inline void rvpmu_sbi_stop_all(struct riscv_pmu *pmu)
{
/*
* No need to check the error because we are disabling all the counters
@@ -819,7 +821,7 @@ static inline void pmu_sbi_stop_all(struct riscv_pmu *pmu)
0, pmu->cmask, SBI_PMU_STOP_FLAG_RESET, 0, 0, 0);
}
-static inline void pmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
+static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
{
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr;
@@ -863,8 +865,8 @@ static inline void pmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
* while the overflowed counters need to be started with updated initialization
* value.
*/
-static inline void pmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
- u64 ctr_ovf_mask)
+static inline void rvpmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
+ u64 ctr_ovf_mask)
{
int idx = 0, i;
struct perf_event *event;
@@ -902,8 +904,8 @@ static inline void pmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
}
}
-static inline void pmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
- u64 ctr_ovf_mask)
+static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
+ u64 ctr_ovf_mask)
{
int i, idx = 0;
struct perf_event *event;
@@ -937,18 +939,18 @@ static inline void pmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_
}
}
-static void pmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
- u64 ctr_ovf_mask)
+static void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
+ u64 ctr_ovf_mask)
{
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
if (sbi_pmu_snapshot_available())
- pmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
+ rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
else
- pmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask);
+ rvpmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask);
}
-static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
+static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
{
struct perf_sample_data data;
struct pt_regs *regs;
@@ -980,7 +982,7 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
}
pmu = to_riscv_pmu(event->pmu);
- pmu_sbi_stop_hw_ctrs(pmu);
+ rvpmu_sbi_stop_hw_ctrs(pmu);
/* Overflow status register should only be read after counter are stopped */
if (sbi_pmu_snapshot_available())
@@ -1049,13 +1051,55 @@ static irqreturn_t pmu_sbi_ovf_handler(int irq, void *dev)
hw_evt->state = 0;
}
- pmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
+ rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
perf_sample_event_took(sched_clock() - start_clock);
return IRQ_HANDLED;
}
-static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
+static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
+{
+ rvpmu_sbi_ctr_start(event, ival);
+ /* TODO: Counter delegation implementation */
+}
+
+static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
+{
+ rvpmu_sbi_ctr_stop(event, flag);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_find_num_ctrs(void)
+{
+ return rvpmu_sbi_find_num_ctrs();
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask)
+{
+ return rvpmu_sbi_get_ctrinfo(nctr, mask);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
+{
+ return rvpmu_sbi_event_map(event, econfig);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_ctr_get_idx(struct perf_event *event)
+{
+ return rvpmu_sbi_ctr_get_idx(event);
+ /* TODO: Counter delegation implementation */
+}
+
+static u64 rvpmu_ctr_read(struct perf_event *event)
+{
+ return rvpmu_sbi_ctr_read(event);
+ /* TODO: Counter delegation implementation */
+}
+
+static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
{
struct riscv_pmu *pmu = hlist_entry_safe(node, struct riscv_pmu, node);
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
@@ -1070,7 +1114,7 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
csr_write(CSR_SCOUNTEREN, 0x2);
/* Stop all the counters so that they can be enabled from perf */
- pmu_sbi_stop_all(pmu);
+ rvpmu_sbi_stop_all(pmu);
if (riscv_pmu_use_irq) {
cpu_hw_evt->irq = riscv_pmu_irq;
@@ -1084,7 +1128,7 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
return 0;
}
-static int pmu_sbi_dying_cpu(unsigned int cpu, struct hlist_node *node)
+static int rvpmu_dying_cpu(unsigned int cpu, struct hlist_node *node)
{
if (riscv_pmu_use_irq) {
disable_percpu_irq(riscv_pmu_irq);
@@ -1099,7 +1143,7 @@ static int pmu_sbi_dying_cpu(unsigned int cpu, struct hlist_node *node)
return 0;
}
-static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pdev)
+static int rvpmu_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pdev)
{
int ret;
struct cpu_hw_events __percpu *hw_events = pmu->hw_events;
@@ -1139,7 +1183,7 @@ static int pmu_sbi_setup_irqs(struct riscv_pmu *pmu, struct platform_device *pde
return -ENODEV;
}
- ret = request_percpu_irq(riscv_pmu_irq, pmu_sbi_ovf_handler, "riscv-pmu", hw_events);
+ ret = request_percpu_irq(riscv_pmu_irq, rvpmu_ovf_handler, "riscv-pmu", hw_events);
if (ret) {
pr_err("registering percpu irq failed [%d]\n", ret);
return ret;
@@ -1215,7 +1259,7 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
}
-static void pmu_sbi_event_init(struct perf_event *event)
+static void rvpmu_event_init(struct perf_event *event)
{
/*
* The permissions are set at event_init so that we do not depend
@@ -1229,7 +1273,7 @@ static void pmu_sbi_event_init(struct perf_event *event)
event->hw.flags |= PERF_EVENT_FLAG_LEGACY;
}
-static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm)
+static void rvpmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS)
return;
@@ -1257,14 +1301,14 @@ static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm)
* that it is possible to do so to avoid any race.
* And we must notify all cpus here because threads that currently run
* on other cpus will try to directly access the counter too without
- * calling pmu_sbi_ctr_start.
+ * calling rvpmu_sbi_ctr_start.
*/
if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS)
on_each_cpu_mask(mm_cpumask(mm),
- pmu_sbi_set_scounteren, (void *)event, 1);
+ rvpmu_set_scounteren, (void *)event, 1);
}
-static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *mm)
+static void rvpmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS)
return;
@@ -1286,7 +1330,7 @@ static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *m
if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS)
on_each_cpu_mask(mm_cpumask(mm),
- pmu_sbi_reset_scounteren, (void *)event, 1);
+ rvpmu_reset_scounteren, (void *)event, 1);
}
static void riscv_pmu_update_counter_access(void *info)
@@ -1329,7 +1373,7 @@ static const struct ctl_table sbi_pmu_sysctl_table[] = {
},
};
-static int pmu_sbi_device_probe(struct platform_device *pdev)
+static int rvpmu_device_probe(struct platform_device *pdev)
{
struct riscv_pmu *pmu = NULL;
int ret = -ENODEV;
@@ -1340,7 +1384,7 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
if (!pmu)
return -ENOMEM;
- num_counters = pmu_sbi_find_num_ctrs();
+ num_counters = rvpmu_find_num_ctrs();
if (num_counters < 0) {
pr_err("SBI PMU extension doesn't provide any counters\n");
goto out_free;
@@ -1353,10 +1397,10 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
}
/* cache all the information about counters now */
- if (pmu_sbi_get_ctrinfo(num_counters, &cmask))
+ if (rvpmu_get_ctrinfo(num_counters, &cmask))
goto out_free;
- ret = pmu_sbi_setup_irqs(pmu, pdev);
+ ret = rvpmu_setup_irqs(pmu, pdev);
if (ret < 0) {
pr_info("Perf sampling/filtering is not supported as sscof extension is not available\n");
pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
@@ -1366,17 +1410,17 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
pmu->pmu.attr_groups = riscv_pmu_attr_groups;
pmu->pmu.parent = &pdev->dev;
pmu->cmask = cmask;
- pmu->ctr_start = pmu_sbi_ctr_start;
- pmu->ctr_stop = pmu_sbi_ctr_stop;
- pmu->event_map = pmu_sbi_event_map;
- pmu->ctr_get_idx = pmu_sbi_ctr_get_idx;
- pmu->ctr_get_width = pmu_sbi_ctr_get_width;
- pmu->ctr_clear_idx = pmu_sbi_ctr_clear_idx;
- pmu->ctr_read = pmu_sbi_ctr_read;
- pmu->event_init = pmu_sbi_event_init;
- pmu->event_mapped = pmu_sbi_event_mapped;
- pmu->event_unmapped = pmu_sbi_event_unmapped;
- pmu->csr_index = pmu_sbi_csr_index;
+ pmu->ctr_start = rvpmu_ctr_start;
+ pmu->ctr_stop = rvpmu_ctr_stop;
+ pmu->event_map = rvpmu_event_map;
+ pmu->ctr_get_idx = rvpmu_ctr_get_idx;
+ pmu->ctr_get_width = rvpmu_ctr_get_width;
+ pmu->ctr_clear_idx = rvpmu_ctr_clear_idx;
+ pmu->ctr_read = rvpmu_ctr_read;
+ pmu->event_init = rvpmu_event_init;
+ pmu->event_mapped = rvpmu_event_mapped;
+ pmu->event_unmapped = rvpmu_event_unmapped;
+ pmu->csr_index = rvpmu_csr_index;
ret = riscv_pm_pmu_register(pmu);
if (ret)
@@ -1432,14 +1476,14 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
return ret;
}
-static struct platform_driver pmu_sbi_driver = {
- .probe = pmu_sbi_device_probe,
+static struct platform_driver rvpmu_driver = {
+ .probe = rvpmu_device_probe,
.driver = {
- .name = RISCV_PMU_SBI_PDEV_NAME,
+ .name = RISCV_PMU_PDEV_NAME,
},
};
-static int __init pmu_sbi_devinit(void)
+static int __init rvpmu_devinit(void)
{
int ret;
struct platform_device *pdev;
@@ -1454,20 +1498,20 @@ static int __init pmu_sbi_devinit(void)
ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_STARTING,
"perf/riscv/pmu:starting",
- pmu_sbi_starting_cpu, pmu_sbi_dying_cpu);
+ rvpmu_starting_cpu, rvpmu_dying_cpu);
if (ret) {
pr_err("CPU hotplug notifier could not be registered: %d\n",
ret);
return ret;
}
- ret = platform_driver_register(&pmu_sbi_driver);
+ ret = platform_driver_register(&rvpmu_driver);
if (ret)
return ret;
- pdev = platform_device_register_simple(RISCV_PMU_SBI_PDEV_NAME, -1, NULL, 0);
+ pdev = platform_device_register_simple(RISCV_PMU_PDEV_NAME, -1, NULL, 0);
if (IS_ERR(pdev)) {
- platform_driver_unregister(&pmu_sbi_driver);
+ platform_driver_unregister(&rvpmu_driver);
return PTR_ERR(pdev);
}
@@ -1476,4 +1520,4 @@ static int __init pmu_sbi_devinit(void)
return ret;
}
-device_initcall(pmu_sbi_devinit)
+device_initcall(rvpmu_devinit)
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index 701974639ff2..525acd6d96d0 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -13,7 +13,7 @@
#include <linux/ptrace.h>
#include <linux/interrupt.h>
-#ifdef CONFIG_RISCV_PMU
+#ifdef CONFIG_RISCV_PMU_COMMON
/*
* The RISCV_MAX_COUNTERS parameter should be specified.
@@ -21,7 +21,7 @@
#define RISCV_MAX_COUNTERS 64
#define RISCV_OP_UNSUPP (-EOPNOTSUPP)
-#define RISCV_PMU_SBI_PDEV_NAME "riscv-pmu-sbi"
+#define RISCV_PMU_PDEV_NAME "riscv-pmu"
#define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy"
#define RISCV_PMU_STOP_FLAG_RESET 1
@@ -87,10 +87,10 @@ void riscv_pmu_legacy_skip_init(void);
static inline void riscv_pmu_legacy_skip_init(void) {};
#endif
struct riscv_pmu *riscv_pmu_alloc(void);
-#ifdef CONFIG_RISCV_PMU_SBI
+#ifdef CONFIG_RISCV_PMU
int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
#endif
-#endif /* CONFIG_RISCV_PMU */
+#endif /* CONFIG_RISCV_PMU_COMMON */
#endif /* _RISCV_PMU_H */
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 12/21] RISC-V: perf: Modify the counter discovery mechanism
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (10 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 11/21] RISC-V: perf: Restructure the SBI PMU code Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 13/21] RISC-V: perf: Add a mechanism to defined legacy event encoding Atish Patra
` (8 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
If both counter delegation and SBI PMU is present, the counter
delegation will be used for hardware pmu counters while the SBI PMU
will be used for firmware counters. Thus, the driver has to probe
the counters info via SBI PMU to distinguish the firmware counters.
The hybrid scheme also requires improvements of the informational
logging messages to indicate the user about underlying interface
used for each use case.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/riscv_pmu_dev.c | 130 ++++++++++++++++++++++++++++++++-----------
1 file changed, 96 insertions(+), 34 deletions(-)
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index 6cebbc16bfe4..c0397bd68b91 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -66,6 +66,20 @@ static bool sbi_v2_available;
static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available);
#define sbi_pmu_snapshot_available() \
static_branch_unlikely(&sbi_pmu_snapshot_available)
+static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available);
+static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
+
+/* Avoid unnecessary code patching in the one time booting path*/
+#define riscv_pmu_cdeleg_available_boot() \
+ static_key_enabled(&riscv_pmu_cdeleg_available)
+#define riscv_pmu_sbi_available_boot() \
+ static_key_enabled(&riscv_pmu_sbi_available)
+
+/* Perform a runtime code patching with static key */
+#define riscv_pmu_cdeleg_available() \
+ static_branch_unlikely(&riscv_pmu_cdeleg_available)
+#define riscv_pmu_sbi_available() \
+ static_branch_likely(&riscv_pmu_sbi_available)
static struct attribute *riscv_arch_formats_attr[] = {
&format_attr_event.attr,
@@ -88,7 +102,8 @@ static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS;
/*
* This structure is SBI specific but counter delegation also require counter
- * width, csr mapping. Reuse it for now.
+ * width, csr mapping. Reuse it for now we can have firmware counters for
+ * platfroms with counter delegation support.
* RISC-V doesn't have heterogeneous harts yet. This need to be part of
* per_cpu in case of harts with different pmu counters
*/
@@ -100,6 +115,8 @@ static unsigned int riscv_pmu_irq;
/* Cache the available counters in a bitmask */
static unsigned long cmask;
+/* Cache the available firmware counters in another bitmask */
+static unsigned long firmware_cmask;
struct sbi_pmu_event_data {
union {
@@ -780,34 +797,38 @@ static int rvpmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}
-static int rvpmu_sbi_get_ctrinfo(int nctr, unsigned long *mask)
+static u32 rvpmu_deleg_find_ctrs(void)
+{
+ /* TODO */
+ return 0;
+}
+
+static int rvpmu_sbi_get_ctrinfo(u32 nsbi_ctr, u32 *num_fw_ctr, u32 *num_hw_ctr)
{
struct sbiret ret;
- int i, num_hw_ctr = 0, num_fw_ctr = 0;
+ int i;
union sbi_pmu_ctr_info cinfo;
- pmu_ctr_list = kcalloc(nctr, sizeof(*pmu_ctr_list), GFP_KERNEL);
- if (!pmu_ctr_list)
- return -ENOMEM;
-
- for (i = 0; i < nctr; i++) {
+ for (i = 0; i < nsbi_ctr; i++) {
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_GET_INFO, i, 0, 0, 0, 0, 0);
if (ret.error)
/* The logical counter ids are not expected to be contiguous */
continue;
- *mask |= BIT(i);
-
cinfo.value = ret.value;
- if (cinfo.type == SBI_PMU_CTR_TYPE_FW)
- num_fw_ctr++;
- else
- num_hw_ctr++;
- pmu_ctr_list[i].value = cinfo.value;
+ if (cinfo.type == SBI_PMU_CTR_TYPE_FW) {
+ /* Track firmware counters in a different mask */
+ firmware_cmask |= BIT(i);
+ pmu_ctr_list[i].value = cinfo.value;
+ *num_fw_ctr = *num_fw_ctr + 1;
+ } else if (cinfo.type == SBI_PMU_CTR_TYPE_HW &&
+ !riscv_pmu_cdeleg_available_boot()) {
+ *num_hw_ctr = *num_hw_ctr + 1;
+ cmask |= BIT(i);
+ pmu_ctr_list[i].value = cinfo.value;
+ }
}
- pr_info("%d firmware and %d hardware counters\n", num_fw_ctr, num_hw_ctr);
-
return 0;
}
@@ -1069,16 +1090,41 @@ static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
/* TODO: Counter delegation implementation */
}
-static int rvpmu_find_num_ctrs(void)
+static int rvpmu_find_ctrs(void)
{
- return rvpmu_sbi_find_num_ctrs();
- /* TODO: Counter delegation implementation */
-}
+ u32 num_sbi_counters = 0, num_deleg_counters = 0;
+ u32 num_hw_ctr = 0, num_fw_ctr = 0, num_ctr = 0;
+ /*
+ * We don't know how many firmware counters are available. Just allocate
+ * for maximum counters the driver can support. The default is 64 anyways.
+ */
+ pmu_ctr_list = kcalloc(RISCV_MAX_COUNTERS, sizeof(*pmu_ctr_list),
+ GFP_KERNEL);
+ if (!pmu_ctr_list)
+ return -ENOMEM;
-static int rvpmu_get_ctrinfo(int nctr, unsigned long *mask)
-{
- return rvpmu_sbi_get_ctrinfo(nctr, mask);
- /* TODO: Counter delegation implementation */
+ if (riscv_pmu_cdeleg_available_boot())
+ num_deleg_counters = rvpmu_deleg_find_ctrs();
+
+ /* This is required for firmware counters even if the above is true */
+ if (riscv_pmu_sbi_available_boot()) {
+ num_sbi_counters = rvpmu_sbi_find_num_ctrs();
+ /* cache all the information about counters now */
+ rvpmu_sbi_get_ctrinfo(num_sbi_counters, &num_hw_ctr, &num_fw_ctr);
+ }
+
+ if (num_sbi_counters > RISCV_MAX_COUNTERS || num_deleg_counters > RISCV_MAX_COUNTERS)
+ return -ENOSPC;
+
+ if (riscv_pmu_cdeleg_available_boot()) {
+ pr_info("%u firmware and %u hardware counters\n", num_fw_ctr, num_deleg_counters);
+ num_ctr = num_fw_ctr + num_deleg_counters;
+ } else {
+ pr_info("%u firmware and %u hardware counters\n", num_fw_ctr, num_hw_ctr);
+ num_ctr = num_sbi_counters;
+ }
+
+ return num_ctr;
}
static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
@@ -1379,12 +1425,21 @@ static int rvpmu_device_probe(struct platform_device *pdev)
int ret = -ENODEV;
int num_counters;
- pr_info("SBI PMU extension is available\n");
+ if (riscv_pmu_cdeleg_available_boot()) {
+ pr_info("hpmcounters will use the counter delegation ISA extension\n");
+ if (riscv_pmu_sbi_available_boot())
+ pr_info("Firmware counters will use SBI PMU extension\n");
+ else
+ pr_info("Firmware counters will not be available as SBI PMU extension is not present\n");
+ } else if (riscv_pmu_sbi_available_boot()) {
+ pr_info("Both hpmcounters and firmware counters will use SBI PMU extension\n");
+ }
+
pmu = riscv_pmu_alloc();
if (!pmu)
return -ENOMEM;
- num_counters = rvpmu_find_num_ctrs();
+ num_counters = rvpmu_find_ctrs();
if (num_counters < 0) {
pr_err("SBI PMU extension doesn't provide any counters\n");
goto out_free;
@@ -1396,9 +1451,6 @@ static int rvpmu_device_probe(struct platform_device *pdev)
pr_info("SBI returned more than maximum number of counters. Limiting the number of counters to %d\n", num_counters);
}
- /* cache all the information about counters now */
- if (rvpmu_get_ctrinfo(num_counters, &cmask))
- goto out_free;
ret = rvpmu_setup_irqs(pmu, pdev);
if (ret < 0) {
@@ -1488,13 +1540,23 @@ static int __init rvpmu_devinit(void)
int ret;
struct platform_device *pdev;
- if (sbi_spec_version < sbi_mk_version(0, 3) ||
- !sbi_probe_extension(SBI_EXT_PMU)) {
- return 0;
- }
+ if (sbi_spec_version >= sbi_mk_version(0, 3) &&
+ sbi_probe_extension(SBI_EXT_PMU))
+ static_branch_enable(&riscv_pmu_sbi_available);
if (sbi_spec_version >= sbi_mk_version(2, 0))
sbi_v2_available = true;
+ /*
+ * We need all three extensions to be present to access the counters
+ * in S-mode via Supervisor Counter delegation.
+ */
+ if (riscv_isa_extension_available(NULL, SSCCFG) &&
+ riscv_isa_extension_available(NULL, SMCDELEG) &&
+ riscv_isa_extension_available(NULL, SSCSRIND))
+ static_branch_enable(&riscv_pmu_cdeleg_available);
+
+ if (!(riscv_pmu_sbi_available_boot() || riscv_pmu_cdeleg_available_boot()))
+ return 0;
ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_RISCV_STARTING,
"perf/riscv/pmu:starting",
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 13/21] RISC-V: perf: Add a mechanism to defined legacy event encoding
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (11 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 12/21] RISC-V: perf: Modify the counter discovery mechanism Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 14/21] RISC-V: perf: Implement supervisor counter delegation support Atish Patra
` (7 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
RISC-V ISA doesn't define any standard event encodings or specify
any event to counter mapping. Thus, event encoding information
and corresponding counter mapping fot those events needs to be
provided in the driver for each vendor.
Add a framework to support that. The individual platform events
will be added later.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/riscv_pmu_dev.c | 54 +++++++++++++++++++++++++++++++++++++++++-
include/linux/perf/riscv_pmu.h | 13 ++++++++++
2 files changed, 66 insertions(+), 1 deletion(-)
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index c0397bd68b91..6f64404a6e3d 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -317,6 +317,56 @@ static struct sbi_pmu_event_data pmu_cache_event_sbi_map[PERF_COUNT_HW_CACHE_MAX
},
};
+/*
+ * Vendor specific PMU events.
+ */
+struct riscv_pmu_event {
+ u64 event_id;
+ u32 counter_mask;
+};
+
+struct riscv_vendor_pmu_events {
+ unsigned long vendorid;
+ unsigned long archid;
+ unsigned long implid;
+ const struct riscv_pmu_event *hw_event_map;
+ const struct riscv_pmu_event (*cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX];
+};
+
+#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, _cache_event_map) \
+ { .vendorid = _vendorid, .archid = _archid, .implid = _implid, \
+ .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map },
+
+static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = {
+};
+
+const struct riscv_pmu_event *current_pmu_hw_event_map;
+const struct riscv_pmu_event (*current_pmu_cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX];
+
+static void rvpmu_vendor_register_events(void)
+{
+ int cpu = raw_smp_processor_id();
+ unsigned long vendor_id = riscv_cached_mvendorid(cpu);
+ unsigned long impl_id = riscv_cached_mimpid(cpu);
+ unsigned long arch_id = riscv_cached_marchid(cpu);
+
+ for (int i = 0; i < ARRAY_SIZE(pmu_vendor_events_table); i++) {
+ if (pmu_vendor_events_table[i].vendorid == vendor_id &&
+ pmu_vendor_events_table[i].implid == impl_id &&
+ pmu_vendor_events_table[i].archid == arch_id) {
+ current_pmu_hw_event_map = pmu_vendor_events_table[i].hw_event_map;
+ current_pmu_cache_event_map = pmu_vendor_events_table[i].cache_event_map;
+ break;
+ }
+ }
+
+ if (!current_pmu_hw_event_map || !current_pmu_cache_event_map) {
+ pr_info("No default PMU events found\n");
+ }
+}
+
static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata)
{
struct sbiret ret;
@@ -1552,8 +1602,10 @@ static int __init rvpmu_devinit(void)
*/
if (riscv_isa_extension_available(NULL, SSCCFG) &&
riscv_isa_extension_available(NULL, SMCDELEG) &&
- riscv_isa_extension_available(NULL, SSCSRIND))
+ riscv_isa_extension_available(NULL, SSCSRIND)) {
static_branch_enable(&riscv_pmu_cdeleg_available);
+ rvpmu_vendor_register_events();
+ }
if (!(riscv_pmu_sbi_available_boot() || riscv_pmu_cdeleg_available_boot()))
return 0;
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index 525acd6d96d0..a3e1fdd5084a 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -28,6 +28,19 @@
#define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1
+#define HW_OP_UNSUPPORTED 0xFFFF
+#define CACHE_OP_UNSUPPORTED 0xFFFF
+
+#define PERF_MAP_ALL_UNSUPPORTED \
+ [0 ... PERF_COUNT_HW_MAX - 1] = {HW_OP_UNSUPPORTED, 0x0}
+
+#define PERF_CACHE_MAP_ALL_UNSUPPORTED \
+[0 ... C(MAX) - 1] = { \
+ [0 ... C(OP_MAX) - 1] = { \
+ [0 ... C(RESULT_MAX) - 1] = {CACHE_OP_UNSUPPORTED, 0x0} \
+ }, \
+}
+
struct cpu_hw_events {
/* currently enabled events */
int n_events;
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 14/21] RISC-V: perf: Implement supervisor counter delegation support
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (12 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 13/21] RISC-V: perf: Add a mechanism to defined legacy event encoding Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-08-28 9:56 ` [External] " yunhui cui
2025-03-27 19:35 ` [PATCH v5 15/21] RISC-V: perf: Skip PMU SBI extension when not implemented Atish Patra
` (6 subsequent siblings)
20 siblings, 1 reply; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
There are few new RISC-V ISA exensions (ssccfg, sscsrind, smcntrpmf) which
allows the hpmcounter/hpmevents to be programmed directly from S-mode. The
implementation detects the ISA extension at runtime and uses them if
available instead of SBI PMU extension. SBI PMU extension will still be
used for firmware counters if the user requests it.
The current linux driver relies on event encoding defined by SBI PMU
specification for standard perf events. However, there are no standard
event encoding available in the ISA. In the future, we may want to
decouple the counter delegation and SBI PMU completely. In that case,
counter delegation supported platforms must rely on the event encoding
defined in the perf json file or in the pmu driver.
For firmware events, it will continue to use the SBI PMU encoding as
one can not support firmware event without SBI PMU.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/csr.h | 1 +
drivers/perf/riscv_pmu_dev.c | 561 +++++++++++++++++++++++++++++++++--------
include/linux/perf/riscv_pmu.h | 3 +
3 files changed, 462 insertions(+), 103 deletions(-)
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 3d2d4f886c77..8b2f5ae1d60e 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -255,6 +255,7 @@
#endif
#define SISELECT_SSCCFG_BASE 0x40
+#define HPMEVENT_MASK GENMASK_ULL(63, 56)
/* mseccfg bits */
#define MSECCFG_PMM ENVCFG_PMM
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index 6f64404a6e3d..7c4a1ef15866 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -27,6 +27,8 @@
#include <asm/cpufeature.h>
#include <asm/vendor_extensions.h>
#include <asm/vendor_extensions/andes.h>
+#include <asm/hwcap.h>
+#include <asm/csr_ind.h>
#define ALT_SBI_PMU_OVERFLOW(__ovl) \
asm volatile(ALTERNATIVE_2( \
@@ -59,14 +61,31 @@ asm volatile(ALTERNATIVE( \
#define PERF_EVENT_FLAG_USER_ACCESS BIT(SYSCTL_USER_ACCESS)
#define PERF_EVENT_FLAG_LEGACY BIT(SYSCTL_LEGACY)
-PMU_FORMAT_ATTR(event, "config:0-47");
+#define RVPMU_SBI_PMU_FORMAT_ATTR "config:0-47"
+#define RVPMU_CDELEG_PMU_FORMAT_ATTR "config:0-55"
+
+static ssize_t __maybe_unused rvpmu_format_show(struct device *dev, struct device_attribute *attr,
+ char *buf);
+
+#define RVPMU_ATTR_ENTRY(_name, _func, _config) ( \
+ &((struct dev_ext_attribute[]) { \
+ { __ATTR(_name, 0444, _func, NULL), (void *)_config } \
+ })[0].attr.attr)
+
+#define RVPMU_FORMAT_ATTR_ENTRY(_name, _config) \
+ RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config)
+
PMU_FORMAT_ATTR(firmware, "config:62-63");
static bool sbi_v2_available;
static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available);
#define sbi_pmu_snapshot_available() \
static_branch_unlikely(&sbi_pmu_snapshot_available)
+
static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available);
+#define riscv_pmu_sbi_available() \
+ static_branch_likely(&riscv_pmu_sbi_available)
+
static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
/* Avoid unnecessary code patching in the one time booting path*/
@@ -81,19 +100,35 @@ static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
#define riscv_pmu_sbi_available() \
static_branch_likely(&riscv_pmu_sbi_available)
-static struct attribute *riscv_arch_formats_attr[] = {
- &format_attr_event.attr,
+static struct attribute *riscv_sbi_pmu_formats_attr[] = {
+ RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_SBI_PMU_FORMAT_ATTR),
+ &format_attr_firmware.attr,
+ NULL,
+};
+
+static struct attribute_group riscv_sbi_pmu_format_group = {
+ .name = "format",
+ .attrs = riscv_sbi_pmu_formats_attr,
+};
+
+static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = {
+ &riscv_sbi_pmu_format_group,
+ NULL,
+};
+
+static struct attribute *riscv_cdeleg_pmu_formats_attr[] = {
+ RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR),
&format_attr_firmware.attr,
NULL,
};
-static struct attribute_group riscv_pmu_format_group = {
+static struct attribute_group riscv_cdeleg_pmu_format_group = {
.name = "format",
- .attrs = riscv_arch_formats_attr,
+ .attrs = riscv_cdeleg_pmu_formats_attr,
};
-static const struct attribute_group *riscv_pmu_attr_groups[] = {
- &riscv_pmu_format_group,
+static const struct attribute_group *riscv_cdeleg_pmu_attr_groups[] = {
+ &riscv_cdeleg_pmu_format_group,
NULL,
};
@@ -395,6 +430,14 @@ static void rvpmu_sbi_check_std_events(struct work_struct *work)
static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events);
+static ssize_t rvpmu_format_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct dev_ext_attribute *eattr = container_of(attr,
+ struct dev_ext_attribute, attr);
+ return sysfs_emit(buf, "%s\n", (char *)eattr->var);
+}
+
static int rvpmu_ctr_get_width(int idx)
{
return pmu_ctr_list[idx].width;
@@ -447,6 +490,38 @@ static uint8_t rvpmu_csr_index(struct perf_event *event)
return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE;
}
+static uint64_t get_deleg_priv_filter_bits(struct perf_event *event)
+{
+ u64 priv_filter_bits = 0;
+ bool guest_events = false;
+
+ if (event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS)
+ guest_events = true;
+ if (event->attr.exclude_kernel)
+ priv_filter_bits |= guest_events ? HPMEVENT_VSINH : HPMEVENT_SINH;
+ if (event->attr.exclude_user)
+ priv_filter_bits |= guest_events ? HPMEVENT_VUINH : HPMEVENT_UINH;
+ if (guest_events && event->attr.exclude_hv)
+ priv_filter_bits |= HPMEVENT_SINH;
+ if (event->attr.exclude_host)
+ priv_filter_bits |= HPMEVENT_UINH | HPMEVENT_SINH;
+ if (event->attr.exclude_guest)
+ priv_filter_bits |= HPMEVENT_VSINH | HPMEVENT_VUINH;
+
+ return priv_filter_bits;
+}
+
+static bool pmu_sbi_is_fw_event(struct perf_event *event)
+{
+ u32 type = event->attr.type;
+ u64 config = event->attr.config;
+
+ if (type == PERF_TYPE_RAW && ((config >> 63) == 1))
+ return true;
+ else
+ return false;
+}
+
static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event)
{
unsigned long cflags = 0;
@@ -475,7 +550,8 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
struct sbiret ret;
int idx;
- uint64_t cbase = 0, cmask = rvpmu->cmask;
+ u64 cbase = 0;
+ unsigned long ctr_mask = rvpmu->cmask;
unsigned long cflags = 0;
cflags = rvpmu_sbi_get_filter_flags(event);
@@ -488,21 +564,23 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
if ((hwc->flags & PERF_EVENT_FLAG_LEGACY) && (event->attr.type == PERF_TYPE_HARDWARE)) {
if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES) {
cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
- cmask = 1;
+ ctr_mask = 1;
} else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS) {
cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
- cmask = BIT(CSR_INSTRET - CSR_CYCLE);
+ ctr_mask = BIT(CSR_INSTRET - CSR_CYCLE);
}
+ } else if (pmu_sbi_is_fw_event(event)) {
+ ctr_mask = firmware_cmask;
}
/* retrieve the available counter index */
#if defined(CONFIG_32BIT)
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
- cmask, cflags, hwc->event_base, hwc->config,
+ ctr_mask, cflags, hwc->event_base, hwc->config,
hwc->config >> 32);
#else
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
- cmask, cflags, hwc->event_base, hwc->config, 0);
+ ctr_mask, cflags, hwc->event_base, hwc->config, 0);
#endif
if (ret.error) {
pr_debug("Not able to find a counter for event %lx config %llx\n",
@@ -511,7 +589,7 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
}
idx = ret.value;
- if (!test_bit(idx, &rvpmu->cmask) || !pmu_ctr_list[idx].value)
+ if (!test_bit(idx, &ctr_mask) || !pmu_ctr_list[idx].value)
return -ENOENT;
/* Additional sanity check for the counter id */
@@ -561,17 +639,6 @@ static int sbi_pmu_event_find_cache(u64 config)
return ret;
}
-static bool pmu_sbi_is_fw_event(struct perf_event *event)
-{
- u32 type = event->attr.type;
- u64 config = event->attr.config;
-
- if ((type == PERF_TYPE_RAW) && ((config >> 63) == 1))
- return true;
- else
- return false;
-}
-
static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
{
u32 type = event->attr.type;
@@ -602,7 +669,6 @@ static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
* 10 - SBI firmware events
* 11 - Risc-V platform specific firmware event
*/
-
switch (config >> 62) {
case 0:
/* Return error any bits [48-63] is set as it is not allowed by the spec */
@@ -634,6 +700,84 @@ static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
return ret;
}
+static int cdeleg_pmu_event_find_cache(u64 config, u64 *eventid, uint32_t *counter_mask)
+{
+ unsigned int cache_type, cache_op, cache_result;
+
+ if (!current_pmu_cache_event_map)
+ return -ENOENT;
+
+ cache_type = (config >> 0) & 0xff;
+ if (cache_type >= PERF_COUNT_HW_CACHE_MAX)
+ return -EINVAL;
+
+ cache_op = (config >> 8) & 0xff;
+ if (cache_op >= PERF_COUNT_HW_CACHE_OP_MAX)
+ return -EINVAL;
+
+ cache_result = (config >> 16) & 0xff;
+ if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
+ return -EINVAL;
+
+ if (eventid)
+ *eventid = current_pmu_cache_event_map[cache_type][cache_op]
+ [cache_result].event_id;
+ if (counter_mask)
+ *counter_mask = current_pmu_cache_event_map[cache_type][cache_op]
+ [cache_result].counter_mask;
+
+ return 0;
+}
+
+static int rvpmu_cdeleg_event_map(struct perf_event *event, u64 *econfig)
+{
+ u32 type = event->attr.type;
+ u64 config = event->attr.config;
+ int ret = 0;
+
+ /*
+ * There are two ways standard perf events can be mapped to platform specific
+ * encoding.
+ * 1. The vendor may specify the encodings in the driver.
+ * 2. The Perf tool for RISC-V may remap the standard perf event to platform
+ * specific encoding.
+ *
+ * As RISC-V ISA doesn't define any standard event encoding. Thus, perf tool allows
+ * vendor to define it via json file. The encoding defined in the json will override
+ * the perf legacy encoding. However, some user may want to run performance
+ * monitoring without perf tool as well. That's why, vendors may specify the event
+ * encoding in the driver as well if they want to support that use case too.
+ * If an encoding is defined in the json, it will be encoded as a raw event.
+ */
+
+ switch (type) {
+ case PERF_TYPE_HARDWARE:
+ if (config >= PERF_COUNT_HW_MAX)
+ return -EINVAL;
+ if (!current_pmu_hw_event_map)
+ return -ENOENT;
+
+ *econfig = current_pmu_hw_event_map[config].event_id;
+ if (*econfig == HW_OP_UNSUPPORTED)
+ ret = -ENOENT;
+ break;
+ case PERF_TYPE_HW_CACHE:
+ ret = cdeleg_pmu_event_find_cache(config, econfig, NULL);
+ if (*econfig == HW_OP_UNSUPPORTED)
+ ret = -ENOENT;
+ break;
+ case PERF_TYPE_RAW:
+ *econfig = config & RISCV_PMU_DELEG_RAW_EVENT_MASK;
+ break;
+ default:
+ ret = -ENOENT;
+ break;
+ }
+
+ /* event_base is not used for counter delegation */
+ return ret;
+}
+
static void pmu_sbi_snapshot_free(struct riscv_pmu *pmu)
{
int cpu;
@@ -717,7 +861,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu)
return 0;
}
-static u64 rvpmu_sbi_ctr_read(struct perf_event *event)
+static u64 rvpmu_ctr_read(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
@@ -794,10 +938,6 @@ static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival)
if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED))
pr_err("Starting counter idx %d failed with error %d\n",
hwc->idx, sbi_err_map_linux_errno(ret.error));
-
- if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
- (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- rvpmu_set_scounteren((void *)event);
}
static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
@@ -808,10 +948,6 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr;
- if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
- (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
- rvpmu_reset_scounteren((void *)event);
-
if (sbi_pmu_snapshot_available())
flag |= SBI_PMU_STOP_FLAG_TAKE_SNAPSHOT;
@@ -847,12 +983,6 @@ static int rvpmu_sbi_find_num_ctrs(void)
return sbi_err_map_linux_errno(ret.error);
}
-static u32 rvpmu_deleg_find_ctrs(void)
-{
- /* TODO */
- return 0;
-}
-
static int rvpmu_sbi_get_ctrinfo(u32 nsbi_ctr, u32 *num_fw_ctr, u32 *num_hw_ctr)
{
struct sbiret ret;
@@ -930,53 +1060,75 @@ static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
}
}
-/*
- * This function starts all the used counters in two step approach.
- * Any counter that did not overflow can be start in a single step
- * while the overflowed counters need to be started with updated initialization
- * value.
- */
-static inline void rvpmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
- u64 ctr_ovf_mask)
+static void rvpmu_deleg_ctr_start_mask(unsigned long mask)
{
- int idx = 0, i;
- struct perf_event *event;
- unsigned long flag = SBI_PMU_START_FLAG_SET_INIT_VALUE;
- unsigned long ctr_start_mask = 0;
- uint64_t max_period;
- struct hw_perf_event *hwc;
- u64 init_val = 0;
+ unsigned long scountinhibit_val = 0;
- for (i = 0; i < BITS_TO_LONGS(RISCV_MAX_COUNTERS); i++) {
- ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask;
- /* Start all the counters that did not overflow in a single shot */
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, i * BITS_PER_LONG, ctr_start_mask,
- 0, 0, 0, 0);
- }
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val &= ~mask;
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
+}
+
+static void rvpmu_deleg_ctr_enable_irq(struct perf_event *event)
+{
+ unsigned long hpmevent_curr;
+ unsigned long of_mask;
+ struct hw_perf_event *hwc = &event->hw;
+ int counter_idx = hwc->idx;
+ unsigned long sip_val = csr_read(CSR_SIP);
+
+ if (!is_sampling_event(event) || (sip_val & SIP_LCOFIP))
+ return;
- /* Reinitialize and start all the counter that overflowed */
- while (ctr_ovf_mask) {
- if (ctr_ovf_mask & 0x01) {
- event = cpu_hw_evt->events[idx];
- hwc = &event->hw;
- max_period = riscv_pmu_ctr_get_width_mask(event);
- init_val = local64_read(&hwc->prev_count) & max_period;
#if defined(CONFIG_32BIT)
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
- flag, init_val, init_val >> 32, 0);
+ hpmevent_curr = csr_ind_read(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx);
+ of_mask = (u32)~HPMEVENTH_OF;
#else
- sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
- flag, init_val, 0, 0);
+ hpmevent_curr = csr_ind_read(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx);
+ of_mask = ~HPMEVENT_OF;
+#endif
+
+ hpmevent_curr &= of_mask;
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr);
+#else
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr);
#endif
- perf_event_update_userpage(event);
- }
- ctr_ovf_mask = ctr_ovf_mask >> 1;
- idx++;
- }
}
-static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
- u64 ctr_ovf_mask)
+static void rvpmu_deleg_ctr_start(struct perf_event *event, u64 ival)
+{
+ unsigned long scountinhibit_val = 0;
+ struct hw_perf_event *hwc = &event->hw;
+
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival & 0xFFFFFFFF);
+ csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, hwc->idx, ival >> BITS_PER_LONG);
+#else
+ csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival);
+#endif
+
+ rvpmu_deleg_ctr_enable_irq(event);
+
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val &= ~(1 << hwc->idx);
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
+}
+
+static void rvpmu_deleg_ctr_stop_mask(unsigned long mask)
+{
+ unsigned long scountinhibit_val = 0;
+
+ scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
+ scountinhibit_val |= mask;
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
+}
+
+static void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
+ u64 ctr_ovf_mask)
{
int i, idx = 0;
struct perf_event *event;
@@ -1010,15 +1162,53 @@ static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_h
}
}
-static void rvpmu_sbi_start_overflow_mask(struct riscv_pmu *pmu,
- u64 ctr_ovf_mask)
+/*
+ * This function starts all the used counters in two step approach.
+ * Any counter that did not overflow can be start in a single step
+ * while the overflowed counters need to be started with updated initialization
+ * value.
+ */
+static void rvpmu_start_overflow_mask(struct riscv_pmu *pmu, u64 ctr_ovf_mask)
{
+ int idx = 0, i;
+ struct perf_event *event;
+ unsigned long ctr_start_mask = 0;
+ u64 max_period, init_val = 0;
+ struct hw_perf_event *hwc;
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
if (sbi_pmu_snapshot_available())
- rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
- else
- rvpmu_sbi_start_ovf_ctrs_sbi(cpu_hw_evt, ctr_ovf_mask);
+ return rvpmu_sbi_start_ovf_ctrs_snapshot(cpu_hw_evt, ctr_ovf_mask);
+
+ /* Start all the counters that did not overflow */
+ if (riscv_pmu_cdeleg_available()) {
+ ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask;
+ rvpmu_deleg_ctr_start_mask(ctr_start_mask);
+ } else {
+ for (i = 0; i < BITS_TO_LONGS(RISCV_MAX_COUNTERS); i++) {
+ ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask;
+ /* Start all the counters that did not overflow in a single shot */
+ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, i * BITS_PER_LONG,
+ ctr_start_mask, 0, 0, 0, 0);
+ }
+ }
+
+ /* Reinitialize and start all the counter that overflowed */
+ while (ctr_ovf_mask) {
+ if (ctr_ovf_mask & 0x01) {
+ event = cpu_hw_evt->events[idx];
+ hwc = &event->hw;
+ max_period = riscv_pmu_ctr_get_width_mask(event);
+ init_val = local64_read(&hwc->prev_count) & max_period;
+ if (riscv_pmu_cdeleg_available())
+ rvpmu_deleg_ctr_start(event, init_val);
+ else
+ rvpmu_sbi_ctr_start(event, init_val);
+ perf_event_update_userpage(event);
+ }
+ ctr_ovf_mask = ctr_ovf_mask >> 1;
+ idx++;
+ }
}
static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
@@ -1053,7 +1243,10 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
}
pmu = to_riscv_pmu(event->pmu);
- rvpmu_sbi_stop_hw_ctrs(pmu);
+ if (riscv_pmu_cdeleg_available())
+ rvpmu_deleg_ctr_stop_mask(cpu_hw_evt->used_hw_ctrs[0]);
+ else
+ rvpmu_sbi_stop_hw_ctrs(pmu);
/* Overflow status register should only be read after counter are stopped */
if (sbi_pmu_snapshot_available())
@@ -1122,25 +1315,177 @@ static irqreturn_t rvpmu_ovf_handler(int irq, void *dev)
hw_evt->state = 0;
}
- rvpmu_sbi_start_overflow_mask(pmu, overflowed_ctrs);
+ rvpmu_start_overflow_mask(pmu, overflowed_ctrs);
perf_sample_event_took(sched_clock() - start_clock);
return IRQ_HANDLED;
}
+static int get_deleg_hw_ctr_width(int counter_offset)
+{
+ unsigned long hpm_warl;
+ int num_bits;
+
+ if (counter_offset < 3 || counter_offset > 31)
+ return 0;
+
+ hpm_warl = csr_ind_warl(CSR_SIREG, SISELECT_SSCCFG_BASE, counter_offset, -1);
+ num_bits = __fls(hpm_warl);
+
+#if defined(CONFIG_32BIT)
+ hpm_warl = csr_ind_warl(CSR_SIREG4, SISELECT_SSCCFG_BASE, counter_offset, -1);
+ num_bits += __fls(hpm_warl);
+#endif
+ return num_bits;
+}
+
+static int rvpmu_deleg_find_ctrs(void)
+{
+ int i, num_hw_ctr = 0;
+ union sbi_pmu_ctr_info cinfo;
+ unsigned long scountinhibit_old = 0;
+
+ /* Do a WARL write/read to detect which hpmcounters have been delegated */
+ scountinhibit_old = csr_read(CSR_SCOUNTINHIBIT);
+ csr_write(CSR_SCOUNTINHIBIT, -1);
+ cmask = csr_read(CSR_SCOUNTINHIBIT);
+
+ csr_write(CSR_SCOUNTINHIBIT, scountinhibit_old);
+
+ for_each_set_bit(i, &cmask, RISCV_MAX_HW_COUNTERS) {
+ if (unlikely(i == 1))
+ continue; /* This should never happen as TM is read only */
+ cinfo.value = 0;
+ cinfo.type = SBI_PMU_CTR_TYPE_HW;
+ /*
+ * If counter delegation is enabled, the csr stored to the cinfo will
+ * be a virtual counter that the delegation attempts to read.
+ */
+ cinfo.csr = CSR_CYCLE + i;
+ if (i == 0 || i == 2)
+ cinfo.width = 63;
+ else
+ cinfo.width = get_deleg_hw_ctr_width(i);
+
+ num_hw_ctr++;
+ pmu_ctr_list[i].value = cinfo.value;
+ }
+
+ return num_hw_ctr;
+}
+
+static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+ return -EINVAL;
+}
+
+static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+ unsigned long hw_ctr_mask = 0;
+
+ /*
+ * TODO: Treat every hpmcounter can monitor every event for now.
+ * The event to counter mapping should come from the json file.
+ * The mapping should also tell if sampling is supported or not.
+ */
+
+ /* Select only hpmcounters */
+ hw_ctr_mask = cmask & (~0x7);
+ hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]);
+ return __ffs(hw_ctr_mask);
+}
+
+static void update_deleg_hpmevent(int counter_idx, uint64_t event_value, uint64_t filter_bits)
+{
+ u64 hpmevent_value = 0;
+
+ /* OF bit should be enable during the start if sampling is requested */
+ hpmevent_value = (event_value & ~HPMEVENT_MASK) | filter_bits | HPMEVENT_OF;
+#if defined(CONFIG_32BIT)
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_value & 0xFFFFFFFF);
+ if (riscv_isa_extension_available(NULL, SSCOFPMF))
+ csr_ind_write(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx,
+ hpmevent_value >> BITS_PER_LONG);
+#else
+ csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_value);
+#endif
+}
+
+static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
+ struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
+ unsigned long hw_ctr_max_id;
+ u64 priv_filter;
+ int idx;
+
+ /*
+ * TODO: We should not rely on SBI Perf encoding to check if the event
+ * is a fixed one or not.
+ */
+ if (!is_sampling_event(event)) {
+ idx = get_deleg_fixed_hw_idx(cpuc, event);
+ if (idx == 0 || idx == 2) {
+ /* Priv mode filter bits are only available if smcntrpmf is present */
+ if (riscv_isa_extension_available(NULL, SMCNTRPMF))
+ goto found_idx;
+ else
+ goto skip_update;
+ }
+ }
+
+ hw_ctr_max_id = __fls(cmask);
+ idx = get_deleg_next_hpm_hw_idx(cpuc, event);
+ if (idx < 3 || idx > hw_ctr_max_id)
+ goto out_err;
+found_idx:
+ priv_filter = get_deleg_priv_filter_bits(event);
+ update_deleg_hpmevent(idx, hwc->config, priv_filter);
+skip_update:
+ if (!test_and_set_bit(idx, cpuc->used_hw_ctrs))
+ return idx;
+out_err:
+ return -ENOENT;
+}
+
static void rvpmu_ctr_start(struct perf_event *event, u64 ival)
{
- rvpmu_sbi_ctr_start(event, ival);
- /* TODO: Counter delegation implementation */
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
+ rvpmu_deleg_ctr_start(event, ival);
+ else
+ rvpmu_sbi_ctr_start(event, ival);
+
+ if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
+ (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
+ rvpmu_set_scounteren((void *)event);
}
static void rvpmu_ctr_stop(struct perf_event *event, unsigned long flag)
{
- rvpmu_sbi_ctr_stop(event, flag);
- /* TODO: Counter delegation implementation */
+ struct hw_perf_event *hwc = &event->hw;
+
+ if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
+ (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
+ rvpmu_reset_scounteren((void *)event);
+
+ if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event)) {
+ /*
+ * The counter is already stopped. No need to stop again. Counter
+ * mapping will be reset in clear_idx function.
+ */
+ if (flag != RISCV_PMU_STOP_FLAG_RESET)
+ rvpmu_deleg_ctr_stop_mask((1 << hwc->idx));
+ else
+ update_deleg_hpmevent(hwc->idx, 0, 0);
+ } else {
+ rvpmu_sbi_ctr_stop(event, flag);
+ }
}
-static int rvpmu_find_ctrs(void)
+static u32 rvpmu_find_ctrs(void)
{
u32 num_sbi_counters = 0, num_deleg_counters = 0;
u32 num_hw_ctr = 0, num_fw_ctr = 0, num_ctr = 0;
@@ -1179,20 +1524,18 @@ static int rvpmu_find_ctrs(void)
static int rvpmu_event_map(struct perf_event *event, u64 *econfig)
{
- return rvpmu_sbi_event_map(event, econfig);
- /* TODO: Counter delegation implementation */
+ if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
+ return rvpmu_cdeleg_event_map(event, econfig);
+ else
+ return rvpmu_sbi_event_map(event, econfig);
}
static int rvpmu_ctr_get_idx(struct perf_event *event)
{
- return rvpmu_sbi_ctr_get_idx(event);
- /* TODO: Counter delegation implementation */
-}
-
-static u64 rvpmu_ctr_read(struct perf_event *event)
-{
- return rvpmu_sbi_ctr_read(event);
- /* TODO: Counter delegation implementation */
+ if (riscv_pmu_cdeleg_available() && !pmu_sbi_is_fw_event(event))
+ return rvpmu_deleg_ctr_get_idx(event);
+ else
+ return rvpmu_sbi_ctr_get_idx(event);
}
static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
@@ -1210,7 +1553,16 @@ static int rvpmu_starting_cpu(unsigned int cpu, struct hlist_node *node)
csr_write(CSR_SCOUNTEREN, 0x2);
/* Stop all the counters so that they can be enabled from perf */
- rvpmu_sbi_stop_all(pmu);
+ if (riscv_pmu_cdeleg_available()) {
+ rvpmu_deleg_ctr_stop_mask(cmask);
+ if (riscv_pmu_sbi_available()) {
+ /* Stop the firmware counters as well */
+ sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, 0, firmware_cmask,
+ 0, 0, 0, 0);
+ }
+ } else {
+ rvpmu_sbi_stop_all(pmu);
+ }
if (riscv_pmu_use_irq) {
cpu_hw_evt->irq = riscv_pmu_irq;
@@ -1509,8 +1861,11 @@ static int rvpmu_device_probe(struct platform_device *pdev)
pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE;
}
- pmu->pmu.attr_groups = riscv_pmu_attr_groups;
pmu->pmu.parent = &pdev->dev;
+ if (riscv_pmu_cdeleg_available_boot())
+ pmu->pmu.attr_groups = riscv_cdeleg_pmu_attr_groups;
+ else
+ pmu->pmu.attr_groups = riscv_sbi_pmu_attr_groups;
pmu->cmask = cmask;
pmu->ctr_start = rvpmu_ctr_start;
pmu->ctr_stop = rvpmu_ctr_stop;
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index a3e1fdd5084a..9e2758c32e8b 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -20,6 +20,7 @@
*/
#define RISCV_MAX_COUNTERS 64
+#define RISCV_MAX_HW_COUNTERS 32
#define RISCV_OP_UNSUPP (-EOPNOTSUPP)
#define RISCV_PMU_PDEV_NAME "riscv-pmu"
#define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy"
@@ -28,6 +29,8 @@
#define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1
+#define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0)
+
#define HW_OP_UNSUPPORTED 0xFFFF
#define CACHE_OP_UNSUPPORTED 0xFFFF
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 15/21] RISC-V: perf: Skip PMU SBI extension when not implemented
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (13 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 14/21] RISC-V: perf: Implement supervisor counter delegation support Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 16/21] RISC-V: perf: Use config2/vendor table for event to counter mapping Atish Patra
` (5 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra,
Charlie Jenkins
From: Charlie Jenkins <charlie@rivosinc.com>
When the PMU SBI extension is not implemented, sbi_v2_available should
not be set to true. The SBI implementation for counter config matching
and firmware counter read should also be skipped when the SBI extension
is not implemented.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
---
drivers/perf/riscv_pmu_dev.c | 38 +++++++++++++++++++++++---------------
1 file changed, 23 insertions(+), 15 deletions(-)
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index 7c4a1ef15866..d1cc8310423f 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -417,18 +417,22 @@ static void rvpmu_sbi_check_event(struct sbi_pmu_event_data *edata)
}
}
-static void rvpmu_sbi_check_std_events(struct work_struct *work)
+static void rvpmu_check_std_events(struct work_struct *work)
{
- for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
- rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]);
-
- for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++)
- for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++)
- for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
- rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]);
+ if (riscv_pmu_sbi_available()) {
+ for (int i = 0; i < ARRAY_SIZE(pmu_hw_event_sbi_map); i++)
+ rvpmu_sbi_check_event(&pmu_hw_event_sbi_map[i]);
+
+ for (int i = 0; i < ARRAY_SIZE(pmu_cache_event_sbi_map); i++)
+ for (int j = 0; j < ARRAY_SIZE(pmu_cache_event_sbi_map[i]); j++)
+ for (int k = 0; k < ARRAY_SIZE(pmu_cache_event_sbi_map[i][j]); k++)
+ rvpmu_sbi_check_event(&pmu_cache_event_sbi_map[i][j][k]);
+ } else {
+ DO_ONCE_LITE_IF(1, pr_err, "Boot time config matching not required for smcdeleg\n");
+ }
}
-static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events);
+static DECLARE_WORK(check_std_events_work, rvpmu_check_std_events);
static ssize_t rvpmu_format_show(struct device *dev,
struct device_attribute *attr, char *buf)
@@ -556,6 +560,9 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
cflags = rvpmu_sbi_get_filter_flags(event);
+ if (!riscv_pmu_sbi_available())
+ return -ENOENT;
+
/*
* In legacy mode, we have to force the fixed counters for those events
* but not in the user access mode as we want to use the other counters
@@ -878,7 +885,7 @@ static u64 rvpmu_ctr_read(struct perf_event *event)
return val;
}
- if (pmu_sbi_is_fw_event(event)) {
+ if (pmu_sbi_is_fw_event(event) && riscv_pmu_sbi_available()) {
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_FW_READ,
hwc->idx, 0, 0, 0, 0, 0);
if (ret.error)
@@ -1945,12 +1952,13 @@ static int __init rvpmu_devinit(void)
int ret;
struct platform_device *pdev;
- if (sbi_spec_version >= sbi_mk_version(0, 3) &&
- sbi_probe_extension(SBI_EXT_PMU))
- static_branch_enable(&riscv_pmu_sbi_available);
+ if (sbi_probe_extension(SBI_EXT_PMU)) {
+ if (sbi_spec_version >= sbi_mk_version(0, 3))
+ static_branch_enable(&riscv_pmu_sbi_available);
+ if (sbi_spec_version >= sbi_mk_version(2, 0))
+ sbi_v2_available = true;
+ }
- if (sbi_spec_version >= sbi_mk_version(2, 0))
- sbi_v2_available = true;
/*
* We need all three extensions to be present to access the counters
* in S-mode via Supervisor Counter delegation.
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 16/21] RISC-V: perf: Use config2/vendor table for event to counter mapping
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (14 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 15/21] RISC-V: perf: Skip PMU SBI extension when not implemented Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 17/21] RISC-V: perf: Add legacy event encodings via sysfs Atish Patra
` (4 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
The counter restriction specified in the json file is passed to
the drivers via config2 paarameter in perf attributes. This allows
any platform vendor to define their custom mapping between event and
hpmcounters without any rules defined in the ISA.
For legacy events, the platform vendor may define the mapping in
the driver in the vendor event table.
The fixed cycle and instruction counters are fixed (0 and 2
respectively) by the ISA and maps to the legacy events. The platform
vendor must specify this in the driver if intended to be used while
profiling. Otherwise, they can just specify the alternate hpmcounters
that may monitor and/or sample the cycle/instruction counts.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/riscv_pmu_dev.c | 79 ++++++++++++++++++++++++++++++++++--------
include/linux/perf/riscv_pmu.h | 2 ++
2 files changed, 67 insertions(+), 14 deletions(-)
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index d1cc8310423f..92ff42aca44b 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -76,6 +76,7 @@ static ssize_t __maybe_unused rvpmu_format_show(struct device *dev, struct devic
RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config)
PMU_FORMAT_ATTR(firmware, "config:62-63");
+PMU_FORMAT_ATTR(counterid_mask, "config2:0-31");
static bool sbi_v2_available;
static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available);
@@ -119,6 +120,7 @@ static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = {
static struct attribute *riscv_cdeleg_pmu_formats_attr[] = {
RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR),
&format_attr_firmware.attr,
+ &format_attr_counterid_mask.attr,
NULL,
};
@@ -1381,24 +1383,77 @@ static int rvpmu_deleg_find_ctrs(void)
return num_hw_ctr;
}
+/*
+ * The json file must correctly specify counter 0 or counter 2 is available
+ * in the counter lists for cycle/instret events. Otherwise, the drivers have
+ * no way to figure out if a fixed counter must be used and pick a programmable
+ * counter if available.
+ */
static int get_deleg_fixed_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
{
- return -EINVAL;
+ struct hw_perf_event *hwc = &event->hw;
+ bool guest_events = event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS;
+
+ if (guest_events) {
+ if (hwc->event_base == SBI_PMU_HW_CPU_CYCLES)
+ return 0;
+ if (hwc->event_base == SBI_PMU_HW_INSTRUCTIONS)
+ return 2;
+ else
+ return -EINVAL;
+ }
+
+ if (!event->attr.config2)
+ return -EINVAL;
+
+ if (event->attr.config2 & RISCV_PMU_CYCLE_FIXED_CTR_MASK)
+ return 0; /* CY counter */
+ else if (event->attr.config2 & RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK)
+ return 2; /* IR counter */
+ else
+ return -EINVAL;
}
static int get_deleg_next_hpm_hw_idx(struct cpu_hw_events *cpuc, struct perf_event *event)
{
- unsigned long hw_ctr_mask = 0;
+ u32 hw_ctr_mask = 0, temp_mask = 0;
+ u32 type = event->attr.type;
+ u64 config = event->attr.config;
+ int ret;
- /*
- * TODO: Treat every hpmcounter can monitor every event for now.
- * The event to counter mapping should come from the json file.
- * The mapping should also tell if sampling is supported or not.
- */
+ /* Select only available hpmcounters */
+ hw_ctr_mask = cmask & (~0x7) & ~(cpuc->used_hw_ctrs[0]);
+
+ switch (type) {
+ case PERF_TYPE_HARDWARE:
+ temp_mask = current_pmu_hw_event_map[config].counter_mask;
+ break;
+ case PERF_TYPE_HW_CACHE:
+ ret = cdeleg_pmu_event_find_cache(config, NULL, &temp_mask);
+ if (ret)
+ return ret;
+ break;
+ case PERF_TYPE_RAW:
+ /*
+ * Mask off the counters that can't monitor this event (specified via json)
+ * The counter mask for this event is set in config2 via the property 'Counter'
+ * in the json file or manual configuration of config2. If the config2 is not set,
+ * it is assumed all the available hpmcounters can monitor this event.
+ * Note: This assumption may fail for virtualization use case where they hypervisor
+ * (e.g. KVM) virtualizes the counter. Any event to counter mapping provided by the
+ * guest is meaningless from a hypervisor perspective. Thus, the hypervisor doesn't
+ * set config2 when creating kernel counter and relies default host mapping.
+ */
+ if (event->attr.config2)
+ temp_mask = event->attr.config2;
+ break;
+ default:
+ break;
+ }
+
+ if (temp_mask)
+ hw_ctr_mask &= temp_mask;
- /* Select only hpmcounters */
- hw_ctr_mask = cmask & (~0x7);
- hw_ctr_mask &= ~(cpuc->used_hw_ctrs[0]);
return __ffs(hw_ctr_mask);
}
@@ -1427,10 +1482,6 @@ static int rvpmu_deleg_ctr_get_idx(struct perf_event *event)
u64 priv_filter;
int idx;
- /*
- * TODO: We should not rely on SBI Perf encoding to check if the event
- * is a fixed one or not.
- */
if (!is_sampling_event(event)) {
idx = get_deleg_fixed_hw_idx(cpuc, event);
if (idx == 0 || idx == 2) {
diff --git a/include/linux/perf/riscv_pmu.h b/include/linux/perf/riscv_pmu.h
index 9e2758c32e8b..e58f83811988 100644
--- a/include/linux/perf/riscv_pmu.h
+++ b/include/linux/perf/riscv_pmu.h
@@ -30,6 +30,8 @@
#define RISCV_PMU_CONFIG1_GUEST_EVENTS 0x1
#define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0)
+#define RISCV_PMU_CYCLE_FIXED_CTR_MASK 0x01
+#define RISCV_PMU_INSTRUCTION_FIXED_CTR_MASK 0x04
#define HW_OP_UNSUPPORTED 0xFFFF
#define CACHE_OP_UNSUPPORTED 0xFFFF
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 17/21] RISC-V: perf: Add legacy event encodings via sysfs
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (15 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 16/21] RISC-V: perf: Use config2/vendor table for event to counter mapping Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 18/21] RISC-V: perf: Add Qemu virt machine events Atish Patra
` (3 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
Define sysfs details for the legacy events so that any tool can
parse these to understand the minimum set of legacy events
supported by the platform. The sysfs entry will describe both event
encoding and corresponding counter map so that an perf event can be
programmed accordingly.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
drivers/perf/riscv_pmu_dev.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index 92ff42aca44b..8a079949e3a4 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -129,7 +129,20 @@ static struct attribute_group riscv_cdeleg_pmu_format_group = {
.attrs = riscv_cdeleg_pmu_formats_attr,
};
+#define RVPMU_EVENT_ATTR_RESOLVE(m) #m
+#define RVPMU_EVENT_CMASK_ATTR(_name, _var, config, mask) \
+ PMU_EVENT_ATTR_STRING(_name, rvpmu_event_attr_##_var, \
+ "event=" RVPMU_EVENT_ATTR_RESOLVE(config) \
+ ",counterid_mask=" RVPMU_EVENT_ATTR_RESOLVE(mask) "\n")
+
+#define RVPMU_EVENT_ATTR_PTR(name) (&rvpmu_event_attr_##name.attr.attr)
+
+static struct attribute_group riscv_cdeleg_pmu_event_group __ro_after_init = {
+ .name = "events",
+};
+
static const struct attribute_group *riscv_cdeleg_pmu_attr_groups[] = {
+ &riscv_cdeleg_pmu_event_group,
&riscv_cdeleg_pmu_format_group,
NULL,
};
@@ -369,11 +382,14 @@ struct riscv_vendor_pmu_events {
const struct riscv_pmu_event *hw_event_map;
const struct riscv_pmu_event (*cache_event_map)[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX];
+ struct attribute **attrs_events;
};
-#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, _cache_event_map) \
+#define RISCV_VENDOR_PMU_EVENTS(_vendorid, _archid, _implid, _hw_event_map, \
+ _cache_event_map, _attrs) \
{ .vendorid = _vendorid, .archid = _archid, .implid = _implid, \
- .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map },
+ .hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map, \
+ .attrs_events = _attrs },
static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = {
};
@@ -395,6 +411,8 @@ static void rvpmu_vendor_register_events(void)
pmu_vendor_events_table[i].archid == arch_id) {
current_pmu_hw_event_map = pmu_vendor_events_table[i].hw_event_map;
current_pmu_cache_event_map = pmu_vendor_events_table[i].cache_event_map;
+ riscv_cdeleg_pmu_event_group.attrs =
+ pmu_vendor_events_table[i].attrs_events;
break;
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 18/21] RISC-V: perf: Add Qemu virt machine events
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (16 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 17/21] RISC-V: perf: Add legacy event encodings via sysfs Atish Patra
@ 2025-03-27 19:35 ` Atish Patra
2025-03-27 19:36 ` [PATCH v5 19/21] tools/perf: Support event code for arch standard events Atish Patra
` (2 subsequent siblings)
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:35 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
Qemu virt machine supports a very minimal set of legacy perf events.
Add them to the vendor table so that users can use them when
counter delegation is enabled.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
arch/riscv/include/asm/vendorid_list.h | 4 ++++
drivers/perf/riscv_pmu_dev.c | 36 ++++++++++++++++++++++++++++++++++
2 files changed, 40 insertions(+)
diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h
index a5150cdf34d8..0eefc844923e 100644
--- a/arch/riscv/include/asm/vendorid_list.h
+++ b/arch/riscv/include/asm/vendorid_list.h
@@ -10,4 +10,8 @@
#define SIFIVE_VENDOR_ID 0x489
#define THEAD_VENDOR_ID 0x5b7
+#define QEMU_VIRT_VENDOR_ID 0x000
+#define QEMU_VIRT_IMPL_ID 0x000
+#define QEMU_VIRT_ARCH_ID 0x000
+
#endif
diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
index 8a079949e3a4..cd2ac4cf34f1 100644
--- a/drivers/perf/riscv_pmu_dev.c
+++ b/drivers/perf/riscv_pmu_dev.c
@@ -26,6 +26,7 @@
#include <asm/sbi.h>
#include <asm/cpufeature.h>
#include <asm/vendor_extensions.h>
+#include <asm/vendorid_list.h>
#include <asm/vendor_extensions/andes.h>
#include <asm/hwcap.h>
#include <asm/csr_ind.h>
@@ -391,7 +392,42 @@ struct riscv_vendor_pmu_events {
.hw_event_map = _hw_event_map, .cache_event_map = _cache_event_map, \
.attrs_events = _attrs },
+/* QEMU virt PMU events */
+static const struct riscv_pmu_event qemu_virt_hw_event_map[PERF_COUNT_HW_MAX] = {
+ PERF_MAP_ALL_UNSUPPORTED,
+ [PERF_COUNT_HW_CPU_CYCLES] = {0x01, 0xFFFFFFF8},
+ [PERF_COUNT_HW_INSTRUCTIONS] = {0x02, 0xFFFFFFF8}
+};
+
+static const struct riscv_pmu_event qemu_virt_cache_event_map[PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+ PERF_CACHE_MAP_ALL_UNSUPPORTED,
+ [C(DTLB)][C(OP_READ)][C(RESULT_MISS)] = {0x10019, 0xFFFFFFF8},
+ [C(DTLB)][C(OP_WRITE)][C(RESULT_MISS)] = {0x1001B, 0xFFFFFFF8},
+
+ [C(ITLB)][C(OP_READ)][C(RESULT_MISS)] = {0x10021, 0xFFFFFFF8},
+};
+
+RVPMU_EVENT_CMASK_ATTR(cycles, cycles, 0x01, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(instructions, instructions, 0x02, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(dTLB-load-misses, dTLB_load_miss, 0x10019, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(dTLB-store-misses, dTLB_store_miss, 0x1001B, 0xFFFFFFF8);
+RVPMU_EVENT_CMASK_ATTR(iTLB-load-misses, iTLB_load_miss, 0x10021, 0xFFFFFFF8);
+
+static struct attribute *qemu_virt_event_group[] = {
+ RVPMU_EVENT_ATTR_PTR(cycles),
+ RVPMU_EVENT_ATTR_PTR(instructions),
+ RVPMU_EVENT_ATTR_PTR(dTLB_load_miss),
+ RVPMU_EVENT_ATTR_PTR(dTLB_store_miss),
+ RVPMU_EVENT_ATTR_PTR(iTLB_load_miss),
+ NULL,
+};
+
static struct riscv_vendor_pmu_events pmu_vendor_events_table[] = {
+ RISCV_VENDOR_PMU_EVENTS(QEMU_VIRT_VENDOR_ID, QEMU_VIRT_ARCH_ID, QEMU_VIRT_IMPL_ID,
+ qemu_virt_hw_event_map, qemu_virt_cache_event_map,
+ qemu_virt_event_group)
};
const struct riscv_pmu_event *current_pmu_hw_event_map;
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 19/21] tools/perf: Support event code for arch standard events
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (17 preceding siblings ...)
2025-03-27 19:35 ` [PATCH v5 18/21] RISC-V: perf: Add Qemu virt machine events Atish Patra
@ 2025-03-27 19:36 ` Atish Patra
2025-03-27 19:36 ` [PATCH v5 20/21] tools/perf: Pass the Counter constraint values in the pmu events Atish Patra
2025-03-27 19:36 ` [PATCH v5 21/21] Sync empty-pmu-events.c with autogenerated one Atish Patra
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:36 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
RISC-V relies on the event encoding from the json file. That includes
arch standard events. If event code is present, event is already updated
with correct encoding. No need to update it again which results in losing
the event encoding.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
tools/perf/pmu-events/arch/riscv/arch-standard.json | 10 ++++++++++
tools/perf/pmu-events/jevents.py | 4 +++-
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/tools/perf/pmu-events/arch/riscv/arch-standard.json b/tools/perf/pmu-events/arch/riscv/arch-standard.json
new file mode 100644
index 000000000000..96e21f088558
--- /dev/null
+++ b/tools/perf/pmu-events/arch/riscv/arch-standard.json
@@ -0,0 +1,10 @@
+[
+ {
+ "EventName": "cycles",
+ "BriefDescription": "cycle executed"
+ },
+ {
+ "EventName": "instructions",
+ "BriefDescription": "instruction retired"
+ }
+]
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index fa7c466a5ef3..fdb7ddf093d2 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -417,7 +417,9 @@ class JsonEvent:
self.long_desc += extra_desc
if arch_std:
if arch_std.lower() in _arch_std_events:
- event = _arch_std_events[arch_std.lower()].event
+ # No need to replace as evencode would have updated the event before
+ if not eventcode:
+ event = _arch_std_events[arch_std.lower()].event
# Copy from the architecture standard event to self for undefined fields.
for attr, value in _arch_std_events[arch_std.lower()].__dict__.items():
if hasattr(self, attr) and not getattr(self, attr):
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 20/21] tools/perf: Pass the Counter constraint values in the pmu events
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (18 preceding siblings ...)
2025-03-27 19:36 ` [PATCH v5 19/21] tools/perf: Support event code for arch standard events Atish Patra
@ 2025-03-27 19:36 ` Atish Patra
2025-04-23 0:17 ` Atish Patra
2025-03-27 19:36 ` [PATCH v5 21/21] Sync empty-pmu-events.c with autogenerated one Atish Patra
20 siblings, 1 reply; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:36 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
RISC-V doesn't have any standard event to counter mapping discovery
mechanism in the ISA. The ISA defines 29 programmable counters and
platforms can choose to implement any number of them and map any
events to any counters. Thus, the perf tool need to inform the driver
about the counter mapping of each events.
The current perf infrastructure only parses the 'Counter' constraints
in metrics. This patch extends that to pass in the pmu events so that
any driver can retrieve those values via perf attributes if defined
accordingly.
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
tools/perf/pmu-events/jevents.py | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
index fdb7ddf093d2..f9f274678a32 100755
--- a/tools/perf/pmu-events/jevents.py
+++ b/tools/perf/pmu-events/jevents.py
@@ -274,6 +274,11 @@ class JsonEvent:
return fixed[name.lower()]
return event
+ def counter_list_to_bitmask(counterlist):
+ counter_ids = list(map(int, counterlist.split(',')))
+ bitmask = sum(1 << pos for pos in counter_ids)
+ return bitmask
+
def unit_to_pmu(unit: str) -> Optional[str]:
"""Convert a JSON Unit to Linux PMU name."""
if not unit or unit == "core":
@@ -427,6 +432,10 @@ class JsonEvent:
else:
raise argparse.ArgumentTypeError('Cannot find arch std event:', arch_std)
+ if self.counters['list']:
+ bitmask = counter_list_to_bitmask(self.counters['list'])
+ event += f',counterid_mask={bitmask:#x}'
+
self.event = real_event(self.name, event)
def __repr__(self) -> str:
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH v5 21/21] Sync empty-pmu-events.c with autogenerated one
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
` (19 preceding siblings ...)
2025-03-27 19:36 ` [PATCH v5 20/21] tools/perf: Pass the Counter constraint values in the pmu events Atish Patra
@ 2025-03-27 19:36 ` Atish Patra
20 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-03-27 19:36 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Atish Patra
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
tools/perf/pmu-events/empty-pmu-events.c | 144 +++++++++++++++----------------
1 file changed, 72 insertions(+), 72 deletions(-)
diff --git a/tools/perf/pmu-events/empty-pmu-events.c b/tools/perf/pmu-events/empty-pmu-events.c
index 3a7ec31576f5..22f0463dc522 100644
--- a/tools/perf/pmu-events/empty-pmu-events.c
+++ b/tools/perf/pmu-events/empty-pmu-events.c
@@ -36,42 +36,42 @@ static const char *const big_c_string =
/* offset=1127 */ "bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000"
/* offset=1187 */ "bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000"
/* offset=1247 */ "l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000"
-/* offset=1343 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000"
-/* offset=1446 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000"
-/* offset=1580 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000"
-/* offset=1699 */ "hisi_sccl,ddrc\000"
-/* offset=1714 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000"
-/* offset=1801 */ "uncore_cbox\000"
-/* offset=1813 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000"
-/* offset=2048 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000"
-/* offset=2114 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000"
-/* offset=2186 */ "hisi_sccl,l3c\000"
-/* offset=2200 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000"
-/* offset=2281 */ "uncore_imc_free_running\000"
-/* offset=2305 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000"
-/* offset=2401 */ "uncore_imc\000"
-/* offset=2412 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000"
-/* offset=2491 */ "uncore_sys_ddr_pmu\000"
-/* offset=2510 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000"
-/* offset=2584 */ "uncore_sys_ccn_pmu\000"
-/* offset=2603 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000"
-/* offset=2678 */ "uncore_sys_cmn_pmu\000"
-/* offset=2697 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000"
-/* offset=2838 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000"
-/* offset=2860 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000"
-/* offset=2923 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000"
-/* offset=3089 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
-/* offset=3153 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
-/* offset=3220 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000"
-/* offset=3291 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000"
-/* offset=3385 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000"
-/* offset=3519 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000"
-/* offset=3583 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000"
-/* offset=3651 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000"
-/* offset=3721 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000"
-/* offset=3743 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000"
-/* offset=3765 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000"
-/* offset=3785 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000"
+/* offset=1343 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80,counterid_mask=0x3\000\00000\000\0000,1\000"
+/* offset=1465 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20,counterid_mask=0x3\000\00000\000\0000,1\000"
+/* offset=1618 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000,counterid_mask=0x3\000\00000\000\0000,1\000"
+/* offset=1756 */ "hisi_sccl,ddrc\000"
+/* offset=1771 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000"
+/* offset=1858 */ "uncore_cbox\000"
+/* offset=1870 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81,counterid_mask=0x3\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000"
+/* offset=2124 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000"
+/* offset=2190 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000"
+/* offset=2262 */ "hisi_sccl,l3c\000"
+/* offset=2276 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000"
+/* offset=2357 */ "uncore_imc_free_running\000"
+/* offset=2381 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000"
+/* offset=2477 */ "uncore_imc\000"
+/* offset=2488 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000"
+/* offset=2567 */ "uncore_sys_ddr_pmu\000"
+/* offset=2586 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000"
+/* offset=2660 */ "uncore_sys_ccn_pmu\000"
+/* offset=2679 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000"
+/* offset=2754 */ "uncore_sys_cmn_pmu\000"
+/* offset=2773 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000"
+/* offset=2914 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000"
+/* offset=2936 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000"
+/* offset=2999 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000"
+/* offset=3165 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
+/* offset=3229 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
+/* offset=3296 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000"
+/* offset=3367 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000"
+/* offset=3461 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000"
+/* offset=3595 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000"
+/* offset=3659 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000"
+/* offset=3727 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000"
+/* offset=3797 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000"
+/* offset=3819 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000"
+/* offset=3841 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000"
+/* offset=3861 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000"
;
static const struct compact_pmu_event pmu_events__common_tool[] = {
@@ -101,27 +101,27 @@ const struct pmu_table_entry pmu_events__common[] = {
static const struct compact_pmu_event pmu_events__test_soc_cpu_default_core[] = {
{ 1127 }, /* bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000 */
{ 1187 }, /* bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000 */
-{ 1446 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000 */
-{ 1580 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000 */
+{ 1465 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20,counterid_mask=0x3\000\00000\000\0000,1\000 */
+{ 1618 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000,counterid_mask=0x3\000\00000\000\0000,1\000 */
{ 1247 }, /* l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000 */
-{ 1343 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000 */
+{ 1343 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80,counterid_mask=0x3\000\00000\000\0000,1\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_ddrc[] = {
-{ 1714 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000 */
+{ 1771 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_l3c[] = {
-{ 2200 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000 */
+{ 2276 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_cbox[] = {
-{ 2048 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000 */
-{ 2114 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000 */
-{ 1813 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000 */
+{ 2124 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000 */
+{ 2190 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000 */
+{ 1870 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81,counterid_mask=0x3\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc[] = {
-{ 2412 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000 */
+{ 2488 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc_free_running[] = {
-{ 2305 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000 */
+{ 2381 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000 */
};
@@ -134,46 +134,46 @@ const struct pmu_table_entry pmu_events__test_soc_cpu[] = {
{
.entries = pmu_events__test_soc_cpu_hisi_sccl_ddrc,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_ddrc),
- .pmu_name = { 1699 /* hisi_sccl,ddrc\000 */ },
+ .pmu_name = { 1756 /* hisi_sccl,ddrc\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_hisi_sccl_l3c,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_l3c),
- .pmu_name = { 2186 /* hisi_sccl,l3c\000 */ },
+ .pmu_name = { 2262 /* hisi_sccl,l3c\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_uncore_cbox,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_cbox),
- .pmu_name = { 1801 /* uncore_cbox\000 */ },
+ .pmu_name = { 1858 /* uncore_cbox\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_uncore_imc,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc),
- .pmu_name = { 2401 /* uncore_imc\000 */ },
+ .pmu_name = { 2477 /* uncore_imc\000 */ },
},
{
.entries = pmu_events__test_soc_cpu_uncore_imc_free_running,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc_free_running),
- .pmu_name = { 2281 /* uncore_imc_free_running\000 */ },
+ .pmu_name = { 2357 /* uncore_imc_free_running\000 */ },
},
};
static const struct compact_pmu_event pmu_metrics__test_soc_cpu_default_core[] = {
-{ 2838 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */
-{ 3519 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */
-{ 3291 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */
-{ 3385 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */
-{ 3583 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
-{ 3651 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
-{ 2923 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */
-{ 2860 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */
-{ 3785 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */
-{ 3721 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */
-{ 3743 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */
-{ 3765 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */
-{ 3220 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */
-{ 3089 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
-{ 3153 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
+{ 2914 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */
+{ 3595 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */
+{ 3367 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */
+{ 3461 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */
+{ 3659 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
+{ 3727 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
+{ 2999 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */
+{ 2936 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */
+{ 3861 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */
+{ 3797 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */
+{ 3819 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */
+{ 3841 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */
+{ 3296 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */
+{ 3165 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
+{ 3229 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
};
@@ -186,13 +186,13 @@ const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = {
};
static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ccn_pmu[] = {
-{ 2603 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000 */
+{ 2679 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_cmn_pmu[] = {
-{ 2697 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000 */
+{ 2773 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000 */
};
static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ddr_pmu[] = {
-{ 2510 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000 */
+{ 2586 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000 */
};
@@ -200,17 +200,17 @@ const struct pmu_table_entry pmu_events__test_soc_sys[] = {
{
.entries = pmu_events__test_soc_sys_uncore_sys_ccn_pmu,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ccn_pmu),
- .pmu_name = { 2584 /* uncore_sys_ccn_pmu\000 */ },
+ .pmu_name = { 2660 /* uncore_sys_ccn_pmu\000 */ },
},
{
.entries = pmu_events__test_soc_sys_uncore_sys_cmn_pmu,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_cmn_pmu),
- .pmu_name = { 2678 /* uncore_sys_cmn_pmu\000 */ },
+ .pmu_name = { 2754 /* uncore_sys_cmn_pmu\000 */ },
},
{
.entries = pmu_events__test_soc_sys_uncore_sys_ddr_pmu,
.num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ddr_pmu),
- .pmu_name = { 2491 /* uncore_sys_ddr_pmu\000 */ },
+ .pmu_name = { 2567 /* uncore_sys_ddr_pmu\000 */ },
},
};
--
2.43.0
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH v5 10/21] dt-bindings: riscv: add Counter delegation ISA extensions description
2025-03-27 19:35 ` [PATCH v5 10/21] dt-bindings: riscv: add Counter delegation ISA extensions description Atish Patra
@ 2025-03-31 15:38 ` Conor Dooley
0 siblings, 0 replies; 28+ messages in thread
From: Conor Dooley @ 2025-03-31 15:38 UTC (permalink / raw)
To: Atish Patra
Cc: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang, linux-riscv, linux-kernel, devicetree,
kvm, kvm-riscv, linux-arm-kernel, linux-perf-users
[-- Attachment #1: Type: text/plain, Size: 227 bytes --]
On Thu, Mar 27, 2025 at 12:35:51PM -0700, Atish Patra wrote:
> Add description for the Smcdeleg/Ssccfg extension.
>
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v5 11/21] RISC-V: perf: Restructure the SBI PMU code
2025-03-27 19:35 ` [PATCH v5 11/21] RISC-V: perf: Restructure the SBI PMU code Atish Patra
@ 2025-04-04 13:49 ` Will Deacon
2025-04-23 0:02 ` Atish Patra
0 siblings, 1 reply; 28+ messages in thread
From: Will Deacon @ 2025-04-04 13:49 UTC (permalink / raw)
To: Atish Patra
Cc: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang, linux-riscv, linux-kernel,
Conor Dooley, devicetree, kvm, kvm-riscv, linux-arm-kernel,
linux-perf-users, Clément Léger
On Thu, Mar 27, 2025 at 12:35:52PM -0700, Atish Patra wrote:
> With Ssccfg/Smcdeleg, we no longer need SBI PMU extension to program/
> access hpmcounter/events. However, we do need it for firmware counters.
> Rename the driver and its related code to represent generic name
> that will handle both sbi and ISA mechanism for hpmcounter related
> operations. Take this opportunity to update the Kconfig names to
> match the new driver name closely.
>
> No functional change intended.
>
> Reviewed-by: Clément Léger <cleger@rivosinc.com>
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
> MAINTAINERS | 4 +-
> arch/riscv/include/asm/kvm_vcpu_pmu.h | 4 +-
> arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +-
> arch/riscv/kvm/Makefile | 4 +-
> arch/riscv/kvm/vcpu_sbi.c | 2 +-
> drivers/perf/Kconfig | 16 +-
> drivers/perf/Makefile | 4 +-
> drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} | 0
> drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 214 +++++++++++++---------
I'm still against this renaming churn. It sucks for backporting and
you're also changing the name of the driver, which could be used by
scripts in userspace (e.g. module listings, udev rules, cmdline options)
Will
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v5 11/21] RISC-V: perf: Restructure the SBI PMU code
2025-04-04 13:49 ` Will Deacon
@ 2025-04-23 0:02 ` Atish Patra
0 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-04-23 0:02 UTC (permalink / raw)
To: Will Deacon
Cc: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang, linux-riscv, linux-kernel,
Conor Dooley, devicetree, kvm, kvm-riscv, linux-arm-kernel,
linux-perf-users, Clément Léger
On 4/4/25 6:49 AM, Will Deacon wrote:
> On Thu, Mar 27, 2025 at 12:35:52PM -0700, Atish Patra wrote:
>> With Ssccfg/Smcdeleg, we no longer need SBI PMU extension to program/
>> access hpmcounter/events. However, we do need it for firmware counters.
>> Rename the driver and its related code to represent generic name
>> that will handle both sbi and ISA mechanism for hpmcounter related
>> operations. Take this opportunity to update the Kconfig names to
>> match the new driver name closely.
>>
>> No functional change intended.
>>
>> Reviewed-by: Clément Léger <cleger@rivosinc.com>
>> Signed-off-by: Atish Patra <atishp@rivosinc.com>
>> ---
>> MAINTAINERS | 4 +-
>> arch/riscv/include/asm/kvm_vcpu_pmu.h | 4 +-
>> arch/riscv/include/asm/kvm_vcpu_sbi.h | 2 +-
>> arch/riscv/kvm/Makefile | 4 +-
>> arch/riscv/kvm/vcpu_sbi.c | 2 +-
>> drivers/perf/Kconfig | 16 +-
>> drivers/perf/Makefile | 4 +-
>> drivers/perf/{riscv_pmu.c => riscv_pmu_common.c} | 0
>> drivers/perf/{riscv_pmu_sbi.c => riscv_pmu_dev.c} | 214 +++++++++++++---------
>
> I'm still against this renaming churn. It sucks for backporting and
> you're also changing the name of the driver, which could be used by
> scripts in userspace (e.g. module listings, udev rules, cmdline options)
>
Ok. I will revert the file and driver name change. I hope config
renaming and code refactoring to separate counter delegation (hw method)
vs SBI calls (firmware assisted method) are okay ?
> Will
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v5 01/21] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping
2025-03-27 19:35 ` [PATCH v5 01/21] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping Atish Patra
@ 2025-04-23 0:13 ` Atish Patra
0 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-04-23 0:13 UTC (permalink / raw)
To: Namhyung Kim, Ian Rogers, weilin.wang
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Paul Walmsley,
Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Anup Patel, Atish Patra, Mark Rutland, Will Deacon,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Alexander Shishkin, Jiri Olsa, Adrian Hunter
On 3/27/25 12:35 PM, Atish Patra wrote:
> From: Weilin Wang <weilin.wang@intel.com>
>
> These functions are added to parse event counter restrictions and counter
> availability info from json files so that the metric grouping method could
> do grouping based on the counter restriction of events and the counters
> that are available on the system.
>
Hi Ian/Weilin,
Any thoughts on this patch ? We would like to understand if this is a
direction that is acceptable for perf tool. I can work on the patch to
isolate only counter restriction part if required.
Please ignore the diff on pmu-events/empty-pmu-events.c as that may
change based on the rebase.
> Signed-off-by: Weilin Wang <weilin.wang@intel.com>
> ---
> tools/perf/pmu-events/empty-pmu-events.c | 299 ++++++++++++++++++++-----------
> tools/perf/pmu-events/jevents.py | 205 ++++++++++++++++++++-
> tools/perf/pmu-events/pmu-events.h | 32 +++-
> 3 files changed, 419 insertions(+), 117 deletions(-)
>
> diff --git a/tools/perf/pmu-events/empty-pmu-events.c b/tools/perf/pmu-events/empty-pmu-events.c
> index 1c7a2cfa321f..3a7ec31576f5 100644
> --- a/tools/perf/pmu-events/empty-pmu-events.c
> +++ b/tools/perf/pmu-events/empty-pmu-events.c
> @@ -20,73 +20,73 @@ struct pmu_table_entry {
>
> static const char *const big_c_string =
> /* offset=0 */ "tool\000"
> -/* offset=5 */ "duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000"
> -/* offset=78 */ "user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000"
> -/* offset=145 */ "system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000"
> -/* offset=210 */ "has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000"
> -/* offset=283 */ "num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000"
> -/* offset=425 */ "num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000"
> -/* offset=525 */ "num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000"
> -/* offset=639 */ "num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000"
> -/* offset=712 */ "num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000"
> -/* offset=795 */ "slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000"
> -/* offset=902 */ "smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000"
> -/* offset=1006 */ "system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000"
> -/* offset=1102 */ "default_core\000"
> -/* offset=1115 */ "bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000"
> -/* offset=1174 */ "bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000"
> -/* offset=1233 */ "l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000"
> -/* offset=1328 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\000"
> -/* offset=1427 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\000"
> -/* offset=1557 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\000"
> -/* offset=1672 */ "hisi_sccl,ddrc\000"
> -/* offset=1687 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000"
> -/* offset=1773 */ "uncore_cbox\000"
> -/* offset=1785 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000"
> -/* offset=2016 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000"
> -/* offset=2081 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000"
> -/* offset=2152 */ "hisi_sccl,l3c\000"
> -/* offset=2166 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000"
> -/* offset=2246 */ "uncore_imc_free_running\000"
> -/* offset=2270 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000"
> -/* offset=2365 */ "uncore_imc\000"
> -/* offset=2376 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000"
> -/* offset=2454 */ "uncore_sys_ddr_pmu\000"
> -/* offset=2473 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000"
> -/* offset=2546 */ "uncore_sys_ccn_pmu\000"
> -/* offset=2565 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000"
> -/* offset=2639 */ "uncore_sys_cmn_pmu\000"
> -/* offset=2658 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000"
> -/* offset=2798 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000"
> -/* offset=2820 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000"
> -/* offset=2883 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000"
> -/* offset=3049 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
> -/* offset=3113 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
> -/* offset=3180 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000"
> -/* offset=3251 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000"
> -/* offset=3345 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000"
> -/* offset=3479 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000"
> -/* offset=3543 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000"
> -/* offset=3611 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000"
> -/* offset=3681 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000"
> -/* offset=3703 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000"
> -/* offset=3725 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000"
> -/* offset=3745 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000"
> +/* offset=5 */ "duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000\000"
> +/* offset=79 */ "user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000\000"
> +/* offset=147 */ "system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000\000"
> +/* offset=213 */ "has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000\000"
> +/* offset=287 */ "num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000\000"
> +/* offset=430 */ "num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000\000"
> +/* offset=531 */ "num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000\000"
> +/* offset=646 */ "num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000\000"
> +/* offset=720 */ "num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000\000"
> +/* offset=804 */ "slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000\000"
> +/* offset=912 */ "smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000\000"
> +/* offset=1017 */ "system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000\000"
> +/* offset=1114 */ "default_core\000"
> +/* offset=1127 */ "bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000"
> +/* offset=1187 */ "bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000"
> +/* offset=1247 */ "l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000"
> +/* offset=1343 */ "segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000"
> +/* offset=1446 */ "dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000"
> +/* offset=1580 */ "eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000"
> +/* offset=1699 */ "hisi_sccl,ddrc\000"
> +/* offset=1714 */ "uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000"
> +/* offset=1801 */ "uncore_cbox\000"
> +/* offset=1813 */ "unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000"
> +/* offset=2048 */ "event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000"
> +/* offset=2114 */ "event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000"
> +/* offset=2186 */ "hisi_sccl,l3c\000"
> +/* offset=2200 */ "uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000"
> +/* offset=2281 */ "uncore_imc_free_running\000"
> +/* offset=2305 */ "uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000"
> +/* offset=2401 */ "uncore_imc\000"
> +/* offset=2412 */ "uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000"
> +/* offset=2491 */ "uncore_sys_ddr_pmu\000"
> +/* offset=2510 */ "sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000"
> +/* offset=2584 */ "uncore_sys_ccn_pmu\000"
> +/* offset=2603 */ "sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000"
> +/* offset=2678 */ "uncore_sys_cmn_pmu\000"
> +/* offset=2697 */ "sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000"
> +/* offset=2838 */ "CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000"
> +/* offset=2860 */ "IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000"
> +/* offset=2923 */ "Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000"
> +/* offset=3089 */ "dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
> +/* offset=3153 */ "icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000"
> +/* offset=3220 */ "cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000"
> +/* offset=3291 */ "DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000"
> +/* offset=3385 */ "DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000"
> +/* offset=3519 */ "DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000"
> +/* offset=3583 */ "DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000"
> +/* offset=3651 */ "DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000"
> +/* offset=3721 */ "M1\000\000ipc + M2\000\000\000\000\000\000\000\00000"
> +/* offset=3743 */ "M2\000\000ipc + M1\000\000\000\000\000\000\000\00000"
> +/* offset=3765 */ "M3\000\0001 / M3\000\000\000\000\000\000\000\00000"
> +/* offset=3785 */ "L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000"
> ;
>
> static const struct compact_pmu_event pmu_events__common_tool[] = {
> -{ 5 }, /* duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000 */
> -{ 210 }, /* has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000 */
> -{ 283 }, /* num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000 */
> -{ 425 }, /* num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000 */
> -{ 525 }, /* num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000 */
> -{ 639 }, /* num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000 */
> -{ 712 }, /* num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000 */
> -{ 795 }, /* slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000 */
> -{ 902 }, /* smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000 */
> -{ 145 }, /* system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000 */
> -{ 1006 }, /* system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000 */
> -{ 78 }, /* user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000 */
> +{ 5 }, /* duration_time\000tool\000Wall clock interval time in nanoseconds\000config=1\000\00000\000\000\000 */
> +{ 213 }, /* has_pmem\000tool\0001 if persistent memory installed otherwise 0\000config=4\000\00000\000\000\000 */
> +{ 287 }, /* num_cores\000tool\000Number of cores. A core consists of 1 or more thread, with each thread being associated with a logical Linux CPU\000config=5\000\00000\000\000\000 */
> +{ 430 }, /* num_cpus\000tool\000Number of logical Linux CPUs. There may be multiple such CPUs on a core\000config=6\000\00000\000\000\000 */
> +{ 531 }, /* num_cpus_online\000tool\000Number of online logical Linux CPUs. There may be multiple such CPUs on a core\000config=7\000\00000\000\000\000 */
> +{ 646 }, /* num_dies\000tool\000Number of dies. Each die has 1 or more cores\000config=8\000\00000\000\000\000 */
> +{ 720 }, /* num_packages\000tool\000Number of packages. Each package has 1 or more die\000config=9\000\00000\000\000\000 */
> +{ 804 }, /* slots\000tool\000Number of functional units that in parallel can execute parts of an instruction\000config=0xa\000\00000\000\000\000 */
> +{ 912 }, /* smt_on\000tool\0001 if simultaneous multithreading (aka hyperthreading) is enable otherwise 0\000config=0xb\000\00000\000\000\000 */
> +{ 147 }, /* system_time\000tool\000System/kernel time in nanoseconds\000config=3\000\00000\000\000\000 */
> +{ 1017 }, /* system_tsc_freq\000tool\000The amount a Time Stamp Counter (TSC) increases per second\000config=0xc\000\00000\000\000\000 */
> +{ 79 }, /* user_time\000tool\000User (non-kernel) time in nanoseconds\000config=2\000\00000\000\000\000 */
>
> };
>
> @@ -99,29 +99,29 @@ const struct pmu_table_entry pmu_events__common[] = {
> };
>
> static const struct compact_pmu_event pmu_events__test_soc_cpu_default_core[] = {
> -{ 1115 }, /* bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000 */
> -{ 1174 }, /* bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000 */
> -{ 1427 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\000 */
> -{ 1557 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\000 */
> -{ 1233 }, /* l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000 */
> -{ 1328 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\000 */
> +{ 1127 }, /* bp_l1_btb_correct\000branch\000L1 BTB Correction\000event=0x8a\000\00000\000\000\000 */
> +{ 1187 }, /* bp_l2_btb_correct\000branch\000L2 BTB Correction\000event=0x8b\000\00000\000\000\000 */
> +{ 1446 }, /* dispatch_blocked.any\000other\000Memory cluster signals to block micro-op dispatch for any reason\000event=9,period=200000,umask=0x20\000\00000\000\0000,1\000 */
> +{ 1580 }, /* eist_trans\000other\000Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions\000event=0x3a,period=200000\000\00000\000\0000,1\000 */
> +{ 1247 }, /* l3_cache_rd\000cache\000L3 cache access, read\000event=0x40\000\00000\000Attributable Level 3 cache access, read\000\000 */
> +{ 1343 }, /* segment_reg_loads.any\000other\000Number of segment register loads\000event=6,period=200000,umask=0x80\000\00000\000\0000,1\000 */
> };
> static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_ddrc[] = {
> -{ 1687 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000 */
> +{ 1714 }, /* uncore_hisi_ddrc.flux_wcmd\000uncore\000DDRC write commands\000event=2\000\00000\000DDRC write commands\000\000 */
> };
> static const struct compact_pmu_event pmu_events__test_soc_cpu_hisi_sccl_l3c[] = {
> -{ 2166 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000 */
> +{ 2200 }, /* uncore_hisi_l3c.rd_hit_cpipe\000uncore\000Total read hits\000event=7\000\00000\000Total read hits\000\000 */
> };
> static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_cbox[] = {
> -{ 2016 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000 */
> -{ 2081 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000 */
> -{ 1785 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000 */
> +{ 2048 }, /* event-hyphen\000uncore\000UNC_CBO_HYPHEN\000event=0xe0\000\00000\000UNC_CBO_HYPHEN\000\000 */
> +{ 2114 }, /* event-two-hyph\000uncore\000UNC_CBO_TWO_HYPH\000event=0xc0\000\00000\000UNC_CBO_TWO_HYPH\000\000 */
> +{ 1813 }, /* unc_cbo_xsnp_response.miss_eviction\000uncore\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\000event=0x22,umask=0x81\000\00000\000A cross-core snoop resulted from L3 Eviction which misses in some processor core\0000,1\000 */
> };
> static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc[] = {
> -{ 2376 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000 */
> +{ 2412 }, /* uncore_imc.cache_hits\000uncore\000Total cache hits\000event=0x34\000\00000\000Total cache hits\000\000 */
> };
> static const struct compact_pmu_event pmu_events__test_soc_cpu_uncore_imc_free_running[] = {
> -{ 2270 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000 */
> +{ 2305 }, /* uncore_imc_free_running.cache_miss\000uncore\000Total cache misses\000event=0x12\000\00000\000Total cache misses\000\000 */
>
> };
>
> @@ -129,51 +129,51 @@ const struct pmu_table_entry pmu_events__test_soc_cpu[] = {
> {
> .entries = pmu_events__test_soc_cpu_default_core,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_default_core),
> - .pmu_name = { 1102 /* default_core\000 */ },
> + .pmu_name = { 1114 /* default_core\000 */ },
> },
> {
> .entries = pmu_events__test_soc_cpu_hisi_sccl_ddrc,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_ddrc),
> - .pmu_name = { 1672 /* hisi_sccl,ddrc\000 */ },
> + .pmu_name = { 1699 /* hisi_sccl,ddrc\000 */ },
> },
> {
> .entries = pmu_events__test_soc_cpu_hisi_sccl_l3c,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_hisi_sccl_l3c),
> - .pmu_name = { 2152 /* hisi_sccl,l3c\000 */ },
> + .pmu_name = { 2186 /* hisi_sccl,l3c\000 */ },
> },
> {
> .entries = pmu_events__test_soc_cpu_uncore_cbox,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_cbox),
> - .pmu_name = { 1773 /* uncore_cbox\000 */ },
> + .pmu_name = { 1801 /* uncore_cbox\000 */ },
> },
> {
> .entries = pmu_events__test_soc_cpu_uncore_imc,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc),
> - .pmu_name = { 2365 /* uncore_imc\000 */ },
> + .pmu_name = { 2401 /* uncore_imc\000 */ },
> },
> {
> .entries = pmu_events__test_soc_cpu_uncore_imc_free_running,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_cpu_uncore_imc_free_running),
> - .pmu_name = { 2246 /* uncore_imc_free_running\000 */ },
> + .pmu_name = { 2281 /* uncore_imc_free_running\000 */ },
> },
> };
>
> static const struct compact_pmu_event pmu_metrics__test_soc_cpu_default_core[] = {
> -{ 2798 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */
> -{ 3479 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */
> -{ 3251 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */
> -{ 3345 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */
> -{ 3543 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
> -{ 3611 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
> -{ 2883 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */
> -{ 2820 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */
> -{ 3745 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */
> -{ 3681 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */
> -{ 3703 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */
> -{ 3725 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */
> -{ 3180 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */
> -{ 3049 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
> -{ 3113 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
> +{ 2838 }, /* CPI\000\0001 / IPC\000\000\000\000\000\000\000\00000 */
> +{ 3519 }, /* DCache_L2_All\000\000DCache_L2_All_Hits + DCache_L2_All_Miss\000\000\000\000\000\000\000\00000 */
> +{ 3291 }, /* DCache_L2_All_Hits\000\000l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit\000\000\000\000\000\000\000\00000 */
> +{ 3385 }, /* DCache_L2_All_Miss\000\000max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss\000\000\000\000\000\000\000\00000 */
> +{ 3583 }, /* DCache_L2_Hits\000\000d_ratio(DCache_L2_All_Hits, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
> +{ 3651 }, /* DCache_L2_Misses\000\000d_ratio(DCache_L2_All_Miss, DCache_L2_All)\000\000\000\000\000\000\000\00000 */
> +{ 2923 }, /* Frontend_Bound_SMT\000\000idq_uops_not_delivered.core / (4 * (cpu_clk_unhalted.thread / 2 * (1 + cpu_clk_unhalted.one_thread_active / cpu_clk_unhalted.ref_xclk)))\000\000\000\000\000\000\000\00000 */
> +{ 2860 }, /* IPC\000group1\000inst_retired.any / cpu_clk_unhalted.thread\000\000\000\000\000\000\000\00000 */
> +{ 3785 }, /* L1D_Cache_Fill_BW\000\00064 * l1d.replacement / 1e9 / duration_time\000\000\000\000\000\000\000\00000 */
> +{ 3721 }, /* M1\000\000ipc + M2\000\000\000\000\000\000\000\00000 */
> +{ 3743 }, /* M2\000\000ipc + M1\000\000\000\000\000\000\000\00000 */
> +{ 3765 }, /* M3\000\0001 / M3\000\000\000\000\000\000\000\00000 */
> +{ 3220 }, /* cache_miss_cycles\000group1\000dcache_miss_cpi + icache_miss_cycles\000\000\000\000\000\000\000\00000 */
> +{ 3089 }, /* dcache_miss_cpi\000\000l1d\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
> +{ 3153 }, /* icache_miss_cycles\000\000l1i\\-loads\\-misses / inst_retired.any\000\000\000\000\000\000\000\00000 */
>
> };
>
> @@ -181,18 +181,18 @@ const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = {
> {
> .entries = pmu_metrics__test_soc_cpu_default_core,
> .num_entries = ARRAY_SIZE(pmu_metrics__test_soc_cpu_default_core),
> - .pmu_name = { 1102 /* default_core\000 */ },
> + .pmu_name = { 1114 /* default_core\000 */ },
> },
> };
>
> static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ccn_pmu[] = {
> -{ 2565 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000 */
> +{ 2603 }, /* sys_ccn_pmu.read_cycles\000uncore\000ccn read-cycles event\000config=0x2c\0000x01\00000\000\000\000 */
> };
> static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_cmn_pmu[] = {
> -{ 2658 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000 */
> +{ 2697 }, /* sys_cmn_pmu.hnf_cache_miss\000uncore\000Counts total cache misses in first lookup result (high priority)\000eventid=1,type=5\000(434|436|43c|43a).*\00000\000\000\000 */
> };
> static const struct compact_pmu_event pmu_events__test_soc_sys_uncore_sys_ddr_pmu[] = {
> -{ 2473 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000 */
> +{ 2510 }, /* sys_ddr_pmu.write_cycles\000uncore\000ddr write-cycles event\000event=0x2b\000v8\00000\000\000\000 */
>
> };
>
> @@ -200,17 +200,17 @@ const struct pmu_table_entry pmu_events__test_soc_sys[] = {
> {
> .entries = pmu_events__test_soc_sys_uncore_sys_ccn_pmu,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ccn_pmu),
> - .pmu_name = { 2546 /* uncore_sys_ccn_pmu\000 */ },
> + .pmu_name = { 2584 /* uncore_sys_ccn_pmu\000 */ },
> },
> {
> .entries = pmu_events__test_soc_sys_uncore_sys_cmn_pmu,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_cmn_pmu),
> - .pmu_name = { 2639 /* uncore_sys_cmn_pmu\000 */ },
> + .pmu_name = { 2678 /* uncore_sys_cmn_pmu\000 */ },
> },
> {
> .entries = pmu_events__test_soc_sys_uncore_sys_ddr_pmu,
> .num_entries = ARRAY_SIZE(pmu_events__test_soc_sys_uncore_sys_ddr_pmu),
> - .pmu_name = { 2454 /* uncore_sys_ddr_pmu\000 */ },
> + .pmu_name = { 2491 /* uncore_sys_ddr_pmu\000 */ },
> },
> };
>
> @@ -227,6 +227,12 @@ struct pmu_metrics_table {
> uint32_t num_pmus;
> };
>
> +/* Struct used to make the PMU counter layout table implementation opaque to callers. */
> +struct pmu_layouts_table {
> + const struct compact_pmu_event *entries;
> + size_t length;
> +};
> +
> /*
> * Map a CPU to its table of PMU events. The CPU is identified by the
> * cpuid field, which is an arch-specific identifier for the CPU.
> @@ -240,6 +246,7 @@ struct pmu_events_map {
> const char *cpuid;
> struct pmu_events_table event_table;
> struct pmu_metrics_table metric_table;
> + struct pmu_layouts_table layout_table;
> };
>
> /*
> @@ -273,6 +280,7 @@ const struct pmu_events_map pmu_events_map[] = {
> .cpuid = 0,
> .event_table = { 0, 0 },
> .metric_table = { 0, 0 },
> + .layout_table = { 0, 0 },
> }
> };
>
> @@ -317,6 +325,8 @@ static void decompress_event(int offset, struct pmu_event *pe)
> pe->unit = (*p == '\0' ? NULL : p);
> while (*p++);
> pe->long_desc = (*p == '\0' ? NULL : p);
> + while (*p++);
> + pe->counters_list = (*p == '\0' ? NULL : p);
> }
>
> static void decompress_metric(int offset, struct pmu_metric *pm)
> @@ -348,6 +358,19 @@ static void decompress_metric(int offset, struct pmu_metric *pm)
> pm->event_grouping = *p - '0';
> }
>
> +static void decompress_layout(int offset, struct pmu_layout *pm)
> +{
> + const char *p = &big_c_string[offset];
> +
> + pm->pmu = (*p == '\0' ? NULL : p);
> + while (*p++);
> + pm->desc = (*p == '\0' ? NULL : p);
> + p++;
> + pm->counters_num_gp = *p - '0';
> + p++;
> + pm->counters_num_fixed = *p - '0';
> +}
> +
> static int pmu_events_table__for_each_event_pmu(const struct pmu_events_table *table,
> const struct pmu_table_entry *pmu,
> pmu_event_iter_fn fn,
> @@ -503,6 +526,21 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table,
> return 0;
> }
>
> +int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table,
> + pmu_layout_iter_fn fn,
> + void *data) {
> + for (size_t i = 0; i < table->length; i++) {
> + struct pmu_layout pm;
> + int ret;
> +
> + decompress_layout(table->entries[i].offset, &pm);
> + ret = fn(&pm, data);
> + if (ret)
> + return ret;
> + }
> + return 0;
> +}
> +
> static const struct pmu_events_map *map_for_cpu(struct perf_cpu cpu)
> {
> static struct {
> @@ -595,6 +633,34 @@ const struct pmu_metrics_table *pmu_metrics_table__find(void)
> return map ? &map->metric_table : NULL;
> }
>
> +const struct pmu_layouts_table *perf_pmu__find_layouts_table(void)
> +{
> + const struct pmu_layouts_table *table = NULL;
> + struct perf_cpu cpu = {-1};
> + char *cpuid = get_cpuid_allow_env_override(cpu);
> + int i;
> +
> + /* on some platforms which uses cpus map, cpuid can be NULL for
> + * PMUs other than CORE PMUs.
> + */
> + if (!cpuid)
> + return NULL;
> +
> + i = 0;
> + for (;;) {
> + const struct pmu_events_map *map = &pmu_events_map[i++];
> + if (!map->arch)
> + break;
> +
> + if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
> + table = &map->layout_table;
> + break;
> + }
> + }
> + free(cpuid);
> + return table;
> +}
> +
> const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid)
> {
> for (const struct pmu_events_map *tables = &pmu_events_map[0];
> @@ -616,6 +682,16 @@ const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const
> }
> return NULL;
> }
> +const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid)
> +{
> + for (const struct pmu_events_map *tables = &pmu_events_map[0];
> + tables->arch;
> + tables++) {
> + if (!strcmp(tables->arch, arch) && !strcmp_cpuid_str(tables->cpuid, cpuid))
> + return &tables->layout_table;
> + }
> + return NULL;
> +}
>
> int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
> {
> @@ -644,6 +720,19 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
> return 0;
> }
>
> +int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data)
> +{
> + for (const struct pmu_events_map *tables = &pmu_events_map[0];
> + tables->arch;
> + tables++) {
> + int ret = pmu_layouts_table__for_each_layout(&tables->layout_table, fn, data);
> +
> + if (ret)
> + return ret;
> + }
> + return 0;
> +}
> +
> const struct pmu_events_table *find_sys_events_table(const char *name)
> {
> for (const struct pmu_sys_events *tables = &pmu_sys_event_tables[0];
> diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
> index 3e204700b59a..fa7c466a5ef3 100755
> --- a/tools/perf/pmu-events/jevents.py
> +++ b/tools/perf/pmu-events/jevents.py
> @@ -23,6 +23,8 @@ _metric_tables = []
> _sys_metric_tables = []
> # Mapping between sys event table names and sys metric table names.
> _sys_event_table_to_metric_table_mapping = {}
> +# List of regular PMU counter layout tables.
> +_pmu_layouts_tables = []
> # Map from an event name to an architecture standard
> # JsonEvent. Architecture standard events are in json files in the top
> # f'{_args.starting_dir}/{_args.arch}' directory.
> @@ -31,6 +33,10 @@ _arch_std_events = {}
> _pending_events = []
> # Name of events table to be written out
> _pending_events_tblname = None
> +# PMU counter layout to write out when the layout table is closed
> +_pending_pmu_counts = []
> +# Name of PMU counter layout table to be written out
> +_pending_pmu_counts_tblname = None
> # Metrics to write out when the table is closed
> _pending_metrics = []
> # Name of metrics table to be written out
> @@ -51,6 +57,11 @@ _json_event_attributes = [
> 'long_desc'
> ]
>
> +# Attributes that are in pmu_unit_layout.
> +_json_layout_attributes = [
> + 'pmu', 'desc'
> +]
> +
> # Attributes that are in pmu_metric rather than pmu_event.
> _json_metric_attributes = [
> 'metric_name', 'metric_group', 'metric_expr', 'metric_threshold',
> @@ -265,7 +276,7 @@ class JsonEvent:
>
> def unit_to_pmu(unit: str) -> Optional[str]:
> """Convert a JSON Unit to Linux PMU name."""
> - if not unit:
> + if not unit or unit == "core":
> return 'default_core'
> # Comment brought over from jevents.c:
> # it's not realistic to keep adding these, we need something more scalable ...
> @@ -336,6 +347,19 @@ class JsonEvent:
> if 'Errata' in jd:
> extra_desc += ' Spec update: ' + jd['Errata']
> self.pmu = unit_to_pmu(jd.get('Unit'))
> + # The list of counter(s) the event could be collected with
> + class Counter:
> + gp = str()
> + fixed = str()
> + self.counters = {'list': str(), 'num': Counter()}
> + self.counters['list'] = jd.get('Counter')
> + # Number of generic counter
> + self.counters['num'].gp = jd.get('CountersNumGeneric')
> + # Number of fixed counter
> + self.counters['num'].fixed = jd.get('CountersNumFixed')
> + # If the event uses an MSR, other event uses the same MSR could not be
> + # schedule to collect at the same time.
> + self.msr = jd.get('MSRIndex')
> filter = jd.get('Filter')
> self.unit = jd.get('ScaleUnit')
> self.perpkg = jd.get('PerPkg')
> @@ -411,8 +435,20 @@ class JsonEvent:
> s += f'\t{attr} = {value},\n'
> return s + '}'
>
> - def build_c_string(self, metric: bool) -> str:
> + def build_c_string(self, metric: bool, layout: bool) -> str:
> s = ''
> + if layout:
> + for attr in _json_layout_attributes:
> + x = getattr(self, attr)
> + if attr in _json_enum_attributes:
> + s += x if x else '0'
> + else:
> + s += f'{x}\\000' if x else '\\000'
> + x = self.counters['num'].gp
> + s += x if x else '0'
> + x = self.counters['num'].fixed
> + s += x if x else '0'
> + return s
> for attr in _json_metric_attributes if metric else _json_event_attributes:
> x = getattr(self, attr)
> if metric and x and attr == 'metric_expr':
> @@ -425,15 +461,18 @@ class JsonEvent:
> s += x if x else '0'
> else:
> s += f'{x}\\000' if x else '\\000'
> + if not metric:
> + x = self.counters['list']
> + s += f'{x}\\000' if x else '\\000'
> return s
>
> - def to_c_string(self, metric: bool) -> str:
> + def to_c_string(self, metric: bool, layout: bool) -> str:
> """Representation of the event as a C struct initializer."""
>
> def fix_comment(s: str) -> str:
> return s.replace('*/', r'\*\/')
>
> - s = self.build_c_string(metric)
> + s = self.build_c_string(metric, layout)
> return f'{{ { _bcs.offsets[s] } }}, /* {fix_comment(s)} */\n'
>
>
> @@ -472,6 +511,8 @@ def preprocess_arch_std_files(archpath: str) -> None:
> _arch_std_events[event.name.lower()] = event
> if event.metric_name:
> _arch_std_events[event.metric_name.lower()] = event
> + if event.counters['num'].gp:
> + _arch_std_events[event.pmu.lower()] = event
> except Exception as e:
> raise RuntimeError(f'Failure processing \'{item.name}\' in \'{archpath}\'') from e
>
> @@ -483,6 +524,8 @@ def add_events_table_entries(item: os.DirEntry, topic: str) -> None:
> _pending_events.append(e)
> if e.metric_name:
> _pending_metrics.append(e)
> + if e.counters['num'].gp:
> + _pending_pmu_counts.append(e)
>
>
> def print_pending_events() -> None:
> @@ -526,8 +569,8 @@ def print_pending_events() -> None:
> last_pmu = event.pmu
> pmus.add((event.pmu, pmu_name))
>
> - _args.output_file.write(event.to_c_string(metric=False))
> last_name = event.name
> + _args.output_file.write(event.to_c_string(metric=False, layout=False))
> _pending_events = []
>
> _args.output_file.write(f"""
> @@ -582,7 +625,7 @@ def print_pending_metrics() -> None:
> last_pmu = metric.pmu
> pmus.add((metric.pmu, pmu_name))
>
> - _args.output_file.write(metric.to_c_string(metric=True))
> + _args.output_file.write(metric.to_c_string(metric=True, layout=False))
> _pending_metrics = []
>
> _args.output_file.write(f"""
> @@ -600,6 +643,35 @@ const struct pmu_table_entry {_pending_metrics_tblname}[] = {{
> """)
> _args.output_file.write('};\n\n')
>
> +def print_pending_pmu_counter_layout_table() -> None:
> + '''Print counter layout data from counter.json file to counter layout table in
> + c-string'''
> +
> + def pmu_counts_cmp_key(j: JsonEvent) -> Tuple[bool, str, str]:
> + def fix_none(s: Optional[str]) -> str:
> + if s is None:
> + return ''
> + return s
> +
> + return (j.desc is not None, fix_none(j.pmu))
> +
> + global _pending_pmu_counts
> + if not _pending_pmu_counts:
> + return
> +
> + global _pending_pmu_counts_tblname
> + global pmu_layouts_tables
> + _pmu_layouts_tables.append(_pending_pmu_counts_tblname)
> +
> + _args.output_file.write(
> + f'static const struct compact_pmu_event {_pending_pmu_counts_tblname}[] = {{\n')
> +
> + for pmu_layout in sorted(_pending_pmu_counts, key=pmu_counts_cmp_key):
> + _args.output_file.write(pmu_layout.to_c_string(metric=False, layout=True))
> + _pending_pmu_counts = []
> +
> + _args.output_file.write('};\n\n')
> +
> def get_topic(topic: str) -> str:
> if topic.endswith('metrics.json'):
> return 'metrics'
> @@ -636,10 +708,12 @@ def preprocess_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
> pmu_name = f"{event.pmu}\\000"
> if event.name:
> _bcs.add(pmu_name, metric=False)
> - _bcs.add(event.build_c_string(metric=False), metric=False)
> + _bcs.add(event.build_c_string(metric=False, layout=False), metric=False)
> if event.metric_name:
> _bcs.add(pmu_name, metric=True)
> - _bcs.add(event.build_c_string(metric=True), metric=True)
> + _bcs.add(event.build_c_string(metric=True, layout=False), metric=True)
> + if event.counters['num'].gp:
> + _bcs.add(event.build_c_string(metric=False, layout=True), metric=False)
>
> def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
> """Process a JSON file during the main walk."""
> @@ -656,11 +730,14 @@ def process_one_file(parents: Sequence[str], item: os.DirEntry) -> None:
> if item.is_dir() and is_leaf_dir_ignoring_sys(item.path):
> print_pending_events()
> print_pending_metrics()
> + print_pending_pmu_counter_layout_table()
>
> global _pending_events_tblname
> _pending_events_tblname = file_name_to_table_name('pmu_events_', parents, item.name)
> global _pending_metrics_tblname
> _pending_metrics_tblname = file_name_to_table_name('pmu_metrics_', parents, item.name)
> + global _pending_pmu_counts_tblname
> + _pending_pmu_counts_tblname = file_name_to_table_name('pmu_layouts_', parents, item.name)
>
> if item.name == 'sys':
> _sys_event_table_to_metric_table_mapping[_pending_events_tblname] = _pending_metrics_tblname
> @@ -694,6 +771,12 @@ struct pmu_metrics_table {
> uint32_t num_pmus;
> };
>
> +/* Struct used to make the PMU counter layout table implementation opaque to callers. */
> +struct pmu_layouts_table {
> + const struct compact_pmu_event *entries;
> + size_t length;
> +};
> +
> /*
> * Map a CPU to its table of PMU events. The CPU is identified by the
> * cpuid field, which is an arch-specific identifier for the CPU.
> @@ -707,6 +790,7 @@ struct pmu_events_map {
> const char *cpuid;
> struct pmu_events_table event_table;
> struct pmu_metrics_table metric_table;
> + struct pmu_layouts_table layout_table;
> };
>
> /*
> @@ -762,6 +846,12 @@ const struct pmu_events_map pmu_events_map[] = {
> metric_size = '0'
> if event_size == '0' and metric_size == '0':
> continue
> + layout_tblname = file_name_to_table_name('pmu_layouts_', [], row[2].replace('/', '_'))
> + if layout_tblname in _pmu_layouts_tables:
> + layout_size = f'ARRAY_SIZE({layout_tblname})'
> + else:
> + layout_tblname = 'NULL'
> + layout_size = '0'
> cpuid = row[0].replace('\\', '\\\\')
> _args.output_file.write(f"""{{
> \t.arch = "{arch}",
> @@ -773,6 +863,10 @@ const struct pmu_events_map pmu_events_map[] = {
> \t.metric_table = {{
> \t\t.pmus = {metric_tblname},
> \t\t.num_pmus = {metric_size}
> +\t}},
> +\t.layout_table = {{
> +\t\t.entries = {layout_tblname},
> +\t\t.length = {layout_size}
> \t}}
> }},
> """)
> @@ -783,6 +877,7 @@ const struct pmu_events_map pmu_events_map[] = {
> \t.cpuid = 0,
> \t.event_table = { 0, 0 },
> \t.metric_table = { 0, 0 },
> +\t.layout_table = { 0, 0 },
> }
> };
> """)
> @@ -851,6 +946,9 @@ static void decompress_event(int offset, struct pmu_event *pe)
> _args.output_file.write('\tp++;')
> else:
> _args.output_file.write('\twhile (*p++);')
> + _args.output_file.write('\twhile (*p++);')
> + _args.output_file.write(f'\n\tpe->counters_list = ')
> + _args.output_file.write("(*p == '\\0' ? NULL : p);\n")
> _args.output_file.write("""}
>
> static void decompress_metric(int offset, struct pmu_metric *pm)
> @@ -871,6 +969,30 @@ static void decompress_metric(int offset, struct pmu_metric *pm)
> _args.output_file.write('\twhile (*p++);')
> _args.output_file.write("""}
>
> +static void decompress_layout(int offset, struct pmu_layout *pm)
> +{
> +\tconst char *p = &big_c_string[offset];
> +""")
> + for attr in _json_layout_attributes:
> + _args.output_file.write(f'\n\tpm->{attr} = ')
> + if attr in _json_enum_attributes:
> + _args.output_file.write("*p - '0';\n")
> + else:
> + _args.output_file.write("(*p == '\\0' ? NULL : p);\n")
> + if attr == _json_layout_attributes[-1]:
> + continue
> + if attr in _json_enum_attributes:
> + _args.output_file.write('\tp++;')
> + else:
> + _args.output_file.write('\twhile (*p++);')
> + _args.output_file.write('\tp++;')
> + _args.output_file.write(f'\n\tpm->counters_num_gp = ')
> + _args.output_file.write("*p - '0';\n")
> + _args.output_file.write('\tp++;')
> + _args.output_file.write(f'\n\tpm->counters_num_fixed = ')
> + _args.output_file.write("*p - '0';\n")
> + _args.output_file.write("""}
> +
> static int pmu_events_table__for_each_event_pmu(const struct pmu_events_table *table,
> const struct pmu_table_entry *pmu,
> pmu_event_iter_fn fn,
> @@ -1026,6 +1148,21 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table,
> return 0;
> }
>
> +int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table,
> + pmu_layout_iter_fn fn,
> + void *data) {
> + for (size_t i = 0; i < table->length; i++) {
> + struct pmu_layout pm;
> + int ret;
> +
> + decompress_layout(table->entries[i].offset, &pm);
> + ret = fn(&pm, data);
> + if (ret)
> + return ret;
> + }
> + return 0;
> +}
> +
> static const struct pmu_events_map *map_for_cpu(struct perf_cpu cpu)
> {
> static struct {
> @@ -1118,6 +1255,34 @@ const struct pmu_metrics_table *pmu_metrics_table__find(void)
> return map ? &map->metric_table : NULL;
> }
>
> +const struct pmu_layouts_table *perf_pmu__find_layouts_table(void)
> +{
> + const struct pmu_layouts_table *table = NULL;
> + struct perf_cpu cpu = {-1};
> + char *cpuid = get_cpuid_allow_env_override(cpu);
> + int i;
> +
> + /* on some platforms which uses cpus map, cpuid can be NULL for
> + * PMUs other than CORE PMUs.
> + */
> + if (!cpuid)
> + return NULL;
> +
> + i = 0;
> + for (;;) {
> + const struct pmu_events_map *map = &pmu_events_map[i++];
> + if (!map->arch)
> + break;
> +
> + if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
> + table = &map->layout_table;
> + break;
> + }
> + }
> + free(cpuid);
> + return table;
> +}
> +
> const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid)
> {
> for (const struct pmu_events_map *tables = &pmu_events_map[0];
> @@ -1139,6 +1304,16 @@ const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const
> }
> return NULL;
> }
> +const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid)
> +{
> + for (const struct pmu_events_map *tables = &pmu_events_map[0];
> + tables->arch;
> + tables++) {
> + if (!strcmp(tables->arch, arch) && !strcmp_cpuid_str(tables->cpuid, cpuid))
> + return &tables->layout_table;
> + }
> + return NULL;
> +}
>
> int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data)
> {
> @@ -1167,6 +1342,19 @@ int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data)
> return 0;
> }
>
> +int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data)
> +{
> + for (const struct pmu_events_map *tables = &pmu_events_map[0];
> + tables->arch;
> + tables++) {
> + int ret = pmu_layouts_table__for_each_layout(&tables->layout_table, fn, data);
> +
> + if (ret)
> + return ret;
> + }
> + return 0;
> +}
> +
> const struct pmu_events_table *find_sys_events_table(const char *name)
> {
> for (const struct pmu_sys_events *tables = &pmu_sys_event_tables[0];
> @@ -1330,6 +1518,7 @@ struct pmu_table_entry {
> ftw(arch_path, [], process_one_file)
> print_pending_events()
> print_pending_metrics()
> + print_pending_pmu_counter_layout_table()
>
> print_mapping_table(archs)
> print_system_mapping_table()
> diff --git a/tools/perf/pmu-events/pmu-events.h b/tools/perf/pmu-events/pmu-events.h
> index 675562e6f770..9a5cbec32513 100644
> --- a/tools/perf/pmu-events/pmu-events.h
> +++ b/tools/perf/pmu-events/pmu-events.h
> @@ -45,6 +45,11 @@ struct pmu_event {
> const char *desc;
> const char *topic;
> const char *long_desc;
> + /**
> + * The list of counter(s) the event could be collected on.
> + * eg., "0,1,2,3,4,5,6,7".
> + */
> + const char *counters_list;
> const char *pmu;
> const char *unit;
> bool perpkg;
> @@ -67,8 +72,18 @@ struct pmu_metric {
> enum metric_event_groups event_grouping;
> };
>
> +struct pmu_layout {
> + const char *pmu;
> + const char *desc;
> + /** Total number of generic counters*/
> + int counters_num_gp;
> + /** Total number of fixed counters. Set to zero if no fixed counter on the unit.*/
> + int counters_num_fixed;
> +};
> +
> struct pmu_events_table;
> struct pmu_metrics_table;
> +struct pmu_layouts_table;
>
> #define PMU_EVENTS__NOT_FOUND -1000
>
> @@ -80,6 +95,9 @@ typedef int (*pmu_metric_iter_fn)(const struct pmu_metric *pm,
> const struct pmu_metrics_table *table,
> void *data);
>
> +typedef int (*pmu_layout_iter_fn)(const struct pmu_layout *pm,
> + void *data);
> +
> int pmu_events_table__for_each_event(const struct pmu_events_table *table,
> struct perf_pmu *pmu,
> pmu_event_iter_fn fn,
> @@ -92,10 +110,13 @@ int pmu_events_table__for_each_event(const struct pmu_events_table *table,
> * search of all tables.
> */
> int pmu_events_table__find_event(const struct pmu_events_table *table,
> - struct perf_pmu *pmu,
> - const char *name,
> - pmu_event_iter_fn fn,
> - void *data);
> + struct perf_pmu *pmu,
> + const char *name,
> + pmu_event_iter_fn fn,
> + void *data);
> +int pmu_layouts_table__for_each_layout(const struct pmu_layouts_table *table,
> + pmu_layout_iter_fn fn,
> + void *data);
> size_t pmu_events_table__num_events(const struct pmu_events_table *table,
> struct perf_pmu *pmu);
>
> @@ -104,10 +125,13 @@ int pmu_metrics_table__for_each_metric(const struct pmu_metrics_table *table, pm
>
> const struct pmu_events_table *perf_pmu__find_events_table(struct perf_pmu *pmu);
> const struct pmu_metrics_table *pmu_metrics_table__find(void);
> +const struct pmu_layouts_table *perf_pmu__find_layouts_table(void);
> const struct pmu_events_table *find_core_events_table(const char *arch, const char *cpuid);
> const struct pmu_metrics_table *find_core_metrics_table(const char *arch, const char *cpuid);
> +const struct pmu_layouts_table *find_core_layouts_table(const char *arch, const char *cpuid);
> int pmu_for_each_core_event(pmu_event_iter_fn fn, void *data);
> int pmu_for_each_core_metric(pmu_metric_iter_fn fn, void *data);
> +int pmu_for_each_core_layout(pmu_layout_iter_fn fn, void *data);
>
> const struct pmu_events_table *find_sys_events_table(const char *name);
> const struct pmu_metrics_table *find_sys_metrics_table(const char *name);
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH v5 20/21] tools/perf: Pass the Counter constraint values in the pmu events
2025-03-27 19:36 ` [PATCH v5 20/21] tools/perf: Pass the Counter constraint values in the pmu events Atish Patra
@ 2025-04-23 0:17 ` Atish Patra
0 siblings, 0 replies; 28+ messages in thread
From: Atish Patra @ 2025-04-23 0:17 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, Namhyung Kim, Ian Rogers
Cc: linux-riscv, linux-kernel, Conor Dooley, devicetree, kvm,
kvm-riscv, linux-arm-kernel, linux-perf-users, Jiri Olsa,
Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Alexander Shishkin, Adrian Hunter
On 3/27/25 12:36 PM, Atish Patra wrote:
> RISC-V doesn't have any standard event to counter mapping discovery
> mechanism in the ISA. The ISA defines 29 programmable counters and
> platforms can choose to implement any number of them and map any
> events to any counters. Thus, the perf tool need to inform the driver
> about the counter mapping of each events.
>
> The current perf infrastructure only parses the 'Counter' constraints
> in metrics. This patch extends that to pass in the pmu events so that
> any driver can retrieve those values via perf attributes if defined
> accordingly.
>
Hi Ian/Arnaldo/Namhyung,
Any thoughts on this patch ? Please let me know if there are any other
better approaches to pass the counter constraints to the driver ?
The RISC-V pmu driver maps the attr.config2 with counterid_mask value
so that driver can parse the counter restrictions.
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
> tools/perf/pmu-events/jevents.py | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/tools/perf/pmu-events/jevents.py b/tools/perf/pmu-events/jevents.py
> index fdb7ddf093d2..f9f274678a32 100755
> --- a/tools/perf/pmu-events/jevents.py
> +++ b/tools/perf/pmu-events/jevents.py
> @@ -274,6 +274,11 @@ class JsonEvent:
> return fixed[name.lower()]
> return event
>
> + def counter_list_to_bitmask(counterlist):
> + counter_ids = list(map(int, counterlist.split(',')))
> + bitmask = sum(1 << pos for pos in counter_ids)
> + return bitmask
> +
> def unit_to_pmu(unit: str) -> Optional[str]:
> """Convert a JSON Unit to Linux PMU name."""
> if not unit or unit == "core":
> @@ -427,6 +432,10 @@ class JsonEvent:
> else:
> raise argparse.ArgumentTypeError('Cannot find arch std event:', arch_std)
>
> + if self.counters['list']:
> + bitmask = counter_list_to_bitmask(self.counters['list'])
> + event += f',counterid_mask={bitmask:#x}'
> +
> self.event = real_event(self.name, event)
>
> def __repr__(self) -> str:
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [External] [PATCH v5 14/21] RISC-V: perf: Implement supervisor counter delegation support
2025-03-27 19:35 ` [PATCH v5 14/21] RISC-V: perf: Implement supervisor counter delegation support Atish Patra
@ 2025-08-28 9:56 ` yunhui cui
0 siblings, 0 replies; 28+ messages in thread
From: yunhui cui @ 2025-08-28 9:56 UTC (permalink / raw)
To: Atish Patra
Cc: Paul Walmsley, Palmer Dabbelt, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Anup Patel, Atish Patra, Will Deacon, Mark Rutland,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Namhyung Kim, Alexander Shishkin, Jiri Olsa, Ian Rogers,
Adrian Hunter, weilin.wang, linux-riscv, linux-kernel,
Conor Dooley, devicetree, kvm, kvm-riscv, linux-arm-kernel,
linux-perf-users
Hi Atish,
On Fri, Mar 28, 2025 at 3:43 AM Atish Patra <atishp@rivosinc.com> wrote:
>
> There are few new RISC-V ISA exensions (ssccfg, sscsrind, smcntrpmf) which
> allows the hpmcounter/hpmevents to be programmed directly from S-mode. The
> implementation detects the ISA extension at runtime and uses them if
> available instead of SBI PMU extension. SBI PMU extension will still be
> used for firmware counters if the user requests it.
>
> The current linux driver relies on event encoding defined by SBI PMU
> specification for standard perf events. However, there are no standard
> event encoding available in the ISA. In the future, we may want to
> decouple the counter delegation and SBI PMU completely. In that case,
> counter delegation supported platforms must rely on the event encoding
> defined in the perf json file or in the pmu driver.
>
> For firmware events, it will continue to use the SBI PMU encoding as
> one can not support firmware event without SBI PMU.
>
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
> arch/riscv/include/asm/csr.h | 1 +
> drivers/perf/riscv_pmu_dev.c | 561 +++++++++++++++++++++++++++++++++--------
> include/linux/perf/riscv_pmu.h | 3 +
> 3 files changed, 462 insertions(+), 103 deletions(-)
>
> diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
> index 3d2d4f886c77..8b2f5ae1d60e 100644
> --- a/arch/riscv/include/asm/csr.h
> +++ b/arch/riscv/include/asm/csr.h
> @@ -255,6 +255,7 @@
> #endif
>
> #define SISELECT_SSCCFG_BASE 0x40
> +#define HPMEVENT_MASK GENMASK_ULL(63, 56)
>
> /* mseccfg bits */
> #define MSECCFG_PMM ENVCFG_PMM
> diff --git a/drivers/perf/riscv_pmu_dev.c b/drivers/perf/riscv_pmu_dev.c
> index 6f64404a6e3d..7c4a1ef15866 100644
> --- a/drivers/perf/riscv_pmu_dev.c
> +++ b/drivers/perf/riscv_pmu_dev.c
> @@ -27,6 +27,8 @@
> #include <asm/cpufeature.h>
> #include <asm/vendor_extensions.h>
> #include <asm/vendor_extensions/andes.h>
> +#include <asm/hwcap.h>
> +#include <asm/csr_ind.h>
>
> #define ALT_SBI_PMU_OVERFLOW(__ovl) \
> asm volatile(ALTERNATIVE_2( \
> @@ -59,14 +61,31 @@ asm volatile(ALTERNATIVE( \
> #define PERF_EVENT_FLAG_USER_ACCESS BIT(SYSCTL_USER_ACCESS)
> #define PERF_EVENT_FLAG_LEGACY BIT(SYSCTL_LEGACY)
>
> -PMU_FORMAT_ATTR(event, "config:0-47");
> +#define RVPMU_SBI_PMU_FORMAT_ATTR "config:0-47"
> +#define RVPMU_CDELEG_PMU_FORMAT_ATTR "config:0-55"
> +
> +static ssize_t __maybe_unused rvpmu_format_show(struct device *dev, struct device_attribute *attr,
> + char *buf);
> +
> +#define RVPMU_ATTR_ENTRY(_name, _func, _config) ( \
> + &((struct dev_ext_attribute[]) { \
> + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \
> + })[0].attr.attr)
> +
> +#define RVPMU_FORMAT_ATTR_ENTRY(_name, _config) \
> + RVPMU_ATTR_ENTRY(_name, rvpmu_format_show, (char *)_config)
> +
> PMU_FORMAT_ATTR(firmware, "config:62-63");
>
> static bool sbi_v2_available;
> static DEFINE_STATIC_KEY_FALSE(sbi_pmu_snapshot_available);
> #define sbi_pmu_snapshot_available() \
> static_branch_unlikely(&sbi_pmu_snapshot_available)
> +
> static DEFINE_STATIC_KEY_FALSE(riscv_pmu_sbi_available);
> +#define riscv_pmu_sbi_available() \
> + static_branch_likely(&riscv_pmu_sbi_available)
> +
> static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
>
> /* Avoid unnecessary code patching in the one time booting path*/
> @@ -81,19 +100,35 @@ static DEFINE_STATIC_KEY_FALSE(riscv_pmu_cdeleg_available);
> #define riscv_pmu_sbi_available() \
> static_branch_likely(&riscv_pmu_sbi_available)
>
> -static struct attribute *riscv_arch_formats_attr[] = {
> - &format_attr_event.attr,
> +static struct attribute *riscv_sbi_pmu_formats_attr[] = {
> + RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_SBI_PMU_FORMAT_ATTR),
> + &format_attr_firmware.attr,
> + NULL,
> +};
> +
> +static struct attribute_group riscv_sbi_pmu_format_group = {
> + .name = "format",
> + .attrs = riscv_sbi_pmu_formats_attr,
> +};
> +
> +static const struct attribute_group *riscv_sbi_pmu_attr_groups[] = {
> + &riscv_sbi_pmu_format_group,
> + NULL,
> +};
> +
> +static struct attribute *riscv_cdeleg_pmu_formats_attr[] = {
> + RVPMU_FORMAT_ATTR_ENTRY(event, RVPMU_CDELEG_PMU_FORMAT_ATTR),
> &format_attr_firmware.attr,
> NULL,
> };
>
> -static struct attribute_group riscv_pmu_format_group = {
> +static struct attribute_group riscv_cdeleg_pmu_format_group = {
> .name = "format",
> - .attrs = riscv_arch_formats_attr,
> + .attrs = riscv_cdeleg_pmu_formats_attr,
> };
>
> -static const struct attribute_group *riscv_pmu_attr_groups[] = {
> - &riscv_pmu_format_group,
> +static const struct attribute_group *riscv_cdeleg_pmu_attr_groups[] = {
> + &riscv_cdeleg_pmu_format_group,
> NULL,
> };
>
> @@ -395,6 +430,14 @@ static void rvpmu_sbi_check_std_events(struct work_struct *work)
>
> static DECLARE_WORK(check_std_events_work, rvpmu_sbi_check_std_events);
>
> +static ssize_t rvpmu_format_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct dev_ext_attribute *eattr = container_of(attr,
> + struct dev_ext_attribute, attr);
> + return sysfs_emit(buf, "%s\n", (char *)eattr->var);
> +}
> +
> static int rvpmu_ctr_get_width(int idx)
> {
> return pmu_ctr_list[idx].width;
> @@ -447,6 +490,38 @@ static uint8_t rvpmu_csr_index(struct perf_event *event)
> return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE;
> }
>
> +static uint64_t get_deleg_priv_filter_bits(struct perf_event *event)
> +{
> + u64 priv_filter_bits = 0;
> + bool guest_events = false;
> +
> + if (event->attr.config1 & RISCV_PMU_CONFIG1_GUEST_EVENTS)
> + guest_events = true;
> + if (event->attr.exclude_kernel)
> + priv_filter_bits |= guest_events ? HPMEVENT_VSINH : HPMEVENT_SINH;
> + if (event->attr.exclude_user)
> + priv_filter_bits |= guest_events ? HPMEVENT_VUINH : HPMEVENT_UINH;
> + if (guest_events && event->attr.exclude_hv)
> + priv_filter_bits |= HPMEVENT_SINH;
> + if (event->attr.exclude_host)
> + priv_filter_bits |= HPMEVENT_UINH | HPMEVENT_SINH;
> + if (event->attr.exclude_guest)
> + priv_filter_bits |= HPMEVENT_VSINH | HPMEVENT_VUINH;
> +
> + return priv_filter_bits;
> +}
> +
> +static bool pmu_sbi_is_fw_event(struct perf_event *event)
> +{
> + u32 type = event->attr.type;
> + u64 config = event->attr.config;
> +
> + if (type == PERF_TYPE_RAW && ((config >> 63) == 1))
> + return true;
> + else
> + return false;
> +}
> +
> static unsigned long rvpmu_sbi_get_filter_flags(struct perf_event *event)
> {
> unsigned long cflags = 0;
> @@ -475,7 +550,8 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
> struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
> struct sbiret ret;
> int idx;
> - uint64_t cbase = 0, cmask = rvpmu->cmask;
> + u64 cbase = 0;
> + unsigned long ctr_mask = rvpmu->cmask;
> unsigned long cflags = 0;
>
> cflags = rvpmu_sbi_get_filter_flags(event);
> @@ -488,21 +564,23 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
> if ((hwc->flags & PERF_EVENT_FLAG_LEGACY) && (event->attr.type == PERF_TYPE_HARDWARE)) {
> if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES) {
> cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
> - cmask = 1;
> + ctr_mask = 1;
> } else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS) {
> cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
> - cmask = BIT(CSR_INSTRET - CSR_CYCLE);
> + ctr_mask = BIT(CSR_INSTRET - CSR_CYCLE);
> }
> + } else if (pmu_sbi_is_fw_event(event)) {
> + ctr_mask = firmware_cmask;
> }
>
> /* retrieve the available counter index */
> #if defined(CONFIG_32BIT)
> ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
> - cmask, cflags, hwc->event_base, hwc->config,
> + ctr_mask, cflags, hwc->event_base, hwc->config,
> hwc->config >> 32);
> #else
> ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
> - cmask, cflags, hwc->event_base, hwc->config, 0);
> + ctr_mask, cflags, hwc->event_base, hwc->config, 0);
> #endif
> if (ret.error) {
> pr_debug("Not able to find a counter for event %lx config %llx\n",
> @@ -511,7 +589,7 @@ static int rvpmu_sbi_ctr_get_idx(struct perf_event *event)
> }
>
> idx = ret.value;
> - if (!test_bit(idx, &rvpmu->cmask) || !pmu_ctr_list[idx].value)
> + if (!test_bit(idx, &ctr_mask) || !pmu_ctr_list[idx].value)
> return -ENOENT;
>
> /* Additional sanity check for the counter id */
> @@ -561,17 +639,6 @@ static int sbi_pmu_event_find_cache(u64 config)
> return ret;
> }
>
> -static bool pmu_sbi_is_fw_event(struct perf_event *event)
> -{
> - u32 type = event->attr.type;
> - u64 config = event->attr.config;
> -
> - if ((type == PERF_TYPE_RAW) && ((config >> 63) == 1))
> - return true;
> - else
> - return false;
> -}
> -
> static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
> {
> u32 type = event->attr.type;
> @@ -602,7 +669,6 @@ static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
> * 10 - SBI firmware events
> * 11 - Risc-V platform specific firmware event
> */
> -
> switch (config >> 62) {
> case 0:
> /* Return error any bits [48-63] is set as it is not allowed by the spec */
> @@ -634,6 +700,84 @@ static int rvpmu_sbi_event_map(struct perf_event *event, u64 *econfig)
> return ret;
> }
>
> +static int cdeleg_pmu_event_find_cache(u64 config, u64 *eventid, uint32_t *counter_mask)
> +{
> + unsigned int cache_type, cache_op, cache_result;
> +
> + if (!current_pmu_cache_event_map)
> + return -ENOENT;
> +
> + cache_type = (config >> 0) & 0xff;
> + if (cache_type >= PERF_COUNT_HW_CACHE_MAX)
> + return -EINVAL;
> +
> + cache_op = (config >> 8) & 0xff;
> + if (cache_op >= PERF_COUNT_HW_CACHE_OP_MAX)
> + return -EINVAL;
> +
> + cache_result = (config >> 16) & 0xff;
> + if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
> + return -EINVAL;
> +
> + if (eventid)
> + *eventid = current_pmu_cache_event_map[cache_type][cache_op]
> + [cache_result].event_id;
> + if (counter_mask)
> + *counter_mask = current_pmu_cache_event_map[cache_type][cache_op]
> + [cache_result].counter_mask;
> +
> + return 0;
> +}
> +
> +static int rvpmu_cdeleg_event_map(struct perf_event *event, u64 *econfig)
> +{
> + u32 type = event->attr.type;
> + u64 config = event->attr.config;
> + int ret = 0;
> +
> + /*
> + * There are two ways standard perf events can be mapped to platform specific
> + * encoding.
> + * 1. The vendor may specify the encodings in the driver.
> + * 2. The Perf tool for RISC-V may remap the standard perf event to platform
> + * specific encoding.
> + *
> + * As RISC-V ISA doesn't define any standard event encoding. Thus, perf tool allows
> + * vendor to define it via json file. The encoding defined in the json will override
> + * the perf legacy encoding. However, some user may want to run performance
> + * monitoring without perf tool as well. That's why, vendors may specify the event
> + * encoding in the driver as well if they want to support that use case too.
> + * If an encoding is defined in the json, it will be encoded as a raw event.
> + */
> +
> + switch (type) {
> + case PERF_TYPE_HARDWARE:
> + if (config >= PERF_COUNT_HW_MAX)
> + return -EINVAL;
> + if (!current_pmu_hw_event_map)
> + return -ENOENT;
> +
> + *econfig = current_pmu_hw_event_map[config].event_id;
> + if (*econfig == HW_OP_UNSUPPORTED)
> + ret = -ENOENT;
> + break;
> + case PERF_TYPE_HW_CACHE:
> + ret = cdeleg_pmu_event_find_cache(config, econfig, NULL);
> + if (*econfig == HW_OP_UNSUPPORTED)
> + ret = -ENOENT;
> + break;
> + case PERF_TYPE_RAW:
> + *econfig = config & RISCV_PMU_DELEG_RAW_EVENT_MASK;
> + break;
> + default:
> + ret = -ENOENT;
> + break;
> + }
> +
> + /* event_base is not used for counter delegation */
> + return ret;
> +}
> +
> static void pmu_sbi_snapshot_free(struct riscv_pmu *pmu)
> {
> int cpu;
> @@ -717,7 +861,7 @@ static int pmu_sbi_snapshot_setup(struct riscv_pmu *pmu, int cpu)
> return 0;
> }
>
> -static u64 rvpmu_sbi_ctr_read(struct perf_event *event)
> +static u64 rvpmu_ctr_read(struct perf_event *event)
> {
> struct hw_perf_event *hwc = &event->hw;
> int idx = hwc->idx;
> @@ -794,10 +938,6 @@ static void rvpmu_sbi_ctr_start(struct perf_event *event, u64 ival)
> if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED))
> pr_err("Starting counter idx %d failed with error %d\n",
> hwc->idx, sbi_err_map_linux_errno(ret.error));
> -
> - if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
> - (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
> - rvpmu_set_scounteren((void *)event);
> }
>
> static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
> @@ -808,10 +948,6 @@ static void rvpmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
> struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
> struct riscv_pmu_snapshot_data *sdata = cpu_hw_evt->snapshot_addr;
>
> - if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
> - (hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
> - rvpmu_reset_scounteren((void *)event);
> -
> if (sbi_pmu_snapshot_available())
> flag |= SBI_PMU_STOP_FLAG_TAKE_SNAPSHOT;
>
> @@ -847,12 +983,6 @@ static int rvpmu_sbi_find_num_ctrs(void)
> return sbi_err_map_linux_errno(ret.error);
> }
>
> -static u32 rvpmu_deleg_find_ctrs(void)
> -{
> - /* TODO */
> - return 0;
> -}
> -
> static int rvpmu_sbi_get_ctrinfo(u32 nsbi_ctr, u32 *num_fw_ctr, u32 *num_hw_ctr)
> {
> struct sbiret ret;
> @@ -930,53 +1060,75 @@ static inline void rvpmu_sbi_stop_hw_ctrs(struct riscv_pmu *pmu)
> }
> }
>
> -/*
> - * This function starts all the used counters in two step approach.
> - * Any counter that did not overflow can be start in a single step
> - * while the overflowed counters need to be started with updated initialization
> - * value.
> - */
> -static inline void rvpmu_sbi_start_ovf_ctrs_sbi(struct cpu_hw_events *cpu_hw_evt,
> - u64 ctr_ovf_mask)
> +static void rvpmu_deleg_ctr_start_mask(unsigned long mask)
> {
> - int idx = 0, i;
> - struct perf_event *event;
> - unsigned long flag = SBI_PMU_START_FLAG_SET_INIT_VALUE;
> - unsigned long ctr_start_mask = 0;
> - uint64_t max_period;
> - struct hw_perf_event *hwc;
> - u64 init_val = 0;
> + unsigned long scountinhibit_val = 0;
>
> - for (i = 0; i < BITS_TO_LONGS(RISCV_MAX_COUNTERS); i++) {
> - ctr_start_mask = cpu_hw_evt->used_hw_ctrs[i] & ~ctr_ovf_mask;
> - /* Start all the counters that did not overflow in a single shot */
> - sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, i * BITS_PER_LONG, ctr_start_mask,
> - 0, 0, 0, 0);
> - }
> + scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
> + scountinhibit_val &= ~mask;
> +
> + csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
> +}
> +
> +static void rvpmu_deleg_ctr_enable_irq(struct perf_event *event)
> +{
> + unsigned long hpmevent_curr;
> + unsigned long of_mask;
> + struct hw_perf_event *hwc = &event->hw;
> + int counter_idx = hwc->idx;
> + unsigned long sip_val = csr_read(CSR_SIP);
> +
> + if (!is_sampling_event(event) || (sip_val & SIP_LCOFIP))
> + return;
>
> - /* Reinitialize and start all the counter that overflowed */
> - while (ctr_ovf_mask) {
> - if (ctr_ovf_mask & 0x01) {
> - event = cpu_hw_evt->events[idx];
> - hwc = &event->hw;
> - max_period = riscv_pmu_ctr_get_width_mask(event);
> - init_val = local64_read(&hwc->prev_count) & max_period;
> #if defined(CONFIG_32BIT)
> - sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
> - flag, init_val, init_val >> 32, 0);
> + hpmevent_curr = csr_ind_read(CSR_SIREG5, SISELECT_SSCCFG_BASE, counter_idx);
> + of_mask = (u32)~HPMEVENTH_OF;
> #else
> - sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_START, idx, 1,
> - flag, init_val, 0, 0);
> + hpmevent_curr = csr_ind_read(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx);
> + of_mask = ~HPMEVENT_OF;
> +#endif
There are too many #if defined(CONFIG_32BIT) checks in the code. Could
we centralize their definitions in a unified place and unify the
32-bit/64-bit logic?
> +
> + hpmevent_curr &= of_mask;
> +#if defined(CONFIG_32BIT)
> + csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr);
> +#else
> + csr_ind_write(CSR_SIREG2, SISELECT_SSCCFG_BASE, counter_idx, hpmevent_curr);
> #endif
> - perf_event_update_userpage(event);
> - }
> - ctr_ovf_mask = ctr_ovf_mask >> 1;
> - idx++;
> - }
> }
>
> -static inline void rvpmu_sbi_start_ovf_ctrs_snapshot(struct cpu_hw_events *cpu_hw_evt,
> - u64 ctr_ovf_mask)
> +static void rvpmu_deleg_ctr_start(struct perf_event *event, u64 ival)
> +{
> + unsigned long scountinhibit_val = 0;
> + struct hw_perf_event *hwc = &event->hw;
> +
> +#if defined(CONFIG_32BIT)
> + csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival & 0xFFFFFFFF);
> + csr_ind_write(CSR_SIREG4, SISELECT_SSCCFG_BASE, hwc->idx, ival >> BITS_PER_LONG);
> +#else
> + csr_ind_write(CSR_SIREG, SISELECT_SSCCFG_BASE, hwc->idx, ival);
> +#endif
> +
> + rvpmu_deleg_ctr_enable_irq(event);
> +
> + scountinhibit_val = csr_read(CSR_SCOUNTINHIBIT);
> + scountinhibit_val &= ~(1 << hwc->idx);
> +
> + csr_write(CSR_SCOUNTINHIBIT, scountinhibit_val);
> +}
> +
...
>
> +#define RISCV_PMU_DELEG_RAW_EVENT_MASK GENMASK_ULL(55, 0)
> +
> #define HW_OP_UNSUPPORTED 0xFFFF
> #define CACHE_OP_UNSUPPORTED 0xFFFF
>
>
> --
> 2.43.0
>
>
Thanks,
Yunhui
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2025-08-28 9:57 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-27 19:35 [PATCH v5 00/21] Add Counter delegation ISA extension support Atish Patra
2025-03-27 19:35 ` [PATCH v5 01/21] perf pmu-events: Add functions in jevent.py to parse counter and event info for hardware aware grouping Atish Patra
2025-04-23 0:13 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 02/21] RISC-V: Add Sxcsrind ISA extension CSR definitions Atish Patra
2025-03-27 19:35 ` [PATCH v5 03/21] RISC-V: Add Sxcsrind ISA extension definition and parsing Atish Patra
2025-03-27 19:35 ` [PATCH v5 04/21] dt-bindings: riscv: add Sxcsrind ISA extension description Atish Patra
2025-03-27 19:35 ` [PATCH v5 05/21] RISC-V: Define indirect CSR access helpers Atish Patra
2025-03-27 19:35 ` [PATCH v5 06/21] RISC-V: Add Smcntrpmf extension parsing Atish Patra
2025-03-27 19:35 ` [PATCH v5 07/21] dt-bindings: riscv: add Smcntrpmf ISA extension description Atish Patra
2025-03-27 19:35 ` [PATCH v5 08/21] RISC-V: Add Sscfg extension CSR definition Atish Patra
2025-03-27 19:35 ` [PATCH v5 09/21] RISC-V: Add Ssccfg/Smcdeleg ISA extension definition and parsing Atish Patra
2025-03-27 19:35 ` [PATCH v5 10/21] dt-bindings: riscv: add Counter delegation ISA extensions description Atish Patra
2025-03-31 15:38 ` Conor Dooley
2025-03-27 19:35 ` [PATCH v5 11/21] RISC-V: perf: Restructure the SBI PMU code Atish Patra
2025-04-04 13:49 ` Will Deacon
2025-04-23 0:02 ` Atish Patra
2025-03-27 19:35 ` [PATCH v5 12/21] RISC-V: perf: Modify the counter discovery mechanism Atish Patra
2025-03-27 19:35 ` [PATCH v5 13/21] RISC-V: perf: Add a mechanism to defined legacy event encoding Atish Patra
2025-03-27 19:35 ` [PATCH v5 14/21] RISC-V: perf: Implement supervisor counter delegation support Atish Patra
2025-08-28 9:56 ` [External] " yunhui cui
2025-03-27 19:35 ` [PATCH v5 15/21] RISC-V: perf: Skip PMU SBI extension when not implemented Atish Patra
2025-03-27 19:35 ` [PATCH v5 16/21] RISC-V: perf: Use config2/vendor table for event to counter mapping Atish Patra
2025-03-27 19:35 ` [PATCH v5 17/21] RISC-V: perf: Add legacy event encodings via sysfs Atish Patra
2025-03-27 19:35 ` [PATCH v5 18/21] RISC-V: perf: Add Qemu virt machine events Atish Patra
2025-03-27 19:36 ` [PATCH v5 19/21] tools/perf: Support event code for arch standard events Atish Patra
2025-03-27 19:36 ` [PATCH v5 20/21] tools/perf: Pass the Counter constraint values in the pmu events Atish Patra
2025-04-23 0:17 ` Atish Patra
2025-03-27 19:36 ` [PATCH v5 21/21] Sync empty-pmu-events.c with autogenerated one Atish Patra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).