All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] lib: pmu: allow to use the highest available counter
@ 2022-06-23 14:13 Sergey Matyukevich
  2022-06-23 15:30 ` Anup Patel
  0 siblings, 1 reply; 8+ messages in thread
From: Sergey Matyukevich @ 2022-06-23 14:13 UTC (permalink / raw)
  To: opensbi

Both OpenSBI and OS explicitly assume that there is no hardware counter
with index 1: hardware uses that bit for TM control. So OpenSBI filters
out that index in sanity checks. However the range sanity checks do not
treat that index in a special way. As a result, OpenSBI does not allow
to use the firmware counter with the highest index. This change modifies
range checks to allow access to the highest index firmware counter.

The simple test is to make sure that there is no counter multiplexing
in the following command:

$ perf stat -e \
	r8000000000000000,r8000000000000001,r8000000000000002,r8000000000000003, \
	r8000000000000004,r8000000000000005,r8000000000000006,r8000000000000007, \
	r8000000000000008,r8000000000000009,r800000000000000a,r800000000000000b, \
	r800000000000000c,r800000000000000d,r800000000000000e,r800000000000000f  \
	ls

Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>

---

Note that perf RISC-V SBI Linux driver is adapted for the current OpenSBI
behavior: it passes counters mask with exluded highest valid index. So the
accompanying fix is also required for Linux, see the patches posted to
the risc-v mailing list:

https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c

 lib/sbi/sbi_pmu.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
index 3ecf536..b386d33 100644
--- a/lib/sbi/sbi_pmu.c
+++ b/lib/sbi/sbi_pmu.c
@@ -115,7 +115,7 @@ static int pmu_ctr_validate(uint32_t cidx, uint32_t *event_idx_code)
 
 	event_idx_val = active_events[hartid][cidx];
 
-	if (cidx >= total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
+	if (cidx > total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
 		return SBI_EINVAL;
 
 	event_idx_type = get_cidx_type(event_idx_val);
@@ -347,13 +347,13 @@ int sbi_pmu_ctr_start(unsigned long cbase, unsigned long cmask,
 	int ret = SBI_EINVAL;
 	bool bUpdate = FALSE;
 
-	if (sbi_fls(ctr_mask) >= total_ctrs)
+	if (sbi_fls(ctr_mask) > total_ctrs)
 		return ret;
 
 	if (flags & SBI_PMU_START_FLAG_SET_INIT_VALUE)
 		bUpdate = TRUE;
 
-	for_each_set_bit_from(cbase, &ctr_mask, total_ctrs) {
+	for_each_set_bit_from(cbase, &ctr_mask, total_ctrs + 1) {
 		event_idx_type = pmu_ctr_validate(cbase, &event_code);
 		if (event_idx_type < 0)
 			/* Continue the start operation for other counters */
@@ -423,10 +423,10 @@ int sbi_pmu_ctr_stop(unsigned long cbase, unsigned long cmask,
 	uint32_t event_code;
 	unsigned long ctr_mask = cmask << cbase;
 
-	if (sbi_fls(ctr_mask) >= total_ctrs)
+	if (sbi_fls(ctr_mask) > total_ctrs)
 		return SBI_EINVAL;
 
-	for_each_set_bit_from(cbase, &ctr_mask, total_ctrs) {
+	for_each_set_bit_from(cbase, &ctr_mask, total_ctrs + 1) {
 		event_idx_type = pmu_ctr_validate(cbase, &event_code);
 		if (event_idx_type < 0)
 			/* Continue the stop operation for other counters */
@@ -598,7 +598,7 @@ static int pmu_ctr_find_fw(unsigned long cbase, unsigned long cmask, u32 hartid)
 	else
 		fw_base = cbase;
 
-	for (i = fw_base; i < total_ctrs; i++)
+	for (i = fw_base; i <= total_ctrs; i++)
 		if ((active_events[hartid][i] == SBI_PMU_EVENT_IDX_INVALID) &&
 		    ((1UL << i) & ctr_mask))
 			return i;
@@ -618,7 +618,7 @@ int sbi_pmu_ctr_cfg_match(unsigned long cidx_base, unsigned long cidx_mask,
 	unsigned long tmp = cidx_mask << cidx_base;
 
 	/* Do a basic sanity check of counter base & mask */
-	if (sbi_fls(tmp) >= total_ctrs || event_type >= SBI_PMU_EVENT_TYPE_MAX)
+	if (sbi_fls(tmp) > total_ctrs || event_type >= SBI_PMU_EVENT_TYPE_MAX)
 		return SBI_EINVAL;
 
 	if (flags & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
@@ -690,7 +690,7 @@ int sbi_pmu_ctr_get_info(uint32_t cidx, unsigned long *ctr_info)
 	struct sbi_scratch *scratch = sbi_scratch_thishart_ptr();
 
 	/* Sanity check. Counter1 is not mapped at all */
-	if (cidx >= total_ctrs || cidx == 1)
+	if (cidx > total_ctrs || cidx == 1)
 		return SBI_EINVAL;
 
 	/* We have 31 HW counters with 31 being the last index(MHPMCOUNTER31) */
@@ -719,7 +719,7 @@ static void pmu_reset_event_map(u32 hartid)
 	int j;
 
 	/* Initialize the counter to event mapping table */
-	for (j = 3; j < total_ctrs; j++)
+	for (j = 3; j <= total_ctrs; j++)
 		active_events[hartid][j] = SBI_PMU_EVENT_IDX_INVALID;
 	for (j = 0; j < SBI_PMU_FW_CTR_MAX; j++)
 		sbi_memset(&fw_event_map[hartid][j], 0,
-- 
2.36.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 1/1] lib: pmu: allow to use the highest available counter
  2022-06-23 14:13 [PATCH 1/1] lib: pmu: allow to use the highest available counter Sergey Matyukevich
@ 2022-06-23 15:30 ` Anup Patel
  2022-06-23 17:59   ` Atish Patra
  0 siblings, 1 reply; 8+ messages in thread
From: Anup Patel @ 2022-06-23 15:30 UTC (permalink / raw)
  To: opensbi

On Thu, Jun 23, 2022 at 7:43 PM Sergey Matyukevich <geomatsi@gmail.com> wrote:
>
> Both OpenSBI and OS explicitly assume that there is no hardware counter
> with index 1: hardware uses that bit for TM control. So OpenSBI filters
> out that index in sanity checks. However the range sanity checks do not
> treat that index in a special way. As a result, OpenSBI does not allow
> to use the firmware counter with the highest index. This change modifies
> range checks to allow access to the highest index firmware counter.
>
> The simple test is to make sure that there is no counter multiplexing
> in the following command:
>
> $ perf stat -e \
>         r8000000000000000,r8000000000000001,r8000000000000002,r8000000000000003, \
>         r8000000000000004,r8000000000000005,r8000000000000006,r8000000000000007, \
>         r8000000000000008,r8000000000000009,r800000000000000a,r800000000000000b, \
>         r800000000000000c,r800000000000000d,r800000000000000e,r800000000000000f  \
>         ls
>
> Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
>
> ---
>
> Note that perf RISC-V SBI Linux driver is adapted for the current OpenSBI
> behavior: it passes counters mask with exluded highest valid index. So the
> accompanying fix is also required for Linux, see the patches posted to
> the risc-v mailing list:
>
> https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
>
>  lib/sbi/sbi_pmu.c | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
> index 3ecf536..b386d33 100644
> --- a/lib/sbi/sbi_pmu.c
> +++ b/lib/sbi/sbi_pmu.c
> @@ -115,7 +115,7 @@ static int pmu_ctr_validate(uint32_t cidx, uint32_t *event_idx_code)
>
>         event_idx_val = active_events[hartid][cidx];
>
> -       if (cidx >= total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> +       if (cidx > total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))

Instead of changing all comparisons of total_ctrs, why not set
"num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;" in sbi_pmu_init() ?

Regards,
Anup

>                 return SBI_EINVAL;
>
>         event_idx_type = get_cidx_type(event_idx_val);
> @@ -347,13 +347,13 @@ int sbi_pmu_ctr_start(unsigned long cbase, unsigned long cmask,
>         int ret = SBI_EINVAL;
>         bool bUpdate = FALSE;
>
> -       if (sbi_fls(ctr_mask) >= total_ctrs)
> +       if (sbi_fls(ctr_mask) > total_ctrs)
>                 return ret;
>
>         if (flags & SBI_PMU_START_FLAG_SET_INIT_VALUE)
>                 bUpdate = TRUE;
>
> -       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs) {
> +       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs + 1) {
>                 event_idx_type = pmu_ctr_validate(cbase, &event_code);
>                 if (event_idx_type < 0)
>                         /* Continue the start operation for other counters */
> @@ -423,10 +423,10 @@ int sbi_pmu_ctr_stop(unsigned long cbase, unsigned long cmask,
>         uint32_t event_code;
>         unsigned long ctr_mask = cmask << cbase;
>
> -       if (sbi_fls(ctr_mask) >= total_ctrs)
> +       if (sbi_fls(ctr_mask) > total_ctrs)
>                 return SBI_EINVAL;
>
> -       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs) {
> +       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs + 1) {
>                 event_idx_type = pmu_ctr_validate(cbase, &event_code);
>                 if (event_idx_type < 0)
>                         /* Continue the stop operation for other counters */
> @@ -598,7 +598,7 @@ static int pmu_ctr_find_fw(unsigned long cbase, unsigned long cmask, u32 hartid)
>         else
>                 fw_base = cbase;
>
> -       for (i = fw_base; i < total_ctrs; i++)
> +       for (i = fw_base; i <= total_ctrs; i++)
>                 if ((active_events[hartid][i] == SBI_PMU_EVENT_IDX_INVALID) &&
>                     ((1UL << i) & ctr_mask))
>                         return i;
> @@ -618,7 +618,7 @@ int sbi_pmu_ctr_cfg_match(unsigned long cidx_base, unsigned long cidx_mask,
>         unsigned long tmp = cidx_mask << cidx_base;
>
>         /* Do a basic sanity check of counter base & mask */
> -       if (sbi_fls(tmp) >= total_ctrs || event_type >= SBI_PMU_EVENT_TYPE_MAX)
> +       if (sbi_fls(tmp) > total_ctrs || event_type >= SBI_PMU_EVENT_TYPE_MAX)
>                 return SBI_EINVAL;
>
>         if (flags & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
> @@ -690,7 +690,7 @@ int sbi_pmu_ctr_get_info(uint32_t cidx, unsigned long *ctr_info)
>         struct sbi_scratch *scratch = sbi_scratch_thishart_ptr();
>
>         /* Sanity check. Counter1 is not mapped at all */
> -       if (cidx >= total_ctrs || cidx == 1)
> +       if (cidx > total_ctrs || cidx == 1)
>                 return SBI_EINVAL;
>
>         /* We have 31 HW counters with 31 being the last index(MHPMCOUNTER31) */
> @@ -719,7 +719,7 @@ static void pmu_reset_event_map(u32 hartid)
>         int j;
>
>         /* Initialize the counter to event mapping table */
> -       for (j = 3; j < total_ctrs; j++)
> +       for (j = 3; j <= total_ctrs; j++)
>                 active_events[hartid][j] = SBI_PMU_EVENT_IDX_INVALID;
>         for (j = 0; j < SBI_PMU_FW_CTR_MAX; j++)
>                 sbi_memset(&fw_event_map[hartid][j], 0,
> --
> 2.36.1
>
>
> --
> opensbi mailing list
> opensbi at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/opensbi


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] lib: pmu: allow to use the highest available counter
  2022-06-23 15:30 ` Anup Patel
@ 2022-06-23 17:59   ` Atish Patra
  2022-06-24  3:40     ` Anup Patel
  2022-06-24  7:47     ` Sergey Matyukevich
  0 siblings, 2 replies; 8+ messages in thread
From: Atish Patra @ 2022-06-23 17:59 UTC (permalink / raw)
  To: opensbi

On Thu, Jun 23, 2022 at 8:31 AM Anup Patel <anup@brainfault.org> wrote:
>
> On Thu, Jun 23, 2022 at 7:43 PM Sergey Matyukevich <geomatsi@gmail.com> wrote:
> >
> > Both OpenSBI and OS explicitly assume that there is no hardware counter
> > with index 1: hardware uses that bit for TM control. So OpenSBI filters
> > out that index in sanity checks. However the range sanity checks do not
> > treat that index in a special way. As a result, OpenSBI does not allow
> > to use the firmware counter with the highest index. This change modifies
> > range checks to allow access to the highest index firmware counter.
> >
> > The simple test is to make sure that there is no counter multiplexing
> > in the following command:
> >
> > $ perf stat -e \
> >         r8000000000000000,r8000000000000001,r8000000000000002,r8000000000000003, \
> >         r8000000000000004,r8000000000000005,r8000000000000006,r8000000000000007, \
> >         r8000000000000008,r8000000000000009,r800000000000000a,r800000000000000b, \
> >         r800000000000000c,r800000000000000d,r800000000000000e,r800000000000000f  \
> >         ls
> >
> > Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
> >
> > ---
> >
> > Note that perf RISC-V SBI Linux driver is adapted for the current OpenSBI
> > behavior: it passes counters mask with exluded highest valid index. So the
> > accompanying fix is also required for Linux, see the patches posted to
> > the risc-v mailing list:
> >
> > https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
> >
> >  lib/sbi/sbi_pmu.c | 18 +++++++++---------
> >  1 file changed, 9 insertions(+), 9 deletions(-)
> >
> > diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
> > index 3ecf536..b386d33 100644
> > --- a/lib/sbi/sbi_pmu.c
> > +++ b/lib/sbi/sbi_pmu.c
> > @@ -115,7 +115,7 @@ static int pmu_ctr_validate(uint32_t cidx, uint32_t *event_idx_code)
> >
> >         event_idx_val = active_events[hartid][cidx];
> >
> > -       if (cidx >= total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > +       if (cidx > total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
>
> Instead of changing all comparisons of total_ctrs, why not set
> "num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;" in sbi_pmu_init() ?
>

Yes. That is simpler. We need a few other fixes along with that as
well. Here is the diff

diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
index 3ecf536002a4..a532746c74be 100644
--- a/lib/sbi/sbi_pmu.c
+++ b/lib/sbi/sbi_pmu.c
@@ -594,7 +594,7 @@ static int pmu_ctr_find_fw(unsigned long cbase,
unsigned long cmask, u32 hartid)
        unsigned long ctr_mask = cmask << cbase;

        if (cbase <= num_hw_ctrs)
-               fw_base = num_hw_ctrs + 1;
+               fw_base = num_hw_ctrs;
        else
                fw_base = cbase;

@@ -694,7 +694,7 @@ int sbi_pmu_ctr_get_info(uint32_t cidx, unsigned
long *ctr_info)
                return SBI_EINVAL;

        /* We have 31 HW counters with 31 being the last index(MHPMCOUNTER31) */
-       if (cidx <= num_hw_ctrs) {
+       if (cidx < num_hw_ctrs) {
                cinfo.type = SBI_PMU_CTR_TYPE_HW;
                cinfo.csr = CSR_CYCLE + cidx;
                /* mcycle & minstret are always 64 bit */
@@ -749,7 +749,7 @@ int sbi_pmu_init(struct sbi_scratch *scratch, bool
cold_boot)
                sbi_platform_pmu_init(plat);

                /* mcycle & minstret is available always */
-               num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 2;
+               num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;
                total_ctrs = num_hw_ctrs + SBI_PMU_FW_CTR_MAX;
        }

As the counters are zero indexed, the firmware counter should start
from num_hw_ctrs
E.g. If a hardware has 8 programmable counters, num_hw_ctrs will be set to 11.
with counter value (0-10). Firmware counters should range from 11-26.

> Regards,
> Anup
>
> >                 return SBI_EINVAL;
> >
> >         event_idx_type = get_cidx_type(event_idx_val);
> > @@ -347,13 +347,13 @@ int sbi_pmu_ctr_start(unsigned long cbase, unsigned long cmask,
> >         int ret = SBI_EINVAL;
> >         bool bUpdate = FALSE;
> >
> > -       if (sbi_fls(ctr_mask) >= total_ctrs)
> > +       if (sbi_fls(ctr_mask) > total_ctrs)
> >                 return ret;
> >
> >         if (flags & SBI_PMU_START_FLAG_SET_INIT_VALUE)
> >                 bUpdate = TRUE;
> >
> > -       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs) {
> > +       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs + 1) {
> >                 event_idx_type = pmu_ctr_validate(cbase, &event_code);
> >                 if (event_idx_type < 0)
> >                         /* Continue the start operation for other counters */
> > @@ -423,10 +423,10 @@ int sbi_pmu_ctr_stop(unsigned long cbase, unsigned long cmask,
> >         uint32_t event_code;
> >         unsigned long ctr_mask = cmask << cbase;
> >
> > -       if (sbi_fls(ctr_mask) >= total_ctrs)
> > +       if (sbi_fls(ctr_mask) > total_ctrs)
> >                 return SBI_EINVAL;
> >
> > -       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs) {
> > +       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs + 1) {
> >                 event_idx_type = pmu_ctr_validate(cbase, &event_code);
> >                 if (event_idx_type < 0)
> >                         /* Continue the stop operation for other counters */
> > @@ -598,7 +598,7 @@ static int pmu_ctr_find_fw(unsigned long cbase, unsigned long cmask, u32 hartid)
> >         else
> >                 fw_base = cbase;
> >
> > -       for (i = fw_base; i < total_ctrs; i++)
> > +       for (i = fw_base; i <= total_ctrs; i++)
> >                 if ((active_events[hartid][i] == SBI_PMU_EVENT_IDX_INVALID) &&
> >                     ((1UL << i) & ctr_mask))
> >                         return i;
> > @@ -618,7 +618,7 @@ int sbi_pmu_ctr_cfg_match(unsigned long cidx_base, unsigned long cidx_mask,
> >         unsigned long tmp = cidx_mask << cidx_base;
> >
> >         /* Do a basic sanity check of counter base & mask */
> > -       if (sbi_fls(tmp) >= total_ctrs || event_type >= SBI_PMU_EVENT_TYPE_MAX)
> > +       if (sbi_fls(tmp) > total_ctrs || event_type >= SBI_PMU_EVENT_TYPE_MAX)
> >                 return SBI_EINVAL;
> >
> >         if (flags & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
> > @@ -690,7 +690,7 @@ int sbi_pmu_ctr_get_info(uint32_t cidx, unsigned long *ctr_info)
> >         struct sbi_scratch *scratch = sbi_scratch_thishart_ptr();
> >
> >         /* Sanity check. Counter1 is not mapped at all */
> > -       if (cidx >= total_ctrs || cidx == 1)
> > +       if (cidx > total_ctrs || cidx == 1)
> >                 return SBI_EINVAL;
> >
> >         /* We have 31 HW counters with 31 being the last index(MHPMCOUNTER31) */
> > @@ -719,7 +719,7 @@ static void pmu_reset_event_map(u32 hartid)
> >         int j;
> >
> >         /* Initialize the counter to event mapping table */
> > -       for (j = 3; j < total_ctrs; j++)
> > +       for (j = 3; j <= total_ctrs; j++)
> >                 active_events[hartid][j] = SBI_PMU_EVENT_IDX_INVALID;
> >         for (j = 0; j < SBI_PMU_FW_CTR_MAX; j++)
> >                 sbi_memset(&fw_event_map[hartid][j], 0,
> > --
> > 2.36.1
> >
> >
> > --
> > opensbi mailing list
> > opensbi at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/opensbi
>
> --
> opensbi mailing list
> opensbi at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/opensbi



-- 
Regards,
Atish


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 1/1] lib: pmu: allow to use the highest available counter
  2022-06-23 17:59   ` Atish Patra
@ 2022-06-24  3:40     ` Anup Patel
  2022-06-24  7:47     ` Sergey Matyukevich
  1 sibling, 0 replies; 8+ messages in thread
From: Anup Patel @ 2022-06-24  3:40 UTC (permalink / raw)
  To: opensbi

Hi Atish and Sergey,

On Thu, Jun 23, 2022 at 11:29 PM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Thu, Jun 23, 2022 at 8:31 AM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Thu, Jun 23, 2022 at 7:43 PM Sergey Matyukevich <geomatsi@gmail.com> wrote:
> > >
> > > Both OpenSBI and OS explicitly assume that there is no hardware counter
> > > with index 1: hardware uses that bit for TM control. So OpenSBI filters
> > > out that index in sanity checks. However the range sanity checks do not
> > > treat that index in a special way. As a result, OpenSBI does not allow
> > > to use the firmware counter with the highest index. This change modifies
> > > range checks to allow access to the highest index firmware counter.
> > >
> > > The simple test is to make sure that there is no counter multiplexing
> > > in the following command:
> > >
> > > $ perf stat -e \
> > >         r8000000000000000,r8000000000000001,r8000000000000002,r8000000000000003, \
> > >         r8000000000000004,r8000000000000005,r8000000000000006,r8000000000000007, \
> > >         r8000000000000008,r8000000000000009,r800000000000000a,r800000000000000b, \
> > >         r800000000000000c,r800000000000000d,r800000000000000e,r800000000000000f  \
> > >         ls
> > >
> > > Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
> > >
> > > ---
> > >
> > > Note that perf RISC-V SBI Linux driver is adapted for the current OpenSBI
> > > behavior: it passes counters mask with exluded highest valid index. So the
> > > accompanying fix is also required for Linux, see the patches posted to
> > > the risc-v mailing list:
> > >
> > > https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
> > >
> > >  lib/sbi/sbi_pmu.c | 18 +++++++++---------
> > >  1 file changed, 9 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
> > > index 3ecf536..b386d33 100644
> > > --- a/lib/sbi/sbi_pmu.c
> > > +++ b/lib/sbi/sbi_pmu.c
> > > @@ -115,7 +115,7 @@ static int pmu_ctr_validate(uint32_t cidx, uint32_t *event_idx_code)
> > >
> > >         event_idx_val = active_events[hartid][cidx];
> > >
> > > -       if (cidx >= total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > > +       if (cidx > total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> >
> > Instead of changing all comparisons of total_ctrs, why not set
> > "num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;" in sbi_pmu_init() ?
> >
>
> Yes. That is simpler. We need a few other fixes along with that as
> well. Here is the diff
>
> diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
> index 3ecf536002a4..a532746c74be 100644
> --- a/lib/sbi/sbi_pmu.c
> +++ b/lib/sbi/sbi_pmu.c
> @@ -594,7 +594,7 @@ static int pmu_ctr_find_fw(unsigned long cbase,
> unsigned long cmask, u32 hartid)
>         unsigned long ctr_mask = cmask << cbase;
>
>         if (cbase <= num_hw_ctrs)
> -               fw_base = num_hw_ctrs + 1;
> +               fw_base = num_hw_ctrs;
>         else
>                 fw_base = cbase;
>
> @@ -694,7 +694,7 @@ int sbi_pmu_ctr_get_info(uint32_t cidx, unsigned
> long *ctr_info)
>                 return SBI_EINVAL;
>
>         /* We have 31 HW counters with 31 being the last index(MHPMCOUNTER31) */
> -       if (cidx <= num_hw_ctrs) {
> +       if (cidx < num_hw_ctrs) {
>                 cinfo.type = SBI_PMU_CTR_TYPE_HW;
>                 cinfo.csr = CSR_CYCLE + cidx;
>                 /* mcycle & minstret are always 64 bit */
> @@ -749,7 +749,7 @@ int sbi_pmu_init(struct sbi_scratch *scratch, bool
> cold_boot)
>                 sbi_platform_pmu_init(plat);
>
>                 /* mcycle & minstret is available always */
> -               num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 2;
> +               num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;
>                 total_ctrs = num_hw_ctrs + SBI_PMU_FW_CTR_MAX;
>         }
>
> As the counters are zero indexed, the firmware counter should start
> from num_hw_ctrs
> E.g. If a hardware has 8 programmable counters, num_hw_ctrs will be set to 11.
> with counter value (0-10). Firmware counters should range from 11-26.

Can one of you quickly send a v2 (preferably today) so that I can include it for
OpenSBI v1.1 release ?

Regards,
Anup

>
> > Regards,
> > Anup
> >
> > >                 return SBI_EINVAL;
> > >
> > >         event_idx_type = get_cidx_type(event_idx_val);
> > > @@ -347,13 +347,13 @@ int sbi_pmu_ctr_start(unsigned long cbase, unsigned long cmask,
> > >         int ret = SBI_EINVAL;
> > >         bool bUpdate = FALSE;
> > >
> > > -       if (sbi_fls(ctr_mask) >= total_ctrs)
> > > +       if (sbi_fls(ctr_mask) > total_ctrs)
> > >                 return ret;
> > >
> > >         if (flags & SBI_PMU_START_FLAG_SET_INIT_VALUE)
> > >                 bUpdate = TRUE;
> > >
> > > -       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs) {
> > > +       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs + 1) {
> > >                 event_idx_type = pmu_ctr_validate(cbase, &event_code);
> > >                 if (event_idx_type < 0)
> > >                         /* Continue the start operation for other counters */
> > > @@ -423,10 +423,10 @@ int sbi_pmu_ctr_stop(unsigned long cbase, unsigned long cmask,
> > >         uint32_t event_code;
> > >         unsigned long ctr_mask = cmask << cbase;
> > >
> > > -       if (sbi_fls(ctr_mask) >= total_ctrs)
> > > +       if (sbi_fls(ctr_mask) > total_ctrs)
> > >                 return SBI_EINVAL;
> > >
> > > -       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs) {
> > > +       for_each_set_bit_from(cbase, &ctr_mask, total_ctrs + 1) {
> > >                 event_idx_type = pmu_ctr_validate(cbase, &event_code);
> > >                 if (event_idx_type < 0)
> > >                         /* Continue the stop operation for other counters */
> > > @@ -598,7 +598,7 @@ static int pmu_ctr_find_fw(unsigned long cbase, unsigned long cmask, u32 hartid)
> > >         else
> > >                 fw_base = cbase;
> > >
> > > -       for (i = fw_base; i < total_ctrs; i++)
> > > +       for (i = fw_base; i <= total_ctrs; i++)
> > >                 if ((active_events[hartid][i] == SBI_PMU_EVENT_IDX_INVALID) &&
> > >                     ((1UL << i) & ctr_mask))
> > >                         return i;
> > > @@ -618,7 +618,7 @@ int sbi_pmu_ctr_cfg_match(unsigned long cidx_base, unsigned long cidx_mask,
> > >         unsigned long tmp = cidx_mask << cidx_base;
> > >
> > >         /* Do a basic sanity check of counter base & mask */
> > > -       if (sbi_fls(tmp) >= total_ctrs || event_type >= SBI_PMU_EVENT_TYPE_MAX)
> > > +       if (sbi_fls(tmp) > total_ctrs || event_type >= SBI_PMU_EVENT_TYPE_MAX)
> > >                 return SBI_EINVAL;
> > >
> > >         if (flags & SBI_PMU_CFG_FLAG_SKIP_MATCH) {
> > > @@ -690,7 +690,7 @@ int sbi_pmu_ctr_get_info(uint32_t cidx, unsigned long *ctr_info)
> > >         struct sbi_scratch *scratch = sbi_scratch_thishart_ptr();
> > >
> > >         /* Sanity check. Counter1 is not mapped at all */
> > > -       if (cidx >= total_ctrs || cidx == 1)
> > > +       if (cidx > total_ctrs || cidx == 1)
> > >                 return SBI_EINVAL;
> > >
> > >         /* We have 31 HW counters with 31 being the last index(MHPMCOUNTER31) */
> > > @@ -719,7 +719,7 @@ static void pmu_reset_event_map(u32 hartid)
> > >         int j;
> > >
> > >         /* Initialize the counter to event mapping table */
> > > -       for (j = 3; j < total_ctrs; j++)
> > > +       for (j = 3; j <= total_ctrs; j++)
> > >                 active_events[hartid][j] = SBI_PMU_EVENT_IDX_INVALID;
> > >         for (j = 0; j < SBI_PMU_FW_CTR_MAX; j++)
> > >                 sbi_memset(&fw_event_map[hartid][j], 0,
> > > --
> > > 2.36.1
> > >
> > >
> > > --
> > > opensbi mailing list
> > > opensbi at lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/opensbi
> >
> > --
> > opensbi mailing list
> > opensbi at lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/opensbi
>
>
>
> --
> Regards,
> Atish


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] lib: pmu: allow to use the highest available counter
  2022-06-23 17:59   ` Atish Patra
  2022-06-24  3:40     ` Anup Patel
@ 2022-06-24  7:47     ` Sergey Matyukevich
  2022-06-24  8:44       ` Anup Patel
  1 sibling, 1 reply; 8+ messages in thread
From: Sergey Matyukevich @ 2022-06-24  7:47 UTC (permalink / raw)
  To: opensbi

Hello,

> > > Both OpenSBI and OS explicitly assume that there is no hardware counter
> > > with index 1: hardware uses that bit for TM control. So OpenSBI filters
> > > out that index in sanity checks. However the range sanity checks do not
> > > treat that index in a special way. As a result, OpenSBI does not allow
> > > to use the firmware counter with the highest index. This change modifies
> > > range checks to allow access to the highest index firmware counter.
> > >
> > > The simple test is to make sure that there is no counter multiplexing
> > > in the following command:
> > >
> > > $ perf stat -e \
> > >         r8000000000000000,r8000000000000001,r8000000000000002,r8000000000000003, \
> > >         r8000000000000004,r8000000000000005,r8000000000000006,r8000000000000007, \
> > >         r8000000000000008,r8000000000000009,r800000000000000a,r800000000000000b, \
> > >         r800000000000000c,r800000000000000d,r800000000000000e,r800000000000000f  \
> > >         ls
> > >
> > > Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
> > >
> > > ---
> > >
> > > Note that perf RISC-V SBI Linux driver is adapted for the current OpenSBI
> > > behavior: it passes counters mask with exluded highest valid index. So the
> > > accompanying fix is also required for Linux, see the patches posted to
> > > the risc-v mailing list:
> > >
> > > https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
> > >
> > >  lib/sbi/sbi_pmu.c | 18 +++++++++---------
> > >  1 file changed, 9 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
> > > index 3ecf536..b386d33 100644
> > > --- a/lib/sbi/sbi_pmu.c
> > > +++ b/lib/sbi/sbi_pmu.c
> > > @@ -115,7 +115,7 @@ static int pmu_ctr_validate(uint32_t cidx, uint32_t *event_idx_code)
> > >
> > >         event_idx_val = active_events[hartid][cidx];
> > >
> > > -       if (cidx >= total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > > +       if (cidx > total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> >
> > Instead of changing all comparisons of total_ctrs, why not set
> > "num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;" in sbi_pmu_init() ?
> >
> 
> Yes. That is simpler. We need a few other fixes along with that as
> well. Here is the diff

Indeed, this is simpler. But as I mentioned in my reply to v2, this way
you report incorrect number of hw counters to kernel just to get the
correct mask of counters in response. And that behavior will have to
be followed by all the other SBI implementations. 

Current OpenSBI implementation implies the only gap in counters: index 1.
So my approach in v1 version of this patch was as follows:
- send the accurate number of counters to the kernel (hw + fw)
- fix range checks to take into account gap for index 1
This way OpenSBI just sends correct information to the kernel and
expects (and verifies) correct data in response. Accompanying kernel
fixes are the first two patches in the kernel series, see:
https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c

The 3rd kernel patch switches from array to IDR. With that change kernel
driver will support any gaps in counters, not only index 1. So OpenSBI
or any other SBI implementation will be able to exclude some hw or fw
counters if needed without changing kernel driver. So far I can see
two possible examples where it could be useful:
- platform specific quirks in OpenSBI to exclude non-functional counters
- keep some hw counters for internal use in OpenSBI or any other SBI implementation

Regards,
Sergey


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] lib: pmu: allow to use the highest available counter
  2022-06-24  7:47     ` Sergey Matyukevich
@ 2022-06-24  8:44       ` Anup Patel
  2022-06-24 10:01         ` Sergey Matyukevich
  0 siblings, 1 reply; 8+ messages in thread
From: Anup Patel @ 2022-06-24  8:44 UTC (permalink / raw)
  To: opensbi

On Fri, Jun 24, 2022 at 1:18 PM Sergey Matyukevich <geomatsi@gmail.com> wrote:
>
> Hello,
>
> > > > Both OpenSBI and OS explicitly assume that there is no hardware counter
> > > > with index 1: hardware uses that bit for TM control. So OpenSBI filters
> > > > out that index in sanity checks. However the range sanity checks do not
> > > > treat that index in a special way. As a result, OpenSBI does not allow
> > > > to use the firmware counter with the highest index. This change modifies
> > > > range checks to allow access to the highest index firmware counter.
> > > >
> > > > The simple test is to make sure that there is no counter multiplexing
> > > > in the following command:
> > > >
> > > > $ perf stat -e \
> > > >         r8000000000000000,r8000000000000001,r8000000000000002,r8000000000000003, \
> > > >         r8000000000000004,r8000000000000005,r8000000000000006,r8000000000000007, \
> > > >         r8000000000000008,r8000000000000009,r800000000000000a,r800000000000000b, \
> > > >         r800000000000000c,r800000000000000d,r800000000000000e,r800000000000000f  \
> > > >         ls
> > > >
> > > > Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
> > > >
> > > > ---
> > > >
> > > > Note that perf RISC-V SBI Linux driver is adapted for the current OpenSBI
> > > > behavior: it passes counters mask with exluded highest valid index. So the
> > > > accompanying fix is also required for Linux, see the patches posted to
> > > > the risc-v mailing list:
> > > >
> > > > https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
> > > >
> > > >  lib/sbi/sbi_pmu.c | 18 +++++++++---------
> > > >  1 file changed, 9 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
> > > > index 3ecf536..b386d33 100644
> > > > --- a/lib/sbi/sbi_pmu.c
> > > > +++ b/lib/sbi/sbi_pmu.c
> > > > @@ -115,7 +115,7 @@ static int pmu_ctr_validate(uint32_t cidx, uint32_t *event_idx_code)
> > > >
> > > >         event_idx_val = active_events[hartid][cidx];
> > > >
> > > > -       if (cidx >= total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > > > +       if (cidx > total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > >
> > > Instead of changing all comparisons of total_ctrs, why not set
> > > "num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;" in sbi_pmu_init() ?
> > >
> >
> > Yes. That is simpler. We need a few other fixes along with that as
> > well. Here is the diff
>
> Indeed, this is simpler. But as I mentioned in my reply to v2, this way
> you report incorrect number of hw counters to kernel just to get the
> correct mask of counters in response. And that behavior will have to
> be followed by all the other SBI implementations.

I don't think this patch reports an incorrect number of counters because
the time CSR is indeed a counter which counts timer ticks.

Just like cycle and instret CSRs, the time CSR is also a fixed counter
which cannot count any other HW event.

The only special thing about time CSR is that as-per RISC-V privilege
specification the time CSR is always counting and there is no way to
halt/resume it because it is an alias of platform-level free-running
mtime counter.

It's totally up to the SBI implementation whether it wants to expose
fixed HW counters (cycle, time, and insret) as SBI PMU counters
or not. In fact, the SBI PMU specification defines counter_idx as
logical counter with no relation to HW counter numbering.

A perfectly valid SBI implementation is allowed to do various things:
1) Intermix HW counters and FW counters (e.g. counter_idx = 0 is
    FW counter but counter_idx = 1 is HW counter)
2) Make a particular HW counter unusable through SBI PMU due
    to some errata (e.g. If mhpmcounter3 is broken then platform
    will not set bit[3] in any of the counter-mask provided in the
   "riscv,pmu" DT node which is parsed by OpenSBI). The
   sbi_pmu_config_matching() will never select a counter_idx
   which is not part of any counter-mask provided by the platform.
3) Create on-demand mapping of counter_idx to HW counters
   because SBI implementation is doing trap-n-emulate of
   counters. This is particularly true for hypervisors like KVM
   when counters are shared between host and guest so the
   hypervisor will on-demand allocate a HW counter and map
   it to a logical counter_idx seen by the guest. (Note: same
   approach is taken by KVM ARM)

>
> Current OpenSBI implementation implies the only gap in counters: index 1.
> So my approach in v1 version of this patch was as follows:
> - send the accurate number of counters to the kernel (hw + fw)
> - fix range checks to take into account gap for index 1
> This way OpenSBI just sends correct information to the kernel and
> expects (and verifies) correct data in response. Accompanying kernel
> fixes are the first two patches in the kernel series, see:
> https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
>
> The 3rd kernel patch switches from array to IDR. With that change kernel
> driver will support any gaps in counters, not only index 1. So OpenSBI
> or any other SBI implementation will be able to exclude some hw or fw
> counters if needed without changing kernel driver. So far I can see
> two possible examples where it could be useful:
> - platform specific quirks in OpenSBI to exclude non-functional counters
> - keep some hw counters for internal use in OpenSBI or any other SBI implementation

I think there is some confusion with regards to counter_idx.

As-per SBI spec, the counter_idx is a logical number so if there are N
counters then 0 <= counter_idx < N. There are no gaps in counter_idx
numbering. Of course, an SBI implementation might intentionally never
configure a particular counter_idx because it is special or broken.

Regards,
Anup

>
> Regards,
> Sergey
>
> --
> opensbi mailing list
> opensbi at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/opensbi


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] lib: pmu: allow to use the highest available counter
  2022-06-24  8:44       ` Anup Patel
@ 2022-06-24 10:01         ` Sergey Matyukevich
  2022-06-24 11:30           ` Anup Patel
  0 siblings, 1 reply; 8+ messages in thread
From: Sergey Matyukevich @ 2022-06-24 10:01 UTC (permalink / raw)
  To: opensbi

Hi Anup,

> > > > > Both OpenSBI and OS explicitly assume that there is no hardware counter
> > > > > with index 1: hardware uses that bit for TM control. So OpenSBI filters
> > > > > out that index in sanity checks. However the range sanity checks do not
> > > > > treat that index in a special way. As a result, OpenSBI does not allow
> > > > > to use the firmware counter with the highest index. This change modifies
> > > > > range checks to allow access to the highest index firmware counter.
> > > > >
> > > > > The simple test is to make sure that there is no counter multiplexing
> > > > > in the following command:
> > > > >
> > > > > $ perf stat -e \
> > > > >         r8000000000000000,r8000000000000001,r8000000000000002,r8000000000000003, \
> > > > >         r8000000000000004,r8000000000000005,r8000000000000006,r8000000000000007, \
> > > > >         r8000000000000008,r8000000000000009,r800000000000000a,r800000000000000b, \
> > > > >         r800000000000000c,r800000000000000d,r800000000000000e,r800000000000000f  \
> > > > >         ls
> > > > >
> > > > > Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
> > > > >
> > > > > ---
> > > > >
> > > > > Note that perf RISC-V SBI Linux driver is adapted for the current OpenSBI
> > > > > behavior: it passes counters mask with exluded highest valid index. So the
> > > > > accompanying fix is also required for Linux, see the patches posted to
> > > > > the risc-v mailing list:
> > > > >
> > > > > https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
> > > > >
> > > > >  lib/sbi/sbi_pmu.c | 18 +++++++++---------
> > > > >  1 file changed, 9 insertions(+), 9 deletions(-)
> > > > >
> > > > > diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
> > > > > index 3ecf536..b386d33 100644
> > > > > --- a/lib/sbi/sbi_pmu.c
> > > > > +++ b/lib/sbi/sbi_pmu.c
> > > > > @@ -115,7 +115,7 @@ static int pmu_ctr_validate(uint32_t cidx, uint32_t *event_idx_code)
> > > > >
> > > > >         event_idx_val = active_events[hartid][cidx];
> > > > >
> > > > > -       if (cidx >= total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > > > > +       if (cidx > total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > > >
> > > > Instead of changing all comparisons of total_ctrs, why not set
> > > > "num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;" in sbi_pmu_init() ?
> > > >
> > >
> > > Yes. That is simpler. We need a few other fixes along with that as
> > > well. Here is the diff
> >
> > Indeed, this is simpler. But as I mentioned in my reply to v2, this way
> > you report incorrect number of hw counters to kernel just to get the
> > correct mask of counters in response. And that behavior will have to
> > be followed by all the other SBI implementations.
> 
> I don't think this patch reports an incorrect number of counters because
> the time CSR is indeed a counter which counts timer ticks.
> 
> Just like cycle and instret CSRs, the time CSR is also a fixed counter
> which cannot count any other HW event.
> 
> The only special thing about time CSR is that as-per RISC-V privilege
> specification the time CSR is always counting and there is no way to
> halt/resume it because it is an alias of platform-level free-running
> mtime counter.
> 
> It's totally up to the SBI implementation whether it wants to expose
> fixed HW counters (cycle, time, and insret) as SBI PMU counters
> or not. In fact, the SBI PMU specification defines counter_idx as
> logical counter with no relation to HW counter numbering.
> 
> A perfectly valid SBI implementation is allowed to do various things:
> 1) Intermix HW counters and FW counters (e.g. counter_idx = 0 is
>     FW counter but counter_idx = 1 is HW counter)
> 2) Make a particular HW counter unusable through SBI PMU due
>     to some errata (e.g. If mhpmcounter3 is broken then platform
>     will not set bit[3] in any of the counter-mask provided in the
>    "riscv,pmu" DT node which is parsed by OpenSBI). The
>    sbi_pmu_config_matching() will never select a counter_idx
>    which is not part of any counter-mask provided by the platform.
> 3) Create on-demand mapping of counter_idx to HW counters
>    because SBI implementation is doing trap-n-emulate of
>    counters. This is particularly true for hypervisors like KVM
>    when counters are shared between host and guest so the
>    hypervisor will on-demand allocate a HW counter and map
>    it to a logical counter_idx seen by the guest. (Note: same
>    approach is taken by KVM ARM)
> 
> >
> > Current OpenSBI implementation implies the only gap in counters: index 1.
> > So my approach in v1 version of this patch was as follows:
> > - send the accurate number of counters to the kernel (hw + fw)
> > - fix range checks to take into account gap for index 1
> > This way OpenSBI just sends correct information to the kernel and
> > expects (and verifies) correct data in response. Accompanying kernel
> > fixes are the first two patches in the kernel series, see:
> > https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
> >
> > The 3rd kernel patch switches from array to IDR. With that change kernel
> > driver will support any gaps in counters, not only index 1. So OpenSBI
> > or any other SBI implementation will be able to exclude some hw or fw
> > counters if needed without changing kernel driver. So far I can see
> > two possible examples where it could be useful:
> > - platform specific quirks in OpenSBI to exclude non-functional counters
> > - keep some hw counters for internal use in OpenSBI or any other SBI implementation
> 
> I think there is some confusion with regards to counter_idx.
> 
> As-per SBI spec, the counter_idx is a logical number so if there are N
> counters then 0 <= counter_idx < N. There are no gaps in counter_idx
> numbering. Of course, an SBI implementation might intentionally never
> configure a particular counter_idx because it is special or broken.

Thanks for clarifications! So, in brief, it is OK for valid SBI
implementation to do the following:

- to report _total_ number of counters (hw + fw), including special
  ones (like timer) or broken ones or reserved ones
- sbi_pmu_ctr_cfg_match should never send special/reserved/broken
  counters to OS

That makes sense to me. So v2 changes suggested by Atish look reasonable.
But I guess commit message needs to be clarified. 

I will shortly send you v3 version of the OpenSBI patch with updated
commit message. I will also send v2 of the kernel patches. 

Regards,
Sergey


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/1] lib: pmu: allow to use the highest available counter
  2022-06-24 10:01         ` Sergey Matyukevich
@ 2022-06-24 11:30           ` Anup Patel
  0 siblings, 0 replies; 8+ messages in thread
From: Anup Patel @ 2022-06-24 11:30 UTC (permalink / raw)
  To: opensbi

On Fri, Jun 24, 2022 at 3:31 PM Sergey Matyukevich <geomatsi@gmail.com> wrote:
>
> Hi Anup,
>
> > > > > > Both OpenSBI and OS explicitly assume that there is no hardware counter
> > > > > > with index 1: hardware uses that bit for TM control. So OpenSBI filters
> > > > > > out that index in sanity checks. However the range sanity checks do not
> > > > > > treat that index in a special way. As a result, OpenSBI does not allow
> > > > > > to use the firmware counter with the highest index. This change modifies
> > > > > > range checks to allow access to the highest index firmware counter.
> > > > > >
> > > > > > The simple test is to make sure that there is no counter multiplexing
> > > > > > in the following command:
> > > > > >
> > > > > > $ perf stat -e \
> > > > > >         r8000000000000000,r8000000000000001,r8000000000000002,r8000000000000003, \
> > > > > >         r8000000000000004,r8000000000000005,r8000000000000006,r8000000000000007, \
> > > > > >         r8000000000000008,r8000000000000009,r800000000000000a,r800000000000000b, \
> > > > > >         r800000000000000c,r800000000000000d,r800000000000000e,r800000000000000f  \
> > > > > >         ls
> > > > > >
> > > > > > Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
> > > > > >
> > > > > > ---
> > > > > >
> > > > > > Note that perf RISC-V SBI Linux driver is adapted for the current OpenSBI
> > > > > > behavior: it passes counters mask with exluded highest valid index. So the
> > > > > > accompanying fix is also required for Linux, see the patches posted to
> > > > > > the risc-v mailing list:
> > > > > >
> > > > > > https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
> > > > > >
> > > > > >  lib/sbi/sbi_pmu.c | 18 +++++++++---------
> > > > > >  1 file changed, 9 insertions(+), 9 deletions(-)
> > > > > >
> > > > > > diff --git a/lib/sbi/sbi_pmu.c b/lib/sbi/sbi_pmu.c
> > > > > > index 3ecf536..b386d33 100644
> > > > > > --- a/lib/sbi/sbi_pmu.c
> > > > > > +++ b/lib/sbi/sbi_pmu.c
> > > > > > @@ -115,7 +115,7 @@ static int pmu_ctr_validate(uint32_t cidx, uint32_t *event_idx_code)
> > > > > >
> > > > > >         event_idx_val = active_events[hartid][cidx];
> > > > > >
> > > > > > -       if (cidx >= total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > > > > > +       if (cidx > total_ctrs || (event_idx_val == SBI_PMU_EVENT_IDX_INVALID))
> > > > >
> > > > > Instead of changing all comparisons of total_ctrs, why not set
> > > > > "num_hw_ctrs = sbi_hart_mhpm_count(scratch) + 3;" in sbi_pmu_init() ?
> > > > >
> > > >
> > > > Yes. That is simpler. We need a few other fixes along with that as
> > > > well. Here is the diff
> > >
> > > Indeed, this is simpler. But as I mentioned in my reply to v2, this way
> > > you report incorrect number of hw counters to kernel just to get the
> > > correct mask of counters in response. And that behavior will have to
> > > be followed by all the other SBI implementations.
> >
> > I don't think this patch reports an incorrect number of counters because
> > the time CSR is indeed a counter which counts timer ticks.
> >
> > Just like cycle and instret CSRs, the time CSR is also a fixed counter
> > which cannot count any other HW event.
> >
> > The only special thing about time CSR is that as-per RISC-V privilege
> > specification the time CSR is always counting and there is no way to
> > halt/resume it because it is an alias of platform-level free-running
> > mtime counter.
> >
> > It's totally up to the SBI implementation whether it wants to expose
> > fixed HW counters (cycle, time, and insret) as SBI PMU counters
> > or not. In fact, the SBI PMU specification defines counter_idx as
> > logical counter with no relation to HW counter numbering.
> >
> > A perfectly valid SBI implementation is allowed to do various things:
> > 1) Intermix HW counters and FW counters (e.g. counter_idx = 0 is
> >     FW counter but counter_idx = 1 is HW counter)
> > 2) Make a particular HW counter unusable through SBI PMU due
> >     to some errata (e.g. If mhpmcounter3 is broken then platform
> >     will not set bit[3] in any of the counter-mask provided in the
> >    "riscv,pmu" DT node which is parsed by OpenSBI). The
> >    sbi_pmu_config_matching() will never select a counter_idx
> >    which is not part of any counter-mask provided by the platform.
> > 3) Create on-demand mapping of counter_idx to HW counters
> >    because SBI implementation is doing trap-n-emulate of
> >    counters. This is particularly true for hypervisors like KVM
> >    when counters are shared between host and guest so the
> >    hypervisor will on-demand allocate a HW counter and map
> >    it to a logical counter_idx seen by the guest. (Note: same
> >    approach is taken by KVM ARM)
> >
> > >
> > > Current OpenSBI implementation implies the only gap in counters: index 1.
> > > So my approach in v1 version of this patch was as follows:
> > > - send the accurate number of counters to the kernel (hw + fw)
> > > - fix range checks to take into account gap for index 1
> > > This way OpenSBI just sends correct information to the kernel and
> > > expects (and verifies) correct data in response. Accompanying kernel
> > > fixes are the first two patches in the kernel series, see:
> > > https://lore.kernel.org/linux-riscv/20220623112735.357093-1-geomatsi at gmail.com/T/#m4811c83cf767cdaaeb31e2530d4a51034fcf6c3c
> > >
> > > The 3rd kernel patch switches from array to IDR. With that change kernel
> > > driver will support any gaps in counters, not only index 1. So OpenSBI
> > > or any other SBI implementation will be able to exclude some hw or fw
> > > counters if needed without changing kernel driver. So far I can see
> > > two possible examples where it could be useful:
> > > - platform specific quirks in OpenSBI to exclude non-functional counters
> > > - keep some hw counters for internal use in OpenSBI or any other SBI implementation
> >
> > I think there is some confusion with regards to counter_idx.
> >
> > As-per SBI spec, the counter_idx is a logical number so if there are N
> > counters then 0 <= counter_idx < N. There are no gaps in counter_idx
> > numbering. Of course, an SBI implementation might intentionally never
> > configure a particular counter_idx because it is special or broken.
>
> Thanks for clarifications! So, in brief, it is OK for valid SBI
> implementation to do the following:
>
> - to report _total_ number of counters (hw + fw), including special
>   ones (like timer) or broken ones or reserved ones

Yes, as long as SBI implementation controls what counters can be
configured successfully.

> - sbi_pmu_ctr_cfg_match should never send special/reserved/broken
>   counters to OS

Yes, sbi_pmu_counter_config_matching() should:
1) Never select broken or reserved or special counters
2) Select fixed counters (such as cycle or insret) only for specific PMU event

>
> That makes sense to me. So v2 changes suggested by Atish look reasonable.
> But I guess commit message needs to be clarified.
>
> I will shortly send you v3 version of the OpenSBI patch with updated
> commit message. I will also send v2 of the kernel patches.

Sure, thanks.

Also, thanks for catching this issue.

>
> Regards,
> Sergey

Regards,
Anup


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-06-24 11:30 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-23 14:13 [PATCH 1/1] lib: pmu: allow to use the highest available counter Sergey Matyukevich
2022-06-23 15:30 ` Anup Patel
2022-06-23 17:59   ` Atish Patra
2022-06-24  3:40     ` Anup Patel
2022-06-24  7:47     ` Sergey Matyukevich
2022-06-24  8:44       ` Anup Patel
2022-06-24 10:01         ` Sergey Matyukevich
2022-06-24 11:30           ` Anup Patel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.