public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Updates to mlxbf-pmc
@ 2024-02-09  8:39 Shravan Kumar Ramani
  0 siblings, 0 replies; 14+ messages in thread
From: Shravan Kumar Ramani @ 2024-02-09  8:39 UTC (permalink / raw)
  To: Hans de Goede, Ilpo Jarvinen, Vadim Pasternak, David Thompson
  Cc: Shravan Kumar Ramani, platform-driver-x86, linux-kernel

This submission contains 4 patches:
(N) Patch 1 replaces all uintN_t usage with kernel-style types
(N) Patch 2 resolves signed/unsigned int mix-up
Patch 3 adds support for 64-bit counters and tracking cycle count
Patch 4 adds support for the clock_measure performance block

v1 -> v2
Added 2 new patches to address generic issues
Replaced all uintN usage in the driver
Fixed signed/unsigned mix-up and replaced identifiers accordingly
Replaced kstrtoint with kstrtouint as applicable
Retained devm_kasprintf usage since other instances require dynamic allocation

Shravan Kumar Ramani (4):
  platform/mellanox: mlxbf-pmc: Replace uintN_t with kernel-style types
  platform/mellanox: mlxbf-pmc: Fix signed/unsigned mix-up
  platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and
    cycle count
  platform/mellanox: mlxbf-pmc: Add support for clock_measure
    performance block

 drivers/platform/mellanox/mlxbf-pmc.c | 381 +++++++++++++++++++-------
 1 file changed, 278 insertions(+), 103 deletions(-)

-- 
2.30.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 0/4] Updates to mlxbf-pmc
@ 2024-05-20 11:56 Shravan Kumar Ramani
  2024-05-20 11:56 ` [PATCH v2 1/4] Documentation/ABI: Add document for Mellanox PMC driver Shravan Kumar Ramani
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Shravan Kumar Ramani @ 2024-05-20 11:56 UTC (permalink / raw)
  To: Hans de Goede, Ilpo Jarvinen, Vadim Pasternak, David Thompson
  Cc: Shravan Kumar Ramani, platform-driver-x86, linux-kernel

This submission contains 4 patches relating to mlxbf-pmc.

Patch 1 adds documentation for the sysfs files created by the driver.
Patches 2 and 3 add specific functionality to the driver for supporting
64-bit ocunters, cycle count and clock_measure performance block.
Patch 4 adds documentation for the newly added sysfs entries.

v1 -> v2
Added patch 4 to document sysfs entries added in patches 2 and 3.

Shravan Kumar Ramani (4):
  Documentation/ABI: Add document for Mellanox PMC driver
  platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and
    cycle count
  platform/mellanox: mlxbf-pmc: Add support for clock_measure
    performance block
  Documentation/ABI: Add new sysfs fields to sysfs-platform-mellanox-pmc

 .../ABI/testing/sysfs-platform-mellanox-pmc   |  65 +++++++
 drivers/platform/mellanox/mlxbf-pmc.c         | 180 +++++++++++++++++-
 2 files changed, 241 insertions(+), 4 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-platform-mellanox-pmc

-- 
2.30.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/4] Documentation/ABI: Add document for Mellanox PMC driver
  2024-05-20 11:56 [PATCH v2 0/4] Updates to mlxbf-pmc Shravan Kumar Ramani
@ 2024-05-20 11:56 ` Shravan Kumar Ramani
  2024-05-27 10:33   ` Ilpo Järvinen
  2024-05-20 11:56 ` [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count Shravan Kumar Ramani
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Shravan Kumar Ramani @ 2024-05-20 11:56 UTC (permalink / raw)
  To: Hans de Goede, Ilpo Jarvinen, Vadim Pasternak, David Thompson
  Cc: Shravan Kumar Ramani, platform-driver-x86, linux-kernel

The sysfs interface is created for programming and monitoring the
performance counters in various HW blocks of Mellanox BlueField-1,
BlueField-2 and BlueField-3.

Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
---
 .../ABI/testing/sysfs-platform-mellanox-pmc   | 49 +++++++++++++++++++
 1 file changed, 49 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-platform-mellanox-pmc

diff --git a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
new file mode 100644
index 000000000000..47094024dbeb
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
@@ -0,0 +1,49 @@
+HID           Driver         Description
+MLNXBFD0      mlxbf-pmc      Performance counters (BlueField-1)
+MLNXBFD1      mlxbf-pmc      Performance counters (BlueField-2)
+MLNXBFD2      mlxbf-pmc      Performance counters (BlueField-3)
+
+What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/event_list
+Date:		Dec 2020
+KernelVersion:	5.10
+Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
+Description:
+		List of events supported by the counters in the specific block.
+		It is used to extract the event number or ID associated with
+		each event.
+
+What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/event<N>
+Date:		Dec 2020
+KernelVersion:	5.10
+Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
+Description:
+		Event monitored by corresponding counter. This is used to
+		program or read back the event that should be or is currently
+		being monitored by counter<N>.
+
+What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/counter<N>
+Date:		Dec 2020
+KernelVersion:	5.10
+Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
+Description:
+		Counter value of the event being monitored. This is used to
+		read the counter value of the event which was programmed using
+		event<N>. This is also used to clear or reset the counter value.
+
+What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/enable
+Date:		Dec 2020
+KernelVersion:	5.10
+Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
+Description:
+		Start or stop counters. This is used to start the counters
+		for monitoring the programmed events and also to stop the
+		counters after the desired duration.
+
+What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/<reg>
+Date:		Dec 2020
+KernelVersion:	5.10
+Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
+Description:
+		Value of register. This is used to read or reset the registers
+		where various performance statistics are counted for each block.
+
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count
  2024-05-20 11:56 [PATCH v2 0/4] Updates to mlxbf-pmc Shravan Kumar Ramani
  2024-05-20 11:56 ` [PATCH v2 1/4] Documentation/ABI: Add document for Mellanox PMC driver Shravan Kumar Ramani
@ 2024-05-20 11:56 ` Shravan Kumar Ramani
  2024-05-27 11:39   ` Ilpo Järvinen
  2024-05-20 11:56 ` [PATCH v2 3/4] platform/mellanox: mlxbf-pmc: Add support for clock_measure performance block Shravan Kumar Ramani
  2024-05-20 11:56 ` [PATCH v2 4/4] Documentation/ABI: Add new sysfs fields to sysfs-platform-mellanox-pmc Shravan Kumar Ramani
  3 siblings, 1 reply; 14+ messages in thread
From: Shravan Kumar Ramani @ 2024-05-20 11:56 UTC (permalink / raw)
  To: Hans de Goede, Ilpo Jarvinen, Vadim Pasternak, David Thompson
  Cc: Shravan Kumar Ramani, platform-driver-x86, linux-kernel

Add support for programming any counter to monitor the cycle count.
Since counting of cycles using 32-bit ocunters would result in frequent
wraparounds, add the ability to combine 2 adjacent 32-bit counters to
form 1 64-bit counter.
Both these features are supported by BlueField-3 PMC hardware, hence
the required bit-fields are exposed by the driver via sysfs to allow
the user to configure as needed.

Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
---
 drivers/platform/mellanox/mlxbf-pmc.c | 134 ++++++++++++++++++++++++++
 1 file changed, 134 insertions(+)

diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c
index 4ed9c7fd2b62..635ecc3b3845 100644
--- a/drivers/platform/mellanox/mlxbf-pmc.c
+++ b/drivers/platform/mellanox/mlxbf-pmc.c
@@ -88,6 +88,8 @@
 #define MLXBF_PMC_CRSPACE_PERFMON_CTL(n) (n * MLXBF_PMC_CRSPACE_PERFMON_REG0_SZ)
 #define MLXBF_PMC_CRSPACE_PERFMON_EN BIT(30)
 #define MLXBF_PMC_CRSPACE_PERFMON_CLR BIT(28)
+#define MLXBF_PMC_CRSPACE_PERFMON_UOC GENMASK(15, 0)
+#define MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(n) (MLXBF_PMC_CRSPACE_PERFMON_CTL(n) + 0x4)
 #define MLXBF_PMC_CRSPACE_PERFMON_VAL0(n) (MLXBF_PMC_CRSPACE_PERFMON_CTL(n) + 0xc)
 
 /**
@@ -114,6 +116,8 @@ struct mlxbf_pmc_attribute {
  * @attr_event: Attributes for "event" sysfs files
  * @attr_event_list: Attributes for "event_list" sysfs files
  * @attr_enable: Attributes for "enable" sysfs files
+ * @attr_use_odd_counter: Attributes for "use_odd_counter" sysfs files
+ * @attr_count_clock: Attributes for "count_clock" sysfs files
  * @block_attr: All attributes needed for the block
  * @block_attr_grp: Attribute group for the block
  */
@@ -126,6 +130,8 @@ struct mlxbf_pmc_block_info {
 	struct mlxbf_pmc_attribute *attr_event;
 	struct mlxbf_pmc_attribute attr_event_list;
 	struct mlxbf_pmc_attribute attr_enable;
+	struct mlxbf_pmc_attribute attr_use_odd_counter;
+	struct mlxbf_pmc_attribute attr_count_clock;
 	struct attribute *block_attr[MLXBF_PMC_MAX_ATTRS];
 	struct attribute_group block_attr_grp;
 };
@@ -1763,6 +1769,103 @@ static ssize_t mlxbf_pmc_enable_store(struct device *dev,
 	return count;
 }
 
+/* Show function for "use_odd_counter" sysfs files - only for crspace */
+static ssize_t mlxbf_pmc_use_odd_counter_show(struct device *dev,
+					      struct device_attribute *attr, char *buf)
+{
+	struct mlxbf_pmc_attribute *attr_use_odd_counter = container_of(
+		attr, struct mlxbf_pmc_attribute, dev_attr);
+	unsigned int blk_num;
+	u32 value, reg;
+
+	blk_num = attr_use_odd_counter->nr;
+
+	if (mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
+			MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
+			&reg))
+		return -EINVAL;
+
+	value = FIELD_GET(MLXBF_PMC_CRSPACE_PERFMON_UOC, reg);
+
+	return sysfs_emit(buf, "%u\n", value);
+}
+
+/* Store function for "use_odd_counter" sysfs files - only for crspace */
+static ssize_t mlxbf_pmc_use_odd_counter_store(struct device *dev,
+					       struct device_attribute *attr,
+					       const char *buf, size_t count)
+{
+	struct mlxbf_pmc_attribute *attr_use_odd_counter = container_of(
+		attr, struct mlxbf_pmc_attribute, dev_attr);
+	unsigned int blk_num;
+	u32 uoc, reg;
+	int err;
+
+	blk_num = attr_use_odd_counter->nr;
+
+	err = kstrtouint(buf, 0, &uoc);
+	if (err < 0)
+		return err;
+
+	err = mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
+		MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
+		&reg);
+	if (err)
+		return -EINVAL;
+
+	reg &= ~MLXBF_PMC_CRSPACE_PERFMON_UOC;
+	reg |= FIELD_PREP(MLXBF_PMC_CRSPACE_PERFMON_UOC, uoc);
+
+	mlxbf_pmc_write(pmc->block[blk_num].mmio_base +
+		MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
+		MLXBF_PMC_WRITE_REG_32, reg);
+
+	return count;
+}
+
+/* Show function for "count_clock" sysfs files - only for crspace */
+static ssize_t mlxbf_pmc_count_clock_show(struct device *dev,
+					  struct device_attribute *attr, char *buf)
+{
+	struct mlxbf_pmc_attribute *attr_count_clock = container_of(
+		attr, struct mlxbf_pmc_attribute, dev_attr);
+	unsigned int blk_num;
+	u32 reg;
+
+	blk_num = attr_count_clock->nr;
+
+	if (mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
+			MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(pmc->block[blk_num].counters),
+			&reg))
+		return -EINVAL;
+
+	return sysfs_emit(buf, "%u\n", reg);
+}
+
+/* Store function for "count_clock" sysfs files - only for crspace */
+static ssize_t mlxbf_pmc_count_clock_store(struct device *dev,
+					   struct device_attribute *attr,
+					   const char *buf, size_t count)
+{
+	struct mlxbf_pmc_attribute *attr_count_clock = container_of(
+		attr, struct mlxbf_pmc_attribute, dev_attr);
+	unsigned int blk_num;
+	u32 reg;
+	int err;
+
+	blk_num = attr_count_clock->nr;
+
+	err = kstrtouint(buf, 0, &reg);
+	if (err < 0)
+		return err;
+
+	mlxbf_pmc_write(pmc->block[blk_num].mmio_base +
+		MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(pmc->block[blk_num].counters),
+		MLXBF_PMC_WRITE_REG_32, reg);
+
+	return count;
+}
+
 /* Populate attributes for blocks with counters to monitor performance */
 static int mlxbf_pmc_init_perftype_counter(struct device *dev, unsigned int blk_num)
 {
@@ -1799,6 +1902,37 @@ static int mlxbf_pmc_init_perftype_counter(struct device *dev, unsigned int blk_
 		attr = NULL;
 	}
 
+	if (pmc->block[blk_num].type == MLXBF_PMC_TYPE_CRSPACE) {
+		/*
+		 * Couple adjacent odd and even 32-bit counters to form 64-bit counters
+		 * using "use_odd_counter" sysfs which has one bit per even counter.
+		 */
+		attr = &pmc->block[blk_num].attr_use_odd_counter;
+		attr->dev_attr.attr.mode = 0644;
+		attr->dev_attr.show = mlxbf_pmc_use_odd_counter_show;
+		attr->dev_attr.store = mlxbf_pmc_use_odd_counter_store;
+		attr->nr = blk_num;
+		attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
+							  "use_odd_counter");
+		if (!attr->dev_attr.attr.name)
+			return -ENOMEM;
+		pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
+		attr = NULL;
+
+		/* Program crspace counters to count clock cycles using "count_clock" sysfs */
+		attr = &pmc->block[blk_num].attr_count_clock;
+		attr->dev_attr.attr.mode = 0644;
+		attr->dev_attr.show = mlxbf_pmc_count_clock_show;
+		attr->dev_attr.store = mlxbf_pmc_count_clock_store;
+		attr->nr = blk_num;
+		attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
+							  "count_clock");
+		if (!attr->dev_attr.attr.name)
+			return -ENOMEM;
+		pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
+		attr = NULL;
+	}
+
 	pmc->block[blk_num].attr_counter = devm_kcalloc(
 		dev, pmc->block[blk_num].counters,
 		sizeof(struct mlxbf_pmc_attribute), GFP_KERNEL);
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/4] platform/mellanox: mlxbf-pmc: Add support for clock_measure performance block
  2024-05-20 11:56 [PATCH v2 0/4] Updates to mlxbf-pmc Shravan Kumar Ramani
  2024-05-20 11:56 ` [PATCH v2 1/4] Documentation/ABI: Add document for Mellanox PMC driver Shravan Kumar Ramani
  2024-05-20 11:56 ` [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count Shravan Kumar Ramani
@ 2024-05-20 11:56 ` Shravan Kumar Ramani
  2024-05-20 11:56 ` [PATCH v2 4/4] Documentation/ABI: Add new sysfs fields to sysfs-platform-mellanox-pmc Shravan Kumar Ramani
  3 siblings, 0 replies; 14+ messages in thread
From: Shravan Kumar Ramani @ 2024-05-20 11:56 UTC (permalink / raw)
  To: Hans de Goede, Ilpo Jarvinen, Vadim Pasternak, David Thompson
  Cc: Shravan Kumar Ramani, platform-driver-x86, linux-kernel

The HW clock_measure counter info is passed to the driver from ACPI.
Create a new sub-directory for clock_measure events and provide
read access to the user. Writes are blocked since the fields are RO.

Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
---
 drivers/platform/mellanox/mlxbf-pmc.c | 46 ++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c
index 635ecc3b3845..1212a96fb3eb 100644
--- a/drivers/platform/mellanox/mlxbf-pmc.c
+++ b/drivers/platform/mellanox/mlxbf-pmc.c
@@ -865,6 +865,37 @@ static const struct mlxbf_pmc_events mlxbf_pmc_llt_miss_events[] = {
 	{75, "HISTOGRAM_HISTOGRAM_BIN9"},
 };
 
+static const struct mlxbf_pmc_events mlxbf_pmc_clock_events[] = {
+	{ 0x0, "FMON_CLK_LAST_COUNT_PLL_D1_INST0" },
+	{ 0x4, "REFERENCE_WINDOW_WIDTH_PLL_D1_INST0" },
+	{ 0x8, "FMON_CLK_LAST_COUNT_PLL_D1_INST1" },
+	{ 0xc, "REFERENCE_WINDOW_WIDTH_PLL_D1_INST1" },
+	{ 0x10, "FMON_CLK_LAST_COUNT_PLL_G1" },
+	{ 0x14, "REFERENCE_WINDOW_WIDTH_PLL_G1" },
+	{ 0x18, "FMON_CLK_LAST_COUNT_PLL_W1" },
+	{ 0x1c, "REFERENCE_WINDOW_WIDTH_PLL_W1" },
+	{ 0x20, "FMON_CLK_LAST_COUNT_PLL_T1" },
+	{ 0x24, "REFERENCE_WINDOW_WIDTH_PLL_T1" },
+	{ 0x28, "FMON_CLK_LAST_COUNT_PLL_A0" },
+	{ 0x2c, "REFERENCE_WINDOW_WIDTH_PLL_A0" },
+	{ 0x30, "FMON_CLK_LAST_COUNT_PLL_C0" },
+	{ 0x34, "REFERENCE_WINDOW_WIDTH_PLL_C0" },
+	{ 0x38, "FMON_CLK_LAST_COUNT_PLL_N1" },
+	{ 0x3c, "REFERENCE_WINDOW_WIDTH_PLL_N1" },
+	{ 0x40, "FMON_CLK_LAST_COUNT_PLL_I1" },
+	{ 0x44, "REFERENCE_WINDOW_WIDTH_PLL_I1" },
+	{ 0x48, "FMON_CLK_LAST_COUNT_PLL_R1" },
+	{ 0x4c, "REFERENCE_WINDOW_WIDTH_PLL_R1" },
+	{ 0x50, "FMON_CLK_LAST_COUNT_PLL_P1" },
+	{ 0x54, "REFERENCE_WINDOW_WIDTH_PLL_P1" },
+	{ 0x58, "FMON_CLK_LAST_COUNT_REF_100_INST0" },
+	{ 0x5c, "REFERENCE_WINDOW_WIDTH_REF_100_INST0" },
+	{ 0x60, "FMON_CLK_LAST_COUNT_REF_100_INST1" },
+	{ 0x64, "REFERENCE_WINDOW_WIDTH_REF_100_INST1" },
+	{ 0x68, "FMON_CLK_LAST_COUNT_REF_156" },
+	{ 0x6c, "REFERENCE_WINDOW_WIDTH_REF_156" },
+};
+
 static struct mlxbf_pmc_context *pmc;
 
 /* UUID used to probe ATF service. */
@@ -1038,6 +1069,9 @@ static const struct mlxbf_pmc_events *mlxbf_pmc_event_list(const char *blk, size
 	} else if (strstr(blk, "llt")) {
 		events = mlxbf_pmc_llt_events;
 		size = ARRAY_SIZE(mlxbf_pmc_llt_events);
+	} else if (strstr(blk, "clock_measure")) {
+		events = mlxbf_pmc_clock_events;
+		size = ARRAY_SIZE(mlxbf_pmc_clock_events);
 	} else {
 		events = NULL;
 		size = 0;
@@ -1472,14 +1506,15 @@ static int mlxbf_pmc_read_event(unsigned int blk_num, u32 cnt_num, bool is_l3, u
 /* Method to read a register */
 static int mlxbf_pmc_read_reg(unsigned int blk_num, u32 offset, u64 *result)
 {
-	u32 ecc_out;
+	u32 reg;
 
-	if (strstr(pmc->block_name[blk_num], "ecc")) {
+	if ((strstr(pmc->block_name[blk_num], "ecc")) ||
+	    (strstr(pmc->block_name[blk_num], "clock_measure"))) {
 		if (mlxbf_pmc_readl(pmc->block[blk_num].mmio_base + offset,
-				    &ecc_out))
+				    &reg))
 			return -EFAULT;
 
-		*result = ecc_out;
+		*result = reg;
 		return 0;
 	}
 
@@ -1493,6 +1528,9 @@ static int mlxbf_pmc_read_reg(unsigned int blk_num, u32 offset, u64 *result)
 /* Method to write to a register */
 static int mlxbf_pmc_write_reg(unsigned int blk_num, u32 offset, u64 data)
 {
+	if (strstr(pmc->block_name[blk_num], "clock_measure"))
+		return -EINVAL;
+
 	if (strstr(pmc->block_name[blk_num], "ecc")) {
 		return mlxbf_pmc_write(pmc->block[blk_num].mmio_base + offset,
 				       MLXBF_PMC_WRITE_REG_32, data);
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 4/4] Documentation/ABI: Add new sysfs fields to sysfs-platform-mellanox-pmc
  2024-05-20 11:56 [PATCH v2 0/4] Updates to mlxbf-pmc Shravan Kumar Ramani
                   ` (2 preceding siblings ...)
  2024-05-20 11:56 ` [PATCH v2 3/4] platform/mellanox: mlxbf-pmc: Add support for clock_measure performance block Shravan Kumar Ramani
@ 2024-05-20 11:56 ` Shravan Kumar Ramani
  3 siblings, 0 replies; 14+ messages in thread
From: Shravan Kumar Ramani @ 2024-05-20 11:56 UTC (permalink / raw)
  To: Hans de Goede, Ilpo Jarvinen, Vadim Pasternak, David Thompson
  Cc: Shravan Kumar Ramani, platform-driver-x86, linux-kernel

Document newly added "count_clock" and "use_odd_counter" sysfs entries
for the Mellanox BlueField PMC driver.

Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
Reviewed-by: David Thompson <davthompson@nvidia.com>
---
 .../ABI/testing/sysfs-platform-mellanox-pmc      | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
index 47094024dbeb..b9973ebec2fd 100644
--- a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
+++ b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
@@ -47,3 +47,19 @@ Description:
 		Value of register. This is used to read or reset the registers
 		where various performance statistics are counted for each block.
 
+What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/count_clock
+Date:		May 2024
+KernelVersion:	6.10
+Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
+Description:
+		Use counter for counting cycles. This is used to program any of
+		the counters in the block to count cycles.
+
+What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/use_odd_counter
+Date:		May 2024
+KernelVersion:	6.10
+Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
+Description:
+		Form 64-bit counter using 2 32-bit counters. This is used to combine
+		2 adjacent counters to form a single 64-bit counter.
+
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/4] Documentation/ABI: Add document for Mellanox PMC driver
  2024-05-20 11:56 ` [PATCH v2 1/4] Documentation/ABI: Add document for Mellanox PMC driver Shravan Kumar Ramani
@ 2024-05-27 10:33   ` Ilpo Järvinen
  0 siblings, 0 replies; 14+ messages in thread
From: Ilpo Järvinen @ 2024-05-27 10:33 UTC (permalink / raw)
  To: Shravan Kumar Ramani
  Cc: Hans de Goede, Vadim Pasternak, David Thompson,
	platform-driver-x86, LKML

On Mon, 20 May 2024, Shravan Kumar Ramani wrote:

> The sysfs interface is created for programming and monitoring the

This patch is not creating interface. I suggest you change the wording to:

"Document the sysfs interface for ...

> performance counters in various HW blocks of Mellanox BlueField-1,
> BlueField-2 and BlueField-3.
> 
> Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
> Reviewed-by: David Thompson <davthompson@nvidia.com>
> ---
>  .../ABI/testing/sysfs-platform-mellanox-pmc   | 49 +++++++++++++++++++
>  1 file changed, 49 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-platform-mellanox-pmc
> 
> diff --git a/Documentation/ABI/testing/sysfs-platform-mellanox-pmc b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
> new file mode 100644
> index 000000000000..47094024dbeb
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-platform-mellanox-pmc
> @@ -0,0 +1,49 @@
> +HID           Driver         Description
> +MLNXBFD0      mlxbf-pmc      Performance counters (BlueField-1)
> +MLNXBFD1      mlxbf-pmc      Performance counters (BlueField-2)
> +MLNXBFD2      mlxbf-pmc      Performance counters (BlueField-3)
> +
> +What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/event_list
> +Date:		Dec 2020
> +KernelVersion:	5.10
> +Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
> +Description:
> +		List of events supported by the counters in the specific block.
> +		It is used to extract the event number or ID associated with
> +		each event.
> +
> +What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/event<N>
> +Date:		Dec 2020
> +KernelVersion:	5.10
> +Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
> +Description:
> +		Event monitored by corresponding counter. This is used to
> +		program or read back the event that should be or is currently
> +		being monitored by counter<N>.
> +
> +What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/counter<N>
> +Date:		Dec 2020
> +KernelVersion:	5.10
> +Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
> +Description:
> +		Counter value of the event being monitored. This is used to
> +		read the counter value of the event which was programmed using
> +		event<N>. This is also used to clear or reset the counter value.

Please document how to clear/reset it in more concrete terms (I didn't 
check the code but my guess would one writes zero there to reset the 
counter?).

> +What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/enable
> +Date:		Dec 2020
> +KernelVersion:	5.10
> +Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
> +Description:
> +		Start or stop counters. This is used to start the counters
> +		for monitoring the programmed events and also to stop the
> +		counters after the desired duration.
> +
> +What:		/sys/bus/platform/devices/<HID>/hwmon/hwmonX/<block>/<reg>
> +Date:		Dec 2020
> +KernelVersion:	5.10
> +Contact:	"Shravan Kumar Ramani <shravankr@nvidia.com>"
> +Description:
> +		Value of register. This is used to read or reset the registers
> +		where various performance statistics are counted for each block.

Ditto for the reset part.

-- 
 i.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count
  2024-05-20 11:56 ` [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count Shravan Kumar Ramani
@ 2024-05-27 11:39   ` Ilpo Järvinen
  2024-06-03 10:29     ` Shravan Ramani
  0 siblings, 1 reply; 14+ messages in thread
From: Ilpo Järvinen @ 2024-05-27 11:39 UTC (permalink / raw)
  To: Shravan Kumar Ramani
  Cc: Hans de Goede, Vadim Pasternak, David Thompson,
	platform-driver-x86, LKML

On Mon, 20 May 2024, Shravan Kumar Ramani wrote:

> Add support for programming any counter to monitor the cycle count.
> Since counting of cycles using 32-bit ocunters would result in frequent

counters

> wraparounds, add the ability to combine 2 adjacent 32-bit counters to
> form 1 64-bit counter.
> Both these features are supported by BlueField-3 PMC hardware, hence
> the required bit-fields are exposed by the driver via sysfs to allow
> the user to configure as needed.

I'm trying to understand what happens for the other counter, when the 
use_odd_counter is enabled? This change also doesn't add code that would 
make the other counter -EBUSY, should that be done?

For 64-bit counter, I suppose the userspace is expected to read the full 
counter from two sysfs files and combine the value (your documentation 
doesn't explain this)? That seems non-optimal, why cannot kernel just 
return the full combined 64-value directly in kernel?

Similarly, are these cycle counters occupying the same space as non-cycle 
counters (so both can/cannot be used that the same time)? I'm asking this 
because you're adding a parallel interface to read the value and if it's 
either-or, I don't understand why the value needs to be read from 
different file depending on the counter counting in cycles or not.

-- 
 i.

> Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
> Reviewed-by: David Thompson <davthompson@nvidia.com>
> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
> ---
>  drivers/platform/mellanox/mlxbf-pmc.c | 134 ++++++++++++++++++++++++++
>  1 file changed, 134 insertions(+)
> 
> diff --git a/drivers/platform/mellanox/mlxbf-pmc.c b/drivers/platform/mellanox/mlxbf-pmc.c
> index 4ed9c7fd2b62..635ecc3b3845 100644
> --- a/drivers/platform/mellanox/mlxbf-pmc.c
> +++ b/drivers/platform/mellanox/mlxbf-pmc.c
> @@ -88,6 +88,8 @@
>  #define MLXBF_PMC_CRSPACE_PERFMON_CTL(n) (n * MLXBF_PMC_CRSPACE_PERFMON_REG0_SZ)
>  #define MLXBF_PMC_CRSPACE_PERFMON_EN BIT(30)
>  #define MLXBF_PMC_CRSPACE_PERFMON_CLR BIT(28)
> +#define MLXBF_PMC_CRSPACE_PERFMON_UOC GENMASK(15, 0)
> +#define MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(n) (MLXBF_PMC_CRSPACE_PERFMON_CTL(n) + 0x4)
>  #define MLXBF_PMC_CRSPACE_PERFMON_VAL0(n) (MLXBF_PMC_CRSPACE_PERFMON_CTL(n) + 0xc)
>  
>  /**
> @@ -114,6 +116,8 @@ struct mlxbf_pmc_attribute {
>   * @attr_event: Attributes for "event" sysfs files
>   * @attr_event_list: Attributes for "event_list" sysfs files
>   * @attr_enable: Attributes for "enable" sysfs files
> + * @attr_use_odd_counter: Attributes for "use_odd_counter" sysfs files
> + * @attr_count_clock: Attributes for "count_clock" sysfs files
>   * @block_attr: All attributes needed for the block
>   * @block_attr_grp: Attribute group for the block
>   */
> @@ -126,6 +130,8 @@ struct mlxbf_pmc_block_info {
>  	struct mlxbf_pmc_attribute *attr_event;
>  	struct mlxbf_pmc_attribute attr_event_list;
>  	struct mlxbf_pmc_attribute attr_enable;
> +	struct mlxbf_pmc_attribute attr_use_odd_counter;
> +	struct mlxbf_pmc_attribute attr_count_clock;
>  	struct attribute *block_attr[MLXBF_PMC_MAX_ATTRS];
>  	struct attribute_group block_attr_grp;
>  };
> @@ -1763,6 +1769,103 @@ static ssize_t mlxbf_pmc_enable_store(struct device *dev,
>  	return count;
>  }
>  
> +/* Show function for "use_odd_counter" sysfs files - only for crspace */
> +static ssize_t mlxbf_pmc_use_odd_counter_show(struct device *dev,
> +					      struct device_attribute *attr, char *buf)
> +{
> +	struct mlxbf_pmc_attribute *attr_use_odd_counter = container_of(
> +		attr, struct mlxbf_pmc_attribute, dev_attr);
> +	unsigned int blk_num;
> +	u32 value, reg;
> +
> +	blk_num = attr_use_odd_counter->nr;
> +
> +	if (mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
> +			MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
> +			&reg))
> +		return -EINVAL;
> +
> +	value = FIELD_GET(MLXBF_PMC_CRSPACE_PERFMON_UOC, reg);
> +
> +	return sysfs_emit(buf, "%u\n", value);
> +}
> +
> +/* Store function for "use_odd_counter" sysfs files - only for crspace */
> +static ssize_t mlxbf_pmc_use_odd_counter_store(struct device *dev,
> +					       struct device_attribute *attr,
> +					       const char *buf, size_t count)
> +{
> +	struct mlxbf_pmc_attribute *attr_use_odd_counter = container_of(
> +		attr, struct mlxbf_pmc_attribute, dev_attr);
> +	unsigned int blk_num;
> +	u32 uoc, reg;
> +	int err;
> +
> +	blk_num = attr_use_odd_counter->nr;
> +
> +	err = kstrtouint(buf, 0, &uoc);
> +	if (err < 0)
> +		return err;
> +
> +	err = mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
> +		MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
> +		&reg);
> +	if (err)
> +		return -EINVAL;
> +
> +	reg &= ~MLXBF_PMC_CRSPACE_PERFMON_UOC;
> +	reg |= FIELD_PREP(MLXBF_PMC_CRSPACE_PERFMON_UOC, uoc);
> +
> +	mlxbf_pmc_write(pmc->block[blk_num].mmio_base +
> +		MLXBF_PMC_CRSPACE_PERFMON_CTL(pmc->block[blk_num].counters),
> +		MLXBF_PMC_WRITE_REG_32, reg);
> +
> +	return count;
> +}
> +
> +/* Show function for "count_clock" sysfs files - only for crspace */
> +static ssize_t mlxbf_pmc_count_clock_show(struct device *dev,
> +					  struct device_attribute *attr, char *buf)
> +{
> +	struct mlxbf_pmc_attribute *attr_count_clock = container_of(
> +		attr, struct mlxbf_pmc_attribute, dev_attr);
> +	unsigned int blk_num;
> +	u32 reg;
> +
> +	blk_num = attr_count_clock->nr;
> +
> +	if (mlxbf_pmc_readl(pmc->block[blk_num].mmio_base +
> +			MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(pmc->block[blk_num].counters),
> +			&reg))
> +		return -EINVAL;
> +
> +	return sysfs_emit(buf, "%u\n", reg);
> +}
> +
> +/* Store function for "count_clock" sysfs files - only for crspace */
> +static ssize_t mlxbf_pmc_count_clock_store(struct device *dev,
> +					   struct device_attribute *attr,
> +					   const char *buf, size_t count)
> +{
> +	struct mlxbf_pmc_attribute *attr_count_clock = container_of(
> +		attr, struct mlxbf_pmc_attribute, dev_attr);
> +	unsigned int blk_num;
> +	u32 reg;
> +	int err;
> +
> +	blk_num = attr_count_clock->nr;
> +
> +	err = kstrtouint(buf, 0, &reg);
> +	if (err < 0)
> +		return err;
> +
> +	mlxbf_pmc_write(pmc->block[blk_num].mmio_base +
> +		MLXBF_PMC_CRSPACE_PERFMON_COUNT_CLOCK(pmc->block[blk_num].counters),
> +		MLXBF_PMC_WRITE_REG_32, reg);
> +
> +	return count;
> +}
> +
>  /* Populate attributes for blocks with counters to monitor performance */
>  static int mlxbf_pmc_init_perftype_counter(struct device *dev, unsigned int blk_num)
>  {
> @@ -1799,6 +1902,37 @@ static int mlxbf_pmc_init_perftype_counter(struct device *dev, unsigned int blk_
>  		attr = NULL;
>  	}
>  
> +	if (pmc->block[blk_num].type == MLXBF_PMC_TYPE_CRSPACE) {
> +		/*
> +		 * Couple adjacent odd and even 32-bit counters to form 64-bit counters
> +		 * using "use_odd_counter" sysfs which has one bit per even counter.
> +		 */
> +		attr = &pmc->block[blk_num].attr_use_odd_counter;
> +		attr->dev_attr.attr.mode = 0644;
> +		attr->dev_attr.show = mlxbf_pmc_use_odd_counter_show;
> +		attr->dev_attr.store = mlxbf_pmc_use_odd_counter_store;
> +		attr->nr = blk_num;
> +		attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
> +							  "use_odd_counter");
> +		if (!attr->dev_attr.attr.name)
> +			return -ENOMEM;
> +		pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
> +		attr = NULL;
> +
> +		/* Program crspace counters to count clock cycles using "count_clock" sysfs */
> +		attr = &pmc->block[blk_num].attr_count_clock;
> +		attr->dev_attr.attr.mode = 0644;
> +		attr->dev_attr.show = mlxbf_pmc_count_clock_show;
> +		attr->dev_attr.store = mlxbf_pmc_count_clock_store;
> +		attr->nr = blk_num;
> +		attr->dev_attr.attr.name = devm_kasprintf(dev, GFP_KERNEL,
> +							  "count_clock");
> +		if (!attr->dev_attr.attr.name)
> +			return -ENOMEM;
> +		pmc->block[blk_num].block_attr[++i] = &attr->dev_attr.attr;
> +		attr = NULL;
> +	}


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count
  2024-05-27 11:39   ` Ilpo Järvinen
@ 2024-06-03 10:29     ` Shravan Ramani
  2024-06-11  7:14       ` Ilpo Järvinen
  0 siblings, 1 reply; 14+ messages in thread
From: Shravan Ramani @ 2024-06-03 10:29 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Hans de Goede, Vadim Pasternak, David Thompson,
	platform-driver-x86@vger.kernel.org, LKML


> > Both these features are supported by BlueField-3 PMC hardware, hence
> > the required bit-fields are exposed by the driver via sysfs to allow
> > the user to configure as needed.
> 
> I'm trying to understand what happens for the other counter, when the
> use_odd_counter is enabled? This change also doesn't add code that would
> make the other counter -EBUSY, should that be done?

When 2 32-bit counters are coupled to form a 64-bit counter using this setting,
one counter will hold the lower 32 bits while the other will hold the upper 32.
So the other counter (or syses corresponding to it) also needs to be accessed.

> For 64-bit counter, I suppose the userspace is expected to read the full
> counter from two sysfs files and combine the value (your documentation
> doesn't explain this)? That seems non-optimal, why cannot kernel just
> return the full combined 64-value directly in kernel?

I will add more clear comments for this.
While it is true that the driver could combine the 2 fields and present a
64-bit value via one of the sysfs, the reason for the current approach is that
there are other interfaces which expose the same counters for our platform
and there are tools that are expected to work on top of both interfaces for
the purpose of collecting performance stats. The other interfaces follow this
approach of having lower and upper 32-bits separately in each counter, and
the tools expect the same. Hence the driver follows this approach to keep
things consistent across the BlueField platform.

> 
> Similarly, are these cycle counters occupying the same space as non-cycle
> counters (so both can/cannot be used that the same time)? I'm asking this
> because you're adding a parallel interface to read the value and if it's
> either-or, I don't understand why the value needs to be read from
> different file depending on the counter counting in cycles or not.

It is the same file. The count_clock sysfs exposes 16 bits, one for each counter,
to allow the user to dedicate any of the 16 counters to counting cycles. Once
set, the corresponding counter can no longer monitor other events, and the
same sysfs can be accessed to read the cycle count. Again, I will try capture
this better and more elaborately in the documentation.

Thanks,
Shravan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count
  2024-06-03 10:29     ` Shravan Ramani
@ 2024-06-11  7:14       ` Ilpo Järvinen
  2024-06-11 13:34         ` Shravan Ramani
  0 siblings, 1 reply; 14+ messages in thread
From: Ilpo Järvinen @ 2024-06-11  7:14 UTC (permalink / raw)
  To: Shravan Ramani
  Cc: Hans de Goede, Vadim Pasternak, David Thompson,
	platform-driver-x86@vger.kernel.org, LKML

On Mon, 3 Jun 2024, Shravan Ramani wrote:

> > > Both these features are supported by BlueField-3 PMC hardware, hence
> > > the required bit-fields are exposed by the driver via sysfs to allow
> > > the user to configure as needed.
> > 
> > I'm trying to understand what happens for the other counter, when the
> > use_odd_counter is enabled? This change also doesn't add code that would
> > make the other counter -EBUSY, should that be done?
> 
> When 2 32-bit counters are coupled to form a 64-bit counter using this setting,
> one counter will hold the lower 32 bits while the other will hold the upper 32.
> So the other counter (or syses corresponding to it) also needs to be accessed.
> 
> > For 64-bit counter, I suppose the userspace is expected to read the full
> > counter from two sysfs files and combine the value (your documentation
> > doesn't explain this)? That seems non-optimal, why cannot kernel just
> > return the full combined 64-value directly in kernel?
> 
> I will add more clear comments for this.
> While it is true that the driver could combine the 2 fields and present a
> 64-bit value via one of the sysfs, the reason for the current approach is that
> there are other interfaces which expose the same counters for our platform
> and there are tools that are expected to work on top of both interfaces for
> the purpose of collecting performance stats.

> The other interfaces follow this
> approach of having lower and upper 32-bits separately in each counter, and
> the tools expect the same. Hence the driver follows this approach to keep
> things consistent across the BlueField platform.

Hi,

I went to look through the existing arrays in mlxbf-pmc.c but did not find 
any entries that would have clearly indicated the counters being hi/lo 
parts of the same counter. There were a few 0/1 ones which could be the 
same counter although I suspect even they are not parts of the same 
counter but two separate entities called 0 and 1 having the same counter.

Could you please elaborate further what you meant with the note about 
other interfaces above so I can better assess the claim?

-- 
 i.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count
  2024-06-11  7:14       ` Ilpo Järvinen
@ 2024-06-11 13:34         ` Shravan Ramani
  2024-06-12  7:28           ` Ilpo Järvinen
  0 siblings, 1 reply; 14+ messages in thread
From: Shravan Ramani @ 2024-06-11 13:34 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Hans de Goede, Vadim Pasternak, David Thompson,
	platform-driver-x86@vger.kernel.org, LKML

> > When 2 32-bit counters are coupled to form a 64-bit counter using this setting,
> > one counter will hold the lower 32 bits while the other will hold the upper 32.
> > So the other counter (or syses corresponding to it) also needs to be accessed.
> >
> > > For 64-bit counter, I suppose the userspace is expected to read the full
> > > counter from two sysfs files and combine the value (your documentation
> > > doesn't explain this)? That seems non-optimal, why cannot kernel just
> > > return the full combined 64-value directly in kernel?
> > 
> > I will add more clear comments for this.
> > While it is true that the driver could combine the 2 fields and present a
> > 64-bit value via one of the sysfs, the reason for the current approach is that
> > there are other interfaces which expose the same counters for our platform
> > and there are tools that are expected to work on top of both interfaces for
> > the purpose of collecting performance stats.
>
> > The other interfaces follow this
> > approach of having lower and upper 32-bits separately in each counter, and
> > the tools expect the same. Hence the driver follows this approach to keep
> > things consistent across the BlueField platform.
>
> Hi,
>
> I went to look through the existing arrays in mlxbf-pmc.c but did not find
> any entries that would have clearly indicated the counters being hi/lo
> parts of the same counter. There were a few 0/1 ones which could be the
> same counter although I suspect even they are not parts of the same
> counter but two separate entities called 0 and 1 having the same counter.
>
> Could you please elaborate further what you meant with the note about
> other interfaces above so I can better assess the claim?

When combining 2 counters using the "use_odd_counter" setting, the mechanism
of joining them or assigning upper or lower 32 bits to a counter is handled in HW
and not by the driver. For example, if bit0 of "use_odd_counter" is set, counter0
and counter1 (which were originally separate counters) automatically become
the lower and upper bits of one 64-bit value. The user needs to read both these
sysfs separately to get the full 64-bit value. The driver does not do any special
handling for such cases, merely provides access to both counter0 and counter1.

Since the events supported by the blocks are quite HW centric and low-level in
nature, the driver is generally used alongside various tools which work on top of
this driver to collect telemetry info and provide more readable statistics to the
end-user. Similar to this driver, there are other FW interfaces providing access to
these counters (same and other additional ones as well that belong to other HW
blocks). For the sake of consistency and to allow the tools to be compatible with
all interfaces, the counter data needs to be accessible in the same way, ie, as 32-bit
upper and lower values in counter0 and counter1 sysfs as in the above case.

Thanks,
Shravan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count
  2024-06-11 13:34         ` Shravan Ramani
@ 2024-06-12  7:28           ` Ilpo Järvinen
  2024-06-14 10:46             ` Shravan Ramani
  0 siblings, 1 reply; 14+ messages in thread
From: Ilpo Järvinen @ 2024-06-12  7:28 UTC (permalink / raw)
  To: Shravan Ramani
  Cc: Hans de Goede, Vadim Pasternak, David Thompson,
	platform-driver-x86@vger.kernel.org, LKML

On Tue, 11 Jun 2024, Shravan Ramani wrote:

> > > When 2 32-bit counters are coupled to form a 64-bit counter using this setting,
> > > one counter will hold the lower 32 bits while the other will hold the upper 32.
> > > So the other counter (or syses corresponding to it) also needs to be accessed.
> > >
> > > > For 64-bit counter, I suppose the userspace is expected to read the full
> > > > counter from two sysfs files and combine the value (your documentation
> > > > doesn't explain this)? That seems non-optimal, why cannot kernel just
> > > > return the full combined 64-value directly in kernel?
> > > 
> > > I will add more clear comments for this.
> > > While it is true that the driver could combine the 2 fields and present a
> > > 64-bit value via one of the sysfs, the reason for the current approach is that
> > > there are other interfaces which expose the same counters for our platform
> > > and there are tools that are expected to work on top of both interfaces for
> > > the purpose of collecting performance stats.
> >
> > > The other interfaces follow this
> > > approach of having lower and upper 32-bits separately in each counter, and
> > > the tools expect the same. Hence the driver follows this approach to keep
> > > things consistent across the BlueField platform.
> >
> > Hi,
> >
> > I went to look through the existing arrays in mlxbf-pmc.c but did not find
> > any entries that would have clearly indicated the counters being hi/lo
> > parts of the same counter. There were a few 0/1 ones which could be the
> > same counter although I suspect even they are not parts of the same
> > counter but two separate entities called 0 and 1 having the same counter.
> >
> > Could you please elaborate further what you meant with the note about
> > other interfaces above so I can better assess the claim?
> 
> When combining 2 counters using the "use_odd_counter" setting, the mechanism
> of joining them or assigning upper or lower 32 bits to a counter is handled in HW
> and not by the driver. For example, if bit0 of "use_odd_counter" is set, counter0
> and counter1 (which were originally separate counters) automatically become
> the lower and upper bits of one 64-bit value. The user needs to read both these
> sysfs separately to get the full 64-bit value. The driver does not do any special
> handling for such cases, merely provides access to both counter0 and counter1.

I know all this by now, but we're discussion here is whether kernel should 
do "special handling". Although, it's not really correct to depict 
representing 64-bit counter in its entirety as "special handling".

I think the kernel should combine the 64-bit halved and you argumented 
it shouldn't. When I went to confirm the claim your argument was based 
on, I couldn't find on what basis the claim was made.

> Since the events supported by the blocks are quite HW centric and low-level in
> nature, the driver is generally used alongside various tools which work on top of
> this driver to collect telemetry info and provide more readable statistics to the
> end-user. Similar to this driver, there are other FW interfaces providing access to
> these counters (same and other additional ones as well that belong to other HW
> blocks). For the sake of consistency and to allow the tools to be compatible with
> all interfaces, the counter data needs to be accessible in the same way, ie, as 32-bit
> upper and lower values in counter0 and counter1 sysfs as in the above case.

This does nothing to answer my question. Where in the kernel, there's an 
example where a 64-bit counter for BlueField platform is presented as 2 
32-bit counters? If there isn't any examples in the kernel, your statement 
about consistency within the platform doesn't hold water, quoted (again) 
here for clarity what I'm refering to:

"The other interfaces follow this approach of having lower and upper 
32-bits separately in each counter, and the tools expect the same.
Hence the driver follows this approach to keep things consistent across 
the BlueField platform."

Where I can find those "other interfaces" that already follow this 
convention?

-- 
 i.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count
  2024-06-12  7:28           ` Ilpo Järvinen
@ 2024-06-14 10:46             ` Shravan Ramani
  2024-06-14 10:58               ` Ilpo Järvinen
  0 siblings, 1 reply; 14+ messages in thread
From: Shravan Ramani @ 2024-06-14 10:46 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Hans de Goede, Vadim Pasternak, David Thompson,
	platform-driver-x86@vger.kernel.org, LKML


> This does nothing to answer my question. Where in the kernel, there's an
> example where a 64-bit counter for BlueField platform is presented as 2
> 32-bit counters? If there isn't any examples in the kernel, your statement
> about consistency within the platform doesn't hold water, quoted (again)
> here for clarity what I'm refering to:
>
> "The other interfaces follow this approach of having lower and upper
> 32-bits separately in each counter, and the tools expect the same.
> Hence the driver follows this approach to keep things consistent across
> the BlueField platform."
>
> Where I can find those "other interfaces" that already follow this
> convention?

Ah, I think I misunderstood the question and went on elaborating the
same thing, apologies. The other interfaces are not part of the kernel.
They are part of the BlueField Software Package, which also contains
the tools that put together the performance metrics.
My thinking was that since this is a platform driver and is used along
with the BlueField Software Package, consistency with the tools which
were developed following the same convention could be considered,
as long as the driver is not doing something non-standard, of course.
I can change the driver handling to present 64-bit data if you insist.

Thanks,
Shravan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count
  2024-06-14 10:46             ` Shravan Ramani
@ 2024-06-14 10:58               ` Ilpo Järvinen
  0 siblings, 0 replies; 14+ messages in thread
From: Ilpo Järvinen @ 2024-06-14 10:58 UTC (permalink / raw)
  To: Shravan Ramani
  Cc: Hans de Goede, Vadim Pasternak, David Thompson,
	platform-driver-x86@vger.kernel.org, LKML

On Fri, 14 Jun 2024, Shravan Ramani wrote:

> > This does nothing to answer my question. Where in the kernel, there's an
> > example where a 64-bit counter for BlueField platform is presented as 2
> > 32-bit counters? If there isn't any examples in the kernel, your statement
> > about consistency within the platform doesn't hold water, quoted (again)
> > here for clarity what I'm refering to:
> >
> > "The other interfaces follow this approach of having lower and upper
> > 32-bits separately in each counter, and the tools expect the same.
> > Hence the driver follows this approach to keep things consistent across
> > the BlueField platform."
> >
> > Where I can find those "other interfaces" that already follow this
> > convention?
> 
> Ah, I think I misunderstood the question and went on elaborating the
> same thing, apologies. The other interfaces are not part of the kernel.
> They are part of the BlueField Software Package, which also contains
> the tools that put together the performance metrics.
> My thinking was that since this is a platform driver and is used along
> with the BlueField Software Package, consistency with the tools which
> were developed following the same convention could be considered,
> as long as the driver is not doing something non-standard, of course.
> I can change the driver handling to present 64-bit data if you insist.

I'd certainly prefer 64-bit data be presented as such by the kernel.
While you make that change, please make sure the driver correctly handles 
the lower dword wraps without returning an inconsistent reading (assuming 
the counter parts are read non-atomically, it is a common pitfall).

-- 
 i.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-06-14 10:59 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-20 11:56 [PATCH v2 0/4] Updates to mlxbf-pmc Shravan Kumar Ramani
2024-05-20 11:56 ` [PATCH v2 1/4] Documentation/ABI: Add document for Mellanox PMC driver Shravan Kumar Ramani
2024-05-27 10:33   ` Ilpo Järvinen
2024-05-20 11:56 ` [PATCH v2 2/4] platform/mellanox: mlxbf-pmc: Add support for 64-bit counters and cycle count Shravan Kumar Ramani
2024-05-27 11:39   ` Ilpo Järvinen
2024-06-03 10:29     ` Shravan Ramani
2024-06-11  7:14       ` Ilpo Järvinen
2024-06-11 13:34         ` Shravan Ramani
2024-06-12  7:28           ` Ilpo Järvinen
2024-06-14 10:46             ` Shravan Ramani
2024-06-14 10:58               ` Ilpo Järvinen
2024-05-20 11:56 ` [PATCH v2 3/4] platform/mellanox: mlxbf-pmc: Add support for clock_measure performance block Shravan Kumar Ramani
2024-05-20 11:56 ` [PATCH v2 4/4] Documentation/ABI: Add new sysfs fields to sysfs-platform-mellanox-pmc Shravan Kumar Ramani
  -- strict thread matches above, loose matches on Subject: below --
2024-02-09  8:39 [PATCH v2 0/4] Updates to mlxbf-pmc Shravan Kumar Ramani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox