linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/5]  Hwmon PMUs
@ 2024-10-22 18:06 Ian Rogers
  2024-10-22 18:06 ` [PATCH v6 1/5] tools api io: Ensure line_len_out is always initialized Ian Rogers
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Ian Rogers @ 2024-10-22 18:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Ravi Bangoria, Weilin Wang,
	Yoshihiro Furudera, James Clark, Athira Jajeev, Howard Chu,
	Oliver Upton, Changbin Du, Ze Gao, Junhao He, linux-kernel,
	linux-perf-users

Following the convention of the tool PMU, create a hwmon PMU that
exposes hwmon data for reading. For example, the following shows
reading the CPU temperature and 2 fan speeds alongside the uncore
frequency:
```
$ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
     1.001153138              52.00 'C   temp_cpu
     1.001153138              2,588 rpm  fan1
     1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
     1.001153138                  8      tool/num_cpus_online/
     1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
     1.001153138      1,012,773,595      duration_time
...
```

Additional data on the hwmon events is in perf list:
```
$ perf list
...
hwmon:
...
  temp_core_0 OR temp2
       [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
        hwmon_coretemp]
...
```

v6: Add string.h #include for issue reported by kernel test robot.
v5: Fix asan issue in parse_hwmon_filename caught by a TMA metric.
v4: Drop merged patches 1 to 10. Separate adding the hwmon_pmu from
    the update to perf_pmu to use it. Try to make source of literal
    strings clearer via named #defines. Fix a number of GCC warnings.
v3: Rebase, add Namhyung's acked-by to patches 1 to 10.
v2: Address Namhyung's review feedback. Rebase dropping 4 patches
    applied by Arnaldo, fix build breakage reported by Arnaldo.

Ian Rogers (5):
  tools api io: Ensure line_len_out is always initialized
  perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
  perf pmu: Add calls enabling the hwmon_pmu
  perf test: Add hwmon "PMU" test
  perf docs: Document tool and hwmon events

 tools/lib/api/io.h                     |   1 +
 tools/perf/Documentation/perf-list.txt |  15 +
 tools/perf/tests/Build                 |   1 +
 tools/perf/tests/builtin-test.c        |   1 +
 tools/perf/tests/hwmon_pmu.c           | 243 ++++++++
 tools/perf/tests/tests.h               |   1 +
 tools/perf/util/Build                  |   1 +
 tools/perf/util/evsel.c                |   9 +
 tools/perf/util/hwmon_pmu.c            | 821 +++++++++++++++++++++++++
 tools/perf/util/hwmon_pmu.h            | 154 +++++
 tools/perf/util/pmu.c                  |  20 +
 tools/perf/util/pmu.h                  |   2 +
 tools/perf/util/pmus.c                 |   9 +
 tools/perf/util/pmus.h                 |   3 +
 14 files changed, 1281 insertions(+)
 create mode 100644 tools/perf/tests/hwmon_pmu.c
 create mode 100644 tools/perf/util/hwmon_pmu.c
 create mode 100644 tools/perf/util/hwmon_pmu.h

-- 
2.47.0.163.g1226f6d8fa-goog


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v6 1/5] tools api io: Ensure line_len_out is always initialized
  2024-10-22 18:06 [PATCH v6 0/5] Hwmon PMUs Ian Rogers
@ 2024-10-22 18:06 ` Ian Rogers
  2024-10-22 18:06 ` [PATCH v6 2/5] perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs Ian Rogers
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2024-10-22 18:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Ravi Bangoria, Weilin Wang,
	Yoshihiro Furudera, James Clark, Athira Jajeev, Howard Chu,
	Oliver Upton, Changbin Du, Ze Gao, Junhao He, linux-kernel,
	linux-perf-users

Ensure initialization to avoid compiler warnings about potential use
of uninitialized variables.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/lib/api/io.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/api/io.h b/tools/lib/api/io.h
index d3eb04d1bc89..1731996b2c32 100644
--- a/tools/lib/api/io.h
+++ b/tools/lib/api/io.h
@@ -189,6 +189,7 @@ static inline ssize_t io__getdelim(struct io *io, char **line_out, size_t *line_
 err_out:
 	free(line);
 	*line_out = NULL;
+	*line_len_out = 0;
 	return -ENOMEM;
 }
 
-- 
2.47.0.163.g1226f6d8fa-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 2/5] perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
  2024-10-22 18:06 [PATCH v6 0/5] Hwmon PMUs Ian Rogers
  2024-10-22 18:06 ` [PATCH v6 1/5] tools api io: Ensure line_len_out is always initialized Ian Rogers
@ 2024-10-22 18:06 ` Ian Rogers
  2024-10-22 18:06 ` [PATCH v6 3/5] perf pmu: Add calls enabling the hwmon_pmu Ian Rogers
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2024-10-22 18:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Ravi Bangoria, Weilin Wang,
	Yoshihiro Furudera, James Clark, Athira Jajeev, Howard Chu,
	Oliver Upton, Changbin Du, Ze Gao, Junhao He, linux-kernel,
	linux-perf-users

Add a tool PMU for hwmon events but don't enable.

The hwmon sysfs ABI is defined in
Documentation/hwmon/sysfs-interface.rst. Create a PMU that reads the
hwmon input and can be used in `perf stat` and metrics much as an
uncore PMU can.

For example, when enabled by a later patch, the following shows
reading the CPU temperature and 2 fan speeds alongside the uncore
frequency:
```
$ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
     1.001153138              52.00 'C   temp_cpu
     1.001153138              2,588 rpm  fan1
     1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
     1.001153138                  8      tool/num_cpus_online/
     1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
     1.001153138      1,012,773,595      duration_time
...
```

The PMUs are named from /sys/class/hwmon/hwmon<num>/name and have an
alias of hwmon<num>.

Hwmon data is presented in multiple <type><number>_<item> files. The
<type><number> is used to identify the event as is the <type> followed
by the contents of the <type>_label file if it exists. The
<type><number>_input file gives the data read by perf.

When enabled by a later patch, in `perf list` the other hwmon <item>
files are used to give a richer description, for example:
```
hwmon:
  temp1
       [Temperature in unit acpitz named temp1. Unit: hwmon_acpitz]
  in0
       [Voltage in unit bat0 named in0. Unit: hwmon_bat0]
  temp_core_0 OR temp2
       [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
        hwmon_coretemp]
  temp_core_1 OR temp3
       [Temperature in unit coretemp named Core 1. crit=100'C,max=100'C crit_alarm=0'C. Unit:
        hwmon_coretemp]
...
  temp_package_id_0 OR temp1
       [Temperature in unit coretemp named Package id 0. crit=100'C,max=100'C crit_alarm=0'C.
        Unit: hwmon_coretemp]
  temp1
       [Temperature in unit iwlwifi_1 named temp1. Unit: hwmon_iwlwifi_1]
  temp_composite OR temp1
       [Temperature in unit nvme named Composite. alarm=0'C,crit=86.85'C,max=75.85'C,
        min=-273.15'C. Unit: hwmon_nvme]
  temp_sensor_1 OR temp2
       [Temperature in unit nvme named Sensor 1. max=65261.8'C,min=-273.15'C. Unit: hwmon_nvme]
  temp_sensor_2 OR temp3
       [Temperature in unit nvme named Sensor 2. max=65261.8'C,min=-273.15'C. Unit: hwmon_nvme]
  fan1
       [Fan in unit thinkpad named fan1. Unit: hwmon_thinkpad]
  fan2
       [Fan in unit thinkpad named fan2. Unit: hwmon_thinkpad]
...
  temp_cpu OR temp1
       [Temperature in unit thinkpad named CPU. Unit: hwmon_thinkpad]
  temp_gpu OR temp2
       [Temperature in unit thinkpad named GPU. Unit: hwmon_thinkpad]
  curr1
       [Current in unit ucsi_source_psy_usbc000_0 named curr1. max=1.5A. Unit:
        hwmon_ucsi_source_psy_usbc000_0]
  in0
       [Voltage in unit ucsi_source_psy_usbc000_0 named in0. max=5V,min=5V. Unit:
        hwmon_ucsi_source_psy_usbc000_0]
```

As there may be multiple hwmon devices a range of PMU types are
reserved for their use and to identify the PMU as belonging to the
hwmon types.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/Build       |   1 +
 tools/perf/util/hwmon_pmu.c | 821 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/hwmon_pmu.h | 154 +++++++
 tools/perf/util/pmu.h       |   2 +
 4 files changed, 978 insertions(+)
 create mode 100644 tools/perf/util/hwmon_pmu.c
 create mode 100644 tools/perf/util/hwmon_pmu.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 1eedead5f2f2..78b990c04f71 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -83,6 +83,7 @@ perf-util-y += pmu.o
 perf-util-y += pmus.o
 perf-util-y += pmu-flex.o
 perf-util-y += pmu-bison.o
+perf-util-y += hwmon_pmu.o
 perf-util-y += tool_pmu.o
 perf-util-y += svghelper.o
 perf-util-$(CONFIG_LIBTRACEEVENT) += trace-event-info.o
diff --git a/tools/perf/util/hwmon_pmu.c b/tools/perf/util/hwmon_pmu.c
new file mode 100644
index 000000000000..519f2a8e512e
--- /dev/null
+++ b/tools/perf/util/hwmon_pmu.c
@@ -0,0 +1,821 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+#include "counts.h"
+#include "debug.h"
+#include "evsel.h"
+#include "hashmap.h"
+#include "hwmon_pmu.h"
+#include "pmu.h"
+#include <internal/xyarray.h>
+#include <internal/threadmap.h>
+#include <perf/threadmap.h>
+#include <sys/types.h>
+#include <ctype.h>
+#include <dirent.h>
+#include <fcntl.h>
+#include <string.h>
+#include <api/fs/fs.h>
+#include <api/io.h>
+#include <linux/zalloc.h>
+
+const char * const hwmon_type_strs[HWMON_TYPE_MAX] = {
+	NULL,
+	"cpu",
+	"curr",
+	"energy",
+	"fan",
+	"humidity",
+	"in",
+	"intrusion",
+	"power",
+	"pwm",
+	"temp",
+};
+#define LONGEST_HWMON_TYPE_STR "intrusion"
+
+static const char *const hwmon_units[HWMON_TYPE_MAX] = {
+	NULL,
+	"V",   /* cpu */
+	"A",   /* curr */
+	"J",   /* energy */
+	"rpm", /* fan */
+	"%",   /* humidity */
+	"V",   /* in */
+	"",    /* intrusion */
+	"W",   /* power */
+	"Hz",  /* pwm */
+	"'C",  /* temp */
+};
+
+const char * const hwmon_item_strs[HWMON_ITEM__MAX] = {
+	NULL,
+	"accuracy",
+	"alarm",
+	"auto_channels_temp",
+	"average",
+	"average_highest",
+	"average_interval",
+	"average_interval_max",
+	"average_interval_min",
+	"average_lowest",
+	"average_max",
+	"average_min",
+	"beep",
+	"cap",
+	"cap_hyst",
+	"cap_max",
+	"cap_min",
+	"crit",
+	"crit_hyst",
+	"div",
+	"emergency",
+	"emergency_hist",
+	"enable",
+	"fault",
+	"freq",
+	"highest",
+	"input",
+	"label",
+	"lcrit",
+	"lcrit_hyst",
+	"lowest",
+	"max",
+	"max_hyst",
+	"min",
+	"min_hyst",
+	"mod",
+	"offset",
+	"pulses",
+	"rated_max",
+	"rated_min",
+	"reset_history",
+	"target",
+	"type",
+	"vid",
+};
+#define LONGEST_HWMON_ITEM_STR "average_interval_max"
+
+struct hwmon_pmu {
+	struct perf_pmu pmu;
+	struct hashmap events;
+	int hwmon_dir_fd;
+};
+
+/**
+ * union hwmon_pmu_event_key: Key for hwmon_pmu->events as such each key
+ * represents an event.
+ *
+ * Related hwmon files start <type><number> that this key represents.
+ */
+union hwmon_pmu_event_key {
+	long type_and_num;
+	struct {
+		int num :16;
+		enum hwmon_type type :8;
+	};
+};
+
+/**
+ * struct hwmon_pmu_event_value: Value in hwmon_pmu->events.
+ *
+ * Hwmon files are of the form <type><number>_<item> and may have a suffix
+ * _alarm.
+ */
+struct hwmon_pmu_event_value {
+	/** @items: which item files are present. */
+	DECLARE_BITMAP(items, HWMON_ITEM__MAX);
+	/** @alarm_items: which item files are present. */
+	DECLARE_BITMAP(alarm_items, HWMON_ITEM__MAX);
+	/** @label: contents of <type><number>_label if present. */
+	char *label;
+	/** @name: name computed from label of the form <type>_<label>. */
+	char *name;
+};
+
+bool perf_pmu__is_hwmon(const struct perf_pmu *pmu)
+{
+	return pmu && pmu->type >= PERF_PMU_TYPE_HWMON_START &&
+		pmu->type <= PERF_PMU_TYPE_HWMON_END;
+}
+
+bool evsel__is_hwmon(const struct evsel *evsel)
+{
+	return perf_pmu__is_hwmon(evsel->pmu);
+}
+
+static size_t hwmon_pmu__event_hashmap_hash(long key, void *ctx __maybe_unused)
+{
+	return ((union hwmon_pmu_event_key)key).type_and_num;
+}
+
+static bool hwmon_pmu__event_hashmap_equal(long key1, long key2, void *ctx __maybe_unused)
+{
+	return ((union hwmon_pmu_event_key)key1).type_and_num ==
+	       ((union hwmon_pmu_event_key)key2).type_and_num;
+}
+
+static int hwmon_strcmp(const void *a, const void *b)
+{
+	const char *sa = a;
+	const char * const *sb = b;
+
+	return strcmp(sa, *sb);
+}
+
+bool parse_hwmon_filename(const char *filename,
+			  enum hwmon_type *type,
+			  int *number,
+			  enum hwmon_item *item,
+			  bool *alarm)
+{
+	char fn_type[24];
+	const char **elem;
+	const char *fn_item = NULL;
+	size_t fn_item_len;
+
+	assert(strlen(LONGEST_HWMON_TYPE_STR) < sizeof(fn_type));
+	strlcpy(fn_type, filename, sizeof(fn_type));
+	for (size_t i = 0; fn_type[i] != '\0'; i++) {
+		if (fn_type[i] >= '0' && fn_type[i] <= '9') {
+			fn_type[i] = '\0';
+			*number = strtoul(&filename[i], (char **)&fn_item, 10);
+			if (*fn_item == '_')
+				fn_item++;
+			break;
+		}
+		if (fn_type[i] == '_') {
+			fn_type[i] = '\0';
+			*number = -1;
+			fn_item = &filename[i + 1];
+			break;
+		}
+	}
+	if (fn_item == NULL || fn_type[0] == '\0' || (item != NULL && fn_item[0] == '\0')) {
+		pr_debug("hwmon_pmu: not a hwmon file '%s'\n", filename);
+		return false;
+	}
+	elem = bsearch(&fn_type, hwmon_type_strs + 1, ARRAY_SIZE(hwmon_type_strs) - 1,
+		       sizeof(hwmon_type_strs[0]), hwmon_strcmp);
+	if (!elem) {
+		pr_debug("hwmon_pmu: not a hwmon type '%s' in file name '%s'\n",
+			 fn_type, filename);
+		return false;
+	}
+
+	*type = elem - &hwmon_type_strs[0];
+	if (!item)
+		return true;
+
+	*alarm = false;
+	fn_item_len = strlen(fn_item);
+	if (fn_item_len > 6 && !strcmp(&fn_item[fn_item_len - 6], "_alarm")) {
+		assert(strlen(LONGEST_HWMON_ITEM_STR) < sizeof(fn_type));
+		strlcpy(fn_type, fn_item, fn_item_len - 6);
+		fn_item = fn_type;
+		*alarm = true;
+	}
+	elem = bsearch(fn_item, hwmon_item_strs + 1, ARRAY_SIZE(hwmon_item_strs) - 1,
+		       sizeof(hwmon_item_strs[0]), hwmon_strcmp);
+	if (!elem) {
+		pr_debug("hwmon_pmu: not a hwmon item '%s' in file name '%s'\n",
+			 fn_item, filename);
+		return false;
+	}
+	*item = elem - &hwmon_item_strs[0];
+	return true;
+}
+
+static void fix_name(char *p)
+{
+	char *s = strchr(p, '\n');
+
+	if (s)
+		*s = '\0';
+
+	while (*p != '\0') {
+		if (strchr(" :,/\n\t", *p))
+			*p = '_';
+		else
+			*p = tolower(*p);
+		p++;
+	}
+}
+
+static int hwmon_pmu__read_events(struct hwmon_pmu *pmu)
+{
+	DIR *dir;
+	struct dirent *ent;
+	int dup_fd, err = 0;
+	struct hashmap_entry *cur, *tmp;
+	size_t bkt;
+
+	if (pmu->pmu.sysfs_aliases_loaded)
+		return 0;
+
+	/* Use a dup-ed fd as closedir will close it. */
+	dup_fd = dup(pmu->hwmon_dir_fd);
+	if (dup_fd == -1)
+		return -ENOMEM;
+
+	dir = fdopendir(dup_fd);
+	if (!dir) {
+		close(dup_fd);
+		return -ENOMEM;
+	}
+
+	while ((ent = readdir(dir)) != NULL) {
+		enum hwmon_type type;
+		int number;
+		enum hwmon_item item;
+		bool alarm;
+		union hwmon_pmu_event_key key = {};
+		struct hwmon_pmu_event_value *value;
+
+		if (ent->d_type != DT_REG)
+			continue;
+
+		if (!parse_hwmon_filename(ent->d_name, &type, &number, &item, &alarm)) {
+			pr_debug("Not a hwmon file '%s'\n", ent->d_name);
+			continue;
+		}
+		key.num = number;
+		key.type = type;
+		if (!hashmap__find(&pmu->events, key.type_and_num, &value)) {
+			value = zalloc(sizeof(*value));
+			if (!value) {
+				err = -ENOMEM;
+				goto err_out;
+			}
+			err = hashmap__add(&pmu->events, key.type_and_num, value);
+			if (err) {
+				free(value);
+				err = -ENOMEM;
+				goto err_out;
+			}
+		}
+		__set_bit(item, alarm ? value->alarm_items : value->items);
+		if (item == HWMON_ITEM_LABEL) {
+			char buf[128];
+			int fd = openat(pmu->hwmon_dir_fd, ent->d_name, O_RDONLY);
+			ssize_t read_len;
+
+			if (fd < 0)
+				continue;
+
+			read_len = read(fd, buf, sizeof(buf));
+
+			while (read_len > 0 && buf[read_len - 1] == '\n')
+				read_len--;
+
+			if (read_len > 0)
+				buf[read_len] = '\0';
+
+			if (buf[0] == '\0') {
+				pr_debug("hwmon_pmu: empty label file %s %s\n",
+					 pmu->pmu.name, ent->d_name);
+				close(fd);
+				continue;
+			}
+			value->label = strdup(buf);
+			if (!value->label) {
+				pr_debug("hwmon_pmu: memory allocation failure\n");
+				close(fd);
+				continue;
+			}
+			snprintf(buf, sizeof(buf), "%s_%s", hwmon_type_strs[type], value->label);
+			fix_name(buf);
+			value->name = strdup(buf);
+			if (!value->name)
+				pr_debug("hwmon_pmu: memory allocation failure\n");
+			close(fd);
+		}
+	}
+	hashmap__for_each_entry_safe((&pmu->events), cur, tmp, bkt) {
+		union hwmon_pmu_event_key key = {
+			.type_and_num = cur->key,
+		};
+		struct hwmon_pmu_event_value *value = cur->pvalue;
+
+		if (!test_bit(HWMON_ITEM_INPUT, value->items)) {
+			pr_debug("hwmon_pmu: removing event '%s%d' that has no input file\n",
+				hwmon_type_strs[key.type], key.num);
+			hashmap__delete(&pmu->events, key.type_and_num, &key, &value);
+			zfree(&value->label);
+			zfree(&value->name);
+			free(value);
+		}
+	}
+	pmu->pmu.sysfs_aliases_loaded = true;
+
+err_out:
+	closedir(dir);
+	return err;
+}
+
+struct perf_pmu *hwmon_pmu__new(struct list_head *pmus, int hwmon_dir, const char *sysfs_name, const char *name)
+{
+	char buf[32];
+	struct hwmon_pmu *hwm;
+
+	hwm = zalloc(sizeof(*hwm));
+	if (!hwm)
+		return NULL;
+
+
+	hwm->hwmon_dir_fd = hwmon_dir;
+	hwm->pmu.type = PERF_PMU_TYPE_HWMON_START + strtoul(sysfs_name + 5, NULL, 10);
+	if (hwm->pmu.type > PERF_PMU_TYPE_HWMON_END) {
+		pr_err("Unable to encode hwmon type from %s in valid PMU type\n", sysfs_name);
+		goto err_out;
+	}
+	snprintf(buf, sizeof(buf), "hwmon_%s", name);
+	fix_name(buf + 6);
+	hwm->pmu.name = strdup(buf);
+	if (!hwm->pmu.name)
+		goto err_out;
+	hwm->pmu.alias_name = strdup(sysfs_name);
+	if (!hwm->pmu.alias_name)
+		goto err_out;
+	hwm->pmu.cpus = perf_cpu_map__new("0");
+	if (!hwm->pmu.cpus)
+		goto err_out;
+	INIT_LIST_HEAD(&hwm->pmu.format);
+	INIT_LIST_HEAD(&hwm->pmu.aliases);
+	INIT_LIST_HEAD(&hwm->pmu.caps);
+	hashmap__init(&hwm->events, hwmon_pmu__event_hashmap_hash,
+		      hwmon_pmu__event_hashmap_equal, /*ctx=*/NULL);
+
+	list_add_tail(&hwm->pmu.list, pmus);
+	return &hwm->pmu;
+err_out:
+	free((char *)hwm->pmu.name);
+	free(hwm->pmu.alias_name);
+	free(hwm);
+	close(hwmon_dir);
+	return NULL;
+}
+
+void hwmon_pmu__exit(struct perf_pmu *pmu)
+{
+	struct hwmon_pmu *hwm = container_of(pmu, struct hwmon_pmu, pmu);
+	struct hashmap_entry *cur, *tmp;
+	size_t bkt;
+
+	hashmap__for_each_entry_safe((&hwm->events), cur, tmp, bkt) {
+		struct hwmon_pmu_event_value *value = cur->pvalue;
+
+		zfree(&value->label);
+		zfree(&value->name);
+		free(value);
+	}
+	hashmap__clear(&hwm->events);
+	close(hwm->hwmon_dir_fd);
+}
+
+static size_t hwmon_pmu__describe_items(struct hwmon_pmu *hwm, char *out_buf, size_t out_buf_len,
+					union hwmon_pmu_event_key key,
+					const unsigned long *items, bool is_alarm)
+{
+	size_t bit;
+	char buf[64];
+	size_t len = 0;
+
+	for_each_set_bit(bit, items, HWMON_ITEM__MAX) {
+		int fd;
+
+		if (bit == HWMON_ITEM_LABEL || bit == HWMON_ITEM_INPUT)
+			continue;
+
+		snprintf(buf, sizeof(buf), "%s%d_%s%s",
+			hwmon_type_strs[key.type],
+			key.num,
+			hwmon_item_strs[bit],
+			is_alarm ? "_alarm" : "");
+		fd = openat(hwm->hwmon_dir_fd, buf, O_RDONLY);
+		if (fd > 0) {
+			ssize_t read_len = read(fd, buf, sizeof(buf));
+
+			while (read_len > 0 && buf[read_len - 1] == '\n')
+				read_len--;
+
+			if (read_len > 0) {
+				long long val;
+
+				buf[read_len] = '\0';
+				val = strtoll(buf, /*endptr=*/NULL, 10);
+				len += snprintf(out_buf + len, out_buf_len - len, "%s%s%s=%g%s",
+						len == 0 ? " " : ", ",
+						hwmon_item_strs[bit],
+						is_alarm ? "_alarm" : "",
+						(double)val / 1000.0,
+						hwmon_units[key.type]);
+			}
+			close(fd);
+		}
+	}
+	return len;
+}
+
+int hwmon_pmu__for_each_event(struct perf_pmu *pmu, void *state, pmu_event_callback cb)
+{
+	struct hwmon_pmu *hwm = container_of(pmu, struct hwmon_pmu, pmu);
+	struct hashmap_entry *cur;
+	size_t bkt;
+
+	if (hwmon_pmu__read_events(hwm))
+		return false;
+
+	hashmap__for_each_entry((&hwm->events), cur, bkt) {
+		static const char *const hwmon_scale_units[HWMON_TYPE_MAX] = {
+			NULL,
+			"0.001V", /* cpu */
+			"0.001A", /* curr */
+			"0.001J", /* energy */
+			"1rpm",   /* fan */
+			"0.001%", /* humidity */
+			"0.001V", /* in */
+			NULL,     /* intrusion */
+			"0.001W", /* power */
+			"1Hz",    /* pwm */
+			"0.001'C", /* temp */
+		};
+		static const char *const hwmon_desc[HWMON_TYPE_MAX] = {
+			NULL,
+			"CPU core reference voltage",   /* cpu */
+			"Current",                      /* curr */
+			"Cumulative energy use",        /* energy */
+			"Fan",                          /* fan */
+			"Humidity",                     /* humidity */
+			"Voltage",                      /* in */
+			"Chassis intrusion detection",  /* intrusion */
+			"Power use",                    /* power */
+			"Pulse width modulation fan control", /* pwm */
+			"Temperature",                  /* temp */
+		};
+		char alias_buf[64];
+		char desc_buf[256];
+		char encoding_buf[128];
+		union hwmon_pmu_event_key key = {
+			.type_and_num = cur->key,
+		};
+		struct hwmon_pmu_event_value *value = cur->pvalue;
+		struct pmu_event_info info = {
+			.pmu = pmu,
+			.name = value->name,
+			.alias = alias_buf,
+			.scale_unit = hwmon_scale_units[key.type],
+			.desc = desc_buf,
+			.long_desc = NULL,
+			.encoding_desc = encoding_buf,
+			.topic = "hwmon",
+			.pmu_name = pmu->name,
+			.event_type_desc = "Hwmon event",
+		};
+		int ret;
+		size_t len;
+
+		len = snprintf(alias_buf, sizeof(alias_buf), "%s%d",
+			       hwmon_type_strs[key.type], key.num);
+		if (!info.name) {
+			info.name = info.alias;
+			info.alias = NULL;
+		}
+
+		len = snprintf(desc_buf, sizeof(desc_buf), "%s in unit %s named %s.",
+			hwmon_desc[key.type],
+			pmu->name + 6,
+			value->label ?: info.name);
+
+		len += hwmon_pmu__describe_items(hwm, desc_buf + len, sizeof(desc_buf) - len,
+						key, value->items, /*is_alarm=*/false);
+
+		len += hwmon_pmu__describe_items(hwm, desc_buf + len, sizeof(desc_buf) - len,
+						key, value->alarm_items, /*is_alarm=*/true);
+
+		snprintf(encoding_buf, sizeof(encoding_buf), "%s/config=0x%lx/",
+			 pmu->name, cur->key);
+
+		ret = cb(state, &info);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
+size_t hwmon_pmu__num_events(struct perf_pmu *pmu)
+{
+	struct hwmon_pmu *hwm = container_of(pmu, struct hwmon_pmu, pmu);
+
+	hwmon_pmu__read_events(hwm);
+	return hashmap__size(&hwm->events);
+}
+
+bool hwmon_pmu__have_event(struct perf_pmu *pmu, const char *name)
+{
+	struct hwmon_pmu *hwm = container_of(pmu, struct hwmon_pmu, pmu);
+	enum hwmon_type type;
+	int number;
+	union hwmon_pmu_event_key key = {};
+	struct hashmap_entry *cur;
+	size_t bkt;
+
+	if (!parse_hwmon_filename(name, &type, &number, /*item=*/NULL, /*is_alarm=*/NULL))
+		return false;
+
+	if (hwmon_pmu__read_events(hwm))
+		return false;
+
+	key.type = type;
+	key.num = number;
+	if (hashmap_find(&hwm->events, key.type_and_num, /*value=*/NULL))
+		return true;
+	if (key.num != -1)
+		return false;
+	/* Item is of form <type>_ which means we should match <type>_<label>. */
+	hashmap__for_each_entry((&hwm->events), cur, bkt) {
+		struct hwmon_pmu_event_value *value = cur->pvalue;
+
+		key.type_and_num = cur->key;
+		if (key.type == type && value->name && !strcasecmp(name, value->name))
+			return true;
+	}
+	return false;
+}
+
+static int hwmon_pmu__config_term(const struct hwmon_pmu *hwm,
+				  struct perf_event_attr *attr,
+				  struct parse_events_term *term,
+				  struct parse_events_error *err)
+{
+	if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER) {
+		enum hwmon_type type;
+		int number;
+
+		if (parse_hwmon_filename(term->config, &type, &number,
+					 /*item=*/NULL, /*is_alarm=*/NULL)) {
+			if (number == -1) {
+				/*
+				 * Item is of form <type>_ which means we should
+				 * match <type>_<label>.
+				 */
+				struct hashmap_entry *cur;
+				size_t bkt;
+
+				attr->config = 0;
+				hashmap__for_each_entry((&hwm->events), cur, bkt) {
+					union hwmon_pmu_event_key key = {
+						.type_and_num = cur->key,
+					};
+					struct hwmon_pmu_event_value *value = cur->pvalue;
+
+					if (key.type == type && value->name &&
+					    !strcasecmp(term->config, value->name)) {
+						attr->config = key.type_and_num;
+						break;
+					}
+				}
+				if (attr->config == 0)
+					return -EINVAL;
+			} else {
+				union hwmon_pmu_event_key key = {
+					.type = type,
+					.num = number,
+				};
+
+				attr->config = key.type_and_num;
+			}
+			return 0;
+		}
+	}
+	if (err) {
+		char *err_str;
+
+		parse_events_error__handle(err, term->err_val,
+					asprintf(&err_str,
+						"unexpected hwmon event term (%s) %s",
+						parse_events__term_type_str(term->type_term),
+						term->config) < 0
+					? strdup("unexpected hwmon event term")
+					: err_str,
+					NULL);
+	}
+	return -EINVAL;
+}
+
+int hwmon_pmu__config_terms(const struct perf_pmu *pmu,
+			    struct perf_event_attr *attr,
+			    struct parse_events_terms *terms,
+			    struct parse_events_error *err)
+{
+	const struct hwmon_pmu *hwm = container_of(pmu, struct hwmon_pmu, pmu);
+	struct parse_events_term *term;
+
+	assert(pmu->sysfs_aliases_loaded);
+	list_for_each_entry(term, &terms->terms, list) {
+		if (hwmon_pmu__config_term(hwm, attr, term, err))
+			return -EINVAL;
+	}
+
+	return 0;
+
+}
+
+int hwmon_pmu__check_alias(struct parse_events_terms *terms, struct perf_pmu_info *info,
+			   struct parse_events_error *err)
+{
+	struct parse_events_term *term =
+		list_first_entry(&terms->terms, struct parse_events_term, list);
+
+	if (term->type_term == PARSE_EVENTS__TERM_TYPE_USER) {
+		enum hwmon_type type;
+		int number;
+
+		if (parse_hwmon_filename(term->config, &type, &number,
+					 /*item=*/NULL, /*is_alarm=*/NULL)) {
+			info->unit = hwmon_units[type];
+			if (type == HWMON_TYPE_FAN || type == HWMON_TYPE_PWM ||
+			    type == HWMON_TYPE_INTRUSION)
+				info->scale = 1;
+			else
+				info->scale = 0.001;
+		}
+		return 0;
+	}
+	if (err) {
+		char *err_str;
+
+		parse_events_error__handle(err, term->err_val,
+					asprintf(&err_str,
+						"unexpected hwmon event term (%s) %s",
+						parse_events__term_type_str(term->type_term),
+						term->config) < 0
+					? strdup("unexpected hwmon event term")
+					: err_str,
+					NULL);
+	}
+	return -EINVAL;
+}
+
+int perf_pmus__read_hwmon_pmus(struct list_head *pmus)
+{
+	char *line = NULL;
+	DIR *class_hwmon_dir;
+	struct dirent *class_hwmon_ent;
+	char buf[PATH_MAX];
+	const char *sysfs = sysfs__mountpoint();
+
+	if (!sysfs)
+		return 0;
+
+	scnprintf(buf, sizeof(buf), "%s/class/hwmon/", sysfs);
+	class_hwmon_dir = opendir(buf);
+	if (!class_hwmon_dir)
+		return 0;
+
+	while ((class_hwmon_ent = readdir(class_hwmon_dir)) != NULL) {
+		size_t line_len;
+		int hwmon_dir, name_fd;
+		struct io io;
+
+		if (class_hwmon_ent->d_type != DT_LNK)
+			continue;
+
+		scnprintf(buf, sizeof(buf), "%s/class/hwmon/%s", sysfs, class_hwmon_ent->d_name);
+		hwmon_dir = open(buf, O_DIRECTORY);
+		if (hwmon_dir == -1) {
+			pr_debug("hwmon_pmu: not a directory: '%s/class/hwmon/%s'\n",
+				 sysfs, class_hwmon_ent->d_name);
+			continue;
+		}
+		name_fd = openat(hwmon_dir, "name", O_RDONLY);
+		if (name_fd == -1) {
+			pr_debug("hwmon_pmu: failure to open '%s/class/hwmon/%s/name'\n",
+				  sysfs, class_hwmon_ent->d_name);
+			close(hwmon_dir);
+			continue;
+		}
+		io__init(&io, name_fd, buf, sizeof(buf));
+		io__getline(&io, &line, &line_len);
+		if (line_len > 0 && line[line_len - 1] == '\n')
+			line[line_len - 1] = '\0';
+		hwmon_pmu__new(pmus, hwmon_dir, class_hwmon_ent->d_name, line);
+		close(name_fd);
+	}
+	free(line);
+	closedir(class_hwmon_dir);
+	return 0;
+}
+
+#define FD(e, x, y) (*(int *)xyarray__entry(e->core.fd, x, y))
+
+int evsel__hwmon_pmu_open(struct evsel *evsel,
+			  struct perf_thread_map *threads,
+			  int start_cpu_map_idx, int end_cpu_map_idx)
+{
+	struct hwmon_pmu *hwm = container_of(evsel->pmu, struct hwmon_pmu, pmu);
+	union hwmon_pmu_event_key key = {
+		.type_and_num = evsel->core.attr.config,
+	};
+	int idx = 0, thread = 0, nthreads, err = 0;
+
+	nthreads = perf_thread_map__nr(threads);
+	for (idx = start_cpu_map_idx; idx < end_cpu_map_idx; idx++) {
+		for (thread = 0; thread < nthreads; thread++) {
+			char buf[64];
+			int fd;
+
+			snprintf(buf, sizeof(buf), "%s%d_input",
+				 hwmon_type_strs[key.type], key.num);
+
+			fd = openat(hwm->hwmon_dir_fd, buf, O_RDONLY);
+			FD(evsel, idx, thread) = fd;
+			if (fd < 0) {
+				err = -errno;
+				goto out_close;
+			}
+		}
+	}
+	return 0;
+out_close:
+	if (err)
+		threads->err_thread = thread;
+
+	do {
+		while (--thread >= 0) {
+			if (FD(evsel, idx, thread) >= 0)
+				close(FD(evsel, idx, thread));
+			FD(evsel, idx, thread) = -1;
+		}
+		thread = nthreads;
+	} while (--idx >= 0);
+	return err;
+}
+
+int evsel__hwmon_pmu_read(struct evsel *evsel, int cpu_map_idx, int thread)
+{
+	char buf[32];
+	int fd;
+	ssize_t len;
+	struct perf_counts_values *count, *old_count = NULL;
+
+	if (evsel->prev_raw_counts)
+		old_count = perf_counts(evsel->prev_raw_counts, cpu_map_idx, thread);
+
+	count = perf_counts(evsel->counts, cpu_map_idx, thread);
+	fd = FD(evsel, cpu_map_idx, thread);
+	len = pread(fd, buf, sizeof(buf), 0);
+	if (len <= 0) {
+		count->lost++;
+		return -EINVAL;
+	}
+	buf[len] = '\0';
+	if (old_count) {
+		count->val = old_count->val + strtoll(buf, NULL, 10);
+		count->run = old_count->run + 1;
+		count->ena = old_count->ena + 1;
+	} else {
+		count->val = strtoll(buf, NULL, 10);
+		count->run++;
+		count->ena++;
+	}
+	return 0;
+}
diff --git a/tools/perf/util/hwmon_pmu.h b/tools/perf/util/hwmon_pmu.h
new file mode 100644
index 000000000000..8061301fcd8e
--- /dev/null
+++ b/tools/perf/util/hwmon_pmu.h
@@ -0,0 +1,154 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+#ifndef __HWMON_PMU_H
+#define __HWMON_PMU_H
+
+#include "pmu.h"
+
+struct list_head;
+
+/**
+ * enum hwmon_type:
+ *
+ * As described in Documentation/hwmon/sysfs-interface.rst hwmon events are
+ * defined over multiple files of the form <type><num>_<item>. This enum
+ * captures potential <type> values.
+ *
+ * This enum is exposed for testing.
+ */
+enum hwmon_type {
+	HWMON_TYPE_NONE,
+
+	HWMON_TYPE_CPU,
+	HWMON_TYPE_CURR,
+	HWMON_TYPE_ENERGY,
+	HWMON_TYPE_FAN,
+	HWMON_TYPE_HUMIDITY,
+	HWMON_TYPE_IN,
+	HWMON_TYPE_INTRUSION,
+	HWMON_TYPE_POWER,
+	HWMON_TYPE_PWM,
+	HWMON_TYPE_TEMP,
+
+	HWMON_TYPE_MAX
+};
+
+/**
+ * enum hwmon_item:
+ *
+ * Similar to enum hwmon_type but describes the item part of a a sysfs filename.
+ *
+ * This enum is exposed for testing.
+ */
+enum hwmon_item {
+	HWMON_ITEM_NONE,
+
+	HWMON_ITEM_ACCURACY,
+	HWMON_ITEM_ALARM,
+	HWMON_ITEM_AUTO_CHANNELS_TEMP,
+	HWMON_ITEM_AVERAGE,
+	HWMON_ITEM_AVERAGE_HIGHEST,
+	HWMON_ITEM_AVERAGE_INTERVAL,
+	HWMON_ITEM_AVERAGE_INTERVAL_MAX,
+	HWMON_ITEM_AVERAGE_INTERVAL_MIN,
+	HWMON_ITEM_AVERAGE_LOWEST,
+	HWMON_ITEM_AVERAGE_MAX,
+	HWMON_ITEM_AVERAGE_MIN,
+	HWMON_ITEM_BEEP,
+	HWMON_ITEM_CAP,
+	HWMON_ITEM_CAP_HYST,
+	HWMON_ITEM_CAP_MAX,
+	HWMON_ITEM_CAP_MIN,
+	HWMON_ITEM_CRIT,
+	HWMON_ITEM_CRIT_HYST,
+	HWMON_ITEM_DIV,
+	HWMON_ITEM_EMERGENCY,
+	HWMON_ITEM_EMERGENCY_HIST,
+	HWMON_ITEM_ENABLE,
+	HWMON_ITEM_FAULT,
+	HWMON_ITEM_FREQ,
+	HWMON_ITEM_HIGHEST,
+	HWMON_ITEM_INPUT,
+	HWMON_ITEM_LABEL,
+	HWMON_ITEM_LCRIT,
+	HWMON_ITEM_LCRIT_HYST,
+	HWMON_ITEM_LOWEST,
+	HWMON_ITEM_MAX,
+	HWMON_ITEM_MAX_HYST,
+	HWMON_ITEM_MIN,
+	HWMON_ITEM_MIN_HYST,
+	HWMON_ITEM_MOD,
+	HWMON_ITEM_OFFSET,
+	HWMON_ITEM_PULSES,
+	HWMON_ITEM_RATED_MAX,
+	HWMON_ITEM_RATED_MIN,
+	HWMON_ITEM_RESET_HISTORY,
+	HWMON_ITEM_TARGET,
+	HWMON_ITEM_TYPE,
+	HWMON_ITEM_VID,
+
+	HWMON_ITEM__MAX,
+};
+
+/** Strings that correspond to enum hwmon_type. */
+extern const char * const hwmon_type_strs[HWMON_TYPE_MAX];
+/** Strings that correspond to enum hwmon_item. */
+extern const char * const hwmon_item_strs[HWMON_ITEM__MAX];
+
+bool perf_pmu__is_hwmon(const struct perf_pmu *pmu);
+bool evsel__is_hwmon(const struct evsel *evsel);
+
+/**
+ * parse_hwmon_filename() - Parse filename into constituent parts.
+ *
+ * @filename: To be parsed, of the form <type><number>_<item>.
+ * @type: The type defined from the parsed file name.
+ * @number: The number of the type, for example there may be more than 1 fan.
+ * @item: A hwmon <type><number> may have multiple associated items.
+ * @alarm: Is the filename for an alarm value?
+ *
+ * An example of a hwmon filename is "temp1_input". The type is temp for a
+ * temperature value. The number is 1. The item within the file is an input
+ * value - the temperature itself. This file doesn't contain an alarm value.
+ *
+ * Exposed for testing.
+ */
+bool parse_hwmon_filename(const char *filename,
+			  enum hwmon_type *type,
+			  int *number,
+			  enum hwmon_item *item,
+			  bool *alarm);
+
+/**
+ * hwmon_pmu__new() - Allocate and construct a hwmon PMU.
+ *
+ * @pmus: The list of PMUs to be added to.
+ * @hwmon_dir: An O_DIRECTORY file descriptor for a hwmon directory.
+ * @sysfs_name: Name of the hwmon sysfs directory like hwmon0.
+ * @name: The contents of the "name" file in the hwmon directory.
+ *
+ * Exposed for testing. Regular construction should happen via
+ * perf_pmus__read_hwmon_pmus.
+ */
+struct perf_pmu *hwmon_pmu__new(struct list_head *pmus, int hwmon_dir,
+				const char *sysfs_name, const char *name);
+void hwmon_pmu__exit(struct perf_pmu *pmu);
+
+int hwmon_pmu__for_each_event(struct perf_pmu *pmu, void *state, pmu_event_callback cb);
+size_t hwmon_pmu__num_events(struct perf_pmu *pmu);
+bool hwmon_pmu__have_event(struct perf_pmu *pmu, const char *name);
+int hwmon_pmu__config_terms(const struct perf_pmu *pmu,
+			    struct perf_event_attr *attr,
+			    struct parse_events_terms *terms,
+			    struct parse_events_error *err);
+int hwmon_pmu__check_alias(struct parse_events_terms *terms, struct perf_pmu_info *info,
+			   struct parse_events_error *err);
+
+int perf_pmus__read_hwmon_pmus(struct list_head *pmus);
+
+
+int evsel__hwmon_pmu_open(struct evsel *evsel,
+			 struct perf_thread_map *threads,
+			 int start_cpu_map_idx, int end_cpu_map_idx);
+int evsel__hwmon_pmu_read(struct evsel *evsel, int cpu_map_idx, int thread);
+
+#endif /* __HWMON_PMU_H */
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index e400db9e9eb1..10f23f48479b 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -37,6 +37,8 @@ struct perf_pmu_caps {
 };
 
 enum {
+	PERF_PMU_TYPE_HWMON_START = 0xFFFF0000,
+	PERF_PMU_TYPE_HWMON_END   = 0xFFFFFFFD,
 	PERF_PMU_TYPE_TOOL = 0xFFFFFFFE,
 	PERF_PMU_TYPE_FAKE = 0xFFFFFFFF,
 };
-- 
2.47.0.163.g1226f6d8fa-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 3/5] perf pmu: Add calls enabling the hwmon_pmu
  2024-10-22 18:06 [PATCH v6 0/5] Hwmon PMUs Ian Rogers
  2024-10-22 18:06 ` [PATCH v6 1/5] tools api io: Ensure line_len_out is always initialized Ian Rogers
  2024-10-22 18:06 ` [PATCH v6 2/5] perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs Ian Rogers
@ 2024-10-22 18:06 ` Ian Rogers
  2024-10-22 18:06 ` [PATCH v6 4/5] perf test: Add hwmon "PMU" test Ian Rogers
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2024-10-22 18:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Ravi Bangoria, Weilin Wang,
	Yoshihiro Furudera, James Clark, Athira Jajeev, Howard Chu,
	Oliver Upton, Changbin Du, Ze Gao, Junhao He, linux-kernel,
	linux-perf-users

Add the base PMU calls necessary for hwmon_pmu(s) to be
created/deleted and events found, listed, opened and read.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/util/evsel.c |  9 +++++++++
 tools/perf/util/pmu.c   | 20 ++++++++++++++++++++
 tools/perf/util/pmus.c  |  2 ++
 3 files changed, 31 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 14663ad14c53..a1cbf82a9753 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -55,6 +55,7 @@
 #include "off_cpu.h"
 #include "pmu.h"
 #include "pmus.h"
+#include "hwmon_pmu.h"
 #include "tool_pmu.h"
 #include "rlimit.h"
 #include "../perf-sys.h"
@@ -1801,6 +1802,9 @@ int evsel__read_counter(struct evsel *evsel, int cpu_map_idx, int thread)
 	if (evsel__is_tool(evsel))
 		return evsel__tool_pmu_read(evsel, cpu_map_idx, thread);
 
+	if (evsel__is_hwmon(evsel))
+		return evsel__hwmon_pmu_read(evsel, cpu_map_idx, thread);
+
 	if (evsel__is_retire_lat(evsel))
 		return evsel__read_retire_lat(evsel, cpu_map_idx, thread);
 
@@ -2246,6 +2250,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 					    start_cpu_map_idx,
 					    end_cpu_map_idx);
 	}
+	if (evsel__is_hwmon(evsel)) {
+		return evsel__hwmon_pmu_open(evsel, threads,
+					     start_cpu_map_idx,
+					     end_cpu_map_idx);
+	}
 
 	for (idx = start_cpu_map_idx; idx < end_cpu_map_idx; idx++) {
 
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 0789758598c0..a02df2c80f42 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -18,6 +18,7 @@
 #include "debug.h"
 #include "evsel.h"
 #include "pmu.h"
+#include "hwmon_pmu.h"
 #include "pmus.h"
 #include "tool_pmu.h"
 #include <util/pmu-bison.h>
@@ -1529,6 +1530,9 @@ int perf_pmu__config_terms(const struct perf_pmu *pmu,
 {
 	struct parse_events_term *term;
 
+	if (perf_pmu__is_hwmon(pmu))
+		return hwmon_pmu__config_terms(pmu, attr, terms, err);
+
 	list_for_each_entry(term, &terms->terms, list) {
 		if (pmu_config_term(pmu, attr, term, terms, zero, apply_hardcoded, err))
 			return -EINVAL;
@@ -1661,6 +1665,11 @@ int perf_pmu__check_alias(struct perf_pmu *pmu, struct parse_events_terms *head_
 	info->scale    = 0.0;
 	info->snapshot = false;
 
+	if (perf_pmu__is_hwmon(pmu)) {
+		ret = hwmon_pmu__check_alias(head_terms, info, err);
+		goto out;
+	}
+
 	/* Fake PMU doesn't rewrite terms. */
 	if (perf_pmu__is_fake(pmu))
 		goto out;
@@ -1834,6 +1843,8 @@ bool perf_pmu__have_event(struct perf_pmu *pmu, const char *name)
 		return false;
 	if (perf_pmu__is_tool(pmu) && tool_pmu__skip_event(name))
 		return false;
+	if (perf_pmu__is_hwmon(pmu))
+		return hwmon_pmu__have_event(pmu, name);
 	if (perf_pmu__find_alias(pmu, name, /*load=*/ true) != NULL)
 		return true;
 	if (pmu->cpu_aliases_added || !pmu->events_table)
@@ -1845,6 +1856,9 @@ size_t perf_pmu__num_events(struct perf_pmu *pmu)
 {
 	size_t nr;
 
+	if (perf_pmu__is_hwmon(pmu))
+		return hwmon_pmu__num_events(pmu);
+
 	pmu_aliases_parse(pmu);
 	nr = pmu->sysfs_aliases + pmu->sys_json_aliases;
 
@@ -1908,6 +1922,9 @@ int perf_pmu__for_each_event(struct perf_pmu *pmu, bool skip_duplicate_pmus,
 	int ret = 0;
 	struct strbuf sb;
 
+	if (perf_pmu__is_hwmon(pmu))
+		return hwmon_pmu__for_each_event(pmu, state, cb);
+
 	strbuf_init(&sb, /*hint=*/ 0);
 	pmu_aliases_parse(pmu);
 	pmu_add_cpu_aliases(pmu);
@@ -2303,6 +2320,9 @@ int perf_pmu__pathname_fd(int dirfd, const char *pmu_name, const char *filename,
 
 void perf_pmu__delete(struct perf_pmu *pmu)
 {
+	if (perf_pmu__is_hwmon(pmu))
+		hwmon_pmu__exit(pmu);
+
 	perf_pmu__del_formats(&pmu->format);
 	perf_pmu__del_aliases(pmu);
 	perf_pmu__del_caps(pmu);
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index 107de86c2637..5c3e88adb9e6 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -15,6 +15,7 @@
 #include "evsel.h"
 #include "pmus.h"
 #include "pmu.h"
+#include "hwmon_pmu.h"
 #include "tool_pmu.h"
 #include "print-events.h"
 #include "strbuf.h"
@@ -234,6 +235,7 @@ static void pmu_read_sysfs(bool core_only)
 	if (!core_only) {
 		tool_pmu = perf_pmus__tool_pmu();
 		list_add_tail(&tool_pmu->list, &other_pmus);
+		perf_pmus__read_hwmon_pmus(&other_pmus);
 	}
 	list_sort(NULL, &other_pmus, pmus_cmp);
 	if (!list_empty(&core_pmus)) {
-- 
2.47.0.163.g1226f6d8fa-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 4/5] perf test: Add hwmon "PMU" test
  2024-10-22 18:06 [PATCH v6 0/5] Hwmon PMUs Ian Rogers
                   ` (2 preceding siblings ...)
  2024-10-22 18:06 ` [PATCH v6 3/5] perf pmu: Add calls enabling the hwmon_pmu Ian Rogers
@ 2024-10-22 18:06 ` Ian Rogers
  2024-10-22 18:06 ` [PATCH v6 5/5] perf docs: Document tool and hwmon events Ian Rogers
  2024-10-24  3:06 ` [PATCH v6 0/5] Hwmon PMUs Namhyung Kim
  5 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2024-10-22 18:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Ravi Bangoria, Weilin Wang,
	Yoshihiro Furudera, James Clark, Athira Jajeev, Howard Chu,
	Oliver Upton, Changbin Du, Ze Gao, Junhao He, linux-kernel,
	linux-perf-users

Based on a mix of the sysfs PMU test (for creating the reference
files) and the tool PMU test, test that parsing given hwmon events
with there aliases creates the expected config values.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/tests/Build          |   1 +
 tools/perf/tests/builtin-test.c |   1 +
 tools/perf/tests/hwmon_pmu.c    | 243 ++++++++++++++++++++++++++++++++
 tools/perf/tests/tests.h        |   1 +
 tools/perf/util/pmus.c          |   7 +
 tools/perf/util/pmus.h          |   3 +
 6 files changed, 256 insertions(+)
 create mode 100644 tools/perf/tests/hwmon_pmu.c

diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 03cbdf7c50a0..f80bb46a9fc4 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -66,6 +66,7 @@ perf-test-y += sigtrap.o
 perf-test-y += event_groups.o
 perf-test-y += symbols.o
 perf-test-y += util.o
+perf-test-y += hwmon_pmu.o
 perf-test-y += tool_pmu.o
 
 ifeq ($(SRCARCH),$(filter $(SRCARCH),x86 arm arm64 powerpc))
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 50533446e747..7a550a37f615 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -73,6 +73,7 @@ static struct test_suite *generic_tests[] = {
 	&suite__PERF_RECORD,
 	&suite__pmu,
 	&suite__pmu_events,
+	&suite__hwmon_pmu,
 	&suite__tool_pmu,
 	&suite__dso_data,
 	&suite__perf_evsel__roundtrip_name_test,
diff --git a/tools/perf/tests/hwmon_pmu.c b/tools/perf/tests/hwmon_pmu.c
new file mode 100644
index 000000000000..e32b7804661b
--- /dev/null
+++ b/tools/perf/tests/hwmon_pmu.c
@@ -0,0 +1,243 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+#include "debug.h"
+#include "evlist.h"
+#include "parse-events.h"
+#include "tests.h"
+#include "hwmon_pmu.h"
+#include <fcntl.h>
+#include <sys/stat.h>
+
+static const struct test_event {
+	const char *name;
+	const char *alias;
+	long config;
+} test_events[] = {
+	{
+		"temp_test_hwmon_event1",
+		"temp1",
+		0xA0001,
+	},
+	{
+		"temp_test_hwmon_event2",
+		"temp2",
+		0xA0002,
+	},
+};
+
+/* Cleanup test PMU directory. */
+static int test_pmu_put(const char *dir, struct perf_pmu *hwm)
+{
+	char buf[PATH_MAX + 20];
+	int ret;
+
+	if (scnprintf(buf, sizeof(buf), "rm -fr %s", dir) < 0) {
+		pr_err("Failure to set up buffer for \"%s\"\n", dir);
+		return -EINVAL;
+	}
+	ret = system(buf);
+	if (ret)
+		pr_err("Failure to \"%s\"\n", buf);
+
+	perf_pmu__delete(hwm);
+	return ret;
+}
+
+/*
+ * Prepare test PMU directory data, normally exported by kernel at
+ * /sys/class/hwmon/hwmon<number>/. Give as input a buffer to hold the file
+ * path, the result is PMU loaded using that directory.
+ */
+static struct perf_pmu *test_pmu_get(char *dir, size_t sz)
+{
+	const char *test_hwmon_name_nl = "A test hwmon PMU\n";
+	const char *test_hwmon_name = "A test hwmon PMU";
+	/* Simulated hwmon items. */
+	const struct test_item {
+		const char *name;
+		const char *value;
+	} test_items[] = {
+		{ "temp1_label", "test hwmon event1\n", },
+		{ "temp1_input", "40000\n", },
+		{ "temp2_label", "test hwmon event2\n", },
+		{ "temp2_input", "50000\n", },
+	};
+	int dirfd, file;
+	struct perf_pmu *hwm = NULL;
+	ssize_t len;
+
+	/* Create equivalent of sysfs mount point. */
+	scnprintf(dir, sz, "/tmp/perf-hwmon-pmu-test-XXXXXX");
+	if (!mkdtemp(dir)) {
+		pr_err("mkdtemp failed\n");
+		dir[0] = '\0';
+		return NULL;
+	}
+	dirfd = open(dir, O_DIRECTORY);
+	if (dirfd < 0) {
+		pr_err("Failed to open test directory \"%s\"\n", dir);
+		goto err_out;
+	}
+
+	/* Create the test hwmon directory and give it a name. */
+	if (mkdirat(dirfd, "hwmon1234", 0755) < 0) {
+		pr_err("Failed to mkdir hwmon directory\n");
+		goto err_out;
+	}
+	file = openat(dirfd, "hwmon1234/name", O_WRONLY | O_CREAT, 0600);
+	if (!file) {
+		pr_err("Failed to open for writing file \"name\"\n");
+		goto err_out;
+	}
+	len = strlen(test_hwmon_name_nl);
+	if (write(file, test_hwmon_name_nl, len) < len) {
+		close(file);
+		pr_err("Failed to write to 'name' file\n");
+		goto err_out;
+	}
+	close(file);
+
+	/* Create test hwmon files. */
+	for (size_t i = 0; i < ARRAY_SIZE(test_items); i++) {
+		const struct test_item *item = &test_items[i];
+
+		file = openat(dirfd, item->name, O_WRONLY | O_CREAT, 0600);
+		if (!file) {
+			pr_err("Failed to open for writing file \"%s\"\n", item->name);
+			goto err_out;
+		}
+
+		if (write(file, item->value, strlen(item->value)) < 0) {
+			pr_err("Failed to write to file \"%s\"\n", item->name);
+			close(file);
+			goto err_out;
+		}
+		close(file);
+	}
+
+	/* Make the PMU reading the files created above. */
+	hwm = perf_pmus__add_test_hwmon_pmu(dirfd, "hwmon1234", test_hwmon_name);
+	if (!hwm)
+		pr_err("Test hwmon creation failed\n");
+
+err_out:
+	if (!hwm) {
+		test_pmu_put(dir, hwm);
+		if (dirfd >= 0)
+			close(dirfd);
+	}
+	return hwm;
+}
+
+static int do_test(size_t i, bool with_pmu, bool with_alias)
+{
+	const char *test_event = with_alias ? test_events[i].alias : test_events[i].name;
+	struct evlist *evlist = evlist__new();
+	struct evsel *evsel;
+	struct parse_events_error err;
+	int ret;
+	char str[128];
+	bool found = false;
+
+	if (!evlist) {
+		pr_err("evlist allocation failed\n");
+		return TEST_FAIL;
+	}
+
+	if (with_pmu)
+		snprintf(str, sizeof(str), "/%s/", test_event);
+	else
+		strlcpy(str, test_event, sizeof(str));
+
+	pr_debug("Testing '%s'\n", str);
+	parse_events_error__init(&err);
+	ret = parse_events(evlist, str, &err);
+	if (ret) {
+		evlist__delete(evlist);
+
+		pr_debug("FAILED %s:%d failed to parse event '%s', err %d\n",
+			 __FILE__, __LINE__, str, ret);
+		parse_events_error__print(&err, str);
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	ret = TEST_OK;
+	if (with_pmu ? (evlist->core.nr_entries != 1) : (evlist->core.nr_entries < 1)) {
+		pr_debug("FAILED %s:%d Unexpected number of events for '%s' of %d\n",
+			 __FILE__, __LINE__, str, evlist->core.nr_entries);
+		ret = TEST_FAIL;
+		goto out;
+	}
+
+	evlist__for_each_entry(evlist, evsel) {
+		if (!perf_pmu__is_hwmon(evsel->pmu))
+			continue;
+
+		if (evsel->core.attr.config != (u64)test_events[i].config) {
+			pr_debug("FAILED %s:%d Unexpected config for '%s', %lld != %ld\n",
+				__FILE__, __LINE__, str,
+				evsel->core.attr.config,
+				test_events[i].config);
+			ret = TEST_FAIL;
+			goto out;
+		}
+		found = true;
+	}
+
+	if (!found) {
+		pr_debug("FAILED %s:%d Didn't find hwmon event '%s' in parsed evsels\n",
+			 __FILE__, __LINE__, str);
+		ret = TEST_FAIL;
+	}
+
+out:
+	evlist__delete(evlist);
+	return ret;
+}
+
+static int test__hwmon_pmu(bool with_pmu)
+{
+	char dir[PATH_MAX];
+	struct perf_pmu *pmu = test_pmu_get(dir, sizeof(dir));
+	int ret = TEST_OK;
+
+	if (!pmu)
+		return TEST_FAIL;
+
+	for (size_t i = 0; i < ARRAY_SIZE(test_events); i++) {
+		ret = do_test(i, with_pmu, /*with_alias=*/false);
+
+		if (ret != TEST_OK)
+			break;
+
+		ret = do_test(i, with_pmu, /*with_alias=*/true);
+
+		if (ret != TEST_OK)
+			break;
+	}
+	test_pmu_put(dir, pmu);
+	return ret;
+}
+
+static int test__hwmon_pmu_without_pmu(struct test_suite *test __maybe_unused,
+				      int subtest __maybe_unused)
+{
+	return test__hwmon_pmu(/*with_pmu=*/false);
+}
+
+static int test__hwmon_pmu_with_pmu(struct test_suite *test __maybe_unused,
+				   int subtest __maybe_unused)
+{
+	return test__hwmon_pmu(/*with_pmu=*/false);
+}
+
+static struct test_case tests__hwmon_pmu[] = {
+	TEST_CASE("Parsing without PMU name", hwmon_pmu_without_pmu),
+	TEST_CASE("Parsing with PMU name", hwmon_pmu_with_pmu),
+	{	.name = NULL, }
+};
+
+struct test_suite suite__hwmon_pmu = {
+	.desc = "Hwmon PMU",
+	.test_cases = tests__hwmon_pmu,
+};
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 1ed76d4156b6..260daa77eb06 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -83,6 +83,7 @@ DECLARE_SUITE(perf_evsel__tp_sched_test);
 DECLARE_SUITE(syscall_openat_tp_fields);
 DECLARE_SUITE(pmu);
 DECLARE_SUITE(pmu_events);
+DECLARE_SUITE(hwmon_pmu);
 DECLARE_SUITE(tool_pmu);
 DECLARE_SUITE(attr);
 DECLARE_SUITE(dso_data);
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index 5c3e88adb9e6..451c6e00ad70 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -733,6 +733,13 @@ struct perf_pmu *perf_pmus__add_test_pmu(int test_sysfs_dirfd, const char *name)
 	return perf_pmu__lookup(&other_pmus, test_sysfs_dirfd, name, /*eager_load=*/true);
 }
 
+struct perf_pmu *perf_pmus__add_test_hwmon_pmu(int hwmon_dir,
+					       const char *sysfs_name,
+					       const char *name)
+{
+	return hwmon_pmu__new(&other_pmus, hwmon_dir, sysfs_name, name);
+}
+
 struct perf_pmu *perf_pmus__fake_pmu(void)
 {
 	static struct perf_pmu fake = {
diff --git a/tools/perf/util/pmus.h b/tools/perf/util/pmus.h
index e1742b56eec7..a0cb0eb2ff97 100644
--- a/tools/perf/util/pmus.h
+++ b/tools/perf/util/pmus.h
@@ -30,6 +30,9 @@ bool perf_pmus__supports_extended_type(void);
 char *perf_pmus__default_pmu_name(void);
 
 struct perf_pmu *perf_pmus__add_test_pmu(int test_sysfs_dirfd, const char *name);
+struct perf_pmu *perf_pmus__add_test_hwmon_pmu(int hwmon_dir,
+					       const char *sysfs_name,
+					       const char *name);
 struct perf_pmu *perf_pmus__fake_pmu(void);
 
 #endif /* __PMUS_H */
-- 
2.47.0.163.g1226f6d8fa-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v6 5/5] perf docs: Document tool and hwmon events
  2024-10-22 18:06 [PATCH v6 0/5] Hwmon PMUs Ian Rogers
                   ` (3 preceding siblings ...)
  2024-10-22 18:06 ` [PATCH v6 4/5] perf test: Add hwmon "PMU" test Ian Rogers
@ 2024-10-22 18:06 ` Ian Rogers
  2024-10-24  3:06 ` [PATCH v6 0/5] Hwmon PMUs Namhyung Kim
  5 siblings, 0 replies; 15+ messages in thread
From: Ian Rogers @ 2024-10-22 18:06 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Kan Liang, Ravi Bangoria, Weilin Wang,
	Yoshihiro Furudera, James Clark, Athira Jajeev, Howard Chu,
	Oliver Upton, Changbin Du, Ze Gao, Junhao He, linux-kernel,
	linux-perf-users

Add a few paragraphs on tool and hwmon events.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/Documentation/perf-list.txt | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 14621f39b375..d0c65fad419a 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -243,6 +243,21 @@ For accessing trace point events perf needs to have read access to
 /sys/kernel/tracing, even when perf_event_paranoid is in a relaxed
 setting.
 
+TOOL/HWMON EVENTS
+-----------------
+
+Some events don't have an associated PMU instead reading values
+available to software without perf_event_open. As these events don't
+support sampling they can only really be read by tools like perf stat.
+
+Tool events provide times and certain system parameters. Examples
+include duration_time, user_time, system_time and num_cpus_online.
+
+Hwmon events provide easy access to hwmon sysfs data typically in
+/sys/class/hwmon. This information includes temperatures, fan speeds
+and energy usage.
+
+
 TRACING
 -------
 
-- 
2.47.0.163.g1226f6d8fa-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5]  Hwmon PMUs
  2024-10-22 18:06 [PATCH v6 0/5] Hwmon PMUs Ian Rogers
                   ` (4 preceding siblings ...)
  2024-10-22 18:06 ` [PATCH v6 5/5] perf docs: Document tool and hwmon events Ian Rogers
@ 2024-10-24  3:06 ` Namhyung Kim
  2024-10-24  7:07   ` Ian Rogers
  5 siblings, 1 reply; 15+ messages in thread
From: Namhyung Kim @ 2024-10-24  3:06 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Ravi Bangoria, Weilin Wang, Yoshihiro Furudera,
	James Clark, Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du,
	Ze Gao, Junhao He, linux-kernel, linux-perf-users

Hi Ian,

On Tue, Oct 22, 2024 at 11:06:18AM -0700, Ian Rogers wrote:
> Following the convention of the tool PMU, create a hwmon PMU that
> exposes hwmon data for reading. For example, the following shows
> reading the CPU temperature and 2 fan speeds alongside the uncore
> frequency:
> ```
> $ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
>      1.001153138              52.00 'C   temp_cpu
>      1.001153138              2,588 rpm  fan1
>      1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
>      1.001153138                  8      tool/num_cpus_online/
>      1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
>      1.001153138      1,012,773,595      duration_time
> ...
> ```
> 
> Additional data on the hwmon events is in perf list:
> ```
> $ perf list
> ...
> hwmon:
> ...
>   temp_core_0 OR temp2
>        [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
>         hwmon_coretemp]
> ...
> ```
> 
> v6: Add string.h #include for issue reported by kernel test robot.
> v5: Fix asan issue in parse_hwmon_filename caught by a TMA metric.
> v4: Drop merged patches 1 to 10. Separate adding the hwmon_pmu from
>     the update to perf_pmu to use it. Try to make source of literal
>     strings clearer via named #defines. Fix a number of GCC warnings.
> v3: Rebase, add Namhyung's acked-by to patches 1 to 10.
> v2: Address Namhyung's review feedback. Rebase dropping 4 patches
>     applied by Arnaldo, fix build breakage reported by Arnaldo.
> 
> Ian Rogers (5):
>   tools api io: Ensure line_len_out is always initialized
>   perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
>   perf pmu: Add calls enabling the hwmon_pmu
>   perf test: Add hwmon "PMU" test
>   perf docs: Document tool and hwmon events

I think the patch 2 can be easily splitted into core and other parts
like dealing with aliases and units.  I believe it'd be helpful for
others (like me) to understand how it works.

Please take a look at 'perf/hwmon-pmu' branch in:

  https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung

> 
>  tools/lib/api/io.h                     |   1 +
>  tools/perf/Documentation/perf-list.txt |  15 +
>  tools/perf/tests/Build                 |   1 +
>  tools/perf/tests/builtin-test.c        |   1 +
>  tools/perf/tests/hwmon_pmu.c           | 243 ++++++++
>  tools/perf/tests/tests.h               |   1 +
>  tools/perf/util/Build                  |   1 +
>  tools/perf/util/evsel.c                |   9 +
>  tools/perf/util/hwmon_pmu.c            | 821 +++++++++++++++++++++++++
>  tools/perf/util/hwmon_pmu.h            | 154 +++++
>  tools/perf/util/pmu.c                  |  20 +
>  tools/perf/util/pmu.h                  |   2 +
>  tools/perf/util/pmus.c                 |   9 +
>  tools/perf/util/pmus.h                 |   3 +
>  14 files changed, 1281 insertions(+)
>  create mode 100644 tools/perf/tests/hwmon_pmu.c
>  create mode 100644 tools/perf/util/hwmon_pmu.c
>  create mode 100644 tools/perf/util/hwmon_pmu.h
> 
> -- 
> 2.47.0.163.g1226f6d8fa-goog
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] Hwmon PMUs
  2024-10-24  3:06 ` [PATCH v6 0/5] Hwmon PMUs Namhyung Kim
@ 2024-10-24  7:07   ` Ian Rogers
  2024-10-24 16:40     ` Namhyung Kim
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Rogers @ 2024-10-24  7:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Ravi Bangoria, Weilin Wang, Yoshihiro Furudera,
	James Clark, Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du,
	Ze Gao, Junhao He, linux-kernel, linux-perf-users

On Wed, Oct 23, 2024 at 8:06 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Tue, Oct 22, 2024 at 11:06:18AM -0700, Ian Rogers wrote:
> > Following the convention of the tool PMU, create a hwmon PMU that
> > exposes hwmon data for reading. For example, the following shows
> > reading the CPU temperature and 2 fan speeds alongside the uncore
> > frequency:
> > ```
> > $ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
> >      1.001153138              52.00 'C   temp_cpu
> >      1.001153138              2,588 rpm  fan1
> >      1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
> >      1.001153138                  8      tool/num_cpus_online/
> >      1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
> >      1.001153138      1,012,773,595      duration_time
> > ...
> > ```
> >
> > Additional data on the hwmon events is in perf list:
> > ```
> > $ perf list
> > ...
> > hwmon:
> > ...
> >   temp_core_0 OR temp2
> >        [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
> >         hwmon_coretemp]
> > ...
> > ```
> >
> > v6: Add string.h #include for issue reported by kernel test robot.
> > v5: Fix asan issue in parse_hwmon_filename caught by a TMA metric.
> > v4: Drop merged patches 1 to 10. Separate adding the hwmon_pmu from
> >     the update to perf_pmu to use it. Try to make source of literal
> >     strings clearer via named #defines. Fix a number of GCC warnings.
> > v3: Rebase, add Namhyung's acked-by to patches 1 to 10.
> > v2: Address Namhyung's review feedback. Rebase dropping 4 patches
> >     applied by Arnaldo, fix build breakage reported by Arnaldo.
> >
> > Ian Rogers (5):
> >   tools api io: Ensure line_len_out is always initialized
> >   perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
> >   perf pmu: Add calls enabling the hwmon_pmu
> >   perf test: Add hwmon "PMU" test
> >   perf docs: Document tool and hwmon events
>
> I think the patch 2 can be easily splitted into core and other parts
> like dealing with aliases and units.  I believe it'd be helpful for
> others (like me) to understand how it works.
>
> Please take a look at 'perf/hwmon-pmu' branch in:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks Namhyung but I'm not really seeing this making anything simpler
and I can see significant new bugs. Your new patch:
https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git/commit/?h=perf/hwmon-pmu&id=85c78b5bf71fb3e67ae815f7b2d044648fa08391
Has taken about 40% out of patch 2, but done so by splitting function
declarations from their definitions, enum declarations from any use,
etc. It also adds in code like:

snprintf(buf, sizeof(buf), "%s_input", evsel->name);

but this would be a strange thing to do. The evsel->name is rewritten
by fallback logic, so cycles may become cycles:u if kernel profiling
is restricted. This is why we have metric-id in the evsel as we cannot
rely on the evsel->name not mutating when looking up events for the
sake of metrics. Using the name as part of a sysfs filename lookup
doesn't make sense to me as now the evsel fallback logic can break a
hwmon event. In the original patch the code was:

snprintf(buf, sizeof(buf), "%s%d_input", hwmon_type_strs[key.type], key.num);

where those two values are constants and key.type and key.num both
values embedded in the config value the evsel fallback logic won't
change. But bringing in the code that does that basically brings in
all of the rest of patch 2.

So the patch is adding a PMU that looks broken, so rather than
simplifying things it just creates a broken intermediate state and
should that be fixed for the benefit of bisects?
It also complicates understanding as the declarations of functions and
enums have kernel-doc, but now the definitions of enums and functions
are split apart. For me, to understand the code I'd want to squash the
patches back together again so I could see a declaration with its
definition.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] Hwmon PMUs
  2024-10-24  7:07   ` Ian Rogers
@ 2024-10-24 16:40     ` Namhyung Kim
  2024-10-25  1:33       ` Ian Rogers
  0 siblings, 1 reply; 15+ messages in thread
From: Namhyung Kim @ 2024-10-24 16:40 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Ravi Bangoria, Weilin Wang, Yoshihiro Furudera,
	James Clark, Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du,
	Ze Gao, Junhao He, linux-kernel, linux-perf-users

On Thu, Oct 24, 2024 at 12:07:46AM -0700, Ian Rogers wrote:
> On Wed, Oct 23, 2024 at 8:06 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Tue, Oct 22, 2024 at 11:06:18AM -0700, Ian Rogers wrote:
> > > Following the convention of the tool PMU, create a hwmon PMU that
> > > exposes hwmon data for reading. For example, the following shows
> > > reading the CPU temperature and 2 fan speeds alongside the uncore
> > > frequency:
> > > ```
> > > $ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
> > >      1.001153138              52.00 'C   temp_cpu
> > >      1.001153138              2,588 rpm  fan1
> > >      1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
> > >      1.001153138                  8      tool/num_cpus_online/
> > >      1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
> > >      1.001153138      1,012,773,595      duration_time
> > > ...
> > > ```
> > >
> > > Additional data on the hwmon events is in perf list:
> > > ```
> > > $ perf list
> > > ...
> > > hwmon:
> > > ...
> > >   temp_core_0 OR temp2
> > >        [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
> > >         hwmon_coretemp]
> > > ...
> > > ```
> > >
> > > v6: Add string.h #include for issue reported by kernel test robot.
> > > v5: Fix asan issue in parse_hwmon_filename caught by a TMA metric.
> > > v4: Drop merged patches 1 to 10. Separate adding the hwmon_pmu from
> > >     the update to perf_pmu to use it. Try to make source of literal
> > >     strings clearer via named #defines. Fix a number of GCC warnings.
> > > v3: Rebase, add Namhyung's acked-by to patches 1 to 10.
> > > v2: Address Namhyung's review feedback. Rebase dropping 4 patches
> > >     applied by Arnaldo, fix build breakage reported by Arnaldo.
> > >
> > > Ian Rogers (5):
> > >   tools api io: Ensure line_len_out is always initialized
> > >   perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
> > >   perf pmu: Add calls enabling the hwmon_pmu
> > >   perf test: Add hwmon "PMU" test
> > >   perf docs: Document tool and hwmon events
> >
> > I think the patch 2 can be easily splitted into core and other parts
> > like dealing with aliases and units.  I believe it'd be helpful for
> > others (like me) to understand how it works.
> >
> > Please take a look at 'perf/hwmon-pmu' branch in:
> >
> >   https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> 
> Thanks Namhyung but I'm not really seeing this making anything simpler
> and I can see significant new bugs. Your new patch:
> https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git/commit/?h=perf/hwmon-pmu&id=85c78b5bf71fb3e67ae815f7b2d044648fa08391
> Has taken about 40% out of patch 2, but done so by splitting function
> declarations from their definitions, enum declarations from any use,

Yeah, it's just because I was lazy and you can split header files too
(and please do so).

> etc. It also adds in code like:
> 
> snprintf(buf, sizeof(buf), "%s_input", evsel->name);
> 
> but this would be a strange thing to do. The evsel->name is rewritten
> by fallback logic, so cycles may become cycles:u if kernel profiling

I know it doesn't work but just want to highlight how it's supposed to
work.  Eventually what we need is a correct file name.  In fact, I think
it'd work if we can pass a correct event name probably like:

  perf stat -e hwmon5/name=fan1/ true

> is restricted. This is why we have metric-id in the evsel as we cannot
> rely on the evsel->name not mutating when looking up events for the
> sake of metrics. Using the name as part of a sysfs filename lookup
> doesn't make sense to me as now the evsel fallback logic can break a
> hwmon event. In the original patch the code was:

The fallback logic is used only if the kernel returns an error.  Thus
it'd be fine as long as it correctly finds the sysfs filename.  But it's
not used in the final code and the change is a simple one-liner.

> 
> snprintf(buf, sizeof(buf), "%s%d_input", hwmon_type_strs[key.type], key.num);
> 
> where those two values are constants and key.type and key.num both
> values embedded in the config value the evsel fallback logic won't
> change. But bringing in the code that does that basically brings in
> all of the rest of patch 2.

Right, that's why I did that way.

> 
> So the patch is adding a PMU that looks broken, so rather than
> simplifying things it just creates a broken intermediate state and
> should that be fixed for the benefit of bisects?

Actually it's not broken since it's not enabled yet. :)


> It also complicates understanding as the declarations of functions and
> enums have kernel-doc, but now the definitions of enums and functions
> are split apart. For me, to understand the code I'd want to squash the
> patches back together again so I could see a declaration with its
> definition.

Yep, please move the declarations to the patch 3.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] Hwmon PMUs
  2024-10-24 16:40     ` Namhyung Kim
@ 2024-10-25  1:33       ` Ian Rogers
  2024-10-25 17:30         ` Namhyung Kim
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Rogers @ 2024-10-25  1:33 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Ravi Bangoria, Weilin Wang, Yoshihiro Furudera,
	James Clark, Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du,
	Ze Gao, Junhao He, linux-kernel, linux-perf-users

On Thu, Oct 24, 2024 at 9:41 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, Oct 24, 2024 at 12:07:46AM -0700, Ian Rogers wrote:
> > On Wed, Oct 23, 2024 at 8:06 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > Hi Ian,
> > >
> > > On Tue, Oct 22, 2024 at 11:06:18AM -0700, Ian Rogers wrote:
> > > > Following the convention of the tool PMU, create a hwmon PMU that
> > > > exposes hwmon data for reading. For example, the following shows
> > > > reading the CPU temperature and 2 fan speeds alongside the uncore
> > > > frequency:
> > > > ```
> > > > $ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
> > > >      1.001153138              52.00 'C   temp_cpu
> > > >      1.001153138              2,588 rpm  fan1
> > > >      1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
> > > >      1.001153138                  8      tool/num_cpus_online/
> > > >      1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
> > > >      1.001153138      1,012,773,595      duration_time
> > > > ...
> > > > ```
> > > >
> > > > Additional data on the hwmon events is in perf list:
> > > > ```
> > > > $ perf list
> > > > ...
> > > > hwmon:
> > > > ...
> > > >   temp_core_0 OR temp2
> > > >        [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
> > > >         hwmon_coretemp]
> > > > ...
> > > > ```
> > > >
> > > > v6: Add string.h #include for issue reported by kernel test robot.
> > > > v5: Fix asan issue in parse_hwmon_filename caught by a TMA metric.
> > > > v4: Drop merged patches 1 to 10. Separate adding the hwmon_pmu from
> > > >     the update to perf_pmu to use it. Try to make source of literal
> > > >     strings clearer via named #defines. Fix a number of GCC warnings.
> > > > v3: Rebase, add Namhyung's acked-by to patches 1 to 10.
> > > > v2: Address Namhyung's review feedback. Rebase dropping 4 patches
> > > >     applied by Arnaldo, fix build breakage reported by Arnaldo.
> > > >
> > > > Ian Rogers (5):
> > > >   tools api io: Ensure line_len_out is always initialized
> > > >   perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
> > > >   perf pmu: Add calls enabling the hwmon_pmu
> > > >   perf test: Add hwmon "PMU" test
> > > >   perf docs: Document tool and hwmon events
> > >
> > > I think the patch 2 can be easily splitted into core and other parts
> > > like dealing with aliases and units.  I believe it'd be helpful for
> > > others (like me) to understand how it works.
> > >
> > > Please take a look at 'perf/hwmon-pmu' branch in:
> > >
> > >   https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> >
> > Thanks Namhyung but I'm not really seeing this making anything simpler
> > and I can see significant new bugs. Your new patch:
> > https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git/commit/?h=perf/hwmon-pmu&id=85c78b5bf71fb3e67ae815f7b2d044648fa08391
> > Has taken about 40% out of patch 2, but done so by splitting function
> > declarations from their definitions, enum declarations from any use,
>
> Yeah, it's just because I was lazy and you can split header files too
> (and please do so).
>
> > etc. It also adds in code like:
> >
> > snprintf(buf, sizeof(buf), "%s_input", evsel->name);
> >
> > but this would be a strange thing to do. The evsel->name is rewritten
> > by fallback logic, so cycles may become cycles:u if kernel profiling
>
> I know it doesn't work but just want to highlight how it's supposed to
> work.  Eventually what we need is a correct file name.  In fact, I think
> it'd work if we can pass a correct event name probably like:
>
>   perf stat -e hwmon5/name=fan1/ true

But this isn't what the term name and evsel's name are for. They are
to allow you to do:
```
$ perf stat -e cycles/name=foobar/ true

Performance counter stats for 'true':

        1,126,942      foobar

      0.001681805 seconds time elapsed

      0.001757000 seconds user
      0.000000000 seconds sys
```
Why would you do this in code, change a fundamental of evsel behavior,
then just to delete it in the next patch?

> > is restricted. This is why we have metric-id in the evsel as we cannot
> > rely on the evsel->name not mutating when looking up events for the
> > sake of metrics. Using the name as part of a sysfs filename lookup
> > doesn't make sense to me as now the evsel fallback logic can break a
> > hwmon event. In the original patch the code was:
>
> The fallback logic is used only if the kernel returns an error.  Thus
> it'd be fine as long as it correctly finds the sysfs filename.  But it's
> not used in the final code and the change is a simple one-liner.

But it's not. It's changing what evsel->name means to be an event
encoding. How does reverse config to name lookup work in this model?
How does the normal use of the name term work?

> >
> > snprintf(buf, sizeof(buf), "%s%d_input", hwmon_type_strs[key.type], key.num);
> >
> > where those two values are constants and key.type and key.num both
> > values embedded in the config value the evsel fallback logic won't
> > change. But bringing in the code that does that basically brings in
> > all of the rest of patch 2.
>
> Right, that's why I did that way.
>
> >
> > So the patch is adding a PMU that looks broken, so rather than
> > simplifying things it just creates a broken intermediate state and
> > should that be fixed for the benefit of bisects?
>
> Actually it's not broken since it's not enabled yet. :)
>
>
> > It also complicates understanding as the declarations of functions and
> > enums have kernel-doc, but now the definitions of enums and functions
> > are split apart. For me, to understand the code I'd want to squash the
> > patches back together again so I could see a declaration with its
> > definition.
>
> Yep, please move the declarations to the patch 3.

So I think moving the enum declarations into one patch is okay. But as
the enum values have no bearing on hardware constants, or something
outside of the code that uses them it smells strange to me. Ultimately
this is going to do little to the lines of code count but damage
readability. I'm not sure why we're doing this given the kernel model
for adding a driver is to add it as a large chunk. For example, here
is adding the intel PT driver:
https://lore.kernel.org/all/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com/T/#u

Thanks,
Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] Hwmon PMUs
  2024-10-25  1:33       ` Ian Rogers
@ 2024-10-25 17:30         ` Namhyung Kim
  2024-10-25 18:26           ` Ian Rogers
  0 siblings, 1 reply; 15+ messages in thread
From: Namhyung Kim @ 2024-10-25 17:30 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Ravi Bangoria, Weilin Wang, Yoshihiro Furudera,
	James Clark, Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du,
	Ze Gao, Junhao He, linux-kernel, linux-perf-users

On Thu, Oct 24, 2024 at 06:33:27PM -0700, Ian Rogers wrote:
> On Thu, Oct 24, 2024 at 9:41 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > On Thu, Oct 24, 2024 at 12:07:46AM -0700, Ian Rogers wrote:
> > > On Wed, Oct 23, 2024 at 8:06 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > >
> > > > Hi Ian,
> > > >
> > > > On Tue, Oct 22, 2024 at 11:06:18AM -0700, Ian Rogers wrote:
> > > > > Following the convention of the tool PMU, create a hwmon PMU that
> > > > > exposes hwmon data for reading. For example, the following shows
> > > > > reading the CPU temperature and 2 fan speeds alongside the uncore
> > > > > frequency:
> > > > > ```
> > > > > $ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
> > > > >      1.001153138              52.00 'C   temp_cpu
> > > > >      1.001153138              2,588 rpm  fan1
> > > > >      1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
> > > > >      1.001153138                  8      tool/num_cpus_online/
> > > > >      1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
> > > > >      1.001153138      1,012,773,595      duration_time
> > > > > ...
> > > > > ```
> > > > >
> > > > > Additional data on the hwmon events is in perf list:
> > > > > ```
> > > > > $ perf list
> > > > > ...
> > > > > hwmon:
> > > > > ...
> > > > >   temp_core_0 OR temp2
> > > > >        [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
> > > > >         hwmon_coretemp]
> > > > > ...
> > > > > ```
> > > > >
> > > > > v6: Add string.h #include for issue reported by kernel test robot.
> > > > > v5: Fix asan issue in parse_hwmon_filename caught by a TMA metric.
> > > > > v4: Drop merged patches 1 to 10. Separate adding the hwmon_pmu from
> > > > >     the update to perf_pmu to use it. Try to make source of literal
> > > > >     strings clearer via named #defines. Fix a number of GCC warnings.
> > > > > v3: Rebase, add Namhyung's acked-by to patches 1 to 10.
> > > > > v2: Address Namhyung's review feedback. Rebase dropping 4 patches
> > > > >     applied by Arnaldo, fix build breakage reported by Arnaldo.
> > > > >
> > > > > Ian Rogers (5):
> > > > >   tools api io: Ensure line_len_out is always initialized
> > > > >   perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
> > > > >   perf pmu: Add calls enabling the hwmon_pmu
> > > > >   perf test: Add hwmon "PMU" test
> > > > >   perf docs: Document tool and hwmon events
> > > >
> > > > I think the patch 2 can be easily splitted into core and other parts
> > > > like dealing with aliases and units.  I believe it'd be helpful for
> > > > others (like me) to understand how it works.
> > > >
> > > > Please take a look at 'perf/hwmon-pmu' branch in:
> > > >
> > > >   https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> > >
> > > Thanks Namhyung but I'm not really seeing this making anything simpler
> > > and I can see significant new bugs. Your new patch:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git/commit/?h=perf/hwmon-pmu&id=85c78b5bf71fb3e67ae815f7b2d044648fa08391
> > > Has taken about 40% out of patch 2, but done so by splitting function
> > > declarations from their definitions, enum declarations from any use,
> >
> > Yeah, it's just because I was lazy and you can split header files too
> > (and please do so).
> >
> > > etc. It also adds in code like:
> > >
> > > snprintf(buf, sizeof(buf), "%s_input", evsel->name);
> > >
> > > but this would be a strange thing to do. The evsel->name is rewritten
> > > by fallback logic, so cycles may become cycles:u if kernel profiling
> >
> > I know it doesn't work but just want to highlight how it's supposed to
> > work.  Eventually what we need is a correct file name.  In fact, I think
> > it'd work if we can pass a correct event name probably like:
> >
> >   perf stat -e hwmon5/name=fan1/ true
> 
> But this isn't what the term name and evsel's name are for. They are
> to allow you to do:
> ```
> $ perf stat -e cycles/name=foobar/ true
> 
> Performance counter stats for 'true':
> 
>         1,126,942      foobar
> 
>       0.001681805 seconds time elapsed
> 
>       0.001757000 seconds user
>       0.000000000 seconds sys
> ```
> Why would you do this in code, change a fundamental of evsel behavior,
> then just to delete it in the next patch?

Well, I didn't change the actual behavior and it doesn't work yet.
The deletion is just one line, and I think it reveals the intention of
the next patch very well.

> 
> > > is restricted. This is why we have metric-id in the evsel as we cannot
> > > rely on the evsel->name not mutating when looking up events for the
> > > sake of metrics. Using the name as part of a sysfs filename lookup
> > > doesn't make sense to me as now the evsel fallback logic can break a
> > > hwmon event. In the original patch the code was:
> >
> > The fallback logic is used only if the kernel returns an error.  Thus
> > it'd be fine as long as it correctly finds the sysfs filename.  But it's
> > not used in the final code and the change is a simple one-liner.
> 
> But it's not. It's changing what evsel->name means to be an event
> encoding. How does reverse config to name lookup work in this model?
> How does the normal use of the name term work?

It's intermediate code that is not activated yet.  So I think it's about
to say how the code works.  If you really don't like to use evsel->name,
maybe you can put a dummy name with a comment saying it'll be updated in
next patch.

> 
> > >
> > > snprintf(buf, sizeof(buf), "%s%d_input", hwmon_type_strs[key.type], key.num);
> > >
> > > where those two values are constants and key.type and key.num both
> > > values embedded in the config value the evsel fallback logic won't
> > > change. But bringing in the code that does that basically brings in
> > > all of the rest of patch 2.
> >
> > Right, that's why I did that way.
> >
> > >
> > > So the patch is adding a PMU that looks broken, so rather than
> > > simplifying things it just creates a broken intermediate state and
> > > should that be fixed for the benefit of bisects?
> >
> > Actually it's not broken since it's not enabled yet. :)
> >
> >
> > > It also complicates understanding as the declarations of functions and
> > > enums have kernel-doc, but now the definitions of enums and functions
> > > are split apart. For me, to understand the code I'd want to squash the
> > > patches back together again so I could see a declaration with its
> > > definition.
> >
> > Yep, please move the declarations to the patch 3.
> 
> So I think moving the enum declarations into one patch is okay. But as
> the enum values have no bearing on hardware constants, or something
> outside of the code that uses them it smells strange to me. Ultimately
> this is going to do little to the lines of code count but damage
> readability. I'm not sure why we're doing this given the kernel model
> for adding a driver is to add it as a large chunk. For example, here
> is adding the intel PT driver:
> https://lore.kernel.org/all/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com/T/#u

Maybe others can understand a big patch easily, but I'm not.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] Hwmon PMUs
  2024-10-25 17:30         ` Namhyung Kim
@ 2024-10-25 18:26           ` Ian Rogers
  2024-10-25 21:01             ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Rogers @ 2024-10-25 18:26 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Ravi Bangoria, Weilin Wang, Yoshihiro Furudera,
	James Clark, Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du,
	Ze Gao, Junhao He, linux-kernel, linux-perf-users

On Fri, Oct 25, 2024 at 10:30 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Thu, Oct 24, 2024 at 06:33:27PM -0700, Ian Rogers wrote:
> > On Thu, Oct 24, 2024 at 9:41 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > On Thu, Oct 24, 2024 at 12:07:46AM -0700, Ian Rogers wrote:
> > > > On Wed, Oct 23, 2024 at 8:06 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > >
> > > > > Hi Ian,
> > > > >
> > > > > On Tue, Oct 22, 2024 at 11:06:18AM -0700, Ian Rogers wrote:
> > > > > > Following the convention of the tool PMU, create a hwmon PMU that
> > > > > > exposes hwmon data for reading. For example, the following shows
> > > > > > reading the CPU temperature and 2 fan speeds alongside the uncore
> > > > > > frequency:
> > > > > > ```
> > > > > > $ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
> > > > > >      1.001153138              52.00 'C   temp_cpu
> > > > > >      1.001153138              2,588 rpm  fan1
> > > > > >      1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
> > > > > >      1.001153138                  8      tool/num_cpus_online/
> > > > > >      1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
> > > > > >      1.001153138      1,012,773,595      duration_time
> > > > > > ...
> > > > > > ```
> > > > > >
> > > > > > Additional data on the hwmon events is in perf list:
> > > > > > ```
> > > > > > $ perf list
> > > > > > ...
> > > > > > hwmon:
> > > > > > ...
> > > > > >   temp_core_0 OR temp2
> > > > > >        [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
> > > > > >         hwmon_coretemp]
> > > > > > ...
> > > > > > ```
> > > > > >
> > > > > > v6: Add string.h #include for issue reported by kernel test robot.
> > > > > > v5: Fix asan issue in parse_hwmon_filename caught by a TMA metric.
> > > > > > v4: Drop merged patches 1 to 10. Separate adding the hwmon_pmu from
> > > > > >     the update to perf_pmu to use it. Try to make source of literal
> > > > > >     strings clearer via named #defines. Fix a number of GCC warnings.
> > > > > > v3: Rebase, add Namhyung's acked-by to patches 1 to 10.
> > > > > > v2: Address Namhyung's review feedback. Rebase dropping 4 patches
> > > > > >     applied by Arnaldo, fix build breakage reported by Arnaldo.
> > > > > >
> > > > > > Ian Rogers (5):
> > > > > >   tools api io: Ensure line_len_out is always initialized
> > > > > >   perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
> > > > > >   perf pmu: Add calls enabling the hwmon_pmu
> > > > > >   perf test: Add hwmon "PMU" test
> > > > > >   perf docs: Document tool and hwmon events
> > > > >
> > > > > I think the patch 2 can be easily splitted into core and other parts
> > > > > like dealing with aliases and units.  I believe it'd be helpful for
> > > > > others (like me) to understand how it works.
> > > > >
> > > > > Please take a look at 'perf/hwmon-pmu' branch in:
> > > > >
> > > > >   https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> > > >
> > > > Thanks Namhyung but I'm not really seeing this making anything simpler
> > > > and I can see significant new bugs. Your new patch:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git/commit/?h=perf/hwmon-pmu&id=85c78b5bf71fb3e67ae815f7b2d044648fa08391
> > > > Has taken about 40% out of patch 2, but done so by splitting function
> > > > declarations from their definitions, enum declarations from any use,
> > >
> > > Yeah, it's just because I was lazy and you can split header files too
> > > (and please do so).
> > >
> > > > etc. It also adds in code like:
> > > >
> > > > snprintf(buf, sizeof(buf), "%s_input", evsel->name);
> > > >
> > > > but this would be a strange thing to do. The evsel->name is rewritten
> > > > by fallback logic, so cycles may become cycles:u if kernel profiling
> > >
> > > I know it doesn't work but just want to highlight how it's supposed to
> > > work.  Eventually what we need is a correct file name.  In fact, I think
> > > it'd work if we can pass a correct event name probably like:
> > >
> > >   perf stat -e hwmon5/name=fan1/ true
> >
> > But this isn't what the term name and evsel's name are for. They are
> > to allow you to do:
> > ```
> > $ perf stat -e cycles/name=foobar/ true
> >
> > Performance counter stats for 'true':
> >
> >         1,126,942      foobar
> >
> >       0.001681805 seconds time elapsed
> >
> >       0.001757000 seconds user
> >       0.000000000 seconds sys
> > ```
> > Why would you do this in code, change a fundamental of evsel behavior,
> > then just to delete it in the next patch?
>
> Well, I didn't change the actual behavior and it doesn't work yet.
> The deletion is just one line, and I think it reveals the intention of
> the next patch very well.
>
> >
> > > > is restricted. This is why we have metric-id in the evsel as we cannot
> > > > rely on the evsel->name not mutating when looking up events for the
> > > > sake of metrics. Using the name as part of a sysfs filename lookup
> > > > doesn't make sense to me as now the evsel fallback logic can break a
> > > > hwmon event. In the original patch the code was:
> > >
> > > The fallback logic is used only if the kernel returns an error.  Thus
> > > it'd be fine as long as it correctly finds the sysfs filename.  But it's
> > > not used in the final code and the change is a simple one-liner.
> >
> > But it's not. It's changing what evsel->name means to be an event
> > encoding. How does reverse config to name lookup work in this model?
> > How does the normal use of the name term work?
>
> It's intermediate code that is not activated yet.  So I think it's about
> to say how the code works.  If you really don't like to use evsel->name,
> maybe you can put a dummy name with a comment saying it'll be updated in
> next patch.
>
> >
> > > >
> > > > snprintf(buf, sizeof(buf), "%s%d_input", hwmon_type_strs[key.type], key.num);
> > > >
> > > > where those two values are constants and key.type and key.num both
> > > > values embedded in the config value the evsel fallback logic won't
> > > > change. But bringing in the code that does that basically brings in
> > > > all of the rest of patch 2.
> > >
> > > Right, that's why I did that way.
> > >
> > > >
> > > > So the patch is adding a PMU that looks broken, so rather than
> > > > simplifying things it just creates a broken intermediate state and
> > > > should that be fixed for the benefit of bisects?
> > >
> > > Actually it's not broken since it's not enabled yet. :)
> > >
> > >
> > > > It also complicates understanding as the declarations of functions and
> > > > enums have kernel-doc, but now the definitions of enums and functions
> > > > are split apart. For me, to understand the code I'd want to squash the
> > > > patches back together again so I could see a declaration with its
> > > > definition.
> > >
> > > Yep, please move the declarations to the patch 3.
> >
> > So I think moving the enum declarations into one patch is okay. But as
> > the enum values have no bearing on hardware constants, or something
> > outside of the code that uses them it smells strange to me. Ultimately
> > this is going to do little to the lines of code count but damage
> > readability. I'm not sure why we're doing this given the kernel model
> > for adding a driver is to add it as a large chunk. For example, here
> > is adding the intel PT driver:
> > https://lore.kernel.org/all/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com/T/#u
>
> Maybe others can understand a big patch easily, but I'm not.

My understanding is that we make small patches so that the codebase is
more bisectable. When there is something new, like a driver or here a
hwmon PMU, the first patch is large and then we switch to the small
patch model. I have seen patches adding constants ahead of them being
used, but not normally as enums. I've already reduced the size of the
patch by moving everything that isn't hwmon PMU out of the patch and
most of that has already landed. Moving enums out of a header file
okay, shouldn't break the build (a compiler may complain about unused
enums) but then I end up copying comments into commit messages and
doing something alien to what is done in the rest of the kernel. Not
defining a function when you declare it, that is in many cases a
compiler error and for good reason. Adding in changes that are what
are or could be compiler errors goes against making things bisectable.

So breaking up this patch is bad as:
1) it doesn't match existing kernel style,
2) it makes the patch harder to understand (declarations split from
definitions, etc.),
3) with new compiler errors/warnings the code will be less bisectable
as we're deliberately doing things we think wrong for the sake of a
lines-of-code size,
4) we increase the number of patches and commit messages, with commit
messages duplicating comments for things like functions or enums being
added,
5) with your patches we create an intermediate PMU with different
conventions than the rest of the code base and with bugs, impacting
bisectability and the ability to understand the code base.

So I'm arguing against doing this as it is contrary to both our normal
objectives and existing style. I have no real way of knowing when I've
cut something up small enough and if we're not building the code then
how do I build/test the intermediate states, I'm just out on a wild
goose chase.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] Hwmon PMUs
  2024-10-25 18:26           ` Ian Rogers
@ 2024-10-25 21:01             ` Arnaldo Carvalho de Melo
  2024-10-25 23:07               ` Ian Rogers
  0 siblings, 1 reply; 15+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-10-25 21:01 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Namhyung Kim, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
	Ravi Bangoria, Weilin Wang, Yoshihiro Furudera, James Clark,
	Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du, Ze Gao,
	Junhao He, linux-kernel, linux-perf-users

On Fri, Oct 25, 2024 at 11:26:26AM -0700, Ian Rogers wrote:
> On Fri, Oct 25, 2024 at 10:30 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > On Thu, Oct 24, 2024 at 06:33:27PM -0700, Ian Rogers wrote:
> > > So I think moving the enum declarations into one patch is okay. But as
> > > the enum values have no bearing on hardware constants, or something
> > > outside of the code that uses them it smells strange to me. Ultimately
> > > this is going to do little to the lines of code count but damage
> > > readability. I'm not sure why we're doing this given the kernel model
> > > for adding a driver is to add it as a large chunk. For example, here
> > > is adding the intel PT driver:
> > > https://lore.kernel.org/all/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com/T/#u

> > Maybe others can understand a big patch easily, but I'm not.
 
> My understanding is that we make small patches so that the codebase is
> more bisectable. When there is something new, like a driver or here a

That is super important, having patches being super small and doing just
one thing helps in bisecting problems.

If two things are done in one patch, and one of them causes a problem,
then bisection is a very effective way of finding out what exactly
caused a problem.

But bisection is not the only benefit from breaking down larger patches
into smaller ones.

We want to have more people joining our ranks, doing low level tooling
and kernel work.

Writing new functionality in a series of patches, growing in complexity
is a way to reduce the cognitive load on understantind how something
works.

As much as trying to emulate how the kernel community works is a good
model as that community has been producing a lot of good code in a
frantic, athletic pace, and as much as I can agree with you that adding
a new piece of code will not affect bisectability as its new code, I
think having it broken down in multiple patches benefits revieweing.

Reviewing is something we should do more, but its very taxing.

One would rather try to write as much code as possible, leaving to
others the reviewing part.

But its a balancing act.

Whatever we can do to help reviewers, like taking into account what they
say they would prefer as a way to submit our work, even if it isn't
exactly of our liking, is one such thing.

So if Namhyung says that it would be best for you to try to break down
your patches into smaller ones, like I did say to you in the past, even
taking the trouble to do it myself, in the process introducing problems,
later fixed, I think you should try to do what he says.

He is the maintainer, try to address his comments.

- Arnaldo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] Hwmon PMUs
  2024-10-25 21:01             ` Arnaldo Carvalho de Melo
@ 2024-10-25 23:07               ` Ian Rogers
  2024-10-26 17:16                 ` Namhyung Kim
  0 siblings, 1 reply; 15+ messages in thread
From: Ian Rogers @ 2024-10-25 23:07 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Namhyung Kim, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Kan Liang,
	Ravi Bangoria, Weilin Wang, Yoshihiro Furudera, James Clark,
	Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du, Ze Gao,
	Junhao He, linux-kernel, linux-perf-users

On Fri, Oct 25, 2024 at 2:01 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Fri, Oct 25, 2024 at 11:26:26AM -0700, Ian Rogers wrote:
> > On Fri, Oct 25, 2024 at 10:30 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > On Thu, Oct 24, 2024 at 06:33:27PM -0700, Ian Rogers wrote:
> > > > So I think moving the enum declarations into one patch is okay. But as
> > > > the enum values have no bearing on hardware constants, or something
> > > > outside of the code that uses them it smells strange to me. Ultimately
> > > > this is going to do little to the lines of code count but damage
> > > > readability. I'm not sure why we're doing this given the kernel model
> > > > for adding a driver is to add it as a large chunk. For example, here
> > > > is adding the intel PT driver:
> > > > https://lore.kernel.org/all/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com/T/#u
>
> > > Maybe others can understand a big patch easily, but I'm not.
>
> > My understanding is that we make small patches so that the codebase is
> > more bisectable. When there is something new, like a driver or here a
>
> That is super important, having patches being super small and doing just
> one thing helps in bisecting problems.
>
> If two things are done in one patch, and one of them causes a problem,
> then bisection is a very effective way of finding out what exactly
> caused a problem.
>
> But bisection is not the only benefit from breaking down larger patches
> into smaller ones.
>
> We want to have more people joining our ranks, doing low level tooling
> and kernel work.
>
> Writing new functionality in a series of patches, growing in complexity
> is a way to reduce the cognitive load on understantind how something
> works.
>
> As much as trying to emulate how the kernel community works is a good
> model as that community has been producing a lot of good code in a
> frantic, athletic pace, and as much as I can agree with you that adding
> a new piece of code will not affect bisectability as its new code, I
> think having it broken down in multiple patches benefits revieweing.

Can you explain how, as asked, can separating the declaration of a
function from its definition aid in reviewing? As a reviewer, I want
to know the scope of a function and its documentation. Placing them in
2 separate patches doesn't benefit my reviewing.

> Reviewing is something we should do more, but its very taxing.
>
> One would rather try to write as much code as possible, leaving to
> others the reviewing part.
>
> But its a balancing act.
>
> Whatever we can do to help reviewers, like taking into account what they
> say they would prefer as a way to submit our work, even if it isn't
> exactly of our liking, is one such thing.
>
> So if Namhyung says that it would be best for you to try to break down
> your patches into smaller ones, like I did say to you in the past, even
> taking the trouble to do it myself, in the process introducing problems,
> later fixed, I think you should try to do what he says.
>
> He is the maintainer, try to address his comments.

I think I've written long emails addressing the comments. Just saying
too big (1) doesn't match how existing drivers are added (although
I've split the code many times so the addition is the smallest it can
be) (2) as I've pointed out makes the code harder to bisect, work with
compilers and understand.

I think there is far too much developer push back going on, it feels
capricious, I'm lucky as I'll just go push into Google's tree. I'm
only persisting here for upstream's benefit and ultimately my benefit
when I pull from upstream. Perfect shouldn't be the enemy of good, but
frequently (more often than not for me) reviewer comments aren't
improving the code they are significantly stalling it:

1) parallel testing
https://lore.kernel.org/lkml/20241025192109.132482-1-irogers@google.com/
1.1) pushed back because it used an #ifdef __linux__ to maintain some
posix library code (a now dropped complaint)
1.2) pushed back for improvements in test numbering, addressed in:
https://lore.kernel.org/lkml/20241025192109.132482-11-irogers@google.com/
not an unreasonable thing to do but feature creep. Hey we'll only take
your work helping us if you also add feature xyz

2) libdw clean up
https://lore.kernel.org/lkml/20241017002520.59124-1-irogers@google.com/
Pushed back as more cross architecture output would make the commit
messages better. Doesn't sound crazily unreasonable until you realize
the function that is being called and needing cross platform testing
is 6 lines long and only applies when you do analysis of x86 perf.data
files on non-x86 platforms. We heavily test the code on x86 and the
chance that cross platform testing will show anything is very small.

On the other hand I can point at unreviewed maintainer code going into
the tree and code where I've pointed out it is broken, from a
fundamental CS perspective, it is also taken into the tree.

RISC-V has been damaged and now in the driver they are trying to
workaround the perf tool. There were already comments to this effect
in ARM breakpoint driver's code.

On Intel we now have TPEBS (which took far far too long to land)
behind a flag which means we've made accurate top-down analysis
require an additional flag on all newer Intel models, something I
pushed against.

So the reviewing is inconsistent, damages the code (a maintainer may
disagree with the reviewer and developers saying otherwise but the
maintainer has to be followed to land) and is constantly stalling
development. Fixing reference counting took years to land because of
endless stalling, any reasonable developer would have just given up.
It is hard to imagine the state the code base would be in without it.

Of the patches I've mentioned how many are code health and how many
are a feature I can say working on is part of my day job? I see a
deliberate lack of understanding of what a developer needs. To say
I've not tried to address comments, I'd say 90% of the noise on
linux-perf-users is me resending patches, mine and others, to address
comments. Here I've made the patches a size that makes sense. I can
move the enums, which feels like a compiler error along the lines of
"static function defined but not used" but beside this, changing
evsel's name meaning to make it part of the event encoding is imo
wrong, having separate patches for a function declaration and then 1
for its definition, can you imagine taking this to its extreme and
what the patches would look like if you did this? In making things
smaller, as has happened already in this series, it is never clear you
will hit a magical maintainer happy threshold. Knowing how to make a
"right" patch is even harder when it is inconsistent with the rest of
Linux development.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v6 0/5] Hwmon PMUs
  2024-10-25 23:07               ` Ian Rogers
@ 2024-10-26 17:16                 ` Namhyung Kim
  0 siblings, 0 replies; 15+ messages in thread
From: Namhyung Kim @ 2024-10-26 17:16 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Adrian Hunter,
	Kan Liang, Ravi Bangoria, Weilin Wang, Yoshihiro Furudera,
	James Clark, Athira Jajeev, Howard Chu, Oliver Upton, Changbin Du,
	Ze Gao, Junhao He, linux-kernel, linux-perf-users

Hello Ian,

Thanks for your email explaining the concerns.

On Fri, Oct 25, 2024 at 04:07:47PM -0700, Ian Rogers wrote:
> On Fri, Oct 25, 2024 at 2:01 PM Arnaldo Carvalho de Melo
> <acme@kernel.org> wrote:
> >
> > On Fri, Oct 25, 2024 at 11:26:26AM -0700, Ian Rogers wrote:
> > > On Fri, Oct 25, 2024 at 10:30 AM Namhyung Kim <namhyung@kernel.org> wrote:
> > > > On Thu, Oct 24, 2024 at 06:33:27PM -0700, Ian Rogers wrote:
> > > > > So I think moving the enum declarations into one patch is okay. But as
> > > > > the enum values have no bearing on hardware constants, or something
> > > > > outside of the code that uses them it smells strange to me. Ultimately
> > > > > this is going to do little to the lines of code count but damage
> > > > > readability. I'm not sure why we're doing this given the kernel model
> > > > > for adding a driver is to add it as a large chunk. For example, here
> > > > > is adding the intel PT driver:
> > > > > https://lore.kernel.org/all/1422614392-114498-1-git-send-email-alexander.shishkin@linux.intel.com/T/#u
> >
> > > > Maybe others can understand a big patch easily, but I'm not.
> >
> > > My understanding is that we make small patches so that the codebase is
> > > more bisectable. When there is something new, like a driver or here a
> >
> > That is super important, having patches being super small and doing just
> > one thing helps in bisecting problems.
> >
> > If two things are done in one patch, and one of them causes a problem,
> > then bisection is a very effective way of finding out what exactly
> > caused a problem.
> >
> > But bisection is not the only benefit from breaking down larger patches
> > into smaller ones.
> >
> > We want to have more people joining our ranks, doing low level tooling
> > and kernel work.
> >
> > Writing new functionality in a series of patches, growing in complexity
> > is a way to reduce the cognitive load on understantind how something
> > works.
> >
> > As much as trying to emulate how the kernel community works is a good
> > model as that community has been producing a lot of good code in a
> > frantic, athletic pace, and as much as I can agree with you that adding
> > a new piece of code will not affect bisectability as its new code, I
> > think having it broken down in multiple patches benefits revieweing.
> 
> Can you explain how, as asked, can separating the declaration of a
> function from its definition aid in reviewing? As a reviewer, I want
> to know the scope of a function and its documentation. Placing them in
> 2 separate patches doesn't benefit my reviewing.

No, it's my fault.  Please move the declaration into the same patch.

> 
> > Reviewing is something we should do more, but its very taxing.
> >
> > One would rather try to write as much code as possible, leaving to
> > others the reviewing part.
> >
> > But its a balancing act.
> >
> > Whatever we can do to help reviewers, like taking into account what they
> > say they would prefer as a way to submit our work, even if it isn't
> > exactly of our liking, is one such thing.
> >
> > So if Namhyung says that it would be best for you to try to break down
> > your patches into smaller ones, like I did say to you in the past, even
> > taking the trouble to do it myself, in the process introducing problems,
> > later fixed, I think you should try to do what he says.
> >
> > He is the maintainer, try to address his comments.
> 
> I think I've written long emails addressing the comments. Just saying
> too big (1) doesn't match how existing drivers are added (although
> I've split the code many times so the addition is the smallest it can

I think it's different than drivers which can be separated by a config
option easily and highly hardware dependent.  Or maybe it's just a
maintainers' preference.


> be) (2) as I've pointed out makes the code harder to bisect, work with
> compilers and understand.

I don't agree.  The intention is to help other's understanding of the
code.  Well I agree it will require more effort for the author, but I
believe that having the code more digestible size would benefit in the
long run.

I feel like the whole perf code base is getting bigger, harder to know
all the details.  So I'm asking contributors to do more work to reduce
the burden in some way.

> 
> I think there is far too much developer push back going on, it feels
> capricious, I'm lucky as I'll just go push into Google's tree. I'm
> only persisting here for upstream's benefit and ultimately my benefit
> when I pull from upstream. Perfect shouldn't be the enemy of good, but
> frequently (more often than not for me) reviewer comments aren't
> improving the code they are significantly stalling it:

I'm sorry that you felt that way.  I was trying to improve the code and
keep the code simple and concise.  But it's sometimes hard to draw the
line where it's acceptable.  Probably my previous decisions were bad,
but I tried to be reasonable as much as possible.

> 
> 1) parallel testing
> https://lore.kernel.org/lkml/20241025192109.132482-1-irogers@google.com/
> 1.1) pushed back because it used an #ifdef __linux__ to maintain some
> posix library code (a now dropped complaint)

I don't know we have that in other place.  So I was curious if we care
about other platforms.  Probably we can just delete the unused code, but
as I said, I can live with this.


> 1.2) pushed back for improvements in test numbering, addressed in:
> https://lore.kernel.org/lkml/20241025192109.132482-11-irogers@google.com/
> not an unreasonable thing to do but feature creep. Hey we'll only take
> your work helping us if you also add feature xyz

Well I think you changed the numbering in the parallel testing and I
asked to keep it continuous as of now.

> 
> 2) libdw clean up
> https://lore.kernel.org/lkml/20241017002520.59124-1-irogers@google.com/
> Pushed back as more cross architecture output would make the commit
> messages better. Doesn't sound crazily unreasonable until you realize
> the function that is being called and needing cross platform testing
> is 6 lines long and only applies when you do analysis of x86 perf.data
> files on non-x86 platforms. We heavily test the code on x86 and the
> chance that cross platform testing will show anything is very small.

I think we're fine except for the register naming.  I haven't reviewed
that part yet and I'll do that next week.

> 
> On the other hand I can point at unreviewed maintainer code going into
> the tree and code where I've pointed out it is broken, from a
> fundamental CS perspective, it is also taken into the tree.
> 
> RISC-V has been damaged and now in the driver they are trying to
> workaround the perf tool. There were already comments to this effect
> in ARM breakpoint driver's code.

It's sad we broke some arch.  But it should be easy to fix the tool
than the kernel driver.  Let's find a way to fix the problems in a
better way.  Sorry if I missed some previous discussion.

> 
> On Intel we now have TPEBS (which took far far too long to land)
> behind a flag which means we've made accurate top-down analysis
> require an additional flag on all newer Intel models, something I
> pushed against.

This is a new code that requires complex and non-intuitive operations
like mixing perf record and perf stat together.  I remember some people
doubt about the approach so it may deserve stricter and longer reviews.

Even then, I tried to review quickly and accepted some minor
disasgreements.  It's unfortunate it took too long but it happens.

> 
> So the reviewing is inconsistent, damages the code (a maintainer may
> disagree with the reviewer and developers saying otherwise but the
> maintainer has to be followed to land) and is constantly stalling
> development. Fixing reference counting took years to land because of
> endless stalling, any reasonable developer would have just given up.
> It is hard to imagine the state the code base would be in without it.

Right it works great, thanks for your effort.

But I think any collaborating work needs some kind of burden (or
stalling) for coordination.  If said patches touch a lot of area, there
is a high chance of rebase, more arguments on the interface, etc.

> 
> Of the patches I've mentioned how many are code health and how many
> are a feature I can say working on is part of my day job? I see a
> deliberate lack of understanding of what a developer needs. To say
> I've not tried to address comments, I'd say 90% of the noise on
> linux-perf-users is me resending patches, mine and others, to address

Thanks for your hard work.  Maybe I'm the bottleneck of your
productivity.  But I cannot take patches without review, and reviewing
patches take time.  Having more active reviewers would help.


> comments. Here I've made the patches a size that makes sense. I can
> move the enums, which feels like a compiler error along the lines of
> "static function defined but not used" but beside this, changing
> evsel's name meaning to make it part of the event encoding is imo
> wrong, having separate patches for a function declaration and then 1
> for its definition, can you imagine taking this to its extreme and
> what the patches would look like if you did this? In making things
> smaller, as has happened already in this series, it is never clear you
> will hit a magical maintainer happy threshold. Knowing how to make a
> "right" patch is even harder when it is inconsistent with the rest of
> Linux development.

At least for this case, I don't think moving the declaration would cause
you a lot of trouble.  Please take a look at my updated 'perf/hwmon-pmu'
branch again.

For the magical maintainer threshold, I admit it can be different to
each maintainer.  But I think it's the review process to find a point of
agreement.  I don't know where it is even by myself but we can argue
each other.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-10-26 17:16 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-22 18:06 [PATCH v6 0/5] Hwmon PMUs Ian Rogers
2024-10-22 18:06 ` [PATCH v6 1/5] tools api io: Ensure line_len_out is always initialized Ian Rogers
2024-10-22 18:06 ` [PATCH v6 2/5] perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs Ian Rogers
2024-10-22 18:06 ` [PATCH v6 3/5] perf pmu: Add calls enabling the hwmon_pmu Ian Rogers
2024-10-22 18:06 ` [PATCH v6 4/5] perf test: Add hwmon "PMU" test Ian Rogers
2024-10-22 18:06 ` [PATCH v6 5/5] perf docs: Document tool and hwmon events Ian Rogers
2024-10-24  3:06 ` [PATCH v6 0/5] Hwmon PMUs Namhyung Kim
2024-10-24  7:07   ` Ian Rogers
2024-10-24 16:40     ` Namhyung Kim
2024-10-25  1:33       ` Ian Rogers
2024-10-25 17:30         ` Namhyung Kim
2024-10-25 18:26           ` Ian Rogers
2024-10-25 21:01             ` Arnaldo Carvalho de Melo
2024-10-25 23:07               ` Ian Rogers
2024-10-26 17:16                 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).