linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf/core: move all of the pmu devices into their own location
@ 2025-02-03 19:25 Greg Kroah-Hartman
  2025-02-03 19:44 ` Ian Rogers
  2025-02-04  7:41 ` Alexander Shishkin
  0 siblings, 2 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-03 19:25 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel
  Cc: Greg Kroah-Hartman, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Liang, Kan

In sysfs, for some reason, all pmu devices seem to show up in the "root"
of /sys/devices/ making for a confusing mess as these devices are not
really at the root of the system at all.

Create a fake root devices, "pmu_bus" and place them all under there if
they do not already have a parent device set, cleaning up sysfs to look
more sane.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---

Note, if you all don't like "pmu_bus" for the name, that's fine, please
let me know and I can rename it to something else, but it should be
something to get these objects out of the root sysfs directory.

 kernel/events/core.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index bcb09e011e9e..786537faed2c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11767,6 +11767,11 @@ static const struct attribute_group *pmu_dev_groups[] = {
 };
 
 static int pmu_bus_running;
+
+static struct device pmu_bus_root = {
+	.init_name	= "pmu_bus",
+};
+
 static struct bus_type pmu_bus = {
 	.name		= "event_source",
 	.dev_groups	= pmu_dev_groups,
@@ -11790,7 +11795,10 @@ static int pmu_dev_alloc(struct pmu *pmu)
 
 	dev_set_drvdata(pmu->dev, pmu);
 	pmu->dev->bus = &pmu_bus;
-	pmu->dev->parent = pmu->parent;
+	if (pmu->parent)
+		pmu->dev->parent = pmu->parent;
+	else
+		pmu->dev->parent = &pmu_bus_root;
 	pmu->dev->release = pmu_dev_release;
 
 	ret = dev_set_name(pmu->dev, "%s", pmu->name);
@@ -14232,9 +14240,17 @@ static int __init perf_event_sysfs_init(void)
 
 	mutex_lock(&pmus_lock);
 
-	ret = bus_register(&pmu_bus);
-	if (ret)
+	ret = device_register(&pmu_bus_root);
+	if (ret) {
+		put_device(&pmu_bus_root);
 		goto unlock;
+	}
+
+	ret = bus_register(&pmu_bus);
+	if (ret) {
+		device_unregister(&pmu_bus_root);
+		goto unlock;
+	}
 
 	list_for_each_entry(pmu, &pmus, entry) {
 		if (pmu->dev)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-03 19:25 [PATCH] perf/core: move all of the pmu devices into their own location Greg Kroah-Hartman
@ 2025-02-03 19:44 ` Ian Rogers
  2025-02-04  7:05   ` Greg Kroah-Hartman
  2025-02-04  7:41 ` Alexander Shishkin
  1 sibling, 1 reply; 22+ messages in thread
From: Ian Rogers @ 2025-02-03 19:44 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan

On Mon, Feb 3, 2025 at 11:25 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> In sysfs, for some reason, all pmu devices seem to show up in the "root"
> of /sys/devices/ making for a confusing mess as these devices are not
> really at the root of the system at all.
>
> Create a fake root devices, "pmu_bus" and place them all under there if
> they do not already have a parent device set, cleaning up sysfs to look
> more sane.
>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: "Liang, Kan" <kan.liang@linux.intel.com>
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>
> Note, if you all don't like "pmu_bus" for the name, that's fine, please
> let me know and I can rename it to something else, but it should be
> something to get these objects out of the root sysfs directory.

Excuse my ignorance, does the event_source bus already provide a
similar solution?
```
$ ls /sys/bus/event_source/devices
breakpoint   i915       msr         uncore_arb     uncore_cbox_3
uncore_cbox_7              uprobe
cpu          intel_bts  power       uncore_cbox_0  uncore_cbox_4  uncore_clock
cstate_core  intel_pt   software    uncore_cbox_1  uncore_cbox_5
uncore_imc_free_running_0
cstate_pkg   kprobe     tracepoint  uncore_cbox_2  uncore_cbox_6
uncore_imc_free_running_1
$ ls /sys/devices
breakpoint   intel_pt     platform    uncore_arb     uncore_cbox_5
         uprobe
cpu          isa          pnp0        uncore_cbox_0  uncore_cbox_6
         virtual
cstate_core  kprobe       power       uncore_cbox_1  uncore_cbox_7
cstate_pkg   LNXSYSTM:00  software    uncore_cbox_2  uncore_clock
i915         msr          system      uncore_cbox_3  uncore_imc_free_running_0
intel_bts    pci0000:00   tracepoint  uncore_cbox_4  uncore_imc_free_running_1
```

Thanks,
Ian

>  kernel/events/core.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index bcb09e011e9e..786537faed2c 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -11767,6 +11767,11 @@ static const struct attribute_group *pmu_dev_groups[] = {
>  };
>
>  static int pmu_bus_running;
> +
> +static struct device pmu_bus_root = {
> +       .init_name      = "pmu_bus",
> +};
> +
>  static struct bus_type pmu_bus = {
>         .name           = "event_source",
>         .dev_groups     = pmu_dev_groups,
> @@ -11790,7 +11795,10 @@ static int pmu_dev_alloc(struct pmu *pmu)
>
>         dev_set_drvdata(pmu->dev, pmu);
>         pmu->dev->bus = &pmu_bus;
> -       pmu->dev->parent = pmu->parent;
> +       if (pmu->parent)
> +               pmu->dev->parent = pmu->parent;
> +       else
> +               pmu->dev->parent = &pmu_bus_root;
>         pmu->dev->release = pmu_dev_release;
>
>         ret = dev_set_name(pmu->dev, "%s", pmu->name);
> @@ -14232,9 +14240,17 @@ static int __init perf_event_sysfs_init(void)
>
>         mutex_lock(&pmus_lock);
>
> -       ret = bus_register(&pmu_bus);
> -       if (ret)
> +       ret = device_register(&pmu_bus_root);
> +       if (ret) {
> +               put_device(&pmu_bus_root);
>                 goto unlock;
> +       }
> +
> +       ret = bus_register(&pmu_bus);
> +       if (ret) {
> +               device_unregister(&pmu_bus_root);
> +               goto unlock;
> +       }
>
>         list_for_each_entry(pmu, &pmus, entry) {
>                 if (pmu->dev)
> --
> 2.48.1
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-03 19:44 ` Ian Rogers
@ 2025-02-04  7:05   ` Greg Kroah-Hartman
  2025-02-04 18:17     ` Ian Rogers
  0 siblings, 1 reply; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-04  7:05 UTC (permalink / raw)
  To: Ian Rogers
  Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan

On Mon, Feb 03, 2025 at 11:44:13AM -0800, Ian Rogers wrote:
> On Mon, Feb 3, 2025 at 11:25 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > In sysfs, for some reason, all pmu devices seem to show up in the "root"
> > of /sys/devices/ making for a confusing mess as these devices are not
> > really at the root of the system at all.
> >
> > Create a fake root devices, "pmu_bus" and place them all under there if
> > they do not already have a parent device set, cleaning up sysfs to look
> > more sane.
> >
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Cc: Jiri Olsa <jolsa@kernel.org>
> > Cc: Ian Rogers <irogers@google.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: "Liang, Kan" <kan.liang@linux.intel.com>
> > Cc: linux-perf-users@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > ---
> >
> > Note, if you all don't like "pmu_bus" for the name, that's fine, please
> > let me know and I can rename it to something else, but it should be
> > something to get these objects out of the root sysfs directory.
> 
> Excuse my ignorance, does the event_source bus already provide a
> similar solution?
> ```
> $ ls /sys/bus/event_source/devices
> breakpoint   i915       msr         uncore_arb     uncore_cbox_3
> uncore_cbox_7              uprobe
> cpu          intel_bts  power       uncore_cbox_0  uncore_cbox_4  uncore_clock
> cstate_core  intel_pt   software    uncore_cbox_1  uncore_cbox_5
> uncore_imc_free_running_0
> cstate_pkg   kprobe     tracepoint  uncore_cbox_2  uncore_cbox_6
> uncore_imc_free_running_1
> $ ls /sys/devices
> breakpoint   intel_pt     platform    uncore_arb     uncore_cbox_5
>          uprobe
> cpu          isa          pnp0        uncore_cbox_0  uncore_cbox_6
>          virtual
> cstate_core  kprobe       power       uncore_cbox_1  uncore_cbox_7
> cstate_pkg   LNXSYSTM:00  software    uncore_cbox_2  uncore_clock
> i915         msr          system      uncore_cbox_3  uncore_imc_free_running_0
> intel_bts    pci0000:00   tracepoint  uncore_cbox_4  uncore_imc_free_running_1
> ```

Those are the exact same device structures :)

Look at /sys/bus/event_source/devices/ those are symlinks back to
/sys/devices.  In other words, that's the exact same structures, and
the mess in /sys/devices/ with these event devices is what I am trying
to clean up.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-03 19:25 [PATCH] perf/core: move all of the pmu devices into their own location Greg Kroah-Hartman
  2025-02-03 19:44 ` Ian Rogers
@ 2025-02-04  7:41 ` Alexander Shishkin
  2025-02-04 10:16   ` Greg Kroah-Hartman
  1 sibling, 1 reply; 22+ messages in thread
From: Alexander Shishkin @ 2025-02-04  7:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman, linux-perf-users, linux-kernel
  Cc: Greg Kroah-Hartman, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Liang, Kan, alexander.shishkin

Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:

> In sysfs, for some reason, all pmu devices seem to show up in the "root"
> of /sys/devices/ making for a confusing mess as these devices are not
> really at the root of the system at all.
>
> Create a fake root devices, "pmu_bus" and place them all under there if
> they do not already have a parent device set, cleaning up sysfs to look
> more sane.

Yeah, so what happens to the userspace that uses them via /sys/devices/*
directly? Even I have scripts that do that.

Regards,
--
Alex

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04  7:41 ` Alexander Shishkin
@ 2025-02-04 10:16   ` Greg Kroah-Hartman
  2025-02-04 14:06     ` Liang, Kan
  0 siblings, 1 reply; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-04 10:16 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Jiri Olsa,
	Ian Rogers, Adrian Hunter, Liang, Kan

On Tue, Feb 04, 2025 at 09:41:03AM +0200, Alexander Shishkin wrote:
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> 
> > In sysfs, for some reason, all pmu devices seem to show up in the "root"
> > of /sys/devices/ making for a confusing mess as these devices are not
> > really at the root of the system at all.
> >
> > Create a fake root devices, "pmu_bus" and place them all under there if
> > they do not already have a parent device set, cleaning up sysfs to look
> > more sane.
> 
> Yeah, so what happens to the userspace that uses them via /sys/devices/*
> directly? Even I have scripts that do that.

You should never be doing that, as you have no idea what type of devices
are in that location in the tree.  You should be doing what the
documentation says to do, and look in /sys/bus/event_source/devices/
instead.  That didn't change here.

Again, system topology can, and will, change all the time in
/sys/devices/ so expect that.  The only "stable" locations are the
symlinks in the /sys/bus/ and /sys/class/ locations, which is why those
symlinks are present.

Been that way for over a decade now :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 10:16   ` Greg Kroah-Hartman
@ 2025-02-04 14:06     ` Liang, Kan
  2025-02-04 15:21       ` Greg Kroah-Hartman
  2025-02-04 16:28       ` Greg Kroah-Hartman
  0 siblings, 2 replies; 22+ messages in thread
From: Liang, Kan @ 2025-02-04 14:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Alexander Shishkin
  Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Jiri Olsa,
	Ian Rogers, Adrian Hunter



On 2025-02-04 5:16 a.m., Greg Kroah-Hartman wrote:
> On Tue, Feb 04, 2025 at 09:41:03AM +0200, Alexander Shishkin wrote:
>> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
>>
>>> In sysfs, for some reason, all pmu devices seem to show up in the "root"
>>> of /sys/devices/ making for a confusing mess as these devices are not
>>> really at the root of the system at all.
>>>
>>> Create a fake root devices, "pmu_bus" and place them all under there if
>>> they do not already have a parent device set, cleaning up sysfs to look
>>> more sane.
>>
>> Yeah, so what happens to the userspace that uses them via /sys/devices/*
>> directly? Even I have scripts that do that.
> 
> You should never be doing that, as you have no idea what type of devices
> are in that location in the tree.  You should be doing what the
> documentation says to do, and look in /sys/bus/event_source/devices/
> instead.  That didn't change here.
>

Not just the script, the /sys/devices/ is also used in the current perf
tool. For example,
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/mem-events.c#n192

And the comments and document in the perf tool.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/pmu.c#n39
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-list.txt#n191

I think it should bring big impact for the end user, especially when
they still use an older perf tool and script.

Thanks,
Kan

> Again, system topology can, and will, change all the time in
> /sys/devices/ so expect that.  The only "stable" locations are the
> symlinks in the /sys/bus/ and /sys/class/ locations, which is why those
> symlinks are present.
> 
> Been that way for over a decade now :)
> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 14:06     ` Liang, Kan
@ 2025-02-04 15:21       ` Greg Kroah-Hartman
  2025-02-04 16:28       ` Greg Kroah-Hartman
  1 sibling, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-04 15:21 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter

On Tue, Feb 04, 2025 at 09:06:18AM -0500, Liang, Kan wrote:
> 
> 
> On 2025-02-04 5:16 a.m., Greg Kroah-Hartman wrote:
> > On Tue, Feb 04, 2025 at 09:41:03AM +0200, Alexander Shishkin wrote:
> >> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> >>
> >>> In sysfs, for some reason, all pmu devices seem to show up in the "root"
> >>> of /sys/devices/ making for a confusing mess as these devices are not
> >>> really at the root of the system at all.
> >>>
> >>> Create a fake root devices, "pmu_bus" and place them all under there if
> >>> they do not already have a parent device set, cleaning up sysfs to look
> >>> more sane.
> >>
> >> Yeah, so what happens to the userspace that uses them via /sys/devices/*
> >> directly? Even I have scripts that do that.
> > 
> > You should never be doing that, as you have no idea what type of devices
> > are in that location in the tree.  You should be doing what the
> > documentation says to do, and look in /sys/bus/event_source/devices/
> > instead.  That didn't change here.
> >
> 
> Not just the script, the /sys/devices/ is also used in the current perf
> tool. For example,
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/mem-events.c#n192

Ugh, it's ignoring the documentation that perf provides for how to find
these devices.  And it's really really risky to do that, you are
"assuming" that a subdir called "events" is never in any other type of
device.

I'll submit a patch to fix that.

> And the comments and document in the perf tool.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/pmu.c#n39
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-list.txt#n191

I'll fix that document and the tools as well.

> I think it should bring big impact for the end user, especially when
> they still use an older perf tool and script.

perf is tied to the kernel tree, so we should be ok here.  We can also
take the perf tool fix for all stable kernels as it will be backwards
compatible.

The idea that somehow "events" are better than any other type of device
in the system and belong all individually in the root of the device tree
is a long-standing bug that should be fixed.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 14:06     ` Liang, Kan
  2025-02-04 15:21       ` Greg Kroah-Hartman
@ 2025-02-04 16:28       ` Greg Kroah-Hartman
  2025-02-04 16:41         ` Vince Weaver
  2025-02-04 18:23         ` Liang, Kan
  1 sibling, 2 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-04 16:28 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter

On Tue, Feb 04, 2025 at 09:06:18AM -0500, Liang, Kan wrote:
> 
> 
> On 2025-02-04 5:16 a.m., Greg Kroah-Hartman wrote:
> > On Tue, Feb 04, 2025 at 09:41:03AM +0200, Alexander Shishkin wrote:
> >> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> >>
> >>> In sysfs, for some reason, all pmu devices seem to show up in the "root"
> >>> of /sys/devices/ making for a confusing mess as these devices are not
> >>> really at the root of the system at all.
> >>>
> >>> Create a fake root devices, "pmu_bus" and place them all under there if
> >>> they do not already have a parent device set, cleaning up sysfs to look
> >>> more sane.
> >>
> >> Yeah, so what happens to the userspace that uses them via /sys/devices/*
> >> directly? Even I have scripts that do that.
> > 
> > You should never be doing that, as you have no idea what type of devices
> > are in that location in the tree.  You should be doing what the
> > documentation says to do, and look in /sys/bus/event_source/devices/
> > instead.  That didn't change here.
> >
> 
> Not just the script, the /sys/devices/ is also used in the current perf
> tool. For example,
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/mem-events.c#n192
> 
> And the comments and document in the perf tool.
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/pmu.c#n39

Note, this file looks to work properly, it's just the comment that is
incorrect.

Any hints on what perf command or test I should run to verify I did get
this all right?

> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-list.txt#n191

I'll fix that.

> I think it should bring big impact for the end user, especially when
> they still use an older perf tool and script.

It seems that the majority of the perf code IS looking in the correct
place, just mem-events.c seemed wrong.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 16:28       ` Greg Kroah-Hartman
@ 2025-02-04 16:41         ` Vince Weaver
  2025-02-04 17:12           ` Greg Kroah-Hartman
  2025-02-04 18:23         ` Liang, Kan
  1 sibling, 1 reply; 22+ messages in thread
From: Vince Weaver @ 2025-02-04 16:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Liang, Kan, Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter

On Tue, 4 Feb 2025, Greg Kroah-Hartman wrote:

> 
> It seems that the majority of the perf code IS looking in the correct
> place, just mem-events.c seemed wrong.

I hate to tell you, but other places in userspace are depending on the 
current setup.  libpfm4, used by PAPI, is looking directly in /sys/devices 
for pmus and will break with the changes you are planning.

Vince Weaver
vincent.weaver@maine.edu


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 16:41         ` Vince Weaver
@ 2025-02-04 17:12           ` Greg Kroah-Hartman
  2025-02-04 17:49             ` Ian Rogers
  0 siblings, 1 reply; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-04 17:12 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Liang, Kan, Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter

On Tue, Feb 04, 2025 at 11:41:38AM -0500, Vince Weaver wrote:
> On Tue, 4 Feb 2025, Greg Kroah-Hartman wrote:
> 
> > 
> > It seems that the majority of the perf code IS looking in the correct
> > place, just mem-events.c seemed wrong.
> 
> I hate to tell you, but other places in userspace are depending on the 
> current setup.  libpfm4, used by PAPI, is looking directly in /sys/devices 
> for pmus and will break with the changes you are planning.

Then that too needs to be fixed, sorry.  Again, devices can, and will,
move around in /sys/devices/ you can never hard-code any paths there.
Any userspace code must ALWAYS be able to handle that, that's a sysfs
requirement.

And do you have a link to the source for that code?  Good news is that
if the code is fixed in userspace, it will work for any kernel (old or
new).

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 17:12           ` Greg Kroah-Hartman
@ 2025-02-04 17:49             ` Ian Rogers
  2025-02-04 18:03               ` Greg Kroah-Hartman
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Rogers @ 2025-02-04 17:49 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Stephane Eranian, Vince Weaver
  Cc: Liang, Kan, Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Adrian Hunter

On Tue, Feb 4, 2025 at 9:12 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Tue, Feb 04, 2025 at 11:41:38AM -0500, Vince Weaver wrote:
> > On Tue, 4 Feb 2025, Greg Kroah-Hartman wrote:
> >
> > >
> > > It seems that the majority of the perf code IS looking in the correct
> > > place, just mem-events.c seemed wrong.
> >
> > I hate to tell you, but other places in userspace are depending on the
> > current setup.  libpfm4, used by PAPI, is looking directly in /sys/devices
> > for pmus and will break with the changes you are planning.
>
> Then that too needs to be fixed, sorry.  Again, devices can, and will,
> move around in /sys/devices/ you can never hard-code any paths there.
> Any userspace code must ALWAYS be able to handle that, that's a sysfs
> requirement.
>
> And do you have a link to the source for that code?  Good news is that
> if the code is fixed in userspace, it will work for any kernel (old or
> new).

+Stephane Eranian

I see use of /sys/devices in at least:
https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/lib/pfmlib_perf_event_pmu.c
I imagine it isn't a big job to clean it up.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 17:49             ` Ian Rogers
@ 2025-02-04 18:03               ` Greg Kroah-Hartman
  2025-02-05  1:21                 ` Vince Weaver
  0 siblings, 1 reply; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-04 18:03 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Stephane Eranian, Vince Weaver, Liang, Kan, Alexander Shishkin,
	linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Jiri Olsa,
	Adrian Hunter

On Tue, Feb 04, 2025 at 09:49:55AM -0800, Ian Rogers wrote:
> On Tue, Feb 4, 2025 at 9:12 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Feb 04, 2025 at 11:41:38AM -0500, Vince Weaver wrote:
> > > On Tue, 4 Feb 2025, Greg Kroah-Hartman wrote:
> > >
> > > >
> > > > It seems that the majority of the perf code IS looking in the correct
> > > > place, just mem-events.c seemed wrong.
> > >
> > > I hate to tell you, but other places in userspace are depending on the
> > > current setup.  libpfm4, used by PAPI, is looking directly in /sys/devices
> > > for pmus and will break with the changes you are planning.
> >
> > Then that too needs to be fixed, sorry.  Again, devices can, and will,
> > move around in /sys/devices/ you can never hard-code any paths there.
> > Any userspace code must ALWAYS be able to handle that, that's a sysfs
> > requirement.
> >
> > And do you have a link to the source for that code?  Good news is that
> > if the code is fixed in userspace, it will work for any kernel (old or
> > new).
> 
> +Stephane Eranian
> 
> I see use of /sys/devices in at least:
> https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/lib/pfmlib_perf_event_pmu.c
> I imagine it isn't a big job to clean it up.

Nope, all that has to happen is this line:
	snprintf(buf, PATH_MAX, "/sys/devices/%s/events/%s", e->pmu ? e->pmu : "cpu", e->name);
be changed to:
	snprintf(buf, PATH_MAX, "/sys/bus/event_source/devices/%s/events/%s", e->pmu ? e->pmu : "cpu", e->name);


thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04  7:05   ` Greg Kroah-Hartman
@ 2025-02-04 18:17     ` Ian Rogers
  2025-02-05  5:41       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Rogers @ 2025-02-04 18:17 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan

On Mon, Feb 3, 2025 at 11:05 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Mon, Feb 03, 2025 at 11:44:13AM -0800, Ian Rogers wrote:
> > On Mon, Feb 3, 2025 at 11:25 AM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
[snip]
> > > Note, if you all don't like "pmu_bus" for the name, that's fine, please
> > > let me know and I can rename it to something else, but it should be
> > > something to get these objects out of the root sysfs directory.
> >
> > Excuse my ignorance, does the event_source bus already provide a
> > similar solution?
> > ```
> > $ ls /sys/bus/event_source/devices
> > breakpoint   i915       msr         uncore_arb     uncore_cbox_3
> > uncore_cbox_7              uprobe
> > cpu          intel_bts  power       uncore_cbox_0  uncore_cbox_4  uncore_clock
> > cstate_core  intel_pt   software    uncore_cbox_1  uncore_cbox_5
> > uncore_imc_free_running_0
> > cstate_pkg   kprobe     tracepoint  uncore_cbox_2  uncore_cbox_6
> > uncore_imc_free_running_1
> > $ ls /sys/devices
> > breakpoint   intel_pt     platform    uncore_arb     uncore_cbox_5
> >          uprobe
> > cpu          isa          pnp0        uncore_cbox_0  uncore_cbox_6
> >          virtual
> > cstate_core  kprobe       power       uncore_cbox_1  uncore_cbox_7
> > cstate_pkg   LNXSYSTM:00  software    uncore_cbox_2  uncore_clock
> > i915         msr          system      uncore_cbox_3  uncore_imc_free_running_0
> > intel_bts    pci0000:00   tracepoint  uncore_cbox_4  uncore_imc_free_running_1
> > ```
>
> Those are the exact same device structures :)
>
> Look at /sys/bus/event_source/devices/ those are symlinks back to
> /sys/devices.  In other words, that's the exact same structures, and
> the mess in /sys/devices/ with these event devices is what I am trying
> to clean up.

Thanks! Again my ignorance, why do we need pmu_bus? Could the symlinks
in event_source be the actual devices? Perhaps for temporary
compatibility some symlinks could be placed in /sys/devices?

Thanks,
Ian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 16:28       ` Greg Kroah-Hartman
  2025-02-04 16:41         ` Vince Weaver
@ 2025-02-04 18:23         ` Liang, Kan
  2025-02-05 16:00           ` Greg Kroah-Hartman
  1 sibling, 1 reply; 22+ messages in thread
From: Liang, Kan @ 2025-02-04 18:23 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter



On 2025-02-04 11:28 a.m., Greg Kroah-Hartman wrote:
> On Tue, Feb 04, 2025 at 09:06:18AM -0500, Liang, Kan wrote:
>>
>>
>> On 2025-02-04 5:16 a.m., Greg Kroah-Hartman wrote:
>>> On Tue, Feb 04, 2025 at 09:41:03AM +0200, Alexander Shishkin wrote:
>>>> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
>>>>
>>>>> In sysfs, for some reason, all pmu devices seem to show up in the "root"
>>>>> of /sys/devices/ making for a confusing mess as these devices are not
>>>>> really at the root of the system at all.
>>>>>
>>>>> Create a fake root devices, "pmu_bus" and place them all under there if
>>>>> they do not already have a parent device set, cleaning up sysfs to look
>>>>> more sane.
>>>>
>>>> Yeah, so what happens to the userspace that uses them via /sys/devices/*
>>>> directly? Even I have scripts that do that.
>>>
>>> You should never be doing that, as you have no idea what type of devices
>>> are in that location in the tree.  You should be doing what the
>>> documentation says to do, and look in /sys/bus/event_source/devices/
>>> instead.  That didn't change here.
>>>
>>
>> Not just the script, the /sys/devices/ is also used in the current perf
>> tool. For example,
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/mem-events.c#n192
>>
>> And the comments and document in the perf tool.
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/pmu.c#n39
> 
> Note, this file looks to work properly, it's just the comment that is
> incorrect.
> 
> Any hints on what perf command or test I should run to verify I did get
> this all right?

For the perf tool, "perf test" should be run at minimum.

> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-list.txt#n191
> 
> I'll fix that.
> 
>> I think it should bring big impact for the end user, especially when
>> they still use an older perf tool and script.
> 
> It seems that the majority of the perf code IS looking in the correct
> place, just mem-events.c seemed wrong.
> 

There should be two more.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/builtin-stat.c#n100

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/x86/util/iostat.c#n35

Thanks,
Kan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 18:03               ` Greg Kroah-Hartman
@ 2025-02-05  1:21                 ` Vince Weaver
  2025-02-05  5:45                   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 22+ messages in thread
From: Vince Weaver @ 2025-02-05  1:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Ian Rogers, Stephane Eranian, Vince Weaver, Liang, Kan,
	Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Adrian Hunter

[-- Attachment #1: Type: text/plain, Size: 2185 bytes --]

On Tue, 4 Feb 2025, Greg Kroah-Hartman wrote:

> On Tue, Feb 04, 2025 at 09:49:55AM -0800, Ian Rogers wrote:
> > On Tue, Feb 4, 2025 at 9:12 AM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Tue, Feb 04, 2025 at 11:41:38AM -0500, Vince Weaver wrote:
> > > > On Tue, 4 Feb 2025, Greg Kroah-Hartman wrote:
> > > >
> > > > >
> > > > > It seems that the majority of the perf code IS looking in the correct
> > > > > place, just mem-events.c seemed wrong.
> > > >
> > > > I hate to tell you, but other places in userspace are depending on the
> > > > current setup.  libpfm4, used by PAPI, is looking directly in /sys/devices
> > > > for pmus and will break with the changes you are planning.
> > >
> > > Then that too needs to be fixed, sorry.  Again, devices can, and will,
> > > move around in /sys/devices/ you can never hard-code any paths there.
> > > Any userspace code must ALWAYS be able to handle that, that's a sysfs
> > > requirement.
> > >
> > > And do you have a link to the source for that code?  Good news is that
> > > if the code is fixed in userspace, it will work for any kernel (old or
> > > new).
> > 
> > +Stephane Eranian
> > 
> > I see use of /sys/devices in at least:
> > https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/lib/pfmlib_perf_event_pmu.c
> > I imagine it isn't a big job to clean it up.
> 
> Nope, all that has to happen is this line:
> 	snprintf(buf, PATH_MAX, "/sys/devices/%s/events/%s", e->pmu ? e->pmu : "cpu", e->name);
> be changed to:
> 	snprintf(buf, PATH_MAX, "/sys/bus/event_source/devices/%s/events/%s", e->pmu ? e->pmu : "cpu", e->name);

it doesn't matter if it's a one line change, your proposed change breaks 
userspace.

I thought the rule was Linux didn't break userspace.
Has that changed recently?

libpfm4/PAPI are widely used and deployed and in some cases possibly 
statically linked into other projects.  It will take years for the change 
to filter out to all users.

Is there a technical reason for this proposed change of yours, or are you 
just arbitrarily breaking userspace programs because you think /sys looks 
cluttered to your personal taste?

Vince Weaver
vincent.weaver@maine.edu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 18:17     ` Ian Rogers
@ 2025-02-05  5:41       ` Greg Kroah-Hartman
  2025-02-05 16:48         ` Ian Rogers
  0 siblings, 1 reply; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-05  5:41 UTC (permalink / raw)
  To: Ian Rogers
  Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan

On Tue, Feb 04, 2025 at 10:17:17AM -0800, Ian Rogers wrote:
> On Mon, Feb 3, 2025 at 11:05 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Mon, Feb 03, 2025 at 11:44:13AM -0800, Ian Rogers wrote:
> > > On Mon, Feb 3, 2025 at 11:25 AM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> [snip]
> > > > Note, if you all don't like "pmu_bus" for the name, that's fine, please
> > > > let me know and I can rename it to something else, but it should be
> > > > something to get these objects out of the root sysfs directory.
> > >
> > > Excuse my ignorance, does the event_source bus already provide a
> > > similar solution?
> > > ```
> > > $ ls /sys/bus/event_source/devices
> > > breakpoint   i915       msr         uncore_arb     uncore_cbox_3
> > > uncore_cbox_7              uprobe
> > > cpu          intel_bts  power       uncore_cbox_0  uncore_cbox_4  uncore_clock
> > > cstate_core  intel_pt   software    uncore_cbox_1  uncore_cbox_5
> > > uncore_imc_free_running_0
> > > cstate_pkg   kprobe     tracepoint  uncore_cbox_2  uncore_cbox_6
> > > uncore_imc_free_running_1
> > > $ ls /sys/devices
> > > breakpoint   intel_pt     platform    uncore_arb     uncore_cbox_5
> > >          uprobe
> > > cpu          isa          pnp0        uncore_cbox_0  uncore_cbox_6
> > >          virtual
> > > cstate_core  kprobe       power       uncore_cbox_1  uncore_cbox_7
> > > cstate_pkg   LNXSYSTM:00  software    uncore_cbox_2  uncore_clock
> > > i915         msr          system      uncore_cbox_3  uncore_imc_free_running_0
> > > intel_bts    pci0000:00   tracepoint  uncore_cbox_4  uncore_imc_free_running_1
> > > ```
> >
> > Those are the exact same device structures :)
> >
> > Look at /sys/bus/event_source/devices/ those are symlinks back to
> > /sys/devices.  In other words, that's the exact same structures, and
> > the mess in /sys/devices/ with these event devices is what I am trying
> > to clean up.
> 
> Thanks! Again my ignorance, why do we need pmu_bus?

You need some sort of "bus" to group all of your devices on.  It could
also be a class, your call.  This is how the driver model in the kernel
works.

> Could the symlinks in event_source be the actual devices?

That's not how the driver model works.

> Perhaps for temporary compatibility some symlinks could be placed in
> /sys/devices?

No, /sys/devices/ is never guaranteed to have stable placement.

And define "temporary" :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-05  1:21                 ` Vince Weaver
@ 2025-02-05  5:45                   ` Greg Kroah-Hartman
  2025-02-05 15:06                     ` Vince Weaver
  0 siblings, 1 reply; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-05  5:45 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Ian Rogers, Stephane Eranian, Liang, Kan, Alexander Shishkin,
	linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Jiri Olsa,
	Adrian Hunter

On Tue, Feb 04, 2025 at 08:21:17PM -0500, Vince Weaver wrote:
> On Tue, 4 Feb 2025, Greg Kroah-Hartman wrote:
> 
> > On Tue, Feb 04, 2025 at 09:49:55AM -0800, Ian Rogers wrote:
> > > On Tue, Feb 4, 2025 at 9:12 AM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Tue, Feb 04, 2025 at 11:41:38AM -0500, Vince Weaver wrote:
> > > > > On Tue, 4 Feb 2025, Greg Kroah-Hartman wrote:
> > > > >
> > > > > >
> > > > > > It seems that the majority of the perf code IS looking in the correct
> > > > > > place, just mem-events.c seemed wrong.
> > > > >
> > > > > I hate to tell you, but other places in userspace are depending on the
> > > > > current setup.  libpfm4, used by PAPI, is looking directly in /sys/devices
> > > > > for pmus and will break with the changes you are planning.
> > > >
> > > > Then that too needs to be fixed, sorry.  Again, devices can, and will,
> > > > move around in /sys/devices/ you can never hard-code any paths there.
> > > > Any userspace code must ALWAYS be able to handle that, that's a sysfs
> > > > requirement.
> > > >
> > > > And do you have a link to the source for that code?  Good news is that
> > > > if the code is fixed in userspace, it will work for any kernel (old or
> > > > new).
> > > 
> > > +Stephane Eranian
> > > 
> > > I see use of /sys/devices in at least:
> > > https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/lib/pfmlib_perf_event_pmu.c
> > > I imagine it isn't a big job to clean it up.
> > 
> > Nope, all that has to happen is this line:
> > 	snprintf(buf, PATH_MAX, "/sys/devices/%s/events/%s", e->pmu ? e->pmu : "cpu", e->name);
> > be changed to:
> > 	snprintf(buf, PATH_MAX, "/sys/bus/event_source/devices/%s/events/%s", e->pmu ? e->pmu : "cpu", e->name);
> 
> it doesn't matter if it's a one line change, your proposed change breaks 
> userspace.

It breaks broken userspace :)

> I thought the rule was Linux didn't break userspace.
> Has that changed recently?

No, which is why I am working to fix these tools so that it will not be
broken.

> libpfm4/PAPI are widely used and deployed and in some cases possibly 
> statically linked into other projects.  It will take years for the change 
> to filter out to all users.

Great, as it is easy to get this type of fix into the stable kernels,
and then fix these tools, by the time those old systems eventually
update to a newer kernel version, it should "just work".

> Is there a technical reason for this proposed change of yours, or are you 
> just arbitrarily breaking userspace programs because you think /sys looks 
> cluttered to your personal taste?

I'm fixing up the bug where all of these devices accidentally got dumped
into the root of the device tree in the system.

Again, /sys/devices/ is NEVER guaranteed to have specific placement,
that's the whole point of sysfs in the first place.  We can, and do,
move devices around in there all the time (even every other boot), it's
just that some tools accidentally didn't realize this and now need to be
fixed.

This "grey area" between the kernel and userspace of sysfs (and other
virtual filesystems), always gets tweaked and changed over time.  sysfs
makes it much simpler to make those changes in relation to other
filesystems like /proc/ which is why it was designed this way in the
first place.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-05  5:45                   ` Greg Kroah-Hartman
@ 2025-02-05 15:06                     ` Vince Weaver
  2025-02-05 15:36                       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 22+ messages in thread
From: Vince Weaver @ 2025-02-05 15:06 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Vince Weaver, Ian Rogers, Stephane Eranian, Liang, Kan,
	Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Adrian Hunter

On Wed, 5 Feb 2025, Greg Kroah-Hartman wrote:

> On Tue, Feb 04, 2025 at 08:21:17PM -0500, Vince Weaver wrote:
> > 
> > it doesn't matter if it's a one line change, your proposed change breaks 
> > userspace.
> 
> It breaks broken userspace :)

no, it breaks working code that's been working for 15 years, apparently on 
some whim by you.

I thought we didn't break userspace.

> > I thought the rule was Linux didn't break userspace.
> > Has that changed recently?
> 
> No, which is why I am working to fix these tools so that it will not be
> broken.

that... is some impressive doublethink.

> > libpfm4/PAPI are widely used and deployed and in some cases possibly 
> > statically linked into other projects.  It will take years for the change 
> > to filter out to all users.
> 
> Great, as it is easy to get this type of fix into the stable kernels,
> and then fix these tools, by the time those old systems eventually
> update to a newer kernel version, it should "just work".

no, that is the worst case scenario.  PAPI is installed on older systems, 
often custom non-vendor packages.  So if you get this into stable then 
these older RHEL systems the kernel will be updated suddenly breaking the 
users with no warning.

Even if we manage to get a change into libpfm4 and PAPI it will probably 
be months before the next stable release, and again people tend not to 
upgrade PAPI releases that often.


> I'm fixing up the bug where all of these devices accidentally got dumped
> into the root of the device tree in the system.

is this a "bug" or just you shuffling around files for no reason?

> Again, /sys/devices/ is NEVER guaranteed to have specific placement,
> that's the whole point of sysfs in the first place.  We can, and do,
> move devices around in there all the time (even every other boot), it's
> just that some tools accidentally didn't realize this and now need to be
> fixed.

great.  Slap a note about this happening in the appropriate "deprecated" 
file with a timeline of 5 years out or whatever, and then when that comes 
around then send your patch.

Vince Weaver
vincent.weaver@maine.edu

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-05 15:06                     ` Vince Weaver
@ 2025-02-05 15:36                       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-05 15:36 UTC (permalink / raw)
  To: Vince Weaver
  Cc: Ian Rogers, Stephane Eranian, Liang, Kan, Alexander Shishkin,
	linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland, Jiri Olsa,
	Adrian Hunter

On Wed, Feb 05, 2025 at 10:06:55AM -0500, Vince Weaver wrote:
> On Wed, 5 Feb 2025, Greg Kroah-Hartman wrote:
> 
> > On Tue, Feb 04, 2025 at 08:21:17PM -0500, Vince Weaver wrote:
> > > 
> > > it doesn't matter if it's a one line change, your proposed change breaks 
> > > userspace.
> > 
> > It breaks broken userspace :)
> 
> no, it breaks working code that's been working for 15 years, apparently on 
> some whim by you.

These devices have been here for 15 years?  I know I've been ignoring
them for a while, but I don't think that long.  Let me check, ok at
least 15, but there was not many event sources back then, and we didn't
really notice it.

Then in 2023, there was an attempt to fix up parent events in commit
143f83e2003a ("perf: Allow a PMU to have a parent"), which does properly
put pmu devices within the device tree.

So if for no other reason, the userspace tools need to be fixed up to
find those devices, they are broken as of 2023 as they can't find them
in the tree unless they follow the in-kernel documentation that says to
look in /sys/bus/event_source/*/devices (as documented in
Documentation/admin-guide/perf/* and many many Documentation/ABI/
entries.

Also, modern systems have way more "perf objects" than before, so
dumping them all at the root of sysfs is not ok, it might not have been
noticed in the past, but on modern systems today, it is.

> I thought we didn't break userspace.

We try not to, yes.  Again, this is an area where we do change all the
time, and in fact, sysfs was designed to allow this.  That's the rule of
sysfs, never treat the /sys/devices/ tree as stable.  Seems that some
tools didn't get that memo, so I'll go fix them up, not a big deal.

> > > libpfm4/PAPI are widely used and deployed and in some cases possibly 
> > > statically linked into other projects.  It will take years for the change 
> > > to filter out to all users.
> > 
> > Great, as it is easy to get this type of fix into the stable kernels,
> > and then fix these tools, by the time those old systems eventually
> > update to a newer kernel version, it should "just work".
> 
> no, that is the worst case scenario.  PAPI is installed on older systems, 
> often custom non-vendor packages.  So if you get this into stable then 
> these older RHEL systems the kernel will be updated suddenly breaking the 
> users with no warning.

So you really update RHEL systems with a custom kernel?  That kind of
defeats the purpose of RHEL, and wastes your support money you are
paying for it :)

RHEL can handle this just fine, they will update their packages
properly over time, I can wait.

> Even if we manage to get a change into libpfm4 and PAPI it will probably 
> be months before the next stable release, and again people tend not to 
> upgrade PAPI releases that often.

Fair enough, what is the release cycle of those tools?

> > I'm fixing up the bug where all of these devices accidentally got dumped
> > into the root of the device tree in the system.
> 
> is this a "bug" or just you shuffling around files for no reason?

It is a "bug" in that the devices should never have been there in the
first place, AND that userspace should never assume that /sys/devices is
stable.

It's also a "bug" in that these tools might end up trying to touch
devices they really aren't supposed to be touching as assuming anything
with an "events" subdir in the root of sysfs is going to break anyway on
their own.

Heck, I'm tempted to create a "/sys/devices/reset_me/events/" object now
just to see what kind of tools trip over that :)

> > Again, /sys/devices/ is NEVER guaranteed to have specific placement,
> > that's the whole point of sysfs in the first place.  We can, and do,
> > move devices around in there all the time (even every other boot), it's
> > just that some tools accidentally didn't realize this and now need to be
> > fixed.
> 
> great.  Slap a note about this happening in the appropriate "deprecated" 
> file with a timeline of 5 years out or whatever, and then when that comes 
> around then send your patch.

I'll send an updated patch soon and put it behind a config option for
those systems that want to do the right thing, update userspace, and
then remove the config option in a few years.  We've done it in the
past, not a big deal.

thanks for the review.

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-04 18:23         ` Liang, Kan
@ 2025-02-05 16:00           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-05 16:00 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Alexander Shishkin, linux-perf-users, linux-kernel,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Mark Rutland, Jiri Olsa, Ian Rogers, Adrian Hunter

On Tue, Feb 04, 2025 at 01:23:51PM -0500, Liang, Kan wrote:
> 
> 
> On 2025-02-04 11:28 a.m., Greg Kroah-Hartman wrote:
> > On Tue, Feb 04, 2025 at 09:06:18AM -0500, Liang, Kan wrote:
> >>
> >>
> >> On 2025-02-04 5:16 a.m., Greg Kroah-Hartman wrote:
> >>> On Tue, Feb 04, 2025 at 09:41:03AM +0200, Alexander Shishkin wrote:
> >>>> Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> >>>>
> >>>>> In sysfs, for some reason, all pmu devices seem to show up in the "root"
> >>>>> of /sys/devices/ making for a confusing mess as these devices are not
> >>>>> really at the root of the system at all.
> >>>>>
> >>>>> Create a fake root devices, "pmu_bus" and place them all under there if
> >>>>> they do not already have a parent device set, cleaning up sysfs to look
> >>>>> more sane.
> >>>>
> >>>> Yeah, so what happens to the userspace that uses them via /sys/devices/*
> >>>> directly? Even I have scripts that do that.
> >>>
> >>> You should never be doing that, as you have no idea what type of devices
> >>> are in that location in the tree.  You should be doing what the
> >>> documentation says to do, and look in /sys/bus/event_source/devices/
> >>> instead.  That didn't change here.
> >>>
> >>
> >> Not just the script, the /sys/devices/ is also used in the current perf
> >> tool. For example,
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/mem-events.c#n192
> >>
> >> And the comments and document in the perf tool.
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/pmu.c#n39
> > 
> > Note, this file looks to work properly, it's just the comment that is
> > incorrect.
> > 
> > Any hints on what perf command or test I should run to verify I did get
> > this all right?
> 
> For the perf tool, "perf test" should be run at minimum.

Cool, I never noticed that, I'll use that to start with.

> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-list.txt#n191
> > 
> > I'll fix that.
> > 
> >> I think it should bring big impact for the end user, especially when
> >> they still use an older perf tool and script.
> > 
> > It seems that the majority of the perf code IS looking in the correct
> > place, just mem-events.c seemed wrong.
> > 
> 
> There should be two more.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/builtin-stat.c#n100

But that's a cpu device, not a event_bus device, right?

Oh, no, wow, yeah, that is an event_bus device.  Now that's confusing, I
imagine the sysfs cpu authors get annoyed at that one...

> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/x86/util/iostat.c#n35

Ah, missed that one too, thanks!

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-05  5:41       ` Greg Kroah-Hartman
@ 2025-02-05 16:48         ` Ian Rogers
  2025-02-05 18:45           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 22+ messages in thread
From: Ian Rogers @ 2025-02-05 16:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan

On Tue, Feb 4, 2025 at 9:41 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Tue, Feb 04, 2025 at 10:17:17AM -0800, Ian Rogers wrote:
> > On Mon, Feb 3, 2025 at 11:05 PM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Mon, Feb 03, 2025 at 11:44:13AM -0800, Ian Rogers wrote:
> > > > On Mon, Feb 3, 2025 at 11:25 AM Greg Kroah-Hartman
> > > > <gregkh@linuxfoundation.org> wrote:
> > [snip]
> > > > > Note, if you all don't like "pmu_bus" for the name, that's fine, please
> > > > > let me know and I can rename it to something else, but it should be
> > > > > something to get these objects out of the root sysfs directory.
> > > >
> > > > Excuse my ignorance, does the event_source bus already provide a
> > > > similar solution?
> > > > ```
> > > > $ ls /sys/bus/event_source/devices
> > > > breakpoint   i915       msr         uncore_arb     uncore_cbox_3
> > > > uncore_cbox_7              uprobe
> > > > cpu          intel_bts  power       uncore_cbox_0  uncore_cbox_4  uncore_clock
> > > > cstate_core  intel_pt   software    uncore_cbox_1  uncore_cbox_5
> > > > uncore_imc_free_running_0
> > > > cstate_pkg   kprobe     tracepoint  uncore_cbox_2  uncore_cbox_6
> > > > uncore_imc_free_running_1
> > > > $ ls /sys/devices
> > > > breakpoint   intel_pt     platform    uncore_arb     uncore_cbox_5
> > > >          uprobe
> > > > cpu          isa          pnp0        uncore_cbox_0  uncore_cbox_6
> > > >          virtual
> > > > cstate_core  kprobe       power       uncore_cbox_1  uncore_cbox_7
> > > > cstate_pkg   LNXSYSTM:00  software    uncore_cbox_2  uncore_clock
> > > > i915         msr          system      uncore_cbox_3  uncore_imc_free_running_0
> > > > intel_bts    pci0000:00   tracepoint  uncore_cbox_4  uncore_imc_free_running_1
> > > > ```
> > >
> > > Those are the exact same device structures :)
> > >
> > > Look at /sys/bus/event_source/devices/ those are symlinks back to
> > > /sys/devices.  In other words, that's the exact same structures, and
> > > the mess in /sys/devices/ with these event devices is what I am trying
> > > to clean up.
> >
> > Thanks! Again my ignorance, why do we need pmu_bus?
>
> You need some sort of "bus" to group all of your devices on.  It could
> also be a class, your call.  This is how the driver model in the kernel
> works.

Sure, I wasn't clear how this mapped out in sysfs.

> > Could the symlinks in event_source be the actual devices?
>
> That's not how the driver model works.

Ok, so I guess that's what's confusing me. Where will the new devices
be in sysfs? If it is /sys/bus/pmu_bus then why is
/sys/bus/event_source a different case? I guess I should read docs and
the source :-)

> > Perhaps for temporary compatibility some symlinks could be placed in
> > /sys/devices?
>
> No, /sys/devices/ is never guaranteed to have stable placement.
>
> And define "temporary" :)

I need to finish my time machine to resolve issues like this :-) An
unfortunate thing with the PMU acronym is that it is overloaded and
can mean Power Management Unit. Events and perf are other names used
in the code and they don't seem overloaded, but it isn't clear to me
we care about the name being overloaded. I'm afflicted by thinking of
Digital Rights Management every time I hear DRM.

Thanks,
Ian

> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] perf/core: move all of the pmu devices into their own location
  2025-02-05 16:48         ` Ian Rogers
@ 2025-02-05 18:45           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 22+ messages in thread
From: Greg Kroah-Hartman @ 2025-02-05 18:45 UTC (permalink / raw)
  To: Ian Rogers
  Cc: linux-perf-users, linux-kernel, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Adrian Hunter, Liang, Kan

On Wed, Feb 05, 2025 at 08:48:21AM -0800, Ian Rogers wrote:
> On Tue, Feb 4, 2025 at 9:41 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, Feb 04, 2025 at 10:17:17AM -0800, Ian Rogers wrote:
> > > On Mon, Feb 3, 2025 at 11:05 PM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Mon, Feb 03, 2025 at 11:44:13AM -0800, Ian Rogers wrote:
> > > > > On Mon, Feb 3, 2025 at 11:25 AM Greg Kroah-Hartman
> > > > > <gregkh@linuxfoundation.org> wrote:
> > > [snip]
> > > > > > Note, if you all don't like "pmu_bus" for the name, that's fine, please
> > > > > > let me know and I can rename it to something else, but it should be
> > > > > > something to get these objects out of the root sysfs directory.
> > > > >
> > > > > Excuse my ignorance, does the event_source bus already provide a
> > > > > similar solution?
> > > > > ```
> > > > > $ ls /sys/bus/event_source/devices
> > > > > breakpoint   i915       msr         uncore_arb     uncore_cbox_3
> > > > > uncore_cbox_7              uprobe
> > > > > cpu          intel_bts  power       uncore_cbox_0  uncore_cbox_4  uncore_clock
> > > > > cstate_core  intel_pt   software    uncore_cbox_1  uncore_cbox_5
> > > > > uncore_imc_free_running_0
> > > > > cstate_pkg   kprobe     tracepoint  uncore_cbox_2  uncore_cbox_6
> > > > > uncore_imc_free_running_1
> > > > > $ ls /sys/devices
> > > > > breakpoint   intel_pt     platform    uncore_arb     uncore_cbox_5
> > > > >          uprobe
> > > > > cpu          isa          pnp0        uncore_cbox_0  uncore_cbox_6
> > > > >          virtual
> > > > > cstate_core  kprobe       power       uncore_cbox_1  uncore_cbox_7
> > > > > cstate_pkg   LNXSYSTM:00  software    uncore_cbox_2  uncore_clock
> > > > > i915         msr          system      uncore_cbox_3  uncore_imc_free_running_0
> > > > > intel_bts    pci0000:00   tracepoint  uncore_cbox_4  uncore_imc_free_running_1
> > > > > ```
> > > >
> > > > Those are the exact same device structures :)
> > > >
> > > > Look at /sys/bus/event_source/devices/ those are symlinks back to
> > > > /sys/devices.  In other words, that's the exact same structures, and
> > > > the mess in /sys/devices/ with these event devices is what I am trying
> > > > to clean up.
> > >
> > > Thanks! Again my ignorance, why do we need pmu_bus?
> >
> > You need some sort of "bus" to group all of your devices on.  It could
> > also be a class, your call.  This is how the driver model in the kernel
> > works.
> 
> Sure, I wasn't clear how this mapped out in sysfs.
> 
> > > Could the symlinks in event_source be the actual devices?
> >
> > That's not how the driver model works.
> 
> Ok, so I guess that's what's confusing me. Where will the new devices
> be in sysfs? If it is /sys/bus/pmu_bus then why is
> /sys/bus/event_source a different case? I guess I should read docs and
> the source :-)

Look at what is in /sys/bus/event_source/devices/ today.  That's a bunch
of symlinks to where in /sys/devices/ the devices are located.

I just moved all of the pmu bus devices under /sys/devices/pmu_bus/
instead of them showing up in the root at /sys/devices/ as that's not
where a bus should be placing individual devices like this.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-02-06 14:27 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-03 19:25 [PATCH] perf/core: move all of the pmu devices into their own location Greg Kroah-Hartman
2025-02-03 19:44 ` Ian Rogers
2025-02-04  7:05   ` Greg Kroah-Hartman
2025-02-04 18:17     ` Ian Rogers
2025-02-05  5:41       ` Greg Kroah-Hartman
2025-02-05 16:48         ` Ian Rogers
2025-02-05 18:45           ` Greg Kroah-Hartman
2025-02-04  7:41 ` Alexander Shishkin
2025-02-04 10:16   ` Greg Kroah-Hartman
2025-02-04 14:06     ` Liang, Kan
2025-02-04 15:21       ` Greg Kroah-Hartman
2025-02-04 16:28       ` Greg Kroah-Hartman
2025-02-04 16:41         ` Vince Weaver
2025-02-04 17:12           ` Greg Kroah-Hartman
2025-02-04 17:49             ` Ian Rogers
2025-02-04 18:03               ` Greg Kroah-Hartman
2025-02-05  1:21                 ` Vince Weaver
2025-02-05  5:45                   ` Greg Kroah-Hartman
2025-02-05 15:06                     ` Vince Weaver
2025-02-05 15:36                       ` Greg Kroah-Hartman
2025-02-04 18:23         ` Liang, Kan
2025-02-05 16:00           ` Greg Kroah-Hartman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).