* [RFC v3 0/2] CPU-Idle latency selftest framework
@ 2021-04-04 8:33 Pratik Rajesh Sampat
2021-04-09 5:23 ` Doug Smythies
0 siblings, 1 reply; 12+ messages in thread
From: Pratik Rajesh Sampat @ 2021-04-04 8:33 UTC (permalink / raw)
To: rjw, daniel.lezcano, shuah, dsmythies, ego, svaidy, linux-pm,
linux-kernel, linux-kselftest, pratik.r.sampat, psampat
Changelog
RFC v2-->v3
Based on comments by Doug Smythies,
1. Changed commit log to reflect the test must be run as super user.
2. Added a comment specifying a method to run the test bash script
without recompiling.
3. Enable all the idle states after the experiments are completed so
that the system is in a coherent state after the tests have run
4. Correct the return status of a CPU that cannot be off-lined.
RFC v2: https://lkml.org/lkml/2021/4/1/615
---
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will additional latencies of
the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output on a POWER9 system is as follows:
# --IPI Latency Test---
# Baseline Average IPI latency(ns): 3114
# Observed Average IPI latency(ns) - State0: 3265
# Observed Average IPI latency(ns) - State1: 3507
# Observed Average IPI latency(ns) - State2: 3739
# Observed Average IPI latency(ns) - State3: 3807
# Observed Average IPI latency(ns) - State4: 17070
# Observed Average IPI latency(ns) - State5: 1038174
# Observed Average IPI latency(ns) - State6: 1068784
#
# --Timeout Latency Test--
# Baseline Average timeout diff(ns): 1420
# Observed Average timeout diff(ns) - State0: 1640
# Observed Average timeout diff(ns) - State1: 1764
# Observed Average timeout diff(ns) - State2: 1715
# Observed Average timeout diff(ns) - State3: 1845
# Observed Average timeout diff(ns) - State4: 16581
# Observed Average timeout diff(ns) - State5: 939977
# Observed Average timeout diff(ns) - State6: 1073024
Things to keep in mind:
1. This kernel module + bash driver does not guarantee idleness on a
core when the IPI and the Timer is armed. It only invokes sleep and
hopes that the core is idle once the IPI/Timer is invoked onto it.
Hence this program must be run on a completely idle system for best
results
2. Even on a completely idle system, there maybe book-keeping tasks or
jitter tasks that can run on the core we want idle. This can create
outliers in the latency measurement. Thankfully, these outliers
should be large enough to easily weed them out.
3. A userspace only selftest variant was also sent out as RFC based on
suggestions over the previous patchset to simply the kernel
complexeity. However, a userspace only approach had more noise in
the latency measurement due to userspace-kernel interactions
which led to run to run variance and a lesser accurate test.
Another downside of the nature of a userspace program is that it
takes orders of magnitude longer to complete a full system test
compared to the kernel framework.
RFC patch: https://lkml.org/lkml/2020/9/2/356
4. For Intel Systems, the Timer based latencies don't exactly give out
the measure of idle latencies. This is because of a hardware
optimization mechanism that pre-arms a CPU when a timer is set to
wakeup. That doesn't make this metric useless for Intel systems,
it just means that is measuring IPI/Timer responding latency rather
than idle wakeup latencies.
(Source: https://lkml.org/lkml/2020/9/2/610)
For solution to this problem, a hardware based latency analyzer is
devised by Artem Bityutskiy from Intel.
https://youtu.be/Opk92aQyvt0?t=8266
https://intel.github.io/wult/
Pratik Rajesh Sampat (2):
cpuidle: Extract IPI based and timer based wakeup latency from idle
states
selftest/cpuidle: Add support for cpuidle latency measurement
drivers/cpuidle/Makefile | 1 +
drivers/cpuidle/test-cpuidle_latency.c | 157 ++++++++++
lib/Kconfig.debug | 10 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/cpuidle/Makefile | 6 +
tools/testing/selftests/cpuidle/cpuidle.sh | 326 +++++++++++++++++++++
tools/testing/selftests/cpuidle/settings | 2 +
7 files changed, 503 insertions(+)
create mode 100644 drivers/cpuidle/test-cpuidle_latency.c
create mode 100644 tools/testing/selftests/cpuidle/Makefile
create mode 100755 tools/testing/selftests/cpuidle/cpuidle.sh
create mode 100644 tools/testing/selftests/cpuidle/settings
--
2.17.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC v3 0/2] CPU-Idle latency selftest framework
2021-04-04 8:33 Pratik Rajesh Sampat
@ 2021-04-09 5:23 ` Doug Smythies
2021-04-09 7:43 ` Pratik Sampat
0 siblings, 1 reply; 12+ messages in thread
From: Doug Smythies @ 2021-04-09 5:23 UTC (permalink / raw)
To: Pratik Rajesh Sampat
Cc: rjw, Daniel Lezcano, shuah, ego, svaidy, Linux PM list,
Linux Kernel Mailing List, linux-kselftest, pratik.r.sampat
Hi Pratik,
I tried V3 on a Intel i5-10600K processor with 6 cores and 12 CPUs.
The core to cpu mappings are:
core 0 has cpus 0 and 6
core 1 has cpus 1 and 7
core 2 has cpus 2 and 8
core 3 has cpus 3 and 9
core 4 has cpus 4 and 10
core 5 has cpus 5 and 11
By default, it will test CPUs 0,2,4,6,10 on cores 0,2,4,0,2,4.
wouldn't it make more sense to test each core once?
With the source CPU always 0, I think the results from the results
from the destination CPUs 0 and 6, on core 0 bias the results, at
least in the deeper idle states. They don't make much difference in
the shallow states. Myself, I wouldn't include them in the results.
Example, where I used the -v option for all CPUs:
--IPI Latency Test---
--Baseline IPI Latency measurement: CPU Busy--
SRC_CPU DEST_CPU IPI_Latency(ns)
0 0 101
0 1 790
0 2 609
0 3 595
0 4 737
0 5 759
0 6 780
0 7 741
0 8 574
0 9 681
0 10 527
0 11 552
Baseline Avg IPI latency(ns): 620 <<<< suggest 656 here
---Enabling state: 0---
SRC_CPU DEST_CPU IPI_Latency(ns)
0 0 76
0 1 471
0 2 420
0 3 462
0 4 454
0 5 468
0 6 453
0 7 473
0 8 380
0 9 483
0 10 492
0 11 454
Expected IPI latency(ns): 0
Observed Avg IPI latency(ns) - State 0: 423 <<<<< suggest 456 here
---Enabling state: 1---
SRC_CPU DEST_CPU IPI_Latency(ns)
0 0 112
0 1 866
0 2 663
0 3 851
0 4 1090
0 5 1314
0 6 1941
0 7 1458
0 8 687
0 9 802
0 10 1041
0 11 1284
Expected IPI latency(ns): 1000
Observed Avg IPI latency(ns) - State 1: 1009 <<<< suggest 1006 here
---Enabling state: 2---
SRC_CPU DEST_CPU IPI_Latency(ns)
0 0 75
0 1 16362
0 2 16785
0 3 19650
0 4 17356
0 5 17606
0 6 2217
0 7 17958
0 8 17332
0 9 16615
0 10 17382
0 11 17423
Expected IPI latency(ns): 120000
Observed Avg IPI latency(ns) - State 2: 14730 <<<< suggest 17447 here
---Enabling state: 3---
SRC_CPU DEST_CPU IPI_Latency(ns)
0 0 103
0 1 17416
0 2 17961
0 3 16651
0 4 17867
0 5 17726
0 6 2178
0 7 16620
0 8 20951
0 9 16567
0 10 17131
0 11 17563
Expected IPI latency(ns): 1034000
Observed Avg IPI latency(ns) - State 3: 14894 <<<< suggest 17645 here
Hope this helps.
... Doug
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC v3 0/2] CPU-Idle latency selftest framework
2021-04-09 5:23 ` Doug Smythies
@ 2021-04-09 7:43 ` Pratik Sampat
2021-04-09 14:26 ` Doug Smythies
0 siblings, 1 reply; 12+ messages in thread
From: Pratik Sampat @ 2021-04-09 7:43 UTC (permalink / raw)
To: Doug Smythies
Cc: rjw, Daniel Lezcano, shuah, ego, svaidy, Linux PM list,
Linux Kernel Mailing List, linux-kselftest, pratik.r.sampat
Hello Doug,
On 09/04/21 10:53 am, Doug Smythies wrote:
> Hi Pratik,
>
> I tried V3 on a Intel i5-10600K processor with 6 cores and 12 CPUs.
> The core to cpu mappings are:
> core 0 has cpus 0 and 6
> core 1 has cpus 1 and 7
> core 2 has cpus 2 and 8
> core 3 has cpus 3 and 9
> core 4 has cpus 4 and 10
> core 5 has cpus 5 and 11
>
> By default, it will test CPUs 0,2,4,6,10 on cores 0,2,4,0,2,4.
> wouldn't it make more sense to test each core once?
Ideally it would be better to run on all the CPUs, however on larger systems
that I'm testing on with hundreds of cores and a high a thread count, the
execution time increases while not particularly bringing any additional
information to the table.
That is why it made sense only run on one of the threads of each core to make
the experiment faster while preserving accuracy.
To handle various thread topologies it maybe worthwhile if we parse
/sys/devices/system/cpu/cpuX/topology/thread_siblings_list for each core and
use this information to run only once per physical core, rather than
assuming the topology.
What are your thoughts on a mechanism like this?
> With the source CPU always 0, I think the results from the results
> from the destination CPUs 0 and 6, on core 0 bias the results, at
> least in the deeper idle states. They don't make much difference in
> the shallow states. Myself, I wouldn't include them in the results.
I agree, CPU0->CPU0 same core interaction is causing a bias. I could omit that
observation while computing the average.
In the verbose mode I'll omit all the threads of CPU0 and in the default
(quick) mode just CPU0's latency can be omitted while computing average.
Thank you,
Pratik
> Example, where I used the -v option for all CPUs:
>
> --IPI Latency Test---
> --Baseline IPI Latency measurement: CPU Busy--
> SRC_CPU DEST_CPU IPI_Latency(ns)
> 0 0 101
> 0 1 790
> 0 2 609
> 0 3 595
> 0 4 737
> 0 5 759
> 0 6 780
> 0 7 741
> 0 8 574
> 0 9 681
> 0 10 527
> 0 11 552
> Baseline Avg IPI latency(ns): 620 <<<< suggest 656 here
> ---Enabling state: 0---
> SRC_CPU DEST_CPU IPI_Latency(ns)
> 0 0 76
> 0 1 471
> 0 2 420
> 0 3 462
> 0 4 454
> 0 5 468
> 0 6 453
> 0 7 473
> 0 8 380
> 0 9 483
> 0 10 492
> 0 11 454
> Expected IPI latency(ns): 0
> Observed Avg IPI latency(ns) - State 0: 423 <<<<< suggest 456 here
> ---Enabling state: 1---
> SRC_CPU DEST_CPU IPI_Latency(ns)
> 0 0 112
> 0 1 866
> 0 2 663
> 0 3 851
> 0 4 1090
> 0 5 1314
> 0 6 1941
> 0 7 1458
> 0 8 687
> 0 9 802
> 0 10 1041
> 0 11 1284
> Expected IPI latency(ns): 1000
> Observed Avg IPI latency(ns) - State 1: 1009 <<<< suggest 1006 here
> ---Enabling state: 2---
> SRC_CPU DEST_CPU IPI_Latency(ns)
> 0 0 75
> 0 1 16362
> 0 2 16785
> 0 3 19650
> 0 4 17356
> 0 5 17606
> 0 6 2217
> 0 7 17958
> 0 8 17332
> 0 9 16615
> 0 10 17382
> 0 11 17423
> Expected IPI latency(ns): 120000
> Observed Avg IPI latency(ns) - State 2: 14730 <<<< suggest 17447 here
> ---Enabling state: 3---
> SRC_CPU DEST_CPU IPI_Latency(ns)
> 0 0 103
> 0 1 17416
> 0 2 17961
> 0 3 16651
> 0 4 17867
> 0 5 17726
> 0 6 2178
> 0 7 16620
> 0 8 20951
> 0 9 16567
> 0 10 17131
> 0 11 17563
> Expected IPI latency(ns): 1034000
> Observed Avg IPI latency(ns) - State 3: 14894 <<<< suggest 17645 here
>
> Hope this helps.
>
> ... Doug
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC v3 0/2] CPU-Idle latency selftest framework
2021-04-09 7:43 ` Pratik Sampat
@ 2021-04-09 14:26 ` Doug Smythies
0 siblings, 0 replies; 12+ messages in thread
From: Doug Smythies @ 2021-04-09 14:26 UTC (permalink / raw)
To: Pratik Sampat
Cc: rjw, Daniel Lezcano, shuah, ego, svaidy, Linux PM list,
Linux Kernel Mailing List, linux-kselftest, pratik.r.sampat,
dsmythies
On Fri, Apr 9, 2021 at 12:43 AM Pratik Sampat <psampat@linux.ibm.com> wrote:
> On 09/04/21 10:53 am, Doug Smythies wrote:
> > I tried V3 on a Intel i5-10600K processor with 6 cores and 12 CPUs.
> > The core to cpu mappings are:
> > core 0 has cpus 0 and 6
> > core 1 has cpus 1 and 7
> > core 2 has cpus 2 and 8
> > core 3 has cpus 3 and 9
> > core 4 has cpus 4 and 10
> > core 5 has cpus 5 and 11
> >
> > By default, it will test CPUs 0,2,4,6,10 on cores 0,2,4,0,2,4.
> > wouldn't it make more sense to test each core once?
>
> Ideally it would be better to run on all the CPUs, however on larger systems
> that I'm testing on with hundreds of cores and a high a thread count, the
> execution time increases while not particularly bringing any additional
> information to the table.
>
> That is why it made sense only run on one of the threads of each core to make
> the experiment faster while preserving accuracy.
>
> To handle various thread topologies it maybe worthwhile if we parse
> /sys/devices/system/cpu/cpuX/topology/thread_siblings_list for each core and
> use this information to run only once per physical core, rather than
> assuming the topology.
>
> What are your thoughts on a mechanism like this?
Yes, seems like a good solution.
... Doug
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC v3 0/2] CPU-Idle latency selftest framework
@ 2023-09-11 5:36 Aboorva Devarajan
2023-09-11 5:36 ` [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events Aboorva Devarajan
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Aboorva Devarajan @ 2023-09-11 5:36 UTC (permalink / raw)
To: aboorvad, mpe, npiggin, rmclure, arnd, joel, shuah,
linux-kselftest, linuxppc-dev, linux-kernel, pratik.r.sampat
Cc: sshegde, srikar
Changelog: v2 -> v3
* Minimal code refactoring
* Rebased on v6.6-rc1
RFC v1:
https://lore.kernel.org/all/20210611124154.56427-1-psampat@linux.ibm.com/
RFC v2:
https://lore.kernel.org/all/20230828061530.126588-2-aboorvad@linux.vnet.ibm.com/
Other related RFC:
https://lore.kernel.org/all/20210430082804.38018-1-psampat@linux.ibm.com/
Userspace selftest:
https://lkml.org/lkml/2020/9/2/356
----
A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.
The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will be additional latencies
of the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.
To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.
The kernel module provides the following interfaces within
/sys/kernel/debug/powerpc/latency_test/ for,
IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer
Sample output is as follows:
# --IPI Latency Test---
# Baseline Avg IPI latency(ns): 2720
# Observed Avg IPI latency(ns) - State snooze: 2565
# Observed Avg IPI latency(ns) - State stop0_lite: 3856
# Observed Avg IPI latency(ns) - State stop0: 3670
# Observed Avg IPI latency(ns) - State stop1: 3872
# Observed Avg IPI latency(ns) - State stop2: 17421
# Observed Avg IPI latency(ns) - State stop4: 1003922
# Observed Avg IPI latency(ns) - State stop5: 1058870
#
# --Timeout Latency Test--
# Baseline Avg timeout diff(ns): 1435
# Observed Avg timeout diff(ns) - State snooze: 1709
# Observed Avg timeout diff(ns) - State stop0_lite: 2028
# Observed Avg timeout diff(ns) - State stop0: 1954
# Observed Avg timeout diff(ns) - State stop1: 1895
# Observed Avg timeout diff(ns) - State stop2: 14556
# Observed Avg timeout diff(ns) - State stop4: 873988
# Observed Avg timeout diff(ns) - State stop5: 959137
Aboorva Devarajan (2):
powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events
powerpc/selftest: Add support for cpuidle latency measurement
arch/powerpc/Kconfig.debug | 10 +
arch/powerpc/kernel/Makefile | 1 +
arch/powerpc/kernel/test_cpuidle_latency.c | 154 ++++++
tools/testing/selftests/powerpc/Makefile | 1 +
.../powerpc/cpuidle_latency/.gitignore | 2 +
.../powerpc/cpuidle_latency/Makefile | 6 +
.../cpuidle_latency/cpuidle_latency.sh | 443 ++++++++++++++++++
.../powerpc/cpuidle_latency/settings | 1 +
8 files changed, 618 insertions(+)
create mode 100644 arch/powerpc/kernel/test_cpuidle_latency.c
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/Makefile
create mode 100755 tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/settings
--
2.25.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events
2023-09-11 5:36 [RFC v3 0/2] CPU-Idle latency selftest framework Aboorva Devarajan
@ 2023-09-11 5:36 ` Aboorva Devarajan
2023-09-12 22:54 ` Michael Ellerman
2023-09-11 5:36 ` [RFC v3 2/2] powerpc/selftest: Add support for cpuidle latency measurement Aboorva Devarajan
2023-09-25 5:06 ` [RFC v3 0/2] CPU-Idle latency selftest framework Aboorva Devarajan
2 siblings, 1 reply; 12+ messages in thread
From: Aboorva Devarajan @ 2023-09-11 5:36 UTC (permalink / raw)
To: aboorvad, mpe, npiggin, rmclure, arnd, joel, shuah,
linux-kselftest, linuxppc-dev, linux-kernel, pratik.r.sampat
Cc: sshegde, srikar
From: Pratik R. Sampat <psampat@linux.ibm.com>
Introduce a mechanism to fire directed IPIs from a source CPU to a
specified target CPU and measure the time incurred on waking up the
target CPU in response.
Also, introduce a mechanism to queue a hrtimer on a specified CPU and
subsequently measure the time taken to wakeup the CPU.
Define a simple debugfs interface that allows for adjusting the
settings to trigger IPI and timer events on a designated CPU, and to
observe the resulting cpuidle wakeup latencies.
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
---
arch/powerpc/Kconfig.debug | 10 ++
arch/powerpc/kernel/Makefile | 1 +
arch/powerpc/kernel/test_cpuidle_latency.c | 154 +++++++++++++++++++++
3 files changed, 165 insertions(+)
create mode 100644 arch/powerpc/kernel/test_cpuidle_latency.c
diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 2a54fadbeaf5..e175fc3028ac 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -391,3 +391,13 @@ config KASAN_SHADOW_OFFSET
default 0xe0000000 if PPC32
default 0xa80e000000000000 if PPC_BOOK3S_64
default 0xa8001c0000000000 if PPC_BOOK3E_64
+
+config CPUIDLE_LATENCY_SELFTEST
+ tristate "Cpuidle latency selftests"
+ depends on CPU_IDLE
+ help
+ Provides a kernel module that run tests using the IPI and
+ timers to measure cpuidle latency.
+
+ Say M if you want these self tests to build as a module.
+ Say N if you are unsure.
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 2919433be355..3c5a576bbcf2 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -94,6 +94,7 @@ obj-$(CONFIG_PPC_BARRIER_NOSPEC) += security.o
obj-$(CONFIG_PPC64) += vdso64_wrapper.o
obj-$(CONFIG_ALTIVEC) += vecemu.o
obj-$(CONFIG_PPC_BOOK3S_IDLE) += idle_book3s.o
+obj-$(CONFIG_CPUIDLE_LATENCY_SELFTEST) += test_cpuidle_latency.o
procfs-y := proc_powerpc.o
obj-$(CONFIG_PROC_FS) += $(procfs-y)
rtaspci-$(CONFIG_PPC64)-$(CONFIG_PCI) := rtas_pci.o
diff --git a/arch/powerpc/kernel/test_cpuidle_latency.c b/arch/powerpc/kernel/test_cpuidle_latency.c
new file mode 100644
index 000000000000..c932222a8f76
--- /dev/null
+++ b/arch/powerpc/kernel/test_cpuidle_latency.c
@@ -0,0 +1,154 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Module-based API test facility for cpuidle latency using IPIs and timers
+ */
+
+#include <linux/debugfs.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+/*
+ * IPI based wakeup latencies
+ * Measure time taken for a CPU to wakeup on a IPI sent from another CPU
+ * The latency measured also includes the latency of sending the IPI
+ */
+struct latency {
+ unsigned int src_cpu;
+ unsigned int dest_cpu;
+ ktime_t time_start;
+ ktime_t time_end;
+ u64 latency_ns;
+} ipi_wakeup;
+
+static void measure_latency(void *info)
+{
+ struct latency *v = (struct latency *)info;
+ ktime_t time_diff;
+
+ v->time_end = ktime_get();
+ time_diff = ktime_sub(v->time_end, v->time_start);
+ v->latency_ns = ktime_to_ns(time_diff);
+}
+
+void run_smp_call_function_test(unsigned int cpu)
+{
+ ipi_wakeup.src_cpu = smp_processor_id();
+ ipi_wakeup.dest_cpu = cpu;
+ ipi_wakeup.time_start = ktime_get();
+ smp_call_function_single(cpu, measure_latency, &ipi_wakeup, 1);
+}
+
+/*
+ * Timer based wakeup latencies
+ * Measure time taken for a CPU to wakeup on a timer being armed and fired
+ */
+struct timer_data {
+ unsigned int src_cpu;
+ u64 timeout;
+ ktime_t time_start;
+ ktime_t time_end;
+ struct hrtimer timer;
+ u64 timeout_diff_ns;
+} timer_wakeup;
+
+static enum hrtimer_restart hrtimer_callback(struct hrtimer *hrtimer)
+{
+ struct timer_data *w;
+ ktime_t time_diff;
+
+ w = container_of(hrtimer, struct timer_data, timer);
+ w->time_end = ktime_get();
+
+ time_diff = ktime_sub(w->time_end, w->time_start);
+ time_diff = ktime_sub(time_diff, ns_to_ktime(w->timeout));
+ w->timeout_diff_ns = ktime_to_ns(time_diff);
+ return HRTIMER_NORESTART;
+}
+
+static void run_timer_test(unsigned int ns)
+{
+ hrtimer_init(&timer_wakeup.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+ timer_wakeup.timer.function = hrtimer_callback;
+ timer_wakeup.src_cpu = smp_processor_id();
+ timer_wakeup.timeout = ns;
+ timer_wakeup.time_start = ktime_get();
+
+ hrtimer_start(&timer_wakeup.timer, ns_to_ktime(ns),
+ HRTIMER_MODE_REL_PINNED);
+}
+
+static struct dentry *dir;
+
+static int cpu_read_op(void *data, u64 *dest_cpu)
+{
+ *dest_cpu = ipi_wakeup.dest_cpu;
+ return 0;
+}
+
+/*
+ * Send a directed IPI from the current CPU (source) to the destination CPU and
+ * measure the latency on wakeup.
+ */
+static int cpu_write_op(void *data, u64 value)
+{
+ run_smp_call_function_test(value);
+ return 0;
+}
+DEFINE_SIMPLE_ATTRIBUTE(ipi_ops, cpu_read_op, cpu_write_op, "%llu\n");
+
+static int timeout_read_op(void *data, u64 *timeout)
+{
+ *timeout = timer_wakeup.timeout;
+ return 0;
+}
+
+/* Queue a hrtimer on a specified desitination CPU and measure the time taken to
+ * wakeup the CPU.
+ */
+static int timeout_write_op(void *data, u64 value)
+{
+ run_timer_test(value);
+ return 0;
+}
+DEFINE_SIMPLE_ATTRIBUTE(timeout_ops, timeout_read_op, timeout_write_op, "%llu\n");
+
+static int __init latency_init(void)
+{
+ struct dentry *temp;
+
+ dir = debugfs_create_dir("latency_test", arch_debugfs_dir);
+ if (!dir) {
+ pr_alert("latency_test: failed to create /sys/kernel/debug/powerpc/latency_test\n");
+ return -1;
+ }
+ temp = debugfs_create_file("ipi_cpu_dest", 0644, dir, NULL, &ipi_ops);
+ if (!temp) {
+ pr_alert("latency_test: failed to create /sys/kernel/debug/powerpc/ipi_cpu_dest\n");
+ return -1;
+ }
+ debugfs_create_u64("ipi_latency_ns", 0444, dir, &ipi_wakeup.latency_ns);
+ debugfs_create_u32("ipi_cpu_src", 0444, dir, &ipi_wakeup.src_cpu);
+
+ temp = debugfs_create_file("timeout_expected_ns", 0644, dir, NULL, &timeout_ops);
+ if (!temp) {
+ pr_alert("latency_test: failed to create /sys/kernel/debug/powerpc/timeout_expected_ns\n");
+ return -1;
+ }
+ debugfs_create_u64("timeout_diff_ns", 0444, dir, &timer_wakeup.timeout_diff_ns);
+ debugfs_create_u32("timeout_cpu_src", 0444, dir, &timer_wakeup.src_cpu);
+ pr_info("Latency Test module loaded\n");
+ return 0;
+}
+
+static void __exit latency_cleanup(void)
+{
+ pr_info("Cleaning up Latency Test module.\n");
+ debugfs_remove_recursive(dir);
+}
+
+module_init(latency_init);
+module_exit(latency_cleanup);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("IBM Corporation");
+MODULE_DESCRIPTION("Measuring idle latency for IPIs and Timers");
--
2.25.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC v3 2/2] powerpc/selftest: Add support for cpuidle latency measurement
2023-09-11 5:36 [RFC v3 0/2] CPU-Idle latency selftest framework Aboorva Devarajan
2023-09-11 5:36 ` [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events Aboorva Devarajan
@ 2023-09-11 5:36 ` Aboorva Devarajan
2023-09-25 5:06 ` [RFC v3 0/2] CPU-Idle latency selftest framework Aboorva Devarajan
2 siblings, 0 replies; 12+ messages in thread
From: Aboorva Devarajan @ 2023-09-11 5:36 UTC (permalink / raw)
To: aboorvad, mpe, npiggin, rmclure, arnd, joel, shuah,
linux-kselftest, linuxppc-dev, linux-kernel, pratik.r.sampat
Cc: sshegde, srikar
From: Pratik R. Sampat <psampat@linux.ibm.com>
The cpuidle latency selftest provides support to systematically extract,
analyse and present IPI and timer based wakeup latencies for each CPU
and each idle state available on the system.
The selftest leverages test_cpuidle_latency module's debugfs interface
to interact and extract latency information from the kernel.
The selftest inserts the module if already not inserted, disables all
the idle states and enables them one by one testing the following:
1. Keeping source CPU constant, iterate through all the cores and pick
a single CPU for each core measuring IPI latency for baseline
(CPU is busy with cat /dev/random > /dev/null workload) and then
when the CPU is idle.
2. Iterating through all the CPU cores and selecting one CPU for each
core, then, the expected timer durations to be equivalent to the
residency of the deepest idle state enabled is sent to the selected
target CPU, then the difference between the expected timer duration
and the time of wakeup is determined.
To run this test specifically:
$ sudo make -C tools/testing/selftests \
TARGETS="powerpc/cpuidle_latency" run_tests
There are a few optional arguments too that the script can take
[-h <help>]
[-i <run timer tests>]
[-m <location of the module>]
[-s <source cpu for ipi test>]
[-o <location of the output>]
[-v <verbose> (run on all cpus)]
Default Output location in:
tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.log
To run the test without re-compiling:
$ cd tools/testing/selftest/powerpc/cpuidle_latency/
$ sudo ./cpuidle_latency.sh
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
---
tools/testing/selftests/powerpc/Makefile | 1 +
.../powerpc/cpuidle_latency/.gitignore | 2 +
.../powerpc/cpuidle_latency/Makefile | 6 +
.../cpuidle_latency/cpuidle_latency.sh | 443 ++++++++++++++++++
.../powerpc/cpuidle_latency/settings | 1 +
5 files changed, 453 insertions(+)
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/Makefile
create mode 100755 tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/settings
diff --git a/tools/testing/selftests/powerpc/Makefile b/tools/testing/selftests/powerpc/Makefile
index 49f2ad1793fd..efac7270ce1f 100644
--- a/tools/testing/selftests/powerpc/Makefile
+++ b/tools/testing/selftests/powerpc/Makefile
@@ -17,6 +17,7 @@ SUB_DIRS = alignment \
benchmarks \
cache_shape \
copyloops \
+ cpuidle_latency \
dexcr \
dscr \
mm \
diff --git a/tools/testing/selftests/powerpc/cpuidle_latency/.gitignore b/tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
new file mode 100644
index 000000000000..987f8852dc59
--- /dev/null
+++ b/tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+cpuidle_latency.log
diff --git a/tools/testing/selftests/powerpc/cpuidle_latency/Makefile b/tools/testing/selftests/powerpc/cpuidle_latency/Makefile
new file mode 100644
index 000000000000..04492b6d2582
--- /dev/null
+++ b/tools/testing/selftests/powerpc/cpuidle_latency/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+all:
+
+TEST_PROGS := cpuidle_latency.sh
+
+include ../../lib.mk
diff --git a/tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh b/tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
new file mode 100755
index 000000000000..c6b1beffa85f
--- /dev/null
+++ b/tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
@@ -0,0 +1,443 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# CPU-Idle latency selftest enables systematic retrieval and presentation
+# of IPI and timer-triggered wake-up latencies for every CPU and available
+# system idle state by leveraging the test_cpuidle_latency module.
+#
+# Author: Pratik R. Sampat <psampat at linux.ibm.com>
+# Author: Aboorva Devarajan <aboorvad at linux.ibm.com>
+
+DISABLE=1
+ENABLE=0
+
+LOG=cpuidle_latency.log
+MODULE=/lib/modules/$(uname -r)/kernel/arch/powerpc/kernel/test_cpuidle_latency.ko
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+exit_status=0
+
+RUN_TIMER_TEST=1
+TIMEOUT=1000000
+VERBOSE=0
+
+IPI_SRC_CPU=0
+
+helpme() {
+ printf "Usage: %s [-h] [-todg args]
+ [-h <help>]
+ [-s <source cpu for ipi test> (default: 0)]
+ [-m <location of the module>]
+ [-o <location of the output>]
+ [-v <verbose> (execute test across all CPU threads)]
+ [-i <run timer tests>]
+ \n" "$0"
+ exit 2
+}
+
+cpu_is_online() {
+ local cpu=$1
+ if [ ! -f "/sys/devices/system/cpu/cpu$cpu/online" ]; then
+ printf "CPU %s: file not found: /sys/devices/system/cpu/cpu%s/online" "$cpu" "$cpu"
+ return 0
+ fi
+ status=$(cat /sys/devices/system/cpu/cpu"$cpu"/online)
+ return "$status"
+}
+
+check_valid_cpu() {
+ local cpu="$1"
+ local cpu_count
+
+ cpu_count="$(nproc)" # Get the number of CPUs on the system
+
+ if [[ "$cpu" =~ ^[0-9]+$ ]]; then
+ if ((cpu >= 0 && cpu < cpu_count)); then
+ cpu_is_online "$cpu"
+ online_status=$?
+ if [ "$online_status" -eq "1" ]; then
+ return 1
+ else
+ printf "CPU %s is offline." "$cpu"
+ return 0
+ fi
+ fi
+ fi
+ return 0
+}
+
+parse_arguments() {
+ while getopts ht:m:s:o:vt:it: arg; do
+ case $arg in
+ h) # --help
+ helpme
+ ;;
+ m) # --mod-file
+ MODULE=$OPTARG
+ ;;
+ s) #
+ IPI_SRC_CPU=$OPTARG
+ check_valid_cpu "$IPI_SRC_CPU"
+ cpu_status=$?
+ if [ "$cpu_status" == "0" ]; then
+ printf "%s is an invalid CPU. Exiting.." "$IPI_SRC_CPU"
+ exit
+ fi
+ ;;
+ o) # output log files
+ LOG=$OPTARG
+ ;;
+ v) # verbose mode - execute tests across all CPU threads
+ VERBOSE=1
+ ;;
+ i) # run timer tests
+ RUN_TIMER_TEST=1
+ ;;
+ \?)
+ helpme
+ ;;
+ esac
+ done
+}
+
+ins_mod() {
+ debugfs_file=/sys/kernel/debug/powerpc/latency_test/ipi_latency_ns
+ # Check if the module is already loaded
+ if [ -f "$debugfs_file" ]; then
+ printf "Module %s already loaded\n\n" "$MODULE"
+ return 0
+ fi
+ # Try to load the module
+ if [ ! -f "$MODULE" ]; then
+ printf "%s module does not exist. Exiting\n" "$MODULE"
+ exit $ksft_skip
+ fi
+ printf "Inserting %s module\n\n" "$MODULE"
+ insmod "$MODULE"
+ if [ $? != 0 ]; then
+ printf "Insmod %s failed\n" "$MODULE"
+ exit $ksft_skip
+ fi
+}
+
+compute_average() {
+ arr=("$@")
+ sum=0
+ size=${#arr[@]}
+ if [ "$size" == 0 ]; then
+ avg=0
+ return 1
+ fi
+ for i in "${arr[@]}"; do
+ sum=$((sum + i))
+ done
+ avg=$((sum / size))
+}
+
+# Perform operation on each CPU for the given state
+# $1 - Operation: enable (0) / disable (1)
+# $2 - State to enable
+op_state() {
+ for ((cpu = 0; cpu < NUM_CPUS; cpu++)); do
+ cpu_is_online "$cpu"
+ local cpu_status=$?
+ if [ "$cpu_status" == 0 ]; then
+ continue
+ fi
+ echo "$1" >/sys/devices/system/cpu/cpu"$cpu"/cpuidle/state"$2"/disable
+ done
+}
+
+cpuidle_enable_state() {
+ state=$1
+ op_state "$ENABLE" "$state"
+}
+
+cpuidle_disable_state() {
+ state=$1
+ op_state "$DISABLE" "$state"
+}
+
+# Enable/Disable all stop states for all CPUs
+# $1 - Operation: enable (0) / disable (1)
+op_cpuidle() {
+ for ((state = 0; state < NUM_STATES; state++)); do
+ op_state "$1" "$state"
+ done
+}
+
+extract_state_information() {
+ for ((state = 0; state < NUM_STATES; state++)); do
+ state_name=$(cat /sys/devices/system/cpu/cpu"$IPI_SRC_CPU"/cpuidle/state"$state"/name)
+ state_name_arr+=("$state_name")
+ done
+}
+
+# Extract latency in microseconds and convert to nanoseconds
+extract_latency() {
+ for ((state = 0; state < NUM_STATES; state++)); do
+ latency=$(($(cat /sys/devices/system/cpu/cpu"$IPI_SRC_CPU"/cpuidle/state"$state"/latency) * 1000))
+ latency_arr+=("$latency")
+ done
+}
+
+# Simple linear search in an array
+# $1 - Element to search for
+# $2 - Array
+element_in() {
+ local item="$1"
+ shift
+ for element in "$@"; do
+ if [ "$element" == "$item" ]; then
+ return 0
+ fi
+ done
+ return 1
+}
+
+# Parse and return a cpuset with ","(individual) and "-" (range) of CPUs
+# $1 - cpuset string
+parse_cpuset() {
+ echo "$1" | awk '/-/{for (i=$1; i<=$2; i++)printf "%s%s",i,ORS;next} {print}' RS=, FS=-
+}
+
+extract_core_information() {
+ declare -a thread_arr
+ for ((cpu = 0; cpu < NUM_CPUS; cpu++)); do
+ cpu_is_online "$cpu"
+ local cpu_status=$?
+ if [ "$cpu_status" == 0 ]; then
+ continue
+ fi
+
+ siblings=$(cat /sys/devices/system/cpu/cpu"$cpu"/topology/thread_siblings_list)
+ sib_arr=()
+
+ for c in $(parse_cpuset "$siblings"); do
+ sib_arr+=("$c")
+ done
+
+ if [ "$VERBOSE" == 1 ]; then
+ core_arr+=("$cpu")
+ continue
+ fi
+ element_in "${sib_arr[0]}" "${thread_arr[@]}"
+ if [ $? == 0 ]; then
+ continue
+ fi
+ core_arr+=("${sib_arr[0]}")
+
+ for thread in "${sib_arr[@]}"; do
+ thread_arr+=("$thread")
+ done
+ done
+
+ src_siblings=$(cat /sys/devices/system/cpu/cpu"$IPI_SRC_CPU"/topology/thread_siblings_list)
+ for c in $(parse_cpuset "$src_siblings"); do
+ first_core_arr+=("$c")
+ done
+}
+
+# Run the IPI test
+# $1 run for baseline - busy cpu or regular environment
+# $2 destination cpu
+ipi_test_once() {
+ dest_cpu=$2
+ if [ "$1" = "baseline" ]; then
+ # Keep the CPU busy
+ taskset -c "$dest_cpu" cat /dev/random >/dev/null &
+ task_pid=$!
+ # Wait for the workload to achieve 100% CPU usage
+ sleep 1
+ fi
+ taskset -c "$IPI_SRC_CPU" echo "$dest_cpu" >/sys/kernel/debug/powerpc/latency_test/ipi_cpu_dest
+ ipi_latency=$(cat /sys/kernel/debug/powerpc/latency_test/ipi_latency_ns)
+ src_cpu=$(cat /sys/kernel/debug/powerpc/latency_test/ipi_cpu_src)
+ if [ "$1" = "baseline" ]; then
+ kill "$task_pid"
+ wait "$task_pid" 2>/dev/null
+ fi
+}
+
+# Incrementally enable idle states one by one and compute the latency
+run_ipi_tests() {
+ extract_latency
+ # Disable idle states for CPUs
+ op_cpuidle "$DISABLE"
+
+ declare -a avg_arr
+ printf "...IPI Latency Test...\n" | tee -a "$LOG"
+
+ printf "...Baseline IPI Latency measurement: CPU Busy...\n" >>"$LOG"
+ printf "%s %10s %12s\n" "SRC_CPU" "DEST_CPU" "IPI_Latency(ns)" >>"$LOG"
+ for cpu in "${core_arr[@]}"; do
+ cpu_is_online "$cpu"
+ local cpu_status=$?
+ if [ "$cpu_status" == 0 ]; then
+ continue
+ fi
+ ipi_test_once "baseline" "$cpu"
+ printf "%-3s %10s %12s\n" "$src_cpu" "$cpu" "$ipi_latency" >>"$LOG"
+ # Skip computing latency average from the source CPU to avoid bias
+ element_in "$cpu" "${first_core_arr[@]}"
+ if [ $? == 0 ]; then
+ continue
+ fi
+ avg_arr+=("$ipi_latency")
+ done
+ compute_average "${avg_arr[@]}"
+ printf "Baseline Avg IPI latency(ns): %s\n" "$avg" | tee -a "$LOG"
+
+ for ((state = 0; state < NUM_STATES; state++)); do
+ unset avg_arr
+ printf "...Enabling state: %s...\n" "${state_name_arr[$state]}" >>"$LOG"
+ cpuidle_enable_state $state
+ printf "%s %10s %12s\n" "SRC_CPU" "DEST_CPU" "IPI_Latency(ns)" >>"$LOG"
+ for cpu in "${core_arr[@]}"; do
+ cpu_is_online "$cpu"
+ local cpu_status=$?
+ if [ "$cpu_status" == 0 ]; then
+ continue
+ fi
+ # Running IPI test and logging results
+ sleep 1
+ ipi_test_once "test" "$cpu"
+ printf "%-3s %10s %12s\n" "$src_cpu" "$cpu" "$ipi_latency" >>"$LOG"
+ # Skip computing latency average from the source CPU to avoid bias
+ element_in "$cpu" "${first_core_arr[@]}"
+ if [ $? == 0 ]; then
+ continue
+ fi
+ avg_arr+=("$ipi_latency")
+ done
+
+ compute_average "${avg_arr[@]}"
+ printf "Expected IPI latency(ns): %s\n" "${latency_arr[$state]}" >>"$LOG"
+ printf "Observed Avg IPI latency(ns) - State %s: %s\n" "${state_name_arr[$state]}" "$avg" | tee -a "$LOG"
+ cpuidle_disable_state $state
+ done
+}
+
+# Extract the residency in microseconds and convert to nanoseconds.
+# Add 200 ns so that the timer stays for a little longer than the residency
+extract_residency() {
+ for ((state = 0; state < NUM_STATES; state++)); do
+ residency=$(($(cat /sys/devices/system/cpu/cpu"$IPI_SRC_CPU"/cpuidle/state"$state"/residency) * 1000 + 200))
+ residency_arr+=("$residency")
+ done
+}
+
+# Run the Timeout test
+# $1 run for baseline - busy cpu or regular environment
+# $2 destination cpu
+# $3 timeout
+timeout_test_once() {
+ dest_cpu=$2
+ if [ "$1" = "baseline" ]; then
+ # Keep the CPU busy
+ taskset -c "$dest_cpu" cat /dev/random >/dev/null &
+ task_pid=$!
+ # Wait for the workload to achieve 100% CPU usage
+ sleep 1
+ fi
+ taskset -c "$dest_cpu" sleep 1
+ taskset -c "$dest_cpu" echo "$3" >/sys/kernel/debug/powerpc/latency_test/timeout_expected_ns
+ # Wait for the result to populate
+ sleep 0.1
+ timeout_diff=$(cat /sys/kernel/debug/powerpc/latency_test/timeout_diff_ns)
+ src_cpu=$(cat /sys/kernel/debug/powerpc/latency_test/timeout_cpu_src)
+ if [ "$1" = "baseline" ]; then
+ kill "$task_pid"
+ wait "$task_pid" 2>/dev/null
+ fi
+}
+
+run_timeout_tests() {
+ extract_residency
+ # Disable idle states for all CPUs
+ op_cpuidle "$DISABLE"
+
+ declare -a avg_arr
+ printf "\n...Timeout Latency Test...\n" | tee -a "$LOG"
+
+ printf "...Baseline Timeout Latency measurement: CPU Busy...\n" >>"$LOG"
+ printf "%s %10s\n" "Wakeup_src" "Baseline_delay(ns)" >>"$LOG"
+ for cpu in "${core_arr[@]}"; do
+ cpu_is_online "$cpu"
+ local cpu_status=$?
+ if [ "$cpu_status" == 0 ]; then
+ continue
+ fi
+ timeout_test_once "baseline" "$cpu" "$TIMEOUT"
+ printf "%-3s %13s\n" "$src_cpu" "$timeout_diff" >>"$LOG"
+ avg_arr+=("$timeout_diff")
+ done
+ compute_average "${avg_arr[@]}"
+ printf "Baseline Avg timeout diff(ns): %s\n" "$avg" | tee -a "$LOG"
+
+ for ((state = 0; state < NUM_STATES; state++)); do
+ unset avg_arr
+ printf "...Enabling state: %s...\n" "${state_name_arr["$state"]}" >>"$LOG"
+ cpuidle_enable_state "$state"
+ printf "%s %10s\n" "Wakeup_src" "Delay(ns)" >>"$LOG"
+ for cpu in "${core_arr[@]}"; do
+ cpu_is_online "$cpu"
+ local cpu_status=$?
+ if [ "$cpu_status" == 0 ]; then
+ continue
+ fi
+ timeout_test_once "test" "$cpu" "$TIMEOUT"
+ printf "%-3s %13s\n" "$src_cpu" "$timeout_diff" >>"$LOG"
+ avg_arr+=("$timeout_diff")
+ done
+ compute_average "${avg_arr[@]}"
+ printf "Expected timeout(ns): %s\n" "${residency_arr["$state"]}" >>"$LOG"
+ printf "Observed Avg timeout diff(ns) - State %s: %s\n" "${state_name_arr["$state"]}" "$avg" | tee -a "$LOG"
+ cpuidle_disable_state "$state"
+ done
+}
+
+# Function to exit the test if not intended
+exit_test() {
+ printf "Exiting the test. Test not intended to run.\n"
+ exit "$ksft_skip"
+}
+
+printf "Running this test enables all CPU idle states by the time it concludes.\n"
+printf "Note: This test does not restore previous idle state.\n"
+
+declare -a residency_arr
+declare -a latency_arr
+declare -a core_arr
+declare -a first_core_arr
+declare -a state_name_arr
+
+parse_arguments "$@"
+
+rm -f "$LOG"
+touch "$LOG"
+
+NUM_CPUS=$(nproc --all)
+NUM_STATES=$(ls -1 /sys/devices/system/cpu/cpu"$IPI_SRC_CPU"/cpuidle/ | wc -l)
+
+extract_core_information
+extract_state_information
+
+ins_mod "$MODULE"
+
+run_ipi_tests
+if [ "$RUN_TIMER_TEST" == "1" ]; then
+ run_timeout_tests
+fi
+
+# Enable all idle states for all CPUs
+op_cpuidle $ENABLE
+printf "Removing %s module\n" "$MODULE"
+printf "Full Output logged at: %s\n" "$LOG"
+
+if [ -f "$MODULE" ]; then
+ rmmod "$MODULE"
+fi
+
+exit "$exit_status"
diff --git a/tools/testing/selftests/powerpc/cpuidle_latency/settings b/tools/testing/selftests/powerpc/cpuidle_latency/settings
new file mode 100644
index 000000000000..e7b9417537fb
--- /dev/null
+++ b/tools/testing/selftests/powerpc/cpuidle_latency/settings
@@ -0,0 +1 @@
+timeout=0
--
2.25.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events
2023-09-11 5:36 ` [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events Aboorva Devarajan
@ 2023-09-12 22:54 ` Michael Ellerman
2023-09-21 11:00 ` Aboorva Devarajan
0 siblings, 1 reply; 12+ messages in thread
From: Michael Ellerman @ 2023-09-12 22:54 UTC (permalink / raw)
To: Aboorva Devarajan, aboorvad, npiggin, rmclure, arnd, joel, shuah,
linux-kselftest, linuxppc-dev, linux-kernel, pratik.r.sampat
Cc: sshegde, srikar
Aboorva Devarajan <aboorvad@linux.vnet.ibm.com> writes:
> From: Pratik R. Sampat <psampat@linux.ibm.com>
>
> Introduce a mechanism to fire directed IPIs from a source CPU to a
> specified target CPU and measure the time incurred on waking up the
> target CPU in response.
>
> Also, introduce a mechanism to queue a hrtimer on a specified CPU and
> subsequently measure the time taken to wakeup the CPU.
>
> Define a simple debugfs interface that allows for adjusting the
> settings to trigger IPI and timer events on a designated CPU, and to
> observe the resulting cpuidle wakeup latencies.
>
> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
> Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
> ---
> arch/powerpc/Kconfig.debug | 10 ++
> arch/powerpc/kernel/Makefile | 1 +
> arch/powerpc/kernel/test_cpuidle_latency.c | 154 +++++++++++++++++++++
I don't see anything here that's powerpc specific?
Which makes me wonder 1) could this be done with some existing generic
mechanism?, and 2) if not can this test code be made generic.
At the very least this should be Cc'ed to the cpuidle lists &
maintainers given it's a test for cpuidle latency :)
cheers
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events
2023-09-12 22:54 ` Michael Ellerman
@ 2023-09-21 11:00 ` Aboorva Devarajan
2023-09-21 23:41 ` Michael Ellerman
0 siblings, 1 reply; 12+ messages in thread
From: Aboorva Devarajan @ 2023-09-21 11:00 UTC (permalink / raw)
To: Michael Ellerman
Cc: sshegde, srikar, npiggin, rmclure, arnd, joel, shuah,
linux-kselftest, linuxppc-dev, linux-kernel, pratik.r.sampat
On Wed, 2023-09-13 at 08:54 +1000, Michael Ellerman wrote:
> Aboorva Devarajan <aboorvad@linux.vnet.ibm.com> writes:
> > From: Pratik R. Sampat <psampat@linux.ibm.com>
> >
> > Introduce a mechanism to fire directed IPIs from a source CPU to a
> > specified target CPU and measure the time incurred on waking up the
> > target CPU in response.
> >
> > Also, introduce a mechanism to queue a hrtimer on a specified CPU
> > and
> > subsequently measure the time taken to wakeup the CPU.
> >
> > Define a simple debugfs interface that allows for adjusting the
> > settings to trigger IPI and timer events on a designated CPU, and
> > to
> > observe the resulting cpuidle wakeup latencies.
> >
> > Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> > Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
> > Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
> > ---
> > arch/powerpc/Kconfig.debug | 10 ++
> > arch/powerpc/kernel/Makefile | 1 +
> > arch/powerpc/kernel/test_cpuidle_latency.c | 154
> > +++++++++++++++++++++
>
> I don't see anything here that's powerpc specific?
>
> Which makes me wonder 1) could this be done with some existing
> generic
> mechanism?, and 2) if not can this test code be made generic.
>
> At the very least this should be Cc'ed to the cpuidle lists &
> maintainers given it's a test for cpuidle latency :)
>
> cheers
Hi Michael,
Thanks a lot for taking a look at this.
Yes, this test-case can be used as a generic benchmark for evaluating
CPU idle latencies across different architectures, as it has thus far
been exclusively tested and used on PowerPC, so we thought it would be
more beneficial to incorporate it into a PowerPC specific self-test
suite. But I will work on making it a generic self-test and send across
a v4.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events
2023-09-21 11:00 ` Aboorva Devarajan
@ 2023-09-21 23:41 ` Michael Ellerman
0 siblings, 0 replies; 12+ messages in thread
From: Michael Ellerman @ 2023-09-21 23:41 UTC (permalink / raw)
To: Aboorva Devarajan
Cc: sshegde, srikar, npiggin, rmclure, arnd, joel, shuah,
linux-kselftest, linuxppc-dev, linux-kernel, pratik.r.sampat
Aboorva Devarajan <aboorvad@linux.vnet.ibm.com> writes:
> On Wed, 2023-09-13 at 08:54 +1000, Michael Ellerman wrote:
>> Aboorva Devarajan <aboorvad@linux.vnet.ibm.com> writes:
>> > From: Pratik R. Sampat <psampat@linux.ibm.com>
>> >
>> > Introduce a mechanism to fire directed IPIs from a source CPU to a
>> > specified target CPU and measure the time incurred on waking up the
>> > target CPU in response.
>> >
>> > Also, introduce a mechanism to queue a hrtimer on a specified CPU
>> > and
>> > subsequently measure the time taken to wakeup the CPU.
>> >
>> > Define a simple debugfs interface that allows for adjusting the
>> > settings to trigger IPI and timer events on a designated CPU, and
>> > to
>> > observe the resulting cpuidle wakeup latencies.
>> >
>> > Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
>> > Signed-off-by: Pratik R. Sampat <psampat@linux.ibm.com>
>> > Signed-off-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
>> > ---
>> > arch/powerpc/Kconfig.debug | 10 ++
>> > arch/powerpc/kernel/Makefile | 1 +
>> > arch/powerpc/kernel/test_cpuidle_latency.c | 154
>> > +++++++++++++++++++++
>>
>> I don't see anything here that's powerpc specific?
>>
>> Which makes me wonder 1) could this be done with some existing
>> generic
>> mechanism?, and 2) if not can this test code be made generic.
>>
>> At the very least this should be Cc'ed to the cpuidle lists &
>> maintainers given it's a test for cpuidle latency :)
>>
>> cheers
>
> Hi Michael,
>
> Thanks a lot for taking a look at this.
>
> Yes, this test-case can be used as a generic benchmark for evaluating
> CPU idle latencies across different architectures, as it has thus far
> been exclusively tested and used on PowerPC, so we thought it would be
> more beneficial to incorporate it into a PowerPC specific self-test
> suite. But I will work on making it a generic self-test and send across
> a v4.
I'd suggest just posting v3 again but Cc'ing the cpuidle lists &
maintainers, to see if there is any interest in making it generic.
cheers
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC v3 0/2] CPU-Idle latency selftest framework
2023-09-11 5:36 [RFC v3 0/2] CPU-Idle latency selftest framework Aboorva Devarajan
2023-09-11 5:36 ` [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events Aboorva Devarajan
2023-09-11 5:36 ` [RFC v3 2/2] powerpc/selftest: Add support for cpuidle latency measurement Aboorva Devarajan
@ 2023-09-25 5:06 ` Aboorva Devarajan
2023-10-12 4:48 ` Aboorva Devarajan
2 siblings, 1 reply; 12+ messages in thread
From: Aboorva Devarajan @ 2023-09-25 5:06 UTC (permalink / raw)
To: mpe, npiggin, rmclure, arnd, joel, shuah, linux-kselftest,
linuxppc-dev, linux-kernel, pratik.r.sampat, aboorvad
Cc: sshegde, srikar, rafael, daniel.lezcano, linux-pm
On Mon, 2023-09-11 at 11:06 +0530, Aboorva Devarajan wrote:
CC'ing CPUidle lists and maintainers,
Patch Summary:
The patchset introduces a kernel module and userspace driver designed
for estimating the wakeup latency experienced when waking up from
various CPU idle states. It primarily measures latencies related to two
types of events: Inter-Processor Interrupts (IPIs) and Timers.
Background:
Initially, these patches were introduced as a generic self-test.
However, it was later discovered that Intel platforms incorporate
timer-based wakeup optimizations. These optimizations allow CPUs to
perform a pre-wakeup, which limits the effectiveness of latency
observation in certain scenarios because it only measures the optimized
wakeup latency [1].
Therefore, in this RFC, the self-test is specifically integrated into
PowerPC, as it has been tested and used in PowerPC so far.
Another proposal is to introduce these patches as a generic cpuilde IPI
and timer wake-up test. While this method may not give us an exact
measurement of latency variations at the hardware level, it can still
help us assess this metric from a software observability standpoint.
Looking forward to hearing what you think and any suggestions you may
have regarding this. Thanks.
[1]
https://lore.kernel.org/linux-pm/20200914174625.GB25628@in.ibm.com/T/#m5c004b9b1a918f669e91b3d0f33e2e3500923234
> Changelog: v2 -> v3
>
> * Minimal code refactoring
> * Rebased on v6.6-rc1
>
> RFC v1:
> https://lore.kernel.org/all/20210611124154.56427-1-psampat@linux.ibm.com/
>
> RFC v2:
> https://lore.kernel.org/all/20230828061530.126588-2-aboorvad@linux.vnet.ibm.com/
>
> Other related RFC:
> https://lore.kernel.org/all/20210430082804.38018-1-psampat@linux.ibm.com/
>
> Userspace selftest:
> https://lkml.org/lkml/2020/9/2/356
>
> ----
>
> A kernel module + userspace driver to estimate the wakeup latency
> caused by going into stop states. The motivation behind this program
> is
> to find significant deviations behind advertised latency and
> residency
> values.
>
> The patchset measures latencies for two kinds of events. IPIs and
> Timers
> As this is a software-only mechanism, there will be additional
> latencies
> of the kernel-firmware-hardware interactions. To account for that,
> the
> program also measures a baseline latency on a 100 percent loaded CPU
> and the latencies achieved must be in view relative to that.
>
> To achieve this, we introduce a kernel module and expose its control
> knobs through the debugfs interface that the selftests can engage
> with.
>
> The kernel module provides the following interfaces within
> /sys/kernel/debug/powerpc/latency_test/ for,
>
> IPI test:
> ipi_cpu_dest = Destination CPU for the IPI
> ipi_cpu_src = Origin of the IPI
> ipi_latency_ns = Measured latency time in ns
> Timeout test:
> timeout_cpu_src = CPU on which the timer to be queued
> timeout_expected_ns = Timer duration
> timeout_diff_ns = Difference of actual duration vs expected timer
>
> Sample output is as follows:
>
> # --IPI Latency Test---
> # Baseline Avg IPI latency(ns): 2720
> # Observed Avg IPI latency(ns) - State snooze: 2565
> # Observed Avg IPI latency(ns) - State stop0_lite: 3856
> # Observed Avg IPI latency(ns) - State stop0: 3670
> # Observed Avg IPI latency(ns) - State stop1: 3872
> # Observed Avg IPI latency(ns) - State stop2: 17421
> # Observed Avg IPI latency(ns) - State stop4: 1003922
> # Observed Avg IPI latency(ns) - State stop5: 1058870
> #
> # --Timeout Latency Test--
> # Baseline Avg timeout diff(ns): 1435
> # Observed Avg timeout diff(ns) - State snooze: 1709
> # Observed Avg timeout diff(ns) - State stop0_lite: 2028
> # Observed Avg timeout diff(ns) - State stop0: 1954
> # Observed Avg timeout diff(ns) - State stop1: 1895
> # Observed Avg timeout diff(ns) - State stop2: 14556
> # Observed Avg timeout diff(ns) - State stop4: 873988
> # Observed Avg timeout diff(ns) - State stop5: 959137
>
> Aboorva Devarajan (2):
> powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer
> events
> powerpc/selftest: Add support for cpuidle latency measurement
>
> arch/powerpc/Kconfig.debug | 10 +
> arch/powerpc/kernel/Makefile | 1 +
> arch/powerpc/kernel/test_cpuidle_latency.c | 154 ++++++
> tools/testing/selftests/powerpc/Makefile | 1 +
> .../powerpc/cpuidle_latency/.gitignore | 2 +
> .../powerpc/cpuidle_latency/Makefile | 6 +
> .../cpuidle_latency/cpuidle_latency.sh | 443
> ++++++++++++++++++
> .../powerpc/cpuidle_latency/settings | 1 +
> 8 files changed, 618 insertions(+)
> create mode 100644 arch/powerpc/kernel/test_cpuidle_latency.c
> create mode 100644
> tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
> create mode 100644
> tools/testing/selftests/powerpc/cpuidle_latency/Makefile
> create mode 100755
> tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
> create mode 100644
> tools/testing/selftests/powerpc/cpuidle_latency/settings
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC v3 0/2] CPU-Idle latency selftest framework
2023-09-25 5:06 ` [RFC v3 0/2] CPU-Idle latency selftest framework Aboorva Devarajan
@ 2023-10-12 4:48 ` Aboorva Devarajan
0 siblings, 0 replies; 12+ messages in thread
From: Aboorva Devarajan @ 2023-10-12 4:48 UTC (permalink / raw)
To: 20230911053620.87973-1-aboorvad, mpe, rafael, daniel.lezcano,
linux-pm
Cc: sshegde, srikar, npiggin, rmclure, arnd, joel, shuah,
linux-kselftest, linuxppc-dev, linux-kernel, pratik.r.sampat
On Mon, 2023-09-25 at 10:36 +0530, Aboorva Devarajan wrote:
Gentle ping to check if there are any feedback or comments on this
patch-set.
Thanks
Aboorva
> On Mon, 2023-09-11 at 11:06 +0530, Aboorva Devarajan wrote:
>
> CC'ing CPUidle lists and maintainers,
>
> Patch Summary:
>
> The patchset introduces a kernel module and userspace driver designed
> for estimating the wakeup latency experienced when waking up from
> various CPU idle states. It primarily measures latencies related to
> two
> types of events: Inter-Processor Interrupts (IPIs) and Timers.
>
> Background:
>
> Initially, these patches were introduced as a generic self-test.
> However, it was later discovered that Intel platforms incorporate
> timer-based wakeup optimizations. These optimizations allow CPUs to
> perform a pre-wakeup, which limits the effectiveness of latency
> observation in certain scenarios because it only measures the
> optimized
> wakeup latency [1].
>
> Therefore, in this RFC, the self-test is specifically integrated into
> PowerPC, as it has been tested and used in PowerPC so far.
>
> Another proposal is to introduce these patches as a generic cpuilde
> IPI
> and timer wake-up test. While this method may not give us an exact
> measurement of latency variations at the hardware level, it can still
> help us assess this metric from a software observability standpoint.
>
> Looking forward to hearing what you think and any suggestions you may
> have regarding this. Thanks.
>
> [1]
> https://lore.kernel.org/linux-pm/20200914174625.GB25628@in.ibm.com/T/#m5c004b9b1a918f669e91b3d0f33e2e3500923234
>
> > Changelog: v2 -> v3
> >
> > * Minimal code refactoring
> > * Rebased on v6.6-rc1
> >
> > RFC v1:
> > https://lore.kernel.org/all/20210611124154.56427-1-psampat@linux.ibm.com/
> >
> > RFC v2:
> > https://lore.kernel.org/all/20230828061530.126588-2-aboorvad@linux.vnet.ibm.com/
> >
> > Other related RFC:
> > https://lore.kernel.org/all/20210430082804.38018-1-psampat@linux.ibm.com/
> >
> > Userspace selftest:
> > https://lkml.org/lkml/2020/9/2/356
> >
> > ----
> >
> > A kernel module + userspace driver to estimate the wakeup latency
> > caused by going into stop states. The motivation behind this
> > program
> > is
> > to find significant deviations behind advertised latency and
> > residency
> > values.
> >
> > The patchset measures latencies for two kinds of events. IPIs and
> > Timers
> > As this is a software-only mechanism, there will be additional
> > latencies
> > of the kernel-firmware-hardware interactions. To account for that,
> > the
> > program also measures a baseline latency on a 100 percent loaded
> > CPU
> > and the latencies achieved must be in view relative to that.
> >
> > To achieve this, we introduce a kernel module and expose its
> > control
> > knobs through the debugfs interface that the selftests can engage
> > with.
> >
> > The kernel module provides the following interfaces within
> > /sys/kernel/debug/powerpc/latency_test/ for,
> >
> > IPI test:
> > ipi_cpu_dest = Destination CPU for the IPI
> > ipi_cpu_src = Origin of the IPI
> > ipi_latency_ns = Measured latency time in ns
> > Timeout test:
> > timeout_cpu_src = CPU on which the timer to be queued
> > timeout_expected_ns = Timer duration
> > timeout_diff_ns = Difference of actual duration vs expected
> > timer
> >
> > Sample output is as follows:
> >
> > # --IPI Latency Test---
> > # Baseline Avg IPI latency(ns): 2720
> > # Observed Avg IPI latency(ns) - State snooze: 2565
> > # Observed Avg IPI latency(ns) - State stop0_lite: 3856
> > # Observed Avg IPI latency(ns) - State stop0: 3670
> > # Observed Avg IPI latency(ns) - State stop1: 3872
> > # Observed Avg IPI latency(ns) - State stop2: 17421
> > # Observed Avg IPI latency(ns) - State stop4: 1003922
> > # Observed Avg IPI latency(ns) - State stop5: 1058870
> > #
> > # --Timeout Latency Test--
> > # Baseline Avg timeout diff(ns): 1435
> > # Observed Avg timeout diff(ns) - State snooze: 1709
> > # Observed Avg timeout diff(ns) - State stop0_lite: 2028
> > # Observed Avg timeout diff(ns) - State stop0: 1954
> > # Observed Avg timeout diff(ns) - State stop1: 1895
> > # Observed Avg timeout diff(ns) - State stop2: 14556
> > # Observed Avg timeout diff(ns) - State stop4: 873988
> > # Observed Avg timeout diff(ns) - State stop5: 959137
> >
> > Aboorva Devarajan (2):
> > powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer
> > events
> > powerpc/selftest: Add support for cpuidle latency measurement
> >
> > arch/powerpc/Kconfig.debug | 10 +
> > arch/powerpc/kernel/Makefile | 1 +
> > arch/powerpc/kernel/test_cpuidle_latency.c | 154 ++++++
> > tools/testing/selftests/powerpc/Makefile | 1 +
> > .../powerpc/cpuidle_latency/.gitignore | 2 +
> > .../powerpc/cpuidle_latency/Makefile | 6 +
> > .../cpuidle_latency/cpuidle_latency.sh | 443
> > ++++++++++++++++++
> > .../powerpc/cpuidle_latency/settings | 1 +
> > 8 files changed, 618 insertions(+)
> > create mode 100644 arch/powerpc/kernel/test_cpuidle_latency.c
> > create mode 100644
> > tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
> > create mode 100644
> > tools/testing/selftests/powerpc/cpuidle_latency/Makefile
> > create mode 100755
> > tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
> > create mode 100644
> > tools/testing/selftests/powerpc/cpuidle_latency/settings
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-10-12 4:51 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-11 5:36 [RFC v3 0/2] CPU-Idle latency selftest framework Aboorva Devarajan
2023-09-11 5:36 ` [RFC v3 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events Aboorva Devarajan
2023-09-12 22:54 ` Michael Ellerman
2023-09-21 11:00 ` Aboorva Devarajan
2023-09-21 23:41 ` Michael Ellerman
2023-09-11 5:36 ` [RFC v3 2/2] powerpc/selftest: Add support for cpuidle latency measurement Aboorva Devarajan
2023-09-25 5:06 ` [RFC v3 0/2] CPU-Idle latency selftest framework Aboorva Devarajan
2023-10-12 4:48 ` Aboorva Devarajan
-- strict thread matches above, loose matches on Subject: below --
2021-04-04 8:33 Pratik Rajesh Sampat
2021-04-09 5:23 ` Doug Smythies
2021-04-09 7:43 ` Pratik Sampat
2021-04-09 14:26 ` Doug Smythies
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).