public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf bench sched pipe: fix enforced blocking reads in worker_thread
@ 2025-03-23 14:01 Dirk Gouders
  2025-03-23 19:02 ` Ingo Molnar
  2025-03-24 16:34 ` Namhyung Kim
  0 siblings, 2 replies; 3+ messages in thread
From: Dirk Gouders @ 2025-03-23 14:01 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Dirk Gouders, Ian Rogers, Adrian Hunter, Peter Zijlstra,
	Ingo Molnar, LKML, linux-perf-users

The function worker_thread() is programmed in a way that roughly
doubles the number of expectable context switches, because it enforces
blocking reads:

 Performance counter stats for 'perf bench sched pipe':

         2,000,004      context-switches

      11.859548321 seconds time elapsed

       0.674871000 seconds user
       8.076890000 seconds sys

The result of this behavior is that the blocking reads by far dominate
the performance analysis of 'perf bench sched pipe':

Samples: 78K of event 'cycles:P', Event count (approx.): 27964965844
Overhead  Command     Shared Object         Symbol
  25.28%  sched-pipe  [kernel.kallsyms]     [k] read_hpet
   8.11%  sched-pipe  [kernel.kallsyms]     [k] retbleed_untrain_ret
   2.82%  sched-pipe  [kernel.kallsyms]     [k] pipe_write

From the code, it is unclear if that behavior is wanted but the log
says that at least Ingo Molnar aims to mimic lmbench's lat_ctx, that
doesn't handle the pipe ends that way
(https://sourceforge.net/p/lmbench/code/HEAD/tree/trunk/lmbench2/src/lat_ctx.c)

Fix worker_thread() by always first feeding the write ends of the pipes
and then trying to read.

This roughly halves the context switches and runtime of pure
'perf bench sched pipe':

 Performance counter stats for 'perf bench sched pipe':

         1,005,770      context-switches

       6.033448041 seconds time elapsed

       0.423142000 seconds user
       4.519829000 seconds sys

And the blocking reads do no longer dominate the analysis at the above
extreme:

Samples: 40K of event 'cycles:P', Event count (approx.): 14309364879
Overhead  Command     Shared Object         Symbol
  12.20%  sched-pipe  [kernel.kallsyms]     [k] read_hpet
   9.23%  sched-pipe  [kernel.kallsyms]     [k] retbleed_untrain_ret
   3.68%  sched-pipe  [kernel.kallsyms]     [k] pipe_write

Signed-off-by: Dirk Gouders <dirk@gouders.net>
---
 tools/perf/bench/sched-pipe.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/tools/perf/bench/sched-pipe.c b/tools/perf/bench/sched-pipe.c
index e2562677df96..70139036d68f 100644
--- a/tools/perf/bench/sched-pipe.c
+++ b/tools/perf/bench/sched-pipe.c
@@ -204,17 +204,10 @@ static void *worker_thread(void *__tdata)
 	}
 
 	for (i = 0; i < loops; i++) {
-		if (!td->nr) {
-			ret = read_pipe(td);
-			BUG_ON(ret != sizeof(int));
-			ret = write(td->pipe_write, &m, sizeof(int));
-			BUG_ON(ret != sizeof(int));
-		} else {
-			ret = write(td->pipe_write, &m, sizeof(int));
-			BUG_ON(ret != sizeof(int));
-			ret = read_pipe(td);
-			BUG_ON(ret != sizeof(int));
-		}
+		ret = write(td->pipe_write, &m, sizeof(int));
+		BUG_ON(ret != sizeof(int));
+		ret = read_pipe(td);
+		BUG_ON(ret != sizeof(int));
 	}
 
 	return NULL;
-- 
2.45.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] perf bench sched pipe: fix enforced blocking reads in worker_thread
  2025-03-23 14:01 [PATCH] perf bench sched pipe: fix enforced blocking reads in worker_thread Dirk Gouders
@ 2025-03-23 19:02 ` Ingo Molnar
  2025-03-24 16:34 ` Namhyung Kim
  1 sibling, 0 replies; 3+ messages in thread
From: Ingo Molnar @ 2025-03-23 19:02 UTC (permalink / raw)
  To: Dirk Gouders
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers, Adrian Hunter,
	Peter Zijlstra, LKML, linux-perf-users


* Dirk Gouders <dirk@gouders.net> wrote:

> The function worker_thread() is programmed in a way that roughly
> doubles the number of expectable context switches, because it enforces
> blocking reads:
> 
>  Performance counter stats for 'perf bench sched pipe':
> 
>          2,000,004      context-switches
> 
>       11.859548321 seconds time elapsed
> 
>        0.674871000 seconds user
>        8.076890000 seconds sys
> 
> The result of this behavior is that the blocking reads by far dominate
> the performance analysis of 'perf bench sched pipe':
> 
> Samples: 78K of event 'cycles:P', Event count (approx.): 27964965844
> Overhead  Command     Shared Object         Symbol
>   25.28%  sched-pipe  [kernel.kallsyms]     [k] read_hpet
>    8.11%  sched-pipe  [kernel.kallsyms]     [k] retbleed_untrain_ret
>    2.82%  sched-pipe  [kernel.kallsyms]     [k] pipe_write
> 
> From the code, it is unclear if that behavior is wanted but the log
> says that at least Ingo Molnar aims to mimic lmbench's lat_ctx, that
> doesn't handle the pipe ends that way
> (https://sourceforge.net/p/lmbench/code/HEAD/tree/trunk/lmbench2/src/lat_ctx.c)
> 
> Fix worker_thread() by always first feeding the write ends of the pipes
> and then trying to read.
> 
> This roughly halves the context switches and runtime of pure
> 'perf bench sched pipe':
> 
>  Performance counter stats for 'perf bench sched pipe':
> 
>          1,005,770      context-switches
> 
>        6.033448041 seconds time elapsed
> 
>        0.423142000 seconds user
>        4.519829000 seconds sys
> 
> And the blocking reads do no longer dominate the analysis at the above
> extreme:
> 
> Samples: 40K of event 'cycles:P', Event count (approx.): 14309364879
> Overhead  Command     Shared Object         Symbol
>   12.20%  sched-pipe  [kernel.kallsyms]     [k] read_hpet
>    9.23%  sched-pipe  [kernel.kallsyms]     [k] retbleed_untrain_ret
>    3.68%  sched-pipe  [kernel.kallsyms]     [k] pipe_write
> 
> Signed-off-by: Dirk Gouders <dirk@gouders.net>
> ---
>  tools/perf/bench/sched-pipe.c | 15 ++++-----------
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/perf/bench/sched-pipe.c b/tools/perf/bench/sched-pipe.c
> index e2562677df96..70139036d68f 100644
> --- a/tools/perf/bench/sched-pipe.c
> +++ b/tools/perf/bench/sched-pipe.c
> @@ -204,17 +204,10 @@ static void *worker_thread(void *__tdata)
>  	}
>  
>  	for (i = 0; i < loops; i++) {
> -		if (!td->nr) {
> -			ret = read_pipe(td);
> -			BUG_ON(ret != sizeof(int));
> -			ret = write(td->pipe_write, &m, sizeof(int));
> -			BUG_ON(ret != sizeof(int));
> -		} else {
> -			ret = write(td->pipe_write, &m, sizeof(int));
> -			BUG_ON(ret != sizeof(int));
> -			ret = read_pipe(td);
> -			BUG_ON(ret != sizeof(int));
> -		}
> +		ret = write(td->pipe_write, &m, sizeof(int));
> +		BUG_ON(ret != sizeof(int));
> +		ret = read_pipe(td);
> +		BUG_ON(ret != sizeof(int));

Yeah, this was unintended:

Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] perf bench sched pipe: fix enforced blocking reads in worker_thread
  2025-03-23 14:01 [PATCH] perf bench sched pipe: fix enforced blocking reads in worker_thread Dirk Gouders
  2025-03-23 19:02 ` Ingo Molnar
@ 2025-03-24 16:34 ` Namhyung Kim
  1 sibling, 0 replies; 3+ messages in thread
From: Namhyung Kim @ 2025-03-24 16:34 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Dirk Gouders
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users

On Sun, 23 Mar 2025 15:01:01 +0100, Dirk Gouders wrote:
> The function worker_thread() is programmed in a way that roughly
> doubles the number of expectable context switches, because it enforces
> blocking reads:
> 
>  Performance counter stats for 'perf bench sched pipe':
> 
>          2,000,004      context-switches
> 
> [...]
Applied to perf-tools-next, thanks!

Best regards,
Namhyung



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-03-24 16:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-23 14:01 [PATCH] perf bench sched pipe: fix enforced blocking reads in worker_thread Dirk Gouders
2025-03-23 19:02 ` Ingo Molnar
2025-03-24 16:34 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox