linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Fernand Sieber <sieberf@amazon.com>
Cc: <linux-perf-users@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	Fernand Sieber <sieberf@amazon.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>
Subject: [PATCH] perf: Timehist account sch delay for scheduled out running
Date: Thu, 13 Jun 2024 20:59:06 +0200	[thread overview]
Message-ID: <20240613185906.31082-1-sieberf@amazon.com> (raw)

When using perf timehist, sch delay is only computed for a waking task,
not for a pre empted task. This patches addresses this problem.

Example of `perf timehist` report before the patch for `stress` task
competing with each other.

First column is wait time, second column sch delay, third column
runtime.
```
       1.492060 [0000]  s    stress[81]                          1.999      0.000      2.000      R  next: stress[83]
       1.494060 [0000]  s    stress[83]                          2.000      0.000      2.000      R  next: stress[81]
       1.496060 [0000]  s    stress[81]                          2.000      0.000      2.000      R  next: stress[83]
       1.498060 [0000]  s    stress[83]                          2.000      0.000      1.999      R  next: stress[81]
```

After the patch, it looks like this (note that all wait time is now sch
delay instead of zero):
```
1.492060 [0000]  s    stress[81]                          1.999      1.999      2.000      R  next: stress[83]
1.494060 [0000]  s    stress[83]                          2.000      2.000      2.000      R  next: stress[81]
1.496060 [0000]  s    stress[81]                          2.000      2.000      2.000      R  next: stress[83]
1.498060 [0000]  s    stress[83]                          2.000      2.000      1.999      R  next: stress[81]
```

In timehist:
* wait time represents the duration waiting for any system resource
* sch delay represents the duration waiting for cpu system resources

This is based on perf comments (dt_wait = wait time, dt_delay = sch
delay):
```
/*
 * Explanation of delta-time stats:
 *
 *            t = time of current schedule out event
 *        tprev = time of previous sched out event
 *                also time of schedule-in event for current task
 *    last_time = time of last sched change event for current task
 *                (i.e, time process was last scheduled out)
 * ready_to_run = time of wakeup for current task
 *
 * -----|------------|------------|------------|------
 *    last         ready        tprev          t
 *    time         to run
 *
 *      |-------- dt_wait --------|
 *                   |- dt_delay -|-- dt_run --|
 *
 *   dt_run = run time of current task
 *  dt_wait = time between last schedule out event for task and tprev
 *            represents time spent off the cpu
 * dt_delay = time between wakeup and schedule-in of task
 */
 ```

The problem with the current logic is that last time is only set when
waking a task. Therefore it is not set for a pre empted task. To fix
this, we set last time to the current sample time if a scheduled out
task (on the switch tracepoint) is in state running.

Signed-off-by: Fernand Sieber <sieberf@amazon.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
---
 tools/perf/builtin-sched.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 5977c49ae2c7..7422c930abaf 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2659,7 +2659,10 @@ static int timehist_sched_change_event(struct perf_tool *tool,
 		tr->last_state = state;
 
 		/* sched out event for task so reset ready to run time */
-		tr->ready_to_run = 0;
+		if (state == 'R')
+			tr->ready_to_run = t;
+		else
+			tr->ready_to_run = 0;
 	}
 
 	evsel__save_time(evsel, sample->time, sample->cpu);
-- 
2.40.1


             reply	other threads:[~2024-06-13 18:59 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-13 18:59 Fernand Sieber [this message]
2024-06-15 20:50 ` [PATCH] perf: Timehist account sch delay for scheduled out running Madadi Vineeth Reddy
2024-06-18  8:44   ` Sieber, Fernand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240613185906.31082-1-sieberf@amazon.com \
    --to=sieberf@amazon.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).