[PATCH 02/12] sched_debug: Fix runqueue task parsing logic and state filtering

public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed

From: Clark Williams <clrkwllms@kernel.org>
To: linux-rt-users@vger.kernel.org
Cc: Clark Williams <williams@redhat.com>,
	Claude <noreply@anthropic.com>,
	Clark Williams <clrkwllms@kernel.org>,
	wander@redhat.com, debarbos@redhat.com, marco.chiappero@suse.com,
	chris.friesen@windriver.com, luochunsheng@ustc.edu
Subject: [PATCH 02/12] sched_debug: Fix runqueue task parsing logic and state filtering
Date: Thu, 16 Oct 2025 21:24:34 -0500	[thread overview]
Message-ID: <20251017022444.118802-2-clrkwllms@kernel.org> (raw)
In-Reply-To: <20251017022444.118802-1-clrkwllms@kernel.org>

From: Clark Williams <williams@redhat.com>

Refactor parse_task_lines() to correctly parse runnable tasks from
sched_debug output with improved state filtering and iteration.

Key fixes:
- Rename skipwords() to skip2word() and fix logic to position pointer
  at the start of the target word (not after whitespace)
- Fix NEW_TASK_FORMAT detection to account for 'S' column offset in
  task field positions
- Add proper state filtering for NEW_TASK_FORMAT: skip '>R' (running),
  non-'R', and non-'X' states
- Fix loop termination: iterate to (nr_entries-1) since running task
  is excluded
- Fix buffer allocation size to include null terminator (+1)
- Add comprehensive comments explaining parsing state machine
- Initialize ptr=NULL to prevent uninitialized use warnings

This resolves task parsing failures where stalld was incorrectly
reading task fields or missing runnable tasks on newer kernels.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Clark Williams <clrkwllms@kernel.org>
Signed-off-by: Clark Williams <williams@redhat.com>
---
 src/sched_debug.c | 89 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 66 insertions(+), 23 deletions(-)

diff --git a/src/sched_debug.c b/src/sched_debug.c
index 180932ca7aa3..04a12ef2a7c2 100644
--- a/src/sched_debug.c
+++ b/src/sched_debug.c
@@ -122,7 +122,8 @@ static char *alloc_and_fill_cpu_buffer(int cpu, char *sched_dbg, int sched_dbg_s
 	if (!next_cpu_start)
 		next_cpu_start = sched_dbg + sched_dbg_size;
 
-	size = next_cpu_start - cpu_start;
+	/* add one for the null terminator */
+	size = next_cpu_start - cpu_start + 1;
 
 	if (size <= 0)
 		return NULL;
@@ -171,12 +172,13 @@ static inline char *nextline(char *str)
  * skip a specified number of words on a task line
  */
 
-static inline char *skipwords(char *ptr, int nwords)
+static inline char *skip2word(char *ptr, int nwords)
 {
 	int i;
-	for (i=0; i < nwords; i++) {
-		ptr = skipspaces(ptr);
+	ptr = skipspaces(ptr);
+	for (i=1; i < nwords; i++) {
 		ptr = skipchars(ptr);
+		ptr = skipspaces(ptr);
 	}
 	return ptr;
 }
@@ -228,12 +230,14 @@ static int detect_task_format(void)
 	config_buffer_size = bufsiz;
 	log_msg("initial config_buffer_size set to %zu\n", config_buffer_size);
 
+	/* find the delimiter for task information */
 	ptr = strstr(buffer, TASK_MARKER);
 	if (ptr == NULL) {
 		die("unable to find 'runnable tasks' in buffer, invalid input\n");
 		exit(-1);
 	}
 
+	/* move to the column header line */
 	ptr = nextline(ptr);
 	i = 0;
 
@@ -245,6 +249,8 @@ static int detect_task_format(void)
 	if (strncmp(ptr, "S", strlen("S")) == 0) {
 		log_msg("detect_task_format: NEW_TASK_FORMAT detected\n");
 		retval = NEW_TASK_FORMAT;
+		/* move the word offset by one */
+		i++;
 	}
 	else {
 		log_msg("detect_task_format: OLD_TASK_FORMAT detected\n");
@@ -357,45 +363,81 @@ static int is_runnable(int pid)
 static int parse_task_lines(char *buffer, struct task_info *task_info, int nr_entries)
 {
 	int pid, ctxsw, prio, comm_size;
-	char *ptr, *line, *end;
+	char *ptr=NULL, *line = buffer, *end;
+	char *buffer_end = buffer + strlen(buffer);
 	struct task_info *task;
 	char comm[COMM_SIZE];
 	int tasks = 0;
 
-	if ((ptr = strstr(buffer, TASK_MARKER)) == NULL)
-		die ("no runnable task section found!\n");
-
 	/*
 	 * If we have less than two tasks on the CPU there is no
 	 * possibility of a stall.
-	 */
+ 	 */
 	if (nr_entries < 2)
 		return 0;
+
+
+	/* search for the task marker header */
+	ptr = strstr(buffer, TASK_MARKER);
+	if (ptr == NULL)
+		die ("no runnable task section found!\n");
+
 	line = ptr;
 
-	/* skip header and divider */
+	/* skip "runnable tasks:" */
+ 	line = nextline(line);
+
+	/* skip header lines */
 	line = nextline(line);
+
+	/* skip divider line */
 	line = nextline(line);
-	
-	/* now loop over the task info */
-	while (tasks < nr_entries) {
+	/* at this point, line should point to the start of a task line */
+
+	/* now loop over the task info
+	 * note that we always discount the task that's on the cpu, so the
+	 * number of waiting tasks will always be at least one less than
+	 * nr_entries.
+	 */
+	while ((line < buffer_end) && tasks < (nr_entries-1)) {
 		task = &task_info[tasks];
 
+		/* move ptr to the first word of the line */
+		ptr = skipspaces(line);
+
 		/*
 		 * In 3.X kernels, only the singular RUNNING task receives
 		 * a "running state" label. Therefore, only care about
-		 * tasks that are not R (running on a CPU).
+		 * tasks that are not R (runnable on a CPU).
 		 */
 		if ((config_task_format == OLD_TASK_FORMAT) &&
 			(*ptr == 'R')) {
 			/* Go to the end of the line and ignore this task. */
-			ptr = strchr(ptr, '\n');
-			ptr++;
+			line = nextline(line);
 			continue;
 		}
 
+		/*
+		 * in newer kernels (>=4.x) every task info line has a state
+		 * but the actual running tasks has a '>R' to denote it.
+		 * since we don't care about the currently running tasks
+		 * skip it.
+		 * Also, we don't care about any states other than 'R' (runnable)
+		 * and 'X' (dying)
+		 */
+		if (config_task_format == NEW_TASK_FORMAT) {
+			if (*ptr == '>' || (*ptr != 'R' && *ptr != 'X')) {
+				line = nextline(line);
+				continue;
+			}
+		}
+
+		/*
+		 * At this point we have a task line to record
+		 */
+		
 		/* get the task field */
-		ptr = skipwords(line, config_task_format_offsets.task);
+		ptr = skip2word(line, config_task_format_offsets.task);
 
 		/* Find the end of the task field */
 		end = skipchars(ptr);
@@ -408,20 +450,21 @@ static int parse_task_lines(char *buffer, struct task_info *task_info, int nr_en
 		}
 		strncpy(comm, ptr, comm_size);
 		comm[comm_size] = '\0';
-		ptr = end;
 
 		/* get the PID field */
-		ptr = skipwords(line, config_task_format_offsets.pid);
+		ptr = skip2word(line, config_task_format_offsets.pid);
 		pid = strtol(ptr, NULL, 10);
 
 		/* get the context switches field */
-		ptr = skipwords(line, config_task_format_offsets.switches);
+		ptr = skip2word(line, config_task_format_offsets.switches);
 		ctxsw = strtol(ptr, NULL, 10);
 
 		/* get the prio field */
-		ptr = skipwords(line, config_task_format_offsets.prio);
+		ptr = skip2word(line, config_task_format_offsets.prio);
 		prio = strtol(ptr, NULL, 10);
 
+		/*log_msg("DEBUG: task%d comm:%s pid:%d ctxsw:%d prio:%d\n", tasks, comm, pid, ctxsw, prio);*/
+
                 /*
                  * In older formats, we must check to
                  * see if the process is runnable prior to storing header
@@ -437,10 +480,10 @@ static int parse_task_lines(char *buffer, struct task_info *task_info, int nr_en
 			task->since = time(NULL);
 			/* increment the count of tasks processed */
 			tasks++;
-		} else {
-			continue;
 		}
 
+		/* move our line pointer to the next availble line */
+		line = nextline(line);
 	}
 	return tasks;
 }
-- 
2.51.0

next prev parent reply	other threads:[~2025-10-17  2:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-17  2:24 [PATCH 01/12] sched_debug: Unify parsing methods for task_info Clark Williams
2025-10-17  2:24 ` Clark Williams [this message]
2025-10-21 15:58   ` [PATCH 02/12] sched_debug: Fix runqueue task parsing logic and state filtering Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 03/12] sched_debug: Fix double-free crash in fill_waiting_task() Clark Williams
2025-10-21 16:01   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 04/12] stalld.c: remove noisy idle report and added report to should_skip_idle_cpus() Clark Williams
2025-10-21 16:03   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 05/12] stalld.c: initialize cpu_info->idle_time to be -1 Clark Williams
2025-10-21 16:15   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 06/12] stalld.c: get rid of misleading print about DL-Server Clark Williams
2025-10-21 16:16   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 07/12] stalld.c: Add starvation logging in single-threaded log-only mode Clark Williams
2025-10-21 16:27   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 08/12] stalld: Add -N/--no_idle_detect flag to disable idle detection Clark Williams
2025-10-21 16:33   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 09/12] stalld: Add defensive checks in print_boosted_info Clark Williams
2025-10-21 17:36   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 10/12] Makefile: Add support for legacy kernels Clark Williams
2025-10-17 12:50   ` Derek Barbosa
2025-10-21 17:43   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 11/12] scripts: fix run-local if bashism Clark Williams
2025-10-21 17:45   ` Wander Lairson Costa
2025-10-17  2:24 ` [PATCH 12/12] Fix segfault in adaptive/aggressive modes Clark Williams
2025-10-21 17:45   ` Wander Lairson Costa
2025-10-21 15:54 ` [PATCH 01/12] sched_debug: Unify parsing methods for task_info Wander Lairson Costa

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:180932ca7aa dfblob:04a12ef2a7c )
 OR (
bs:"[PATCH 02/12] sched_debug: Fix runqueue task parsing logic and state filtering" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251017022444.118802-2-clrkwllms@kernel.org \
    --to=clrkwllms@kernel.org \
    --cc=chris.friesen@windriver.com \
    --cc=debarbos@redhat.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=luochunsheng@ustc.edu \
    --cc=marco.chiappero@suse.com \
    --cc=noreply@anthropic.com \
    --cc=wander@redhat.com \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox