linux-trace-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Ze Gao <zegao2021@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Ian Rogers <irogers@google.com>, Ingo Molnar <mingo@redhat.com>,
	Jiri Olsa <jolsa@kernel.org>, Mark Rutland <mark.rutland@arm.com>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org,
	linux-trace-devel@vger.kernel.org, Ze Gao <zegao@tencent.com>
Subject: Re: [RFC PATCH v4 5/7] sched, tracing: add to report task state in symbolic chars
Date: Wed, 2 Aug 2023 11:07:01 -0400	[thread overview]
Message-ID: <20230802110701.5227346d@gandalf.local.home> (raw)
In-Reply-To: <20230802121116.324604-6-zegao@tencent.com>

On Wed,  2 Aug 2023 08:10:00 -0400
Ze Gao <zegao2021@gmail.com> wrote:

> From: Ze Gao <zegao2021@gmail.com>
> 
> Internal representations of task state are likely to be changed
> or ordered, and reporting them to userspace without exporting
> them as part of API is basically wrong, which can easily break
> a userspace observability tool as kernel evolves. For example,
> perf suffers from this and still reports wrong states as of this
> writing.
> 
> OTOH, some masqueraded states like TASK_REPORT_IDLE and
> TASK_REPORT_MAX are also reported inadvertently, which confuses
> things even more and most userspace tools do not even take them
> into consideration.
> 
> So add a new variable in company with the old raw value to
> report task state in symbolic chars, which are self-explaining
> and no further translation is needed. Of course this does not
> break any userspace tool.
> 
> Note for PREEMPT_ACTIVE, we introduce 'p' to report it and use
> the old conventions for the rest.
> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Acked-by: Ian Rogers <irogers@google.com>
> ---
>  include/trace/events/sched.h | 44 ++++++++++++++++++++++--------------
>  1 file changed, 27 insertions(+), 17 deletions(-)
> 
> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> index 7d34db20b2c6..1c7b94793495 100644
> --- a/include/trace/events/sched.h
> +++ b/include/trace/events/sched.h
> @@ -6,6 +6,7 @@
>  #define _TRACE_SCHED_H
>  
>  #include <linux/kthread.h>
> +#include <linux/sched.h>
>  #include <linux/sched/numa_balancing.h>
>  #include <linux/tracepoint.h>
>  #include <linux/binfmts.h>
> @@ -214,6 +215,27 @@ static inline int __trace_sched_switch_state(bool preempt,
>  
>  	return state ? (1 << (state - 1)) : state;
>  }
> +
> +static inline char __trace_sched_switch_state_char(bool preempt,
> +						   unsigned int prev_state,
> +						   struct task_struct *p)
> +{
> +	long state;
> +
> +#ifdef CONFIG_SCHED_DEBUG
> +	BUG_ON(p != current);

BUG? Why not WARN_ON()?

-- Steve

> +#endif /* CONFIG_SCHED_DEBUG */
> +
> +	/*
> +	 * For PREEMPT_ACTIVE, we introduce 'p' to report it and use the old
> +	 * conventions for the rest.
> +	 */
> +	if (preempt)
> +		return 'p';
> +
> +	state = __task_state_index(prev_state, p->exit_state);
> +	return task_index_to_char(state);
> +}
>  #endif /* CREATE_TRACE_POINTS */
>  
>  /*
> @@ -236,6 +258,7 @@ TRACE_EVENT(sched_switch,
>  		__field(	int,	prev_state			)
>  		__array(	char,	prev_comm,	TASK_COMM_LEN	)
>  		__array(	char,	next_comm,	TASK_COMM_LEN	)
> +		__field(	char,	prev_state_char			)
>  	),
>  
>  	TP_fast_assign(
> @@ -246,26 +269,13 @@ TRACE_EVENT(sched_switch,
>  		__entry->prev_state		= __trace_sched_switch_state(preempt, prev_state, prev);
>  		memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN);
>  		memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN);
> +		__entry->prev_state_char	= __trace_sched_switch_state_char(preempt, prev_state, prev);
>  		/* XXX SCHED_DEADLINE */
>  	),
>  
> -	TP_printk("prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d",
> -		__entry->prev_comm, __entry->prev_pid, __entry->prev_prio,
> -
> -		(__entry->prev_state & (TASK_REPORT_MAX - 1)) ?
> -		  __print_flags(__entry->prev_state & (TASK_REPORT_MAX - 1), "|",
> -				{ TASK_INTERRUPTIBLE, "S" },
> -				{ TASK_UNINTERRUPTIBLE, "D" },
> -				{ __TASK_STOPPED, "T" },
> -				{ __TASK_TRACED, "t" },
> -				{ EXIT_DEAD, "X" },
> -				{ EXIT_ZOMBIE, "Z" },
> -				{ TASK_PARKED, "P" },
> -				{ TASK_DEAD, "I" }) :
> -		  "R",
> -
> -		__entry->prev_state & TASK_REPORT_MAX ? "+" : "",
> -		__entry->next_comm, __entry->next_pid, __entry->next_prio)
> +	TP_printk("prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%c ==> next_comm=%s next_pid=%d next_prio=%d",
> +		__entry->prev_comm, __entry->prev_pid, __entry->prev_prio, __entry->prev_state_char, __entry->next_comm,
> +		__entry->next_pid, __entry->next_prio)
>  );
>  
>  /*


  reply	other threads:[~2023-08-02 15:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-02 12:09 [RFC PATCH v4 0/7] fix task state report from sched tracepoint Ze Gao
2023-08-02 12:09 ` [RFC PATCH v4 1/7] libtraceevent: sync state char array with the kernel Ze Gao
2023-08-02 14:36   ` Steven Rostedt
2023-08-03  2:13     ` Ze Gao
2023-08-03  2:37       ` Steven Rostedt
2023-08-03  2:49         ` Ze Gao
2023-08-02 12:09 ` [RFC PATCH v4 2/7] perf sched: " Ze Gao
2023-08-02 12:09 ` [RFC PATCH v4 3/7] perf sched: reorganize sched-out task state report code Ze Gao
2023-08-02 12:09 ` [RFC PATCH v4 4/7] sched, tracing: reorganize fields of switch event struct Ze Gao
2023-08-02 15:05   ` Steven Rostedt
2023-08-03  2:02     ` Ze Gao
2023-08-02 12:10 ` [RFC PATCH v4 5/7] sched, tracing: add to report task state in symbolic chars Ze Gao
2023-08-02 15:07   ` Steven Rostedt [this message]
2023-08-03  2:38     ` Ze Gao
2023-08-03  2:48       ` Steven Rostedt
2023-08-02 12:10 ` [RFC PATCH v4 6/7] perf sched: prefer to use prev_state_char introduced in sched_switch Ze Gao
2023-08-02 12:10 ` [RFC PATCH v4 7/7] libtraceevent: " Ze Gao
2023-08-02 15:08   ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230802110701.5227346d@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux-trace-devel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=zegao2021@gmail.com \
    --cc=zegao@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).