All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event
@ 2023-01-17 15:21 Steven Rostedt
  2023-01-17 15:21 ` [PATCH 1/6 v2] tracing: Simplify calculating entry size using struct_size() Steven Rostedt
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Steven Rostedt @ 2023-01-17 15:21 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi, Ross Zwisler,
	Ching-lin Yu

I finally got around to implementing what I've been wanting to do for a long
time! That is to pass stacktraces from the start to the end event of a
synthetic event. Specifically, I need to get the stacktrace of a task as it
schedules out, but I don't want to see it until it schedules back in and I
check the timings to show that it is longer than normal. That is now done
with:

 # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
 # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace  if prev_state == 2' >> events/sched/sched_switch/trigger
 # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger

The above creates a synthetic event that will trigger on the max delta of a
task blocked in an uninterruptible state. It will show you the stack trace
of where that occurred!

 # echo 1 > events/synthetic/block_lat/enable
 # ls -lR > /dev/null
 # cat trace

# tracer: nop
#
# entries-in-buffer/entries-written: 2/2   #P:8
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
          <idle>-0       [005] d..4.   521.164922: block_lat: pid=0 delta=8322 stack=STACK:
=> __schedule+0x448/0x7b0
=> schedule+0x5a/0xb0
=> io_schedule+0x42/0x70
=> bit_wait_io+0xd/0x60
=> __wait_on_bit+0x4b/0x140
=> out_of_line_wait_on_bit+0x91/0xb0
=> jbd2_journal_commit_transaction+0x1679/0x1a70
=> kjournald2+0xa9/0x280
=> kthread+0xe9/0x110
=> ret_from_fork+0x2c/0x50  

           <...>-2       [004] d..4.   525.184257: block_lat: pid=2 delta=76 stack=STACK:
=> __schedule+0x448/0x7b0
=> schedule+0x5a/0xb0
=> schedule_timeout+0x11a/0x150
=> wait_for_completion_killable+0x144/0x1f0
=> __kthread_create_on_node+0xe7/0x1e0
=> kthread_create_on_node+0x51/0x70
=> create_worker+0xcc/0x1a0
=> worker_thread+0x2ad/0x380
=> kthread+0xe9/0x110
=> ret_from_fork+0x2c/0x50

Changes since v1: https://lore.kernel.org/linux-trace-kernel/20230117044010.838685230@goodmis.org/

- Fixed an issue with passing a stacktrace variable to another variable.

- Can now use the stacktrace in the histogram as well:

  # echo 'hist:delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger
  # cat events/synthetic/block_lat/hist
  
  # event histogram
  # 
  # trigger info: hist:keys=delta.buckets=100,stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
  # 
    
  { delta: ~ 0-99, stacktrace:      
           event_hist_trigger+0x464/0x480
           event_triggers_call+0x52/0xe0
           trace_event_buffer_commit+0x193/0x250
           trace_event_raw_event_sched_switch+0xfc/0x150
           __traceiter_sched_switch+0x41/0x60 
           __schedule+0x448/0x7b0
           schedule_idle+0x26/0x40
           cpu_startup_entry+0x19/0x20
           start_secondary+0xed/0xf0
           secondary_startup_64_no_verify+0xe0/0xeb
  } hitcount:          6
  { delta: ~ 0-99, stacktrace:
           event_hist_trigger+0x464/0x480
           event_triggers_call+0x52/0xe0
           trace_event_buffer_commit+0x193/0x250
           trace_event_raw_event_sched_switch+0xfc/0x150
           __traceiter_sched_switch+0x41/0x60
           __schedule+0x448/0x7b0
           schedule_idle+0x26/0x40
           cpu_startup_entry+0x19/0x20
           __pfx_kernel_init+0x0/0x10
           arch_call_rest_init+0xa/0x24
           start_kernel+0x964/0x98d
           secondary_startup_64_no_verify+0xe0/0xeb
  } hitcount:          3
[..]
  { delta: ~ 8500-8599, stacktrace:
           event_hist_trigger+0x464/0x480
           event_triggers_call+0x52/0xe0
           trace_event_buffer_commit+0x193/0x250
           trace_event_raw_event_sched_switch+0xfc/0x150
           __traceiter_sched_switch+0x41/0x60
           __schedule+0x448/0x7b0
           schedule_idle+0x26/0x40
           cpu_startup_entry+0x19/0x20
           start_secondary+0xed/0xf0
           secondary_startup_64_no_verify+0xe0/0xeb
  } hitcount:          1

  Totals:
      Hits: 89 
      Entries: 11
      Dropped: 0



Steven Rostedt (Google) (6):
      tracing: Simplify calculating entry size using struct_size()
      tracing: Allow stacktraces to be saved as histogram variables
      tracing: Allow synthetic events to pass around stacktraces
      tracing/histogram: Add stacktrace type
      tracing/histogram: Document variable stacktrace
      tracing/histogram: Add simple tests for stacktrace usage of synthetic events

----
 Documentation/trace/histogram.rst                  | 129 +++++++++++++++++++++
 kernel/trace/trace.c                               |   2 +-
 kernel/trace/trace.h                               |   4 +
 kernel/trace/trace_events_hist.c                   |  63 +++++++---
 kernel/trace/trace_events_synth.c                  |  83 ++++++++++++-
 kernel/trace/trace_synth.h                         |   1 +
 .../inter-event/trigger-synthetic-event-stack.tc   |  24 ++++
 .../inter-event/trigger-synthetic-event-syntax.tc  |   6 +
 8 files changed, 294 insertions(+), 18 deletions(-)
 create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-stack.tc

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/6 v2] tracing: Simplify calculating entry size using struct_size()
  2023-01-17 15:21 [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Steven Rostedt
@ 2023-01-17 15:21 ` Steven Rostedt
  2023-01-17 15:21 ` [PATCH 2/6 v2] tracing: Allow stacktraces to be saved as histogram variables Steven Rostedt
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2023-01-17 15:21 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi, Ross Zwisler,
	Ching-lin Yu

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

When tracing a dynamic string field for a synthetic event, the offset
calculation for where to write the next event can use struct_size() to
find what the current size of the structure is.

This simplifies the code and makes it less error prone.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_synth.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index 67592eed0be8..9f79cd689b79 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -416,8 +416,7 @@ static unsigned int trace_string(struct synth_trace_event *entry,
 	if (is_dynamic) {
 		u32 data_offset;
 
-		data_offset = offsetof(typeof(*entry), fields);
-		data_offset += event->n_u64 * sizeof(u64);
+		data_offset = struct_size(entry, fields, event->n_u64);
 		data_offset += data_size;
 
 		len = kern_fetch_store_strlen((unsigned long)str_val);
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/6 v2] tracing: Allow stacktraces to be saved as histogram variables
  2023-01-17 15:21 [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Steven Rostedt
  2023-01-17 15:21 ` [PATCH 1/6 v2] tracing: Simplify calculating entry size using struct_size() Steven Rostedt
@ 2023-01-17 15:21 ` Steven Rostedt
  2023-01-17 15:21 ` [PATCH 3/6 v2] tracing: Allow synthetic events to pass around stacktraces Steven Rostedt
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2023-01-17 15:21 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi, Ross Zwisler,
	Ching-lin Yu

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Allow to save stacktraces into a histogram variable. This will be used by
synthetic events to allow a stacktrace from one event to be passed and
displayed by another event.

The special keyword "stacktrace" is to be used to trigger a stack
trace for the event that the histogram trigger is attached to.

  echo 'hist:keys=pid:st=stacktrace" > events/sched/sched_waking/trigger

Currently nothing can get access to the "$st" variable above that contains
the stack trace, but that will soon change.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_hist.c | 52 ++++++++++++++++++++++++++------
 1 file changed, 42 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index fcaf226b7744..bd67e7c3f258 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1360,6 +1360,8 @@ static const char *hist_field_name(struct hist_field *field,
 			field_name = field->name;
 	} else if (field->flags & HIST_FIELD_FL_TIMESTAMP)
 		field_name = "common_timestamp";
+	else if (field->flags & HIST_FIELD_FL_STACKTRACE)
+		field_name = "stacktrace";
 	else if (field->flags & HIST_FIELD_FL_HITCOUNT)
 		field_name = "hitcount";
 
@@ -1980,6 +1982,10 @@ static struct hist_field *create_hist_field(struct hist_trigger_data *hist_data,
 
 	if (flags & HIST_FIELD_FL_STACKTRACE) {
 		hist_field->fn_num = HIST_FIELD_FN_NOP;
+		hist_field->size = HIST_STACKTRACE_SIZE;
+		hist_field->type = kstrdup_const("unsigned long[]", GFP_KERNEL);
+		if (!hist_field->type)
+			goto free;
 		goto out;
 	}
 
@@ -2349,6 +2355,8 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file,
 		hist_data->enable_timestamps = true;
 		if (*flags & HIST_FIELD_FL_TIMESTAMP_USECS)
 			hist_data->attrs->ts_in_usecs = true;
+	} else if (strcmp(field_name, "stacktrace") == 0) {
+		*flags |= HIST_FIELD_FL_STACKTRACE;
 	} else if (strcmp(field_name, "common_cpu") == 0)
 		*flags |= HIST_FIELD_FL_CPU;
 	else if (strcmp(field_name, "hitcount") == 0)
@@ -3117,13 +3125,24 @@ static inline void __update_field_vars(struct tracing_map_elt *elt,
 		var_val = hist_fn_call(val, elt, buffer, rbe, rec);
 		var_idx = var->var.idx;
 
-		if (val->flags & HIST_FIELD_FL_STRING) {
+		if (val->flags & (HIST_FIELD_FL_STRING |
+				  HIST_FIELD_FL_STACKTRACE)) {
 			char *str = elt_data->field_var_str[j++];
 			char *val_str = (char *)(uintptr_t)var_val;
 			unsigned int size;
 
-			size = min(val->size, STR_VAR_LEN_MAX);
-			strscpy(str, val_str, size);
+			if (val->flags & HIST_FIELD_FL_STRING) {
+				size = min(val->size, STR_VAR_LEN_MAX);
+				strscpy(str, val_str, size);
+			} else {
+				int e;
+
+				e = stack_trace_save((void *)str,
+						     HIST_STACKTRACE_DEPTH,
+						     HIST_STACKTRACE_SKIP);
+				if (e < HIST_STACKTRACE_DEPTH - 1)
+					((unsigned long *)str)[e] = 0;
+			}
 			var_val = (u64)(uintptr_t)str;
 		}
 		tracing_map_set_var(elt, var_idx, var_val);
@@ -3822,7 +3841,8 @@ static void save_field_var(struct hist_trigger_data *hist_data,
 {
 	hist_data->field_vars[hist_data->n_field_vars++] = field_var;
 
-	if (field_var->val->flags & HIST_FIELD_FL_STRING)
+	/* Stack traces are saved in the string storage too */
+	if (field_var->val->flags & (HIST_FIELD_FL_STRING | HIST_FIELD_FL_STACKTRACE))
 		hist_data->n_field_var_str++;
 }
 
@@ -4101,7 +4121,8 @@ static int action_create(struct hist_trigger_data *hist_data,
 			}
 
 			hist_data->save_vars[hist_data->n_save_vars++] = field_var;
-			if (field_var->val->flags & HIST_FIELD_FL_STRING)
+			if (field_var->val->flags &
+			    (HIST_FIELD_FL_STRING | HIST_FIELD_FL_STACKTRACE))
 				hist_data->n_save_var_str++;
 			kfree(param);
 		}
@@ -4349,7 +4370,8 @@ static int create_var_field(struct hist_trigger_data *hist_data,
 	if (!ret && hist_data->fields[val_idx]->flags & HIST_FIELD_FL_EXECNAME)
 		update_var_execname(hist_data->fields[val_idx]);
 
-	if (!ret && hist_data->fields[val_idx]->flags & HIST_FIELD_FL_STRING)
+	if (!ret && hist_data->fields[val_idx]->flags &
+	    (HIST_FIELD_FL_STRING | HIST_FIELD_FL_STACKTRACE))
 		hist_data->fields[val_idx]->var_str_idx = hist_data->n_var_str++;
 
 	return ret;
@@ -5090,7 +5112,8 @@ static void hist_trigger_elt_update(struct hist_trigger_data *hist_data,
 		if (hist_field->flags & HIST_FIELD_FL_VAR) {
 			var_idx = hist_field->var.idx;
 
-			if (hist_field->flags & HIST_FIELD_FL_STRING) {
+			if (hist_field->flags &
+			    (HIST_FIELD_FL_STRING | HIST_FIELD_FL_STACKTRACE)) {
 				unsigned int str_start, var_str_idx, idx;
 				char *str, *val_str;
 				unsigned int size;
@@ -5103,9 +5126,18 @@ static void hist_trigger_elt_update(struct hist_trigger_data *hist_data,
 				str = elt_data->field_var_str[idx];
 				val_str = (char *)(uintptr_t)hist_val;
 
-				size = min(hist_field->size, STR_VAR_LEN_MAX);
-				strscpy(str, val_str, size);
-
+				if (hist_field->flags & HIST_FIELD_FL_STRING) {
+					size = min(hist_field->size, STR_VAR_LEN_MAX);
+					strscpy(str, val_str, size);
+				} else {
+					int e;
+
+					e = stack_trace_save((void *)str,
+							     HIST_STACKTRACE_DEPTH,
+							     HIST_STACKTRACE_SKIP);
+					if (e < HIST_STACKTRACE_DEPTH - 1)
+						((unsigned long *)str)[e] = 0;
+				}
 				hist_val = (u64)(uintptr_t)str;
 			}
 			tracing_map_set_var(elt, var_idx, hist_val);
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/6 v2] tracing: Allow synthetic events to pass around stacktraces
  2023-01-17 15:21 [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Steven Rostedt
  2023-01-17 15:21 ` [PATCH 1/6 v2] tracing: Simplify calculating entry size using struct_size() Steven Rostedt
  2023-01-17 15:21 ` [PATCH 2/6 v2] tracing: Allow stacktraces to be saved as histogram variables Steven Rostedt
@ 2023-01-17 15:21 ` Steven Rostedt
  2023-01-17 15:21 ` [PATCH 4/6 v2] tracing/histogram: Add stacktrace type Steven Rostedt
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2023-01-17 15:21 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi, Ross Zwisler,
	Ching-lin Yu

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Allow a stacktrace from one event to be displayed by the end event of a
synthetic event. This is very useful when looking for the longest latency
of a sleep or something blocked on I/O.

 # cd /sys/kernel/tracing/
 # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
 # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace  if prev_state == 1||prev_state == 2' > events/sched/sched_switch/trigger
 # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger

The above creates a "block_lat" synthetic event that take the stacktrace of
when a task schedules out in either the interruptible or uninterruptible
states, and on a new per process max $delta (the time it was scheduled
out), will print the process id and the stacktrace.

  # echo 1 > events/synthetic/block_lat/enable
  # cat trace
 #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
 #              | |         |   |||||     |         |
    kworker/u16:0-767     [006] d..4.   560.645045: block_lat: pid=767 delta=66 stack=STACK:
 => __schedule
 => schedule
 => pipe_read
 => vfs_read
 => ksys_read
 => do_syscall_64
 => 0x966000aa

           <idle>-0       [003] d..4.   561.132117: block_lat: pid=0 delta=413787 stack=STACK:
 => __schedule
 => schedule
 => schedule_hrtimeout_range_clock
 => do_sys_poll
 => __x64_sys_poll
 => do_syscall_64
 => 0x966000aa

            <...>-153     [006] d..4.   562.068407: block_lat: pid=153 delta=54 stack=STACK:
 => __schedule
 => schedule
 => io_schedule
 => rq_qos_wait
 => wbt_wait
 => __rq_qos_throttle
 => blk_mq_submit_bio
 => submit_bio_noacct_nocheck
 => ext4_bio_write_page
 => mpage_submit_page
 => mpage_process_page_bufs
 => mpage_prepare_extent_to_map
 => ext4_do_writepages
 => ext4_writepages
 => do_writepages
 => __writeback_single_inode

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace.h              |  4 ++
 kernel/trace/trace_events_hist.c  |  7 ++-
 kernel/trace/trace_events_synth.c | 80 ++++++++++++++++++++++++++++++-
 kernel/trace/trace_synth.h        |  1 +
 4 files changed, 87 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index a56b0b7d23ef..28294f6e15b9 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -113,6 +113,10 @@ enum trace_type {
 #define MEM_FAIL(condition, fmt, ...)					\
 	DO_ONCE_LITE_IF(condition, pr_err, "ERROR: " fmt, ##__VA_ARGS__)
 
+#define HIST_STACKTRACE_DEPTH	16
+#define HIST_STACKTRACE_SIZE	(HIST_STACKTRACE_DEPTH * sizeof(unsigned long))
+#define HIST_STACKTRACE_SKIP	5
+
 /*
  * syscalls are special, and need special handling, this is why
  * they are not included in trace_entries.h
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index bd67e7c3f258..599c056b6048 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -480,10 +480,6 @@ DEFINE_HIST_FIELD_FN(u8);
 #define for_each_hist_key_field(i, hist_data)	\
 	for ((i) = (hist_data)->n_vals; (i) < (hist_data)->n_fields; (i)++)
 
-#define HIST_STACKTRACE_DEPTH	16
-#define HIST_STACKTRACE_SIZE	(HIST_STACKTRACE_DEPTH * sizeof(unsigned long))
-#define HIST_STACKTRACE_SKIP	5
-
 #define HITCOUNT_IDX		0
 #define HIST_KEY_SIZE_MAX	(MAX_FILTER_STR_VAL + HIST_STACKTRACE_SIZE)
 
@@ -3867,6 +3863,9 @@ static int check_synth_field(struct synth_event *event,
 	    && field->is_dynamic)
 		return 0;
 
+	if (strstr(hist_field->type, "long[") && field->is_stack)
+		return 0;
+
 	if (strcmp(field->type, hist_field->type) != 0) {
 		if (field->size != hist_field->size ||
 		    (!field->is_string && field->is_signed != hist_field->is_signed))
diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index 9f79cd689b79..1cfba626ea1e 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -173,6 +173,14 @@ static int synth_field_is_string(char *type)
 	return false;
 }
 
+static int synth_field_is_stack(char *type)
+{
+	if (strstr(type, "long[") != NULL)
+		return true;
+
+	return false;
+}
+
 static int synth_field_string_size(char *type)
 {
 	char buf[4], *end, *start;
@@ -248,6 +256,8 @@ static int synth_field_size(char *type)
 		size = sizeof(gfp_t);
 	else if (synth_field_is_string(type))
 		size = synth_field_string_size(type);
+	else if (synth_field_is_stack(type))
+		size = 0;
 
 	return size;
 }
@@ -292,6 +302,8 @@ static const char *synth_field_fmt(char *type)
 		fmt = "%x";
 	else if (synth_field_is_string(type))
 		fmt = "%.*s";
+	else if (synth_field_is_stack(type))
+		fmt = "%s";
 
 	return fmt;
 }
@@ -371,6 +383,23 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
 						 i == se->n_fields - 1 ? "" : " ");
 				n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
 			}
+		} else if (se->fields[i]->is_stack) {
+			u32 offset, data_offset, len;
+			unsigned long *p, *end;
+
+			offset = (u32)entry->fields[n_u64];
+			data_offset = offset & 0xffff;
+			len = offset >> 16;
+
+			p = (void *)entry + data_offset;
+			end = (void *)p + len - (sizeof(long) - 1);
+
+			trace_seq_printf(s, "%s=STACK:\n", se->fields[i]->name);
+
+			for (; *p && p < end; p++)
+				trace_seq_printf(s, "=> %pS\n", (void *)*p);
+			n_u64++;
+
 		} else {
 			struct trace_print_flags __flags[] = {
 			    __def_gfpflag_names, {-1, NULL} };
@@ -446,6 +475,43 @@ static unsigned int trace_string(struct synth_trace_event *entry,
 	return len;
 }
 
+static unsigned int trace_stack(struct synth_trace_event *entry,
+				 struct synth_event *event,
+				 long *stack,
+				 unsigned int data_size,
+				 unsigned int *n_u64)
+{
+	unsigned int len;
+	u32 data_offset;
+	void *data_loc;
+
+	data_offset = struct_size(entry, fields, event->n_u64);
+	data_offset += data_size;
+
+	for (len = 0; len < HIST_STACKTRACE_DEPTH; len++) {
+		if (!stack[len])
+			break;
+	}
+
+	/* Include the zero'd element if it fits */
+	if (len < HIST_STACKTRACE_DEPTH)
+		len++;
+
+	len *= sizeof(long);
+
+	/* Find the dynamic section to copy the stack into. */
+	data_loc = (void *)entry + data_offset;
+	memcpy(data_loc, stack, len);
+
+	/* Fill in the field that holds the offset/len combo */
+	data_offset |= len << 16;
+	*(u32 *)&entry->fields[*n_u64] = data_offset;
+
+	(*n_u64)++;
+
+	return len;
+}
+
 static notrace void trace_event_raw_event_synth(void *__data,
 						u64 *var_ref_vals,
 						unsigned int *var_ref_idx)
@@ -498,6 +564,12 @@ static notrace void trace_event_raw_event_synth(void *__data,
 					   event->fields[i]->is_dynamic,
 					   data_size, &n_u64);
 			data_size += len; /* only dynamic string increments */
+		} if (event->fields[i]->is_stack) {
+		        long *stack = (long *)var_ref_vals[val_idx];
+
+			len = trace_stack(entry, event, stack,
+					   data_size, &n_u64);
+			data_size += len;
 		} else {
 			struct synth_field *field = event->fields[i];
 			u64 val = var_ref_vals[val_idx];
@@ -560,6 +632,9 @@ static int __set_synth_event_print_fmt(struct synth_event *event,
 		    event->fields[i]->is_dynamic)
 			pos += snprintf(buf + pos, LEN_OR_ZERO,
 				", __get_str(%s)", event->fields[i]->name);
+		else if (event->fields[i]->is_stack)
+			pos += snprintf(buf + pos, LEN_OR_ZERO,
+				", __get_stacktrace(%s)", event->fields[i]->name);
 		else
 			pos += snprintf(buf + pos, LEN_OR_ZERO,
 					", REC->%s", event->fields[i]->name);
@@ -696,7 +771,8 @@ static struct synth_field *parse_synth_field(int argc, char **argv,
 		ret = -EINVAL;
 		goto free;
 	} else if (size == 0) {
-		if (synth_field_is_string(field->type)) {
+		if (synth_field_is_string(field->type) ||
+		    synth_field_is_stack(field->type)) {
 			char *type;
 
 			len = sizeof("__data_loc ") + strlen(field->type) + 1;
@@ -727,6 +803,8 @@ static struct synth_field *parse_synth_field(int argc, char **argv,
 
 	if (synth_field_is_string(field->type))
 		field->is_string = true;
+	else if (synth_field_is_stack(field->type))
+		field->is_stack = true;
 
 	field->is_signed = synth_field_signed(field->type);
  out:
diff --git a/kernel/trace/trace_synth.h b/kernel/trace/trace_synth.h
index b29595fe3ac5..43f6fb6078db 100644
--- a/kernel/trace/trace_synth.h
+++ b/kernel/trace/trace_synth.h
@@ -18,6 +18,7 @@ struct synth_field {
 	bool is_signed;
 	bool is_string;
 	bool is_dynamic;
+	bool is_stack;
 };
 
 struct synth_event {
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 4/6 v2] tracing/histogram: Add stacktrace type
  2023-01-17 15:21 [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Steven Rostedt
                   ` (2 preceding siblings ...)
  2023-01-17 15:21 ` [PATCH 3/6 v2] tracing: Allow synthetic events to pass around stacktraces Steven Rostedt
@ 2023-01-17 15:21 ` Steven Rostedt
  2023-01-17 15:21 ` [PATCH 5/6 v2] tracing/histogram: Document variable stacktrace Steven Rostedt
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2023-01-17 15:21 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi, Ross Zwisler,
	Ching-lin Yu

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Now that stacktraces can be part of synthetic events, allow a key to be
typed as a stacktrace.

  # cd /sys/kernel/tracing
  # echo 's:block_lat u64 delta; unsigned long stack[];' >> dynamic_events
  # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 2' >> events/sched/sched_switch/trigger
  # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,st2=$st:onmatch(sched.sched_switch).trace(block_lat,$delta,$st2)' >> events/sched/sched_switch/trigger
  # echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger

  # cat events/synthetic/block_lat/hist

  # event histogram
  #
  # trigger info: hist:keys=delta.buckets=100,stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
  #

  { delta: ~ 0-99, stacktrace:
           event_hist_trigger+0x464/0x480
           event_triggers_call+0x52/0xe0
           trace_event_buffer_commit+0x193/0x250
           trace_event_raw_event_sched_switch+0xfc/0x150
           __traceiter_sched_switch+0x41/0x60
           __schedule+0x448/0x7b0
           schedule_idle+0x26/0x40
           cpu_startup_entry+0x19/0x20
           start_secondary+0xed/0xf0
           secondary_startup_64_no_verify+0xe0/0xeb
  } hitcount:          6
  { delta: ~ 0-99, stacktrace:
           event_hist_trigger+0x464/0x480
           event_triggers_call+0x52/0xe0
           trace_event_buffer_commit+0x193/0x250
           trace_event_raw_event_sched_switch+0xfc/0x150
           __traceiter_sched_switch+0x41/0x60
           __schedule+0x448/0x7b0
           schedule_idle+0x26/0x40
           cpu_startup_entry+0x19/0x20
           __pfx_kernel_init+0x0/0x10
           arch_call_rest_init+0xa/0x24
           start_kernel+0x964/0x98d
           secondary_startup_64_no_verify+0xe0/0xeb
  } hitcount:          3
  { delta: ~ 0-99, stacktrace:
           event_hist_trigger+0x464/0x480
           event_triggers_call+0x52/0xe0
           trace_event_buffer_commit+0x193/0x250
           trace_event_raw_event_sched_switch+0xfc/0x150
           __traceiter_sched_switch+0x41/0x60
           __schedule+0x448/0x7b0
           schedule+0x5a/0xb0
           worker_thread+0xaf/0x380
           kthread+0xe9/0x110
           ret_from_fork+0x2c/0x50
  } hitcount:          1
  { delta: ~ 100-199, stacktrace:
           event_hist_trigger+0x464/0x480
           event_triggers_call+0x52/0xe0
           trace_event_buffer_commit+0x193/0x250
           trace_event_raw_event_sched_switch+0xfc/0x150
           __traceiter_sched_switch+0x41/0x60
           __schedule+0x448/0x7b0
           schedule_idle+0x26/0x40
           cpu_startup_entry+0x19/0x20
           start_secondary+0xed/0xf0
           secondary_startup_64_no_verify+0xe0/0xeb
  } hitcount:         15
  [..]
  { delta: ~ 8500-8599, stacktrace:
           event_hist_trigger+0x464/0x480
           event_triggers_call+0x52/0xe0
           trace_event_buffer_commit+0x193/0x250
           trace_event_raw_event_sched_switch+0xfc/0x150
           __traceiter_sched_switch+0x41/0x60
           __schedule+0x448/0x7b0
           schedule_idle+0x26/0x40
           cpu_startup_entry+0x19/0x20
           start_secondary+0xed/0xf0
           secondary_startup_64_no_verify+0xe0/0xeb
  } hitcount:          1

  Totals:
      Hits: 89
      Entries: 11
      Dropped: 0

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_hist.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index 599c056b6048..888b7a394ce5 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -1716,6 +1716,8 @@ static const char *get_hist_field_flags(struct hist_field *hist_field)
 		flags_str = "percent";
 	else if (hist_field->flags & HIST_FIELD_FL_GRAPH)
 		flags_str = "graph";
+	else if (hist_field->flags & HIST_FIELD_FL_STACKTRACE)
+		flags_str = "stacktrace";
 
 	return flags_str;
 }
@@ -2312,6 +2314,8 @@ parse_field(struct hist_trigger_data *hist_data, struct trace_event_file *file,
 			*flags |= HIST_FIELD_FL_EXECNAME;
 		else if (strcmp(modifier, "syscall") == 0)
 			*flags |= HIST_FIELD_FL_SYSCALL;
+		else if (strcmp(modifier, "stacktrace") == 0)
+			*flags |= HIST_FIELD_FL_STACKTRACE;
 		else if (strcmp(modifier, "log2") == 0)
 			*flags |= HIST_FIELD_FL_LOG2;
 		else if (strcmp(modifier, "usecs") == 0)
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 5/6 v2] tracing/histogram: Document variable stacktrace
  2023-01-17 15:21 [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Steven Rostedt
                   ` (3 preceding siblings ...)
  2023-01-17 15:21 ` [PATCH 4/6 v2] tracing/histogram: Add stacktrace type Steven Rostedt
@ 2023-01-17 15:21 ` Steven Rostedt
  2023-01-17 15:21 ` [PATCH 6/6 v2] tracing/histogram: Add simple tests for stacktrace usage of synthetic events Steven Rostedt
  2023-02-10 21:25 ` [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Tom Zanussi
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2023-01-17 15:21 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi, Ross Zwisler,
	Ching-lin Yu

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Add a little documentation (and a useful example) of how a stacktrace can
be used within a histogram variable and synthetic event.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 Documentation/trace/histogram.rst | 129 ++++++++++++++++++++++++++++++
 1 file changed, 129 insertions(+)

diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst
index f95459aa984f..5c391328b9bb 100644
--- a/Documentation/trace/histogram.rst
+++ b/Documentation/trace/histogram.rst
@@ -81,6 +81,7 @@ Documentation written by Tom Zanussi
 	.usecs         display a common_timestamp in microseconds
         .percent       display a number of percentage value
         .graph         display a bar-graph of a value
+	.stacktrace    display as a stacktrace (must by a long[] type)
 	=============  =================================================
 
   Note that in general the semantics of a given field aren't
@@ -1786,6 +1787,8 @@ or assigned to a variable and referenced in a subsequent expression::
   # echo 'hist:keys=next_pid:us_per_sec=1000000 ...' >> event/trigger
   # echo 'hist:keys=next_pid:timestamp_secs=common_timestamp/$us_per_sec ...' >> event/trigger
 
+Variables can even hold stacktraces, which are useful with synthetic events.
+
 2.2.2 Synthetic Events
 ----------------------
 
@@ -1940,6 +1943,132 @@ the ".buckets" modifier and specify a size (in this case groups of 10).
       Entries: 16
       Dropped: 0
 
+To save stacktraces, create a synthetic event with a field of type "unsigned long[]"
+or even just "long[]". For example, to see how long a task is blocked in an
+uninterruptible state:
+
+  # cd /sys/kernel/tracing
+  # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
+  # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace  if prev_state == 2' >> events/sched/sched_switch/trigger
+  # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger
+  # echo 1 > events/synthetic/block_lat/enable
+  # cat trace
+
+  # tracer: nop
+  #
+  # entries-in-buffer/entries-written: 2/2   #P:8
+  #
+  #                                _-----=> irqs-off/BH-disabled
+  #                               / _----=> need-resched
+  #                              | / _---=> hardirq/softirq
+  #                              || / _--=> preempt-depth
+  #                              ||| / _-=> migrate-disable
+  #                              |||| /     delay
+  #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
+  #              | |         |   |||||     |         |
+            <idle>-0       [005] d..4.   521.164922: block_lat: pid=0 delta=8322 stack=STACK:
+  => __schedule+0x448/0x7b0
+  => schedule+0x5a/0xb0
+  => io_schedule+0x42/0x70
+  => bit_wait_io+0xd/0x60
+  => __wait_on_bit+0x4b/0x140
+  => out_of_line_wait_on_bit+0x91/0xb0
+  => jbd2_journal_commit_transaction+0x1679/0x1a70
+  => kjournald2+0xa9/0x280
+  => kthread+0xe9/0x110
+  => ret_from_fork+0x2c/0x50
+
+             <...>-2       [004] d..4.   525.184257: block_lat: pid=2 delta=76 stack=STACK:
+  => __schedule+0x448/0x7b0
+  => schedule+0x5a/0xb0
+  => schedule_timeout+0x11a/0x150
+  => wait_for_completion_killable+0x144/0x1f0
+  => __kthread_create_on_node+0xe7/0x1e0
+  => kthread_create_on_node+0x51/0x70
+  => create_worker+0xcc/0x1a0
+  => worker_thread+0x2ad/0x380
+  => kthread+0xe9/0x110
+  => ret_from_fork+0x2c/0x50
+
+A synthetic event that has a stacktrace field may use it as a key in histogram:
+
+  # echo 'hist:delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger
+  # cat events/synthetic/block_lat/hist
+
+  # event histogram
+  #
+  # trigger info: hist:keys=delta.buckets=100,stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
+  #
+
+  { delta: ~ 0-99, stacktrace:
+           event_hist_trigger+0x464/0x480
+           event_triggers_call+0x52/0xe0
+           trace_event_buffer_commit+0x193/0x250
+           trace_event_raw_event_sched_switch+0xfc/0x150
+           __traceiter_sched_switch+0x41/0x60
+           __schedule+0x448/0x7b0
+           schedule_idle+0x26/0x40
+           cpu_startup_entry+0x19/0x20
+           start_secondary+0xed/0xf0
+           secondary_startup_64_no_verify+0xe0/0xeb
+  } hitcount:          6
+  { delta: ~ 0-99, stacktrace:
+           event_hist_trigger+0x464/0x480
+           event_triggers_call+0x52/0xe0
+           trace_event_buffer_commit+0x193/0x250
+           trace_event_raw_event_sched_switch+0xfc/0x150
+           __traceiter_sched_switch+0x41/0x60
+           __schedule+0x448/0x7b0
+           schedule_idle+0x26/0x40
+           cpu_startup_entry+0x19/0x20
+           __pfx_kernel_init+0x0/0x10
+           arch_call_rest_init+0xa/0x24
+           start_kernel+0x964/0x98d
+           secondary_startup_64_no_verify+0xe0/0xeb
+  } hitcount:          3
+  { delta: ~ 0-99, stacktrace:
+           event_hist_trigger+0x464/0x480
+           event_triggers_call+0x52/0xe0
+           trace_event_buffer_commit+0x193/0x250
+           trace_event_raw_event_sched_switch+0xfc/0x150
+           __traceiter_sched_switch+0x41/0x60
+           __schedule+0x448/0x7b0
+           schedule+0x5a/0xb0
+           worker_thread+0xaf/0x380
+           kthread+0xe9/0x110
+           ret_from_fork+0x2c/0x50
+  } hitcount:          1
+  { delta: ~ 100-199, stacktrace:
+           event_hist_trigger+0x464/0x480
+           event_triggers_call+0x52/0xe0
+           trace_event_buffer_commit+0x193/0x250
+           trace_event_raw_event_sched_switch+0xfc/0x150
+           __traceiter_sched_switch+0x41/0x60
+           __schedule+0x448/0x7b0
+           schedule_idle+0x26/0x40
+           cpu_startup_entry+0x19/0x20
+           start_secondary+0xed/0xf0
+           secondary_startup_64_no_verify+0xe0/0xeb
+  } hitcount:         15
+  [..]
+  { delta: ~ 8500-8599, stacktrace:
+           event_hist_trigger+0x464/0x480
+           event_triggers_call+0x52/0xe0
+           trace_event_buffer_commit+0x193/0x250
+           trace_event_raw_event_sched_switch+0xfc/0x150
+           __traceiter_sched_switch+0x41/0x60
+           __schedule+0x448/0x7b0
+           schedule_idle+0x26/0x40
+           cpu_startup_entry+0x19/0x20
+           start_secondary+0xed/0xf0
+           secondary_startup_64_no_verify+0xe0/0xeb
+  } hitcount:          1
+
+  Totals:
+      Hits: 89
+      Entries: 11
+      Dropped: 0
+
 2.2.3 Hist trigger 'handlers' and 'actions'
 -------------------------------------------
 
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 6/6 v2] tracing/histogram: Add simple tests for stacktrace usage of synthetic events
  2023-01-17 15:21 [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Steven Rostedt
                   ` (4 preceding siblings ...)
  2023-01-17 15:21 ` [PATCH 5/6 v2] tracing/histogram: Document variable stacktrace Steven Rostedt
@ 2023-01-17 15:21 ` Steven Rostedt
  2023-02-10 21:25 ` [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Tom Zanussi
  6 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2023-01-17 15:21 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Tom Zanussi, Ross Zwisler,
	Ching-lin Yu

From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Update the selftests to include a test of passing a stacktrace between the
events of a synthetic event.

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 kernel/trace/trace.c                          |  2 +-
 .../trigger-synthetic-event-stack.tc          | 24 +++++++++++++++++++
 .../trigger-synthetic-event-syntax.tc         |  6 +++++
 3 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-stack.tc

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d445789dc247..beb7b0e77874 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5803,7 +5803,7 @@ static const char readme_msg[] =
 #ifdef CONFIG_SYNTH_EVENTS
 	"  events/synthetic_events\t- Create/append/remove/show synthetic events\n"
 	"\t  Write into this file to define/undefine new synthetic events.\n"
-	"\t     example: echo 'myevent u64 lat; char name[]' >> synthetic_events\n"
+	"\t     example: echo 'myevent u64 lat; char name[]; long[] stack' >> synthetic_events\n"
 #endif
 #endif
 ;
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-stack.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-stack.tc
new file mode 100644
index 000000000000..755dbe94ccf4
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-stack.tc
@@ -0,0 +1,24 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+# description: event trigger - test inter-event histogram trigger trace action with dynamic string param
+# requires: set_event synthetic_events events/sched/sched_process_exec/hist "long[]' >> synthetic_events":README
+
+fail() { #msg
+    echo $1
+    exit_fail
+}
+
+echo "Test create synthetic event with stack"
+
+
+echo 's:wake_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
+echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace  if prev_state == 1||prev_state == 2' >> events/sched/sched_switch/trigger
+echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(wake_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger
+echo 1 > events/synthetic/wake_lat/enable
+sleep 1
+
+if ! grep -q "=>.*sched" trace; then
+    fail "Failed to create synthetic event with stack"
+fi
+
+exit 0
diff --git a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-syntax.tc b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-syntax.tc
index 2968cdc7df30..366f1f3ad906 100644
--- a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-syntax.tc
+++ b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic-event-syntax.tc
@@ -70,6 +70,12 @@ grep "myevent[[:space:]]unsigned long var" synthetic_events
 echo "myevent char var[10]" > synthetic_events
 grep "myevent[[:space:]]char\[10\] var" synthetic_events
 
+if grep -q 'long\[\]' README; then
+  # test stacktrace type
+  echo "myevent unsigned long[] var" > synthetic_events
+  grep "myevent[[:space:]]unsigned long\[\] var" synthetic_events
+fi
+
 do_reset
 
 exit 0
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event
  2023-01-17 15:21 [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Steven Rostedt
                   ` (5 preceding siblings ...)
  2023-01-17 15:21 ` [PATCH 6/6 v2] tracing/histogram: Add simple tests for stacktrace usage of synthetic events Steven Rostedt
@ 2023-02-10 21:25 ` Tom Zanussi
  6 siblings, 0 replies; 8+ messages in thread
From: Tom Zanussi @ 2023-02-10 21:25 UTC (permalink / raw)
  To: Steven Rostedt, linux-kernel, linux-trace-kernel
  Cc: Masami Hiramatsu, Andrew Morton, Ross Zwisler, Ching-lin Yu

Hi Steve,

On Tue, 2023-01-17 at 10:21 -0500, Steven Rostedt wrote:
> I finally got around to implementing what I've been wanting to do for
> a long
> time! That is to pass stacktraces from the start to the end event of
> a
> synthetic event. Specifically, I need to get the stacktrace of a task

Hey, this is really great!  Thanks for doing this, and sorry I haven't
gotten around to looking at it until now.

Anyway, while playing with it I found a few things that didn't work as
expected, and will send a few patches with some fixes shortly.

Thanks again, this is really useful!

Tom





> as it
> schedules out, but I don't want to see it until it schedules back in
> and I
> check the timings to show that it is longer than normal. That is now
> done
> with:
> 
>  # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' >
> dynamic_events
>  # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace 
> if prev_state == 2' >> events/sched/sched_switch/trigger
>  # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-
> $ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >>
> events/sched/sched_switch/trigger
> 
> The above creates a synthetic event that will trigger on the max
> delta of a
> task blocked in an uninterruptible state. It will show you the stack
> trace
> of where that occurred!
> 
>  # echo 1 > events/synthetic/block_lat/enable
>  # ls -lR > /dev/null
>  # cat trace
> 
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 2/2   #P:8
> #
> #                                _-----=> irqs-off/BH-disabled
> #                               / _----=> need-resched
> #                              | / _---=> hardirq/softirq
> #                              || / _--=> preempt-depth
> #                              ||| / _-=> migrate-disable
> #                              |||| /     delay
> #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
> #              | |         |   |||||     |         |
>           <idle>-0       [005] d..4.   521.164922: block_lat: pid=0
> delta=8322 stack=STACK:
> => __schedule+0x448/0x7b0
> => schedule+0x5a/0xb0
> => io_schedule+0x42/0x70
> => bit_wait_io+0xd/0x60
> => __wait_on_bit+0x4b/0x140
> => out_of_line_wait_on_bit+0x91/0xb0
> => jbd2_journal_commit_transaction+0x1679/0x1a70
> => kjournald2+0xa9/0x280
> => kthread+0xe9/0x110
> => ret_from_fork+0x2c/0x50  
> 
>            <...>-2       [004] d..4.   525.184257: block_lat: pid=2
> delta=76 stack=STACK:
> => __schedule+0x448/0x7b0
> => schedule+0x5a/0xb0
> => schedule_timeout+0x11a/0x150
> => wait_for_completion_killable+0x144/0x1f0
> => __kthread_create_on_node+0xe7/0x1e0
> => kthread_create_on_node+0x51/0x70
> => create_worker+0xcc/0x1a0
> => worker_thread+0x2ad/0x380
> => kthread+0xe9/0x110
> => ret_from_fork+0x2c/0x50
> 
> Changes since v1:
> https://lore.kernel.org/linux-trace-kernel/20230117044010.838685230@goodmis.org/
> 
> - Fixed an issue with passing a stacktrace variable to another
> variable.
> 
> - Can now use the stacktrace in the histogram as well:
> 
>   # echo 'hist:delta.buckets=100,stack.stacktrace:sort=delta' >
> events/synthetic/block_lat/trigger
>   # cat events/synthetic/block_lat/hist
>   
>   # event histogram
>   # 
>   # trigger info:
> hist:keys=delta.buckets=100,stacktrace:vals=hitcount:sort=delta.bucke
> ts=100:size=2048 [active]
>   # 
>     
>   { delta: ~ 0-99, stacktrace:      
>            event_hist_trigger+0x464/0x480
>            event_triggers_call+0x52/0xe0
>            trace_event_buffer_commit+0x193/0x250
>            trace_event_raw_event_sched_switch+0xfc/0x150
>            __traceiter_sched_switch+0x41/0x60 
>            __schedule+0x448/0x7b0
>            schedule_idle+0x26/0x40
>            cpu_startup_entry+0x19/0x20
>            start_secondary+0xed/0xf0
>            secondary_startup_64_no_verify+0xe0/0xeb
>   } hitcount:          6
>   { delta: ~ 0-99, stacktrace:
>            event_hist_trigger+0x464/0x480
>            event_triggers_call+0x52/0xe0
>            trace_event_buffer_commit+0x193/0x250
>            trace_event_raw_event_sched_switch+0xfc/0x150
>            __traceiter_sched_switch+0x41/0x60
>            __schedule+0x448/0x7b0
>            schedule_idle+0x26/0x40
>            cpu_startup_entry+0x19/0x20
>            __pfx_kernel_init+0x0/0x10
>            arch_call_rest_init+0xa/0x24
>            start_kernel+0x964/0x98d
>            secondary_startup_64_no_verify+0xe0/0xeb
>   } hitcount:          3
> [..]
>   { delta: ~ 8500-8599, stacktrace:
>            event_hist_trigger+0x464/0x480
>            event_triggers_call+0x52/0xe0
>            trace_event_buffer_commit+0x193/0x250
>            trace_event_raw_event_sched_switch+0xfc/0x150
>            __traceiter_sched_switch+0x41/0x60
>            __schedule+0x448/0x7b0
>            schedule_idle+0x26/0x40
>            cpu_startup_entry+0x19/0x20
>            start_secondary+0xed/0xf0
>            secondary_startup_64_no_verify+0xe0/0xeb
>   } hitcount:          1
> 
>   Totals:
>       Hits: 89 
>       Entries: 11
>       Dropped: 0
> 
> 
> 
> Steven Rostedt (Google) (6):
>       tracing: Simplify calculating entry size using struct_size()
>       tracing: Allow stacktraces to be saved as histogram variables
>       tracing: Allow synthetic events to pass around stacktraces
>       tracing/histogram: Add stacktrace type
>       tracing/histogram: Document variable stacktrace
>       tracing/histogram: Add simple tests for stacktrace usage of
> synthetic events
> 
> ----
>  Documentation/trace/histogram.rst                  | 129
> +++++++++++++++++++++
>  kernel/trace/trace.c                               |   2 +-
>  kernel/trace/trace.h                               |   4 +
>  kernel/trace/trace_events_hist.c                   |  63 +++++++---
>  kernel/trace/trace_events_synth.c                  |  83
> ++++++++++++-
>  kernel/trace/trace_synth.h                         |   1 +
>  .../inter-event/trigger-synthetic-event-stack.tc   |  24 ++++
>  .../inter-event/trigger-synthetic-event-syntax.tc  |   6 +
>  8 files changed, 294 insertions(+), 18 deletions(-)
>  create mode 100644
> tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-
> synthetic-event-stack.tc


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-02-10 21:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-17 15:21 [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Steven Rostedt
2023-01-17 15:21 ` [PATCH 1/6 v2] tracing: Simplify calculating entry size using struct_size() Steven Rostedt
2023-01-17 15:21 ` [PATCH 2/6 v2] tracing: Allow stacktraces to be saved as histogram variables Steven Rostedt
2023-01-17 15:21 ` [PATCH 3/6 v2] tracing: Allow synthetic events to pass around stacktraces Steven Rostedt
2023-01-17 15:21 ` [PATCH 4/6 v2] tracing/histogram: Add stacktrace type Steven Rostedt
2023-01-17 15:21 ` [PATCH 5/6 v2] tracing/histogram: Document variable stacktrace Steven Rostedt
2023-01-17 15:21 ` [PATCH 6/6 v2] tracing/histogram: Add simple tests for stacktrace usage of synthetic events Steven Rostedt
2023-02-10 21:25 ` [PATCH 0/6 v2] tracing/histograms: Pass stacktrace from synthetic start event to end event Tom Zanussi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.