public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH 1/2] ftrace/scripts: Add helper script to bisect function tracing problem functions
Date: Sun, 25 Sep 2016 10:47:15 -0400	[thread overview]
Message-ID: <20160925144751.426276514@goodmis.org> (raw)
In-Reply-To: 20160925144714.728554031@goodmis.org

[-- Attachment #1: 0001-ftrace-scripts-Add-helper-script-to-bisect-function-.patch --]
[-- Type: text/plain, Size: 5444 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

Every so often, with a special config or a architecture change, running
function or function_graph tracing can cause the machien to hard reboot,
crash, or simply hard lockup. There's some functions in the function graph
tracer that can not be traced otherwise it causes the function tracer to
recurse before the recursion protection mechanisms are in place.

When this occurs, using the dynamic ftrace featuer that allows limiting what
actually gets traced can be used to bisect down to the problem function.
This adds a script that helps with this process in the scripts/tracing
directory, called ftrace-bisect.sh

The set up is to read all the functions that can be traced from
available_filter_functions into a file (full_file). Then run this script
passing it the full_file and a "test_file" and "non_test_file", where the
test_file will be add to set_ftrace_filter. What ftarce_bisect.sh does, is
to copy half of the functions in full_file into the test_file and the other
half into the non_test_file. This way, one can cat the test_file into the
set_ftrace_filter functions and only test the functions that are in that
file. If it works, then we run the process again after copying non_test_file
to full_file and repeating the process. If the system crashed, then the bad
function is in the test_file and after a reboot, the test_file becomes the
new full_file in the next iteration.

When we get down to a single function in the full_file, then
ftrace_bisect.sh will report that as the bad function.

Full documentation of how to use this simple script is within the script
file itself.

Link: http://lkml.kernel.org/r/20160920100716.131d3647@gandalf.local.home

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 scripts/tracing/ftrace-bisect.sh | 115 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)
 create mode 100755 scripts/tracing/ftrace-bisect.sh

diff --git a/scripts/tracing/ftrace-bisect.sh b/scripts/tracing/ftrace-bisect.sh
new file mode 100755
index 000000000000..9ff8ac5fc53c
--- /dev/null
+++ b/scripts/tracing/ftrace-bisect.sh
@@ -0,0 +1,115 @@
+#!/bin/bash
+#
+# Here's how to use this:
+#
+# This script is used to help find functions that are being traced by function
+# tracer or function graph tracing that causes the machine to reboot, hang, or
+# crash. Here's the steps to take.
+#
+# First, determine if function tracing is working with a single function:
+#
+#   (note, if this is a problem with function_graph tracing, then simply
+#    replace "function" with "function_graph" in the following steps).
+#
+#  # cd /sys/kernel/debug/tracing
+#  # echo schedule > set_ftrace_filter
+#  # echo function > current_tracer
+#
+# If this works, then we know that something is being traced that shouldn't be.
+#
+#  # echo nop > current_tracer
+#
+#  # cat available_filter_functions > ~/full-file
+#  # ftrace-bisect ~/full-file ~/test-file ~/non-test-file
+#  # cat ~/test-file > set_ftrace_filter
+#
+# *** Note *** this will take several minutes. Setting multiple functions is
+# an O(n^2) operation, and we are dealing with thousands of functions. So go
+# have  coffee, talk with your coworkers, read facebook. And eventually, this
+# operation will end.
+#
+#  # echo function > current_tracer
+#
+# If it crashes, we know that ~/test-file has a bad function.
+#
+#   Reboot back to test kernel.
+#
+#     # cd /sys/kernel/debug/tracing
+#     # mv ~/test-file ~/full-file
+#
+# If it didn't crash.
+#
+#     # echo nop > current_tracer
+#     # mv ~/non-test-file ~/full-file
+#
+# Get rid of the other test file from previous run (or save them off somewhere).
+#  # rm -f ~/test-file ~/non-test-file
+#
+# And start again:
+#
+#  # ftrace-bisect ~/full-file ~/test-file ~/non-test-file
+#
+# The good thing is, because this cuts the number of functions in ~/test-file
+# by half, the cat of it into set_ftrace_filter takes half as long each
+# iteration, so don't talk so much at the water cooler the second time.
+#
+# Eventually, if you did this correctly, you will get down to the problem
+# function, and all we need to do is to notrace it.
+#
+# The way to figure out if the problem function is bad, just do:
+#
+#  # echo <problem-function> > set_ftrace_notrace
+#  # echo > set_ftrace_filter
+#  # echo function > current_tracer
+#
+# And if it doesn't crash, we are done.
+#
+# If it does crash, do this again (there's more than one problem function)
+# but you need to echo the problem function(s) into set_ftrace_notrace before
+# enabling function tracing in the above steps. Or if you can compile the
+# kernel, annotate the problem functions with "notrace" and start again.
+#
+
+
+if [ $# -ne 3 ]; then
+  echo 'usage: ftrace-bisect full-file test-file  non-test-file'
+  exit
+fi
+
+full=$1
+test=$2
+nontest=$3
+
+x=`cat $full | wc -l`
+if [ $x -eq 1 ]; then
+	echo "There's only one function left, must be the bad one"
+	cat $full
+	exit 0
+fi
+
+let x=$x/2
+let y=$x+1
+
+if [ ! -f $full ]; then
+	echo "$full does not exist"
+	exit 1
+fi
+
+if [ -f $test ]; then
+	echo -n "$test exists, delete it? [y/N]"
+	read a
+	if [ "$a" != "y" -a "$a" != "Y" ]; then
+		exit 1
+	fi
+fi
+
+if [ -f $nontest ]; then
+	echo -n "$nontest exists, delete it? [y/N]"
+	read a
+	if [ "$a" != "y" -a "$a" != "Y" ]; then
+		exit 1
+	fi
+fi
+
+sed -ne "1,${x}p" $full > $test
+sed -ne "$y,\$p" $full > $nontest
-- 
2.8.1

  reply	other threads:[~2016-09-25 14:47 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-25 14:47 [PATCH 0/2] tracing: Some more updates for 4.9 Steven Rostedt
2016-09-25 14:47 ` Steven Rostedt [this message]
2016-09-25 14:47 ` [PATCH 2/2] tracing: Call traceoff trigger after event is recorded Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160925144751.426276514@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox