public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: "Ingo Molnar" <mingo@kernel.org>,
	"Frederic Weisbecker" <fweisbec@gmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Masami Hiramatsu" <masami.hiramatsu.pt@hitachi.com>,
	"Oleg Nesterov" <oleg@redhat.com>, "Jörn Engel" <joern@logfs.org>,
	"Dave Jones" <davej@redhat.com>,
	"Steve Hodgson" <steve@purestorage.com>
Subject: [for-next-3.11][PATCH 8/8] ftrace: Clear module traced functions on unload module
Date: Tue, 30 Jul 2013 07:27:26 -0400	[thread overview]
Message-ID: <20130730113153.375039705@goodmis.org> (raw)
In-Reply-To: 20130730112718.273490378@goodmis.org

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: 0008-ftrace-Clear-module-traced-functions-on-unload-modul.patch --]
[-- Type: text/plain, Size: 5844 bytes --]

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

There's been a nasty bug that would show up and not give much info.
The bug displayed the following warning:

 WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230()
 Pid: 20903, comm: bash Tainted: G           O 3.6.11+ #38405.trunk
 Call Trace:
  [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230
  [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0
  [<ffffffff811401cc>] ? kfree+0x2c/0x110
  [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150
  [<ffffffff81149f1e>] __fput+0xae/0x220
  [<ffffffff8114a09e>] ____fput+0xe/0x10
  [<ffffffff8105fa22>] task_work_run+0x72/0x90
  [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0
  [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
  [<ffffffff815c0f88>] int_signal+0x12/0x17
 ---[ end trace 793179526ee09b2c ]---

It was finally narrowed down to unloading a module that was being traced.

It was actually more than that. When functions are being traced, there's
a table of all functions that have a ref count of the number of active
tracers attached to that function. When a function trace callback is
registered to a function, the function's record ref count is incremented.
When it is unregistered, the function's record ref count is decremented.
If an inconsistency is detected (ref count goes below zero) the above
warning is shown and the function tracing is permanently disabled until
reboot.

When a module is unloaded, it frees the function records that represent
the module functions. These records exist on their own pages, that is
function records for one module will not exist on the same page as
function records for other modules or even the core kernel.

Now, when a callback is registered to a function, it holds a "hash" of
function addresses that it traces. When it unregisters, the hash is
examined and any record that exists in the hash will decrement the
functions record's ref count. Hashes that point to a module function
that was freed are simply ignored.

But what happens if the trace is still going on and you reload the same
module. If the module is allocated in the same location, the hashes
of the registered functions will still be mapped to the module functions
(if it was tracing the module function). The problem is that the module
would allocate new function record tables with their ref counts as zero.
When a callback is unregistered, if the hash matches the function record
of a reloaded module, it will dec the ref count, but as it was zero it
would trigger the above warning and disable function tracing.

With the help of Steve and Joern, we found a reproducer:

 Using uinput module and uinput_release function.

 modprobe uinput
 echo uinput_release > /sys/kernel/debug/tracing/set_ftrace_filter
 rmmod uinput
 modprobe uinput
 // check /proc/modules to see if loaded in same addr, otherwise try again
 echo > /sys/kernel/debug/tracing/set_ftrace_filter

 [BOOM]

The solution here is on module unload, look at all the registered funtion
callbacks and remove any hash entry that points to any function in the
module that is about to be removed.

Link: http://lkml.kernel.org/r/CADHUQJ58icP+d3QTXheiCnZwu6hrAqG2heMhhcuQvghXavHRFw@mail.gmail.com

Reported-by: Jörn Engel <joern@logfs.org>
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Steve Hodgson <steve@purestorage.com>
Tested-by: Steve Hodgson <steve@purestorage.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
 kernel/trace/ftrace.c |   60 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 92d3334..6de8cbd 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4061,6 +4061,63 @@ static int ftrace_process_locs(struct module *mod,
 
 #ifdef CONFIG_MODULES
 
+/*
+ * If the filter is cleared, then all functions may end up being
+ * traced. We need to pass that information down to update the
+ * records.
+ */
+static bool remove_rec_entry(struct ftrace_hash *hash, struct dyn_ftrace *rec)
+{
+	struct ftrace_func_entry *entry;
+
+	entry = ftrace_lookup_ip(hash, rec->ip);
+	if (!entry)
+		return false;
+
+	free_hash_entry(hash, entry);
+
+	/* If we cleared the hash, then we now trace all functions */
+	return ftrace_hash_empty(hash);
+}
+
+static void clear_recs(struct ftrace_ops *ops, struct ftrace_page *pg)
+{
+	struct dyn_ftrace *rec;
+	bool update = false;
+	int i;
+
+	for (i = 0; i < pg->index; i++) {
+		rec = &pg->records[i];
+
+		/* If the filter hash gets cleared, we trace all functions */
+		if (remove_rec_entry(ops->filter_hash, rec))
+			update = true;
+		remove_rec_entry(ops->notrace_hash, rec);
+	}
+
+	if (update) {
+		ftrace_hash_rec_enable(ops, 1);
+		if (ftrace_enabled)
+			ftrace_run_update_code(FTRACE_UPDATE_CALLS);
+	}
+}
+
+static bool ops_has_filter(struct ftrace_ops *ops)
+{
+	return !ftrace_hash_empty(ops->filter_hash) ||
+		!ftrace_hash_empty(ops->notrace_hash);
+}
+
+static void clear_hashes(struct ftrace_page *pg)
+{
+	struct ftrace_ops *ops;
+
+	for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
+		if ((ops->flags & FTRACE_OPS_FL_ENABLED) && ops_has_filter(ops))
+			clear_recs(ops, pg);
+	}
+}
+
 #define next_to_ftrace_page(p) container_of(p, struct ftrace_page, next)
 
 void ftrace_release_mod(struct module *mod)
@@ -4094,6 +4151,9 @@ void ftrace_release_mod(struct module *mod)
 			if (pg == ftrace_pages)
 				ftrace_pages = next_to_ftrace_page(last_pg);
 
+			/* Make sure no hashes reference this module record */
+			clear_hashes(pg);
+
 			*last_pg = pg->next;
 			order = get_count_order(pg->size / ENTRIES_PER_PAGE);
 			free_pages((unsigned long)pg->records, order);
-- 
1.7.10.4



  parent reply	other threads:[~2013-07-30 11:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-30 11:27 [for-next-3.11][PATCH 0/8] ftrace/tracing: Event file fixes and ftrace function hash fixes Steven Rostedt
2013-07-30 11:27 ` [for-next-3.11][PATCH 1/8] tracing: Turn event/id->i_private into call->event.type Steven Rostedt
2013-07-30 11:27 ` [for-next-3.11][PATCH 2/8] tracing: Change event_enable/disable_read() to verify i_private != NULL Steven Rostedt
2013-07-30 11:27 ` [for-next-3.11][PATCH 3/8] tracing: Change event_filter_read/write " Steven Rostedt
2013-07-30 11:27 ` [for-next-3.11][PATCH 4/8] tracing: Change f_start() to take event_mutex and " Steven Rostedt
2013-07-30 11:27 ` [for-next-3.11][PATCH 5/8] tracing: Introduce remove_event_file_dir() Steven Rostedt
2013-07-30 11:27 ` [for-next-3.11][PATCH 6/8] tracing: Change remove_event_file_dir() to clear "d_subdirs"->i_private Steven Rostedt
2013-07-30 11:27 ` [for-next-3.11][PATCH 7/8] ftrace: Consolidate some duplicate code for updating ftrace ops Steven Rostedt
2013-07-30 11:27 ` Steven Rostedt [this message]
2013-07-30 17:39   ` [for-next-3.11][PATCH 8/8] ftrace: Clear module traced functions on unload module Steven Rostedt
2013-07-31 11:47 ` [for-next-3.11][PATCH 0/8] ftrace/tracing: Event file fixes and ftrace function hash fixes Oleg Nesterov
2013-07-31 14:06   ` Steven Rostedt
2013-07-31 14:22     ` Oleg Nesterov
2013-07-31 14:29       ` Oleg Nesterov
2013-07-31 16:43       ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130730113153.375039705@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=joern@logfs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masami.hiramatsu.pt@hitachi.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=steve@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox