stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, "Jörn Engel" <joern@logfs.org>,
	"Dave Jones" <davej@redhat.com>,
	"Steve Hodgson" <steve@purestorage.com>,
	"Steven Rostedt" <rostedt@goodmis.org>
Subject: [ 35/74] ftrace: Check module functions being traced on reload
Date: Mon, 26 Aug 2013 18:08:33 -0700	[thread overview]
Message-ID: <20130827010431.989598015@linuxfoundation.org> (raw)
In-Reply-To: <20130827010424.535365435@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>

commit 8c4f3c3fa9681dc549cd35419b259496082fef8b upstream.

There's been a nasty bug that would show up and not give much info.
The bug displayed the following warning:

 WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230()
 Pid: 20903, comm: bash Tainted: G           O 3.6.11+ #38405.trunk
 Call Trace:
  [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0
  [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230
  [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0
  [<ffffffff811401cc>] ? kfree+0x2c/0x110
  [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150
  [<ffffffff81149f1e>] __fput+0xae/0x220
  [<ffffffff8114a09e>] ____fput+0xe/0x10
  [<ffffffff8105fa22>] task_work_run+0x72/0x90
  [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0
  [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
  [<ffffffff815c0f88>] int_signal+0x12/0x17
 ---[ end trace 793179526ee09b2c ]---

It was finally narrowed down to unloading a module that was being traced.

It was actually more than that. When functions are being traced, there's
a table of all functions that have a ref count of the number of active
tracers attached to that function. When a function trace callback is
registered to a function, the function's record ref count is incremented.
When it is unregistered, the function's record ref count is decremented.
If an inconsistency is detected (ref count goes below zero) the above
warning is shown and the function tracing is permanently disabled until
reboot.

The ftrace callback ops holds a hash of functions that it filters on
(and/or filters off). If the hash is empty, the default means to filter
all functions (for the filter_hash) or to disable no functions (for the
notrace_hash).

When a module is unloaded, it frees the function records that represent
the module functions. These records exist on their own pages, that is
function records for one module will not exist on the same page as
function records for other modules or even the core kernel.

Now when a module unloads, the records that represents its functions are
freed. When the module is loaded again, the records are recreated with
a default ref count of zero (unless there's a callback that traces all
functions, then they will also be traced, and the ref count will be
incremented).

The problem is that if an ftrace callback hash includes functions of the
module being unloaded, those hash entries will not be removed. If the
module is reloaded in the same location, the hash entries still point
to the functions of the module but the module's ref counts do not reflect
that.

With the help of Steve and Joern, we found a reproducer:

 Using uinput module and uinput_release function.

 cd /sys/kernel/debug/tracing
 modprobe uinput
 echo uinput_release > set_ftrace_filter
 echo function > current_tracer
 rmmod uinput
 modprobe uinput
 # check /proc/modules to see if loaded in same addr, otherwise try again
 echo nop > current_tracer

 [BOOM]

The above loads the uinput module, which creates a table of functions that
can be traced within the module.

We add uinput_release to the filter_hash to trace just that function.

Enable function tracincg, which increments the ref count of the record
associated to uinput_release.

Remove uinput, which frees the records including the one that represents
uinput_release.

Load the uinput module again (and make sure it's at the same address).
This recreates the function records all with a ref count of zero,
including uinput_release.

Disable function tracing, which will decrement the ref count for uinput_release
which is now zero because of the module removal and reload, and we have
a mismatch (below zero ref count).

The solution is to check all currently tracing ftrace callbacks to see if any
are tracing any of the module's functions when a module is loaded (it already does
that with callbacks that trace all functions). If a callback happens to have
a module function being traced, it increments that records ref count and starts
tracing that function.

There may be a strange side effect with this, where tracing module functions
on unload and then reloading a new module may have that new module's functions
being traced. This may be something that confuses the user, but it's not
a big deal. Another approach is to disable all callback hashes on module unload,
but this leaves some ftrace callbacks that may not be registered, but can
still have hashes tracing the module's function where ftrace doesn't know about
it. That situation can cause the same bug. This solution solves that case too.
Another benefit of this solution, is it is possible to trace a module's
function on unload and load.

Link: http://lkml.kernel.org/r/20130705142629.GA325@redhat.com

Reported-by: Jörn Engel <joern@logfs.org>
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Steve Hodgson <steve@purestorage.com>
Tested-by: Steve Hodgson <steve@purestorage.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/trace/ftrace.c |   71 +++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 62 insertions(+), 9 deletions(-)

--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2144,12 +2144,57 @@ static cycle_t		ftrace_update_time;
 static unsigned long	ftrace_update_cnt;
 unsigned long		ftrace_update_tot_cnt;
 
-static int ops_traces_mod(struct ftrace_ops *ops)
+static inline int ops_traces_mod(struct ftrace_ops *ops)
 {
-	struct ftrace_hash *hash;
+	/*
+	 * Filter_hash being empty will default to trace module.
+	 * But notrace hash requires a test of individual module functions.
+	 */
+	return ftrace_hash_empty(ops->filter_hash) &&
+		ftrace_hash_empty(ops->notrace_hash);
+}
+
+/*
+ * Check if the current ops references the record.
+ *
+ * If the ops traces all functions, then it was already accounted for.
+ * If the ops does not trace the current record function, skip it.
+ * If the ops ignores the function via notrace filter, skip it.
+ */
+static inline bool
+ops_references_rec(struct ftrace_ops *ops, struct dyn_ftrace *rec)
+{
+	/* If ops isn't enabled, ignore it */
+	if (!(ops->flags & FTRACE_OPS_FL_ENABLED))
+		return 0;
+
+	/* If ops traces all mods, we already accounted for it */
+	if (ops_traces_mod(ops))
+		return 0;
+
+	/* The function must be in the filter */
+	if (!ftrace_hash_empty(ops->filter_hash) &&
+	    !ftrace_lookup_ip(ops->filter_hash, rec->ip))
+		return 0;
+
+	/* If in notrace hash, we ignore it too */
+	if (ftrace_lookup_ip(ops->notrace_hash, rec->ip))
+		return 0;
+
+	return 1;
+}
+
+static int referenced_filters(struct dyn_ftrace *rec)
+{
+	struct ftrace_ops *ops;
+	int cnt = 0;
 
-	hash = ops->filter_hash;
-	return ftrace_hash_empty(hash);
+	for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
+		if (ops_references_rec(ops, rec))
+		    cnt++;
+	}
+
+	return cnt;
 }
 
 static int ftrace_update_code(struct module *mod)
@@ -2158,6 +2203,7 @@ static int ftrace_update_code(struct mod
 	struct dyn_ftrace *p;
 	cycle_t start, stop;
 	unsigned long ref = 0;
+	bool test = false;
 	int i;
 
 	/*
@@ -2171,9 +2217,12 @@ static int ftrace_update_code(struct mod
 
 		for (ops = ftrace_ops_list;
 		     ops != &ftrace_list_end; ops = ops->next) {
-			if (ops->flags & FTRACE_OPS_FL_ENABLED &&
-			    ops_traces_mod(ops))
-				ref++;
+			if (ops->flags & FTRACE_OPS_FL_ENABLED) {
+				if (ops_traces_mod(ops))
+					ref++;
+				else
+					test = true;
+			}
 		}
 	}
 
@@ -2183,12 +2232,16 @@ static int ftrace_update_code(struct mod
 	for (pg = ftrace_new_pgs; pg; pg = pg->next) {
 
 		for (i = 0; i < pg->index; i++) {
+			int cnt = ref;
+
 			/* If something went wrong, bail without enabling anything */
 			if (unlikely(ftrace_disabled))
 				return -1;
 
 			p = &pg->records[i];
-			p->flags = ref;
+			if (test)
+				cnt += referenced_filters(p);
+			p->flags = cnt;
 
 			/*
 			 * Do the initial record conversion from mcount jump
@@ -2208,7 +2261,7 @@ static int ftrace_update_code(struct mod
 			 * conversion puts the module to the correct state, thus
 			 * passing the ftrace_make_call check.
 			 */
-			if (ftrace_start_up && ref) {
+			if (ftrace_start_up && cnt) {
 				int failed = __ftrace_replace_code(p, 1);
 				if (failed)
 					ftrace_bug(failed, p->ip);



  parent reply	other threads:[~2013-08-27  1:08 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-27  1:07 [ 00/74] 3.10.10-stable review Greg Kroah-Hartman
2013-08-27  1:07 ` [ 01/74] KVM: s390: move kvm_guest_enter,exit closer to sie Greg Kroah-Hartman
2013-08-27  1:08 ` [ 02/74] mac80211: dont wait for TX status forever Greg Kroah-Hartman
2013-08-27  1:08 ` [ 03/74] ACPI: add _STA evaluation at do_acpi_find_child() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 04/74] ACPI: Try harder to resolve _ADR collisions for bridges Greg Kroah-Hartman
2013-08-27  1:08 ` [ 05/74] ARC: gdbserver breakage in Big-Endian configuration #1 Greg Kroah-Hartman
2013-08-27  1:08 ` [ 06/74] ARC: gdbserver breakage in Big-Endian configuration #2 Greg Kroah-Hartman
2013-08-27  1:08 ` [ 07/74] ARM: at91: at91sam9x5 RTC is not compatible with at91rm9200 one Greg Kroah-Hartman
2013-08-27  1:08 ` [ 08/74] NFC: llcp: Fix non blocking sockets connections Greg Kroah-Hartman
2013-08-27  1:08 ` [ 09/74] iwlwifi: mvm: correctly configure MCAST in AP mode Greg Kroah-Hartman
2013-08-27  1:08 ` [ 10/74] iwlwifi: mvm: fix " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 11/74] iwlwifi: mvm: properly tell the fw that a STA is awake Greg Kroah-Hartman
2013-08-27  1:08 ` [ 12/74] iwlwifi: mvm: dont set the MCAST queue in STAs queue list Greg Kroah-Hartman
2013-08-27  1:08 ` [ 13/74] iwlwifi: mvm: take the seqno from packet if transmit failed Greg Kroah-Hartman
2013-08-27  1:08 ` [ 14/74] iwlwifi: mvm: unregister leds when registration failed Greg Kroah-Hartman
2013-08-27  1:08 ` [ 15/74] iwlwifi: bump required firmware API version for 3160/7260 Greg Kroah-Hartman
2013-08-27  1:08 ` [ 16/74] iwlwifi: mvm: adjust firmware D3 configuration API Greg Kroah-Hartman
2013-08-27  1:08 ` [ 17/74] tracing: Do not call kmem_cache_free() on allocation failure Greg Kroah-Hartman
2013-08-27  1:08 ` [ 18/74] tracing/kprobe: Wait for disabling all running kprobe handlers Greg Kroah-Hartman
2013-08-27  1:08 ` [ 19/74] tracing: Introduce trace_create_cpu_file() and tracing_get_cpu() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 20/74] tracing: Change tracing_pipe_fops() to rely on tracing_get_cpu() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 21/74] tracing: Change tracing_buffers_fops " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 22/74] tracing: Change tracing_stats_fops " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 23/74] tracing: Change tracing_entries_fops " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 24/74] tracing: Change tracing_fops/snapshot_fops " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 25/74] ftrace: Add check for NULL regs if ops has SAVE_REGS set Greg Kroah-Hartman
2013-08-27  1:08 ` [ 26/74] tracing: Turn event/id->i_private into call->event.type Greg Kroah-Hartman
2013-08-27  1:08 ` [ 27/74] tracing: Change event_enable/disable_read() to verify i_private != NULL Greg Kroah-Hartman
2013-08-27  1:08 ` [ 28/74] tracing: Change event_filter_read/write " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 29/74] tracing: Change f_start() to take event_mutex and " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 30/74] tracing: Introduce remove_event_file_dir() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 31/74] tracing: Change remove_event_file_dir() to clear "d_subdirs"->i_private Greg Kroah-Hartman
2013-08-27  1:08 ` [ 32/74] tracing: trace_remove_event_call() should fail if call/file is in use Greg Kroah-Hartman
2013-08-27  1:08 ` [ 33/74] tracing/kprobes: Fail to unregister if probe event files are " Greg Kroah-Hartman
2013-08-27  1:08 ` [ 34/74] tracing/uprobes: " Greg Kroah-Hartman
2013-08-27  1:08 ` Greg Kroah-Hartman [this message]
2013-08-27  1:08 ` [ 36/74] xen/smp: initialize IPI vectors before marking CPU online Greg Kroah-Hartman
2013-08-27  1:08 ` [ 37/74] ARC: [lib] strchr breakage in Big-endian configuration Greg Kroah-Hartman
2013-08-27  1:08 ` [ 38/74] zd1201: do not use stack as URB transfer_buffer Greg Kroah-Hartman
2013-08-27  1:08 ` [ 39/74] VFS: collect_mounts() should return an ERR_PTR Greg Kroah-Hartman
2013-08-27  1:08 ` [ 40/74] x86: Dont clear olpc_ofw_header when sentinel is detected Greg Kroah-Hartman
2013-08-27  1:08 ` [ 41/74] xen/events: initialize local per-cpu mask for all possible events Greg Kroah-Hartman
2013-08-27  1:08 ` [ 42/74] xen/events: mask events when changing their VCPU binding Greg Kroah-Hartman
2013-08-27  1:08 ` [ 43/74] ARM: davinci: nand: specify ecc strength Greg Kroah-Hartman
2013-08-27  1:08 ` [ 44/74] ARM: at91/DT: fix at91sam9n12ek memory node Greg Kroah-Hartman
2013-08-27  1:08 ` [ 45/74] arm64: perf: fix array out of bounds access in armpmu_map_hw_event() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 46/74] arm64: perf: fix event validation for software group leaders Greg Kroah-Hartman
2013-08-27  1:08 ` [ 47/74] ARM: 7816/1: CONFIG_KUSER_HELPERS: fix help text Greg Kroah-Hartman
2013-08-27  1:08 ` [ 48/74] staging: comedi: bug-fix NULL pointer dereference on failed attach Greg Kroah-Hartman
2013-08-27  1:08 ` [ 49/74] drm/radeon/r7xx: fix copy paste typo in golden register setup Greg Kroah-Hartman
2013-08-27  1:08 ` [ 50/74] drm/radeon: fix UVD message buffer validation Greg Kroah-Hartman
2013-08-27  1:08 ` [ 51/74] drm/radeon: fix WREG32_OR macro setting bits in a register Greg Kroah-Hartman
2013-08-27  1:08 ` [ 52/74] drm/i915: Invalidate TLBs for the rings after a reset Greg Kroah-Hartman
2013-08-27  1:08 ` [ 53/74] of: fdt: fix memory initialization for expanded DT Greg Kroah-Hartman
2013-08-27  1:08 ` [ 54/74] nilfs2: remove double bio_put() in nilfs_end_bio_write() for BIO_EOPNOTSUPP error Greg Kroah-Hartman
2013-08-27  1:08 ` [ 55/74] nilfs2: fix issue with counting number of bio requests for BIO_EOPNOTSUPP error detection Greg Kroah-Hartman
2013-08-27  1:08 ` [ 56/74] drivers/platform/olpc/olpc-ec.c: initialise earlier Greg Kroah-Hartman
2013-08-27  1:08 ` [ 57/74] usb: phy: fix build breakage Greg Kroah-Hartman
2013-08-27  1:08 ` [ 58/74] sata_fsl: save irqs while coalescing Greg Kroah-Hartman
2013-08-27  1:08 ` [ 59/74] Hostap: copying wrong data prism2_ioctl_giwaplist() Greg Kroah-Hartman
2013-08-27  1:08 ` [ 60/74] libata: apply behavioral quirks to sil3826 PMP Greg Kroah-Hartman
2013-08-27  1:08 ` [ 61/74] iwlwifi: dvm: fix calling ieee80211_chswitch_done() with NULL Greg Kroah-Hartman
2013-08-27  1:09 ` [ 62/74] iwlwifi: pcie: disable L1 Active after pci_enable_device Greg Kroah-Hartman
2013-08-27  1:09 ` [ 63/74] SCSI: zfcp: fix lock imbalance by reworking request queue locking Greg Kroah-Hartman
2013-08-27  1:09 ` [ 64/74] SCSI: zfcp: fix schedule-inside-lock in scsi_device list loops Greg Kroah-Hartman
2013-08-27  1:09 ` [ 65/74] SCSI: lpfc: Dont force CONFIG_GENERIC_CSUM on Greg Kroah-Hartman
2013-08-27  1:09 ` [ 66/74] SCSI: sg: Fix user memory corruption when SG_IO is interrupted by a signal Greg Kroah-Hartman
2013-08-27  1:09 ` [ 67/74] Revert "x86 get_unmapped_area(): use proper mmap base for bottom-up direction" Greg Kroah-Hartman
2013-08-27  1:09 ` [ 68/74] x86 get_unmapped_area: Access mmap_legacy_base through mm_struct member Greg Kroah-Hartman
2013-08-27  1:09 ` [ 69/74] x86/xen: do not identity map UNUSABLE regions in the machine E820 Greg Kroah-Hartman
2013-08-27  1:09 ` [ 70/74] mei: me: fix reset state machine Greg Kroah-Hartman
2013-08-27  1:09 ` [ 71/74] mei: dont have to clean the state on power up Greg Kroah-Hartman
2013-08-27  1:09 ` [ 72/74] mei: me: fix waiting for hw ready Greg Kroah-Hartman
2013-08-27  1:09 ` [ 73/74] md: bcache: io.c: fix a potential NULL pointer dereference Greg Kroah-Hartman
2013-08-27  1:09 ` [ 74/74] bcache: FUA fixes Greg Kroah-Hartman
2013-08-27  4:31 ` [ 00/74] 3.10.10-stable review Guenter Roeck
2013-08-27 22:27   ` Greg Kroah-Hartman
2013-08-27 20:41 ` Shuah Khan
2013-08-27 22:27   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130827010431.989598015@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davej@redhat.com \
    --cc=joern@logfs.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    --cc=steve@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).