public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	ltt-dev@lists.casi.polymtl.ca,
	Peter Zijlstra <peterz@infradead.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Arjan van de Ven <arjan@infradead.org>,
	Pekka Paalanen <pq@iki.fi>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Martin Bligh <mbligh@google.com>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	Tom Zanussi <tzanussi@gmail.com>,
	Masami Hiramatsu <mhiramat@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Jason Baron <jbaron@redhat.com>,
	Christoph Hellwig <hch@infradead.org>,
	Jiaying Zhang <jiayingz@google.com>,
	Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>,
	mrubin@google.com, md@google.com
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Subject: [RFC patch 19/41] Sort module list by pointer address to get coherent sleepable seq_file iterators
Date: Thu, 05 Mar 2009 17:47:47 -0500	[thread overview]
Message-ID: <20090305225515.964678663@polymtl.ca> (raw)
In-Reply-To: 20090305224728.947235917@polymtl.ca

[-- Attachment #1: module.c-sort-module-list.patch --]
[-- Type: text/plain, Size: 5867 bytes --]

A race that appears both in /proc/modules and in kallsyms: if, between the
seq file reads, the process is put to sleep and at this moment a module is
or removed from the module list, the listing will skip an amount of
modules/symbols corresponding to the amount of elements present in the unloaded
module, but at the current position in the list if the iteration is located
after the removed module.

The cleanest way I found to deal with this problem is to sort the module list.
We can then keep the old struct module * as the old iterator, knowing the it may
be removed between the seq file reads, but we only use it as "get next". If it
is not present in the module list, the next pointer will be used.

By doing this, removing a given module will now only fuzz the output related to
this specific module, not any random module anymore. Since modprobe uses
/proc/modules, it might be important to make sure multiple concurrent running
modprobes won't interfere with each other.


Small test program for this:

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define BUFSIZE 1024

int main()
{
	int fd = open("/proc/modules", O_RDONLY);
	char buf[BUFSIZE];
	ssize_t size;

	do {
		size = read(fd, buf, 1);
		printf("%c", buf[0]);
		usleep(100000);
	} while(size > 0);

	close(fd);
	return 0;
}


Actual test (kernel 2.6.23-rc3):

dijkstra:~# lsmod
Module                  Size  Used by
pl2303                 18564  0
usbserial              29032  1 pl2303
ppdev                   7844  0
sky2                   37476  0
skge                   36368  0
rtc                    10104  0
snd_hda_intel         265628  0

  (here, while we are printing the 2nd line, I rmmod pl2303)
compudj@dijkstra:~/test$ ./module
pl2303 18564 0 - Live 0xf886e000
usbserial 29032 1 pl2303, Live 0xf8865000
sky2 37476 0 - Live 0xf884f000
skge 36368 0 - Live 0xf8838000
rtc 10104 0 - Live 0xf8825000

We see the the 2nd line is garbage.

Now, with my patch applied:
  (here, while we are printing the rtc module, I rmmod rtc)
nd_hda_intel 268708 0 - Live 0xf8820000
ltt_control 2372 0 - Live 0xf8866000
rtc 10392 0 - Live 0xf886d000
skge 36768 0 - Live 0xf8871000
ltt_statedump 8516 0 - Live 0xf887b000

We see that since the rtc line was already in the buffer, it has been
printed completely.


  (here, while we are printing the skge module, I rmmod rtc)
snd_hda_intel 268708 0 - Live 0xf8820000
ltt_control 2372 0 - Live 0xf8866000
rtc 10392 0 - Live 0xf886d000
skge 36768 0 - Live 0xf8871000
ltt_statedump 8516 0 - Live 0xf887b000
sky2 38420 0 - Live 0xf88cd000

We see that the iteration continued at the same position even though the
rtc module, located in earlier addresses, was removed.

Changelog:

When reading the data by small chunks (i.e. byte by byte), the index (ppos) is
incremented by seq_read() directly and no "next" callback is called when going
to the next module.

Therefore, use ppos instead of m->private to deal with the fact that this index
is incremented directly to pass to the next module in seq_read() after the
buffer has been emptied.

Before fix, it prints the first module indefinitely. The patch fixes
this.

Changelog:
- Remove module_mutex usage: depend on functions implemented in module.c for
  that.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
---
 kernel/module.c |   27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

Index: linux-2.6-lttng/kernel/module.c
===================================================================
--- linux-2.6-lttng.orig/kernel/module.c	2009-03-05 15:29:39.000000000 -0500
+++ linux-2.6-lttng/kernel/module.c	2009-03-05 15:45:33.000000000 -0500
@@ -66,7 +66,9 @@
 #define INIT_OFFSET_MASK (1UL << (BITS_PER_LONG-1))
 
 /* List of modules, protected by module_mutex or preempt_disable
- * (delete uses stop_machine/add uses RCU list operations). */
+ * (delete uses stop_machine/add uses RCU list operations).
+ * Sorted by ascending list node address.
+ */
 static DEFINE_MUTEX(module_mutex);
 static LIST_HEAD(modules);
 
@@ -1877,6 +1879,7 @@ static noinline struct module *load_modu
 	void *percpu = NULL, *ptr = NULL; /* Stops spurious gcc warning */
 	unsigned long *mseg;
 	mm_segment_t old_fs;
+	struct module *iter;
 
 	DEBUGP("load_module: umod=%p, len=%lu, uargs=%p\n",
 	       umod, len, uargs);
@@ -2260,8 +2263,24 @@ static noinline struct module *load_modu
 	 * function to insert in a way safe to concurrent readers.
 	 * The mutex protects against concurrent writers.
 	 */
-	list_add_rcu(&mod->list, &modules);
 
+	/*
+	 * We sort the modules by struct module pointer address to permit
+	 * correct iteration over modules of, at least, kallsyms for preemptible
+	 * operations, such as read(). Sorting by struct module pointer address
+	 * is equivalent to sort by list node address.
+	 */
+	list_for_each_entry_reverse(iter, &modules, list) {
+		BUG_ON(iter == mod);	/* Should never be in the list twice */
+		if (iter < mod) {
+			/* We belong to the location right after iter. */
+			list_add_rcu(&mod->list, &iter->list);
+			goto module_added;
+		}
+	}
+	/* We should be added at the head of the list */
+	list_add_rcu(&mod->list, &modules);
+module_added:
 	err = parse_args(mod->name, mod->args, kp, num_kp, NULL);
 	if (err < 0)
 		goto unlink;
@@ -2618,12 +2637,12 @@ static char *module_flags(struct module 
 static void *m_start(struct seq_file *m, loff_t *pos)
 {
 	mutex_lock(&module_mutex);
-	return seq_list_start(&modules, *pos);
+	return seq_sorted_list_start(&modules, pos);
 }
 
 static void *m_next(struct seq_file *m, void *p, loff_t *pos)
 {
-	return seq_list_next(p, &modules, pos);
+	return seq_sorted_list_next(p, &modules, pos);
 }
 
 static void m_stop(struct seq_file *m, void *p)

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  parent reply	other threads:[~2009-03-05 23:29 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-05 22:47 [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9 Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 01/41] LTTng - core header Mathieu Desnoyers
2009-03-06 18:37   ` Steven Rostedt
2009-03-05 22:47 ` [RFC patch 02/41] LTTng - core data structures Mathieu Desnoyers
2009-03-06 18:41   ` Steven Rostedt
2009-03-05 22:47 ` [RFC patch 03/41] LTTng core x86 Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 04/41] LTTng core powerpc Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 05/41] LTTng relay buffer allocation, read, write Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 06/41] LTTng optimize write to page function Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 07/41] LTTng dynamic channels Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 08/41] LTTng - tracer header Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 09/41] LTTng optimize write to page function deal with unaligned access Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 10/41] lttng-optimize-write-to-page-function-remove-some-memcpy-calls Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 11/41] ltt-relay: cache pages address Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 12/41] x86 : export vmalloc_sync_all() Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 13/41] LTTng - tracer code Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 14/41] Splice and pipe : export pipe buf operations for GPL modules Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 15/41] Poll : add poll_wait_set_exclusive Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 16/41] LTTng Transport Locked Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 17/41] LTTng - serialization Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 18/41] Seq_file add support for sorted list Mathieu Desnoyers
2009-03-05 22:47 ` Mathieu Desnoyers [this message]
2009-03-05 22:47 ` [RFC patch 20/41] Linux Kernel Markers - Iterator Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 21/41] LTTng probes specialized tracepoints Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 22/41] LTTng marker control Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 23/41] Immediate Values Stub header Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 24/41] Linux Kernel Markers - Use Immediate Values Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 25/41] Markers Support for Proprierary Modules Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 26/41] Marers remove old comment Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 27/41] Markers use dynamic channels Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 28/41] LTT trace control Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 29/41] LTTng menus Mathieu Desnoyers
2009-03-05 23:35   ` Randy Dunlap
2009-03-05 23:47     ` Mathieu Desnoyers
2009-03-05 23:51       ` Randy Dunlap
2009-03-06  0:01         ` [ltt-dev] " Mathieu Desnoyers
2009-03-06  0:12           ` Randy Dunlap
2009-03-05 22:47 ` [RFC patch 30/41] LTTng build Mathieu Desnoyers
2009-03-05 22:47 ` [RFC patch 31/41] LTTng userspace event v2 Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 32/41] LTTng filter Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 33/41] LTTng dynamic tracing support with kprobes Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 34/41] Marker header API update Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 35/41] Marker " Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 36/41] kvm markers " Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 37/41] Markers : multi-probes test Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 38/41] Markers examples API update Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 39/41] SPUFS markers " Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 40/41] EXT4: instrumentation with tracepoints Mathieu Desnoyers
2009-03-05 22:48 ` [RFC patch 41/41] JBD2: use tracepoints for instrumentation Mathieu Desnoyers
2009-03-06 10:11 ` [RFC patch 00/41] LTTng 0.105 core for Linux 2.6.27-rc9 Ingo Molnar
2009-03-06 19:02   ` Mathieu Desnoyers
2009-03-11 18:32     ` Ingo Molnar
2009-03-13 16:18       ` Mathieu Desnoyers
2009-03-14 16:43         ` Ingo Molnar
2009-03-14 16:59           ` [ltt-dev] " Mathieu Desnoyers
2009-03-06 18:34 ` Steven Rostedt
2009-03-06 19:01   ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090305225515.964678663@polymtl.ca \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=acme@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=eduard.munteanu@linux360.ro \
    --cc=fche@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=hch@infradead.org \
    --cc=hpa@zytor.com \
    --cc=jbaron@redhat.com \
    --cc=jiayingz@google.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ltt-dev@lists.casi.polymtl.ca \
    --cc=mbligh@google.com \
    --cc=md@google.com \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=mrubin@google.com \
    --cc=peterz@infradead.org \
    --cc=pq@iki.fi \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=tzanussi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox