* [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2
@ 2010-01-05 2:15 Andi Kleen
2010-01-05 2:15 ` [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2 Andi Kleen
` (8 more replies)
0 siblings, 9 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: ebiederm, paulmck, akpm, linux-kernel
v2 version that addresses all the earlier review comments
minus patches that were already merged.
I think this one is ready for merge now. Andrew, could you please
take it?
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
With BKL-less sysctls most of the writable string sysctls are racy. There
is no locking on the reader side, so a reader could see an inconsistent
string or worse miss the terminating null and walk of beyond it.
This patch kit adds a new "rcu string" variant to avoid these
problems and convers the racy users. One the writer side the strings are
always copied to new memory and the readers use rcu_read_lock()
to get a stable view. For readers who access the string over
sleeps the reader copies the string.
This is all hidden in a new generic "rcu_string" ADT which can be also
used for other purposes.
This finally implements all the letters in RCU, most other users
leave out the 'C'.
I left some obscure users in architectures (sparc, mips) alone and audited
all of the others. The sparc reboot_cmd one has references to asm files
which I didn't want to touch and the mips variant seemd just too obscure.
All the others are not racy.
-Andi
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
2010-01-05 5:32 ` Paul E. McKenney
` (2 more replies)
2010-01-05 2:15 ` [PATCH] [2/9] Add a kernel_address() that works for data too Andi Kleen
` (7 subsequent siblings)
8 siblings, 3 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: paulmck, ebiederm, paulmck, akpm, linux-kernel
Add a little ADT for RCU protected strings. RCU is a convenient
way to manage modifications to read-often-write-seldom strings.
Add some helper functions to make this more straight forward.
Used by follow-on patches to implement RCU protected sysctl strings.
* General rules:
* Reader has to use rcu_read_lock() and not sleep while accessing the string,
* or alternatively get a copy with access_rcu_string()
* Writer needs an own lock against each other.
* Each modification should allocate a new string first and free the old
* one with free_rcu_string()
* In writers use rcu_assign_pointer to publicize the updated string to
* global readers.
* The size passed to access_rcu_string() must be the same as passed
* to alloc_rcu_string() and be known in advance. Don't use strlen()!
*
* For sysctls also see proc_rcu_string() as a convenient wrapper
v2: Use rcu_dereference correctly
Cc: paulmck@linux.vnet.ibm.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/rcustring.h | 20 ++++++++
lib/Makefile | 3 -
lib/rcustring.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 137 insertions(+), 1 deletion(-)
Index: linux-2.6.33-rc2-ak/include/linux/rcustring.h
===================================================================
--- /dev/null
+++ linux-2.6.33-rc2-ak/include/linux/rcustring.h
@@ -0,0 +1,20 @@
+#ifndef _RCUSTRING_H
+#define _RCUSTRING_H 1
+
+#include <linux/gfp.h>
+#include <linux/rcupdate.h>
+
+/*
+ * Simple wrapper to manage strings by RCU.
+ */
+
+extern char *alloc_rcu_string(int size, gfp_t gfp);
+extern void free_rcu_string(const char *string);
+
+/*
+ * size must be the same as alloc_rcu_string, don't
+ * use strlen on str!
+ */
+extern char *access_rcu_string(char **str, int size, gfp_t gfp);
+
+#endif
Index: linux-2.6.33-rc2-ak/lib/rcustring.c
===================================================================
--- /dev/null
+++ linux-2.6.33-rc2-ak/lib/rcustring.c
@@ -0,0 +1,115 @@
+/*
+ * Manage strings by Read-Copy-Update. This is useful for global strings
+ * that change only very rarely, but are read often.
+ *
+ * Author: Andi Kleen
+ *
+ * General rules:
+ * Reader has to use rcu_read_lock() and not sleep while accessing the string,
+ * or alternatively get a copy with access_rcu_string()
+ * Writer needs an own lock against each other.
+ * Each modification should allocate a new string first and free the old
+ * one with free_rcu_string()
+ * In writers use rcu_assign_pointer to publicize the updated string to
+ * global readers.
+ * The size passed to access_rcu_string() must be the same as passed
+ * to alloc_rcu_string() and be known in advance. Don't use strlen()!
+ * The pointer passed to access_rcu_string must be to the variable modified
+ * by the writer.
+ *
+ * For sysctls also see proc_rcu_string() as a convenient wrapper
+ *
+ * Typical example:
+ * #define MAX_GLOBAL_SIZE ...
+ * char *global = "default";
+ *
+ * Rare writer:
+ * char *old, *new;
+ * DECLARE_MUTEX(writer_lock);
+ * mutex_lock(&writer_lock);
+ * new = alloc_rcu_string(MAX_GLOBAL_SIZE, GFP_KERNEL);
+ * if (!new) {
+ * mutex_unlock(&writer_lock);
+ * return -ENOMEM;
+ * }
+ * strlcpy(new, new_value, MAX_GLOBAL_SIZE);
+ * old = global;
+ * rcu_assign_pointer(global, new);
+ * mutex_unlock(&writer_lock);
+ * free_rcu_string(old);
+ *
+ * Sleepy reader:
+ * char *str = access_rcu_string(&global, MAX_GLOBAL_SIZE, GFP_KERNEL);
+ * if (!str)
+ * return -ENOMEM;
+ * ... use str while sleeping ...
+ * kfree(string);
+ *
+ * Non sleepy reader:
+ * rcu_read_lock();
+ * str = rcu_dereference(&global);
+ * ... use str without sleeping ...
+ * rcu_read_unlock();
+ *
+ * Note this code could be relatively easily generalized for other kinds
+ * of non-atomic data, but this initial version only handles strings.
+ * Only need to change the strlcpy() below to memcpy()
+ */
+#include <linux/kernel.h>
+#include <linux/rcustring.h>
+#include <linux/slab.h>
+#include <linux/rcupdate.h>
+#include <linux/module.h>
+#include <linux/string.h>
+
+struct rcu_string {
+ struct rcu_head rcu;
+ char str[0];
+};
+
+char *alloc_rcu_string(int size, gfp_t gfp)
+{
+ struct rcu_string *rs = kmalloc(sizeof(struct rcu_string) + size, gfp);
+ if (!rs)
+ return NULL;
+ return rs->str;
+}
+EXPORT_SYMBOL(alloc_rcu_string);
+
+static void do_free_rcu_string(struct rcu_head *h)
+{
+ kfree(container_of(h, struct rcu_string, rcu));
+}
+
+static inline struct rcu_string *str_to_rcustr(const char *str)
+{
+ /*
+ * Opencoded container_of because the strict type checking
+ * in normal container_of cannot deal with char str[0] vs char *str.
+ */
+ return (struct rcu_string *)(str - offsetof(struct rcu_string, str));
+}
+
+void free_rcu_string(const char *str)
+{
+ struct rcu_string *rs = str_to_rcustr(str);
+ call_rcu(&rs->rcu, do_free_rcu_string);
+}
+EXPORT_SYMBOL(free_rcu_string);
+
+/*
+ * Get a local private copy of a RCU protected string.
+ * Mostly useful to get a string that is stable while sleeping.
+ * Caller must free returned string.
+ */
+char *access_rcu_string(char **str, int size, gfp_t gfp)
+{
+ char *copy = kmalloc(size, gfp);
+ if (!str)
+ return NULL;
+ rcu_read_lock();
+ strlcpy(copy, rcu_dereference(*str), size);
+ rcu_read_unlock();
+ return copy;
+}
+EXPORT_SYMBOL(access_rcu_string);
Index: linux-2.6.33-rc2-ak/lib/Makefile
===================================================================
--- linux-2.6.33-rc2-ak.orig/lib/Makefile
+++ linux-2.6.33-rc2-ak/lib/Makefile
@@ -12,7 +12,8 @@ lib-y := ctype.o string.o vsprintf.o cmd
idr.o int_sqrt.o extable.o prio_tree.o \
sha1.o irq_regs.o reciprocal_div.o argv_split.o \
proportions.o prio_heap.o ratelimit.o show_mem.o \
- is_single_threaded.o plist.o decompress.o flex_array.o
+ is_single_threaded.o plist.o decompress.o flex_array.o \
+ rcustring.o
lib-$(CONFIG_MMU) += ioremap.o
lib-$(CONFIG_SMP) += cpumask.o
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [2/9] Add a kernel_address() that works for data too
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
2010-01-05 2:15 ` [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2 Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
2010-01-05 8:44 ` Russell King
2010-01-05 8:58 ` Sam Ravnborg
2010-01-05 2:15 ` [PATCH] [3/9] SYSCTL: Add proc_rcu_string to manage sysctls using rcu strings Andi Kleen
` (6 subsequent siblings)
8 siblings, 2 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: linux-arch, ebiederm, paulmck, akpm, linux-kernel
Add a variant of kernel_text_address() that includes kernel data.
Assumes kernel is _text ... _end - init section. True everywhere?
Cc: linux-arch@vger.kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/kernel.h | 1 +
kernel/extable.c | 18 ++++++++++++++++++
2 files changed, 19 insertions(+)
Index: linux-2.6.33-rc2-ak/include/linux/kernel.h
===================================================================
--- linux-2.6.33-rc2-ak.orig/include/linux/kernel.h
+++ linux-2.6.33-rc2-ak/include/linux/kernel.h
@@ -205,6 +205,7 @@ extern unsigned long long memparse(const
extern int core_kernel_text(unsigned long addr);
extern int __kernel_text_address(unsigned long addr);
extern int kernel_text_address(unsigned long addr);
+extern int kernel_address(unsigned long addr);
extern int func_ptr_is_kernel_text(void *ptr);
struct pid;
Index: linux-2.6.33-rc2-ak/kernel/extable.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/extable.c
+++ linux-2.6.33-rc2-ak/kernel/extable.c
@@ -72,6 +72,18 @@ int core_kernel_text(unsigned long addr)
return 0;
}
+static int core_kernel_address(unsigned long addr)
+{
+ if ((addr >= (unsigned long)_text &&
+ addr <= (unsigned long)_end)) {
+ if (addr >= (unsigned long)__init_begin &&
+ addr < (unsigned long)__init_end)
+ return system_state == SYSTEM_BOOTING;
+ return 1;
+ }
+ return 0;
+}
+
int __kernel_text_address(unsigned long addr)
{
if (core_kernel_text(addr))
@@ -98,6 +110,12 @@ int kernel_text_address(unsigned long ad
return is_module_text_address(addr);
}
+/* text or data in core kernel or module */
+int kernel_address(unsigned long addr)
+{
+ return core_kernel_address(addr) || is_module_address(addr);
+}
+
/*
* On some architectures (PPC64, IA64) function pointers
* are actually only tokens to some data that then holds the
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [3/9] SYSCTL: Add proc_rcu_string to manage sysctls using rcu strings
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
2010-01-05 2:15 ` [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2 Andi Kleen
2010-01-05 2:15 ` [PATCH] [2/9] Add a kernel_address() that works for data too Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
2010-01-05 2:15 ` [PATCH] [4/9] SYSCTL: Use RCU strings for core_pattern sysctl Andi Kleen
` (5 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: ebiederm, paulmck, akpm, linux-kernel
Add a helper to use the new rcu strings for managing access
to text sysctls. Conversions will be in follow-on patches.
An alternative would be to use seqlocks here, but RCU seemed
cleaner.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/sysctl.h | 2 +
kernel/sysctl.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++
kernel/sysctl_check.c | 1
3 files changed, 69 insertions(+)
Index: linux-2.6.33-rc2-ak/include/linux/sysctl.h
===================================================================
--- linux-2.6.33-rc2-ak.orig/include/linux/sysctl.h
+++ linux-2.6.33-rc2-ak/include/linux/sysctl.h
@@ -969,6 +969,8 @@ typedef int proc_handler (struct ctl_tab
extern int proc_dostring(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
+extern int proc_rcu_string(struct ctl_table *, int,
+ void __user *, size_t *, loff_t *);
extern int proc_dointvec(struct ctl_table *, int,
void __user *, size_t *, loff_t *);
extern int proc_dointvec_minmax(struct ctl_table *, int,
Index: linux-2.6.33-rc2-ak/kernel/sysctl.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/sysctl.c
+++ linux-2.6.33-rc2-ak/kernel/sysctl.c
@@ -50,6 +50,7 @@
#include <linux/ftrace.h>
#include <linux/slow-work.h>
#include <linux/perf_event.h>
+#include <linux/rcustring.h>
#include <asm/uaccess.h>
#include <asm/processor.h>
@@ -2016,6 +2017,60 @@ static int _proc_do_string(void* data, i
}
/**
+ * proc_rcu_string - sysctl string with rcu protection
+ * @table: the sysctl table
+ * @write: %TRUE if this is a write to the sysctl file
+ * @buffer: the user buffer
+ * @lenp: the size of the user buffer
+ * @ppos: file position
+ *
+ * Handle a string sysctl similar to proc_dostring.
+ * The main difference is that the data pointer in the table
+ * points to a pointer to a string. The string should be initially
+ * pointing to a statically allocated (as a C object, not on the heap)
+ * default. When it is replaced old uses will be protected by
+ * RCU. The reader should use rcu_read_lock()/unlock() or
+ * access_rcu_string().
+ */
+int proc_rcu_string(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ int ret;
+
+ if (write) {
+ /* protect writers against each other */
+ static DEFINE_MUTEX(rcu_string_mutex);
+ char *old;
+ char *new;
+
+ new = alloc_rcu_string(table->maxlen, GFP_KERNEL);
+ if (!new)
+ return -ENOMEM;
+ mutex_lock(&rcu_string_mutex);
+ old = *(char **)(table->data);
+ strcpy(new, old);
+ ret = _proc_do_string(new, table->maxlen, write, buffer, lenp, ppos);
+ rcu_assign_pointer(*(char **)(table->data), new);
+ /*
+ * For the first initialization allow constant strings.
+ */
+ if (!kernel_address((unsigned long)old))
+ free_rcu_string(old);
+ mutex_unlock(&rcu_string_mutex);
+ } else {
+ char *str;
+
+ str = access_rcu_string((char **)(table->data), table->maxlen,
+ GFP_KERNEL);
+ if (!str)
+ return -ENOMEM;
+ ret = _proc_do_string(str, table->maxlen, write, buffer, lenp, ppos);
+ kfree(str);
+ }
+ return ret;
+}
+
+/**
* proc_dostring - read a string sysctl
* @table: the sysctl table
* @write: %TRUE if this is a write to the sysctl file
@@ -2030,6 +2085,10 @@ static int _proc_do_string(void* data, i
* and a newline '\n' is added. It is truncated if the buffer is
* not large enough.
*
+ * WARNING: this should be only used for read only strings
+ * or when you have a wrapper with special locking. Otherwise
+ * use proc_rcu_string to avoid races with the consumer.
+ *
* Returns 0 on success.
*/
int proc_dostring(struct ctl_table *table, int write,
@@ -2614,6 +2673,12 @@ int proc_dostring(struct ctl_table *tabl
return -ENOSYS;
}
+int proc_rcu_string(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ return -ENOSYS;
+}
+
int proc_dointvec(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
@@ -2670,6 +2735,7 @@ EXPORT_SYMBOL(proc_dointvec_minmax);
EXPORT_SYMBOL(proc_dointvec_userhz_jiffies);
EXPORT_SYMBOL(proc_dointvec_ms_jiffies);
EXPORT_SYMBOL(proc_dostring);
+EXPORT_SYMBOL(proc_rcu_string);
EXPORT_SYMBOL(proc_doulongvec_minmax);
EXPORT_SYMBOL(proc_doulongvec_ms_jiffies_minmax);
EXPORT_SYMBOL(register_sysctl_table);
Index: linux-2.6.33-rc2-ak/kernel/sysctl_check.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/sysctl_check.c
+++ linux-2.6.33-rc2-ak/kernel/sysctl_check.c
@@ -131,6 +131,7 @@ int sysctl_check_table(struct nsproxy *n
set_fail(&fail, table, "Directory with extra2");
} else {
if ((table->proc_handler == proc_dostring) ||
+ (table->proc_handler == proc_rcu_string) ||
(table->proc_handler == proc_dointvec) ||
(table->proc_handler == proc_dointvec_minmax) ||
(table->proc_handler == proc_dointvec_jiffies) ||
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [4/9] SYSCTL: Use RCU strings for core_pattern sysctl
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
` (2 preceding siblings ...)
2010-01-05 2:15 ` [PATCH] [3/9] SYSCTL: Add proc_rcu_string to manage sysctls using rcu strings Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
2010-01-05 6:56 ` Eric W. Biederman
2010-01-05 2:15 ` [PATCH] [5/9] SYSCTL: Add call_usermodehelper_cleanup() Andi Kleen
` (4 subsequent siblings)
8 siblings, 1 reply; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: ebiederm, paulmck, akpm, linux-kernel
Also saves ~220 bytes in the data segment for default kernels.
As a bonus this removes one use of the BKL.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
fs/exec.c | 11 +++++------
kernel/sysctl.c | 6 +++---
2 files changed, 8 insertions(+), 9 deletions(-)
Index: linux-2.6.33-rc2-ak/fs/exec.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/fs/exec.c
+++ linux-2.6.33-rc2-ak/fs/exec.c
@@ -62,7 +62,7 @@
#include "internal.h"
int core_uses_pid;
-char core_pattern[CORENAME_MAX_SIZE] = "core";
+char *core_pattern = "core";
unsigned int core_pipe_limit;
int suid_dumpable = 0;
@@ -1421,7 +1421,7 @@ EXPORT_SYMBOL(set_binfmt);
static int format_corename(char *corename, long signr)
{
const struct cred *cred = current_cred();
- const char *pat_ptr = core_pattern;
+ const char *pat_ptr = rcu_dereference(core_pattern);
int ispipe = (*pat_ptr == '|');
char *out_ptr = corename;
char *const out_end = corename + CORENAME_MAX_SIZE;
@@ -1825,12 +1825,11 @@ void do_coredump(long signr, int exit_co
clear_thread_flag(TIF_SIGPENDING);
/*
- * lock_kernel() because format_corename() is controlled by sysctl, which
- * uses lock_kernel()
+ * Protect corename by RCU vs proc_rcu_string()
*/
- lock_kernel();
+ rcu_read_lock();
ispipe = format_corename(corename, signr);
- unlock_kernel();
+ rcu_read_unlock();
if ((!ispipe) && (cprm.limit < binfmt->min_coredump))
goto fail_unlock;
Index: linux-2.6.33-rc2-ak/kernel/sysctl.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/sysctl.c
+++ linux-2.6.33-rc2-ak/kernel/sysctl.c
@@ -75,7 +75,7 @@ extern int sysctl_oom_dump_tasks;
extern int max_threads;
extern int core_uses_pid;
extern int suid_dumpable;
-extern char core_pattern[];
+extern char *core_pattern;
extern unsigned int core_pipe_limit;
extern int pid_max;
extern int min_free_kbytes;
@@ -399,10 +399,10 @@ static struct ctl_table kern_table[] = {
},
{
.procname = "core_pattern",
- .data = core_pattern,
+ .data = &core_pattern,
.maxlen = CORENAME_MAX_SIZE,
.mode = 0644,
- .proc_handler = proc_dostring,
+ .proc_handler = proc_rcu_string,
},
{
.procname = "core_pipe_limit",
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [5/9] SYSCTL: Add call_usermodehelper_cleanup()
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
` (3 preceding siblings ...)
2010-01-05 2:15 ` [PATCH] [4/9] SYSCTL: Use RCU strings for core_pattern sysctl Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
2010-01-05 2:15 ` [PATCH] [6/9] SYSCTL: Convert modprobe_path to proc_rcu_string() Andi Kleen
` (3 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: ebiederm, paulmck, akpm, linux-kernel
This is the same as call_usermodehelper(), but with an cleanup callback.
Needed for some of the followon proc_rcu_string() users.
This avoids open coding this.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/kmod.h | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
Index: linux-2.6.33-rc2-ak/include/linux/kmod.h
===================================================================
--- linux-2.6.33-rc2-ak.orig/include/linux/kmod.h
+++ linux-2.6.33-rc2-ak/include/linux/kmod.h
@@ -72,7 +72,8 @@ int call_usermodehelper_exec(struct subp
void call_usermodehelper_freeinfo(struct subprocess_info *info);
static inline int
-call_usermodehelper(char *path, char **argv, char **envp, enum umh_wait wait)
+call_usermodehelper_cleanup(char *path, char **argv, char **envp, enum umh_wait wait,
+ void (*cleanup)(char **, char **))
{
struct subprocess_info *info;
gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
@@ -80,10 +81,18 @@ call_usermodehelper(char *path, char **a
info = call_usermodehelper_setup(path, argv, envp, gfp_mask);
if (info == NULL)
return -ENOMEM;
+ if (cleanup)
+ call_usermodehelper_setcleanup(info, cleanup);
return call_usermodehelper_exec(info, wait);
}
static inline int
+call_usermodehelper(char *path, char **argv, char **envp, enum umh_wait wait)
+{
+ return call_usermodehelper_cleanup(path, argv, envp, wait, NULL);
+}
+
+static inline int
call_usermodehelper_keys(char *path, char **argv, char **envp,
struct key *session_keyring, enum umh_wait wait)
{
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [6/9] SYSCTL: Convert modprobe_path to proc_rcu_string()
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
` (4 preceding siblings ...)
2010-01-05 2:15 ` [PATCH] [5/9] SYSCTL: Add call_usermodehelper_cleanup() Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
2010-01-05 2:15 ` [PATCH] [7/9] SYSCTL: Convert poweroff_command to proc_rcu_string Andi Kleen
` (2 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: ebiederm, paulmck, akpm, linux-kernel
Avoids races with lockless sysctl
Also saves ~220 bytes in the data segment for default kernels.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
kernel/kmod.c | 36 ++++++++++++++++++++++++++++--------
kernel/sysctl.c | 4 ++--
2 files changed, 30 insertions(+), 10 deletions(-)
Index: linux-2.6.33-rc2-ak/kernel/kmod.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/kmod.c
+++ linux-2.6.33-rc2-ak/kernel/kmod.c
@@ -35,6 +35,7 @@
#include <linux/resource.h>
#include <linux/notifier.h>
#include <linux/suspend.h>
+#include <linux/rcustring.h>
#include <asm/uaccess.h>
#include <trace/events/module.h>
@@ -48,7 +49,12 @@ static struct workqueue_struct *khelper_
/*
modprobe_path is set via /proc/sys.
*/
-char modprobe_path[KMOD_PATH_LEN] = "/sbin/modprobe";
+char *modprobe_path = "/sbin/modprobe";
+
+static void free_arg(char **argv, char **env)
+{
+ kfree(argv[0]);
+}
/**
* __request_module - try to load a kernel module
@@ -71,7 +77,8 @@ int __request_module(bool wait, const ch
char module_name[MODULE_NAME_LEN];
unsigned int max_modprobes;
int ret;
- char *argv[] = { modprobe_path, "-q", "--", module_name, NULL };
+ char *mp_copy;
+ char *argv[] = { NULL, "-q", "--", module_name, NULL };
static char *envp[] = { "HOME=/",
"TERM=linux",
"PATH=/sbin:/usr/sbin:/bin:/usr/bin",
@@ -80,15 +87,24 @@ int __request_module(bool wait, const ch
#define MAX_KMOD_CONCURRENT 50 /* Completely arbitrary value - KAO */
static int kmod_loop_msg;
+ /* Get a stable-over-sleeps private copy of modprobe_path */
+ mp_copy = access_rcu_string(&modprobe_path, KMOD_PATH_LEN,
+ wait ? GFP_KERNEL : GFP_ATOMIC);
+ if (!mp_copy)
+ return -ENOMEM;
+ argv[0] = mp_copy;
+
va_start(args, fmt);
ret = vsnprintf(module_name, MODULE_NAME_LEN, fmt, args);
va_end(args);
- if (ret >= MODULE_NAME_LEN)
- return -ENAMETOOLONG;
+ if (ret >= MODULE_NAME_LEN) {
+ ret = -ENAMETOOLONG;
+ goto error;
+ }
ret = security_kernel_module_request(module_name);
if (ret)
- return ret;
+ goto error;
/* If modprobe needs a service that is in a module, we get a recursive
* loop. Limit the number of running kmod threads to max_threads/2 or
@@ -111,14 +127,18 @@ int __request_module(bool wait, const ch
"request_module: runaway loop modprobe %s\n",
module_name);
atomic_dec(&kmod_concurrent);
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto error;
}
trace_module_request(module_name, wait, _RET_IP_);
- ret = call_usermodehelper(modprobe_path, argv, envp,
- wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
+ ret = call_usermodehelper_cleanup(mp_copy, argv, envp,
+ wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC, free_arg);
+ mp_copy = NULL; /* free_arg frees */
atomic_dec(&kmod_concurrent);
+error:
+ kfree(mp_copy);
return ret;
}
EXPORT_SYMBOL(__request_module);
Index: linux-2.6.33-rc2-ak/kernel/sysctl.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/sysctl.c
+++ linux-2.6.33-rc2-ak/kernel/sysctl.c
@@ -121,7 +121,7 @@ static int min_percpu_pagelist_fract = 8
static int ngroups_max = NGROUPS_MAX;
#ifdef CONFIG_MODULES
-extern char modprobe_path[];
+extern char *modprobe_path;
extern int modules_disabled;
#endif
#ifdef CONFIG_CHR_DEV_SG
@@ -532,7 +532,7 @@ static struct ctl_table kern_table[] = {
.data = &modprobe_path,
.maxlen = KMOD_PATH_LEN,
.mode = 0644,
- .proc_handler = proc_dostring,
+ .proc_handler = proc_rcu_string,
},
{
.procname = "modules_disabled",
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [7/9] SYSCTL: Convert poweroff_command to proc_rcu_string
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
` (5 preceding siblings ...)
2010-01-05 2:15 ` [PATCH] [6/9] SYSCTL: Convert modprobe_path to proc_rcu_string() Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
2010-01-05 2:15 ` [PATCH] [8/9] SYSCTL: Convert hotplug helper string to proc_rcu_string() Andi Kleen
2010-01-05 2:15 ` [PATCH] [9/9] SYSCTL: Use RCU protected sysctl for ocfs group add helper Andi Kleen
8 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: ebiederm, paulmck, akpm, linux-kernel
Avoids races with lockless sysctl.
Also saves ~220 bytes in the data segment for default kernels.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/reboot.h | 2 +-
kernel/sys.c | 8 ++++++--
kernel/sysctl.c | 2 +-
3 files changed, 8 insertions(+), 4 deletions(-)
Index: linux-2.6.33-rc2-ak/kernel/sys.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/sys.c
+++ linux-2.6.33-rc2-ak/kernel/sys.c
@@ -1597,7 +1597,7 @@ SYSCALL_DEFINE3(getcpu, unsigned __user
return err ? -EFAULT : 0;
}
-char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = "/sbin/poweroff";
+char *poweroff_cmd = "/sbin/poweroff";
static void argv_cleanup(char **argv, char **envp)
{
@@ -1614,7 +1614,7 @@ static void argv_cleanup(char **argv, ch
int orderly_poweroff(bool force)
{
int argc;
- char **argv = argv_split(GFP_ATOMIC, poweroff_cmd, &argc);
+ char **argv;
static char *envp[] = {
"HOME=/",
"PATH=/sbin:/bin:/usr/sbin:/usr/bin",
@@ -1623,6 +1623,10 @@ int orderly_poweroff(bool force)
int ret = -ENOMEM;
struct subprocess_info *info;
+ /* RCU protection for poweroff_cmd */
+ rcu_read_lock();
+ argv = argv_split(GFP_ATOMIC, rcu_dereference(poweroff_cmd), &argc);
+ rcu_read_unlock();
if (argv == NULL) {
printk(KERN_WARNING "%s failed to allocate memory for \"%s\"\n",
__func__, poweroff_cmd);
Index: linux-2.6.33-rc2-ak/include/linux/reboot.h
===================================================================
--- linux-2.6.33-rc2-ak.orig/include/linux/reboot.h
+++ linux-2.6.33-rc2-ak/include/linux/reboot.h
@@ -67,7 +67,7 @@ extern void kernel_power_off(void);
void ctrl_alt_del(void);
#define POWEROFF_CMD_PATH_LEN 256
-extern char poweroff_cmd[POWEROFF_CMD_PATH_LEN];
+extern char *poweroff_cmd;
extern int orderly_poweroff(bool force);
Index: linux-2.6.33-rc2-ak/kernel/sysctl.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/sysctl.c
+++ linux-2.6.33-rc2-ak/kernel/sysctl.c
@@ -871,7 +871,7 @@ static struct ctl_table kern_table[] = {
.data = &poweroff_cmd,
.maxlen = POWEROFF_CMD_PATH_LEN,
.mode = 0644,
- .proc_handler = proc_dostring,
+ .proc_handler = proc_rcu_string,
},
#ifdef CONFIG_KEYS
{
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [8/9] SYSCTL: Convert hotplug helper string to proc_rcu_string()
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
` (6 preceding siblings ...)
2010-01-05 2:15 ` [PATCH] [7/9] SYSCTL: Convert poweroff_command to proc_rcu_string Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
2010-01-05 2:15 ` [PATCH] [9/9] SYSCTL: Use RCU protected sysctl for ocfs group add helper Andi Kleen
8 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: ebiederm, paulmck, akpm, linux-kernel
This avoids races with lockless sysctl.
Also saves ~220 bytes in the data segment for default kernels.
I also moved the code into a separate function because the original
was very long.
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
include/linux/kobject.h | 2 -
kernel/sysctl.c | 2 -
lib/kobject_uevent.c | 50 ++++++++++++++++++++++++++++++------------------
3 files changed, 34 insertions(+), 20 deletions(-)
Index: linux-2.6.33-rc2-ak/include/linux/kobject.h
===================================================================
--- linux-2.6.33-rc2-ak.orig/include/linux/kobject.h
+++ linux-2.6.33-rc2-ak/include/linux/kobject.h
@@ -31,7 +31,7 @@
#define UEVENT_BUFFER_SIZE 2048 /* buffer for the variables */
/* path to the userspace helper executed on an event */
-extern char uevent_helper[];
+extern char *uevent_helper;
/* counter to tag the uevent, read only except for the kobject core */
extern u64 uevent_seqnum;
Index: linux-2.6.33-rc2-ak/kernel/sysctl.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/kernel/sysctl.c
+++ linux-2.6.33-rc2-ak/kernel/sysctl.c
@@ -551,7 +551,7 @@ static struct ctl_table kern_table[] = {
.data = &uevent_helper,
.maxlen = UEVENT_HELPER_PATH_LEN,
.mode = 0644,
- .proc_handler = proc_dostring,
+ .proc_handler = proc_rcu_string,
},
#endif
#ifdef CONFIG_CHR_DEV_SG
Index: linux-2.6.33-rc2-ak/lib/kobject_uevent.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/lib/kobject_uevent.c
+++ linux-2.6.33-rc2-ak/lib/kobject_uevent.c
@@ -22,11 +22,12 @@
#include <linux/socket.h>
#include <linux/skbuff.h>
#include <linux/netlink.h>
+#include <linux/rcustring.h>
#include <net/sock.h>
u64 uevent_seqnum;
-char uevent_helper[UEVENT_HELPER_PATH_LEN] = CONFIG_UEVENT_HELPER_PATH;
+char *uevent_helper = CONFIG_UEVENT_HELPER_PATH;
static DEFINE_SPINLOCK(sequence_lock);
#if defined(CONFIG_NET)
static struct sock *uevent_sock;
@@ -76,6 +77,34 @@ out:
return ret;
}
+/* Call an external helper executable. */
+static int uevent_call_helper(const char *subsystem, struct kobj_uevent_env *env)
+{
+ char *argv[3];
+ char *helper;
+ int retval;
+
+ helper = access_rcu_string(&uevent_helper, UEVENT_HELPER_PATH_LEN, GFP_KERNEL);
+ if (!helper)
+ return -ENOMEM;
+
+ retval = -E2BIG;
+ argv[0] = helper;
+ argv[1] = (char *)subsystem;
+ argv[2] = NULL;
+ retval = add_uevent_var(env, "HOME=/");
+ if (retval)
+ goto error;
+ retval = add_uevent_var(env, "PATH=/sbin:/bin:/usr/sbin:/usr/bin");
+ if (retval)
+ goto error;
+
+ retval = call_usermodehelper(argv[0], argv, env->envp, UMH_WAIT_EXEC);
+error:
+ kfree(helper);
+ return retval;
+}
+
/**
* kobject_uevent_env - send an uevent with environmental data
*
@@ -243,23 +272,8 @@ int kobject_uevent_env(struct kobject *k
#endif
/* call uevent_helper, usually only enabled during early boot */
- if (uevent_helper[0]) {
- char *argv [3];
-
- argv [0] = uevent_helper;
- argv [1] = (char *)subsystem;
- argv [2] = NULL;
- retval = add_uevent_var(env, "HOME=/");
- if (retval)
- goto exit;
- retval = add_uevent_var(env,
- "PATH=/sbin:/bin:/usr/sbin:/usr/bin");
- if (retval)
- goto exit;
-
- retval = call_usermodehelper(argv[0], argv,
- env->envp, UMH_WAIT_EXEC);
- }
+ if (uevent_helper[0])
+ retval = uevent_call_helper(subsystem, env);
exit:
kfree(devpath);
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH] [9/9] SYSCTL: Use RCU protected sysctl for ocfs group add helper
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
` (7 preceding siblings ...)
2010-01-05 2:15 ` [PATCH] [8/9] SYSCTL: Convert hotplug helper string to proc_rcu_string() Andi Kleen
@ 2010-01-05 2:15 ` Andi Kleen
8 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 2:15 UTC (permalink / raw)
To: joel.becker, ebiederm, paulmck, akpm, linux-kernel
This avoids races with unlocked sysctl()
Also saves ~220 bytes in the data segment.
Cc: joel.becker@oracle.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
fs/ocfs2/stackglue.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
Index: linux-2.6.33-rc2-ak/fs/ocfs2/stackglue.c
===================================================================
--- linux-2.6.33-rc2-ak.orig/fs/ocfs2/stackglue.c
+++ linux-2.6.33-rc2-ak/fs/ocfs2/stackglue.c
@@ -27,6 +27,7 @@
#include <linux/kobject.h>
#include <linux/sysfs.h>
#include <linux/sysctl.h>
+#include <linux/rcustring.h>
#include "ocfs2_fs.h"
@@ -40,7 +41,7 @@ static struct ocfs2_locking_protocol *lp
static DEFINE_SPINLOCK(ocfs2_stack_lock);
static LIST_HEAD(ocfs2_stack_list);
static char cluster_stack_name[OCFS2_STACK_LABEL_LEN + 1];
-static char ocfs2_hb_ctl_path[OCFS2_MAX_HB_CTL_PATH] = "/sbin/ocfs2_hb_ctl";
+static char *ocfs2_hb_ctl_path = "/sbin/ocfs2_hb_ctl";
/*
* The stack currently in use. If not null, active_stack->sp_count > 0,
@@ -395,8 +396,15 @@ static void ocfs2_leave_group(const char
{
int ret;
char *argv[5], *envp[3];
+ char *helper;
- argv[0] = ocfs2_hb_ctl_path;
+ helper = access_rcu_string(&ocfs2_hb_ctl_path, OCFS2_MAX_HB_CTL_PATH, GFP_KERNEL);
+ if (!helper) {
+ printk(KERN_ERR "ocfs2_leave_group: no memory\n");
+ return;
+ }
+
+ argv[0] = helper;
argv[1] = "-K";
argv[2] = "-u";
argv[3] = (char *)group;
@@ -414,6 +422,7 @@ static void ocfs2_leave_group(const char
"\"%s %s %s %s\"\n",
ret, argv[0], argv[1], argv[2], argv[3]);
}
+ kfree(helper);
}
/*
@@ -621,10 +630,10 @@ error:
static ctl_table ocfs2_nm_table[] = {
{
.procname = "hb_ctl_path",
- .data = ocfs2_hb_ctl_path,
+ .data = &ocfs2_hb_ctl_path,
.maxlen = OCFS2_MAX_HB_CTL_PATH,
.mode = 0644,
- .proc_handler = proc_dostring,
+ .proc_handler = proc_rcu_string,
},
{ }
};
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2
2010-01-05 2:15 ` [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2 Andi Kleen
@ 2010-01-05 5:32 ` Paul E. McKenney
2010-01-05 10:47 ` Andi Kleen
2010-01-08 23:50 ` Andrew Morton
2010-01-11 12:12 ` Bert Wesarg
2 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2010-01-05 5:32 UTC (permalink / raw)
To: Andi Kleen; +Cc: ebiederm, akpm, linux-kernel
On Tue, Jan 05, 2010 at 03:15:26AM +0100, Andi Kleen wrote:
>
> Add a little ADT for RCU protected strings. RCU is a convenient
> way to manage modifications to read-often-write-seldom strings.
> Add some helper functions to make this more straight forward.
>
> Used by follow-on patches to implement RCU protected sysctl strings.
>
> * General rules:
> * Reader has to use rcu_read_lock() and not sleep while accessing the string,
> * or alternatively get a copy with access_rcu_string()
> * Writer needs an own lock against each other.
> * Each modification should allocate a new string first and free the old
> * one with free_rcu_string()
> * In writers use rcu_assign_pointer to publicize the updated string to
> * global readers.
> * The size passed to access_rcu_string() must be the same as passed
> * to alloc_rcu_string() and be known in advance. Don't use strlen()!
> *
> * For sysctls also see proc_rcu_string() as a convenient wrapper
>
> v2: Use rcu_dereference correctly
Very good, the extra level of indirection on the first argument to
access_rcu_string() fixes the problem I complained about last time.
One additional question below.
> Cc: paulmck@linux.vnet.ibm.com
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> ---
> include/linux/rcustring.h | 20 ++++++++
> lib/Makefile | 3 -
> lib/rcustring.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 137 insertions(+), 1 deletion(-)
[ . . . ]
> +/*
> + * Get a local private copy of a RCU protected string.
> + * Mostly useful to get a string that is stable while sleeping.
> + * Caller must free returned string.
> + */
> +char *access_rcu_string(char **str, int size, gfp_t gfp)
> +{
> + char *copy = kmalloc(size, gfp);
> + if (!str)
> + return NULL;
> + rcu_read_lock();
> + strlcpy(copy, rcu_dereference(*str), size);
What if "str" is non-NULL, but "*str" is NULL? Or is that disallowed
somehow?
If it is not disallowed, then something like the following?
if (!str)
return NULL;
rcu_read_lock();
tmp = rcu_dereference(*str);
if (!tmp) {
rcu_read_unlock();
return NULL;
}
strlcpy(copy, tmp, size);
rcu_read_unlock();
return copy;
> + rcu_read_unlock();
> + return copy;
> +}
> +EXPORT_SYMBOL(access_rcu_string);
> Index: linux-2.6.33-rc2-ak/lib/Makefile
> ===================================================================
> --- linux-2.6.33-rc2-ak.orig/lib/Makefile
> +++ linux-2.6.33-rc2-ak/lib/Makefile
> @@ -12,7 +12,8 @@ lib-y := ctype.o string.o vsprintf.o cmd
> idr.o int_sqrt.o extable.o prio_tree.o \
> sha1.o irq_regs.o reciprocal_div.o argv_split.o \
> proportions.o prio_heap.o ratelimit.o show_mem.o \
> - is_single_threaded.o plist.o decompress.o flex_array.o
> + is_single_threaded.o plist.o decompress.o flex_array.o \
> + rcustring.o
>
> lib-$(CONFIG_MMU) += ioremap.o
> lib-$(CONFIG_SMP) += cpumask.o
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [4/9] SYSCTL: Use RCU strings for core_pattern sysctl
2010-01-05 2:15 ` [PATCH] [4/9] SYSCTL: Use RCU strings for core_pattern sysctl Andi Kleen
@ 2010-01-05 6:56 ` Eric W. Biederman
2010-01-05 11:49 ` Andi Kleen
0 siblings, 1 reply; 25+ messages in thread
From: Eric W. Biederman @ 2010-01-05 6:56 UTC (permalink / raw)
To: Andi Kleen; +Cc: paulmck, akpm, linux-kernel
Andi Kleen <andi@firstfloor.org> writes:
> Also saves ~220 bytes in the data segment for default kernels.
>
> As a bonus this removes one use of the BKL.
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> ---
> fs/exec.c | 11 +++++------
> kernel/sysctl.c | 6 +++---
> 2 files changed, 8 insertions(+), 9 deletions(-)
>
> Index: linux-2.6.33-rc2-ak/fs/exec.c
> ===================================================================
> --- linux-2.6.33-rc2-ak.orig/fs/exec.c
> +++ linux-2.6.33-rc2-ak/fs/exec.c
> @@ -62,7 +62,7 @@
> #include "internal.h"
>
> int core_uses_pid;
> -char core_pattern[CORENAME_MAX_SIZE] = "core";
> +char *core_pattern = "core";
> unsigned int core_pipe_limit;
> int suid_dumpable = 0;
>
> @@ -1421,7 +1421,7 @@ EXPORT_SYMBOL(set_binfmt);
> static int format_corename(char *corename, long signr)
> {
> const struct cred *cred = current_cred();
> - const char *pat_ptr = core_pattern;
> + const char *pat_ptr = rcu_dereference(core_pattern);
rcu_dereference should aways be between rcu_read_lock()
and rcu_read_unlock();
> int ispipe = (*pat_ptr == '|');
> char *out_ptr = corename;
> char *const out_end = corename + CORENAME_MAX_SIZE;
> @@ -1825,12 +1825,11 @@ void do_coredump(long signr, int exit_co
> clear_thread_flag(TIF_SIGPENDING);
>
> /*
> - * lock_kernel() because format_corename() is controlled by sysctl, which
> - * uses lock_kernel()
> + * Protect corename by RCU vs proc_rcu_string()
> */
> - lock_kernel();
> + rcu_read_lock();
> ispipe = format_corename(corename, signr);
> - unlock_kernel();
> + rcu_read_unlock();
>
> if ((!ispipe) && (cprm.limit < binfmt->min_coredump))
> goto fail_unlock;
> Index: linux-2.6.33-rc2-ak/kernel/sysctl.c
> ===================================================================
> --- linux-2.6.33-rc2-ak.orig/kernel/sysctl.c
> +++ linux-2.6.33-rc2-ak/kernel/sysctl.c
> @@ -75,7 +75,7 @@ extern int sysctl_oom_dump_tasks;
> extern int max_threads;
> extern int core_uses_pid;
> extern int suid_dumpable;
> -extern char core_pattern[];
> +extern char *core_pattern;
> extern unsigned int core_pipe_limit;
> extern int pid_max;
> extern int min_free_kbytes;
> @@ -399,10 +399,10 @@ static struct ctl_table kern_table[] = {
> },
> {
> .procname = "core_pattern",
> - .data = core_pattern,
> + .data = &core_pattern,
> .maxlen = CORENAME_MAX_SIZE,
> .mode = 0644,
> - .proc_handler = proc_dostring,
> + .proc_handler = proc_rcu_string,
> },
> {
> .procname = "core_pipe_limit",
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [2/9] Add a kernel_address() that works for data too
2010-01-05 2:15 ` [PATCH] [2/9] Add a kernel_address() that works for data too Andi Kleen
@ 2010-01-05 8:44 ` Russell King
2010-01-05 8:58 ` Sam Ravnborg
1 sibling, 0 replies; 25+ messages in thread
From: Russell King @ 2010-01-05 8:44 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-arch, ebiederm, paulmck, akpm, linux-kernel
On Tue, Jan 05, 2010 at 03:15:27AM +0100, Andi Kleen wrote:
>
> Add a variant of kernel_text_address() that includes kernel data.
>
> Assumes kernel is _text ... _end - init section. True everywhere?
No. XIP kernels have the text and data/bss separated into two distinct
address regions.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [2/9] Add a kernel_address() that works for data too
2010-01-05 2:15 ` [PATCH] [2/9] Add a kernel_address() that works for data too Andi Kleen
2010-01-05 8:44 ` Russell King
@ 2010-01-05 8:58 ` Sam Ravnborg
2010-01-05 19:04 ` Russell King
1 sibling, 1 reply; 25+ messages in thread
From: Sam Ravnborg @ 2010-01-05 8:58 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-arch, ebiederm, paulmck, akpm, linux-kernel
On Tue, Jan 05, 2010 at 03:15:27AM +0100, Andi Kleen wrote:
>
> Add a variant of kernel_text_address() that includes kernel data.
>
> Assumes kernel is _text ... _end - init section. True everywhere?
Tim Abbott has done a great job lately to unify the various
linker scripts used by the different architectures.
Architectures are supposed to follow the skeleton outlined
in include/asm-generic/vmlinux.lds.h
But I think the skeleton needs a small update.
It describes that we mark start of .text with _stext.
But reality is that we use _text for this.
But you can trust _etext an almost all architectures.
It is a bug if it is missing.
So [_text, _etext] is the text section.
The data section may be placed before or after - it depends on the architecture.
But again - only some architectures define _sdata.
But all? define _edata.
Sam
>
> Cc: linux-arch@vger.kernel.org
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> ---
> include/linux/kernel.h | 1 +
> kernel/extable.c | 18 ++++++++++++++++++
> 2 files changed, 19 insertions(+)
>
> Index: linux-2.6.33-rc2-ak/include/linux/kernel.h
> ===================================================================
> --- linux-2.6.33-rc2-ak.orig/include/linux/kernel.h
> +++ linux-2.6.33-rc2-ak/include/linux/kernel.h
> @@ -205,6 +205,7 @@ extern unsigned long long memparse(const
> extern int core_kernel_text(unsigned long addr);
> extern int __kernel_text_address(unsigned long addr);
> extern int kernel_text_address(unsigned long addr);
> +extern int kernel_address(unsigned long addr);
> extern int func_ptr_is_kernel_text(void *ptr);
>
> struct pid;
> Index: linux-2.6.33-rc2-ak/kernel/extable.c
> ===================================================================
> --- linux-2.6.33-rc2-ak.orig/kernel/extable.c
> +++ linux-2.6.33-rc2-ak/kernel/extable.c
> @@ -72,6 +72,18 @@ int core_kernel_text(unsigned long addr)
> return 0;
> }
>
> +static int core_kernel_address(unsigned long addr)
> +{
> + if ((addr >= (unsigned long)_text &&
> + addr <= (unsigned long)_end)) {
> + if (addr >= (unsigned long)__init_begin &&
> + addr < (unsigned long)__init_end)
> + return system_state == SYSTEM_BOOTING;
> + return 1;
> + }
> + return 0;
> +}
> +
> int __kernel_text_address(unsigned long addr)
> {
> if (core_kernel_text(addr))
> @@ -98,6 +110,12 @@ int kernel_text_address(unsigned long ad
> return is_module_text_address(addr);
> }
>
> +/* text or data in core kernel or module */
> +int kernel_address(unsigned long addr)
> +{
> + return core_kernel_address(addr) || is_module_address(addr);
> +}
> +
> /*
> * On some architectures (PPC64, IA64) function pointers
> * are actually only tokens to some data that then holds the
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arch" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2
2010-01-05 5:32 ` Paul E. McKenney
@ 2010-01-05 10:47 ` Andi Kleen
2010-01-05 14:11 ` Paul E. McKenney
0 siblings, 1 reply; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 10:47 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: Andi Kleen, ebiederm, akpm, linux-kernel
On Mon, Jan 04, 2010 at 09:32:52PM -0800, Paul E. McKenney wrote:
> > +/*
> > + * Get a local private copy of a RCU protected string.
> > + * Mostly useful to get a string that is stable while sleeping.
> > + * Caller must free returned string.
> > + */
> > +char *access_rcu_string(char **str, int size, gfp_t gfp)
> > +{
> > + char *copy = kmalloc(size, gfp);
> > + if (!str)
> > + return NULL;
> > + rcu_read_lock();
> > + strlcpy(copy, rcu_dereference(*str), size);
>
> What if "str" is non-NULL, but "*str" is NULL? Or is that disallowed
> somehow?
I would consider it disallowed, all the strings have some default value.
Empty string would be ""
If we allowed it the caller couldn't easily distingush error from expected NULL.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [4/9] SYSCTL: Use RCU strings for core_pattern sysctl
2010-01-05 6:56 ` Eric W. Biederman
@ 2010-01-05 11:49 ` Andi Kleen
0 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 11:49 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: paulmck, akpm, linux-kernel
ebiederm@xmission.com (Eric W. Biederman) writes:
>>
>> @@ -1421,7 +1421,7 @@ EXPORT_SYMBOL(set_binfmt);
>> static int format_corename(char *corename, long signr)
>> {
>> const struct cred *cred = current_cred();
>> - const char *pat_ptr = core_pattern;
>> + const char *pat_ptr = rcu_dereference(core_pattern);
>
> rcu_dereference should aways be between rcu_read_lock()
> and rcu_read_unlock();
It is, see the call site below:
>> - * lock_kernel() because format_corename() is controlled by sysctl, which
>> - * uses lock_kernel()
>> + * Protect corename by RCU vs proc_rcu_string()
>> */
>> - lock_kernel();
>> + rcu_read_lock();
>> ispipe = format_corename(corename, signr);
>> - unlock_kernel();
>> + rcu_read_unlock();
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2
2010-01-05 10:47 ` Andi Kleen
@ 2010-01-05 14:11 ` Paul E. McKenney
2010-01-05 14:19 ` Andi Kleen
0 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2010-01-05 14:11 UTC (permalink / raw)
To: Andi Kleen; +Cc: ebiederm, akpm, linux-kernel
On Tue, Jan 05, 2010 at 11:47:46AM +0100, Andi Kleen wrote:
> On Mon, Jan 04, 2010 at 09:32:52PM -0800, Paul E. McKenney wrote:
> > > +/*
> > > + * Get a local private copy of a RCU protected string.
> > > + * Mostly useful to get a string that is stable while sleeping.
> > > + * Caller must free returned string.
> > > + */
> > > +char *access_rcu_string(char **str, int size, gfp_t gfp)
> > > +{
> > > + char *copy = kmalloc(size, gfp);
> > > + if (!str)
> > > + return NULL;
> > > + rcu_read_lock();
> > > + strlcpy(copy, rcu_dereference(*str), size);
> >
> > What if "str" is non-NULL, but "*str" is NULL? Or is that disallowed
> > somehow?
>
> I would consider it disallowed, all the strings have some default value.
> Empty string would be ""
>
> If we allowed it the caller couldn't easily distingush error from expected NULL.
OK, then:
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Thanx, Paul
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2
2010-01-05 14:11 ` Paul E. McKenney
@ 2010-01-05 14:19 ` Andi Kleen
0 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 14:19 UTC (permalink / raw)
To: paulmck; +Cc: ebiederm, akpm, linux-kernel
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
>>
>> I would consider it disallowed, all the strings have some default value.
>> Empty string would be ""
>>
>> If we allowed it the caller couldn't easily distingush error from expected NULL.
>
> OK, then:
Thanks. I actually changed my mind and added support for NULL strings
(although I suspect they are not too useful)
I also found one more minor problem (and there was one conversion
missing in v2), will do a repost later
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [2/9] Add a kernel_address() that works for data too
2010-01-05 8:58 ` Sam Ravnborg
@ 2010-01-05 19:04 ` Russell King
2010-01-05 19:15 ` Andi Kleen
0 siblings, 1 reply; 25+ messages in thread
From: Russell King @ 2010-01-05 19:04 UTC (permalink / raw)
To: Sam Ravnborg
Cc: Andi Kleen, linux-arch, ebiederm, paulmck, akpm, linux-kernel
On Tue, Jan 05, 2010 at 09:58:53AM +0100, Sam Ravnborg wrote:
> But you can trust _etext an almost all architectures.
> It is a bug if it is missing.
>
> So [_text, _etext] is the text section.
>
> The data section may be placed before or after - it depends on the architecture.
> But again - only some architectures define _sdata.
> But all? define _edata.
You can not guarantee that the data segment is after the text segment,
unless you want to outlaw XIP kernels. XIP kernels have the text
segment mapped at a completely different address to the data segment.
I'd suggest the only way to identify the data segment in a generic way
is to have everyone define _sdata, or more preferably _data (to be
consistent with _text), to be the start of the data segment.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [2/9] Add a kernel_address() that works for data too
@ 2010-01-05 19:15 ` Andi Kleen
0 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 19:15 UTC (permalink / raw)
To: Sam Ravnborg, Andi Kleen, linux-arch, ebiederm, paulmck, akpm,
linux-kern
> I'd suggest the only way to identify the data segment in a generic way
> is to have everyone define _sdata, or more preferably _data (to be
> consistent with _text), to be the start of the data segment.
I agree. I'll take a look at that.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [2/9] Add a kernel_address() that works for data too
@ 2010-01-05 19:15 ` Andi Kleen
0 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-05 19:15 UTC (permalink / raw)
To: Sam Ravnborg, Andi Kleen, linux-arch, ebiederm, paulmck, akpm,
linux-kernel
> I'd suggest the only way to identify the data segment in a generic way
> is to have everyone define _sdata, or more preferably _data (to be
> consistent with _text), to be the start of the data segment.
I agree. I'll take a look at that.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2
2010-01-05 2:15 ` [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2 Andi Kleen
2010-01-05 5:32 ` Paul E. McKenney
@ 2010-01-08 23:50 ` Andrew Morton
2010-01-11 12:12 ` Bert Wesarg
2 siblings, 0 replies; 25+ messages in thread
From: Andrew Morton @ 2010-01-08 23:50 UTC (permalink / raw)
To: Andi Kleen; +Cc: paulmck, ebiederm, linux-kernel
On Tue, 5 Jan 2010 03:15:26 +0100 (CET)
Andi Kleen <andi@firstfloor.org> wrote:
> +extern char *alloc_rcu_string(int size, gfp_t gfp);
> +extern void free_rcu_string(const char *string);
> +
> +/*
> + * size must be the same as alloc_rcu_string, don't
> + * use strlen on str!
> + */
> +extern char *access_rcu_string(char **str, int size, gfp_t gfp);
I think better names would be rcu_string_alloc(), rcu_string_free() and
rcu_string_access().
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [2/9] Add a kernel_address() that works for data too
2010-01-05 19:15 ` Andi Kleen
(?)
@ 2010-01-08 23:51 ` Andrew Morton
-1 siblings, 0 replies; 25+ messages in thread
From: Andrew Morton @ 2010-01-08 23:51 UTC (permalink / raw)
To: Andi Kleen; +Cc: Sam Ravnborg, linux-arch, ebiederm, paulmck, linux-kernel
On Tue, 5 Jan 2010 20:15:03 +0100
Andi Kleen <andi@firstfloor.org> wrote:
> > I'd suggest the only way to identify the data segment in a generic way
> > is to have everyone define _sdata, or more preferably _data (to be
> > consistent with _text), to be the start of the data segment.
>
> I agree. I'll take a look at that.
>
I'll merge this as-is for now. Please send updates when convenient.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2
2010-01-05 2:15 ` [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2 Andi Kleen
2010-01-05 5:32 ` Paul E. McKenney
2010-01-08 23:50 ` Andrew Morton
@ 2010-01-11 12:12 ` Bert Wesarg
2010-01-11 14:26 ` Andi Kleen
2 siblings, 1 reply; 25+ messages in thread
From: Bert Wesarg @ 2010-01-11 12:12 UTC (permalink / raw)
To: Andi Kleen; +Cc: paulmck, ebiederm, akpm, linux-kernel
On Tue, Jan 5, 2010 at 03:15, Andi Kleen <andi@firstfloor.org> wrote:
> Index: linux-2.6.33-rc2-ak/lib/rcustring.c
> ===================================================================
> --- /dev/null
> +++ linux-2.6.33-rc2-ak/lib/rcustring.c
[ . . . ]
> +/*
> + * Get a local private copy of a RCU protected string.
> + * Mostly useful to get a string that is stable while sleeping.
> + * Caller must free returned string.
> + */
> +char *access_rcu_string(char **str, int size, gfp_t gfp)
> +{
> + char *copy = kmalloc(size, gfp);
No check of return value from kmalloc()?
> + if (!str)
> + return NULL;
> + rcu_read_lock();
> + strlcpy(copy, rcu_dereference(*str), size);
> + rcu_read_unlock();
> + return copy;
> +}
> +EXPORT_SYMBOL(access_rcu_string);
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2
2010-01-11 12:12 ` Bert Wesarg
@ 2010-01-11 14:26 ` Andi Kleen
0 siblings, 0 replies; 25+ messages in thread
From: Andi Kleen @ 2010-01-11 14:26 UTC (permalink / raw)
To: Bert Wesarg; +Cc: Andi Kleen, paulmck, ebiederm, akpm, linux-kernel
On Mon, Jan 11, 2010 at 01:12:41PM +0100, Bert Wesarg wrote:
> On Tue, Jan 5, 2010 at 03:15, Andi Kleen <andi@firstfloor.org> wrote:
> > Index: linux-2.6.33-rc2-ak/lib/rcustring.c
> > ===================================================================
> > --- /dev/null
> > +++ linux-2.6.33-rc2-ak/lib/rcustring.c
>
> [ . . . ]
>
> > +/*
> > + * Get a local private copy of a RCU protected string.
> > + * Mostly useful to get a string that is stable while sleeping.
> > + * Caller must free returned string.
> > + */
> > +char *access_rcu_string(char **str, int size, gfp_t gfp)
> > +{
> > + char *copy = kmalloc(size, gfp);
> No check of return value from kmalloc()?
>
> > + if (!str)
> > + return NULL;
The !str should be !copy right. I fixed that in my version.
Thanks for catching.
-Andi
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2010-01-11 14:27 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-05 2:15 [PATCH] [0/9] SYSCTL: Use RCU to avoid races with string sysctls v2 Andi Kleen
2010-01-05 2:15 ` [PATCH] [1/9] Add rcustring ADT for RCU protected strings v2 Andi Kleen
2010-01-05 5:32 ` Paul E. McKenney
2010-01-05 10:47 ` Andi Kleen
2010-01-05 14:11 ` Paul E. McKenney
2010-01-05 14:19 ` Andi Kleen
2010-01-08 23:50 ` Andrew Morton
2010-01-11 12:12 ` Bert Wesarg
2010-01-11 14:26 ` Andi Kleen
2010-01-05 2:15 ` [PATCH] [2/9] Add a kernel_address() that works for data too Andi Kleen
2010-01-05 8:44 ` Russell King
2010-01-05 8:58 ` Sam Ravnborg
2010-01-05 19:04 ` Russell King
2010-01-05 19:15 ` Andi Kleen
2010-01-05 19:15 ` Andi Kleen
2010-01-08 23:51 ` Andrew Morton
2010-01-05 2:15 ` [PATCH] [3/9] SYSCTL: Add proc_rcu_string to manage sysctls using rcu strings Andi Kleen
2010-01-05 2:15 ` [PATCH] [4/9] SYSCTL: Use RCU strings for core_pattern sysctl Andi Kleen
2010-01-05 6:56 ` Eric W. Biederman
2010-01-05 11:49 ` Andi Kleen
2010-01-05 2:15 ` [PATCH] [5/9] SYSCTL: Add call_usermodehelper_cleanup() Andi Kleen
2010-01-05 2:15 ` [PATCH] [6/9] SYSCTL: Convert modprobe_path to proc_rcu_string() Andi Kleen
2010-01-05 2:15 ` [PATCH] [7/9] SYSCTL: Convert poweroff_command to proc_rcu_string Andi Kleen
2010-01-05 2:15 ` [PATCH] [8/9] SYSCTL: Convert hotplug helper string to proc_rcu_string() Andi Kleen
2010-01-05 2:15 ` [PATCH] [9/9] SYSCTL: Use RCU protected sysctl for ocfs group add helper Andi Kleen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.