From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: linux-kernel@vger.kernel.org, ebiederm@xmission.com
Subject: Re: [PATCH] [3/11] SYSCTL: Add proc_rcu_string to manage sysctls using rcu strings
Date: Mon, 21 Dec 2009 18:51:31 -0800 [thread overview]
Message-ID: <20091222025131.GB9279@linux.vnet.ibm.com> (raw)
In-Reply-To: <20091221012024.A0828B158A@basil.firstfloor.org>
On Mon, Dec 21, 2009 at 02:20:24AM +0100, Andi Kleen wrote:
>
> Add a helper to use the new rcu strings for managing access
> to text sysctls. Conversions will be in follow-on patches.
>
> An alternative would be to use seqlocks here, but RCU seemed
> cleaner.
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
Using the below as an example of my concern about access_rcu_string(), FYI.
> ---
> include/linux/sysctl.h | 2 +
> kernel/sysctl.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++
> kernel/sysctl_check.c | 1
> 3 files changed, 69 insertions(+)
>
> Index: linux-2.6.33-rc1-ak/include/linux/sysctl.h
> ===================================================================
> --- linux-2.6.33-rc1-ak.orig/include/linux/sysctl.h
> +++ linux-2.6.33-rc1-ak/include/linux/sysctl.h
> @@ -969,6 +969,8 @@ typedef int proc_handler (struct ctl_tab
>
> extern int proc_dostring(struct ctl_table *, int,
> void __user *, size_t *, loff_t *);
> +extern int proc_rcu_string(struct ctl_table *, int,
> + void __user *, size_t *, loff_t *);
> extern int proc_dointvec(struct ctl_table *, int,
> void __user *, size_t *, loff_t *);
> extern int proc_dointvec_minmax(struct ctl_table *, int,
> Index: linux-2.6.33-rc1-ak/kernel/sysctl.c
> ===================================================================
> --- linux-2.6.33-rc1-ak.orig/kernel/sysctl.c
> +++ linux-2.6.33-rc1-ak/kernel/sysctl.c
> @@ -50,6 +50,7 @@
> #include <linux/ftrace.h>
> #include <linux/slow-work.h>
> #include <linux/perf_event.h>
> +#include <linux/rcustring.h>
>
> #include <asm/uaccess.h>
> #include <asm/processor.h>
> @@ -2016,6 +2017,60 @@ static int _proc_do_string(void* data, i
> }
>
> /**
> + * proc_rcu_string - sysctl string with rcu protection
> + * @table: the sysctl table
> + * @write: %TRUE if this is a write to the sysctl file
> + * @buffer: the user buffer
> + * @lenp: the size of the user buffer
> + * @ppos: file position
> + *
> + * Handle a string sysctl similar to proc_dostring.
> + * The main difference is that the data pointer in the table
> + * points to a pointer to a string. The string should be initially
> + * pointing to a statically allocated (as a C object, not on the heap)
> + * default. When it is replaced old uses will be protected by
> + * RCU. The reader should use rcu_read_lock()/unlock() or
> + * access_rcu_string().
> + */
> +int proc_rcu_string(struct ctl_table *table, int write,
> + void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> + int ret;
> +
> + if (write) {
> + /* protect writers against each other */
> + static DEFINE_MUTEX(rcu_string_mutex);
> + char *old;
> + char *new;
> +
> + new = alloc_rcu_string(table->maxlen, GFP_KERNEL);
> + if (!new)
> + return -ENOMEM;
> + mutex_lock(&rcu_string_mutex);
> + old = *(char **)(table->data);
> + strcpy(new, old);
> + ret = _proc_do_string(new, table->maxlen, write, buffer, lenp, ppos);
> + rcu_assign_pointer(*(char **)(table->data), new);
> + /*
> + * For the first initialization allow constant strings.
> + */
> + if (!kernel_address((unsigned long)old))
> + free_rcu_string(old);
> + mutex_unlock(&rcu_string_mutex);
> + } else {
> + char *str;
> +
> + str = access_rcu_string(*(char **)(table->data), table->maxlen,
> + GFP_KERNEL);
So the above statement picks up table->data, then some other CPU comes
in and executes the "write" side of this "if" statement, we get
preempted before access_rcu_string() enters its RCU read-side critical
section, the grace period elapse, we resume, and ... ouch!
One trick would be to make access_rcu_string() be a macro that does
first access to its first argument in an RCU read-side critical section.
Alternatively, pass in the address of the pointer, rather than the
pointer itself.
Or explain to me how I am confused.
> + if (!str)
> + return -ENOMEM;
> + ret = _proc_do_string(str, table->maxlen, write, buffer, lenp, ppos);
> + kfree(str);
> + }
> + return ret;
> +}
> +
> +/**
> * proc_dostring - read a string sysctl
> * @table: the sysctl table
> * @write: %TRUE if this is a write to the sysctl file
> @@ -2030,6 +2085,10 @@ static int _proc_do_string(void* data, i
> * and a newline '\n' is added. It is truncated if the buffer is
> * not large enough.
> *
> + * WARNING: this should be only used for read only strings
> + * or when you have a wrapper with special locking. Otherwise
> + * use proc_rcu_string to avoid races with the consumer.
> + *
> * Returns 0 on success.
> */
> int proc_dostring(struct ctl_table *table, int write,
> @@ -2614,6 +2673,12 @@ int proc_dostring(struct ctl_table *tabl
> return -ENOSYS;
> }
>
> +int proc_rcu_string(struct ctl_table *table, int write,
> + void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> + return -ENOSYS;
> +}
> +
> int proc_dointvec(struct ctl_table *table, int write,
> void __user *buffer, size_t *lenp, loff_t *ppos)
> {
> @@ -2670,6 +2735,7 @@ EXPORT_SYMBOL(proc_dointvec_minmax);
> EXPORT_SYMBOL(proc_dointvec_userhz_jiffies);
> EXPORT_SYMBOL(proc_dointvec_ms_jiffies);
> EXPORT_SYMBOL(proc_dostring);
> +EXPORT_SYMBOL(proc_rcu_string);
> EXPORT_SYMBOL(proc_doulongvec_minmax);
> EXPORT_SYMBOL(proc_doulongvec_ms_jiffies_minmax);
> EXPORT_SYMBOL(register_sysctl_table);
> Index: linux-2.6.33-rc1-ak/kernel/sysctl_check.c
> ===================================================================
> --- linux-2.6.33-rc1-ak.orig/kernel/sysctl_check.c
> +++ linux-2.6.33-rc1-ak/kernel/sysctl_check.c
> @@ -131,6 +131,7 @@ int sysctl_check_table(struct nsproxy *n
> set_fail(&fail, table, "Directory with extra2");
> } else {
> if ((table->proc_handler == proc_dostring) ||
> + (table->proc_handler == proc_rcu_string) ||
> (table->proc_handler == proc_dointvec) ||
> (table->proc_handler == proc_dointvec_minmax) ||
> (table->proc_handler == proc_dointvec_jiffies) ||
next prev parent reply other threads:[~2009-12-22 2:51 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-21 1:20 [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls Andi Kleen
2009-12-21 1:20 ` [PATCH] [1/11] Add rcustring ADT for RCU protected strings Andi Kleen
2009-12-22 2:46 ` Paul E. McKenney
2009-12-22 10:05 ` Andi Kleen
2009-12-22 20:59 ` Paul E. McKenney
2009-12-21 1:20 ` [PATCH] [2/11] Add a kernel_address() that works for data too Andi Kleen
2009-12-21 1:20 ` [PATCH] [3/11] SYSCTL: Add proc_rcu_string to manage sysctls using rcu strings Andi Kleen
2009-12-22 2:51 ` Paul E. McKenney [this message]
2009-12-22 3:00 ` Eric W. Biederman
2009-12-22 7:44 ` Paul E. McKenney
2009-12-21 1:20 ` [PATCH] [4/11] SYSCTL: Use RCU strings for core_pattern sysctl Andi Kleen
2009-12-21 1:20 ` [PATCH] [5/11] SYSCTL: Add call_usermodehelper_cleanup() Andi Kleen
2009-12-21 1:20 ` [PATCH] [6/11] SYSCTL: Convert modprobe_path to proc_rcu_string() Andi Kleen
2009-12-21 1:20 ` [PATCH] [7/11] SYSCTL: Convert poweroff_command to proc_rcu_string Andi Kleen
2009-12-21 1:20 ` [PATCH] [8/11] SYSCTL: Convert hotplug helper string to proc_rcu_string() Andi Kleen
2009-12-22 19:03 ` Greg KH
2009-12-21 1:20 ` [PATCH] [9/11] SYSCTL: Add a mutex to the page_alloc zone order sysctl Andi Kleen
2009-12-21 1:20 ` [PATCH] [10/11] SYSCTL: Use RCU protected sysctl for ocfs group add helper Andi Kleen
2009-12-21 1:20 ` [PATCH] [11/11] SYSCTL: Convert IRDA text sysctl to RCU Andi Kleen
2009-12-21 1:59 ` [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls Eric W. Biederman
2009-12-21 2:04 ` Andi Kleen
2009-12-21 2:31 ` Eric W. Biederman
2009-12-21 3:21 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091222025131.GB9279@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=andi@firstfloor.org \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox