All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: linux-kernel@vger.kernel.org, ebiederm@xmission.com
Subject: Re: [PATCH] [3/11] SYSCTL: Add proc_rcu_string to manage sysctls using rcu strings
Date: Mon, 21 Dec 2009 18:51:31 -0800	[thread overview]
Message-ID: <20091222025131.GB9279@linux.vnet.ibm.com> (raw)
In-Reply-To: <20091221012024.A0828B158A@basil.firstfloor.org>

On Mon, Dec 21, 2009 at 02:20:24AM +0100, Andi Kleen wrote:
> 
> Add a helper to use the new rcu strings for managing access
> to text sysctls. Conversions will be in follow-on patches.
> 
> An alternative would be to use seqlocks here, but RCU seemed
> cleaner.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>

Using the below as an example of my concern about access_rcu_string(), FYI.

> ---
>  include/linux/sysctl.h |    2 +
>  kernel/sysctl.c        |   66 +++++++++++++++++++++++++++++++++++++++++++++++++
>  kernel/sysctl_check.c  |    1 
>  3 files changed, 69 insertions(+)
> 
> Index: linux-2.6.33-rc1-ak/include/linux/sysctl.h
> ===================================================================
> --- linux-2.6.33-rc1-ak.orig/include/linux/sysctl.h
> +++ linux-2.6.33-rc1-ak/include/linux/sysctl.h
> @@ -969,6 +969,8 @@ typedef int proc_handler (struct ctl_tab
> 
>  extern int proc_dostring(struct ctl_table *, int,
>  			 void __user *, size_t *, loff_t *);
> +extern int proc_rcu_string(struct ctl_table *, int,
> +			 void __user *, size_t *, loff_t *);
>  extern int proc_dointvec(struct ctl_table *, int,
>  			 void __user *, size_t *, loff_t *);
>  extern int proc_dointvec_minmax(struct ctl_table *, int,
> Index: linux-2.6.33-rc1-ak/kernel/sysctl.c
> ===================================================================
> --- linux-2.6.33-rc1-ak.orig/kernel/sysctl.c
> +++ linux-2.6.33-rc1-ak/kernel/sysctl.c
> @@ -50,6 +50,7 @@
>  #include <linux/ftrace.h>
>  #include <linux/slow-work.h>
>  #include <linux/perf_event.h>
> +#include <linux/rcustring.h>
> 
>  #include <asm/uaccess.h>
>  #include <asm/processor.h>
> @@ -2016,6 +2017,60 @@ static int _proc_do_string(void* data, i
>  }
> 
>  /**
> + * proc_rcu_string - sysctl string with rcu protection
> + * @table: the sysctl table
> + * @write: %TRUE if this is a write to the sysctl file
> + * @buffer: the user buffer
> + * @lenp: the size of the user buffer
> + * @ppos: file position
> + *
> + * Handle a string sysctl similar to proc_dostring.
> + * The main difference is that the data pointer in the table
> + * points to a pointer to a string. The string should be initially
> + * pointing to a statically allocated (as a C object, not on the heap)
> + * default. When it is replaced old uses will be protected by
> + * RCU. The reader should use rcu_read_lock()/unlock() or
> + * access_rcu_string().
> + */
> +int proc_rcu_string(struct ctl_table *table, int write,
> +		void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> +	int ret;
> +
> +	if (write) {
> +		/* protect writers against each other */
> +		static DEFINE_MUTEX(rcu_string_mutex);
> +		char *old;
> +		char *new;
> +
> +		new = alloc_rcu_string(table->maxlen, GFP_KERNEL);
> +		if (!new)
> +			return -ENOMEM;
> +		mutex_lock(&rcu_string_mutex);
> +		old = *(char **)(table->data);
> +		strcpy(new, old);
> +		ret = _proc_do_string(new, table->maxlen, write, buffer, lenp, ppos);
> +		rcu_assign_pointer(*(char **)(table->data), new);
> +		/*
> +		 * For the first initialization allow constant strings.
> +		 */
> +		if (!kernel_address((unsigned long)old))
> +			free_rcu_string(old);
> +		mutex_unlock(&rcu_string_mutex);
> +	} else {
> +		char *str;
> +
> +		str = access_rcu_string(*(char **)(table->data), table->maxlen,
> +					GFP_KERNEL);

So the above statement picks up table->data, then some other CPU comes
in and executes the "write" side of this "if" statement, we get
preempted before access_rcu_string() enters its RCU read-side critical
section, the grace period elapse, we resume, and ... ouch!

One trick would be to make access_rcu_string() be a macro that does
first access to its first argument in an RCU read-side critical section.
Alternatively, pass in the address of the pointer, rather than the
pointer itself.

Or explain to me how I am confused.

> +		if (!str)
> +			return -ENOMEM;
> +		ret = _proc_do_string(str, table->maxlen, write, buffer, lenp, ppos);
> +		kfree(str);
> +	}
> +	return ret;
> +}
> +
> +/**
>   * proc_dostring - read a string sysctl
>   * @table: the sysctl table
>   * @write: %TRUE if this is a write to the sysctl file
> @@ -2030,6 +2085,10 @@ static int _proc_do_string(void* data, i
>   * and a newline '\n' is added. It is truncated if the buffer is
>   * not large enough.
>   *
> + * WARNING: this should be only used for read only strings
> + * or when you have a wrapper with special locking. Otherwise
> + * use proc_rcu_string to avoid races with the consumer.
> + *
>   * Returns 0 on success.
>   */
>  int proc_dostring(struct ctl_table *table, int write,
> @@ -2614,6 +2673,12 @@ int proc_dostring(struct ctl_table *tabl
>  	return -ENOSYS;
>  }
> 
> +int proc_rcu_string(struct ctl_table *table, int write,
> +		  void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> +	return -ENOSYS;
> +}
> +
>  int proc_dointvec(struct ctl_table *table, int write,
>  		  void __user *buffer, size_t *lenp, loff_t *ppos)
>  {
> @@ -2670,6 +2735,7 @@ EXPORT_SYMBOL(proc_dointvec_minmax);
>  EXPORT_SYMBOL(proc_dointvec_userhz_jiffies);
>  EXPORT_SYMBOL(proc_dointvec_ms_jiffies);
>  EXPORT_SYMBOL(proc_dostring);
> +EXPORT_SYMBOL(proc_rcu_string);
>  EXPORT_SYMBOL(proc_doulongvec_minmax);
>  EXPORT_SYMBOL(proc_doulongvec_ms_jiffies_minmax);
>  EXPORT_SYMBOL(register_sysctl_table);
> Index: linux-2.6.33-rc1-ak/kernel/sysctl_check.c
> ===================================================================
> --- linux-2.6.33-rc1-ak.orig/kernel/sysctl_check.c
> +++ linux-2.6.33-rc1-ak/kernel/sysctl_check.c
> @@ -131,6 +131,7 @@ int sysctl_check_table(struct nsproxy *n
>  				set_fail(&fail, table, "Directory with extra2");
>  		} else {
>  			if ((table->proc_handler == proc_dostring) ||
> +			    (table->proc_handler == proc_rcu_string) ||
>  			    (table->proc_handler == proc_dointvec) ||
>  			    (table->proc_handler == proc_dointvec_minmax) ||
>  			    (table->proc_handler == proc_dointvec_jiffies) ||

  reply	other threads:[~2009-12-22  2:51 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-21  1:20 [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls Andi Kleen
2009-12-21  1:20 ` [PATCH] [1/11] Add rcustring ADT for RCU protected strings Andi Kleen
2009-12-22  2:46   ` Paul E. McKenney
2009-12-22 10:05     ` Andi Kleen
2009-12-22 20:59       ` Paul E. McKenney
2009-12-21  1:20 ` [PATCH] [2/11] Add a kernel_address() that works for data too Andi Kleen
2009-12-21  1:20 ` [PATCH] [3/11] SYSCTL: Add proc_rcu_string to manage sysctls using rcu strings Andi Kleen
2009-12-22  2:51   ` Paul E. McKenney [this message]
2009-12-22  3:00     ` Eric W. Biederman
2009-12-22  7:44       ` Paul E. McKenney
2009-12-21  1:20 ` [PATCH] [4/11] SYSCTL: Use RCU strings for core_pattern sysctl Andi Kleen
2009-12-21  1:20 ` [PATCH] [5/11] SYSCTL: Add call_usermodehelper_cleanup() Andi Kleen
2009-12-21  1:20 ` [PATCH] [6/11] SYSCTL: Convert modprobe_path to proc_rcu_string() Andi Kleen
2009-12-21  1:20 ` [PATCH] [7/11] SYSCTL: Convert poweroff_command to proc_rcu_string Andi Kleen
2009-12-21  1:20 ` [PATCH] [8/11] SYSCTL: Convert hotplug helper string to proc_rcu_string() Andi Kleen
2009-12-22 19:03   ` Greg KH
2009-12-21  1:20 ` [PATCH] [9/11] SYSCTL: Add a mutex to the page_alloc zone order sysctl Andi Kleen
2009-12-21  1:20 ` [PATCH] [10/11] SYSCTL: Use RCU protected sysctl for ocfs group add helper Andi Kleen
2009-12-21  1:20 ` [PATCH] [11/11] SYSCTL: Convert IRDA text sysctl to RCU Andi Kleen
2009-12-21  1:59 ` [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls Eric W. Biederman
2009-12-21  2:04   ` Andi Kleen
2009-12-21  2:31     ` Eric W. Biederman
2009-12-21  3:21       ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091222025131.GB9279@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.