From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [REVIEW][PATCH] Making poll generally useful for sysctls Date: Sat, 24 Mar 2012 00:58:33 -0700 Message-ID: References: <20120313005855.GA24639@redhat.com> <20120318192755.GB6589@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Linux Kernel , linux-fsdevel@vger.kernel.org To: Lucas De Marchi Return-path: In-Reply-To: (Lucas De Marchi's message of "Sat, 24 Mar 2012 03:20:53 -0300") Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Lucas De Marchi writes: >> =C2=A0/* A sysctl table is an array of struct ctl_table: */ >> =C2=A0struct ctl_table >> =C2=A0{ >> =C2=A0 =C2=A0 =C2=A0 =C2=A0const char *procname; =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 /* Text ID for /proc/sys, or zero */ >> =C2=A0 =C2=A0 =C2=A0 =C2=A0void *data; >> + =C2=A0 =C2=A0 =C2=A0 atomic_t event; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0int maxlen; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0umode_t mode; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct ctl_table *child; =C2=A0 =C2=A0 =C2= =A0 =C2=A0/* Deprecated */ >> =C2=A0 =C2=A0 =C2=A0 =C2=A0proc_handler *proc_handler; =C2=A0 =C2=A0= /* Callback for text formatting */ >> - =C2=A0 =C2=A0 =C2=A0 struct ctl_table_poll *poll; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0void *extra1; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0void *extra2; >> =C2=A0}; >> @@ -1042,6 +1025,7 @@ struct ctl_table_header >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0}; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct rcu_he= ad rcu; >> =C2=A0 =C2=A0 =C2=A0 =C2=A0}; >> + =C2=A0 =C2=A0 =C2=A0 wait_queue_head_t wait; > > If you have a waitqueue per table instead of per entry you will get > spurious notifications when other entries change. The easiest way to > test this is to poll /proc/sys/kernel/hostname and change your > domainname. You will get spurious wakeups but not spurious notifications to userspace since event is still per entry. =46or my money that seemed a nice compromise of code simplicity, and generality. We could of course do something much closer to what sysfs does and allocate and refcount something similar to your ctl_table_poll when we have a ctl_table opened. But that just looked like a pain. Of course we already have spurious notifications for hostname and domainname when multiple uts namespaces are involved, but that is a different problem. > I couldn't apply this patch to any tree (linus/master + my previous > patch, your master, 3.3 + my previous patch), so I couldn't test. On > top of your tree: How odd. It should have applied cleanly to my tree and it applies with just a two line offset top of Linus's latest with my tree merged in. Those two lines of offset coming from the two extra includes that came in through the merge. patch -p1 --dry-run < ~/tmp/sysctl-poll-test.patch=20 patching file fs/proc/proc_sysctl.c Hunk #1 succeeded at 18 (offset 2 lines). Hunk #2 succeeded at 173 (offset 2 lines). Hunk #3 succeeded at 245 (offset 2 lines). Hunk #4 succeeded at 512 (offset 2 lines). Hunk #5 succeeded at 542 (offset 2 lines). Hunk #6 succeeded at 561 (offset 2 lines). patching file include/linux/sysctl.h patching file kernel/utsname_sysctl.c > [lucas@vader kernel]$ git am /tmp/a.patch > Applying: Making poll generally useful for sysctls > error: patch failed: fs/proc/proc_sysctl.c:16 > error: fs/proc/proc_sysctl.c: patch does not apply > error: patch failed: include/linux/sysctl.h:992 > error: include/linux/sysctl.h: patch does not apply > Patch failed at 0001 Making poll generally useful for sysctls Here is rebased version of the patch just in case that helps. --- diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 21d836f..739615c 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -18,13 +18,15 @@ static const struct inode_operations proc_sys_inode= _operations; static const struct file_operations proc_sys_dir_file_operations; static const struct inode_operations proc_sys_dir_operations; =20 -void proc_sys_poll_notify(struct ctl_table_poll *poll) +static inline void *proc_sys_poll_event(struct ctl_table *table) { - if (!poll) - return; + return (void *)(unsigned long)atomic_read(&table->event); +} =20 - atomic_inc(&poll->event); - wake_up_interruptible(&poll->wait); +void proc_sys_poll_notify(struct ctl_table_header *head, struct ctl_ta= ble *table) +{ + atomic_inc(&table->event); + wake_up_interruptible(&head->wait); } =20 static struct ctl_table root_table[] =3D { @@ -171,6 +173,7 @@ static void init_header(struct ctl_table_header *he= ad, for (entry =3D table; entry->procname; entry++, node++) { rb_init_node(&node->node); node->header =3D head; + atomic_set(&entry->event, 1); } } } @@ -242,6 +245,8 @@ static void start_unregistering(struct ctl_table_he= ader *p) /* anything non-NULL; we'll never dereference it */ p->unregistering =3D ERR_PTR(-EINVAL); } + /* Don't let poll sleep forever on deleted entries */ + wake_up_interruptible(&p->wait); /* * do not remove from the list until nobody holds it; walking the * list in do_sysctl() relies on that. @@ -507,6 +512,9 @@ static ssize_t proc_sys_call_handler(struct file *f= ilp, void __user *buf, error =3D table->proc_handler(table, write, buf, &res, ppos); if (!error) error =3D res; + + if (write) + proc_sys_poll_notify(head, table); out: sysctl_head_finish(head); =20 @@ -534,8 +542,7 @@ static int proc_sys_open(struct inode *inode, struc= t file *filp) if (IS_ERR(head)) return PTR_ERR(head); =20 - if (table->poll) - filp->private_data =3D proc_sys_poll_event(table->poll); + filp->private_data =3D proc_sys_poll_event(table); =20 sysctl_head_finish(head); =20 @@ -554,21 +561,14 @@ static unsigned int proc_sys_poll(struct file *fi= lp, poll_table *wait) if (IS_ERR(head)) return POLLERR | POLLHUP; =20 - if (!table->proc_handler) - goto out; - - if (!table->poll) - goto out; - event =3D (unsigned long)filp->private_data; - poll_wait(filp, &table->poll->wait, wait); + poll_wait(filp, &head->wait, wait); =20 - if (event !=3D atomic_read(&table->poll->event)) { - filp->private_data =3D proc_sys_poll_event(table->poll); + if (event !=3D atomic_read(&table->event)) { + filp->private_data =3D proc_sys_poll_event(table); ret =3D POLLIN | POLLRDNORM | POLLERR | POLLPRI; } =20 -out: sysctl_head_finish(head); =20 return ret; diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index c34b4c8..8647ee0 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -992,34 +992,17 @@ extern int proc_do_large_bitmap(struct ctl_table = *, int, * cover common cases. */ =20 -/* Support for userspace poll() to watch for changes */ -struct ctl_table_poll { - atomic_t event; - wait_queue_head_t wait; -}; - -static inline void *proc_sys_poll_event(struct ctl_table_poll *poll) -{ - return (void *)(unsigned long)atomic_read(&poll->event); -} - -#define __CTL_TABLE_POLL_INITIALIZER(name) { \ - .event =3D ATOMIC_INIT(0), \ - .wait =3D __WAIT_QUEUE_HEAD_INITIALIZER(name.wait) } - -#define DEFINE_CTL_TABLE_POLL(name) \ - struct ctl_table_poll name =3D __CTL_TABLE_POLL_INITIALIZER(name) =20 /* A sysctl table is an array of struct ctl_table: */ struct ctl_table=20 { const char *procname; /* Text ID for /proc/sys, or zero */ void *data; + atomic_t event; int maxlen; umode_t mode; struct ctl_table *child; /* Deprecated */ proc_handler *proc_handler; /* Callback for text formatting */ - struct ctl_table_poll *poll; void *extra1; void *extra2; }; @@ -1042,6 +1025,7 @@ struct ctl_table_header }; struct rcu_head rcu; }; + wait_queue_head_t wait; struct completion *unregistering; struct ctl_table *ctl_table_arg; struct ctl_table_root *root; @@ -1076,7 +1060,7 @@ struct ctl_path { =20 #ifdef CONFIG_SYSCTL =20 -void proc_sys_poll_notify(struct ctl_table_poll *poll); +void proc_sys_poll_notify(struct ctl_table_header *head, struct ctl_ta= ble *table); =20 extern void setup_sysctl_set(struct ctl_table_set *p, struct ctl_table_root *root, diff --git a/kernel/utsname_sysctl.c b/kernel/utsname_sysctl.c index 63da38c..3e91a50 100644 --- a/kernel/utsname_sysctl.c +++ b/kernel/utsname_sysctl.c @@ -53,18 +53,12 @@ static int proc_do_uts_string(ctl_table *table, int= write, r =3D proc_dostring(&uts_table,write,buffer,lenp, ppos); put_uts(table, write, uts_table.data); =20 - if (write) - proc_sys_poll_notify(table->poll); - return r; } #else #define proc_do_uts_string NULL #endif =20 -static DEFINE_CTL_TABLE_POLL(hostname_poll); -static DEFINE_CTL_TABLE_POLL(domainname_poll); - static struct ctl_table uts_kern_table[] =3D { { .procname =3D "ostype", @@ -93,7 +87,6 @@ static struct ctl_table uts_kern_table[] =3D { .maxlen =3D sizeof(init_uts_ns.name.nodename), .mode =3D 0644, .proc_handler =3D proc_do_uts_string, - .poll =3D &hostname_poll, }, { .procname =3D "domainname", @@ -101,7 +94,6 @@ static struct ctl_table uts_kern_table[] =3D { .maxlen =3D sizeof(init_uts_ns.name.domainname), .mode =3D 0644, .proc_handler =3D proc_do_uts_string, - .poll =3D &domainname_poll, }, {} }; @@ -115,6 +107,8 @@ static struct ctl_table uts_root_table[] =3D { {} }; =20 +static struct ctl_table_header *uts_header; + #ifdef CONFIG_PROC_SYSCTL /* * Notify userspace about a change in a certain entry of uts_kern_tabl= e, @@ -124,13 +118,13 @@ void uts_proc_notify(enum uts_proc proc) { struct ctl_table *table =3D &uts_kern_table[proc]; =20 - proc_sys_poll_notify(table->poll); + proc_sys_poll_notify(uts_header, table); } #endif =20 static int __init utsname_sysctl_init(void) { - register_sysctl_table(uts_root_table); + uts_header =3D register_sysctl_table(uts_root_table); return 0; } =20