* [PATCH] poll/wait/md: allow module to safely support 'poll' on /proc files
@ 2014-03-12 2:36 NeilBrown
2014-03-12 3:03 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2014-03-12 2:36 UTC (permalink / raw)
To: Ingo Molnar, Peter Zijlstra, Andrew Morton, Alexander Viro
Cc: linux-raid, linux-kernel, linux-fsdevel, majianpeng
[-- Attachment #1: Type: text/plain, Size: 5371 bytes --]
The md driver currently supports 'poll' on /proc/mdstat.
This is unsafe as if the md-mod module is removed while a 'poll'
or 'select' is outstanding on /proc/mdstat, an oops occurs
when the syscall completes.
poll_freewait() will call remove_wait_queue() on a wait_queue_head_t
which was local to the module which no-longer exists.
This problem is particular to /proc. Most filesystems do not
allow the module to be unloaded while any files are open on it.
/proc only blocks module unloading while a file_operations
call is currently active into the module, not while the file is open.
kernfs has this property too but kernfs allocates a wait_queue_head_t
in its internal data structures so the module doesn't need to provide
one.
(A previous patch to add a similar allocation to procfs was not
accepted).
This patch takes a different approach and allows a module to
disconnect the wait_queue_head_t that was passed to poll_wait()
from all the clients which are waiting on it. Thus after calling
proc_remove_entry("mdstat", NULL);
we simply call
wait_queue_purge(&md_event_waiters);
and then know that it is safe to remove the module.
rcu infrastructure is used to avoid races.
poll_freewait() checks if the purge has happened under rcu_read_lock()
to ensure that it never touches any freed memory. wait_queue_purge()
uses synchronize_rcu() to ensure no poll_freewait() could still be
looking at the wait_queue_head_t.
Reported-by: "majianpeng" <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 4ad5cc4e63e8..e28c9d2a1166 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8681,6 +8681,7 @@ static __exit void md_exit(void)
unregister_reboot_notifier(&md_notifier);
unregister_sysctl_table(raid_table_header);
remove_proc_entry("mdstat", NULL);
+ wait_queue_purge(&md_event_waiters);
for_each_mddev(mddev, tmp) {
export_array(mddev);
mddev->hold_active = 0;
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index af903128891c..a095312d01e2 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -521,7 +521,7 @@ static void ep_remove_wait_queue(struct eppoll_entry *pwq)
/* If it is cleared by POLLFREE, it should be rcu-safe */
whead = rcu_dereference(pwq->whead);
if (whead)
- remove_wait_queue(whead, &pwq->wait);
+ remove_wait_queue_purgeable(whead, &pwq->wait);
rcu_read_unlock();
}
diff --git a/fs/select.c b/fs/select.c
index 467bb1cb3ea5..7c35bcdbd94c 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -129,7 +129,7 @@ EXPORT_SYMBOL(poll_initwait);
static void free_poll_entry(struct poll_table_entry *entry)
{
- remove_wait_queue(entry->wait_address, &entry->wait);
+ remove_wait_queue_purgeable(entry->wait_address, &entry->wait);
fput(entry->filp);
}
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 559044c79232..18d0d2fbf3bd 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -106,6 +106,8 @@ static inline int waitqueue_active(wait_queue_head_t *q)
extern void add_wait_queue(wait_queue_head_t *q, wait_queue_t *wait);
extern void add_wait_queue_exclusive(wait_queue_head_t *q, wait_queue_t *wait);
extern void remove_wait_queue(wait_queue_head_t *q, wait_queue_t *wait);
+extern void wait_queue_purge(wait_queue_head_t *q);
+extern void remove_wait_queue_purgeable(wait_queue_head_t *q, wait_queue_t *wait);
static inline void __add_wait_queue(wait_queue_head_t *head, wait_queue_t *new)
{
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 7d50f794e248..12548730c6ed 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -52,6 +52,51 @@ void remove_wait_queue(wait_queue_head_t *q, wait_queue_t *wait)
EXPORT_SYMBOL(remove_wait_queue);
+/**
+ * wait_queue_purge - remove all waiter from a wait_queue
+ * @q: The queue to be purged
+ *
+ * Unlink all pending waiters from the queue.
+ * This can be used prior to freeing a queue providing all waiters are
+ * prepared for queue purging.
+ * Waiters must call remove_wait_queue_puregeable() rather than
+ * remove_wait_queue().
+ *
+ */
+void wait_queue_purge(wait_queue_head_t *q)
+{
+ spin_lock(&q->lock);
+ while (!list_empty(&q->task_list))
+ list_del_init(q->task_list.next);
+ spin_unlock(&q->lock);
+ synchronize_rcu();
+}
+EXPORT_SYMBOL(wait_queue_purge);
+
+/**
+ * remove_wait_queue_puregeable - remove_wait_queue if wait_queue_purge might be used.
+ * @q: the queue, which may already be purged, to remove from
+ * @wait: the waiter to remove
+ *
+ * Remove a waiter from a queue if it hasn't already been purged.
+ * If the queue has already been purged then task_list will be empty.
+ * If it isn't then it is still safe to lock the queue and remove
+ * the task.
+ */
+void remove_wait_queue_purgeable(wait_queue_head_t *q, wait_queue_t *wait)
+{
+ unsigned long flags;
+
+ rcu_read_lock();
+ if (!list_empty(&wait->task_list)) {
+ spin_lock_irqsave(&q->lock, flags);
+ list_del_init(&wait->task_list);
+ spin_unlock_irqrestore(&q->lock, flags);
+ }
+ rcu_read_unlock();
+}
+EXPORT_SYMBOL(remove_wait_queue_purgeable);
+
/*
* The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
* wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] poll/wait/md: allow module to safely support 'poll' on /proc files
2014-03-12 2:36 [PATCH] poll/wait/md: allow module to safely support 'poll' on /proc files NeilBrown
@ 2014-03-12 3:03 ` Andrew Morton
2014-03-12 3:10 ` NeilBrown
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2014-03-12 3:03 UTC (permalink / raw)
To: NeilBrown
Cc: Ingo Molnar, Peter Zijlstra, Alexander Viro, linux-raid,
linux-kernel, linux-fsdevel, majianpeng
On Wed, 12 Mar 2014 13:36:38 +1100 NeilBrown <neilb@suse.de> wrote:
>
> The md driver currently supports 'poll' on /proc/mdstat.
> This is unsafe as if the md-mod module is removed while a 'poll'
> or 'select' is outstanding on /proc/mdstat, an oops occurs
> when the syscall completes.
> poll_freewait() will call remove_wait_queue() on a wait_queue_head_t
> which was local to the module which no-longer exists.
>
> This problem is particular to /proc. Most filesystems do not
> allow the module to be unloaded while any files are open on it.
> /proc only blocks module unloading while a file_operations
> call is currently active into the module, not while the file is open.
> kernfs has this property too but kernfs allocates a wait_queue_head_t
> in its internal data structures so the module doesn't need to provide
> one.
> (A previous patch to add a similar allocation to procfs was not
> accepted).
By who, me? I was hoping we could somehow keep the implementation
contained within md. I don't think I actually looked at it to any
significant extent!
Was hoping that viro would pipe up.
> This patch takes a different approach and allows a module to
> disconnect the wait_queue_head_t that was passed to poll_wait()
> from all the clients which are waiting on it. Thus after calling
> proc_remove_entry("mdstat", NULL);
> we simply call
> wait_queue_purge(&md_event_waiters);
>
> and then know that it is safe to remove the module.
>
> rcu infrastructure is used to avoid races.
> poll_freewait() checks if the purge has happened under rcu_read_lock()
> to ensure that it never touches any freed memory. wait_queue_purge()
> uses synchronize_rcu() to ensure no poll_freewait() could still be
> looking at the wait_queue_head_t.
>
> ...
>
> +/**
> + * wait_queue_purge - remove all waiter from a wait_queue
> + * @q: The queue to be purged
> + *
> + * Unlink all pending waiters from the queue.
> + * This can be used prior to freeing a queue providing all waiters are
> + * prepared for queue purging.
> + * Waiters must call remove_wait_queue_puregeable() rather than
> + * remove_wait_queue().
> + *
> + */
> +void wait_queue_purge(wait_queue_head_t *q)
> +{
> + spin_lock(&q->lock);
> + while (!list_empty(&q->task_list))
> + list_del_init(q->task_list.next);
> + spin_unlock(&q->lock);
> + synchronize_rcu();
> +}
> +EXPORT_SYMBOL(wait_queue_purge);
I don't get this. If a task is waiting on that wait_queue_head_t, how
does it get woken?
> +/**
> + * remove_wait_queue_puregeable - remove_wait_queue if wait_queue_purge might be used.
> + * @q: the queue, which may already be purged, to remove from
> + * @wait: the waiter to remove
> + *
> + * Remove a waiter from a queue if it hasn't already been purged.
> + * If the queue has already been purged then task_list will be empty.
> + * If it isn't then it is still safe to lock the queue and remove
> + * the task.
> + */
> +void remove_wait_queue_purgeable(wait_queue_head_t *q, wait_queue_t *wait)
> +{
> + unsigned long flags;
> +
> + rcu_read_lock();
> + if (!list_empty(&wait->task_list)) {
> + spin_lock_irqsave(&q->lock, flags);
Mixture of spin_lock_irqsave() here and spin_lock() in
wait_queue_purge() looks odd.
> + list_del_init(&wait->task_list);
> + spin_unlock_irqrestore(&q->lock, flags);
> + }
> + rcu_read_unlock();
> +}
> +EXPORT_SYMBOL(remove_wait_queue_purgeable);
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] poll/wait/md: allow module to safely support 'poll' on /proc files
2014-03-12 3:03 ` Andrew Morton
@ 2014-03-12 3:10 ` NeilBrown
2014-03-12 4:19 ` Andrew Morton
0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2014-03-12 3:10 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Peter Zijlstra, Alexander Viro, linux-raid,
linux-kernel, linux-fsdevel, majianpeng
[-- Attachment #1: Type: text/plain, Size: 4373 bytes --]
On Tue, 11 Mar 2014 20:03:31 -0700 Andrew Morton <akpm@linux-foundation.org>
wrote:
> On Wed, 12 Mar 2014 13:36:38 +1100 NeilBrown <neilb@suse.de> wrote:
>
> >
> > The md driver currently supports 'poll' on /proc/mdstat.
> > This is unsafe as if the md-mod module is removed while a 'poll'
> > or 'select' is outstanding on /proc/mdstat, an oops occurs
> > when the syscall completes.
> > poll_freewait() will call remove_wait_queue() on a wait_queue_head_t
> > which was local to the module which no-longer exists.
> >
> > This problem is particular to /proc. Most filesystems do not
> > allow the module to be unloaded while any files are open on it.
> > /proc only blocks module unloading while a file_operations
> > call is currently active into the module, not while the file is open.
> > kernfs has this property too but kernfs allocates a wait_queue_head_t
> > in its internal data structures so the module doesn't need to provide
> > one.
> > (A previous patch to add a similar allocation to procfs was not
> > accepted).
>
> By who, me? I was hoping we could somehow keep the implementation
> contained within md. I don't think I actually looked at it to any
> significant extent!
>
> Was hoping that viro would pipe up.
Was not accepted by anybody. Everybody is guilty :-)
You were at least nice enough to comment (thanks).
I think I like this version better so it might not be a problem that the
previous version was not accepted. Depends on what the scheduler guys think
though....
>
> > This patch takes a different approach and allows a module to
> > disconnect the wait_queue_head_t that was passed to poll_wait()
> > from all the clients which are waiting on it. Thus after calling
> > proc_remove_entry("mdstat", NULL);
> > we simply call
> > wait_queue_purge(&md_event_waiters);
> >
> > and then know that it is safe to remove the module.
> >
> > rcu infrastructure is used to avoid races.
> > poll_freewait() checks if the purge has happened under rcu_read_lock()
> > to ensure that it never touches any freed memory. wait_queue_purge()
> > uses synchronize_rcu() to ensure no poll_freewait() could still be
> > looking at the wait_queue_head_t.
> >
> > ...
> >
> > +/**
> > + * wait_queue_purge - remove all waiter from a wait_queue
> > + * @q: The queue to be purged
> > + *
> > + * Unlink all pending waiters from the queue.
> > + * This can be used prior to freeing a queue providing all waiters are
> > + * prepared for queue purging.
> > + * Waiters must call remove_wait_queue_puregeable() rather than
> > + * remove_wait_queue().
> > + *
> > + */
> > +void wait_queue_purge(wait_queue_head_t *q)
> > +{
> > + spin_lock(&q->lock);
> > + while (!list_empty(&q->task_list))
> > + list_del_init(q->task_list.next);
> > + spin_unlock(&q->lock);
> > + synchronize_rcu();
> > +}
> > +EXPORT_SYMBOL(wait_queue_purge);
>
> I don't get this. If a task is waiting on that wait_queue_head_t, how
> does it get woken?
This is only expected to be used by tasks which have some other means of
being woken up. Possibly a timeout passed to 'select' or 'poll', possibly a
signal.
>
> > +/**
> > + * remove_wait_queue_puregeable - remove_wait_queue if wait_queue_purge might be used.
> > + * @q: the queue, which may already be purged, to remove from
> > + * @wait: the waiter to remove
> > + *
> > + * Remove a waiter from a queue if it hasn't already been purged.
> > + * If the queue has already been purged then task_list will be empty.
> > + * If it isn't then it is still safe to lock the queue and remove
> > + * the task.
> > + */
> > +void remove_wait_queue_purgeable(wait_queue_head_t *q, wait_queue_t *wait)
> > +{
> > + unsigned long flags;
> > +
> > + rcu_read_lock();
> > + if (!list_empty(&wait->task_list)) {
> > + spin_lock_irqsave(&q->lock, flags);
>
> Mixture of spin_lock_irqsave() here and spin_lock() in
> wait_queue_purge() looks odd.
True - I was copying remove_wait_queue(). Maybe I should have just called
remove_wait_queue() directly (we don't actually need that _init on the
list_del).
>
> > + list_del_init(&wait->task_list);
> > + spin_unlock_irqrestore(&q->lock, flags);
> > + }
> > + rcu_read_unlock();
> > +}
> > +EXPORT_SYMBOL(remove_wait_queue_purgeable);
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] poll/wait/md: allow module to safely support 'poll' on /proc files
2014-03-12 3:10 ` NeilBrown
@ 2014-03-12 4:19 ` Andrew Morton
2014-03-12 4:37 ` NeilBrown
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2014-03-12 4:19 UTC (permalink / raw)
To: NeilBrown
Cc: Ingo Molnar, Peter Zijlstra, Alexander Viro, linux-raid,
linux-kernel, linux-fsdevel, majianpeng
On Wed, 12 Mar 2014 14:10:25 +1100 NeilBrown <neilb@suse.de> wrote:
> On Tue, 11 Mar 2014 20:03:31 -0700 Andrew Morton <akpm@linux-foundation.org>
> wrote:
>
> > On Wed, 12 Mar 2014 13:36:38 +1100 NeilBrown <neilb@suse.de> wrote:
> >
> > >
> > > The md driver currently supports 'poll' on /proc/mdstat.
> > > This is unsafe as if the md-mod module is removed while a 'poll'
> > > or 'select' is outstanding on /proc/mdstat, an oops occurs
> > > when the syscall completes.
> > > poll_freewait() will call remove_wait_queue() on a wait_queue_head_t
> > > which was local to the module which no-longer exists.
> > >
> > > This problem is particular to /proc. Most filesystems do not
> > > allow the module to be unloaded while any files are open on it.
> > > /proc only blocks module unloading while a file_operations
> > > call is currently active into the module, not while the file is open.
> > > kernfs has this property too but kernfs allocates a wait_queue_head_t
> > > in its internal data structures so the module doesn't need to provide
> > > one.
> > > (A previous patch to add a similar allocation to procfs was not
> > > accepted).
> >
> > By who, me? I was hoping we could somehow keep the implementation
> > contained within md. I don't think I actually looked at it to any
> > significant extent!
> >
> > Was hoping that viro would pipe up.
>
> Was not accepted by anybody. Everybody is guilty :-)
> You were at least nice enough to comment (thanks).
>
> I think I like this version better so it might not be a problem that the
> previous version was not accepted. Depends on what the scheduler guys think
> though....
OK..
> > > ...
> > >
> > > +/**
> > > + * wait_queue_purge - remove all waiter from a wait_queue
> > > + * @q: The queue to be purged
> > > + *
> > > + * Unlink all pending waiters from the queue.
> > > + * This can be used prior to freeing a queue providing all waiters are
> > > + * prepared for queue purging.
> > > + * Waiters must call remove_wait_queue_puregeable() rather than
> > > + * remove_wait_queue().
> > > + *
> > > + */
> > > +void wait_queue_purge(wait_queue_head_t *q)
> > > +{
> > > + spin_lock(&q->lock);
> > > + while (!list_empty(&q->task_list))
> > > + list_del_init(q->task_list.next);
> > > + spin_unlock(&q->lock);
> > > + synchronize_rcu();
> > > +}
> > > +EXPORT_SYMBOL(wait_queue_purge);
> >
> > I don't get this. If a task is waiting on that wait_queue_head_t, how
> > does it get woken?
>
> This is only expected to be used by tasks which have some other means of
> being woken up. Possibly a timeout passed to 'select' or 'poll', possibly a
> signal.
Oh. So the caller is supposed to take the tasks off the queue via
wait_queue_purge(), then to wake them up (which implies the caller has
a second way of looking the tasks up).
And the tasks themselves ... do they need to know what happened? If
they run remove_wait_queue() then will still take
wait_queue_head_t.lock, so the calling task need to somehow keep the
wait_queue_head_t alive until everyone has woken and cleared off.
Complex! Could we please get that all fleshed out in the changelog and
kerneldoc?
> >
> > > +/**
> > > + * remove_wait_queue_puregeable - remove_wait_queue if wait_queue_purge might be used.
> > > + * @q: the queue, which may already be purged, to remove from
> > > + * @wait: the waiter to remove
> > > + *
> > > + * Remove a waiter from a queue if it hasn't already been purged.
> > > + * If the queue has already been purged then task_list will be empty.
> > > + * If it isn't then it is still safe to lock the queue and remove
> > > + * the task.
> > > + */
> > > +void remove_wait_queue_purgeable(wait_queue_head_t *q, wait_queue_t *wait)
> > > +{
> > > + unsigned long flags;
> > > +
> > > + rcu_read_lock();
> > > + if (!list_empty(&wait->task_list)) {
> > > + spin_lock_irqsave(&q->lock, flags);
> >
> > Mixture of spin_lock_irqsave() here and spin_lock() in
> > wait_queue_purge() looks odd.
>
> True - I was copying remove_wait_queue(). Maybe I should have just called
> remove_wait_queue() directly (we don't actually need that _init on the
> list_del).
lockdep should have complained about the spin_lock().
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] poll/wait/md: allow module to safely support 'poll' on /proc files
2014-03-12 4:19 ` Andrew Morton
@ 2014-03-12 4:37 ` NeilBrown
0 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2014-03-12 4:37 UTC (permalink / raw)
To: Andrew Morton
Cc: Ingo Molnar, Peter Zijlstra, Alexander Viro, linux-raid,
linux-kernel, linux-fsdevel, majianpeng
[-- Attachment #1: Type: text/plain, Size: 6178 bytes --]
On Tue, 11 Mar 2014 21:19:24 -0700 Andrew Morton <akpm@linux-foundation.org>
wrote:
> On Wed, 12 Mar 2014 14:10:25 +1100 NeilBrown <neilb@suse.de> wrote:
>
> > On Tue, 11 Mar 2014 20:03:31 -0700 Andrew Morton <akpm@linux-foundation.org>
> > wrote:
> >
> > > On Wed, 12 Mar 2014 13:36:38 +1100 NeilBrown <neilb@suse.de> wrote:
> > >
> > > >
> > > > The md driver currently supports 'poll' on /proc/mdstat.
> > > > This is unsafe as if the md-mod module is removed while a 'poll'
> > > > or 'select' is outstanding on /proc/mdstat, an oops occurs
> > > > when the syscall completes.
> > > > poll_freewait() will call remove_wait_queue() on a wait_queue_head_t
> > > > which was local to the module which no-longer exists.
> > > >
> > > > This problem is particular to /proc. Most filesystems do not
> > > > allow the module to be unloaded while any files are open on it.
> > > > /proc only blocks module unloading while a file_operations
> > > > call is currently active into the module, not while the file is open.
> > > > kernfs has this property too but kernfs allocates a wait_queue_head_t
> > > > in its internal data structures so the module doesn't need to provide
> > > > one.
> > > > (A previous patch to add a similar allocation to procfs was not
> > > > accepted).
> > >
> > > By who, me? I was hoping we could somehow keep the implementation
> > > contained within md. I don't think I actually looked at it to any
> > > significant extent!
> > >
> > > Was hoping that viro would pipe up.
> >
> > Was not accepted by anybody. Everybody is guilty :-)
> > You were at least nice enough to comment (thanks).
> >
> > I think I like this version better so it might not be a problem that the
> > previous version was not accepted. Depends on what the scheduler guys think
> > though....
>
> OK..
>
> > > > ...
> > > >
> > > > +/**
> > > > + * wait_queue_purge - remove all waiter from a wait_queue
> > > > + * @q: The queue to be purged
> > > > + *
> > > > + * Unlink all pending waiters from the queue.
> > > > + * This can be used prior to freeing a queue providing all waiters are
> > > > + * prepared for queue purging.
> > > > + * Waiters must call remove_wait_queue_puregeable() rather than
> > > > + * remove_wait_queue().
> > > > + *
> > > > + */
> > > > +void wait_queue_purge(wait_queue_head_t *q)
> > > > +{
> > > > + spin_lock(&q->lock);
> > > > + while (!list_empty(&q->task_list))
> > > > + list_del_init(q->task_list.next);
> > > > + spin_unlock(&q->lock);
> > > > + synchronize_rcu();
> > > > +}
> > > > +EXPORT_SYMBOL(wait_queue_purge);
> > >
> > > I don't get this. If a task is waiting on that wait_queue_head_t, how
> > > does it get woken?
> >
> > This is only expected to be used by tasks which have some other means of
> > being woken up. Possibly a timeout passed to 'select' or 'poll', possibly a
> > signal.
>
> Oh. So the caller is supposed to take the tasks off the queue via
> wait_queue_purge(), then to wake them up (which implies the caller has
> a second way of looking the tasks up).
No. The caller of wait_queue_purge() doesn't care beyond purging the queue.
The process that are waiting will be in 'select' or 'poll' or similar and
there is no guarantee that anything will ever happen to wake them up.
They can just keep waiting for all I care.
I most cases they will have set their own timeout (or should have done).
I could wake_up() the wait_queue before purging it. Then the callers would
wake up, attempt IO, probably get EINVAL or something because the /proc file
has been de-registered and hopefully move on with life.
But that is really independent of the implementation of wait_queue_purge().
It is entirely possible for a process to be waiting on multiple events (And
in the select/poll case it is very common) and if one event wants to say
"this event is never ever going to happen so I'm detaching the queue", then
there is no particular reason to do anything but detach the queue.
>
> And the tasks themselves ... do they need to know what happened? If
> they run remove_wait_queue() then will still take
> wait_queue_head_t.lock, so the calling task need to somehow keep the
> wait_queue_head_t alive until everyone has woken and cleared off.
The tasks themselves mustn't run remove_wait_queue(). They must run
remove_wait_queue_purgeable(). This check if the remove has already happened
and avoids taking the spinlock (which no longer exists) in that case.
You cannot just call wait_queue_purge() on any old queue - only one that is
known to be potentially purged and waiters know to use
remove_wait_queue_purge().
Currently that is only wait_queues that are passed to poll_wait() inside a
->poll() method.
>
> Complex! Could we please get that all fleshed out in the changelog and
> kerneldoc?
I thought the important parts (the waiter having to call
remove_wait_queue_purgeable()) already was in the kerneldoc.
Thanks,
NeilBrown
>
> > >
> > > > +/**
> > > > + * remove_wait_queue_puregeable - remove_wait_queue if wait_queue_purge might be used.
> > > > + * @q: the queue, which may already be purged, to remove from
> > > > + * @wait: the waiter to remove
> > > > + *
> > > > + * Remove a waiter from a queue if it hasn't already been purged.
> > > > + * If the queue has already been purged then task_list will be empty.
> > > > + * If it isn't then it is still safe to lock the queue and remove
> > > > + * the task.
> > > > + */
> > > > +void remove_wait_queue_purgeable(wait_queue_head_t *q, wait_queue_t *wait)
> > > > +{
> > > > + unsigned long flags;
> > > > +
> > > > + rcu_read_lock();
> > > > + if (!list_empty(&wait->task_list)) {
> > > > + spin_lock_irqsave(&q->lock, flags);
> > >
> > > Mixture of spin_lock_irqsave() here and spin_lock() in
> > > wait_queue_purge() looks odd.
> >
> > True - I was copying remove_wait_queue(). Maybe I should have just called
> > remove_wait_queue() directly (we don't actually need that _init on the
> > list_del).
>
> lockdep should have complained about the spin_lock().
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2014-03-12 4:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-12 2:36 [PATCH] poll/wait/md: allow module to safely support 'poll' on /proc files NeilBrown
2014-03-12 3:03 ` Andrew Morton
2014-03-12 3:10 ` NeilBrown
2014-03-12 4:19 ` Andrew Morton
2014-03-12 4:37 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).