From: Oleg Nesterov <oleg@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Davide Libenzi <davidel@xmailserver.org>,
Eric Dumazet <eric.dumazet@gmail.com>, Greg KH <greg@kroah.com>,
Jason Baron <jbaron@redhat.com>,
Roland McGrath <roland@hack.frob.com>,
Eugene Teo <eugeneteo@kernel.sg>,
Maxime Bizon <mbizon@freebox.fr>,
Denys Vlasenko <dvlasenk@redhat.com>,
linux-kernel@vger.kernel.org
Subject: [PATCH v2 1/2] epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()
Date: Fri, 24 Feb 2012 20:07:11 +0100 [thread overview]
Message-ID: <20120224190711.GB22287@redhat.com> (raw)
In-Reply-To: <20120224190651.GA22287@redhat.com>
This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.
epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.
This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.
__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.
ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.
The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.
In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.
Note:
- we do not have wake_up_all_poll() but wake_up_poll()
is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.
- signalfd_cleanup() uses POLLHUP along with POLLFREE,
we need a couple of simple changes in eventpoll.c to
make sure it can't be "lost".
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Cc: <stable@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
fs/eventpoll.c | 4 ++++
fs/signalfd.c | 11 +++++++++++
include/asm-generic/poll.h | 2 ++
include/linux/signalfd.h | 5 ++++-
kernel/fork.c | 5 ++++-
5 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index aabdfc3..34bbfc6 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -842,6 +842,10 @@ static int ep_poll_callback(wait_queue_t *wait, unsigned mode, int sync, void *k
struct epitem *epi = ep_item_from_wait(wait);
struct eventpoll *ep = epi->ep;
+ /* the caller holds eppoll_entry->whead->lock */
+ if ((unsigned long)key & POLLFREE)
+ list_del_init(&wait->task_list);
+
spin_lock_irqsave(&ep->lock, flags);
/*
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 492465b..79c1eea 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -30,6 +30,17 @@
#include <linux/signalfd.h>
#include <linux/syscalls.h>
+void signalfd_cleanup(struct sighand_struct *sighand)
+{
+ wait_queue_head_t *wqh = &sighand->signalfd_wqh;
+
+ if (likely(!waitqueue_active(wqh)))
+ return;
+
+ /* wait_queue_t->func(POLLFREE) should do remove_wait_queue() */
+ wake_up_poll(wqh, POLLHUP | POLLFREE);
+}
+
struct signalfd_ctx {
sigset_t sigmask;
};
diff --git a/include/asm-generic/poll.h b/include/asm-generic/poll.h
index 44bce83..9ce7f44 100644
--- a/include/asm-generic/poll.h
+++ b/include/asm-generic/poll.h
@@ -28,6 +28,8 @@
#define POLLRDHUP 0x2000
#endif
+#define POLLFREE 0x4000 /* currently only for epoll */
+
struct pollfd {
int fd;
short events;
diff --git a/include/linux/signalfd.h b/include/linux/signalfd.h
index 3ff4961..247399b 100644
--- a/include/linux/signalfd.h
+++ b/include/linux/signalfd.h
@@ -61,13 +61,16 @@ static inline void signalfd_notify(struct task_struct *tsk, int sig)
wake_up(&tsk->sighand->signalfd_wqh);
}
+extern void signalfd_cleanup(struct sighand_struct *sighand);
+
#else /* CONFIG_SIGNALFD */
static inline void signalfd_notify(struct task_struct *tsk, int sig) { }
+static inline void signalfd_cleanup(struct sighand_struct *sighand) { }
+
#endif /* CONFIG_SIGNALFD */
#endif /* __KERNEL__ */
#endif /* _LINUX_SIGNALFD_H */
-
diff --git a/kernel/fork.c b/kernel/fork.c
index b77fd55..e2cd3e2 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -66,6 +66,7 @@
#include <linux/user-return-notifier.h>
#include <linux/oom.h>
#include <linux/khugepaged.h>
+#include <linux/signalfd.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -935,8 +936,10 @@ static int copy_sighand(unsigned long clone_flags, struct task_struct *tsk)
void __cleanup_sighand(struct sighand_struct *sighand)
{
- if (atomic_dec_and_test(&sighand->count))
+ if (atomic_dec_and_test(&sighand->count)) {
+ signalfd_cleanup(sighand);
kmem_cache_free(sighand_cachep, sighand);
+ }
}
--
1.5.5.1
next prev parent reply other threads:[~2012-02-24 19:14 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120222173326.GA7139@redhat.com>
2012-02-22 17:33 ` [PATCH 1/4] signalfd: introduce signalfd_cleanup() Oleg Nesterov
2012-02-22 17:34 ` [PATCH 2/4] epoll: introduce POLLFREE for ep_poll_callback() Oleg Nesterov
2012-02-22 17:34 ` [PATCH 3/4] signalfd: signalfd_cleanup() can race with remove_wait_queue() Oleg Nesterov
2012-02-22 17:35 ` [PATCH 4/4] epoll: ep_unregister_pollwait() can use the freed pwq->whead Oleg Nesterov
2012-02-23 15:44 ` Oleg Nesterov
2012-02-23 22:17 ` Linus Torvalds
2012-02-24 19:06 ` [PATCH v2 0/2] signalfd/epoll fixes Oleg Nesterov
2012-02-24 19:07 ` Oleg Nesterov [this message]
2012-02-29 19:57 ` [PATCH v2 1/2] epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() Andy Lutomirski
2012-02-29 20:06 ` Oleg Nesterov
2012-02-29 20:16 ` Andrew Lutomirski
2012-03-01 19:26 ` Oleg Nesterov
2012-02-24 19:07 ` [PATCH v2 2/2] epoll: ep_unregister_pollwait() can use the freed pwq->whead Oleg Nesterov
2012-02-24 20:23 ` [PATCH v2 0/2] signalfd/epoll fixes Linus Torvalds
2012-02-24 23:14 ` Linus Torvalds
2012-02-25 16:08 ` Oleg Nesterov
2012-02-25 19:00 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120224190711.GB22287@redhat.com \
--to=oleg@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=davidel@xmailserver.org \
--cc=dvlasenk@redhat.com \
--cc=eric.dumazet@gmail.com \
--cc=eugeneteo@kernel.sg \
--cc=greg@kroah.com \
--cc=jbaron@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mbizon@freebox.fr \
--cc=roland@hack.frob.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.