From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk, Maxime Bizon <mbizon@freebox.fr>,
Oleg Nesterov <oleg@redhat.com>
Subject: [ 70/73] epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()
Date: Mon, 27 Feb 2012 17:03:13 -0800 [thread overview]
Message-ID: <20120228010215.974304968@linuxfoundation.org> (raw)
In-Reply-To: <20120228010246.GA24299@kroah.com>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov <oleg@redhat.com>
commit d80e731ecab420ddcb79ee9d0ac427acbc187b4b upstream.
This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.
epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.
This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.
__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.
ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.
The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.
In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.
Note:
- we do not have wake_up_all_poll() but wake_up_poll()
is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.
- signalfd_cleanup() uses POLLHUP along with POLLFREE,
we need a couple of simple changes in eventpoll.c to
make sure it can't be "lost".
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/eventpoll.c | 4 ++++
fs/signalfd.c | 11 +++++++++++
include/asm-generic/poll.h | 2 ++
include/linux/signalfd.h | 5 ++++-
kernel/fork.c | 5 ++++-
5 files changed, 25 insertions(+), 2 deletions(-)
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -827,6 +827,10 @@ static int ep_poll_callback(wait_queue_t
struct epitem *epi = ep_item_from_wait(wait);
struct eventpoll *ep = epi->ep;
+ /* the caller holds eppoll_entry->whead->lock */
+ if ((unsigned long)key & POLLFREE)
+ list_del_init(&wait->task_list);
+
spin_lock_irqsave(&ep->lock, flags);
/*
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -30,6 +30,17 @@
#include <linux/signalfd.h>
#include <linux/syscalls.h>
+void signalfd_cleanup(struct sighand_struct *sighand)
+{
+ wait_queue_head_t *wqh = &sighand->signalfd_wqh;
+
+ if (likely(!waitqueue_active(wqh)))
+ return;
+
+ /* wait_queue_t->func(POLLFREE) should do remove_wait_queue() */
+ wake_up_poll(wqh, POLLHUP | POLLFREE);
+}
+
struct signalfd_ctx {
sigset_t sigmask;
};
--- a/include/asm-generic/poll.h
+++ b/include/asm-generic/poll.h
@@ -28,6 +28,8 @@
#define POLLRDHUP 0x2000
#endif
+#define POLLFREE 0x4000 /* currently only for epoll */
+
struct pollfd {
int fd;
short events;
--- a/include/linux/signalfd.h
+++ b/include/linux/signalfd.h
@@ -61,13 +61,16 @@ static inline void signalfd_notify(struc
wake_up(&tsk->sighand->signalfd_wqh);
}
+extern void signalfd_cleanup(struct sighand_struct *sighand);
+
#else /* CONFIG_SIGNALFD */
static inline void signalfd_notify(struct task_struct *tsk, int sig) { }
+static inline void signalfd_cleanup(struct sighand_struct *sighand) { }
+
#endif /* CONFIG_SIGNALFD */
#endif /* __KERNEL__ */
#endif /* _LINUX_SIGNALFD_H */
-
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -67,6 +67,7 @@
#include <linux/user-return-notifier.h>
#include <linux/oom.h>
#include <linux/khugepaged.h>
+#include <linux/signalfd.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -917,8 +918,10 @@ static int copy_sighand(unsigned long cl
void __cleanup_sighand(struct sighand_struct *sighand)
{
- if (atomic_dec_and_test(&sighand->count))
+ if (atomic_dec_and_test(&sighand->count)) {
+ signalfd_cleanup(sighand);
kmem_cache_free(sighand_cachep, sighand);
+ }
}
next prev parent reply other threads:[~2012-02-28 1:10 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-28 1:02 [ 00/73] 3.0.23-stable review Greg KH
2012-02-28 1:02 ` [ 01/73] ASoC: wm8962: Fix sidetone enumeration texts Greg KH
2012-02-28 1:02 ` [ 02/73] NOMMU: Lock i_mmap_mutex for access to the VMA prio list Greg KH
2012-02-28 1:02 ` [ 03/73] hwmon: (max6639) Fix FAN_FROM_REG calculation Greg KH
2012-02-28 1:02 ` [ 04/73] hwmon: (max6639) Fix PPR register initialization to set both channels Greg KH
2012-02-28 1:02 ` [ 05/73] hwmon: (ads1015) Fix file leak in probe function Greg KH
2012-02-28 1:02 ` [ 06/73] powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events Greg KH
2012-02-28 1:02 ` [ 07/73] drm/radeon/kms: fix MSI re-arm on rv370+ Greg KH
2012-02-28 1:02 ` [ 08/73] PCI: workaround hard-wired bus number V2 Greg KH
2012-02-28 1:02 ` [ 09/73] mac80211: Fix a rwlock bad magic bug Greg KH
2012-02-28 1:02 ` [ 10/73] ipheth: Add iPhone 4S Greg KH
2012-02-28 1:02 ` [ 11/73] eCryptfs: Copy up lower inode attrs after setting lower xattr Greg KH
2012-02-28 1:02 ` [ 12/73] ALSA: hda - Fix redundant jack creations for cx5051 Greg KH
2012-02-28 1:02 ` [ 13/73] mmc: core: check for zero length ioctl data Greg KH
2012-02-28 1:02 ` [ 14/73] NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEID Greg KH
2012-02-28 1:02 ` [ 15/73] ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR Greg KH
2012-02-28 1:02 ` [ 16/73] ARM: 7325/1: fix v7 boot with lockdep enabled Greg KH
2012-02-28 1:02 ` [ 17/73] net: Make qdisc_skb_cb upper size bound explicit Greg KH
2012-02-28 1:02 ` [ 18/73] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses Greg KH
2012-02-28 1:02 ` [ 19/73] gro: more generic L2 header check Greg KH
2012-02-28 1:02 ` [ 20/73] veth: Enforce minimum size of VETH_INFO_PEER Greg KH
2012-02-28 1:02 ` [ 21/73] 3c59x: shorten timer period for slave devices Greg KH
2012-02-28 1:02 ` [ 22/73] ipv6-multicast: Fix memory leak in input path Greg KH
2012-02-28 1:02 ` [ 23/73] ipv6-multicast: Fix memory leak in IPv6 multicast Greg KH
2012-02-28 1:02 ` [ 24/73] ipv4: fix for ip_options_rcv_srr() daddr update Greg KH
2012-02-28 1:02 ` [ 25/73] ipv4: Save nexthop address of LSRR/SSRR option to IPCB Greg KH
2012-02-28 1:02 ` [ 26/73] ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr Greg KH
2012-02-28 1:02 ` [ 27/73] ipv4: reset flowi parameters on route connect Greg KH
2012-02-28 1:02 ` [ 28/73] net: Dont proxy arp respond if iif == rt->dst.dev if private VLAN is disabled Greg KH
2012-02-28 1:02 ` [ 29/73] netpoll: netpoll_poll_dev() should access dev->flags Greg KH
2012-02-28 1:02 ` [ 30/73] net_sched: Bug in netem reordering Greg KH
2012-02-28 1:02 ` [ 31/73] via-velocity: S3 resume fix Greg KH
2012-02-28 1:02 ` [ 32/73] tcp_v4_send_reset: binding oif to iif in no sock case Greg KH
2012-02-28 1:02 ` [ 33/73] tcp: allow tcp_sacktag_one() to tag ranges not aligned with skbs Greg KH
2012-02-28 1:02 ` [ 34/73] tcp: fix range tcp_shifted_skb() passes to tcp_sacktag_one() Greg KH
2012-02-28 1:02 ` [ 35/73] tcp: fix tcp_shifted_skb() adjustment of lost_cnt_hint for FACK Greg KH
2012-02-28 1:02 ` [ 36/73] route: fix ICMP redirect validation Greg KH
2012-02-28 1:02 ` [ 37/73] ipv4: fix redirect handling Greg KH
2012-02-28 1:02 ` [ 38/73] USB: Added Kamstrup VID/PIDs to cp210x serial driver Greg KH
2012-02-28 1:02 ` [ 39/73] USB: option: cleanup zte 3g-dongles pid in option.c Greg KH
2012-02-28 1:02 ` [ 40/73] USB: Serial: ti_usb_3410_5052: Add Abbot Diabetes Care cable id Greg KH
2012-02-28 1:02 ` [ 41/73] USB: Remove duplicate USB 3.0 hub feature #defines Greg KH
2012-02-28 1:02 ` [ 42/73] USB: Fix handoff when BIOS disables host PCI device Greg KH
2012-02-28 1:02 ` [ 43/73] xhci: Fix oops caused by more USB2 ports than USB3 ports Greg KH
2012-02-28 1:02 ` [ 44/73] xhci: Fix encoding for HS bulk/control NAK rate Greg KH
2012-02-28 1:02 ` [ 45/73] USB: Set hub depth after USB3 hub reset Greg KH
2012-02-28 1:02 ` [ 46/73] i387: math_state_restore() isnt called from asm Greg KH
2012-02-28 1:02 ` [ 47/73] i387: make irq_fpu_usable() tests more robust Greg KH
2012-02-28 1:02 ` [ 48/73] i387: fix sense of sanity check Greg KH
2012-02-28 1:02 ` [ 49/73] i387: fix x86-64 preemption-unsafe user stack save/restore Greg KH
2012-02-28 1:02 ` [ 50/73] i387: move TS_USEDFPU clearing out of __save_init_fpu and into callers Greg KH
2012-02-28 1:02 ` [ 51/73] i387: dont ever touch TS_USEDFPU directly, use helper functions Greg KH
2012-02-28 1:02 ` [ 52/73] i387: do not preload FPU state at task switch time Greg KH
2012-02-28 1:02 ` [ 53/73] i387: move AMD K7/K8 fpu fxsave/fxrstor workaround from save to restore Greg KH
2012-02-28 1:02 ` [ 54/73] i387: move TS_USEDFPU flag from thread_info to task_struct Greg KH
2012-02-28 1:02 ` [ 55/73] i387: re-introduce FPU state preloading at context switch time Greg KH
2012-02-28 1:02 ` [ 56/73] usb-storage: fix freezing of the scanning thread Greg KH
2012-02-28 1:03 ` [ 57/73] USB: Dont fail USB3 probe on missing legacy PCI IRQ Greg KH
2012-02-28 1:03 ` [ 58/73] x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors Greg KH
2012-02-28 1:03 ` [ 59/73] ath9k: stop on rates with idx -1 in ath9k rate controls .tx_status Greg KH
2012-02-28 1:03 ` [ 60/73] genirq: Unmask oneshot irqs when thread was not woken Greg KH
2012-02-28 1:03 ` [ 61/73] genirq: Handle pending irqs in irq_startup() Greg KH
2012-02-28 1:03 ` [ 62/73] [SCSI] scsi_scan: Fix Poison overwritten warning caused by using freed shost Greg KH
2012-02-28 1:03 ` [ 63/73] [SCSI] scsi_pm: Fix bug in the SCSI power management handler Greg KH
2012-02-28 1:03 ` [ 64/73] ipvs: fix matching of fwmark templates during scheduling Greg KH
2012-02-28 1:03 ` [ 65/73] jme: Fix FIFO flush issue Greg KH
2012-02-28 1:03 ` [ 66/73] davinci_emac: Do not free all rx dma descriptors during init Greg KH
2012-02-28 1:03 ` [ 67/73] builddeb: Dont create files in /tmp with predictable names Greg KH
2012-02-28 1:03 ` [ 68/73] [media] hdpvr: fix race conditon during start of streaming Greg KH
2012-02-28 1:03 ` [ 69/73] hwmon: (f75375s) Fix register write order when setting fans to full speed Greg KH
2012-02-28 1:03 ` Greg KH [this message]
2012-02-28 1:03 ` [ 71/73] epoll: ep_unregister_pollwait() can use the freed pwq->whead Greg KH
2012-02-28 1:03 ` [ 72/73] epoll: limit paths Greg KH
2012-02-28 1:03 ` [ 73/73] cdrom: use copy_to_user() without the underscores Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120228010215.974304968@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=mbizon@freebox.fr \
--cc=oleg@redhat.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).