From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk, Maxime Bizon <mbizon@freebox.fr>,
Oleg Nesterov <oleg@redhat.com>
Subject: [ 71/73] epoll: ep_unregister_pollwait() can use the freed pwq->whead
Date: Mon, 27 Feb 2012 17:03:14 -0800 [thread overview]
Message-ID: <20120228010216.168366820@linuxfoundation.org> (raw)
In-Reply-To: <20120228010246.GA24299@kroah.com>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov <oleg@redhat.com>
commit 971316f0503a5c50633d07b83b6db2f15a3a5b00 upstream.
signalfd_cleanup() ensures that ->signalfd_wqh is not used, but
this is not enough. eppoll_entry->whead still points to the memory
we are going to free, ep_unregister_pollwait()->remove_wait_queue()
is obviously unsafe.
Change ep_poll_callback(POLLFREE) to set eppoll_entry->whead = NULL,
change ep_unregister_pollwait() to check pwq->whead != NULL under
rcu_read_lock() before remove_wait_queue(). We add the new helper,
ep_remove_wait_queue(), for this.
This works because sighand_cachep is SLAB_DESTROY_BY_RCU and because
->signalfd_wqh is initialized in sighand_ctor(), not in copy_sighand.
ep_unregister_pollwait()->remove_wait_queue() can play with already
freed and potentially reused ->sighand, but this is fine. This memory
must have the valid ->signalfd_wqh until rcu_read_unlock().
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/eventpoll.c | 30 +++++++++++++++++++++++++++---
fs/signalfd.c | 6 +++++-
2 files changed, 32 insertions(+), 4 deletions(-)
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -299,6 +299,11 @@ static inline int ep_is_linked(struct li
return !list_empty(p);
}
+static inline struct eppoll_entry *ep_pwq_from_wait(wait_queue_t *p)
+{
+ return container_of(p, struct eppoll_entry, wait);
+}
+
/* Get the "struct epitem" from a wait queue pointer */
static inline struct epitem *ep_item_from_wait(wait_queue_t *p)
{
@@ -446,6 +451,18 @@ static void ep_poll_safewake(wait_queue_
put_cpu();
}
+static void ep_remove_wait_queue(struct eppoll_entry *pwq)
+{
+ wait_queue_head_t *whead;
+
+ rcu_read_lock();
+ /* If it is cleared by POLLFREE, it should be rcu-safe */
+ whead = rcu_dereference(pwq->whead);
+ if (whead)
+ remove_wait_queue(whead, &pwq->wait);
+ rcu_read_unlock();
+}
+
/*
* This function unregisters poll callbacks from the associated file
* descriptor. Must be called with "mtx" held (or "epmutex" if called from
@@ -460,7 +477,7 @@ static void ep_unregister_pollwait(struc
pwq = list_first_entry(lsthead, struct eppoll_entry, llink);
list_del(&pwq->llink);
- remove_wait_queue(pwq->whead, &pwq->wait);
+ ep_remove_wait_queue(pwq);
kmem_cache_free(pwq_cache, pwq);
}
}
@@ -827,9 +844,16 @@ static int ep_poll_callback(wait_queue_t
struct epitem *epi = ep_item_from_wait(wait);
struct eventpoll *ep = epi->ep;
- /* the caller holds eppoll_entry->whead->lock */
- if ((unsigned long)key & POLLFREE)
+ if ((unsigned long)key & POLLFREE) {
+ ep_pwq_from_wait(wait)->whead = NULL;
+ /*
+ * whead = NULL above can race with ep_remove_wait_queue()
+ * which can do another remove_wait_queue() after us, so we
+ * can't use __remove_wait_queue(). whead->lock is held by
+ * the caller.
+ */
list_del_init(&wait->task_list);
+ }
spin_lock_irqsave(&ep->lock, flags);
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -33,7 +33,11 @@
void signalfd_cleanup(struct sighand_struct *sighand)
{
wait_queue_head_t *wqh = &sighand->signalfd_wqh;
-
+ /*
+ * The lockless check can race with remove_wait_queue() in progress,
+ * but in this case its caller should run under rcu_read_lock() and
+ * sighand_cachep is SLAB_DESTROY_BY_RCU, we can safely return.
+ */
if (likely(!waitqueue_active(wqh)))
return;
next prev parent reply other threads:[~2012-02-28 1:03 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-28 1:02 [ 00/73] 3.0.23-stable review Greg KH
2012-02-28 1:02 ` [ 01/73] ASoC: wm8962: Fix sidetone enumeration texts Greg KH
2012-02-28 1:02 ` [ 02/73] NOMMU: Lock i_mmap_mutex for access to the VMA prio list Greg KH
2012-02-28 1:02 ` [ 03/73] hwmon: (max6639) Fix FAN_FROM_REG calculation Greg KH
2012-02-28 1:02 ` [ 04/73] hwmon: (max6639) Fix PPR register initialization to set both channels Greg KH
2012-02-28 1:02 ` [ 05/73] hwmon: (ads1015) Fix file leak in probe function Greg KH
2012-02-28 1:02 ` [ 06/73] powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events Greg KH
2012-02-28 1:02 ` [ 07/73] drm/radeon/kms: fix MSI re-arm on rv370+ Greg KH
2012-02-28 1:02 ` [ 08/73] PCI: workaround hard-wired bus number V2 Greg KH
2012-02-28 1:02 ` [ 09/73] mac80211: Fix a rwlock bad magic bug Greg KH
2012-02-28 1:02 ` [ 10/73] ipheth: Add iPhone 4S Greg KH
2012-02-28 1:02 ` [ 11/73] eCryptfs: Copy up lower inode attrs after setting lower xattr Greg KH
2012-02-28 1:02 ` [ 12/73] ALSA: hda - Fix redundant jack creations for cx5051 Greg KH
2012-02-28 1:02 ` [ 13/73] mmc: core: check for zero length ioctl data Greg KH
2012-02-28 1:02 ` [ 14/73] NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEID Greg KH
2012-02-28 1:02 ` [ 15/73] ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR Greg KH
2012-02-28 1:02 ` [ 16/73] ARM: 7325/1: fix v7 boot with lockdep enabled Greg KH
2012-02-28 1:02 ` [ 17/73] net: Make qdisc_skb_cb upper size bound explicit Greg KH
2012-02-28 1:02 ` [ 18/73] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses Greg KH
2012-02-28 1:02 ` [ 19/73] gro: more generic L2 header check Greg KH
2012-02-28 1:02 ` [ 20/73] veth: Enforce minimum size of VETH_INFO_PEER Greg KH
2012-02-28 1:02 ` [ 21/73] 3c59x: shorten timer period for slave devices Greg KH
2012-02-28 1:02 ` [ 22/73] ipv6-multicast: Fix memory leak in input path Greg KH
2012-02-28 1:02 ` [ 23/73] ipv6-multicast: Fix memory leak in IPv6 multicast Greg KH
2012-02-28 1:02 ` [ 24/73] ipv4: fix for ip_options_rcv_srr() daddr update Greg KH
2012-02-28 1:02 ` [ 25/73] ipv4: Save nexthop address of LSRR/SSRR option to IPCB Greg KH
2012-02-28 1:02 ` [ 26/73] ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr Greg KH
2012-02-28 1:02 ` [ 27/73] ipv4: reset flowi parameters on route connect Greg KH
2012-02-28 1:02 ` [ 28/73] net: Dont proxy arp respond if iif == rt->dst.dev if private VLAN is disabled Greg KH
2012-02-28 1:02 ` [ 29/73] netpoll: netpoll_poll_dev() should access dev->flags Greg KH
2012-02-28 1:02 ` [ 30/73] net_sched: Bug in netem reordering Greg KH
2012-02-28 1:02 ` [ 31/73] via-velocity: S3 resume fix Greg KH
2012-02-28 1:02 ` [ 32/73] tcp_v4_send_reset: binding oif to iif in no sock case Greg KH
2012-02-28 1:02 ` [ 33/73] tcp: allow tcp_sacktag_one() to tag ranges not aligned with skbs Greg KH
2012-02-28 1:02 ` [ 34/73] tcp: fix range tcp_shifted_skb() passes to tcp_sacktag_one() Greg KH
2012-02-28 1:02 ` [ 35/73] tcp: fix tcp_shifted_skb() adjustment of lost_cnt_hint for FACK Greg KH
2012-02-28 1:02 ` [ 36/73] route: fix ICMP redirect validation Greg KH
2012-02-28 1:02 ` [ 37/73] ipv4: fix redirect handling Greg KH
2012-02-28 1:02 ` [ 38/73] USB: Added Kamstrup VID/PIDs to cp210x serial driver Greg KH
2012-02-28 1:02 ` [ 39/73] USB: option: cleanup zte 3g-dongles pid in option.c Greg KH
2012-02-28 1:02 ` [ 40/73] USB: Serial: ti_usb_3410_5052: Add Abbot Diabetes Care cable id Greg KH
2012-02-28 1:02 ` [ 41/73] USB: Remove duplicate USB 3.0 hub feature #defines Greg KH
2012-02-28 1:02 ` [ 42/73] USB: Fix handoff when BIOS disables host PCI device Greg KH
2012-02-28 1:02 ` [ 43/73] xhci: Fix oops caused by more USB2 ports than USB3 ports Greg KH
2012-02-28 1:02 ` [ 44/73] xhci: Fix encoding for HS bulk/control NAK rate Greg KH
2012-02-28 1:02 ` [ 45/73] USB: Set hub depth after USB3 hub reset Greg KH
2012-02-28 1:02 ` [ 46/73] i387: math_state_restore() isnt called from asm Greg KH
2012-02-28 1:02 ` [ 47/73] i387: make irq_fpu_usable() tests more robust Greg KH
2012-02-28 1:02 ` [ 48/73] i387: fix sense of sanity check Greg KH
2012-02-28 1:02 ` [ 49/73] i387: fix x86-64 preemption-unsafe user stack save/restore Greg KH
2012-02-28 1:02 ` [ 50/73] i387: move TS_USEDFPU clearing out of __save_init_fpu and into callers Greg KH
2012-02-28 1:02 ` [ 51/73] i387: dont ever touch TS_USEDFPU directly, use helper functions Greg KH
2012-02-28 1:02 ` [ 52/73] i387: do not preload FPU state at task switch time Greg KH
2012-02-28 1:02 ` [ 53/73] i387: move AMD K7/K8 fpu fxsave/fxrstor workaround from save to restore Greg KH
2012-02-28 1:02 ` [ 54/73] i387: move TS_USEDFPU flag from thread_info to task_struct Greg KH
2012-02-28 1:02 ` [ 55/73] i387: re-introduce FPU state preloading at context switch time Greg KH
2012-02-28 1:02 ` [ 56/73] usb-storage: fix freezing of the scanning thread Greg KH
2012-02-28 1:03 ` [ 57/73] USB: Dont fail USB3 probe on missing legacy PCI IRQ Greg KH
2012-02-28 1:03 ` [ 58/73] x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors Greg KH
2012-02-28 1:03 ` [ 59/73] ath9k: stop on rates with idx -1 in ath9k rate controls .tx_status Greg KH
2012-02-28 1:03 ` [ 60/73] genirq: Unmask oneshot irqs when thread was not woken Greg KH
2012-02-28 1:03 ` [ 61/73] genirq: Handle pending irqs in irq_startup() Greg KH
2012-02-28 1:03 ` [ 62/73] [SCSI] scsi_scan: Fix Poison overwritten warning caused by using freed shost Greg KH
2012-02-28 1:03 ` [ 63/73] [SCSI] scsi_pm: Fix bug in the SCSI power management handler Greg KH
2012-02-28 1:03 ` [ 64/73] ipvs: fix matching of fwmark templates during scheduling Greg KH
2012-02-28 1:03 ` [ 65/73] jme: Fix FIFO flush issue Greg KH
2012-02-28 1:03 ` [ 66/73] davinci_emac: Do not free all rx dma descriptors during init Greg KH
2012-02-28 1:03 ` [ 67/73] builddeb: Dont create files in /tmp with predictable names Greg KH
2012-02-28 1:03 ` [ 68/73] [media] hdpvr: fix race conditon during start of streaming Greg KH
2012-02-28 1:03 ` [ 69/73] hwmon: (f75375s) Fix register write order when setting fans to full speed Greg KH
2012-02-28 1:03 ` [ 70/73] epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() Greg KH
2012-02-28 1:03 ` Greg KH [this message]
2012-02-28 1:03 ` [ 72/73] epoll: limit paths Greg KH
2012-02-28 1:03 ` [ 73/73] cdrom: use copy_to_user() without the underscores Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120228010216.168366820@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=mbizon@freebox.fr \
--cc=oleg@redhat.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).