All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Maxime Bizon <mbizon@freebox.fr>,
	Oleg Nesterov <oleg@redhat.com>
Subject: [ 70/72] epoll: ep_unregister_pollwait() can use the freed pwq->whead
Date: Mon, 27 Feb 2012 17:05:39 -0800	[thread overview]
Message-ID: <20120228010435.562463016@linuxfoundation.org> (raw)
In-Reply-To: <20120228010511.GA8453@kroah.com>

3.2-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Oleg Nesterov <oleg@redhat.com>

commit 971316f0503a5c50633d07b83b6db2f15a3a5b00 upstream.

signalfd_cleanup() ensures that ->signalfd_wqh is not used, but
this is not enough. eppoll_entry->whead still points to the memory
we are going to free, ep_unregister_pollwait()->remove_wait_queue()
is obviously unsafe.

Change ep_poll_callback(POLLFREE) to set eppoll_entry->whead = NULL,
change ep_unregister_pollwait() to check pwq->whead != NULL under
rcu_read_lock() before remove_wait_queue(). We add the new helper,
ep_remove_wait_queue(), for this.

This works because sighand_cachep is SLAB_DESTROY_BY_RCU and because
->signalfd_wqh is initialized in sighand_ctor(), not in copy_sighand.
ep_unregister_pollwait()->remove_wait_queue() can play with already
freed and potentially reused ->sighand, but this is fine. This memory
must have the valid ->signalfd_wqh until rcu_read_unlock().

Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/eventpoll.c |   30 +++++++++++++++++++++++++++---
 fs/signalfd.c  |    6 +++++-
 2 files changed, 32 insertions(+), 4 deletions(-)

--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -299,6 +299,11 @@ static inline int ep_is_linked(struct li
 	return !list_empty(p);
 }
 
+static inline struct eppoll_entry *ep_pwq_from_wait(wait_queue_t *p)
+{
+	return container_of(p, struct eppoll_entry, wait);
+}
+
 /* Get the "struct epitem" from a wait queue pointer */
 static inline struct epitem *ep_item_from_wait(wait_queue_t *p)
 {
@@ -446,6 +451,18 @@ static void ep_poll_safewake(wait_queue_
 	put_cpu();
 }
 
+static void ep_remove_wait_queue(struct eppoll_entry *pwq)
+{
+	wait_queue_head_t *whead;
+
+	rcu_read_lock();
+	/* If it is cleared by POLLFREE, it should be rcu-safe */
+	whead = rcu_dereference(pwq->whead);
+	if (whead)
+		remove_wait_queue(whead, &pwq->wait);
+	rcu_read_unlock();
+}
+
 /*
  * This function unregisters poll callbacks from the associated file
  * descriptor.  Must be called with "mtx" held (or "epmutex" if called from
@@ -460,7 +477,7 @@ static void ep_unregister_pollwait(struc
 		pwq = list_first_entry(lsthead, struct eppoll_entry, llink);
 
 		list_del(&pwq->llink);
-		remove_wait_queue(pwq->whead, &pwq->wait);
+		ep_remove_wait_queue(pwq);
 		kmem_cache_free(pwq_cache, pwq);
 	}
 }
@@ -827,9 +844,16 @@ static int ep_poll_callback(wait_queue_t
 	struct epitem *epi = ep_item_from_wait(wait);
 	struct eventpoll *ep = epi->ep;
 
-	/* the caller holds eppoll_entry->whead->lock */
-	if ((unsigned long)key & POLLFREE)
+	if ((unsigned long)key & POLLFREE) {
+		ep_pwq_from_wait(wait)->whead = NULL;
+		/*
+		 * whead = NULL above can race with ep_remove_wait_queue()
+		 * which can do another remove_wait_queue() after us, so we
+		 * can't use __remove_wait_queue(). whead->lock is held by
+		 * the caller.
+		 */
 		list_del_init(&wait->task_list);
+	}
 
 	spin_lock_irqsave(&ep->lock, flags);
 
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -33,7 +33,11 @@
 void signalfd_cleanup(struct sighand_struct *sighand)
 {
 	wait_queue_head_t *wqh = &sighand->signalfd_wqh;
-
+	/*
+	 * The lockless check can race with remove_wait_queue() in progress,
+	 * but in this case its caller should run under rcu_read_lock() and
+	 * sighand_cachep is SLAB_DESTROY_BY_RCU, we can safely return.
+	 */
 	if (likely(!waitqueue_active(wqh)))
 		return;
 



  parent reply	other threads:[~2012-02-28  1:05 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-28  1:05 [ 00/72] 3.2.9-stable review Greg KH
2012-02-28  1:04 ` [ 01/72] Security: tomoyo: add .gitignore file Greg KH
2012-02-28  1:04 ` [ 02/72] powerpc/perf: power_pmu_start restores incorrect values, breaking frequency events Greg KH
2012-02-28  1:04 ` [ 03/72] ARM: at91: USB AT91 gadget registration for module Greg KH
2012-02-28  1:04 ` [ 04/72] drm/radeon/kms: fix MSI re-arm on rv370+ Greg KH
2012-02-28  1:04 ` [ 05/72] PCI: workaround hard-wired bus number V2 Greg KH
2012-02-28  1:04 ` [ 06/72] mac80211: Fix a rwlock bad magic bug Greg KH
2012-02-28  1:04 ` [ 07/72] ipheth: Add iPhone 4S Greg KH
2012-02-28  1:04 ` [ 08/72] regmap: Fix cache defaults initialization from raw cache defaults Greg KH
2012-02-28  1:04 ` [ 09/72] eCryptfs: Copy up lower inode attrs after setting lower xattr Greg KH
2012-02-28  1:04 ` [ 10/72] S390: correct ktime to tod clock comparator conversion Greg KH
2012-02-28  1:04 ` [ 11/72] vfs: fix d_inode_lookup() dentry ref leak Greg KH
2012-02-28  1:04 ` [ 12/72] ARM: 7326/2: PL330: fix null pointer dereference in pl330_chan_ctrl() Greg KH
2012-02-28  2:29   ` Mans Rullgard
2012-02-28  2:29     ` Mans Rullgard
2012-02-28  8:44     ` Russell King
2012-02-28  8:44       ` Russell King
2012-02-28  9:33       ` Javi Merino
2012-02-28 11:36         ` Mans Rullgard
2012-02-28 11:36           ` Mans Rullgard
2012-02-28  1:04 ` [ 13/72] ALSA: hda - Fix redundant jack creations for cx5051 Greg KH
2012-02-28  1:04 ` [ 14/72] mmc: core: check for zero length ioctl data Greg KH
2012-02-28  1:04 ` [ 15/72] NFSv4: Fix an Oops in the NFSv4 getacl code Greg KH
2012-02-28  1:04 ` [ 16/72] NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEID Greg KH
2012-02-28  1:04 ` [ 17/72] NFSv4: fix server_scope memory leak Greg KH
2012-02-28  1:04 ` [ 18/72] ARM: 7321/1: cache-v7: Disable preemption when reading CCSIDR Greg KH
2012-02-28  1:04 ` [ 19/72] ARM: 7325/1: fix v7 boot with lockdep enabled Greg KH
2012-02-28  1:04 ` [ 20/72] 3c59x: shorten timer period for slave devices Greg KH
2012-02-28  1:04 ` [ 21/72] net: Dont proxy arp respond if iif == rt->dst.dev if private VLAN is disabled Greg KH
2012-02-28  1:04 ` [ 22/72] netpoll: netpoll_poll_dev() should access dev->flags Greg KH
2012-02-28  1:04 ` [ 23/72] net_sched: Bug in netem reordering Greg KH
2012-02-28  1:04 ` [ 24/72] veth: Enforce minimum size of VETH_INFO_PEER Greg KH
2012-02-28  1:04 ` [ 25/72] via-velocity: S3 resume fix Greg KH
2012-02-28  1:04 ` [ 26/72] ipv4: reset flowi parameters on route connect Greg KH
2012-02-28  1:04 ` [ 27/72] tcp_v4_send_reset: binding oif to iif in no sock case Greg KH
2012-02-28  1:04 ` [ 28/72] ipv4: Fix wrong order of ip_rt_get_source() and update iph->daddr Greg KH
2012-02-28  1:04 ` [ 29/72] net: Make qdisc_skb_cb upper size bound explicit Greg KH
2012-02-28  1:04 ` [ 30/72] IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses Greg KH
2012-02-28  1:05 ` [ 31/72] gro: more generic L2 header check Greg KH
2012-02-28  1:05   ` Greg KH
2012-02-28  1:05 ` [ 32/72] tcp: allow tcp_sacktag_one() to tag ranges not aligned with skbs Greg KH
2012-02-28  1:05 ` [ 33/72] tcp: fix range tcp_shifted_skb() passes to tcp_sacktag_one() Greg KH
2012-02-28  1:05 ` [ 34/72] tcp: fix tcp_shifted_skb() adjustment of lost_cnt_hint for FACK Greg KH
2012-02-28  1:05 ` [ 35/72] USB: Added Kamstrup VID/PIDs to cp210x serial driver Greg KH
2012-02-28  1:05 ` [ 36/72] USB: option: cleanup zte 3g-dongles pid in option.c Greg KH
2012-02-28  1:05 ` [ 37/72] USB: Serial: ti_usb_3410_5052: Add Abbot Diabetes Care cable id Greg KH
2012-02-28  1:05 ` [ 38/72] USB: Remove duplicate USB 3.0 hub feature #defines Greg KH
2012-02-28  1:05 ` [ 39/72] USB: Fix handoff when BIOS disables host PCI device Greg KH
2012-02-28  1:05 ` [ 40/72] xhci: Fix oops caused by more USB2 ports than USB3 ports Greg KH
2012-02-28  1:05 ` [ 41/72] xhci: Fix encoding for HS bulk/control NAK rate Greg KH
2012-02-28  1:05 ` [ 42/72] USB: Dont fail USB3 probe on missing legacy PCI IRQ Greg KH
2012-02-28  1:05 ` [ 43/72] USB: Set hub depth after USB3 hub reset Greg KH
2012-02-28  1:05 ` [ 44/72] usb-storage: fix freezing of the scanning thread Greg KH
2012-02-28  1:05 ` [ 45/72] target: Allow control CDBs with data > 1 page Greg KH
2012-02-28  1:05 ` [ 46/72] ASoC: wm8962: Fix sidetone enumeration texts Greg KH
2012-02-28  1:05 ` [ 47/72] ALSA: hda/realtek - Fix overflow of vol/sw check bitmap Greg KH
2012-02-28  1:05 ` [ 48/72] ALSA: hda/realtek - Fix surround output regression on Acer Aspire 5935 Greg KH
2012-02-28  1:05 ` [ 49/72] NOMMU: Lock i_mmap_mutex for access to the VMA prio list Greg KH
2012-02-28  1:05 ` [ 50/72] hwmon: (max6639) Fix FAN_FROM_REG calculation Greg KH
2012-02-28  1:05 ` [ 51/72] hwmon: (max6639) Fix PPR register initialization to set both channels Greg KH
2012-02-28  1:05 ` [ 52/72] hwmon: (ads1015) Fix file leak in probe function Greg KH
2012-02-28  1:05 ` [ 53/72] ARM: omap: fix oops in drivers/video/omap2/dss/dpi.c Greg KH
2012-02-28  1:05 ` [ 54/72] ARM: omap: fix oops in arch/arm/mach-omap2/vp.c when pmic is not found Greg KH
2012-02-28  1:05 ` [ 55/72] x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors Greg KH
2012-02-28  1:05 ` [ 56/72] ath9k: stop on rates with idx -1 in ath9k rate controls .tx_status Greg KH
2012-02-28  1:05 ` [ 57/72] genirq: Unmask oneshot irqs when thread was not woken Greg KH
2012-03-04 21:06   ` Sven Joachim
2012-03-04 21:53     ` Jonathan Nieder
2012-03-04 22:08       ` Sven Joachim
2012-03-05  0:43     ` Stefan Lippers-Hollmann
2012-03-06  0:34       ` Linus Torvalds
2012-03-06  0:34         ` Linus Torvalds
2012-03-06  8:28         ` Thomas Gleixner
2012-03-06  9:52           ` Thomas Gleixner
2012-03-06 19:31             ` Thomas Gleixner
2012-03-06 19:53               ` Sven Joachim
2012-03-06 20:26                 ` Thomas Gleixner
2012-03-06 20:54                   ` Thomas Gleixner
2012-03-06 21:07                   ` Sven Joachim
2012-03-06 21:11                     ` Thomas Gleixner
2012-03-06 21:40                       ` Linus Torvalds
2012-03-06 21:08                   ` Stefan Lippers-Hollmann
2012-03-06 21:40                   ` Linus Torvalds
2012-03-06 21:40                     ` Linus Torvalds
2012-03-06 21:47                     ` Linus Torvalds
2012-03-06 21:47                       ` Linus Torvalds
2012-03-06 22:18                     ` Thomas Gleixner
2012-03-06 22:33                       ` Linus Torvalds
2012-03-06 23:38                         ` Stefan Lippers-Hollmann
2012-03-07  5:36                         ` Sven Joachim
2012-03-06 20:25               ` Stefan Lippers-Hollmann
2012-03-06 19:45       ` Thomas Gleixner
2012-03-06 20:10         ` Sven Joachim
2012-02-28  1:05 ` [ 58/72] genirq: Handle pending irqs in irq_startup() Greg KH
2012-02-28  1:05 ` [ 59/72] [SCSI] scsi_scan: Fix Poison overwritten warning caused by using freed shost Greg KH
2012-02-28  1:05 ` [ 60/72] [SCSI] scsi_pm: Fix bug in the SCSI power management handler Greg KH
2012-02-28  1:05 ` [ 61/72] ipvs: fix matching of fwmark templates during scheduling Greg KH
2012-02-28  1:05 ` [ 62/72] jme: Fix FIFO flush issue Greg KH
2012-02-28  1:05 ` [ 63/72] davinci_emac: Do not free all rx dma descriptors during init Greg KH
2012-02-28  1:05 ` [ 64/72] builddeb: Dont create files in /tmp with predictable names Greg KH
2012-02-28  1:05 ` [ 65/72] can: sja1000: fix isr hang when hw is unplugged under load Greg KH
2012-02-28  1:05 ` [ 66/72] [media] hdpvr: fix race conditon during start of streaming Greg KH
2012-02-28  1:05 ` [ 67/72] [media] imon: dont wedge hardware after early callbacks Greg KH
2012-02-28  1:05 ` [ 68/72] hwmon: (f75375s) Fix register write order when setting fans to full speed Greg KH
2012-02-28  1:05 ` [ 69/72] epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree() Greg KH
2012-02-28  1:05 ` Greg KH [this message]
2012-02-28  1:05 ` [ 71/72] epoll: limit paths Greg KH
2012-02-28  1:05 ` [ 72/72] cdrom: use copy_to_user() without the underscores Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120228010435.562463016@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbizon@freebox.fr \
    --cc=oleg@redhat.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.