From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"linux-kernel@vger.kernel.org,
Linus Torvalds" <torvalds@linux-foundation.org>,
Eric Biggers <ebiggers@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.9 19/42] wait: add wake_up_pollfree()
Date: Mon, 13 Dec 2021 10:30:01 +0100 [thread overview]
Message-ID: <20211213092927.203069108@linuxfoundation.org> (raw)
In-Reply-To: <20211213092926.578829548@linuxfoundation.org>
From: Eric Biggers <ebiggers@kernel.org>
commit 42288cb44c4b5fff7653bc392b583a2b8bd6a8c0 upstream.
Several ->poll() implementations are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case. This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution. This solution is for the queue to be cleared
before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'.
However, that has a bug: wake_up_poll() calls __wake_up() with
nr_exclusive=1. Therefore, if there are multiple "exclusive" waiters,
and the wakeup function for the first one returns a positive value, only
that one will be called. That's *not* what's needed for POLLFREE;
POLLFREE is special in that it really needs to wake up everyone.
Considering the three non-blocking poll systems:
- io_uring poll doesn't handle POLLFREE at all, so it is broken anyway.
- aio poll is unaffected, since it doesn't support exclusive waits.
However, that's fragile, as someone could add this feature later.
- epoll doesn't appear to be broken by this, since its wakeup function
returns 0 when it sees POLLFREE. But this is fragile.
Although there is a workaround (see epoll), it's better to define a
function which always sends POLLFREE to all waiters. Add such a
function. Also make it verify that the queue really becomes empty after
all waiters have been woken up.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/wait.h | 26 ++++++++++++++++++++++++++
kernel/sched/wait.c | 8 ++++++++
2 files changed, 34 insertions(+)
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -202,6 +202,7 @@ void __wake_up_locked_key(wait_queue_hea
void __wake_up_sync_key(wait_queue_head_t *q, unsigned int mode, int nr, void *key);
void __wake_up_locked(wait_queue_head_t *q, unsigned int mode, int nr);
void __wake_up_sync(wait_queue_head_t *q, unsigned int mode, int nr);
+void __wake_up_pollfree(wait_queue_head_t *wq_head);
void __wake_up_bit(wait_queue_head_t *, void *, int);
int __wait_on_bit(wait_queue_head_t *, struct wait_bit_queue *, wait_bit_action_f *, unsigned);
int __wait_on_bit_lock(wait_queue_head_t *, struct wait_bit_queue *, wait_bit_action_f *, unsigned);
@@ -236,6 +237,31 @@ wait_queue_head_t *bit_waitqueue(void *,
#define wake_up_interruptible_sync_poll(x, m) \
__wake_up_sync_key((x), TASK_INTERRUPTIBLE, 1, (void *) (m))
+/**
+ * wake_up_pollfree - signal that a polled waitqueue is going away
+ * @wq_head: the wait queue head
+ *
+ * In the very rare cases where a ->poll() implementation uses a waitqueue whose
+ * lifetime is tied to a task rather than to the 'struct file' being polled,
+ * this function must be called before the waitqueue is freed so that
+ * non-blocking polls (e.g. epoll) are notified that the queue is going away.
+ *
+ * The caller must also RCU-delay the freeing of the wait_queue_head, e.g. via
+ * an explicit synchronize_rcu() or call_rcu(), or via SLAB_DESTROY_BY_RCU.
+ */
+static inline void wake_up_pollfree(wait_queue_head_t *wq_head)
+{
+ /*
+ * For performance reasons, we don't always take the queue lock here.
+ * Therefore, we might race with someone removing the last entry from
+ * the queue, and proceed while they still hold the queue lock.
+ * However, rcu_read_lock() is required to be held in such cases, so we
+ * can safely proceed with an RCU-delayed free.
+ */
+ if (waitqueue_active(wq_head))
+ __wake_up_pollfree(wq_head);
+}
+
#define ___wait_cond_timeout(condition) \
({ \
bool __cond = (condition); \
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -10,6 +10,7 @@
#include <linux/wait.h>
#include <linux/hash.h>
#include <linux/kthread.h>
+#include <linux/poll.h>
void __init_waitqueue_head(wait_queue_head_t *q, const char *name, struct lock_class_key *key)
{
@@ -156,6 +157,13 @@ void __wake_up_sync(wait_queue_head_t *q
}
EXPORT_SYMBOL_GPL(__wake_up_sync); /* For internal use only */
+void __wake_up_pollfree(wait_queue_head_t *wq_head)
+{
+ __wake_up(wq_head, TASK_NORMAL, 0, (void *)(POLLHUP | POLLFREE));
+ /* POLLFREE must have cleared the queue. */
+ WARN_ON_ONCE(waitqueue_active(wq_head));
+}
+
/*
* Note: we use "set_current_state()" _after_ the wait-queue add,
* because we need a memory barrier there on SMP, so that any
next prev parent reply other threads:[~2021-12-13 9:37 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-13 9:29 [PATCH 4.9 00/42] 4.9.293-rc1 review Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 01/42] HID: introduce hid_is_using_ll_driver Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 02/42] HID: add hid_is_usb() function to make it simpler for USB detection Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 03/42] HID: add USB_HID dependancy to hid-prodikeys Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 04/42] HID: add USB_HID dependancy to hid-chicony Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 05/42] HID: add USB_HID dependancy on some USB HID drivers Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 06/42] HID: wacom: fix problems when device is not a valid USB device Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 07/42] HID: check for valid USB device for many HID drivers Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 08/42] can: sja1000: fix use after free in ems_pcmcia_add_card() Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 09/42] nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 10/42] IB/hfi1: Correct guard on eager buffer deallocation Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 11/42] mm: bdi: initialize bdi_min_ratio when bdi is unregistered Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 12/42] ALSA: ctl: Fix copy of updated id with element read/write Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 13/42] ALSA: pcm: oss: Fix negative period/buffer sizes Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 14/42] ALSA: pcm: oss: Limit the period size to 16MB Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 15/42] ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*() Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 16/42] tracefs: Have new files inherit the ownership of their parent Greg Kroah-Hartman
2021-12-13 9:29 ` [PATCH 4.9 17/42] can: pch_can: pch_can_rx_normal: fix use after free Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 18/42] libata: add horkage for ASMedia 1092 Greg Kroah-Hartman
2021-12-13 9:30 ` Greg Kroah-Hartman [this message]
2021-12-13 9:30 ` [PATCH 4.9 20/42] binder: use wake_up_pollfree() Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 21/42] signalfd: " Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 22/42] tracefs: Set all files to the same group ownership as the mount option Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 23/42] block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2) Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 24/42] net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 25/42] net: altera: set a couple error code in probe() Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 26/42] net: fec: only clear interrupt of handling queue in fec_enet_rx_queue() Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 27/42] net, neigh: clear whole pneigh_entry at alloc time Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 28/42] net/qla3xxx: fix an error code in ql_adapter_up() Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 29/42] USB: gadget: detect too-big endpoint 0 requests Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 30/42] USB: gadget: zero allocate endpoint 0 buffers Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 31/42] usb: core: config: fix validation of wMaxPacketValue entries Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 32/42] usb: core: config: using bit mask instead of individual bits Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 33/42] iio: stk3310: Dont return error code in interrupt handler Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 34/42] iio: mma8452: Fix trigger reference couting Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 35/42] iio: ltr501: Dont return error code in trigger handler Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 36/42] iio: kxsd9: " Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 37/42] iio: itg3200: Call iio_trigger_notify_done() on error Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 38/42] iio: accel: kxcjk-1013: Fix possible memory leak in probe and remove Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 39/42] irqchip/armada-370-xp: Fix return value of armada_370_xp_msi_alloc() Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 40/42] irqchip/armada-370-xp: Fix support for Multi-MSI interrupts Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 41/42] irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL Greg Kroah-Hartman
2021-12-13 9:30 ` [PATCH 4.9 42/42] irqchip: nvic: Fix offset for Interrupt Priority Offsets Greg Kroah-Hartman
2021-12-13 14:43 ` [PATCH 4.9 00/42] 4.9.293-rc1 review Jon Hunter
2021-12-13 19:54 ` Guenter Roeck
2021-12-13 20:00 ` Florian Fainelli
2021-12-13 20:31 ` Shuah Khan
2021-12-14 5:53 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211213092927.203069108@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=ebiggers@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.