From: Christian Brauner <brauner@kernel.org>
To: linux-fsdevel@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jens Axboe <axboe@kernel.dk>,
"Christian Brauner (Amutable)" <brauner@kernel.org>
Subject: [PATCH 14/17] eventpoll: wrap EP_UNACTIVE_PTR in typed sentinel helpers
Date: Fri, 24 Apr 2026 15:46:45 +0200 [thread overview]
Message-ID: <20260424-work-epoll-rework-v1-14-249ed00a20f3@kernel.org> (raw)
In-Reply-To: <20260424-work-epoll-rework-v1-0-249ed00a20f3@kernel.org>
ep->ovflist and epi->next both use EP_UNACTIVE_PTR (a cast to
(void *)-1) as a sentinel, with distinct meanings at each site:
ep->ovflist == EP_UNACTIVE_PTR no scan in progress
epi->next == EP_UNACTIVE_PTR epi not on ovflist
Call sites had to know the sentinel's value and, by convention, what
it meant in each context. Hide both behind inline helpers:
ep_is_scanning(ep) predicate for "scan in progress"
ep_enter_scan(ep) WRITE_ONCE flip to NULL (scan start)
ep_exit_scan(ep) WRITE_ONCE flip to sentinel (scan end)
epi_on_ovflist(epi) predicate for "epi is on ovflist"
epi_clear_ovflist(epi) clear epi's ovflist link slot
Convert ep_events_available(), ep_start_scan(), ep_done_scan(),
ep_poll_callback(), and ep_alloc_epitem() to use the wrappers. The
ovflist state-machine transitions are now named, not encoded in
sentinel comparisons, and the top-of-file "Ready-list state machine"
section is the single place that spells out the sentinel's meaning.
ep_alloc() keeps the raw "ep->ovflist = EP_UNACTIVE_PTR" init (no
concurrent access at that point) with an inline "not scanning"
comment, and the tfile_check_list sentinel is left alone -- it will
disappear entirely when the loop-check globals move into a
stack-allocated ep_ctl_ctx in a later commit.
Also rework ep_done_scan()'s for-loop: the combined initializer +
update clause that advanced nepi AND cleared epi->next in one step
was clever but hard to read; splitting the update into two
statements inside the body makes the epi_clear_ovflist() call
visible.
No functional change.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
fs/eventpoll.c | 73 +++++++++++++++++++++++++++++++++++++++++-----------------
1 file changed, 52 insertions(+), 21 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index d49457dc8c7f..4199ef8e42e5 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -541,6 +541,43 @@ static inline struct epitem *ep_item_from_wait(wait_queue_entry_t *p)
return container_of(p, struct eppoll_entry, wait)->base;
}
+/*
+ * Ready-list / ovflist state (see "Ready-list state machine" in the
+ * top-of-file banner for the full state machine). EP_UNACTIVE_PTR is
+ * the sentinel; these wrappers name each transition and each test so
+ * call sites do not need to know the sentinel's value.
+ */
+
+/* True iff @ep is between ep_enter_scan() and ep_exit_scan(). */
+static inline bool ep_is_scanning(struct eventpoll *ep)
+{
+ return READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR;
+}
+
+/* Called by ep_start_scan(): divert ep_poll_callback() to ovflist. */
+static inline void ep_enter_scan(struct eventpoll *ep)
+{
+ WRITE_ONCE(ep->ovflist, NULL);
+}
+
+/* Called by ep_done_scan(): redirect ep_poll_callback() back to rdllist. */
+static inline void ep_exit_scan(struct eventpoll *ep)
+{
+ WRITE_ONCE(ep->ovflist, EP_UNACTIVE_PTR);
+}
+
+/* True iff @epi is currently linked on its ep's ovflist. */
+static inline bool epi_on_ovflist(const struct epitem *epi)
+{
+ return epi->next != EP_UNACTIVE_PTR;
+}
+
+/* Mark @epi as not on any ovflist (init and post-drain). */
+static inline void epi_clear_ovflist(struct epitem *epi)
+{
+ epi->next = EP_UNACTIVE_PTR;
+}
+
/**
* ep_events_available - Checks if ready events might be available.
*
@@ -551,8 +588,7 @@ static inline struct epitem *ep_item_from_wait(wait_queue_entry_t *p)
*/
static inline int ep_events_available(struct eventpoll *ep)
{
- return !list_empty_careful(&ep->rdllist) ||
- READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR;
+ return !list_empty_careful(&ep->rdllist) || ep_is_scanning(ep);
}
#ifdef CONFIG_NET_RX_BUSY_POLL
@@ -910,7 +946,7 @@ static void ep_start_scan(struct eventpoll *ep, struct list_head *txlist)
lockdep_assert_irqs_enabled();
spin_lock_irq(&ep->lock);
list_splice_init(&ep->rdllist, txlist);
- WRITE_ONCE(ep->ovflist, NULL);
+ ep_enter_scan(ep);
spin_unlock_irq(&ep->lock);
}
@@ -925,29 +961,24 @@ static void ep_done_scan(struct eventpoll *ep,
* other events might have been queued by the poll callback.
* We re-insert them inside the main ready-list here.
*/
- for (nepi = READ_ONCE(ep->ovflist); (epi = nepi) != NULL;
- nepi = epi->next, epi->next = EP_UNACTIVE_PTR) {
+ for (nepi = READ_ONCE(ep->ovflist); (epi = nepi) != NULL; ) {
+ nepi = epi->next;
+ epi_clear_ovflist(epi);
/*
- * We need to check if the item is already in the list.
- * During the "sproc" callback execution time, items are
- * queued into ->ovflist but the "txlist" might already
- * contain them, and the list_splice() below takes care of them.
+ * Skip items that the caller already returned via @txlist
+ * -- the list_splice() below takes care of those.
*/
if (!ep_is_linked(epi)) {
/*
- * ->ovflist is LIFO, so we have to reverse it in order
- * to keep in FIFO.
+ * ovflist is LIFO; list_add() head-insert here
+ * reverses the iteration order into FIFO.
*/
list_add(&epi->rdllink, &ep->rdllist);
ep_pm_stay_awake(epi);
}
}
- /*
- * We need to set back ep->ovflist to EP_UNACTIVE_PTR, so that after
- * releasing the lock, events will be queued in the normal way inside
- * ep->rdllist.
- */
- WRITE_ONCE(ep->ovflist, EP_UNACTIVE_PTR);
+ /* Back out of scan mode; callbacks target ep->rdllist again. */
+ ep_exit_scan(ep);
/*
* Quickly re-inject items left on "txlist".
@@ -1376,7 +1407,7 @@ static int ep_alloc(struct eventpoll **pep)
init_waitqueue_head(&ep->poll_wait);
INIT_LIST_HEAD(&ep->rdllist);
ep->rbr = RB_ROOT_CACHED;
- ep->ovflist = EP_UNACTIVE_PTR;
+ ep->ovflist = EP_UNACTIVE_PTR; /* not scanning */
ep->user = get_current_user();
refcount_set(&ep->refcount, 1);
@@ -1456,8 +1487,8 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
* semantics). All the events that happen during that period of time are
* chained in ep->ovflist and requeued later on.
*/
- if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) {
- if (epi->next == EP_UNACTIVE_PTR) {
+ if (ep_is_scanning(ep)) {
+ if (!epi_on_ovflist(epi)) {
epi->next = READ_ONCE(ep->ovflist);
WRITE_ONCE(ep->ovflist, epi);
ep_pm_stay_awake_rcu(epi);
@@ -1771,7 +1802,7 @@ static struct epitem *ep_alloc_epitem(struct eventpoll *ep,
epi->ep = ep;
ep_set_ffd(&epi->ffd, tfile, fd);
epi->event = *event;
- epi->next = EP_UNACTIVE_PTR;
+ epi_clear_ovflist(epi);
return epi;
}
--
2.47.3
next prev parent reply other threads:[~2026-04-24 13:47 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-24 13:46 [PATCH 00/17] eventpoll: clarity refactor Christian Brauner
2026-04-24 13:46 ` [PATCH 01/17] eventpoll: expand top-of-file overview / locking doc Christian Brauner
2026-04-24 13:46 ` [PATCH 02/17] eventpoll: document loop-check / path-check globals Christian Brauner
2026-04-24 13:46 ` [PATCH 03/17] eventpoll: clarify POLLFREE handshake comments Christian Brauner
2026-04-24 13:46 ` [PATCH 04/17] eventpoll: refresh epi_fget() / ep_remove_file() comments Christian Brauner
2026-04-24 13:46 ` [PATCH 05/17] eventpoll: document ep_clear_and_put() two-pass pattern Christian Brauner
2026-04-24 13:46 ` [PATCH 06/17] eventpoll: rename ep_refcount_dec_and_test() to ep_put() Christian Brauner
2026-04-24 13:46 ` [PATCH 07/17] eventpoll: drop unused depth argument from epoll_mutex_lock() Christian Brauner
2026-04-24 13:46 ` [PATCH 08/17] eventpoll: rename attach_epitem() to ep_attach_file() Christian Brauner
2026-04-24 13:46 ` [PATCH 09/17] eventpoll: relocate KCMP helpers near compat syscalls Christian Brauner
2026-04-24 13:46 ` [PATCH 10/17] eventpoll: split ep_insert() into alloc + register stages Christian Brauner
2026-04-24 13:46 ` [PATCH 11/17] eventpoll: split ep_clear_and_put() into drain helpers Christian Brauner
2026-04-24 13:46 ` [PATCH 12/17] eventpoll: extract ep_deliver_event() from ep_send_events() Christian Brauner
2026-04-24 13:46 ` [PATCH 13/17] eventpoll: extract lock dance from do_epoll_ctl() into ep_ctl_lock() Christian Brauner
2026-04-24 13:46 ` Christian Brauner [this message]
2026-04-24 13:46 ` [PATCH 15/17] eventpoll: rename epi->next and txlist for clarity Christian Brauner
2026-04-24 16:06 ` Linus Torvalds
2026-04-24 13:46 ` [PATCH 16/17] eventpoll: use bool for predicate helpers Christian Brauner
2026-04-24 13:46 ` [PATCH 17/17] eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx Christian Brauner
2026-04-24 15:33 ` [PATCH 00/17] eventpoll: clarity refactor Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260424-work-epoll-rework-v1-14-249ed00a20f3@kernel.org \
--to=brauner@kernel.org \
--cc=axboe@kernel.dk \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox