public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: linux-fsdevel@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	 Linus Torvalds <torvalds@linux-foundation.org>,
	 Jens Axboe <axboe@kernel.dk>,
	 "Christian Brauner (Amutable)" <brauner@kernel.org>
Subject: [PATCH 02/17] eventpoll: document loop-check / path-check globals
Date: Fri, 24 Apr 2026 15:46:33 +0200	[thread overview]
Message-ID: <20260424-work-epoll-rework-v1-2-249ed00a20f3@kernel.org> (raw)
In-Reply-To: <20260424-work-epoll-rework-v1-0-249ed00a20f3@kernel.org>

The globals that support EPOLL_CTL_ADD's cycle and path-length checks
are scattered: epnested_mutex, loop_check_gen, inserting_into, and
tfile_check_list sit at the top of the file; path_count[] and
path_limits[] are declared inline with the path-check code further
down. Their interaction -- the "ep->gen == loop_check_gen" trigger in
do_epoll_ctl(), the two loop_check_gen++ bumps that sandwich a check,
the EP_UNACTIVE_PTR sentinel on tfile_check_list, the -ELOOP back-edge
detection via inserting_into -- is not documented anywhere.

The area has had three recent fixes (CVE-2025-38349, the unbounded
recursion fix, and the overflow fix) whose logic depends on these
invariants. Collect the description in one block alongside the
declarations, cross-reference the path_count[] declaration that lives
with the path-check code, and name the fix commits so future readers
can find the context.

Also add a short comment on struct epitems_head describing its
dual use (wrapper for non-epoll file->f_ep versus pointing into
&ep->refs for the epoll-watches-epoll case), which the old comment
on tfile_check_list had accidentally attached to the struct.

Comment-only; no functional change.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
 fs/eventpoll.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 50 insertions(+), 6 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 5896f705a3ac..477fcbc8e95e 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -372,12 +372,54 @@ struct ep_pqueue {
 /* Maximum number of epoll watched descriptors, per user */
 static long max_user_watches __read_mostly;
 
-/* Used for cycles detection */
+/*
+ * Cycle and path-length checks at EPOLL_CTL_ADD
+ * ---------------------------------------------
+ *
+ * When EPOLL_CTL_ADD creates a link that either targets an eventpoll
+ * file or extends an existing chain of eventpolls, two checks run:
+ *
+ *   1. no cycle is being formed -- ep_loop_check() walks downward
+ *      from the candidate target, and ep_get_upwards_depth_proc()
+ *      walks upward from the outer ep, both bounded by EP_MAX_NESTS.
+ *   2. no file accumulates more than path_limits[depth] wakeup paths
+ *      of a given length -- reverse_path_check().
+ *
+ * Both need a global view of the epoll topology and must be atomic
+ * with the insertion, so the scratch state below is all serialized by
+ * one global mutex, epnested_mutex. Non-nested inserts skip this
+ * machinery entirely and take only ep->mtx.
+ *
+ *   epnested_mutex     Serializes the whole check; also protects every
+ *                      other variable in this block plus path_count[]
+ *                      (declared with the path-check code further
+ *                      down).
+ *   loop_check_gen     Monotonic stamp, bumped once at the start of a
+ *                      check and once at the end. ep->gen caches the
+ *                      value under which ep was last visited by
+ *                      ep_loop_check_proc() or
+ *                      ep_get_upwards_depth_proc(); the post-check
+ *                      bump ensures those cached stamps can no longer
+ *                      equal loop_check_gen, so the
+ *                      "ep->gen == loop_check_gen" trigger in
+ *                      do_epoll_ctl() only fires while another check
+ *                      is in flight.
+ *   inserting_into     Outer eventpoll pointer for the lifetime of one
+ *                      ep_loop_check(); ep_loop_check_proc() fails
+ *                      with -ELOOP if the downward walk reaches it.
+ *   tfile_check_list   Singly-linked list of epitems_head objects
+ *                      collected by ep_loop_check_proc() during the
+ *                      walk, consumed by reverse_path_check()
+ *                      afterwards. Sentinel EP_UNACTIVE_PTR means no
+ *                      check is in flight.
+ *
+ * Commits fdcfce93073d ("eventpoll: Fix integer overflow in
+ * ep_loop_check_proc()") and f2e467a48287 ("eventpoll: Fix
+ * semi-unbounded recursion") hardened the walk; any refactor must
+ * preserve both bail-outs.
+ */
 static DEFINE_MUTEX(epnested_mutex);
-
 static u64 loop_check_gen = 0;
-
-/* Used to check for epoll file descriptor inclusion loops */
 static struct eventpoll *inserting_into;
 
 /* Slab cache used to allocate "struct epitem" */
@@ -387,8 +429,10 @@ static struct kmem_cache *epi_cache __ro_after_init;
 static struct kmem_cache *pwq_cache __ro_after_init;
 
 /*
- * List of files with newly added links, where we may need to limit the number
- * of emanating paths. Protected by the epnested_mutex.
+ * Wrapper anchor for file->f_ep when the watched file is not itself an
+ * eventpoll; for the epoll-watches-epoll case, file->f_ep points at
+ * &watched_ep->refs directly. The ->next field threads
+ * tfile_check_list during one EPOLL_CTL_ADD path check.
  */
 struct epitems_head {
 	struct hlist_head epitems;

-- 
2.47.3


  parent reply	other threads:[~2026-04-24 13:46 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-24 13:46 [PATCH 00/17] eventpoll: clarity refactor Christian Brauner
2026-04-24 13:46 ` [PATCH 01/17] eventpoll: expand top-of-file overview / locking doc Christian Brauner
2026-04-24 13:46 ` Christian Brauner [this message]
2026-04-24 13:46 ` [PATCH 03/17] eventpoll: clarify POLLFREE handshake comments Christian Brauner
2026-04-24 13:46 ` [PATCH 04/17] eventpoll: refresh epi_fget() / ep_remove_file() comments Christian Brauner
2026-04-24 13:46 ` [PATCH 05/17] eventpoll: document ep_clear_and_put() two-pass pattern Christian Brauner
2026-04-24 13:46 ` [PATCH 06/17] eventpoll: rename ep_refcount_dec_and_test() to ep_put() Christian Brauner
2026-04-24 13:46 ` [PATCH 07/17] eventpoll: drop unused depth argument from epoll_mutex_lock() Christian Brauner
2026-04-24 13:46 ` [PATCH 08/17] eventpoll: rename attach_epitem() to ep_attach_file() Christian Brauner
2026-04-24 13:46 ` [PATCH 09/17] eventpoll: relocate KCMP helpers near compat syscalls Christian Brauner
2026-04-24 13:46 ` [PATCH 10/17] eventpoll: split ep_insert() into alloc + register stages Christian Brauner
2026-04-24 13:46 ` [PATCH 11/17] eventpoll: split ep_clear_and_put() into drain helpers Christian Brauner
2026-04-24 13:46 ` [PATCH 12/17] eventpoll: extract ep_deliver_event() from ep_send_events() Christian Brauner
2026-04-24 13:46 ` [PATCH 13/17] eventpoll: extract lock dance from do_epoll_ctl() into ep_ctl_lock() Christian Brauner
2026-04-24 13:46 ` [PATCH 14/17] eventpoll: wrap EP_UNACTIVE_PTR in typed sentinel helpers Christian Brauner
2026-04-24 13:46 ` [PATCH 15/17] eventpoll: rename epi->next and txlist for clarity Christian Brauner
2026-04-24 16:06   ` Linus Torvalds
2026-04-24 13:46 ` [PATCH 16/17] eventpoll: use bool for predicate helpers Christian Brauner
2026-04-24 13:46 ` [PATCH 17/17] eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx Christian Brauner
2026-04-24 15:33 ` [PATCH 00/17] eventpoll: clarity refactor Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260424-work-epoll-rework-v1-2-249ed00a20f3@kernel.org \
    --to=brauner@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox