* [PATCH] eventpoll: Fix semi-unbounded recursion
@ 2025-07-11 16:33 Jann Horn
2025-07-11 16:38 ` Jann Horn
2025-07-31 22:31 ` Carlos Llamas
0 siblings, 2 replies; 4+ messages in thread
From: Jann Horn @ 2025-07-11 16:33 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-kernel, stable, Jann Horn
Ensure that epoll instances can never form a graph deeper than
EP_MAX_NESTS+1 links.
Currently, ep_loop_check_proc() ensures that the graph is loop-free and
does some recursion depth checks, but those recursion depth checks don't
limit the depth of the resulting tree for two reasons:
- They don't look upwards in the tree.
- If there are multiple downwards paths of different lengths, only one of
the paths is actually considered for the depth check since commit
28d82dc1c4ed ("epoll: limit paths").
Essentially, the current recursion depth check in ep_loop_check_proc() just
serves to prevent it from recursing too deeply while checking for loops.
A more thorough check is done in reverse_path_check() after the new graph
edge has already been created; this checks, among other things, that no
paths going upwards from any non-epoll file with a length of more than 5
edges exist. However, this check does not apply to non-epoll files.
As a result, it is possible to recurse to a depth of at least roughly 500,
tested on v6.15. (I am unsure if deeper recursion is possible; and this may
have changed with commit 8c44dac8add7 ("eventpoll: Fix priority inversion
problem").)
To fix it:
1. In ep_loop_check_proc(), note the subtree depth of each visited node,
and use subtree depths for the total depth calculation even when a subtree
has already been visited.
2. Add ep_get_upwards_depth_proc() for similarly determining the maximum
depth of an upwards walk.
3. In ep_loop_check(), use these values to limit the total path length
between epoll nodes to EP_MAX_NESTS edges.
Fixes: 22bacca48a17 ("epoll: prevent creating circular epoll structures")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
---
fs/eventpoll.c | 60 ++++++++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 46 insertions(+), 14 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index d4dbffdedd08..44648cc09250 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -218,6 +218,7 @@ struct eventpoll {
/* used to optimize loop detection check */
u64 gen;
struct hlist_head refs;
+ u8 loop_check_depth;
/*
* usage count, used together with epitem->dying to
@@ -2142,23 +2143,24 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
}
/**
- * ep_loop_check_proc - verify that adding an epoll file inside another
- * epoll structure does not violate the constraints, in
- * terms of closed loops, or too deep chains (which can
- * result in excessive stack usage).
+ * ep_loop_check_proc - verify that adding an epoll file @ep inside another
+ * epoll file does not create closed loops, and
+ * determine the depth of the subtree starting at @ep
*
* @ep: the &struct eventpoll to be currently checked.
* @depth: Current depth of the path being checked.
*
- * Return: %zero if adding the epoll @file inside current epoll
- * structure @ep does not violate the constraints, or %-1 otherwise.
+ * Return: depth of the subtree, or INT_MAX if we found a loop or went too deep.
*/
static int ep_loop_check_proc(struct eventpoll *ep, int depth)
{
- int error = 0;
+ int result = 0;
struct rb_node *rbp;
struct epitem *epi;
+ if (ep->gen == loop_check_gen)
+ return ep->loop_check_depth;
+
mutex_lock_nested(&ep->mtx, depth + 1);
ep->gen = loop_check_gen;
for (rbp = rb_first_cached(&ep->rbr); rbp; rbp = rb_next(rbp)) {
@@ -2166,13 +2168,11 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
if (unlikely(is_file_epoll(epi->ffd.file))) {
struct eventpoll *ep_tovisit;
ep_tovisit = epi->ffd.file->private_data;
- if (ep_tovisit->gen == loop_check_gen)
- continue;
if (ep_tovisit == inserting_into || depth > EP_MAX_NESTS)
- error = -1;
+ result = INT_MAX;
else
- error = ep_loop_check_proc(ep_tovisit, depth + 1);
- if (error != 0)
+ result = max(result, ep_loop_check_proc(ep_tovisit, depth + 1) + 1);
+ if (result > EP_MAX_NESTS)
break;
} else {
/*
@@ -2186,9 +2186,27 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
list_file(epi->ffd.file);
}
}
+ ep->loop_check_depth = result;
mutex_unlock(&ep->mtx);
- return error;
+ return result;
+}
+
+/**
+ * ep_get_upwards_depth_proc - determine depth of @ep when traversed upwards
+ */
+static int ep_get_upwards_depth_proc(struct eventpoll *ep, int depth)
+{
+ int result = 0;
+ struct epitem *epi;
+
+ if (ep->gen == loop_check_gen)
+ return ep->loop_check_depth;
+ hlist_for_each_entry_rcu(epi, &ep->refs, fllink)
+ result = max(result, ep_get_upwards_depth_proc(epi->ep, depth + 1) + 1);
+ ep->gen = loop_check_gen;
+ ep->loop_check_depth = result;
+ return result;
}
/**
@@ -2204,8 +2222,22 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
*/
static int ep_loop_check(struct eventpoll *ep, struct eventpoll *to)
{
+ int depth, upwards_depth;
+
inserting_into = ep;
- return ep_loop_check_proc(to, 0);
+ /*
+ * Check how deep down we can get from @to, and whether it is possible
+ * to loop up to @ep.
+ */
+ depth = ep_loop_check_proc(to, 0);
+ if (depth > EP_MAX_NESTS)
+ return -1;
+ /* Check how far up we can go from @ep. */
+ rcu_read_lock();
+ upwards_depth = ep_get_upwards_depth_proc(ep, 0);
+ rcu_read_unlock();
+
+ return (depth+1+upwards_depth > EP_MAX_NESTS) ? -1 : 0;
}
static void clear_tfile_check_list(void)
---
base-commit: 0ff41df1cb268fc69e703a08a57ee14ae967d0ca
change-id: 20250711-epoll-recursion-fix-fb0e336b2aeb
--
Jann Horn <jannh@google.com>
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] eventpoll: Fix semi-unbounded recursion
2025-07-11 16:33 [PATCH] eventpoll: Fix semi-unbounded recursion Jann Horn
@ 2025-07-11 16:38 ` Jann Horn
2025-07-31 22:31 ` Carlos Llamas
1 sibling, 0 replies; 4+ messages in thread
From: Jann Horn @ 2025-07-11 16:38 UTC (permalink / raw)
To: Alexander Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-kernel, stable
On Fri, Jul 11, 2025 at 6:33 PM Jann Horn <jannh@google.com> wrote:
> A more thorough check is done in reverse_path_check() after the new graph
> edge has already been created; this checks, among other things, that no
> paths going upwards from any non-epoll file with a length of more than 5
> edges exist. However, this check does not apply to non-epoll files.
... of course directly after sending this I notice that I put one too
many negations in this sentence.
s/does not apply to non-epoll files/does not apply to epoll files/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] eventpoll: Fix semi-unbounded recursion
2025-07-11 16:33 [PATCH] eventpoll: Fix semi-unbounded recursion Jann Horn
2025-07-11 16:38 ` Jann Horn
@ 2025-07-31 22:31 ` Carlos Llamas
2025-08-04 14:55 ` Jann Horn
1 sibling, 1 reply; 4+ messages in thread
From: Carlos Llamas @ 2025-07-31 22:31 UTC (permalink / raw)
To: Jann Horn
Cc: Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel,
linux-kernel, stable
On Fri, Jul 11, 2025 at 06:33:36PM +0200, Jann Horn wrote:
> Ensure that epoll instances can never form a graph deeper than
> EP_MAX_NESTS+1 links.
>
> Currently, ep_loop_check_proc() ensures that the graph is loop-free and
> does some recursion depth checks, but those recursion depth checks don't
> limit the depth of the resulting tree for two reasons:
>
> - They don't look upwards in the tree.
> - If there are multiple downwards paths of different lengths, only one of
> the paths is actually considered for the depth check since commit
> 28d82dc1c4ed ("epoll: limit paths").
>
> Essentially, the current recursion depth check in ep_loop_check_proc() just
> serves to prevent it from recursing too deeply while checking for loops.
>
> A more thorough check is done in reverse_path_check() after the new graph
> edge has already been created; this checks, among other things, that no
> paths going upwards from any non-epoll file with a length of more than 5
> edges exist. However, this check does not apply to non-epoll files.
>
> As a result, it is possible to recurse to a depth of at least roughly 500,
> tested on v6.15. (I am unsure if deeper recursion is possible; and this may
> have changed with commit 8c44dac8add7 ("eventpoll: Fix priority inversion
> problem").)
>
> To fix it:
>
> 1. In ep_loop_check_proc(), note the subtree depth of each visited node,
> and use subtree depths for the total depth calculation even when a subtree
> has already been visited.
> 2. Add ep_get_upwards_depth_proc() for similarly determining the maximum
> depth of an upwards walk.
> 3. In ep_loop_check(), use these values to limit the total path length
> between epoll nodes to EP_MAX_NESTS edges.
>
> Fixes: 22bacca48a17 ("epoll: prevent creating circular epoll structures")
> Cc: stable@vger.kernel.org
> Signed-off-by: Jann Horn <jannh@google.com>
> ---
Hey Jann,
I've bisected an LTP test failure to this commit and I can't find any
reports of this. The test is epoll_ctl04:
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/epoll_ctl/epoll_ctl04.c
This is what I get:
####################################################################3
root@debian:~# ./epoll_ctl04
tst_test.c:2004: TINFO: LTP version: 20250530-116-g91e6272fe
tst_test.c:2007: TINFO: Tested kernel: 6.16.0-rc1-00017-gf2e467a48287 #28 SMP PREEMPT Thu Jul 31 21:12:22 UTC 2025 aarch64
tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
tst_test.c:1825: TINFO: Overall timeout per run is 0h 00m 30s
epoll_ctl04.c:59: TFAIL: epoll_ctl(..., EPOLL_CTL_ADD, ...) with number of nesting is 5 expected EINVAL: ELOOP (40)
Summary:
passed 0
failed 1
broken 0
skipped 0
warnings 0
####################################################################3
I haven't looked much into this but it seems the test expects EINVAL at
nesting depth 5 and is instead getting ELOOP. Any chance there is an
off-by-one error in ep_loop_check() as it fails with depth=4 and
upwards_depth=0, which isn't correct?
---
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 44648cc09250..811960b2a74c 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -2237,7 +2237,7 @@ static int ep_loop_check(struct eventpoll *ep, struct eventpoll *to)
upwards_depth = ep_get_upwards_depth_proc(ep, 0);
rcu_read_unlock();
- return (depth+1+upwards_depth > EP_MAX_NESTS) ? -1 : 0;
+ return (depth+upwards_depth > EP_MAX_NESTS) ? -1 : 0;
}
static void clear_tfile_check_list(void)
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] eventpoll: Fix semi-unbounded recursion
2025-07-31 22:31 ` Carlos Llamas
@ 2025-08-04 14:55 ` Jann Horn
0 siblings, 0 replies; 4+ messages in thread
From: Jann Horn @ 2025-08-04 14:55 UTC (permalink / raw)
To: Carlos Llamas
Cc: Alexander Viro, Christian Brauner, Jan Kara, linux-fsdevel,
linux-kernel, stable
On Fri, Aug 1, 2025 at 12:31 AM Carlos Llamas <cmllamas@google.com> wrote:
> On Fri, Jul 11, 2025 at 06:33:36PM +0200, Jann Horn wrote:
> > Ensure that epoll instances can never form a graph deeper than
> > EP_MAX_NESTS+1 links.
> >
> > Currently, ep_loop_check_proc() ensures that the graph is loop-free and
> > does some recursion depth checks, but those recursion depth checks don't
> > limit the depth of the resulting tree for two reasons:
> >
> > - They don't look upwards in the tree.
> > - If there are multiple downwards paths of different lengths, only one of
> > the paths is actually considered for the depth check since commit
> > 28d82dc1c4ed ("epoll: limit paths").
> >
> > Essentially, the current recursion depth check in ep_loop_check_proc() just
> > serves to prevent it from recursing too deeply while checking for loops.
> >
> > A more thorough check is done in reverse_path_check() after the new graph
> > edge has already been created; this checks, among other things, that no
> > paths going upwards from any non-epoll file with a length of more than 5
> > edges exist. However, this check does not apply to non-epoll files.
> >
> > As a result, it is possible to recurse to a depth of at least roughly 500,
> > tested on v6.15. (I am unsure if deeper recursion is possible; and this may
> > have changed with commit 8c44dac8add7 ("eventpoll: Fix priority inversion
> > problem").)
> >
> > To fix it:
> >
> > 1. In ep_loop_check_proc(), note the subtree depth of each visited node,
> > and use subtree depths for the total depth calculation even when a subtree
> > has already been visited.
> > 2. Add ep_get_upwards_depth_proc() for similarly determining the maximum
> > depth of an upwards walk.
> > 3. In ep_loop_check(), use these values to limit the total path length
> > between epoll nodes to EP_MAX_NESTS edges.
> >
> > Fixes: 22bacca48a17 ("epoll: prevent creating circular epoll structures")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Jann Horn <jannh@google.com>
> > ---
>
> Hey Jann,
>
> I've bisected an LTP test failure to this commit and I can't find any
> reports of this. The test is epoll_ctl04:
>
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/epoll_ctl/epoll_ctl04.c
>
> This is what I get:
> ####################################################################3
> root@debian:~# ./epoll_ctl04
> tst_test.c:2004: TINFO: LTP version: 20250530-116-g91e6272fe
> tst_test.c:2007: TINFO: Tested kernel: 6.16.0-rc1-00017-gf2e467a48287 #28 SMP PREEMPT Thu Jul 31 21:12:22 UTC 2025 aarch64
> tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz'
> tst_test.c:1825: TINFO: Overall timeout per run is 0h 00m 30s
> epoll_ctl04.c:59: TFAIL: epoll_ctl(..., EPOLL_CTL_ADD, ...) with number of nesting is 5 expected EINVAL: ELOOP (40)
>
> Summary:
> passed 0
> failed 1
> broken 0
> skipped 0
> warnings 0
> ####################################################################3
>
>
> I haven't looked much into this but it seems the test expects EINVAL at
> nesting depth 5 and is instead getting ELOOP. Any chance there is an
> off-by-one error in ep_loop_check() as it fails with depth=4 and
> upwards_depth=0, which isn't correct?
This is an area where the existing code is very inconsistent in how it
reports errors; limits on the structure of the epoll graph (in
particular limits on the graph depth) are enforced by both
ep_loop_check() (which fails with ELOOP) and reverse_path_check()
(which fails with EINVAL). The comments suggest that ep_loop_check()
is supposed to be doing depth checks, and reverse_path_check() is
mostly supposed to count wakeup paths, but reverse_path_check()
happens to be what you hit in that LTP testcase.
The manpage also says that ELOOP is for too-deep nesting, and does not
mention this case of EINVAL at all. (It doesn't even mention the
wakeup path count restriction in the ERRORS section...)
I think this is one of those cases where the existing semantics are
too convoluted and tainted with kernel implementation details for
userspace to have handled the existing error cases well enough to be
broken by this change; the existing behavior was something like (not
sure if I'm getting it exactly right) "ELOOP is for loops; EINVAL is
for hitting a depth limit when constructing a chain of epoll instances
with a file at the bottom; ELOOP is for hitting a depth limit when
constructing a chain of epoll instances with no file at the bottom and
you'd only get it depending on which way around you build the chain
and, for more complex structures, in what order the addresses of
kernel objects are"; and the implementation was different from what
the manpage says.
So my opinion is that the right fix is to change the testcase to also
accept ELOOP, though I can see how a bugfix that breaks a unit test is
going to raise eyebrows.
> ---
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index 44648cc09250..811960b2a74c 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -2237,7 +2237,7 @@ static int ep_loop_check(struct eventpoll *ep, struct eventpoll *to)
> upwards_depth = ep_get_upwards_depth_proc(ep, 0);
> rcu_read_unlock();
>
> - return (depth+1+upwards_depth > EP_MAX_NESTS) ? -1 : 0;
> + return (depth+upwards_depth > EP_MAX_NESTS) ? -1 : 0;
Here I am calculating: We want to create a new link between two nodes,
and want to know how long the longest resulting chain of epoll
instances will be. For that, I add the following numbers of links:
- The number of links going upwards from "ep".
- One for the new link we're adding.
- The number of links going downward from "to".
So I think this is correct.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-08-04 14:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-11 16:33 [PATCH] eventpoll: Fix semi-unbounded recursion Jann Horn
2025-07-11 16:38 ` Jann Horn
2025-07-31 22:31 ` Carlos Llamas
2025-08-04 14:55 ` Jann Horn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).