* [PATCH net-next v3 0/2] Fix dangling pointer at f6i->gc_link.
@ 2023-12-13 21:37 thinker.li
2023-12-13 21:37 ` [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree thinker.li
2023-12-13 21:37 ` [PATCH net-next v3 2/2] selftests: fib_tests: Add tests for toggling between w/ and w/o expires thinker.li
0 siblings, 2 replies; 10+ messages in thread
From: thinker.li @ 2023-12-13 21:37 UTC (permalink / raw)
To: netdev, martin.lau, kernel-team, davem, kuba, pabeni, dsahern,
edumazet
Cc: sinquersw, kuifeng, Kui-Feng Lee
From: Kui-Feng Lee <thinker.li@gmail.com>
According to a report [1], f6i->gc_link may point to a free block
causing use-after-free. According the stacktraces in the report, it is
very likely that a f6i was added to the GC list after being removed
from the tree of a fib6_table. The reason this can happen is the
current implementation determines if a f6i is on a tree in a wrong
way. It believes a f6i is on a tree if f6i->fib6_table is not NULL.
However, f6i->fib6_table is not reset when f6i is removed from a tree.
The new solution is to check if f6i->fib6_node is not NULL as well.
f6i->fib6_node is set/or reset when the f6i is added/or removed from
from a tree. It can be used to determines if a f6i is on a tree
reliably.
The other change is to determine if a f6i is on a GC list. The
current implementation relies on RTF_EXPIRES on fib6_flags. It needs
to consider if a f6i is on a tree as well. The new solution is
checking hlist_unhashed() on f6i->gc_link, a clear evidence, instead.
[1] https://lore.kernel.org/all/20231205173250.2982846-1-edumazet@google.com/
---
Major changes from v2:
- Ensure dependencies of checks in the test cases.
Major changes from v1:
- Split fib6_set_expires_locked() and fib6_clean_expires_locked()
- Use hlist_unhashed() on gc_link instead of checking RTF_EXPIRES to
determine if a f6i is on a GC list.
- Add tests on toggling routes between permanent and temporary.
v2: https://lore.kernel.org/all/20231208194523.312416-1-thinker.li@gmail.com/
v1: https://lore.kernel.org/all/20231207221627.746324-1-thinker.li@gmail.com/
Kui-Feng Lee (2):
net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table
tree.
selftests: fib_tests: Add tests for toggling between w/ and w/o
expires.
include/net/ip6_fib.h | 46 +++++++++----
net/ipv6/route.c | 6 +-
tools/testing/selftests/net/fib_tests.sh | 82 +++++++++++++++++++++++-
3 files changed, 118 insertions(+), 16 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree.
2023-12-13 21:37 [PATCH net-next v3 0/2] Fix dangling pointer at f6i->gc_link thinker.li
@ 2023-12-13 21:37 ` thinker.li
2023-12-14 6:11 ` David Ahern
2023-12-13 21:37 ` [PATCH net-next v3 2/2] selftests: fib_tests: Add tests for toggling between w/ and w/o expires thinker.li
1 sibling, 1 reply; 10+ messages in thread
From: thinker.li @ 2023-12-13 21:37 UTC (permalink / raw)
To: netdev, martin.lau, kernel-team, davem, kuba, pabeni, dsahern,
edumazet
Cc: sinquersw, kuifeng, Kui-Feng Lee, syzbot+c15aa445274af8674f41
From: Kui-Feng Lee <thinker.li@gmail.com>
Check f6i->fib6_node and hlist_unhashed(&f6i->gc_link) before inserting a
f6i (fib6_info) to tb6_gc_hlist.
The current implementation checks if f6i->fib6_table is not NULL to
determines if a f6i is on a tree, however it is not enough. When a f6i is
removed from a fib6_table, f6i->fib6_table is not reset. However, fib6_node
is always reset when a f6i is removed from a fib6_table and is set when a
f6i is added to a fib6_table. So, f6i->fib6_node is a reliable way to
determine if a f6i is on a tree.
The current implementation checks RTF_EXPIRES on f6i->fib6_flags to
determine if a f6i is on a GC list. It also consider if the f6i is on a
tree before making a conclusion. This is indirect and complicated. The new
solution is checking hlist_unhashed(&f6i->gc_link), a clear evidence.
Putting them together, these changes provide more reliable signals to
determines if a f6i should be added/or removed to a GC list.
Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes.")
Reported-by: syzbot+c15aa445274af8674f41@syzkaller.appspotmail.com
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: dsahern@kernel.org
---
include/net/ip6_fib.h | 46 ++++++++++++++++++++++++++++++++-----------
net/ipv6/route.c | 6 +++---
2 files changed, 38 insertions(+), 14 deletions(-)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 1ba9f4ddf2f6..1213722c394f 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -500,21 +500,47 @@ void fib6_gc_cleanup(void);
int fib6_init(void);
-/* fib6_info must be locked by the caller, and fib6_info->fib6_table can be
- * NULL.
- */
-static inline void fib6_set_expires_locked(struct fib6_info *f6i,
- unsigned long expires)
+static inline void fib6_add_gc_list(struct fib6_info *f6i)
{
struct fib6_table *tb6;
tb6 = f6i->fib6_table;
- f6i->expires = expires;
- if (tb6 && !fib6_has_expires(f6i))
+ if (tb6 &&
+ rcu_dereference_protected(f6i->fib6_node,
+ lockdep_is_held(&tb6->tb6_lock)) &&
+ hlist_unhashed(&f6i->gc_link))
hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
+}
+
+static inline void fib6_del_gc_list(struct fib6_info *f6i)
+{
+ if (!hlist_unhashed(&f6i->gc_link))
+ hlist_del_init(&f6i->gc_link);
+}
+
+static inline void __fib6_set_expires(struct fib6_info *f6i,
+ unsigned long expires)
+{
+ f6i->expires = expires;
f6i->fib6_flags |= RTF_EXPIRES;
}
+static inline void __fib6_clean_expires(struct fib6_info *f6i)
+{
+ f6i->fib6_flags &= ~RTF_EXPIRES;
+ f6i->expires = 0;
+}
+
+/* fib6_info must be locked by the caller, and fib6_info->fib6_table can be
+ * NULL.
+ */
+static inline void fib6_set_expires_locked(struct fib6_info *f6i,
+ unsigned long expires)
+{
+ __fib6_set_expires(f6i, expires);
+ fib6_add_gc_list(f6i);
+}
+
/* fib6_info must be locked by the caller, and fib6_info->fib6_table can be
* NULL. If fib6_table is NULL, the fib6_info will no be inserted into the
* list of GC candidates until it is inserted into a table.
@@ -529,10 +555,8 @@ static inline void fib6_set_expires(struct fib6_info *f6i,
static inline void fib6_clean_expires_locked(struct fib6_info *f6i)
{
- if (fib6_has_expires(f6i))
- hlist_del_init(&f6i->gc_link);
- f6i->fib6_flags &= ~RTF_EXPIRES;
- f6i->expires = 0;
+ fib6_del_gc_list(f6i);
+ __fib6_clean_expires(f6i);
}
static inline void fib6_clean_expires(struct fib6_info *f6i)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b132feae3393..dcaeb88d73aa 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3763,10 +3763,10 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
rt->dst_nocount = true;
if (cfg->fc_flags & RTF_EXPIRES)
- fib6_set_expires_locked(rt, jiffies +
- clock_t_to_jiffies(cfg->fc_expires));
+ __fib6_set_expires(rt, jiffies +
+ clock_t_to_jiffies(cfg->fc_expires));
else
- fib6_clean_expires_locked(rt);
+ __fib6_clean_expires(rt);
if (cfg->fc_protocol == RTPROT_UNSPEC)
cfg->fc_protocol = RTPROT_BOOT;
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH net-next v3 2/2] selftests: fib_tests: Add tests for toggling between w/ and w/o expires.
2023-12-13 21:37 [PATCH net-next v3 0/2] Fix dangling pointer at f6i->gc_link thinker.li
2023-12-13 21:37 ` [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree thinker.li
@ 2023-12-13 21:37 ` thinker.li
2023-12-14 3:32 ` Hangbin Liu
1 sibling, 1 reply; 10+ messages in thread
From: thinker.li @ 2023-12-13 21:37 UTC (permalink / raw)
To: netdev, martin.lau, kernel-team, davem, kuba, pabeni, dsahern,
edumazet
Cc: sinquersw, kuifeng, Kui-Feng Lee, Hangbin Liu
From: Kui-Feng Lee <thinker.li@gmail.com>
Make sure that toggling routes between w/ expires and w/o expires works
properly with GC list.
When a route with expires is replaced by a permanent route, the entry
should be removed from the gc list. When a permanent routes is replaced by
a temporary route, the new entry should be added to the gc list. The new
tests check if these basic operators work properly.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Cc: Hangbin Liu <liuhangbin@gmail.com>
---
tools/testing/selftests/net/fib_tests.sh | 82 +++++++++++++++++++++++-
1 file changed, 80 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
index 66d0db7a2614..337d0febd796 100755
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -785,6 +785,8 @@ fib6_gc_test()
ret=0
fi
+ log_test $ret 0 "ipv6 route garbage collection"
+
# Permanent routes
for i in $(seq 1 5000); do
$IP -6 route add 2001:30::$i \
@@ -806,9 +808,85 @@ fib6_gc_test()
ret=0
fi
- set +e
+ log_test $ret 0 "ipv6 route garbage collection (with permanent routes)"
- log_test $ret 0 "ipv6 route garbage collection"
+ # Delete permanent routes
+ for i in $(seq 1 5000); do
+ $IP -6 route del 2001:30::$i \
+ via 2001:10::2 dev dummy_10
+ done
+
+ # Permanent routes
+ for i in $(seq 1 100); do
+ # Expire route after $EXPIRE seconds
+ $IP -6 route add 2001:20::$i \
+ via 2001:10::2 dev dummy_10
+ done
+ # Replace with temporary routes
+ for i in $(seq 1 100); do
+ # Expire route after $EXPIRE seconds
+ $IP -6 route replace 2001:20::$i \
+ via 2001:10::2 dev dummy_10 expires $EXPIRE
+ done
+ N_EXP_SLEEP=$($IP -6 route list |grep expires|wc -l)
+ if [ $N_EXP_SLEEP -ne 100 ]; then
+ echo "FAIL: expected 100 routes with expires, got $N_EXP_SLEEP"
+ ret=1
+ else
+ ret=0
+ fi
+
+ if [ $ret -eq 0 ]; then
+ sleep $(($EXPIRE * 2 + 1))
+ N_EXP_SLEEP=$($IP -6 route list |grep expires|wc -l)
+ if [ $N_EXP_SLEEP -ne 0 ]; then
+ echo "FAIL: expected 0 routes with expires," \
+ "got $N_EXP_SLEEP"
+ ret=1
+ else
+ ret=0
+ fi
+ fi
+
+ if [ $ret -eq 0 ]; then
+ PERM_BASE=$($IP -6 route list |grep -v expires|wc -l)
+ # Temporary routes
+ for i in $(seq 1 100); do
+ # Expire route after $EXPIRE seconds
+ $IP -6 route add 2001:20::$i \
+ via 2001:10::2 dev dummy_10 expires $EXPIRE
+ done
+ # Replace with permanent routes
+ for i in $(seq 1 100); do
+ # Expire route after $EXPIRE seconds
+ $IP -6 route replace 2001:20::$i \
+ via 2001:10::2 dev dummy_10
+ done
+ N_EXP_SLEEP=$($IP -6 route list |grep expires|wc -l)
+ if [ $N_EXP_SLEEP -ne 0 ]; then
+ echo "FAIL: expected 0 routes with expires," \
+ "got $N_EXP_SLEEP"
+ ret=1
+ else
+ ret=0
+ fi
+ fi
+ if [ $ret -eq 0 ]; then
+ sleep $(($EXPIRE * 2 + 1))
+ N_EXP_PERM=$($IP -6 route list |grep -v expires|wc -l)
+ N_EXP_PERM=$(($N_EXP_PERM - $PERM_BASE))
+ if [ $N_EXP_PERM -ne 100 ]; then
+ echo "FAIL: expected 100 permanent routes," \
+ "got $N_EXP_PERM"
+ ret=1
+ else
+ ret=0
+ fi
+ fi
+
+ log_test $ret 0 "ipv6 route garbage collection (replace with permanent)"
+
+ set +e
cleanup &> /dev/null
}
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v3 2/2] selftests: fib_tests: Add tests for toggling between w/ and w/o expires.
2023-12-13 21:37 ` [PATCH net-next v3 2/2] selftests: fib_tests: Add tests for toggling between w/ and w/o expires thinker.li
@ 2023-12-14 3:32 ` Hangbin Liu
0 siblings, 0 replies; 10+ messages in thread
From: Hangbin Liu @ 2023-12-14 3:32 UTC (permalink / raw)
To: thinker.li
Cc: netdev, martin.lau, kernel-team, davem, kuba, pabeni, dsahern,
edumazet, sinquersw, kuifeng
Hi Kui-Feng,
On Wed, Dec 13, 2023 at 01:37:35PM -0800, thinker.li@gmail.com wrote:
> From: Kui-Feng Lee <thinker.li@gmail.com>
>
> Make sure that toggling routes between w/ expires and w/o expires works
> properly with GC list.
>
> When a route with expires is replaced by a permanent route, the entry
> should be removed from the gc list. When a permanent routes is replaced by
> a temporary route, the new entry should be added to the gc list. The new
> tests check if these basic operators work properly.
>
> Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
> Cc: Hangbin Liu <liuhangbin@gmail.com>
> ---
> tools/testing/selftests/net/fib_tests.sh | 82 +++++++++++++++++++++++-
> 1 file changed, 80 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh
> index 66d0db7a2614..337d0febd796 100755
> --- a/tools/testing/selftests/net/fib_tests.sh
> +++ b/tools/testing/selftests/net/fib_tests.sh
> @@ -785,6 +785,8 @@ fib6_gc_test()
> ret=0
> fi
>
> + log_test $ret 0 "ipv6 route garbage collection"
If the ret doesn't affect the later tests. You can simple do like:
if [ $N_EXP_SLEEP -ne 0 ]; then
log_test 1 0 "fib6_gc: expected 0 routes with expires, got $N_EXP_SLEEP"
fi
> +
> # Permanent routes
> for i in $(seq 1 5000); do
> $IP -6 route add 2001:30::$i \
> @@ -806,9 +808,85 @@ fib6_gc_test()
> ret=0
> fi
>
> - set +e
> + log_test $ret 0 "ipv6 route garbage collection (with permanent routes)"
>
> - log_test $ret 0 "ipv6 route garbage collection"
> + # Delete permanent routes
> + for i in $(seq 1 5000); do
> + $IP -6 route del 2001:30::$i \
> + via 2001:10::2 dev dummy_10
> + done
> +
> + # Permanent routes
> + for i in $(seq 1 100); do
> + # Expire route after $EXPIRE seconds
^^ These are permanent routes, no expires
> + $IP -6 route add 2001:20::$i \
> + via 2001:10::2 dev dummy_10
> + done
> + # Replace with temporary routes
> + for i in $(seq 1 100); do
> + # Expire route after $EXPIRE seconds
> + $IP -6 route replace 2001:20::$i \
> + via 2001:10::2 dev dummy_10 expires $EXPIRE
> + done
> + N_EXP_SLEEP=$($IP -6 route list |grep expires|wc -l)
> + if [ $N_EXP_SLEEP -ne 100 ]; then
> + echo "FAIL: expected 100 routes with expires, got $N_EXP_SLEEP"
> + ret=1
> + else
> + ret=0
> + fi
> +
> + if [ $ret -eq 0 ]; then
All these if/else ret setting/checking looks redundant. Either we can just return
when one test failed (so there is no need to check if $ret -eq 0).
if [ $N_EXP_SLEEP -ne 100 ]; then
log_test 1 0 "fib6_gc: replace permanent to temporary: expected 100 routes with expires, got $N_EXP_SLEEP"
cleanup &> /dev/null
return
fi
Or, use different subnet for testing. So the next one doesn't affect the
previous test. Then there is no need to call "cleanup && return" for every
failed check. e.g.
do temporary route test with 2001:20:: subnet
if [ $N_EXP_SLEEP -ne 0 ]; then
log_test 1 0 "some log info"
fi
do permanent route + temp route with 2001:30:: subnet
if [ $N_EXP_SLEEP -ne 0 ]; then
log_test 1 0 "some log info"
fi
(Here we'd better remove the 5000 permanent route :), or just del and re-add
the interface directly.)
do permanent route with replace to temp route with 2001:40:: subnet
if [ $N_EXP_SLEEP -ne 100 ]; then
log_test 1 0 "some log info"
else
sleep and recheck the route number
fi
etc.
> + sleep $(($EXPIRE * 2 + 1))
> + N_EXP_SLEEP=$($IP -6 route list |grep expires|wc -l)
> + if [ $N_EXP_SLEEP -ne 0 ]; then
> + echo "FAIL: expected 0 routes with expires," \
> + "got $N_EXP_SLEEP"
> + ret=1
> + else
> + ret=0
> + fi
> + fi
> +
> + if [ $ret -eq 0 ]; then
> + PERM_BASE=$($IP -6 route list |grep -v expires|wc -l)
> + # Temporary routes
> + for i in $(seq 1 100); do
> + # Expire route after $EXPIRE seconds
> + $IP -6 route add 2001:20::$i \
> + via 2001:10::2 dev dummy_10 expires $EXPIRE
> + done
> + # Replace with permanent routes
> + for i in $(seq 1 100); do
> + # Expire route after $EXPIRE seconds
^^ These are permanent routes.
Thanks
Hangbin
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree.
2023-12-13 21:37 ` [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree thinker.li
@ 2023-12-14 6:11 ` David Ahern
2023-12-14 23:43 ` Kui-Feng Lee
2023-12-15 19:12 ` Kui-Feng Lee
0 siblings, 2 replies; 10+ messages in thread
From: David Ahern @ 2023-12-14 6:11 UTC (permalink / raw)
To: thinker.li, netdev, martin.lau, kernel-team, davem, kuba, pabeni,
edumazet
Cc: sinquersw, kuifeng, syzbot+c15aa445274af8674f41
On 12/13/23 2:37 PM, thinker.li@gmail.com wrote:
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index b132feae3393..dcaeb88d73aa 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -3763,10 +3763,10 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
> rt->dst_nocount = true;
>
> if (cfg->fc_flags & RTF_EXPIRES)
> - fib6_set_expires_locked(rt, jiffies +
> - clock_t_to_jiffies(cfg->fc_expires));
> + __fib6_set_expires(rt, jiffies +
> + clock_t_to_jiffies(cfg->fc_expires));
> else
> - fib6_clean_expires_locked(rt);
> + __fib6_clean_expires(rt);
as Eric noted in a past comment, the clean is not needed in this
function since memory is initialized to 0 (expires is never set).
Also, this patch set does not fundamentally change the logic, so it
cannot fix the bug reported in
https://lore.kernel.org/all/20231205173250.2982846-1-edumazet@google.com/
please hold off future versions of this set until the problem in that
stack traced is fixed. I have tried a few things using RA's, but have
not been able to recreate UAF.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree.
2023-12-14 6:11 ` David Ahern
@ 2023-12-14 23:43 ` Kui-Feng Lee
2023-12-15 19:12 ` Kui-Feng Lee
1 sibling, 0 replies; 10+ messages in thread
From: Kui-Feng Lee @ 2023-12-14 23:43 UTC (permalink / raw)
To: David Ahern, thinker.li, netdev, martin.lau, kernel-team, davem,
kuba, pabeni, edumazet
Cc: kuifeng, syzbot+c15aa445274af8674f41
On 12/13/23 22:11, David Ahern wrote:
> On 12/13/23 2:37 PM, thinker.li@gmail.com wrote:
>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>> index b132feae3393..dcaeb88d73aa 100644
>> --- a/net/ipv6/route.c
>> +++ b/net/ipv6/route.c
>> @@ -3763,10 +3763,10 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
>> rt->dst_nocount = true;
>>
>> if (cfg->fc_flags & RTF_EXPIRES)
>> - fib6_set_expires_locked(rt, jiffies +
>> - clock_t_to_jiffies(cfg->fc_expires));
>> + __fib6_set_expires(rt, jiffies +
>> + clock_t_to_jiffies(cfg->fc_expires));
>> else
>> - fib6_clean_expires_locked(rt);
>> + __fib6_clean_expires(rt);
>
> as Eric noted in a past comment, the clean is not needed in this
> function since memory is initialized to 0 (expires is never set).
>
> Also, this patch set does not fundamentally change the logic, so it
> cannot fix the bug reported in
>
> https://lore.kernel.org/all/20231205173250.2982846-1-edumazet@google.com/
>
> please hold off future versions of this set until the problem in that
> stack traced is fixed. I have tried a few things using RA's, but have
> not been able to recreate UAF.
Sure
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree.
2023-12-14 6:11 ` David Ahern
2023-12-14 23:43 ` Kui-Feng Lee
@ 2023-12-15 19:12 ` Kui-Feng Lee
2023-12-16 18:36 ` David Ahern
2023-12-18 1:16 ` Kui-Feng Lee
1 sibling, 2 replies; 10+ messages in thread
From: Kui-Feng Lee @ 2023-12-15 19:12 UTC (permalink / raw)
To: David Ahern, thinker.li, netdev, martin.lau, kernel-team, davem,
kuba, pabeni, edumazet
Cc: kuifeng, syzbot+c15aa445274af8674f41
On 12/13/23 22:11, David Ahern wrote:
> On 12/13/23 2:37 PM, thinker.li@gmail.com wrote:
>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>> index b132feae3393..dcaeb88d73aa 100644
>> --- a/net/ipv6/route.c
>> +++ b/net/ipv6/route.c
>> @@ -3763,10 +3763,10 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
>> rt->dst_nocount = true;
>>
>> if (cfg->fc_flags & RTF_EXPIRES)
>> - fib6_set_expires_locked(rt, jiffies +
>> - clock_t_to_jiffies(cfg->fc_expires));
>> + __fib6_set_expires(rt, jiffies +
>> + clock_t_to_jiffies(cfg->fc_expires));
>> else
>> - fib6_clean_expires_locked(rt);
>> + __fib6_clean_expires(rt);
>
> as Eric noted in a past comment, the clean is not needed in this
> function since memory is initialized to 0 (expires is never set).
>
> Also, this patch set does not fundamentally change the logic, so it
> cannot fix the bug reported in
>
> https://lore.kernel.org/all/20231205173250.2982846-1-edumazet@google.com/
>
> please hold off future versions of this set until the problem in that
> stack traced is fixed. I have tried a few things using RA's, but have
> not been able to recreate UAF.
I tried to reproduce the issue yesterday, according to the hypothesis
behind the patch of this thread. The following is the instructions
to reproduce the UAF issue. However, this instruction doesn't create
a crash at the place since the memory is still available even it has
been free. But, the log shows a UAF.
The patch at the end of this message is required to reproduce and
show UAF. The most critical change in the patch is to insert
a 'mdelay(3000)' to sleep 3s in rt6_route_rcv(). That gives us
a chance to manipulate the kernel to reproduce the UAF.
Here is my conclusion. There is no protection between finding
a route and changing the route in rt6_route_rcv(), including inserting
the entry to the gc list. It is possible to insert an entry that will be
free later to the gc list, causing a UAF. There is more explanations
along with the following logs.
Instructions:
- Preparation
- install ipv6toolkit on the host.
- run qemu with a patched kernel as a guest through
with the host through qemubr0 a bridge.
- On the guest
- stop systemd-networkd.service & systemd-networkd.socket if
there are.
- sysctl -w net.ipv6.conf.enp0s3.accept_ra=2
- sysctl -w net.ipv6.conf.enp0s3.accept_ra_rt_info_max_plen=127
- Assume the address of qemubr0 in the host is
fe80::4ce9:92ff:fe27:75df.
- Test
- ra6 -i qemubr0 -d ff02::1 -R 'fe82::/64#1#300' -t 300 # On the
host
- sleep 2; ip -6 route del fe82::/64 # On the
guest
- ra6 -i qemubr0 -d ff02::1 -R 'fe82::/64#1#300' -t 300 # On the
host
- ra6 -i qemubr0 -d ff02::1 -R 'fe81::/64#1#300' -t 0 # On the
host
- The step 3 should start immediately after step 2 since we have
a gap of merely 3 seconds in the kernel.
The log generated by the test:
# this is the log triggered by step 1
qemu login: [ 4.673867] __ip6_ins_rt fe80::5054:ff:fe12:3456/128
[ 82.138139] rt = 0000000000000000
[ 82.138731] fib6_info_alloc: ffff888103950200
[ 82.139088] __ip6_ins_rt ::/0
[ 82.139993] fib6_add: add ffff888103950200 to gc list
ffff888103950238 pprev ffff888102878200 next 0000000000000000
[ 82.141719] rt = ffff888103950200
[ 82.141748] rt6_route_rcv
[ 82.141748] fib6_info_alloc: ffff88810093be00
[ 82.141748] __ip6_ins_rt fe82::/64
[ 82.141748] rt6_route_rcv: route info ffff88810093be00, sleep 3s
[ 85.121803] fib6_set_expires_locked: add ffff88810093be00 to gc list
ffff88810093be38 pprev ffff888102878200 next ffff888103950238
[ 85.146497] rt6_route_rcv: route info ffff88810093be00, after release
# This is the log triggered by step 2 & 3.
#
# The line containing fib6_clean_expires_locked is specifically
# triggered by step 2. Step 2 removes the entry from the tree and
# gc list. Step 3 free the entry by calling
# fib6_info_release() at the very end of rt6_route_rcv() since it is
# the last owner of that entry. fib6_info_destroy_rcu proves that.
#
# However, even the entry will be free later, rt6_route_rcv() still add
# the entry back to the gc list before freeing the entry. In other
# words, it create an entry (ffff88810093be38) that is free in
# a gc list.
#
# Keep an eye on ffff88810093be38 and ffff88810093be00. ffff88810093be00
# is the address of a fib6_info and ffff88810093be38 is the address of
# its gc_link.
#
# This log also complies with the log in
#
# https://lore.kernel.org/all/20231205173250.2982846-1-edumazet@google.com/
#
# The entry is free by the call of fib6_info_release() in
# rt6_route_rcv().
[ 105.158140] rt = ffff888103950200
[ 105.158590] rt6_route_rcv
[ 105.158924] rt6_route_rcv: route info ffff88810093be00, sleep 3s
[ 106.368875] fib6_clean_expires_locked: del ffff88810093be00 from gc
list pprev ffff888102878200 next ffff888103950238
[ 107.201815] fib6_set_expires_locked: add ffff88810093be00 to gc list
ffff88810093be38 pprev ffff888102878200 next ffff888103950238
[ 108.159815] rt6_route_rcv: route info ffff88810093be00, after release
[ 108.168807] fib6_info_destroy_rcu ffff88810093be00
# This is the log triggered by step 4.
# The line containing fib6_clean_expires_locked shows the free entry
# mentioned in the previous part is still in the gc list. (pprev
# ffff88810093be38)
# Since fib6_clean_expires_locked() calls hlist_del_init() to remove
# an entry from the gc list, it will change the value of *pprev
# (ffff88810093be38). It causes an UAF case.
[ 131.882130] rt = ffff888103950200
[ 131.882567] __ip6_del_rt ::/0
[ 131.882932] fib6_clean_expires_locked: del ffff888103950200 from gc
list pprev ffff88810093be38 next 0000000000000000
[ 131.883296] rt6_route_rcv
[ 131.883296] fib6_info_alloc: ffff88810093be00
[ 131.884517] __ip6_ins_rt fe81::/64
[ 131.884517] rt6_route_rcv: route info ffff88810093be00, sleep 3s
[ 134.305866] fib6_set_expires_locked: add ffff88810093be00 to gc list
ffff88810093be38 pprev ffff888102878200 next ffff88810093be38
[ 134.537866] rt6_route_rcv: route info ffff88810093be00, after release
[ 134.896801] fib6_info_destroy_rcu ffff888103950200
# The following log is the kernel errors that printed after 10s seconds.
[ 168.321812] watchdog: BUG: soft lockup - CPU#3 stuck for 26s!
[swapper/3:0]
[ 196.289823] watchdog: BUG: soft lockup - CPU#3 stuck for 52s!
[swapper/3:0]
[ 214.244784] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 214.245723] rcu: 3-....: (7 ticks this GP) idle=198c/0/0x1
softirq=2002/2003 fqs=12309
[ 214.245774] rcu: (detected by 2, t=60002 jiffies, g=1213, q=282
ncpus=4)
[ 240.321823] watchdog: BUG: soft lockup - CPU#3 stuck for 93s!
[swapper/3:0]
[ 268.353804] watchdog: BUG: soft lockup - CPU#3 stuck for 119s!
[swapper/3:0]
[ 296.388805] watchdog: BUG: soft lockup - CPU#3 stuck for 146s!
[swapper/3:0]
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 1ba9f4ddf2f6..6e059ba3d2d0 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -510,8 +510,11 @@ static inline void fib6_set_expires_locked(struct
fib6_info *f6i,
tb6 = f6i->fib6_table;
f6i->expires = expires;
- if (tb6 && !fib6_has_expires(f6i))
+ if (tb6 && !fib6_has_expires(f6i)) {
hlist_add_head(&f6i->gc_link, &tb6->tb6_gc_hlist);
+ printk(KERN_CRIT "fib6_set_expires_locked: add %px to gc list %px
pprev %px next %px\n",
+ f6i, &f6i->gc_link, f6i->gc_link.pprev, f6i->gc_link.next);
+ }
f6i->fib6_flags |= RTF_EXPIRES;
}
@@ -529,8 +532,11 @@ static inline void fib6_set_expires(struct
fib6_info *f6i,
static inline void fib6_clean_expires_locked(struct fib6_info *f6i)
{
- if (fib6_has_expires(f6i))
+ if (fib6_has_expires(f6i)) {
+ printk(KERN_CRIT "fib6_clean_expires_locked: del %px from gc list
pprev %px next %px\n",
+ f6i, f6i->gc_link.pprev, f6i->gc_link.next);
hlist_del_init(&f6i->gc_link);
+ }
f6i->fib6_flags &= ~RTF_EXPIRES;
f6i->expires = 0;
}
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 28b01a068412..b275b9798b5e 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -162,6 +162,8 @@ struct fib6_info *fib6_info_alloc(gfp_t gfp_flags,
bool with_fib6_nh)
INIT_HLIST_NODE(&f6i->gc_link);
+ printk(KERN_CRIT "fib6_info_alloc: %px\n", f6i);
+
return f6i;
}
@@ -178,6 +180,7 @@ void fib6_info_destroy_rcu(struct rcu_head *head)
ip_fib_metrics_put(f6i->fib6_metrics);
kfree(f6i);
+ printk(KERN_CRIT "fib6_info_destroy_rcu %px\n", f6i);
}
EXPORT_SYMBOL_GPL(fib6_info_destroy_rcu);
@@ -1486,8 +1489,10 @@ int fib6_add(struct fib6_node *root, struct
fib6_info *rt,
list_add(&rt->nh_list, &rt->nh->f6i_list);
__fib6_update_sernum_upto_root(rt, fib6_new_sernum(info->nl_net));
- if (fib6_has_expires(rt))
+ if (fib6_has_expires(rt)) {
hlist_add_head(&rt->gc_link, &table->tb6_gc_hlist);
+ printk(KERN_CRIT "fib6_add: add %px to gc list %px pprev %px next
%px\n", rt, &rt->gc_link, rt->gc_link.pprev, rt->gc_link.next);
+ }
fib6_start_gc(info->nl_net, rt);
}
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b132feae3393..52283a80f79e 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -935,6 +935,8 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt,
int len,
unsigned long lifetime;
struct fib6_info *rt;
+ printk(KERN_CRIT "rt6_route_rcv\n");
+
if (len < sizeof(struct route_info)) {
return -EINVAL;
}
@@ -989,12 +991,15 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt,
int len,
(rt->fib6_flags & ~RTF_PREF_MASK) | RTF_PREF(pref);
if (rt) {
+ printk(KERN_CRIT "rt6_route_rcv: route info %px, sleep 3s\n", rt);
+ mdelay(3000);
if (!addrconf_finite_timeout(lifetime))
fib6_clean_expires(rt);
else
fib6_set_expires(rt, jiffies + HZ * lifetime);
fib6_info_release(rt);
+ printk(KERN_CRIT "rt6_route_rcv: route info %px, after release\n", rt);
}
return 0;
}
@@ -1297,6 +1302,9 @@ static int __ip6_ins_rt(struct fib6_info *rt,
struct nl_info *info,
{
int err;
struct fib6_table *table;
+ if (rt)
+ printk(KERN_CRIT "__ip6_ins_rt %pI6c/%d\n",
+ &rt->fib6_dst.addr, rt->fib6_dst.plen);
table = rt->fib6_table;
spin_lock_bh(&table->tb6_lock);
@@ -3855,6 +3863,9 @@ static int __ip6_del_rt(struct fib6_info *rt,
struct nl_info *info)
struct net *net = info->nl_net;
struct fib6_table *table;
int err;
+ if (rt)
+ printk(KERN_CRIT "__ip6_del_rt %pI6c/%d\n",
+ &rt->fib6_dst.addr, rt->fib6_dst.plen);
if (rt == net->ipv6.fib6_null_entry) {
err = -ENOENT;
@@ -4345,8 +4356,10 @@ struct fib6_info *rt6_get_dflt_router(struct net
*net,
ipv6_addr_equal(&nh->fib_nh_gw6, addr))
break;
}
- if (rt && !fib6_info_hold_safe(rt))
+ if (rt && !fib6_info_hold_safe(rt)) {
rt = NULL;
+ }
+ printk(KERN_CRIT "rt = %px\n", rt);
rcu_read_unlock();
return rt;
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree.
2023-12-15 19:12 ` Kui-Feng Lee
@ 2023-12-16 18:36 ` David Ahern
2023-12-18 1:05 ` Kui-Feng Lee
2023-12-18 1:16 ` Kui-Feng Lee
1 sibling, 1 reply; 10+ messages in thread
From: David Ahern @ 2023-12-16 18:36 UTC (permalink / raw)
To: Kui-Feng Lee, thinker.li, netdev, martin.lau, kernel-team, davem,
kuba, pabeni, edumazet
Cc: kuifeng, syzbot+c15aa445274af8674f41
On 12/15/23 11:12 AM, Kui-Feng Lee wrote:
>
> I tried to reproduce the issue yesterday, according to the hypothesis
> behind the patch of this thread. The following is the instructions
> to reproduce the UAF issue. However, this instruction doesn't create
> a crash at the place since the memory is still available even it has
> been free. But, the log shows a UAF.
>
> The patch at the end of this message is required to reproduce and
> show UAF. The most critical change in the patch is to insert
> a 'mdelay(3000)' to sleep 3s in rt6_route_rcv(). That gives us
> a chance to manipulate the kernel to reproduce the UAF.
>
> Here is my conclusion. There is no protection between finding
> a route and changing the route in rt6_route_rcv(), including inserting
> the entry to the gc list. It is possible to insert an entry that will be
> free later to the gc list, causing a UAF. There is more explanations
> along with the following logs.
TL;DR: I think 3dec89b14d37 should be reverted for 6.6 and 6.7
(selftests should be valid with or without this change) and you try
again for 6.9 (6.7 dev cycle is about to close).
###
I was successful in triggering UAF twice yesterday, but a slightly
different code path that in Eric's trace:
This is the WARN_ON for !hlist_unhashed in fib6_info_release:
[ 46.926339] ------------[ cut here ]------------
[ 46.926368] WARNING: CPU: 3 PID: 0 at include/net/ip6_fib.h:332
fib6_info_release+0x2a/0x43
[ 46.926384] Modules linked in:
[ 46.926393] CPU: 3 PID: 0 Comm: swapper/3 Not tainted
6.7.0-rc4-debug+ #16
[ 46.926399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.15.0-1 04/01/2014
[ 46.926404] RIP: 0010:fib6_info_release+0x2a/0x43
[ 46.926409] Code: 48 85 ff 74 3d 55 48 89 e5 53 48 89 fb 48 8d 7f 2c
e8 3e ff ff ff 84 c0 74 25 48 8d 7b 40 e8 33 11 8b ff 48 83 7b 40 00 74
02 <0f> 0b 48 8d bb a0 00 00 00 48 c7 c6 93 ea ae 81 e8 3f 00 66 ff 5b
[ 46.926416] RSP: 0018:ffffc900002a8390 EFLAGS: 00010282
[ 46.926422] RAX: 1ffff11002048b00 RBX: ffff888010245c00 RCX:
ffffffff81af8c9e
[ 46.926426] RDX: dffffc0000000000 RSI: 0000000000000004 RDI:
ffff888010245c40
[ 46.926431] RBP: ffffc900002a8398 R08: ffffed1002048b86 R09:
0000000000000001
[ 46.926435] R10: ffffffff81af8be4 R11: ffff888010245c2f R12:
ffff88800aeec000
[ 46.926439] R13: 0000000000000000 R14: ffff88801ab7c200 R15:
0000000000000020
[ 46.926444] FS: 0000000000000000(0000) GS:ffff88806c000000(0000)
knlGS:0000000000000000
[ 46.926448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 46.926452] CR2: 00007f897cc95fb0 CR3: 000000001d69d000 CR4:
0000000000750ef0
[ 46.926458] PKRU: 55555554
[ 46.926462] Call Trace:
[ 46.926466] <IRQ>
[ 46.926470] ? show_regs+0x5c/0x60
[ 46.926478] ? fib6_info_release+0x2a/0x43
[ 46.926483] ? __warn+0xcb/0x19c
[ 46.926489] ? fib6_info_release+0x2a/0x43
[ 46.926495] ? report_bug+0x114/0x186
[ 46.926504] ? handle_bug+0x40/0x6b
[ 46.926510] ? exc_invalid_op+0x19/0x41
[ 46.926515] ? asm_exc_invalid_op+0x1b/0x20
[ 46.926524] ? refcount_dec_and_test+0x15/0x43
[ 46.926530] ? fib6_info_release+0x23/0x43
[ 46.926536] ? fib6_info_release+0x2a/0x43
[ 46.926542] ndisc_router_discovery+0xf41/0xfa6
UAF here:
[ 47.941928]
==================================================================
[ 47.942548] BUG: KASAN: slab-use-after-free in
fib6_gc_all.constprop.0+0x19b/0x294
[ 47.943178] Read of size 8 at addr ffff888010245c38 by task swapper/3/0
[ 47.943866] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W
6.7.0-rc4-debug+ #16
[ 47.944548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.15.0-1 04/01/2014
[ 47.945204] Call Trace:
[ 47.945416] <IRQ>
[ 47.945595] dump_stack_lvl+0x5b/0x82
[ 47.945912] print_address_description.constprop.0+0x7a/0x2eb
[ 47.946391] print_report+0x106/0x1e0
[ 47.946703] ? virt_to_slab+0x9/0x1a
[ 47.947007] ? kmem_cache_debug_flags+0x13/0x1f
[ 47.947392] ? kasan_complete_mode_report_info+0x19e/0x1a1
[ 47.947850] ? fib6_gc_all.constprop.0+0x19b/0x294
[ 47.948254] kasan_report+0x99/0xc4
[ 47.948554] ? fib6_gc_all.constprop.0+0x19b/0x294
[ 47.948962] __asan_load8+0x77/0x79
[ 47.949262] fib6_gc_all.constprop.0+0x19b/0x294
for a route added here:
[ 47.970092] Allocated by task 0:
[ 47.970366] stack_trace_save+0x8d/0xbb
[ 47.970373] kasan_save_stack+0x26/0x46
[ 47.970379] kasan_set_track+0x25/0x2b
[ 47.970385] kasan_save_alloc_info+0x1e/0x21
[ 47.970392] ____kasan_kmalloc+0x6f/0x7b
[ 47.970398] __kasan_kmalloc+0x9/0xb
[ 47.970404] __kmalloc+0x91/0xbf
[ 47.970411] kzalloc+0xf/0x11
[ 47.970416] fib6_info_alloc+0x26/0xa1
[ 47.970422] ip6_route_info_create+0x266/0x6c5
[ 47.970428] ip6_route_add+0x14/0x46
[ 47.970433] rt6_add_dflt_router+0x123/0x1bd
[ 47.970439] ndisc_router_discovery+0x5a4/0xfa6
[ 47.970447] ndisc_rcv+0x1a2/0x1b5
This is just aggressive RA's and aggressive gc with a hack to the stack
to toggle lifetime or metric on every RA (something that can be scripted
with an ra command - set and reset lifetime in every other RA and change
the metric in every RA).
I was targeting this code path because I noticed rt6_add_dflt_router
sets RTF_EXPIRES but does not pass in the expires value. There are races
(like the one you found with a different code path) in the handling of
RTF_EXPIRES, setting the timer, running gc and adding the route entry to
the gc list.
In short there are a number of places in the RA path that need the
lifetime handling cleaned up first to make it all consistent with the
idea of using a linked list to track entries with an expires tag.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree.
2023-12-16 18:36 ` David Ahern
@ 2023-12-18 1:05 ` Kui-Feng Lee
0 siblings, 0 replies; 10+ messages in thread
From: Kui-Feng Lee @ 2023-12-18 1:05 UTC (permalink / raw)
To: David Ahern, thinker.li, netdev, martin.lau, kernel-team, davem,
kuba, pabeni, edumazet
Cc: kuifeng, syzbot+c15aa445274af8674f41
On 12/16/23 10:36, David Ahern wrote:
> On 12/15/23 11:12 AM, Kui-Feng Lee wrote:
>>
>> I tried to reproduce the issue yesterday, according to the hypothesis
>> behind the patch of this thread. The following is the instructions
>> to reproduce the UAF issue. However, this instruction doesn't create
>> a crash at the place since the memory is still available even it has
>> been free. But, the log shows a UAF.
>>
>> The patch at the end of this message is required to reproduce and
>> show UAF. The most critical change in the patch is to insert
>> a 'mdelay(3000)' to sleep 3s in rt6_route_rcv(). That gives us
>> a chance to manipulate the kernel to reproduce the UAF.
>>
>> Here is my conclusion. There is no protection between finding
>> a route and changing the route in rt6_route_rcv(), including inserting
>> the entry to the gc list. It is possible to insert an entry that will be
>> free later to the gc list, causing a UAF. There is more explanations
>> along with the following logs.
>
> TL;DR: I think 3dec89b14d37 should be reverted for 6.6 and 6.7
> (selftests should be valid with or without this change) and you try
> again for 6.9 (6.7 dev cycle is about to close).
>
> ###
>
> I was successful in triggering UAF twice yesterday, but a slightly
> different code path that in Eric's trace:
>
> This is the WARN_ON for !hlist_unhashed in fib6_info_release:
>
> [ 46.926339] ------------[ cut here ]------------
> [ 46.926368] WARNING: CPU: 3 PID: 0 at include/net/ip6_fib.h:332
> fib6_info_release+0x2a/0x43
> [ 46.926384] Modules linked in:
> [ 46.926393] CPU: 3 PID: 0 Comm: swapper/3 Not tainted
> 6.7.0-rc4-debug+ #16
> [ 46.926399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.15.0-1 04/01/2014
> [ 46.926404] RIP: 0010:fib6_info_release+0x2a/0x43
> [ 46.926409] Code: 48 85 ff 74 3d 55 48 89 e5 53 48 89 fb 48 8d 7f 2c
> e8 3e ff ff ff 84 c0 74 25 48 8d 7b 40 e8 33 11 8b ff 48 83 7b 40 00 74
> 02 <0f> 0b 48 8d bb a0 00 00 00 48 c7 c6 93 ea ae 81 e8 3f 00 66 ff 5b
> [ 46.926416] RSP: 0018:ffffc900002a8390 EFLAGS: 00010282
> [ 46.926422] RAX: 1ffff11002048b00 RBX: ffff888010245c00 RCX:
> ffffffff81af8c9e
> [ 46.926426] RDX: dffffc0000000000 RSI: 0000000000000004 RDI:
> ffff888010245c40
> [ 46.926431] RBP: ffffc900002a8398 R08: ffffed1002048b86 R09:
> 0000000000000001
> [ 46.926435] R10: ffffffff81af8be4 R11: ffff888010245c2f R12:
> ffff88800aeec000
> [ 46.926439] R13: 0000000000000000 R14: ffff88801ab7c200 R15:
> 0000000000000020
> [ 46.926444] FS: 0000000000000000(0000) GS:ffff88806c000000(0000)
> knlGS:0000000000000000
> [ 46.926448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 46.926452] CR2: 00007f897cc95fb0 CR3: 000000001d69d000 CR4:
> 0000000000750ef0
> [ 46.926458] PKRU: 55555554
> [ 46.926462] Call Trace:
> [ 46.926466] <IRQ>
> [ 46.926470] ? show_regs+0x5c/0x60
> [ 46.926478] ? fib6_info_release+0x2a/0x43
> [ 46.926483] ? __warn+0xcb/0x19c
> [ 46.926489] ? fib6_info_release+0x2a/0x43
> [ 46.926495] ? report_bug+0x114/0x186
> [ 46.926504] ? handle_bug+0x40/0x6b
> [ 46.926510] ? exc_invalid_op+0x19/0x41
> [ 46.926515] ? asm_exc_invalid_op+0x1b/0x20
> [ 46.926524] ? refcount_dec_and_test+0x15/0x43
> [ 46.926530] ? fib6_info_release+0x23/0x43
> [ 46.926536] ? fib6_info_release+0x2a/0x43
> [ 46.926542] ndisc_router_discovery+0xf41/0xfa6
>
>
> UAF here:
>
> [ 47.941928]
> ==================================================================
> [ 47.942548] BUG: KASAN: slab-use-after-free in
> fib6_gc_all.constprop.0+0x19b/0x294
> [ 47.943178] Read of size 8 at addr ffff888010245c38 by task swapper/3/0
>
> [ 47.943866] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G W
> 6.7.0-rc4-debug+ #16
> [ 47.944548] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> 1.15.0-1 04/01/2014
> [ 47.945204] Call Trace:
> [ 47.945416] <IRQ>
> [ 47.945595] dump_stack_lvl+0x5b/0x82
> [ 47.945912] print_address_description.constprop.0+0x7a/0x2eb
> [ 47.946391] print_report+0x106/0x1e0
> [ 47.946703] ? virt_to_slab+0x9/0x1a
> [ 47.947007] ? kmem_cache_debug_flags+0x13/0x1f
> [ 47.947392] ? kasan_complete_mode_report_info+0x19e/0x1a1
> [ 47.947850] ? fib6_gc_all.constprop.0+0x19b/0x294
> [ 47.948254] kasan_report+0x99/0xc4
> [ 47.948554] ? fib6_gc_all.constprop.0+0x19b/0x294
> [ 47.948962] __asan_load8+0x77/0x79
> [ 47.949262] fib6_gc_all.constprop.0+0x19b/0x294
>
>
> for a route added here:
>
> [ 47.970092] Allocated by task 0:
> [ 47.970366] stack_trace_save+0x8d/0xbb
> [ 47.970373] kasan_save_stack+0x26/0x46
> [ 47.970379] kasan_set_track+0x25/0x2b
> [ 47.970385] kasan_save_alloc_info+0x1e/0x21
> [ 47.970392] ____kasan_kmalloc+0x6f/0x7b
> [ 47.970398] __kasan_kmalloc+0x9/0xb
> [ 47.970404] __kmalloc+0x91/0xbf
> [ 47.970411] kzalloc+0xf/0x11
> [ 47.970416] fib6_info_alloc+0x26/0xa1
> [ 47.970422] ip6_route_info_create+0x266/0x6c5
> [ 47.970428] ip6_route_add+0x14/0x46
> [ 47.970433] rt6_add_dflt_router+0x123/0x1bd
> [ 47.970439] ndisc_router_discovery+0x5a4/0xfa6
> [ 47.970447] ndisc_rcv+0x1a2/0x1b5
>
> This is just aggressive RA's and aggressive gc with a hack to the stack
> to toggle lifetime or metric on every RA (something that can be scripted
> with an ra command - set and reset lifetime in every other RA and change
> the metric in every RA).
>
> I was targeting this code path because I noticed rt6_add_dflt_router
> sets RTF_EXPIRES but does not pass in the expires value. There are races
> (like the one you found with a different code path) in the handling of
> RTF_EXPIRES, setting the timer, running gc and adding the route entry to
> the gc list.
I am not sure what you mean a race in rt6_add_dflt_router().
In this function, it calls ip6_route_add() to add a default route
with RTF_EXPIRES. In ip6_route_add(), it calls ip6_route_info_create()
to create a f6i, and calls __ip6_ins_rt() to add the entry to gc list.
Although there is a gap between ip6_route_info_create() and
__ip6_ins_rt() without any protection, it should not cause a race.
The newly created entry is not available to the rest of the system
until __ip6_ins_rt() adds it to the tree. And, adding the entry
to the gc list and adding the entry to the tree are performed
atomically, being protected by table->tb6_lock. So, it should not have
a race between adding to tree and adding to gc list if this is what
you mean.
By the way, do you happen to have a chance to try the patch here in
you tests? It fixed the issue for my test scenario. According to
your stacktraces, it should be the same issue happening in my test.
Somewhere adds a entry to gc list by faultily believing that
the entry is still on a tree. And, the patch here fixes this
faultily believe.
>
> In short there are a number of places in the RA path that need the
> lifetime handling cleaned up first to make it all consistent with the
> idea of using a linked list to track entries with an expires tag.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree.
2023-12-15 19:12 ` Kui-Feng Lee
2023-12-16 18:36 ` David Ahern
@ 2023-12-18 1:16 ` Kui-Feng Lee
1 sibling, 0 replies; 10+ messages in thread
From: Kui-Feng Lee @ 2023-12-18 1:16 UTC (permalink / raw)
To: David Ahern, thinker.li, netdev, martin.lau, kernel-team, davem,
kuba, pabeni, edumazet
Cc: kuifeng, syzbot+c15aa445274af8674f41
On 12/15/23 11:12, Kui-Feng Lee wrote:
>
>
> On 12/13/23 22:11, David Ahern wrote:
>> On 12/13/23 2:37 PM, thinker.li@gmail.com wrote:
>>> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
>>> index b132feae3393..dcaeb88d73aa 100644
>>> --- a/net/ipv6/route.c
>>> +++ b/net/ipv6/route.c
>>> @@ -3763,10 +3763,10 @@ static struct fib6_info
>>> *ip6_route_info_create(struct fib6_config *cfg,
>>> rt->dst_nocount = true;
>>> if (cfg->fc_flags & RTF_EXPIRES)
>>> - fib6_set_expires_locked(rt, jiffies +
>>> - clock_t_to_jiffies(cfg->fc_expires));
>>> + __fib6_set_expires(rt, jiffies +
>>> + clock_t_to_jiffies(cfg->fc_expires));
>>> else
>>> - fib6_clean_expires_locked(rt);
>>> + __fib6_clean_expires(rt);
>>
>> as Eric noted in a past comment, the clean is not needed in this
>> function since memory is initialized to 0 (expires is never set).
>>
>> Also, this patch set does not fundamentally change the logic, so it
>> cannot fix the bug reported in
>>
>> https://lore.kernel.org/all/20231205173250.2982846-1-edumazet@google.com/
>>
>> please hold off future versions of this set until the problem in that
>> stack traced is fixed. I have tried a few things using RA's, but have
>> not been able to recreate UAF.
>
> I tried to reproduce the issue yesterday, according to the hypothesis
> behind the patch of this thread. The following is the instructions
> to reproduce the UAF issue. However, this instruction doesn't create
> a crash at the place since the memory is still available even it has
> been free. But, the log shows a UAF.
>
> The patch at the end of this message is required to reproduce and
> show UAF. The most critical change in the patch is to insert
> a 'mdelay(3000)' to sleep 3s in rt6_route_rcv(). That gives us
> a chance to manipulate the kernel to reproduce the UAF.
>
> Here is my conclusion. There is no protection between finding
> a route and changing the route in rt6_route_rcv(), including inserting
> the entry to the gc list. It is possible to insert an entry that will be
> free later to the gc list, causing a UAF. There is more explanations
> along with the following logs.
>
> Instructions:
> - Preparation
> - install ipv6toolkit on the host.
> - run qemu with a patched kernel as a guest through
> with the host through qemubr0 a bridge.
> - On the guest
> - stop systemd-networkd.service & systemd-networkd.socket if
> there are.
> - sysctl -w net.ipv6.conf.enp0s3.accept_ra=2
> - sysctl -w net.ipv6.conf.enp0s3.accept_ra_rt_info_max_plen=127
> - Assume the address of qemubr0 in the host is
> fe80::4ce9:92ff:fe27:75df.
> - Test
> - ra6 -i qemubr0 -d ff02::1 -R 'fe82::/64#1#300' -t 300 # On the
> host
> - sleep 2; ip -6 route del fe82::/64 # On the
> guest
> - ra6 -i qemubr0 -d ff02::1 -R 'fe82::/64#1#300' -t 300 # On the
> host
> - ra6 -i qemubr0 -d ff02::1 -R 'fe81::/64#1#300' -t 0 # On the
> host
> - The step 3 should start immediately after step 2 since we have
> a gap of merely 3 seconds in the kernel.
>
I did the same test with the fix patch along with this thread.
The following is the log I got.
# This is the log printed by the step 1.
# ffff888108066600 was add to the gc list.
[ 4.596875] __ip6_ins_rt fe80::5054:ff:fe12:3456/128
[ 38.925482] rt = 0000000000000000
[ 38.925916] fib6_info_alloc: ffff888108066e00
[ 38.926441] __ip6_ins_rt ::/0
[ 38.926441] fib6_add: add ffff888108066e00 to gc list
ffff888108066e38 pprev ffff888102a02800 next 0000000000000000
[ 38.927084] rt = ffff888108066e00
[ 38.927084] rt6_route_rcv
[ 38.927084] fib6_info_alloc: ffff888108066600
[ 38.929333] __ip6_ins_rt fe82::/64
[ 38.929333] rt6_route_rcv: route info ffff888108066600, sleep 3s
[ 40.948831] fib6_set_expires_locked: add ffff888108066600 to gc list
ffff888108066638 pprev ffff888102a02800 next ffff888108066e38
[ 41.932501] rt6_route_rcv: route info ffff888108066600, after release
# This is the log printed by the step 2 & 3.
# The entry (ffff888108066600) removed from the gc list by
# fib6_clean_expires_locked was not added back to the gc list again.
[ 68.173441] rt = ffff888108066e00
[ 68.173883] rt6_route_rcv
[ 68.174245] rt6_route_rcv: route info ffff888108066600, sleep 3s
[ 69.389112] fib6_clean_expires_locked: del ffff888108066600 from gc
list pprev ffff888102a02800 next ffff888108066e38
[ 70.900831] rt6_route_rcv: route info ffff888108066600, after release
[ 71.182822] fib6_info_destroy_rcu ffff888108066600
# This is the log printed by the step 4.
[ 109.017446] rt = ffff888108066e00
[ 109.017893] __ip6_del_rt ::/0
[ 109.018288] fib6_clean_expires_locked: del ffff888108066e00 from gc
list pprev ffff888102a02800 next 0000000000000000
[ 109.019471] rt6_route_rcv
[ 109.019471] fib6_info_alloc: ffff888108066600
[ 109.019471] __ip6_ins_rt fe81::/64
[ 109.019471] rt6_route_rcv: route info ffff888108066600, sleep 3s
[ 111.028831] fib6_set_expires_locked: add ffff888108066600 to gc list
ffff888108066638 pprev ffff888102a02800 next 0000000000000000
[ 112.023607] rt6_route_rcv: route info ffff888108066600, after release
[ 112.031825] fib6_info_destroy_rcu ffff888108066e00
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-12-18 1:16 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-13 21:37 [PATCH net-next v3 0/2] Fix dangling pointer at f6i->gc_link thinker.li
2023-12-13 21:37 ` [PATCH net-next v3 1/2] net/ipv6: insert a f6i to a GC list only if the f6i is in a fib6_table tree thinker.li
2023-12-14 6:11 ` David Ahern
2023-12-14 23:43 ` Kui-Feng Lee
2023-12-15 19:12 ` Kui-Feng Lee
2023-12-16 18:36 ` David Ahern
2023-12-18 1:05 ` Kui-Feng Lee
2023-12-18 1:16 ` Kui-Feng Lee
2023-12-13 21:37 ` [PATCH net-next v3 2/2] selftests: fib_tests: Add tests for toggling between w/ and w/o expires thinker.li
2023-12-14 3:32 ` Hangbin Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).