[PATCH] net: sched: use-after-free in tcf_action

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] net: sched: use-after-free in tcf_action_destroy
@ 2024-08-16  1:53 yangzhuorao
  2024-08-16  4:06 ` Jamal Hadi Salim
  2024-08-16  7:20 ` Jiri Pirko
  0 siblings, 2 replies; 11+ messages in thread
From: yangzhuorao @ 2024-08-16  1:53 UTC (permalink / raw)
  To: netdev
  Cc: linux-kernel, jhs, xiyou.wangcong, jiri, davem, security,
	xkaneiki, hackerzheng666, yangzhuorao

There is a uaf bug in net/sched/act_api.c.
When Thread1 call [1] tcf_action_init_1 to alloc act which saves
in actions array. If allocation failed, it will go to err path.
Meanwhile thread2 call tcf_del_walker to delete action in idr.
So thread 1 in err path [3] tcf_action_destroy will cause
use-after-free read bug.

Thread1                            Thread2
 tcf_action_init
  for(i;i<TCA_ACT_MAX_PRIO;i++)
   act=tcf_action_init_1 //[1]
   if(IS_ERR(act))
    goto err
   actions[i] = act
                                   tcf_del_walker
                                    idr_for_each_entry_ul(idr,p,id)
                                     __tcf_idr_release(p,false,true)
                                      free_tcf(p) //[2]
  err:
   tcf_action_destroy
    a=actions[i]
    ops = a->ops //[3]

We add lock and unlock in tcf_action_init and tcf_del_walker function
to aviod race condition.

==================================================================
BUG: KASAN: use-after-free in tcf_action_destroy+0x138/0x150
Read of size 8 at addr ffff88806543e100 by task syz-executor156/295

CPU: 0 PID: 295 Comm: syz-executor156 Not tainted 4.19.311 #2
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xcd/0x110 lib/dump_stack.c:118
 print_address_description+0x60/0x224 mm/kasan/report.c:255
 kasan_report_error mm/kasan/report.c:353 [inline]
 kasan_report mm/kasan/report.c:411 [inline]
 kasan_report.cold+0x9e/0x1c6 mm/kasan/report.c:395
 tcf_action_destroy+0x138/0x150 net/sched/act_api.c:664
 tcf_action_init+0x252/0x330 net/sched/act_api.c:961
 tcf_action_add+0xdb/0x370 net/sched/act_api.c:1326
 tc_ctl_action+0x327/0x410 net/sched/act_api.c:1381
 rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
 netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
 sock_sendmsg_nosec net/socket.c:652 [inline]
 __sock_sendmsg+0x126/0x160 net/socket.c:663
 ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
 __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
 do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x5c/0xc1
RIP: 0033:0x7fc19796b10d
RSP: 002b:00007fc197910d78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fc1979fe2e0 RCX: 00007fc19796b10d
RDX: 0000000000000000 RSI: 0000000020000480 RDI: 0000000000000004
RBP: 00007fc1979fe2e8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000246 R12: 00007fc1979fe2ec
R13: 00007fc1979fc010 R14: 5c56ebd45a42de31 R15: 00007fc1979cb008

Allocated by task 295:
 __kmalloc+0x89/0x1d0 mm/slub.c:3808
 kmalloc include/linux/slab.h:520 [inline]
 kzalloc include/linux/slab.h:709 [inline]
 tcf_idr_create+0x59/0x5e0 net/sched/act_api.c:361
 tcf_nat_init+0x4b7/0x850 net/sched/act_nat.c:63
 tcf_action_init_1+0x981/0xc90 net/sched/act_api.c:879
 tcf_action_init+0x216/0x330 net/sched/act_api.c:945
 tcf_action_add+0xdb/0x370 net/sched/act_api.c:1326
 tc_ctl_action+0x327/0x410 net/sched/act_api.c:1381
 rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
 netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
 sock_sendmsg_nosec net/socket.c:652 [inline]
 __sock_sendmsg+0x126/0x160 net/socket.c:663
 ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
 __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
 do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x5c/0xc1

Freed by task 275:
 slab_free_hook mm/slub.c:1391 [inline]
 slab_free_freelist_hook mm/slub.c:1419 [inline]
 slab_free mm/slub.c:2998 [inline]
 kfree+0x8b/0x1a0 mm/slub.c:3963
 __tcf_action_put+0x114/0x160 net/sched/act_api.c:112
 __tcf_idr_release net/sched/act_api.c:142 [inline]
 __tcf_idr_release+0x52/0xe0 net/sched/act_api.c:122
 tcf_del_walker net/sched/act_api.c:266 [inline]
 tcf_generic_walker+0x66a/0x9c0 net/sched/act_api.c:292
 tca_action_flush net/sched/act_api.c:1154 [inline]
 tca_action_gd+0x8b6/0x15b0 net/sched/act_api.c:1260
 tc_ctl_action+0x26d/0x410 net/sched/act_api.c:1389
 rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
 netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
 netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
 netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
 netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
 sock_sendmsg_nosec net/socket.c:652 [inline]
 __sock_sendmsg+0x126/0x160 net/socket.c:663
 ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
 __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
 do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x5c/0xc1

The buggy address belongs to the object at ffff88806543e100
 which belongs to the cache kmalloc-192 of size 192
The buggy address is located 0 bytes inside of
 192-byte region [ffff88806543e100, ffff88806543e1c0)
The buggy address belongs to the page:
flags: 0x100000000000100(slab)
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88806543e000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88806543e080: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff88806543e100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff88806543e180: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
 ffff88806543e200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Signed-off-by: yangzhuorao <alex000young@gmail.com>
---
 net/sched/act_api.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index ad0773b20d83..d29ea69ba312 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -261,7 +261,7 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
 		goto nla_put_failure;
 	if (nla_put_string(skb, TCA_KIND, ops->kind))
 		goto nla_put_failure;
-
+	rcu_read_lock();
 	idr_for_each_entry_ul(idr, p, id) {
 		ret = __tcf_idr_release(p, false, true);
 		if (ret == ACT_P_DELETED) {
@@ -271,12 +271,14 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
 			goto nla_put_failure;
 		}
 	}
+	rcu_read_unlock();
 	if (nla_put_u32(skb, TCA_FCNT, n_i))
 		goto nla_put_failure;
 	nla_nest_end(skb, nest);
 
 	return n_i;
 nla_put_failure:
+	rcu_read_unlock();
 	nla_nest_cancel(skb, nest);
 	return ret;
 }
@@ -940,7 +942,7 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 	err = nla_parse_nested(tb, TCA_ACT_MAX_PRIO, nla, NULL, extack);
 	if (err < 0)
 		return err;
-
+	rcu_read_lock();
 	for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
 		act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind,
 					rtnl_held, extack);
@@ -953,11 +955,12 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
 		/* Start from index 0 */
 		actions[i - 1] = act;
 	}
-
+	rcu_read_unlock();
 	*attr_size = tcf_action_full_attrs_size(sz);
 	return i - 1;
 
 err:
+	rcu_read_lock();
 	tcf_action_destroy(actions, bind);
 	return err;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-16  1:53 [PATCH] net: sched: use-after-free in tcf_action_destroy yangzhuorao
@ 2024-08-16  4:06 ` Jamal Hadi Salim
  2024-08-16  5:03   ` Willy Tarreau
  2024-08-16 15:04   ` Jamal Hadi Salim
  2024-08-16  7:20 ` Jiri Pirko
  1 sibling, 2 replies; 11+ messages in thread
From: Jamal Hadi Salim @ 2024-08-16  4:06 UTC (permalink / raw)
  To: yangzhuorao
  Cc: netdev, linux-kernel, xiyou.wangcong, jiri, davem, security,
	xkaneiki, hackerzheng666

On Thu, Aug 15, 2024 at 9:54 PM yangzhuorao <alex000young@gmail.com> wrote:
>
> There is a uaf bug in net/sched/act_api.c.
> When Thread1 call [1] tcf_action_init_1 to alloc act which saves
> in actions array. If allocation failed, it will go to err path.
> Meanwhile thread2 call tcf_del_walker to delete action in idr.
> So thread 1 in err path [3] tcf_action_destroy will cause
> use-after-free read bug.
>
> Thread1                            Thread2
>  tcf_action_init
>   for(i;i<TCA_ACT_MAX_PRIO;i++)
>    act=tcf_action_init_1 //[1]
>    if(IS_ERR(act))
>     goto err
>    actions[i] = act
>                                    tcf_del_walker
>                                     idr_for_each_entry_ul(idr,p,id)
>                                      __tcf_idr_release(p,false,true)
>                                       free_tcf(p) //[2]
>   err:
>    tcf_action_destroy
>     a=actions[i]
>     ops = a->ops //[3]
>
> We add lock and unlock in tcf_action_init and tcf_del_walker function
> to aviod race condition.
>
> ==================================================================
> BUG: KASAN: use-after-free in tcf_action_destroy+0x138/0x150
> Read of size 8 at addr ffff88806543e100 by task syz-executor156/295
>

Since this is syzkaller induced, do you have a repro?
Also what kernel (trying to see if it was before/after Eric's netlink changes).

cheers,
jamal

> CPU: 0 PID: 295 Comm: syz-executor156 Not tainted 4.19.311 #2
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0xcd/0x110 lib/dump_stack.c:118
>  print_address_description+0x60/0x224 mm/kasan/report.c:255
>  kasan_report_error mm/kasan/report.c:353 [inline]
>  kasan_report mm/kasan/report.c:411 [inline]
>  kasan_report.cold+0x9e/0x1c6 mm/kasan/report.c:395
>  tcf_action_destroy+0x138/0x150 net/sched/act_api.c:664
>  tcf_action_init+0x252/0x330 net/sched/act_api.c:961
>  tcf_action_add+0xdb/0x370 net/sched/act_api.c:1326
>  tc_ctl_action+0x327/0x410 net/sched/act_api.c:1381
>  rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
>  netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
>  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
>  netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
>  netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
>  sock_sendmsg_nosec net/socket.c:652 [inline]
>  __sock_sendmsg+0x126/0x160 net/socket.c:663
>  ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
>  __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
>  do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
>  entry_SYSCALL_64_after_hwframe+0x5c/0xc1
> RIP: 0033:0x7fc19796b10d
> RSP: 002b:00007fc197910d78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00007fc1979fe2e0 RCX: 00007fc19796b10d
> RDX: 0000000000000000 RSI: 0000000020000480 RDI: 0000000000000004
> RBP: 00007fc1979fe2e8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000002 R11: 0000000000000246 R12: 00007fc1979fe2ec
> R13: 00007fc1979fc010 R14: 5c56ebd45a42de31 R15: 00007fc1979cb008
>
> Allocated by task 295:
>  __kmalloc+0x89/0x1d0 mm/slub.c:3808
>  kmalloc include/linux/slab.h:520 [inline]
>  kzalloc include/linux/slab.h:709 [inline]
>  tcf_idr_create+0x59/0x5e0 net/sched/act_api.c:361
>  tcf_nat_init+0x4b7/0x850 net/sched/act_nat.c:63
>  tcf_action_init_1+0x981/0xc90 net/sched/act_api.c:879
>  tcf_action_init+0x216/0x330 net/sched/act_api.c:945
>  tcf_action_add+0xdb/0x370 net/sched/act_api.c:1326
>  tc_ctl_action+0x327/0x410 net/sched/act_api.c:1381
>  rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
>  netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
>  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
>  netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
>  netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
>  sock_sendmsg_nosec net/socket.c:652 [inline]
>  __sock_sendmsg+0x126/0x160 net/socket.c:663
>  ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
>  __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
>  do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
>  entry_SYSCALL_64_after_hwframe+0x5c/0xc1
>
> Freed by task 275:
>  slab_free_hook mm/slub.c:1391 [inline]
>  slab_free_freelist_hook mm/slub.c:1419 [inline]
>  slab_free mm/slub.c:2998 [inline]
>  kfree+0x8b/0x1a0 mm/slub.c:3963
>  __tcf_action_put+0x114/0x160 net/sched/act_api.c:112
>  __tcf_idr_release net/sched/act_api.c:142 [inline]
>  __tcf_idr_release+0x52/0xe0 net/sched/act_api.c:122
>  tcf_del_walker net/sched/act_api.c:266 [inline]
>  tcf_generic_walker+0x66a/0x9c0 net/sched/act_api.c:292
>  tca_action_flush net/sched/act_api.c:1154 [inline]
>  tca_action_gd+0x8b6/0x15b0 net/sched/act_api.c:1260
>  tc_ctl_action+0x26d/0x410 net/sched/act_api.c:1389
>  rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
>  netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
>  netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
>  netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
>  netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
>  sock_sendmsg_nosec net/socket.c:652 [inline]
>  __sock_sendmsg+0x126/0x160 net/socket.c:663
>  ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
>  __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
>  do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
>  entry_SYSCALL_64_after_hwframe+0x5c/0xc1
>
> The buggy address belongs to the object at ffff88806543e100
>  which belongs to the cache kmalloc-192 of size 192
> The buggy address is located 0 bytes inside of
>  192-byte region [ffff88806543e100, ffff88806543e1c0)
> The buggy address belongs to the page:
> flags: 0x100000000000100(slab)
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff88806543e000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff88806543e080: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> >ffff88806543e100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                    ^
>  ffff88806543e180: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>  ffff88806543e200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>
> Signed-off-by: yangzhuorao <alex000young@gmail.com>
> ---
>  net/sched/act_api.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index ad0773b20d83..d29ea69ba312 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -261,7 +261,7 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
>                 goto nla_put_failure;
>         if (nla_put_string(skb, TCA_KIND, ops->kind))
>                 goto nla_put_failure;
> -
> +       rcu_read_lock();
>         idr_for_each_entry_ul(idr, p, id) {
>                 ret = __tcf_idr_release(p, false, true);
>                 if (ret == ACT_P_DELETED) {
> @@ -271,12 +271,14 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
>                         goto nla_put_failure;
>                 }
>         }
> +       rcu_read_unlock();
>         if (nla_put_u32(skb, TCA_FCNT, n_i))
>                 goto nla_put_failure;
>         nla_nest_end(skb, nest);
>
>         return n_i;
>  nla_put_failure:
> +       rcu_read_unlock();
>         nla_nest_cancel(skb, nest);
>         return ret;
>  }
> @@ -940,7 +942,7 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
>         err = nla_parse_nested(tb, TCA_ACT_MAX_PRIO, nla, NULL, extack);
>         if (err < 0)
>                 return err;
> -
> +       rcu_read_lock();
>         for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
>                 act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind,
>                                         rtnl_held, extack);
> @@ -953,11 +955,12 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
>                 /* Start from index 0 */
>                 actions[i - 1] = act;
>         }
> -
> +       rcu_read_unlock();
>         *attr_size = tcf_action_full_attrs_size(sz);
>         return i - 1;
>
>  err:
> +       rcu_read_lock();
>         tcf_action_destroy(actions, bind);
>         return err;
>  }
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-16  4:06 ` Jamal Hadi Salim
@ 2024-08-16  5:03   ` Willy Tarreau
  2024-08-16 15:04   ` Jamal Hadi Salim
  1 sibling, 0 replies; 11+ messages in thread
From: Willy Tarreau @ 2024-08-16  5:03 UTC (permalink / raw)
  To: yangzhuorao
  Cc: Jamal Hadi Salim, netdev, linux-kernel, xiyou.wangcong, jiri,
	davem, xkaneiki, hackerzheng666

On Fri, Aug 16, 2024 at 12:06:25AM -0400, Jamal Hadi Salim wrote:
> On Thu, Aug 15, 2024 at 9:54 PM yangzhuorao <alex000young@gmail.com> wrote:
> >
> > There is a uaf bug in net/sched/act_api.c.
> > When Thread1 call [1] tcf_action_init_1 to alloc act which saves
> > in actions array. If allocation failed, it will go to err path.
> > Meanwhile thread2 call tcf_del_walker to delete action in idr.
> > So thread 1 in err path [3] tcf_action_destroy will cause
> > use-after-free read bug.
> >
> > Thread1                            Thread2
> >  tcf_action_init
> >   for(i;i<TCA_ACT_MAX_PRIO;i++)
> >    act=tcf_action_init_1 //[1]
> >    if(IS_ERR(act))
> >     goto err
> >    actions[i] = act
> >                                    tcf_del_walker
> >                                     idr_for_each_entry_ul(idr,p,id)
> >                                      __tcf_idr_release(p,false,true)
> >                                       free_tcf(p) //[2]
> >   err:
> >    tcf_action_destroy
> >     a=actions[i]
> >     ops = a->ops //[3]
> >
> > We add lock and unlock in tcf_action_init and tcf_del_walker function
> > to aviod race condition.
> >
> > ==================================================================
> > BUG: KASAN: use-after-free in tcf_action_destroy+0x138/0x150
> > Read of size 8 at addr ffff88806543e100 by task syz-executor156/295
> >
> 
> Since this is syzkaller induced, do you have a repro?
> Also what kernel (trying to see if it was before/after Eric's netlink changes).
> 
> cheers,
> jamal

In addition, please note that since this is public, there's no point in
CCing security@k.o (which I've dropped from the CC list) since there's
no coordination needed anymore.

Thanks!
Willy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-16  1:53 [PATCH] net: sched: use-after-free in tcf_action_destroy yangzhuorao
  2024-08-16  4:06 ` Jamal Hadi Salim
@ 2024-08-16  7:20 ` Jiri Pirko
  1 sibling, 0 replies; 11+ messages in thread
From: Jiri Pirko @ 2024-08-16  7:20 UTC (permalink / raw)
  To: yangzhuorao
  Cc: netdev, linux-kernel, jhs, xiyou.wangcong, davem, security,
	xkaneiki, hackerzheng666

Fri, Aug 16, 2024 at 03:53:55AM CEST, alex000young@gmail.com wrote:
>There is a uaf bug in net/sched/act_api.c.
>When Thread1 call [1] tcf_action_init_1 to alloc act which saves
>in actions array. If allocation failed, it will go to err path.
>Meanwhile thread2 call tcf_del_walker to delete action in idr.
>So thread 1 in err path [3] tcf_action_destroy will cause
>use-after-free read bug.
>
>Thread1                            Thread2
> tcf_action_init
>  for(i;i<TCA_ACT_MAX_PRIO;i++)
>   act=tcf_action_init_1 //[1]
>   if(IS_ERR(act))
>    goto err
>   actions[i] = act
>                                   tcf_del_walker
>                                    idr_for_each_entry_ul(idr,p,id)
>                                     __tcf_idr_release(p,false,true)
>                                      free_tcf(p) //[2]
>  err:
>   tcf_action_destroy
>    a=actions[i]
>    ops = a->ops //[3]
>
>We add lock and unlock in tcf_action_init and tcf_del_walker function

Who's "we"? Be imperative, tell the codebase what to do in order to fix
this bug.


>to aviod race condition.
>
>==================================================================
>BUG: KASAN: use-after-free in tcf_action_destroy+0x138/0x150
>Read of size 8 at addr ffff88806543e100 by task syz-executor156/295
>
>CPU: 0 PID: 295 Comm: syz-executor156 Not tainted 4.19.311 #2
>Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0xcd/0x110 lib/dump_stack.c:118
> print_address_description+0x60/0x224 mm/kasan/report.c:255
> kasan_report_error mm/kasan/report.c:353 [inline]
> kasan_report mm/kasan/report.c:411 [inline]
> kasan_report.cold+0x9e/0x1c6 mm/kasan/report.c:395
> tcf_action_destroy+0x138/0x150 net/sched/act_api.c:664
> tcf_action_init+0x252/0x330 net/sched/act_api.c:961
> tcf_action_add+0xdb/0x370 net/sched/act_api.c:1326
> tc_ctl_action+0x327/0x410 net/sched/act_api.c:1381
> rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
> netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
> netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
> netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
> netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
> sock_sendmsg_nosec net/socket.c:652 [inline]
> __sock_sendmsg+0x126/0x160 net/socket.c:663
> ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
> __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
> do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
> entry_SYSCALL_64_after_hwframe+0x5c/0xc1
>RIP: 0033:0x7fc19796b10d
>RSP: 002b:00007fc197910d78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>RAX: ffffffffffffffda RBX: 00007fc1979fe2e0 RCX: 00007fc19796b10d
>RDX: 0000000000000000 RSI: 0000000020000480 RDI: 0000000000000004
>RBP: 00007fc1979fe2e8 R08: 0000000000000000 R09: 0000000000000000
>R10: 0000000000000002 R11: 0000000000000246 R12: 00007fc1979fe2ec
>R13: 00007fc1979fc010 R14: 5c56ebd45a42de31 R15: 00007fc1979cb008
>
>Allocated by task 295:
> __kmalloc+0x89/0x1d0 mm/slub.c:3808
> kmalloc include/linux/slab.h:520 [inline]
> kzalloc include/linux/slab.h:709 [inline]
> tcf_idr_create+0x59/0x5e0 net/sched/act_api.c:361
> tcf_nat_init+0x4b7/0x850 net/sched/act_nat.c:63
> tcf_action_init_1+0x981/0xc90 net/sched/act_api.c:879
> tcf_action_init+0x216/0x330 net/sched/act_api.c:945
> tcf_action_add+0xdb/0x370 net/sched/act_api.c:1326
> tc_ctl_action+0x327/0x410 net/sched/act_api.c:1381
> rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
> netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
> netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
> netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
> netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
> sock_sendmsg_nosec net/socket.c:652 [inline]
> __sock_sendmsg+0x126/0x160 net/socket.c:663
> ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
> __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
> do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
> entry_SYSCALL_64_after_hwframe+0x5c/0xc1
>
>Freed by task 275:
> slab_free_hook mm/slub.c:1391 [inline]
> slab_free_freelist_hook mm/slub.c:1419 [inline]
> slab_free mm/slub.c:2998 [inline]
> kfree+0x8b/0x1a0 mm/slub.c:3963
> __tcf_action_put+0x114/0x160 net/sched/act_api.c:112
> __tcf_idr_release net/sched/act_api.c:142 [inline]
> __tcf_idr_release+0x52/0xe0 net/sched/act_api.c:122
> tcf_del_walker net/sched/act_api.c:266 [inline]
> tcf_generic_walker+0x66a/0x9c0 net/sched/act_api.c:292
> tca_action_flush net/sched/act_api.c:1154 [inline]
> tca_action_gd+0x8b6/0x15b0 net/sched/act_api.c:1260
> tc_ctl_action+0x26d/0x410 net/sched/act_api.c:1389
> rtnetlink_rcv_msg+0x79e/0xa40 net/core/rtnetlink.c:4793
> netlink_rcv_skb+0x156/0x420 net/netlink/af_netlink.c:2459
> netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
> netlink_unicast+0x4d6/0x690 net/netlink/af_netlink.c:1357
> netlink_sendmsg+0x6ce/0xce0 net/netlink/af_netlink.c:1907
> sock_sendmsg_nosec net/socket.c:652 [inline]
> __sock_sendmsg+0x126/0x160 net/socket.c:663
> ___sys_sendmsg+0x7f2/0x920 net/socket.c:2258
> __sys_sendmsg+0xec/0x1b0 net/socket.c:2296
> do_syscall_64+0xbd/0x360 arch/x86/entry/common.c:293
> entry_SYSCALL_64_after_hwframe+0x5c/0xc1
>
>The buggy address belongs to the object at ffff88806543e100
> which belongs to the cache kmalloc-192 of size 192
>The buggy address is located 0 bytes inside of
> 192-byte region [ffff88806543e100, ffff88806543e1c0)
>The buggy address belongs to the page:
>flags: 0x100000000000100(slab)
>page dumped because: kasan: bad access detected
>
>Memory state around the buggy address:
> ffff88806543e000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88806543e080: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>>ffff88806543e100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                   ^
> ffff88806543e180: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> ffff88806543e200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>

You are missing tags. "Fixes" at least.


>Signed-off-by: yangzhuorao <alex000young@gmail.com>

Usually, name starts with capital letter and most often it is multiple
words, yours is different?


>---
> net/sched/act_api.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
>diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>index ad0773b20d83..d29ea69ba312 100644
>--- a/net/sched/act_api.c
>+++ b/net/sched/act_api.c
>@@ -261,7 +261,7 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
> 		goto nla_put_failure;
> 	if (nla_put_string(skb, TCA_KIND, ops->kind))
> 		goto nla_put_failure;
>-
>+	rcu_read_lock();
> 	idr_for_each_entry_ul(idr, p, id) {
> 		ret = __tcf_idr_release(p, false, true);
> 		if (ret == ACT_P_DELETED) {
>@@ -271,12 +271,14 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb,
> 			goto nla_put_failure;
> 		}
> 	}
>+	rcu_read_unlock();
> 	if (nla_put_u32(skb, TCA_FCNT, n_i))
> 		goto nla_put_failure;
> 	nla_nest_end(skb, nest);
> 
> 	return n_i;
> nla_put_failure:
>+	rcu_read_unlock();
> 	nla_nest_cancel(skb, nest);
> 	return ret;
> }
>@@ -940,7 +942,7 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
> 	err = nla_parse_nested(tb, TCA_ACT_MAX_PRIO, nla, NULL, extack);
> 	if (err < 0)
> 		return err;
>-
>+	rcu_read_lock();
> 	for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) {
> 		act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind,
> 					rtnl_held, extack);
>@@ -953,11 +955,12 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla,
> 		/* Start from index 0 */
> 		actions[i - 1] = act;
> 	}
>-
>+	rcu_read_unlock();


Can you please describe in details, how exactly you fix this issue. I'm
asking because the rcu_read_lock section here looks to me very
suspicious.



> 	*attr_size = tcf_action_full_attrs_size(sz);
> 	return i - 1;
> 
> err:
>+	rcu_read_lock();
> 	tcf_action_destroy(actions, bind);
> 	return err;
> }
>-- 
>2.25.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-16  4:06 ` Jamal Hadi Salim
  2024-08-16  5:03   ` Willy Tarreau
@ 2024-08-16 15:04   ` Jamal Hadi Salim
  2024-08-17  9:27     ` Alex Young
  1 sibling, 1 reply; 11+ messages in thread
From: Jamal Hadi Salim @ 2024-08-16 15:04 UTC (permalink / raw)
  To: yangzhuorao
  Cc: netdev, linux-kernel, xiyou.wangcong, jiri, davem, security,
	xkaneiki, hackerzheng666

On Fri, Aug 16, 2024 at 12:06 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Thu, Aug 15, 2024 at 9:54 PM yangzhuorao <alex000young@gmail.com> wrote:
> >
> > There is a uaf bug in net/sched/act_api.c.
> > When Thread1 call [1] tcf_action_init_1 to alloc act which saves
> > in actions array. If allocation failed, it will go to err path.
> > Meanwhile thread2 call tcf_del_walker to delete action in idr.
> > So thread 1 in err path [3] tcf_action_destroy will cause
> > use-after-free read bug.
> >
> > Thread1                            Thread2
> >  tcf_action_init
> >   for(i;i<TCA_ACT_MAX_PRIO;i++)
> >    act=tcf_action_init_1 //[1]
> >    if(IS_ERR(act))
> >     goto err
> >    actions[i] = act
> >                                    tcf_del_walker
> >                                     idr_for_each_entry_ul(idr,p,id)
> >                                      __tcf_idr_release(p,false,true)
> >                                       free_tcf(p) //[2]
> >   err:
> >    tcf_action_destroy
> >     a=actions[i]
> >     ops = a->ops //[3]
> >
> > We add lock and unlock in tcf_action_init and tcf_del_walker function
> > to aviod race condition.
> >

Hi Alex,

Thanks for your valiant effort, unfortunately there's nothing to fix
here for the current kernels.
For your edification:

This may have been an issue on the 4.x kernels you ran but i dont see
a valid codepath that would create the kernel parallelism scenario you
mentioned above (currently or ever actually). Kernel entry is
syncronized from user space via the rtnetlink lock - meaning you cant
have two control plane threads (as you show in your nice diagram above
in your commit) entering from user space in parallel to trigger the
bug.

I believe the reason it happens in 4.x is there is likely a bug(hand
waving here) where within a short window upon a) creating a batch of
actions of the same kind b) followed by partial updates of said action
you then c) flush that action kind. Theory is the flush will trigger
the bug. IOW, it is not parallel but rather a single entry. I didnt
have time to look but whatever this bug is was most certainly fixed at
some point - perhaps nobody bothered to backport. If this fix is so
important to you please dig into newer kernels and backport.

There are other technical issues with your solution but I hope the above helps.
The repro doesnt cause any issues in newer kernels - but please verify
for yourself as well.

So from me, I am afraid, this is a nack

cheers,
jamal

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-16 15:04   ` Jamal Hadi Salim
@ 2024-08-17  9:27     ` Alex Young
  2024-08-17  9:35       ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Young @ 2024-08-17  9:27 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, linux-kernel, xiyou.wangcong, jiri, davem, security,
	xkaneiki, hackerzheng666

Hi Jamal,

Thanks your mention. I have reviewed the latest kernel code.
I understand why these two tc function threads can enter the kernel at the same
time. It's because the request_module[2] function in tcf_action_init_1. When the
tc_action_init_1 function to add a new action, it will load the action
module. It will
call rtnl_unlock to let the Thread2 into the kernel space.

Thread1                                                 Thread2
rtnetlink_rcv_msg                                   rtnetlink_rcv_msg
 rtnl_lock();
 tcf_action_init
  for(i;i<TCA_ACT_MAX_PRIO;i++)
   act=tcf_action_init_1 //[1]
        if (rtnl_held)
           rtnl_unlock(); //[2]
        request_module("act_%s", act_name);

                                                                tcf_del_walker

idr_for_each_entry_ul(idr,p,id)

__tcf_idr_release(p,false,true)

 free_tcf(p) //[3]
if (rtnl_held)
rtnl_lock();

   if(IS_ERR(act))
    goto err
   actions[i] = act

  err:
   tcf_action_destroy
    a=actions[i]
    ops = a->ops //[4]
I know this time window is small, but it can indeed cause the bug. And
in the latest
kernel, it have fixed the bug. But version 4.19.x is still a
maintenance version.
Is there anyone who can introduce this change into version 4.19.

Best regards,
Alex

Jamal Hadi Salim <jhs@mojatatu.com> 于2024年8月16日周五 23:04写道：

>
> On Fri, Aug 16, 2024 at 12:06 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> >
> > On Thu, Aug 15, 2024 at 9:54 PM yangzhuorao <alex000young@gmail.com> wrote:
> > >
> > > There is a uaf bug in net/sched/act_api.c.
> > > When Thread1 call [1] tcf_action_init_1 to alloc act which saves
> > > in actions array. If allocation failed, it will go to err path.
> > > Meanwhile thread2 call tcf_del_walker to delete action in idr.
> > > So thread 1 in err path [3] tcf_action_destroy will cause
> > > use-after-free read bug.
> > >
> > > Thread1                            Thread2
> > >  tcf_action_init
> > >   for(i;i<TCA_ACT_MAX_PRIO;i++)
> > >    act=tcf_action_init_1 //[1]
> > >    if(IS_ERR(act))
> > >     goto err
> > >    actions[i] = act
> > >                                    tcf_del_walker
> > >                                     idr_for_each_entry_ul(idr,p,id)
> > >                                      __tcf_idr_release(p,false,true)
> > >                                       free_tcf(p) //[2]
> > >   err:
> > >    tcf_action_destroy
> > >     a=actions[i]
> > >     ops = a->ops //[3]
> > >
> > > We add lock and unlock in tcf_action_init and tcf_del_walker function
> > > to aviod race condition.
> > >
>
> Hi Alex,
>
> Thanks for your valiant effort, unfortunately there's nothing to fix
> here for the current kernels.
> For your edification:
>
> This may have been an issue on the 4.x kernels you ran but i dont see
> a valid codepath that would create the kernel parallelism scenario you
> mentioned above (currently or ever actually). Kernel entry is
> syncronized from user space via the rtnetlink lock - meaning you cant
> have two control plane threads (as you show in your nice diagram above
> in your commit) entering from user space in parallel to trigger the
> bug.
>
> I believe the reason it happens in 4.x is there is likely a bug(hand
> waving here) where within a short window upon a) creating a batch of
> actions of the same kind b) followed by partial updates of said action
> you then c) flush that action kind. Theory is the flush will trigger
> the bug. IOW, it is not parallel but rather a single entry. I didnt
> have time to look but whatever this bug is was most certainly fixed at
> some point - perhaps nobody bothered to backport. If this fix is so
> important to you please dig into newer kernels and backport.
>
> There are other technical issues with your solution but I hope the above helps.
> The repro doesnt cause any issues in newer kernels - but please verify
> for yourself as well.
>
> So from me, I am afraid, this is a nack
>
> cheers,
> jamal

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-17  9:27     ` Alex Young
@ 2024-08-17  9:35       ` Greg KH
  2024-08-17 12:11         ` Jamal Hadi Salim
  0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2024-08-17  9:35 UTC (permalink / raw)
  To: Alex Young
  Cc: Jamal Hadi Salim, netdev, linux-kernel, xiyou.wangcong, jiri,
	davem, security, xkaneiki, hackerzheng666

On Sat, Aug 17, 2024 at 05:27:17PM +0800, Alex Young wrote:
> Hi Jamal,
> 
> Thanks your mention. I have reviewed the latest kernel code.
> I understand why these two tc function threads can enter the kernel at the same
> time. It's because the request_module[2] function in tcf_action_init_1. When the
> tc_action_init_1 function to add a new action, it will load the action
> module. It will
> call rtnl_unlock to let the Thread2 into the kernel space.
> 
> Thread1                                                 Thread2
> rtnetlink_rcv_msg                                   rtnetlink_rcv_msg
>  rtnl_lock();
>  tcf_action_init
>   for(i;i<TCA_ACT_MAX_PRIO;i++)
>    act=tcf_action_init_1 //[1]
>         if (rtnl_held)
>            rtnl_unlock(); //[2]
>         request_module("act_%s", act_name);
> 
>                                                                 tcf_del_walker
> 
> idr_for_each_entry_ul(idr,p,id)
> 
> __tcf_idr_release(p,false,true)
> 
>  free_tcf(p) //[3]
> if (rtnl_held)
> rtnl_lock();
> 
>    if(IS_ERR(act))
>     goto err
>    actions[i] = act
> 
>   err:
>    tcf_action_destroy
>     a=actions[i]
>     ops = a->ops //[4]
> I know this time window is small, but it can indeed cause the bug. And
> in the latest
> kernel, it have fixed the bug. But version 4.19.x is still a
> maintenance version.

4.19.y is only going to be alive for 4 more months, and anyone still
using it now really should have their plans to move off of it finished
already (or almost finished.)

If this is a request_module issue, and you care about 4.19.y kernels,
just add that module to the modprobe exclude list in userspace which
will prevent it from being loaded automatically.  Or load it at boot
time.

And what specific commit resolved this issue in the older kernels?  Have
you attempted to just backport that change to 4.19.y?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-17  9:35       ` Greg KH
@ 2024-08-17 12:11         ` Jamal Hadi Salim
  2024-08-18 10:40           ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Jamal Hadi Salim @ 2024-08-17 12:11 UTC (permalink / raw)
  To: Greg KH
  Cc: Alex Young, netdev, linux-kernel, xiyou.wangcong, jiri, davem,
	security, xkaneiki, hackerzheng666

On Sat, Aug 17, 2024 at 5:35 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Sat, Aug 17, 2024 at 05:27:17PM +0800, Alex Young wrote:
> > Hi Jamal,
> >
> > Thanks your mention. I have reviewed the latest kernel code.
> > I understand why these two tc function threads can enter the kernel at the same
> > time. It's because the request_module[2] function in tcf_action_init_1. When the
> > tc_action_init_1 function to add a new action, it will load the action
> > module. It will
> > call rtnl_unlock to let the Thread2 into the kernel space.
> >
> > Thread1                                                 Thread2
> > rtnetlink_rcv_msg                                   rtnetlink_rcv_msg
> >  rtnl_lock();
> >  tcf_action_init
> >   for(i;i<TCA_ACT_MAX_PRIO;i++)
> >    act=tcf_action_init_1 //[1]
> >         if (rtnl_held)
> >            rtnl_unlock(); //[2]
> >         request_module("act_%s", act_name);
> >
> >                                                                 tcf_del_walker
> >
> > idr_for_each_entry_ul(idr,p,id)
> >
> > __tcf_idr_release(p,false,true)
> >
> >  free_tcf(p) //[3]
> > if (rtnl_held)
> > rtnl_lock();
> >
> >    if(IS_ERR(act))
> >     goto err
> >    actions[i] = act
> >
> >   err:
> >    tcf_action_destroy
> >     a=actions[i]
> >     ops = a->ops //[4]
> > I know this time window is small, but it can indeed cause the bug. And
> > in the latest
> > kernel, it have fixed the bug. But version 4.19.x is still a
> > maintenance version.
>
> 4.19.y is only going to be alive for 4 more months, and anyone still
> using it now really should have their plans to move off of it finished
> already (or almost finished.)
>
> If this is a request_module issue, and you care about 4.19.y kernels,
> just add that module to the modprobe exclude list in userspace which
> will prevent it from being loaded automatically.  Or load it at boot
> time.
>
> And what specific commit resolved this issue in the older kernels?  Have
> you attempted to just backport that change to 4.19.y?
>

And if you or anyone cares, here it is:
d349f997686887906b1183b5be96933c5452362a

cheers,
jamal

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-17 12:11         ` Jamal Hadi Salim
@ 2024-08-18 10:40           ` Greg KH
  2024-08-19  1:10             ` Alex Young
  0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2024-08-18 10:40 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Alex Young, netdev, linux-kernel, xiyou.wangcong, jiri, davem,
	security, xkaneiki, hackerzheng666

On Sat, Aug 17, 2024 at 08:11:50AM -0400, Jamal Hadi Salim wrote:
> On Sat, Aug 17, 2024 at 5:35 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Sat, Aug 17, 2024 at 05:27:17PM +0800, Alex Young wrote:
> > > Hi Jamal,
> > >
> > > Thanks your mention. I have reviewed the latest kernel code.
> > > I understand why these two tc function threads can enter the kernel at the same
> > > time. It's because the request_module[2] function in tcf_action_init_1. When the
> > > tc_action_init_1 function to add a new action, it will load the action
> > > module. It will
> > > call rtnl_unlock to let the Thread2 into the kernel space.
> > >
> > > Thread1                                                 Thread2
> > > rtnetlink_rcv_msg                                   rtnetlink_rcv_msg
> > >  rtnl_lock();
> > >  tcf_action_init
> > >   for(i;i<TCA_ACT_MAX_PRIO;i++)
> > >    act=tcf_action_init_1 //[1]
> > >         if (rtnl_held)
> > >            rtnl_unlock(); //[2]
> > >         request_module("act_%s", act_name);
> > >
> > >                                                                 tcf_del_walker
> > >
> > > idr_for_each_entry_ul(idr,p,id)
> > >
> > > __tcf_idr_release(p,false,true)
> > >
> > >  free_tcf(p) //[3]
> > > if (rtnl_held)
> > > rtnl_lock();
> > >
> > >    if(IS_ERR(act))
> > >     goto err
> > >    actions[i] = act
> > >
> > >   err:
> > >    tcf_action_destroy
> > >     a=actions[i]
> > >     ops = a->ops //[4]
> > > I know this time window is small, but it can indeed cause the bug. And
> > > in the latest
> > > kernel, it have fixed the bug. But version 4.19.x is still a
> > > maintenance version.
> >
> > 4.19.y is only going to be alive for 4 more months, and anyone still
> > using it now really should have their plans to move off of it finished
> > already (or almost finished.)
> >
> > If this is a request_module issue, and you care about 4.19.y kernels,
> > just add that module to the modprobe exclude list in userspace which
> > will prevent it from being loaded automatically.  Or load it at boot
> > time.
> >
> > And what specific commit resolved this issue in the older kernels?  Have
> > you attempted to just backport that change to 4.19.y?
> >
> 
> And if you or anyone cares, here it is:
> d349f997686887906b1183b5be96933c5452362a

Thanks for that.  Looks like it might be good to backport that to 5.4.y
if someone cares about this issue there as well.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-18 10:40           ` Greg KH
@ 2024-08-19  1:10             ` Alex Young
  2024-08-19  3:08               ` Greg KH
  0 siblings, 1 reply; 11+ messages in thread
From: Alex Young @ 2024-08-19  1:10 UTC (permalink / raw)
  To: Greg KH
  Cc: Jamal Hadi Salim, netdev, linux-kernel, xiyou.wangcong, jiri,
	davem, security, xkaneiki, hackerzheng666

Hi greg.
Thanks for your suggestion. I will try to use the new kernel.
By the way, the 5.4.y you mentioned does not fix this bug either.

Best regards,
Alex

Greg KH <gregkh@linuxfoundation.org> 于2024年8月18日周日 18:40写道：
>
> On Sat, Aug 17, 2024 at 08:11:50AM -0400, Jamal Hadi Salim wrote:
> > On Sat, Aug 17, 2024 at 5:35 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Sat, Aug 17, 2024 at 05:27:17PM +0800, Alex Young wrote:
> > > > Hi Jamal,
> > > >
> > > > Thanks your mention. I have reviewed the latest kernel code.
> > > > I understand why these two tc function threads can enter the kernel at the same
> > > > time. It's because the request_module[2] function in tcf_action_init_1. When the
> > > > tc_action_init_1 function to add a new action, it will load the action
> > > > module. It will
> > > > call rtnl_unlock to let the Thread2 into the kernel space.
> > > >
> > > > Thread1                                                 Thread2
> > > > rtnetlink_rcv_msg                                   rtnetlink_rcv_msg
> > > >  rtnl_lock();
> > > >  tcf_action_init
> > > >   for(i;i<TCA_ACT_MAX_PRIO;i++)
> > > >    act=tcf_action_init_1 //[1]
> > > >         if (rtnl_held)
> > > >            rtnl_unlock(); //[2]
> > > >         request_module("act_%s", act_name);
> > > >
> > > >                                                                 tcf_del_walker
> > > >
> > > > idr_for_each_entry_ul(idr,p,id)
> > > >
> > > > __tcf_idr_release(p,false,true)
> > > >
> > > >  free_tcf(p) //[3]
> > > > if (rtnl_held)
> > > > rtnl_lock();
> > > >
> > > >    if(IS_ERR(act))
> > > >     goto err
> > > >    actions[i] = act
> > > >
> > > >   err:
> > > >    tcf_action_destroy
> > > >     a=actions[i]
> > > >     ops = a->ops //[4]
> > > > I know this time window is small, but it can indeed cause the bug. And
> > > > in the latest
> > > > kernel, it have fixed the bug. But version 4.19.x is still a
> > > > maintenance version.
> > >
> > > 4.19.y is only going to be alive for 4 more months, and anyone still
> > > using it now really should have their plans to move off of it finished
> > > already (or almost finished.)
> > >
> > > If this is a request_module issue, and you care about 4.19.y kernels,
> > > just add that module to the modprobe exclude list in userspace which
> > > will prevent it from being loaded automatically.  Or load it at boot
> > > time.
> > >
> > > And what specific commit resolved this issue in the older kernels?  Have
> > > you attempted to just backport that change to 4.19.y?
> > >
> >
> > And if you or anyone cares, here it is:
> > d349f997686887906b1183b5be96933c5452362a
>
> Thanks for that.  Looks like it might be good to backport that to 5.4.y
> if someone cares about this issue there as well.
>
> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] net: sched: use-after-free in tcf_action_destroy
  2024-08-19  1:10             ` Alex Young
@ 2024-08-19  3:08               ` Greg KH
  0 siblings, 0 replies; 11+ messages in thread
From: Greg KH @ 2024-08-19  3:08 UTC (permalink / raw)
  To: Alex Young
  Cc: Jamal Hadi Salim, netdev, linux-kernel, xiyou.wangcong, jiri,
	davem, security, xkaneiki, hackerzheng666

On Mon, Aug 19, 2024 at 09:10:45AM +0800, Alex Young wrote:
> Hi greg.
> Thanks for your suggestion. I will try to use the new kernel.
> By the way, the 5.4.y you mentioned does not fix this bug either.

As was pointed out, this looks to be resolved in 5.10.y, not 5.4.y.  We
will gladly accept a working backport to 5.4.y of the commit to resolve
it there, please send it to stable@vger.kernel.org to be included.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-08-19  3:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-16  1:53 [PATCH] net: sched: use-after-free in tcf_action_destroy yangzhuorao
2024-08-16  4:06 ` Jamal Hadi Salim
2024-08-16  5:03   ` Willy Tarreau
2024-08-16 15:04   ` Jamal Hadi Salim
2024-08-17  9:27     ` Alex Young
2024-08-17  9:35       ` Greg KH
2024-08-17 12:11         ` Jamal Hadi Salim
2024-08-18 10:40           ` Greg KH
2024-08-19  1:10             ` Alex Young
2024-08-19  3:08               ` Greg KH
2024-08-16  7:20 ` Jiri Pirko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).