public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
@ 2024-04-30 17:11 Davide Caratti
  2024-04-30 17:58 ` Eric Dumazet
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Davide Caratti @ 2024-04-30 17:11 UTC (permalink / raw)
  To: Jamal Hadi Salim, Cong Wang, Jiri Pirko
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Naresh Kamboju, netdev

Naresh and Eric report several errors (corrupted elements in the dynamic
key hash list), when running tdc.py or syzbot. The error path of
qdisc_alloc() and qdisc_create() frees the qdisc memory, but it forgets
to unregister the lockdep key, thus causing use-after-free like the
following one:

 ==================================================================
 BUG: KASAN: slab-use-after-free in lockdep_register_key+0x5f2/0x700
 Read of size 8 at addr ffff88811236f2a8 by task ip/7925

 CPU: 26 PID: 7925 Comm: ip Kdump: loaded Not tainted 6.9.0-rc2+ #648
 Hardware name: Supermicro SYS-6027R-72RF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0  07/26/2013
 Call Trace:
  <TASK>
  dump_stack_lvl+0x7c/0xc0
  print_report+0xc9/0x610
  kasan_report+0x89/0xc0
  lockdep_register_key+0x5f2/0x700
  qdisc_alloc+0x21d/0xb60
  qdisc_create_dflt+0x63/0x3c0
  attach_one_default_qdisc.constprop.37+0x8e/0x170
  dev_activate+0x4bd/0xc30
  __dev_open+0x275/0x380
  __dev_change_flags+0x3f1/0x570
  dev_change_flags+0x7c/0x160
  do_setlink+0x1ea1/0x34b0
  __rtnl_newlink+0x8c9/0x1510
  rtnl_newlink+0x61/0x90
  rtnetlink_rcv_msg+0x2f0/0xbc0
  netlink_rcv_skb+0x120/0x380
  netlink_unicast+0x420/0x630
  netlink_sendmsg+0x732/0xbc0
  __sock_sendmsg+0x1ea/0x280
  ____sys_sendmsg+0x5a9/0x990
  ___sys_sendmsg+0xf1/0x180
  __sys_sendmsg+0xd3/0x180
  do_syscall_64+0x96/0x180
  entry_SYSCALL_64_after_hwframe+0x71/0x79
 RIP: 0033:0x7f9503f4fa07
 Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
 RSP: 002b:00007fff6c729068 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 000000006630c681 RCX: 00007f9503f4fa07
 RDX: 0000000000000000 RSI: 00007fff6c7290d0 RDI: 0000000000000003
 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000078
 R10: 000000000000009b R11: 0000000000000246 R12: 0000000000000001
 R13: 00007fff6c729180 R14: 0000000000000000 R15: 000055bf67dd9040
  </TASK>

 Allocated by task 7745:
  kasan_save_stack+0x1c/0x40
  kasan_save_track+0x10/0x30
  __kasan_kmalloc+0x7b/0x90
  __kmalloc_node+0x1ff/0x460
  qdisc_alloc+0xae/0xb60
  qdisc_create+0xdd/0xfb0
  tc_modify_qdisc+0x37e/0x1960
  rtnetlink_rcv_msg+0x2f0/0xbc0
  netlink_rcv_skb+0x120/0x380
  netlink_unicast+0x420/0x630
  netlink_sendmsg+0x732/0xbc0
  __sock_sendmsg+0x1ea/0x280
  ____sys_sendmsg+0x5a9/0x990
  ___sys_sendmsg+0xf1/0x180
  __sys_sendmsg+0xd3/0x180
  do_syscall_64+0x96/0x180
  entry_SYSCALL_64_after_hwframe+0x71/0x79

 Freed by task 7745:
  kasan_save_stack+0x1c/0x40
  kasan_save_track+0x10/0x30
  kasan_save_free_info+0x36/0x60
  __kasan_slab_free+0xfe/0x180
  kfree+0x113/0x380
  qdisc_create+0xafb/0xfb0
  tc_modify_qdisc+0x37e/0x1960
  rtnetlink_rcv_msg+0x2f0/0xbc0
  netlink_rcv_skb+0x120/0x380
  netlink_unicast+0x420/0x630
  netlink_sendmsg+0x732/0xbc0
  __sock_sendmsg+0x1ea/0x280
  ____sys_sendmsg+0x5a9/0x990
  ___sys_sendmsg+0xf1/0x180
  __sys_sendmsg+0xd3/0x180
  do_syscall_64+0x96/0x180
  entry_SYSCALL_64_after_hwframe+0x71/0x79

Fix this ensuring that lockdep_unregister_key() is called before the
qdisc struct is freed, also in the error path of qdisc_create() and
qdisc_alloc().

Fixes: af0cb3fa3f9e ("net/sched: fix false lockdep warning on qdisc root lock")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/netdev/20240429221706.1492418-1-naresh.kamboju@linaro.org/
CC: Naresh Kamboju <naresh.kamboju@linaro.org>
CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 net/sched/sch_api.c     | 1 +
 net/sched/sch_generic.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 60239378d43f..6292d6d73b72 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1389,6 +1389,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
 		ops->destroy(sch);
 	qdisc_put_stab(rtnl_dereference(sch->stab));
 err_out3:
+	lockdep_unregister_key(&sch->root_lock_key);
 	netdev_put(dev, &sch->dev_tracker);
 	qdisc_free(sch);
 err_out2:
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 31dfd6c7405b..d3f6006b563c 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -982,6 +982,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 
 	return sch;
 errout1:
+	lockdep_unregister_key(&sch->root_lock_key);
 	kfree(sch);
 errout:
 	return ERR_PTR(err);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-04-30 17:11 [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path Davide Caratti
@ 2024-04-30 17:58 ` Eric Dumazet
  2024-04-30 18:35   ` Davide Caratti
  2024-05-01  7:39 ` Eric Dumazet
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2024-04-30 17:58 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Naresh Kamboju, netdev

On Tue, Apr 30, 2024 at 7:11 PM Davide Caratti <dcaratti@redhat.com> wrote:
>
> Naresh and Eric report several errors (corrupted elements in the dynamic
> key hash list), when running tdc.py or syzbot. The error path of
> qdisc_alloc() and qdisc_create() frees the qdisc memory, but it forgets
> to unregister the lockdep key, thus causing use-after-free like the
> following one:
>
>  ==================================================================
>  BUG: KASAN: slab-use-after-free in lockdep_register_key+0x5f2/0x700
>  Read of size 8 at addr ffff88811236f2a8 by task ip/7925
>
>  CPU: 26 PID: 7925 Comm: ip Kdump: loaded Not tainted 6.9.0-rc2+ #648
>  Hardware name: Supermicro SYS-6027R-72RF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0  07/26/2013
>  Call Trace:
>   <TASK>
>   dump_stack_lvl+0x7c/0xc0
>   print_report+0xc9/0x610
>   kasan_report+0x89/0xc0
>   lockdep_register_key+0x5f2/0x700
>   qdisc_alloc+0x21d/0xb60
>   qdisc_create_dflt+0x63/0x3c0
>   attach_one_default_qdisc.constprop.37+0x8e/0x170
>   dev_activate+0x4bd/0xc30
>   __dev_open+0x275/0x380
>   __dev_change_flags+0x3f1/0x570
>   dev_change_flags+0x7c/0x160
>   do_setlink+0x1ea1/0x34b0
>   __rtnl_newlink+0x8c9/0x1510
>   rtnl_newlink+0x61/0x90
>   rtnetlink_rcv_msg+0x2f0/0xbc0
>   netlink_rcv_skb+0x120/0x380
>   netlink_unicast+0x420/0x630
>   netlink_sendmsg+0x732/0xbc0
>   __sock_sendmsg+0x1ea/0x280
>   ____sys_sendmsg+0x5a9/0x990
>   ___sys_sendmsg+0xf1/0x180
>   __sys_sendmsg+0xd3/0x180
>   do_syscall_64+0x96/0x180
>   entry_SYSCALL_64_after_hwframe+0x71/0x79
>  RIP: 0033:0x7f9503f4fa07
>  Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
>  RSP: 002b:00007fff6c729068 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>  RAX: ffffffffffffffda RBX: 000000006630c681 RCX: 00007f9503f4fa07
>  RDX: 0000000000000000 RSI: 00007fff6c7290d0 RDI: 0000000000000003
>  RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000078
>  R10: 000000000000009b R11: 0000000000000246 R12: 0000000000000001
>  R13: 00007fff6c729180 R14: 0000000000000000 R15: 000055bf67dd9040
>   </TASK>
>
>  Allocated by task 7745:
>   kasan_save_stack+0x1c/0x40
>   kasan_save_track+0x10/0x30
>   __kasan_kmalloc+0x7b/0x90
>   __kmalloc_node+0x1ff/0x460
>   qdisc_alloc+0xae/0xb60
>   qdisc_create+0xdd/0xfb0
>   tc_modify_qdisc+0x37e/0x1960
>   rtnetlink_rcv_msg+0x2f0/0xbc0
>   netlink_rcv_skb+0x120/0x380
>   netlink_unicast+0x420/0x630
>   netlink_sendmsg+0x732/0xbc0
>   __sock_sendmsg+0x1ea/0x280
>   ____sys_sendmsg+0x5a9/0x990
>   ___sys_sendmsg+0xf1/0x180
>   __sys_sendmsg+0xd3/0x180
>   do_syscall_64+0x96/0x180
>   entry_SYSCALL_64_after_hwframe+0x71/0x79
>
>  Freed by task 7745:
>   kasan_save_stack+0x1c/0x40
>   kasan_save_track+0x10/0x30
>   kasan_save_free_info+0x36/0x60
>   __kasan_slab_free+0xfe/0x180
>   kfree+0x113/0x380
>   qdisc_create+0xafb/0xfb0
>   tc_modify_qdisc+0x37e/0x1960
>   rtnetlink_rcv_msg+0x2f0/0xbc0
>   netlink_rcv_skb+0x120/0x380
>   netlink_unicast+0x420/0x630
>   netlink_sendmsg+0x732/0xbc0
>   __sock_sendmsg+0x1ea/0x280
>   ____sys_sendmsg+0x5a9/0x990
>   ___sys_sendmsg+0xf1/0x180
>   __sys_sendmsg+0xd3/0x180
>   do_syscall_64+0x96/0x180
>   entry_SYSCALL_64_after_hwframe+0x71/0x79
>
> Fix this ensuring that lockdep_unregister_key() is called before the
> qdisc struct is freed, also in the error path of qdisc_create() and
> qdisc_alloc().
>
> Fixes: af0cb3fa3f9e ("net/sched: fix false lockdep warning on qdisc root lock")
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> Closes: https://lore.kernel.org/netdev/20240429221706.1492418-1-naresh.kamboju@linaro.org/
> CC: Naresh Kamboju <naresh.kamboju@linaro.org>
> CC: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>  net/sched/sch_api.c     | 1 +
>  net/sched/sch_generic.c | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index 60239378d43f..6292d6d73b72 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -1389,6 +1389,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
>                 ops->destroy(sch);
>         qdisc_put_stab(rtnl_dereference(sch->stab));
>  err_out3:
> +       lockdep_unregister_key(&sch->root_lock_key);
>         netdev_put(dev, &sch->dev_tracker);
>         qdisc_free(sch);
>  err_out2:

For consistency with the other path, what about this instead ?

This would also  allow a qdisc goten from an rcu lookup to allow its
spinlock to be acquired.
(I am not saying this can happen, but who knows...)

Ie defer the  lockdep_unregister_key() right before the kfree()


diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 31dfd6c7405b01e22fe1b8c80944e2bed7d30ddc..edc9e4d240410d2f98dec04d3fac40983e808c28
100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1042,6 +1042,7 @@ void qdisc_free(struct Qdisc *qdisc)
                free_percpu(qdisc->cpu_qstats);
        }

+       lockdep_unregister_key(&qdisc->root_lock_key);
        kfree(qdisc);
 }

@@ -1070,7 +1071,6 @@ static void __qdisc_destroy(struct Qdisc *qdisc)
        if (ops->destroy)
                ops->destroy(qdisc);

-       lockdep_unregister_key(&qdisc->root_lock_key);
        module_put(ops->owner);
        netdev_put(dev, &qdisc->dev_tracker);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-04-30 17:58 ` Eric Dumazet
@ 2024-04-30 18:35   ` Davide Caratti
  2024-04-30 18:43     ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Davide Caratti @ 2024-04-30 18:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Naresh Kamboju, netdev

hi Eric, thanks for looking at this!

On Tue, Apr 30, 2024 at 07:58:14PM +0200, Eric Dumazet wrote:
> On Tue, Apr 30, 2024 at 7:11 PM Davide Caratti <dcaratti@redhat.com> wrote:
> >

[...]

> > @@ -1389,6 +1389,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
> >                 ops->destroy(sch);
> >         qdisc_put_stab(rtnl_dereference(sch->stab));
> >  err_out3:
> > +       lockdep_unregister_key(&sch->root_lock_key);
> >         netdev_put(dev, &sch->dev_tracker);
> >         qdisc_free(sch);
> >  err_out2:
> 
> For consistency with the other path, what about this instead ?
> 
> This would also  allow a qdisc goten from an rcu lookup to allow its
> spinlock to be acquired.
> (I am not saying this can happen, but who knows...)
> 
> Ie defer the  lockdep_unregister_key() right before the kfree()

the problem is, qdisc_free() is called also in a RCU callback. So, if we move
lockdep_unregister_key() inside the function, the non-error path is
going to splat like this:

 BUG: sleeping function called from invalid context at kernel/locking/lockdep.c:6450        
 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/6                                       
 preempt_count: 101, expected: 0                                                                                 
 RCU nest depth: 0, expected: 0                                                                                                                                                                                                                 
 1 lock held by swapper/6/0:                                                                                     
  #0: ffffffff83e03000 (rcu_callback){....}-{0:0}, at: rcu_do_batch+0x1d6/0x630                                                                                                                                                                 
 Preemption disabled at:                                                                                                                                                                                                                        
 [<0000000000000000>] 0x0                                                                                                                                                                                                                       
 CPU: 6 PID: 0 Comm: swapper/6 Not tainted 6.9.0-rc2+ #655                                                                                                                                                                                      
 Hardware name: Supermicro SYS-6027R-72RF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0  07/26/2013                                                                                                                                                             
 Call Trace:                                                                                                     
  <IRQ>                                                                                                                                                                                                                                         
  dump_stack_lvl+0xa9/0xc0                                                                                                                                                                                                                      
  __might_resched+0x1a6/0x2b0                                                                                    
  ? rcu_do_batch+0x208/0x630                                                                                                                                                                                                                    
  lockdep_unregister_key+0x28/0x290                                                                                                                                                                                                             
  qdisc_free+0x1b/0x40                                                                                           
  rcu_do_batch+0x20d/0x630                                                                                                                                                                                                                      
  ? lockdep_hardirqs_on+0x78/0x100                                                                                                                                                                                                              
  rcu_core+0x305/0x570                                                                                                                                                                                                                          
  __do_softirq+0xcd/0x484                                                                                                                                                                                                                       
  irq_exit_rcu+0xc9/0x110                                                                                                                                                                                                                       
  sysvec_apic_timer_interrupt+0x9e/0xc0                                                                                                                                                                                                         
  </IRQ>                                                                                                                                                                                                                                        
  <TASK>                                                                                                                                                                                                                                        
  asm_sysvec_apic_timer_interrupt+0x16/0x20                                                                                                                                                                                                     
 RIP: 0010:cpuidle_enter_state+0x104/0x5e0                                                                                                                                                                                                      
 Code: 00 89 c0 48 0f a3 05 3b 3e 1e 01 0f 82 4f 03 00 00 31 ff e8 5e 9e 33 ff 45 84 ff 0f 85 ff 02 00 00 e8 60 4b 47 ff fb 45 85 f6 <0f> 88 24 02 00 00 49 63 d6 48 2b 2c 24 48 8d 04 52 48 8d 04 82 49                                        
 RSP: 0018:ffffbe4ec22e7e78 EFLAGS: 00000202                                                                                                                                                                                                    
 RAX: 000000000002ec55 RBX: 0000000000000004 RCX: 0000000000000000                                                                                                                                                                              
 RDX: 0000000000000000 RSI: ffffffff838d1436 RDI: ffffffff8385cace                                                                                                                                                                              
 RBP: 0000001486cedd9b R08: 0000000000000001 R09: 0000000000000001                                                                                                                                                                              
 R10: 00004d0a32102e4a R11: ffffffff83a0bb08 R12: ffffde4abd602710                                               
 R13: ffffffff83f66f20 R14: 0000000000000004 R15: 0000000000000000         


6442 void lockdep_unregister_key(struct lock_class_key *key)
6443 {
6444         struct hlist_head *hash_head = keyhashentry(key);
6445         struct lock_class_key *k;
6446         struct pending_free *pf;
6447         unsigned long flags;
6448         bool found = false;
6449 
6450         might_sleep();

... because of the above line. That's why in the normal path, the dynamic key is
unregistered before scheduling the RCU callback.

-- 
davide



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-04-30 18:35   ` Davide Caratti
@ 2024-04-30 18:43     ` Eric Dumazet
  2024-05-03 12:44       ` Davide Caratti
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2024-04-30 18:43 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Naresh Kamboju, netdev

On Tue, Apr 30, 2024 at 8:35 PM Davide Caratti <dcaratti@redhat.com> wrote:
>
> hi Eric, thanks for looking at this!
>
> On Tue, Apr 30, 2024 at 07:58:14PM +0200, Eric Dumazet wrote:
> > On Tue, Apr 30, 2024 at 7:11 PM Davide Caratti <dcaratti@redhat.com> wrote:
> > >
>
> [...]
>
> > > @@ -1389,6 +1389,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
> > >                 ops->destroy(sch);
> > >         qdisc_put_stab(rtnl_dereference(sch->stab));
> > >  err_out3:
> > > +       lockdep_unregister_key(&sch->root_lock_key);
> > >         netdev_put(dev, &sch->dev_tracker);
> > >         qdisc_free(sch);
> > >  err_out2:
> >
> > For consistency with the other path, what about this instead ?
> >
> > This would also  allow a qdisc goten from an rcu lookup to allow its
> > spinlock to be acquired.
> > (I am not saying this can happen, but who knows...)
> >
> > Ie defer the  lockdep_unregister_key() right before the kfree()
>
> the problem is, qdisc_free() is called also in a RCU callback. So, if we move
> lockdep_unregister_key() inside the function, the non-error path is
> going to splat like this

Got it, but we do have ways of running a work queue after rcu grace period.

Let's use your patch, but I suspect we could have other issues.

Full disclosure, I have the following syzbot report:

WARNING: bad unlock balance detected!
6.9.0-rc5-syzkaller-01413-gdd1941f801bc #0 Not tainted
-------------------------------------
kworker/u8:6/2474 is trying to release lock (&sch->root_lock_key) at:
[<ffffffff897300c5>] spin_unlock_bh include/linux/spinlock.h:396 [inline]
[<ffffffff897300c5>] dev_reset_queue+0x145/0x1b0 net/sched/sch_generic.c:1304
but there are no more locks to release!

other info that might help us debug this:
5 locks held by kworker/u8:6/2474:
#0: ffff888015ecd948 ((wq_completion)netns){+.+.}-{0:0}, at:
process_one_work kernel/workqueue.c:3229 [inline]
#0: ffff888015ecd948 ((wq_completion)netns){+.+.}-{0:0}, at:
process_scheduled_works+0x8e0/0x17c0 kernel/workqueue.c:3335
#1: ffffc9000a3a7d00 (net_cleanup_work){+.+.}-{0:0}, at:
process_one_work kernel/workqueue.c:3230 [inline]
#1: ffffc9000a3a7d00 (net_cleanup_work){+.+.}-{0:0}, at:
process_scheduled_works+0x91b/0x17c0 kernel/workqueue.c:3335
#2: ffffffff8f59bd50 (pernet_ops_rwsem){++++}-{3:3}, at:
cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:591
#3: ffffffff8f5a8648 (rtnl_mutex){+.+.}-{3:3}, at:
cleanup_net+0x6af/0xcc0 net/core/net_namespace.c:627
#4: ffff88802cbce258 (dev->qdisc_tx_busylock ?:
&qdisc_tx_busylock#2){+...}-{2:2}, at: spin_lock_bh
include/linux/spinlock.h:356 [inline]
#4: ffff88802cbce258 (dev->qdisc_tx_busylock ?:
&qdisc_tx_busylock#2){+...}-{2:2}, at: dev_reset_queue+0x126/0x1b0
net/sched/sch_generic.c:1299

stack backtrace:
CPU: 1 PID: 2474 Comm: kworker/u8:6 Not tainted
6.9.0-rc5-syzkaller-01413-gdd1941f801bc #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 03/27/2024
Workqueue: netns cleanup_net
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_unlock_imbalance_bug+0x256/0x2c0 kernel/locking/lockdep.c:5194
__lock_release kernel/locking/lockdep.c:5431 [inline]
lock_release+0x599/0x9f0 kernel/locking/lockdep.c:5774
__raw_spin_unlock_bh include/linux/spinlock_api_smp.h:165 [inline]
_raw_spin_unlock_bh+0x1b/0x40 kernel/locking/spinlock.c:210
spin_unlock_bh include/linux/spinlock.h:396 [inline]
dev_reset_queue+0x145/0x1b0 net/sched/sch_generic.c:1304
netdev_for_each_tx_queue include/linux/netdevice.h:2503 [inline]
dev_deactivate_many+0x54a/0xb10 net/sched/sch_generic.c:1368
__dev_close_many+0x1a4/0x300 net/core/dev.c:1529
dev_close_many+0x24e/0x4c0 net/core/dev.c:1567
unregister_netdevice_many_notify+0x544/0x16e0 net/core/dev.c:11181
cleanup_net+0x75d/0xcc0 net/core/net_namespace.c:632
process_one_work kernel/workqueue.c:3254 [inline]
process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335
worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-04-30 17:11 [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path Davide Caratti
  2024-04-30 17:58 ` Eric Dumazet
@ 2024-05-01  7:39 ` Eric Dumazet
  2024-05-01 10:26 ` Ido Schimmel
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2024-05-01  7:39 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Naresh Kamboju, netdev

On Tue, Apr 30, 2024 at 7:11 PM Davide Caratti <dcaratti@redhat.com> wrote:
>
> Naresh and Eric report several errors (corrupted elements in the dynamic
> key hash list), when running tdc.py or syzbot. The error path of
> qdisc_alloc() and qdisc_create() frees the qdisc memory, but it forgets
> to unregister the lockdep key, thus causing use-after-free like the
> following one:
>
>  ==================================================================
>  BUG: KASAN: slab-use-after-free in lockdep_register_key+0x5f2/0x700
>  Read of size 8 at addr ffff88811236f2a8 by task ip/7925
>
>
> Fixes: af0cb3fa3f9e ("net/sched: fix false lockdep warning on qdisc root lock")
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> Closes: https://lore.kernel.org/netdev/20240429221706.1492418-1-naresh.kamboju@linaro.org/
> CC: Naresh Kamboju <naresh.kamboju@linaro.org>
> CC: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---

Reviewed-by: Eric Dumazet <edumazet@google.com>

I hope syzbot storm will be stopped with this fix.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-04-30 17:11 [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path Davide Caratti
  2024-04-30 17:58 ` Eric Dumazet
  2024-05-01  7:39 ` Eric Dumazet
@ 2024-05-01 10:26 ` Ido Schimmel
  2024-05-01 10:38   ` Naresh Kamboju
  2024-05-02  6:22 ` Ido Schimmel
  2024-05-02 14:20 ` patchwork-bot+netdevbpf
  4 siblings, 1 reply; 11+ messages in thread
From: Ido Schimmel @ 2024-05-01 10:26 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Naresh Kamboju, netdev

On Tue, Apr 30, 2024 at 07:11:13PM +0200, Davide Caratti wrote:
> Naresh and Eric report several errors (corrupted elements in the dynamic
> key hash list), when running tdc.py or syzbot. The error path of
> qdisc_alloc() and qdisc_create() frees the qdisc memory, but it forgets
> to unregister the lockdep key, thus causing use-after-free like the
> following one:

[...]

> Fix this ensuring that lockdep_unregister_key() is called before the
> qdisc struct is freed, also in the error path of qdisc_create() and
> qdisc_alloc().
> 
> Fixes: af0cb3fa3f9e ("net/sched: fix false lockdep warning on qdisc root lock")
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> Closes: https://lore.kernel.org/netdev/20240429221706.1492418-1-naresh.kamboju@linaro.org/
> CC: Naresh Kamboju <naresh.kamboju@linaro.org>
> CC: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>

Reviewed-by: Ido Schimmel <idosch@nvidia.com>

We've also hit the issue on two of our machines running debug kernels. I
started a run with the fix on both and will report tomorrow morning (not
saying you should wait).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-05-01 10:26 ` Ido Schimmel
@ 2024-05-01 10:38   ` Naresh Kamboju
  0 siblings, 0 replies; 11+ messages in thread
From: Naresh Kamboju @ 2024-05-01 10:38 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Davide Caratti, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev

On Wed, 1 May 2024 at 15:56, Ido Schimmel <idosch@idosch.org> wrote:
>
> On Tue, Apr 30, 2024 at 07:11:13PM +0200, Davide Caratti wrote:
> > Naresh and Eric report several errors (corrupted elements in the dynamic
> > key hash list), when running tdc.py or syzbot. The error path of
> > qdisc_alloc() and qdisc_create() frees the qdisc memory, but it forgets
> > to unregister the lockdep key, thus causing use-after-free like the
> > following one:
>
> [...]
>
> > Fix this ensuring that lockdep_unregister_key() is called before the
> > qdisc struct is freed, also in the error path of qdisc_create() and
> > qdisc_alloc().
> >
> > Fixes: af0cb3fa3f9e ("net/sched: fix false lockdep warning on qdisc root lock")
> > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> > Closes: https://lore.kernel.org/netdev/20240429221706.1492418-1-naresh.kamboju@linaro.org/
> > CC: Naresh Kamboju <naresh.kamboju@linaro.org>
> > CC: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Davide Caratti <dcaratti@redhat.com>
>
> Reviewed-by: Ido Schimmel <idosch@nvidia.com>

Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>

I have applied this patch and tested.

>
> We've also hit the issue on two of our machines running debug kernels. I
> started a run with the fix on both and will report tomorrow morning (not
> saying you should wait).

- Naresh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-04-30 17:11 [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path Davide Caratti
                   ` (2 preceding siblings ...)
  2024-05-01 10:26 ` Ido Schimmel
@ 2024-05-02  6:22 ` Ido Schimmel
  2024-05-02 14:20 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 11+ messages in thread
From: Ido Schimmel @ 2024-05-02  6:22 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Naresh Kamboju, netdev

On Tue, Apr 30, 2024 at 07:11:13PM +0200, Davide Caratti wrote:
> Naresh and Eric report several errors (corrupted elements in the dynamic
> key hash list), when running tdc.py or syzbot. The error path of
> qdisc_alloc() and qdisc_create() frees the qdisc memory, but it forgets
> to unregister the lockdep key, thus causing use-after-free like the
> following one:

[...]

> Fix this ensuring that lockdep_unregister_key() is called before the
> qdisc struct is freed, also in the error path of qdisc_create() and
> qdisc_alloc().
> 
> Fixes: af0cb3fa3f9e ("net/sched: fix false lockdep warning on qdisc root lock")
> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
> Closes: https://lore.kernel.org/netdev/20240429221706.1492418-1-naresh.kamboju@linaro.org/
> CC: Naresh Kamboju <naresh.kamboju@linaro.org>
> CC: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>

Tested-by: Ido Schimmel <idosch@nvidia.com>

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-04-30 17:11 [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path Davide Caratti
                   ` (3 preceding siblings ...)
  2024-05-02  6:22 ` Ido Schimmel
@ 2024-05-02 14:20 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-05-02 14:20 UTC (permalink / raw)
  To: Davide Caratti
  Cc: jhs, xiyou.wangcong, jiri, davem, edumazet, kuba, pabeni,
	naresh.kamboju, netdev

Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 30 Apr 2024 19:11:13 +0200 you wrote:
> Naresh and Eric report several errors (corrupted elements in the dynamic
> key hash list), when running tdc.py or syzbot. The error path of
> qdisc_alloc() and qdisc_create() frees the qdisc memory, but it forgets
> to unregister the lockdep key, thus causing use-after-free like the
> following one:
> 
>  ==================================================================
>  BUG: KASAN: slab-use-after-free in lockdep_register_key+0x5f2/0x700
>  Read of size 8 at addr ffff88811236f2a8 by task ip/7925
> 
> [...]

Here is the summary with links:
  - [net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
    https://git.kernel.org/netdev/net-next/c/86735b57c905

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-04-30 18:43     ` Eric Dumazet
@ 2024-05-03 12:44       ` Davide Caratti
  2024-05-03 12:48         ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Davide Caratti @ 2024-05-03 12:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Naresh Kamboju, netdev

hello Eric,

On Tue, Apr 30, 2024 at 08:43:22PM +0200, Eric Dumazet wrote:
> On Tue, Apr 30, 2024 at 8:35 PM Davide Caratti <dcaratti@redhat.com> wrote:
> >

[...]

> > > For consistency with the other path, what about this instead ?
> > >
> > > This would also  allow a qdisc goten from an rcu lookup to allow its
> > > spinlock to be acquired.
> > > (I am not saying this can happen, but who knows...)
> > >
> > > Ie defer the  lockdep_unregister_key() right before the kfree()
> >
> > the problem is, qdisc_free() is called also in a RCU callback. So, if we move
> > lockdep_unregister_key() inside the function, the non-error path is
> > going to splat like this
> 
> Got it, but we do have ways of running a work queue after rcu grace period.

this would imply scheduling a work that does qdisc_free() + lockdep_unregister_key()
in qdisc_free_cb(). I can try that, but maybe the issue is different:

> Let's use your patch, but I suspect we could have other issues.
> 
> Full disclosure, I have the following syzbot report:
> 
> WARNING: bad unlock balance detected!
> 6.9.0-rc5-syzkaller-01413-gdd1941f801bc #0 Not tainted
> -------------------------------------
> kworker/u8:6/2474 is trying to release lock (&sch->root_lock_key) at:
> [<ffffffff897300c5>] spin_unlock_bh include/linux/spinlock.h:396 [inline]
> [<ffffffff897300c5>] dev_reset_queue+0x145/0x1b0 net/sched/sch_generic.c:1304
> but there are no more locks to release!

I don't understand how can this "imbalance" be caused by lockdep_unregister_key()
being called too early. I'm more inclined to think that this splat is due to UaF
similar to those that we saw a couples of days ago. Is syzbot still
generating report like the one above?

thanks,

-- 
davide


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path
  2024-05-03 12:44       ` Davide Caratti
@ 2024-05-03 12:48         ` Eric Dumazet
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2024-05-03 12:48 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Naresh Kamboju, netdev

On Fri, May 3, 2024 at 2:44 PM Davide Caratti <dcaratti@redhat.com> wrote:
>
> hello Eric,
>
> On Tue, Apr 30, 2024 at 08:43:22PM +0200, Eric Dumazet wrote:
> > On Tue, Apr 30, 2024 at 8:35 PM Davide Caratti <dcaratti@redhat.com> wrote:
> > >
>
> [...]
>
> > > > For consistency with the other path, what about this instead ?
> > > >
> > > > This would also  allow a qdisc goten from an rcu lookup to allow its
> > > > spinlock to be acquired.
> > > > (I am not saying this can happen, but who knows...)
> > > >
> > > > Ie defer the  lockdep_unregister_key() right before the kfree()
> > >
> > > the problem is, qdisc_free() is called also in a RCU callback. So, if we move
> > > lockdep_unregister_key() inside the function, the non-error path is
> > > going to splat like this
> >
> > Got it, but we do have ways of running a work queue after rcu grace period.
>
> this would imply scheduling a work that does qdisc_free() + lockdep_unregister_key()
> in qdisc_free_cb(). I can try that, but maybe the issue is different:
>
> > Let's use your patch, but I suspect we could have other issues.
> >
> > Full disclosure, I have the following syzbot report:
> >
> > WARNING: bad unlock balance detected!
> > 6.9.0-rc5-syzkaller-01413-gdd1941f801bc #0 Not tainted
> > -------------------------------------
> > kworker/u8:6/2474 is trying to release lock (&sch->root_lock_key) at:
> > [<ffffffff897300c5>] spin_unlock_bh include/linux/spinlock.h:396 [inline]
> > [<ffffffff897300c5>] dev_reset_queue+0x145/0x1b0 net/sched/sch_generic.c:1304
> > but there are no more locks to release!
>
> I don't understand how can this "imbalance" be caused by lockdep_unregister_key()
> being called too early. I'm more inclined to think that this splat is due to UaF
> similar to those that we saw a couples of days ago. Is syzbot still
> generating report like the one above?
>

 I had 22 other syzbot reports that I marked as a Duplicate of the original one.

Let's see if syzbot re-opens a new one after the fix.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-05-03 12:49 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-30 17:11 [PATCH net-next] net/sched: unregister lockdep keys in qdisc_create/qdisc_alloc error path Davide Caratti
2024-04-30 17:58 ` Eric Dumazet
2024-04-30 18:35   ` Davide Caratti
2024-04-30 18:43     ` Eric Dumazet
2024-05-03 12:44       ` Davide Caratti
2024-05-03 12:48         ` Eric Dumazet
2024-05-01  7:39 ` Eric Dumazet
2024-05-01 10:26 ` Ido Schimmel
2024-05-01 10:38   ` Naresh Kamboju
2024-05-02  6:22 ` Ido Schimmel
2024-05-02 14:20 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox