netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: gro_cells: Provide lockdep class for gro_cell's bh_lock
@ 2025-11-04 11:12 Sebastian Andrzej Siewior
  2025-11-04 14:05 ` Jakub Kicinski
  2025-11-04 14:22 ` [syzbot ci] Re: net: gro_cells: Provide lockdep class for gro_cell's bh_lock syzbot ci
  0 siblings, 2 replies; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-11-04 11:12 UTC (permalink / raw)
  To: netdev
  Cc: Eric Dumazet, Gal Pressman, linux-rt-devel, David S. Miller,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Clark Williams,
	Steven Rostedt

One GRO-cell device's NAPI callback can nest into the GRO-cell of
another device if the underlying device is also using GRO-cell.
This is the case for IPsec over vxlan.
These two GRO-cells are separate devices. From lockdep's point of view
it is the same because each device is sharing the same lock class and so
it reports a possible deadlock assuming one device is nesting into
itself.

Provide a lockclass for the bh_lock on for gro-cell device allowing
lockdep to distinguish between individual devices.

Fixes: 25718fdcbdd2 ("net: gro_cells: Use nested-BH locking for gro_cell")
Reported-by: Gal Pressman <gal@nvidia.com>
Closes: https://lore.kernel.org/all/66664116-edb8-48dc-ad72-d5223696dd19@nvidia.com/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/net/gro_cells.h | 1 +
 net/core/gro_cells.c    | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/net/gro_cells.h b/include/net/gro_cells.h
index 596688b67a2a8..2453d0139c205 100644
--- a/include/net/gro_cells.h
+++ b/include/net/gro_cells.h
@@ -10,6 +10,7 @@ struct gro_cell;
 
 struct gro_cells {
 	struct gro_cell __percpu	*cells;
+	struct lock_class_key		cells_bh_key;
 };
 
 int gro_cells_receive(struct gro_cells *gcells, struct sk_buff *skb);
diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
index fd57b845de333..a91fdc47e8096 100644
--- a/net/core/gro_cells.c
+++ b/net/core/gro_cells.c
@@ -88,6 +88,7 @@ int gro_cells_init(struct gro_cells *gcells, struct net_device *dev)
 
 		__skb_queue_head_init(&cell->napi_skbs);
 		local_lock_init(&cell->bh_lock);
+		lockdep_set_class(&cell->bh_lock, &gcells->cells_bh_key);
 
 		set_bit(NAPI_STATE_NO_BUSY_POLL, &cell->napi.state);
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] net: gro_cells: Provide lockdep class for gro_cell's bh_lock
  2025-11-04 11:12 [PATCH net] net: gro_cells: Provide lockdep class for gro_cell's bh_lock Sebastian Andrzej Siewior
@ 2025-11-04 14:05 ` Jakub Kicinski
  2025-11-04 15:34   ` [PATCH net v2] net: gro_cells: Reduce lock scope in gro_cell_poll Sebastian Andrzej Siewior
  2025-11-04 14:22 ` [syzbot ci] Re: net: gro_cells: Provide lockdep class for gro_cell's bh_lock syzbot ci
  1 sibling, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2025-11-04 14:05 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: netdev, Eric Dumazet, Gal Pressman, linux-rt-devel,
	David S. Miller, Paolo Abeni, Simon Horman, Clark Williams,
	Steven Rostedt

On Tue, 4 Nov 2025 12:12:01 +0100 Sebastian Andrzej Siewior wrote:
> One GRO-cell device's NAPI callback can nest into the GRO-cell of
> another device if the underlying device is also using GRO-cell.
> This is the case for IPsec over vxlan.
> These two GRO-cells are separate devices. From lockdep's point of view
> it is the same because each device is sharing the same lock class and so
> it reports a possible deadlock assuming one device is nesting into
> itself.
> 
> Provide a lockclass for the bh_lock on for gro-cell device allowing
> lockdep to distinguish between individual devices.
> 
> Fixes: 25718fdcbdd2 ("net: gro_cells: Use nested-BH locking for gro_cell")
> Reported-by: Gal Pressman <gal@nvidia.com>
> Closes: https://lore.kernel.org/all/66664116-edb8-48dc-ad72-d5223696dd19@nvidia.com/
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Breaks boot:

[    2.053035][    T1] netem: version 1.3
[    2.054087][    T1] ipip: IPv4 and MPLS over IPv4 tunneling driver
[    2.055273][    T1] BUG: key ffff888009041e10 has not been registered!
[    2.055683][    T1] ------------[ cut here ]------------
[    2.055863][    T1] DEBUG_LOCKS_WARN_ON(1)
[    2.055880][    T1] WARNING: CPU: 1 PID: 1 at kernel/locking/lockdep.c:4976 lockdep_init_map_type+0x24c/0x270
[    2.056328][    T1] Modules linked in:
[    2.056488][    T1] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.0-rc3-virtme #1 PREEMPT(full) 
[    2.056792][    T1] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    2.057007][    T1] RIP: 0010:lockdep_init_map_type+0x24c/0x270
[    2.057220][    T1] Code: ff 4c 89 e6 48 c7 c7 50 97 83 b6 e8 ee c9 01 00 e9 3d ff ff ff 90 48 c7 c6 1a 2d 7d b6 48 c7 c7 67 29 7d b6 e8 65 6e e9 ff 90 <0f> 0b 90 90 e9 47 ff ff ff 90 48 c7 c6 56 2e 7d b6 48 c7 c7 67 29
[    2.057839][    T1] RSP: 0000:ffffc90000017960 EFLAGS: 00010286
[    2.058161][    T1] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    2.058417][    T1] RDX: 0000000000000002 RSI: 0000000000000004 RDI: 0000000000000001
[    2.058663][    T1] RBP: ffffe8ffffc01ce8 R08: 0000000000000000 R09: fffffbfff6e4090c
[    2.059045][    T1] R10: 0000000000000003 R11: 0000000000000004 R12: ffff888009041e10
[    2.059298][    T1] R13: 0000000000000000 R14: ffffe8ffffc01ad0 R15: ffffe8ffffc01a78
[    2.059657][    T1] FS:  0000000000000000(0000) GS:ffff8880ae587000(0000) knlGS:0000000000000000
[    2.059966][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.060199][    T1] CR2: 0000000000000000 CR3: 0000000079349001 CR4: 0000000000772ef0
[    2.060559][    T1] PKRU: 55555554
[    2.060684][    T1] Call Trace:
[    2.060819][    T1]  <TASK>
[    2.060911][    T1]  gro_cells_init+0x252/0x3d0
[    2.061082][    T1]  ip_tunnel_init+0xef/0x5f0
[    2.061364][    T1]  register_netdevice+0x59f/0x17b0
[    2.061545][    T1]  ? unregister_netdevice_queue+0x410/0x410
[    2.061767][    T1]  ? alloc_netdev_mqs+0xdd7/0x1370
[    2.062035][    T1]  __ip_tunnel_create+0x326/0x440
[    2.062201][    T1]  ? ip_tunnel_add+0x180/0x180
[    2.062374][    T1]  ip_tunnel_init_net+0x16f/0x4e0
[    2.062539][    T1]  ? paint_ptr+0x3b/0x90
[    2.062666][    T1]  ? ip_tunnel_ctl+0x890/0x890
[    2.062953][    T1]  ? mark_held_locks+0x49/0x70
[    2.063125][    T1]  ? _raw_spin_unlock_irqrestore+0x59/0x70
[    2.063332][    T1]  ops_init+0x189/0x550
[    2.063469][    T1]  register_pernet_operations+0x31f/0x8b0
[    2.063747][    T1]  ? ops_undo_list+0x890/0x890
[    2.063918][    T1]  ? rwsem_down_write_slowpath+0xc60/0xc60
[    2.064134][    T1]  ? rng_is_initialized+0x20/0x20
[    2.064408][    T1]  ? __up_write+0x1ad/0x520
[    2.064584][    T1]  ? ip_mr_init+0x120/0x120
[    2.064777][    T1]  register_pernet_device+0x2a/0x60
[    2.064947][    T1]  ipip_init+0x23/0xe0
[    2.065072][    T1]  do_one_initcall+0x8c/0x1d0
[    2.065353][    T1]  ? trace_initcall_start+0x130/0x130
[    2.065535][    T1]  ? rcu_is_watching+0x12/0xb0
[    2.065700][    T1]  ? __kmalloc_noprof+0x313/0x820
[    2.065983][    T1]  ? rcu_is_watching+0x12/0xb0
[    2.066158][    T1]  do_initcalls+0x176/0x280
[    2.066383][    T1]  kernel_init_freeable+0x227/0x310
[    2.066565][    T1]  ? rest_init+0x260/0x260
[    2.066774][    T1]  kernel_init+0x20/0x1f0
[    2.066900][    T1]  ? rest_init+0x260/0x260
[    2.067062][    T1]  ? rest_init+0x260/0x260
[    2.067227][    T1]  ret_from_fork+0x1db/0x270
[    2.067397][    T1]  ? rest_init+0x260/0x260
[    2.067677][    T1]  ret_from_fork_asm+0x11/0x20
[    2.067858][    T1]  </TASK>
[    2.067995][    T1] irq event stamp: 337037
[    2.068127][    T1] hardirqs last  enabled at (337037): [<ffffffffb3bcfb47>] __up_console_sem+0x67/0x70
[    2.068525][    T1] hardirqs last disabled at (337036): [<ffffffffb3bcfb2c>] __up_console_sem+0x4c/0x70
[    2.068829][    T1] softirqs last  enabled at (336962): [<ffffffffb3a63822>] handle_softirqs+0x352/0x610
[    2.069231][    T1] softirqs last disabled at (336631): [<ffffffffb3a6408b>] irq_exit_rcu+0xab/0x100
[    2.069541][    T1] ---[ end trace 0000000000000000 ]---
[    2.073346][    T1] IPv4 over IPsec tunneling driver
[    2.077691][    T1] NET: Registered PF_INET6 protocol family
[    2.087079][    T1] Segment Routing with IPv6
[    2.087254][    T1] RPL Segment Routing with IPv6
[    2.087776][    T1] In-situ OAM (IOAM) with IPv6
[    2.092422][    T1] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[    2.100954][    T1] NET: Registered PF_PACKET protocol family
[    2.102004][    T1] 8021q: 802.1Q VLAN Support v1.8
[    2.102407][    T1] 9pnet: Installing 9P2000 support
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [syzbot ci] Re: net: gro_cells: Provide lockdep class for gro_cell's bh_lock
  2025-11-04 11:12 [PATCH net] net: gro_cells: Provide lockdep class for gro_cell's bh_lock Sebastian Andrzej Siewior
  2025-11-04 14:05 ` Jakub Kicinski
@ 2025-11-04 14:22 ` syzbot ci
  1 sibling, 0 replies; 5+ messages in thread
From: syzbot ci @ 2025-11-04 14:22 UTC (permalink / raw)
  To: bigeasy, clrkwllms, davem, edumazet, gal, horms, kuba,
	linux-rt-devel, netdev, pabeni, rostedt
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v1] net: gro_cells: Provide lockdep class for gro_cell's bh_lock
https://lore.kernel.org/all/20251104111201.5eBxkOKb@linutronix.de
* [PATCH net] net: gro_cells: Provide lockdep class for gro_cell's bh_lock

and found the following issue:
BUG: key ADDR has not been registered!

Full report is available here:
https://ci.syzbot.org/series/487d8c1b-77a3-45c6-af6d-8195d5c60ad7

***

BUG: key ADDR has not been registered!

tree:      net
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net.git
base:      3d18a84eddde169d6dbf3c72cc5358b988c347d0
arch:      amd64
compiler:  Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config:    https://ci.syzbot.org/builds/45da8225-040a-4fda-9352-295e63030683/config

gb_gbphy: registered new driver usb
asus_wmi: ASUS WMI generic driver loaded
gnss: GNSS driver registered with major 494
usbcore: registered new interface driver gnss-usb
usbcore: registered new interface driver hdm_usb
usbcore: registered new interface driver snd-usb-audio
usbcore: registered new interface driver snd-ua101
usbcore: registered new interface driver snd-usb-usx2y
usbcore: registered new interface driver snd-usb-us122l
usbcore: registered new interface driver snd-usb-caiaq
usbcore: registered new interface driver snd-usb-6fire
usbcore: registered new interface driver snd-usb-hiface
usbcore: registered new interface driver snd-bcd2000
usbcore: registered new interface driver snd_usb_pod
usbcore: registered new interface driver snd_usb_podhd
usbcore: registered new interface driver snd_usb_toneport
usbcore: registered new interface driver snd_usb_variax
drop_monitor: Initializing network drop monitor service
NET: Registered PF_LLC protocol family
GACT probability on
Mirror/redirect action on
Simple TC action Loaded
netem: version 1.3
u32 classifier
    Performance counters on
    input device check on
    Actions configured
nf_conntrack_irc: failed to register helpers
nf_conntrack_sane: failed to register helpers
nf_conntrack_sip: failed to register helpers
xt_time: kernel timezone is -0000
IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
IPVS: Connection hash table configured (size=4096, memory=32Kbytes)
IPVS: ipvs loaded.
IPVS: [rr] scheduler registered.
IPVS: [wrr] scheduler registered.
IPVS: [lc] scheduler registered.
IPVS: [wlc] scheduler registered.
IPVS: [fo] scheduler registered.
IPVS: [ovf] scheduler registered.
IPVS: [lblc] scheduler registered.
IPVS: [lblcr] scheduler registered.
IPVS: [dh] scheduler registered.
IPVS: [sh] scheduler registered.
IPVS: [mh] scheduler registered.
IPVS: [sed] scheduler registered.
IPVS: [nq] scheduler registered.
IPVS: [twos] scheduler registered.
IPVS: [sip] pe registered.
ipip: IPv4 and MPLS over IPv4 tunneling driver
BUG: key ffff88816c216e68 has not been registered!
------------[ cut here ]------------
DEBUG_LOCKS_WARN_ON(1)
WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:4976 lockdep_init_map_type+0x241/0x380
Modules linked in:
CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:lockdep_init_map_type+0x241/0x380
Code: 75 cd 90 e8 21 dc e8 02 85 c0 74 22 83 3d 2a c9 df 0d 00 75 19 90 48 c7 c7 36 2e 8f 8d 48 c7 c6 93 c8 7e 8d e8 00 dd e5 ff 90 <0f> 0b 90 90 65 48 8b 05 93 fc d0 10 48 3b 44 24 08 0f 85 14 01 00
RSP: 0000:ffffc900000672a8 EFLAGS: 00010246
RAX: 40a598123e634300 RBX: ffffe8fee481dde8 RCX: ffff888102688000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002
RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffffbfff1bba650 R12: 0000000000000000
R13: 0000000000000000 R14: ffff88816c216e68 R15: ffffe8fee481dde8
FS:  0000000000000000(0000) GS:ffff88818eb3c000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88823ffff000 CR3: 000000000dd38000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 gro_cells_init+0x26c/0x380
 ip_tunnel_init+0xc9/0x650
 register_netdevice+0x6bf/0x1ae0
 __ip_tunnel_create+0x3e7/0x560
 ip_tunnel_init_net+0x2ba/0x800
 ops_init+0x35c/0x5c0
 register_pernet_operations+0x336/0x800
 register_pernet_device+0x2a/0x80
 ipip_init+0x1d/0xd0
 do_one_initcall+0x236/0x820
 do_initcall_level+0x104/0x190
 do_initcalls+0x59/0xa0
 kernel_init_freeable+0x334/0x4b0
 kernel_init+0x1d/0x1d0
 ret_from_fork+0x4bc/0x870
 ret_from_fork_asm+0x1a/0x30
 </TASK>


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH net v2] net: gro_cells: Reduce lock scope in gro_cell_poll
  2025-11-04 14:05 ` Jakub Kicinski
@ 2025-11-04 15:34   ` Sebastian Andrzej Siewior
  2025-11-06  1:50     ` patchwork-bot+netdevbpf
  0 siblings, 1 reply; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-11-04 15:34 UTC (permalink / raw)
  To: netdev
  Cc: Jakub Kicinski, Eric Dumazet, Gal Pressman, linux-rt-devel,
	David S. Miller, Paolo Abeni, Simon Horman, Clark Williams,
	Steven Rostedt

One GRO-cell device's NAPI callback can nest into the GRO-cell of
another device if the underlying device is also using GRO-cell.
This is the case for IPsec over vxlan.
These two GRO-cells are separate devices. From lockdep's point of view
it is the same because each device is sharing the same lock class and so
it reports a possible deadlock assuming one device is nesting into
itself.

Hold the bh_lock only while accessing gro_cell::napi_skbs in
gro_cell_poll(). This reduces the locking scope and avoids acquiring the
same lock class multiple times.

Fixes: 25718fdcbdd2 ("net: gro_cells: Use nested-BH locking for gro_cell")
Reported-by: Gal Pressman <gal@nvidia.com>
Closes: https://lore.kernel.org/all/66664116-edb8-48dc-ad72-d5223696dd19@nvidia.com/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
v1…v2:
  - Drop the lock and reacquire it again in gro_cell_poll() instead
    providing the lock class. The addition lock class needs to be
    registered and unregistered. The latter must not have from the RCU
    callback. This looks simpler.
    Reported by Jakub.

 net/core/gro_cells.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c
index fd57b845de333..a725d21159a6f 100644
--- a/net/core/gro_cells.c
+++ b/net/core/gro_cells.c
@@ -60,9 +60,10 @@ static int gro_cell_poll(struct napi_struct *napi, int budget)
 	struct sk_buff *skb;
 	int work_done = 0;
 
-	__local_lock_nested_bh(&cell->bh_lock);
 	while (work_done < budget) {
+		__local_lock_nested_bh(&cell->bh_lock);
 		skb = __skb_dequeue(&cell->napi_skbs);
+		__local_unlock_nested_bh(&cell->bh_lock);
 		if (!skb)
 			break;
 		napi_gro_receive(napi, skb);
@@ -71,7 +72,6 @@ static int gro_cell_poll(struct napi_struct *napi, int budget)
 
 	if (work_done < budget)
 		napi_complete_done(napi, work_done);
-	__local_unlock_nested_bh(&cell->bh_lock);
 	return work_done;
 }
 
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net v2] net: gro_cells: Reduce lock scope in gro_cell_poll
  2025-11-04 15:34   ` [PATCH net v2] net: gro_cells: Reduce lock scope in gro_cell_poll Sebastian Andrzej Siewior
@ 2025-11-06  1:50     ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-11-06  1:50 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: netdev, kuba, edumazet, gal, linux-rt-devel, davem, pabeni, horms,
	clrkwllms, rostedt

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 4 Nov 2025 16:34:35 +0100 you wrote:
> One GRO-cell device's NAPI callback can nest into the GRO-cell of
> another device if the underlying device is also using GRO-cell.
> This is the case for IPsec over vxlan.
> These two GRO-cells are separate devices. From lockdep's point of view
> it is the same because each device is sharing the same lock class and so
> it reports a possible deadlock assuming one device is nesting into
> itself.
> 
> [...]

Here is the summary with links:
  - [net,v2] net: gro_cells: Reduce lock scope in gro_cell_poll
    https://git.kernel.org/netdev/net/c/d917c217b612

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-11-06  1:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-04 11:12 [PATCH net] net: gro_cells: Provide lockdep class for gro_cell's bh_lock Sebastian Andrzej Siewior
2025-11-04 14:05 ` Jakub Kicinski
2025-11-04 15:34   ` [PATCH net v2] net: gro_cells: Reduce lock scope in gro_cell_poll Sebastian Andrzej Siewior
2025-11-06  1:50     ` patchwork-bot+netdevbpf
2025-11-04 14:22 ` [syzbot ci] Re: net: gro_cells: Provide lockdep class for gro_cell's bh_lock syzbot ci

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).