* Re: BUG: general protection fault in hrtimer_try_to_cancel
[not found] <CALAgD-4F7g=Fqy0KL0t0SaZburRiENsnzm_CYbb8SzbAk1+8oA@mail.gmail.com>
@ 2024-08-25 19:38 ` Thomas Gleixner
2024-08-26 12:26 ` Eric Dumazet
0 siblings, 1 reply; 2+ messages in thread
From: Thomas Gleixner @ 2024-08-25 19:38 UTC (permalink / raw)
To: Xingyu Li, anna-maria, frederic, linux-kernel
Cc: netdev, Eric Dumazet, Jakub Kicinski
On Sat, Aug 24 2024 at 21:50, Xingyu Li wrote:
Cc=: network folks.
> We found a bug in Linux 6.10. It is probably a null pointer dereference bug.
> The reason is probably that in line 1615 of kernel/time/hrtimer.c,
> before "seq = raw_read_seqcount_begin(&base->seq);", there is no null
> pointer check for 'base'.
So something in the network code invokes hrtimer_cancel() in a teardown
operation on a non-initialized hrtimer and hrtimers contrary to the timer
wheel does not check for initialization. That's trivial to fix, see
below.
But that does not explain the actual root cause. That NULL pointer
dereference is just the messenger.
This is gro_cells related:
> hrtimer_cancel+0x12/0x50 kernel/time/hrtimer.c:1447
> napi_disable+0x1b6/0x210 net/core/dev.c:6648
> gro_cells_destroy+0x12a/0x3d0 net/core/gro_cells.c:116
gro_cells_init() invokes netif_napi_add() for each per cpu
cell. netif_napi_add() invokes hrtimer_init().
So how can gro_cells_destroy() have a non-initialized hrtimer?
I defer that question to the network people
Thanks,
tglx
> The bug report is as follows, but unfortunately there is no generated
> syzkaller reproducer.
>
> bridge0: port 2(bridge_slave_1) entered disabled state
> bridge_slave_0: left allmulticast mode
> bridge_slave_0: left promiscuous mode
> bridge0: port 1(bridge_slave_0) entered disabled state
> Oops: general protection fault, probably for non-canonical address
> 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
> KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
> CPU: 0 PID: 29 Comm: kworker/u4:2 Not tainted 6.10.0 #13
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Workqueue: netns cleanup_net
> RIP: 0010:__seqprop_raw_spinlock_sequence include/linux/seqlock.h:226 [inline]
> RIP: 0010:hrtimer_active kernel/time/hrtimer.c:1615 [inline]
> RIP: 0010:hrtimer_try_to_cancel+0x7c/0x410 kernel/time/hrtimer.c:1332
> Code: 2f 12 00 48 8b 5c 24 10 48 8b 44 24 08 42 80 3c 30 00 74 08 48
> 89 df e8 02 3b 75 00 4c 8b 3b 4d 8d 67 10 4c 89 e3 48 c1 eb 03 <42> 8a
> 04 33 84 c0 0f 85 f4 00 00 00 41 8b 2c 24 89 ee 83 e6 01 31
> RSP: 0018:ffffc9000080f7a0 EFLAGS: 00010202
> RAX: 1ffffd1ffff88a14 RBX: 0000000000000002 RCX: 1ffffd1ffff88a15
> RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffe8ffffc45070
> RBP: ffffe8ffffc45070 R08: ffffc9000080f867 R09: 1ffff92000101f0c
> R10: dffffc0000000000 R11: fffff52000101f0d R12: 0000000000000010
> R13: dffffc0000000000 R14: dffffc0000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff888063a00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f09772124b0 CR3: 000000001f978000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> hrtimer_cancel+0x12/0x50 kernel/time/hrtimer.c:1447
> napi_disable+0x1b6/0x210 net/core/dev.c:6648
> gro_cells_destroy+0x12a/0x3d0 net/core/gro_cells.c:116
> unregister_netdevice_many_notify+0x10a5/0x16d0 net/core/dev.c:11239
> cleanup_net+0x764/0xcd0 net/core/net_namespace.c:635
> process_one_work kernel/workqueue.c:3248 [inline]
> process_scheduled_works+0x977/0x1410 kernel/workqueue.c:3329
> worker_thread+0xaa0/0x1020 kernel/workqueue.c:3409
> kthread+0x2eb/0x380 kernel/kthread.c:389
> ret_from_fork+0x49/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244
> </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:__seqprop_raw_spinlock_sequence include/linux/seqlock.h:226 [inline]
> RIP: 0010:hrtimer_active kernel/time/hrtimer.c:1615 [inline]
> RIP: 0010:hrtimer_try_to_cancel+0x7c/0x410 kernel/time/hrtimer.c:1332
> Code: 2f 12 00 48 8b 5c 24 10 48 8b 44 24 08 42 80 3c 30 00 74 08 48
> 89 df e8 02 3b 75 00 4c 8b 3b 4d 8d 67 10 4c 89 e3 48 c1 eb 03 <42> 8a
> 04 33 84 c0 0f 85 f4 00 00 00 41 8b 2c 24 89 ee 83 e6 01 31
> RSP: 0018:ffffc9000080f7a0 EFLAGS: 00010202
> RAX: 1ffffd1ffff88a14 RBX: 0000000000000002 RCX: 1ffffd1ffff88a15
> RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffe8ffffc45070
> RBP: ffffe8ffffc45070 R08: ffffc9000080f867 R09: 1ffff92000101f0c
> R10: dffffc0000000000 R11: fffff52000101f0d R12: 0000000000000010
> R13: dffffc0000000000 R14: dffffc0000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff888063a00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000563cd2cf2058 CR3: 000000001d166000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> ----------------
> Code disassembly (best guess), 1 bytes skipped:
> 0: 12 00 adc (%rax),%al
> 2: 48 8b 5c 24 10 mov 0x10(%rsp),%rbx
> 7: 48 8b 44 24 08 mov 0x8(%rsp),%rax
> c: 42 80 3c 30 00 cmpb $0x0,(%rax,%r14,1)
> 11: 74 08 je 0x1b
> 13: 48 89 df mov %rbx,%rdi
> 16: e8 02 3b 75 00 call 0x753b1d
> 1b: 4c 8b 3b mov (%rbx),%r15
> 1e: 4d 8d 67 10 lea 0x10(%r15),%r12
> 22: 4c 89 e3 mov %r12,%rbx
> 25: 48 c1 eb 03 shr $0x3,%rbx
> * 29: 42 8a 04 33 mov (%rbx,%r14,1),%al <-- trapping instruction
> 2d: 84 c0 test %al,%al
> 2f: 0f 85 f4 00 00 00 jne 0x129
> 35: 41 8b 2c 24 mov (%r12),%ebp
> 39: 89 ee mov %ebp,%esi
> 3b: 83 e6 01 and $0x1,%esi
> 3e: 31 .byte 0x31
---
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index a023946f8558..448bce5e6a05 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -485,6 +485,11 @@ static inline void debug_deactivate(struct hrtimer *timer)
trace_hrtimer_cancel(timer);
}
+static inline bool hrtimer_initialized(const struct hrtimer *timer)
+{
+ return timer->base && timer->function;
+}
+
static struct hrtimer_clock_base *
__next_base(struct hrtimer_cpu_base *cpu_base, unsigned int *active)
{
@@ -1285,7 +1290,7 @@ void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
struct hrtimer_clock_base *base;
unsigned long flags;
- if (WARN_ON_ONCE(!timer->function))
+ if (WARN_ON_ONCE(!hrtimer_initialized(timer)))
return;
/*
* Check whether the HRTIMER_MODE_SOFT bit and hrtimer.is_soft
@@ -1612,6 +1617,9 @@ bool hrtimer_active(const struct hrtimer *timer)
struct hrtimer_clock_base *base;
unsigned int seq;
+ if (WARN_ON_ONCE(!hrtimer_initialized(timer)))
+ return false;
+
do {
base = READ_ONCE(timer->base);
seq = raw_read_seqcount_begin(&base->seq);
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: BUG: general protection fault in hrtimer_try_to_cancel
2024-08-25 19:38 ` BUG: general protection fault in hrtimer_try_to_cancel Thomas Gleixner
@ 2024-08-26 12:26 ` Eric Dumazet
0 siblings, 0 replies; 2+ messages in thread
From: Eric Dumazet @ 2024-08-26 12:26 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Xingyu Li, anna-maria, frederic, linux-kernel, netdev,
Jakub Kicinski
On Sun, Aug 25, 2024 at 9:38 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Sat, Aug 24 2024 at 21:50, Xingyu Li wrote:
>
> Cc=: network folks.
>
> > We found a bug in Linux 6.10. It is probably a null pointer dereference bug.
> > The reason is probably that in line 1615 of kernel/time/hrtimer.c,
> > before "seq = raw_read_seqcount_begin(&base->seq);", there is no null
> > pointer check for 'base'.
>
> So something in the network code invokes hrtimer_cancel() in a teardown
> operation on a non-initialized hrtimer and hrtimers contrary to the timer
> wheel does not check for initialization. That's trivial to fix, see
> below.
>
> But that does not explain the actual root cause. That NULL pointer
> dereference is just the messenger.
>
> This is gro_cells related:
>
> > hrtimer_cancel+0x12/0x50 kernel/time/hrtimer.c:1447
> > napi_disable+0x1b6/0x210 net/core/dev.c:6648
> > gro_cells_destroy+0x12a/0x3d0 net/core/gro_cells.c:116
>
> gro_cells_init() invokes netif_napi_add() for each per cpu
> cell. netif_napi_add() invokes hrtimer_init().
>
> So how can gro_cells_destroy() have a non-initialized hrtimer?
>
> I defer that question to the network people
>
> Thanks,
>
> tglx
>
> > The bug report is as follows, but unfortunately there is no generated
> > syzkaller reproducer.
> >
> > bridge0: port 2(bridge_slave_1) entered disabled state
> > bridge_slave_0: left allmulticast mode
> > bridge_slave_0: left promiscuous mode
> > bridge0: port 1(bridge_slave_0) entered disabled state
> > Oops: general protection fault, probably for non-canonical address
> > 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
> > KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
> > CPU: 0 PID: 29 Comm: kworker/u4:2 Not tainted 6.10.0 #13
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > Workqueue: netns cleanup_net
> > RIP: 0010:__seqprop_raw_spinlock_sequence include/linux/seqlock.h:226 [inline]
> > RIP: 0010:hrtimer_active kernel/time/hrtimer.c:1615 [inline]
> > RIP: 0010:hrtimer_try_to_cancel+0x7c/0x410 kernel/time/hrtimer.c:1332
> > Code: 2f 12 00 48 8b 5c 24 10 48 8b 44 24 08 42 80 3c 30 00 74 08 48
> > 89 df e8 02 3b 75 00 4c 8b 3b 4d 8d 67 10 4c 89 e3 48 c1 eb 03 <42> 8a
> > 04 33 84 c0 0f 85 f4 00 00 00 41 8b 2c 24 89 ee 83 e6 01 31
> > RSP: 0018:ffffc9000080f7a0 EFLAGS: 00010202
> > RAX: 1ffffd1ffff88a14 RBX: 0000000000000002 RCX: 1ffffd1ffff88a15
> > RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffe8ffffc45070
> > RBP: ffffe8ffffc45070 R08: ffffc9000080f867 R09: 1ffff92000101f0c
> > R10: dffffc0000000000 R11: fffff52000101f0d R12: 0000000000000010
> > R13: dffffc0000000000 R14: dffffc0000000000 R15: 0000000000000000
> > FS: 0000000000000000(0000) GS:ffff888063a00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f09772124b0 CR3: 000000001f978000 CR4: 0000000000350ef0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <TASK>
> > hrtimer_cancel+0x12/0x50 kernel/time/hrtimer.c:1447
> > napi_disable+0x1b6/0x210 net/core/dev.c:6648
> > gro_cells_destroy+0x12a/0x3d0 net/core/gro_cells.c:116
This looks like something that should not happen, unless something
mangled memory.
I had another syzbot report hinting at some per-cpu var corruption of
some sort :
bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
bond0 (unregistering): Released all slaves
Oops: general protection fault, probably for non-canonical address
0xf01ffbfd1fdfa34d: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0x80ffffe8fefd1a68-0x80ffffe8fefd1a6f]
CPU: 1 UID: 0 PID: 64 Comm: kworker/u32:3 Not tainted
6.11.0-rc1-syzkaller-00046-gc91a7dee0555 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Workqueue: netns cleanup_net
RIP: 0010:__skb_unlink include/linux/skbuff.h:2418 [inline]
RIP: 0010:__skb_dequeue include/linux/skbuff.h:2434 [inline]
RIP: 0010:__skb_queue_purge_reason include/linux/skbuff.h:3288 [inline]
RIP: 0010:__skb_queue_purge include/linux/skbuff.h:3294 [inline]
RIP: 0010:gro_cells_destroy net/core/gro_cells.c:118 [inline]
RIP: 0010:gro_cells_destroy+0x1a9/0x4d0 net/core/gro_cells.c:106
Code: e8 03 80 3c 28 00 0f 85 71 02 00 00 48 8d 7a 08 49 8b 47 08 49
c7 07 00 00 00 00 48 89 f9 49 c7 47 08 00 00 00 00 48 c1 e9 03 <80> 3c
29 00 0f 85 2b 02 00 00 48 89 c1 48 89 42 08 48 c1 e9 03 80
RSP: 0018:ffffc90000d17a50 EFLAGS: 00010a06
RAX: ffffffe8fefd1a60 RBX: ffffe8fefd1a6060 RCX: 101ffffd1fdfa34d
RDX: 80ffffe8fefd1a60 RSI: ffffffff89043b68 RDI: 80ffffe8fefd1a68
RBP: dffffc0000000000 R08: 0000000000000005 R09: 0000000000000008
R10: 0000000000000008 R11: 0000000000000000 R12: fffff91fdfa34c0e
R13: 0000000000000001 R14: ffffe8fefd1a6070 R15: ffffe8fefd1a6061
FS: 0000000000000000(0000) GS:ffff88806b100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbf176e2650 CR3: 000000003fb9c000 CR4: 0000000000350ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
ip6gre_dev_free+0x19/0x30 net/ipv6/ip6_gre.c:1444
netdev_run_todo+0x760/0x12d0 net/core/dev.c:10753
cleanup_net+0x591/0xbf0 net/core/net_namespace.c:636
process_one_work+0x9c5/0x1b40 kernel/workqueue.c:3231
process_scheduled_works kernel/workqueue.c:3312 [inline]
worker_thread+0x6c8/0xf20 kernel/workqueue.c:3390
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__skb_unlink include/linux/skbuff.h:2418 [inline]
RIP: 0010:__skb_dequeue include/linux/skbuff.h:2434 [inline]
RIP: 0010:__skb_queue_purge_reason include/linux/skbuff.h:3288 [inline]
RIP: 0010:__skb_queue_purge include/linux/skbuff.h:3294 [inline]
RIP: 0010:gro_cells_destroy net/core/gro_cells.c:118 [inline]
RIP: 0010:gro_cells_destroy+0x1a9/0x4d0 net/core/gro_cells.c:106
Code: e8 03 80 3c 28 00 0f 85 71 02 00 00 48 8d 7a 08 49 8b 47 08 49
c7 07 00 00 00 00 48 89 f9 49 c7 47 08 00 00 00 00 48 c1 e9 03 <80> 3c
29 00 0f 85 2b 02 00 00 48 89 c1 48 89 42 08 48 c1 e9 03 80
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-08-26 12:27 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CALAgD-4F7g=Fqy0KL0t0SaZburRiENsnzm_CYbb8SzbAk1+8oA@mail.gmail.com>
2024-08-25 19:38 ` BUG: general protection fault in hrtimer_try_to_cancel Thomas Gleixner
2024-08-26 12:26 ` Eric Dumazet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).