From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03B26281503 for ; Mon, 15 Dec 2025 11:14:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765797245; cv=none; b=ap1+MmG3e5LyaKpyg0n47NyQVvnc9XmoQjfH1o1P8i8XTFSjHwVWPRuuEHRnYHtv4c5NryU5owJsth6PK968CtXH+/RyDNegIsPm43EC4IuhU6RVlSdNMhp/JU/6Vf0JZIFOqyWyQZ/c3DEracJnOC81yZulhUIWEZhrr/Zo0wU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765797245; c=relaxed/simple; bh=t8W/bDl0ngEa9OdlZ08hpPVdhiPi2r+c5vlyfQ62cfE=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=r4yVK2W3NK9J8hGhcKHhj2KfBX4Ac39S+iepm1ri1rnNmMSEOKyKO+lki6FJc7ovsZ4UXCN+jVjb9tDP7hSFtmI2F3l5qjZ8mHjRUDj2dxO5a+/gKbJqtaCRPOgaz4GfVEJW3poWVNFRoEn1WcINF89RiSgm+WRtBMRaBXWUzbc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=MtoRJlCj; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=BuXm7xlb; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="MtoRJlCj"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="BuXm7xlb" From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1765797240; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aw+//mWuoaq93Fyz1zPOL3EGTybnP5nOn0EuVIbxhAo=; b=MtoRJlCjS70cJp4/wyRW0LcCb+y6WzDOI1UlkDPBDBV6qhfPOjNEoYg0Cm/2y9vaomztNy PNp3FCd+KjdNytZHruLErmNsi2s6riyJntzFZXawmNFmdm5Z+YJCwSxS10zpEORquvUNzy 0B9OGemIqDxrHim0n5TlTEj/nhDMRRxPGuIJBTfGLg0toZXDc1SvDhNiEbD4R8u55So9Vn /Vn4tNndC3dcZyA5TTA2fr3HWzBaL23rC7msrE3Mz6QAy4WHduoOhW2QU17jIH2ATLUDpH qrk2a+3FeWOQsmxb/Pfz4re6tCd+2otoU/GdJMfX3JlY88jPKdW7dLin6u4iPA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1765797240; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aw+//mWuoaq93Fyz1zPOL3EGTybnP5nOn0EuVIbxhAo=; b=BuXm7xlby4v/PyKZhRHA9do4vkKPF7FWGb+tmBAm3flOW7VTRh8TJ5bN1FMfThgXiq0Ihq cnOgkAbg9baDEvBA== To: Olle =?utf-8?Q?L=C3=B6gdahl?= , "linux-kernel@vger.kernel.org" Cc: "frederic@kernel.org" , "anna-maria@linutronix.de" Subject: Re: [BUG] hrtimer: null deref in hrtimer_next_event_without when entering idle In-Reply-To: References: Date: Mon, 15 Dec 2025 12:13:59 +0100 Message-ID: <878qf4dmko.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Sat, Dec 13 2025 at 08:55, Olle L=C3=B6gdahl wrote: > I encountered a kernel panic with a null-pointer dereference in the > hrtimer system on kernel 6.17.9-arch1-1 (x86_64) when entering idle.=20 > The crash occurred in __hrtimer_next_event_base+0x4c. > > [137017.825435] BUG: kernel NULL pointer dereference, address: 0000000000= 000018 > [137017.825450] #PF: supervisor read access in kernel mode > [137017.825457] #PF: error_code(0x0000) - not-present page > [137017.825464] PGD 1719cb067 P4D 1719cb067 PUD 17ca5a067 PMD 0=20 > [137017.825483] Oops: Oops: 0000 [#1] SMP NOPTI > [137017.825495] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: P = OE 6.17.9-arch1-1 #1 PREEMPT(full) 71adf6020e7d04ea315feaf360c679be= 0fb5cb04 > [137017.825510] Tainted: [P]=3DPROPRIETARY_MODULE, [O]=3DOOT_MODULE, [E]= =3DUNSIGNED_MODULE > [137017.825516] Hardware name: System manufacturer System Product Name/PR= IME X370-PRO, BIOS 4207 12/08/2018 > [137017.825523] RIP: 0010:__hrtimer_next_event_base+0x4c/0xb0 > [137017.825538] Code: 0f bc c9 89 cd 48 8d 45 01 48 c1 e0 06 4c 01 e0 74 = 32 ba 01 00 00 00 48 8b 40 28 d3 e2 f7 d2 21 d3 49 39 c7 74 43 48 c1 e5 06 = <48> 8b 50 18 49 2b 54 2c 78 4c 39 ea 7d 08 4d 85 ff 74 1f 49 89 d5 > [137017.825546] RSP: 0018:ffffffffa3603d90 EFLAGS: 00010056 > [137017.825555] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000= 00000000 > [137017.825562] RDX: 00000000fffffffe RSI: ffff8a6b0ee216d8 RDI: ffff8a6b= 0ee21100 > [137017.825569] RBP: 0000000000000000 R08: ffffffffa3603d78 R09: 00000000= 00000018 > [137017.825575] R10: 00000000ffffffff R11: 000000000000012d R12: ffff8a6b= 0ee21100 > [137017.825582] R13: 7fffffffffffffff R14: 071c71c71c71c71c R15: ffff8a6b= 0ee216d8 > [137017.825589] FS: 0000000000000000(0000) GS:ffff8a6b6ab09000(0000) knl= GS:0000000000000000 > [137017.825596] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [137017.825603] CR2: 0000000000000018 CR3: 000000026e066000 CR4: 00000000= 003506f0 > [137017.825610] Call Trace: > [137017.825618] > [137017.825631] hrtimer_next_event_without+0x56/0x90 > [137017.825644] tick_nohz_get_sleep_length+0x86/0xa0 > [137017.825659] menu_select+0x391/0x680 > [137017.825677] do_idle+0x18b/0x210 > [137017.825693] cpu_startup_entry+0x29/0x30 > [137017.825704] rest_init+0xcc/0xd0 > [137017.825718] start_kernel+0x9a2/0x9b0 > [137017.825735] x86_64_start_reservations+0x24/0x30 > [137017.825748] x86_64_start_kernel+0xd1/0xe0 > [137017.825760] common_startup_64+0x13e/0x141 > [137017.825783] > > Disassembling the code at RIP shows the faulting instruction is: > 2a: 48 8b 50 18 mov rdx,QWORD PTR [rax+0x18] This looks like reading hrtimer::_softexpires and the hrtimer pointer is NULL. > Looking at the preceding code, rax was loaded from another structure > at offset 0x28: > 17: 48 8b 40 28 mov rax,QWORD PTR [rax+0x28] That's loading the next node from the clock base That means the clock base is marked active but has no timer queued. I have no idea how that can happen as all related operations are holding the relevant base lock. > I have not been able to reproduce this yet. I'd be interested in > working on a fix if guidance can be provided on the root cause. No idea how this can be chased down unless you have a halfways reliable reproducer which reproduces without that (whatever it is) module loaded: > [137017.825510] Tainted: [P]=3DPROPRIETARY_MODULE, [O]=3DOOT_MODULE, [E]= =3DUNSIGNED_MODULE Thanks, tglx