From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 725BC37701F for ; Tue, 3 Mar 2026 20:21:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772569316; cv=none; b=e+29BHy4h5+GLoZa8MmlIzZdQGzkRQWdgiTRivMTkdVuXT/XGyzeKdjs8deH1ZOcT8e0Q0l9NMfJF2631LPFT2ZD2xHVmCi522pbH0J0mFJY4HAzQWhv1rhgr9dxQESTBttKrZQN2dZ8FzDdwCanMTjQ5ECX4qV488jJHW6wvlw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772569316; c=relaxed/simple; bh=YKYwi6TLSi6z+q2smXUHYZP2BPBlcSDSMAtPu6Ks8Hg=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=e1rO4+t1fzuBxIwXA9XCAKmY8XJjMsLJ1RAsMx72NKqtmkSVKJ1g1IJ4wiv7i7jMXaMlFY38LbOGPFVzGifRs79o7s9XyKAVbx3Da5FDREmyhURUJc5xn/mkbBIs7l8lah1njTVMwd6hgjNv+4WjOyDzgqV/vGPYIf0iLGWNFPc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AC+Bq0wS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AC+Bq0wS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54CF4C116C6; Tue, 3 Mar 2026 20:21:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1772569316; bh=YKYwi6TLSi6z+q2smXUHYZP2BPBlcSDSMAtPu6Ks8Hg=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=AC+Bq0wSJZUsvgz5+ILQ4WERjk+kkXTeLaJIQd0bqCHw9BDqzpoi7ZGepLEuE+nmj XNfefhcYl3ccfjd0YmK5b4tumhz6t7LKCDjkRm5LGuzygt4grDY4IebVyIgqEdiJu3 9UqyFx+yPo3aC6gH1+NQn8L1wfiJv3MjpPCCzagjmpv7HUUIITZuKSPjlPJPw/lgnl R8wb5WSc0nYSFDWQ2G18+q5n8zch+AuAJY7XYWK428KFjs5Ok4DXLmK0/hV2BMDcXv klpdMPtKju/1rrKPtKBgRashZTT+k1fZMXeaEWAD3uOGJ0OGhxF4y/QIt9tB6lePny +yZrgMTp1N/WQ== From: Thomas Gleixner To: Nathan Chancellor Cc: LKML , Anna-Maria Behnsen , John Stultz , Stephen Boyd , Daniel Lezcano , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , x86@kernel.org, Peter Zijlstra , Frederic Weisbecker , Eric Dumazet Subject: Re: [patch 20/48] x86/apic: Enable TSC coupled programming mode In-Reply-To: <20260303173809.GA1114907@ax162> References: <20260224163022.795809588@kernel.org> <20260224163430.076565985@kernel.org> <20260303012905.GA978396@ax162> <87jyvtyo6o.ffs@tglx> <20260303173809.GA1114907@ax162> Date: Tue, 03 Mar 2026 21:21:52 +0100 Message-ID: <87ecm0zmsf.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On Tue, Mar 03 2026 at 10:38, Nathan Chancellor wrote: > On Tue, Mar 03, 2026 at 03:37:03PM +0100, Thomas Gleixner wrote: >> On Mon, Mar 02 2026 at 18:29, Nathan Chancellor wrote: >> > >> > After this change landed in -next as commit f246ec3478cf ("x86/apic: >> > Enable TSC coupled programming mode"), two of my Intel-based test >> > machines fail to boot. Unfortunately, I do not think I have any serial >> > access on these, so I have little introspective ability. Is there any >> > information I can provide or patches I can test to try and help figure >> > out what is going on here? I have attached the output of lscpu of both >> > machines, in case there is some common thread there. >> >> Grmbl. I stared at it for a while and I have a suspicion. Can you try >> the patch below and also provide from one of the machines the output of >> >> dmesg | grep -i tsc > > This patch works on both machines, so your suspicion seemed spot on. > > Output of that dmesg commmand appears to be the same between > 89f951a1e8ad and f246ec3478cf with that diff applied: > > [ 0.000000] tsc: Detected 2500.000 MHz processor > [ 0.000000] tsc: Detected 2496.000 MHz TSC > [ 0.008989] TSC deadline timer available > [ 0.119139] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x23fa772cf26, max_idle_ns: 440795269835 ns > [ 0.312141] clocksource: Switched to clocksource tsc-early > [ 0.322686] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23fa772cf26, max_idle_ns: 440795269835 ns > [ 0.322951] clocksource: Switched to clocksource tsc Ha! That's exactly what I suspected. What happens is: TSC-early is installed, which is neither valid for high resolution timers nor for coupled mode. A bit later TSC is installed with the same frequency as TSC early. Which means the shift mult pair is not changing, which then fails to invoke the update of maxns. That stays simply 0, so the time is always armed for an event in the past and the machine dies from TSC deadline timer interrupt storm. On all my test machines TSC frequency is refined against HPET and installed late and that refinement always changes the shift/mult pair so I never ran into this situation and obviously did not think about it either. Let me write a proper change log and get this into the tip tree. Thanks for testing! tglx