From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D7F13AE6EB for ; Wed, 8 Apr 2026 08:52:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775638337; cv=none; b=qRsWZrN44lZoZImTyHF9vx9gjPaTbCHzN4hB00BHrQXYU8EjNqpahbSZ743qGmOXjiQJzt9AakI6PkG+57mepmbZeuYD7tlTDVd4GkXIcDT7p4IZDTnq38524RDtbqDpl8O54EL6gzoSEsp+5P3ay0Uvqujosc61zcnzlNaON7I= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775638337; c=relaxed/simple; bh=y2mPcHwphRvG8sP6D40/jwLW4cbVdq+AM99V1zURRU8=; h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID: MIME-Version:Content-Type; b=kkvHOtQgJXDJ3sUaNYpasppSeZu5OcnwcvtNENGkEZOjs4eSzYgAxWeE1AtizqlgX2wH0HSCiYgP9VXwfT6DPInXDJ0Lcw18MAzxM59GIdp69gnll4dKzeA+3xMWGEpC2znj4638oDvKg2/nhfJgbIksjps42n+rs4z9w4cuuB4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mFRjtql6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mFRjtql6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 358F9C19424; Wed, 8 Apr 2026 08:52:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775638337; bh=y2mPcHwphRvG8sP6D40/jwLW4cbVdq+AM99V1zURRU8=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=mFRjtql6Qyd0go+BsBdR8fuBYByQruYGS56PZKGnec15u5Ym6R/GzCBAZGGCwHh5p zHSD15l9poWoqZqDgsOCa2ugKQR/lcuv9IUQmVo5+SX/upvlw8Cg6DMGcH51AjI2+9 jz1TX6A1NqvT6s+m2ONKKMuiv4AmR7/PAVJh9/h0UoO3OQfd9V+deEjSJmF55fkKms CRGdot5CtKpVcIsA41btA1m4ZstvR7jcqHPgFzkFGY6YoJTu03N0Vgjy+zZiq2zRt7 jZsuV/VfC+G89UCw13EIK0ba+HhMWfPFHVPxyiMvDfYd6xKnSEhYSkf7eoRubYNQZ1 TIoCZr69oOCxA== From: Thomas Gleixner To: Calvin Owens Cc: Borislav Petkov , Petr Mladek , linux-kernel@vger.kernel.org, arighi@nvidia.com, yaozhenguo1@gmail.com, tj@kernel.org, feng.tang@linux.alibaba.com, lirongqing@baidu.com, realwujing@gmail.com, hu.shengming@zte.com.cn, dianders@chromium.org, joel.granados@kernel.org, Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Anna-Maria Behnsen , x86@kernel.org Subject: Re: [PATCH] clockevents: Prevent timer interrupt starvation In-Reply-To: References: <87v7ejetl1.ffs@tglx> <875x6a913n.ffs@tglx> <20260401163435.GGac1JG42tWmsCKL37@fat_crate.local> <87jyup70ka.ffs@tglx> <87bjg06r7t.ffs@tglx> <87zf3j6f88.ffs@tglx> Date: Wed, 08 Apr 2026 10:52:13 +0200 Message-ID: <875x614ywy.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On Fri, Apr 03 2026 at 17:15, Calvin Owens wrote: > On Friday 04/03 at 21:00 +0200, Thomas Gleixner wrote: >> Btw, I'm really curious how you deduced the reproducer from systemd >> code. I assume you figured somehow out which program triggered the >> behaviour and then inspected the source to find something fishy. Can you >> provide a pointer to the code in question? If they really do what your >> reproducer does, then this code needs to be fixed too :) > > I pulled the text that was executing when the NMI fired out of the dump: > > 00 ba 38 03 00 00 48 8d 35 ce 40 18 00 48 8d 3d 16 41 18 00 e8 11 14 > e8 ff b8 f4 ff ff ff e9 6d ff ff ff 0f 1f 80 00 00 00 00 0f b6 4f 2f > 48 8d 15 e5 5f 26 00 48 89 c8 83 e0 03 48 c1 e0 05 48 > > ...and searched for it in systemd-networkd and all its libs. It appears > in one spot in libsystemd-shared-259.so in path_hash_func(), so that > must be where the userspace %ip was when the NMI fired. Amazing. > Unfortunately that has too many callers: I couldn't narrow it down > meaningfully from there. Despite staring at a lot of timer code in > systemd, I haven't yet found anything concrete that might cause buggy > behavior. > > But, it stuck out at me that the detritus on the stack wasn't futex() or > poll() or read() related. It seemed wildly improbable that the NMI > would have just happened to catch systemd-networkd running like that, I > guessed it was probably spinning around timerfd_settime() in userspace > when the NMI fired (with calls to path_hash_func() somehow in-between). Right and there is an explicit timerfd_settime(... { 0, 1 }) in the event management code. > My initial guess was that the trigger was something about waiting on the > timer in a different thread than it was set on. I started to write that > out as a small reproducer, but almost jokingly thought, "well, I should > just try setting them blindly first and see if that works", and then my > head exploded when it actually did :) :) > I've tried overloading the machine, and triggering some unrealistically > large time steps back and forth underneath it. But I can't get systemd > to stick itself in any sort of loop like that, or even set a single > timer expiry to an unreasonable value. > > I think I will set up a little BPF thing to force systemd-networkd to > dump core if it makes timerfd_settime() calls too quickly or with > abstime arguments in the past, hopefully from the core I can work out > what was going on. But any better suggestions are welcome. It just occured to me that with the hrtimer changes, you might be able to utilize the new hrtimer_start_expires tracepoint and enable user stack traces to get down to the actual root cause. Thanks, tglx