From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CCE7B25A2C6 for ; Tue, 7 Apr 2026 14:52:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775573571; cv=none; b=TLzEsKiRJBo1jiVDQn7gymnzjKIQ7hM6nsaRpv9uMLiviqCNU6U+Kkdu93QKS3Bfcy39CmxF7acmh90BhV2T5RgGhpJt/kV4Da+SrnhowD0XPu7gqc4qyci/fA6MPc18Y8hPa6/l+iOq9XxDeo+xdIy5D7Dww3Hw4eiAsd6Wq6k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775573571; c=relaxed/simple; bh=HxK0Rx9Gb+ba2EIN2ER5TUbKQGckzc9XZb4m2s2IPAk=; h=Subject:To:Cc:From:Date:Message-ID:MIME-Version:Content-Type; b=Lxa5JqROWAe8u+jsTPymkYuiAUMJJSt6xolcAueUb32dbNrpnhxnaeGAByzSU7AvmeEbfwCfOwFmnD/ZxXefZcc9OtNpEX/wkmXhdmd6RABi8Cy3xM3NNX1uIywaDoRhNiSbDZ7YZrtHkYVJn+RSmarzoLSSNsv2B+tO2jtJ6yo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=oU1xucni; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="oU1xucni" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D536C116C6; Tue, 7 Apr 2026 14:52:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1775573571; bh=HxK0Rx9Gb+ba2EIN2ER5TUbKQGckzc9XZb4m2s2IPAk=; h=Subject:To:Cc:From:Date:From; b=oU1xucni34keDj34/AuI1At2qeVn+OHkicqts7oCk92hIgxI4AUCm5kygx0Pn1NAj kfaDqTSJyH1Ib7FWd47KEvyl6JHI11+WqrF96tg5WjHOhIfpL2aNz5Mx+qFVPUH4dW eKRuCUUa7X0mDVsz+/T7Uu83n5WWGj1gwaavu7tQ= Subject: FAILED: patch "[PATCH] workqueue: Fix false positive stall reports" failed to apply to 5.15-stable tree To: song@kernel.org,tj@kernel.org Cc: From: Date: Tue, 07 Apr 2026 16:52:38 +0200 Message-ID: <2026040738-stack-private-1bad@gregkh> Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: 8bit The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to . To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x c7f27a8ab9f2f43570f0725256597a0d7abe2c5b # git commit -s git send-email --to '' --in-reply-to '2026040738-stack-private-1bad@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From c7f27a8ab9f2f43570f0725256597a0d7abe2c5b Mon Sep 17 00:00:00 2001 From: Song Liu Date: Sat, 21 Mar 2026 20:30:45 -0700 Subject: [PATCH] workqueue: Fix false positive stall reports On weakly ordered architectures (e.g., arm64), the lockless check in wq_watchdog_timer_fn() can observe a reordering between the worklist insertion and the last_progress_ts update. Specifically, the watchdog can see a non-empty worklist (from a list_add) while reading a stale last_progress_ts value, causing a false positive stall report. This was confirmed by reading pool->last_progress_ts again after holding pool->lock in wq_watchdog_timer_fn(): workqueue watchdog: pool 7 false positive detected! lockless_ts=4784580465 locked_ts=4785033728 diff=453263ms worklist_empty=0 To avoid slowing down the hot path (queue_work, etc.), recheck last_progress_ts with pool->lock held. This will eliminate the false positive with minimal overhead. Remove two extra empty lines in wq_watchdog_timer_fn() as we are on it. Fixes: 82607adcf9cd ("workqueue: implement lockup detector") Cc: stable@vger.kernel.org # v4.5+ Assisted-by: claude-code:claude-opus-4-6 Signed-off-by: Song Liu Signed-off-by: Tejun Heo diff --git a/kernel/workqueue.c b/kernel/workqueue.c index b77119d71641..ff97b705f25e 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -7699,8 +7699,28 @@ static void wq_watchdog_timer_fn(struct timer_list *unused) else ts = touched; - /* did we stall? */ + /* + * Did we stall? + * + * Do a lockless check first. On weakly ordered + * architectures, the lockless check can observe a + * reordering between worklist insert_work() and + * last_progress_ts update from __queue_work(). Since + * __queue_work() is a much hotter path than the timer + * function, we handle false positive here by reading + * last_progress_ts again with pool->lock held. + */ if (time_after(now, ts + thresh)) { + scoped_guard(raw_spinlock_irqsave, &pool->lock) { + pool_ts = pool->last_progress_ts; + if (time_after(pool_ts, touched)) + ts = pool_ts; + else + ts = touched; + } + if (!time_after(now, ts + thresh)) + continue; + lockup_detected = true; stall_time = jiffies_to_msecs(now - pool_ts) / 1000; max_stall_time = max(max_stall_time, stall_time); @@ -7712,8 +7732,6 @@ static void wq_watchdog_timer_fn(struct timer_list *unused) pr_cont_pool_info(pool); pr_cont(" stuck for %us!\n", stall_time); } - - } if (lockup_detected)