From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85E16239E74; Sun, 1 Jun 2025 23:30:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748820611; cv=none; b=daH8V97z6yj7KCTqpweybPfD1kvsaxcYUHwzxzEkPGB04JTXmwaAge9pjJmSqvUpmGGAJ/A1H1HKQWPF/iGBW5E9+hEiRydy319uqgg6iAGTQ7lHeWN5Su/ZD38/OxrAA+VPtpHWbnWYY/mcNOQSnUXOP6hdrkkJMGMUQlWa1e4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748820611; c=relaxed/simple; bh=1W/GNfD6/8yRUIJ+eaRPM7gWNexMe8+FyA1ybtFBDrg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=HWzN/tkOWaFceFXQlDzzEzyaDHEkaXeuA87EYXcNbZuJ7XWLCeRqSdT11hVmLQX2lY5gOlp/MbE29vyzfZfTY50GwSRSKRCBJXhjzG1UchGVx2z1SfzieBYOBhujxjzDw18UjGN66wojbGEynU3CWGiFaaHGuYw7pqkk4GKrnVE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=s9SLFJk0; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="s9SLFJk0" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 91486C4CEF3; Sun, 1 Jun 2025 23:30:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1748820611; bh=1W/GNfD6/8yRUIJ+eaRPM7gWNexMe8+FyA1ybtFBDrg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=s9SLFJk05XQBjqeGJalxLBO5ddzyry7RJW/tn9xJH7BthXAYVwqg+eznD//3tLWB7 QsmMdqzWBISMvsVibrdmUwqQ4EljjXVS7rrkAuHphToEcxs9Ox73Xew5ByfKzJXZe6 uniN/NmE9MMulPDl/CRo7zjWotWdGpTZxMwa2hvE4hIH8zjeric8p13bdjywwJUHhu 08k7GXaLdd4++dcB6UIKmxeXSrJLrLKuksZBWcNz4MwqJ8H/DP/mzyNK9BtXFUNpaJ jrciYgTHaWjUbAUFN03tvyWZiohNxHVgbNOg0d5T0szM0wIKrzSA3rJa97wyodzL+j jWm7eLPPMPBaA== From: Sasha Levin To: patches@lists.linux.dev, stable@vger.kernel.org Cc: Jiayuan Chen , syzbot+01affb1491750534256d@syzkaller.appspotmail.com, Tejun Heo , Sasha Levin , linux-kernel@vger.kernel.org Subject: [PATCH AUTOSEL 6.14 012/102] workqueue: Fix race condition in wq->stats incrementation Date: Sun, 1 Jun 2025 19:28:04 -0400 Message-Id: <20250601232937.3510379-12-sashal@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250601232937.3510379-1-sashal@kernel.org> References: <20250601232937.3510379-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.14.9 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Jiayuan Chen [ Upstream commit 70e1683ca3a6474360af1d3a020a9a98c8492cc0 ] Fixed a race condition in incrementing wq->stats[PWQ_STAT_COMPLETED] by moving the operation under pool->lock. Reported-by: syzbot+01affb1491750534256d@syzkaller.appspotmail.com Signed-off-by: Jiayuan Chen Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin --- **YES** This commit should be backported to stable kernel trees. Here's my detailed analysis: ## Race Condition Analysis The commit fixes a **data race condition** in `pwq->stats[PWQ_STAT_COMPLETED]++` that is directly related to a previously fixed race condition for `pwq->stats[PWQ_STAT_STARTED]++` (commit fe48ba7daefe). **The Problem:** - Line 3244 in the current kernel shows `pwq->stats[PWQ_STAT_COMPLETED]++` is executed **outside** the `pool->lock` protection - Multiple worker threads can concurrently increment this counter, creating a race condition identical to the one fixed in commit fe48ba7daefe - The syzbot report confirms this is a real issue affecting production systems **The Fix:** The commit moves `pwq->stats[PWQ_STAT_COMPLETED]++` from line ~3244 (before `lock_map_release`) to after `raw_spin_lock_irq(&pool->lock)` at line ~3273, ensuring atomic access under lock protection. ## Backport Suitability Criteria **✅ Fixes Important Bug:** Yes - race conditions can lead to incorrect statistics and potential corruption **✅ Small and Contained:** The change is minimal - just moving one line of code to a different location **✅ No Side Effects:** Moving the stats increment under lock protection has no functional side effects beyond fixing the race **✅ No Architectural Changes:** This is purely a synchronization fix with no design changes **✅ Critical Subsystem:** Workqueue is a core kernel subsystem used throughout the kernel **✅ Clear Stable Tree Compliance:** - Important bugfix ✓ - Minimal risk ✓ - Confined to workqueue subsystem ✓ - Reported by automated testing (syzbot) ✓ ## Historical Precedent The commit directly parallels the **previously backported** commit fe48ba7daefe (Similar Commit #1), which fixed the same type of race condition for `PWQ_STAT_STARTED`. That commit was marked "Backport Status: YES" and demonstrates that: 1. KCSAN-detected race conditions in workqueue stats are considered backport-worthy 2. Moving stats increments under lock protection is an established fix pattern 3. The workqueue maintainer (Tejun Heo) accepts such fixes ## Risk Assessment **Minimal Risk:** The change only affects when the counter is incremented (under lock vs. outside lock). Since both locations are in the same execution path, there's no functional behavior change - only proper synchronization. **No Regression Potential:** The fix follows the exact same pattern as the successful previous fix, reducing the likelihood of introducing new issues. This commit represents a clear, low-risk bug fix that addresses a real concurrency issue in a critical kernel subsystem, making it an ideal candidate for stable tree backporting. kernel/workqueue.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index bfe030b443e27..2e7845fe47e47 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -3241,7 +3241,7 @@ __acquires(&pool->lock) * point will only record its address. */ trace_workqueue_execute_end(work, worker->current_func); - pwq->stats[PWQ_STAT_COMPLETED]++; + lock_map_release(&lockdep_map); if (!bh_draining) lock_map_release(pwq->wq->lockdep_map); @@ -3272,6 +3272,8 @@ __acquires(&pool->lock) raw_spin_lock_irq(&pool->lock); + pwq->stats[PWQ_STAT_COMPLETED]++; + /* * In addition to %WQ_CPU_INTENSIVE, @worker may also have been marked * CPU intensive by wq_worker_tick() if @work hogged CPU longer than -- 2.39.5