From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29D042765D2; Mon, 6 Oct 2025 18:19:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759774750; cv=none; b=my1gU1v4A4bU0dl+qfvSbFvG4LFLWpwc/sdOP9AHtoVmqm61Q/JudWaviISmxWU4bm0EErnE5f0kNMkKdr6K6z6bLilMyspp8oSV+sSXlq190Xj5AbM52JSYpspZ13nyEX8sI1sPoY1lBlQj50V3BUD1caRHqKbY9nRnzTsWvv8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759774750; c=relaxed/simple; bh=pi9+48WVTh/niaA89TWN/8nPZQAmwD4yjcnsT846FhE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=jvDhvYSiVUWEuVDBYd7R1bAlhHc0PUx3+zME7buaZX0FgPF0KvBxYCb1jDvwF1d3hluOngV9+1bZHIcrASXW6dOFD2FNelGAxi64QR+MnO1Sg1g7f4BFRSEcTAUxsG1bRAlOPCVu9qss+my8DZ78MPzXYY8r/L4p8l6UObYVj+I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mtrCUyd/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mtrCUyd/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 54F5FC4CEF5; Mon, 6 Oct 2025 18:19:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1759774747; bh=pi9+48WVTh/niaA89TWN/8nPZQAmwD4yjcnsT846FhE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mtrCUyd/zapq93kkt8wbj7jFO5vmGwbIG1xVgQJn4lop0NefH1d1XkzWaR5EilopC 0mukCbSebRUMtMxSykKsthrCIdqz1iVw6aY+RG4A0UER/Sc+GaS4c8kdNTIxq4NEPf vj6VX/kv58jtCs3SGXzX537TAtGP7Tw4LpCRFpX7wH0eKbWGZgj4qsSTw/x9hcdnVn OX67EX7L5hLIFbaIuSmtpTzubAZ0Kz7/im7aqFbwELSUMhflWC8icENLRagnaaTgnS X5C+dXwZIj7T+pxAuyJWGGMGkqKV+A7z1xE78yB0rBz21sSm22pKukAd992HFqG6tX V/2K6k/qB0YdQ== From: Sasha Levin To: patches@lists.linux.dev, stable@vger.kernel.org Cc: Tejun Heo , Andrea Righi , Sasha Levin , mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH AUTOSEL 6.17-6.16] sched_ext: Keep bypass on between enable failure and scx_disable_workfn() Date: Mon, 6 Oct 2025 14:17:39 -0400 Message-ID: <20251006181835.1919496-7-sashal@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251006181835.1919496-1-sashal@kernel.org> References: <20251006181835.1919496-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.17.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit From: Tejun Heo [ Upstream commit 4a1d9d73aabc8f97f48c4f84f936de3b265ffd6f ] scx_enable() turns on the bypass mode while enable is in progress. If enabling fails, it turns off the bypass mode and then triggers scx_error(). scx_error() will trigger scx_disable_workfn() which will turn on the bypass mode again and unload the failed scheduler. This moves the system out of bypass mode between the enable error path and the disable path, which is unnecessary and can be brittle - e.g. the thread running scx_enable() may already be on the failed scheduler and can be switched out before it triggers scx_error() leading to a stall. The watchdog would eventually kick in, so the situation isn't critical but is still suboptimal. There is nothing to be gained by turning off the bypass mode between scx_enable() failure and scx_disable_workfn(). Keep bypass on. Signed-off-by: Tejun Heo Acked-by: Andrea Righi Signed-off-by: Sasha Levin --- LLM Generated explanations, may be completely bogus: Based on my comprehensive analysis, here is my determination: ## **YES** This commit should be backported to stable kernel trees. ### Detailed Analysis: #### **Context and Timing:** - **sched_ext introduction**: This feature was first introduced in kernel **v6.12** (confirmed via git tag analysis) - **Applicability**: This commit is only relevant for kernels **v6.12 and later** that include sched_ext - **Current status**: The commit has **already been selected** for backporting to 6.17 (commit 2d013fc03f11a with Sasha Levin's sign-off) #### **Bug Description and Impact:** The code change addresses a **race condition in the error handling path** during BPF scheduler enablement: **Old behavior (buggy):** 1. Line 4580: `scx_bypass(true)` - Enable bypass mode during scheduler initialization 2. On error → Line 4708-4711: `scx_bypass(false)` - **Disable bypass mode** 3. Call `scx_error()` which triggers `scx_disable_workfn()` 4. Line 3873 in `scx_disable_workfn()`: `scx_bypass(true)` - Re-enable bypass mode **Problem:** Between steps 2 and 4, the system is **out of bypass mode** with a failed scheduler. The commit message explicitly states: *"the thread running scx_enable() may already be on the failed scheduler and can be switched out before it triggers scx_error() **leading to a stall**"* **New behavior (fixed):** Simply **removes** the `scx_bypass(false)` call at line 4710, keeping bypass mode continuously enabled from the failure point through the entire disable sequence. #### **Why This Should Be Backported:** 1. **Real Bug**: This fixes an actual stall condition (confirmed by author Tejun Heo and acked by Andrea Righi) 2. **User Impact**: While the watchdog eventually recovers, users experience **unnecessary stalls** when BPF schedulers fail to load - a real-world scenario 3. **Minimal Risk**: - **1-line change** (removal only) - Makes error path **more conservative** (keeps bypass on longer) - No new logic introduced - Only affects **error conditions**, not normal operation 4. **Stable Tree Criteria Met**: - ✅ Fixes important bug affecting users - ✅ Doesn't introduce new features - ✅ No architectural changes - ✅ Minimal regression risk - ✅ Confined to sched_ext subsystem 5. **Already Validated**: The autosel process has already selected this for 6.17, indicating automated analysis confirms its suitability 6. **Active Subsystem**: Multiple sched_ext fixes show this is under active maintenance and bug discovery #### **Code Change Details:** ```c // kernel/sched/ext.c, line 4708-4712 err_disable_unlock_all: scx_cgroup_unlock(); percpu_up_write(&scx_fork_rwsem); - scx_bypass(false); // REMOVED + /* we'll soon enter disable path, keep bypass on */ // ADDED COMMENT err_disable: ``` This single-line removal prevents the problematic window where the system exits bypass mode between error detection and cleanup, eliminating the potential for stalls during scheduler enable failures. kernel/sched/ext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 088ceff38c8a4..2ccc885a229d5 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -5795,7 +5795,7 @@ static int scx_enable(struct sched_ext_ops *ops, struct bpf_link *link) err_disable_unlock_all: scx_cgroup_unlock(); percpu_up_write(&scx_fork_rwsem); - scx_bypass(false); + /* we'll soon enter disable path, keep bypass on */ err_disable: mutex_unlock(&scx_enable_mutex); /* -- 2.51.0