From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E95052DC336; Wed, 25 Feb 2026 05:01:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771995692; cv=none; b=k1sny/3Ny4WIXtHU5xQ2Rre5BsLEwVVxDR5tLGty7yPTorVLUvMWni4TM9n+nl4x1HFk3ii1auGWkM/VhewfeLWBouInnotuRXZSWnvpqRSORlxG5rwO/6HArxg9LvTplkwQCLX1BtAXOsjZHSGAtI17dBVjL6dhL3M18jGiGZQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771995692; c=relaxed/simple; bh=KrCNe9MtsfTl/3X60+z9H7/2VmsQLcP/cvf19g0VG3E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SeNV/xWZNUsvJeBIPwWVSC7TJJnVHIZJXnwW0rf25psi5q1ZGdN10W7T6ZMacVenjE+K6HTzRiEgNxa9KNUzahgE0NMFBQ2bxcSJeSr9d+T9vkJixd+7siXxwNsqQE/WIMB5yeLsUTU0qGD1yz28vzXQGPkNpZoETCWhjMqqpo8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PfMQPlnc; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PfMQPlnc" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8192BC116D0; Wed, 25 Feb 2026 05:01:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1771995691; bh=KrCNe9MtsfTl/3X60+z9H7/2VmsQLcP/cvf19g0VG3E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PfMQPlncHDt2fhEH1N0DKwqfiOa5Ns/9R0ipXYropZTFZy5hY37w1aWzLAzs+qUi+ UTBmNjqQr6jdAyWRVf0oWuI0Q+Ph1eFFgYr9DwnkO0PMA6R6Hj8RrAjWsb08FP1npe P7id0pqxISYdCCClTTugo3u92B4zyps6ugzZa7WTCsGonTGlGgBLs1r5udzFADFlcT pQ1AAcyNUQYt0+VNmQdPghJvT52Xb760dDm1RypNm6mYocnalVgE3UfYUdlHkLhCvJ 9ij2DD+OhndOJ927sk20ftPxzPwQ1YzQAqh0Cs1uW/ci5LN9vizO2OZmq53kAGCrH6 +CtLtE5eWxmUg== From: Tejun Heo To: linux-kernel@vger.kernel.org, sched-ext@lists.linux.dev Cc: void@manifault.com, arighi@nvidia.com, changwoo@igalia.com, emil@etsalapatis.com, hannes@cmpxchg.org, mkoutny@suse.com, cgroups@vger.kernel.org, Tejun Heo Subject: [PATCH 20/34] sched_ext: Factor out scx_dispatch_sched() Date: Tue, 24 Feb 2026 19:00:55 -1000 Message-ID: <20260225050109.1070059-21-tj@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260225050109.1070059-1-tj@kernel.org> References: <20260225050109.1070059-1-tj@kernel.org> Precedence: bulk X-Mailing-List: cgroups@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit In preparation of multiple scheduler support, factor out scx_dispatch_sched() from balance_one(). The function boundary makes remembering $prev_on_scx and $prev_on_rq less useful. Open code $prev_on_scx in balance_one() and $prev_on_rq in both balance_one() and scx_dispatch_sched(). No functional changes. Signed-off-by: Tejun Heo --- kernel/sched/ext.c | 123 ++++++++++++++++++++++++--------------------- 1 file changed, 65 insertions(+), 58 deletions(-) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 2aab3ccbd3e3..99ef2a1cc3ac 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -2384,67 +2384,22 @@ static inline void maybe_queue_balance_callback(struct rq *rq) rq->scx.flags &= ~SCX_RQ_BAL_CB_PENDING; } -static int balance_one(struct rq *rq, struct task_struct *prev) +static bool scx_dispatch_sched(struct scx_sched *sch, struct rq *rq, + struct task_struct *prev) { - struct scx_sched *sch = scx_root; struct scx_dsp_ctx *dspc = this_cpu_ptr(scx_dsp_ctx); bool prev_on_scx = prev->sched_class == &ext_sched_class; - bool prev_on_rq = prev->scx.flags & SCX_TASK_QUEUED; int nr_loops = SCX_DSP_MAX_LOOPS; s32 cpu = cpu_of(rq); - lockdep_assert_rq_held(rq); - rq->scx.flags |= SCX_RQ_IN_BALANCE; - rq->scx.flags &= ~SCX_RQ_BAL_KEEP; - - if ((sch->ops.flags & SCX_OPS_HAS_CPU_PREEMPT) && - unlikely(rq->scx.cpu_released)) { - /* - * If the previous sched_class for the current CPU was not SCX, - * notify the BPF scheduler that it again has control of the - * core. This callback complements ->cpu_release(), which is - * emitted in switch_class(). - */ - if (SCX_HAS_OP(sch, cpu_acquire)) - SCX_CALL_OP(sch, SCX_KF_REST, cpu_acquire, rq, cpu, NULL); - rq->scx.cpu_released = false; - } - - if (prev_on_scx) { - update_curr_scx(rq); - - /* - * If @prev is runnable & has slice left, it has priority and - * fetching more just increases latency for the fetched tasks. - * Tell pick_task_scx() to keep running @prev. If the BPF - * scheduler wants to handle this explicitly, it should - * implement ->cpu_release(). - * - * See scx_disable_workfn() for the explanation on the bypassing - * test. - */ - if (prev_on_rq && prev->scx.slice && !scx_bypassing(sch, cpu)) { - rq->scx.flags |= SCX_RQ_BAL_KEEP; - goto has_tasks; - } - } - - /* if there already are tasks to run, nothing to do */ - if (rq->scx.local_dsq.nr) - goto has_tasks; - if (consume_global_dsq(sch, rq)) - goto has_tasks; + return true; - if (scx_bypassing(sch, cpu)) { - if (consume_dispatch_q(sch, rq, bypass_dsq(sch, cpu))) - goto has_tasks; - else - goto no_tasks; - } + if (scx_bypassing(sch, cpu)) + return consume_dispatch_q(sch, rq, bypass_dsq(sch, cpu)); if (unlikely(!SCX_HAS_OP(sch, dispatch)) || !scx_rq_online(rq)) - goto no_tasks; + return false; dspc->rq = rq; @@ -2463,14 +2418,14 @@ static int balance_one(struct rq *rq, struct task_struct *prev) flush_dispatch_buf(sch, rq); - if (prev_on_rq && prev->scx.slice) { + if ((prev->scx.flags & SCX_TASK_QUEUED) && prev->scx.slice) { rq->scx.flags |= SCX_RQ_BAL_KEEP; - goto has_tasks; + return true; } if (rq->scx.local_dsq.nr) - goto has_tasks; + return true; if (consume_global_dsq(sch, rq)) - goto has_tasks; + return true; /* * ops.dispatch() can trap us in this loop by repeatedly @@ -2479,7 +2434,7 @@ static int balance_one(struct rq *rq, struct task_struct *prev) * balance(), we want to complete this scheduling cycle and then * start a new one. IOW, we want to call resched_curr() on the * next, most likely idle, task, not the current one. Use - * scx_kick_cpu() for deferred kicking. + * __scx_bpf_kick_cpu() for deferred kicking. */ if (unlikely(!--nr_loops)) { scx_kick_cpu(sch, cpu, 0); @@ -2487,12 +2442,64 @@ static int balance_one(struct rq *rq, struct task_struct *prev) } } while (dspc->nr_tasks); -no_tasks: + return false; +} + +static int balance_one(struct rq *rq, struct task_struct *prev) +{ + struct scx_sched *sch = scx_root; + s32 cpu = cpu_of(rq); + + lockdep_assert_rq_held(rq); + rq->scx.flags |= SCX_RQ_IN_BALANCE; + rq->scx.flags &= ~SCX_RQ_BAL_KEEP; + + if ((sch->ops.flags & SCX_OPS_HAS_CPU_PREEMPT) && + unlikely(rq->scx.cpu_released)) { + /* + * If the previous sched_class for the current CPU was not SCX, + * notify the BPF scheduler that it again has control of the + * core. This callback complements ->cpu_release(), which is + * emitted in switch_class(). + */ + if (SCX_HAS_OP(sch, cpu_acquire)) + SCX_CALL_OP(sch, SCX_KF_REST, cpu_acquire, rq, cpu, NULL); + rq->scx.cpu_released = false; + } + + if (prev->sched_class == &ext_sched_class) { + update_curr_scx(rq); + + /* + * If @prev is runnable & has slice left, it has priority and + * fetching more just increases latency for the fetched tasks. + * Tell pick_task_scx() to keep running @prev. If the BPF + * scheduler wants to handle this explicitly, it should + * implement ->cpu_release(). + * + * See scx_disable_workfn() for the explanation on the bypassing + * test. + */ + if ((prev->scx.flags & SCX_TASK_QUEUED) && prev->scx.slice && + !scx_bypassing(sch, cpu)) { + rq->scx.flags |= SCX_RQ_BAL_KEEP; + goto has_tasks; + } + } + + /* if there already are tasks to run, nothing to do */ + if (rq->scx.local_dsq.nr) + goto has_tasks; + + /* dispatch @sch */ + if (scx_dispatch_sched(sch, rq, prev)) + goto has_tasks; + /* * Didn't find another task to run. Keep running @prev unless * %SCX_OPS_ENQ_LAST is in effect. */ - if (prev_on_rq && + if ((prev->scx.flags & SCX_TASK_QUEUED) && (!(sch->ops.flags & SCX_OPS_ENQ_LAST) || scx_bypassing(sch, cpu))) { rq->scx.flags |= SCX_RQ_BAL_KEEP; __scx_add_event(sch, SCX_EV_DISPATCH_KEEP_LAST, 1); -- 2.53.0