From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBECE3C7DF1 for ; Thu, 30 Apr 2026 21:39:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.54 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777585144; cv=none; b=BEu8jLLXvO4ol1oDKTtvam5SkupFnd+qkzfsEBOR4/njbPWpQ0Qob7LfB5n+tOwNmZ4d6SCqhgWsmZETtG9wzC/MYopTuwbey24i0VjUvIZ3AyIxNB6UukbMglOJywHAqRUjSGJPqnemLG9X0VPAkZAN1njMEmjj4GJp9VtWj8s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777585144; c=relaxed/simple; bh=V6N8Z3HqxF7Vs03gx7LpjNmoBsHpjyvtqOwnt1Hq4yI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DyoRutznwz/u74nJJM9eIxmh/2YeIOssLt5m2OdsFoTjhhBQghk9GORWZOcrvnY5FqD8msxWcYVyi1NMuH8JcMhXxO/QAayYvaKHVzGH5zLIWggYlehsx2XHmkfv/kRlIL6RgWUM6RRrXtAanDTmYLCAJ8i5S/SAQzkb1iG8Mts= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FCaEseq2; arc=none smtp.client-ip=209.85.221.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FCaEseq2" Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-43d7e23defbso834421f8f.0 for ; Thu, 30 Apr 2026 14:39:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777585141; x=1778189941; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4uCJba0CU9dnTtMqX5SdAOLUZCf3o9RZtBHur+KKBAA=; b=FCaEseq2zl4ZHEDUawKB+BOmZMUkITyYpNhzKbJ2FYETaiWb9FSGWENU0Bm16UEPSe 6D+tRTJQJ7dnMYJFIO38LjitC3N4LqEnRZW95taxxRfTacs9jbt3i3Ug7juPt0A1jJ+o /yURFGWIT+IvTVtx7U4IrtnwgxpNFU6OEi6ZVDp3Ms5OehPbL+Mf3oCgWTKyGdMWmdnR 3AM8fTmiVVEX9JTYidXhZJY7RyryVDIn9wfDvlJfx3V3bfzJH+TX4uve68Rje3aNpAt4 ZGlW/wd72MNaS8RzKSvvNeCRyVZeElvB3fIMO3f5XofYSK6YiiHVlOp4exIfcjQn3BCb rMuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777585141; x=1778189941; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=4uCJba0CU9dnTtMqX5SdAOLUZCf3o9RZtBHur+KKBAA=; b=lG6rITY5agsdw/QWSlwLN8BKSqJFqGJQWTXCDSh0UyBPvGCxjfMphiilUJKYzdhvyQ MBmJffrTJES2w0twcVwPV4Jooo6khS75NMD/j3rsc53WTxiPxXu9oFNzhrrxH9WeEfDJ JryuTuIhuqWUpqgA2aOKPPqGuWRnxwywmJSJ85VPRRyxMeqiAC1uSW2JJvTucQFPWsLA 1rbwnweMHr0MOxJ3l7Q4zmHF+LguR9U63ke6yDdx8AHTQyzCH7GaHSqYalDj3C3/0bTq CkSqL6e7YSUUun/kYvbOhDB7IXgwqkal1Vko/rG8IXEMJqgukyrWchKTqGYF2Gii0GL4 QEPA== X-Gm-Message-State: AOJu0Ywp+DJ+47qPSbmrZ1TW7euXNsnhl6UqAJJu3xgNL4Z4JV/e83yp ZEKj+SqE7WxXj36aRIkRp0I7H1lOUQ5ACqo+q9T7vsAeES4FhaRF2vKH X-Gm-Gg: AeBDievQRpopiNd06fHc7cTiSg1e7wvRHM1TrZxeyaXoSOT/i7vd7NzItGR5GArDroD aKPo9oZrx0LhobD1D+wBRGoBqXWI9eIEW5p9M0KRUv+LscU35nUbLol9ivjCQst8n/EBOHV8j8v IQ+jVMuXuqHk7BHp5yNxhKVZOvCQ4bU8XNOsm3e/J14JgX6rWh7h/zBsxZe/ym0krbd3XiX22XV qsOVvziuXepelQPqAgGDXxFCYbOoCkWxsetfV1rzYX9N64zuTxE683SzSus8ffAmT2Ep60kPgle pi8FQxEXmEU7E5yf0ylCf7aqgWrQR9roGp1+N+wuFyIqTTK4duKj3iDADWL2DP9Vp5DZLARFoBo HO6RA1Wm4dE2PQTBBltHdLix+4HwU5qbCbgVA+Vg8D9Ym5sLg5g0ANgxSkSj7N3KrMreJ88TzKV KturggxTuOm2GacvU+kTeEJrKD7OgrACwiXVYJxZrd X-Received: by 2002:a05:6000:230c:b0:43d:1c4a:37c with SMTP id ffacd0b85a97d-44a861795fbmr629668f8f.4.1777585141297; Thu, 30 Apr 2026 14:39:01 -0700 (PDT) Received: from yuri-framework13 ([78.211.51.156]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-44a9879ef89sm418510f8f.30.2026.04.30.14.39.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Apr 2026 14:39:01 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v5 13/29] sched/rt: Implement dl-server operations for rt-cgroups Date: Thu, 30 Apr 2026 23:38:17 +0200 Message-ID: <20260430213835.62217-14-yurand2000@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260430213835.62217-1-yurand2000@gmail.com> References: <20260430213835.62217-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Implement rt_server_pick, the callback that deadline servers use to pick a task to schedule. rt_server_pick(): pick the next runnable rt task and tell the scheduler that it is going to be scheduled next. Let enqueue_task_rt function start the attached deadline server when the first task is enqueued on a specific rq/server. The server is not symmetrically stopped in dequeue_task_rt as it is stopped when server_pick_task returns NULL (see deadline.c). Change update_curr_rt to perform a deadline server update if the updated task is served by non-root group. Update inc/dec_dl_tasks to account the number of active tasks in the local runqueue for rt-cgroups servers, as their local runqueue is different from the global runqueue, and thus when a rt-group server is activated/deactivated, the number of served tasks' must be added/removed. This uses nr_running to be compatible with future dl-server interfaces. Account also the deadline server so that it is picked for shutdown when its runqueue is empty (future patches will try to pull tasks before stopping). Update inc/dec_rt_prio_smp to change a rq's cpupri only if the rt_rq is the global runqueue, since cgroups are scheduled via their dl-server priority. Update inc/dec_rt_tasks to account for waking/sleeping tasks on the global runqueue, when the task runs on the root cgroup, or its local dl server is active. The accounting is not done when servers are throttled, as they will add/sub the number of tasks running when they get enqueued/dequeued. For rt cgroups, account for the number of active tasks in the nr_running field of the local runqueue (add/sub_nr_running), as this number is used when a dl server is enqueued/dequeued. Update set_task_rq to record the dl_rq, tracking which deadline server manages a task. Update set_task_rq to not use the parent field anymore, as it is unused by this patchset's code. Remove the unused parent field from sched_rt_entity. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- include/linux/sched.h | 1 - kernel/sched/deadline.c | 8 ++++++ kernel/sched/rt.c | 60 ++++++++++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 8 +++++- 4 files changed, 71 insertions(+), 6 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index eb8b57f689b5..ea2e74598b93 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -630,7 +630,6 @@ struct sched_rt_entity { struct sched_rt_entity *back; #ifdef CONFIG_RT_GROUP_SCHED - struct sched_rt_entity *parent; /* rq on which this entity is (to be) queued: */ struct rt_rq *rt_rq; /* rq "owned" by this entity/group: */ diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 084af1d375b5..c82810732106 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2093,6 +2093,10 @@ void inc_dl_tasks(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq) if (!dl_server(dl_se)) add_nr_running(rq_of_dl_rq(dl_rq), 1); + else if (rq_of_dl_se(dl_se) != dl_se->my_q) { + WARN_ON(dl_se->my_q->rt.rt_nr_running != dl_se->my_q->nr_running); + add_nr_running(rq_of_dl_rq(dl_rq), dl_se->my_q->nr_running + 1); + } inc_dl_deadline(dl_rq, deadline); } @@ -2105,6 +2109,10 @@ void dec_dl_tasks(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq) if (!dl_server(dl_se)) sub_nr_running(rq_of_dl_rq(dl_rq), 1); + else if (rq_of_dl_se(dl_se) != dl_se->my_q) { + WARN_ON(dl_se->my_q->rt.rt_nr_running != dl_se->my_q->nr_running); + sub_nr_running(rq_of_dl_rq(dl_rq), dl_se->my_q->nr_running - 1); + } dec_dl_deadline(dl_rq, dl_se->deadline); } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 3d7f2b2ebe60..defb812b0e48 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -144,9 +144,22 @@ void free_rt_sched_group(struct task_group *tg) kfree(tg->dl_se); } +static struct sched_rt_entity *pick_next_rt_entity(struct rt_rq *rt_rq); +static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first); + static struct task_struct *rt_server_pick(struct sched_dl_entity *dl_se, struct rq_flags *rf) { - return NULL; + struct rt_rq *rt_rq = &dl_se->my_q->rt; + struct rq *rq = rq_of_rt_rq(rt_rq); + struct task_struct *p; + + if (!sched_rt_runnable(dl_se->my_q)) + return NULL; + + p = rt_task_of(pick_next_rt_entity(rt_rq)); + set_next_task_rt(rq, p, true); + + return p; } static inline void __rt_rq_free(struct rt_rq **rt_rq) @@ -462,6 +475,7 @@ static inline int rt_se_prio(struct sched_rt_entity *rt_se) static void update_curr_rt(struct rq *rq) { struct task_struct *donor = rq->donor; + struct rt_rq *rt_rq; s64 delta_exec; if (donor->sched_class != &rt_sched_class) @@ -471,8 +485,18 @@ static void update_curr_rt(struct rq *rq) if (unlikely(delta_exec <= 0)) return; - if (!rt_bandwidth_enabled()) + if (!rt_group_sched_enabled()) return; + + if (!dl_bandwidth_enabled()) + return; + + rt_rq = rt_rq_of_se(&donor->rt); + if (is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se = dl_group_of(rt_rq); + + dl_server_update(dl_se, delta_exec); + } } static void @@ -483,7 +507,7 @@ inc_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio) /* * Change rq's cpupri only if rt_rq is the top queue. */ - if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && &rq->rt != rt_rq) + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) return; if (rq->online && prio < prev_prio) @@ -498,7 +522,7 @@ dec_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio) /* * Change rq's cpupri only if rt_rq is the top queue. */ - if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && &rq->rt != rt_rq) + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) return; if (rq->online && rt_rq->highest_prio.curr != prev_prio) @@ -561,6 +585,16 @@ void inc_rt_tasks(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) rt_rq->rr_nr_running += is_rr_task(rt_se); inc_rt_prio(rt_rq, rt_se_prio(rt_se)); + + if (rt_group_sched_enabled() && is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se = dl_group_of(rt_rq); + + if (!dl_se->dl_throttled) + add_nr_running(rq_of_rt_rq(rt_rq), 1); + add_nr_running(served_rq_of_rt_rq(rt_rq), 1); + } else { + add_nr_running(rq_of_rt_rq(rt_rq), 1); + } } static inline @@ -571,6 +605,16 @@ void dec_rt_tasks(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) rt_rq->rr_nr_running -= is_rr_task(rt_se); dec_rt_prio(rt_rq, rt_se_prio(rt_se)); + + if (rt_group_sched_enabled() && is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se = dl_group_of(rt_rq); + + if (!dl_se->dl_throttled) + sub_nr_running(rq_of_rt_rq(rt_rq), 1); + sub_nr_running(served_rq_of_rt_rq(rt_rq), 1); + } else { + sub_nr_running(rq_of_rt_rq(rt_rq), 1); + } } /* @@ -752,6 +796,14 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags) check_schedstat_required(); update_stats_wait_start_rt(rt_rq_of_se(rt_se), rt_se); + /* Task arriving in an idle group of tasks. */ + if (rt_group_sched_enabled() && + is_dl_group(rt_rq) && rt_rq->rt_nr_running == 0) { + struct sched_dl_entity *dl_se = dl_group_of(rt_rq); + + dl_server_start(dl_se); + } + enqueue_rt_entity(rt_se, flags); if (task_is_blocked(p)) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index ca69d2132061..d949babfe16a 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2292,7 +2292,7 @@ static inline void set_task_rq(struct task_struct *p, unsigned int cpu) if (!rt_group_sched_enabled()) tg = &root_task_group; p->rt.rt_rq = tg->rt_rq[cpu]; - p->rt.parent = tg->rt_se[cpu]; + p->dl.dl_rq = &cpu_rq(cpu)->dl; #endif /* CONFIG_RT_GROUP_SCHED */ } @@ -2954,6 +2954,9 @@ static inline void add_nr_running(struct rq *rq, unsigned count) unsigned prev_nr = rq->nr_running; rq->nr_running = prev_nr + count; + if (rq != cpu_rq(rq->cpu)) + return; + if (trace_sched_update_nr_running_tp_enabled()) { call_trace_sched_update_nr_running(rq, count); } @@ -2967,6 +2970,9 @@ static inline void add_nr_running(struct rq *rq, unsigned count) static inline void sub_nr_running(struct rq *rq, unsigned count) { rq->nr_running -= count; + if (rq != cpu_rq(rq->cpu)) + return; + if (trace_sched_update_nr_running_tp_enabled()) { call_trace_sched_update_nr_running(rq, -count); } -- 2.53.0