From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pdx-out-006.esa.us-west-2.outbound.mail-perimeter.amazon.com (pdx-out-006.esa.us-west-2.outbound.mail-perimeter.amazon.com [52.26.1.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FBC518050 for ; Mon, 25 May 2026 19:38:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.26.1.71 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779737887; cv=none; b=it12Jnrw8AGjWP1naoZIewmstEVAy6FZt225JFBUMpeMn6FmvDLbIgcxrWZWKtFqfMTNxEqfd1CO1K0wuhgISqyYbxm9nzUZsxF7MpwZOOlCLGxaYFKdDdcDba0wv8Ae6idrBqCdDZl5Cs3HSk1WgU2mlWhcWPTjRQEC5h77Fxo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779737887; c=relaxed/simple; bh=zQnR4d1OfZWRyJnIRTdgkb6PfoscKdbvplNZSUDkn7A=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=pPhxINs2pmxf8Enavt/6h3RYzuZsNvIRwKrqFdDIGdR4Q9kIxLl+/ksf9VcmcFufqRHYo20nErgeFtPPqbHmpcTrjOGAGP5Ab2n9OWoS7voWl44WllUBaxws+A4rj/0lbauQYmXMa6pa9F5J3c4TzMcqxAHaNduvxVxKQziHXsU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (2048-bit key) header.d=amazon.com header.i=@amazon.com header.b=Gxfr1Ecj; arc=none smtp.client-ip=52.26.1.71 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=amazon.com header.i=@amazon.com header.b="Gxfr1Ecj" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazoncorp2; t=1779737886; x=1811273886; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SpN58XxSxDF0xIMmexI2qCxTUynLmnpOjfUUjkT5MWY=; b=Gxfr1EcjHqN2oJXh56mSFrkI0ouuuThQQKv2zopgayZnycspweQqpeds ybqmJA+NYuC7MA5uG1TU6tCaqMqFioQsTFyWf2oo+gPUbXL+FX2H3KOAu +6PB3aKlHlbwrz8xOgR0yliaVyxkkK5pIE5vcMCGob96d/aap4iUGC2k3 FkA4D7HkQYrhMfwa5xsT/lHPQ/zdkTCrCDnxUns7hZLaJyQEW/YrfeGjq 8d2uSyp3/JGdUj77xj1r8hb/k5yVGWY2cFqVZNL1fQ0cefwDoKrwyE/R3 PDQKu4HMwcy/ZQtZMA8KBhFgoYPFf7qsGZ69ItYx51UuUd3pgVdGp95RG Q==; X-CSE-ConnectionGUID: OF6JMzloQ5+pPPxxhEVsRQ== X-CSE-MsgGUID: kjNuLJjhStOXBdwAdS4vtw== X-IronPort-AV: E=Sophos;i="6.24,168,1774310400"; d="scan'208";a="20456287" Received: from ip-10-5-6-203.us-west-2.compute.internal (HELO smtpout.naws.us-west-2.prod.farcaster.email.amazon.dev) ([10.5.6.203]) by internal-pdx-out-006.esa.us-west-2.outbound.mail-perimeter.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 May 2026 19:38:03 +0000 Received: from EX19MTAUWA002.ant.amazon.com [205.251.233.234:16077] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.48.135:2525] with esmtp (Farcaster) id 70dc5d05-f2b8-40ce-b62f-f2db3ded8282; Mon, 25 May 2026 19:38:03 +0000 (UTC) X-Farcaster-Flow-ID: 70dc5d05-f2b8-40ce-b62f-f2db3ded8282 Received: from EX19D001UWA001.ant.amazon.com (10.13.138.214) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.2562.37; Mon, 25 May 2026 19:38:02 +0000 Received: from dev-dsk-sieberf-metal-1a-7543e84d.eu-west-1.amazon.com (172.19.116.227) by EX19D001UWA001.ant.amazon.com (10.13.138.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.2562.37; Mon, 25 May 2026 19:37:59 +0000 From: Fernand Sieber To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot CC: Tejun Heo , David Vernet , Andrea Righi , Changwoo Min , Dietmar Eggemann , Ben Segall , Mel Gorman , , , Fahad Mubeen , "Hendrik Borghorst" , David Woodhouse , Fernand Sieber Subject: [PATCH 2/2] sched/ext: add cgroup_set_runtime ops callback Date: Mon, 25 May 2026 21:36:22 +0200 Message-ID: <20260525193622.70282-3-sieberf@amazon.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260525193622.70282-1-sieberf@amazon.com> References: <20260525193622.70282-1-sieberf@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: EX19D045UWC003.ant.amazon.com (10.13.139.198) To EX19D001UWA001.ant.amazon.com (10.13.138.214) Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Add a sched_ext_ops callback that is invoked when userspace writes to cpu.max.runtime. This allows BPF schedulers to be notified when runtime credits are injected into a cgroup, enabling SCX-side credit tracking. The callback includes change detection (only fires when the value changes) and caches the value in tg->scx.bw_runtime_us. Signed-off-by: Fernand Sieber --- include/linux/sched/ext.h | 1 + kernel/sched/core.c | 2 ++ kernel/sched/ext.c | 17 +++++++++++++++++ kernel/sched/ext.h | 2 ++ kernel/sched/ext_internal.h | 12 ++++++++++++ 5 files changed, 34 insertions(+) diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h index 2129e18ad..591801a50 100644 --- a/include/linux/sched/ext.h +++ b/include/linux/sched/ext.h @@ -273,6 +273,7 @@ struct scx_task_group { u64 bw_period_us; u64 bw_quota_us; u64 bw_burst_us; + u64 bw_runtime_us; bool idle; #endif }; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index d92e5840b..369dd03d3 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10165,6 +10165,8 @@ static int cpu_runtime_write_u64(struct cgroup_subsys_state *css, cfs_b->runtime = (u64)runtime_us * NSEC_PER_USEC; raw_spin_unlock_irq(&cfs_b->lock); + + scx_group_set_runtime(tg, runtime_us); return 0; } diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 827a96e39..2ce505ad8 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -4488,6 +4488,23 @@ void scx_group_set_bandwidth(struct task_group *tg, percpu_up_read(&scx_cgroup_ops_rwsem); } + +void scx_group_set_runtime(struct task_group *tg, u64 runtime_us) +{ + struct scx_sched *sch; + + percpu_down_read(&scx_cgroup_ops_rwsem); + sch = scx_root; + + if (scx_cgroup_enabled && SCX_HAS_OP(sch, cgroup_set_runtime) && + tg->scx.bw_runtime_us != runtime_us) + SCX_CALL_OP(sch, cgroup_set_runtime, NULL, + tg_cgrp(tg), runtime_us); + + tg->scx.bw_runtime_us = runtime_us; + + percpu_up_read(&scx_cgroup_ops_rwsem); +} #endif /* CONFIG_EXT_GROUP_SCHED */ #if defined(CONFIG_EXT_GROUP_SCHED) || defined(CONFIG_EXT_SUB_SCHED) diff --git a/kernel/sched/ext.h b/kernel/sched/ext.h index 0b7fc46ae..00103ec3d 100644 --- a/kernel/sched/ext.h +++ b/kernel/sched/ext.h @@ -81,6 +81,7 @@ void scx_cgroup_cancel_attach(struct cgroup_taskset *tset); void scx_group_set_weight(struct task_group *tg, unsigned long cgrp_weight); void scx_group_set_idle(struct task_group *tg, bool idle); void scx_group_set_bandwidth(struct task_group *tg, u64 period_us, u64 quota_us, u64 burst_us); +void scx_group_set_runtime(struct task_group *tg, u64 runtime_us); #else /* CONFIG_EXT_GROUP_SCHED */ static inline void scx_tg_init(struct task_group *tg) {} static inline int scx_tg_online(struct task_group *tg) { return 0; } @@ -91,5 +92,6 @@ static inline void scx_cgroup_cancel_attach(struct cgroup_taskset *tset) {} static inline void scx_group_set_weight(struct task_group *tg, unsigned long cgrp_weight) {} static inline void scx_group_set_idle(struct task_group *tg, bool idle) {} static inline void scx_group_set_bandwidth(struct task_group *tg, u64 period_us, u64 quota_us, u64 burst_us) {} +static inline void scx_group_set_runtime(struct task_group *tg, u64 runtime_us) {} #endif /* CONFIG_EXT_GROUP_SCHED */ #endif /* CONFIG_CGROUP_SCHED */ diff --git a/kernel/sched/ext_internal.h b/kernel/sched/ext_internal.h index a075732d4..21e6ab7af 100644 --- a/kernel/sched/ext_internal.h +++ b/kernel/sched/ext_internal.h @@ -739,6 +739,18 @@ struct sched_ext_ops { */ void (*cgroup_set_idle)(struct cgroup *cgrp, bool idle); + /** + * @cgroup_set_runtime: A cgroup's runtime is being set directly + * @cgrp: cgroup whose runtime is being set + * @runtime_us: runtime in microseconds + * + * Update @cgrp's available runtime. This is from the cpu.max.runtime + * cgroup interface. @runtime_us is the total runtime budget that the + * cgroup may consume. The BPF scheduler should track this value and + * throttle tasks in @cgrp once the budget is exhausted. + */ + void (*cgroup_set_runtime)(struct cgroup *cgrp, u64 runtime_us); + #endif /* CONFIG_EXT_GROUP_SCHED */ /** -- 2.47.3 Amazon Development Centre (South Africa) (Proprietary) Limited 29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa Registration Number: 2004 / 034463 / 07