All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Changwoo Min <multics69@gmail.com>
Cc: tj@kernel.org, void@manifault.com, mingo@redhat.com,
	peterz@infradead.org, changwoo@igalia.com, kernel-dev@igalia.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 2/6] sched_ext: Implement scx_rq_clock_update/stale()
Date: Mon, 9 Dec 2024 10:40:33 +0100	[thread overview]
Message-ID: <Z1a7EfETQi3FSLJG@gpd3> (raw)
In-Reply-To: <20241209061531.257531-3-changwoo@igalia.com>

On Mon, Dec 09, 2024 at 03:15:27PM +0900, Changwoo Min wrote:
> scx_rq_clock_update() and scx_rq_clock_stale() manage the status of an
> rq clock when sched_ext is enabled. scx_rq_clock_update() keeps the rq
> clock in memory and its status valid. scx_rq_clock_stale() invalidates
> the current rq clock not to use the cached rq clock.
> 
> Signed-off-by: Changwoo Min <changwoo@igalia.com>
> ---
>  kernel/sched/sched.h | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 440ecedf871b..7e71d8685fcc 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -754,6 +754,7 @@ enum scx_rq_flags {
>  	SCX_RQ_BAL_PENDING	= 1 << 2, /* balance hasn't run yet */
>  	SCX_RQ_BAL_KEEP		= 1 << 3, /* balance decided to keep current */
>  	SCX_RQ_BYPASSING	= 1 << 4,
> +	SCX_RQ_CLK_VALID	= 1 << 5, /* RQ clock is fresh and valid */
>  
>  	SCX_RQ_IN_WAKEUP	= 1 << 16,
>  	SCX_RQ_IN_BALANCE	= 1 << 17,
> @@ -766,9 +767,11 @@ struct scx_rq {
>  	unsigned long		ops_qseq;
>  	u64			extra_enq_flags;	/* see move_task_to_local_dsq() */
>  	u32			nr_running;
> -	u32			flags;
>  	u32			cpuperf_target;		/* [0, SCHED_CAPACITY_SCALE] */
>  	bool			cpu_released;
> +	u32			flags;
> +	u64			clock;			/* current per-rq clock -- see scx_bpf_now_ns() */
> +	u64			prev_clock;		/* previous per-rq clock -- see scx_bpf_now_ns() */

Since we're reordering this struct, we may want to move cpu_released all
the way to the bottom to get rid of the 3-bytes hole (and still have
flags, clock and prev_clock in the same cacheline).

>  	cpumask_var_t		cpus_to_kick;
>  	cpumask_var_t		cpus_to_kick_if_idle;
>  	cpumask_var_t		cpus_to_preempt;
> @@ -1725,9 +1728,28 @@ DECLARE_STATIC_KEY_FALSE(__scx_switched_all);	/* all fair class tasks on SCX */
>  
>  #define scx_enabled()		static_branch_unlikely(&__scx_ops_enabled)
>  #define scx_switched_all()	static_branch_unlikely(&__scx_switched_all)
> +
> +static inline void scx_rq_clock_update(struct rq *rq, u64 clock)
> +{
> +	if (scx_enabled()) {
> +		rq->scx.prev_clock = rq->scx.clock;
> +		rq->scx.clock = clock;
> +		rq->scx.flags |= SCX_RQ_CLK_VALID;
> +	}
> +}

Nit, this is just personal preference (feel free to ignore it):

	if (!scx_enabled())
		return;
	rq->scx.prev_clock = rq->scx.clock;
	rq->scx.clock = clock;
	rq->scx.flags |= SCX_RQ_CLK_VALID;

> +
> +static inline void scx_rq_clock_stale(struct rq *rq)
> +{
> +	if (scx_enabled())
> +		rq->scx.flags &= ~SCX_RQ_CLK_VALID;
> +}

I'm wondering if we need to invalidate the clock on all rqs when we call
scx_ops_enable() to prevent getting stale information from a previous
scx scheduler.

Probably it's not an issue, since scx_ops_disable_workfn() should make
sure that all tasks are going through rq_unpin_lock() before unloading
the current scheduler, maybe it could be helpful to add comment about
this scenario in scx_bpf_now_ns() (PATCH 4/6)?

> +
>  #else /* !CONFIG_SCHED_CLASS_EXT */
>  #define scx_enabled()		false
>  #define scx_switched_all()	false
> +
> +static inline void scx_rq_clock_update(struct rq *rq, u64 clock) {}
> +static inline void scx_rq_clock_stale(struct rq *rq) {}
>  #endif /* !CONFIG_SCHED_CLASS_EXT */
>  
>  /*
> -- 
> 2.47.1
> 

Thanks,
-Andrea

  reply	other threads:[~2024-12-09  9:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-09  6:15 [PATCH v4 0/6] sched_ext: Support high-performance monotonically non-decreasing clock Changwoo Min
2024-12-09  6:15 ` [PATCH v4 1/6] sched_ext: Relocate scx_enabled() related code Changwoo Min
2024-12-11  7:27   ` Tejun Heo
2024-12-11  7:37     ` Tejun Heo
2024-12-09  6:15 ` [PATCH v4 2/6] sched_ext: Implement scx_rq_clock_update/stale() Changwoo Min
2024-12-09  9:40   ` Andrea Righi [this message]
2024-12-10  7:21     ` Changwoo Min
2024-12-11  7:43   ` Tejun Heo
2024-12-13  1:16     ` Changwoo Min
2024-12-09  6:15 ` [PATCH v4 3/6] sched_ext: Manage the validity of scx_rq_clock Changwoo Min
2024-12-09  6:15 ` [PATCH v4 4/6] sched_ext: Implement scx_bpf_now_ns() Changwoo Min
2024-12-11  8:14   ` Tejun Heo
2024-12-13  1:41     ` Changwoo Min
2024-12-11  9:32   ` Peter Zijlstra
2024-12-13  2:01     ` Changwoo Min
2024-12-09  6:15 ` [PATCH v4 5/6] sched_ext: Add scx_bpf_now_ns() for BPF scheduler Changwoo Min
2024-12-09  6:15 ` [PATCH v4 6/6] sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now_ns() Changwoo Min
2024-12-09  9:51 ` [PATCH v4 0/6] sched_ext: Support high-performance monotonically non-decreasing clock Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z1a7EfETQi3FSLJG@gpd3 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=kernel-dev@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=multics69@gmail.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.