All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Valentin Schneider <vschneid@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [PATCH 5/5] sched/fair: Rename set_next_buddy() to set_next_pick()
Date: Tue, 9 Apr 2024 10:32:59 +0200	[thread overview]
Message-ID: <ZhT9O6yd868GuAxr@gmail.com> (raw)
In-Reply-To: <20240408091605.GE21904@noisy.programming.kicks-ass.net>


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sun, Apr 07, 2024 at 10:43:19AM +0200, Ingo Molnar wrote:
> > This is a mechanism to set the next task_pick target,
> > 'buddy' is too ambiguous and refers to a historic feature we
> > don't have anymore.
> > 
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > ---
> >  kernel/sched/fair.c | 28 +++++++++++++---------------
> >  1 file changed, 13 insertions(+), 15 deletions(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 93ea653065f5..fe730f232ffd 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -3200,7 +3200,16 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >  	hrtick_update(rq);
> >  }
> >  
> > -static void set_next_buddy(struct sched_entity *se);
> > +static void set_next_pick(struct sched_entity *se)
> > +{
> > +	for_each_sched_entity(se) {
> > +		if (SCHED_WARN_ON(!se->on_rq))
> > +			return;
> > +		if (se_is_idle(se))
> > +			return;
> > +		cfs_rq_of(se)->next = se;
> > +	}
> > +}
> >  
> >  /*
> >   * The dequeue_task method is called before nr_running is
> > @@ -3240,7 +3249,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >  			 * p is sleeping when it is within its sched_slice.
> >  			 */
> >  			if (task_sleep && se && !throttled_hierarchy(cfs_rq))
> > -				set_next_buddy(se);
> > +				set_next_pick(se);
> >  			break;
> >  		}
> >  		flags |= DEQUEUE_SLEEP;
> > @@ -4631,17 +4640,6 @@ balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
> >  static inline void set_task_max_allowed_capacity(struct task_struct *p) {}
> >  #endif /* CONFIG_SMP */
> >  
> > -static void set_next_buddy(struct sched_entity *se)
> > -{
> > -	for_each_sched_entity(se) {
> > -		if (SCHED_WARN_ON(!se->on_rq))
> > -			return;
> > -		if (se_is_idle(se))
> > -			return;
> > -		cfs_rq_of(se)->next = se;
> > -	}
> > -}
> > -
> 
> Hurmmm.. afaict the only actual user of cfs_rq->next left is task_hot(),
> no? Is that thing worth it?

Yeah, so:

1)

While you are correct in the context of my patch, I think that might be a 
bug - the yield_to() methods are intending to use ->next:

        /* Tell the scheduler that we'd really like se to run next. */
        set_next_buddy(se);

        yield_task_fair(rq);

... and yield_to() would rather fundamentally rely on ->next overriding the 
next-task-pick selection, but it won't due to NEXT_BUDDY being false:

static struct sched_entity *
pick_next_entity(struct cfs_rq *cfs_rq)
{               
        /*      
         * Enabling NEXT_BUDDY will affect latency but not fairness.
         */
        if (sched_feat(NEXT_BUDDY) &&
            cfs_rq->next && entity_eligible(cfs_rq, cfs_rq->next))
                return cfs_rq->next;



> That is, should we not totally nuke the thing?

I don't think we want to nuke it - there's 3 users:

 - yield()
 - CFS bandwidth
 - wakeup

I think the yield() and CFS bandwidth ones are genuine, but non-working due 
to NEXT_BUDDY at 0. Wakeup was the original intended NEXT_BUDDY logic, but 
it got turned off due to some performance or latency considerations that 
might or might not be valid & relevant today.

2)

Even the task_hot() use of ->next isn't spurious: if a task has been marked 
as run-next, then presumably the current task is descheduling and we should 
probably not tear its ->next away in load-balancing.

3)

Side note: a set rq->next should probably reduce a candidate runqueue's 
weight both in periodic load-balancing and in idle-balancing, by rq->curr's 
weight or so?

So what I think we should do is to keep ->next and fix all its intended 
uses, and make it all unconditional by removing both NEXT_BUDDY and 
CACHE_HOT_BUDDY. I can cook up a series if you agree in principle.

Thanks,

	Ingo

  reply	other threads:[~2024-04-09  8:33 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-07  8:43 [PATCH 0/5] sched: Split out kernel/sched/fair_balance.c, numa_balancing.c and syscalls.c, plus other updates Ingo Molnar
2024-04-07  8:43 ` [PATCH 1/5] sched: Split out kernel/sched/syscalls.c from kernel/sched/core.c Ingo Molnar
2024-04-07 19:09   ` kernel test robot
2024-05-27 12:05   ` [tip: sched/core] sched/syscalls: " tip-bot2 for Ingo Molnar
2024-04-07  8:43 ` [PATCH 2/5] sched: Split out kernel/sched/fair_balance.c from kernel/sched/fair.c Ingo Molnar
2024-04-07 10:04   ` kernel test robot
2024-04-07 10:15   ` kernel test robot
2024-04-07 20:21   ` kernel test robot
2024-04-07  8:43 ` [PATCH 3/5] sched: Split out kernel/sched/numa_balancing.c " Ingo Molnar
2024-04-07 18:49   ` kernel test robot
2024-04-07  8:43 ` [PATCH 4/5] sched/fair: Remove NEXT_BUDDY Ingo Molnar
2024-04-07  8:43 ` [PATCH 5/5] sched/fair: Rename set_next_buddy() to set_next_pick() Ingo Molnar
2024-04-08  9:16   ` Peter Zijlstra
2024-04-09  8:32     ` Ingo Molnar [this message]
2024-04-09  9:27       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZhT9O6yd868GuAxr@gmail.com \
    --to=mingo@kernel.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=sshegde@linux.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.