public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: Juri Lelli <juri.lelli@redhat.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Suleiman Souhlal <suleiman@google.com>,
	Aashish Sharma <shraash@google.com>,
	Shin Kawamura <kawasin@google.com>,
	Vineeth Remanan Pillai <vineeth@bitbyteword.org>
Subject: Re: [PATCH] dl_server: Reset DL server params when rd changes
Date: Mon, 4 Nov 2024 17:41:09 +0000	[thread overview]
Message-ID: <20241104174109.GA1044726@google.com> (raw)
In-Reply-To: <Zyin7P2WNZMM40tp@jlelli-thinkpadt14gen4.remote.csb>

On Mon, Nov 04, 2024 at 11:54:36AM +0100, Juri Lelli wrote:
> On 30/10/24 19:50, Joel Fernandes wrote:
> 
> ...
> 
> > With some prints [1] in the kernel, we can see on boot:
> > 
> > $ dmesg|grep appl
> > [    0.930337] dl_server_apply_params: cpu=0, runtime=50000000, period=1000000000, cpus=1, cap=1024, init=1
> > [    0.949025] dl_server_apply_params: cpu=1, runtime=50000000, period=1000000000, cpus=1, cap=1024, init=1
> > [    0.953026] dl_server_apply_params: cpu=2, runtime=50000000, period=1000000000, cpus=2, cap=2048, init=1
> > [    0.957024] dl_server_apply_params: cpu=3, runtime=50000000, period=1000000000, cpus=3, cap=3072, init=1
> > [    0.961023] dl_server_apply_params: cpu=4, runtime=50000000, period=1000000000, cpus=4, cap=4096, init=1
> > [    0.965030] dl_server_apply_params: cpu=5, runtime=50000000, period=1000000000, cpus=5, cap=5120, init=1
> > [    0.969024] dl_server_apply_params: cpu=6, runtime=50000000, period=1000000000, cpus=6, cap=6144, init=1
> > [    0.973024] dl_server_apply_params: cpu=7, runtime=50000000, period=1000000000, cpus=7, cap=7168, init=1
> > 
> > For the 8th apply_params, the 8th CPU is not considered. This is because
> > set_cpu_active() for the 8th CPU has not yet happened as mentioned in commit
> > message.
> > 
> > With the patch:
> > 
> > $ dmesg|grep appl
> > [    0.961169] dl_server_apply_params: cpu=0, runtime=50000000, period=1000000000, cpus=1, cap=1024, init=1
> > [    0.981936] dl_server_apply_params: cpu=1, runtime=50000000, period=1000000000, cpus=1, cap=1024, init=1
> > [    0.985836] dl_server_apply_params: cpu=2, runtime=50000000, period=1000000000, cpus=2, cap=2048, init=1
> > [    0.989835] dl_server_apply_params: cpu=3, runtime=50000000, period=1000000000, cpus=3, cap=3072, init=1
> > [    0.993840] dl_server_apply_params: cpu=4, runtime=50000000, period=1000000000, cpus=4, cap=4096, init=1
> > [    0.997835] dl_server_apply_params: cpu=5, runtime=50000000, period=1000000000, cpus=5, cap=5120, init=1
> > [    1.001838] dl_server_apply_params: cpu=6, runtime=50000000, period=1000000000, cpus=6, cap=6144, init=1
> > [    1.005834] dl_server_apply_params: cpu=7, runtime=50000000, period=1000000000, cpus=7, cap=7168, init=1
> > 
> >    [ ... here somewhere rd changes as topology init finishes, then all the
> >    params are replied, this time with the correct rd. ]
> > 
> > [    1.009903] dl_server_apply_params: cpu=0, runtime=50000000, period=1000000000, cpus=8, cap=8192, init=1
> > [    1.012409] dl_server_apply_params: cpu=1, runtime=50000000, period=1000000000, cpus=8, cap=8192, init=1
> > [    1.014269] dl_server_apply_params: cpu=2, runtime=50000000, period=1000000000, cpus=8, cap=8192, init=1
> > [    1.019865] dl_server_apply_params: cpu=3, runtime=50000000, period=1000000000, cpus=8, cap=8192, init=1
> > [    1.054908] dl_server_apply_params: cpu=4, runtime=50000000, period=1000000000, cpus=8, cap=8192, init=1
> > [    1.081865] dl_server_apply_params: cpu=5, runtime=50000000, period=1000000000, cpus=8, cap=8192, init=1
> > [    1.108861] dl_server_apply_params: cpu=6, runtime=50000000, period=1000000000, cpus=8, cap=8192, init=1
> > [    1.136944] dl_server_apply_params: cpu=7, runtime=50000000, period=1000000000, cpus=8, cap=8192, init=1
> > 
> > The -EBUSY happens for our 5.15 backport. I see dl_b->total_bw to be 0
> > without my patch. Even if the -EBUSY doesn't happen for you (perhaps due to
> > compiler or other differences), shouldn't we use the correct rd for
> > apply_params? The dl_bw is tied to the rd via  cpu_rq(cpu)->rd->dl_bw;
> 
> I think I am still seeing something different.
> 
> [    0.184629] dl_server_apply_params: cpu=0, runtime=50000000, period=1000000000, cpus=1, cap=1024, init=1, dl_b=ffffffffb837c080, dl_b->total_bw=0, dl_b->bw=996147
> [    0.185498] __dl_overflow: dl_b=ffffffffb837c080, dl_b->bw=996147, cap=1024, cap_scale=996147 dl_bw->total_bw=0, old_bw=0, new_bw=52428
> [    0.371531] dl_server_apply_params: cpu=1, runtime=50000000, period=1000000000, cpus=1, cap=1024, init=1, dl_b=ffffffffb837c080, dl_b->total_bw=52428, dl_b->bw=996147
> [    0.372498] __dl_overflow: dl_b=ffffffffb837c080, dl_b->bw=996147, cap=1024, cap_scale=996147 dl_bw->total_bw=52428, old_bw=0, new_bw=52428
> [    0.373541] dl_server_apply_params: cpu=2, runtime=50000000, period=1000000000, cpus=2, cap=2048, init=1, dl_b=ffffffffb837c080, dl_b->total_bw=104856, dl_b->bw=996147
> [    0.374498] __dl_overflow: dl_b=ffffffffb837c080, dl_b->bw=996147, cap=2048, cap_scale=1992294 dl_bw->total_bw=104856, old_bw=0, new_bw=52428
> [    0.375507] dl_server_apply_params: cpu=3, runtime=50000000, period=1000000000, cpus=3, cap=3072, init=1, dl_b=ffffffffb837c080, dl_b->total_bw=157284, dl_b->bw=996147
> [    0.376498] __dl_overflow: dl_b=ffffffffb837c080, dl_b->bw=996147, cap=3072, cap_scale=2988441 dl_bw->total_bw=157284, old_bw=0, new_bw=52428
> [    0.377507] dl_server_apply_params: cpu=4, runtime=50000000, period=1000000000, cpus=4, cap=4096, init=1, dl_b=ffffffffb837c080, dl_b->total_bw=209712, dl_b->bw=996147
> [    0.378498] __dl_overflow: dl_b=ffffffffb837c080, dl_b->bw=996147, cap=4096, cap_scale=3984588 dl_bw->total_bw=209712, old_bw=0, new_bw=52428
> [    0.379505] dl_server_apply_params: cpu=5, runtime=50000000, period=1000000000, cpus=5, cap=5120, init=1, dl_b=ffffffffb837c080, dl_b->total_bw=262140, dl_b->bw=996147
> [    0.380498] __dl_overflow: dl_b=ffffffffb837c080, dl_b->bw=996147, cap=5120, cap_scale=4980735 dl_bw->total_bw=262140, old_bw=0, new_bw=52428
> [    0.381504] dl_server_apply_params: cpu=6, runtime=50000000, period=1000000000, cpus=6, cap=6144, init=1, dl_b=ffffffffb837c080, dl_b->total_bw=314568, dl_b->bw=996147
> [    0.382498] __dl_overflow: dl_b=ffffffffb837c080, dl_b->bw=996147, cap=6144, cap_scale=5976882 dl_bw->total_bw=314568, old_bw=0, new_bw=52428
> [    0.384527] dl_server_apply_params: cpu=7, runtime=50000000, period=1000000000, cpus=7, cap=7168, init=1, dl_b=ffffffffb837c080, dl_b->total_bw=366996, dl_b->bw=996147
> [    0.385498] __dl_overflow: dl_b=ffffffffb837c080, dl_b->bw=996147, cap=7168, cap_scale=6973029 dl_bw->total_bw=366996, old_bw=0, new_bw=52428
>      ...
> [    0.388556] __dl_server_attach_root: cpu=0, new_bw=52428, dl_b=ffff949bc080e040, dl_b->total_bw=52428, dl_b->bw=996147
> [    0.389507] __dl_server_attach_root: cpu=1, new_bw=52428, dl_b=ffff949bc080e040, dl_b->total_bw=104856, dl_b->bw=996147
> [    0.390501] __dl_server_attach_root: cpu=2, new_bw=52428, dl_b=ffff949bc080e040, dl_b->total_bw=157284, dl_b->bw=996147
> [    0.391503] __dl_server_attach_root: cpu=3, new_bw=52428, dl_b=ffff949bc080e040, dl_b->total_bw=209712, dl_b->bw=996147
> [    0.392499] __dl_server_attach_root: cpu=4, new_bw=52428, dl_b=ffff949bc080e040, dl_b->total_bw=262140, dl_b->bw=996147
> [    0.393499] __dl_server_attach_root: cpu=5, new_bw=52428, dl_b=ffff949bc080e040, dl_b->total_bw=314568, dl_b->bw=996147
> [    0.394428] __dl_server_attach_root: cpu=6, new_bw=52428, dl_b=ffff949bc080e040, dl_b->total_bw=366996, dl_b->bw=996147
> [    0.394499] __dl_server_attach_root: cpu=7, new_bw=52428, dl_b=ffff949bc080e040, dl_b->total_bw=419424, dl_b->bw=996147
> 
> 
> I added a printk in __dl_server_attach_root which is called after the
> dynamic rd is built to transfer bandwidth to it.
> 
> __dl_server_attach_root came with d741f297bceaf ("sched/fair: Fair
> server interface"), do you have this change in your backport?

You nailed it! Our 5.15 backport appears to be slightly older and is missing
this from topology.c as you mentioned. Thanks for clarifying!


        /*
         * Because the rq is not a task, dl_add_task_root_domain() did not
         * move the fair server bw to the rd if it already started.
         * Add it now.
         */
        if (rq->fair_server.dl_server)
                __dl_server_attach_root(&rq->fair_server, rq);

> 
> > So if rd changes during boot initialization, the correct dl_bw has to be
> > updated AFAICS. Also if cpusets are used, the rd for a CPU may change.
> 
> cpusets changes are something that I still need to double check. Will
> do.
> 

Sounds good, that would be good to verify.

Thanks,

 - Joel


  reply	other threads:[~2024-11-04 17:41 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-29 22:51 [PATCH] dl_server: Reset DL server params when rd changes Joel Fernandes (Google)
2024-10-30 14:30 ` Juri Lelli
2024-10-30 19:50   ` Joel Fernandes
2024-11-04 10:54     ` Juri Lelli
2024-11-04 17:41       ` Joel Fernandes [this message]
2024-11-06 16:08         ` Juri Lelli
2024-11-06 18:05           ` Waiman Long
2024-11-08  4:40             ` Waiman Long
2024-11-08 10:40               ` Juri Lelli
2024-11-09  3:30               ` Waiman Long
2024-11-09 18:18                 ` Waiman Long
2024-11-11  9:37                   ` Juri Lelli
2024-11-11 12:24                     ` Juri Lelli
2024-10-30 15:54 ` kernel test robot
2024-10-30 17:06 ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241104174109.GA1044726@google.com \
    --to=joel@joelfernandes.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=kawasin@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=shraash@google.com \
    --cc=suleiman@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vineeth@bitbyteword.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox