From: Aaron Lu <ziqianlu@bytedance.com>
To: Valentin Schneider <vschneid@redhat.com>
Cc: Ben Segall <bsegall@google.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Peter Zijlstra <peterz@infradead.org>,
Chengming Zhou <chengming.zhou@linux.dev>,
Josh Don <joshdon@google.com>, Ingo Molnar <mingo@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Xi Wang <xii@google.com>,
linux-kernel@vger.kernel.org, Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Mel Gorman <mgorman@suse.de>,
Chuyi Zhou <zhouchuyi@bytedance.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
Florian Bezdeka <florian.bezdeka@siemens.com>,
Songtang Liu <liusongtang@bytedance.com>
Subject: Re: [PATCH v3 3/5] sched/fair: Switch to task based throttle model
Date: Fri, 15 Aug 2025 17:30:08 +0800 [thread overview]
Message-ID: <20250815092910.GA33@bytedance> (raw)
In-Reply-To: <xhsmhh5y9j3ut.mognet@vschneid-thinkpadt14sgen2i.remote.csb>
On Thu, Aug 14, 2025 at 05:54:34PM +0200, Valentin Schneider wrote:
> On 12/08/25 16:48, Aaron Lu wrote:
> > On Fri, Aug 08, 2025 at 01:45:11PM +0200, Valentin Schneider wrote:
> >> On 08/08/25 18:13, Aaron Lu wrote:
> >> > Let me run some scheduler benchmark to see how it impacts performance.
> >> >
> >> > I'm thinking maybe running something like hackbench on server platforms,
> >> > first with quota not set and see if performance changes; then also test
> >> > with quota set and see how performance changes.
> >> >
> >> > Does this sound good to you? Or do you have any specific benchmark and
> >> > test methodology in mind?
> >> >
> >>
> >> Yeah hackbench is pretty good for stressing the EQ/DQ paths.
> >>
> >
> > Tested hackbench/pipe and netperf/UDP_RR on Intel EMR(2 sockets/240
> > cpus) and AMD Genoa(2 sockets/384 cpus), the tldr is: there is no clear
> > performance change between base and this patchset(head). Below is
> > detailed test data:
> > (turbo/boost disabled, cpuidle disabled, cpufreq set to performance)
> >
> > hackbench/pipe/loops=150000
> > (seconds, smaller is better)
> >
> > On Intel EMR:
> >
> > nr_group base head change
> > 1 3.62±2.99% 3.61±10.42% +0.28%
> > 8 8.06±1.58% 7.88±5.82% +2.23%
> > 16 11.40±2.57% 11.25±3.72% +1.32%
> >
> > For nr_group=16 case, configure a cgroup and set quota to half cpu and
> > then let hackbench run in this cgroup:
> >
> > base head change
> > quota=50% 18.35±2.40% 18.78±1.97% -2.34%
> >
> > On AMD Genoa:
> >
> > nr_group base head change
> > 1 17.05±1.92% 16.99±2.81% +0.35%
> > 8 16.54±0.71% 16.73±1.18% -1.15%
> > 16 27.04±0.39% 26.72±2.37% +1.18%
> >
> > For nr_group=16 case, configure a cgroup and set quota to half cpu and
> > then let hackbench run in this cgroup:
> >
> > base head change
> > quota=50% 43.79±1.10% 44.65±0.37% -1.96%
> >
> > Netperf/UDP_RR/testlen=30s
> > (throughput, higher is better)
> >
> > 25% means nr_clients set to 1/4 nr_cpu, 50% means nr_clients is 1/2
> > nr_cpu, etc.
> >
> > On Intel EMR:
> >
> > nr_clients base head change
> > 25% 83,567±0.06% 84,298±0.23% +0.87%
> > 50% 61,336±1.49% 60,816±0.63% -0.85%
> > 75% 40,592±0.97% 40,461±0.14% -0.32%
> > 100% 31,277±2.11% 30,948±1.84% -1.05%
> >
> > For nr_clients=100% case, configure a cgroup and set quota to half cpu
> > and then let netperf run in this cgroup:
> >
> > nr_clients base head change
> > 100% 25,532±0.56% 26,772±3.05% +4.86%
> >
> > On AMD Genoa:
> >
> > nr_clients base head change
> > 25% 12,443±0.40% 12,525±0.06% +0.66%
> > 50% 11,403±0.35% 11,472±0.50% +0.61%
> > 75% 10,070±0.19% 10,071±0.95% 0.00%
> > 100% 9,947±0.80% 9,881±0.58% -0.66%
> >
> > For nr_clients=100% case, configure a cgroup and set quota to half cpu
> > and then let netperf run in this cgroup:
> >
> > nr_clients base head change
> > 100% 4,954±0.24% 4,952±0.14% 0.00%
>
> Thank you for running these, looks like mostly slightly bigger variance on
> a few of these but that's about it.
>
> I would also suggest running similar benchmarks but with deeper
> hierarchies, to get an idea of how much worse unthrottle_cfs_rq() can get
> when tg_unthrottle_up() goes up a bigger tree.
No problem.
I suppose I can reuse the previous shared test script:
https://lore.kernel.org/lkml/CANCG0GdOwS7WO0k5Fb+hMd8R-4J_exPTt2aS3-0fAMUC5pVD8g@mail.gmail.com/
There I used:
nr_level1=2
nr_level2=100
nr_level3=10
But I can tweak these numbers for this performance evaluation. I can make
the leaf level to be 5 level deep and place tasks in leaf level cgroups
and configure quota on 1st level cgroups.
I'll get back to you once I finished collecting data, feel free to let
me know if you have other idea testing this :)
next prev parent reply other threads:[~2025-08-15 9:30 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-15 7:16 [PATCH v3 0/5] Defer throttle when task exits to user Aaron Lu
2025-07-15 7:16 ` [PATCH v3 1/5] sched/fair: Add related data structure for task based throttle Aaron Lu
2025-07-15 7:16 ` [PATCH v3 2/5] sched/fair: Implement throttle task work and related helpers Aaron Lu
2025-07-15 7:16 ` [PATCH v3 3/5] sched/fair: Switch to task based throttle model Aaron Lu
2025-07-15 23:29 ` kernel test robot
2025-07-16 6:57 ` Aaron Lu
2025-07-16 7:40 ` Philip Li
2025-07-16 11:15 ` [PATCH v3 update " Aaron Lu
2025-07-16 11:27 ` [PATCH v3 " Peter Zijlstra
2025-07-16 15:20 ` kernel test robot
2025-07-17 3:52 ` Aaron Lu
2025-07-23 8:21 ` Oliver Sang
2025-07-23 10:08 ` Aaron Lu
2025-08-08 9:12 ` Valentin Schneider
2025-08-08 10:13 ` Aaron Lu
2025-08-08 11:45 ` Valentin Schneider
2025-08-12 8:48 ` Aaron Lu
2025-08-14 15:54 ` Valentin Schneider
2025-08-15 9:30 ` Aaron Lu [this message]
2025-08-22 11:07 ` Aaron Lu
2025-09-03 7:14 ` Aaron Lu
2025-09-03 9:11 ` K Prateek Nayak
2025-09-03 10:11 ` Aaron Lu
2025-09-03 10:31 ` K Prateek Nayak
2025-09-03 11:35 ` Aaron Lu
2025-09-04 7:33 ` Bezdeka, Florian
2025-09-04 8:26 ` K Prateek Nayak
2025-09-04 8:40 ` Aaron Lu
2025-08-28 3:50 ` Aaron Lu
2025-08-17 8:50 ` Chen, Yu C
2025-08-18 2:50 ` Aaron Lu
2025-08-18 3:10 ` Chen, Yu C
2025-08-18 3:12 ` Aaron Lu
2025-07-15 7:16 ` [PATCH v3 4/5] sched/fair: Task based throttle time accounting Aaron Lu
2025-08-18 14:57 ` Valentin Schneider
2025-08-19 9:34 ` Aaron Lu
2025-08-19 14:09 ` Valentin Schneider
2025-08-26 14:10 ` Michal Koutný
2025-08-27 15:16 ` Valentin Schneider
2025-08-28 6:06 ` Aaron Lu
2025-08-26 9:15 ` Aaron Lu
2025-07-15 7:16 ` [PATCH v3 5/5] sched/fair: Get rid of throttled_lb_pair() Aaron Lu
2025-07-15 7:22 ` [PATCH v3 0/5] Defer throttle when task exits to user Aaron Lu
2025-08-01 14:31 ` Matteo Martelli
2025-08-04 7:52 ` Aaron Lu
2025-08-04 11:18 ` Valentin Schneider
2025-08-04 11:56 ` Aaron Lu
2025-08-08 16:37 ` Matteo Martelli
2025-08-04 8:51 ` K Prateek Nayak
2025-08-04 11:48 ` Aaron Lu
2025-08-27 14:58 ` Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250815092910.GA33@bytedance \
--to=ziqianlu@bytedance.com \
--cc=bsegall@google.com \
--cc=chengming.zhou@linux.dev \
--cc=dietmar.eggemann@arm.com \
--cc=florian.bezdeka@siemens.com \
--cc=jan.kiszka@siemens.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=liusongtang@bytedance.com \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=xii@google.com \
--cc=zhouchuyi@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.