From: Hagar Hemdan <hagarhem@amazon.com>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: <abuehaze@amazon.com>, <linux-kernel@vger.kernel.org>,
<hagarhem@amazon.com>, <wuchi.zero@gmail.com>
Subject: Re: BUG Report: Fork benchmark drop by 30% on aarch64
Date: Fri, 28 Feb 2025 19:39:45 +0000 [thread overview]
Message-ID: <20250228193945.GA13237@amazon.com> (raw)
In-Reply-To: <5f92761b-c7d4-4b96-9398-183a5bf7556a@arm.com>
On Mon, Feb 17, 2025 at 11:51:45PM +0100, Dietmar Eggemann wrote:
> On 13/02/2025 19:55, Dietmar Eggemann wrote:
> > On 11/02/2025 22:40, Hagar Hemdan wrote:
> >> On Tue, Feb 11, 2025 at 05:27:47PM +0100, Dietmar Eggemann wrote:
> >>> On 10/02/2025 22:31, Hagar Hemdan wrote:
> >>>> On Mon, Feb 10, 2025 at 11:38:51AM +0100, Dietmar Eggemann wrote:
> >>>>> On 07/02/2025 12:07, Hagar Hemdan wrote:
> >>>>>> On Fri, Feb 07, 2025 at 10:14:54AM +0100, Dietmar Eggemann wrote:
> >>>>>>> Hi Hagar,
> >>>>>>>
> >>>>>>> On 05/02/2025 16:10, Hagar Hemdan wrote:
> >>>
> >>> [...]
> >>>
> >>>>> The 'spawn' tasks in sched_move_task() are 'running' and 'queued' so we
> >>>>> call dequeue_task(), put_prev_task(), enqueue_task() and
> >>>>> set_next_task().
> >>>>>
> >>>>> I guess what we need here is the cfs_rq->avg.load_avg (cpu_load() in
> >>>>> case of root tg) update in:
> >>>>>
> >>>>> task_change_group_fair() -> detach_task_cfs_rq() -> ...,
> >>>>> attach_task_cfs_rq() -> ...
> >>>>>
> >>>>> since this is used for WF_FORK, WF_EXEC handling in wakeup:
> >>>>>
> >>>>> select_task_rq_fair() -> sched_balance_find_dst_cpu() ->
> >>>>> sched_balance_find_dst_group_cpu()
> >>>>>
> >>>>> in form of 'least_loaded_cpu' and 'load = cpu_load(cpu_rq(i)'.
> >>>>>
> >>>>> You mentioned AutoGroups (AG). I don't see this issue on my Debian 12
> >>>>> Juno-r0 Arm64 board. When I run w/ AG, 'group' is '/' and
> >>>>> 'tsk->sched_task_group' is '/autogroup-x' so the condition 'if (group ==
> >>>>> tsk->sched_task_group)' isn't true in sched_move_task(). If I disable AG
> >>>>> then they match "/" == "/".
> >>>>>
> >>>>> I assume you run Ubuntu on your AWS instances? What kind of
> >>>>> 'cgroup/taskgroup' related setup are you using?
> >>>>
> >>>> I'm running AL2023 and use Vanilla kernel 6.13.1 on m6g.xlarge AWS instance.
> >>>> AL2023 uses cgroupv2 by default.
> >>>>>
> >>>>> Can you run w/ this debug snippet w/ and w/o AG enabled?
> >>>>
> >>>> I have run that and have attached the trace files to this email.
> >>>
> >>> Thanks!
> >>>
> >>> So w/ AG you see that 'group' and 'tsk->sched_task_group' are both
> >>> '/user.slice/user-1000.slice/session-1.scope' so we bail for those tasks
> >>> w/o doing the 'cfs_rq->avg.load_avg' update I described above.
> >>
> >> yes, both groups are identical so it returns from sched_move_task()
> >> without {de|en}queue and without call task_change_group_fair().
> >
> > OK.
> >
> >>> You said that there is no issue w/o AG.
> >>
> >> To clarify, I meant by there's no regression when autogroup is disabled,
> >> that the fork results w/o AG remain consistent with or without the commit
> >> "sched/core: Reduce cost of sched_move_task when config autogroup". However,
> >> the fork results are consistently lower when AG disabled compared to when
> >> it's enabled (without commit applied). This is illustrated in the tables
> >> provided in the report.
> >
> > OK, but I don't quite get yet why w/o AG the results are lower even w/o
> > eff6c8ce8d4d? Have to dig further I guess. Maybe there is more than this
> > p->se.avg.load_avg update when we go via task_change_group_fair()?
>
> './Run -c 4 spawn' on AWS instance (m7gd.16xlarge) with v6.13, 'mem=16G
> maxcpus=4 nr_cpus=4' and Ubuntu '22.04.5 LTS':
>
> CFG_SCHED_AUTOGROUP | sched_ag_enabled | eff6c8ce8d4d | Fork (lps)
>
> y 1 y 21005 (27120 **)
> y 0 y 21059 (27012 **)
> n - y 21299
> y 1 n 27745 *
> y 0 n 27493 *
> n - n 20928
>
> (*) So here the higher numbers are only achieved when
> 'sched_autogroup_exit_task() -> sched_move_task() ->
> sched_change_group() is called for the 'spawn' tasks.
>
> (**) When I apply the fix from
> https://lkml.kernel.org/r/4a9cc5ab-c538-4427-8a7c-99cb317a283f@arm.com.
This is currently impacting our kernel, do you
have any concerns to submit this fix upstream?
Thanks,
Hagar
>
> These results support the story that we need:
>
> task_change_group_fair() -> detach_task_cfs_rq() -> ...,
> attach_task_cfs_rq() -> ...
>
> i.e. the related 'cfs_rq->avg.load_avg' update during do_exit() so that
> WF_FORK handling in wakeup:
>
> select_task_rq_fair() -> sched_balance_find_dst_cpu() ->
> sched_balance_find_dst_group_cpu()
>
> can use more recent 'load = cpu_load(cpu_rq(i)' values to get a better
> 'least_loaded_cpu'.
>
> The AWS instance runs systemd so shell and test run in a taskgroup other
> than root which trumps autogroups:
>
> task_wants_autogroup()
>
> if (tg != &root_task_group)
> return false;
>
> ...
>
> That's why 'group == tsk->sched_task_group' in sched_move_task() is
> true, which is different on my Juno: the shell from which I launch the
> tests runs in '/' so that the test ends up in an autogroup, i.e. 'group
> != tsk->sched_task_group'.
>
> [...]
next prev parent reply other threads:[~2025-02-28 19:39 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-05 15:10 BUG Report: Fork benchmark drop by 30% on aarch64 Hagar Hemdan
2025-02-07 9:14 ` Dietmar Eggemann
2025-02-07 11:07 ` Hagar Hemdan
2025-02-10 10:38 ` Dietmar Eggemann
2025-02-10 21:31 ` Hagar Hemdan
2025-02-11 16:27 ` Dietmar Eggemann
2025-02-11 21:40 ` Hagar Hemdan
2025-02-13 18:55 ` Dietmar Eggemann
2025-02-17 22:51 ` Dietmar Eggemann
2025-02-21 6:44 ` Hagar Hemdan
2025-03-03 10:05 ` Dietmar Eggemann
2025-03-03 13:57 ` Hagar Hemdan
2025-02-28 19:39 ` Hagar Hemdan [this message]
2025-03-03 10:06 ` Dietmar Eggemann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250228193945.GA13237@amazon.com \
--to=hagarhem@amazon.com \
--cc=abuehaze@amazon.com \
--cc=dietmar.eggemann@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=wuchi.zero@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.