All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hagar Hemdan <hagarhem@amazon.com>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: <hagarhem@amazon.com>, <abuehaze@amazon.com>,
	<linux-kernel@vger.kernel.org>
Subject: Re: BUG Report: Fork benchmark drop by 30% on aarch64
Date: Mon, 3 Mar 2025 13:57:15 +0000	[thread overview]
Message-ID: <20250303135715.GA21308@amazon.com> (raw)
In-Reply-To: <14a2aaac-05d5-4b2e-a8c1-617bb4411659@arm.com>

On Mon, Mar 03, 2025 at 11:05:01AM +0100, Dietmar Eggemann wrote:
> On 21/02/2025 07:44, Hagar Hemdan wrote:
> > On Mon, Feb 17, 2025 at 11:51:45PM +0100, Dietmar Eggemann wrote:
> >> On 13/02/2025 19:55, Dietmar Eggemann wrote:
> >>> On 11/02/2025 22:40, Hagar Hemdan wrote:
> >>>> On Tue, Feb 11, 2025 at 05:27:47PM +0100, Dietmar Eggemann wrote:
> >>>>> On 10/02/2025 22:31, Hagar Hemdan wrote:
> >>>>>> On Mon, Feb 10, 2025 at 11:38:51AM +0100, Dietmar Eggemann wrote:
> >>>>>>> On 07/02/2025 12:07, Hagar Hemdan wrote:
> >>>>>>>> On Fri, Feb 07, 2025 at 10:14:54AM +0100, Dietmar Eggemann wrote:
> >>>>>>>>> Hi Hagar,
> >>>>>>>>>
> >>>>>>>>> On 05/02/2025 16:10, Hagar Hemdan wrote:
> 
> [...]
> 
> >> './Run -c 4 spawn' on AWS instance (m7gd.16xlarge) with v6.13, 'mem=16G
> >> maxcpus=4 nr_cpus=4' and Ubuntu '22.04.5 LTS':
> >>
> >> CFG_SCHED_AUTOGROUP | sched_ag_enabled | eff6c8ce8d4d | Fork (lps)
> >>
> >>    	y	             1		   y            21005 (27120 **)
> >> 	y		     0		   y            21059 (27012 **)
> >> 	n		     -		   y            21299
> >> 	y		     1		   n	        27745 *
> >> 	y		     0		   n	        27493 *
> >> 	n		     -		   n	        20928
> >>
> >> (*) So here the higher numbers are only achieved when
> >> 'sched_autogroup_exit_task() -> sched_move_task() ->
> >> sched_change_group() is called for the 'spawn' tasks.
> >>
> >> (**) When I apply the fix from
> >> https://lkml.kernel.org/r/4a9cc5ab-c538-4427-8a7c-99cb317a283f@arm.com.
> > Thanks!
> > Will you submit that fix upstream?
> 
> I will, I just had to understand in detail why this regression happens.
> 
> Looks like the issue is rather related to 'sgs->group_util' in
> group_is_overloaded() and group_has_capacity(). If we don't
> 'deqeue/detach + attach/enqueue' (1) the task in sched_move_task() then
> sgs->group_util is ~900 (you run 4 CPUs flat in a single MC sched domain
> so sgs->group_capacity = 1024 and this leads to group_is_overloaded()
> returning true and group_has_capacity() false much more often as if
> we would do (1).
> 
> I.e. we have much more cases of 'group_is_overloaded' and
> 'group_fully_busy' in WF_FORK wakeup sched_balance_find_dst_cpu() which
> then (a) returns much more often a CPU != smp_processor_id() (which
> isn't good for these extremely short running tasks (FORK + EXIT)) and
> also involves calling sched_balance_find_dst_group_cpu() unnecessary
> (since we deal with single CPU sched domains). 
> 
> select_task_rq_fair(..., wake_flags = WF_FORK)
> 
>   cpu = smp_processor_id()
> 
>   new_cpu = sched_balance_find_dst_group(..., cpu, ...)
> 
>     do {
> 
>       update_sg_wakeup_stats()
> 
>         sgs->group_type = group_classify()   
> 							w/o patch 	w/ patch                   
>           if group_is_overloaded() (*)
>             return group_overloaded /* 6 */		457,141		394
> 
>           if !group_has_capacity() (**)
>             return group_fully_busy /* 1 */ 	  	816,629		714
> 
>           return group_has_spare    /* 0 */		1,158,890	3,157,472
> 
>     } while group 
> 
>     if local_sgs.group_type > idlest_sgs.group_type	
>       return idlest					351,598		273
> 
>     case group_has_spare:
> 
>       if local_sgs.idle_cpus >= idlest_sgs.idle_cpus
>         return NULL 					156,760		788,462
> 
> 
> (*)
> 
>   if sgs->group_capacity * 100) <			
> 		sgs->group_util * imbalance_pct		951,705		856
>     return true
> 
>   sgs->group_util ~ 900 and sgs->group_capacity = 1024 (1 CPU per sched group)
> 
> 
> (**)
> 
>  if sgs->group_capacity * 100 >
> 		sgs->group_util * imbalance_pct
>    return true						1,087,555	3,163,152
> 
>  return false						1,332,974	882
> 
> 
> (*) and (**) are for 'wakeup' and 'load-balance' so they don't
> match the only wakeup numbers above!

Thank you for the detailed explanation. We appreciate your effort and
will await the fix.
> 
> In this test run I got 608,092 new wakeups w/o and 789,572 (~+ 30%)
> w/ the patch when running './Run -c 4 -i 1 spawn' on AWS instance
> (m7gd.16xlarge) with v6.13, 'mem=16G maxcpus=4 nr_cpus=4' and
> Ubuntu '22.04.5 LTS'
> 
> > Do you think that this fix is the same as reverting commit eff6c8ce8d4d and
> > its follow up commit fa614b4feb5a? I mean what does commit eff6c8ce8d4d 
> > actually improve?
> 
> There are occurrences in which 'group == tsk->sched_task_group' and
> '!(tsk->flags & PF_EXITING)' so there the early bail might help w/o
> the negative impact on sched benchmarks.
ok, thanks!

  reply	other threads:[~2025-03-03 13:57 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-05 15:10 BUG Report: Fork benchmark drop by 30% on aarch64 Hagar Hemdan
2025-02-07  9:14 ` Dietmar Eggemann
2025-02-07 11:07   ` Hagar Hemdan
2025-02-10 10:38     ` Dietmar Eggemann
2025-02-10 21:31       ` Hagar Hemdan
2025-02-11 16:27         ` Dietmar Eggemann
2025-02-11 21:40           ` Hagar Hemdan
2025-02-13 18:55             ` Dietmar Eggemann
2025-02-17 22:51               ` Dietmar Eggemann
2025-02-21  6:44                 ` Hagar Hemdan
2025-03-03 10:05                   ` Dietmar Eggemann
2025-03-03 13:57                     ` Hagar Hemdan [this message]
2025-02-28 19:39                 ` Hagar Hemdan
2025-03-03 10:06                   ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250303135715.GA21308@amazon.com \
    --to=hagarhem@amazon.com \
    --cc=abuehaze@amazon.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.