From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev,
Hazem Mohamed Abuelfotoh <abuehaze@amazon.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Hagar Hemdan <hagarhem@amazon.com>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 6.6 72/77] Revert "sched/core: Reduce cost of sched_move_task when config autogroup"
Date: Tue, 25 Mar 2025 08:23:07 -0400 [thread overview]
Message-ID: <20250325122146.301672233@linuxfoundation.org> (raw)
In-Reply-To: <20250325122144.259256924@linuxfoundation.org>
6.6-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
commit 76f970ce51c80f625eb6ddbb24e9cb51b977b598 upstream.
This reverts commit eff6c8ce8d4d7faef75f66614dd20bb50595d261.
Hazem reported a 30% drop in UnixBench spawn test with commit
eff6c8ce8d4d ("sched/core: Reduce cost of sched_move_task when config
autogroup") on a m6g.xlarge AWS EC2 instance with 4 vCPUs and 16 GiB RAM
(aarch64) (single level MC sched domain):
https://lkml.kernel.org/r/20250205151026.13061-1-hagarhem@amazon.com
There is an early bail from sched_move_task() if p->sched_task_group is
equal to p's 'cpu cgroup' (sched_get_task_group()). E.g. both are
pointing to taskgroup '/user.slice/user-1000.slice/session-1.scope'
(Ubuntu '22.04.5 LTS').
So in:
do_exit()
sched_autogroup_exit_task()
sched_move_task()
if sched_get_task_group(p) == p->sched_task_group
return
/* p is enqueued */
dequeue_task() \
sched_change_group() |
task_change_group_fair() |
detach_task_cfs_rq() | (1)
set_task_rq() |
attach_task_cfs_rq() |
enqueue_task() /
(1) isn't called for p anymore.
Turns out that the regression is related to sgs->group_util in
group_is_overloaded() and group_has_capacity(). If (1) isn't called for
all the 'spawn' tasks then sgs->group_util is ~900 and
sgs->group_capacity = 1024 (single CPU sched domain) and this leads to
group_is_overloaded() returning true (2) and group_has_capacity() false
(3) much more often compared to the case when (1) is called.
I.e. there are much more cases of 'group_is_overloaded' and
'group_fully_busy' in WF_FORK wakeup sched_balance_find_dst_cpu() which
then returns much more often a CPU != smp_processor_id() (5).
This isn't good for these extremely short running tasks (FORK + EXIT)
and also involves calling sched_balance_find_dst_group_cpu() unnecessary
(single CPU sched domain).
Instead if (1) is called for 'p->flags & PF_EXITING' then the path
(4),(6) is taken much more often.
select_task_rq_fair(..., wake_flags = WF_FORK)
cpu = smp_processor_id()
new_cpu = sched_balance_find_dst_cpu(..., cpu, ...)
group = sched_balance_find_dst_group(..., cpu)
do {
update_sg_wakeup_stats()
sgs->group_type = group_classify()
if group_is_overloaded() (2)
return group_overloaded
if !group_has_capacity() (3)
return group_fully_busy
return group_has_spare (4)
} while group
if local_sgs.group_type > idlest_sgs.group_type
return idlest (5)
case group_has_spare:
if local_sgs.idle_cpus >= idlest_sgs.idle_cpus
return NULL (6)
Unixbench Tests './Run -c 4 spawn' on:
(a) VM AWS instance (m7gd.16xlarge) with v6.13 ('maxcpus=4 nr_cpus=4')
and Ubuntu 22.04.5 LTS (aarch64).
Shell & test run in '/user.slice/user-1000.slice/session-1.scope'.
w/o patch w/ patch
21005 27120
(b) i7-13700K with tip/sched/core ('nosmt maxcpus=8 nr_cpus=8') and
Ubuntu 22.04.5 LTS (x86_64).
Shell & test run in '/A'.
w/o patch w/ patch
67675 88806
CONFIG_SCHED_AUTOGROUP=y & /sys/proc/kernel/sched_autogroup_enabled equal
0 or 1.
Reported-by: Hazem Mohamed Abuelfotoh <abuehaze@amazon.com>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Hagar Hemdan <hagarhem@amazon.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250314151345.275739-1-dietmar.eggemann@arm.com
[Hagar: clean revert of eff6c8ce8dd7 to make it work on 6.6]
Signed-off-by: Hagar Hemdan <hagarhem@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/sched/core.c | 22 +++-------------------
1 file changed, 3 insertions(+), 19 deletions(-)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10494,7 +10494,7 @@ void sched_release_group(struct task_gro
spin_unlock_irqrestore(&task_group_lock, flags);
}
-static struct task_group *sched_get_task_group(struct task_struct *tsk)
+static void sched_change_group(struct task_struct *tsk)
{
struct task_group *tg;
@@ -10506,13 +10506,7 @@ static struct task_group *sched_get_task
tg = container_of(task_css_check(tsk, cpu_cgrp_id, true),
struct task_group, css);
tg = autogroup_task_group(tsk, tg);
-
- return tg;
-}
-
-static void sched_change_group(struct task_struct *tsk, struct task_group *group)
-{
- tsk->sched_task_group = group;
+ tsk->sched_task_group = tg;
#ifdef CONFIG_FAIR_GROUP_SCHED
if (tsk->sched_class->task_change_group)
@@ -10533,19 +10527,10 @@ void sched_move_task(struct task_struct
{
int queued, running, queue_flags =
DEQUEUE_SAVE | DEQUEUE_MOVE | DEQUEUE_NOCLOCK;
- struct task_group *group;
struct rq_flags rf;
struct rq *rq;
rq = task_rq_lock(tsk, &rf);
- /*
- * Esp. with SCHED_AUTOGROUP enabled it is possible to get superfluous
- * group changes.
- */
- group = sched_get_task_group(tsk);
- if (group == tsk->sched_task_group)
- goto unlock;
-
update_rq_clock(rq);
running = task_current(rq, tsk);
@@ -10556,7 +10541,7 @@ void sched_move_task(struct task_struct
if (running)
put_prev_task(rq, tsk);
- sched_change_group(tsk, group);
+ sched_change_group(tsk);
if (queued)
enqueue_task(rq, tsk, queue_flags);
@@ -10570,7 +10555,6 @@ void sched_move_task(struct task_struct
resched_curr(rq);
}
-unlock:
task_rq_unlock(rq, tsk, &rf);
}
next prev parent reply other threads:[~2025-03-25 12:37 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-25 12:21 [PATCH 6.6 00/77] 6.6.85-rc1 review Greg Kroah-Hartman
2025-03-25 12:21 ` [PATCH 6.6 01/77] firmware: imx-scu: fix OF node leak in .probe() Greg Kroah-Hartman
2025-03-25 12:21 ` [PATCH 6.6 02/77] arm64: dts: freescale: tqma8mpql: Fix vqmmc-supply Greg Kroah-Hartman
2025-03-25 12:21 ` [PATCH 6.6 03/77] xfrm: fix tunnel mode TX datapath in packet offload mode Greg Kroah-Hartman
2025-03-25 12:21 ` [PATCH 6.6 04/77] xfrm_output: Force software GSO only in tunnel mode Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 05/77] soc: imx8m: Remove global soc_uid Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 06/77] soc: imx8m: Use devm_* to simplify probe failure handling Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 07/77] soc: imx8m: Unregister cpufreq and soc dev in cleanup path Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 08/77] ARM: dts: bcm2711: PL011 UARTs are actually r1p5 Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 09/77] arm64: dts: rockchip: Remove undocumented sdmmc property from lubancat-1 Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 10/77] RDMA/bnxt_re: Add missing paranthesis in map_qp_id_to_tbl_indx Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 11/77] RDMA/mlx5: Handle errors returned from mlx5r_ib_rate() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 12/77] ARM: OMAP1: select CONFIG_GENERIC_IRQ_CHIP Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 13/77] ARM: dts: bcm2711: Dont mark timer regs unconfigured Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 14/77] RDMA/bnxt_re: Avoid clearing VLAN_ID mask in modify qp path Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 15/77] RDMA/hns: Fix soft lockup during bt pages loop Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 16/77] RDMA/hns: Fix unmatched condition in error path of alloc_user_qp_db() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 17/77] RDMA/hns: Fix a missing rollback in error path of hns_roce_create_qp_common() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 18/77] RDMA/hns: Fix wrong value of max_sge_rd Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 19/77] Bluetooth: Fix error code in chan_alloc_skb_cb() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 20/77] Bluetooth: hci_event: Fix connection regression between LE and non-LE adapters Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 21/77] accel/qaic: Fix possible data corruption in BOs > 2G Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 22/77] ARM: davinci: da850: fix selecting ARCH_DAVINCI_DA8XX Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 23/77] ipv6: Fix memleak of nhc_pcpu_rth_output in fib_check_nh_v6_gw() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 24/77] ipv6: Set errno after ip_fib_metrics_init() in ip6_route_info_create() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 25/77] net: atm: fix use after free in lec_send() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 26/77] net: lwtunnel: fix recursion loops Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 27/77] net/neighbor: add missing policy for NDTPA_QUEUE_LENBYTES Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 28/77] Revert "gre: Fix IPv6 link-local address generation." Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 29/77] i2c: omap: fix IRQ storms Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 30/77] can: rcar_canfd: Fix page entries in the AFL list Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 31/77] can: ucan: fix out of bound read in strscpy() source Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 32/77] can: flexcan: only change CAN state when link up in system PM Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 33/77] can: flexcan: disable transceiver during " Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 34/77] drm/v3d: Dont run jobs that have errors flagged in its fence Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 35/77] riscv: dts: starfive: Fix a typo in StarFive JH7110 pin function definitions Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 36/77] regulator: dummy: force synchronous probing Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 37/77] regulator: check that dummy regulator has been probed before using it Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 38/77] accel/qaic: Fix integer overflow in qaic_validate_req() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 39/77] arm64: dts: freescale: imx8mp-verdin-dahlia: add Microphone Jack to sound card Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 40/77] arm64: dts: freescale: imx8mm-verdin-dahlia: " Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 41/77] arm64: dts: rockchip: fix pinmux of UART0 for PX30 Ringneck on Haikou Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 42/77] arm64: dts: rockchip: Add missing PCIe supplies to RockPro64 board dtsi Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 43/77] mmc: sdhci-brcmstb: add cqhci suspend/resume to PM ops Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 44/77] mmc: atmel-mci: Add missing clk_disable_unprepare() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 45/77] mm: fix error handling in __filemap_get_folio() with FGP_NOWAIT Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 46/77] mm/migrate: fix shmem xarray update during migration Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 47/77] proc: fix UAF in proc_get_inode() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 48/77] memcg: drain obj stock on cpu hotplug teardown Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 49/77] ARM: dts: imx6qdl-apalis: Fix poweroff on Apalis iMX6 Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 50/77] ARM: shmobile: smp: Enforce shmobile_smp_* alignment Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 51/77] efi/libstub: Avoid physical address 0x0 when doing random allocation Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 52/77] xsk: fix an integer overflow in xp_create_and_assign_umem() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 53/77] batman-adv: Ignore own maximum aggregation size during RX Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 54/77] soc: qcom: pdr: Fix the potential deadlock Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 55/77] drm/radeon: fix uninitialized size issue in radeon_vce_cs_parse() Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 56/77] drm/sched: Fix fence reference count leak Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 57/77] drm/amdgpu: Fix MPEG2, MPEG4 and VC1 video caps max size Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 58/77] drm/amdgpu: Fix JPEG video caps max size for navi1x and raven Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 59/77] drm/amd/display: should support dmub hw lock on Replay Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 60/77] drm/amd/display: Use HW lock mgr for PSR1 when only one eDP Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 61/77] ksmbd: fix incorrect validation for num_aces field of smb_acl Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 62/77] mptcp: Fix data stream corruption in the address announcement Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 63/77] KVM: arm64: Calculate cptr_el2 traps on activating traps Greg Kroah-Hartman
2025-03-25 12:22 ` [PATCH 6.6 64/77] KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 65/77] KVM: arm64: Remove host FPSIMD saving for non-protected KVM Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 66/77] KVM: arm64: Remove VHE host restore of CPACR_EL1.ZEN Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 67/77] KVM: arm64: Remove VHE host restore of CPACR_EL1.SMEN Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 68/77] KVM: arm64: Refactor exit handlers Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 69/77] KVM: arm64: Mark some header functions as inline Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 70/77] KVM: arm64: Eagerly switch ZCR_EL{1,2} Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 71/77] arm64: dts: rockchip: fix u2phy1_host status for NanoPi R4S Greg Kroah-Hartman
2025-03-25 12:23 ` Greg Kroah-Hartman [this message]
2025-03-25 12:23 ` [PATCH 6.6 73/77] btrfs: make sure that WRITTEN is set on all metadata blocks Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 74/77] bnxt_en: Fix receive ring space parameters when XDP is active Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 75/77] wifi: iwlwifi: support BIOS override for 5G9 in CA also in LARI version 8 Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 76/77] wifi: iwlwifi: mvm: ensure offloading TID queue exists Greg Kroah-Hartman
2025-03-25 12:23 ` [PATCH 6.6 77/77] netfilter: nft_counter: Use u64_stats_t for statistic Greg Kroah-Hartman
2025-03-25 15:07 ` [PATCH 6.6 00/77] 6.6.85-rc1 review Naresh Kamboju
2025-03-25 16:07 ` Dragan Simic
2025-03-25 23:36 ` Greg Kroah-Hartman
2025-03-26 2:33 ` Harshit Mogalapalli
2025-03-26 3:56 ` Dragan Simic
2025-03-26 15:38 ` Greg Kroah-Hartman
2025-03-27 7:12 ` Dragan Simic
2025-03-25 17:25 ` Florian Fainelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250325122146.301672233@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=abuehaze@amazon.com \
--cc=dietmar.eggemann@arm.com \
--cc=hagarhem@amazon.com \
--cc=mingo@kernel.org \
--cc=patches@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.