From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Matt Fleming <matt@codeblueprint.co.uk>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Mike Galbraith <efault@gmx.de>,
Mike Galbraith <umgwanakikbuti@gmail.com>,
Morten Rasmussen <morten.rasmussen@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Vincent Guittot <vincent.guittot@linaro.org>,
Ingo Molnar <mingo@kernel.org>
Subject: [PATCH 4.4 088/101] sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting
Date: Mon, 3 Jul 2017 15:35:28 +0200 [thread overview]
Message-ID: <20170703133349.059470829@linuxfoundation.org> (raw)
In-Reply-To: <20170703133334.237346187@linuxfoundation.org>
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matt Fleming <matt@codeblueprint.co.uk>
commit 6e5f32f7a43f45ee55c401c0b9585eb01f9629a8 upstream.
If we crossed a sample window while in NO_HZ we will add LOAD_FREQ to
the pending sample window time on exit, setting the next update not
one window into the future, but two.
This situation on exiting NO_HZ is described by:
this_rq->calc_load_update < jiffies < calc_load_update
In this scenario, what we should be doing is:
this_rq->calc_load_update = calc_load_update [ next window ]
But what we actually do is:
this_rq->calc_load_update = calc_load_update + LOAD_FREQ [ next+1 window ]
This has the effect of delaying load average updates for potentially
up to ~9seconds.
This can result in huge spikes in the load average values due to
per-cpu uninterruptible task counts being out of sync when accumulated
across all CPUs.
It's safe to update the per-cpu active count if we wake between sample
windows because any load that we left in 'calc_load_idle' will have
been zero'd when the idle load was folded in calc_global_load().
This issue is easy to reproduce before,
commit 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")
just by forking short-lived process pipelines built from ps(1) and
grep(1) in a loop. I'm unable to reproduce the spikes after that
commit, but the bug still seems to be present from code review.
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Fixes: commit 5167e8d ("sched/nohz: Rewrite and fix load-avg computation -- again")
Link: http://lkml.kernel.org/r/20170217120731.11868-2-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/sched/loadavg.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -201,8 +201,9 @@ void calc_load_exit_idle(void)
struct rq *this_rq = this_rq();
/*
- * If we're still before the sample window, we're done.
+ * If we're still before the pending sample window, we're done.
*/
+ this_rq->calc_load_update = calc_load_update;
if (time_before(jiffies, this_rq->calc_load_update))
return;
@@ -211,7 +212,6 @@ void calc_load_exit_idle(void)
* accounted through the nohz accounting, so skip the entire deal and
* sync up for the next window.
*/
- this_rq->calc_load_update = calc_load_update;
if (time_before(jiffies, this_rq->calc_load_update + 10))
this_rq->calc_load_update += LOAD_FREQ;
}
next prev parent reply other threads:[~2017-07-03 13:40 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-03 13:34 [PATCH 4.4 000/101] 4.4.76-stable review Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 001/101] ipv6: release dst on error in ip6_dst_lookup_tail Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 002/101] net: dont call strlen on non-terminated string in dev_set_alias() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 003/101] decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 004/101] net: Zero ifla_vf_info in rtnl_fill_vfinfo() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 005/101] af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 006/101] Fix an intermittent pr_emerg warning about lo becoming free Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 007/101] net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 008/101] igmp: acquire pmc lock for ip_mc_clear_src() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 009/101] igmp: add a missing spin_lock_init() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 010/101] ipv6: fix calling in6_ifa_hold incorrectly for dad work Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 011/101] net/mlx5: Wait for FW readiness before initializing command interface Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 012/101] decnet: always not take dst->__refcnt when inserting dst into hash table Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 013/101] net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 014/101] sfc: provide dummy definitions of vswitch functions Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 015/101] ipv6: Do not leak throw route references Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 016/101] rtnetlink: add IFLA_GROUP to ifla_policy Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 017/101] netfilter: xt_TCPMSS: add more sanity tests on tcph->doff Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 018/101] netfilter: synproxy: fix conntrackd interaction Greg Kroah-Hartman
2017-08-17 5:57 ` Stefan Bader
2017-08-17 16:47 ` Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 019/101] NFSv4: fix a reference leak caused WARNING messages Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 020/101] drm/ast: Handle configuration without P2A bridge Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 021/101] mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 022/101] MIPS: Avoid accidental raw backtrace Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 023/101] MIPS: pm-cps: Drop manual cache-line alignment of ready_count Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 024/101] MIPS: Fix IRQ tracing & lockdep when rescheduling Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 025/101] ALSA: hda - Fix endless loop of codec configure Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 026/101] ALSA: hda - set input_path bitmap to zero after moving it to new place Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 027/101] drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 028/101] usb: gadget: f_fs: Fix possibe deadlock Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 029/101] sysctl: enable strict writes Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 030/101] block: fix module reference leak on put_disk() call for cgroups throttle Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 031/101] mm: numa: avoid waiting on freed migrated pages Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 033/101] scsi: sd: Fix wrong DPOFUA disable in sd_read_cache_type Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 034/101] scsi: lpfc: Set elsiocb contexts to NULL after freeing it Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 035/101] qla2xxx: Fix erroneous invalid handle message Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 037/101] net: mvneta: Fix for_each_present_cpu usage Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 038/101] MIPS: ath79: fix regression in PCI window initialization Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 039/101] net: korina: Fix NAPI versus resources freeing Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 040/101] MIPS: ralink: MT7688 pinmux fixes Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 041/101] MIPS: ralink: fix USB frequency scaling Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 042/101] MIPS: ralink: Fix invalid assignment of SoC type Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 046/101] bgmac: fix a missing check for build_skb Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 048/101] bgmac: Fix reversed test of build_skb() return value Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 049/101] net: bgmac: Fix SOF bit checking Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 050/101] net: bgmac: Start transmit queue in bgmac_open Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 051/101] net: bgmac: Remove superflous netif_carrier_on() Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 052/101] powerpc/eeh: Enable IO path on permanent error Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 053/101] gianfar: Do not reuse pages from emergency reserve Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 054/101] Btrfs: fix truncate down when no_holes feature is enabled Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 055/101] virtio_console: fix a crash in config_work_handler Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 056/101] swiotlb-xen: update dev_addr after swapping pages Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 057/101] net: sctp: fix array overrun read on sctp_timer_tbl Greg Kroah-Hartman
2017-07-04 18:48 ` Ben Hutchings
2017-07-05 12:17 ` Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 058/101] xen-netfront: Fix Rx stall during network stress and OOM Greg Kroah-Hartman
2017-07-03 13:34 ` [PATCH 4.4 059/101] scsi: virtio_scsi: Reject commands when virtqueue is broken Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 060/101] platform/x86: ideapad-laptop: handle ACPI event 1 Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 061/101] amd-xgbe: Check xgbe_init() return code Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 062/101] net: dsa: Check return value of phy_connect_direct() Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 064/101] vfio/spapr: fail tce_iommu_attach_group() when iommu_data is null Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 065/101] virtio_net: fix PAGE_SIZE > 64k Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 066/101] vxlan: do not age static remote mac entries Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 067/101] ibmveth: Add a proper check for the availability of the checksum features Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 068/101] kernel/panic.c: add missing \n Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 069/101] HID: i2c-hid: Add sleep between POWER ON and RESET Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 070/101] scsi: lpfc: avoid double free of resource identifiers Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 071/101] spi: davinci: use dma_mapping_error() Greg Kroah-Hartman
2017-07-05 14:24 ` Ben Hutchings
2018-04-06 8:21 ` Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 072/101] arm64: assembler: make adr_l work in modules under KASLR Greg Kroah-Hartman
2017-07-04 9:24 ` Ard Biesheuvel
2017-07-04 9:29 ` Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 073/101] mac80211: initialize SMPS field in HT capabilities Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 074/101] x86/mpx: Use compatible types in comparison to fix sparse error Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 075/101] coredump: Ensure proper size of sparse core files Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 076/101] swiotlb: ensure that page-sized mappings are page-aligned Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 077/101] s390/ctl_reg: make __ctl_load a full memory barrier Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 078/101] be2net: fix status check in be_cmd_pmac_add() Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 079/101] perf probe: Fix to show correct locations for events on modules Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 080/101] net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 081/101] sctp: check af before verify address in sctp_addr_id2transport Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 082/101] ravb: Fix use-after-free on `ifconfig eth0 down` Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 083/101] jump label: fix passing kbuild_cflags when checking for asm goto support Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 084/101] xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 085/101] xfrm: NULL dereference on allocation failure Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 086/101] xfrm: Oops on error in pfkey_msg2xfrm_state() Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 087/101] watchdog: bcm281xx: Fix use of uninitialized spinlock Greg Kroah-Hartman
2017-07-03 13:35 ` Greg Kroah-Hartman [this message]
2017-07-03 13:35 ` [PATCH 4.4 089/101] ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 090/101] ARM: 8685/1: ensure memblock-limit is pmd-aligned Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 091/101] x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 092/101] x86/mm: Fix flush_tlb_page() on Xen Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 093/101] ocfs2: o2hb: revert hb threshold to keep compatible Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 094/101] iommu/vt-d: Dont over-free page table directories Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 095/101] iommu: Handle default domain attach failure Greg Kroah-Hartman
2017-07-05 18:52 ` Ben Hutchings
2017-07-03 13:35 ` [PATCH 4.4 096/101] iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid() Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 097/101] cpufreq: s3c2416: double free on driver init error path Greg Kroah-Hartman
2017-07-03 13:35 ` [PATCH 4.4 098/101] KVM: x86: fix emulation of RSM and IRET instructions Greg Kroah-Hartman
2017-07-03 19:37 ` [PATCH 4.4 000/101] 4.4.76-stable review Guenter Roeck
2017-07-04 8:00 ` Greg Kroah-Hartman
[not found] ` <595aa84b.4eec1c0a.8c9e2.e137@mx.google.com>
2017-07-04 8:02 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170703133349.059470829@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=efault@gmx.de \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=umgwanakikbuti@gmail.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox