From: Dave Jones <davej@codemonkey.org.uk>
To: Linux Kernel <linux-kernel@vger.kernel.org>
Cc: peterz@infradead.org, mgorman@techsingularity.net,
mingo@kernel.org, Linus Torvalds <torvalds@linux-foundation.org>
Subject: weird loadavg on idle machine post 5.7
Date: Thu, 2 Jul 2020 13:15:48 -0400 [thread overview]
Message-ID: <20200702171548.GA11813@codemonkey.org.uk> (raw)
When I upgraded my firewall to 5.7-rc2 I noticed that on a mostly
idle machine (that usually sees loadavg hover in the 0.xx range)
that it was consistently above 1.00 even when there was nothing running.
All that perf showed was the kernel was spending time in the idle loop
(and running perf).
For the first hour or so after boot, everything seems fine, but over
time loadavg creeps up, and once it's established a new baseline, it
never seems to ever drop below that again.
One morning I woke up to find loadavg at '7.xx', after almost as many
hours of uptime, which makes me wonder if perhaps this is triggered
by something in cron. I have a bunch of scripts that fire off
every hour that involve thousands of shortlived runs of iptables/ipset,
but running them manually didn't seem to automatically trigger the bug.
Given it took a few hours of runtime to confirm good/bad, bisecting this
took the last two weeks. I did it four different times, the first
producing bogus results from over-eager 'good', but the last two runs
both implicated this commit:
commit c6e7bd7afaeb3af55ffac122828035f1c01d1d7b (refs/bisect/bad)
Author: Peter Zijlstra <peterz@infradead.org>
Date: Sun May 24 21:29:55 2020 +0100
sched/core: Optimize ttwu() spinning on p->on_cpu
Both Rik and Mel reported seeing ttwu() spend significant time on:
smp_cond_load_acquire(&p->on_cpu, !VAL);
Attempt to avoid this by queueing the wakeup on the CPU that owns the
p->on_cpu value. This will then allow the ttwu() to complete without
further waiting.
Since we run schedule() with interrupts disabled, the IPI is
guaranteed to happen after p->on_cpu is cleared, this is what makes it
safe to queue early.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jirka Hladky <jhladky@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: valentin.schneider@arm.com
Cc: Hillf Danton <hdanton@sina.com>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20200524202956.27665-2-mgorman@techsingularity.net
Unfortunatly it doesn't revert cleanly on top of rc3 so I haven't
confirmed 100% that it's the cause yet, but the two separate bisects
seem promising.
I don't see any obvious correlation between what's changing there and
the symtoms (other than "scheduler magic") but maybe those closer to
this have ideas what could be going awry ?
Dave
next reply other threads:[~2020-07-02 17:48 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-02 17:15 Dave Jones [this message]
2020-07-02 19:46 ` weird loadavg on idle machine post 5.7 Dave Jones
2020-07-02 21:15 ` Paul Gortmaker
2020-07-03 13:23 ` Paul Gortmaker
2020-07-02 21:36 ` Mel Gorman
2020-07-02 23:11 ` Michal Kubecek
2020-07-02 23:24 ` Dave Jones
2020-07-03 9:02 ` Peter Zijlstra
2020-07-03 10:40 ` Peter Zijlstra
2020-07-03 20:51 ` Dave Jones
2020-07-06 14:59 ` Peter Zijlstra
2020-07-06 21:20 ` Dave Jones
2020-07-07 7:48 ` Peter Zijlstra
2020-07-06 23:56 ` Valentin Schneider
2020-07-07 8:17 ` Peter Zijlstra
2020-07-07 10:20 ` Valentin Schneider
2020-07-07 10:29 ` Peter Zijlstra
2020-07-08 9:46 ` [tip: sched/urgent] sched: Fix loadavg accounting race tip-bot2 for Peter Zijlstra
2020-07-07 9:20 ` weird loadavg on idle machine post 5.7 Qais Yousef
2020-07-07 9:47 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200702171548.GA11813@codemonkey.org.uk \
--to=davej@codemonkey.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox