From: Mel Gorman <mgorman@techsingularity.net>
To: Peter Zijlstra <peterz@infradead.org>, Will Deacon <will@kernel.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: Loadavg accounting error on arm64
Date: Mon, 16 Nov 2020 09:10:54 +0000 [thread overview]
Message-ID: <20201116091054.GL3371@techsingularity.net> (raw)
Hi,
I got cc'd internal bug report filed against a 5.8 and 5.9 kernel
that loadavg was "exploding" on arch64 on a machines acting as a build
servers. It happened on at least two different arm64 variants. That setup
is complex to replicate but fortunately can be reproduced by running
hackbench-process-pipes while heavily overcomitting a machine with 96
logical CPUs and then checking if loadavg drops afterwards. With an
MMTests clone, I reproduced it as follows
./run-mmtests.sh --config configs/config-workload-hackbench-process-pipes --no-monitor testrun; \
for i in `seq 1 60`; do cat /proc/loadavg; sleep 60; done
Load should drop to 10 after about 10 minutes and it does on x86-64 but
remained at around 200+ on arm64.
The reproduction case simply hammers the case where a task can be
descheduling while also being woken by another task at the same time. It
takes a long time to run but it makes the problem very obvious. The
expectation is that after hackbench has been running and saturating the
machine for a long time.
Commit dbfb089d360b ("sched: Fix loadavg accounting race") fixed a loadavg
accounting race in the generic case. Later it was documented why the
ordering of when p->sched_contributes_to_load is read/updated relative
to p->on_cpu. This is critical when a task is descheduling at the same
time it is being activated on another CPU. While the load/stores happen
under the RQ lock, the RQ lock on its own does not give any guarantees
on the task state.
Over the weekend I convinced myself that it must be because the
implementation of smp_load_acquire and smp_store_release do not appear
to implement acquire/release semantics because I didn't find something
arm64 that was playing with p->state behind the schedulers back (I could
have missed it if it was in an assembly portion as I can't reliablyh read
arm assembler). Similarly, it's not clear why the arm64 implementation
does not call smp_acquire__after_ctrl_dep in the smp_load_acquire
implementation. Even when it was introduced, the arm64 implementation
differed significantly from the arm implementation in terms of what
barriers it used for non-obvious reasons.
Unfortunately, making that work similar to the arch-independent version
did not help but it's not helped that I know nothing about the arm64
memory model.
I'll be looking again today to see can I find a mistake in the ordering for
how sched_contributes_to_load is handled but again, the lack of knowledge
on the arm64 memory model means I'm a bit stuck and a second set of eyes
would be nice :(
--
Mel Gorman
SUSE Labs
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2020-11-16 9:11 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-16 9:10 Mel Gorman [this message]
2020-11-16 11:49 ` Loadavg accounting error on arm64 Mel Gorman
2020-11-16 12:00 ` Mel Gorman
2020-11-16 12:53 ` Peter Zijlstra
2020-11-16 12:58 ` Peter Zijlstra
2020-11-16 15:29 ` Mel Gorman
2020-11-16 16:42 ` Mel Gorman
2020-11-16 16:49 ` Peter Zijlstra
2020-11-16 17:24 ` Mel Gorman
2020-11-16 17:41 ` Will Deacon
2020-11-16 12:46 ` Peter Zijlstra
2020-11-16 12:58 ` Mel Gorman
2020-11-16 13:11 ` Will Deacon
2020-11-16 13:37 ` Mel Gorman
2020-11-16 14:20 ` Peter Zijlstra
2020-11-16 15:52 ` Mel Gorman
2020-11-16 16:54 ` Peter Zijlstra
2020-11-16 17:16 ` Mel Gorman
2020-11-16 19:31 ` Mel Gorman
2020-11-17 8:30 ` [PATCH] sched: Fix data-race in wakeup Peter Zijlstra
2020-11-17 9:15 ` Will Deacon
2020-11-17 9:29 ` Peter Zijlstra
2020-11-17 9:46 ` Peter Zijlstra
2020-11-17 10:36 ` Will Deacon
2020-11-17 12:52 ` Valentin Schneider
2020-11-17 15:37 ` Valentin Schneider
2020-11-17 16:13 ` Peter Zijlstra
2020-11-17 19:32 ` Valentin Schneider
2020-11-18 8:05 ` Peter Zijlstra
2020-11-18 9:51 ` Valentin Schneider
2020-11-18 13:33 ` Marco Elver
2020-11-17 9:38 ` [PATCH] sched: Fix rq->nr_iowait ordering Peter Zijlstra
2020-11-17 11:43 ` Mel Gorman
2020-11-17 12:40 ` [PATCH] sched: Fix data-race in wakeup Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201116091054.GL3371@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=dave@stgolabs.net \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).