From: "qi.fuli@fujitsu.com" <qi.fuli@fujitsu.com>
To: Will Deacon <will@kernel.org>,
"qi.fuli@fujitsu.com" <qi.fuli@fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>,
"indou.takao@fujitsu.com" <indou.takao@fujitsu.com>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"peterz@infradead.org" <peterz@infradead.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Jonathan Corbet <corbet@lwn.net>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH 0/2] arm64: Introduce boot parameter to disable TLB flush instruction within the same inner shareable domain
Date: Wed, 3 Jul 2019 02:45:43 +0000 [thread overview]
Message-ID: <5999ed84-72d0-9d42-bf7d-b8d56eaa4d4a@jp.fujitsu.com> (raw)
In-Reply-To: <20190627102724.vif6zh6zfqktpmjx@willie-the-truck>
Hi Will,
Thanks for your comments.
On 6/27/19 7:27 PM, Will Deacon wrote:
> On Mon, Jun 24, 2019 at 10:34:02AM +0000, qi.fuli@fujitsu.com wrote:
>> On 6/18/19 2:03 AM, Will Deacon wrote:
>>> On Mon, Jun 17, 2019 at 11:32:53PM +0900, Takao Indoh wrote:
>>>> From: Takao Indoh <indou.takao@fujitsu.com>
>>>>
>>>> I found a performance issue related on the implementation of Linux's TLB
>>>> flush for arm64.
>>>>
>>>> When I run a single-threaded test program on moderate environment, it
>>>> usually takes 39ms to finish its work. However, when I put a small
>>>> apprication, which just calls mprotest() continuously, on one of sibling
>>>> cores and run it simultaneously, the test program slows down significantly.
>>>> It becomes 49ms(125%) on ThunderX2. I also detected the same problem on
>>>> ThunderX1 and Fujitsu A64FX.
>>> This is a problem for any applications that share hardware resources with
>>> each other, so I don't think it's something we should be too concerned about
>>> addressing unless there is a practical DoS scenario, which there doesn't
>>> appear to be in this case. It may be that the real answer is "don't call
>>> mprotect() in a loop".
>> I think there has been a misunderstanding, please let me explain.
>> This application is just an example using for reproducing the
>> performance issue we found.
>> Our original purpose is reducing OS jitter by this series.
>> The OS jitter on massively parallel processing systems have been known
>> and studied for many years.
>> The 2.5% OS jitter can result in over a factor of 20 slowdown for the
>> same application [1].
> I think it's worth pointing out that the system in question was neither
> ARM-based nor running Linux, so I'd be cautious in applying the conclusions
> of that paper directly to our TLB invalidation code. Furthermore, the noise
> being generated in their experiments uses a timer interrupt, which has a
> /vastly/ different profile to a DVM message in terms of both system impact
> and frequency.
My original purpose was to explain that the OS jitter is a vital issue for
large-scale HPC environment by referencing this paper.
Please allow me to introduce the issue that had occurred to our HPC
environment.
We used FWQ [1] to do an experiment on 1 node of our HPC environment,
we expected it would be tens of microseconds of maximum OS jitter, but
it was
hundreds of microseconds, which didn't meet our requirement. We tried to
find
out the cause by using ftrace, but we cannot find any processes which would
cause noise and only knew the extension of processing time. Then we
confirmed
the CPU instruction count through CPU PMU, we also didn't find any changes.
However, we found that with the increase of that the TLB flash was called,
the noise was also increasing. Here we understood that the cause of this
issue
is the implementation of Linux's TLB flush for arm64, especially use of
TLBI-is
instruction which is a broadcast to all processor core on the system.
Therefore,
we made this patch set to fix this issue. After testing for several
times, the
noise was reduced and our original goal was achieved, so we do think
this patch
makes sense.
As I mentioned, the OS jitter is a vital issue for large-scale HPC
environment.
We tried a lot of things to reduce the OS jitter. One of them is task
separation
between the CPUs which are used for computing and the CPUs which are
used for
maintenance. All of the daemon processes and I/O interrupts are bounden
to the
maintenance CPUs. Further more, we used nohz_full to avoid the noise
caused by
computing CPU interruption, but all of the CPUs were affected by TLBI-is
instruction, the task separation of CPUs didn't work. Therefore, we
would like
to implement that TLB flush is done on minimal CPUs to reducing the OS
jitter
by using this patch set.
[1] https://asc.llnl.gov/sequoia/benchmarks/FTQ_summary_v1.1.pdf
Thanks,
QI Fuli
>> Though it may be an extreme example, reducing the OS jitter has been an
>> issue in HPC environment.
>>
>> [1] Ferreira, Kurt B., Patrick Bridges, and Ron Brightwell.
>> "Characterizing application sensitivity to OS interference using
>> kernel-level noise injection." Proceedings of the 2008 ACM/IEEE
>> conference on Supercomputing. IEEE Press, 2008.
>>
>>>> I suppose the root cause of this issue is the implementation of Linux's TLB
>>>> flush for arm64, especially use of TLBI-is instruction which is a broadcast
>>>> to all processor core on the system. In case of the above situation,
>>>> TLBI-is is called by mprotect().
>>> On the flip side, Linux is providing the hardware with enough information
>>> not to broadcast to cores for which the remote TLBs don't have entries
>>> allocated for the ASID being invalidated. I would say that the root cause
>>> of the issue is that this filtering is not taking place.
>> Do you mean that the filter should be implemented in hardware?
> Yes. If you're building a large system and you care about "jitter", then
> you either need to partition it in such a way that sources of noise are
> contained, or you need to introduce filters to limit their scope. Rewriting
> the low-level memory-management parts of the operating system is a red
> herring and imposes a needless burden on everybody else without solving
> the real problem, which is that contended use of shared resources doesn't
> scale.
>
> Will
next prev parent reply other threads:[~2019-07-03 2:45 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-17 14:32 [PATCH 0/2] arm64: Introduce boot parameter to disable TLB flush instruction within the same inner shareable domain Takao Indoh
2019-06-17 14:32 ` [PATCH 1/2] arm64: mm: Restore mm_cpumask (revert commit 38d96287504a ("arm64: mm: kill mm_cpumask usage")) Takao Indoh
2019-07-23 11:55 ` Catalin Marinas
2019-06-17 14:32 ` [PATCH 2/2] arm64: tlb: Add boot parameter to disable TLB flush within the same inner shareable domain Takao Indoh
2019-07-23 12:11 ` Catalin Marinas
2019-06-17 17:03 ` [PATCH 0/2] arm64: Introduce boot parameter to disable TLB flush instruction " Will Deacon
2019-06-24 10:34 ` qi.fuli
2019-06-27 10:27 ` Will Deacon
2019-07-03 2:45 ` qi.fuli [this message]
2019-07-09 0:25 ` Jon Masters
2019-07-09 0:29 ` Jon Masters
2019-07-09 8:03 ` Will Deacon
2019-07-09 8:07 ` Will Deacon
2019-11-01 9:56 ` qi.fuli
2019-11-01 17:28 ` Will Deacon
2019-11-26 14:26 ` Matthias Brugger
2019-11-26 14:36 ` Will Deacon
2019-12-01 16:02 ` Jon Masters
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5999ed84-72d0-9d42-bf7d-b8d56eaa4d4a@jp.fujitsu.com \
--to=qi.fuli@fujitsu.com \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=indou.takao@fujitsu.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=will.deacon@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox