From: Catalin Marinas <catalin.marinas@arm.com>
To: "Christoph Lameter (Ampere)" <cl@gentwo.org>
Cc: Yang Shi <yang@os.amperecomputing.com>,
will@kernel.org, anshuman.khandual@arm.com, david@redhat.com,
scott@os.amperecomputing.com,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [v5 PATCH] arm64: mm: force write fault for atomic RMW instructions
Date: Fri, 5 Jul 2024 19:24:56 +0100 [thread overview]
Message-ID: <Zog6eFF1zDl4IRHX@arm.com> (raw)
In-Reply-To: <b0315df9-b122-46cd-12b2-7704d4a4392e@gentwo.org>
On Fri, Jul 05, 2024 at 10:05:29AM -0700, Christoph Lameter (Ampere) wrote:
> On Thu, 4 Jul 2024, Catalin Marinas wrote:
> > It could be worked around with a new flavour of get_user() that uses the
> > non-T LDR instruction and the user mapping is readable by the kernel
> > (that's the case with EPAN, prior to PIE and I think we can change this
> > for PIE configurations as well). But it adds to the complexity of this
> > patch when the kernel already offers a MADV_POPULATE_WRITE solution.
>
> The use of MADV_POPULATE_WRITE here is arch specific and not a general
> solution. It requires specialized knowledge and research before someone can
> figure out that this particular trick is required on Linux ARM64 processors.
> The builders need to detect this special situation in the build process and
> activate this workaround.
Not really, see this OpenJDK commit:
https://github.com/openjdk/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a
There's nothing about arm64 in there and it looks like the code prefers
MADV_POPULATE_WRITE if THPs are enabled (which is the case in all
enterprise distros). I can't tell whether the change was made to work
around the arm64 behaviour, there's no commit log (it was contributed by
Ampere).
There's a separate thread with the mm folk on the THP behaviour for
pmd_none() vs pmd mapping the zero huge page but it is more portable for
OpenJDK to use madvise() than guess the kernel behaviour and touch small
pages or a single large pages. Even if one claims that atomic_add(0) is
portable across operating systems, the OpenJDK code was already treating
Linux as a special case in the presence of THP.
> It would be much simpler to just merge the patch and be done with it.
> Otherwise this issue will continue to cause uncountably many hours of
> anguish for sysadmins and developers all over the Linux ecosystem trying to
> figure out what in the world is going on with ARM.
People will be happy until one enables execute-only ELF text sections in
a distro and all that opcode parsing will add considerable overhead for
many read faults (those with a writeable vma).
I'd also like to understand (probably have to re-read the older threads)
whether the overhead is caused mostly by the double fault or the actual
breaking of a THP. For the latter, the mm folk are willing to change the
behaviour so that pmd_none() and pmd to the zero high page are treated
similarly (i.e. allocate a huge page on write fault). If that's good
enough, I'd rather not merge this patch (or some form of it) and wait
for a proper fix in hardware in the future.
Just to be clear, there are still potential issues to address (or
understand the impact of) in this patch with exec-only mappings and
the performance gain _after_ the THP behaviour changed in the mm code.
We can make a call once we have more data but, TBH, my inclination is
towards 'no' given that OpenJDK already support madvise() and it's not
arm64 specific.
--
Catalin
next prev parent reply other threads:[~2024-07-05 18:25 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-26 19:18 [v5 PATCH] arm64: mm: force write fault for atomic RMW instructions Yang Shi
2024-06-28 16:54 ` Catalin Marinas
2024-06-28 16:57 ` Christoph Lameter (Ampere)
2024-06-28 17:24 ` Catalin Marinas
2024-06-28 18:20 ` Yang Shi
2024-07-01 19:43 ` Catalin Marinas
2024-07-02 10:26 ` Ryan Roberts
2024-07-02 11:22 ` David Hildenbrand
2024-07-02 12:36 ` Ryan Roberts
2024-07-02 12:58 ` David Hildenbrand
2024-07-02 13:26 ` Ryan Roberts
2024-07-02 13:50 ` David Hildenbrand
2024-07-02 14:51 ` Ryan Roberts
2024-07-15 13:09 ` Ryan Roberts
2024-07-02 22:21 ` Yang Shi
2024-07-04 10:03 ` Catalin Marinas
2024-07-05 17:05 ` Christoph Lameter (Ampere)
2024-07-05 18:24 ` Catalin Marinas [this message]
2024-07-05 18:51 ` Christoph Lameter (Ampere)
2024-07-06 9:47 ` Catalin Marinas
2024-07-09 17:56 ` Yang Shi
2024-07-09 18:35 ` Catalin Marinas
2024-07-09 22:29 ` Yang Shi
2024-07-10 9:22 ` Catalin Marinas
2024-07-10 18:43 ` Yang Shi
2024-07-11 17:43 ` Catalin Marinas
2024-07-11 18:17 ` Yang Shi
2024-08-13 17:09 ` Yang Shi
2024-08-21 10:18 ` Catalin Marinas
2024-08-21 11:32 ` Dev Jain
2024-08-23 9:59 ` Will Deacon
2024-06-28 18:26 ` Christoph Lameter (Ampere)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zog6eFF1zDl4IRHX@arm.com \
--to=catalin.marinas@arm.com \
--cc=anshuman.khandual@arm.com \
--cc=cl@gentwo.org \
--cc=david@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=scott@os.amperecomputing.com \
--cc=will@kernel.org \
--cc=yang@os.amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).