From: Catalin Marinas <catalin.marinas@arm.com>
To: "Christoph Lameter (Ampere)" <cl@gentwo.org>
Cc: Yang Shi <yang@os.amperecomputing.com>,
will@kernel.org, anshuman.khandual@arm.com, david@redhat.com,
scott@os.amperecomputing.com,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [v5 PATCH] arm64: mm: force write fault for atomic RMW instructions
Date: Fri, 5 Jul 2024 19:24:56 +0100 [thread overview]
Message-ID: <Zog6eFF1zDl4IRHX@arm.com> (raw)
In-Reply-To: <b0315df9-b122-46cd-12b2-7704d4a4392e@gentwo.org>
On Fri, Jul 05, 2024 at 10:05:29AM -0700, Christoph Lameter (Ampere) wrote:
> On Thu, 4 Jul 2024, Catalin Marinas wrote:
> > It could be worked around with a new flavour of get_user() that uses the
> > non-T LDR instruction and the user mapping is readable by the kernel
> > (that's the case with EPAN, prior to PIE and I think we can change this
> > for PIE configurations as well). But it adds to the complexity of this
> > patch when the kernel already offers a MADV_POPULATE_WRITE solution.
>
> The use of MADV_POPULATE_WRITE here is arch specific and not a general
> solution. It requires specialized knowledge and research before someone can
> figure out that this particular trick is required on Linux ARM64 processors.
> The builders need to detect this special situation in the build process and
> activate this workaround.
Not really, see this OpenJDK commit:
https://github.com/openjdk/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a
There's nothing about arm64 in there and it looks like the code prefers
MADV_POPULATE_WRITE if THPs are enabled (which is the case in all
enterprise distros). I can't tell whether the change was made to work
around the arm64 behaviour, there's no commit log (it was contributed by
Ampere).
There's a separate thread with the mm folk on the THP behaviour for
pmd_none() vs pmd mapping the zero huge page but it is more portable for
OpenJDK to use madvise() than guess the kernel behaviour and touch small
pages or a single large pages. Even if one claims that atomic_add(0) is
portable across operating systems, the OpenJDK code was already treating
Linux as a special case in the presence of THP.
> It would be much simpler to just merge the patch and be done with it.
> Otherwise this issue will continue to cause uncountably many hours of
> anguish for sysadmins and developers all over the Linux ecosystem trying to
> figure out what in the world is going on with ARM.
People will be happy until one enables execute-only ELF text sections in
a distro and all that opcode parsing will add considerable overhead for
many read faults (those with a writeable vma).
I'd also like to understand (probably have to re-read the older threads)
whether the overhead is caused mostly by the double fault or the actual
breaking of a THP. For the latter, the mm folk are willing to change the
behaviour so that pmd_none() and pmd to the zero high page are treated
similarly (i.e. allocate a huge page on write fault). If that's good
enough, I'd rather not merge this patch (or some form of it) and wait
for a proper fix in hardware in the future.
Just to be clear, there are still potential issues to address (or
understand the impact of) in this patch with exec-only mappings and
the performance gain _after_ the THP behaviour changed in the mm code.
We can make a call once we have more data but, TBH, my inclination is
towards 'no' given that OpenJDK already support madvise() and it's not
arm64 specific.
--
Catalin
next prev parent reply other threads:[~2024-07-05 18:25 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-26 19:18 [v5 PATCH] arm64: mm: force write fault for atomic RMW instructions Yang Shi
2024-06-28 16:54 ` Catalin Marinas
2024-06-28 16:57 ` Christoph Lameter (Ampere)
2024-06-28 17:24 ` Catalin Marinas
2024-06-28 18:20 ` Yang Shi
2024-07-01 19:43 ` Catalin Marinas
2024-07-02 10:26 ` Ryan Roberts
2024-07-02 11:22 ` David Hildenbrand
2024-07-02 12:36 ` Ryan Roberts
2024-07-02 12:58 ` David Hildenbrand
2024-07-02 13:26 ` Ryan Roberts
2024-07-02 13:50 ` David Hildenbrand
2024-07-02 14:51 ` Ryan Roberts
2024-07-15 13:09 ` Ryan Roberts
2024-07-02 22:21 ` Yang Shi
2024-07-04 10:03 ` Catalin Marinas
2024-07-05 17:05 ` Christoph Lameter (Ampere)
2024-07-05 18:24 ` Catalin Marinas [this message]
2024-07-05 18:51 ` Christoph Lameter (Ampere)
2024-07-06 9:47 ` Catalin Marinas
2024-07-09 17:56 ` Yang Shi
2024-07-09 18:35 ` Catalin Marinas
2024-07-09 22:29 ` Yang Shi
2024-07-10 9:22 ` Catalin Marinas
2024-07-10 18:43 ` Yang Shi
2024-07-11 17:43 ` Catalin Marinas
2024-07-11 18:17 ` Yang Shi
2024-08-13 17:09 ` Yang Shi
2024-08-21 10:18 ` Catalin Marinas
2024-08-21 11:32 ` Dev Jain
2024-08-23 9:59 ` Will Deacon
2024-06-28 18:26 ` Christoph Lameter (Ampere)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zog6eFF1zDl4IRHX@arm.com \
--to=catalin.marinas@arm.com \
--cc=anshuman.khandual@arm.com \
--cc=cl@gentwo.org \
--cc=david@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=scott@os.amperecomputing.com \
--cc=will@kernel.org \
--cc=yang@os.amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.