linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: "Christoph Lameter (Ampere)" <cl@gentwo.org>
Cc: Yang Shi <yang@os.amperecomputing.com>,
	will@kernel.org, anshuman.khandual@arm.com, david@redhat.com,
	scott@os.amperecomputing.com,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [v5 PATCH] arm64: mm: force write fault for atomic RMW instructions
Date: Fri, 5 Jul 2024 19:24:56 +0100	[thread overview]
Message-ID: <Zog6eFF1zDl4IRHX@arm.com> (raw)
In-Reply-To: <b0315df9-b122-46cd-12b2-7704d4a4392e@gentwo.org>

On Fri, Jul 05, 2024 at 10:05:29AM -0700, Christoph Lameter (Ampere) wrote:
> On Thu, 4 Jul 2024, Catalin Marinas wrote:
> > It could be worked around with a new flavour of get_user() that uses the
> > non-T LDR instruction and the user mapping is readable by the kernel
> > (that's the case with EPAN, prior to PIE and I think we can change this
> > for PIE configurations as well). But it adds to the complexity of this
> > patch when the kernel already offers a MADV_POPULATE_WRITE solution.
> 
> The use of MADV_POPULATE_WRITE here is arch specific and not a general
> solution. It requires specialized knowledge and research before someone can
> figure out that this particular trick is required on Linux ARM64 processors.
> The builders need to detect this special situation in the build process and
> activate this workaround.

Not really, see this OpenJDK commit:

https://github.com/openjdk/jdk/commit/a65a89522d2f24b1767e1c74f6689a22ea32ca6a

There's nothing about arm64 in there and it looks like the code prefers
MADV_POPULATE_WRITE if THPs are enabled (which is the case in all
enterprise distros). I can't tell whether the change was made to work
around the arm64 behaviour, there's no commit log (it was contributed by
Ampere).

There's a separate thread with the mm folk on the THP behaviour for
pmd_none() vs pmd mapping the zero huge page but it is more portable for
OpenJDK to use madvise() than guess the kernel behaviour and touch small
pages or a single large pages. Even if one claims that atomic_add(0) is
portable across operating systems, the OpenJDK code was already treating
Linux as a special case in the presence of THP.

> It would be much simpler to just merge the patch and be done with it.
> Otherwise this issue will continue to cause uncountably many hours of
> anguish for sysadmins and developers all over the Linux ecosystem trying to
> figure out what in the world is going on with ARM.

People will be happy until one enables execute-only ELF text sections in
a distro and all that opcode parsing will add considerable overhead for
many read faults (those with a writeable vma).

I'd also like to understand (probably have to re-read the older threads)
whether the overhead is caused mostly by the double fault or the actual
breaking of a THP. For the latter, the mm folk are willing to change the
behaviour so that pmd_none() and pmd to the zero high page are treated
similarly (i.e. allocate a huge page on write fault). If that's good
enough, I'd rather not merge this patch (or some form of it) and wait
for a proper fix in hardware in the future.

Just to be clear, there are still potential issues to address (or
understand the impact of) in this patch with exec-only mappings and
the performance gain _after_ the THP behaviour changed in the mm code.
We can make a call once we have more data but, TBH, my inclination is
towards 'no' given that OpenJDK already support madvise() and it's not
arm64 specific.

-- 
Catalin


  reply	other threads:[~2024-07-05 18:25 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-26 19:18 [v5 PATCH] arm64: mm: force write fault for atomic RMW instructions Yang Shi
2024-06-28 16:54 ` Catalin Marinas
2024-06-28 16:57   ` Christoph Lameter (Ampere)
2024-06-28 17:24     ` Catalin Marinas
2024-06-28 18:20       ` Yang Shi
2024-07-01 19:43         ` Catalin Marinas
2024-07-02 10:26           ` Ryan Roberts
2024-07-02 11:22             ` David Hildenbrand
2024-07-02 12:36               ` Ryan Roberts
2024-07-02 12:58                 ` David Hildenbrand
2024-07-02 13:26                   ` Ryan Roberts
2024-07-02 13:50                     ` David Hildenbrand
2024-07-02 14:51                       ` Ryan Roberts
2024-07-15 13:09             ` Ryan Roberts
2024-07-02 22:21           ` Yang Shi
2024-07-04 10:03             ` Catalin Marinas
2024-07-05 17:05               ` Christoph Lameter (Ampere)
2024-07-05 18:24                 ` Catalin Marinas [this message]
2024-07-05 18:51                   ` Christoph Lameter (Ampere)
2024-07-06  9:47                     ` Catalin Marinas
2024-07-09 17:56               ` Yang Shi
2024-07-09 18:35                 ` Catalin Marinas
2024-07-09 22:29                   ` Yang Shi
2024-07-10  9:22                     ` Catalin Marinas
2024-07-10 18:43                       ` Yang Shi
2024-07-11 17:43                         ` Catalin Marinas
2024-07-11 18:17                           ` Yang Shi
2024-08-13 17:09                             ` Yang Shi
2024-08-21 10:18                             ` Catalin Marinas
2024-08-21 11:32                               ` Dev Jain
2024-08-23  9:59                               ` Will Deacon
2024-06-28 18:26       ` Christoph Lameter (Ampere)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zog6eFF1zDl4IRHX@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=cl@gentwo.org \
    --cc=david@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=scott@os.amperecomputing.com \
    --cc=will@kernel.org \
    --cc=yang@os.amperecomputing.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).