From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 23F53C2BD09 for ; Sat, 6 Jul 2024 09:47:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=VW1ufa3rd2yy8P8EJDwC22kN+aTCtD5+oK1bEWW3BTM=; b=fcsyPrkSsofuaVjwYcEgmW11cU pqF7ZhrjH8wjW9e8/5IjwVak0X7ErETh4T30LNJwdxzJuj2b5tB5ajuSnFojjpwiF3+onRyK/p8N+ 1UUFNfWeuc06fTI1NwkN3oc9T1ffAQZ4vhyqg7T1Vr2XxIsaFetDikU/LnX1urxcpRR64xi9GIjB5 AwmTOF/VvOvxWiemz37Tm+BNmm8yxA7PDszMzIhU4pFbbbtnVPxjGNlK2Y9OIJkuTYwymmWWQXufb l0A4mgbWFkCxOq3mMeYsHLemk2C8pLoxGhnecq5rZInRn97H44uO9O21vHaYHUPV6FbSjL/+zTIms 6UEXTkag==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sQ216-000000001Uj-0dYY; Sat, 06 Jul 2024 09:47:40 +0000 Received: from sin.source.kernel.org ([145.40.73.55]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sQ20q-000000001R6-10dJ for linux-arm-kernel@lists.infradead.org; Sat, 06 Jul 2024 09:47:25 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 84D83CE09FC; Sat, 6 Jul 2024 09:47:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3EBD5C2BD10; Sat, 6 Jul 2024 09:47:20 +0000 (UTC) Date: Sat, 6 Jul 2024 10:47:17 +0100 From: Catalin Marinas To: "Christoph Lameter (Ampere)" Cc: Yang Shi , will@kernel.org, anshuman.khandual@arm.com, david@redhat.com, scott@os.amperecomputing.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [v5 PATCH] arm64: mm: force write fault for atomic RMW instructions Message-ID: References: <773c8be7-eb73-010c-acea-1c2fefd65b84@gentwo.org> <200c5d06-c551-4847-adaf-287750e6aac4@os.amperecomputing.com> <1689cd26-514a-4d72-a1bd-b67357aab3e0@os.amperecomputing.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240706_024724_660824_9DDA4D1B X-CRM114-Status: GOOD ( 25.74 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Jul 05, 2024 at 11:51:33AM -0700, Christoph Lameter (Ampere) wrote: > On Fri, 5 Jul 2024, Catalin Marinas wrote: > > People will be happy until one enables execute-only ELF text sections in > > a distro and all that opcode parsing will add considerable overhead for > > many read faults (those with a writeable vma). > > The opcode is in the l1 cache since we just faulted on it. There is no > "considerable" overhead. This has nothing to do with caches (and many Arm implementations still have separate I and D caches). With the right linker flags (e.g. --execute-only for lld), one can generate a PROT_EXEC only (no PROT_READ) ELF text section. On newer Arm CPUs with FEAT_EPAN, the kernel no longer forces PROT_READ on PROT_EXEC only mappings. The get_user() in this patch to read the opcode will fault. So instead of two faults you get now for an atomic instruction, you'd get three (the additional one for opcode reading). What's worse, it affects standard loads as well in the same way. Yang Shi did test this scenario but for correctness only, not performance. It would be good to recompile the benchmark with --execute-only (or --rosegment I think for gnu ld) and see post the results. > > Just to be clear, there are still potential issues to address (or > > understand the impact of) in this patch with exec-only mappings and > > the performance gain _after_ the THP behaviour changed in the mm code. > > We can make a call once we have more data but, TBH, my inclination is > > towards 'no' given that OpenJDK already support madvise() and it's not > > arm64 specific. > > It is arm64 specific. Other Linux architectures have optimizations for > similar issues in their arch code as mentioned in the patch or the > processors will not double fault. > > Is there a particular reason for ARM as processor manufacturer to oppose > this patch? We have mostly hand waving and speculations coming from you > here. Arm Ltd has no involvement at all in this decision (and probably if you ask the architects, they wouldn't see any problem). Even though I have an arm.com email address, my opinions on the list are purely from a kernel maintainer perspective. There's no speculation but some real concerns here. Please see above. > What the patch does is clearly beneficial and it is an established way of > implementing read->write fault handling. It is clearly beneficial for this specific case but, as I said, we still need to address the execute-only mappings causing an additional fault on the opcode reading. You may not find many such binaries now in the field but there's a strong push from security people to enable it (it's a user-space decisions, the kernel simply provides PROT_EXEC only mappings). In addition, there's a 24% performance overhead in one of Yang Shi's results. This has not been explained. -- Catalin