From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D51BFC25B4F for ; Mon, 6 May 2024 16:13:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=xW2jUy8pEdaLg9R99CW8orrD2WfCjhby9IhehiVCIfo=; b=ryiPt0mnMZERhw Cni/2FHOceEO3Xu8d6CPE262yswRcEG57CHuOtCAeEy+mYNp6YMp4w1pEOm6i4JHIrUq17wDmkezx GU0PAiBRCgy6Uf69vOcpdDyOLRqX696i9gkL5u20ocTGWdI5X1HHGuHckvm20GSMTm1wxJiWQApqI nPBSECJsObSjkDcQp3JgmJ8HQXh54tR+lcQoiSwtXNxtp4MVyH1J4sLvFWrTJRNmQctMzcrrR0QWp K/yvqOH+3z2BYO4i+QGtxJKAzr315/Vk0XVQrQbomfBOLzYsSzDEmqu17Ony4P3bj2R1Hp3846p2i zFfxkS5w5S/kS27yIZbQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1s40xb-000000080Pw-3lDE; Mon, 06 May 2024 16:13:03 +0000 Received: from sin.source.kernel.org ([145.40.73.55]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1s40xY-000000080O9-17K8 for linux-arm-kernel@lists.infradead.org; Mon, 06 May 2024 16:13:01 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 7D484CE0FBC; Mon, 6 May 2024 16:12:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AA2BBC116B1; Mon, 6 May 2024 16:12:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715011976; bh=6WL8Sjyjx9eJ7K+zXFfyOq+hI9vi3mF7je4WnOEK4K0=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=juXAToHNoHXZx1uMNflWMouZ8TRAhvvKW1uPHoNpUw8YNnf81uAceUgVBYBU5rE1b zKsbY3wHdKDfUjq6HKSqw7Oo6GFO7JPRpDZ7brdWtrB1WLTWebHyAv1gdzKi9MvZFz Hjuo7RFAzoJmR0DODpQT5sByjRMGAfgMVypTMZE+1BVBzMs1pnotw9If9zgDFpTYOh IpALmsFXg0O7FmpS/iIdgTdSMWbT+Ch9rzeGYsIRsmSP3hwqNglQuLo2dkzPuctQYk KGEvvKG6moxZCbTdqKT47frpE80X4CL/2urRxzr/U6IV5wKHW/6MDk+fBpDwFzJ5z1 DKyNQcSyU7MYw== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1s40xR-00B1eK-P5; Mon, 06 May 2024 17:12:53 +0100 Date: Mon, 06 May 2024 17:12:53 +0100 Message-ID: <86y18mq5q2.wl-maz@kernel.org> From: Marc Zyngier To: Sergio Lopez Pascual Cc: Eric Curtin , Will Deacon , Hector Martin , Catalin Marinas , Mark Rutland , Zayd Qumsieh , Justin Lu , Ryan Houdek , Mark Brown , Ard Biesheuvel , Mateusz Guzik , Anshuman Khandual , Oliver Upton , Miguel Luis , Joey Gouly , Christoph Paasch , Kees Cook , Sami Tolvanen , Baoquan He , Joel Granados , Dawei Li , Andrew Morton , Florent Revest , David Hildenbrand , Stefan Roesch , Andy Chiu , Josh Triplett , Oleg Nesterov , Helge Deller , Zev Weiss , Ondrej Mosnacek , Miguel Ojeda , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Asahi Linux Subject: Re: [PATCH 0/4] arm64: Support the TSO memory model In-Reply-To: References: <20240411-tso-v1-0-754f11abfbff@marcan.st> <20240411132853.GA26481@willie-the-truck> <28ab55b3-e699-4487-b332-f1f20a6b22a1@marcan.st> <20240419165809.GA4020@willie-the-truck> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.2 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: slp@redhat.com, ecurtin@redhat.com, will@kernel.org, marcan@marcan.st, catalin.marinas@arm.com, mark.rutland@arm.com, zayd_qumsieh@apple.com, ih_justin@apple.com, Houdek.Ryan@fex-emu.org, broonie@kernel.org, ardb@kernel.org, mjguzik@gmail.com, anshuman.khandual@arm.com, oliver.upton@linux.dev, miguel.luis@oracle.com, joey.gouly@arm.com, cpaasch@apple.com, keescook@chromium.org, samitolvanen@google.com, bhe@redhat.com, j.granados@samsung.com, dawei.li@shingroup.cn, akpm@linux-foundation.org, revest@chromium.org, david@redhat.com, shr@devkernel.io, andy.chiu@sifive.com, josh@joshtriplett.org, oleg@redhat.com, deller@gmx.de, zev@bewilderbeest.net, omosnace@redhat.com, ojeda@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, asahi@lists.linux.dev X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240506_091300_690692_CED852E9 X-CRM114-Status: GOOD ( 47.14 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, 06 May 2024 12:21:40 +0100, Sergio Lopez Pascual wrote: > > Eric Curtin writes: > > > On Fri, 19 Apr 2024 at 18:08, Will Deacon wrote: > >> > >> On Thu, Apr 11, 2024 at 11:19:13PM +0900, Hector Martin wrote: > >> > On 2024/04/11 22:28, Will Deacon wrote: > >> > > * Some binaries in a distribution exhibit instability which goes away > >> > > in TSO mode, so a taskset-like program is used to run them with TSO > >> > > enabled. > >> > > >> > Since the flag is cleared on execve, this third one isn't generally > >> > possible as far as I know. > >> > >> Ah ok, I'd missed that. Thanks. > >> > >> > > In all these cases, we end up with native arm64 applications that will > >> > > either fail to load or will crash in subtle ways on CPUs without the TSO > >> > > feature. Assuming that the application cannot be fixed, a better > >> > > approach would be to recompile using stronger instructions (e.g. > >> > > LDAR/STLR) so that at least the resulting binary is portable. Now, it's > >> > > true that some existing CPUs are TSO by design (this is a perfectly > >> > > valid implementation of the arm64 memory model), but I think there's a > >> > > big difference between quietly providing more ordering guarantees than > >> > > software may be relying on and providing a mechanism to discover, > >> > > request and ultimately rely upon the stronger behaviour. > >> > > >> > The problem is "just" using stronger instructions is much more > >> > expensive, as emulators have demonstrated. If TSO didn't serve a > >> > practical purpose I wouldn't be submitting this, but it does. This is > >> > basically non-negotiable for x86 emulation; if this is rejected > >> > upstream, it will forever live as a downstream patch used by the entire > >> > gaming-on-Mac-Linux ecosystem (and this is an ecosystem we are very > >> > explicitly targeting, given our efforts with microVMs for 4K page size > >> > support and the upcoming Vulkan drivers). > > In addition to the use case Hector exposed here, there's another, > potentially larger one, which is running x86_64 containers on aarch64 > systems, using a combination of both Virtualization and emulation. > > In this scenario, both not being able to use TSO for emulation > and having to enable it all the time for the whole VM have a very large > impact on performance (~25% on some workloads). Well, there is always a price to pay somewhere, and this is the usual trade-off between performance and maintainability. > I understand the concern about the risk of userspace fragmentation, but > I was wondering if we could minimize it to an acceptable level by > narrowing down the context. For instance, since both use cases we're > bringing to the table imply the use of Virtualization, we should be able > to restrict PR_SET_MEM_MODEL to only be accepted when running on EL1 > (and not in nVHE nor pKVM), returning EINVAL otherwise. This would > heavily discourage users from relying on this feature for native > applications that can run on arbitrary contexts, hence drastically > reducing the fragmentation risk. As I explained in another sub-thread[1], I am not prepared to allow non architectural state to be exposed to a guest. I'm also not prepared to make significant ABI differences between VHE, nVHE, hVHE, with or without pKVM, because the job of the kernel is to abstract those differences. > We would still need a way to ensure the trap gets to the VMM and for > the VMM to operate on the impdef ACTLR_EL12, but that should be dealt on > a different series. The VMM can't use ACTLR_EL12, by the very definition of this register (the clue is in the name). You'd have to proxy the write in the kernel and context-switch it, which means adding non-architectural state to KVM, breaking VM migration and adding more kludges to the existing Apple-specific host crap. Also, let's realise that we are talking about making significant changes to the arm64 ABI for a platform that is still not fully supported in the upstream kernel. I have the feeling that changing the memory model dynamically may not be of the utmost priority until then. Thanks, M. [1] https://lore.kernel.org/all/867cgcqrb9.wl-maz@kernel.org -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel