From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2ED65C52D7D for ; Wed, 14 Aug 2024 23:39:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=SLPSCr45PlozEochk67KHBRJDoHMsgo62YtK7gKgkHM=; b=Z8tnAAmMsEzJGyac0g5u1Wsy8N iOLKNpFV86Vsi9onURR8VXSlAxXiLJd0+DUG6qXTTVfJhx4M8JwM7d5L79MZ4kFVLbea0/wpHnO5O gsSjwpF0DeIDVRxYhLziZDWZ4kY7thFU3TUU3zlwNZ/b//pEGFf56rSSJnBXisLzc0wkrCCVAEy8S VHJmdwKUlf7FdBJLo229vK73t+VwaYOGleF//c+bPBNi7nNienbw+BUe8o4FUZ4doiZ6Obqw8b2Zo uUdM4KgMy6rO+USRSZPJGMp0b1NTwK6IZa+pQSoGPLXncnl7kXlhJ8CV3ZVOuiyy0L+YIoRcHlOdo K2MKUHfA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1seNa7-00000008ZKR-3zY0; Wed, 14 Aug 2024 23:39:07 +0000 Received: from out-184.mta0.migadu.com ([91.218.175.184]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1seNZS-00000008Z5n-1Jmp for linux-arm-kernel@lists.infradead.org; Wed, 14 Aug 2024 23:38:27 +0000 Date: Wed, 14 Aug 2024 16:38:14 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1723678703; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SLPSCr45PlozEochk67KHBRJDoHMsgo62YtK7gKgkHM=; b=Ve8c9/nB+Hba/Dde/QOYxHrDLOKPS2ZmDDFYjHKgvtnNyHOG7+y9ZJHQLRdeEplS/30s1C SaD3s5dPr7EUpuFyTiBHo1mu1PTYvszce0Xoi9ahGbnryIg/w2DqXF0ulU/B7Zl9W8ut2V Vlz4VNy6jjOStPZemVZCBdw0xivfjnU= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Oliver Upton To: Sean Christopherson Cc: Jason Gunthorpe , Peter Xu , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Oscar Salvador , Axel Rasmussen , linux-arm-kernel@lists.infradead.org, x86@kernel.org, Will Deacon , Gavin Shan , Paolo Bonzini , Zi Yan , Andrew Morton , Catalin Marinas , Ingo Molnar , Alistair Popple , Borislav Petkov , David Hildenbrand , Thomas Gleixner , kvm@vger.kernel.org, Dave Hansen , Alex Williamson , Yan Zhao , Marc Zyngier Subject: Re: [PATCH 00/19] mm: Support huge pfnmaps Message-ID: References: <20240809160909.1023470-1-peterx@redhat.com> <20240814123715.GB2032816@nvidia.com> <20240814144307.GP2032816@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240814_163826_645418_F4CC8F8D X-CRM114-Status: GOOD ( 21.32 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Aug 14, 2024 at 04:28:00PM -0700, Oliver Upton wrote: > On Wed, Aug 14, 2024 at 01:54:04PM -0700, Sean Christopherson wrote: > > TL;DR: it's probably worth looking at mmu_stress_test (was: max_guest_memory_test) > > on arm64, specifically the mprotect() testcase[1], as performance is significantly > > worse compared to x86, > > Sharing what we discussed offline: > > Sean was using a machine w/o FEAT_FWB for this test, so the increased > runtime on arm64 is likely explained by the CMOs we're doing when > creating or invalidating a stage-2 PTE. > > Using a machine w/ FEAT_FWB would be better for making these sort of > cross-architecture comparisons. Beyond CMOs, we do have some ... some heavy barriers (e.g. DSB(ishst)) we use to ensure page table updates are visible to the system. So there could still be some arch-specific quirks that'll show up in the test. > > and there might be bugs lurking the mmu_notifier flows. > > Impossible! :) > > > Jumping back to mmap_lock, adding a lock, vma_lookup(), and unlock in x86's page > > fault path for valid VMAs does introduce a performance regression, but only ~30%, > > not the ~6x jump from x86 to arm64. So that too makes it unlikely taking mmap_lock > > is the main problem, though it's still good justification for avoid mmap_lock in > > the page fault path. > > I'm curious how much of that 30% in a microbenchmark would translate to > real world performance, since it isn't *that* egregious. We also have > other uses for getting at the VMA beyond mapping granularity (MTE and > the VFIO Normal-NC hint) that'd require some attention too. > > -- > Thanks, > Oliver