From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C293CC77B75 for ; Mon, 22 May 2023 18:35:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=oDho7ZROrlIMtp4jVhXdKVs02GfvqbttVV4Ag9MyP44=; b=XrgWW8vwLbG6py6MnEeNhyqBCc h98ejgOLNgvAcPM0fnXCbZV4dh4FEvc/a8dvpt0tca4R0XCKdB4f5QyINPXpvDKQeM9/htWX7Y8T8 A2BhTTy7OOkFvRcRMH3nDm6n/Mgao4ekPuhYM75Ck00GZX79x8Q201mE5r5CdvsKlcjIi1sQU92E2 RkHl5xz3ht4AVOThTC65M55XINViSzlpYnhn4KZXb/jiA7j0G6Xxq7AbYafLN6P/tqCzNg9XGZ8qA fS5t7W7F6bH/PAzBzeDqpjKbDliIEDOv8pJQKVlGHbCKsILU9uqIC9rDeGCQr2ZHb7/5mKsOVhvzp k5UFa9yA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1q1AMo-007XJc-1u; Mon, 22 May 2023 18:34:46 +0000 Received: from mail-pl1-x649.google.com ([2607:f8b0:4864:20::649]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1q1AMl-007XIG-0X for linux-arm-kernel@lists.infradead.org; Mon, 22 May 2023 18:34:44 +0000 Received: by mail-pl1-x649.google.com with SMTP id d9443c01a7336-1ae777f993dso20878045ad.1 for ; Mon, 22 May 2023 11:34:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684780480; x=1687372480; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ncyyvzCwaBiOGfdAK57eDSyGsVrCv17gynkqmLbIh7Q=; b=cGPQFY6F+JsGLicyWmQ7+nx3BT8FDTky0Wpm3XMm47C67teffHRLLERrhhaXB78/CL ntljpUrXT68Ppi5/Br2bFbLfK6HZXVLGLYn6zwBMkZqCqOHJE8c6nTf4PHZT3kOdALRD WYT++yPJw8eQcaZW/JCptRL2KmXZBG59VSW+dbF+1yGNjbzEvGw3WCB6+aR/o2UoFfMD 3Q48kdKs3/9fpMluVeCEMRk7evs7OTjc7hyrDlbNEi7WL2d2onFS7my6C8fDbmseK36M m4pArVT+6V0srD5bk1rZT2WYLS9kqpm+kH9gzo69MoI63Q2Bq2DN4xM3T0ZgHxCuj8pL CMdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684780480; x=1687372480; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ncyyvzCwaBiOGfdAK57eDSyGsVrCv17gynkqmLbIh7Q=; b=XmRTTLc1gLvBxBv4Jny+nQ+RUKegKc9WUG0xPCe5XMqPgCG7VouLAl2469UUe5ikJz 4x4I9juvnJBnWcQ6LkVCxdbZO2sA8XKkShe0HCmX3V68dsqc99bWJqq/9KYIChB8dsU/ C4GPwnQvq/c+VcTa4uavByVOxBOyt9znZj7YsjJkasMB2Mm21fQStCctmiMuL6l+yKih /TKrq+cmj277SY3aOcphBWOfm7eicTKIQ8gsUTj/vChCrjMOBRvXQA35VhfEfoxjg+KG 9wlFSAXafEZBdknGER6WpT3AaY+yZZs53M3RY6+LqHMhN1EvV0U2pAlJsIpvS/6a6c2r 4R1A== X-Gm-Message-State: AC+VfDyHHRSSzgXc0u5sJ1DPabSAVBVUpd4i/2QGhEpe0H4/HSoc5tFK fn1fW0HDjs8B08Iqi7FHnVXbBRrM7go= X-Google-Smtp-Source: ACHHUZ4vYpOScPsGTy+iBltm4E7gLuJQusAKoigKV11l60H7+6wz2QnxSdUf70tvT+CQhoqlL+09UUYI3KI= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:cad2:b0:1ad:ec0b:42f9 with SMTP id y18-20020a170902cad200b001adec0b42f9mr2629780pld.6.1684780479772; Mon, 22 May 2023 11:34:39 -0700 (PDT) Date: Mon, 22 May 2023 11:34:38 -0700 In-Reply-To: <20230522063725.284686-1-apopple@nvidia.com> Mime-Version: 1.0 References: <20230522063725.284686-1-apopple@nvidia.com> Message-ID: Subject: Re: [PATCH] mmu_notifiers: Notify on pte permission upgrades From: Sean Christopherson To: Alistair Popple Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, robin.murphy@arm.com, will@kernel.org, nicolinc@nvidia.com, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, jgg@nvidia.com, John Hubbard X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230522_113443_222458_B0F2FC39 X-CRM114-Status: GOOD ( 35.85 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Mon, May 22, 2023, Alistair Popple wrote: > Some architectures, specifically ARM and perhaps Sparc and IA64, > require TLB invalidates when upgrading pte permission from read-only > to read-write. > > The current mmu_notifier implementation assumes that upgrades do not > need notifications. Typically though mmu_notifiers are used to > implement TLB invalidations for secondary MMUs that comply with the > main CPU architecture. > > Therefore if the main CPU architecture requires an invalidation for > permission upgrade the secondary MMU will as well and an mmu_notifier > should be sent for the upgrade. > > Currently CPU invalidations for permission upgrade occur in > ptep_set_access_flags(). Unfortunately MMU notifiers cannot be called > directly from this architecture specific code as the notifier > callbacks can sleep, and ptep_set_access_flags() is usually called > whilst holding the PTL spinlock. Therefore add the notifier calls > after the PTL is dropped and only if the PTE actually changed. This > will allow secondary MMUs to obtain an updated PTE with appropriate > permissions. > > This problem was discovered during testing of an ARM SMMU > implementation that does not support broadcast TLB maintenance > (BTM). In this case the SMMU driver uses notifiers to issue TLB > invalidates. For read-only to read-write pte upgrades the SMMU > continually returned a read-only PTE to the device, even though the > CPU had a read-write PTE installed. > > Sending a mmu notifier event to the SMMU driver fixes the problem by > flushing secondary TLB entries. A new notifier event type is added so > drivers may filter out these invalidations if not required. Note a > driver should never upgrade or install a PTE in response to this mmu > notifier event as it is not synchronised against other PTE operations. > > Signed-off-by: Alistair Popple > --- > include/linux/mmu_notifier.h | 6 +++++ > mm/hugetlb.c | 24 ++++++++++++++++++- > mm/memory.c | 45 ++++++++++++++++++++++++++++++++++-- > 3 files changed, 72 insertions(+), 3 deletions(-) > > diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h > index d6c06e140277..f14d68f119d8 100644 > --- a/include/linux/mmu_notifier.h > +++ b/include/linux/mmu_notifier.h > @@ -31,6 +31,11 @@ struct mmu_interval_notifier; > * pages in the range so to mirror those changes the user must inspect the CPU > * page table (from the end callback). > * > + * @MMU_NOTIFY_PROTECTION_UPGRAGE: update is due to a change from read-only to > + * read-write for pages in the range. This must not be used to upgrade > + * permissions on secondary PTEs, rather it should only be used to invalidate > + * caches such as secondary TLBs that may cache old read-only entries. This is a poor fit for invalidate_range_{start,end}(). All other uses bookend the primary MMU update, i.e. call start() _before_ changing PTEs. The comments are somewhat stale as they talk only about "unmapped", but the contract between the primary MMU and the secondary MMU is otherwise quite clear on when the primary MMU will invoke start() and end(). * invalidate_range_start() is called when all pages in the * range are still mapped and have at least a refcount of one. * * invalidate_range_end() is called when all pages in the * range have been unmapped and the pages have been freed by * the VM. I'm also confused as to how this actually fixes ARM's SMMU. Unless I'm looking at the wrong SMMU implementation, the SMMU implemenents only invalidate_range(), not the start()/end() variants. static const struct mmu_notifier_ops arm_smmu_mmu_notifier_ops = { .invalidate_range = arm_smmu_mm_invalidate_range, .release = arm_smmu_mm_release, .free_notifier = arm_smmu_mmu_notifier_free, }; Again from include/linux/mmu_notifier.h, not implementing the start()/end() hooks is perfectly valid. And AFAICT, the existing invalidate_range() hook is pretty much a perfect fit for what you want to achieve. * If invalidate_range() is used to manage a non-CPU TLB with * shared page-tables, it not necessary to implement the * invalidate_range_start()/end() notifiers, as * invalidate_range() already catches the points in time when an * external TLB range needs to be flushed. For more in depth * discussion on this see Documentation/mm/mmu_notifier.rst Even worse, this change may silently regress performance for secondary MMUs that haven't yet taken advantage of the event type, e.g. KVM will zap all of KVM's PTEs in response to the upgrade, instead of waiting until the guest actually tries to utilize the new protections. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel