From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21CCAC433B4 for ; Wed, 28 Apr 2021 17:07:20 +0000 (UTC) Received: from mm01.cs.columbia.edu (mm01.cs.columbia.edu [128.59.11.253]) by mail.kernel.org (Postfix) with ESMTP id 5217B61440 for ; Wed, 28 Apr 2021 17:07:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5217B61440 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvmarm-bounces@lists.cs.columbia.edu Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id C24F14B275; Wed, 28 Apr 2021 13:07:18 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rimpZQTK35wL; Wed, 28 Apr 2021 13:07:17 -0400 (EDT) Received: from mm01.cs.columbia.edu (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id A392D4B277; Wed, 28 Apr 2021 13:07:17 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by mm01.cs.columbia.edu (Postfix) with ESMTP id D51B14B272 for ; Wed, 28 Apr 2021 13:07:16 -0400 (EDT) X-Virus-Scanned: at lists.cs.columbia.edu Received: from mm01.cs.columbia.edu ([127.0.0.1]) by localhost (mm01.cs.columbia.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tY3p6T5ZWXG5 for ; Wed, 28 Apr 2021 13:07:15 -0400 (EDT) Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by mm01.cs.columbia.edu (Postfix) with ESMTPS id 71F4D4B271 for ; Wed, 28 Apr 2021 13:07:15 -0400 (EDT) Received: by mail.kernel.org (Postfix) with ESMTPSA id F0C11613B4; Wed, 28 Apr 2021 17:07:10 +0000 (UTC) Date: Wed, 28 Apr 2021 18:07:08 +0100 From: Catalin Marinas To: Steven Price Subject: Re: [PATCH v11 2/6] arm64: kvm: Introduce MTE VM feature Message-ID: <20210428170705.GB4022@arm.com> References: <20210416154309.22129-1-steven.price@arm.com> <20210416154309.22129-3-steven.price@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210416154309.22129-3-steven.price@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Cc: "Dr. David Alan Gilbert" , qemu-devel@nongnu.org, Marc Zyngier , Juan Quintela , Richard Henderson , linux-kernel@vger.kernel.org, Dave Martin , linux-arm-kernel@lists.infradead.org, Thomas Gleixner , Will Deacon , kvmarm@lists.cs.columbia.edu X-BeenThere: kvmarm@lists.cs.columbia.edu X-Mailman-Version: 2.1.14 Precedence: list List-Id: Where KVM/ARM decisions are made List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kvmarm-bounces@lists.cs.columbia.edu Sender: kvmarm-bounces@lists.cs.columbia.edu On Fri, Apr 16, 2021 at 04:43:05PM +0100, Steven Price wrote: > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 77cb2d28f2a4..5f8e165ea053 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -879,6 +879,26 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (vma_pagesize == PAGE_SIZE && !force_pte) > vma_pagesize = transparent_hugepage_adjust(memslot, hva, > &pfn, &fault_ipa); > + > + if (fault_status != FSC_PERM && kvm_has_mte(kvm) && !device && > + pfn_valid(pfn)) { In the current implementation, device == !pfn_valid(), so we could skip the latter check. > + /* > + * VM will be able to see the page's tags, so we must ensure > + * they have been initialised. if PG_mte_tagged is set, tags > + * have already been initialised. > + */ > + unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT; > + struct page *page = pfn_to_online_page(pfn); > + > + if (!page) > + return -EFAULT; I think that's fine, though maybe adding a comment that otherwise it would be mapped at stage 2 as Normal Cacheable and we cannot guarantee that the memory supports MTE tags. > + > + for (i = 0; i < nr_pages; i++, page++) { > + if (!test_and_set_bit(PG_mte_tagged, &page->flags)) > + mte_clear_page_tags(page_address(page)); > + } > + } > + > if (writable) > prot |= KVM_PGTABLE_PROT_W; I probably asked already but is the only way to map a standard RAM page (not device) in stage 2 via the fault handler? One case I had in mind was something like get_user_pages() but it looks like that one doesn't call set_pte_at_notify(). There are a few other places where set_pte_at_notify() is called and these may happen before we got a chance to fault on stage 2, effectively populating the entry (IIUC). If that's an issue, we could move the above loop and check closer to the actual pte setting like kvm_pgtable_stage2_map(). While the set_pte_at() race on the page flags is somewhat clearer, we may still have a race here with the VMM's set_pte_at() if the page is mapped as tagged. KVM has its own mmu_lock but it wouldn't be held when handling the VMM page tables (well, not always, see below). gfn_to_pfn_prot() ends up calling get_user_pages*(). At least the slow path (hva_to_pfn_slow()) ends up with FOLL_TOUCH in gup and the VMM pte would be set, tags cleared (if PROT_MTE) before the stage 2 pte. I'm not sure whether get_user_page_fast_only() does the same. The race with an mprotect(PROT_MTE) in the VMM is fine I think as the KVM mmu notifier is invoked before set_pte_at() and racing with another user_mem_abort() is serialised by the KVM mmu_lock. The subsequent set_pte_at() would see the PG_mte_tagged set either by the current CPU or by the one it was racing with. -- Catalin _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 771E1C433B4 for ; Wed, 28 Apr 2021 17:09:20 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C0DEF6143A for ; Wed, 28 Apr 2021 17:09:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C0DEF6143A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=n+1/hRUycLxXZNNQJdI06FRA/goM41Ast3APvfmoOdI=; b=aiQFjW3ctw7PJhVsXiDdMGkFi zpCggW/heLrcqU4Tn92xWmhFGcMx48sefKTpif5sqRqN36ODz2InEtrGii7qWM2S5pgh92tzHWlyi qPgLr44IZH76srxWB9Fvwx3OdAik4sMlnOE3NqMxGAZ/F0j6/EAdZU24Jo5/o3ORynP7CCUng0suE BcTGlMtWhd4R+uD+rOAfQ93XUSip8O8jdUks8T6/4cyyOd+tDv9Y/9k2Q2Znkfvnw/drAHJwJtej4 RnDFT/mbaLYzv7NkwVzlMztlBUA1FzNZoixennxyVt6SRQxwzsiRhf4qiT/3dBXAtkb7O4fEV//6l fhY4OgUEw==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lbnek-003xqt-RD; Wed, 28 Apr 2021 17:07:22 +0000 Received: from bombadil.infradead.org ([2607:7c80:54:e::133]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lbnef-003xqT-QC for linux-arm-kernel@desiato.infradead.org; Wed, 28 Apr 2021 17:07:18 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=bL+ugqOApebB9vgskN5q9nPWYGDfj6yXoWsEinWgb/U=; b=hAVI1TeDbnlXzEK/OXzBNvt5nz 8CtT1rcVBHUfDoTj6ZNG8GdgIyj56umE/n4GV13o4qwxbj11iXLxc1RtLZRbg3BxC1fvwF9YYXK93 uwvur/CNNHnC8A8PAK9Kqg6Db+7tv2XxoYqJ1yT5JLj8+dGNg45gPlD9dfKjVwlqBpLIpXnOTt4O4 lqXOlAieu13Z5iXYnqIyTRlmZlD1I5+wqMgD4lswQWHbW3KOpT8UBlhf3Fj/jx8H1kgqNsD8thCUs I82/TlJaAceWh5XHiGEsmzy7W0AjfETTwMs6YOYZ9MdqLGOzScW8FaAeyzR13vtsAXTcMLl3aBXdh 4YmsX4+Q==; Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lbnec-00034C-Ph for linux-arm-kernel@lists.infradead.org; Wed, 28 Apr 2021 17:07:16 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id F0C11613B4; Wed, 28 Apr 2021 17:07:10 +0000 (UTC) Date: Wed, 28 Apr 2021 18:07:08 +0100 From: Catalin Marinas To: Steven Price Cc: Marc Zyngier , Will Deacon , James Morse , Julien Thierry , Suzuki K Poulose , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Dave Martin , Mark Rutland , Thomas Gleixner , qemu-devel@nongnu.org, Juan Quintela , "Dr. David Alan Gilbert" , Richard Henderson , Peter Maydell , Haibo Xu , Andrew Jones Subject: Re: [PATCH v11 2/6] arm64: kvm: Introduce MTE VM feature Message-ID: <20210428170705.GB4022@arm.com> References: <20210416154309.22129-1-steven.price@arm.com> <20210416154309.22129-3-steven.price@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210416154309.22129-3-steven.price@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210428_100714_935229_0EDE6D3C X-CRM114-Status: GOOD ( 24.97 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Apr 16, 2021 at 04:43:05PM +0100, Steven Price wrote: > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 77cb2d28f2a4..5f8e165ea053 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -879,6 +879,26 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (vma_pagesize == PAGE_SIZE && !force_pte) > vma_pagesize = transparent_hugepage_adjust(memslot, hva, > &pfn, &fault_ipa); > + > + if (fault_status != FSC_PERM && kvm_has_mte(kvm) && !device && > + pfn_valid(pfn)) { In the current implementation, device == !pfn_valid(), so we could skip the latter check. > + /* > + * VM will be able to see the page's tags, so we must ensure > + * they have been initialised. if PG_mte_tagged is set, tags > + * have already been initialised. > + */ > + unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT; > + struct page *page = pfn_to_online_page(pfn); > + > + if (!page) > + return -EFAULT; I think that's fine, though maybe adding a comment that otherwise it would be mapped at stage 2 as Normal Cacheable and we cannot guarantee that the memory supports MTE tags. > + > + for (i = 0; i < nr_pages; i++, page++) { > + if (!test_and_set_bit(PG_mte_tagged, &page->flags)) > + mte_clear_page_tags(page_address(page)); > + } > + } > + > if (writable) > prot |= KVM_PGTABLE_PROT_W; I probably asked already but is the only way to map a standard RAM page (not device) in stage 2 via the fault handler? One case I had in mind was something like get_user_pages() but it looks like that one doesn't call set_pte_at_notify(). There are a few other places where set_pte_at_notify() is called and these may happen before we got a chance to fault on stage 2, effectively populating the entry (IIUC). If that's an issue, we could move the above loop and check closer to the actual pte setting like kvm_pgtable_stage2_map(). While the set_pte_at() race on the page flags is somewhat clearer, we may still have a race here with the VMM's set_pte_at() if the page is mapped as tagged. KVM has its own mmu_lock but it wouldn't be held when handling the VMM page tables (well, not always, see below). gfn_to_pfn_prot() ends up calling get_user_pages*(). At least the slow path (hva_to_pfn_slow()) ends up with FOLL_TOUCH in gup and the VMM pte would be set, tags cleared (if PROT_MTE) before the stage 2 pte. I'm not sure whether get_user_page_fast_only() does the same. The race with an mprotect(PROT_MTE) in the VMM is fine I think as the KVM mmu notifier is invoked before set_pte_at() and racing with another user_mem_abort() is serialised by the KVM mmu_lock. The subsequent set_pte_at() would see the PG_mte_tagged set either by the current CPU or by the one it was racing with. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3077C433ED for ; Wed, 28 Apr 2021 17:07:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AF7546143E for ; Wed, 28 Apr 2021 17:07:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241447AbhD1RIA (ORCPT ); Wed, 28 Apr 2021 13:08:00 -0400 Received: from mail.kernel.org ([198.145.29.99]:35198 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231916AbhD1RH7 (ORCPT ); Wed, 28 Apr 2021 13:07:59 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id F0C11613B4; Wed, 28 Apr 2021 17:07:10 +0000 (UTC) Date: Wed, 28 Apr 2021 18:07:08 +0100 From: Catalin Marinas To: Steven Price Cc: Marc Zyngier , Will Deacon , James Morse , Julien Thierry , Suzuki K Poulose , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Dave Martin , Mark Rutland , Thomas Gleixner , qemu-devel@nongnu.org, Juan Quintela , "Dr. David Alan Gilbert" , Richard Henderson , Peter Maydell , Haibo Xu , Andrew Jones Subject: Re: [PATCH v11 2/6] arm64: kvm: Introduce MTE VM feature Message-ID: <20210428170705.GB4022@arm.com> References: <20210416154309.22129-1-steven.price@arm.com> <20210416154309.22129-3-steven.price@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210416154309.22129-3-steven.price@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 16, 2021 at 04:43:05PM +0100, Steven Price wrote: > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 77cb2d28f2a4..5f8e165ea053 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -879,6 +879,26 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (vma_pagesize == PAGE_SIZE && !force_pte) > vma_pagesize = transparent_hugepage_adjust(memslot, hva, > &pfn, &fault_ipa); > + > + if (fault_status != FSC_PERM && kvm_has_mte(kvm) && !device && > + pfn_valid(pfn)) { In the current implementation, device == !pfn_valid(), so we could skip the latter check. > + /* > + * VM will be able to see the page's tags, so we must ensure > + * they have been initialised. if PG_mte_tagged is set, tags > + * have already been initialised. > + */ > + unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT; > + struct page *page = pfn_to_online_page(pfn); > + > + if (!page) > + return -EFAULT; I think that's fine, though maybe adding a comment that otherwise it would be mapped at stage 2 as Normal Cacheable and we cannot guarantee that the memory supports MTE tags. > + > + for (i = 0; i < nr_pages; i++, page++) { > + if (!test_and_set_bit(PG_mte_tagged, &page->flags)) > + mte_clear_page_tags(page_address(page)); > + } > + } > + > if (writable) > prot |= KVM_PGTABLE_PROT_W; I probably asked already but is the only way to map a standard RAM page (not device) in stage 2 via the fault handler? One case I had in mind was something like get_user_pages() but it looks like that one doesn't call set_pte_at_notify(). There are a few other places where set_pte_at_notify() is called and these may happen before we got a chance to fault on stage 2, effectively populating the entry (IIUC). If that's an issue, we could move the above loop and check closer to the actual pte setting like kvm_pgtable_stage2_map(). While the set_pte_at() race on the page flags is somewhat clearer, we may still have a race here with the VMM's set_pte_at() if the page is mapped as tagged. KVM has its own mmu_lock but it wouldn't be held when handling the VMM page tables (well, not always, see below). gfn_to_pfn_prot() ends up calling get_user_pages*(). At least the slow path (hva_to_pfn_slow()) ends up with FOLL_TOUCH in gup and the VMM pte would be set, tags cleared (if PROT_MTE) before the stage 2 pte. I'm not sure whether get_user_page_fast_only() does the same. The race with an mprotect(PROT_MTE) in the VMM is fine I think as the KVM mmu notifier is invoked before set_pte_at() and racing with another user_mem_abort() is serialised by the KVM mmu_lock. The subsequent set_pte_at() would see the PG_mte_tagged set either by the current CPU or by the one it was racing with. -- Catalin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84356C433B4 for ; Wed, 28 Apr 2021 17:33:51 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BFE0361073 for ; Wed, 28 Apr 2021 17:33:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BFE0361073 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55822 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lbo4L-00080c-Gv for qemu-devel@archiver.kernel.org; Wed, 28 Apr 2021 13:33:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41218) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lbnei-0001ti-Dg for qemu-devel@nongnu.org; Wed, 28 Apr 2021 13:07:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:52134) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lbnee-00070D-Dl for qemu-devel@nongnu.org; Wed, 28 Apr 2021 13:07:20 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id F0C11613B4; Wed, 28 Apr 2021 17:07:10 +0000 (UTC) Date: Wed, 28 Apr 2021 18:07:08 +0100 From: Catalin Marinas To: Steven Price Subject: Re: [PATCH v11 2/6] arm64: kvm: Introduce MTE VM feature Message-ID: <20210428170705.GB4022@arm.com> References: <20210416154309.22129-1-steven.price@arm.com> <20210416154309.22129-3-steven.price@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210416154309.22129-3-steven.price@arm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Received-SPF: pass client-ip=198.145.29.99; envelope-from=cmarinas@kernel.org; helo=mail.kernel.org X-Spam_score_int: -66 X-Spam_score: -6.7 X-Spam_bar: ------ X-Spam_report: (-6.7 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , Peter Maydell , "Dr. David Alan Gilbert" , Andrew Jones , Haibo Xu , Suzuki K Poulose , qemu-devel@nongnu.org, Marc Zyngier , Juan Quintela , Richard Henderson , linux-kernel@vger.kernel.org, Dave Martin , James Morse , linux-arm-kernel@lists.infradead.org, Thomas Gleixner , Will Deacon , kvmarm@lists.cs.columbia.edu, Julien Thierry Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On Fri, Apr 16, 2021 at 04:43:05PM +0100, Steven Price wrote: > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 77cb2d28f2a4..5f8e165ea053 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -879,6 +879,26 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > if (vma_pagesize == PAGE_SIZE && !force_pte) > vma_pagesize = transparent_hugepage_adjust(memslot, hva, > &pfn, &fault_ipa); > + > + if (fault_status != FSC_PERM && kvm_has_mte(kvm) && !device && > + pfn_valid(pfn)) { In the current implementation, device == !pfn_valid(), so we could skip the latter check. > + /* > + * VM will be able to see the page's tags, so we must ensure > + * they have been initialised. if PG_mte_tagged is set, tags > + * have already been initialised. > + */ > + unsigned long i, nr_pages = vma_pagesize >> PAGE_SHIFT; > + struct page *page = pfn_to_online_page(pfn); > + > + if (!page) > + return -EFAULT; I think that's fine, though maybe adding a comment that otherwise it would be mapped at stage 2 as Normal Cacheable and we cannot guarantee that the memory supports MTE tags. > + > + for (i = 0; i < nr_pages; i++, page++) { > + if (!test_and_set_bit(PG_mte_tagged, &page->flags)) > + mte_clear_page_tags(page_address(page)); > + } > + } > + > if (writable) > prot |= KVM_PGTABLE_PROT_W; I probably asked already but is the only way to map a standard RAM page (not device) in stage 2 via the fault handler? One case I had in mind was something like get_user_pages() but it looks like that one doesn't call set_pte_at_notify(). There are a few other places where set_pte_at_notify() is called and these may happen before we got a chance to fault on stage 2, effectively populating the entry (IIUC). If that's an issue, we could move the above loop and check closer to the actual pte setting like kvm_pgtable_stage2_map(). While the set_pte_at() race on the page flags is somewhat clearer, we may still have a race here with the VMM's set_pte_at() if the page is mapped as tagged. KVM has its own mmu_lock but it wouldn't be held when handling the VMM page tables (well, not always, see below). gfn_to_pfn_prot() ends up calling get_user_pages*(). At least the slow path (hva_to_pfn_slow()) ends up with FOLL_TOUCH in gup and the VMM pte would be set, tags cleared (if PROT_MTE) before the stage 2 pte. I'm not sure whether get_user_page_fast_only() does the same. The race with an mprotect(PROT_MTE) in the VMM is fine I think as the KVM mmu notifier is invoked before set_pte_at() and racing with another user_mem_abort() is serialised by the KVM mmu_lock. The subsequent set_pte_at() would see the PG_mte_tagged set either by the current CPU or by the one it was racing with. -- Catalin