From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 68EEAC433F5 for ; Fri, 1 Apr 2022 16:50:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=sCugBVGuPwC/obdI6rki5VG8ryf94fCE0WZ/UtNS+VY=; b=PE2GfXCk3SG60R zZc1h/Nn3BksWfAJlz2QQRZRoHME3dWO2i3TFA+b/J0cNjVt6+u11kc/vR1Fl6aGgphaFaTiu1MGM +vMruD1vPjKmETt2I9kaiQWgps6BWQGMgtCZHpRMCooIFxxgNZ8fftDxXvL4zJxW8Ob4oeifYGCgr o9vINcsw9hxWxtkM1eOIUqifCCDyiYK/lUq46t3o05sU3UOQqPdLwHcgoNdsLfs0NT2ZbOr0FYebF 6arDrHv8gLNsPPRsW4WwRzosKQT1YwpGle+UiY4MgFjEniUejf8D7MwmJON430Qm1jTHuBMzSbnWy DKSk8XWlEUHy99F1JUpg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1naKSS-006bsi-WA; Fri, 01 Apr 2022 16:49:09 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1naKSO-006bry-OU for linux-arm-kernel@lists.infradead.org; Fri, 01 Apr 2022 16:49:06 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 98FC361CBB; Fri, 1 Apr 2022 16:49:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 02540C340EC; Fri, 1 Apr 2022 16:49:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1648831742; bh=Q7xkMN9lfi4IJffUuB8VvpeF+YJTc1OHOUWtd3UE+EA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=BIg+XwPKwBJbVY4YLTI3r0FnmtKbcZjwuGN5zjQ+w7aRWZU+y9nJ1FBzOwut1QclT 42G0nHKl8/cADyAQ6NYg3KIm5ZGu88jc2hGcI8E1bRrIFsFq3Y+OOZt+STQ+948mRZ 2BogPHc3j/YJQL9Gy4HNYWOC6bkVEZ7FP07HXdKo4TqNKwmbRxgYxWUjbMWLK3zzSf G7G1TjiFtVqx1ZGQRRET5V3f1Sq8Uu8BKh/ev9N2FkI0R+gW3YKB1YnLBh1lSM4d3W mfT5FoD1QJ1oM80YXhTuEjpefucg1Y/1pajNBSdFudokT6F2mHtVDLfMrrxUyrRkOL phJCu+ul8Zhaw== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1naKSJ-000rxZ-J8; Fri, 01 Apr 2022 17:48:59 +0100 Date: Fri, 01 Apr 2022 17:48:59 +0100 Message-ID: <87tubcbvgk.wl-maz@kernel.org> From: Marc Zyngier To: xieming Cc: linux@armlinux.org.uk, catalin.marinas@arm.com, will@kernel.org, alex.williamson@redhat.com, sashal@kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] kvm/arm64: fixed passthrough gpu into vm on arm64 In-Reply-To: <20220401090828.614167-1-xieming@kylinos.cn> References: <20220401090828.614167-1-xieming@kylinos.cn> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: xieming@kylinos.cn, linux@armlinux.org.uk, catalin.marinas@arm.com, will@kernel.org, alex.williamson@redhat.com, sashal@kernel.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220401_094904_922586_D1BEA54E X-CRM114-Status: GOOD ( 33.80 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Xieming, This is the second time I fix email addresses for you. Next time, I simply won't bother replying. On Fri, 01 Apr 2022 10:08:28 +0100, xieming wrote: > > when passthrough some pcie device, such as gpus(including > Nvidia and AMD),kvm will report:"Unsupported FSC: EC=0x24 > xFSC=0x21 ESR_EL2=0x92000061" err.the main reason is vfio I have asked you to describe how you get there, and you still haven't bothered replying. > ioremap vga memory type by DEVICE_nGnRnE, and kvm setting > memory type to PAGE_S2_DEVICE(DEVICE_nGnRE), but in guestos, > all of device io memory type when ioremapping (including gpu > driver TTM memory type) is setting to MT_NORMAL_NC. > > according to ARM64 stage1&stage2 conbining rules. > memory type attributes combining rules: > Normal-WB DevicenGnRE Normal-WB is weakest,Device-nGnRnE is strongest. > > refferring to 'Arm Architecture Reference Manual Armv8, > for Armv8-A architecture profile' pdf, chapter B2.8 > refferring to 'ARM System Memory Management Unit Architecture > Specification SMMU architecture version 3.0 and version 3.1' pdf, > chapter 13.1.5 > > therefore, the I/O memory attribute of the VM is setting to > DevicenGnRE maybe is a mistake. it causes all device memory > accessing in the virtual machine must be aligned. > > To summarize: stage2 memory type cannot be stronger than stage1 > in arm64 archtechture. You are plain wrong. It can, and most of the time, it *must*. > > Signed-off-by: xieming > --- > arch/arm/include/asm/kvm_mmu.h | 3 ++- > arch/arm64/include/asm/kvm_mmu.h | 3 ++- > arch/arm64/include/asm/memory.h | 4 +++- > arch/arm64/include/asm/pgtable-prot.h | 2 +- > drivers/vfio/pci/vfio_pci.c | 7 +++++++ > virt/kvm/arm/mmu.c | 19 ++++++++++++++++--- > virt/kvm/arm/vgic/vgic-v2.c | 2 +- > 7 files changed, 32 insertions(+), 8 deletions(-) > > diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h > index 523c499e42db..5c7869d25b62 100644 > --- a/arch/arm/include/asm/kvm_mmu.h > +++ b/arch/arm/include/asm/kvm_mmu.h This file has been removed from the tree *over two years ago*. > @@ -64,7 +64,8 @@ void stage2_unmap_vm(struct kvm *kvm); > int kvm_alloc_stage2_pgd(struct kvm *kvm); > void kvm_free_stage2_pgd(struct kvm *kvm); > int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, > - phys_addr_t pa, unsigned long size, bool writable); > + phys_addr_t pa, unsigned long size, > + bool writable, bool writecombine); > > int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run); > > diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h > index b2558447c67d..3f98286c7498 100644 > --- a/arch/arm64/include/asm/kvm_mmu.h > +++ b/arch/arm64/include/asm/kvm_mmu.h > @@ -158,7 +158,8 @@ void stage2_unmap_vm(struct kvm *kvm); > int kvm_alloc_stage2_pgd(struct kvm *kvm); > void kvm_free_stage2_pgd(struct kvm *kvm); > int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, > - phys_addr_t pa, unsigned long size, bool writable); > + phys_addr_t pa, unsigned long size, > + bool writable, bool writecombine); NAK. For a start, there is no such thing as 'write-combine' in the ARM architecture, and I'm not convinced you can equate WC to Normal-NC. See the previous discussion at [1]. [1] https://lore.kernel.org/r/20210429162906.32742-1-sdonthineni@nvidia.com [...] > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index 51b791c750f1..6f66efb71743 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -1452,7 +1452,14 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) > } > > vma->vm_private_data = vdev; > +#ifdef CONFIG_ARM64 > + if (vfio_pci_is_vga(pdev)) > + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > + else > + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); No. That's completely unacceptable. Who says that some VGA (who the hell implements VGA these days?) implies any sort of attribute other than device memory? This may work for your particular device under your own circumstances. Can it be generalised? No. And as Jason pointed out, this is likely to break userspace. > +#else > vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > +#endif > vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff; > > /* > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c > index 11103b75c596..a46a58696834 100644 > --- a/virt/kvm/arm/mmu.c > +++ b/virt/kvm/arm/mmu.c > @@ -206,6 +206,17 @@ static inline void kvm_pgd_populate(pgd_t *pgdp, pud_t *pudp) > dsb(ishst); > } > > +/** > + * is_vma_write_combine - check if VMA is mapped with writecombine or not > + * Return true if VMA mapped with MT_NORMAL_NC otherwise fasle > + */ > +static inline bool is_vma_write_combine(struct vm_area_struct *vma) > +{ > + pteval_t pteval = pgprot_val(vma->vm_page_prot); > + > + return ((pteval & PTE_ATTRINDX_MASK) == PTE_ATTRINDX(MT_NORMAL_NC)); > +} Again, you are making tons of assumptions here, none of which are acceptable as is. M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel