From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DEA691401A; Mon, 8 Jan 2024 11:04:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CA6B7C433C7; Mon, 8 Jan 2024 11:04:49 +0000 (UTC) Date: Mon, 8 Jan 2024 11:04:47 +0000 From: Catalin Marinas To: Oliver Upton Cc: Lorenzo Pieralisi , Jason Gunthorpe , ankita@nvidia.com, maz@kernel.org, suzuki.poulose@arm.com, yuzenghui@huawei.com, will@kernel.org, alex.williamson@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, ardb@kernel.org, akpm@linux-foundation.org, gshan@redhat.com, linux-mm@kvack.org, aniketa@nvidia.com, cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com, vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com, jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com, kvmarm@lists.linux.dev, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, james.morse@arm.com Subject: Re: [PATCH v3 2/2] kvm: arm64: set io memory s2 pte as normalnc for vfio pci devices Message-ID: References: <20231208164709.23101-1-ankita@nvidia.com> <20231208164709.23101-3-ankita@nvidia.com> <20231212181156.GO3014157@nvidia.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri, Jan 05, 2024 at 08:42:31PM +0000, Oliver Upton wrote: > On Thu, Dec 21, 2023 at 01:19:18PM +0000, Catalin Marinas wrote: > > > Apologies, I didn't mean to question what's going on here from the > > > hardware POV. My concern was more from the kernel + user interfaces POV, > > > this all seems to work (specifically for PCI) by maintaining an > > > intentional mismatch between the VFIO stage-1 and KVM stage-2 mappings. > > > > If you stare at it long enough, the mismatch starts to look fine ;). > > Even if you have the VFIO stage 1 Normal NC, KVM stage 2 Normal NC, you > > can still have the guest setting stage 1 to Device and introduce an > > architectural mismatch. These aliases have some bad reputation but the > > behaviour is constrained architecturally. > > > > IMHO we should move on from this attribute mismatch since we can't fully > > solve it anyway and focus instead on what the device, system can > > tolerate, who's responsible for deciding which MMIO ranges can be mapped > > as Normal NC. > > Fair enough :) The other slightly unsavory part is that we're baking > the mapping policy into KVM. I'd prefer it if this policy were kept in > userspace somehow, but there's no actual usecase for userspace selecting > memory attributes at this point. If by policy you mean who's deciding the write-combining relaxation, this series moved it to the vfio-pci host driver. KVM only picks the appropriate memory type for stage 2 based on the vma flags. That's Normal NC in the absence of anything better on arm64 and it does more than just write-combining but we can describe what this new VM_* flag allows. If we want to keep this decision strictly in user space, we can do it with some ioctl(). The downside is that the host kernel now puts more trust in the user VMM, so my preference would be to keep this in the vfio driver. Or we can do both, vfio-pci allows the relaxation, the VMM tells KVM to go for a more relaxed stage 2 via an ioctl(). -- Catalin