From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2E38C02183 for ; Fri, 17 Jan 2025 18:52:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5ECDC280005; Fri, 17 Jan 2025 13:52:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 59DF0280004; Fri, 17 Jan 2025 13:52:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4651B280005; Fri, 17 Jan 2025 13:52:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 23470280004 for ; Fri, 17 Jan 2025 13:52:50 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C1D9580A8E for ; Fri, 17 Jan 2025 18:52:49 +0000 (UTC) X-FDA: 83017840458.25.87A8228 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf03.hostedemail.com (Postfix) with ESMTP id 3652920005 for ; Fri, 17 Jan 2025 18:52:48 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of cmarinas@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737139968; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=axSv5eQmu0+CzEeU5AIUA3KSzkm385159mlYc9MVdbg=; b=jMGBHqq9SVy5bJr1sH6xi4uMfzbs6KOcnEJfYL+D661GlWrwnmTb49QBvJ4Zk0MYkCOvpU CzuaamVkElBPD1slakOgpS3bv4lK8mORvU2HNfjdC1LEPEi8558NtaPBp1/a4pyE59n21r Ll/5gOiCsMoV3W9u2brtE5CjcyVAjVI= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of cmarinas@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=cmarinas@kernel.org; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=arm.com (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737139968; a=rsa-sha256; cv=none; b=h86/vXjK23GjVbYf9KFXJ85QohkBBAzNFeM8AJN+J/iIHg7iRK3UmfyjFwEUKg+BroIJbZ K4RvaXlD31gbD29A2AsF1E6juDj2ymECRxRjkc+qhJeAImaCUN6CyeXLJ/IUW5E7js0HeN 2tLnsQ2T5NCFGnjUzN+y+OeGY/T1wFg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id ACEA6A4337F; Fri, 17 Jan 2025 18:50:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 55C46C4CEDD; Fri, 17 Jan 2025 18:52:41 +0000 (UTC) Date: Fri, 17 Jan 2025 18:52:39 +0000 From: Catalin Marinas To: Jason Gunthorpe Cc: Ankit Agrawal , David Hildenbrand , "maz@kernel.org" , "oliver.upton@linux.dev" , "joey.gouly@arm.com" , "suzuki.poulose@arm.com" , "yuzenghui@huawei.com" , "will@kernel.org" , "ryan.roberts@arm.com" , "shahuang@redhat.com" , "lpieralisi@kernel.org" , Aniket Agashe , Neo Jia , Kirti Wankhede , "Tarun Gupta (SW-GPU)" , Vikram Sethi , Andy Currid , Alistair Popple , John Hubbard , Dan Williams , Zhi Wang , Matt Ochs , Uday Dhoke , Dheeraj Nigam , "alex.williamson@redhat.com" , "sebastianene@google.com" , "coltonlewis@google.com" , "kevin.tian@intel.com" , "yi.l.liu@intel.com" , "ardb@kernel.org" , "akpm@linux-foundation.org" , "gshan@redhat.com" , "linux-mm@kvack.org" , "kvmarm@lists.linux.dev" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" Subject: Re: [PATCH v2 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags Message-ID: References: <20250106165159.GJ5556@nvidia.com> <20250113162749.GN5556@nvidia.com> <0743193c-80a0-4ef8-9cd7-cb732f3761ab@redhat.com> <20250114133145.GA5556@nvidia.com> <20250115143213.GQ5556@nvidia.com> <20250117140050.GC5556@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250117140050.GC5556@nvidia.com> X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3652920005 X-Stat-Signature: ugq7r67xi7hfoszetgqjsc7yka94e7mk X-Rspam-User: X-HE-Tag: 1737139968-284988 X-HE-Meta: U2FsdGVkX18Ov23elgg3ML48OHmFPwswSEMd6MatYEXyqZ5f0MS2iJHF+GF6fR6oYMQcSVCbG8VNTspETAblsAE6m5Q6JWTnBE07az0Dmmj/Bjc4o7VjM1aMJoxBgYWiVhZbxHP+9Cs4FLjg8yvJSNGtvCT22Mu2KSxscSNMh3Iyjzm7MP+s4Om6SgXsG72yWgOqYebO5gZOFFSDuEOwrcPxOKEXobHhNHFm0NKKkBjiBMpNbtu6ty7TCRCYp8eWmNxPULxLwS9ZLOtR0RRCaOiM+UMZdgC6Jx2FC5ubpplzD1mxzvo80TPtJtRcsjHgtnTtWETDjxLticX6pyHORVjwOz0yux7/9cyEdgQHloETeVXz+YRP/3dbcf9ysq63VK21TpZS627uflagNk/Xwinv8Rzt1VCQJh2ml/tOpSiXi08msfGvReXNYcTAuzLWGtDUNUgRWp1V8cdtjb7a0cATsR8SHhqBouGAvzzTIZUeCgyw/0krOe0krqkC/cM53EOI/vgUSif9+Guf1I9tPI2d8HvTXVB9zLFrCjBi99pupO3WpcqcS28oWt4I8zE4gmHELUyvFYrCCA7+PBj/eBeJiS5rA/3vJBjEfXxLg/BDaBnv25gepvmlZJXd3dE5PMimxzg+SQ1lGW0J/T7Zw1RYlNjnIxe9h9o5ljGx8h3E+TfDXPlD5Kc8zR2++7G/oRtV9UzKUUpUH4TYCmpJ98y2kgeRhHqWk3SrGLPQQEuh8iOxW5mZaRTtPB2AyXJdZiXKY1mXVNjsbV83ldDHZLnkHf83WxpT6eh5kRZaBl0qlOeyGx2BoBz0GHX+AEha/pyxPl8OEw4rXQmyzKdHbB/amn6OlB+N8ZVjESqHL2erEaXXNXMhEFWDTpt6d9Z7ujcNxowSPaPFJC1JZfTqLV1SxWOT38tkqVqHqXhxC22XruEpQLo/V6lohKJl6wpQ8QdyyJy+XkarTxbUQQ5 2S179Tj0 fO4jNM9YYu3ToPXUjNyGgmeK+ZZXosiV8BjOTZXJni0sWpe0cnG4MUxWJEccflnvkTmcco48DJgZpPv0pF9FG/IsY45bi3CiKhxOtlkUapYDZUHMC59GlOt3ojNFkyMOR80uo5EkcAjoxCUzk+Qm2VmW+UjVssxGCKalOCCLuPplg8TywRmblImmieeOsPohAxvtvJ4XMkK46jUeBXg7r3x1CjaBvFrpUTVnVHoVcha8MrFOqyLwXfxEzgk3D3fKQn22P1tuTISUpb+oporgK82QKSounQxNk16zoNbsL8W16UJEih7UOPxYMNbDlNxbqsQs/xO5OAX8NE2A97WDThLgh/MKYzNrdHYC7+MzjCLijsf6QQZ0r+5Yn0olBbSgd4fEojedHZb4hwXE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jan 17, 2025 at 10:00:50AM -0400, Jason Gunthorpe wrote: > On Thu, Jan 16, 2025 at 10:28:48PM +0000, Catalin Marinas wrote: > > with FEAT_MTE_PERM (patches from Aneesh on the list). Or, a bigger > > happen, disable MTE in guests (well, not that big, not many platforms > > supporting MTE, especially in the enterprise space). > > As above, it seems we already effectively disable MTE in guests to use > VFIO. That's fine. Once the NoTagAccess feature gets in, we could allow MTE here as well. > > A second problem, similar to relaxing to Normal NC we merged last year, > > we can't tell what allowing Stage 2 cacheable means (SError etc). > > That was a very different argument. On that series KVM was upgrading a > VM with pgprot noncached to Normal NC, that upgrade was what triggered > the discussions about SError. > > For this case the VMA is already pgprot cache. KVM is not changing > anything. The KVM S2 will have the same Normal NC memory type as the > VMA has in the S1. Thus KVM has no additional responsibility for > safety here. I agree this is safe. My point was more generic about not allowing cacheable mappings without some sanity check. Ankit's patch relies on the pgprot used on the S1 mapping to make this decision. Presumably the pgprot is set by the host driver. > > information. Checking vm_page_prot instead of a VM_* flag may work if > > it's mapped in user space but this might not always be the case. > > For this series it is only about mapping VMAs. Some future FD based > mapping for CC is going to also need similar metadata.. I have another > thread about that :) How soon would you need that and if you come up with a different mechanism, shouldn't we unify them early rather than having two methods? > > I don't see how VM_PFNMAP alone can tell us anything about the > > access properties supported by a device address range. Either way, > > it's the driver setting vm_page_prot or some VM_* flag. KVM has no > > clue, it's just a memory slot. > > I think David's point about VM_PFNMAP was to avoid some of the > pfn_valid() logic. If we get VM_PFNMAP we just assume it is non-struct > page and follow the VMA's pgprot. Ah, ok, thanks for the clarification. > > A third aspect, more of a simplification when reasoning about this, was > > to use FWB at Stage 2 to force cacheability and not care about cache > > maintenance, especially when such range might be mapped both in user > > space and in the guest. > > Yes, I thought we needed this anyhow as KVM can't cache invalidate > non-struct page memory.. Looks good. I think Ankit should post a new series factoring out the exec handling in a separate patch, dropping some of the pfn_valid() assumptions and we take it from there. I also think some sanity check should be done early in kvm_arch_prepare_memory_region() like rejecting the slot if it's cacheable and we don't have FWB. But I'll leave this decision to the KVM maintainers. We are trying to relax some stuff here as well (see the NoTagAccess thread). -- Catalin