linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>,
	Peter Collingbourne <pcc@google.com>
Cc: linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	Suzuki K Poulose <Suzuki.Poulose@arm.com>,
	Steven Price <steven.price@arm.com>,
	Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Oliver Upton <oliver.upton@linux.dev>,
	Joey Gouly <joey.gouly@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>
Subject: Re: [PATCH v2 5/7] KVM: arm64: MTE: Use stage-2 NoTagAccess memory attribute if supported
Date: Tue, 28 Jan 2025 16:01:18 +0530	[thread overview]
Message-ID: <yq5afrl3cjrd.fsf@kernel.org> (raw)
In-Reply-To: <Z4e04P1bQlFBDHo7@arm.com>

Catalin Marinas <catalin.marinas@arm.com> writes:

> On Mon, Jan 13, 2025 at 12:47:54PM -0800, Peter Collingbourne wrote:
>> On Mon, Jan 13, 2025 at 11:09 AM Catalin Marinas
>> <catalin.marinas@arm.com> wrote:
>> > On Sat, Jan 11, 2025 at 06:49:55PM +0530, Aneesh Kumar K.V wrote:
>> > > Catalin Marinas <catalin.marinas@arm.com> writes:
>> > > > On Fri, Jan 10, 2025 at 04:30:21PM +0530, Aneesh Kumar K.V (Arm) wrote:
>> > > >> Currently, the kernel won't start a guest if the MTE feature is enabled
>> > >
>> > > ...
>> > >
>> > > >> @@ -2152,7 +2162,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>> > > >>            if (!vma)
>> > > >>                    break;
>> > > >>
>> > > >> -          if (kvm_has_mte(kvm) && !kvm_vma_mte_allowed(vma)) {
>> > > >> +          if (kvm_has_mte(kvm) &&
>> > > >> +              !kvm_has_mte_perm(kvm) && !kvm_vma_mte_allowed(vma)) {
>> > > >>                    ret = -EINVAL;
>> > > >>                    break;
>> > > >>            }
>> > > >
>> > > > I don't think we should change this, or at least not how it's done above
>> > > > (Suzuki raised a related issue internally relaxing this for VM_PFNMAP).
>> > > >
>> > > > For standard memory slots, we want to reject them upfront rather than
>> > > > deferring to the fault handler. An example here is file mmap() passed as
>> > > > standard RAM to the VM. It's an unnecessary change in behaviour IMHO.
>> > > > I'd only relax this for VM_PFNMAP mappings further down in this
>> > > > function (and move the VM_PFNMAP check above; see Suzuki's internal
>> > > > patch, unless he posted it publicly already).
>> > >
>> > > But we want to handle memslots backed by pagecache pages for virtio-shm
>> > > here (virtiofs dax use case).
>> >
>> > Ah, I forgot about this use case. So with virtiofs DAX, does a host page
>> > cache page (host VMM mmap()) get mapped directly into the guest as a
>> > separate memory slot? In this case, the host vma would not have
>> > VM_MTE_ALLOWED set.
>> >
>> > > With MTE_PERM, we can essentially skip the
>> > > kvm_vma_mte_allowed(vma) check because we handle all types in the fault
>> > > handler.
>> >
>> > This was pretty much the early behaviour when we added KVM support for
>> > MTE, allow !VM_MTE_ALLOWED and trap them later. However, we disallowed
>> > VM_SHARED because of some non-trivial race. Commit d89585fbb308 ("KVM:
>> > arm64: unify the tests for VMAs in memslots when MTE is enabled")
>> > changed this behaviour and the VM_MTE_ALLOWED check happens upfront. A
>> > subsequent commit removed the VM_SHARED check.
>> >
>> > It's a minor ABI change but I'm trying to figure out why we needed this
>> > upfront check rather than simply dropping the VM_SHARED check. Adding
>> > Peter in case he remembers. I can't see any race if we simply skipped
>> > this check altogether, irrespective of FEAT_MTE_PERM.
>> 
>> I don't see a problem with removing the upfront check. The reason I
>> kept the check was IIRC just that there was already a check there and
>> its logic needed to be adjusted for my VM_SHARED changes.
>
> Prior to commit d89585fbb308, kvm_arch_prepare_memory_region() only
> rejected a memory slot if VM_SHARED was set. This commit unified the
> checking with user_mem_abort(), with slots being rejected if
> (!VM_MTE_ALLOWED || VM_SHARED). A subsequent commit dropped the
> VM_SHARED check, so we ended up with memory slots being rejected only if
> !VM_MTE_ALLOWED (of course, if kvm_has_mte()). This wasn't the case
> before the VM_SHARED relaxation.
>
> So if you don't remember any strong reason for this change, I think we
> should go back to the original behaviour of deferring the VM_MTE_ALLOWED
> check to user_mem_abort() (and still permitting VM_SHARED).
>

Something as below?

From 466237a6f0a165152c157ab4a73f34c400cffe34 Mon Sep 17 00:00:00 2001
From: "Aneesh Kumar K.V (Arm)" <aneesh.kumar@kernel.org>
Date: Tue, 28 Jan 2025 14:21:52 +0530
Subject: [PATCH] KVM: arm64: Drop mte_allowed check during memslot creation

Before commit d89585fbb308 ("KVM: arm64: unify the tests for VMAs in
memslots when MTE is enabled"), kvm_arch_prepare_memory_region() only
rejected a memory slot if VM_SHARED was set. This commit unified the
checking with user_mem_abort(), with slots being rejected if either
VM_MTE_ALLOWED is not set or VM_SHARED set. A subsequent commit
c911f0d46879 ("KVM: arm64: permit all VM_MTE_ALLOWED mappings with MTE
enabled") dropped the VM_SHARED check, so we ended up with memory slots
being rejected if VM_MTE_ALLOWED is not set. This wasn't the case before
the commit d89585fbb308. The rejection of the memory slot with VM_SHARED
set was done to avoid a race condition with the test/set of the
PG_mte_tagged flag. Before Commit d77e59a8fccd ("arm64: mte: Lock a page
for MTE tag initialization") the kernel avoided allowing MTE with shared
pages, thereby preventing two tasks sharing a page from setting up the
PG_mte_tagged flag racily.

Commit d77e59a8fccd ("arm64: mte: Lock a page for MTE tag
initialization") further updated the locking so that the kernel
allows VM_SHARED mapping with MTE. With this commit, we can enable
memslot creation with VM_SHARED VMA mapping.

This patch results in a minor tweak to the ABI. We now allow creating
memslots that don't have the VM_MTE_ALLOWED flag set. If the guest uses
such a memslot with Allocation Tags, the kernel will generate -EFAULT.
ie, instead of failing early, we now fail later during KVM_RUN.

This change is needed because, without it, users are not able to use MTE
with VFIO passthrough, as shown below (kvmtool VMM).

[  617.921030] vfio-pci 0000:01:00.0: resetting
[  618.024719] vfio-pci 0000:01:00.0: reset done
  Error: 0000:01:00.0: failed to register region with KVM
  Warning: [0abc:aced] Error activating emulation for BAR 0
  Error: 0000:01:00.0: failed to configure regions
  Warning: Failed init: vfio__init

  Fatal: Initialisation failed

Signed-off-by: Aneesh Kumar K.V (Arm) <aneesh.kumar@kernel.org>
---
 arch/arm64/kvm/mmu.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 007dda958eab..610becd8574e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -2146,11 +2146,6 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
 		if (!vma)
 			break;
 
-		if (kvm_has_mte(kvm) && !kvm_vma_mte_allowed(vma)) {
-			ret = -EINVAL;
-			break;
-		}
-
 		if (vma->vm_flags & VM_PFNMAP) {
 			/* IO region dirty page logging not allowed */
 			if (new->flags & KVM_MEM_LOG_DIRTY_PAGES) {
-- 
2.43.0




  reply	other threads:[~2025-01-28 10:33 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-10 11:00 [PATCH v2 0/7] Add support for NoTagAccess memory attribute Aneesh Kumar K.V (Arm)
2025-01-10 11:00 ` [PATCH v2 1/7] arm64: Update the values to binary from hex Aneesh Kumar K.V (Arm)
2025-01-10 13:11   ` Catalin Marinas
2025-01-10 11:00 ` [PATCH v2 2/7] KVM: arm64: MTE: Update code comments Aneesh Kumar K.V (Arm)
2025-01-10 13:11   ` Catalin Marinas
2025-01-10 11:00 ` [PATCH v2 3/7] arm64: cpufeature: add Allocation Tag Access Permission (MTE_PERM) feature Aneesh Kumar K.V (Arm)
2025-01-10 13:15   ` Catalin Marinas
2025-01-10 11:00 ` [PATCH v2 4/7] KVM: arm64: MTE: Add KVM_CAP_ARM_MTE_PERM Aneesh Kumar K.V (Arm)
2025-01-10 11:00 ` [PATCH v2 5/7] KVM: arm64: MTE: Use stage-2 NoTagAccess memory attribute if supported Aneesh Kumar K.V (Arm)
2025-01-10 18:20   ` Catalin Marinas
2025-01-11 13:19     ` Aneesh Kumar K.V
2025-01-13 19:09       ` Catalin Marinas
2025-01-13 20:47         ` Peter Collingbourne
2025-01-14  9:55           ` Suzuki K Poulose
2025-01-15 13:15           ` Catalin Marinas
2025-01-28 10:31             ` Aneesh Kumar K.V [this message]
2025-01-29 14:38               ` Catalin Marinas
2025-01-10 11:00 ` [PATCH v2 6/7] KVM: arm64: MTE: Nested guest support Aneesh Kumar K.V (Arm)
2025-01-10 11:00 ` [PATCH v2 7/7] KVM: arm64: Split some of the kvm_pgtable_prot bits into separate defines Aneesh Kumar K.V (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq5afrl3cjrd.fsf@kernel.org \
    --to=aneesh.kumar@kernel.org \
    --cc=Suzuki.Poulose@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=joey.gouly@arm.com \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pcc@google.com \
    --cc=steven.price@arm.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).