From: Marc Zyngier <maz@kernel.org>
To: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
Oliver Upton <oliver.upton@linux.dev>,
Joey Gouly <joey.gouly@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>, Will Deacon <will@kernel.org>,
Suzuki K Poulose <Suzuki.Poulose@arm.com>,
Steven Price <steven.price@arm.com>,
Peter Collingbourne <pcc@google.com>
Subject: Re: [PATCH] KVM: arm64: Drop mte_allowed check during memslot creation
Date: Mon, 24 Feb 2025 17:23:38 +0000 [thread overview]
Message-ID: <86ikozqmsl.wl-maz@kernel.org> (raw)
In-Reply-To: <yq5aseo3gund.fsf@kernel.org>
On Mon, 24 Feb 2025 16:44:06 +0000,
Aneesh Kumar K.V <aneesh.kumar@kernel.org> wrote:
>
> Marc Zyngier <maz@kernel.org> writes:
>
> > On Mon, 24 Feb 2025 14:39:16 +0000,
> > Catalin Marinas <catalin.marinas@arm.com> wrote:
> >>
> >> On Mon, Feb 24, 2025 at 12:24:14PM +0000, Marc Zyngier wrote:
> >> > On Mon, 24 Feb 2025 11:05:33 +0000,
> >> > Catalin Marinas <catalin.marinas@arm.com> wrote:
> >> > > On Mon, Feb 24, 2025 at 03:09:38PM +0530, Aneesh Kumar K.V (Arm) wrote:
> >> > > > This change is needed because, without it, users are not able to use MTE
> >> > > > with VFIO passthrough (currently the mapping is either Device or
> >> > > > NonCacheable for which tag access check is not applied.), as shown
> >> > > > below (kvmtool VMM).
> >> > >
> >> > > Another nit: "users are not able to user VFIO passthrough when MTE is
> >> > > enabled". At a first read, the above sounded to me like one wants to
> >> > > enable MTE for VFIO passthrough mappings.
> >> >
> >> > What the commit message doesn't spell out is how MTE and VFIO are
> >> > interacting here. I also don't understand the reference to Device or
> >> > NC memory here.
> >>
> >> I guess it's saying that the guest cannot turn MTE on (Normal Tagged)
> >> for these ranges anyway since Stage 2 is Device or Normal NC. So we
> >> don't break any use-case specific to VFIO.
> >>
> >> > Isn't the issue that DMA doesn't check/update tags, and therefore it
> >> > makes little sense to prevent non-tagged memory being associated with
> >> > a memslot?
> >>
> >> The issue is that some MMIO memory range that does not support MTE
> >> (well, all MMIO) could be mapped by the guest as Normal Tagged and we
> >> have no clue what the hardware does as tag accesses, hence we currently
> >> prevent it altogether. It's not about DMA.
> >>
> >> This patch still prevents such MMIO+MTE mappings but moves the decision
> >> to user_mem_abort() and it's slightly more relaxed - only rejecting it
> >> if !VM_MTE_ALLOWED _and_ the Stage 2 is cacheable. The side-effect is
> >> that it allows device assignment into the guest since Stage 2 is not
> >> Normal Cacheable (at least for now, we have some patches Ankit but they
> >> handle the MTE case).
> >
> > The other side effect is that it also allows non-tagged cacheable
> > memory to be given to the MTE-enabled guest, and the guest has no way
> > to distinguish between what is tagged and what's not.
> >
> >>
> >> > My other concern is that this gives pretty poor consistency to the
> >> > guest, which cannot know what can be tagged and what cannot, and
> >> > breaks a guarantee that the guest should be able to rely on.
> >>
> >> The guest should not set Normal Tagged on anything other than what it
> >> gets as standard RAM. We are not changing this here. KVM than needs to
> >> prevent a broken/malicious guest from setting MTE on other (physical)
> >> ranges that don't support MTE. Currently it can only do this by forcing
> >> Device or Normal NC (or disable MTE altogether). Later we'll add
> >> FEAT_MTE_PERM to permit Stage 2 Cacheable but trap on tag accesses.
> >>
> >> The ABI change is just for the VMM, the guest shouldn't be aware as
> >> long as it sticks to the typical recommendations for MTE - only enable
> >> on standard RAM.
> >
> > See above. You fall into the same trap with standard memory, since you
> > now allow userspace to mix things at will, and only realise something
> > has gone wrong on access (and -EFAULT is not very useful).
> >
> >>
> >> Does any VMM rely on the memory slot being rejected on registration if
> >> it does not support MTE? After this change, we'd get an exit to the VMM
> >> on guest access with MTE turned on (even if it's not mapped as such at
> >> Stage 1).
> >
> > I really don't know what userspace expects w.r.t. mixing tagged and
> > non-tagged memory. But I don't expect anything good to come out of it,
> > given that we provide zero information about the fault context.
> >
> > Honestly, if we are going to change this, then let's make sure we give
> > enough information for userspace to go and fix the mess. Not just "it
> > all went wrong".
> >
>
> What if we trigger a memory fault exit with the TAGACCESS flag, allowing
> the VMM to use the GPA to retrieve additional details and print extra
> information to aid in analysis? BTW, we will do this on the first fault
> in cacheable, non-tagged memory even if there is no tagaccess in that
> region. This can be further improved using the NoTagAccess series I
> posted earlier, which ensures the memory fault exit occurs only on
> actual tag access
>
> Something like below?
Something like that, only with:
- a capability informing userspace of this behaviour
- a per-VM (or per-VMA) flag as a buy-in for that behaviour
- the relaxation is made conditional on the memslot not being memory
(i.e. really MMIO-only).
and keep the current behaviour otherwise.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2025-02-24 17:25 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-24 9:39 [PATCH] KVM: arm64: Drop mte_allowed check during memslot creation Aneesh Kumar K.V (Arm)
2025-02-24 10:32 ` Suzuki K Poulose
2025-02-24 11:05 ` Catalin Marinas
2025-02-24 12:24 ` Marc Zyngier
2025-02-24 14:39 ` Catalin Marinas
2025-02-24 15:02 ` Marc Zyngier
2025-02-24 16:44 ` Aneesh Kumar K.V
2025-02-24 17:23 ` Marc Zyngier [this message]
2025-02-26 8:02 ` Oliver Upton
2025-02-26 9:58 ` Aneesh Kumar K.V
2025-02-26 15:58 ` Catalin Marinas
2025-02-26 16:48 ` Aneesh Kumar K.V
2025-02-26 18:02 ` Catalin Marinas
2025-02-24 18:00 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86ikozqmsl.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=Suzuki.Poulose@arm.com \
--cc=aneesh.kumar@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=joey.gouly@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oliver.upton@linux.dev \
--cc=pcc@google.com \
--cc=steven.price@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).