From: Oliver Upton <oliver.upton@linux.dev>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Marc Zyngier <maz@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Ankit Agrawal <ankita@nvidia.com>,
"joey.gouly@arm.com" <joey.gouly@arm.com>,
"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
"yuzenghui@huawei.com" <yuzenghui@huawei.com>,
"will@kernel.org" <will@kernel.org>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"shahuang@redhat.com" <shahuang@redhat.com>,
"lpieralisi@kernel.org" <lpieralisi@kernel.org>,
"david@redhat.com" <david@redhat.com>,
Aniket Agashe <aniketa@nvidia.com>, Neo Jia <cjia@nvidia.com>,
Kirti Wankhede <kwankhede@nvidia.com>,
"Tarun Gupta (SW-GPU)" <targupta@nvidia.com>,
Vikram Sethi <vsethi@nvidia.com>,
Andy Currid <acurrid@nvidia.com>,
Alistair Popple <apopple@nvidia.com>,
John Hubbard <jhubbard@nvidia.com>,
Dan Williams <danw@nvidia.com>, Zhi Wang <zhiw@nvidia.com>,
Matt Ochs <mochs@nvidia.com>, Uday Dhoke <udhoke@nvidia.com>,
Dheeraj Nigam <dnigam@nvidia.com>,
Krishnakant Jaju <kjaju@nvidia.com>,
"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
"sebastianene@google.com" <sebastianene@google.com>,
"coltonlewis@google.com" <coltonlewis@google.com>,
"kevin.tian@intel.com" <kevin.tian@intel.com>,
"yi.l.liu@intel.com" <yi.l.liu@intel.com>,
"ardb@kernel.org" <ardb@kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"gshan@redhat.com" <gshan@redhat.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"ddutile@redhat.com" <ddutile@redhat.com>,
"tabba@google.com" <tabba@google.com>,
"qperret@google.com" <qperret@google.com>,
"seanjc@google.com" <seanjc@google.com>,
"kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v3 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags
Date: Wed, 19 Mar 2025 00:01:29 -0700 [thread overview]
Message-ID: <Z9pryQwy2iwa2bpJ@linux.dev> (raw)
In-Reply-To: <20250318230909.GD9311@nvidia.com>
On Tue, Mar 18, 2025 at 08:09:09PM -0300, Jason Gunthorpe wrote:
> > It's far more problematic the other way around, e.g. the host knows that
> > something needs a Device-* attribute and the VM has done something
> > cacheable. The endpoint for that PA could, for example, fall over when
> > lines pulled in by the guest are written back, which of course can't
> > always be traced back to the offending VM.
> >
> > OTOH, if the host knows that a PA is cacheable and the guest does
> > something non-cacheable, you 'just' have to deal with the usual
> > mismatched attributes problem as laid out in the ARM ARM.
>
> I think the issue is that KVM doesn't do that usual stuff (ie cache
> flushing) for memory that doesn't have a struct page backing.
Indeed, I clearly paged that out. What I said is how we arrived at the
Device-* v. Normal-NC distinction.
> So nothing in the hypervisor does any cache flushing and I belive you
> end up with a situation where the VMM could have zero'd this cachable
> memory using cachable stores to sanitize it across VMs and then KVM
> can put that memory into the VM as uncached and the VM could then
> access stale non-zeroed data from a prior VM. Yes? This is a security
> problem.
Pedantic, but KVM only cares about cache maintenance in response to the
primary MM, not the VMM. After a stage-2 mapping has been established
userspace cannot expect KVM to do cache maintenance on its behalf.
You have a very good point that KVM is broken for cacheable PFNMAP'd
crap since we demote to something non-cacheable, and maybe that
deserves fixing first. Hopefully nobody notices that we've taken away
the toys...
> So I think the logic we want here in the fault handler is to:
> Get the mm's PTE
> If it is cachable:
> Check if it has a struct page:
> Yes - KVM flushes it and can use a non-FWB path
> No - KVM either fails to install it, or installs it using FWB
> to force cachability. KVM never allows degrading cachable
> to non-cachable when it can't do flushing.
> Not cachable:
> Install it with Normal-NC as was previously discussed and merged
We still need to test the VMA flag here to select Normal-NC v. Device.
> > Userspace should be stating intentions on the memslot with the sort of
> > mapping that it wants to create, and a memslot flag to say "I allow
> > cacheable mappings" seems to fit the bill.
>
> I'm not sure about this, I don't see that the userspace has any
> choice. As above, KVM has to follow whatever is in the PTEs, the
> userspace can't ask for something different here. At best you could
> make non-struct page cachable memory always fail unless the flag is
> given - but why?
>
> It seems sufficient for fast fail to check if the VMA has PFNMAP and
> pgprot cachable then !FEAT_FWB fails the memslot. There is no real
> recovery from this, the VMM is doing something that cannot be
> supported.
I'm less worried about recovery and more focused on userspace being
able to understand what happened. Otherwise we may get folks complaining
about the ioctl failing "randomly" on certain machines.
But we might need to just expose the FWB-ness of the MMU to userspace
since it can already encounter mismatched attributes when poking struct
page-backed guest memory.
Thanks,
Oliver
next prev parent reply other threads:[~2025-03-19 7:01 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-10 10:30 [PATCH v3 0/1] KVM: arm64: Map GPU device memory as cacheable ankita
2025-03-10 10:30 ` [PATCH v3 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags ankita
2025-03-10 11:54 ` Marc Zyngier
2025-03-11 3:42 ` Ankit Agrawal
2025-03-11 11:18 ` Marc Zyngier
2025-03-11 12:07 ` Ankit Agrawal
2025-03-12 8:21 ` Marc Zyngier
2025-03-17 5:55 ` Ankit Agrawal
2025-03-17 9:27 ` Marc Zyngier
2025-03-17 19:54 ` Catalin Marinas
2025-03-18 9:39 ` Marc Zyngier
2025-03-18 12:55 ` Jason Gunthorpe
2025-03-18 19:27 ` Catalin Marinas
2025-03-18 19:35 ` David Hildenbrand
2025-03-18 19:40 ` Oliver Upton
2025-03-20 3:30 ` bibo mao
2025-03-20 7:24 ` bibo mao
2025-03-18 23:17 ` Jason Gunthorpe
2025-03-19 18:03 ` Catalin Marinas
2025-03-18 19:30 ` Oliver Upton
2025-03-18 23:09 ` Jason Gunthorpe
2025-03-19 7:01 ` Oliver Upton [this message]
2025-03-19 17:04 ` Jason Gunthorpe
2025-03-19 18:11 ` Catalin Marinas
2025-03-19 19:22 ` Jason Gunthorpe
2025-03-19 21:48 ` Catalin Marinas
2025-03-26 8:31 ` Ankit Agrawal
2025-03-26 14:53 ` Sean Christopherson
2025-03-26 15:42 ` Marc Zyngier
2025-03-26 16:10 ` Sean Christopherson
2025-03-26 18:02 ` Marc Zyngier
2025-03-26 18:24 ` Sean Christopherson
2025-03-26 18:51 ` Oliver Upton
2025-03-31 14:44 ` Jason Gunthorpe
2025-03-31 14:56 ` Jason Gunthorpe
2025-04-07 15:20 ` Sean Christopherson
2025-04-07 16:15 ` Jason Gunthorpe
2025-04-07 16:43 ` Sean Christopherson
2025-04-16 8:51 ` Ankit Agrawal
2025-04-21 16:03 ` Ankit Agrawal
2025-04-22 7:49 ` Oliver Upton
2025-04-22 13:54 ` Jason Gunthorpe
2025-04-22 16:50 ` Catalin Marinas
2025-04-22 17:03 ` Jason Gunthorpe
2025-04-22 21:28 ` Oliver Upton
2025-04-22 23:35 ` Jason Gunthorpe
2025-04-23 10:45 ` Catalin Marinas
2025-04-23 12:02 ` Jason Gunthorpe
2025-04-23 12:26 ` Catalin Marinas
2025-04-23 13:03 ` Jason Gunthorpe
2025-04-29 10:47 ` Ankit Agrawal
2025-04-29 13:27 ` Catalin Marinas
2025-04-29 14:14 ` Jason Gunthorpe
2025-04-29 16:03 ` Catalin Marinas
2025-04-29 16:44 ` Jason Gunthorpe
2025-04-29 18:09 ` Catalin Marinas
2025-04-29 18:19 ` Jason Gunthorpe
2025-05-07 15:26 ` Ankit Agrawal
2025-05-09 12:47 ` Catalin Marinas
2025-04-22 14:53 ` Sean Christopherson
2025-03-18 12:57 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z9pryQwy2iwa2bpJ@linux.dev \
--to=oliver.upton@linux.dev \
--cc=acurrid@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=alex.williamson@redhat.com \
--cc=aniketa@nvidia.com \
--cc=ankita@nvidia.com \
--cc=apopple@nvidia.com \
--cc=ardb@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=cjia@nvidia.com \
--cc=coltonlewis@google.com \
--cc=danw@nvidia.com \
--cc=david@redhat.com \
--cc=ddutile@redhat.com \
--cc=dnigam@nvidia.com \
--cc=gshan@redhat.com \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=joey.gouly@arm.com \
--cc=kevin.tian@intel.com \
--cc=kjaju@nvidia.com \
--cc=kvmarm@lists.linux.dev \
--cc=kwankhede@nvidia.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lpieralisi@kernel.org \
--cc=maz@kernel.org \
--cc=mochs@nvidia.com \
--cc=qperret@google.com \
--cc=ryan.roberts@arm.com \
--cc=seanjc@google.com \
--cc=sebastianene@google.com \
--cc=shahuang@redhat.com \
--cc=suzuki.poulose@arm.com \
--cc=tabba@google.com \
--cc=targupta@nvidia.com \
--cc=udhoke@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
--cc=yuzenghui@huawei.com \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).