From: Catalin Marinas <catalin.marinas@arm.com>
To: Marc Zyngier <maz@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
ankita@nvidia.com,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
oliver.upton@linux.dev, suzuki.poulose@arm.com,
yuzenghui@huawei.com, will@kernel.org, ardb@kernel.org,
akpm@linux-foundation.org, gshan@redhat.com, aniketa@nvidia.com,
cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com,
vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com,
kvmarm@lists.linux.dev, kvm@vger.kernel.org,
lpieralisi@kernel.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory
Date: Wed, 6 Dec 2023 12:14:18 +0000 [thread overview]
Message-ID: <ZXBlmt88dKmZLCU9@arm.com> (raw)
In-Reply-To: <868r67blwo.wl-maz@kernel.org>
On Wed, Dec 06, 2023 at 11:39:03AM +0000, Marc Zyngier wrote:
> On Tue, 05 Dec 2023 18:40:42 +0000,
> Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, Dec 05, 2023 at 05:50:27PM +0000, Marc Zyngier wrote:
> > > On Tue, 05 Dec 2023 17:33:01 +0000,
> > > Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > Ideally we should do this for vfio only but we don't have an easy
> > > > way to convey this to KVM.
> > >
> > > But if we want to limit this to PCIe, we'll have to find out. The
> > > initial proposal (a long while ago) had a flag conveying some
> > > information, and I'd definitely feel more confident having something
> > > like that.
> >
> > We can add a VM_PCI_IO in the high vma flags to be set by
> > vfio_pci_core_mmap(), though it limits it to 64-bit architectures. KVM
> > knows this is PCI and relaxes things a bit. It's not generic though if
> > we need this later for something else.
>
> Either that, or something actually describing the attributes that VFIO
> wants.
>
> And I very much want it to be a buy-in behaviour, not something that
> automagically happens and changes the default behaviour for everyone
> based on some hand-wavy assertions.
>
> If that means a userspace change, fine by me. The VMM better know what
> is happening.
Driving the attributes from a single point like the VFIO driver is
indeed better. The problem is that write-combining on Arm doesn't come
without speculative loads, otherwise we would have solved it by now. I
also recall the VFIO maintainer pushing back on relaxing the
pgprot_noncached() for the user mapping but I don't remember the
reasons.
We could do with a pgprot_maybewritecombine() or
pgprot_writecombinenospec() (similar to Jason's idea but without
changing the semantics of pgprot_device()). For the user mapping on
arm64 this would be Device (even _GRE) since it can't disable
speculation but stage 2 would leave the decision to the guest since the
speculative loads aren't much different from committed loads done
wrongly.
If we want the VMM to drive this entirely, we could add a new mmap()
flag like MAP_WRITECOMBINE or PROT_WRITECOMBINE. They do feel a bit
weird but there is precedent with PROT_MTE to describe a memory type.
One question is whether the VFIO driver still needs to have the
knowledge and sanitise the requests from the VMM within a single BAR. If
there are no security implications to such mappings, the VMM can map
parts of the BAR as pgprot_noncached(), other parts as
pgprot_writecombine() and KVM just follows them (similarly if we need a
cacheable mapping).
The latter has some benefits for DPDK but it's a lot more involved with
having to add device-specific knowledge into the VMM. The VMM would also
have to present the whole BAR contiguously to the guest even if there
are different mapping attributes within the range. So a lot of MAP_FIXED
uses. I'd rather leaving this decision with the guest than the VMM, it
looks like more hassle to create those mappings. The VMM or the VFIO
could only state write-combine and speculation allowed.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-12-06 12:14 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-05 3:30 [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory ankita
2023-12-05 9:21 ` Marc Zyngier
2023-12-05 11:40 ` Catalin Marinas
2023-12-05 13:05 ` Jason Gunthorpe
2023-12-05 14:37 ` Lorenzo Pieralisi
2023-12-05 14:44 ` Jason Gunthorpe
2023-12-05 16:24 ` Catalin Marinas
2023-12-05 17:10 ` Jason Gunthorpe
2023-12-05 16:22 ` Catalin Marinas
2023-12-05 16:43 ` Jason Gunthorpe
2023-12-05 17:01 ` Marc Zyngier
2023-12-05 17:33 ` Catalin Marinas
2023-12-05 17:50 ` Marc Zyngier
2023-12-05 18:40 ` Catalin Marinas
2023-12-06 11:39 ` Marc Zyngier
2023-12-06 12:14 ` Catalin Marinas [this message]
2023-12-06 15:16 ` Jason Gunthorpe
2023-12-06 16:31 ` Catalin Marinas
2023-12-06 17:20 ` Jason Gunthorpe
2023-12-06 18:58 ` Catalin Marinas
2023-12-06 19:03 ` Jason Gunthorpe
2023-12-06 19:06 ` Catalin Marinas
2023-12-07 2:53 ` Ankit Agrawal
2023-12-06 11:52 ` Lorenzo Pieralisi
2023-12-05 19:24 ` Catalin Marinas
2023-12-05 19:48 ` Jason Gunthorpe
2023-12-06 14:49 ` Catalin Marinas
2023-12-06 15:05 ` Jason Gunthorpe
2023-12-06 15:18 ` Lorenzo Pieralisi
2023-12-06 15:38 ` Jason Gunthorpe
2023-12-06 16:23 ` Catalin Marinas
2023-12-06 16:48 ` Jason Gunthorpe
2023-12-07 10:13 ` Lorenzo Pieralisi
2023-12-07 13:38 ` Jason Gunthorpe
2023-12-07 14:50 ` Lorenzo Pieralisi
2023-12-05 13:28 ` Lorenzo Pieralisi
2023-12-05 14:16 ` Shameerali Kolothum Thodi
2023-12-06 8:17 ` Shameerali Kolothum Thodi
2023-12-05 11:48 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZXBlmt88dKmZLCU9@arm.com \
--to=catalin.marinas@arm.com \
--cc=acurrid@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=aniketa@nvidia.com \
--cc=ankita@nvidia.com \
--cc=apopple@nvidia.com \
--cc=ardb@kernel.org \
--cc=cjia@nvidia.com \
--cc=danw@nvidia.com \
--cc=gshan@redhat.com \
--cc=jgg@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=kwankhede@nvidia.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lpieralisi@kernel.org \
--cc=maz@kernel.org \
--cc=mochs@nvidia.com \
--cc=oliver.upton@linux.dev \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=suzuki.poulose@arm.com \
--cc=targupta@nvidia.com \
--cc=vsethi@nvidia.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).