public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>,
	ankita@nvidia.com,
	Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
	oliver.upton@linux.dev, suzuki.poulose@arm.com,
	yuzenghui@huawei.com, will@kernel.org, ardb@kernel.org,
	akpm@linux-foundation.org, gshan@redhat.com, aniketa@nvidia.com,
	cjia@nvidia.com, kwankhede@nvidia.com, targupta@nvidia.com,
	vsethi@nvidia.com, acurrid@nvidia.com, apopple@nvidia.com,
	jhubbard@nvidia.com, danw@nvidia.com, mochs@nvidia.com,
	kvmarm@lists.linux.dev, kvm@vger.kernel.org,
	lpieralisi@kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory
Date: Tue, 5 Dec 2023 12:43:18 -0400	[thread overview]
Message-ID: <20231205164318.GG2692119@nvidia.com> (raw)
In-Reply-To: <ZW9OSe8Z9gAmM7My@arm.com>

On Tue, Dec 05, 2023 at 04:22:33PM +0000, Catalin Marinas wrote:
> Yeah, I made this argument in the past. But it's a fair question to ask
> since the Arm world is different from x86. Just reusing an existing
> driver in a different context may break its expectations. Does Normal NC
> access complete by the time a TLBI (for Stage 2) and DSB (DVMsync) is
> completed? It does reach some point of serialisation with subsequent
> accesses to the same address but not sure how it is ordered with an
> access to a different location like the config space used for reset.
> Maybe it's not a problem at all or it is safe only for PCIe but it would
> be good to get to the bottom of this.

IMHO, the answer is you can't know architecturally. The specific
vfio-platform driver must do an analysis of it's specific SOC and
determine what exactly is required to order the reset. The primary
purpose of the vfio-platform drivers is to provide this reset!

In most cases I would expect some reads from the device to be required
before the reset.

> > Remember, the feedback we got from the CPU architects was that even
> > DEVICE_* will experience an uncontained failure if the device tiggers
> > an error response in shipping ARM IP.
> > 
> > The reason PCIe is safe is because the PCI bridge does not generate
> > errors in the first place!
> 
> That's an argument to restrict this feature to PCIe. It's really about
> fewer arguments on the behaviour of other devices. Marc did raise
> another issue with the GIC VCPU interface (does this even have a vma in
> the host VMM?). That's a class of devices where the mapping is
> context-switched, so the TLBI+DSB rules don't help.

I don't know anything about the GIC VCPU interface, to give any
comment unfortunately. Since it seems there is something to fix here I
would appreciate some background..
 
When you say it is context switched do you mean kvm does a register
write on every vm entry to set the proper HW context for the vCPU?

We are worrying that register write will possibly not order after
NORMAL_NC?

> > Thus, the way a platform device can actually be safe is if it too
> > never generates errors in the first place! Obviously this approach
> > works just as well with NORMAL_NC.
> > 
> > If a platform device does generate errors then we shouldn't expect
> > containment at all, and the memory type has no bearing on the
> > safety. The correct answer is to block these platform devices from
> > VFIO/KVM/etc because they can trigger uncontained failures.
> 
> Assuming the error containment is sorted, there are two other issues
> with other types of devices:
> 
> 1. Ordering guarantees on reclaim or context switch

Solved in VFIO
 
> 2. Unaligned accesses
> 
> On (2), I think PCIe is fairly clear on how the TLPs are generated, so I
> wouldn't expect additional errors here. But I have no idea what AMBA/AXI
> does here in general. Perhaps it's fine, I don't think we looked into it
> as the focus was mostly on PCIe.

I would expect AXI devices to throw errors in all sorts of odd
cases. eg I would not be surprised at all to see carelessly built AXI
devices error if ST64B is pointed at them. At least when I was
building AXI logic years ago I was so lazy :P

This is mostly my point - if the devices under vfio-platform were not
designed to have contained errors then, IMHO, it is hard to believe
that DEVICE_X/NORMAL_NC is the only issue in that HW.

> So, I think it would be easier to get this patch upstream if we limit
> the change to PCIe devices for now. We may relax this further in the
> future. Do you actually have a need for non-PCIe devices to support WC
> in the guest or it's more about the complexity of the logic to detect
> whether it's actually a PCIe BAR we are mapping into the guest? (I can
> see some Arm GPU folk asking for this but those devices are not easily
> virtualisable).

The complexity is my concern, and the disruption to the ecosystem with
some of the ideas given.

If there was a trivial way to convey in the VMA that it is safe then
sure, no objection from me.

My worry is this has turned from fixing a real problem we have today
into a debate about theoretical issues that nobody may care about that
are very disruptive to solve.

I would turn it around and ask we find a way to restrict platform
devices when someone comes with a platform device that wants to use
secure kvm and has a single well defined HW problem that is solved by
this work.

What if we change vfio-pci to use pgprot_device() like it already
really should and say the pgprot_noncached() is enforced as
DEVICE_nGnRnE and pgprot_device() may be DEVICE_nGnRE or NORMAL_NC?
Would that be acceptable?

Jason

  reply	other threads:[~2023-12-05 16:43 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-05  3:30 [PATCH v2 1/1] KVM: arm64: allow the VM to select DEVICE_* and NORMAL_NC for IO memory ankita
2023-12-05  9:21 ` Marc Zyngier
2023-12-05 11:40   ` Catalin Marinas
2023-12-05 13:05     ` Jason Gunthorpe
2023-12-05 14:37       ` Lorenzo Pieralisi
2023-12-05 14:44         ` Jason Gunthorpe
2023-12-05 16:24           ` Catalin Marinas
2023-12-05 17:10             ` Jason Gunthorpe
2023-12-05 16:22       ` Catalin Marinas
2023-12-05 16:43         ` Jason Gunthorpe [this message]
2023-12-05 17:01           ` Marc Zyngier
2023-12-05 17:33             ` Catalin Marinas
2023-12-05 17:50               ` Marc Zyngier
2023-12-05 18:40                 ` Catalin Marinas
2023-12-06 11:39                   ` Marc Zyngier
2023-12-06 12:14                     ` Catalin Marinas
2023-12-06 15:16                       ` Jason Gunthorpe
2023-12-06 16:31                         ` Catalin Marinas
2023-12-06 17:20                           ` Jason Gunthorpe
2023-12-06 18:58                             ` Catalin Marinas
2023-12-06 19:03                               ` Jason Gunthorpe
2023-12-06 19:06                                 ` Catalin Marinas
2023-12-07  2:53                                   ` Ankit Agrawal
2023-12-06 11:52                   ` Lorenzo Pieralisi
2023-12-05 19:24           ` Catalin Marinas
2023-12-05 19:48             ` Jason Gunthorpe
2023-12-06 14:49               ` Catalin Marinas
2023-12-06 15:05                 ` Jason Gunthorpe
2023-12-06 15:18                   ` Lorenzo Pieralisi
2023-12-06 15:38                     ` Jason Gunthorpe
2023-12-06 16:23                       ` Catalin Marinas
2023-12-06 16:48                         ` Jason Gunthorpe
2023-12-07 10:13                           ` Lorenzo Pieralisi
2023-12-07 13:38                             ` Jason Gunthorpe
2023-12-07 14:50                               ` Lorenzo Pieralisi
2023-12-05 13:28     ` Lorenzo Pieralisi
2023-12-05 14:16     ` Shameerali Kolothum Thodi
2023-12-06  8:17       ` Shameerali Kolothum Thodi
2023-12-05 11:48 ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231205164318.GG2692119@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=cjia@nvidia.com \
    --cc=danw@nvidia.com \
    --cc=gshan@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=kwankhede@nvidia.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=maz@kernel.org \
    --cc=mochs@nvidia.com \
    --cc=oliver.upton@linux.dev \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=suzuki.poulose@arm.com \
    --cc=targupta@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox