public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Ankit Agrawal <ankita@nvidia.com>,
	Yishai Hadas <yishaih@nvidia.com>,
	"shameerali.kolothum.thodi@huawei.com"
	<shameerali.kolothum.thodi@huawei.com>,
	"kevin.tian@intel.com" <kevin.tian@intel.com>,
	"eric.auger@redhat.com" <eric.auger@redhat.com>,
	"brett.creeley@amd.com" <brett.creeley@amd.com>,
	"horms@kernel.org" <horms@kernel.org>,
	Aniket Agashe <aniketa@nvidia.com>, Neo Jia <cjia@nvidia.com>,
	Kirti Wankhede <kwankhede@nvidia.com>,
	"Tarun Gupta (SW-GPU)" <targupta@nvidia.com>,
	Vikram Sethi <vsethi@nvidia.com>,
	Andy Currid <acurrid@nvidia.com>,
	Alistair Popple <apopple@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	Dan Williams <danw@nvidia.com>,
	"Anuj Aggarwal (SW-GPU)" <anuaggarwal@nvidia.com>,
	Matt Ochs <mochs@nvidia.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v15 1/1] vfio/nvgrace-gpu: Add vfio pci variant module for grace hopper
Date: Wed, 3 Jan 2024 20:40:18 -0400	[thread overview]
Message-ID: <20240104004018.GW50406@nvidia.com> (raw)
In-Reply-To: <20240103172426.0a1f4ae6.alex.williamson@redhat.com>

On Wed, Jan 03, 2024 at 05:24:26PM -0700, Alex Williamson wrote:
> > Why do it need to do anything special? If the VM read/writes from
> > memory that the master bit is disabled on it gets undefined
> > behavior. The system doesn't crash and it does something reasonable.
> 
> The behavior is actually defined (6.0.1 Table 7-4):
> 
>     Memory Space Enable - Controls a Function's response to Memory
>     Space accesses. When this bit is Clear, all received Memory Space
>     accesses are caused to be handled as Unsupported Requests. When
>     this bit is Set, the Function is enabled to decode the address and
>     further process Memory Space accesses.
> 
> From there we get into system error handling decisions where some
> platforms claim to protect data integrity by generating a fault before
> allowing drivers to consume the UR response and others are more lenient.

Sure PCIe defines more detail, but the actual behavior the SW
experiences when triggering this corner is effective undefined as
"machine crash" is something that actually happens.

> AIUI, the address space enable bits are primarily to prevent the device
> from decoding accesses during BAR sizing operations or prior to BAR
> programming.  

Yes. It is not functionally relavent to devices like this that have a
fixed aperture, or to virtual devices that can't move the physical
aperture.

I think the layers have become confused a bit here. The vfio side
should entirely care about kernel self-protection from hostile
userspace, which is why we have to zap/etc.

However the VMM still controls the "address decoder" and if the memory
(or IO) enable is off then the VMM should already prevent the VM
address space from decoding into the VFIO regions at all. Ie it should
unmap it from KVM for mmapable regions, and stop matching the address
for emulated regions.

This is effectively necessary because the VM might choose to reprogram
the BAR registers and move the region, it can't do this atomically so
we have to fully ignore the BAR value when the decoders are disabled.

IOW the corner case of the memory enable disable and the VM touching
the memory is not something the kernel VFIO should be emulating, and
indeed, I think there is probably no reason to allow the VM to
manipulate the physical control either..

> unprogrammed BARs are ignored (ie. not exposed to userspace), so perhaps
> as long as it can be guaranteed that an access with the address space
> enable bit cleared cannot generate a system level fault, we actually
> have no strong argument to strictly enforce the address space bits.

This is what I think, yes.

> > I think that has just become too pedantic, accessing the regions with
> > the enable bits turned off is broadly undefined behavior. So long as
> > the platform doesn't crash, I think it is fine to behave in a simple
> > way.
> > 
> > There is no use case for providing stronger emulation of this.
> 
> As above, I think I can be convinced this is acceptable given that the
> platform and device are essentially one in the same here with
> understood lack of a system wide error response.

Right

> Now I'm wondering if we should do something different with
> virtio-vfio-pci.  As a VF, the memory space is effectively always
> enabled, governed by the SR-IOV MSE bit on the PF which is assumed to
> be static.  It doesn't make a lot of sense to track the IO enable bit
> for the emulated IO BAR when the memory BAR is always enabled.  It's a
> fairly trivial amount of code though, so it's not harmful either.

As above, it was probably unneeded to put this into VFIO kernel side,
I don't think there is a functional harm to allow it.

Jason

  reply	other threads:[~2024-01-04  0:40 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-17 19:10 [PATCH v15 1/1] vfio/nvgrace-gpu: Add vfio pci variant module for grace hopper ankita
2023-12-18 15:46 ` Cédric Le Goater
2023-12-18 22:17 ` Alex Williamson
2023-12-21 12:43   ` Ankit Agrawal
2024-01-02 16:10     ` Alex Williamson
2024-01-03 16:57       ` Jason Gunthorpe
2024-01-03 18:00         ` Alex Williamson
2024-01-03 19:33           ` Jason Gunthorpe
2024-01-04  0:24             ` Alex Williamson
2024-01-04  0:40               ` Jason Gunthorpe [this message]
2024-01-09 11:42                 ` Ankit Agrawal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240104004018.GW50406@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=acurrid@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=aniketa@nvidia.com \
    --cc=ankita@nvidia.com \
    --cc=anuaggarwal@nvidia.com \
    --cc=apopple@nvidia.com \
    --cc=brett.creeley@amd.com \
    --cc=cjia@nvidia.com \
    --cc=danw@nvidia.com \
    --cc=eric.auger@redhat.com \
    --cc=horms@kernel.org \
    --cc=jhubbard@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mochs@nvidia.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=targupta@nvidia.com \
    --cc=vsethi@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox