public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Keith Busch <kbusch@meta.com>,
	kvm@vger.kernel.org, Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH rfc] vfio-pci: Allow write combining
Date: Tue, 6 Aug 2024 12:43:02 -0600	[thread overview]
Message-ID: <20240806124302.21e46cee.alex.williamson@redhat.com> (raw)
In-Reply-To: <20240806165312.GI676757@ziepe.ca>

On Tue, 6 Aug 2024 13:53:12 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:

> On Fri, Aug 02, 2024 at 11:05:06AM -0600, Alex Williamson wrote:
> 
> > > Well, again, it is not a region, it is just a record that this mmap
> > > cookie uses X region with Y mapping flags. The number of regions don't
> > > change. Critically from a driver perspective the number of regions and
> > > region indexes wouldn't change.  
> > 
> > Why is this critical?  
> 
> So we don't leak this too much into the drivers? Why should all the
> VFIO drivers have to be changed to alter how their region indexes work
> just to add a single flag?? 

I don't know how you're coming to this conclusion.  A driver that wants
this new mapping flag needs to do something different but the existing
use case is absolutely unchanged.  Look for example at how the IGD code
in vfio adds several device specific regions.  This doesn't affect
anything other than new code in userspace that wants to iterate these
regions.  It doesn't change the indexes of any of the statically
defined regions.
 
> > > Well, that is just the current implementation. What we did in RDMA
> > > when we switched from hard coded mmap cookies to dynamic ones is
> > > use an xarray (today this should be a maple tree) to dynamically
> > > allocate mmap cookies whenever the driver returns something to
> > > userspace. During the mmap fop the pgoff is fed back through the maple
> > > tree to get the description of what the cookie represents.  
> > 
> > Sure, we could do that too, the current implementation (not uAPI) just
> > uses some upper bits to create fixed region address spaces.  The only
> > thing we should need to keep consistent is the mapping of indexes to
> > device resources up through VFIO_PCI_NUM_REGIONS.  
> 
> I fear we might need to do this as there may not be room in the pgoff
> space (at least for 32 bit) to duplicate everything....

We'll root out userspace drivers that hard code region offsets in doing
this, but otherwise it shouldn't be an issue.  If the collateral is too
large the standard regions can use the fixed offsets and device
specific regions can use dynamic offsets.

> > > My point is to not confuse the pgoff encoding with the driver concept
> > > of a region. The region is a single peice of memory, the "mmap cookie"s
> > > are just handles to it. Adding more data to the handle is not the same
> > > as adding more regions.  
> > 
> > I don't get it.  Take for instance PCI config space.  Given the right
> > GPU, I can get to config space through an I/O port region, an MMIO
> > region (possibly multiple ways), and the config space region itself.
> > Therefore based on this hardware implementation there is no unique
> > mapping that says that config space is uniquely accessible via a single
> > region.    
> 
> That doesn't seem like this sitation. Those are multiple different HW
> paths with different HW addresses, sure they can have different
> regions.
> 
> Here we are talking about the same HW path with the same HW
> addresses. It shouldn't be duplicated.

How does an "mmap cookie" not duplicate that a device range is
accessible through multiple offsets of the vfio device file?

> > BAR can only be accessed via a signle region and we need to play games
> > with terminology to call it an mmap cookie rather than officially
> > creating a region with WC mmap semantics?  
> 
> Because if you keep adding more regions for what are attributes of a
> mapping we may end up with a combinatoral explosion of regions.
> 
> I already know there is interest in doing non-cache/cache mapping
> attributes too.

This sounds like variant driver space, we can't generically create
cachable mappings to MMIO.  vfio-nvgrace-gpu already does this, but
they usurp the standard BAR region, there's no longer uncached access.

> Approaching this as a fixed number of regions reflecting the HW
> addresses and a variable number of flags requested by the user is alot
> more reasonable than trying to have a list of every permutation of
> every address for every combination of flags.

Well first, we're not talking about a fixed number of additional
regions, we're talking about defining region identifiers for each BAR
with a WC mapping attribute, but at worst we'd only populate
implemented MMIO BARs.  But then we've also mentioned that a device
feature could be used to allow a userspace driver to selectively bring
these regions into existence.  In an case, an mmap cookie also consumes
address space from the vfio device file, so I'm still failing to see
how calling them a region vs just an mmap cookie makes a substantive
difference.  Thanks,

Alex


  reply	other threads:[~2024-08-06 18:43 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-31 15:53 [PATCH rfc] vfio-pci: Allow write combining Keith Busch
2024-08-01 14:19 ` Jason Gunthorpe
2024-08-01 15:41   ` Alex Williamson
2024-08-01 16:11     ` Jason Gunthorpe
2024-08-01 16:52       ` Alex Williamson
2024-08-01 17:13         ` Jason Gunthorpe
2024-08-01 17:33           ` Alex Williamson
2024-08-01 17:53             ` Jason Gunthorpe
2024-08-01 18:16               ` Alex Williamson
2024-08-02 11:53                 ` Jason Gunthorpe
2024-08-02 17:05                   ` Alex Williamson
2024-08-06 16:53                     ` Jason Gunthorpe
2024-08-06 18:43                       ` Alex Williamson [this message]
2024-08-07 14:19                         ` Jason Gunthorpe
2024-08-07 17:46                           ` Alex Williamson
2024-08-13 18:02                             ` Jason Gunthorpe
2024-08-02 14:24             ` Keith Busch
2024-08-02 14:33               ` Jason Gunthorpe
2024-08-06  7:19                 ` Tian, Kevin
2024-08-06 16:47                   ` Jason Gunthorpe
2024-08-15  5:05               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240806124302.21e46cee.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox