public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Keith Busch <kbusch@meta.com>,
	kvm@vger.kernel.org, Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH rfc] vfio-pci: Allow write combining
Date: Wed, 7 Aug 2024 11:46:43 -0600	[thread overview]
Message-ID: <20240807114643.25f78652.alex.williamson@redhat.com> (raw)
In-Reply-To: <20240807141910.GG8473@ziepe.ca>

On Wed, 7 Aug 2024 11:19:10 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:

> On Tue, Aug 06, 2024 at 12:43:02PM -0600, Alex Williamson wrote:
> 
> > > So we don't leak this too much into the drivers? Why should all the
> > > VFIO drivers have to be changed to alter how their region indexes work
> > > just to add a single flag??  
> 
> > I don't know how you're coming to this conclusion.  
> 
> Ideally we'd want to support the WC option basically everywhere.
> 
> > > I fear we might need to do this as there may not be room in the pgoff
> > > space (at least for 32 bit) to duplicate everything....  
> 
> > We'll root out userspace drivers that hard code region offsets in doing
> > this, but otherwise it shouldn't be an issue.  
> 
> The issue is running out of pgoff bits on 32 bit. Maybe this isn't an
> issue for VFIO, but it was for RDMA. We needed tight optimal on-demand
> packing of actual requested mmaps. Allocating gigabytes of address
> space for possible mmaps ran out of pgoff bits. :\

If we only implemented WC for 64-bit, would anyone notice?
 
> > How does an "mmap cookie" not duplicate that a device range is
> > accessible through multiple offsets of the vfio device file?  
> 
> pgoff duplcation is not really an issue, from an API perspective the
> driver would call a helper to convert the pgoff into a region index
> and mmap flags. It wouldn't matter to any driver how many duplicates
> there are.

Which is exactly my point whether we call it a region or an mmap
cookie.  In one case we're trying to give a bare pgoff that effectively
aliases to a region with different mapping flags, in the other the
pgoff is exposed through a new region offset that does exactly the same
thing.

> > Well first, we're not talking about a fixed number of additional
> > regions, we're talking about defining region identifiers for each BAR
> > with a WC mapping attribute, but at worst we'd only populate
> > implemented MMIO BARs.  But then we've also mentioned that a device
> > feature could be used to allow a userspace driver to selectively bring
> > these regions into existence.  In an case, an mmap cookie also consumes
> > address space from the vfio device file, so I'm still failing to see
> > how calling them a region vs just an mmap cookie makes a substantive
> > difference.  
> 
> You'd only allocate the mmap cookie when userspace requests it.

I've suggested a mechanism using DEVICE_FEATURE that could do this for
regions.

> My original suggestion was to send a flag to REGION_INFO to
> specifically ask for the different behavior, that (and only that)
> would return new mmap cookies.

Which can't work because flags is only an output field.

> The alternative version of this might be to have a single
> 'GET_REGION_MMAP' that gives a new mmap cookie for a singular
> specified region index. Userspace would call REGION_INFO to learn the
> memory regions and then it could call GET_REGION_MMAP(REQ_WC) and will
> get back a single dynamic mmap cookie that links the WC flags.
> 
> No system call, no cookie allocation. Existing apps don't start seeing
> more regions from REGION_INFO. Drivers keep region indexes 1:1 with HW
> objects. The uAPI has room to add more mmap flags.

Please tell me how this is ultimately different from invoking a
DEVICE_FEATURE call to request that a new device specific region be
created with the desired mappings.  In the short term, if we run out of
pgoff, the user gets an -ENOSPC.  DEVICE_INFO is updated with the new
number of regions, existing region indexes are unchanged, the user
iterates new indexes with REGION_INFO to get the offset and identifies
them using the previously proposed region types.  Thanks,

Alex


  reply	other threads:[~2024-08-07 17:46 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-31 15:53 [PATCH rfc] vfio-pci: Allow write combining Keith Busch
2024-08-01 14:19 ` Jason Gunthorpe
2024-08-01 15:41   ` Alex Williamson
2024-08-01 16:11     ` Jason Gunthorpe
2024-08-01 16:52       ` Alex Williamson
2024-08-01 17:13         ` Jason Gunthorpe
2024-08-01 17:33           ` Alex Williamson
2024-08-01 17:53             ` Jason Gunthorpe
2024-08-01 18:16               ` Alex Williamson
2024-08-02 11:53                 ` Jason Gunthorpe
2024-08-02 17:05                   ` Alex Williamson
2024-08-06 16:53                     ` Jason Gunthorpe
2024-08-06 18:43                       ` Alex Williamson
2024-08-07 14:19                         ` Jason Gunthorpe
2024-08-07 17:46                           ` Alex Williamson [this message]
2024-08-13 18:02                             ` Jason Gunthorpe
2024-08-02 14:24             ` Keith Busch
2024-08-02 14:33               ` Jason Gunthorpe
2024-08-06  7:19                 ` Tian, Kevin
2024-08-06 16:47                   ` Jason Gunthorpe
2024-08-15  5:05               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240807114643.25f78652.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox