public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Keith Busch <kbusch@meta.com>,
	kvm@vger.kernel.org, Keith Busch <kbusch@kernel.org>
Subject: Re: [PATCH rfc] vfio-pci: Allow write combining
Date: Thu, 1 Aug 2024 10:52:18 -0600	[thread overview]
Message-ID: <20240801105218.7c297f9a.alex.williamson@redhat.com> (raw)
In-Reply-To: <20240801161130.GD3030761@ziepe.ca>

On Thu, 1 Aug 2024 13:11:30 -0300
Jason Gunthorpe <jgg@ziepe.ca> wrote:

> On Thu, Aug 01, 2024 at 09:41:23AM -0600, Alex Williamson wrote:
> > On Thu, 1 Aug 2024 11:19:14 -0300
> > Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >   
> > > On Wed, Jul 31, 2024 at 08:53:52AM -0700, Keith Busch wrote:  
> > > > From: Keith Busch <kbusch@kernel.org>
> > > > 
> > > > Write combining can be provide performance improvement for places that
> > > > can safely use this capability.
> > > > 
> > > > Previous discussions on the topic suggest a vfio user needs to
> > > > explicitly request such a mapping, and it sounds like a new vfio
> > > > specific ioctl to request this is one way recommended way to do that.
> > > > This patch implements a new ioctl to achieve that so a user can request
> > > > write combining on prefetchable memory. A new ioctl seems a bit much for
> > > > just this purpose, so the implementation here provides a "flags" field
> > > > with only the write combine option defined. The rest of the bits are
> > > > reserved for future use.    
> > > 
> > > This is a neat hack for sure
> > > 
> > > But how about adding this flag to vfio_region_info ?
> > > 
> > > @@ -275,6 +289,7 @@ struct vfio_region_info {
> > >  #define VFIO_REGION_INFO_FLAG_WRITE    (1 << 1) /* Region supports write */
> > >  #define VFIO_REGION_INFO_FLAG_MMAP     (1 << 2) /* Region supports mmap */
> > >  #define VFIO_REGION_INFO_FLAG_CAPS     (1 << 3) /* Info supports caps */
> > > +#define VFIO_REGION_INFO_REQ_WC         (1 << 4) /* Request a write combining mapping*/
> > >         __u32   index;          /* Region index */
> > >         __u32   cap_offset;     /* Offset within info struct of first cap */
> > >         __aligned_u64   size;   /* Region size (bytes) */
> > > 
> > > 
> > > It specify REQ_WC when calling VFIO_DEVICE_GET_REGION_INFO
> > > 
> > > The kernel will then return an offset value that yields a WC
> > > mapping. It doesn't displace the normal non-WC mapping?
> > > 
> > > Arguably we should fixup the kernel to put the mmap cookies into a
> > > maple tree so they can be dynamically allocated and more densely
> > > packed.  
> > 
> > vfio_region_info.flags in not currently tested for input therefore this
> > proposal could lead to unexpected behavior for a caller that doesn't
> > currently zero this field.  It's intended as an output-only field.  
> 
> Perhaps a REGION_INFO2 then?
> 
> I still think per-request is better than a global flag

I don't understand why we'd need a REGION_INFO2, we already have
support for defining new regions.  We do this by increasing the
num_regions value from the VFIO_DEVICE_GET_INFO ioctl.  The user can
iterate those additional regions and for each index call
VFIO_DEVICE_GET_REGION_INFO.  The new regions expose a
VFIO_REGION_INFO_CAP_TYPE capability where we define new types as:

#define VFIO_REGION_TYPE_PCI_BAR0_WC	(4)
#define VFIO_REGION_TYPE_PCI_BAR1_WC	(5)
#define VFIO_REGION_TYPE_PCI_BAR2_WC	(6)
#define VFIO_REGION_TYPE_PCI_BAR3_WC	(7)
#define VFIO_REGION_TYPE_PCI_BAR4_WC	(8)
#define VFIO_REGION_TYPE_PCI_BAR5_WC	(9)

We'd populate these new regions only for BARs that support prefetch and
mmap and we'd define that these BARs may expose only the MMAP flag and
not the READ or WRITE flags since those can go through the standard
region.  Really the only difference from the static vfio-pci defined
region indexes is the O(N) search for the user to find the vfio region
index to BAR mapping.  Thanks,

Alex


  reply	other threads:[~2024-08-01 16:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-31 15:53 [PATCH rfc] vfio-pci: Allow write combining Keith Busch
2024-08-01 14:19 ` Jason Gunthorpe
2024-08-01 15:41   ` Alex Williamson
2024-08-01 16:11     ` Jason Gunthorpe
2024-08-01 16:52       ` Alex Williamson [this message]
2024-08-01 17:13         ` Jason Gunthorpe
2024-08-01 17:33           ` Alex Williamson
2024-08-01 17:53             ` Jason Gunthorpe
2024-08-01 18:16               ` Alex Williamson
2024-08-02 11:53                 ` Jason Gunthorpe
2024-08-02 17:05                   ` Alex Williamson
2024-08-06 16:53                     ` Jason Gunthorpe
2024-08-06 18:43                       ` Alex Williamson
2024-08-07 14:19                         ` Jason Gunthorpe
2024-08-07 17:46                           ` Alex Williamson
2024-08-13 18:02                             ` Jason Gunthorpe
2024-08-02 14:24             ` Keith Busch
2024-08-02 14:33               ` Jason Gunthorpe
2024-08-06  7:19                 ` Tian, Kevin
2024-08-06 16:47                   ` Jason Gunthorpe
2024-08-15  5:05               ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240801105218.7c297f9a.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=kbusch@kernel.org \
    --cc=kbusch@meta.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox