All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>,
	kvm@vger.kernel.org, Yongji Xie <elohimes@gmail.com>,
	Eric Auger <eric.auger@redhat.com>
Subject: Re: [RFC PATCH kernel] vfio-pci: Allow mapping MSIX BAR
Date: Wed, 22 Nov 2017 15:44:55 +1100	[thread overview]
Message-ID: <20171122044455.GR2380@umbus.fritz.box> (raw)
In-Reply-To: <20171121212846.4d567f2e@t450s.home>

[-- Attachment #1: Type: text/plain, Size: 3019 bytes --]

On Tue, Nov 21, 2017 at 09:28:46PM -0700, Alex Williamson wrote:
> On Wed, 22 Nov 2017 15:09:32 +1100
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
> > By default VFIO disables mapping of MSIX BAR to the userspace as
> > the userspace may program it in a way allowing spurious interrupts;
> > instead the userspace uses the VFIO_DEVICE_SET_IRQS ioctl.
> > 
> > This works fine as long as the system page size equals to the MSIX
> > alignment requirement which is 4KB. However with a bigger page size
> > the existing code prohibits mapping non-MSIX parts of a page with MSIX
> > structures so these parts have to be emulated via slow reads/writes on
> > a VFIO device fd. If these emulated bits are accessed often, this has
> > serious impact on performance.
> > 
> > This adds an ioctl to the vfio-pci device which hides the sparse
> > capability and allows the userspace to map a BAR with MSIX structures.
> 
> So the user is in control of telling the kernel whether they're allowed
> to mmap the msi-x vector table.  That makes absolutely no sense.  If
> you're trying to figure out how userspace knows whether to implicitly
> avoid mmap'ing the msix region, I think there are far better ways in
> the existing region info ioctl.  We could use a flag, or maybe the
> existence of a capability chain pointer, or a new capability.  But
> absolutely not this.  The kernel needs to decide whether it's going to
> let the user do this, not the user.  Thanks,

No, it doesn't.  This is actually the approach we discussed in Prague.

Remember that intercepting access to the MSI-X table is not a host
safety / security issue.  It's just that without that we won't wire up
the device's MSI-X vectors properly so they won't work.

Basically the decision here is between

   A) Allow MSI-X configuration via standard PCI mechanisms, at the
      cost of making access slow for any registers sharing a page with
      the MSI-X table.

or

   B) Make access to BAR registers sharing a page with the MSI-X table
      fast, at the cost of requiring some alternative mechanism to
      configure MSI-X vectors.

And that is a tradeoff that it is reasonable for userspace to make.

In the case of KVM guests, the decision depends entirely on the
*guest* platform.  Usually we need (A) because the guest expects to be
able to poke the MSI-X table in the usual way.  However for PAPR
guests, there's an alternative mechanism via an RTAS call, which means
we can use (B).

The host kernel can't make this decision, because it doesn't know the
guest platform (well, KVM might, but VFIO doesn't).

A userspace VFIO program could also elect for (B) if it does care
about performance of access to registers in the same BAR as the MSI-X
table, but doesn't need MSI-X for example.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2017-11-22  4:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-22  4:09 [RFC PATCH kernel] vfio-pci: Allow mapping MSIX BAR Alexey Kardashevskiy
2017-11-22  4:25 ` David Gibson
2017-11-22  4:28 ` Alex Williamson
2017-11-22  4:44   ` David Gibson [this message]
2017-11-22  5:14     ` Alex Williamson
2017-11-22  5:31       ` Alexey Kardashevskiy
2017-11-22  6:51       ` David Gibson
2017-11-29  3:25         ` Alexey Kardashevskiy
2017-11-29  3:57           ` David Gibson
2017-11-29  4:38             ` Alexey Kardashevskiy
2017-11-29  5:17               ` David Gibson
2017-11-29  7:58                 ` Alexey Kardashevskiy
2017-11-29  8:52                   ` David Gibson
2017-11-29 20:06                     ` Alex Williamson
2017-11-29 22:03     ` Konrad Rzeszutek Wilk
2017-11-29 22:49       ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171122044455.GR2380@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=elohimes@gmail.com \
    --cc=eric.auger@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.