public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Jan Kiszka <jan.kiszka@web.de>
Cc: Avi Kivity <avi@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	kvm@vger.kernel.org, jbaron@redhat.com
Subject: Re: [PATCH v2] kvm: Disable MSI/MSI-X in assigned device reset path
Date: Sun, 8 Apr 2012 23:35:48 +0300	[thread overview]
Message-ID: <20120408203547.GB10916@redhat.com> (raw)
In-Reply-To: <4F81DB67.700@web.de>

On Sun, Apr 08, 2012 at 08:39:35PM +0200, Jan Kiszka wrote:
> On 2012-04-08 20:18, Michael S. Tsirkin wrote:
> > On Sun, Apr 08, 2012 at 07:37:57PM +0200, Jan Kiszka wrote:
> >> On 2012-04-08 18:08, Avi Kivity wrote:
> >>> On 04/08/2012 07:04 PM, Michael S. Tsirkin wrote:
> >>>> On Sun, Apr 08, 2012 at 06:50:27PM +0300, Avi Kivity wrote:
> >>>>> On 04/08/2012 06:46 PM, Michael S. Tsirkin wrote:
> >>>>>>>>>
> >>>>>>>>> I'm thinking about this flow:
> >>>>>>>>>
> >>>>>>>>>   FLR the device
> >>>>>>>>>   for each emulated register
> >>>>>>>>>      read it from the hardware
> >>>>>>>>>      if different from emulated register:
> >>>>>>>>>         update the internal model (for example, disabling MSI in kvm if
> >>>>>>>>> needed)
> >>>>>>>>
> >>>>>>>> If we do it this way we get back the problem this patch
> >>>>>>>> is trying to solve: MSIX assigned while device
> >>>>>>>> memory is disabled would cause unsupported request errors.
> >>>>>>>
> >>>>>>> Why is that?  FLR would presumably disable MSI in the device, and this
> >>>>>>> line would disable it in kvm as well.
> >>>>>>
> >>>>>> The bug is that device memory is disabled (FLR would do that)
> >>>>>> while MSI is enabled in kvm. The fix is to
> >>>>>> disable MSI in kvm first.
> >>>>>
> >>>>> Yes, no need to repeat.  My question is whether my pseudo-code does the
> >>>>> same
> >>>>
> >>>> It doesn't seem to: FLR (disabling memory) is followed
> >>>> by MSI disable in kvm instead of the reverse.
> >>>
> >>> Ah, so the problem is the ordering?  I see.
> >>>
> >>>>> and whether or not if it is better (when applied to all emulated
> >>>>> config space).
> >>>>
> >>>> I'm not sure.
> >>>> I would like to see an example of a register that you have
> >>>> in mind.
> >>>
> >>> I went over the PCI registers and saw none that would be affected.
> >>>
> >>>>>>
> >>>>>> Yes. I'm talking about things like enabling memory, setting up irq register,
> >>>>>> etc though. Most of this setup is done by bios.
> >>>>>
> >>>>> I see.  So should we have a pci_reset_function() variant that limits
> >>>>> itself to restoring just those bits?
> >>>>
> >>>> We only need kernel to restore whatever qemu emulates, but
> >>>> kernel doesn't know what that is.
> >>>> What kind of interface do you have in mind?
> >>>>
> >>>
> >>> The same as pci_reset_function(), but leaves MSI clear.
> >>>
> >>> I guess it's not worth it if the ordering problem is there.
> >>
> >> The core problem is not the ordering. The problem is that the kernel is
> >> susceptible to ordering mistakes of userspace.
> >> And that is because the
> >> kernel panics on PCI errors of devices that are in user hands - a
> >> critical kernel bug IMHO.
> > 
> > I'm not sure. The pci sysfs interface
> > is by design not secured against malicious users,
> > isn't it?
> 
> That's surely true for devices outside of IOMMU protection. But do we
> really have to give up when we encapsulate and isolate them that way?
> Provided we moderate access to the sysfs resources via libvirt or some
> other management service.

We don't have to give up but we'd have to build such an
interface: /config attribute is not it.

> > 
> >> Proper reset of MSI or even the whole PCI
> >> config space is another issue, but one the kernel should not worry about
> >> - still, it should be fixed (therefore this patch).
> >> But even if we disallowed userland to disable MMIO and PIO access to the
> >> device, we would be be able to exclude that there are secrete channels
> >> in the device's interface having the same effect.
> > 
> > I'm not sure I agree here.  If there are secret channels to the device
> > that let it violate the PCI express spec, it can probably break the SRIOV
> > security model. And then you can do much more than just crash the host.
> 
> Maybe, but there are also other devices. And if a guest reprograms it
> (firmware update...) and makes it stop reacting on requests, we may get
> the same effect. That would also be some kind of a "secrete channel".

Right. So it looks like SRIOV VF is the only type of device that
is safe to assign to a guest:

Presumably, SRIOV VFs don't let driver program the firmware.
And I think SRIOV VFs don't have MMIO/PIO enable bits either,
and the BAR isn't programmable through the VF...

> > 
> >> So we likely need to
> >> enhance PCI error handling to catch and handle faults for certain
> >> devices differently - those we cannot trust to behave properly while
> >> they are under userland/guest control.
> >>
> >> Jan
> >>
> > 
> > 
> > I agree - forwarding errors to the guest would actually be very useful - but
> > I think we also need to analyse the problem carefully,
> > and prevent as many ways as we can for guest to cause trouble.
> 
> If possible, the protection should target userspace which would
> automatically include guests. Only if that is not feasible with
> reasonable effort, we have to rely on QEMU to save the host.

Defence in depth is best, right?

> > 
> > And there is another issue here: unsuppported request errors
> > should not cause kernel panics IMO.
> > 
> > There's also the issue that qemu let guest control the MMIO/PIO
> > bits in the command register.
> > 
> > So there are multiple bugs.
> > 
> 
> Yep, that's true.
> 
> Jan
> 



  reply	other threads:[~2012-04-08 20:35 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-05  3:42 [PATCH v2] kvm: Disable MSI/MSI-X in assigned device reset path Alex Williamson
2012-04-05  7:28 ` Jan Kiszka
2012-04-05  9:34 ` Michael S. Tsirkin
2012-04-05 14:42   ` Alex Williamson
2012-04-05 15:04     ` Michael S. Tsirkin
2012-04-08 13:14 ` Avi Kivity
2012-04-08 13:17   ` Michael S. Tsirkin
2012-04-08 13:18     ` Avi Kivity
2012-04-08 13:21       ` Michael S. Tsirkin
2012-04-08 13:24         ` Avi Kivity
2012-04-08 13:30           ` Michael S. Tsirkin
2012-04-08 13:41             ` Avi Kivity
2012-04-08 13:53               ` Michael S. Tsirkin
2012-04-08 14:01                 ` Avi Kivity
2012-04-08 14:42                   ` Michael S. Tsirkin
2012-04-08 15:26                     ` Avi Kivity
2012-04-08 15:46                       ` Michael S. Tsirkin
2012-04-08 15:50                         ` Avi Kivity
2012-04-08 16:04                           ` Michael S. Tsirkin
2012-04-08 16:08                             ` Avi Kivity
2012-04-08 17:37                               ` Jan Kiszka
2012-04-08 18:18                                 ` Michael S. Tsirkin
2012-04-08 18:39                                   ` Jan Kiszka
2012-04-08 20:35                                     ` Michael S. Tsirkin [this message]
2012-04-09  8:35                                 ` Avi Kivity
2012-04-10 16:55                                   ` Alex Williamson
2012-04-16 14:03 ` Alex Williamson
2012-04-16 14:31   ` Avi Kivity
2012-04-16 15:06   ` Michael S. Tsirkin
2012-04-16 15:10     ` Jan Kiszka
2012-04-16 16:08       ` Michael S. Tsirkin
2012-04-16 16:13         ` Jan Kiszka
2012-04-16 16:36           ` Michael S. Tsirkin
2012-04-16 16:38             ` Jan Kiszka
2012-04-16 17:12               ` Michael S. Tsirkin
2012-04-16 18:47                 ` Jan Kiszka
2012-04-16 16:12     ` Jason Baron
2012-04-16 16:34       ` Michael S. Tsirkin
2012-04-16 19:07     ` Alex Williamson
2012-04-16 19:47       ` Michael S. Tsirkin
2012-04-17  0:55 ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120408203547.GB10916@redhat.com \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=avi@redhat.com \
    --cc=jan.kiszka@web.de \
    --cc=jbaron@redhat.com \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox