From: Casey Leedom <leedom@chelsio.com>
To: kvm@vger.kernel.org
Subject: Re: KVM vs. PCI-E Function Level Reset (FLR) ...
Date: Wed, 14 Jul 2010 11:01:29 -0700 [thread overview]
Message-ID: <201007141101.29804.leedom@chelsio.com> (raw)
In-Reply-To: <201007140853.33360.sheng@linux.intel.com>
| From: Sheng Yang <sheng@linux.intel.com>
| Date: Tuesday, July 13, 2010 05:53 pm
|
| On Wednesday 14 July 2010 04:41:01 Casey Leedom wrote:
| >
| > It looks like the Linux KVM kernel support code issues an FLR against
| > an Assigned Device when the device is assigned and when it's freed
| > but it's not clear when those actions are taken. For instance,
| > if a device is assigned to a VM and then the VM reboots itself, is
| > that counted as another assignment point? I.e. will KVM issue a
| > new pci_reset_function() on the device so it shows up reset in
| > the newly rebooted VM?
|
| The assignment/deassignment happened when guest was created/destroyed.
| Currently it wouldn't issue a FLR when guest reset.
Hhrrmmm, this seems like a semantic error. "Resetting" the Vm should be
morally equivalent to resetting a real physical machine. And when a real
physical machine is reset, all of its buses, etc. get reset which propagates to
Device Resets on those buses ... I think that Assigned Devices should be reset
for reboots and power off/on ...
| > And what happens if the VM OS/Driver attempts to write the PCI Pass
| > Through Device's PCI-E FLR bit? I assume that that write (and the
| > following polling reads) are trapped by the KVM code but I can't find the
| > code which implements the PCI Configuration Space emulation to see if the
| > FLR is implemented there. For instance, if I run Linux 2.6.30 in the VM
| > and my Device Driver calls pci_reset_function() in its "probe()" function
| > will that result in a Device FLR? it doesn't appear to be the case ...
|
| The PCI configuration space emulated is in QEmu rather than KVM. You can
| check qemu-kvm/hw/device-assignment.c. We didn't emulate FLR capability
| now. (OK, some other device specific reset method may involved, you can
| check pci_dev_reset())
Okay, I think that this is also going to be an issue for supporting Assigned
Devices. For PCI-E SR-IOV Virtual Functions which are assigned to a VM, they
need to be reset at reboot/power off/power on and the Configuration Space
emulation needs to support the Guest OS/VM Device Driver issuing an FLR ...
| > Note that it's impossible for a Device Driver to call
| > pci_reset_function() under Linux 2.6.31 and later
| > because a call to device_lock() was added to
| > pci_dev_reset() in chageset 8e9394ce on Feb
| > 17, 2010 by Greg Kroah-Hartman. This means
| > that a call to pci_reset_function() in a device
| > driver's "probe()" routine will result in an
| > immediate deadlock.
|
| What I saw the code is like this:
|
| static int pci_dev_reset(struct pci_dev *dev, int probe)
| {
| int rc;
|
| might_sleep();
|
| if (!probe) {
| pci_block_user_cfg_access(dev);
| /* block PM suspend, driver probe, etc. */
| device_lock(&dev->dev);
| }
| [...]
|
| So seems it's fine with _probe_ set, to use with probe() routine.
You're looking at a local routine in drivers/pci/pci.c. That routine is
called twice in pci_reset_function(). The "probe" parameter is used to indicate
whether the caller wants to "probe" for the ability to perform a PCI Function
Reset or to actually _do_ the reset. pci_reset_function() first calls
pci_dev_reset() is probe=1 and, if that returns an error code, it returns
immediately with the error. Otherwise it saves the PCI State of the device,
makes another call to pci_dev_reset() with probe=0, and then restores the
device's PCI State. Thus, this "probe" in pci_dev_reset() doesn't have anything
to do with the possibility that a device's own (pci_dev *)->driver->probe()
routine happens to be calling pci_reset_function(). Since, apparently, the
device's own ...->probe() routine is called with the device's (pci_dev *)->lock
held, a call to pci_reset_function() on itself will result in an immediate
deadlock from Linux 2.6.31 on ...
Casey
next prev parent reply other threads:[~2010-07-14 18:01 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-13 20:41 KVM vs. PCI-E Function Level Reset (FLR) Casey Leedom
2010-07-14 0:53 ` Sheng Yang
2010-07-14 18:01 ` Casey Leedom [this message]
2010-07-15 1:31 ` Sheng Yang
[not found] ` <201007150839.37130.leedom@chelsio.com>
2010-07-15 16:06 ` Casey Leedom
2010-07-16 0:56 ` Sheng Yang
2010-07-16 0:56 ` [Qemu-devel] " Sheng Yang
2010-07-16 17:29 ` Casey Leedom
2010-07-16 17:29 ` [Qemu-devel] " Casey Leedom
[not found] <201007150033.o6F0XUBj024880@stargate.chelsio.com>
2010-07-15 0:55 ` Casey Leedom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201007141101.29804.leedom@chelsio.com \
--to=leedom@chelsio.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.