From: Casey Leedom <leedom@chelsio.com>
To: kvm@vger.kernel.org
Subject: Re: KVM vs. PCI-E Function Level Reset (FLR) ...
Date: Wed, 14 Jul 2010 11:01:29 -0700 [thread overview]
Message-ID: <201007141101.29804.leedom@chelsio.com> (raw)
In-Reply-To: <201007140853.33360.sheng@linux.intel.com>
| From: Sheng Yang <sheng@linux.intel.com>
| Date: Tuesday, July 13, 2010 05:53 pm
|
| On Wednesday 14 July 2010 04:41:01 Casey Leedom wrote:
| >
| > It looks like the Linux KVM kernel support code issues an FLR against
| > an Assigned Device when the device is assigned and when it's freed
| > but it's not clear when those actions are taken. For instance,
| > if a device is assigned to a VM and then the VM reboots itself, is
| > that counted as another assignment point? I.e. will KVM issue a
| > new pci_reset_function() on the device so it shows up reset in
| > the newly rebooted VM?
|
| The assignment/deassignment happened when guest was created/destroyed.
| Currently it wouldn't issue a FLR when guest reset.
Hhrrmmm, this seems like a semantic error. "Resetting" the Vm should be
morally equivalent to resetting a real physical machine. And when a real
physical machine is reset, all of its buses, etc. get reset which propagates to
Device Resets on those buses ... I think that Assigned Devices should be reset
for reboots and power off/on ...
| > And what happens if the VM OS/Driver attempts to write the PCI Pass
| > Through Device's PCI-E FLR bit? I assume that that write (and the
| > following polling reads) are trapped by the KVM code but I can't find the
| > code which implements the PCI Configuration Space emulation to see if the
| > FLR is implemented there. For instance, if I run Linux 2.6.30 in the VM
| > and my Device Driver calls pci_reset_function() in its "probe()" function
| > will that result in a Device FLR? it doesn't appear to be the case ...
|
| The PCI configuration space emulated is in QEmu rather than KVM. You can
| check qemu-kvm/hw/device-assignment.c. We didn't emulate FLR capability
| now. (OK, some other device specific reset method may involved, you can
| check pci_dev_reset())
Okay, I think that this is also going to be an issue for supporting Assigned
Devices. For PCI-E SR-IOV Virtual Functions which are assigned to a VM, they
need to be reset at reboot/power off/power on and the Configuration Space
emulation needs to support the Guest OS/VM Device Driver issuing an FLR ...
| > Note that it's impossible for a Device Driver to call
| > pci_reset_function() under Linux 2.6.31 and later
| > because a call to device_lock() was added to
| > pci_dev_reset() in chageset 8e9394ce on Feb
| > 17, 2010 by Greg Kroah-Hartman. This means
| > that a call to pci_reset_function() in a device
| > driver's "probe()" routine will result in an
| > immediate deadlock.
|
| What I saw the code is like this:
|
| static int pci_dev_reset(struct pci_dev *dev, int probe)
| {
| int rc;
|
| might_sleep();
|
| if (!probe) {
| pci_block_user_cfg_access(dev);
| /* block PM suspend, driver probe, etc. */
| device_lock(&dev->dev);
| }
| [...]
|
| So seems it's fine with _probe_ set, to use with probe() routine.
You're looking at a local routine in drivers/pci/pci.c. That routine is
called twice in pci_reset_function(). The "probe" parameter is used to indicate
whether the caller wants to "probe" for the ability to perform a PCI Function
Reset or to actually _do_ the reset. pci_reset_function() first calls
pci_dev_reset() is probe=1 and, if that returns an error code, it returns
immediately with the error. Otherwise it saves the PCI State of the device,
makes another call to pci_dev_reset() with probe=0, and then restores the
device's PCI State. Thus, this "probe" in pci_dev_reset() doesn't have anything
to do with the possibility that a device's own (pci_dev *)->driver->probe()
routine happens to be calling pci_reset_function(). Since, apparently, the
device's own ...->probe() routine is called with the device's (pci_dev *)->lock
held, a call to pci_reset_function() on itself will result in an immediate
deadlock from Linux 2.6.31 on ...
Casey
next prev parent reply other threads:[~2010-07-14 18:01 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-13 20:41 KVM vs. PCI-E Function Level Reset (FLR) Casey Leedom
2010-07-14 0:53 ` Sheng Yang
2010-07-14 18:01 ` Casey Leedom [this message]
2010-07-15 1:31 ` Sheng Yang
[not found] ` <201007150839.37130.leedom@chelsio.com>
2010-07-15 16:06 ` Casey Leedom
2010-07-16 0:56 ` Sheng Yang
2010-07-16 17:29 ` Casey Leedom
[not found] <201007150033.o6F0XUBj024880@stargate.chelsio.com>
2010-07-15 0:55 ` Casey Leedom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201007141101.29804.leedom@chelsio.com \
--to=leedom@chelsio.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox