All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sheng Yang <sheng@linux.intel.com>
To: kvm@vger.kernel.org
Cc: Casey Leedom <leedom@chelsio.com>, linux-pci@vger.kernel.org
Subject: Re: KVM vs. PCI-E Function Level Reset (FLR) ...
Date: Thu, 15 Jul 2010 09:31:00 +0800	[thread overview]
Message-ID: <201007150931.00591.sheng@linux.intel.com> (raw)
In-Reply-To: <201007141101.29804.leedom@chelsio.com>

On Thursday 15 July 2010 02:01:29 Casey Leedom wrote:
> | From: Sheng Yang <sheng@linux.intel.com>
> | Date: Tuesday, July 13, 2010 05:53 pm

(Please use reply to all next time.)
> | 
> | On Wednesday 14 July 2010 04:41:01 Casey Leedom wrote:
> | >   It looks like the Linux KVM kernel support code issues an FLR against
> | > 
> | > an Assigned Device when the device is assigned and when it's freed
> | > but it's not clear when those actions are taken.  For instance,
> | > if a device is assigned to a VM and then the VM reboots itself, is
> | > that counted as another assignment point? I.e. will KVM issue a
> | > new pci_reset_function() on the device so it shows up reset in
> | > the newly rebooted VM?
> | 
> | The assignment/deassignment happened when guest was created/destroyed.
> | Currently it wouldn't issue a FLR when guest reset.
>   Hhrrmmm, this seems like a semantic error.  "Resetting" the Vm should be
> morally equivalent to resetting a real physical machine.  And when a real
> physical machine is reset, all of its buses, etc. get reset which
> propagates to Device Resets on those buses ...  I think that Assigned
> Devices should be reset for reboots and power off/on ...

Yes, it should be done when reset. And power on/off should be there, because that's 
means the create and destroy the guest.
> 
> | >   And what happens if the VM OS/Driver attempts to write the PCI Pass
> | > 
> | > Through Device's PCI-E FLR bit?  I assume that that write (and the
> | > following polling reads) are trapped by the KVM code but I can't find
> | > the code which implements the PCI Configuration Space emulation to see
> | > if the FLR is implemented there.  For instance, if I run Linux 2.6.30
> | > in the VM and my Device Driver calls pci_reset_function() in its
> | > "probe()" function will that result in a Device FLR? it doesn't appear
> | > to be the case ...
> | 
> | The PCI configuration space emulated is in QEmu rather than KVM. You can
> | check qemu-kvm/hw/device-assignment.c. We didn't emulate FLR capability
> | now. (OK, some other device specific reset method may involved, you can
> | check pci_dev_reset())
> 
>   Okay, I think that this is also going to be an issue for supporting
> Assigned Devices.  For PCI-E SR-IOV Virtual Functions which are assigned
> to a VM, they need to be reset at reboot/power off/power on and the
> Configuration Space emulation needs to support the Guest OS/VM Device
> Driver issuing an FLR ...

You can add the FLR support for your device if you need to call it in the guest. 
But I don't quite understand why you need to call it in the guest. KVM should has 
already done that, and it's not necessary for native device.

> 
> | >       Note that it's impossible for a Device Driver to call
> | >       pci_reset_function() under Linux 2.6.31 and later
> | >       because a call to device_lock() was added to
> | >       pci_dev_reset() in chageset 8e9394ce on Feb
> | >       17, 2010 by Greg Kroah-Hartman.  This means
> | >       that a call to pci_reset_function() in a device
> | >       driver's "probe()" routine will result in an
> | >       immediate deadlock.
> | 
> | What I saw the code is like this:
> | 
> | static int pci_dev_reset(struct pci_dev *dev, int probe)
> | {
> | 
> |         int rc;
> |         
> |         might_sleep();
> |         
> |         if (!probe) {
> |         
> |                 pci_block_user_cfg_access(dev);
> |                 /* block PM suspend, driver probe, etc. */
> |                 device_lock(&dev->dev);
> |         
> |         }
> | 
> | [...]
> | 
> | So seems it's fine with _probe_ set, to use with probe() routine.
> 
>   You're looking at a local routine in drivers/pci/pci.c.  That routine is
> called twice in pci_reset_function().  The "probe" parameter is used to
> indicate whether the caller wants to "probe" for the ability to perform a
> PCI Function Reset or to actually _do_ the reset.  pci_reset_function()
> first calls pci_dev_reset() is probe=1 and, if that returns an error code,
> it returns immediately with the error.  Otherwise it saves the PCI State
> of the device, makes another call to pci_dev_reset() with probe=0, and
> then restores the device's PCI State.  Thus, this "probe" in
> pci_dev_reset() doesn't have anything to do with the possibility that a
> device's own (pci_dev *)->driver->probe() routine happens to be calling
> pci_reset_function().  Since, apparently, the device's own ...->probe()
> routine is called with the device's (pci_dev *)->lock held, a call to
> pci_reset_function() on itself will result in an immediate deadlock from
> Linux 2.6.31 on ...

So you want to issue FLR(rather than probing the feature) when driver->probe() 
called? Current seems KVM is the only user of pci_reset_function(), so I think we 
can update it after your driver posted to the mailing list. Also I am not sure why 
you want to reset function when probing. KVM should cover them all I think.

--
regards
Yang, Sheng

> 
> Casey
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-07-15  1:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-13 20:41 KVM vs. PCI-E Function Level Reset (FLR) Casey Leedom
2010-07-14  0:53 ` Sheng Yang
2010-07-14 18:01   ` Casey Leedom
2010-07-15  1:31     ` Sheng Yang [this message]
     [not found]       ` <201007150839.37130.leedom@chelsio.com>
2010-07-15 16:06         ` Casey Leedom
2010-07-16  0:56         ` Sheng Yang
2010-07-16  0:56           ` [Qemu-devel] " Sheng Yang
2010-07-16 17:29           ` Casey Leedom
2010-07-16 17:29             ` [Qemu-devel] " Casey Leedom
     [not found] <201007150033.o6F0XUBj024880@stargate.chelsio.com>
2010-07-15  0:55 ` Casey Leedom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201007150931.00591.sheng@linux.intel.com \
    --to=sheng@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=leedom@chelsio.com \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.