qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Cc: Chen Fan <chen.fan.fnst@cn.fujitsu.com>,
	izumi.taku@jp.fujitsu.com, fan.chen@easystack.cn,
	qemu-devel@nongnu.org, mst@redhat.com
Subject: Re: [Qemu-devel] [PATCH v9 05/11] vfio: add check host bus reset is support or not
Date: Wed, 31 Aug 2016 13:56:20 -0600	[thread overview]
Message-ID: <20160831135620.1083b9a6@t450s.home> (raw)
In-Reply-To: <1468913909-21811-6-git-send-email-zhoujie2011@cn.fujitsu.com>

On Tue, 19 Jul 2016 15:38:23 +0800
Zhou Jie <zhoujie2011@cn.fujitsu.com> wrote:

> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> 
> When assigning a vfio device with AER enabled, we must check whether
> the device supports a host bus reset (ie. hot reset) as this may be
> used by the guest OS in order to recover the device from an AER
> error.  QEMU must therefore have the ability to perform a physical
> host bus reset using the existing vfio APIs in response to a virtual
> bus reset in the VM.  A physical bus reset affects all of the devices
> on the host bus, therefore we place a few simplifying configuration
> restriction on the VM:
> 
>  - All physical devices affected by a bus reset must be assigned to
>    the VM with AER enabled on each and be configured on the same
>    virtual bus in the VM.
> 
>  - No devices unaffected by the bus reset, be they physical, emulated,
>    or paravirtual may be configured on the same virtual bus as a
>    device supporting AER signaling through vfio.
> 
> In other words users wishing to enable AER on a multifunction device
> need to assign all functions of the device to the same virtual bus
> and enable AER support for each device.  The easiest way to
> accomplish this is to identity map the physical functions to virtual
> functions with multifunction enabled on the virtual device.

Why am I able to start the following VM with aer=on for the vfio-pci
devices?

# lspci -tv
-[0000:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
           +-01.0  Device 1234:1111
           +-1c.0-[01]--
           +-1d.0-[02]--+-01.0  Intel Corporation 82576 Gigabit Network Connection
           |            \-01.1  Intel Corporation 82576 Gigabit Network Connection
           ...

# lspci -vvv -s 1d.0
00:1d.0 PCI bridge: Red Hat, Inc. QEMU PCI-PCI bridge (prog-if 00 [Normal decode])

The devices are behind a PCIe-to-PCI bridge, so shouldn't specifying
aer=on for the vfio-pci devices cause a configuration error?

commandline:

/home/alwillia/local/bin/qemu-system-x86_64 -name guest=rhel7-q35,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-11-rhel7-q35/master-key.aes -machine pc-q35-2.7,accel=kvm,usb=off,vmport=off -cpu IvyBridge -m 8192 -realtime mlock=off -smp 6,sockets=1,cores=6,threads=1 -uuid b20b28b4-9304-4e11-9ffa-0367aeb44afb -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-11-rhel7-q35/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device i82801b11-bridge,id=pci.1,bus=pcie.0,addr=0x1e -device pci-bridge,chassis_nr=2,id=pci.2,bus=pci.1,addr=0x1 -device pci-bridge,chassis_nr=3,id=pci.3,bus=pcie.0,addr=0x1d -device ioh3420,port=0xe0,chassis=4,id=pci.4,bus=pcie.0,addr=0x1c -device ich9-usb-ehci1,id=usb,bus=pci.2!
 ,addr=0x3.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.2,multifunction=on,addr=0x3 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.2,addr=0x3.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.2,addr=0x3.0x2 -drive file=/dev/rhel/rhel7-q35,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native -device virtio-blk-pci,scsi=off,bus=pci.2,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=28 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:50:ec:0d,bus=pci.2,addr=0x1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 127.0.0.1:0 -device VGA,id=video0,vgamem_mb=16,bus=pcie.0,addr=0x1 -device intel-hda,id=sound0,bus=pci.2,addr=0x2 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -device vfio-pci,aer=on,host=07:00.0,id=hostdev0,bus=pci.3,multifunction=on,addr=0x1 -device vfio-pci,ae!
 r=on,host=07:00.1,id=hostdev1,bus=pci.3,addr=0x1.0x1 -msg timestamp=on

Thanks,
Alex

> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 278 +++++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  hw/vfio/pci.h |   1 +
>  2 files changed, 256 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 21fd801..242c1e4 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1693,6 +1693,42 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos)
>      }
>  }
>  
> +static int vfio_pci_name_to_addr(const char *name, PCIHostDeviceAddress *addr)
> +{
> +    if (strlen(name) != 12 ||
> +        sscanf(name, "%04x:%02x:%02x.%1x", &addr->domain,
> +               &addr->bus, &addr->slot, &addr->function) != 4) {
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +static bool vfio_pci_host_match(PCIHostDeviceAddress *addr, const char *name)
> +{
> +    PCIHostDeviceAddress tmp;
> +
> +    if (vfio_pci_name_to_addr(name, &tmp)) {
> +        return false;
> +    }
> +
> +    return (tmp.domain == addr->domain && tmp.bus == addr->bus &&
> +            tmp.slot == addr->slot && tmp.function == addr->function);
> +}
> +
> +static bool vfio_pci_host_match_slot(PCIHostDeviceAddress *addr,
> +                                     const char *name)
> +{
> +    PCIHostDeviceAddress tmp;
> +
> +    if (vfio_pci_name_to_addr(name, &tmp)) {
> +        return false;
> +    }
> +
> +    return (tmp.domain == addr->domain && tmp.bus == addr->bus &&
> +            tmp.slot == addr->slot);
> +}
> +
>  /*
>   * return negative with errno, return 0 on success.
>   * if success, the point of ret_info fill with the affected device reset info.
> @@ -1854,6 +1890,200 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos)
>      return 0;
>  }
>  
> +static int vfio_device_range_limit(PCIBus *bus)
> +{
> +    PCIDevice *br;
> +
> +    br = pci_bridge_get_device(bus);
> +    if (!br ||
> +        !pci_is_express(br) ||
> +        !(br->exp.exp_cap) ||
> +        pcie_cap_is_arifwd_enabled(br)) {
> +        return 255;
> +    }
> +
> +    return 8;
> +}
> +
> +static void vfio_check_hot_bus_reset(VFIOPCIDevice *vdev, Error **errp)
> +{
> +    PCIBus *bus = vdev->pdev.bus;
> +    struct vfio_pci_hot_reset_info *info = NULL;
> +    struct vfio_pci_dependent_device *devices;
> +    VFIOGroup *group;
> +    int ret, i, devfn, range_limit;
> +
> +    ret = vfio_get_hot_reset_info(vdev, &info);
> +    if (ret) {
> +        error_setg(errp, "vfio: Cannot enable AER for device %s,"
> +                   " device does not support hot reset.",
> +                   vdev->vbasedev.name);
> +        return;
> +    }
> +
> +    /* List all affected devices by bus reset */
> +    devices = &info->devices[0];
> +
> +    /* Verify that we have all the groups required */
> +    for (i = 0; i < info->count; i++) {
> +        PCIHostDeviceAddress host;
> +        VFIOPCIDevice *tmp;
> +        VFIODevice *vbasedev_iter;
> +        bool found = false;
> +
> +        host.domain = devices[i].segment;
> +        host.bus = devices[i].bus;
> +        host.slot = PCI_SLOT(devices[i].devfn);
> +        host.function = PCI_FUNC(devices[i].devfn);
> +
> +        /* Skip the current device */
> +        if (vfio_pci_host_match(&host, vdev->vbasedev.name)) {
> +            continue;
> +        }
> +
> +        /* Ensure we own the group of the affected device */
> +        QLIST_FOREACH(group, &vfio_group_list, next) {
> +            if (group->groupid == devices[i].group_id) {
> +                break;
> +            }
> +        }
> +
> +        if (!group) {
> +            error_setg(errp, "vfio: Cannot enable AER for device %s, "
> +                       "depends on group %d which is not owned.",
> +                       vdev->vbasedev.name, devices[i].group_id);
> +            goto out;
> +        }
> +
> +        /* Ensure affected devices for reset on the same bus */
> +        QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
> +            if (vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) {
> +                continue;
> +            }
> +            tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev);
> +            if (vfio_pci_host_match(&host, tmp->vbasedev.name)) {
> +                /*
> +                 * AER errors may be broadcast to all functions of a multi-
> +                 * function endpoint.  If any of those sibling functions are
> +                 * also assigned, they need to have AER enabled or else an
> +                 * error may continue to cause a vm_stop condition.  IOW,
> +                 * AER setup of this function would be pointless.
> +                 */
> +                if (vfio_pci_host_match_slot(&host, vdev->vbasedev.name) &&
> +                    !(tmp->features & VFIO_FEATURE_ENABLE_AER)) {
> +                    error_setg(errp, "vfio: Cannot enable AER for device %s,"
> +                               " on same slot the dependent device %s which"
> +                               " does not enable AER.",
> +                               vdev->vbasedev.name, tmp->vbasedev.name);
> +                    goto out;
> +                }
> +
> +                if (tmp->pdev.bus != bus) {
> +                    error_setg(errp, "vfio: Cannot enable AER for device %s, "
> +                               "the dependent device %s is not on the same bus",
> +                               vdev->vbasedev.name, tmp->vbasedev.name);
> +                    goto out;
> +                }
> +                found = true;
> +                break;
> +            }
> +        }
> +
> +        /* Ensure all affected devices assigned to VM */
> +        if (!found) {
> +            error_setg(errp, "vfio: Cannot enable AER for device %s, "
> +                       "the dependent device %04x:%02x:%02x.%x "
> +                       "is not assigned to VM.",
> +                       vdev->vbasedev.name, host.domain, host.bus,
> +                       host.slot, host.function);
> +            goto out;
> +        }
> +    }
> +
> +    /*
> +     * The above code verified that all devices affected by a bus reset
> +     * exist on the same bus in the VM.  To further simplify, we also
> +     * require that there are no additional devices beyond those existing on
> +     * the VM bus.
> +     */
> +    range_limit = vfio_device_range_limit(bus);
> +    for (devfn = 0; devfn < range_limit; devfn++) {
> +        VFIOPCIDevice *tmp;
> +        PCIDevice *dev;
> +        bool found = false;
> +
> +        dev = pci_find_device(bus, pci_bus_num(bus),
> +                  PCI_DEVFN(PCI_SLOT(vdev->pdev.devfn), devfn));
> +
> +        if (!dev) {
> +            continue;
> +        }
> +
> +        if (!object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
> +            error_setg(errp, "vfio: Cannot enable AER for device %s, device"
> +                             " %s: slot %d function%d cannot be configured"
> +                             " on the same virtual bus",
> +                             vdev->vbasedev.name, dev->name,
> +                             PCI_SLOT(dev->devfn), PCI_FUNC(dev->devfn));
> +            goto out;
> +        }
> +
> +        tmp = DO_UPCAST(VFIOPCIDevice, pdev, dev);
> +        for (i = 0; i < info->count; i++) {
> +            PCIHostDeviceAddress host;
> +
> +            host.domain = devices[i].segment;
> +            host.bus = devices[i].bus;
> +            host.slot = PCI_SLOT(devices[i].devfn);
> +            host.function = PCI_FUNC(devices[i].devfn);
> +
> +            if (vfio_pci_host_match(&host, tmp->vbasedev.name)) {
> +                found = true;
> +                break;
> +            }
> +        }
> +
> +        if (!found) {
> +            error_setg(errp, "vfio: Cannot enable AER for device %s, affected"
> +                             " device %s does not be configured on the same"
> +                             " virtual bus",
> +                       vdev->vbasedev.name, tmp->vbasedev.name);
> +            goto out;
> +        }
> +    }
> +
> +out:
> +    g_free(info);
> +    return;
> +}
> +
> +static void vfio_aer_check_host_bus_reset(Error **errp)
> +{
> +    VFIOGroup *group;
> +    VFIODevice *vbasedev;
> +    VFIOPCIDevice *vdev;
> +    Error *local_err = NULL;
> +
> +    /* Check All vfio-pci devices if have bus reset capability */
> +    QLIST_FOREACH(group, &vfio_group_list, next) {
> +        QLIST_FOREACH(vbasedev, &group->device_list, next) {
> +            if (vbasedev->type != VFIO_DEVICE_TYPE_PCI) {
> +                continue;
> +            }
> +            vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
> +            if (vdev->features & VFIO_FEATURE_ENABLE_AER) {
> +                vfio_check_hot_bus_reset(vdev, &local_err);
> +                if (local_err) {
> +                    error_propagate(errp, local_err);
> +                    return;
> +                }
> +            }
> +        }
> +    }
> +
> +    return;
> +}
> +
>  static int vfio_setup_aer(VFIOPCIDevice *vdev, uint8_t cap_ver,
>                            int pos, uint16_t size)
>  {
> @@ -2060,29 +2290,6 @@ static void vfio_pci_post_reset(VFIOPCIDevice *vdev)
>      vfio_intx_enable(vdev);
>  }
>  
> -static int vfio_pci_name_to_addr(const char *name, PCIHostDeviceAddress *addr)
> -{
> -    if (strlen(name) != 12 ||
> -        sscanf(name, "%04x:%02x:%02x.%1x", &addr->domain,
> -               &addr->bus, &addr->slot, &addr->function) != 4) {
> -        return -EINVAL;
> -    }
> -
> -    return 0;
> -}
> -
> -static bool vfio_pci_host_match(PCIHostDeviceAddress *addr, const char *name)
> -{
> -    PCIHostDeviceAddress tmp;
> -
> -    if (vfio_pci_name_to_addr(name, &tmp)) {
> -        return false;
> -    }
> -
> -    return (tmp.domain == addr->domain && tmp.bus == addr->bus &&
> -            tmp.slot == addr->slot && tmp.function == addr->function);
> -}
> -
>  static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
>  {
>      VFIOGroup *group;
> @@ -2608,6 +2815,22 @@ static void vfio_unregister_req_notifier(VFIOPCIDevice *vdev)
>      vdev->req_enabled = false;
>  }
>  
> +static void vfio_pci_machine_done_notify(Notifier *notifier, void *unused)
> +{
> +    Error *local_err = NULL;
> +
> +    vfio_aer_check_host_bus_reset(&local_err);
> +    if (local_err) {
> +        fprintf(stderr, "%s\n", error_get_pretty(local_err));
> +        error_free(local_err);
> +        exit(1);
> +    }
> +}
> +
> +static Notifier machine_notifier = {
> +    .notify = vfio_pci_machine_done_notify,
> +};
> +
>  static int vfio_initfn(PCIDevice *pdev)
>  {
>      VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
> @@ -3000,6 +3223,15 @@ static const TypeInfo vfio_pci_dev_info = {
>  static void register_vfio_pci_dev_type(void)
>  {
>      type_register_static(&vfio_pci_dev_info);
> +
> +    /*
> +     * The AER configuration may depend on multiple devices, so we cannot
> +     * validate consistency after each device is initialized.  We can only
> +     * depend on function initialization order (function 0 last) for hotplug
> +     * devices, therefore a machine-init-done notifier is used to validate
> +     * the configuration after all cold-plug devices are processed.
> +     */
> +     qemu_add_machine_init_done_notifier(&machine_notifier);
>  }
>  
>  type_init(register_vfio_pci_dev_type)
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index 5483044..e2faffb 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -15,6 +15,7 @@
>  #include "qemu-common.h"
>  #include "exec/memory.h"
>  #include "hw/pci/pci.h"
> +#include "hw/pci/pci_bus.h"
>  #include "hw/pci/pci_bridge.h"
>  #include "hw/vfio/vfio-common.h"
>  #include "qemu/event_notifier.h"

  reply	other threads:[~2016-08-31 19:56 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19  7:38 [Qemu-devel] [PATCH v9 00/11] vfio-pci: pass the aer error to guest Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 01/11] vfio: extract vfio_get_hot_reset_info as a single function Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 02/11] vfio: squeeze out vfio_pci_do_hot_reset for support bus reset Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 03/11] vfio: add aer support for vfio device Zhou Jie
2016-08-31 19:44   ` Alex Williamson
2016-09-13  6:56     ` Dou Liyang
2016-09-13 15:14       ` Alex Williamson
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 04/11] vfio: refine function vfio_pci_host_match Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 05/11] vfio: add check host bus reset is support or not Zhou Jie
2016-08-31 19:56   ` Alex Williamson [this message]
2016-09-01  2:12     ` Alex Williamson
2016-09-22 14:04       ` Dou Liyang
2016-09-22  8:34     ` Dou Liyang
2016-09-22 14:03       ` Alex Williamson
2016-09-22 14:27         ` Dou Liyang
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 06/11] pci: add a pci_function_is_valid callback to check function if valid Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 07/11] vfio: add check aer functionality for hotplug device Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 08/11] vfio: vote the function 0 to do host bus reset when aer occurred Zhou Jie
2016-10-09 13:07   ` Cao jin
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 09/11] vfio-pci: pass the aer error to guest Zhou Jie
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 10/11] vfio: Add waiting for host aer error progress Zhou Jie
2016-08-31 20:13   ` Alex Williamson
2016-08-31 20:34     ` Michael S. Tsirkin
2016-10-24  7:56   ` Cao jin
2016-07-19  7:38 ` [Qemu-devel] [PATCH v9 11/11] vfio: add 'aer' property to expose aercap Zhou Jie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160831135620.1083b9a6@t450s.home \
    --to=alex.williamson@redhat.com \
    --cc=chen.fan.fnst@cn.fujitsu.com \
    --cc=fan.chen@easystack.cn \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=zhoujie2011@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).