From: Eric Auger <eric.auger@linaro.org>
To: Alex Williamson <alex.williamson@redhat.com>, qemu-devel@nongnu.org
Cc: jike.song@intel.com, kevin.tian@intel.com, laine@redhat.com
Subject: Re: [Qemu-devel] [RFC PATCH] vfio: Add sysfsdev property for pci & platform
Date: Tue, 26 Jan 2016 16:03:28 +0100 [thread overview]
Message-ID: <56A78AC0.7080300@linaro.org> (raw)
In-Reply-To: <20160120180552.21926.99964.stgit@gimli.home>
Hi Alex,
I did a try with both legacy cmd line and new one and it works fine for
vfio platform too:
-device vfio-calxeda-xgmac,host="fff51000.ethernet"
-device
vfio-calxeda-xgmac,sysfsdev="/sys/bus/platform/devices/fff51000.ethernet"
Tested-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
just 1 question below.
Best Regards
Eric
On 01/20/2016 07:06 PM, Alex Williamson wrote:
> vfio-pci currently requires a host= parameter, which comes in the
> form of a PCI address in [domain:]<bus:slot.function> notation. We
> expect to find a matching entry in sysfs for that under
> /sys/bus/pci/devices/. vfio-platform takes a similar approach, but
> defines the host= parameter to be a string, which can be matched
> directly under /sys/bus/platform/devices/. On the PCI side, we have
> some interest in using vfio to expose vGPU devices. These are not
> actual discrete PCI devices, so they don't have a compatible host PCI
> bus address or a device link where QEMU wants to look for it. There's
> also really no requirement that vfio can only be used to expose
> physical devices, a new vfio bus and iommu driver could expose a
> completely emulated device. To fit within the vfio framework, it
> would need a kernel struct device and associated IOMMU group, but
> those are easy constraints to manage.
>
> To support such devices, which would include vGPUs, that honor the
> VFIO PCI programming API, but are not necessarily backed by a unique
> PCI address, add support for specifying any device in sysfs. The
> vfio API already has support for probing the device type to ensure
> compatibility with either vfio-pci or vfio-platform.
>
> With this, a vfio-pci device could either be specified as:
>
> -device vfio-pci,host=02:00.0
>
> or
>
> -device vfio-pci,sysfsdev=/sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0
>
> or even
>
> -device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:02:00.0
>
> When vGPU support comes along, this might look something more like:
>
> -device vfio-pci,sysfsdev=/sys/devices/virtual/intel-vgpu/vgpu0@0000:00:02.0
>
> NB - This is only a made up example path, but it should be noted that
> the device namespace is global for vfio, a virtual device cannot
> overlap with existing namespaces and should not create a name prone to
> conflict, such as a simple instance number.
>
> The same changes is made for vfio-platform but is only compile tested.
> In both cases, specifying sysfsdev has precedence over the old host
> option.
>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
> hw/vfio/pci.c | 130 +++++++++++++++++------------------------
> hw/vfio/platform.c | 55 ++++++++++-------
> include/hw/vfio/vfio-common.h | 1
> 3 files changed, 86 insertions(+), 100 deletions(-)
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 1fb868c..bfe4215 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -880,12 +880,8 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
> if (vdev->pdev.romfile || !vdev->pdev.rom_bar) {
> /* Since pci handles romfile, just print a message and return */
> if (vfio_blacklist_opt_rom(vdev) && vdev->pdev.romfile) {
> - error_printf("Warning : Device at %04x:%02x:%02x.%x "
> - "is known to cause system instability issues during "
> - "option rom execution. "
> - "Proceeding anyway since user specified romfile\n",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> + error_printf("Warning : Device at %s is known to cause system instability issues during option rom execution. Proceeding anyway since user specified romfile\n",
> + vdev->vbasedev.name);
> }
> return;
> }
> @@ -898,9 +894,7 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
> pwrite(fd, &size, 4, offset) != 4 ||
> pread(fd, &size, 4, offset) != 4 ||
> pwrite(fd, &orig, 4, offset) != 4) {
> - error_report("%s(%04x:%02x:%02x.%x) failed: %m",
> - __func__, vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function);
> + error_report("%s(%s) failed: %m", __func__, vdev->vbasedev.name);
> return;
> }
>
> @@ -912,29 +906,18 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
>
> if (vfio_blacklist_opt_rom(vdev)) {
> if (dev->opts && qemu_opt_get(dev->opts, "rombar")) {
> - error_printf("Warning : Device at %04x:%02x:%02x.%x "
> - "is known to cause system instability issues during "
> - "option rom execution. "
> - "Proceeding anyway since user specified non zero value for "
> - "rombar\n",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> + error_printf("Warning : Device at %s is known to cause system instability issues during option rom execution. Proceeding anyway since user specified non zero value for rombar\n",
> + vdev->vbasedev.name);
> } else {
> - error_printf("Warning : Rom loading for device at "
> - "%04x:%02x:%02x.%x has been disabled due to "
> - "system instability issues. "
> - "Specify rombar=1 or romfile to force\n",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> + error_printf("Warning : Rom loading for device at %s has been disabled due to system instability issues. Specify rombar=1 or romfile to force\n",
> + vdev->vbasedev.name);
> return;
> }
> }
>
> trace_vfio_pci_size_rom(vdev->vbasedev.name, size);
>
> - snprintf(name, sizeof(name), "vfio[%04x:%02x:%02x.%x].rom",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> + snprintf(name, sizeof(name), "vfio[%s].rom", vdev->vbasedev.name);
>
> memory_region_init_io(&vdev->pdev.rom, OBJECT(vdev),
> &vfio_rom_ops, vdev, name, size);
> @@ -1048,9 +1031,8 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
> ret = pread(vdev->vbasedev.fd, &phys_val, len,
> vdev->config_offset + addr);
> if (ret != len) {
> - error_report("%s(%04x:%02x:%02x.%x, 0x%x, 0x%x) failed: %m",
> - __func__, vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function, addr, len);
> + error_report("%s(%s, 0x%x, 0x%x) failed: %m",
> + __func__, vdev->vbasedev.name, addr, len);
> return -errno;
> }
> phys_val = le32_to_cpu(phys_val);
> @@ -1074,9 +1056,8 @@ void vfio_pci_write_config(PCIDevice *pdev,
> /* Write everything to VFIO, let it filter out what we can't write */
> if (pwrite(vdev->vbasedev.fd, &val_le, len, vdev->config_offset + addr)
> != len) {
> - error_report("%s(%04x:%02x:%02x.%x, 0x%x, 0x%x, 0x%x) failed: %m",
> - __func__, vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function, addr, val, len);
> + error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %m",
> + __func__, vdev->vbasedev.name, addr, val, len);
> }
>
> /* MSI/MSI-X Enabling/Disabling */
> @@ -1347,9 +1328,7 @@ static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
> return;
> }
>
> - snprintf(name, sizeof(name), "VFIO %04x:%02x:%02x.%x BAR %d",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function, nr);
> + snprintf(name, sizeof(name), "VFIO %s BAR %d", vdev->vbasedev.name, nr);
>
> /* Determine what type of BAR this is for registration */
> ret = pread(vdev->vbasedev.fd, &pci_bar, sizeof(pci_bar),
> @@ -1719,9 +1698,8 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos)
> }
>
> if (ret < 0) {
> - error_report("vfio: %04x:%02x:%02x.%x Error adding PCI capability "
> - "0x%x[0x%x]@0x%x: %d", vdev->host.domain,
> - vdev->host.bus, vdev->host.slot, vdev->host.function,
> + error_report("vfio: %s Error adding PCI capability "
> + "0x%x[0x%x]@0x%x: %d", vdev->vbasedev.name,
> cap_id, size, pos, ret);
> return ret;
> }
> @@ -1783,11 +1761,14 @@ static void vfio_pci_post_reset(VFIOPCIDevice *vdev)
> vfio_intx_enable(vdev);
> }
>
> -static bool vfio_pci_host_match(PCIHostDeviceAddress *host1,
> - PCIHostDeviceAddress *host2)
> +static bool vfio_pci_host_match(PCIHostDeviceAddress *addr, const char *name)
> {
> - return (host1->domain == host2->domain && host1->bus == host2->bus &&
> - host1->slot == host2->slot && host1->function == host2->function);
> + char tmp[13];
> +
> + sprintf(tmp, "%04x:%02x:%02x.%1x", addr->domain,
> + addr->bus, addr->slot, addr->function);
> +
> + return (strcmp(tmp, name) == 0);
> }
>
> static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
> @@ -1812,9 +1793,8 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
> if (ret && errno != ENOSPC) {
> ret = -errno;
> if (!vdev->has_pm_reset) {
> - error_report("vfio: Cannot reset device %04x:%02x:%02x.%x, "
> - "no available reset mechanism.", vdev->host.domain,
> - vdev->host.bus, vdev->host.slot, vdev->host.function);
> + error_report("vfio: Cannot reset device %s, "
> + "no available reset mechanism.", vdev->vbasedev.name);
> }
> goto out_single;
> }
> @@ -1847,7 +1827,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
> trace_vfio_pci_hot_reset_dep_devices(host.domain,
> host.bus, host.slot, host.function, devices[i].group_id);
>
> - if (vfio_pci_host_match(&host, &vdev->host)) {
> + if (vfio_pci_host_match(&host, vdev->vbasedev.name)) {
> continue;
> }
>
> @@ -1873,7 +1853,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
> continue;
> }
> tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev);
> - if (vfio_pci_host_match(&host, &tmp->host)) {
> + if (vfio_pci_host_match(&host, tmp->vbasedev.name)) {
> if (single) {
> ret = -EINVAL;
> goto out_single;
> @@ -1935,7 +1915,7 @@ out:
> host.slot = PCI_SLOT(devices[i].devfn);
> host.function = PCI_FUNC(devices[i].devfn);
>
> - if (vfio_pci_host_match(&host, &vdev->host)) {
> + if (vfio_pci_host_match(&host, vdev->vbasedev.name)) {
> continue;
> }
>
> @@ -1954,7 +1934,7 @@ out:
> continue;
> }
> tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev);
> - if (vfio_pci_host_match(&host, &tmp->host)) {
> + if (vfio_pci_host_match(&host, tmp->vbasedev.name)) {
> vfio_pci_post_reset(tmp);
> break;
> }
> @@ -2160,10 +2140,7 @@ static void vfio_err_notifier_handler(void *opaque)
> * guest to contain the error.
> */
>
> - error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected. "
> - "Please collect any data possible and then kill the guest",
> - __func__, vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function);
> + error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
>
> vm_stop(RUN_STATE_INTERNAL_ERROR);
> }
> @@ -2344,42 +2321,43 @@ static int vfio_initfn(PCIDevice *pdev)
> VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
> VFIODevice *vbasedev_iter;
> VFIOGroup *group;
> - char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
> + char *tmp, group_path[PATH_MAX], *group_name;
> ssize_t len;
> struct stat st;
> int groupid;
> int ret;
>
> - /* Check that the host device exists */
> - snprintf(path, sizeof(path),
> - "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> - if (stat(path, &st) < 0) {
> - error_report("vfio: error: no such host device: %s", path);
> + if (!vdev->vbasedev.sysfsdev) {
> + vdev->vbasedev.sysfsdev =
> + g_strdup_printf("/sys/bus/pci/devices/%04x:%02x:%02x.%01x",
> + vdev->host.domain, vdev->host.bus,
> + vdev->host.slot, vdev->host.function);
> + }
> +
> + if (stat(vdev->vbasedev.sysfsdev, &st) < 0) {
> + error_report("vfio: error: no such host device: %s",
> + vdev->vbasedev.sysfsdev);
> return -errno;
> }
>
> + vdev->vbasedev.name = g_strdup(basename(vdev->vbasedev.sysfsdev));
> vdev->vbasedev.ops = &vfio_pci_ops;
> -
> vdev->vbasedev.type = VFIO_DEVICE_TYPE_PCI;
> - vdev->vbasedev.name = g_strdup_printf("%04x:%02x:%02x.%01x",
> - vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function);
>
> - strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);
> + tmp = g_strdup_printf("%s/iommu_group", vdev->vbasedev.sysfsdev);
> + len = readlink(tmp, group_path, sizeof(group_path));
> + g_free(tmp);
>
> - len = readlink(path, iommu_group_path, sizeof(path));
> - if (len <= 0 || len >= sizeof(path)) {
> + if (len <= 0 || len >= sizeof(group_path)) {
> error_report("vfio: error no iommu_group for device");
> return len < 0 ? -errno : -ENAMETOOLONG;
> }
>
> - iommu_group_path[len] = 0;
> - group_name = basename(iommu_group_path);
> + group_path[len] = 0;
>
> + group_name = basename(group_path);
> if (sscanf(group_name, "%d", &groupid) != 1) {
> - error_report("vfio: error reading %s: %m", path);
> + error_report("vfio: error reading %s: %m", group_path);
> return -errno;
> }
>
> @@ -2391,21 +2369,18 @@ static int vfio_initfn(PCIDevice *pdev)
> return -ENOENT;
> }
>
> - snprintf(path, sizeof(path), "%04x:%02x:%02x.%01x",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> -
> QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
> if (strcmp(vbasedev_iter->name, vdev->vbasedev.name) == 0) {
> - error_report("vfio: error: device %s is already attached", path);
> + error_report("vfio: error: device %s is already attached",
> + vdev->vbasedev.name);
> vfio_put_group(group);
> return -EBUSY;
> }
> }
>
> - ret = vfio_get_device(group, path, &vdev->vbasedev);
> + ret = vfio_get_device(group, vdev->vbasedev.name, &vdev->vbasedev);
> if (ret) {
> - error_report("vfio: failed to get device %s", path);
> + error_report("vfio: failed to get device %s", vdev->vbasedev.name);
> vfio_put_group(group);
> return ret;
> }
> @@ -2622,6 +2597,7 @@ static void vfio_instance_init(Object *obj)
>
> static Property vfio_pci_dev_properties[] = {
> DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host),
> + DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev),
> DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice,
> intx.mmap_timeout, 1100),
> DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features,
> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
> index 289b498..99f0642 100644
> --- a/hw/vfio/platform.c
> +++ b/hw/vfio/platform.c
> @@ -559,38 +559,45 @@ static int vfio_base_device_init(VFIODevice *vbasedev)
> {
> VFIOGroup *group;
> VFIODevice *vbasedev_iter;
> - char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
> + char *tmp, group_path[PATH_MAX], *group_name;
> ssize_t len;
> struct stat st;
> int groupid;
> int ret;
>
> - /* name must be set prior to the call */
> - if (!vbasedev->name || strchr(vbasedev->name, '/')) {
> - return -EINVAL;
> - }
> + /* @sysfsdev takes precedence over @host */
> + if (vbasedev->sysfsdev) {
> + g_free(vbasedev->name);
> + vbasedev->name = g_strdup(basename(vbasedev->sysfsdev));
do we need the g_strdup here?
> + } else {
> + if (!vbasedev->name || strchr(vbasedev->name, '/')) {
> + return -EINVAL;
> + }
>
> - /* Check that the host device exists */
> - g_snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
> - vbasedev->name);
> + vbasedev->sysfsdev = g_strdup_printf("/sys/bus/platform/devices/%s",
> + vbasedev->name);
> + }
>
> - if (stat(path, &st) < 0) {
> - error_report("vfio: error: no such host device: %s", path);
> + if (stat(vbasedev->sysfsdev, &st) < 0) {
> + error_report("vfio: error: no such host device: %s",
> + vbasedev->sysfsdev);
> return -errno;
> }
>
> - g_strlcat(path, "iommu_group", sizeof(path));
> - len = readlink(path, iommu_group_path, sizeof(iommu_group_path));
> - if (len < 0 || len >= sizeof(iommu_group_path)) {
> + tmp = g_strdup_printf("%s/iommu_group", vbasedev->sysfsdev);
> + len = readlink(tmp, group_path, sizeof(group_path));
> + g_free(tmp);
> +
> + if (len < 0 || len >= sizeof(group_path)) {
> error_report("vfio: error no iommu_group for device");
> return len < 0 ? -errno : -ENAMETOOLONG;
> }
>
> - iommu_group_path[len] = 0;
> - group_name = basename(iommu_group_path);
> + group_path[len] = 0;
>
> + group_name = basename(group_path);
> if (sscanf(group_name, "%d", &groupid) != 1) {
> - error_report("vfio: error reading %s: %m", path);
> + error_report("vfio: error reading %s: %m", group_path);
> return -errno;
> }
>
> @@ -602,25 +609,24 @@ static int vfio_base_device_init(VFIODevice *vbasedev)
> return -ENOENT;
> }
>
> - g_snprintf(path, sizeof(path), "%s", vbasedev->name);
> -
> QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
> if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
> - error_report("vfio: error: device %s is already attached", path);
> + error_report("vfio: error: device %s is already attached",
> + vbasedev->name);
> vfio_put_group(group);
> return -EBUSY;
> }
> }
> - ret = vfio_get_device(group, path, vbasedev);
> + ret = vfio_get_device(group, vbasedev->name, vbasedev);
> if (ret) {
> - error_report("vfio: failed to get device %s", path);
> + error_report("vfio: failed to get device %s", vbasedev->name);
> vfio_put_group(group);
> return ret;
> }
>
> ret = vfio_populate_device(vbasedev);
> if (ret) {
> - error_report("vfio: failed to populate device %s", path);
> + error_report("vfio: failed to populate device %s", vbasedev->name);
> vfio_put_group(group);
> }
>
> @@ -680,7 +686,9 @@ static void vfio_platform_realize(DeviceState *dev, Error **errp)
> vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
> vbasedev->ops = &vfio_platform_ops;
>
> - trace_vfio_platform_realize(vbasedev->name, vdev->compat);
> + trace_vfio_platform_realize(vbasedev->sysfsdev ?
> + vbasedev->sysfsdev : vbasedev->name,
> + vdev->compat);
>
> ret = vfio_base_device_init(vbasedev);
> if (ret) {
> @@ -702,6 +710,7 @@ static const VMStateDescription vfio_platform_vmstate = {
>
> static Property vfio_platform_dev_properties[] = {
> DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
> + DEFINE_PROP_STRING("sysfsdev", VFIOPlatformDevice, vbasedev.sysfsdev),
> DEFINE_PROP_BOOL("x-no-mmap", VFIOPlatformDevice, vbasedev.no_mmap, false),
> DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
> mmap_timeout, 1100),
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index f037f3c..7e00ffc 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -89,6 +89,7 @@ typedef struct VFIODeviceOps VFIODeviceOps;
> typedef struct VFIODevice {
> QLIST_ENTRY(VFIODevice) next;
> struct VFIOGroup *group;
> + char *sysfsdev;
> char *name;
> int fd;
> int type;
>
next prev parent reply other threads:[~2016-01-26 15:03 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-20 18:06 [Qemu-devel] [RFC PATCH] vfio: Add sysfsdev property for pci & platform Alex Williamson
2016-01-20 18:11 ` Daniel P. Berrange
2016-01-20 18:28 ` Alex Williamson
2016-01-21 7:09 ` P J P
2016-01-21 7:51 ` Tian, Kevin
2016-01-21 15:14 ` Alex Williamson
2016-01-25 19:27 ` Tian, Kevin
2016-01-26 15:03 ` Eric Auger [this message]
2016-01-26 17:08 ` Alex Williamson
2016-02-01 17:32 ` Eric Auger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A78AC0.7080300@linaro.org \
--to=eric.auger@linaro.org \
--cc=alex.williamson@redhat.com \
--cc=jike.song@intel.com \
--cc=kevin.tian@intel.com \
--cc=laine@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.