From: Jerin Jacob <jerin.jacob@caviumnetworks.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: Alejandro Lucero <alejandro.lucero@netronome.com>,
dev@dpdk.org, "Burakov, Anatoly" <anatoly.burakov@intel.com>
Subject: Re: [PATCH] vfio: fix device unplug when several devices per vfio group
Date: Mon, 8 May 2017 20:50:07 +0530 [thread overview]
Message-ID: <20170508152006.GA28180@jerin> (raw)
In-Reply-To: <1528500.cNnDXbOJL1@xps>
-----Original Message-----
> Date: Sun, 30 Apr 2017 19:29:49 +0200
> From: Thomas Monjalon <thomas@monjalon.net>
> To: Alejandro Lucero <alejandro.lucero@netronome.com>
> Cc: dev@dpdk.org, "Burakov, Anatoly" <anatoly.burakov@intel.com>
> Subject: Re: [dpdk-dev] [PATCH] vfio: fix device unplug when several
> devices per vfio group
>
> 28/04/2017 15:25, Burakov, Anatoly:
> > From: Alejandro Lucero [mailto:alejandro.lucero@netronome.com]
> > > VFIO allows a secure way of assigning devices to user space and those
> > > devices which can not be isolated from other ones are set in same VFIO
> > > group. Releasing or unplugging a device should be aware of remaining
> > > devices is the same group for avoiding to close such a group.
> > >
> > > Fixes: 94c0776b1bad ("vfio: support hotplug")
> > >
> > > Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
> >
> > I have tested this on my setup on an old kernel with multiple attach/detaches, and it works (whereas it fails without this patch).
> >
> > Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> Applied, thanks
This patch creates issue when large number of PCIe devices connected to system.
Found it through git bisect.
This issue is, vfio_group_fd goes beyond 64(VFIO_MAX_GROUPS) and writes
to wrong memory on following code execution and sub sequentially creates
issues in vfio mapping or such.
vfio_cfg.vfio_groups[vfio_group_fd].devices++;
I can increase VFIO_MAX_GROUPS, but I think, it is not correct fix as
vfio_group_fd generated from open system call.
I add some prints the code for debug. Please find below the output.
Any thoughts from VFIO experts?
➜ [master]83xx [dpdk-master] $ git diff
diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c
b/lib/librte_eal/linuxapp/eal/eal_vfio.c
index d3eae20..2d8ee4c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_vfio.c
+++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c
@@ -100,6 +100,7 @@ vfio_get_group_fd(int iommu_group_no)
snprintf(filename, sizeof(filename),
VFIO_GROUP_FMT, iommu_group_no);
vfio_group_fd = open(filename, O_RDWR);
+ printf("###### name %s vfio_group_fd %d\n", filename, vfio_group_fd);
if (vfio_group_fd < 0) {
/* if file not found, it's not an error */
if (errno != ENOENT) {
@@ -259,6 +260,8 @@ vfio_setup_device(const char *sysfs_base, const char
*dev_addr,
if (vfio_group_fd < 0)
return -1;
+ printf("#### iommu_group_fd %d vfio_group_fd=%d\n", iommu_group_no, vfio_group_fd);
+
/* if group_fd == 0, that means the device isn't managed by VFIO
* */
if (vfio_group_fd == 0) {
RTE_LOG(WARNING, EAL, " %s not managed by VFIO driver,
skipping\n",
@@ -266,6 +269,7 @@ vfio_setup_device(const char *sysfs_base, const char
*dev_addr,
return 1;
}
/*
* at this point, we know that this group is viable (meaning,
* all devices
* are either bound to VFIO or not bound to anything)
@@ -359,6 +363,7 @@ vfio_setup_device(const char *sysfs_base, const char
*dev_addr,
return -1;
}
vfio_cfg.vfio_groups[vfio_group_fd].devices++;
+ printf("vfio_group_fd %d device %d\n", vfio_group_fd, vfio_cfg.vfio_groups[vfio_group_fd].devices++);
return 0;
}
output log
----------
EAL: PCI device 0000:07:00.1 on NUMA socket 0
EAL: probe driver: 177d:a04b octeontx_ssovf
###### name /dev/vfio/114 vfio_group_fd 44
#### iommu_group_fd 114 vfio_group_fd=44
EAL: using IOMMU type 1 (Type 1)
vfio_group_fd 44 device 1
EAL: PCI device 0000:07:00.2 on NUMA socket 0
EAL: probe driver: 177d:a04b octeontx_ssovf
###### name /dev/vfio/115 vfio_group_fd 47
#### iommu_group_fd 115 vfio_group_fd=47
vfio_group_fd 47 device 1
EAL: PCI device 0000:07:00.3 on NUMA socket 0
EAL: probe driver: 177d:a04b octeontx_ssovf
###### name /dev/vfio/116 vfio_group_fd 50
#### iommu_group_fd 116 vfio_group_fd=50
vfio_group_fd 50 device 1
EAL: PCI device 0000:07:00.4 on NUMA socket 0
EAL: probe driver: 177d:a04b octeontx_ssovf
###### name /dev/vfio/117 vfio_group_fd 53
#### iommu_group_fd 117 vfio_group_fd=53
vfio_group_fd 53 device 1
EAL: PCI device 0000:07:00.5 on NUMA socket 0
EAL: probe driver: 177d:a04b octeontx_ssovf
###### name /dev/vfio/118 vfio_group_fd 56
#### iommu_group_fd 118 vfio_group_fd=56
vfio_group_fd 56 device 1
EAL: PCI device 0000:07:00.6 on NUMA socket 0
EAL: probe driver: 177d:a04b octeontx_ssovf
###### name /dev/vfio/119 vfio_group_fd 59
#### iommu_group_fd 119 vfio_group_fd=59
vfio_group_fd 59 device 1
EAL: PCI device 0000:07:00.7 on NUMA socket 0
EAL: probe driver: 177d:a04b octeontx_ssovf
###### name /dev/vfio/120 vfio_group_fd 62
#### iommu_group_fd 120 vfio_group_fd=62
vfio_group_fd 62 device 1
EAL: PCI device 0000:07:01.0 on NUMA socket 0
EAL: probe driver: 177d:a04b octeontx_ssovf
###### name /dev/vfio/121 vfio_group_fd 65
#### iommu_group_fd 121 vfio_group_fd=65
vfio_group_fd 65 device 1632632833
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^(memory corruption here)
EAL: PCI device 0000:08:00.1 on NUMA socket 0
EAL: probe driver: 177d:a04d octeontx_ssowvf
###### name /dev/vfio/122 vfio_group_fd 68
#### iommu_group_fd 122 vfio_group_fd=68
vfio_group_fd 68 device 1
EAL: PCI device 0000:08:00.2 on NUMA socket 0
EAL: probe driver: 177d:a04d octeontx_ssowvf
###### name /dev/vfio/123 vfio_group_fd 71
#### iommu_group_fd 123 vfio_group_fd=71
vfio_group_fd 71 device 99999941
EAL: PCI device 0000:08:00.3 on NUMA socket 0
EAL: probe driver: 177d:a04d octeontx_ssowvf
###### name /dev/vfio/124 vfio_group_fd 74
#### iommu_group_fd 124 vfio_group_fd=74
vfio_group_fd 74 device 1
EAL: PCI device 0000:08:00.4 on NUMA socket 0
EAL: probe driver: 177d:a04d octeontx_ssowvf
###### name /dev/vfio/125 vfio_group_fd 77
#### iommu_group_fd 125 vfio_group_fd=77
vfio_group_fd 77 device 1
EAL: PCI device 0000:08:00.5 on NUMA socket 0
EAL: probe driver: 177d:a04d octeontx_ssowvf
###### name /dev/vfio/126 vfio_group_fd 80
#### iommu_group_fd 126 vfio_group_fd=80
vfio_group_fd 80 device 1
EAL: PCI device 0000:08:00.6 on NUMA socket 0
EAL: probe driver: 177d:a04d octeontx_ssowvf
###### name /dev/vfio/127 vfio_group_fd 83
#### iommu_group_fd 127 vfio_group_fd=83
vfio_group_fd 83 device 1
EAL: PCI device 0000:08:00.7 on NUMA socket 0
EAL: probe driver: 177d:a04d octeontx_ssowvf
EAL: PCI device 0000:08:01.0 on NUMA socket 0
EAL: probe driver: 177d:a04d octeontx_ssowvf
EAL: PCI device 0001:01:00.1 on NUMA socket 0
EAL: probe driver: 177d:a034 net_thunderx
###### name /dev/vfio/64 vfio_group_fd 86
#### iommu_group_fd 64 vfio_group_fd=86
vfio_group_fd 86 device 1
Segmentation fault
next prev parent reply other threads:[~2017-05-08 15:20 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-26 10:49 [PATCH] vfio: fix device unplug when several devices per vfio group Alejandro Lucero
2017-04-28 13:25 ` Burakov, Anatoly
2017-04-30 17:29 ` Thomas Monjalon
2017-05-08 15:20 ` Jerin Jacob [this message]
2017-05-08 16:44 ` Alejandro Lucero
2017-05-08 17:59 ` Alejandro Lucero
2017-05-09 4:13 ` Jerin Jacob
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170508152006.GA28180@jerin \
--to=jerin.jacob@caviumnetworks.com \
--cc=alejandro.lucero@netronome.com \
--cc=anatoly.burakov@intel.com \
--cc=dev@dpdk.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.