* [Qemu-devel] Assigning network devices to nested VMs results in driver errors in nested VMs
@ 2018-02-14 4:44 Jintack Lim
2018-02-14 5:36 ` Peter Xu
2018-02-14 19:55 ` Jintack Lim
0 siblings, 2 replies; 5+ messages in thread
From: Jintack Lim @ 2018-02-14 4:44 UTC (permalink / raw)
To: QEMU Devel Mailing List, vfio-users; +Cc: Peter Xu
Hi,
I'm trying to assign network devices to nested VMs on x86 using KVM,
but I got network device driver errors in the nested VMs. (I've tried
this about an year ago when vIOMMU patches were not upstreamed, and I
got similar errors at that time.)
This could be network driver issues, but I'd like to get some help if
somebody encountered similar issues.
I'm using v4.15.0 kernel and v2.11.0 QEMU, and I followed this [1]
guide. I had no problem with assigning devices to the first level VMs
(L1 VMs). And I also checked that the devices were assigned to nested
VMs with the lspci command in the nested VMs. But network device
drivers failed to initialize the device. I tried two network cards -
Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection and
Mellanox Technologies MT27500 Family.
Intel driver error in the nested VM looks like this.
[ 1.939552] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver -
version 5.1.0-k
[ 1.949796] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 2.210024] ixgbe 0000:00:04.0: HW Init failed: -12
[ 2.218144] ixgbe: probe of 0000:00:04.0 failed with error -12
and I saw lots of these messages in the host (L0) kernel log when
booting the nested VM.
[ 1557.404173] DMAR: DRHD: handling fault status reg 102
[ 1557.409813] DMAR: [DMA Read] Request device [06:00.0] fault addr
90000 [fault reason 06] PTE Read access is not set
[ 1561.383957] DMAR: DRHD: handling fault status reg 202
[ 1561.389598] DMAR: [DMA Read] Request device [06:00.0] fault addr
90000 [fault reason 06] PTE Read access is not set
This is Mellanox driver error in another nested VM.
[ 2.481694] mlx4_core: Initializing 0000:00:04.0
[ 3.519422] mlx4_core 0000:00:04.0: Installed FW has unsupported
command interface revision 0
[ 3.537769] mlx4_core 0000:00:04.0: (Installed FW version is 0.0.000)
[ 3.551733] mlx4_core 0000:00:04.0: This driver version supports
only revisions 2 to 3
[ 3.568758] mlx4_core 0000:00:04.0: QUERY_FW command failed, aborting
[ 3.582789] mlx4_core 0000:00:04.0: Failed to init fw, aborting.
The host showed similar messages as above.
I wonder what could be the cause of these errors. Please let me know
if further information is needed.
[1] https://wiki.qemu.org/Features/VT-d
Thanks,
Jintack
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Assigning network devices to nested VMs results in driver errors in nested VMs
2018-02-14 4:44 [Qemu-devel] Assigning network devices to nested VMs results in driver errors in nested VMs Jintack Lim
@ 2018-02-14 5:36 ` Peter Xu
2018-02-14 13:57 ` Jintack Lim
2018-02-14 19:55 ` Jintack Lim
1 sibling, 1 reply; 5+ messages in thread
From: Peter Xu @ 2018-02-14 5:36 UTC (permalink / raw)
To: Jintack Lim; +Cc: QEMU Devel Mailing List, vfio-users
On Tue, Feb 13, 2018 at 11:44:09PM -0500, Jintack Lim wrote:
> Hi,
>
> I'm trying to assign network devices to nested VMs on x86 using KVM,
> but I got network device driver errors in the nested VMs. (I've tried
> this about an year ago when vIOMMU patches were not upstreamed, and I
> got similar errors at that time.)
>
> This could be network driver issues, but I'd like to get some help if
> somebody encountered similar issues.
>
> I'm using v4.15.0 kernel and v2.11.0 QEMU, and I followed this [1]
> guide. I had no problem with assigning devices to the first level VMs
> (L1 VMs). And I also checked that the devices were assigned to nested
> VMs with the lspci command in the nested VMs. But network device
> drivers failed to initialize the device. I tried two network cards -
> Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection and
> Mellanox Technologies MT27500 Family.
>
> Intel driver error in the nested VM looks like this.
> [ 1.939552] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver -
> version 5.1.0-k
> [ 1.949796] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
> [ 2.210024] ixgbe 0000:00:04.0: HW Init failed: -12
> [ 2.218144] ixgbe: probe of 0000:00:04.0 failed with error -12
>
> and I saw lots of these messages in the host (L0) kernel log when
> booting the nested VM.
>
> [ 1557.404173] DMAR: DRHD: handling fault status reg 102
> [ 1557.409813] DMAR: [DMA Read] Request device [06:00.0] fault addr
> 90000 [fault reason 06] PTE Read access is not set
> [ 1561.383957] DMAR: DRHD: handling fault status reg 202
> [ 1561.389598] DMAR: [DMA Read] Request device [06:00.0] fault addr
> 90000 [fault reason 06] PTE Read access is not set
>
> This is Mellanox driver error in another nested VM.
> [ 2.481694] mlx4_core: Initializing 0000:00:04.0
> [ 3.519422] mlx4_core 0000:00:04.0: Installed FW has unsupported
> command interface revision 0
> [ 3.537769] mlx4_core 0000:00:04.0: (Installed FW version is 0.0.000)
> [ 3.551733] mlx4_core 0000:00:04.0: This driver version supports
> only revisions 2 to 3
> [ 3.568758] mlx4_core 0000:00:04.0: QUERY_FW command failed, aborting
> [ 3.582789] mlx4_core 0000:00:04.0: Failed to init fw, aborting.
>
> The host showed similar messages as above.
>
> I wonder what could be the cause of these errors. Please let me know
> if further information is needed.
>
> [1] https://wiki.qemu.org/Features/VT-d
Hi, Jintack,
Thanks for reporting the problem.
I haven't been playing with nested assignment much recently (and even
before), but I think I encountered similar problem too in the past.
Will let you know if I had any progress, but it's possibly not gonna
happen in a few days since there'll be a whole week holiday starting
from tomorrow (which is Chinese Spring Festival).
--
Peter Xu
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Assigning network devices to nested VMs results in driver errors in nested VMs
2018-02-14 5:36 ` Peter Xu
@ 2018-02-14 13:57 ` Jintack Lim
0 siblings, 0 replies; 5+ messages in thread
From: Jintack Lim @ 2018-02-14 13:57 UTC (permalink / raw)
To: Peter Xu; +Cc: QEMU Devel Mailing List, vfio-users
On Wed, Feb 14, 2018 at 12:36 AM, Peter Xu <peterx@redhat.com> wrote:
> On Tue, Feb 13, 2018 at 11:44:09PM -0500, Jintack Lim wrote:
>> Hi,
>>
>> I'm trying to assign network devices to nested VMs on x86 using KVM,
>> but I got network device driver errors in the nested VMs. (I've tried
>> this about an year ago when vIOMMU patches were not upstreamed, and I
>> got similar errors at that time.)
>>
>> This could be network driver issues, but I'd like to get some help if
>> somebody encountered similar issues.
>>
>> I'm using v4.15.0 kernel and v2.11.0 QEMU, and I followed this [1]
>> guide. I had no problem with assigning devices to the first level VMs
>> (L1 VMs). And I also checked that the devices were assigned to nested
>> VMs with the lspci command in the nested VMs. But network device
>> drivers failed to initialize the device. I tried two network cards -
>> Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection and
>> Mellanox Technologies MT27500 Family.
>>
>> Intel driver error in the nested VM looks like this.
>> [ 1.939552] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver -
>> version 5.1.0-k
>> [ 1.949796] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
>> [ 2.210024] ixgbe 0000:00:04.0: HW Init failed: -12
>> [ 2.218144] ixgbe: probe of 0000:00:04.0 failed with error -12
>>
>> and I saw lots of these messages in the host (L0) kernel log when
>> booting the nested VM.
>>
>> [ 1557.404173] DMAR: DRHD: handling fault status reg 102
>> [ 1557.409813] DMAR: [DMA Read] Request device [06:00.0] fault addr
>> 90000 [fault reason 06] PTE Read access is not set
>> [ 1561.383957] DMAR: DRHD: handling fault status reg 202
>> [ 1561.389598] DMAR: [DMA Read] Request device [06:00.0] fault addr
>> 90000 [fault reason 06] PTE Read access is not set
>>
>> This is Mellanox driver error in another nested VM.
>> [ 2.481694] mlx4_core: Initializing 0000:00:04.0
>> [ 3.519422] mlx4_core 0000:00:04.0: Installed FW has unsupported
>> command interface revision 0
>> [ 3.537769] mlx4_core 0000:00:04.0: (Installed FW version is 0.0.000)
>> [ 3.551733] mlx4_core 0000:00:04.0: This driver version supports
>> only revisions 2 to 3
>> [ 3.568758] mlx4_core 0000:00:04.0: QUERY_FW command failed, aborting
>> [ 3.582789] mlx4_core 0000:00:04.0: Failed to init fw, aborting.
>>
>> The host showed similar messages as above.
>>
>> I wonder what could be the cause of these errors. Please let me know
>> if further information is needed.
>>
>> [1] https://wiki.qemu.org/Features/VT-d
>
> Hi, Jintack,
Hi Peter,
>
> Thanks for reporting the problem.
>
> I haven't been playing with nested assignment much recently (and even
> before), but I think I encountered similar problem too in the past.
Oh, that's good to hear that :)
>
> Will let you know if I had any progress, but it's possibly not gonna
> happen in a few days since there'll be a whole week holiday starting
> from tomorrow (which is Chinese Spring Festival).
Thanks a lot. Enjoy Lunar New Year!
>
> --
> Peter Xu
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Assigning network devices to nested VMs results in driver errors in nested VMs
2018-02-14 4:44 [Qemu-devel] Assigning network devices to nested VMs results in driver errors in nested VMs Jintack Lim
2018-02-14 5:36 ` Peter Xu
@ 2018-02-14 19:55 ` Jintack Lim
1 sibling, 0 replies; 5+ messages in thread
From: Jintack Lim @ 2018-02-14 19:55 UTC (permalink / raw)
To: QEMU Devel Mailing List, vfio-users; +Cc: Peter Xu
On Tue, Feb 13, 2018 at 11:44 PM, Jintack Lim <jintack@cs.columbia.edu> wrote:
> Hi,
>
> I'm trying to assign network devices to nested VMs on x86 using KVM,
> but I got network device driver errors in the nested VMs. (I've tried
> this about an year ago when vIOMMU patches were not upstreamed, and I
> got similar errors at that time.)
>
> This could be network driver issues, but I'd like to get some help if
> somebody encountered similar issues.
>
> I'm using v4.15.0 kernel and v2.11.0 QEMU, and I followed this [1]
> guide. I had no problem with assigning devices to the first level VMs
> (L1 VMs). And I also checked that the devices were assigned to nested
> VMs with the lspci command in the nested VMs. But network device
> drivers failed to initialize the device. I tried two network cards -
> Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection and
> Mellanox Technologies MT27500 Family.
>
> Intel driver error in the nested VM looks like this.
> [ 1.939552] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver -
> version 5.1.0-k
> [ 1.949796] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
> [ 2.210024] ixgbe 0000:00:04.0: HW Init failed: -12
> [ 2.218144] ixgbe: probe of 0000:00:04.0 failed with error -12
>
I was assigning PF to the L1 VMs and L2 VMs so far; I guess this is
not the right way. So I tried to assign VF to the L1 VM and assigned
the same VF to the L2 VM in turn. Then the device driver in L2 VM
didn't show any error, and I was able to configure the network
interface. But the network still didn't work.
I only tried Intel network device so far.
> and I saw lots of these messages in the host (L0) kernel log when
> booting the nested VM.
>
> [ 1557.404173] DMAR: DRHD: handling fault status reg 102
> [ 1557.409813] DMAR: [DMA Read] Request device [06:00.0] fault addr
> 90000 [fault reason 06] PTE Read access is not set
> [ 1561.383957] DMAR: DRHD: handling fault status reg 202
> [ 1561.389598] DMAR: [DMA Read] Request device [06:00.0] fault addr
> 90000 [fault reason 06] PTE Read access is not set
>
I still see similar error logs in the host kernel. The fault address
looks different, though.
[ 3228.636485] ixgbe 0000:06:00.0 eth2: VF Reset msg received from vf 0
[ 3236.023683] DMAR: DRHD: handling fault status reg 2
[ 3236.029129] DMAR: [DMA Read] Request device [06:10.0] fault addr
354748000 [fault reason 06] PTE Read access is not set
[ 3236.371711] DMAR: DRHD: handling fault status reg 102
[ 3236.377353] DMAR: [DMA Read] Request device [06:10.0] fault addr
354748000 [fault reason 06] PTE Read access is not set
[ 3236.595667] DMAR: DRHD: handling fault status reg 202
[ 3236.601307] DMAR: [DMA Read] Request device [06:10.0] fault addr
354748000 [fault reason 06] PTE Read access is not set
[ 3236.831863] DMAR: DRHD: handling fault status reg 302
[ 3236.837503] DMAR: [DMA Read] Request device [06:10.0] fault addr
370b7c000 [fault reason 06] PTE Read access is not set
[ 3237.647806] vfio-pci 0000:06:10.0: timed out waiting for pending
transaction; performing function level reset anyway
> This is Mellanox driver error in another nested VM.
> [ 2.481694] mlx4_core: Initializing 0000:00:04.0
> [ 3.519422] mlx4_core 0000:00:04.0: Installed FW has unsupported
> command interface revision 0
> [ 3.537769] mlx4_core 0000:00:04.0: (Installed FW version is 0.0.000)
> [ 3.551733] mlx4_core 0000:00:04.0: This driver version supports
> only revisions 2 to 3
> [ 3.568758] mlx4_core 0000:00:04.0: QUERY_FW command failed, aborting
> [ 3.582789] mlx4_core 0000:00:04.0: Failed to init fw, aborting.
>
> The host showed similar messages as above.
>
> I wonder what could be the cause of these errors. Please let me know
> if further information is needed.
>
> [1] https://wiki.qemu.org/Features/VT-d
>
> Thanks,
> Jintack
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Assigning network devices to nested VMs results in driver errors in nested VMs
@ 2024-10-14 10:51 ryan.ljlu
0 siblings, 0 replies; 5+ messages in thread
From: ryan.ljlu @ 2024-10-14 10:51 UTC (permalink / raw)
To: jintack@cs.columbia.edu; +Cc: qemu-devel@nongnu.org
[-- Attachment #1: Type: text/plain, Size: 1947 bytes --]
Hi Jintack,
I run into the same issue as you described in https://lists.nongnu.org/archive/html/qemu-devel/2018-02/msg03876.html
I try to pass through MLNX VF and NVME to level2-vm, but both these two pci device can display with lspci correctly, but have issue with driver.
My env are:
Host Linux version : 5.15.131
Qemu: 8.2.0
Since you reported this issue in 2018, does this have any update from there?
Level2 VM dmesg error logs while passing through MLNX VF are below:
[Mon Oct 14 04:12:47 2024] pci 0000:00:06.0: [15b3:101e] type 00 class 0x020000
[Mon Oct 14 04:12:47 2024] pci 0000:00:06.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit pref]
[Mon Oct 14 04:12:47 2024] pci 0000:00:06.0: enabling Extended Tags
[Mon Oct 14 04:12:47 2024] pci 0000:00:06.0: 0.000 Gb/s available PCIe bandwidth, limited by Unknown x0 link at 0000:00:06.0 (capable of 126.024 Gb/s with 16.0 GT/s PCIe x8 link)
[Mon Oct 14 04:12:47 2024] pci 0000:00:06.0: BAR 0: assigned [mem 0x140000000-0x1400fffff 64bit pref]
[Mon Oct 14 04:12:48 2024] mlx5_core 0000:00:06.0: enabling device (0000 -> 0002)
[Mon Oct 14 04:12:48 2024] mlx5_core 0000:00:06.0: firmware version: 22.35.2000
[Mon Oct 14 04:13:50 2024] mlx5_core 0000:00:06.0: wait_func:1151:(pid 1193): ENABLE_HCA(0x104) timeout. Will cause a leak of a command resource
[Mon Oct 14 04:13:50 2024] mlx5_core 0000:00:06.0: mlx5_function_setup:1164:(pid 1193): enable hca failed
[Mon Oct 14 04:13:50 2024] mlx5_core 0000:00:06.0: probe_one:1770:(pid 1193): mlx5_init_one failed with error code -110
[Mon Oct 14 04:13:50 2024] mlx5_core: probe of 0000:00:06.0 failed with error -110
Host dmesg error logs while passing through MLNX VF are below:
[Mon Oct 14 08:07:08 2024] DMAR: DRHD: handling fault status reg 2
[Mon Oct 14 08:07:08 2024] DMAR: [DMA Read NO_PASID] Request device [ca:08.2] fault addr 0x104ead000 [fault reason 0x06] PTE Read access is not set
Thanks,
Ryan
[-- Attachment #2: Type: text/html, Size: 7862 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-10-14 13:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-14 4:44 [Qemu-devel] Assigning network devices to nested VMs results in driver errors in nested VMs Jintack Lim
2018-02-14 5:36 ` Peter Xu
2018-02-14 13:57 ` Jintack Lim
2018-02-14 19:55 ` Jintack Lim
-- strict thread matches above, loose matches on Subject: below --
2024-10-14 10:51 ryan.ljlu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).