* LTP test related to virtio releasing and reassigning resource leads to guest hung
@ 2023-08-10 8:57 longguang.yue
2023-08-10 9:08 ` Michael Tokarev
0 siblings, 1 reply; 7+ messages in thread
From: longguang.yue @ 2023-08-10 8:57 UTC (permalink / raw)
To: qemu-devel, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 4967 bytes --]
Hi, all:
A ltp test leads to guest hung(io hung), the test releases virtio device resource and then reassign.
I find device’s mem prefetchable resource 64-bit is changed.
ltp test: https://github.com/linux-test-project/ltp/blob/522d7fba4afc84e07b252aa4cd91b241e81d6613/testcases/kernel/device-drivers/pci/tpci_kernel/ltp_tpci.c#L428
Do you know what cause the problem?
Thanks very much.
--------------------------
ENV: kernel 5.10.0, qemu 6.2
———————————
[ 301.194813] ltp_tpci: name = 0000:00:05.0, flags = 1319436, start 0xfe004000, end 0xfe007fff
[ 301.194814] virtio-pci 0000:00:05.0: BAR 4: releasing [mem 0xfe004000-0xfe007fff 64bit pref]
[ 301.194816] virtio-pci 0000:00:05.0: BAR 4: assigned [mem 0x240004000-0x240007fff 64bit pref]
----------------------
ps -aux| grep D
root 7 0.0 0.0 0 0 ? D 14:37 0:00 [kworker/u16:0+flush-253:0]
root 483 0.0 0.0 0 0 ? D 14:37 0:00 [jbd2/vda3-8]
——————hung task panic————---
[ 9585.419571][ T62] INFO: task jbd2/vda3-8:475 blocked for more than 122 seconds.
[ 9585.420274][ T62] Tainted: G W OE 5.10.0-60.67.0.96.ule3.x86_64 #1
[ 9585.421027][ T62] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 9585.421854][ T62] task:jbd2/vda3-8 state:D stack: 0 pid: 475 ppid: 2 flags:0x00004000
[ 9585.422715][ T62] Call Trace:
[ 9585.423020][ T62] __schedule+0x3f6/0x7d0
[ 9585.423435][ T62] ? bit_wait+0x60/0x60
[ 9585.423825][ T62] schedule+0x46/0xb0
[ 9585.424194][ T62] io_schedule+0x42/0x70
[ 9585.424597][ T62] bit_wait_io+0xd/0x60
[ 9585.424977][ T62] __wait_on_bit+0x28/0x90
[ 9585.425383][ T62] out_of_line_wait_on_bit+0x92/0xb0
[ 9585.425879][ T62] ? var_wake_function+0x30/0x30
[ 9585.426341][ T62] jbd2_journal_commit_transaction+0xd02/0x15e0 [jbd2]
[ 9585.427005][ T62] kjournald2+0xab/0x270 [jbd2]
[ 9585.427455][ T62] ? wait_woken+0x80/0x80
[ 9585.427863][ T62] ? commit_timeout+0x10/0x10 [jbd2]
[ 9585.428355][ T62] kthread+0xfb/0x140
[ 9585.428733][ T62] ? kthread_park+0x90/0x90
[ 9585.429144][ T62] ret_from_fork+0x1f/0x30
[ 9585.429578][ T62] INFO: task kworker/u16:0:4392 blocked for more than 122 seconds.
[ 9585.430303][ T62] Tainted: G W OE 5.10.0-60.67.0.96.ule3.x86_64 #1
[ 9585.431062][ T62] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 9585.431885][ T62] task:kworker/u16:0 state:D stack: 0 pid: 4392 ppid: 2 flags:0x00004080
[ 9585.432751][ T62] Workqueue: writeback wb_workfn (flush-253:0)
[ 9585.433328][ T62] Call Trace:
[ 9585.433642][ T62] __schedule+0x3f6/0x7d0
[ 9585.434035][ T62] ? bit_wait+0x60/0x60
[ 9585.434424][ T62] schedule+0x46/0xb0
[ 9585.434809][ T62] io_schedule+0x42/0x70
[ 9585.435201][ T62] bit_wait_io+0xd/0x60
[ 9585.435605][ T62] __wait_on_bit+0x28/0x90
[ 9585.436016][ T62] out_of_line_wait_on_bit+0x92/0xb0
[ 9585.436507][ T62] ? var_wake_function+0x30/0x30
[ 9585.437006][ T62] do_get_write_access+0x244/0x320 [jbd2]
[ 9585.437613][ T62] jbd2_journal_get_write_access+0x67/0x90 [jbd2]
[ 9585.438222][ T62] __ext4_journal_get_write_access+0x44/0x90 [ext4]
[ 9585.438859][ T62] ? ext4_get_inode_loc+0x3e/0xa0 [ext4]
[ 9585.439390][ T62] ext4_reserve_inode_write+0x83/0xc0 [ext4]
[ 9585.439956][ T62] __ext4_mark_inode_dirty+0x50/0x120 [ext4]
[ 9585.440528][ T62] ext4_ext_insert_extent+0x386/0x670 [ext4]
[ 9585.441099][ T62] ? ext4_mb_new_blocks+0x14c/0x540 [ext4]
[ 9585.441663][ T62] ext4_ext_map_blocks+0x30e/0x8f0 [ext4]
[ 9585.442187][ T62] ? find_get_pages_range_tag+0x1b0/0x220
[ 9585.442736][ T62] ext4_map_blocks+0x18e/0x5a0 [ext4]
[ 9585.443239][ T62] ? mpage_prepare_extent_to_map+0x2db/0x300 [ext4]
[ 9585.443866][ T62] mpage_map_one_extent+0x64/0x150 [ext4]
[ 9585.444401][ T62] mpage_map_and_submit_extent+0x77/0x210 [ext4]
[ 9585.444996][ T62] ext4_writepages+0x627/0x790 [ext4]
[ 9585.445499][ T62] ? update_nohz_stats+0x43/0x60
[ 9585.445967][ T62] do_writepages+0x31/0xc0
[ 9585.446378][ T62] __writeback_single_inode+0x39/0x200
[ 9585.446888][ T62] writeback_sb_inodes+0x20a/0x4f0
[ 9585.447358][ T62] __writeback_inodes_wb+0x4c/0xd0
[ 9585.447837][ T62] wb_writeback+0x1d8/0x2a0
[ 9585.448256][ T62] wb_check_old_data_flush+0xb6/0xc0
[ 9585.448764][ T62] wb_do_writeback+0xc1/0x180
[ 9585.449199][ T62] ? set_worker_desc+0xaa/0xc0
[ 9585.449652][ T62] wb_workfn+0x5a/0x180
[ 9585.450035][ T62] process_one_work+0x1ad/0x350
[ 9585.450490][ T62] worker_thread+0x49/0x310
[ 9585.450912][ T62] ? rescuer_thread+0x370/0x370
[ 9585.451364][ T62] kthread+0xfb/0x140
[ 9585.451738][ T62] ? kthread_park+0x90/0x90
[ 9585.452145][ T62] ret_from_fork+0x1f/0x30
[ 9585.452571][ T62] Kernel panic - not syncing: hung_task: blocked tasks
[-- Attachment #2: Type: text/html, Size: 7355 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LTP test related to virtio releasing and reassigning resource leads to guest hung
2023-08-10 8:57 LTP test related to virtio releasing and reassigning resource leads to guest hung longguang.yue
@ 2023-08-10 9:08 ` Michael Tokarev
2023-08-10 10:35 ` longguang.yue
0 siblings, 1 reply; 7+ messages in thread
From: Michael Tokarev @ 2023-08-10 9:08 UTC (permalink / raw)
To: longguang.yue, qemu-devel, linux-kernel
10.08.2023 11:57, longguang.yue wrote:
> Hi, all:
> A ltp test leads to guest hung(io hung), the test releases virtio device resource and then reassign.
> I find device’s mem prefetchable resource 64-bit is changed.
>
> ltp
> test: https://github.com/linux-test-project/ltp/blob/522d7fba4afc84e07b252aa4cd91b241e81d6613/testcases/kernel/device-drivers/pci/tpci_kernel/ltp_tpci.c#L428
>
> Do you know what cause the problem?
>
> Thanks very much.
>
> --------------------------
> ENV: kernel 5.10.0, qemu 6.2
Current qemu is 8.1 (well, almost, to be released this month;
previous release is 8.0 anyway).
This might be interesting to test in a current version before
going any further.
Thanks,
/mjt
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LTP test related to virtio releasing and reassigning resource leads to guest hung
2023-08-10 9:08 ` Michael Tokarev
@ 2023-08-10 10:35 ` longguang.yue
2023-08-10 14:13 ` Stefan Hajnoczi
0 siblings, 1 reply; 7+ messages in thread
From: longguang.yue @ 2023-08-10 10:35 UTC (permalink / raw)
To: Michael Tokarev, mst@redhat.com, stefanha@redhat.com
Cc: qemu-devel, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1345 bytes --]
could you please give me some tips to diagnose? I could do tests on qemu 8.0, but product environment could not update.
I test on different kernel version 5.10.0-X, one is better and results show problem is more about host kernel rather than qemu.
test cases are different combination of i440fx/q35 and virtio/scsi and kernel.
thanks
---- Replied Message ----
| From | Michael Tokarev<mjt@tls.msk.ru> |
| Date | 08/10/2023 17:08 |
| To | longguang.yue<kvmluck@163.com> ,
qemu-devel<qemu-devel@nongnu.org> ,
linux-kernel<linux-kernel@vger.kernel.org> |
| Subject | Re: LTP test related to virtio releasing and reassigning resource leads to guest hung |
10.08.2023 11:57, longguang.yue wrote:
Hi, all:
A ltp test leads to guest hung(io hung), the test releases virtio device resource and then reassign.
I find device’s mem prefetchable resource 64-bit is changed.
ltp
test: https://github.com/linux-test-project/ltp/blob/522d7fba4afc84e07b252aa4cd91b241e81d6613/testcases/kernel/device-drivers/pci/tpci_kernel/ltp_tpci.c#L428
Do you know what cause the problem?
Thanks very much.
--------------------------
ENV: kernel 5.10.0, qemu 6.2
Current qemu is 8.1 (well, almost, to be released this month;
previous release is 8.0 anyway).
This might be interesting to test in a current version before
going any further.
Thanks,
/mjt
[-- Attachment #2: Type: text/html, Size: 5166 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LTP test related to virtio releasing and reassigning resource leads to guest hung
2023-08-10 10:35 ` longguang.yue
@ 2023-08-10 14:13 ` Stefan Hajnoczi
2023-08-10 15:24 ` Stefan Hajnoczi
0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2023-08-10 14:13 UTC (permalink / raw)
To: longguang.yue; +Cc: Michael Tokarev, mst@redhat.com, qemu-devel, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2073 bytes --]
On Thu, Aug 10, 2023 at 06:35:32PM +0800, longguang.yue wrote:
> could you please give me some tips to diagnose? I could do tests on qemu 8.0, but product environment could not update.
> I test on different kernel version 5.10.0-X, one is better and results show problem is more about host kernel rather than qemu.
>
>
> test cases are different combination of i440fx/q35 and virtio/scsi and kernel.
Can you post the guest kernel messages (dmesg)? If the guest is hanging
then it may be easiest to configure a serial console so the kernel
messages are sent to the host where you can see them.
Does the hang occur during the LTP code you linked or afterwards when
the PCI device is bound to a virtio driver?
Which virtio device causes the problem?
Can you describe the hang in more detail: is the guest still responsive
(e.g. console or network)? Is the QEMU HMP/QMP monitor still responsive?
Thanks,
Stefan
>
>
>
>
> thanks
>
>
>
>
> ---- Replied Message ----
> | From | Michael Tokarev<mjt@tls.msk.ru> |
> | Date | 08/10/2023 17:08 |
> | To | longguang.yue<kvmluck@163.com> ,
> qemu-devel<qemu-devel@nongnu.org> ,
> linux-kernel<linux-kernel@vger.kernel.org> |
> | Subject | Re: LTP test related to virtio releasing and reassigning resource leads to guest hung |
> 10.08.2023 11:57, longguang.yue wrote:
> Hi, all:
> A ltp test leads to guest hung(io hung), the test releases virtio device resource and then reassign.
> I find device’s mem prefetchable resource 64-bit is changed.
>
> ltp
> test: https://github.com/linux-test-project/ltp/blob/522d7fba4afc84e07b252aa4cd91b241e81d6613/testcases/kernel/device-drivers/pci/tpci_kernel/ltp_tpci.c#L428
>
> Do you know what cause the problem?
>
> Thanks very much.
>
> --------------------------
> ENV: kernel 5.10.0, qemu 6.2
>
> Current qemu is 8.1 (well, almost, to be released this month;
> previous release is 8.0 anyway).
>
> This might be interesting to test in a current version before
> going any further.
>
> Thanks,
>
> /mjt
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LTP test related to virtio releasing and reassigning resource leads to guest hung
2023-08-10 14:13 ` Stefan Hajnoczi
@ 2023-08-10 15:24 ` Stefan Hajnoczi
2023-08-11 2:26 ` longguang.yue
0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2023-08-10 15:24 UTC (permalink / raw)
To: Stefan Hajnoczi
Cc: longguang.yue, Michael Tokarev, mst@redhat.com, qemu-devel,
linux-kernel
On Thu, 10 Aug 2023 at 10:14, Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Thu, Aug 10, 2023 at 06:35:32PM +0800, longguang.yue wrote:
> > could you please give me some tips to diagnose? I could do tests on qemu 8.0, but product environment could not update.
> > I test on different kernel version 5.10.0-X, one is better and results show problem is more about host kernel rather than qemu.
> >
> >
> > test cases are different combination of i440fx/q35 and virtio/scsi and kernel.
>
> Can you post the guest kernel messages (dmesg)? If the guest is hanging
> then it may be easiest to configure a serial console so the kernel
> messages are sent to the host where you can see them.
>
> Does the hang occur during the LTP code you linked or afterwards when
> the PCI device is bound to a virtio driver?
I didn't see your original email so I missed the panic. I'd still like
to see the earlier kernel messages before the panic in order to
understand how the PCI device is bound.
Is the vda device with hung I/O the same device that was accessed by
the LTP test earlier? I guess the LTP test runs against the device and
then the virtio driver binds to the device again afterwards?
>
> Which virtio device causes the problem?
>
> Can you describe the hang in more detail: is the guest still responsive
> (e.g. console or network)? Is the QEMU HMP/QMP monitor still responsive?
>
> Thanks,
> Stefan
>
> >
> >
> >
> >
> > thanks
> >
> >
> >
> >
> > ---- Replied Message ----
> > | From | Michael Tokarev<mjt@tls.msk.ru> |
> > | Date | 08/10/2023 17:08 |
> > | To | longguang.yue<kvmluck@163.com> ,
> > qemu-devel<qemu-devel@nongnu.org> ,
> > linux-kernel<linux-kernel@vger.kernel.org> |
> > | Subject | Re: LTP test related to virtio releasing and reassigning resource leads to guest hung |
> > 10.08.2023 11:57, longguang.yue wrote:
> > Hi, all:
> > A ltp test leads to guest hung(io hung), the test releases virtio device resource and then reassign.
> > I find device’s mem prefetchable resource 64-bit is changed.
> >
> > ltp
> > test: https://github.com/linux-test-project/ltp/blob/522d7fba4afc84e07b252aa4cd91b241e81d6613/testcases/kernel/device-drivers/pci/tpci_kernel/ltp_tpci.c#L428
> >
> > Do you know what cause the problem?
> >
> > Thanks very much.
> >
> > --------------------------
> > ENV: kernel 5.10.0, qemu 6.2
> >
> > Current qemu is 8.1 (well, almost, to be released this month;
> > previous release is 8.0 anyway).
> >
> > This might be interesting to test in a current version before
> > going any further.
> >
> > Thanks,
> >
> > /mjt
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LTP test related to virtio releasing and reassigning resource leads to guest hung
2023-08-10 15:24 ` Stefan Hajnoczi
@ 2023-08-11 2:26 ` longguang.yue
2023-08-15 15:18 ` Stefan Hajnoczi
0 siblings, 1 reply; 7+ messages in thread
From: longguang.yue @ 2023-08-11 2:26 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Michael Tokarev, mst@redhat.com, qemu-devel, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 6267 bytes --]
一)
Can you post the guest kernel messages (dmesg)? If the guest is hanging
then it may be easiest to configure a serial console so the kernel
messages are sent to the host where you can see them.
Does the hang occur during the LTP code you linked or afterwards when
the PCI device is bound to a virtio driver?
> I used conosle, the hang occurred afterwards. dmesg shows that tpci test is finished without error.
LTP test case: https://github.com/linux-test-project/ltp/blob/522d7fba4afc84e07b252aa4cd91b241e81d6613/testcases/kernel/device-drivers/pci/tpci_kernel/ltp_tpci.c#L428
kernel 5.10, qemu 6.2
different guest-configuration tests show different results. guest did not crash if hung-task-panic=0, in my case i enable hung-task-panic in order to trace.
test case 1:
xml machine pc,virtio disk, virtio net —— guest's io hung, network broke down, though console is avilable but io operation hung.
#ps -aux| grep D
root 7 0.0 0.0 0 0 ? D 14:37 0:00 [kworker/u16:0+flush-253:0]
root 483 0.0 0.0 0 0 ? D 14:37 0:00 [jbd2/vda3-8]
test case 2:
xml machine q35,virtio/q35,scsi ——disk did not hung but network broke down. ping errors though everything looks ok and no crash and no kernel error
二)
I didn't see your original email so I missed the panic. I'd still like
to see the earlier kernel messages before the panic in order to
understand how the PCI device is bound.
Is the vda device with hung I/O the same device that was accessed by
the LTP test earlier? I guess the LTP test runs against the device and
then the virtio driver binds to the device again afterwards?
> the test is
```
// iterate all devices
……
for (i = 0; i < 7; ++i) { // iterate current device's resources
if (r->flags & IORESOURCE_MEM &&
r->flags & IORESOURCE_PREFETCH) {
pci_release_resource(dev, i);
ret = pci_assign_resource(dev, i);
prk_info("assign resource to '%d', ret '%d'", i, ret);
rc |= (ret < 0 && ret != -EBUSY) ? TFAIL : TPASS;
}
}
```
test does not do virtio device unbind and bind.
I only notice mem resource changed. see 'test-case 12'
———————————
[ 88.905705] ltp_tpci: test-case 12
[ 88.905706] ltp_tpci: assign resources
[ 88.905706] ltp_tpci: assign resource #0
[ 88.905707] ltp_tpci: name = 0000:00:07.0, flags = 262401, start 0xc080, end 0xc0ff
[ 88.905707] ltp_tpci: assign resource #1
[ 88.905708] ltp_tpci: name = 0000:00:07.0, flags = 262656, start 0xfebd4000, end 0xfebd4fff
[ 88.905709] ltp_tpci: assign resource #2
[ 88.905709] ltp_tpci: name = 0000:00:07.0, flags = 0, start 0x0, end 0x0
[ 88.905710] ltp_tpci: assign resource #3
[ 88.905710] ltp_tpci: name = 0000:00:07.0, flags = 0, start 0x0, end 0x0
[ 88.905711] ltp_tpci: assign resource #4
[ 88.905711] ltp_tpci: name = 0000:00:07.0, flags = 1319436, start 0xfe00c000, end 0xfe00ffff
[ 88.905713] virtio-pci 0000:00:07.0: BAR 4: releasing [mem 0xfe00c000-0xfe00ffff 64bit pref]
[ 88.905715] virtio-pci 0000:00:07.0: BAR 4: assigned [mem 0x24000c000-0x24000ffff 64bit pref]
[ 88.906693] ltp_tpci: assign resource to '4', ret '0'
[ 88.906694] ltp_tpci: assign resource #5
[ 88.906694] ltp_tpci: name = (null), flags = 0, start 0x0, end 0x0
[ 88.906695] ltp_tpci: assign resource #6
[ 88.906695] ltp_tpci: name = 0000:00:07.0, flags = 0, start 0x0, end 0x0
[ 88.906800] ltp_tpci: test-case 13
---- Replied Message ----
| From | Stefan Hajnoczi<stefanha@gmail.com> |
| Date | 08/10/2023 23:24 |
| To | Stefan Hajnoczi<stefanha@redhat.com> |
| Cc | longguang.yue<kvmluck@163.com> ,
Michael Tokarev<mjt@tls.msk.ru> ,
mst@redhat.com<mst@redhat.com> ,
qemu-devel<qemu-devel@nongnu.org> ,
linux-kernel<linux-kernel@vger.kernel.org> |
| Subject | Re: LTP test related to virtio releasing and reassigning resource leads to guest hung |
On Thu, 10 Aug 2023 at 10:14, Stefan Hajnoczi <stefanha@redhat.com> wrote:
On Thu, Aug 10, 2023 at 06:35:32PM +0800, longguang.yue wrote:
could you please give me some tips to diagnose? I could do tests on qemu 8.0, but product environment could not update.
I test on different kernel version 5.10.0-X, one is better and results show problem is more about host kernel rather than qemu.
test cases are different combination of i440fx/q35 and virtio/scsi and kernel.
Can you post the guest kernel messages (dmesg)? If the guest is hanging
then it may be easiest to configure a serial console so the kernel
messages are sent to the host where you can see them.
Does the hang occur during the LTP code you linked or afterwards when
the PCI device is bound to a virtio driver?
I didn't see your original email so I missed the panic. I'd still like
to see the earlier kernel messages before the panic in order to
understand how the PCI device is bound.
Is the vda device with hung I/O the same device that was accessed by
the LTP test earlier? I guess the LTP test runs against the device and
then the virtio driver binds to the device again afterwards?
Which virtio device causes the problem?
Can you describe the hang in more detail: is the guest still responsive
(e.g. console or network)? Is the QEMU HMP/QMP monitor still responsive?
Thanks,
Stefan
thanks
---- Replied Message ----
| From | Michael Tokarev<mjt@tls.msk.ru> |
| Date | 08/10/2023 17:08 |
| To | longguang.yue<kvmluck@163.com> ,
qemu-devel<qemu-devel@nongnu.org> ,
linux-kernel<linux-kernel@vger.kernel.org> |
| Subject | Re: LTP test related to virtio releasing and reassigning resource leads to guest hung |
10.08.2023 11:57, longguang.yue wrote:
Hi, all:
A ltp test leads to guest hung(io hung), the test releases virtio device resource and then reassign.
I find device’s mem prefetchable resource 64-bit is changed.
ltp
test: https://github.com/linux-test-project/ltp/blob/522d7fba4afc84e07b252aa4cd91b241e81d6613/testcases/kernel/device-drivers/pci/tpci_kernel/ltp_tpci.c#L428
Do you know what cause the problem?
Thanks very much.
--------------------------
ENV: kernel 5.10.0, qemu 6.2
Current qemu is 8.1 (well, almost, to be released this month;
previous release is 8.0 anyway).
This might be interesting to test in a current version before
going any further.
Thanks,
/mjt
[-- Attachment #2: Type: text/html, Size: 14502 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: LTP test related to virtio releasing and reassigning resource leads to guest hung
2023-08-11 2:26 ` longguang.yue
@ 2023-08-15 15:18 ` Stefan Hajnoczi
0 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2023-08-15 15:18 UTC (permalink / raw)
To: longguang.yue; +Cc: Michael Tokarev, mst@redhat.com, qemu-devel, linux-kernel
On Thu, 10 Aug 2023 at 22:27, longguang.yue <kvmluck@163.com> wrote:
>
>
> 一)
> Can you post the guest kernel messages (dmesg)? If the guest is hanging
> then it may be easiest to configure a serial console so the kernel
> messages are sent to the host where you can see them.
>
> Does the hang occur during the LTP code you linked or afterwards when
> the PCI device is bound to a virtio driver?
>
>
> > I used conosle, the hang occurred afterwards. dmesg shows that tpci test is finished without error.
> LTP test case: https://github.com/linux-test-project/ltp/blob/522d7fba4afc84e07b252aa4cd91b241e81d6613/testcases/kernel/device-drivers/pci/tpci_kernel/ltp_tpci.c#L428
> kernel 5.10, qemu 6.2
>
> different guest-configuration tests show different results. guest did not crash if hung-task-panic=0, in my case i enable hung-task-panic in order to trace.
>
> test case 1:
> xml machine pc,virtio disk, virtio net —— guest's io hung, network broke down, though console is avilable but io operation hung.
>
> #ps -aux| grep D
> root 7 0.0 0.0 0 0 ? D 14:37 0:00 [kworker/u16:0+flush-253:0]
> root 483 0.0 0.0 0 0 ? D 14:37 0:00 [jbd2/vda3-8]
>
> test case 2:
> xml machine q35,virtio/q35,scsi ——disk did not hung but network broke down. ping errors though everything looks ok and no crash and no kernel error
>
>
>
> 二)
> I didn't see your original email so I missed the panic. I'd still like
> to see the earlier kernel messages before the panic in order to
> understand how the PCI device is bound.
>
> Is the vda device with hung I/O the same device that was accessed by
> the LTP test earlier? I guess the LTP test runs against the device and
> then the virtio driver binds to the device again afterwards?
>
> > the test is
> ```
> // iterate all devices
> ……
> for (i = 0; i < 7; ++i) { // iterate current device's resources
> if (r->flags & IORESOURCE_MEM &&
> r->flags & IORESOURCE_PREFETCH) {
> pci_release_resource(dev, i);
> ret = pci_assign_resource(dev, i);
> prk_info("assign resource to '%d', ret '%d'", i, ret);
> rc |= (ret < 0 && ret != -EBUSY) ? TFAIL : TPASS;
> }
> }
> ```
> test does not do virtio device unbind and bind.
> I only notice mem resource changed. see 'test-case 12'
>
> ———————————
> [ 88.905705] ltp_tpci: test-case 12
> [ 88.905706] ltp_tpci: assign resources
> [ 88.905706] ltp_tpci: assign resource #0
> [ 88.905707] ltp_tpci: name = 0000:00:07.0, flags = 262401, start 0xc080, end 0xc0ff
> [ 88.905707] ltp_tpci: assign resource #1
> [ 88.905708] ltp_tpci: name = 0000:00:07.0, flags = 262656, start 0xfebd4000, end 0xfebd4fff
> [ 88.905709] ltp_tpci: assign resource #2
> [ 88.905709] ltp_tpci: name = 0000:00:07.0, flags = 0, start 0x0, end 0x0
> [ 88.905710] ltp_tpci: assign resource #3
> [ 88.905710] ltp_tpci: name = 0000:00:07.0, flags = 0, start 0x0, end 0x0
> [ 88.905711] ltp_tpci: assign resource #4
> [ 88.905711] ltp_tpci: name = 0000:00:07.0, flags = 1319436, start 0xfe00c000, end 0xfe00ffff
> [ 88.905713] virtio-pci 0000:00:07.0: BAR 4: releasing [mem 0xfe00c000-0xfe00ffff 64bit pref]
> [ 88.905715] virtio-pci 0000:00:07.0: BAR 4: assigned [mem 0x24000c000-0x24000ffff 64bit pref]
> [ 88.906693] ltp_tpci: assign resource to '4', ret '0'
> [ 88.906694] ltp_tpci: assign resource #5
> [ 88.906694] ltp_tpci: name = (null), flags = 0, start 0x0, end 0x0
> [ 88.906695] ltp_tpci: assign resource #6
> [ 88.906695] ltp_tpci: name = 0000:00:07.0, flags = 0, start 0x0, end 0x0
>
> [ 88.906800] ltp_tpci: test-case 13
I don't know. Maybe the test case is leaving the device is a state
that conflicts with the virtio drivers that are bound after testing
finishes.
One approach is to trace the PCI BAR accesses after the test runs and
compare against a trace when the tpci driver hasn't been loaded. That
way you might be able to find out what is different.
Stefan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-08-15 15:20 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-10 8:57 LTP test related to virtio releasing and reassigning resource leads to guest hung longguang.yue
2023-08-10 9:08 ` Michael Tokarev
2023-08-10 10:35 ` longguang.yue
2023-08-10 14:13 ` Stefan Hajnoczi
2023-08-10 15:24 ` Stefan Hajnoczi
2023-08-11 2:26 ` longguang.yue
2023-08-15 15:18 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).