* [PATCH] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset
@ 2020-03-15 13:24 Liran Alon
2020-03-15 13:48 ` no-reply
0 siblings, 1 reply; 5+ messages in thread
From: Liran Alon @ 2020-03-15 13:24 UTC (permalink / raw)
To: qemu-devel; +Cc: fam, pbonzini, dmitry.fleytman
From: Elazar Leibovich <elazar.leibovich@oracle.com>
When running Ubuntu 3.13.0-65-generic guest, QEMU sometimes crashes
during guest ACPI reset. It crashes on assert(s->rings_info_valid)
in pvscsi_process_io().
Analyzing the crash revealed that it happens when userspace issues
a sync during a reboot syscall.
Below are backtraces we gathered from the guests.
Guest backtrace when issuing PVSCSI_CMD_ADAPTER_RESET:
pci_device_shutdown
device_shutdown
init_pid_ns
init_pid_ns
kernel_power_off
SYSC_reboot
Guest backtrace when issuing PVSCSI_REG_OFFSET_KICK_RW_IO:
scsi_done
scsi_dispatch_cmd
blk_add_timer
scsi_request_fn
elv_rb_add
__blk_run_queue
queue_unplugged
blk_flush_plug_list
blk_finish_plug
ext4_writepages
set_next_entity
do_writepages
__filemap_fdatawrite_range
filemap_write_and_wait_range
ext4_sync_file
ext4_sync_file
do_fsync
sys_fsync
Since QEMU pvscsi should imitate VMware pvscsi device emulation,
we decided to imitate VMware's behavior in this case.
To check VMware behavior, we wrote a kernel module that issues
a reset to the pvscsi device and then issues a kick. We ran it on
VMware ESXi 6.5 and it seems that it simply ignores the kick.
Hence, we decided to ignore the kick as well.
Signed-off-by: Elazar Leibovich <elazar.leibovich@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
hw/scsi/vmw_pvscsi.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index c91352cf46de..b2bb80449bba 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -719,7 +719,12 @@ pvscsi_process_io(PVSCSIState *s)
PVSCSIRingReqDesc descr;
hwaddr next_descr_pa;
- assert(s->rings_info_valid);
+ if (!s->rings_info_valid) {
+ qemu_log("WARNING: PVSCSI: Cannot process I/O when "
+ "rings are not valid.\n");
+ return;
+ }
+
while ((next_descr_pa = pvscsi_ring_pop_req_descr(&s->rings)) != 0) {
/* Only read after production index verification */
--
2.20.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset
@ 2020-03-15 13:26 Liran Alon
2020-03-15 14:27 ` no-reply
2020-03-18 13:32 ` Paolo Bonzini
0 siblings, 2 replies; 5+ messages in thread
From: Liran Alon @ 2020-03-15 13:26 UTC (permalink / raw)
To: qemu-devel; +Cc: fam, pbonzini, dmitry.fleytman, liran.alon, elazar
From: Elazar Leibovich <elazar.leibovich@oracle.com>
When running Ubuntu 3.13.0-65-generic guest, QEMU sometimes crashes
during guest ACPI reset. It crashes on assert(s->rings_info_valid)
in pvscsi_process_io().
Analyzing the crash revealed that it happens when userspace issues
a sync during a reboot syscall.
Below are backtraces we gathered from the guests.
Guest backtrace when issuing PVSCSI_CMD_ADAPTER_RESET:
pci_device_shutdown
device_shutdown
init_pid_ns
init_pid_ns
kernel_power_off
SYSC_reboot
Guest backtrace when issuing PVSCSI_REG_OFFSET_KICK_RW_IO:
scsi_done
scsi_dispatch_cmd
blk_add_timer
scsi_request_fn
elv_rb_add
__blk_run_queue
queue_unplugged
blk_flush_plug_list
blk_finish_plug
ext4_writepages
set_next_entity
do_writepages
__filemap_fdatawrite_range
filemap_write_and_wait_range
ext4_sync_file
ext4_sync_file
do_fsync
sys_fsync
Since QEMU pvscsi should imitate VMware pvscsi device emulation,
we decided to imitate VMware's behavior in this case.
To check VMware behavior, we wrote a kernel module that issues
a reset to the pvscsi device and then issues a kick. We ran it on
VMware ESXi 6.5 and it seems that it simply ignores the kick.
Hence, we decided to ignore the kick as well.
Signed-off-by: Elazar Leibovich <elazar.leibovich@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
---
hw/scsi/vmw_pvscsi.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index c91352cf46de..b2bb80449bba 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -719,7 +719,12 @@ pvscsi_process_io(PVSCSIState *s)
PVSCSIRingReqDesc descr;
hwaddr next_descr_pa;
- assert(s->rings_info_valid);
+ if (!s->rings_info_valid) {
+ qemu_log("WARNING: PVSCSI: Cannot process I/O when "
+ "rings are not valid.\n");
+ return;
+ }
+
while ((next_descr_pa = pvscsi_ring_pop_req_descr(&s->rings)) != 0) {
/* Only read after production index verification */
--
2.20.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset
2020-03-15 13:24 Liran Alon
@ 2020-03-15 13:48 ` no-reply
0 siblings, 0 replies; 5+ messages in thread
From: no-reply @ 2020-03-15 13:48 UTC (permalink / raw)
To: liran.alon; +Cc: fam, pbonzini, dmitry.fleytman, qemu-devel
Patchew URL: https://patchew.org/QEMU/20200315132447.113131-1-liran.alon@oracle.com/
Hi,
This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.
=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===
AS pc-bios/optionrom/pvh.o
CC pc-bios/optionrom/pvh_main.o
/tmp/qemu-test/src/hw/scsi/vmw_pvscsi.c: In function 'pvscsi_process_io':
/tmp/qemu-test/src/hw/scsi/vmw_pvscsi.c:723:9: error: implicit declaration of function 'qemu_log'; did you mean 'qemu_fork'? [-Werror=implicit-function-declaration]
qemu_log("WARNING: PVSCSI: Cannot process I/O when "
^~~~~~~~
qemu_fork
/tmp/qemu-test/src/hw/scsi/vmw_pvscsi.c:723:9: error: nested extern declaration of 'qemu_log' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [/tmp/qemu-test/src/rules.mak:69: hw/scsi/vmw_pvscsi.o] Error 1
make: *** Waiting for unfinished jobs....
BUILD pc-bios/optionrom/multiboot.img
BUILD pc-bios/optionrom/linuxboot.img
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=00c554e6813f41a7bd68800814129c28', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-huuof_c9/src/docker-src.2020-03-15-09.46.08.12907:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=00c554e6813f41a7bd68800814129c28
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-huuof_c9/src'
make: *** [docker-run-test-mingw@fedora] Error 2
real 1m53.509s
user 0m7.720s
The full log is available at
http://patchew.org/logs/20200315132447.113131-1-liran.alon@oracle.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset
2020-03-15 13:26 [PATCH] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset Liran Alon
@ 2020-03-15 14:27 ` no-reply
2020-03-18 13:32 ` Paolo Bonzini
1 sibling, 0 replies; 5+ messages in thread
From: no-reply @ 2020-03-15 14:27 UTC (permalink / raw)
To: liran.alon; +Cc: fam, dmitry.fleytman, elazar, qemu-devel, liran.alon, pbonzini
Patchew URL: https://patchew.org/QEMU/20200315132634.113632-1-liran.alon@oracle.com/
Hi,
This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.
=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===
CC replay/replay.o
CC replay/replay-internal.o
/tmp/qemu-test/src/hw/scsi/vmw_pvscsi.c: In function 'pvscsi_process_io':
/tmp/qemu-test/src/hw/scsi/vmw_pvscsi.c:723:9: error: implicit declaration of function 'qemu_log'; did you mean 'qemu_fork'? [-Werror=implicit-function-declaration]
qemu_log("WARNING: PVSCSI: Cannot process I/O when "
^~~~~~~~
qemu_fork
/tmp/qemu-test/src/hw/scsi/vmw_pvscsi.c:723:9: error: nested extern declaration of 'qemu_log' [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [/tmp/qemu-test/src/rules.mak:69: hw/scsi/vmw_pvscsi.o] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
File "./tests/docker/docker.py", line 664, in <module>
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=b60fe0dffe2b447eb06a7f9f46d9f5aa', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-sakg1pkp/src/docker-src.2020-03-15-10.25.38.6677:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=b60fe0dffe2b447eb06a7f9f46d9f5aa
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-sakg1pkp/src'
make: *** [docker-run-test-mingw@fedora] Error 2
real 2m8.180s
user 0m8.937s
The full log is available at
http://patchew.org/logs/20200315132634.113632-1-liran.alon@oracle.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset
2020-03-15 13:26 [PATCH] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset Liran Alon
2020-03-15 14:27 ` no-reply
@ 2020-03-18 13:32 ` Paolo Bonzini
1 sibling, 0 replies; 5+ messages in thread
From: Paolo Bonzini @ 2020-03-18 13:32 UTC (permalink / raw)
To: Liran Alon, qemu-devel; +Cc: fam, elazar, dmitry.fleytman
On 15/03/20 14:26, Liran Alon wrote:
> From: Elazar Leibovich <elazar.leibovich@oracle.com>
>
> When running Ubuntu 3.13.0-65-generic guest, QEMU sometimes crashes
> during guest ACPI reset. It crashes on assert(s->rings_info_valid)
> in pvscsi_process_io().
>
> Analyzing the crash revealed that it happens when userspace issues
> a sync during a reboot syscall.
>
> Below are backtraces we gathered from the guests.
>
> Guest backtrace when issuing PVSCSI_CMD_ADAPTER_RESET:
> pci_device_shutdown
> device_shutdown
> init_pid_ns
> init_pid_ns
> kernel_power_off
> SYSC_reboot
>
> Guest backtrace when issuing PVSCSI_REG_OFFSET_KICK_RW_IO:
> scsi_done
> scsi_dispatch_cmd
> blk_add_timer
> scsi_request_fn
> elv_rb_add
> __blk_run_queue
> queue_unplugged
> blk_flush_plug_list
> blk_finish_plug
> ext4_writepages
> set_next_entity
> do_writepages
> __filemap_fdatawrite_range
> filemap_write_and_wait_range
> ext4_sync_file
> ext4_sync_file
> do_fsync
> sys_fsync
>
> Since QEMU pvscsi should imitate VMware pvscsi device emulation,
> we decided to imitate VMware's behavior in this case.
>
> To check VMware behavior, we wrote a kernel module that issues
> a reset to the pvscsi device and then issues a kick. We ran it on
> VMware ESXi 6.5 and it seems that it simply ignores the kick.
> Hence, we decided to ignore the kick as well.
>
> Signed-off-by: Elazar Leibovich <elazar.leibovich@oracle.com>
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> ---
> hw/scsi/vmw_pvscsi.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
> index c91352cf46de..b2bb80449bba 100644
> --- a/hw/scsi/vmw_pvscsi.c
> +++ b/hw/scsi/vmw_pvscsi.c
> @@ -719,7 +719,12 @@ pvscsi_process_io(PVSCSIState *s)
> PVSCSIRingReqDesc descr;
> hwaddr next_descr_pa;
>
> - assert(s->rings_info_valid);
> + if (!s->rings_info_valid) {
> + qemu_log("WARNING: PVSCSI: Cannot process I/O when "
> + "rings are not valid.\n");
> + return;
> + }
> +
> while ((next_descr_pa = pvscsi_ring_pop_req_descr(&s->rings)) != 0) {
>
> /* Only read after production index verification */
>
Queued, with the qemu_log removed even.
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-03-18 13:33 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-15 13:26 [PATCH] hw/scsi/vmw_pvscsi: Remove assertion for kick after reset Liran Alon
2020-03-15 14:27 ` no-reply
2020-03-18 13:32 ` Paolo Bonzini
-- strict thread matches above, loose matches on Subject: below --
2020-03-15 13:24 Liran Alon
2020-03-15 13:48 ` no-reply
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).