* [Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas
@ 2019-08-21 11:08 Lukáš Doktor
0 siblings, 0 replies; only message in thread
From: Lukáš Doktor @ 2019-08-21 11:08 UTC (permalink / raw)
To: QEMU Developers, qemu-block, Anton Nefedov, Max Reitz,
Andrew Jones
[-- Attachment #1.1.1: Type: text/plain, Size: 12730 bytes --]
Hello guys,
I bisected a regression on aarch64 all the way to this commit: "qcow2: skip writing zero buffers to empty COW areas" c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a. Would you please have a look at it?
My reproducer is running kickstart installation of RHEL-8 from DVD on aarch64 gicv3 machine, which never finishes since this commit, where anaconda complains about package installation, occasionally there are also XFS metadata corruption messages on serial console:
2019-08-18 19:36:11: ^[[22;1HInstalling check.aarch64 (359/1180)^[[1;24r^[[23;1H[ 439.335155] XFS (dm-3): Metadata corruption detected at xfs_inode_buf_verify+0x174/0x190 [xfs], xfs_inode block 0xb1d960 xfs_inode_buf_verify
2019-08-18 19:36:11: [ 439.345295] XFS (dm-3): Unmount and run xfs_repair
2019-08-18 19:36:11: [ 439.349167] XFS (dm-3): First 128 bytes of corrupted metadata buffer:
2019-08-18 19:36:11: ^[[1;23r^[[23;80H
2019-08-18 19:36:11: ^[[22;1HInstalling libidn.aarch64 (360/1180)^[[1;24r^[[23;1H[ 439.354063] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.365513] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.371929] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.378175] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.384242] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.390603] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.396784] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.402838] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2019-08-18 19:36:11: [ 439.409314] XFS (dm-3): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0xb1d960 len 32 error 117
2019-08-18 19:36:11: [ 439.416866] XFS (dm-3): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
2019-08-18 19:36:12: [ 440.341560] systemd-journald[1550]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12: ================================================================================^[[23;80H^[[1;23r^[[23;80H
2019-08-18 19:36:12:
2019-08-18 19:36:12: ^[[22;1H================================================================================^[[23;1HError^[[1;24r^[[23;6H[ 440.371152] systemd-journald[1550]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12: [ 440.381579] systemd-journald[1550]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12: ^[[1;23r^[[23;80H
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12:
2019-08-18 19:36:12: ^[[18;1H The following error occurred while installing. This is a fatal error and
2019-08-18 19:36:12: installation will be aborted.
2019-08-18 19:36:12:
2019-08-18 19:36:12: DNF error: Error unpacking rpm package libidn-1.34-5.el8.aarch64^[[23;1HPress ENTER to exit: ^[[1;24r^[[23;22H[ 440.401675] systemd-journald[1550]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
It reproduces 100%, usually at different stage (different RPM package). With Avocado-vt you can reproduce it by putting RHEL-8 DVD image to RHEL.7.devel iso location and executing something like this:
avocado --show all run --vt-guest-os RHEL.7.devel --vt-extra-params store_vm_register=no take_regular_screendumps=no monitor_type=human --vt-qemu-bin /home/jenkins/aarch64/qemu-master-build/bin/qemu-system-aarch64 -- unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads
Note: I'm still using RHEL.7.devel profile although the provided DVD is RHEL8, it doesn't really matter
Note: The "--vt-extra-params" are workarounds of current issue where frequent use of QMP monitor results in bad things on aarch64
But it should reproduce even without Avocado-vt. The used qemu cmd was:
MALLOC_PERTURB_=1 /home/jenkins/aarch64/qemu-master-build/bin/qemu-system-aarch64 \
-S \
-name 'avocado-vt-vm1' \
-sandbox off \
-drive file=/usr/share/AAVMF/AAVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on \
-drive file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-aarch64_AAVMF_VARS.fd,if=pflash,format=raw,unit=1 \
-machine virt,gic-version=host \
-nodefaults \
-device virtio-gpu-pci,bus=pcie.0,addr=0x1 \
-chardev socket,id=hmp_id_qmpmonitor1,path=/var/tmp/avocado_kurmsdn9/monitor-qmpmonitor1-20190821-065917-I0T8uXz4,server,nowait \
-mon chardev=hmp_id_qmpmonitor1,mode=readline \
-chardev socket,id=hmp_id_catch_monitor,path=/var/tmp/avocado_kurmsdn9/monitor-catch_monitor-20190821-065917-I0T8uXz4,server,nowait \
-mon chardev=hmp_id_catch_monitor,mode=readline \
-serial unix:'/var/tmp/avocado_kurmsdn9/serial-serial0-20190821-065917-I0T8uXz4',server,nowait \
-device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
-device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0-root-port-3,addr=0x0 \
-drive id=drive_image1,if=none,format=qcow2,file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-aarch64.qcow2 \
-device scsi-hd,id=image1,drive=drive_image1 \
-device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
-device virtio-net-pci,mac=9a:96:87:89:70:c0,rombar=0,id=idfUqVrx,netdev=idNvVkD7,bus=pcie.0-root-port-4,addr=0x0 \
-netdev tap,id=idNvVkD7,fd=20 \
-m 4096 \
-smp 2,maxcpus=2,cores=1,threads=1,sockets=2 \
-cpu 'host' \
-drive id=drive_cd1,if=none,media=cdrom,file=/home/isos/RHEL-8.1.0-20190820.n.0-aarch64-dvd1.iso \
-device scsi-cd,id=cd1,drive=drive_cd1 \
-drive id=drive_unattended,if=none,media=cdrom,file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-64/ks.iso \
-device scsi-cd,id=unattended,drive=drive_unattended \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-kernel '/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-aarch64/vmlinuz' \
-append 'ks=cdrom inst.sshd nicdelay=60 earlyprintk=pl011,0x9000000 console=ttyAMA0 debug ignore_loglevel rootwait' \
-initrd '/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-aarch64/initrd.img' \
-vnc :0 \
-rtc base=utc,clock=host \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
-device pcie-root-port,id=pcie_extra_root_port_1,slot=6,chassis=6,addr=0x6,bus=pcie.0
But you should find all the required information in the attached file. It's worth mentioning that I haven't seen this issue on other architectures (s390x, ppc64, ppc64le) and reproducibility is 100%.
Regards,
Lukáš
PS: During bisecting the metadata corruption was usually not there, so attaching excerpt of serial console obtained with the commit c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a:
2019-08-21 07:04:57: ^[[22;1HInstalling kernel-devel.aarch64 (311/1190)^[[1;24r^[[23;1H
2019-08-21 07:05:23: [ 346.780872] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [ 346.788877] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [ 346.796951] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [ 346.805247] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.250148] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.258148] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.270263] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.287159] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.295778] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.304047] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: ================================================================================^[[23;80H[ 372.318162] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.332918] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: ^[[1;23r^[[23;80H
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49: ^[[15;1H================================================================================^[[16;1HError^[[18;1H The following error occurred while installing. This is a fatal error and
2019-08-21 07:05:49: installation will be aborted.
2019-08-21 07:05:49:
2019-08-21 07:05:49: DNF error: Error in POSTIN scriptlet in rpm package kernel-devel^[[1;24r^[[23;1H[ 372.341178] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [ 372.371747] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: Press ENTER to exit:
2019-08-21 07:05:53: [ 377.143065] systemd-coredump[2890]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:53: [ 377.150559] systemd-coredump[2890]: Failed to get EXE, ignoring: No such process
2019-08-21 07:05:53: [ 377.200294] systemd-coredump[2892]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:53: [ 377.207611] systemd-coredump[2892]: Failed to get EXE, ignoring: No such process
2019-08-21 07:05:54: [ 377.259821] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.267769] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.278325] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.286222] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.297440] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.307166] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.318449] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.326360] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.337852] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [ 377.346173] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:56: [ 379.428425] systemd-coredump[2909]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:56: [ 379.435753] systemd-coredump[2909]: Failed to get EXE, ignoring: No such process
2019-08-21 07:05:57: [ 380.880584] systemd-coredump[2919]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:57: [ 380.887926] systemd-coredump[2919]: Failed to get EXE, ignoring: No such process
2019-08-21 07:05:58: [ 381.526076] systemd-coredump[2922]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:58: [ 381.533709] systemd-coredump[2922]: Failed to get EXE, ignoring: No such process
[-- Attachment #1.1.2: 094-functional_git_RHEL-8_dvd io-github-autotest-qemu.unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads_-e630.zip --]
[-- Type: application/zip, Size: 89995 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2019-08-21 11:37 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-21 11:08 [Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas Lukáš Doktor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).