qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Lukáš Doktor" <ldoktor@redhat.com>
To: QEMU Developers <qemu-devel@nongnu.org>,
	qemu-block@nongnu.org,
	Anton Nefedov <anton.nefedov@virtuozzo.com>,
	Max Reitz <mreitz@redhat.com>, Andrew Jones <drjones@redhat.com>
Subject: [Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas
Date: Wed, 21 Aug 2019 13:08:34 +0200	[thread overview]
Message-ID: <3b46e268-f8a3-9409-d40a-978390df62c2@redhat.com> (raw)


[-- Attachment #1.1.1: Type: text/plain, Size: 12730 bytes --]

Hello guys,

I bisected a regression on aarch64 all the way to this commit: "qcow2: skip writing zero buffers to empty COW areas" c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a. Would you please have a look at it?

My reproducer is running kickstart installation of RHEL-8 from DVD on aarch64 gicv3 machine, which never finishes since this commit, where anaconda complains about package installation, occasionally there are also XFS metadata corruption messages on serial console:


2019-08-18 19:36:11: ^[[22;1HInstalling check.aarch64 (359/1180)^[[1;24r^[[23;1H[  439.335155] XFS (dm-3): Metadata corruption detected at xfs_inode_buf_verify+0x174/0x190 [xfs], xfs_inode block 0xb1d960 xfs_inode_buf_verify
2019-08-18 19:36:11: [  439.345295] XFS (dm-3): Unmount and run xfs_repair
2019-08-18 19:36:11: [  439.349167] XFS (dm-3): First 128 bytes of corrupted metadata buffer:
2019-08-18 19:36:11: ^[[1;23r^[[23;80H
2019-08-18 19:36:11: ^[[22;1HInstalling libidn.aarch64 (360/1180)^[[1;24r^[[23;1H[  439.354063] 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
2019-08-18 19:36:11: [  439.365513] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
2019-08-18 19:36:11: [  439.371929] 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
2019-08-18 19:36:11: [  439.378175] 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
2019-08-18 19:36:11: [  439.384242] 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
2019-08-18 19:36:11: [  439.390603] 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
2019-08-18 19:36:11: [  439.396784] 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
2019-08-18 19:36:11: [  439.402838] 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
2019-08-18 19:36:11: [  439.409314] XFS (dm-3): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0xb1d960 len 32 error 117
2019-08-18 19:36:11: [  439.416866] XFS (dm-3): xfs_imap_to_bp: xfs_trans_read_buf() returned error -117.
2019-08-18 19:36:12: [  440.341560] systemd-journald[1550]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12: ================================================================================^[[23;80H^[[1;23r^[[23;80H
2019-08-18 19:36:12: 
2019-08-18 19:36:12: ^[[22;1H================================================================================^[[23;1HError^[[1;24r^[[23;6H[  440.371152] systemd-journald[1550]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12: [  440.381579] systemd-journald[1550]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-18 19:36:12: ^[[1;23r^[[23;80H
2019-08-18 19:36:12: 
2019-08-18 19:36:12: 
2019-08-18 19:36:12: 
2019-08-18 19:36:12: 
2019-08-18 19:36:12: 
2019-08-18 19:36:12: 
2019-08-18 19:36:12: ^[[18;1H   The following error occurred while installing.  This is a fatal error and
2019-08-18 19:36:12:    installation will be aborted.
2019-08-18 19:36:12: 
2019-08-18 19:36:12:    DNF error: Error unpacking rpm package libidn-1.34-5.el8.aarch64^[[23;1HPress ENTER to exit: ^[[1;24r^[[23;22H[  440.401675] systemd-journald[1550]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B

It reproduces 100%, usually at different stage (different RPM package). With Avocado-vt you can reproduce it by putting RHEL-8 DVD image to RHEL.7.devel iso location and executing something like this:

avocado --show all run --vt-guest-os RHEL.7.devel --vt-extra-params store_vm_register=no take_regular_screendumps=no monitor_type=human --vt-qemu-bin /home/jenkins/aarch64/qemu-master-build/bin/qemu-system-aarch64 -- unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads

Note: I'm still using RHEL.7.devel profile although the provided DVD is RHEL8, it doesn't really matter
Note: The "--vt-extra-params" are workarounds of current issue where frequent use of QMP monitor results in bad things on aarch64

But it should reproduce even without Avocado-vt. The used qemu cmd was:

MALLOC_PERTURB_=1  /home/jenkins/aarch64/qemu-master-build/bin/qemu-system-aarch64 \
    -S  \
    -name 'avocado-vt-vm1'  \
    -sandbox off  \
    -drive file=/usr/share/AAVMF/AAVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on \
    -drive file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-aarch64_AAVMF_VARS.fd,if=pflash,format=raw,unit=1 \
    -machine virt,gic-version=host  \
    -nodefaults \
    -device virtio-gpu-pci,bus=pcie.0,addr=0x1  \
    -chardev socket,id=hmp_id_qmpmonitor1,path=/var/tmp/avocado_kurmsdn9/monitor-qmpmonitor1-20190821-065917-I0T8uXz4,server,nowait \
    -mon chardev=hmp_id_qmpmonitor1,mode=readline  \
    -chardev socket,id=hmp_id_catch_monitor,path=/var/tmp/avocado_kurmsdn9/monitor-catch_monitor-20190821-065917-I0T8uXz4,server,nowait \
    -mon chardev=hmp_id_catch_monitor,mode=readline  \
    -serial unix:'/var/tmp/avocado_kurmsdn9/serial-serial0-20190821-065917-I0T8uXz4',server,nowait \
    -device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
    -device qemu-xhci,id=usb1,bus=pcie.0-root-port-2,addr=0x0 \
    -device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
    -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie.0-root-port-3,addr=0x0 \
    -drive id=drive_image1,if=none,format=qcow2,file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7devel-aarch64.qcow2 \
    -device scsi-hd,id=image1,drive=drive_image1 \
    -device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
    -device virtio-net-pci,mac=9a:96:87:89:70:c0,rombar=0,id=idfUqVrx,netdev=idNvVkD7,bus=pcie.0-root-port-4,addr=0x0  \
    -netdev tap,id=idNvVkD7,fd=20 \
    -m 4096  \
    -smp 2,maxcpus=2,cores=1,threads=1,sockets=2  \
    -cpu 'host' \
    -drive id=drive_cd1,if=none,media=cdrom,file=/home/isos/RHEL-8.1.0-20190820.n.0-aarch64-dvd1.iso \
    -device scsi-cd,id=cd1,drive=drive_cd1 \
    -drive id=drive_unattended,if=none,media=cdrom,file=/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-64/ks.iso \
    -device scsi-cd,id=unattended,drive=drive_unattended \
    -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1  \
    -kernel '/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-aarch64/vmlinuz'  \
    -append 'ks=cdrom inst.sshd nicdelay=60  earlyprintk=pl011,0x9000000 console=ttyAMA0 debug ignore_loglevel rootwait'  \
    -initrd '/var/lib/libvirt/images/avocado/avocado-vt/images/rhel7-aarch64/initrd.img'  \
    -vnc :0  \
    -rtc base=utc,clock=host \
    -enable-kvm \
    -device pcie-root-port,id=pcie_extra_root_port_0,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
    -device pcie-root-port,id=pcie_extra_root_port_1,slot=6,chassis=6,addr=0x6,bus=pcie.0

But you should find all the required information in the attached file. It's worth mentioning that I haven't seen this issue on other architectures (s390x, ppc64, ppc64le) and reproducibility is 100%.

Regards,
Lukáš

PS: During bisecting the metadata corruption was usually not there, so attaching excerpt of serial console obtained with the commit c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a:

2019-08-21 07:04:57: ^[[22;1HInstalling kernel-devel.aarch64 (311/1190)^[[1;24r^[[23;1H
2019-08-21 07:05:23: [  346.780872] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [  346.788877] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [  346.796951] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:23: [  346.805247] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [  372.250148] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [  372.258148] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [  372.270263] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [  372.287159] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [  372.295778] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [  372.304047] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: ================================================================================^[[23;80H[  372.318162] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [  372.332918] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: ^[[1;23r^[[23;80H
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49:
2019-08-21 07:05:49: ^[[15;1H================================================================================^[[16;1HError^[[18;1H   The following error occurred while installing.  This is a fatal error and
2019-08-21 07:05:49:    installation will be aborted.
2019-08-21 07:05:49:
2019-08-21 07:05:49:    DNF error: Error in POSTIN scriptlet in rpm package kernel-devel^[[1;24r^[[23;1H[  372.341178] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: [  372.371747] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:49: Press ENTER to exit:
2019-08-21 07:05:53: [  377.143065] systemd-coredump[2890]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:53: [  377.150559] systemd-coredump[2890]: Failed to get EXE, ignoring: No such process
2019-08-21 07:05:53: [  377.200294] systemd-coredump[2892]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:53: [  377.207611] systemd-coredump[2892]: Failed to get EXE, ignoring: No such process
2019-08-21 07:05:54: [  377.259821] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.267769] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.278325] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.286222] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.297440] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.307166] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.318449] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.326360] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.337852] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:54: [  377.346173] systemd-journald[1558]: Journal effective settings seal=yes compress=yes compress_threshold_bytes=512B
2019-08-21 07:05:56: [  379.428425] systemd-coredump[2909]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:56: [  379.435753] systemd-coredump[2909]: Failed to get EXE, ignoring: No such process
2019-08-21 07:05:57: [  380.880584] systemd-coredump[2919]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:57: [  380.887926] systemd-coredump[2919]: Failed to get EXE, ignoring: No such process
2019-08-21 07:05:58: [  381.526076] systemd-coredump[2922]: Failed to get COMM, falling back to the command line: No such process
2019-08-21 07:05:58: [  381.533709] systemd-coredump[2922]: Failed to get EXE, ignoring: No such process

[-- Attachment #1.1.2: 094-functional_git_RHEL-8_dvd io-github-autotest-qemu.unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads_-e630.zip --]
[-- Type: application/zip, Size: 89995 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

                 reply	other threads:[~2019-08-21 11:37 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b46e268-f8a3-9409-d40a-978390df62c2@redhat.com \
    --to=ldoktor@redhat.com \
    --cc=anton.nefedov@virtuozzo.com \
    --cc=drjones@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).