From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34F1EC3A5A1 for ; Thu, 22 Aug 2019 16:35:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0A77C23400 for ; Thu, 22 Aug 2019 16:35:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0A77C23400 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45666 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i0q3I-0008FE-2W for qemu-devel@archiver.kernel.org; Thu, 22 Aug 2019 12:35:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44633) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1i0q1S-0006JM-Tz for qemu-devel@nongnu.org; Thu, 22 Aug 2019 12:33:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1i0q1Q-0005uW-Sh for qemu-devel@nongnu.org; Thu, 22 Aug 2019 12:33:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47410) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1i0q1M-0005rC-2x; Thu, 22 Aug 2019 12:33:08 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 344237FDC9; Thu, 22 Aug 2019 16:33:07 +0000 (UTC) Received: from dresden.str.redhat.com (unknown [10.40.205.206]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 60A245D6A7; Thu, 22 Aug 2019 16:33:00 +0000 (UTC) From: Max Reitz To: =?UTF-8?B?THVrw6HFoSBEb2t0b3I=?= , QEMU Developers , qemu-block@nongnu.org, Anton Nefedov , Andrew Jones , Paolo Bonzini , Stefan Hajnoczi References: <84dd318a-28d4-b2f3-b03b-d870a5f618d5@redhat.com> <80341376-ecbd-cbf4-8800-022fdd65c04d@redhat.com> <8e42ea8e-0629-2645-d14d-785722df6466@redhat.com> <9b09731a-1769-2cbe-4b6f-3d7787f74ebc@redhat.com> Openpgp: preference=signencrypt Autocrypt: addr=mreitz@redhat.com; prefer-encrypt=mutual; keydata= mQENBFXOJlcBCADEyyhOTsoa/2ujoTRAJj4MKA21dkxxELVj3cuILpLTmtachWj7QW+TVG8U /PsMCFbpwsQR7oEy8eHHZwuGQsNpEtNC2G/L8Yka0BIBzv7dEgrPzIu+W3anZXQW4702+uES U29G8TP/NGfXRRHGlbBIH9KNUnOSUD2vRtpOLXkWsV5CN6vQFYgQfFvmp5ZpPeUe6xNplu8V mcTw8OSEDW/ZnxJc8TekCKZSpdzYoxfzjm7xGmZqB18VFwgJZlIibt1HE0EB4w5GsD7x5ekh awIe3RwoZgZDLQMdOitJ1tUc8aqaxvgA4tz6J6st8D8pS//m1gAoYJWGwwIVj1DjTYLtABEB AAG0HU1heCBSZWl0eiA8bXJlaXR6QHJlZGhhdC5jb20+iQFTBBMBCAA9AhsDBQkSzAMABQsJ CAcCBhUICQoLAgQWAgMBAh4BAheABQJVzie5FRhoa3A6Ly9rZXlzLmdudXBnLm5ldAAKCRD0 B9sAYdXPQDcIB/9uNkbYEex1rHKz3mr12uxYMwLOOFY9fstP5aoVJQ1nWQVB6m2cfKGdcRe1 2/nFaHSNAzT0NnKz2MjhZVmcrpyd2Gp2QyISCfb1FbT82GMtXFj1wiHmPb3CixYmWGQUUh+I AvUqsevLA+WihgBUyaJq/vuDVM1/K9Un+w+Tz5vpeMidlIsTYhcsMhn0L9wlCjoucljvbDy/ 8C9L2DUdgi3XTa0ORKeflUhdL4gucWoAMrKX2nmPjBMKLgU7WLBc8AtV+84b9OWFML6NEyo4 4cP7cM/07VlJK53pqNg5cHtnWwjHcbpGkQvx6RUx6F1My3y52vM24rNUA3+ligVEgPYBuQEN BFXOJlcBCADAmcVUNTWT6yLWQHvxZ0o47KCP8OcLqD+67T0RCe6d0LP8GsWtrJdeDIQk+T+F xO7DolQPS6iQ6Ak2/lJaPX8L0BkEAiMuLCKFU6Bn3lFOkrQeKp3u05wCSV1iKnhg0UPji9V2 W5eNfy8F4ZQHpeGUGy+liGXlxqkeRVhLyevUqfU0WgNqAJpfhHSGpBgihUupmyUg7lfUPeRM DzAN1pIqoFuxnN+BRHdAecpsLcbR8sQddXmDg9BpSKozO/JyBmaS1RlquI8HERQoe6EynJhd 64aICHDfj61rp+/0jTIcevxIIAzW70IadoS/y3DVIkuhncgDBvGbF3aBtjrJVP+5ABEBAAGJ ASUEGAEIAA8FAlXOJlcCGwwFCRLMAwAACgkQ9AfbAGHVz0CbFwf9F/PXxQR9i4N0iipISYjU sxVdjJOM2TMut+ZZcQ6NSMvhZ0ogQxJ+iEQ5OjnIputKvPVd5U7WRh+4lF1lB/NQGrGZQ1ic alkj6ocscQyFwfib+xIe9w8TG1CVGkII7+TbS5pXHRxZH1niaRpoi/hYtgzkuOPp35jJyqT/ /ELbqQTDAWcqtJhzxKLE/ugcOMK520dJDeb6x2xVES+S5LXby0D4juZlvUj+1fwZu+7Io5+B bkhSVPb/QdOVTpnz7zWNyNw+OONo1aBUKkhq2UIByYXgORPFnbfMY7QWHcjpBVw9MgC4tGeF R4bv+1nAMMxKmb5VvQCExr0eFhJUAHAhVg== Message-ID: <1ac19336-e9f4-753c-ebd2-41156152eb9a@redhat.com> Date: Thu, 22 Aug 2019 18:32:59 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <9b09731a-1769-2cbe-4b6f-3d7787f74ebc@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Thu, 22 Aug 2019 16:33:07 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: Re: [Qemu-devel] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas [v2] X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" On 22.08.19 17:40, Max Reitz wrote: > On 22.08.19 17:25, Max Reitz wrote: >> On 22.08.19 14:09, Max Reitz wrote: >>> (CC-ing Paolo because of the XFS connection, and Stefan because why n= ot.) >>> >>> On 22.08.19 13:27, Luk=C3=A1=C5=A1 Doktor wrote: >>>> Dne 21. 08. 19 v 19:51 Max Reitz napsal(a): >>>>> On 21.08.19 16:14, Luk=C3=A1=C5=A1 Doktor wrote: >>>>>> Hello guys, >>>>>> >>>>>> First attempt was rejected due to zip attachment, let's try it aga= in with just Avocado-vt debug.log and serial console log files attached. >>>>>> >>>>>> I bisected a regression on aarch64 all the way to this commit: "qc= ow2: skip writing zero buffers to empty COW areas" c8bb23cbdbe32f5c326365= e0a82e1b0e68cdcd8a. Would you please have a look at it? >>>>> >>>>> I think I can see the issue on my x64 system (I don=E2=80=99t see t= he XFS >>>>> corruption, but the installation fails because of some segfaults). >>>>> >>>>> I haven=E2=80=99t found a simpler way to reproduce the problem yet,= though, >>>>> which is a pain... :-/ >>>>> >>>>> It looks like the problem disappears when I configure qemu with >>>>> =E2=80=9C--disable-xfsctl=E2=80=9D. Can you try that? >>>>> >>>>> Max >>>>> >>>> >>>> Hello Max, >>>> >>>> yes, I'm getting the same behavior. With "--disable-xfsctl" it works= well. Also looking at the option I understand why it only failed on aarc= h64 for me, I don't have libs installed on the other machines, therefor i= t was disabled by "./configure" there. Anyway I guess disabling it in my = builds won't really fix the issue, right? :-) >>> >>> Thanks! >>> >>> No, it won=E2=80=99t, but it means the actual root of the problem is = probably >>> rather in some XFS-related code (be it because qemu uses it the wrong >>> way or because of XFS kernel code) than in the pure qcow2 commit that >>> made the problem surface by exercising it heavily. (Or in an >>> interaction between the two.) >> >> OK, I got a simpler reproducer now: >> >> $ ./qemu-img create -f qcow2 test.qcow2 1M >> $ (for i in $(seq 15 -1 0); do \ >> echo "aio_write -P 42 $((i * 64 + 1))k 62k"; \ >> done) \ >> | ./qemu-io test.qcow2 >> $ for i in $(seq 0 15); do \ >> echo $i; \ >> ofs=3D$((i * 64)); \ >> ./qemu-io -c "read -P 0 ${ofs}k 1k" \ >> -c "read -P 42 $((ofs + 1))k 62k" \ >> -c "read -P 0 $((ofs + 63))k 1k" \ >> test.qcow2 \ >> | grep 'verification'; \ >> done >> >> On XFS with --enable-xfsctl, this basically always gives me some >> verification failure somewhere. (On tmpfs or with --disable-xfsctl, i= t >> never fails.) >> >> So it seems to be related to I/O from back to front. >> >> (You can also reproduce it with a plain =E2=80=9Cqemu-img bench=E2=80=9D= invocation, >> like =E2=80=9C./qemu-img bench -w --pattern=3D42 -o 1k -S 64k -s 62k t= est.qcow2=E2=80=9D >> (on, say, a 4 GB image), but then the failure appears much later in th= e >> image, because you have to wait from some requests to come in reverse >> (by chance) first.) >=20 > The problem is the ftruncate() in xfs_write_zeroes(). It is possible > for it to yield, then other requests come in, and the data they write > may get discarded once the ftruncate() settles. I=E2=80=99ve just sent a patch: =E2=80=9Cblock/file-posix: Fix xfs_write_= zeroes()=E2=80=9D, Message-ID <20190822162618.27670-1-mreitz@redhat.com>: https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg01148.html Max