From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35446) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aumHv-0003Bl-Vk for qemu-devel@nongnu.org; Mon, 25 Apr 2016 15:35:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aumHr-0000JC-ER for qemu-devel@nongnu.org; Mon, 25 Apr 2016 15:35:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38382) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aumHr-0000Ig-6I for qemu-devel@nongnu.org; Mon, 25 Apr 2016 15:35:15 -0400 References: <1461413127-2594-1-git-send-email-den@openvz.org> <20160425090553.GA5293@noname.str.redhat.com> <571DEF7D.1010304@openvz.org> From: Eric Blake Message-ID: <571E7170.1090305@redhat.com> Date: Mon, 25 Apr 2016 13:35:12 -0600 MIME-Version: 1.0 In-Reply-To: <571DEF7D.1010304@openvz.org> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="7nJrpKGQP90SuAU5WaxgssqGplsU1PtoQ" Subject: Re: [Qemu-devel] [PATCH for 2.7 1/1] qcow2: improve qcow2_co_write_zeroes() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Denis V. Lunev" , Kevin Wolf Cc: qemu-devel@nongnu.org, Max Reitz This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --7nJrpKGQP90SuAU5WaxgssqGplsU1PtoQ Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 04/25/2016 04:20 AM, Denis V. Lunev wrote: > On 04/25/2016 12:05 PM, Kevin Wolf wrote: >> Am 23.04.2016 um 14:05 hat Denis V. Lunev geschrieben: >>> Unfortunately Linux kernel could send non-aligned requests to qemu-nb= d >>> if the caller is using O_DIRECT and does not align in-memory data to >>> page. Thus qemu-nbd will call block layer with non-aligned requests. At first glance, I'd argue that any caller using O_DIRECT without obeying memory alignment restrictions is broken; why should qemu have to work around such broken callers? > The program is 100% reproducible. The following sequence > is performed: >=20 > #define _GNU_SOURCE >=20 > #include > #include > #include > #include > #include >=20 > int main(int argc, char *argv[]) > { > char *buf; > int fd; >=20 > if (argc !=3D 2) { > return -1; > } >=20 > fd =3D open(argv[1], O_WRONLY | O_DIRECT); >=20 > do { > buf =3D memalign(512, 1024 * 1024); > } while (!(unsigned long)buf & (4096 - 1)); In other words, you are INTENTIONALLY grabbing an unaligned buffer for use with an O_DIRECT fd, when O_DIRECT demands that you should be using at least page alignment (4096 or greater). Arguably, the bug is in your program, not qemu. That said, teaching qemu to split up a write_zeroes request into head, tail, and aligned body, so at least the aligned part might benefit from optimizations, seems like it might be worthwhile, particularly since my pending NBD series changed from blk_write_zeroes (cluster-aligned) to blk_pwrite_zeroes (byte-aligned), and it is conceivable that we can encounter a situation without O_DIRECT in the picture at all, where our NBD server is connected to a client that specifically asks for the new NBD_CMD_WRITE_ZEROES on any arbitrary byte alignment. > memset(buf, 0, 1024 * 1024); > write(fd, buf, 1024 * 1024); > return 0; > } >=20 > This program is compiled as a.out. >=20 > Before the patch: > hades ~/src/qemu $ qemu-img create -f qcow2 test.qcow2 64G > Formatting 'test.qcow2', fmt=3Dqcow2 size=3D68719476736 encryption=3Dof= f > cluster_size=3D65536 lazy_refcounts=3Doff refcount_bits=3D16 > hades ~/src/qemu $ sudo ./qemu-nbd --connect=3D/dev/nbd3 test.qcow2 > --detect-zeroes=3Don --aio=3Dnative --cache=3Dnone > hades ~/src/qemu $ sudo ./a.out /dev/nbd3 Here, I'd argue that the kernel NBD module is buggy, for allowing a user-space app to pass an unaligned buffer to an O_DIRECT fd. But that's outside the realm of qemu code. But again, per my argument that you don't have to involve the kernel nor O_DIRECT to be able to write a client that can attempt to cause an NBD server to do unaligned operations, we can use this kernel bug as an easier way to test any proposed fix to the qemu side of things, whether or not the kernel module gets tightened in behavior down the road. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --7nJrpKGQP90SuAU5WaxgssqGplsU1PtoQ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJXHnFwAAoJEKeha0olJ0NqjqQH/0NAddsqTZCx3ooEuUN027YQ WYVe8XMWaGE3HbyH+A7cv2EajEzLaI2YtI8SKu1ZKqS8QKDdWsLJthCuKSyIpQj6 eMYQKRe5hScu8+RutqDhc13i5LCpTVo8bFVRZYnhzSBVQ6t7Z5A0K85npjBw14t4 9tm7kybG4z4MwY35CSlz9BSsx16tLUrIj7Y7kl9gjXkoOyYPC+xeoy9NPP8pXo+E MFTZxnEVCB0bzqpZK7GuvocROffD2+cwozJh72qAgBd7un54+KcaseKw7gqHNi5h HxAcmT4Lqq1Vc5U5Vq4viC89QtjOGjzx7OdTlseGizmUqzFcwN0F4MrNuGYEJuI= =AA4k -----END PGP SIGNATURE----- --7nJrpKGQP90SuAU5WaxgssqGplsU1PtoQ--