From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40649) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YsYeI-0006VR-PB for qemu-devel@nongnu.org; Wed, 13 May 2015 11:32:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YsYeE-0008Ih-GY for qemu-devel@nongnu.org; Wed, 13 May 2015 11:32:42 -0400 Date: Wed, 13 May 2015 16:32:23 +0100 From: Stefan Hajnoczi Message-ID: <20150513153223.GA24352@stefanha-thinkpad.redhat.com> References: <1431441056-26198-1-git-send-email-den@openvz.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="dDRMvlgZJXvWKvBx" Content-Disposition: inline In-Reply-To: <1431441056-26198-1-git-send-email-den@openvz.org> Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v8 0/2] block: enforce minimal 4096 alignment in qemu_blockalign List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Denis V. Lunev" Cc: Paolo Bonzini , Stefan Hajnoczi , qemu-devel@nongnu.org, qemu-block@nongnu.org --dDRMvlgZJXvWKvBx Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, May 12, 2015 at 05:30:54PM +0300, Denis V. Lunev wrote: > I have used the following program to test > #define _GNU_SOURCE >=20 > #include > #include > #include > #include > #include > #include >=20 > int main(int argc, char *argv[]) > { > int fd =3D open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644); > void *buf; > int i =3D 0, align =3D atoi(argv[2]); >=20 > do { > buf =3D memalign(align, 4096); > if (align >=3D 4096) > break; > if ((unsigned long)buf & 4095) > break; > i++; > } while (1); > printf("%d %p\n", i, buf); >=20 > memset(buf, 0x11, 4096); >=20 > for (i =3D 0; i < 100000; i++) { > lseek(fd, SEEK_CUR, 4096); > write(fd, buf, 4096); > } >=20 > close(fd); > return 0; > } > for in in `seq 1 30` ; do a.out aa ; done >=20 > The file was placed into 8 GB partition on HDD below to avoid speed > change due to different offset on disk. Results are reliable: > - 189 vs 180 seconds on Linux 3.16 >=20 > The following setups have been tested: > 1) ext4 with block size equals to 1024 over 512/512 physical/logical > sector size SSD disk > 2) ext4 with block size equals to 4096 over 512/512 physical/logical > sector size SSD disk > 3) ext4 with block size equals to 4096 over 512/4096 physical/logical > sector size rotational disk (WDC WD20EZRX) > 4) xfs with block size equals to 4096 over 512/512 physical/logical > sector size SSD disk >=20 > The difference is quite reliable and the same 5%. > qemu-io -n -c 'write -P 0xaa 0 1G' 1.img > for image in qcow2 format is 1% faster. >=20 > qemu-img is also affected. The difference in between > qemu-img create -f qcow2 1.img 64G > qemu-io -n -c 'write -P 0xaa 0 1G' 1.img > time for i in `seq 1 30` ; do qemu-img convert 1.img -t none -O raw 2.i= mg ; rm -rf 2.img ; done > is around 126 vs 119 seconds. >=20 > The justification of the performance improve is quite interesting. > From the kernel point of view each request to the disk was split > by two. This could be seen by blktrace like this: > 9,0 11 1 0.000000000 11151 Q WS 312737792 + 1023 [qemu-img] > 9,0 11 2 0.000007938 11151 Q WS 312738815 + 8 [qemu-img] > 9,0 11 3 0.000030735 11151 Q WS 312738823 + 1016 [qemu-img] > 9,0 11 4 0.000032482 11151 Q WS 312739839 + 8 [qemu-img] > 9,0 11 5 0.000041379 11151 Q WS 312739847 + 1016 [qemu-img] > 9,0 11 6 0.000042818 11151 Q WS 312740863 + 8 [qemu-img] > 9,0 11 7 0.000051236 11151 Q WS 312740871 + 1017 [qemu-img] > 9,0 5 1 0.169071519 11151 Q WS 312741888 + 1023 [qemu-img] > After the patch the pattern becomes normal: > 9,0 6 1 0.000000000 12422 Q WS 314834944 + 1024 [qemu-img] > 9,0 6 2 0.000038527 12422 Q WS 314835968 + 1024 [qemu-img] > 9,0 6 3 0.000072849 12422 Q WS 314836992 + 1024 [qemu-img] > 9,0 6 4 0.000106276 12422 Q WS 314838016 + 1024 [qemu-img] > and the amount of requests sent to disk (could be calculated counting > number of lines in the output of blktrace) is reduced about 2 times. >=20 > Both qemu-img and qemu-io are affected while qemu-kvm is not. The guest > does his job well and real requests comes properly aligned (to page). >=20 > Changes from v7: > - make assignment from v6 unconditional (Kevin) >=20 > Changes from v6: > - explicitely assign opt_mem_alignemnt in raw-posix.c with > MAX(s->buf_align, getpagesize()) (Kevin) >=20 > Changes from v5: > - found justification from kernel point of view > - fixed checkpatch warnings in the patch 2 >=20 > Changes from v4: > - patches reordered > - dropped conversion from 512 to BDRV_SECTOR_SIZE > - getpagesize() is replaced with MAX(4096, getpagesize()) as suggested by > Kevin >=20 > Changes from v3: > - portable way to calculate system page size used > - 512/4096 values are replaced with proper macros/values >=20 > Changes from v2: > - opt_mem_alignment is split to opt_mem_alignment for bounce buffering > and min_mem_alignment to check buffers coming from guest. >=20 > Changes from v1: > - enforces 4096 alignment in qemu_(try_)blockalign, avoid touching of > bdrv_qiov_is_aligned path not to enforce additional bounce buffering > as suggested by Paolo > - reduces 10% to 5% in patch description to better fit 180 vs 189 > difference >=20 > Signed-off-by: Denis V. Lunev > CC: Paolo Bonzini > CC: Kevin Wolf > CC: Stefan Hajnoczi >=20 >=20 Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan --dDRMvlgZJXvWKvBx Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVU26HAAoJEJykq7OBq3PIgJAH/RUNhNbwr897l9X45UZzo9xC vZgCuvwicG0CuiYTd0yxbKbtuu7HWbtux2i14WeBDGEFwCasS4QPFAFSy531Ch5j Ex7/1uc8IgQ7eh0gXEf4Kmc+1C4c9Cu9dFF3gYsTt3iYl0ye1nCC3Z+WmaON4gUK RPdrhEebEE8R4ZGL48z3XgA8hRmKEwiHmGdCbJqKGTaQW7L5jURJ+JnBhBleL0sx B2C0B0DCGZpIom4kqOxSRZvLMsYK2/PnPl5WJ9Y+o/ZrSetiySkYczOugvd7VKAd tyEtCKQElLP59PPSWKooMXQuLpyBis0uYAE6lt19fe1U6eXUXc+n+Cfa1ukZbnA= =4wj0 -----END PGP SIGNATURE----- --dDRMvlgZJXvWKvBx--