From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36063) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMB51-0004xL-60 for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:54:28 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YMB4w-0004Jo-6J for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:54:27 -0500 Received: from mx2.parallels.com ([199.115.105.18]:39799) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YMB4v-0004Bb-W4 for qemu-devel@nongnu.org; Fri, 13 Feb 2015 02:54:22 -0500 Message-ID: <54DDAD49.1060607@openvz.org> Date: Fri, 13 Feb 2015 10:52:41 +0300 From: "Denis V. Lunev" MIME-Version: 1.0 References: <1423244272-24887-1-git-send-email-den@openvz.org> In-Reply-To: <1423244272-24887-1-git-send-email-den@openvz.org> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v4 0/1] block: enforce minimal 4096 alignment in qemu_blockalign List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Paolo Bonzini , qemu-devel@nongnu.org On 06/02/15 20:37, Denis V. Lunev wrote: > The following sequence > int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644); > for (i = 0; i < 100000; i++) > write(fd, buf, 4096); > iperforms 5% better if buf is aligned to 4096 bytes rather then to > 512 bytes. > > I have used the following program to test > #define _GNU_SOURCE > > #include > #include > #include > #include > #include > #include > > int main(int argc, char *argv[]) > { > int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644); > void *buf; > int i = 0, align = atoi(argv[2]); > > do { > buf = memalign(align, 4096); > if (align >= 4096) > break; > if ((unsigned long)buf & 4095) > break; > i++; > } while (1); > printf("%d %p\n", i, buf); > > memset(buf, 0x11, 4096); > > for (i = 0; i < 100000; i++) { > lseek(fd, SEEK_CUR, 4096); > write(fd, buf, 4096); > } > > close(fd); > return 0; > } > for in in `seq 1 30` ; do a.out aa ; done > > The file was placed into 8 GB partition on HDD below to avoid speed > change due to different offset on disk. Results are reliable: > - 189 vs 180 seconds on Linux 3.16 > > The following setups have been tested: > 1) ext4 with block size equals to 1024 over 512/512 physical/logical > sector size SSD disk > 2) ext4 with block size equals to 4096 over 512/512 physical/logical > sector size SSD disk > 3) ext4 with block size equals to 4096 over 512/4096 physical/logical > sector size rotational disk (WDC WD20EZRX) > 4) xfs with block size equals to 4096 over 512/512 physical/logical > sector size SSD disk > > The difference is quite reliable and the same 5%. > qemu-io -n -c 'write -P 0xaa 0 1G' 1.img > for image in qcow2 format is 1% faster. > > Changes from v3: > - portable way to calculate system page size used > - 512/4096 values are replaced with proper macros/values > > Changes from v2: > - opt_mem_alignment is split to opt_mem_alignment for bounce buffering > and min_mem_alignment to check buffers coming from guest. > > Changes from v1: > - enforces 4096 alignment in qemu_(try_)blockalign, avoid touching of > bdrv_qiov_is_aligned path not to enforce additional bounce buffering > as suggested by Paolo > - reduces 10% to 5% in patch description to better fit 180 vs 189 > difference > > Signed-off-by: Denis V. Lunev > CC: Paolo Bonzini > CC: Kevin Wolf > ping