From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:32965) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YJmqJ-0001k5-7G for qemu-devel@nongnu.org; Fri, 06 Feb 2015 12:37:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YJmqE-0006Hi-Lc for qemu-devel@nongnu.org; Fri, 06 Feb 2015 12:37:22 -0500 Received: from mailhub.sw.ru ([195.214.232.25]:6750 helo=relay.sw.ru) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YJmqE-0006HP-AJ for qemu-devel@nongnu.org; Fri, 06 Feb 2015 12:37:18 -0500 From: "Denis V. Lunev" Date: Fri, 6 Feb 2015 20:37:50 +0300 Message-Id: <1423244272-24887-1-git-send-email-den@openvz.org> Subject: [Qemu-devel] [PATCH v4 0/1] block: enforce minimal 4096 alignment in qemu_blockalign List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , "Denis V. Lunev" , qemu-devel@nongnu.org, Paolo Bonzini The following sequence int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644); for (i = 0; i < 100000; i++) write(fd, buf, 4096); iperforms 5% better if buf is aligned to 4096 bytes rather then to 512 bytes. I have used the following program to test #define _GNU_SOURCE #include #include #include #include #include #include int main(int argc, char *argv[]) { int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644); void *buf; int i = 0, align = atoi(argv[2]); do { buf = memalign(align, 4096); if (align >= 4096) break; if ((unsigned long)buf & 4095) break; i++; } while (1); printf("%d %p\n", i, buf); memset(buf, 0x11, 4096); for (i = 0; i < 100000; i++) { lseek(fd, SEEK_CUR, 4096); write(fd, buf, 4096); } close(fd); return 0; } for in in `seq 1 30` ; do a.out aa ; done The file was placed into 8 GB partition on HDD below to avoid speed change due to different offset on disk. Results are reliable: - 189 vs 180 seconds on Linux 3.16 The following setups have been tested: 1) ext4 with block size equals to 1024 over 512/512 physical/logical sector size SSD disk 2) ext4 with block size equals to 4096 over 512/512 physical/logical sector size SSD disk 3) ext4 with block size equals to 4096 over 512/4096 physical/logical sector size rotational disk (WDC WD20EZRX) 4) xfs with block size equals to 4096 over 512/512 physical/logical sector size SSD disk The difference is quite reliable and the same 5%. qemu-io -n -c 'write -P 0xaa 0 1G' 1.img for image in qcow2 format is 1% faster. Changes from v3: - portable way to calculate system page size used - 512/4096 values are replaced with proper macros/values Changes from v2: - opt_mem_alignment is split to opt_mem_alignment for bounce buffering and min_mem_alignment to check buffers coming from guest. Changes from v1: - enforces 4096 alignment in qemu_(try_)blockalign, avoid touching of bdrv_qiov_is_aligned path not to enforce additional bounce buffering as suggested by Paolo - reduces 10% to 5% in patch description to better fit 180 vs 189 difference Signed-off-by: Denis V. Lunev CC: Paolo Bonzini CC: Kevin Wolf