All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v4 0/1] block: enforce minimal 4096 alignment in qemu_blockalign
@ 2015-02-06 17:37 Denis V. Lunev
  2015-02-06 17:37 ` [Qemu-devel] [PATCH 1/2] block, raw-posix: replace 512/4096 constants with proper macros/values Denis V. Lunev
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Denis V. Lunev @ 2015-02-06 17:37 UTC (permalink / raw)
  Cc: Kevin Wolf, Denis V. Lunev, qemu-devel, Paolo Bonzini

The following sequence
    int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
    for (i = 0; i < 100000; i++)
            write(fd, buf, 4096);
iperforms 5% better if buf is aligned to 4096 bytes rather then to
512 bytes.

I have used the following program to test
#define _GNU_SOURCE

#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <malloc.h>
#include <string.h>

int main(int argc, char *argv[])
{
    int fd = open(argv[1], O_RDWR | O_CREAT | O_DIRECT, 0644);
    void *buf;
    int i = 0, align = atoi(argv[2]);

    do {
        buf = memalign(align, 4096);
        if (align >= 4096)
            break;
        if ((unsigned long)buf & 4095)
            break;
        i++;
    } while (1);
    printf("%d %p\n", i, buf);

    memset(buf, 0x11, 4096);

    for (i = 0; i < 100000; i++) {
        lseek(fd, SEEK_CUR, 4096);
        write(fd, buf, 4096);
    }

    close(fd);
    return 0;
}
for in in `seq 1 30` ; do a.out aa ; done

The file was placed into 8 GB partition on HDD below to avoid speed
change due to different offset on disk. Results are reliable:
- 189 vs 180 seconds on Linux 3.16

The following setups have been tested:
1) ext4 with block size equals to 1024 over 512/512 physical/logical
   sector size SSD disk
2) ext4 with block size equals to 4096 over 512/512 physical/logical
   sector size SSD disk
3) ext4 with block size equals to 4096 over 512/4096 physical/logical
   sector size rotational disk (WDC WD20EZRX)
4) xfs with block size equals to 4096 over 512/512 physical/logical
   sector size SSD disk

The difference is quite reliable and the same 5%.
  qemu-io -n -c 'write -P 0xaa 0 1G' 1.img
for image in qcow2 format is 1% faster.

Changes from v3:
- portable way to calculate system page size used
- 512/4096 values are replaced with proper macros/values

Changes from v2:
- opt_mem_alignment is split to opt_mem_alignment for bounce buffering
  and min_mem_alignment to check buffers coming from guest.

Changes from v1:
- enforces 4096 alignment in qemu_(try_)blockalign, avoid touching of
  bdrv_qiov_is_aligned path not to enforce additional bounce buffering
  as suggested by Paolo
- reduces 10% to 5% in patch description to better fit 180 vs 189
  difference

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-02-16 11:16 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-06 17:37 [Qemu-devel] [PATCH v4 0/1] block: enforce minimal 4096 alignment in qemu_blockalign Denis V. Lunev
2015-02-06 17:37 ` [Qemu-devel] [PATCH 1/2] block, raw-posix: replace 512/4096 constants with proper macros/values Denis V. Lunev
2015-02-16 10:32   ` Kevin Wolf
2015-02-16 10:34     ` Denis V. Lunev
2015-02-06 17:37 ` [Qemu-devel] [PATCH 2/2] block: align bounce buffers to page Denis V. Lunev
2015-02-16 10:59   ` Kevin Wolf
2015-02-16 11:14     ` Denis V. Lunev
2015-02-13  7:52 ` [Qemu-devel] [PATCH v4 0/1] block: enforce minimal 4096 alignment in qemu_blockalign Denis V. Lunev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.