qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2] block: posix: Handle undetectable alignment
@ 2019-08-11 20:50 Nir Soffer
  2019-08-12 14:23 ` Kevin Wolf
  0 siblings, 1 reply; 5+ messages in thread
From: Nir Soffer @ 2019-08-11 20:50 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Nir Soffer, qemu-devel, Max Reitz

In some cases buf_align or request_alignment cannot be detected:

- With Gluster, buf_align cannot be detected since the actual I/O is
  done on Gluster server, and qemu buffer alignment does not matter.

- With local XFS filesystem, buf_align cannot be detected if reading
  from unallocated area.

- With Gluster backed by XFS, request_alignment cannot be detected if
  reading from unallocated area.

- With NFS, the server does not use direct I/O, so both buf_align cannot
  be detected.

These cases seems to work when storage sector size is 512 bytes, because
the current code starts checking align=512. If the check succeeds
because alignment cannot be detected we use 512. But this does not work
for storage with 4k sector size.

Practically the alignment requirements are the same for buffer
alignment, buffer length, and offset in file. So in case we cannot
detect buf_align, we can use request alignment. If we cannot detect
request alignment, we can fallback to a safe value.

With this change:

- Provisioning VM and copying disks on local XFS and Gluster with 4k
  sector size works, resolving bugs [1],[2].

- With NFS we fallback to buf_align and request_alignment of 4096
  instead of 512. This may cause unneeded data copying, but so far I see
  better performance with this change.

[1] https://bugzilla.redhat.com/1737256
[2] https://bugzilla.redhat.com/1738657

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
---

v1 was a minimal hack; this version is a more generic fix that works for
any storage without requiring users to allocate the first block of an
image. Allocting the first block of an image is still a good idea since
it allows detecting the right alignment in some cases.

v1 could also affect cases when we could detect buf_align to use
request_alignment instead; v2 will only affect cases when buf_align or
request_alignment cannot be detected.

v1 was hare:
https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00133.html

 block/file-posix.c | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index f33b542b33..511468f166 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -323,6 +323,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
     BDRVRawState *s = bs->opaque;
     char *buf;
     size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
+    size_t alignments[] = {1, 512, 1024, 2048, 4096};
 
     /* For SCSI generic devices the alignment is not really used.
        With buffered I/O, we don't have any restrictions. */
@@ -349,25 +350,42 @@ static void raw_probe_alignment(BlockDriverState *bs, int fd, Error **errp)
     }
 #endif
 
-    /* If we could not get the sizes so far, we can only guess them */
-    if (!s->buf_align) {
+    /*
+     * If we could not get the sizes so far, we can only guess them. First try
+     * to detect request alignment, since it is more likely to succeed. Then
+     * try to detect buf_align, which cannot be detected in some cases (e.g.
+     * Gluster). If buf_align cannot be detected, we fallback to the value of
+     * request_alignment.
+     */
+
+    if (!bs->bl.request_alignment) {
+        int i;
         size_t align;
-        buf = qemu_memalign(max_align, 2 * max_align);
-        for (align = 512; align <= max_align; align <<= 1) {
-            if (raw_is_io_aligned(fd, buf + align, max_align)) {
-                s->buf_align = align;
+        buf = qemu_memalign(max_align, max_align);
+        for (i = 0; i < ARRAY_SIZE(alignments); i++) {
+            align = alignments[i];
+            if (raw_is_io_aligned(fd, buf, align)) {
+                /* Fallback to safe value. */
+                bs->bl.request_alignment = (align != 1) ? align : max_align;
                 break;
             }
         }
         qemu_vfree(buf);
     }
 
-    if (!bs->bl.request_alignment) {
+    if (!s->buf_align) {
+        int i;
         size_t align;
-        buf = qemu_memalign(s->buf_align, max_align);
-        for (align = 512; align <= max_align; align <<= 1) {
-            if (raw_is_io_aligned(fd, buf, align)) {
-                bs->bl.request_alignment = align;
+        buf = qemu_memalign(max_align, 2 * max_align);
+        for (i = 0; i < ARRAY_SIZE(alignments); i++) {
+            align = alignments[i];
+            if (raw_is_io_aligned(fd, buf + align, max_align)) {
+                /* Fallback to request_aligment or safe value. */
+                s->buf_align = (align != 1)
+                    ? align
+                    : (bs->bl.request_alignment != 0)
+                        ? bs->bl.request_alignment
+                        : max_align;
                 break;
             }
         }
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-08-13 13:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-08-11 20:50 [Qemu-devel] [PATCH v2] block: posix: Handle undetectable alignment Nir Soffer
2019-08-12 14:23 ` Kevin Wolf
2019-08-13 10:45   ` Nir Soffer
2019-08-13 11:21     ` Kevin Wolf
2019-08-13 13:12       ` Nir Soffer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).