[Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster
@ 2016-04-19 14:16 Jeff Cody
  2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 1/3] block/gluster: return correct error value Jeff Cody
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Jeff Cody @ 2016-04-19 14:16 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, kwolf, rwheeler, pkarampu, rgowdapp

Change from v2:

Look at the error return from the aio callback from
fsync, as well. (thanks Kevin)

Bug fixes for gluster; third patch is to prevent
a potential data loss when trying to recover from
a recoverable error (such as ENOSPC).

The final patch closes the gluster fd and sets the
protocol drv to NULL on fsync failure in gluster;
we have no way of knowing what gluster versions
support retaining fysnc cache on error, so until
we do the safest thing to do is invalidate the
drive.

Jeff Cody (3):
  block/gluster: return correct error value
  block/gluster: code movement of qemu_gluster_close()
  block/gluster: prevent data loss after i/o error

 block/gluster.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++----------
 configure       |  8 ++++++
 2 files changed, 72 insertions(+), 13 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH for-2.6 v3 1/3] block/gluster: return correct error value
  2016-04-19 14:16 [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster Jeff Cody
@ 2016-04-19 14:16 ` Jeff Cody
  2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 2/3] block/gluster: code movement of qemu_gluster_close() Jeff Cody
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Cody @ 2016-04-19 14:16 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, kwolf, rwheeler, pkarampu, rgowdapp

Upon error, gluster will call the aio callback function with a
ret value of -1, with errno set to the proper error value.  If
we set the acb->ret value to the return value in the callback,
that results in every error being EPERM (i.e. 1).  Instead, set
it to the proper error result.

Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/gluster.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/gluster.c b/block/gluster.c
index 51e154c..b0e2cc2 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -247,7 +247,7 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
     if (!ret || ret == acb->size) {
         acb->ret = 0; /* Success */
     } else if (ret < 0) {
-        acb->ret = ret; /* Read/Write failed */
+        acb->ret = -errno; /* Read/Write failed */
     } else {
         acb->ret = -EIO; /* Partial read/write - fail it */
     }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH for-2.6 v3 2/3] block/gluster: code movement of qemu_gluster_close()
  2016-04-19 14:16 [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster Jeff Cody
  2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 1/3] block/gluster: return correct error value Jeff Cody
@ 2016-04-19 14:16 ` Jeff Cody
  2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 3/3] block/gluster: prevent data loss after i/o error Jeff Cody
  2016-04-19 15:48 ` [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster Jeff Cody
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Cody @ 2016-04-19 14:16 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, kwolf, rwheeler, pkarampu, rgowdapp

Move qemu_gluster_close() further up in the file, in preparation
for the next patch, to avoid a forward declaration.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/gluster.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index b0e2cc2..d9aace6 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -589,6 +589,17 @@ static coroutine_fn int qemu_gluster_co_writev(BlockDriverState *bs,
     return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 1);
 }
 
+static void qemu_gluster_close(BlockDriverState *bs)
+{
+    BDRVGlusterState *s = bs->opaque;
+
+    if (s->fd) {
+        glfs_close(s->fd);
+        s->fd = NULL;
+    }
+    glfs_fini(s->glfs);
+}
+
 static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
 {
     int ret;
@@ -661,17 +672,6 @@ static int64_t qemu_gluster_allocated_file_size(BlockDriverState *bs)
     }
 }
 
-static void qemu_gluster_close(BlockDriverState *bs)
-{
-    BDRVGlusterState *s = bs->opaque;
-
-    if (s->fd) {
-        glfs_close(s->fd);
-        s->fd = NULL;
-    }
-    glfs_fini(s->glfs);
-}
-
 static int qemu_gluster_has_zero_init(BlockDriverState *bs)
 {
     /* GlusterFS volume could be backed by a block device */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH for-2.6 v3 3/3] block/gluster: prevent data loss after i/o error
  2016-04-19 14:16 [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster Jeff Cody
  2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 1/3] block/gluster: return correct error value Jeff Cody
  2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 2/3] block/gluster: code movement of qemu_gluster_close() Jeff Cody
@ 2016-04-19 14:16 ` Jeff Cody
  2016-04-19 15:48 ` [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster Jeff Cody
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Cody @ 2016-04-19 14:16 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, kwolf, rwheeler, pkarampu, rgowdapp

Upon receiving an I/O error after an fsync, by default gluster will
dump its cache.  However, QEMU will retry the fsync, which is especially
useful when encountering errors such as ENOSPC when using the werror=stop
option.  When using caching with gluster, however, the last written data
will be lost upon encountering ENOSPC.  Using the write-behind-cache
xlator option of 'resync-failed-syncs-after-fsync' should cause gluster
to retain the cached data after a failed fsync, so that ENOSPC and other
transient errors are recoverable.

Unfortunately, we have no way of knowing if the
'resync-failed-syncs-after-fsync' xlator option is supported, so for now
close the fd and set the BDS driver to NULL upon fsync error.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/gluster.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 configure       |  8 ++++++++
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/block/gluster.c b/block/gluster.c
index d9aace6..a8aaacf 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -314,6 +314,23 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
         goto out;
     }
 
+#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
+    /* Without this, if fsync fails for a recoverable reason (for instance,
+     * ENOSPC), gluster will dump its cache, preventing retries.  This means
+     * almost certain data loss.  Not all gluster versions support the
+     * 'resync-failed-syncs-after-fsync' key value, but there is no way to
+     * discover during runtime if it is supported (this api returns success for
+     * unknown key/value pairs) */
+    ret = glfs_set_xlator_option(s->glfs, "*-write-behind",
+                                          "resync-failed-syncs-after-fsync",
+                                          "on");
+    if (ret < 0) {
+        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
+        ret = -errno;
+        goto out;
+    }
+#endif
+
     qemu_gluster_parse_flags(bdrv_flags, &open_flags);
 
     s->fd = glfs_open(s->glfs, gconf->image, open_flags);
@@ -366,6 +383,16 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
         goto exit;
     }
 
+#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
+    ret = glfs_set_xlator_option(reop_s->glfs, "*-write-behind",
+                                 "resync-failed-syncs-after-fsync", "on");
+    if (ret < 0) {
+        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
+        ret = -errno;
+        goto exit;
+    }
+#endif
+
     reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
     if (reop_s->fd == NULL) {
         /* reops->glfs will be cleaned up in _abort */
@@ -613,11 +640,35 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
 
     ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
     if (ret < 0) {
-        return -errno;
+        ret = -errno;
+        goto error;
     }
 
     qemu_coroutine_yield();
+    if (acb.ret < 0) {
+        ret = acb.ret;
+        goto error;
+    }
+
     return acb.ret;
+
+error:
+    /* Some versions of Gluster (3.5.6 -> 3.5.8?) will not retain its cache
+     * after a fsync failure, so we have no way of allowing the guest to safely
+     * continue.  Gluster versions prior to 3.5.6 don't retain the cache
+     * either, but will invalidate the fd on error, so this is again our only
+     * option.
+     *
+     * The 'resync-failed-syncs-after-fsync' xlator option for the
+     * write-behind cache will cause later gluster versions to retain its
+     * cache after error, so long as the fd remains open.  However, we
+     * currently have no way of knowing if this option is supported.
+     *
+     * TODO: Once gluster provides a way for us to determine if the option
+     * is supported, bypass the closure and setting drv to NULL.  */
+    qemu_gluster_close(bs);
+    bs->drv = NULL;
+    return ret;
 }
 
 #ifdef CONFIG_GLUSTERFS_DISCARD
diff --git a/configure b/configure
index f1c307b..ab54f3c 100755
--- a/configure
+++ b/configure
@@ -298,6 +298,7 @@ coroutine=""
 coroutine_pool=""
 seccomp=""
 glusterfs=""
+glusterfs_xlator_opt="no"
 glusterfs_discard="no"
 glusterfs_zerofill="no"
 archipelago="no"
@@ -3400,6 +3401,9 @@ if test "$glusterfs" != "no" ; then
     glusterfs="yes"
     glusterfs_cflags=`$pkg_config --cflags glusterfs-api`
     glusterfs_libs=`$pkg_config --libs glusterfs-api`
+    if $pkg_config --atleast-version=4 glusterfs-api; then
+      glusterfs_xlator_opt="yes"
+    fi
     if $pkg_config --atleast-version=5 glusterfs-api; then
       glusterfs_discard="yes"
     fi
@@ -5342,6 +5346,10 @@ if test "$glusterfs" = "yes" ; then
   echo "GLUSTERFS_LIBS=$glusterfs_libs" >> $config_host_mak
 fi
 
+if test "$glusterfs_xlator_opt" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_XLATOR_OPT=y" >> $config_host_mak
+fi
+
 if test "$glusterfs_discard" = "yes" ; then
   echo "CONFIG_GLUSTERFS_DISCARD=y" >> $config_host_mak
 fi
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster
  2016-04-19 14:16 [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster Jeff Cody
                   ` (2 preceding siblings ...)
  2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 3/3] block/gluster: prevent data loss after i/o error Jeff Cody
@ 2016-04-19 15:48 ` Jeff Cody
  3 siblings, 0 replies; 5+ messages in thread
From: Jeff Cody @ 2016-04-19 15:48 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel

On Tue, Apr 19, 2016 at 10:16:18AM -0400, Jeff Cody wrote:
> Change from v2:
> 
> Look at the error return from the aio callback from
> fsync, as well. (thanks Kevin)
> 
> 
> Bug fixes for gluster; third patch is to prevent
> a potential data loss when trying to recover from
> a recoverable error (such as ENOSPC).
> 
> The final patch closes the gluster fd and sets the
> protocol drv to NULL on fsync failure in gluster;
> we have no way of knowing what gluster versions
> support retaining fysnc cache on error, so until
> we do the safest thing to do is invalidate the
> drive.
> 
> Jeff Cody (3):
>   block/gluster: return correct error value
>   block/gluster: code movement of qemu_gluster_close()
>   block/gluster: prevent data loss after i/o error
> 
>  block/gluster.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++----------
>  configure       |  8 ++++++
>  2 files changed, 72 insertions(+), 13 deletions(-)
> 
> -- 
> 1.9.3
> 

Thanks,

Applied to my block branch:

git://github.com/codyprime/qemu-kvm-jtc.git block

-Jeff

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-04-19 15:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-19 14:16 [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster Jeff Cody
2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 1/3] block/gluster: return correct error value Jeff Cody
2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 2/3] block/gluster: code movement of qemu_gluster_close() Jeff Cody
2016-04-19 14:16 ` [Qemu-devel] [PATCH for-2.6 v3 3/3] block/gluster: prevent data loss after i/o error Jeff Cody
2016-04-19 15:48 ` [Qemu-devel] [PATCH for-2.6 v3 0/3] Bug fixes for gluster Jeff Cody

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).