[Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
@ 2016-04-19 12:07 Jeff Cody
  2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 1/3] block/gluster: return correct error value Jeff Cody
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Jeff Cody @ 2016-04-19 12:07 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, kwolf, rwheeler, pkarampu, rgowdapp, ndevos

Bug fixes for gluster; third patch is to prevent
a potential data loss when trying to recover from
a recoverable error (such as ENOSPC).

The final patch closes the gluster fd and sets the
protocol drv to NULL on fsync failure in gluster;
we have no way of knowing what gluster versions
support retaining fysnc cache on error, so until
we do the safest thing to do is invalidate the
drive.

Jeff Cody (3):
  block/gluster: return correct error value
  block/gluster: code movement of qemu_gluster_close()
  block/gluster: prevent data loss after i/o error

 block/gluster.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++-----------
 configure       |  8 +++++++
 2 files changed, 62 insertions(+), 12 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH for-2.6 v2 1/3] block/gluster: return correct error value
  2016-04-19 12:07 [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Jeff Cody
@ 2016-04-19 12:07 ` Jeff Cody
  2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 2/3] block/gluster: code movement of qemu_gluster_close() Jeff Cody
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Jeff Cody @ 2016-04-19 12:07 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, kwolf, rwheeler, pkarampu, rgowdapp, ndevos

Upon error, gluster will call the aio callback function with a
ret value of -1, with errno set to the proper error value.  If
we set the acb->ret value to the return value in the callback,
that results in every error being EPERM (i.e. 1).  Instead, set
it to the proper error result.

Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/gluster.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/gluster.c b/block/gluster.c
index 51e154c..b0e2cc2 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -247,7 +247,7 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
     if (!ret || ret == acb->size) {
         acb->ret = 0; /* Success */
     } else if (ret < 0) {
-        acb->ret = ret; /* Read/Write failed */
+        acb->ret = -errno; /* Read/Write failed */
     } else {
         acb->ret = -EIO; /* Partial read/write - fail it */
     }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH for-2.6 v2 2/3] block/gluster: code movement of qemu_gluster_close()
  2016-04-19 12:07 [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Jeff Cody
  2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 1/3] block/gluster: return correct error value Jeff Cody
@ 2016-04-19 12:07 ` Jeff Cody
  2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error Jeff Cody
  2016-04-19 12:18 ` [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Ric Wheeler
  3 siblings, 0 replies; 16+ messages in thread
From: Jeff Cody @ 2016-04-19 12:07 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, kwolf, rwheeler, pkarampu, rgowdapp, ndevos

Move qemu_gluster_close() further up in the file, in preparation
for the next patch, to avoid a forward declaration.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/gluster.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index b0e2cc2..d9aace6 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -589,6 +589,17 @@ static coroutine_fn int qemu_gluster_co_writev(BlockDriverState *bs,
     return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 1);
 }
 
+static void qemu_gluster_close(BlockDriverState *bs)
+{
+    BDRVGlusterState *s = bs->opaque;
+
+    if (s->fd) {
+        glfs_close(s->fd);
+        s->fd = NULL;
+    }
+    glfs_fini(s->glfs);
+}
+
 static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
 {
     int ret;
@@ -661,17 +672,6 @@ static int64_t qemu_gluster_allocated_file_size(BlockDriverState *bs)
     }
 }
 
-static void qemu_gluster_close(BlockDriverState *bs)
-{
-    BDRVGlusterState *s = bs->opaque;
-
-    if (s->fd) {
-        glfs_close(s->fd);
-        s->fd = NULL;
-    }
-    glfs_fini(s->glfs);
-}
-
 static int qemu_gluster_has_zero_init(BlockDriverState *bs)
 {
     /* GlusterFS volume could be backed by a block device */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error
  2016-04-19 12:07 [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Jeff Cody
  2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 1/3] block/gluster: return correct error value Jeff Cody
  2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 2/3] block/gluster: code movement of qemu_gluster_close() Jeff Cody
@ 2016-04-19 12:07 ` Jeff Cody
  2016-04-19 12:27   ` Kevin Wolf
  2016-04-19 12:18 ` [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Ric Wheeler
  3 siblings, 1 reply; 16+ messages in thread
From: Jeff Cody @ 2016-04-19 12:07 UTC (permalink / raw)
  To: qemu-block; +Cc: qemu-devel, kwolf, rwheeler, pkarampu, rgowdapp, ndevos

Upon receiving an I/O error after an fsync, by default gluster will
dump its cache.  However, QEMU will retry the fsync, which is especially
useful when encountering errors such as ENOSPC when using the werror=stop
option.  When using caching with gluster, however, the last written data
will be lost upon encountering ENOSPC.  Using the write-behind-cache
xlator option of 'resync-failed-syncs-after-fsync' should cause gluster
to retain the cached data after a failed fsync, so that ENOSPC and other
transient errors are recoverable.

Unfortunately, we have no way of knowing if the
'resync-failed-syncs-after-fsync' xlator option is supported, so for now
close the fd and set the BDS driver to NULL upon fsync error.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block/gluster.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 configure       |  8 ++++++++
 2 files changed, 50 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index d9aace6..ba33488 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -314,6 +314,23 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
         goto out;
     }
 
+#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
+    /* Without this, if fsync fails for a recoverable reason (for instance,
+     * ENOSPC), gluster will dump its cache, preventing retries.  This means
+     * almost certain data loss.  Not all gluster versions support the
+     * 'resync-failed-syncs-after-fsync' key value, but there is no way to
+     * discover during runtime if it is supported (this api returns success for
+     * unknown key/value pairs) */
+    ret = glfs_set_xlator_option(s->glfs, "*-write-behind",
+                                          "resync-failed-syncs-after-fsync",
+                                          "on");
+    if (ret < 0) {
+        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
+        ret = -errno;
+        goto out;
+    }
+#endif
+
     qemu_gluster_parse_flags(bdrv_flags, &open_flags);
 
     s->fd = glfs_open(s->glfs, gconf->image, open_flags);
@@ -366,6 +383,16 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
         goto exit;
     }
 
+#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
+    ret = glfs_set_xlator_option(reop_s->glfs, "*-write-behind",
+                                 "resync-failed-syncs-after-fsync", "on");
+    if (ret < 0) {
+        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
+        ret = -errno;
+        goto exit;
+    }
+#endif
+
     reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
     if (reop_s->fd == NULL) {
         /* reops->glfs will be cleaned up in _abort */
@@ -613,6 +640,21 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
 
     ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
     if (ret < 0) {
+        /* Some versions of Gluster (3.5.6 -> 3.5.8?) will not retain its
+         * cache after a fsync failure, so we have no way of allowing the guest
+         * to safely continue.  Gluster versions prior to 3.5.6 don't retain
+         * the cache either, but will invalidate the fd on error, so this is
+         * again our only option.
+         *
+         * The 'resync-failed-syncs-after-fsync' xlator option for the
+         * write-behind cache will cause later gluster versions to retain
+         * its cache after error, so long as the fd remains open.  However,
+         * we currently have no way of knowing if this option is supported.
+         *
+         * TODO: Once gluster provides a way for us to determine if the option
+         *       is supported, bypass the closure and setting drv to NULL.  */
+        qemu_gluster_close(bs);
+        bs->drv = NULL;
         return -errno;
     }
 
diff --git a/configure b/configure
index f1c307b..ab54f3c 100755
--- a/configure
+++ b/configure
@@ -298,6 +298,7 @@ coroutine=""
 coroutine_pool=""
 seccomp=""
 glusterfs=""
+glusterfs_xlator_opt="no"
 glusterfs_discard="no"
 glusterfs_zerofill="no"
 archipelago="no"
@@ -3400,6 +3401,9 @@ if test "$glusterfs" != "no" ; then
     glusterfs="yes"
     glusterfs_cflags=`$pkg_config --cflags glusterfs-api`
     glusterfs_libs=`$pkg_config --libs glusterfs-api`
+    if $pkg_config --atleast-version=4 glusterfs-api; then
+      glusterfs_xlator_opt="yes"
+    fi
     if $pkg_config --atleast-version=5 glusterfs-api; then
       glusterfs_discard="yes"
     fi
@@ -5342,6 +5346,10 @@ if test "$glusterfs" = "yes" ; then
   echo "GLUSTERFS_LIBS=$glusterfs_libs" >> $config_host_mak
 fi
 
+if test "$glusterfs_xlator_opt" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_XLATOR_OPT=y" >> $config_host_mak
+fi
+
 if test "$glusterfs_discard" = "yes" ; then
   echo "CONFIG_GLUSTERFS_DISCARD=y" >> $config_host_mak
 fi
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error
  2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error Jeff Cody
@ 2016-04-19 12:27   ` Kevin Wolf
  2016-04-19 12:29     ` Jeff Cody
  0 siblings, 1 reply; 16+ messages in thread
From: Kevin Wolf @ 2016-04-19 12:27 UTC (permalink / raw)
  To: Jeff Cody; +Cc: qemu-block, qemu-devel, rwheeler, pkarampu, rgowdapp, ndevos

Am 19.04.2016 um 14:07 hat Jeff Cody geschrieben:
> Upon receiving an I/O error after an fsync, by default gluster will
> dump its cache.  However, QEMU will retry the fsync, which is especially
> useful when encountering errors such as ENOSPC when using the werror=stop
> option.  When using caching with gluster, however, the last written data
> will be lost upon encountering ENOSPC.  Using the write-behind-cache
> xlator option of 'resync-failed-syncs-after-fsync' should cause gluster
> to retain the cached data after a failed fsync, so that ENOSPC and other
> transient errors are recoverable.
> 
> Unfortunately, we have no way of knowing if the
> 'resync-failed-syncs-after-fsync' xlator option is supported, so for now
> close the fd and set the BDS driver to NULL upon fsync error.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  block/gluster.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  configure       |  8 ++++++++
>  2 files changed, 50 insertions(+)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index d9aace6..ba33488 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -314,6 +314,23 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
>          goto out;
>      }
>  
> +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> +    /* Without this, if fsync fails for a recoverable reason (for instance,
> +     * ENOSPC), gluster will dump its cache, preventing retries.  This means
> +     * almost certain data loss.  Not all gluster versions support the
> +     * 'resync-failed-syncs-after-fsync' key value, but there is no way to
> +     * discover during runtime if it is supported (this api returns success for
> +     * unknown key/value pairs) */
> +    ret = glfs_set_xlator_option(s->glfs, "*-write-behind",
> +                                          "resync-failed-syncs-after-fsync",
> +                                          "on");
> +    if (ret < 0) {
> +        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
> +        ret = -errno;
> +        goto out;
> +    }
> +#endif
> +
>      qemu_gluster_parse_flags(bdrv_flags, &open_flags);
>  
>      s->fd = glfs_open(s->glfs, gconf->image, open_flags);
> @@ -366,6 +383,16 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
>          goto exit;
>      }
>  
> +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> +    ret = glfs_set_xlator_option(reop_s->glfs, "*-write-behind",
> +                                 "resync-failed-syncs-after-fsync", "on");
> +    if (ret < 0) {
> +        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
> +        ret = -errno;
> +        goto exit;
> +    }
> +#endif
> +
>      reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
>      if (reop_s->fd == NULL) {
>          /* reops->glfs will be cleaned up in _abort */
> @@ -613,6 +640,21 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
>  
>      ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
>      if (ret < 0) {
> +        /* Some versions of Gluster (3.5.6 -> 3.5.8?) will not retain its
> +         * cache after a fsync failure, so we have no way of allowing the guest
> +         * to safely continue.  Gluster versions prior to 3.5.6 don't retain
> +         * the cache either, but will invalidate the fd on error, so this is
> +         * again our only option.
> +         *
> +         * The 'resync-failed-syncs-after-fsync' xlator option for the
> +         * write-behind cache will cause later gluster versions to retain
> +         * its cache after error, so long as the fd remains open.  However,
> +         * we currently have no way of knowing if this option is supported.
> +         *
> +         * TODO: Once gluster provides a way for us to determine if the option
> +         *       is supported, bypass the closure and setting drv to NULL.  */
> +        qemu_gluster_close(bs);
> +        bs->drv = NULL;
>          return -errno;
>      }

More context:

        qemu_coroutine_yield();
        return acb.ret;
    }

I would guess that acb.ret containing an error is the more common case.
We should probably invalidate the BDS in both cases (immediate failure
and callback with error code).

Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error
  2016-04-19 12:27   ` Kevin Wolf
@ 2016-04-19 12:29     ` Jeff Cody
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Cody @ 2016-04-19 12:29 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-block, qemu-devel, rwheeler, pkarampu, rgowdapp, ndevos

On Tue, Apr 19, 2016 at 02:27:56PM +0200, Kevin Wolf wrote:
> Am 19.04.2016 um 14:07 hat Jeff Cody geschrieben:
> > Upon receiving an I/O error after an fsync, by default gluster will
> > dump its cache.  However, QEMU will retry the fsync, which is especially
> > useful when encountering errors such as ENOSPC when using the werror=stop
> > option.  When using caching with gluster, however, the last written data
> > will be lost upon encountering ENOSPC.  Using the write-behind-cache
> > xlator option of 'resync-failed-syncs-after-fsync' should cause gluster
> > to retain the cached data after a failed fsync, so that ENOSPC and other
> > transient errors are recoverable.
> > 
> > Unfortunately, we have no way of knowing if the
> > 'resync-failed-syncs-after-fsync' xlator option is supported, so for now
> > close the fd and set the BDS driver to NULL upon fsync error.
> > 
> > Signed-off-by: Jeff Cody <jcody@redhat.com>
> > ---
> >  block/gluster.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> >  configure       |  8 ++++++++
> >  2 files changed, 50 insertions(+)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index d9aace6..ba33488 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -314,6 +314,23 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
> >          goto out;
> >      }
> >  
> > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> > +    /* Without this, if fsync fails for a recoverable reason (for instance,
> > +     * ENOSPC), gluster will dump its cache, preventing retries.  This means
> > +     * almost certain data loss.  Not all gluster versions support the
> > +     * 'resync-failed-syncs-after-fsync' key value, but there is no way to
> > +     * discover during runtime if it is supported (this api returns success for
> > +     * unknown key/value pairs) */
> > +    ret = glfs_set_xlator_option(s->glfs, "*-write-behind",
> > +                                          "resync-failed-syncs-after-fsync",
> > +                                          "on");
> > +    if (ret < 0) {
> > +        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
> > +        ret = -errno;
> > +        goto out;
> > +    }
> > +#endif
> > +
> >      qemu_gluster_parse_flags(bdrv_flags, &open_flags);
> >  
> >      s->fd = glfs_open(s->glfs, gconf->image, open_flags);
> > @@ -366,6 +383,16 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
> >          goto exit;
> >      }
> >  
> > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> > +    ret = glfs_set_xlator_option(reop_s->glfs, "*-write-behind",
> > +                                 "resync-failed-syncs-after-fsync", "on");
> > +    if (ret < 0) {
> > +        error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
> > +        ret = -errno;
> > +        goto exit;
> > +    }
> > +#endif
> > +
> >      reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags);
> >      if (reop_s->fd == NULL) {
> >          /* reops->glfs will be cleaned up in _abort */
> > @@ -613,6 +640,21 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs)
> >  
> >      ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb);
> >      if (ret < 0) {
> > +        /* Some versions of Gluster (3.5.6 -> 3.5.8?) will not retain its
> > +         * cache after a fsync failure, so we have no way of allowing the guest
> > +         * to safely continue.  Gluster versions prior to 3.5.6 don't retain
> > +         * the cache either, but will invalidate the fd on error, so this is
> > +         * again our only option.
> > +         *
> > +         * The 'resync-failed-syncs-after-fsync' xlator option for the
> > +         * write-behind cache will cause later gluster versions to retain
> > +         * its cache after error, so long as the fd remains open.  However,
> > +         * we currently have no way of knowing if this option is supported.
> > +         *
> > +         * TODO: Once gluster provides a way for us to determine if the option
> > +         *       is supported, bypass the closure and setting drv to NULL.  */
> > +        qemu_gluster_close(bs);
> > +        bs->drv = NULL;
> >          return -errno;
> >      }
> 
> More context:
> 
>         qemu_coroutine_yield();
>         return acb.ret;
>     }
> 
> I would guess that acb.ret containing an error is the more common case.
> We should probably invalidate the BDS in both cases (immediate failure
> and callback with error code).
>

Ah yes, indeed.  I'll do that now.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-19 12:07 [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Jeff Cody
                   ` (2 preceding siblings ...)
  2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error Jeff Cody
@ 2016-04-19 12:18 ` Ric Wheeler
  2016-04-19 14:09   ` Jeff Cody
  3 siblings, 1 reply; 16+ messages in thread
From: Ric Wheeler @ 2016-04-19 12:18 UTC (permalink / raw)
  To: Jeff Cody, qemu-block
  Cc: qemu-devel, kwolf, pkarampu, rgowdapp, ndevos, Rik van Riel

On 04/19/2016 08:07 AM, Jeff Cody wrote:
> Bug fixes for gluster; third patch is to prevent
> a potential data loss when trying to recover from
> a recoverable error (such as ENOSPC).

Hi Jeff,

Just a note, I have been talking to some of the disk drive people here at LSF 
(the kernel summit for file and storage people) and got a non-public 
confirmation that individual storage devices (s-ata drives or scsi) can also 
dump cache state when a synchronize cache command fails.  Also followed up with 
Rik van Riel - in the page cache in general, when we fail to write back dirty 
pages, they are simply marked "clean" (which means effectively that they get 
dropped).

Long winded way of saying that I think that this scenario is not unique to 
gluster - any failed fsync() to a file (or block device) might be an indication 
of permanent data loss.

Regards,

Ric

>
> The final patch closes the gluster fd and sets the
> protocol drv to NULL on fsync failure in gluster;
> we have no way of knowing what gluster versions
> support retaining fysnc cache on error, so until
> we do the safest thing to do is invalidate the
> drive.
>
> Jeff Cody (3):
>    block/gluster: return correct error value
>    block/gluster: code movement of qemu_gluster_close()
>    block/gluster: prevent data loss after i/o error
>
>   block/gluster.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++-----------
>   configure       |  8 +++++++
>   2 files changed, 62 insertions(+), 12 deletions(-)
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-19 12:18 ` [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Ric Wheeler
@ 2016-04-19 14:09   ` Jeff Cody
  2016-04-20  1:56     ` Ric Wheeler
  2016-04-20  5:15     ` Raghavendra Gowdappa
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff Cody @ 2016-04-19 14:09 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: qemu-block, qemu-devel, kwolf, pkarampu, rgowdapp, ndevos,
	Rik van Riel

On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
> On 04/19/2016 08:07 AM, Jeff Cody wrote:
> >Bug fixes for gluster; third patch is to prevent
> >a potential data loss when trying to recover from
> >a recoverable error (such as ENOSPC).
> 
> Hi Jeff,
> 
> Just a note, I have been talking to some of the disk drive people
> here at LSF (the kernel summit for file and storage people) and got
> a non-public confirmation that individual storage devices (s-ata
> drives or scsi) can also dump cache state when a synchronize cache
> command fails.  Also followed up with Rik van Riel - in the page
> cache in general, when we fail to write back dirty pages, they are
> simply marked "clean" (which means effectively that they get
> dropped).
> 
> Long winded way of saying that I think that this scenario is not
> unique to gluster - any failed fsync() to a file (or block device)
> might be an indication of permanent data loss.
>

Ric,

Thanks.

I think you are right, we likely do need to address how QEMU handles fsync
failures across the board in QEMU at some point (2.7?).  Another point to
consider is that QEMU is cross-platform - so not only do we have different
protocols, and filesystems, but also different underlying host OSes as well.
It is likely, like you said, that there are other non-gluster scenarios where
we have non-recoverable data loss on fsync failure. 

With Gluster specifically, if we look at just ENOSPC, does this mean that
even if Gluster retains its cache after fsync failure, we still won't know
that there was no permanent data loss?  If we hit ENOSPC during an fsync, I
presume that means Gluster itself may have encountered ENOSPC from a fsync to
the underlying storage.  In that case, does Gluster just pass the error up
the stack?

Jeff

> 
> >
> >The final patch closes the gluster fd and sets the
> >protocol drv to NULL on fsync failure in gluster;
> >we have no way of knowing what gluster versions
> >support retaining fysnc cache on error, so until
> >we do the safest thing to do is invalidate the
> >drive.
> >
> >Jeff Cody (3):
> >   block/gluster: return correct error value
> >   block/gluster: code movement of qemu_gluster_close()
> >   block/gluster: prevent data loss after i/o error
> >
> >  block/gluster.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++-----------
> >  configure       |  8 +++++++
> >  2 files changed, 62 insertions(+), 12 deletions(-)
> >
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-19 14:09   ` Jeff Cody
@ 2016-04-20  1:56     ` Ric Wheeler
  2016-04-20  9:24       ` Kevin Wolf
  2016-04-20  5:15     ` Raghavendra Gowdappa
  1 sibling, 1 reply; 16+ messages in thread
From: Ric Wheeler @ 2016-04-20  1:56 UTC (permalink / raw)
  To: Jeff Cody
  Cc: qemu-block, qemu-devel, kwolf, pkarampu, rgowdapp, ndevos,
	Rik van Riel

On 04/19/2016 10:09 AM, Jeff Cody wrote:
> On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
>> On 04/19/2016 08:07 AM, Jeff Cody wrote:
>>> Bug fixes for gluster; third patch is to prevent
>>> a potential data loss when trying to recover from
>>> a recoverable error (such as ENOSPC).
>> Hi Jeff,
>>
>> Just a note, I have been talking to some of the disk drive people
>> here at LSF (the kernel summit for file and storage people) and got
>> a non-public confirmation that individual storage devices (s-ata
>> drives or scsi) can also dump cache state when a synchronize cache
>> command fails.  Also followed up with Rik van Riel - in the page
>> cache in general, when we fail to write back dirty pages, they are
>> simply marked "clean" (which means effectively that they get
>> dropped).
>>
>> Long winded way of saying that I think that this scenario is not
>> unique to gluster - any failed fsync() to a file (or block device)
>> might be an indication of permanent data loss.
>>
> Ric,
>
> Thanks.
>
> I think you are right, we likely do need to address how QEMU handles fsync
> failures across the board in QEMU at some point (2.7?).  Another point to
> consider is that QEMU is cross-platform - so not only do we have different
> protocols, and filesystems, but also different underlying host OSes as well.
> It is likely, like you said, that there are other non-gluster scenarios where
> we have non-recoverable data loss on fsync failure.
>
> With Gluster specifically, if we look at just ENOSPC, does this mean that
> even if Gluster retains its cache after fsync failure, we still won't know
> that there was no permanent data loss?  If we hit ENOSPC during an fsync, I
> presume that means Gluster itself may have encountered ENOSPC from a fsync to
> the underlying storage.  In that case, does Gluster just pass the error up
> the stack?
>
> Jeff

I still worry that in many non-gluster situations we will have permanent data 
loss here. Specifically, the way the page cache works, if we fail to write back 
cached data *at any time*, a future fsync() will get a failure.

That failure could be because of a thinly provisioned backing store, but in the 
interim, the page cache is free to drop the pages that had failed. In effect, we 
end up with data loss in part or in whole without a way to detect which bits got 
dropped.

Note that this is not a gluster issue, this is for any file system on top of 
thinly provisioned storage (i.e., we would see this with xfs on thin storage or 
ext4 on thin storage).  In effect, if gluster has written the data back to xfs 
and that is on top of a thinly provisioned target, the kernel might drop that 
data before you can try an fsync again. Even if you retry the fsync(), the pages 
are marked clean so they will not be pushed back to storage on that second fsync().

Same issue with link loss - if we lose connection to a storage target, it is 
likely to take time to detect that, more time to reconnect. In the interim, any 
page cache data is very likely to get dropped under memory pressure.

In both of these cases, fsync() failure is effectively a signal of a high chance 
of data that has been already lost. A retry will not save the day.

At LSF/MM today, we discussed an option that would allow the page cache to hang 
on to data - for re-tryable errors only for example - so that this would not 
happen. The impact of this is also potentially huge (page cache/physical memory 
could be exhausted while waiting for an admin to fix the issue) so it would have 
to be a non-default option.

I think that we will need some discussions with the kernel memory management 
team (and some storage kernel people) to see what seems reasonable here.

Regards,

Ric

>
>>> The final patch closes the gluster fd and sets the
>>> protocol drv to NULL on fsync failure in gluster;
>>> we have no way of knowing what gluster versions
>>> support retaining fysnc cache on error, so until
>>> we do the safest thing to do is invalidate the
>>> drive.
>>>
>>> Jeff Cody (3):
>>>    block/gluster: return correct error value
>>>    block/gluster: code movement of qemu_gluster_close()
>>>    block/gluster: prevent data loss after i/o error
>>>
>>>   block/gluster.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++-----------
>>>   configure       |  8 +++++++
>>>   2 files changed, 62 insertions(+), 12 deletions(-)
>>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-20  1:56     ` Ric Wheeler
@ 2016-04-20  9:24       ` Kevin Wolf
  2016-04-20 10:40         ` Ric Wheeler
  0 siblings, 1 reply; 16+ messages in thread
From: Kevin Wolf @ 2016-04-20  9:24 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Jeff Cody, qemu-block, qemu-devel, pkarampu, rgowdapp, ndevos,
	Rik van Riel

Am 20.04.2016 um 03:56 hat Ric Wheeler geschrieben:
> On 04/19/2016 10:09 AM, Jeff Cody wrote:
> >On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
> >>On 04/19/2016 08:07 AM, Jeff Cody wrote:
> >>>Bug fixes for gluster; third patch is to prevent
> >>>a potential data loss when trying to recover from
> >>>a recoverable error (such as ENOSPC).
> >>Hi Jeff,
> >>
> >>Just a note, I have been talking to some of the disk drive people
> >>here at LSF (the kernel summit for file and storage people) and got
> >>a non-public confirmation that individual storage devices (s-ata
> >>drives or scsi) can also dump cache state when a synchronize cache
> >>command fails.  Also followed up with Rik van Riel - in the page
> >>cache in general, when we fail to write back dirty pages, they are
> >>simply marked "clean" (which means effectively that they get
> >>dropped).
> >>
> >>Long winded way of saying that I think that this scenario is not
> >>unique to gluster - any failed fsync() to a file (or block device)
> >>might be an indication of permanent data loss.
> >>
> >Ric,
> >
> >Thanks.
> >
> >I think you are right, we likely do need to address how QEMU handles fsync
> >failures across the board in QEMU at some point (2.7?).  Another point to
> >consider is that QEMU is cross-platform - so not only do we have different
> >protocols, and filesystems, but also different underlying host OSes as well.
> >It is likely, like you said, that there are other non-gluster scenarios where
> >we have non-recoverable data loss on fsync failure.
> >
> >With Gluster specifically, if we look at just ENOSPC, does this mean that
> >even if Gluster retains its cache after fsync failure, we still won't know
> >that there was no permanent data loss?  If we hit ENOSPC during an fsync, I
> >presume that means Gluster itself may have encountered ENOSPC from a fsync to
> >the underlying storage.  In that case, does Gluster just pass the error up
> >the stack?
> >
> >Jeff
> 
> I still worry that in many non-gluster situations we will have
> permanent data loss here. Specifically, the way the page cache
> works, if we fail to write back cached data *at any time*, a future
> fsync() will get a failure.

And this is actually what saves the semantic correctness. If you threw
away data, any following fsync() must fail. This is of course
inconvenient because you won't be able to resume a VM that is configured
to stop on errors, and it means some data loss, but it's safe because we
never tell the guest that the data is on disk when it really isn't.

gluster's behaviour (without resync-failed-syncs-after-fsync set) is
different, if I understand correctly. It will throw away the data and
then happily report success on the next fsync() call. And this is what
causes not only data loss, but corruption.

[ Hm, or having read what's below... Did I misunderstand and Linux
  returns failure only for a single fsync() and on the next one it
  returns success again? That would be bad. ]

> That failure could be because of a thinly provisioned backing store,
> but in the interim, the page cache is free to drop the pages that
> had failed. In effect, we end up with data loss in part or in whole
> without a way to detect which bits got dropped.
> 
> Note that this is not a gluster issue, this is for any file system
> on top of thinly provisioned storage (i.e., we would see this with
> xfs on thin storage or ext4 on thin storage).  In effect, if gluster
> has written the data back to xfs and that is on top of a thinly
> provisioned target, the kernel might drop that data before you can
> try an fsync again. Even if you retry the fsync(), the pages are
> marked clean so they will not be pushed back to storage on that
> second fsync().

I'm wondering... Marking the page clean means that it can be evicted
from the cache, right? Which happens whenever something more useful can
be done with the memory, i.e. possibly at any time. Does this mean that
two consecutive reads of the same block can return different data even
though no process has written to the file in between?

Also, O_DIRECT bypasses the problem, right? In that already the write
request would fail there, not only the fsync(). We recommend that for
production environments anyway.

> Same issue with link loss - if we lose connection to a storage
> target, it is likely to take time to detect that, more time to
> reconnect. In the interim, any page cache data is very likely to get
> dropped under memory pressure.
> 
> In both of these cases, fsync() failure is effectively a signal of a
> high chance of data that has been already lost. A retry will not
> save the day.
>
> At LSF/MM today, we discussed an option that would allow the page
> cache to hang on to data - for re-tryable errors only for example -
> so that this would not happen. The impact of this is also
> potentially huge (page cache/physical memory could be exhausted
> while waiting for an admin to fix the issue) so it would have to be
> a non-default option.

Is memory pressure the most common case, though?

The odd effect that I see is that calling fsync() could actually make
data less safe than it was if the call fails. With the kernel marking
the pages clean on failure, instead of evicting "really clean" pages, we
can now evict "dirty, but failed writeout" pages even without any real
memory pressure, just because they can't be distinguished any more. Or
maybe they aren't even evicted, but the admin fixes the problem and we
could now write them to the disk if only they were still marked dirty
and wouldn't be ignored in the writeout.

I'm sure there are solutions that are more intelligent than the extremes
of "mark clean on error" and "keep failed pages indefinitely" and that
cover a large part of use cases where qemu wants to resume a VM after a
failure (for local files perhaps most commonly resuming after ENOSPC).

Even just evicting pages immediately on a failure would probably be an
improvement because reads would then be consistent. And keeping the data
around until we *really* need memory might solve the problem for all
practical purposes. If we do eventually need the memory and throw away
data, fsync() consistently returning an error after throwing away data
is still safe, but we have a much better behaviour in the average case.

> I think that we will need some discussions with the kernel memory
> management team (and some storage kernel people) to see what seems
> reasonable here.

It's a good discussion to have, but for the network protocols (like with
gluster) we tend to use the native libraries and don't even go through
the kernel page cache. So I think we shouldn't stop discussing the
semantics of these protocols and APIs while talking about the kernel
page cache.

Network protocols are also where error like "network is down" become
more relevant, so if anything, we want to have better error recovery
than on local files there.

Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-20  9:24       ` Kevin Wolf
@ 2016-04-20 10:40         ` Ric Wheeler
  2016-04-20 11:46           ` Kevin Wolf
  2016-04-20 18:37           ` Rik van Riel
  0 siblings, 2 replies; 16+ messages in thread
From: Ric Wheeler @ 2016-04-20 10:40 UTC (permalink / raw)
  To: Kevin Wolf
  Cc: Jeff Cody, qemu-block, qemu-devel, pkarampu, rgowdapp, ndevos,
	Rik van Riel

On 04/20/2016 05:24 AM, Kevin Wolf wrote:
> Am 20.04.2016 um 03:56 hat Ric Wheeler geschrieben:
>> On 04/19/2016 10:09 AM, Jeff Cody wrote:
>>> On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
>>>> On 04/19/2016 08:07 AM, Jeff Cody wrote:
>>>>> Bug fixes for gluster; third patch is to prevent
>>>>> a potential data loss when trying to recover from
>>>>> a recoverable error (such as ENOSPC).
>>>> Hi Jeff,
>>>>
>>>> Just a note, I have been talking to some of the disk drive people
>>>> here at LSF (the kernel summit for file and storage people) and got
>>>> a non-public confirmation that individual storage devices (s-ata
>>>> drives or scsi) can also dump cache state when a synchronize cache
>>>> command fails.  Also followed up with Rik van Riel - in the page
>>>> cache in general, when we fail to write back dirty pages, they are
>>>> simply marked "clean" (which means effectively that they get
>>>> dropped).
>>>>
>>>> Long winded way of saying that I think that this scenario is not
>>>> unique to gluster - any failed fsync() to a file (or block device)
>>>> might be an indication of permanent data loss.
>>>>
>>> Ric,
>>>
>>> Thanks.
>>>
>>> I think you are right, we likely do need to address how QEMU handles fsync
>>> failures across the board in QEMU at some point (2.7?).  Another point to
>>> consider is that QEMU is cross-platform - so not only do we have different
>>> protocols, and filesystems, but also different underlying host OSes as well.
>>> It is likely, like you said, that there are other non-gluster scenarios where
>>> we have non-recoverable data loss on fsync failure.
>>>
>>> With Gluster specifically, if we look at just ENOSPC, does this mean that
>>> even if Gluster retains its cache after fsync failure, we still won't know
>>> that there was no permanent data loss?  If we hit ENOSPC during an fsync, I
>>> presume that means Gluster itself may have encountered ENOSPC from a fsync to
>>> the underlying storage.  In that case, does Gluster just pass the error up
>>> the stack?
>>>
>>> Jeff
>> I still worry that in many non-gluster situations we will have
>> permanent data loss here. Specifically, the way the page cache
>> works, if we fail to write back cached data *at any time*, a future
>> fsync() will get a failure.
> And this is actually what saves the semantic correctness. If you threw
> away data, any following fsync() must fail. This is of course
> inconvenient because you won't be able to resume a VM that is configured
> to stop on errors, and it means some data loss, but it's safe because we
> never tell the guest that the data is on disk when it really isn't.
>
> gluster's behaviour (without resync-failed-syncs-after-fsync set) is
> different, if I understand correctly. It will throw away the data and
> then happily report success on the next fsync() call. And this is what
> causes not only data loss, but corruption.

Yes, that makes sense to me - the kernel will remember that it could not write 
data back from the page cache and the future fsync() will see an error.

>
> [ Hm, or having read what's below... Did I misunderstand and Linux
>    returns failure only for a single fsync() and on the next one it
>    returns success again? That would be bad. ]

I would need to think through that scenario with the memory management people to 
see if that could happen.

>> That failure could be because of a thinly provisioned backing store,
>> but in the interim, the page cache is free to drop the pages that
>> had failed. In effect, we end up with data loss in part or in whole
>> without a way to detect which bits got dropped.
>>
>> Note that this is not a gluster issue, this is for any file system
>> on top of thinly provisioned storage (i.e., we would see this with
>> xfs on thin storage or ext4 on thin storage).  In effect, if gluster
>> has written the data back to xfs and that is on top of a thinly
>> provisioned target, the kernel might drop that data before you can
>> try an fsync again. Even if you retry the fsync(), the pages are
>> marked clean so they will not be pushed back to storage on that
>> second fsync().
> I'm wondering... Marking the page clean means that it can be evicted
> from the cache, right? Which happens whenever something more useful can
> be done with the memory, i.e. possibly at any time. Does this mean that
> two consecutive reads of the same block can return different data even
> though no process has written to the file in between?

This we should tease out with a careful review of the behavior, but I think that 
might be able to happen.

Specifically,

Time 0: File has pattern A at offset 0. Any reads at this point see pattern A

Time 1: Write pattern B to offset 0. Reads now see pattern B.

Time 2: Run out of space on the backing store (before the data has been written 
back)

Time 3: Do an fsync() *OR* have the page cache fail to write back that page

Time 4: Under memory pressure, the page which was marked clean, is dropped

Time 5: Read offset 0 again - do we now see pattern A again? Or an IO error?
>
> Also, O_DIRECT bypasses the problem, right? In that already the write
> request would fail there, not only the fsync(). We recommend that for
> production environments anyway.

O_DIRECT bypasses the page cache, but that data is allowed to be held in a 
volatile write cache (say in a disk's write cache) until the target device sees 
an fsync().

The safest (and horribly slow way) to be 100% safe is to write O_DIRECT|O_SYNC 
which bypasses the page cache and sends effectively a cache flush after each IO.

Most applications use fsync() after O_DIRECT at more strategic times though I 
assume (or don't know about this behavior).
>
>> Same issue with link loss - if we lose connection to a storage
>> target, it is likely to take time to detect that, more time to
>> reconnect. In the interim, any page cache data is very likely to get
>> dropped under memory pressure.
>>
>> In both of these cases, fsync() failure is effectively a signal of a
>> high chance of data that has been already lost. A retry will not
>> save the day.
>>
>> At LSF/MM today, we discussed an option that would allow the page
>> cache to hang on to data - for re-tryable errors only for example -
>> so that this would not happen. The impact of this is also
>> potentially huge (page cache/physical memory could be exhausted
>> while waiting for an admin to fix the issue) so it would have to be
>> a non-default option.
> Is memory pressure the most common case, though?

I think it really depends on the type of storage device we have under us.

>
> The odd effect that I see is that calling fsync() could actually make
> data less safe than it was if the call fails. With the kernel marking
> the pages clean on failure, instead of evicting "really clean" pages, we
> can now evict "dirty, but failed writeout" pages even without any real
> memory pressure, just because they can't be distinguished any more. Or
> maybe they aren't even evicted, but the admin fixes the problem and we
> could now write them to the disk if only they were still marked dirty
> and wouldn't be ignored in the writeout.

fsync() is just the messenger that something bad happened - it is always better 
to know that we lost data since the last fsync() call rather than not know, correct?

Keep in mind that data will have this issue any time memory pressure (or other 
algorithms) cause data to be written back from the page cache, even if the 
application has not used an fsync().

Even if the admin "fixes" the issues (adds more storage, kicks a fibre channel 
switch, re-inserts a disk), IO might have been dropped forever from the page cache.

>
> I'm sure there are solutions that are more intelligent than the extremes
> of "mark clean on error" and "keep failed pages indefinitely" and that
> cover a large part of use cases where qemu wants to resume a VM after a
> failure (for local files perhaps most commonly resuming after ENOSPC).
>
> Even just evicting pages immediately on a failure would probably be an
> improvement because reads would then be consistent. And keeping the data
> around until we *really* need memory might solve the problem for all
> practical purposes. If we do eventually need the memory and throw away
> data, fsync() consistently returning an error after throwing away data
> is still safe, but we have a much better behaviour in the average case.
>
>> I think that we will need some discussions with the kernel memory
>> management team (and some storage kernel people) to see what seems
>> reasonable here.
> It's a good discussion to have, but for the network protocols (like with
> gluster) we tend to use the native libraries and don't even go through
> the kernel page cache. So I think we shouldn't stop discussing the
> semantics of these protocols and APIs while talking about the kernel
> page cache.
>
> Network protocols are also where error like "network is down" become
> more relevant, so if anything, we want to have better error recovery
> than on local files there.

I agree that with gluster we can try various schemes pretty easily when the 
error appears because of something internal to gluster (like a network error to 
a remote gluster server) but we cannot shield applications from data loss when 
we are just the messenger for an error on the storage servers local storage stack.

This is an important discussion to work through though - not just for qemu, I 
think it has a lot of value for everyone.

Thanks!

Ric

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-20 10:40         ` Ric Wheeler
@ 2016-04-20 11:46           ` Kevin Wolf
  2016-04-20 18:38             ` Rik van Riel
  2016-04-20 18:37           ` Rik van Riel
  1 sibling, 1 reply; 16+ messages in thread
From: Kevin Wolf @ 2016-04-20 11:46 UTC (permalink / raw)
  To: Ric Wheeler
  Cc: Jeff Cody, qemu-block, qemu-devel, pkarampu, rgowdapp, ndevos,
	Rik van Riel

Am 20.04.2016 um 12:40 hat Ric Wheeler geschrieben:
> On 04/20/2016 05:24 AM, Kevin Wolf wrote:
> >Am 20.04.2016 um 03:56 hat Ric Wheeler geschrieben:
> >>On 04/19/2016 10:09 AM, Jeff Cody wrote:
> >>>On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
> >>I still worry that in many non-gluster situations we will have
> >>permanent data loss here. Specifically, the way the page cache
> >>works, if we fail to write back cached data *at any time*, a future
> >>fsync() will get a failure.
> >And this is actually what saves the semantic correctness. If you threw
> >away data, any following fsync() must fail. This is of course
> >inconvenient because you won't be able to resume a VM that is configured
> >to stop on errors, and it means some data loss, but it's safe because we
> >never tell the guest that the data is on disk when it really isn't.
> >
> >gluster's behaviour (without resync-failed-syncs-after-fsync set) is
> >different, if I understand correctly. It will throw away the data and
> >then happily report success on the next fsync() call. And this is what
> >causes not only data loss, but corruption.
> 
> Yes, that makes sense to me - the kernel will remember that it could
> not write data back from the page cache and the future fsync() will
> see an error.
> 
> >
> >[ Hm, or having read what's below... Did I misunderstand and Linux
> >   returns failure only for a single fsync() and on the next one it
> >   returns success again? That would be bad. ]
> 
> I would need to think through that scenario with the memory
> management people to see if that could happen.

Okay, please do. This is the fundamental assumption we make: If an
fsync() succeeds, *all* successfully completed writes are on disk, no
matter whether another fsync() failed in between. If they can't be
written to the disk (e.g. because the data was thrown away), no
consequent fsync() can succeed any more.

> >>That failure could be because of a thinly provisioned backing store,
> >>but in the interim, the page cache is free to drop the pages that
> >>had failed. In effect, we end up with data loss in part or in whole
> >>without a way to detect which bits got dropped.
> >>
> >>Note that this is not a gluster issue, this is for any file system
> >>on top of thinly provisioned storage (i.e., we would see this with
> >>xfs on thin storage or ext4 on thin storage).  In effect, if gluster
> >>has written the data back to xfs and that is on top of a thinly
> >>provisioned target, the kernel might drop that data before you can
> >>try an fsync again. Even if you retry the fsync(), the pages are
> >>marked clean so they will not be pushed back to storage on that
> >>second fsync().
> >I'm wondering... Marking the page clean means that it can be evicted
> >from the cache, right? Which happens whenever something more useful can
> >be done with the memory, i.e. possibly at any time. Does this mean that
> >two consecutive reads of the same block can return different data even
> >though no process has written to the file in between?
> 
> This we should tease out with a careful review of the behavior, but
> I think that might be able to happen.
> 
> Specifically,
> 
> Time 0: File has pattern A at offset 0. Any reads at this point see pattern A
> 
> Time 1: Write pattern B to offset 0. Reads now see pattern B.
> 
> Time 2: Run out of space on the backing store (before the data has
> been written back)
> 
> Time 3: Do an fsync() *OR* have the page cache fail to write back that page
> 
> Time 4: Under memory pressure, the page which was marked clean, is dropped
> 
> Time 5: Read offset 0 again - do we now see pattern A again? Or an IO error?

Seeing pattern A again would certainly be surprising for programs.
Probably worth checking what really happens.

> >Also, O_DIRECT bypasses the problem, right? In that already the write
> >request would fail there, not only the fsync(). We recommend that for
> >production environments anyway.
> 
> O_DIRECT bypasses the page cache, but that data is allowed to be
> held in a volatile write cache (say in a disk's write cache) until
> the target device sees an fsync().
> 
> The safest (and horribly slow way) to be 100% safe is to write
> O_DIRECT|O_SYNC which bypasses the page cache and sends effectively
> a cache flush after each IO.
> 
> Most applications use fsync() after O_DIRECT at more strategic times
> though I assume (or don't know about this behavior).

qemu can be configured to flush after each write request, but for
obvious reasons that's not something you want to use if you don't have
to.

Anyway, disks are yet another layer, and I would guess that flush
failures become less and less likely to be temporary and recoverable the
further you go down the stack. Failing for good when the disk is broken
is fine, as far as I am concerned. Doing the same because the network
had a hiccup for a few seconds is not.

> >>Same issue with link loss - if we lose connection to a storage
> >>target, it is likely to take time to detect that, more time to
> >>reconnect. In the interim, any page cache data is very likely to get
> >>dropped under memory pressure.
> >>
> >>In both of these cases, fsync() failure is effectively a signal of a
> >>high chance of data that has been already lost. A retry will not
> >>save the day.
> >>
> >>At LSF/MM today, we discussed an option that would allow the page
> >>cache to hang on to data - for re-tryable errors only for example -
> >>so that this would not happen. The impact of this is also
> >>potentially huge (page cache/physical memory could be exhausted
> >>while waiting for an admin to fix the issue) so it would have to be
> >>a non-default option.
> >Is memory pressure the most common case, though?
> 
> I think it really depends on the type of storage device we have under us.
> 
> >
> >The odd effect that I see is that calling fsync() could actually make
> >data less safe than it was if the call fails. With the kernel marking
> >the pages clean on failure, instead of evicting "really clean" pages, we
> >can now evict "dirty, but failed writeout" pages even without any real
> >memory pressure, just because they can't be distinguished any more. Or
> >maybe they aren't even evicted, but the admin fixes the problem and we
> >could now write them to the disk if only they were still marked dirty
> >and wouldn't be ignored in the writeout.
> 
> fsync() is just the messenger that something bad happened - it is
> always better to know that we lost data since the last fsync() call
> rather than not know, correct?
> 
> Keep in mind that data will have this issue any time memory pressure
> (or other algorithms) cause data to be written back from the page
> cache, even if the application has not used an fsync().
> 
> Even if the admin "fixes" the issues (adds more storage, kicks a
> fibre channel switch, re-inserts a disk), IO might have been dropped
> forever from the page cache.

Yes. We can't recover in 100% of the cases. In some cases, like when
write failure and memory pressure come together, we may have lost. We
should probably just accept that and concentrate on improving the
average case.

My point is just that if there is no memory pressure (>90%?), we
shouldn't make the situation worse than it was. In this case, fsync()
wasn't only the messenger that a write failed, but it is what caused the
write to happen at this specific time in the first place. If we hadn't
called it, and the issue were fixed before memory pressure caused the
page to be written back, we might not have suffered data loss.

In other words, calling fsync() was harmful in this situation. And that
certainly shouldn't be the case.

> >I'm sure there are solutions that are more intelligent than the extremes
> >of "mark clean on error" and "keep failed pages indefinitely" and that
> >cover a large part of use cases where qemu wants to resume a VM after a
> >failure (for local files perhaps most commonly resuming after ENOSPC).
> >
> >Even just evicting pages immediately on a failure would probably be an
> >improvement because reads would then be consistent. And keeping the data
> >around until we *really* need memory might solve the problem for all
> >practical purposes. If we do eventually need the memory and throw away
> >data, fsync() consistently returning an error after throwing away data
> >is still safe, but we have a much better behaviour in the average case.
> >
> >>I think that we will need some discussions with the kernel memory
> >>management team (and some storage kernel people) to see what seems
> >>reasonable here.
> >It's a good discussion to have, but for the network protocols (like with
> >gluster) we tend to use the native libraries and don't even go through
> >the kernel page cache. So I think we shouldn't stop discussing the
> >semantics of these protocols and APIs while talking about the kernel
> >page cache.
> >
> >Network protocols are also where error like "network is down" become
> >more relevant, so if anything, we want to have better error recovery
> >than on local files there.
> 
> I agree that with gluster we can try various schemes pretty easily
> when the error appears because of something internal to gluster
> (like a network error to a remote gluster server) but we cannot
> shield applications from data loss when we are just the messenger
> for an error on the storage servers local storage stack.

Yes. In some cases, we will always have to tell the user "sorry,
something went wrong, your data is gone". But I think in most cases we
can do better than that.

> This is an important discussion to work through though - not just
> for qemu, I think it has a lot of value for everyone.

I would think so, yes.

Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-20 11:46           ` Kevin Wolf
@ 2016-04-20 18:38             ` Rik van Riel
  2016-04-21  8:43               ` Kevin Wolf
  0 siblings, 1 reply; 16+ messages in thread
From: Rik van Riel @ 2016-04-20 18:38 UTC (permalink / raw)
  To: Kevin Wolf, Ric Wheeler
  Cc: Jeff Cody, qemu-block, qemu-devel, pkarampu, rgowdapp, ndevos

On Wed, 2016-04-20 at 13:46 +0200, Kevin Wolf wrote:
> Am 20.04.2016 um 12:40 hat Ric Wheeler geschrieben:
> > 
> > On 04/20/2016 05:24 AM, Kevin Wolf wrote:
> > > 
> > > Am 20.04.2016 um 03:56 hat Ric Wheeler geschrieben:
> > > > 
> > > > On 04/19/2016 10:09 AM, Jeff Cody wrote:
> > > > > 
> > > > > On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
> > > > I still worry that in many non-gluster situations we will have
> > > > permanent data loss here. Specifically, the way the page cache
> > > > works, if we fail to write back cached data *at any time*, a
> > > > future
> > > > fsync() will get a failure.
> > > And this is actually what saves the semantic correctness. If you
> > > threw
> > > away data, any following fsync() must fail. This is of course
> > > inconvenient because you won't be able to resume a VM that is
> > > configured
> > > to stop on errors, and it means some data loss, but it's safe
> > > because we
> > > never tell the guest that the data is on disk when it really
> > > isn't.
> > > 
> > > gluster's behaviour (without resync-failed-syncs-after-fsync set)
> > > is
> > > different, if I understand correctly. It will throw away the data
> > > and
> > > then happily report success on the next fsync() call. And this is
> > > what
> > > causes not only data loss, but corruption.
> > Yes, that makes sense to me - the kernel will remember that it
> > could
> > not write data back from the page cache and the future fsync() will
> > see an error.
> > 
> > > 
> > > 
> > > [ Hm, or having read what's below... Did I misunderstand and
> > > Linux
> > >   returns failure only for a single fsync() and on the next one
> > > it
> > >   returns success again? That would be bad. ]
> > I would need to think through that scenario with the memory
> > management people to see if that could happen.
> Okay, please do. This is the fundamental assumption we make: If an
> fsync() succeeds, *all* successfully completed writes are on disk, no
> matter whether another fsync() failed in between. If they can't be
> written to the disk (e.g. because the data was thrown away), no
> consequent fsync() can succeed any more.
> 

Is that actually desirable behaviour?

What would it take to make fsync succeed again
on that file at any point in the future?

Umount of the filesystem?

Reboot of the whole system?

Restore from backup?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-20 18:38             ` Rik van Riel
@ 2016-04-21  8:43               ` Kevin Wolf
  0 siblings, 0 replies; 16+ messages in thread
From: Kevin Wolf @ 2016-04-21  8:43 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Ric Wheeler, Jeff Cody, qemu-block, qemu-devel, pkarampu,
	rgowdapp, ndevos

Am 20.04.2016 um 20:38 hat Rik van Riel geschrieben:
> On Wed, 2016-04-20 at 13:46 +0200, Kevin Wolf wrote:
> > Am 20.04.2016 um 12:40 hat Ric Wheeler geschrieben:
> > > 
> > > On 04/20/2016 05:24 AM, Kevin Wolf wrote:
> > > > 
> > > > Am 20.04.2016 um 03:56 hat Ric Wheeler geschrieben:
> > > > > 
> > > > > On 04/19/2016 10:09 AM, Jeff Cody wrote:
> > > > > > 
> > > > > > On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
> > > > > I still worry that in many non-gluster situations we will have
> > > > > permanent data loss here. Specifically, the way the page cache
> > > > > works, if we fail to write back cached data *at any time*, a
> > > > > future
> > > > > fsync() will get a failure.
> > > > And this is actually what saves the semantic correctness. If you
> > > > threw
> > > > away data, any following fsync() must fail. This is of course
> > > > inconvenient because you won't be able to resume a VM that is
> > > > configured
> > > > to stop on errors, and it means some data loss, but it's safe
> > > > because we
> > > > never tell the guest that the data is on disk when it really
> > > > isn't.
> > > > 
> > > > gluster's behaviour (without resync-failed-syncs-after-fsync set)
> > > > is
> > > > different, if I understand correctly. It will throw away the data
> > > > and
> > > > then happily report success on the next fsync() call. And this is
> > > > what
> > > > causes not only data loss, but corruption.
> > > Yes, that makes sense to me - the kernel will remember that it
> > > could
> > > not write data back from the page cache and the future fsync() will
> > > see an error.
> > > 
> > > > 
> > > > 
> > > > [ Hm, or having read what's below... Did I misunderstand and
> > > > Linux
> > > >   returns failure only for a single fsync() and on the next one
> > > > it
> > > >   returns success again? That would be bad. ]
> > > I would need to think through that scenario with the memory
> > > management people to see if that could happen.
> > Okay, please do. This is the fundamental assumption we make: If an
> > fsync() succeeds, *all* successfully completed writes are on disk, no
> > matter whether another fsync() failed in between. If they can't be
> > written to the disk (e.g. because the data was thrown away), no
> > consequent fsync() can succeed any more.
> > 
> 
> Is that actually desirable behaviour?
> 
> What would it take to make fsync succeed again on that file at any
> point in the future?
> 
> Umount of the filesystem?
> 
> Reboot of the whole system?
> 
> Restore from backup?

I would say at least a close()/open() pair.

In other words, I would mark all file descriptors that exist for the
file when dirty pages are dropped or marked clean without having them
written back successfully, so that fsync() will consistently return
failure for them if data written to the file descriptor is lost.

So for more precise wording, I should have said "If an fsync() succeeds,
all successfully completed writes *to that file descriptor* are on
disk". fsync() gets as an fd, so I think that's the obvious thing.

And as I write this and think more about the reasons and implications, I
think we need to get rid of the behaviour that after a write error, the
page stays in the cache and is marked clean while it's inconsistent with
the actual disk contents.

If we were to ensure that the cache is consistent, we'd have the two
options I mentioned earlier in the thread:

1. On write error, don't mark the page clean, but remove it from the
   cache entirely. After a new open(), programs will see the state of
   the file as it is on the disk, with the data that is going to be lost
   already gone. (The other option is, obviously, that the read will
   return an I/O error because the disk is really broken.)

   fsync() can legimitately return success on new file descriptors
   because the program already reads what is on the disk, and there are
   no additional pages in the cache that need to be written back for
   this to be stable. On the other hand, even this works only if the
   problem was temporary because supposedly the program wrote something
   to its new file descriptor before calling fsync().

   On old file descriptors, it will always return failure because the
   data is already lost.

2. On write error, keep the pages in the cache as long as the memory
   isn't desperately needed elsewhere. When it is desperately needed,
   goto 1.

   This means that newly opened files would still see the cache contents
   which hasn't been successfully written back yet. fsync() would retry
   writing it out, both from old and from new file descriptors. For the
   programs having new file descriptors, this is probably equivalent
   with 1 - fsync() succeeds if, and only if, the temporary problem has
   been resolved meanwhile.

   On old file descriptors, this preserves the data and if the failure
   was temporary, the program can recover the file descriptor and get
   back to working.

Implementing 1. would already improve the situation considerably, and I
guess 2. would be the best thing that we can possibly achieve. Both are
by far better than a situation where fsync() lies and returns success
in cases where the cache contents differs from the disk contents (which
is what I understand we have now).

Kevin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-20 10:40         ` Ric Wheeler
  2016-04-20 11:46           ` Kevin Wolf
@ 2016-04-20 18:37           ` Rik van Riel
  1 sibling, 0 replies; 16+ messages in thread
From: Rik van Riel @ 2016-04-20 18:37 UTC (permalink / raw)
  To: Ric Wheeler, Kevin Wolf
  Cc: Jeff Cody, qemu-block, qemu-devel, pkarampu, rgowdapp, ndevos

On Wed, 2016-04-20 at 06:40 -0400, Ric Wheeler wrote:
> On 04/20/2016 05:24 AM, Kevin Wolf wrote:
> > 
> > Am 20.04.2016 um 03:56 hat Ric Wheeler geschrieben:
> > > 
> > > On 04/19/2016 10:09 AM, Jeff Cody wrote:
> > > > 
> > > > On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
> > > > > 
> > > > > On 04/19/2016 08:07 AM, Jeff Cody wrote:
> > > > > > 
> > > > > > Bug fixes for gluster; third patch is to prevent
> > > > > > a potential data loss when trying to recover from
> > > > > > a recoverable error (such as ENOSPC).
> > > > > Hi Jeff,
> > > > > 
> > > > > Just a note, I have been talking to some of the disk drive
> > > > > people
> > > > > here at LSF (the kernel summit for file and storage people)
> > > > > and got
> > > > > a non-public confirmation that individual storage devices (s-
> > > > > ata
> > > > > drives or scsi) can also dump cache state when a synchronize
> > > > > cache
> > > > > command fails.  Also followed up with Rik van Riel - in the
> > > > > page
> > > > > cache in general, when we fail to write back dirty pages,
> > > > > they are
> > > > > simply marked "clean" (which means effectively that they get
> > > > > dropped).
> > > > > 
> > > > > Long winded way of saying that I think that this scenario is
> > > > > not
> > > > > unique to gluster - any failed fsync() to a file (or block
> > > > > device)
> > > > > might be an indication of permanent data loss.
> > > > > 
> > > > Ric,
> > > > 
> > > > Thanks.
> > > > 
> > > > I think you are right, we likely do need to address how QEMU
> > > > handles fsync
> > > > failures across the board in QEMU at some point
> > > > (2.7?).  Another point to
> > > > consider is that QEMU is cross-platform - so not only do we
> > > > have different
> > > > protocols, and filesystems, but also different underlying host
> > > > OSes as well.
> > > > It is likely, like you said, that there are other non-gluster
> > > > scenarios where
> > > > we have non-recoverable data loss on fsync failure.
> > > > 
> > > > With Gluster specifically, if we look at just ENOSPC, does this
> > > > mean that
> > > > even if Gluster retains its cache after fsync failure, we still
> > > > won't know
> > > > that there was no permanent data loss?  If we hit ENOSPC during
> > > > an fsync, I
> > > > presume that means Gluster itself may have encountered ENOSPC
> > > > from a fsync to
> > > > the underlying storage.  In that case, does Gluster just pass
> > > > the error up
> > > > the stack?
> > > > 
> > > > Jeff
> > > I still worry that in many non-gluster situations we will have
> > > permanent data loss here. Specifically, the way the page cache
> > > works, if we fail to write back cached data *at any time*, a
> > > future
> > > fsync() will get a failure.
> > And this is actually what saves the semantic correctness. If you
> > threw
> > away data, any following fsync() must fail. This is of course
> > inconvenient because you won't be able to resume a VM that is
> > configured
> > to stop on errors, and it means some data loss, but it's safe
> > because we
> > never tell the guest that the data is on disk when it really isn't.
> > 
> > gluster's behaviour (without resync-failed-syncs-after-fsync set)
> > is
> > different, if I understand correctly. It will throw away the data
> > and
> > then happily report success on the next fsync() call. And this is
> > what
> > causes not only data loss, but corruption.
> Yes, that makes sense to me - the kernel will remember that it could
> not write 
> data back from the page cache and the future fsync() will see an
> error.
> 
> > 
> > 
> > [ Hm, or having read what's below... Did I misunderstand and Linux
> >    returns failure only for a single fsync() and on the next one it
> >    returns success again? That would be bad. ]
> I would need to think through that scenario with the memory
> management people to 
> see if that could happen.

It could definitely happen.

1) block on disk contains contents A
2) page cache gets contents B written to it
3) fsync fails
4) page with contents B get evicted from memory
5) block with contents A gets read from disk

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster
  2016-04-19 14:09   ` Jeff Cody
  2016-04-20  1:56     ` Ric Wheeler
@ 2016-04-20  5:15     ` Raghavendra Gowdappa
  1 sibling, 0 replies; 16+ messages in thread
From: Raghavendra Gowdappa @ 2016-04-20  5:15 UTC (permalink / raw)
  To: Jeff Cody
  Cc: Ric Wheeler, qemu-block, qemu-devel, kwolf, pkarampu, ndevos,
	Rik van Riel, Poornima Gurusiddaiah, Raghavendra Talur,
	Vijay Bellur



----- Original Message -----
> From: "Jeff Cody" <jcody@redhat.com>
> To: "Ric Wheeler" <rwheeler@redhat.com>
> Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, kwolf@redhat.com, pkarampu@redhat.com, rgowdapp@redhat.com,
> ndevos@redhat.com, "Rik van Riel" <riel@redhat.com>
> Sent: Tuesday, April 19, 2016 7:39:17 PM
> Subject: Re: [PATCH for-2.6 v2 0/3] Bug fixes for gluster
> 
> On Tue, Apr 19, 2016 at 08:18:39AM -0400, Ric Wheeler wrote:
> > On 04/19/2016 08:07 AM, Jeff Cody wrote:
> > >Bug fixes for gluster; third patch is to prevent
> > >a potential data loss when trying to recover from
> > >a recoverable error (such as ENOSPC).
> > 
> > Hi Jeff,
> > 
> > Just a note, I have been talking to some of the disk drive people
> > here at LSF (the kernel summit for file and storage people) and got
> > a non-public confirmation that individual storage devices (s-ata
> > drives or scsi) can also dump cache state when a synchronize cache
> > command fails.  Also followed up with Rik van Riel - in the page
> > cache in general, when we fail to write back dirty pages, they are
> > simply marked "clean" (which means effectively that they get
> > dropped).

Yes. Thanks for confirming that. This was another source of confusion for us while deciding on what should be Gluster's "reaction" on failure of write-backs, as Linux kernel page-cache itself doesn't do retries and was one of the questions raised by us on how QEMU handles the same scenario on different platforms. Nevertheless, it doesn't hurt glusterfs to do retries till an fsync or flush.

> > 
> > Long winded way of saying that I think that this scenario is not
> > unique to gluster - any failed fsync() to a file (or block device)
> > might be an indication of permanent data loss.
> >
> 
> Ric,
> 
> Thanks.
> 
> I think you are right, we likely do need to address how QEMU handles fsync
> failures across the board in QEMU at some point (2.7?).  Another point to
> consider is that QEMU is cross-platform - so not only do we have different
> protocols, and filesystems, but also different underlying host OSes as well.
> It is likely, like you said, that there are other non-gluster scenarios where
> we have non-recoverable data loss on fsync failure.
> 
> With Gluster specifically, if we look at just ENOSPC, does this mean that
> even if Gluster retains its cache after fsync failure, we still won't know
> that there was no permanent data loss?  If we hit ENOSPC during an fsync, I
> presume that means Gluster itself may have encountered ENOSPC from a fsync to
> the underlying storage.  In that case, does Gluster just pass the error up
> the stack?

Yes. It passes errno up the stack. (But,) If the option "resync-failed-syncs-after-fsync" is set, Gluster retains the cache after failed fsync to backend irrespective of the errno (including ENOSPC) till a flush. So, there is no permanent data-loss as long as the fd is not closed or backend store recovers from the error before fd is closed.

To summarize consequences of the scenario you explained:

1. Application/kernel sees a failed fsync with same errno as the one backend-storage returned.
2. (Nevertheless) Glusterfs retains the writes cached before fsync (even after fsync failure) and does retry, if performance.resync-failed-syncs-after-fsync option is set.

regards,
Raghavendra

> 
> Jeff
> 
> > 
> > >
> > >The final patch closes the gluster fd and sets the
> > >protocol drv to NULL on fsync failure in gluster;
> > >we have no way of knowing what gluster versions
> > >support retaining fysnc cache on error, so until
> > >we do the safest thing to do is invalidate the
> > >drive.
> > >
> > >Jeff Cody (3):
> > >   block/gluster: return correct error value
> > >   block/gluster: code movement of qemu_gluster_close()
> > >   block/gluster: prevent data loss after i/o error
> > >
> > >  block/gluster.c | 66
> > >  ++++++++++++++++++++++++++++++++++++++++++++++-----------
> > >  configure       |  8 +++++++
> > >  2 files changed, 62 insertions(+), 12 deletions(-)
> > >
> > 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-04-21  8:43 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-19 12:07 [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Jeff Cody
2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 1/3] block/gluster: return correct error value Jeff Cody
2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 2/3] block/gluster: code movement of qemu_gluster_close() Jeff Cody
2016-04-19 12:07 ` [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error Jeff Cody
2016-04-19 12:27   ` Kevin Wolf
2016-04-19 12:29     ` Jeff Cody
2016-04-19 12:18 ` [Qemu-devel] [PATCH for-2.6 v2 0/3] Bug fixes for gluster Ric Wheeler
2016-04-19 14:09   ` Jeff Cody
2016-04-20  1:56     ` Ric Wheeler
2016-04-20  9:24       ` Kevin Wolf
2016-04-20 10:40         ` Ric Wheeler
2016-04-20 11:46           ` Kevin Wolf
2016-04-20 18:38             ` Rik van Riel
2016-04-21  8:43               ` Kevin Wolf
2016-04-20 18:37           ` Rik van Riel
2016-04-20  5:15     ` Raghavendra Gowdappa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).