From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40511) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1asUlC-0008O8-Pq for qemu-devel@nongnu.org; Tue, 19 Apr 2016 08:28:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1asUlB-0000d9-MF for qemu-devel@nongnu.org; Tue, 19 Apr 2016 08:28:06 -0400 Date: Tue, 19 Apr 2016 14:27:56 +0200 From: Kevin Wolf Message-ID: <20160419122756.GB4498@noname.str.redhat.com> References: <3d0edbc7aafd7731496db768c55a8ff3d4ac1537.1461067248.git.jcody@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3d0edbc7aafd7731496db768c55a8ff3d4ac1537.1461067248.git.jcody@redhat.com> Subject: Re: [Qemu-devel] [PATCH for-2.6 v2 3/3] block/gluster: prevent data loss after i/o error List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jeff Cody Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org, rwheeler@redhat.com, pkarampu@redhat.com, rgowdapp@redhat.com, ndevos@redhat.com Am 19.04.2016 um 14:07 hat Jeff Cody geschrieben: > Upon receiving an I/O error after an fsync, by default gluster will > dump its cache. However, QEMU will retry the fsync, which is especially > useful when encountering errors such as ENOSPC when using the werror=stop > option. When using caching with gluster, however, the last written data > will be lost upon encountering ENOSPC. Using the write-behind-cache > xlator option of 'resync-failed-syncs-after-fsync' should cause gluster > to retain the cached data after a failed fsync, so that ENOSPC and other > transient errors are recoverable. > > Unfortunately, we have no way of knowing if the > 'resync-failed-syncs-after-fsync' xlator option is supported, so for now > close the fd and set the BDS driver to NULL upon fsync error. > > Signed-off-by: Jeff Cody > --- > block/gluster.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > configure | 8 ++++++++ > 2 files changed, 50 insertions(+) > > diff --git a/block/gluster.c b/block/gluster.c > index d9aace6..ba33488 100644 > --- a/block/gluster.c > +++ b/block/gluster.c > @@ -314,6 +314,23 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options, > goto out; > } > > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT > + /* Without this, if fsync fails for a recoverable reason (for instance, > + * ENOSPC), gluster will dump its cache, preventing retries. This means > + * almost certain data loss. Not all gluster versions support the > + * 'resync-failed-syncs-after-fsync' key value, but there is no way to > + * discover during runtime if it is supported (this api returns success for > + * unknown key/value pairs) */ > + ret = glfs_set_xlator_option(s->glfs, "*-write-behind", > + "resync-failed-syncs-after-fsync", > + "on"); > + if (ret < 0) { > + error_setg_errno(errp, errno, "Unable to set xlator key/value pair"); > + ret = -errno; > + goto out; > + } > +#endif > + > qemu_gluster_parse_flags(bdrv_flags, &open_flags); > > s->fd = glfs_open(s->glfs, gconf->image, open_flags); > @@ -366,6 +383,16 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state, > goto exit; > } > > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT > + ret = glfs_set_xlator_option(reop_s->glfs, "*-write-behind", > + "resync-failed-syncs-after-fsync", "on"); > + if (ret < 0) { > + error_setg_errno(errp, errno, "Unable to set xlator key/value pair"); > + ret = -errno; > + goto exit; > + } > +#endif > + > reop_s->fd = glfs_open(reop_s->glfs, gconf->image, open_flags); > if (reop_s->fd == NULL) { > /* reops->glfs will be cleaned up in _abort */ > @@ -613,6 +640,21 @@ static coroutine_fn int qemu_gluster_co_flush_to_disk(BlockDriverState *bs) > > ret = glfs_fsync_async(s->fd, gluster_finish_aiocb, &acb); > if (ret < 0) { > + /* Some versions of Gluster (3.5.6 -> 3.5.8?) will not retain its > + * cache after a fsync failure, so we have no way of allowing the guest > + * to safely continue. Gluster versions prior to 3.5.6 don't retain > + * the cache either, but will invalidate the fd on error, so this is > + * again our only option. > + * > + * The 'resync-failed-syncs-after-fsync' xlator option for the > + * write-behind cache will cause later gluster versions to retain > + * its cache after error, so long as the fd remains open. However, > + * we currently have no way of knowing if this option is supported. > + * > + * TODO: Once gluster provides a way for us to determine if the option > + * is supported, bypass the closure and setting drv to NULL. */ > + qemu_gluster_close(bs); > + bs->drv = NULL; > return -errno; > } More context: qemu_coroutine_yield(); return acb.ret; } I would guess that acb.ret containing an error is the more common case. We should probably invalidate the BDS in both cases (immediate failure and callback with error code). Kevin