From: Jeff Cody <jcody@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org, pkarampu@redhat.com,
qemu-devel@nongnu.org, ndevos@redhat.com, rwheeler@redhat.com,
kdhananj@redhat.com
Subject: Re: [Qemu-devel] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error
Date: Wed, 6 Apr 2016 08:44:18 -0400 [thread overview]
Message-ID: <20160406124418.GA7329@localhost.localdomain> (raw)
In-Reply-To: <20160406110216.GE5098@noname.redhat.com>
On Wed, Apr 06, 2016 at 01:02:16PM +0200, Kevin Wolf wrote:
> [ Adding some CCs ]
>
> Am 06.04.2016 um 05:29 hat Jeff Cody geschrieben:
> > Upon receiving an I/O error after an fsync, by default gluster will
> > dump its cache. However, QEMU will retry the fsync, which is especially
> > useful when encountering errors such as ENOSPC when using the werror=stop
> > option. When using caching with gluster, however, the last written data
> > will be lost upon encountering ENOSPC. Using the cache xlator option of
> > 'resync-failed-syncs-after-fsync' should cause gluster to retain the
> > cached data after a failed fsync, so that ENOSPC and other transient
> > errors are recoverable.
> >
> > Signed-off-by: Jeff Cody <jcody@redhat.com>
> > ---
> > block/gluster.c | 27 +++++++++++++++++++++++++++
> > configure | 8 ++++++++
> > 2 files changed, 35 insertions(+)
> >
> > diff --git a/block/gluster.c b/block/gluster.c
> > index 30a827e..b1cf71b 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -330,6 +330,23 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
> > goto out;
> > }
> >
> > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> > + /* Without this, if fsync fails for a recoverable reason (for instance,
> > + * ENOSPC), gluster will dump its cache, preventing retries. This means
> > + * almost certain data loss. Not all gluster versions support the
> > + * 'resync-failed-syncs-after-fsync' key value, but there is no way to
> > + * discover during runtime if it is supported (this api returns success for
> > + * unknown key/value pairs) */
>
> Honestly, this sucks. There is apparently no way to operate gluster so
> we can safely recover after a failed fsync. "We hope everything is fine,
> but depending on your gluster version, we may now corrupt your image"
> isn't very good.
>
> We need to consider very carefully if this is good enough to go on after
> an error. I'm currently leaning towards "no". That is, we should only
> enable this after Gluster provides us a way to make sure that the option
> is really set.
>
> > + ret = glfs_set_xlator_option (s->glfs, "*-write-behind",
> > + "resync-failed-syncs-after-fsync",
> > + "on");
> > + if (ret < 0) {
> > + error_setg_errno(errp, errno, "Unable to set xlator key/value pair");
> > + ret = -errno;
> > + goto out;
> > + }
> > +#endif
>
> We also need to consider the case without CONFIG_GLUSTERFS_XLATOR_OPT.
> In this case (as well as theoretically in the case that the option
> didn't take effect - if only we could know about it), a failed
> glfs_fsync_async() is fatal and we need to stop operating on the image,
> i.e. set bs->drv = NULL like when we detect corruption in qcow2 images.
> The guest will see a broken disk that fails all I/O requests, but that's
> better than corrupting data.
>
Gluster versions that don't support CONFIG_GLUSTERFS_XLATOR_OPT will
also not have the gluster patch that removes the file descriptor
invalidation upon error (unless that was a relatively new
bug/feature). But if that is the case, every operation on the file
descriptor in those versions will return error. But it is also rather
old versions that don't support glfs_set_xlator_option() I believe.
next prev parent reply other threads:[~2016-04-06 12:44 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-06 3:29 [Qemu-devel] [PATCH for-2.6 0/2] Bug fixes for gluster Jeff Cody
2016-04-06 3:29 ` [Qemu-devel] [PATCH for-2.6 1/2] block/gluster: return correct error value Jeff Cody
2016-04-06 15:00 ` Niels de Vos
2016-04-06 3:29 ` [Qemu-devel] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error Jeff Cody
2016-04-06 11:02 ` Kevin Wolf
2016-04-06 11:19 ` Ric Wheeler
2016-04-06 11:41 ` Kevin Wolf
2016-04-06 11:51 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2016-04-06 13:10 ` Jeff Cody
2016-04-06 13:20 ` Kevin Wolf
2016-04-07 7:48 ` Pranith Kumar Karampuri
2016-04-11 4:26 ` Raghavendra Gowdappa
2016-05-09 6:38 ` Raghavendra Talur
2016-05-09 8:02 ` Kevin Wolf
2016-04-06 12:44 ` Jeff Cody [this message]
2016-04-06 14:47 ` [Qemu-devel] " Niels de Vos
2016-04-06 16:05 ` Jeff Cody
2016-04-07 16:17 ` Jeff Cody
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160406124418.GA7329@localhost.localdomain \
--to=jcody@redhat.com \
--cc=kdhananj@redhat.com \
--cc=kwolf@redhat.com \
--cc=ndevos@redhat.com \
--cc=pkarampu@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=rwheeler@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).