From: John Snow <jsnow@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>,
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: qemu-devel <qemu-devel@nongnu.org>,
qemu block <qemu-block@nongnu.org>, Jeff Cody <jcody@redhat.com>,
Fam Zheng <famz@redhat.com>, "Denis V. Lunev" <den@openvz.org>
Subject: Re: [Qemu-devel] backup notifier fail policy
Date: Mon, 3 Oct 2016 14:07:34 -0400 [thread overview]
Message-ID: <b51b707a-a8a2-0449-959d-a0d66ce9169e@redhat.com> (raw)
In-Reply-To: <20161003131151.GB31993@stefanha-x1.localdomain>
On 10/03/2016 09:11 AM, Stefan Hajnoczi wrote:
> On Fri, Sep 30, 2016 at 09:59:16PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> On 30.09.2016 20:11, Vladimir Sementsov-Ogievskiy wrote:
>>> Hi all!
>>>
>>> Please, can somebody explain me, why we fail guest request in case of io
>>> error in write notifier? I think guest consistency is more important
>>> than success of unfinished backup. Or, what am I missing?
>>>
>>> I'm saying about this code:
>>>
>>> static int coroutine_fn backup_before_write_notify(
>>> NotifierWithReturn *notifier,
>>> void *opaque)
>>> {
>>> BackupBlockJob *job = container_of(notifier, BackupBlockJob,
>>> before_write);
>>> BdrvTrackedRequest *req = opaque;
>>> int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
>>> int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
>>>
>>> assert(req->bs == blk_bs(job->common.blk));
>>> assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
>>> assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
>>>
>>> return backup_do_cow(job, sector_num, nb_sectors, NULL, true);
>>> }
>>>
>>> So, what about something like
>>>
>>> ret = backup_do_cow(job, ...
>>> if (ret < 0 && job->notif_ret == 0) {
>>> job->notif_ret = ret;
>>> }
>>>
>>> return 0;
>>>
>>> and fail block job if notif_ret < 0 in other places of backup code?
>>>
>>
>> And second question about notifiers in backup block job. If block job is
>> paused, notifiers still works and can copy data. Is it ok? So, user thinks
>> that job is paused, so he can do something with target disk.. But really,
>> this 'something' will race with write-notifiers. So, what assumptions may
>> user actually have about paused backup job? Is there any agreements? Also,
>> on query-block-jobs we will see job.busy = false, when actually
>> copy-on-write may be in flight..
>
> I agree that the job should fail and the guest continues running.
>
> The backup job cannot do the usual ENOSPC stop/resume error handling
> since we lose snapshot consistency once guest writes are allowed to
> proceed. Backup errors need to be fatal, resuming is usually not
> possible. The user will have to retry the backup operation.
>
> Stefan
>
If we fail and intercept the error for the backup write and HALT at that
point, why would we lose consistency? If the backup write failed before
we allowed the guest write to proceed, that data should still be there
on disk, no?
I guess it is a little messier than the usual STOP case, but it doesn't
seem inherently impossible to me...
Eh, regardless: If we're not using a STOP policy, it seems like the
right thing to do is definitely to just fail the backup instead of
failing the write.
As for paused guarantees... good point. If you want to truly pause a
backup job, I think you necessarily begin accruing a backlog of data
that needs to get written back out. Maybe it's not easily possible to
truly pause a backup block job.
I'm not exactly sure what we should do about it, though I do know that
eventually we want to replace write notifiers with block filters, but
even those would likely remain operating during a pause.
'busy' means something very specific within QEMU, but perhaps the query
function can be adjusted to return 'true' for busy as long as either the
job is running OR it has latent portions still running (write notifiers,
block filters, etc.)
--js
next prev parent reply other threads:[~2016-10-03 18:07 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-30 17:11 [Qemu-devel] backup notifier fail policy Vladimir Sementsov-Ogievskiy
2016-09-30 18:59 ` Vladimir Sementsov-Ogievskiy
2016-10-03 13:11 ` Stefan Hajnoczi
2016-10-03 18:07 ` John Snow [this message]
2016-10-04 9:23 ` Stefan Hajnoczi
2016-10-04 9:34 ` Kevin Wolf
2016-10-04 10:41 ` Denis V. Lunev
2016-10-04 11:55 ` Kevin Wolf
2016-10-04 16:02 ` Stefan Hajnoczi
2016-10-04 16:03 ` John Snow
2016-10-04 16:19 ` Denis V. Lunev
2016-10-05 8:12 ` Kevin Wolf
2016-10-05 12:59 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b51b707a-a8a2-0449-959d-a0d66ce9169e@redhat.com \
--to=jsnow@redhat.com \
--cc=den@openvz.org \
--cc=famz@redhat.com \
--cc=jcody@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).