From: Paolo Bonzini <pbonzini@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Federico Simoncelli <fsimonce@redhat.com>,
Luiz Capitulino <lcapitulino@redhat.com>,
Eric Blake <eblake@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>,
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] Proposal for extensions of block job commands in QEMU 1.2
Date: Mon, 21 May 2012 13:02:46 +0200 [thread overview]
Message-ID: <4FBA20D6.10507@redhat.com> (raw)
In-Reply-To: <4FBA19D7.1000806@redhat.com>
Il 21/05/2012 12:32, Kevin Wolf ha scritto:
> Am 21.05.2012 12:02, schrieb Paolo Bonzini:
>> Il 21/05/2012 11:29, Kevin Wolf ha scritto:
>>>> * block-stream: I propose adding two options to the existing
>>>> block-stream command. If this is rejected, only mirroring will be able
>>>> to use rerror/werror.
>>>>
>>>> The new options are of course rerror/werror. They are enum options,
>>>> with the following possible values:
>>>
>>> Do we really need separate werror/rerror? For guest operations they
>>> really exist only for historical reasons: werror was there first, and
>>> when we wanted the same functionality, it seemed odd to overload werror
>>> to include reads as well.
>>>
>>> For block jobs, where there is no such option yet, we could go with a
>>> single error option, unless there is a use case for separate
>>> werror/rerror options.
>>
>> For mirroring rerror=source and werror=target. I'm not sure there is an
>> actual usecase, but at least it is more interesting than for devices...
>
> Hm. What if we add an active mirror? Then we can get some kind of COW,
> and rerror can happen on the target as well.
Errors during the read part of COW are always reported as werror.
> If source/target is really the distinction we want to have, should the
> available options be specific to the job type, so that you could have
> src_error and dst_error for mirroring?
Yes, that would make sense.
>>>> 'stop': The VM *and* the job will be paused---the VM is stopped even if
>>>> the block device has neither rerror=stop nor werror={stop,enospc}. The
>>>> error is recorded in the block device's iostatus (which can be examined
>>>> with query-block). However, a BLOCK_IO_ERROR event will _never_ pause a
>>>> job.
>>>>
>>>> Rationale: stopping all I/O seems to be the best choice in order
>>>> to limit the number of errors received. However, due to backwards-
>>>> compatibility with QEMU 1.1 we cannot pause the job when guest-
>>>> initiated I/O causes an error. We could do that if the block
>>>> device has rerror=stop/werror={stop,enospc}, but it seems more
>>>> complicated to just never do it.
>>>
>>> I don't agree with stopping the VM. Consider a case where the target is
>>> somewhere on the network and you lose the connection, but the primary
>>> image is local on the hard disk. You don't want to stop the VM just
>>> because continuing with the copy isn't possible for the moment.
>>
>> I think this is something that management should resolve.
>
> Management doesn't necessarily exist.
Even a human sitting at a console is management. (Though I don't plan
HMP to expose rerror/werror; so you can assume in some sense that
management exists).
>> For an error on the source, stopping the VM makes sense. I don't
>> think management cares about what caused an I/O error on a device.
>> Does it matter if streaming was active or rather the guest was
>> executing "dd if=/dev/sda of=/dev/null".
>
> Yes, there's a big difference: If it was a job, the guest can keep
> running without any problems. If it was a guest operation, we would have
> to return an error to the guest, which may offline the disk in response.
Ok, this makes sense.
>> Management may want to keep the VM stopped even for an error on the
>> target, as long as mirroring has finished the initial synchronization
>> step. The VM can perform large amounts of I/O while the job is paused,
>> and then completing the job can take a large amount of time.
>
> If management wants to limit the impact of this, it can decide to
> explicitly stop the VM when it receives the error event.
That can be too late.
Eric, is it a problem for libvirt if a pause or target error during
mirroring causes the job to exit steady state? That means that after a
target error the offset can go back from 100% to <100%.
>>> If the VM is stopped (including BLOCK_IO_ERROR), no I/O should be going
>>> on at all. Do we really keep running the jobs in 1.1? If so, this is a
>>> bug and should be fixed before the release.
>>
>> Yes, we do. Do you think it's a problem for migration (thinking more
>> about it: ouch, yes, it should be)?
>
> I'm pretty sure that it is a problem for migration. And it's likely a
> problem in more cases.
On the other hand, in other cases it can be desirable (qemu -S, run
streaming before the VM starts).
>> I'd rather make the extension of query-block-jobs more generic, with a
>> list "devices" instead of a member "target", and making up the device
>> name in the implementation (so you have "device": "target" for mirroring).
>
> Well, my idea for blockdev was something like (represented in a -drive
> syntax because I don't know what it will look like):
>
> (qemu) blockdev_add file=foo.img,id=src
> (qemu) device_add virtio-blk-pci,drive=src
> ...
> (qemu) blockdev_add file=bar.img,id=dst
> (qemu) blockdev_mirror foo bar
>
> Once QOM reaches the block layer, I guess we'll want to make all
> BlockDriverStates user visible anyway.
I don't disagree, but that's very different from what is done with
drive-mirror.
So for now I'll keep my proposed extension of query-block-jobs; later it
can be modified so that the target will have a name if you started the
mirroring with blockdev_mirror instead of drive_mirror.
Paolo
next prev parent reply other threads:[~2012-05-21 11:03 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-18 17:08 [Qemu-devel] Proposal for extensions of block job commands in QEMU 1.2 Paolo Bonzini
2012-05-21 9:29 ` Kevin Wolf
2012-05-21 10:02 ` Paolo Bonzini
2012-05-21 10:32 ` Kevin Wolf
2012-05-21 11:02 ` Paolo Bonzini [this message]
2012-05-21 13:07 ` Kevin Wolf
2012-05-21 15:18 ` Paolo Bonzini
2012-05-21 13:13 ` Eric Blake
2012-05-21 12:20 ` Stefan Hajnoczi
2012-05-21 13:59 ` Luiz Capitulino
2012-05-21 14:09 ` Kevin Wolf
2012-05-21 14:16 ` Anthony Liguori
2012-05-21 14:17 ` Luiz Capitulino
2012-05-21 14:10 ` Anthony Liguori
2012-05-21 14:16 ` Luiz Capitulino
2012-05-21 14:19 ` Anthony Liguori
2012-05-21 14:26 ` Paolo Bonzini
2012-05-21 14:40 ` Anthony Liguori
2012-05-21 14:47 ` Paolo Bonzini
2012-05-21 15:44 ` Anthony Liguori
2012-05-21 15:55 ` Paolo Bonzini
2012-05-21 14:17 ` Kevin Wolf
2012-05-21 14:39 ` Paolo Bonzini
2012-05-24 13:41 ` [Qemu-devel] Block job commands in QEMU 1.2 [v2, including support for replication] Paolo Bonzini
2012-05-24 14:00 ` Ori Mamluk
2012-05-24 14:19 ` Paolo Bonzini
2012-05-24 15:32 ` Dor Laor
2012-05-25 8:59 ` Paolo Bonzini
2012-05-24 16:57 ` Eric Blake
2012-05-25 8:48 ` Paolo Bonzini
2012-05-25 15:02 ` Eric Blake
2012-05-25 8:28 ` Stefan Hajnoczi
2012-05-25 8:42 ` Kevin Wolf
2012-05-25 9:43 ` Stefan Hajnoczi
2012-05-25 11:17 ` Paolo Bonzini
2012-05-25 12:09 ` Stefan Hajnoczi
2012-05-25 13:25 ` Paolo Bonzini
2012-05-25 16:57 ` Luiz Capitulino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FBA20D6.10507@redhat.com \
--to=pbonzini@redhat.com \
--cc=eblake@redhat.com \
--cc=fsimonce@redhat.com \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).