From: Stefan Hajnoczi <stefanha@gmail.com>
To: Eric Blake <eblake@redhat.com>
Cc: "Daniel P. Berrange" <berrange@redhat.com>,
Kashyap Chamarthy <kchamart@redhat.com>,
qemu-devel@nongnu.org, qemu-block@nongnu.org, mbooth@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] RFC: use case for adding QMP, block jobs & multiple exports to qemu-nbd ?
Date: Fri, 3 Nov 2017 10:04:57 +0000 [thread overview]
Message-ID: <20171103100457.GF5078@stefanha-x1.localdomain> (raw)
In-Reply-To: <3c41f881-d773-85e1-f18a-ea1b6c77cf9c@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3103 bytes --]
On Thu, Nov 02, 2017 at 12:50:39PM -0500, Eric Blake wrote:
> On 11/02/2017 12:04 PM, Daniel P. Berrange wrote:
>
> > vm-a-disk1.qcow2 open - its just a regular backing file setup.
> >
> >>
> >>> | (format=qcow2, proto=file)
> >>> |
> >>> +- vm-a-disk1.qcow2 (qemu-system-XXX)
> >>>
> >>> The problem is that many VMs are wanting to use cache-disk1.qcow2 as
> >>> their disk's backing file, and only one process is permitted to be
> >>> writing to disk backing file at any time.
> >>
> >> Can you explain a bit more about how many VMs are trying to write to
> >> write to the same backing file 'cache-disk1.qcow2'? I'd assume it's
> >> just the "immutable" local backing store (once the previous 'mirror' job
> >> is completed), based on which Nova creates a qcow2 overlay for each
> >> instance it boots.
> >
> > An arbitrary number of vm-*-disk1.qcow2 files could exist all using
> > the same cache-disk1.qcow2 image. Its only limited by how many VMs
> > you can fit on the host. By definition you can only ever have a single
> > process writing to a qcow2 file though, otherwise corruption will quickly
> > follow.
>
> So if I'm following, your argument is that the local qemu-nbd process is
> the only one writing to the file, while all other overlays are backed by
> the NBD process; and then as any one of the VMs reads, the qemu-nbd
> process pulls those sectors into the local storage as a result.
>
> >
> >> When I pointed this e-mail of yours to Matt Booth on Freenode Nova IRC
> >> channel, he said the intermediate image (cache-disk1.qcow2) is a COR
> >> Copy-On-Read). I realize what COR is -- everytime you read a cluster
> >> from the backing file, you write that locally, to avoid reading it
> >> again.
> >
> > qcow2 doesn't give you COR, only COW. So every read request would have a miss
> > in cache-disk1.qcow2 and thus have to be fetched from master-disk1.qcow2. The
> > use of drive-mirror to pull master-disk1.qcow2 contents into cache-disk1.qcow
> > makes up for the lack of COR by populating cache-disk1.qcow2 in the background.
>
> Ah, but qcow2 (or more precisely, any protocol qemu BDS) DOES have
> copy-on-read, built in to the block layer. See qemu-iotest 197 for an
> example of it in use. If we use COR correctly, then every initial read
> request will miss in the cache, but the COR will populate the cache
> without having to have a background drive-mirror. A background
> drive-mirror may still be useful to populate the cache faster, but COR
> populates the parts you want now regardless of how fast the background
> task is running.
-drive copy-on-read=on and the stream block job were added exactly for
this provisioning use case. They can be used together.
I was a little surprised that the discussion has been about the mirror
job rather than the stream job.
One difference between stream and mirror is that stream doesn't pivot
the image file on completion. Instead it clears the backing file so the
link to the remote server no longer exists.
Stefan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]
next prev parent reply other threads:[~2017-11-03 10:05 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-02 12:02 [Qemu-devel] RFC: use case for adding QMP, block jobs & multiple exports to qemu-nbd ? Daniel P. Berrange
2017-11-02 16:40 ` [Qemu-devel] [Qemu-block] " Kashyap Chamarthy
2017-11-02 17:04 ` Daniel P. Berrange
2017-11-02 17:50 ` Eric Blake
2017-11-03 10:04 ` Stefan Hajnoczi [this message]
2017-11-03 10:16 ` Daniel P. Berrange
2017-11-02 18:06 ` Paolo Bonzini
2017-11-02 21:38 ` Max Reitz
2017-11-03 9:59 ` Stefan Hajnoczi
2017-11-09 13:54 ` Markus Armbruster
2017-11-09 16:02 ` Daniel P. Berrange
2017-11-03 6:00 ` [Qemu-devel] " Fam Zheng
2017-11-03 10:01 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171103100457.GF5078@stefanha-x1.localdomain \
--to=stefanha@gmail.com \
--cc=berrange@redhat.com \
--cc=eblake@redhat.com \
--cc=kchamart@redhat.com \
--cc=mbooth@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).