From: Eric Blake <eblake@redhat.com>
To: Michael Spradling <michael@os.amperecomputing.com>
Cc: "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] Loading snapshot with readonly qcow2 image
Date: Fri, 14 Dec 2018 14:28:37 -0600 [thread overview]
Message-ID: <264e8cab-431e-68c6-96ca-711131481a14@redhat.com> (raw)
In-Reply-To: <20181214160328.GB20905@mswork1>
On 12/14/18 10:03 AM, Michael Spradling wrote:
>> Can you combine -s (create a writable temp file) with -l to get what you
>> want?
>>
>> /me tries:
>>
>> I can confirm that 'qemu-nbd -s a' lets me write data that is discarded on
>> disconnect (lsof says a temp file in /var/tmp/vl.XXXXXX was created); and
>> that 'qemu-nbd -l snap a' lets me read the snapshot data. But mixing the two
>> fails, and it would be a nice bug to fix.
>
> I briefly looked at the code and is seams to be using the same base
> functions as qemu does. So, if I get this working for the model it
> might also start working for qemu-nbd.
>>
> Ideally, I want to not modify old images or create new images with
> qemu-img, so I have been not modifing qemu-img, but qemu directly
> itself. My use case will have several snapshots in an image.(say
> 100). I will then later resume each of these snapshots in a qemu
> session in parallel. This is why I have gone done the route of modifying
> the temp snapshots file /var/tmp/vl.XXXXX L1 and l2 tables. My
> understanding is if these are updated and the cluster doesn't exists in
> the temp file the code will then look for it in the backing file. Still
> researching this area.
Right now, the only thing that qemu reads from a backing file is a guest
cluster. L1/L2 clusters have to be local to the file that they are
describing (there is no way to make an L2 table fall back to the
contents of a different cluster in the backing file). It boils down to:
Reads:
Does the active layer have an L2 mapping for the current cluster being
read? Yes - read that cluster. No - ask the backing layer to provide
the contents of that cluster (and if copy-on-read is enabled, also write
those contents in a fresh allocation so that the current layer no longer
has to defer to the backing).
Writes:
Does the active layer have an L2 mapping for the current cluster
containing the data being written? Yes - modify that cluster in place.
No - allocate an new cluster, and if the write was for less than a full
cluster, also ask the backing layer to provide the contents of the rest
of the cluster for a copy-on-write action. After the write, the current
layer no longer has to defer to the backing.
Creating an arbitrary qcow2 file on top of any arbitrary read-only
backing layer (including 'qemu-nbd -l snap image) should be doable, even
if verbose (since the "backing file" of a qcow2 BDS node can be any
other BDS). Providing some shorter command lines, like making 'qemu-nbd
-s -l snap image' work so that you don't have to provide your own manual
overlay, is thus not a high priority.
>
>>>
>>> I still don't have this working yet and I believe my area of problems is
>>> qcow2_update_snapshot_refcount. Can anyone explain what this does
>>> exactly. It seems the function does three different things based on the
>>> value of addend, either -1, 0, 1, but its somewhat unclear.
>>
>> Every cluster of qcow2 is reference-counted, to track which portions of the
>> file are (supposed to be) in use according to following the metadata trails.
>> When internal snapshots are used, this is implemented by incrementing the
>> refcount for each cluster that is reachable both from the snapshot and from
>> the current L1 table (update_snapshot_refcount +1), then when writing to the
>> cluster we break the reference count by writing the new data to a new
>> allocation and decrementing the reference count of the old cluster. When
>> trimming clusters, we decrement the refcount, and if it goes to 0 the
>> cluster can be reused for something else.
>
> I think I understand this. That would satifys addend being a -1 or 1.
> I am still unclear why you would call the fuction with addend being 0.
An addend of 0 allows a couple of callers to temporarily have an
inconsistent image for the sake of optimizing a bulk allocation/freeing,
followed by informing the refcount table to match, with fewer changes to
the cluster containing the refcounts than if the algorithm had to
accurately use -1/+1 on a per-cluster basis.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
prev parent reply other threads:[~2018-12-14 20:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-30 14:44 [Qemu-devel] Loading snapshot with readonly qcow2 image Michael Spradling
2018-11-30 14:58 ` Eric Blake
2018-12-13 18:33 ` Michael Spradling
2018-12-13 21:43 ` Eric Blake
2018-12-14 16:03 ` Michael Spradling
2018-12-14 20:28 ` Eric Blake [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=264e8cab-431e-68c6-96ca-711131481a14@redhat.com \
--to=eblake@redhat.com \
--cc=michael@os.amperecomputing.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).