All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Snow <jsnow@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-devel@nongnu.org
Cc: famz@redhat.com, den@virtuozzo.com,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH RFC] external backup api
Date: Tue, 9 Feb 2016 11:51:57 -0500	[thread overview]
Message-ID: <56BA192D.3050700@redhat.com> (raw)
In-Reply-To: <56BA0BA0.2060302@virtuozzo.com>



On 02/09/2016 10:54 AM, Vladimir Sementsov-Ogievskiy wrote:
> On 09.02.2016 00:14, John Snow wrote:
>>
>> On 02/06/2016 04:19 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> On 05.02.2016 22:48, John Snow wrote:
>>>> On 01/22/2016 12:07 PM, Vladimir Sementsov-Ogievskiy wrote:
>>>>> Hi all.
>>>>>
>>>>> This is the early begin of the series which aims to add external
>>>>> backup
>>>>> api. This is needed to allow backup software use our dirty bitmaps.
>>>>>
>>>>> Vmware and Parallels Cloud Server have this feature.
>>>>>
>>>> Have a link to the equivalent feature that VMWare exposes? (Or
>>>> Parallels
>>>> Cloud Server) ... I'm curious about what the API there looks like.
>>> For VMware you need their Virtual Disk Api Programming Guide
>>> http://pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/vddk60_programming.pdf
>>>
>>>
>> Great, thanks!
>>
>>> Look at Changed Block Tracking (CBT) , Backup and Restore.
>>>
>>> For PCS here is part of SDK header, related to the topic:
>>>
>>> ====================================
>>> /*
>>>   * Builds a map of the disk contents changes between 2 PITs.
>>>     Parameters
>>>     hDisk :       A handle of type PHT_VIRTUAL_DISK identifying
>>>                   the virtual disk.
>>>     sPit1Uuid :   Uuid of the older PIT.
>>>     sPit2Uuid :   Uuid of the later PIT.
>>>     phMap :       A pointer to a variable which receives the
>>>           result (a handle of type PHT_VIRTUAL_DISK_MAP).
>>>     Returns
>>>     PRL_RESULT.
>>> */
>>> PRL_METHOD_DECL( PARALLELS_API_VER_5,
>>>                  PrlDisk_GetChangesMap_Local, (
>>>          PRL_HANDLE hDisk,
>>>          PRL_CONST_STR sPit1Uuid,
>>>          PRL_CONST_STR sPit2Uuid,
>>>          PRL_HANDLE_PTR phMap) );
>>>
>> Effectively giving you a dirty bitmap diff between two snapshots.
>> Something we don't currently genuinely support in QEMU.
> 
> Just start dirty bitmap at point a and stop at point b..
> 
>>
>>> /*
>>>   * Reports the number of significant bits in the map.
>>>     Parameters
>>>     hMap :        A handle of type PHT_VIRTUAL_DISK_MAP identifying
>>>                   the changes map.
>>>     phSize :      A pointer to a variable which receives the
>>>           result.
>>>     Returns
>>>     PRL_RESULT.
>>> */
>>> PRL_METHOD_DECL( PARALLELS_API_VER_5,
>>>                  PrlDiskMap_GetSize, (
>>>          PRL_HANDLE hMap,
>>>          PRL_UINT32_PTR pnSize) );
>>>
>> I assume this is roughly the dirty bit count, for us, this would be
>> dirty clusters. (Or whatever granularity you specified, but usually
>> clusters.)
>>
>>> /*
>>>   * Reports the size (in bytes) of a block mapped by a single bit
>>>   * in the map.
>>>     Parameters
>>>     hMap :        A handle of type PHT_VIRTUAL_DISK_MAP identifying
>>>                   the changes map.
>>>     phSize :      A pointer to a variable which receives the
>>>           result.
>>>     Returns
>>>     PRL_RESULT.
>>> */
>>> PRL_METHOD_DECL( PARALLELS_API_VER_5,
>>>                  PrlDiskMap_GetGranularity, (
>>>          PRL_HANDLE hMap,
>>>          PRL_UINT32_PTR pnSize) );
>>>
>> Basically a granularity query.
>>
>>> /*
>>>   * Returns bits from the blocks map.
>>>     Parameters
>>>     hMap :        A handle of type PHT_VIRTUAL_DISK_MAP identifying
>>>                   the changes map.
>>>     pBuffer :     A pointer to a store.
>>>     pnCapacity :  A pointer to a variable holding the size
>>>           of the buffer and receiving the number of
>>>           bytes actually written.
>>>     Returns
>>>     PRL_RESULT.
>>> */
>>> PRL_METHOD_DECL( PARALLELS_API_VER_5,
>>>                  PrlDiskMap_Read, (
>>>          PRL_HANDLE hMap,
>>>          PRL_VOID_PTR pBuffer,
>>>          PRL_UINT32_PTR pnCapacity) );
>>>
>> And this would be a direct bitmap query.
>>
>> Is the expected usage here that the third party client will use this
>> bitmap to read the source image? Or do you query for the data from API?
> 
> - from API.
> 
>>
>> I think the thought among block devs would be to opt for more of the
>> second option, and less allowing clients to directly interface with the
>> image files.
>>
>>> =======================================
>>>
>>>
>>>>> There is only one patch here, about querying dirty bitmap from qemu by
>>>>> qmp command. It is just an updated and clipped (hmp command
>>>>> removed) old
>>>>> my patch "[PATCH RFC v3 01/14] qmp: add query-block-dirty-bitmap".
>>>>>
>>>>> Before writing the whole thing I'd like to discuss the details. Or,
>>>>> may
>>>>> be there are existing plans on this topic, or may be someone already
>>>>> works on it?
>>>>>
>>>>> I see it like this:
>>>>>
>>>>> =====
>>>>>
>>>>> - add qmp commands for dirty-bitmap functions: create_successor,
>>>>> abdicate,
>>>>> reclaime.
>>>> Hm, why do we need such low-level control over splitting and merging
>>>> bitmaps from an external client?
>>>>
>>>>> - make create-successor command transaction-able
>>>>> - add query-block-dirty-bitmap qmp command
>>>>>
>>>>> then, external backup:
>>>>>
>>>>> qmp transaction {
>>>>>       external-snapshot
>>>>>       bitmap-create-successor
>>>>> }
>>>>>
>>>>> qmp query frozen bitmap, not acquiring aio context.
>>>>>
>>>>> do external backup, using snapshot and bitmap
>>>>>
>>>>> if (success backup)
>>>>>       qmp bitmap-abdicate
>>>>> else
>>>>>       qmp bitmap-reclaime
>>>>>
>>>>> qmp merge snapshot
>>>>> =====
>>>>>
>>>> Hm, I see -- so you're hoping to manage the backup *entirely*
>>>> externally, so you want to be able to reach inside of QEMU and control
>>>> some status conditions to guarantee it'll be safe.
>>>>
>>>> I'm not convinced QEMU can guarantee such things -- due to various
>>>> flush
>>>> properties, race conditions on write, etc. QEMU handles all of this
>>>> internally in a non-public way at the moment.
>>> Hm, can you be more concrete? What operations are dangerous? We can do
>>> them in paused state for example.
>>>
>> I suppose if you're going to pause the VM, then it should be reasonably
>> safe, but recently there have been endeavors to augment the .qcow2
>> format to prohibit concurrent access, which might include a paused VM as
>> well, I'm not clear on the implementation.
>>
>> If you do it via paused only, then you also don't need to expose the
>> freeze/rollback mechanisms: the existing clear mechanism alone is
>> sufficient:
>>
>> (A) The frozen backup fails. Nothing new has been written, so we don't
>> need to adjust anything, we can just try again.
>> (B) The frozen backup succeeds. We can just clear the bitmap before
>> unfreezing.
> 
> We can't query bitmap in paused state - it may take too much time.
> 

And I think it is currently unsafe to fetch the data from disk while the
VM is running, so you'll have to solve one or the other problem...

>>
>> I definitely have reservations about using this as a live fleecing
>> mechanism -- the backup block job uses a write-notifier to make
>> just-in-time backups of data before it is altered, leaving it the only
>> "safe" live backup mechanism in QEMU currently. (Alongside mirror.)
>>
>> I actually have some patches from Fam to introduce a live fleecing
>> mechanism into QEMU (The idea being you create a point-in-time drive you
>> can get data from via NBD, then delete it when done) that might be more
>> appropriate, but I ran into a lot of problems with the patch. I'll post
>> the WIP for that patch to try to solicit comments on the best way
>> forward.
> 
> After adding
> =============
> --- a/block.c
> +++ b/block.c
> @@ -1276,6 +1276,9 @@ void bdrv_set_backing_hd(BlockDriverState *bs,
> BlockDriverState *backing_hd)
>      /* Otherwise we won't be able to commit due to check in bdrv_commit */
>      bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_COMMIT_TARGET,
>                      bs->backing_blocker);
> +
> +    bdrv_op_unblock(backing_hd, BLOCK_OP_TYPE_BACKUP_SOURCE,
> +                    bs->backing_blocker);
>  out:
>      bdrv_refresh_limits(bs, NULL);
>  }
> ==============
> and tiny fix for qemu_io interface in iotest
> 
> Fam's "qemu-iotests: Image fleecing test case 089" works for me. Isn't
> it enough?
> 
> 
> 
>>
>> Otherwise, My biggest question here is:
>> "What does fleecing a backup externally provide as a benefit over
>> backing up to an NBD target?"
> 
> Look at our answers on v2 of these series:
> 
> On 05.02.2016 11:28, Denis V. Lunev wrote:
>> On 02/03/2016 11:14 AM, Fam Zheng wrote:
>>> On Sat, 01/30 13:56, Vladimir Sementsov-Ogievskiy wrote:
>>>> Hi all.
>>>>
>>>> These series which aims to add external backup api. This is needed
>>>> to allow
>>>> backup software use our dirty bitmaps.
>>>>
>>>> Vmware and Parallels Cloud Server have this feature.
>>> What is the advantage of this appraoch over "drive-backup
>>> sync=incremental
>>> ..."?
>>
>> This will allow third-party vendors to backup QEMU VMs into
>> their own formats or to the cloud etc.
> 
> 
>>
>> You can already today perform incremental backups to an NBD target to
>> copy the data out via an external mechanism, is this not sufficient for
>> Parallels? If not, why?
>>
>>>>> In the following patch query-bitmap acquires aio context. This must be
>>>>> ofcourse dropped for frozen bitmap.
>>>>> But to make it in true way, I think, I should check somehow that
>>>>> this is
>>>>> not just frozen bitmap, but the bitmap frozen by qmp command, to avoid
>>>>> incorrect quering of bitmap frozen by internal backup (or other
>>>>> mechanizm).. May be, it is not necessary.
>>>>>
>>>>>
>>>>>
>>>>
> 
> 

      reply	other threads:[~2016-02-09 16:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-22 17:07 [Qemu-devel] [PATCH RFC] external backup api Vladimir Sementsov-Ogievskiy
2016-01-22 17:07 ` [Qemu-devel] [PATCH] qmp: add query-block-dirty-bitmap Vladimir Sementsov-Ogievskiy
2016-01-22 17:22   ` Denis V. Lunev
2016-01-22 17:28     ` Vladimir Sementsov-Ogievskiy
2016-01-22 18:28       ` Denis V. Lunev
2016-01-22 18:43   ` Eric Blake
2016-02-05 19:48 ` [Qemu-devel] [PATCH RFC] external backup api John Snow
2016-02-06  9:19   ` Vladimir Sementsov-Ogievskiy
2016-02-08 21:14     ` John Snow
2016-02-09 15:54       ` Vladimir Sementsov-Ogievskiy
2016-02-09 16:51         ` John Snow [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56BA192D.3050700@redhat.com \
    --to=jsnow@redhat.com \
    --cc=den@virtuozzo.com \
    --cc=famz@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.