[Qemu-devel] block replication

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] block replication
@ 2017-08-09 14:11 Vladimir Sementsov-Ogievskiy
  2017-08-10 12:26 ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-08-09 14:11 UTC (permalink / raw)
  To: wency; +Cc: qemu-devel, qemu block, John Snow

Hi Wen!

I'm trying to understand block/replication code and have a question.

Why should we block the region from intersecting cow requests when read? 
If I understand correctly

regardless of writes to the secondary-disk we have consistent view of it 
through hidden-disk:

Even if we are intersecting with some writes to secondary-disk (and 
corresponding cow-requests), the

data in secondary disk will not be updated until backed up to 
hidden-disk, therefore, for read we have two

options:

1. read old data from secondary-disk (unallocated region in hidden-disk 
means data in secondary-disk is not updated yet)

2. read backed-up data from hidden-disk (data in secondary-disk may be 
already updated but we don't care)

(the whole region to read may consists of parts, corresponding to 1 or 
2, but this should be ok too)

Where am I wrong?


======

static coroutine_fn int replication_co_readv(BlockDriverState *bs,
                                              int64_t sector_num,
                                              int remaining_sectors,
                                              QEMUIOVector *qiov)
{
     BDRVReplicationState *s = bs->opaque;
     BdrvChild *child = s->secondary_disk;
     BlockJob *job = NULL;
     CowRequest req;
     int ret;

     if (s->mode == REPLICATION_MODE_PRIMARY) {
         /* We only use it to forward primary write requests */
         return -EIO;
}

     ret = replication_get_io_status(s);
     if (ret < 0) {
         return ret;
}

     if (child && child->bs) {
         job = child->bs->job;
}

     if (job) {
         uint64_t remaining_bytes = remaining_sectors * BDRV_SECTOR_SIZE;

backup_wait_for_overlapping_requests(child->bs->job,
                                              sector_num * 
BDRV_SECTOR_SIZE,
remaining_bytes);
         backup_cow_request_begin(&req, child->bs->job,
                                  sector_num * BDRV_SECTOR_SIZE,
remaining_bytes);
         ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors,
qiov);
backup_cow_request_end(&req);
         goto out;
}

     ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors, qiov);
out:
     return replication_return_value(s, ret);
}

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] block replication
  2017-08-09 14:11 [Qemu-devel] block replication Vladimir Sementsov-Ogievskiy
@ 2017-08-10 12:26 ` Vladimir Sementsov-Ogievskiy
  2017-08-11  1:51   ` Xie Changlong
  0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-08-10 12:26 UTC (permalink / raw)
  To: wency; +Cc: qemu-devel, qemu block, John Snow, Stefan Hajnoczi

09.08.2017 17:11, Vladimir Sementsov-Ogievskiy wrote:
> Hi Wen!
>
> I'm trying to understand block/replication code and have a question.
>
> Why should we block the region from intersecting cow requests when 
> read? If I understand correctly
>
> regardless of writes to the secondary-disk we have consistent view of 
> it through hidden-disk:
>
> Even if we are intersecting with some writes to secondary-disk (and 
> corresponding cow-requests), the
>
> data in secondary disk will not be updated until backed up to 
> hidden-disk, therefore, for read we have two
>
> options:
>
> 1. read old data from secondary-disk (unallocated region in 
> hidden-disk means data in secondary-disk is not updated yet)
>
> 2. read backed-up data from hidden-disk (data in secondary-disk may be 
> already updated but we don't care)
>
> (the whole region to read may consists of parts, corresponding to 1 or 
> 2, but this should be ok too)
>
> Where am I wrong?

Ok, now I think this is needed to prevent intersecting of writes and 
reads on hidden-disk. If it so, I think it is better to use serializing 
requests
mechanism (just serialize all requests on hidden-disk, and on write wait 
for all intersecting serializing requests, on read wait for intersecting 
serializing writes) - it may require additional
option for BlockDriverState, but it is more generic and more clear than 
export internal backup things to lock disk region. This also can be 
reused for image-fleecing scheme
(which is based on same pattern  [active-disk is backing for temp-disk, 
backup sync=none from active to temp, read from temp])

>
>
> ======
>
> static coroutine_fn int replication_co_readv(BlockDriverState *bs,
>                                              int64_t sector_num,
>                                              int remaining_sectors,
>                                              QEMUIOVector *qiov)
> {
>     BDRVReplicationState *s = bs->opaque;
>     BdrvChild *child = s->secondary_disk;
>     BlockJob *job = NULL;
>     CowRequest req;
>     int ret;
>
>     if (s->mode == REPLICATION_MODE_PRIMARY) {
>         /* We only use it to forward primary write requests */
>         return -EIO;
> }
>
>     ret = replication_get_io_status(s);
>     if (ret < 0) {
>         return ret;
> }
>
>     if (child && child->bs) {
>         job = child->bs->job;
> }
>
>     if (job) {
>         uint64_t remaining_bytes = remaining_sectors * BDRV_SECTOR_SIZE;
>
> backup_wait_for_overlapping_requests(child->bs->job,
>                                              sector_num * 
> BDRV_SECTOR_SIZE,
> remaining_bytes);
>         backup_cow_request_begin(&req, child->bs->job,
>                                  sector_num * BDRV_SECTOR_SIZE,
> remaining_bytes);
>         ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors,
> qiov);
> backup_cow_request_end(&req);
>         goto out;
> }
>
>     ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors, qiov);
> out:
>     return replication_return_value(s, ret);
> }
>

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] block replication
  2017-08-10 12:26 ` Vladimir Sementsov-Ogievskiy
@ 2017-08-11  1:51   ` Xie Changlong
  0 siblings, 0 replies; 3+ messages in thread
From: Xie Changlong @ 2017-08-11  1:51 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Wen Congyang
  Cc: Stefan Hajnoczi, John Snow, qemu-devel, qemu block, Zhang Chen,
	Zhang Hailiang, Li Zhijian

在 8/10/2017 8:26 PM, Vladimir Sementsov-Ogievskiy 写道:
> 09.08.2017 17:11, Vladimir Sementsov-Ogievskiy wrote:
>> Hi Wen!
>>
>> I'm trying to understand block/replication code and have a question.
>>
>> Why should we block the region from intersecting cow requests when 
>> read? If I understand correctly
>>
>> regardless of writes to the secondary-disk we have consistent view of 
>> it through hidden-disk:
>>
>> Even if we are intersecting with some writes to secondary-disk (and 
>> corresponding cow-requests), the
>>
>> data in secondary disk will not be updated until backed up to 
>> hidden-disk, therefore, for read we have two
>>
>> options:
>>
>> 1. read old data from secondary-disk (unallocated region in 
>> hidden-disk means data in secondary-disk is not updated yet)
>>
>> 2. read backed-up data from hidden-disk (data in secondary-disk may be 
>> already updated but we don't care)
>>
>> (the whole region to read may consists of parts, corresponding to 1 or 
>> 2, but this should be ok too)
>>
>> Where am I wrong?
> 
> Ok, now I think this is needed to prevent intersecting of writes and 
> reads on hidden-disk. If it so, I think it is better to use serializing 

Hi, Vladimir. Pls refer 
https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg03025.html

BTW, wen's email has changed, also CC Zhang Hailiang, Zhang Chen, Li Zhijian

> requests
> mechanism (just serialize all requests on hidden-disk, and on write wait 
> for all intersecting serializing requests, on read wait for intersecting 
> serializing writes) - it may require additional
> option for BlockDriverState, but it is more generic and more clear than 
> export internal backup things to lock disk region. This also can be 
> reused for image-fleecing scheme
> (which is based on same pattern  [active-disk is backing for temp-disk, 
> backup sync=none from active to temp, read from temp])
> 
>>
>>
>> ======
>>
>> static coroutine_fn int replication_co_readv(BlockDriverState *bs,
>>                                              int64_t sector_num,
>>                                              int remaining_sectors,
>>                                              QEMUIOVector *qiov)
>> {
>>     BDRVReplicationState *s = bs->opaque;
>>     BdrvChild *child = s->secondary_disk;
>>     BlockJob *job = NULL;
>>     CowRequest req;
>>     int ret;
>>
>>     if (s->mode == REPLICATION_MODE_PRIMARY) {
>>         /* We only use it to forward primary write requests */
>>         return -EIO;
>> }
>>
>>     ret = replication_get_io_status(s);
>>     if (ret < 0) {
>>         return ret;
>> }
>>
>>     if (child && child->bs) {
>>         job = child->bs->job;
>> }
>>
>>     if (job) {
>>         uint64_t remaining_bytes = remaining_sectors * BDRV_SECTOR_SIZE;
>>
>> backup_wait_for_overlapping_requests(child->bs->job,
>>                                              sector_num * 
>> BDRV_SECTOR_SIZE,
>> remaining_bytes);
>>         backup_cow_request_begin(&req, child->bs->job,
>>                                  sector_num * BDRV_SECTOR_SIZE,
>> remaining_bytes);
>>         ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors,
>> qiov);
>> backup_cow_request_end(&req);
>>         goto out;
>> }
>>
>>     ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors, qiov);
>> out:
>>     return replication_return_value(s, ret);
>> }
>>
> 

-- 
Thanks
     -Xie

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-08-11  1:52 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-09 14:11 [Qemu-devel] block replication Vladimir Sementsov-Ogievskiy
2017-08-10 12:26 ` Vladimir Sementsov-Ogievskiy
2017-08-11  1:51   ` Xie Changlong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).