From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55190) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bRrdX-0001cj-W5 for qemu-devel@nongnu.org; Mon, 25 Jul 2016 21:58:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bRrdV-0001Ip-NH for qemu-devel@nongnu.org; Mon, 25 Jul 2016 21:58:22 -0400 Message-ID: <5796C4F2.5060904@cn.fujitsu.com> Date: Tue, 26 Jul 2016 10:03:30 +0800 From: Changlong Xie MIME-Version: 1.0 References: <1469182567-3114-1-git-send-email-wangww.fnst@cn.fujitsu.com> <1469182567-3114-4-git-send-email-wangww.fnst@cn.fujitsu.com> In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v22 03/10] Backup: export interfaces for extra serialization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Max Reitz , Wang WeiWei , qemu devel , qemu block , Stefan Hajnoczi , Fam Zheng , Kevin Wolf , Jeff Cody Cc: Paolo Bonzini , John Snow , Eric Blake , Markus Armbruster , "Dr. David Alan Gilbert" , Dong Eddie , Jiang Yunhong , zhanghailiang , Gonglei , Wen Congyang On 07/26/2016 05:50 AM, Max Reitz wrote: > On 22.07.2016 12:16, Wang WeiWei wrote: >> From: Changlong Xie >> >> Normal backup(sync='none') workflow: >> step 1. NBD peformance I/O write from client to server >> qcow2_co_writev >> bdrv_co_writev >> ... >> bdrv_aligned_pwritev >> notifier_with_return_list_notify -> backup_do_cow >> bdrv_driver_pwritev // write new contents >> >> step 2. drive-backup sync=none >> backup_do_cow >> { >> wait_for_overlapping_requests >> cow_request_begin >> for(; start < end; start++) { >> bdrv_co_readv_no_serialising //read old contents from Secondary disk >> bdrv_co_writev // write old contents to hidden-disk >> } >> cow_request_end >> } >> >> step 3. Then roll back to "step 1" to write new contents to Secondary disk. >> >> And for replication, we must make sure that we only read the old contents from >> Secondary disk in order to keep contents consistent. >> >> 1) Replication workflow of Secondary >> virtio-blk >> ^ >> -------> 1 NBD | >> || server 3 replication >> || ^ ^ >> || | backing backing | >> || Secondary disk 6<-------- hidden-disk 5 <-------- active-disk 4 >> || | ^ >> || '-------------------------' >> || drive-backup sync=none 2 >> >> Hence, we need these interfaces to implement coarse-grained serialization between >> COW of Secondary disk and the read operation of replication. >> >> Example codes about how to use them: >> >> *#include "block/block_backup.h" >> >> static coroutine_fn int xxx_co_readv() >> { >> CowRequest req; >> BlockJob *job = secondary_disk->bs->job; >> >> if (job) { >> backup_wait_for_overlapping_requests(job, start, end); >> backup_cow_request_begin(&req, job, start, end); >> ret = bdrv_co_readv(); >> backup_cow_request_end(&req); >> goto out; >> } >> ret = bdrv_co_readv(); >> out: >> return ret; >> } >> >> Signed-off-by: Changlong Xie >> Signed-off-by: Wen Congyang >> Signed-off-by: Wang WeiWei >> --- >> block/backup.c | 41 ++++++++++++++++++++++++++++++++++------- >> include/block/block_backup.h | 14 ++++++++++++++ >> 2 files changed, 48 insertions(+), 7 deletions(-) >> >> diff --git a/block/backup.c b/block/backup.c >> index 3bce416..919b63a 100644 >> --- a/block/backup.c >> +++ b/block/backup.c >> @@ -28,13 +28,6 @@ >> #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16) >> #define SLICE_TIME 100000000ULL /* ns */ >> >> -typedef struct CowRequest { >> - int64_t start; >> - int64_t end; >> - QLIST_ENTRY(CowRequest) list; >> - CoQueue wait_queue; /* coroutines blocked on this request */ >> -} CowRequest; >> - >> typedef struct BackupBlockJob { >> BlockJob common; >> BlockBackend *target; >> @@ -271,6 +264,40 @@ void backup_do_checkpoint(BlockJob *job, Error **errp) >> bitmap_zero(backup_job->done_bitmap, len); >> } >> >> +void backup_wait_for_overlapping_requests(BlockJob *job, int64_t sector_num, >> + int nb_sectors) >> +{ >> + BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common); >> + int64_t sectors_per_cluster = cluster_size_sectors(backup_job); > > It's not really ideal to use backup_job here... For COLO, Secondary disk always does backup job before failover. > >> + int64_t start, end; >> + >> + assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP); > > ...and then only here assert that it's actually a valid object pointer. > > Not catastrophic, though, since it's an assertion and thus deemed > impossible to go wrong anyway. This is just what I thought > > Max > >> + >> + start = sector_num / sectors_per_cluster; >> + end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster); >> + wait_for_overlapping_requests(backup_job, start, end); >> +} >> + >> +void backup_cow_request_begin(CowRequest *req, BlockJob *job, >> + int64_t sector_num, >> + int nb_sectors) >> +{ >> + BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common); >> + int64_t sectors_per_cluster = cluster_size_sectors(backup_job); >> + int64_t start, end; >> + >> + assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP); >> + >> + start = sector_num / sectors_per_cluster; >> + end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster); >> + cow_request_begin(req, backup_job, start, end); >> +} >> + >> +void backup_cow_request_end(CowRequest *req) >> +{ >> + cow_request_end(req); >> +} >> + >> static const BlockJobDriver backup_job_driver = { >> .instance_size = sizeof(BackupBlockJob), >> .job_type = BLOCK_JOB_TYPE_BACKUP, >> diff --git a/include/block/block_backup.h b/include/block/block_backup.h >> index 3753bcb..e0e7ce6 100644 >> --- a/include/block/block_backup.h >> +++ b/include/block/block_backup.h >> @@ -1,3 +1,17 @@ >> #include "block/block_int.h" >> >> +typedef struct CowRequest { >> + int64_t start; >> + int64_t end; >> + QLIST_ENTRY(CowRequest) list; >> + CoQueue wait_queue; /* coroutines blocked on this request */ >> +} CowRequest; >> + >> +void backup_wait_for_overlapping_requests(BlockJob *job, int64_t sector_num, >> + int nb_sectors); >> +void backup_cow_request_begin(CowRequest *req, BlockJob *job, >> + int64_t sector_num, >> + int nb_sectors); >> +void backup_cow_request_end(CowRequest *req); >> + >> void backup_do_checkpoint(BlockJob *job, Error **errp); >> > >