From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41415) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b04F7-0000Nl-RS for qemu-devel@nongnu.org; Tue, 10 May 2016 05:46:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b04F5-0004dR-Eo for qemu-devel@nongnu.org; Tue, 10 May 2016 05:46:16 -0400 Message-ID: <5731AE94.7030605@cn.fujitsu.com> Date: Tue, 10 May 2016 17:49:08 +0800 From: Changlong Xie MIME-Version: 1.0 References: <1460707838-13510-1-git-send-email-xiecl.fnst@cn.fujitsu.com> <1460707838-13510-8-git-send-email-xiecl.fnst@cn.fujitsu.com> <20160506154641.GA23075@stefanha-x1.localdomain> In-Reply-To: <20160506154641.GA23075@stefanha-x1.localdomain> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v18 7/8] Implement new driver for block replication List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: qemu devel , Eric Blake , Alberto Garcia , Kevin Wolf , Max Reitz , Stefan Hajnoczi , qemu block , Jiang Yunhong , Dong Eddie , Markus Armbruster , "Dr. David Alan Gilbert" On 05/06/2016 11:46 PM, Stefan Hajnoczi wrote: > On Fri, Apr 15, 2016 at 04:10:37PM +0800, Changlong Xie wrote: >> +static void replication_close(BlockDriverState *bs) >> +{ >> + BDRVReplicationState *s = bs->opaque; >> + >> + if (s->mode == REPLICATION_MODE_SECONDARY) { >> + g_free(s->top_id); >> + } >> + >> + if (s->replication_state == BLOCK_REPLICATION_RUNNING) { >> + replication_stop(s->rs, false, NULL); >> + } > > There is a possible use-after-free with s->top_id. If we free it above > then replication_stop() must not call backup_job_cleanup(). I think it > could call it from replication_stop(). > > It would be safer to call replication_stop() before freeing s->top_id. > Yes, you are right. >> + top_bs = bdrv_lookup_bs(s->top_id, s->top_id, errp); > > Please check that bs is a child of top_bs. If it is not a child then > strange things could happen, for example the AioContexts might not match > (meaning it's not thread-safe) so this should be forbidden. > Will fix in next version >> + if (!top_bs) { >> + aio_context_release(aio_context); >> + return; >> + } > > Error return paths after reopen_backing_file(s, true, &local_err) should > undo the operation. Will do. > >> + bdrv_op_block_all(top_bs, s->blocker); >> + bdrv_op_unblock(top_bs, BLOCK_OP_TYPE_DATAPLANE, s->blocker); >> + >> + /* >> + * Must protect backup target if backup job was stopped/cancelled >> + * unexpectedly >> + */ >> + bdrv_ref(s->hidden_disk->bs); >> + >> + backup_start(s->secondary_disk->bs, s->hidden_disk->bs, 0, >> + MIRROR_SYNC_MODE_NONE, NULL, BLOCKDEV_ON_ERROR_REPORT, >> + BLOCKDEV_ON_ERROR_REPORT, backup_job_completed, >> + s, NULL, &local_err); > > Did you run stress tests where the primary is writing to the disk while > the secondary reads from the same sectors? > > I thought about this some more and I'm wondering about the following > scenario: > > NBD writes to secondary_disk and the guest reads from the disk at the > same time. There is a coroutine mutex in qcow2.c that protects both > read and write requests, but only until they perform the data I/O. It > may be possible that the read request from the Secondary VM could be > started before the NBD write but the preadv() syscall isn't entered > because of CPU scheduling decisions. In the meantime the > secondary_disk->hidden_disk backup operation takes place. With some > unlucky timing it may be possible for the Secondary VM to read the new > contents from secondary_disk instead of the old contents that were > backed up into hidden_disk. Thanks for your catch. I'll think about this scenario carefully. > > Extra serialization would be needed. > block/backup.c:wait_for_overlapping_requests() and > block/io.c:mark_request_serialising() are good starting points for > solving this. > >> + cleanup_imgs(); > > Please use qtest_add_abrt_handler() so cleanup happens even when SIGABRT > is received. > Surely. Thanks -Xie