From mboxrd@z Thu Jan 1 00:00:00 1970 From: Yoshiaki Tamura Subject: Re: [Qemu-devel] Re: [PATCH 09/21] Introduce event-tap. Date: Tue, 30 Nov 2010 19:35:54 +0900 Message-ID: <4CF4D38A.203@lab.ntt.co.jp> References: <1290665220-26478-1-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <1290665220-26478-10-git-send-email-tamura.yoshiaki@lab.ntt.co.jp> <20101130011914.GA9015@amt.cnet> <20101130102538.GA20921@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: aliguori@us.ibm.com, ananth@in.ibm.com, kvm@vger.kernel.org, ohmura.kei@lab.ntt.co.jp, dlaor@redhat.com, qemu-devel@nongnu.org, vatsa@linux.vnet.ibm.com, avi@redhat.com, psuriset@linux.vnet.ibm.com, stefanha@linux.vnet.ibm.com To: Marcelo Tosatti Return-path: Received: from tama500.ecl.ntt.co.jp ([129.60.39.148]:46653 "EHLO tama500.ecl.ntt.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755541Ab0K3LIU (ORCPT ); Tue, 30 Nov 2010 06:08:20 -0500 In-Reply-To: <20101130102538.GA20921@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: Marcelo Tosatti wrote: > On Tue, Nov 30, 2010 at 06:28:55PM +0900, Yoshiaki Tamura wrote: >> 2010/11/30 Marcelo Tosatti: >>> On Thu, Nov 25, 2010 at 03:06:48PM +0900, Yoshiaki Tamura wrote: >>>> event-tap controls when to start FT transaction, and provides proxy >>>> functions to called from net/block devices. While FT transaction, it >>>> queues up net/block requests, and flush them when the transaction gets >>>> completed. >>>> >>>> Signed-off-by: Yoshiaki Tamura >>>> Signed-off-by: OHMURA Kei >>> >>>> +static void event_tap_alloc_blk_req(EventTapBlkReq *blk_req, >>>> + BlockDriverState *bs, BlockRequest *reqs, >>>> + int num_reqs, BlockDriverCompletionFunc *cb, >>>> + void *opaque, bool is_multiwrite) >>>> +{ >>>> + int i; >>>> + >>>> + blk_req->num_reqs = num_reqs; >>>> + blk_req->num_cbs = num_reqs; >>>> + blk_req->device_name = qemu_strdup(bs->device_name); >>>> + blk_req->is_multiwrite = is_multiwrite; >>>> + >>>> + for (i = 0; i< num_reqs; i++) { >>>> + blk_req->reqs[i].sector = reqs[i].sector; >>>> + blk_req->reqs[i].nb_sectors = reqs[i].nb_sectors; >>>> + blk_req->reqs[i].qiov = reqs[i].qiov; >>>> + blk_req->reqs[i].cb = cb; >>>> + blk_req->reqs[i].opaque = opaque; >>>> + blk_req->cb[i] = reqs[i].cb; >>>> + blk_req->opaque[i] = reqs[i].opaque; >>>> + } >>>> +} >>> >>> bdrv_aio_flush should also be logged, so that guest initiated flush is >>> respected on replay. >> >> In the current implementation w/o flush logging, there might be >> order inversion after replay? >> >> Yoshi > > Yes, since a vcpu is allowed to continue after synchronization is > scheduled via a bh. For virtio-blk, for example: > > 1) bdrv_aio_write, event queued. > 2) bdrv_aio_flush > 3) bdrv_aio_write, event queued. > > On replay, there is no flush between the two writes. > > Why can't synchronization be done from event-tap itself, synchronously, > to avoid this kind of problem? Thanks. I would fix it. > The way you hook synchronization into savevm seems unclean. Perhaps > better separation between standard savevm path and FT savevm would make > it cleaner. I think you're mentioning about the changes in migration.c? Yoshi