From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54794) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1adplj-00023u-Fa for qemu-devel@nongnu.org; Wed, 09 Mar 2016 20:52:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1adplf-00034z-FE for qemu-devel@nongnu.org; Wed, 09 Mar 2016 20:52:03 -0500 Received: from mx1.redhat.com ([209.132.183.28]:33896) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1adplf-00034f-9W for qemu-devel@nongnu.org; Wed, 09 Mar 2016 20:51:59 -0500 Date: Thu, 10 Mar 2016 09:51:54 +0800 From: Fam Zheng Message-ID: <20160310015154.GD23632@ad.usersys.redhat.com> References: <1455470231-5223-1-git-send-email-pbonzini@redhat.com> <1455470231-5223-6-git-send-email-pbonzini@redhat.com> <56E01544.6060305@de.ibm.com> <56E01D3F.1060204@redhat.com> <56E03333.5020601@de.ibm.com> <56E04C9B.7070801@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56E04C9B.7070801@redhat.com> Subject: Re: [Qemu-devel] [PATCH 5/8] virtio-blk: fix "disabled data plane" mode List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: Christian Borntraeger , qemu-devel@nongnu.org On Wed, 03/09 17:17, Paolo Bonzini wrote: > > > On 09/03/2016 15:29, Christian Borntraeger wrote: > > FWIW, it seems that this patch triggers this error, the "tracked_request_begin" > > that I reported yesterday and / or some early read issues from the bootloader > > in a random fashion. > > Using 2906cddfecff21af20eedab43288b485a679f9ac^ seems to work all the time, > > moving around vblk->dataplane_started = true also triggers all 3 types > > of bugs > > In all likelihood, the bug is that virtio_blk_handle_output is being > called in two threads. > > It's not clear to me how that's possible, though. The aio_poll() inside "blk_set_aio_context(s->conf->conf.blk, s->ctx)" looks suspicious: main thread iothread ---------------------------------------------------------------------------- virtio_blk_handle_output() virtio_blk_data_plane_start() vblk->dataplane_started = true; blk_set_aio_context() bdrv_set_aio_context() bdrv_drain() aio_poll() virtio_blk_handle_output() /* s->dataplane_started is true */ !!! -> virtio_blk_handle_request() event_notifier_set(ioeventfd) aio_poll() virtio_blk_handle_request() Christian, could you try the followed patch? The aio_poll above is replaced with a "limited aio_poll" that doesn't disptach ioeventfd. (Note: perhaps moving "vblk->dataplane_started = true;" after blk_set_aio_context() also *works around* this.) --- diff --git a/block.c b/block.c index ba24b8e..e37e8f7 100644 --- a/block.c +++ b/block.c @@ -4093,7 +4093,9 @@ void bdrv_attach_aio_context(BlockDriverState *bs, void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context) { - bdrv_drain(bs); /* ensure there are no in-flight requests */ + /* ensure there are no in-flight requests */ + bdrv_drained_begin(bs); + bdrv_drained_end(bs); bdrv_detach_aio_context(bs);