From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57537) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YfrOw-0000Dd-5b for qemu-devel@nongnu.org; Wed, 08 Apr 2015 10:56:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YfrOs-0007oN-6G for qemu-devel@nongnu.org; Wed, 08 Apr 2015 10:56:22 -0400 Date: Wed, 8 Apr 2015 16:56:14 +0200 From: Alberto Garcia Message-ID: <20150408145614.GA2140@igalia.com> References: <1428069921-2957-1-git-send-email-famz@redhat.com> <1428069921-2957-3-git-send-email-famz@redhat.com> <20150408103752.GH28835@stefanha-thinkpad.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150408103752.GH28835@stefanha-thinkpad.redhat.com> Subject: Re: [Qemu-devel] [PATCH v2 2/4] block: Pause block jobs in bdrv_drain_all List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Kevin Wolf , Fam Zheng , qemu-block@nongnu.org, Jeff Cody , qemu-devel@nongnu.org, pbonzini@redhat.com On Wed, Apr 08, 2015 at 11:37:52AM +0100, Stefan Hajnoczi wrote: > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > > + AioContext *aio_context = bdrv_get_aio_context(bs); > > + > > + aio_context_acquire(aio_context); > > + if (bs->job) { > > + block_job_pause(bs->job); > > + } > > + aio_context_release(aio_context); > > + } > > + > > while (busy) { > > busy = false; > > > > @@ -2044,6 +2054,16 @@ void bdrv_drain_all(void) > > aio_context_release(aio_context); > > } > > } > > + > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > > + AioContext *aio_context = bdrv_get_aio_context(bs); > > + > > + aio_context_acquire(aio_context); > > + if (bs->job) { > > + block_job_resume(bs->job); > > + } > > + aio_context_release(aio_context); > > + } > > } > > There is a tiny chance that we pause a job (which actually just sets > job->paused = true but there's no guarantee the job's coroutine > reacts to this) right before it terminates. Then aio_poll() enters > the coroutine one last time and the job terminates. > > We then reach the resume portion of bdrv_drain_all() and the job no > longer exists. Hopefully nothing started a new job in the meantime. > bs->job should either be the original block job or NULL. > > This code seems under current assumptions, but I just wanted to > raise these issues in case someone sees problems that I've missed. Is it possible that a new job is started in the meantime? If that's the case this will hit the assertion in block_job_resume(). Berto