From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52466) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yg9mr-0002vo-07 for qemu-devel@nongnu.org; Thu, 09 Apr 2015 06:34:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yg9mq-0002Fd-2m for qemu-devel@nongnu.org; Thu, 09 Apr 2015 06:34:16 -0400 Date: Thu, 9 Apr 2015 11:34:04 +0100 From: Stefan Hajnoczi Message-ID: <20150409103404.GA2783@stefanha-thinkpad.redhat.com> References: <1428069921-2957-1-git-send-email-famz@redhat.com> <1428069921-2957-3-git-send-email-famz@redhat.com> <20150408103752.GH28835@stefanha-thinkpad.redhat.com> <20150408145614.GA2140@igalia.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2oS5YaxWCcQjTEyO" Content-Disposition: inline In-Reply-To: <20150408145614.GA2140@igalia.com> Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v2 2/4] block: Pause block jobs in bdrv_drain_all List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Alberto Garcia Cc: pbonzini@redhat.com, Fam Zheng , qemu-block@nongnu.org, Stefan Hajnoczi , qemu-devel@nongnu.org --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Apr 08, 2015 at 04:56:14PM +0200, Alberto Garcia wrote: > On Wed, Apr 08, 2015 at 11:37:52AM +0100, Stefan Hajnoczi wrote: >=20 > > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > > > + AioContext *aio_context =3D bdrv_get_aio_context(bs); > > > + > > > + aio_context_acquire(aio_context); > > > + if (bs->job) { > > > + block_job_pause(bs->job); > > > + } > > > + aio_context_release(aio_context); > > > + } > > > + > > > while (busy) { > > > busy =3D false; > > > =20 > > > @@ -2044,6 +2054,16 @@ void bdrv_drain_all(void) > > > aio_context_release(aio_context); > > > } > > > } > > > + > > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) { > > > + AioContext *aio_context =3D bdrv_get_aio_context(bs); > > > + > > > + aio_context_acquire(aio_context); > > > + if (bs->job) { > > > + block_job_resume(bs->job); > > > + } > > > + aio_context_release(aio_context); > > > + } > > > } > >=20 > > There is a tiny chance that we pause a job (which actually just sets > > job->paused =3D true but there's no guarantee the job's coroutine > > reacts to this) right before it terminates. Then aio_poll() enters > > the coroutine one last time and the job terminates. > >=20 > > We then reach the resume portion of bdrv_drain_all() and the job no > > longer exists. Hopefully nothing started a new job in the meantime. > > bs->job should either be the original block job or NULL. > >=20 > > This code seems under current assumptions, but I just wanted to > > raise these issues in case someone sees problems that I've missed. >=20 > Is it possible that a new job is started in the meantime? If that's > the case this will hit the assertion in block_job_resume(). That is currently not possible since the QEMU monitor does not run while we're waiting in aio_poll(). Therefore no block job monitor commands could spawn a new job. If code is added that spawns a job based on an AioContext timer or due to some other event, then this assumption no longer holds and there is a problem because block_job_resume() is called on a job that never paused. But for now there is no problem. Stefan --2oS5YaxWCcQjTEyO Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVJlWcAAoJEJykq7OBq3PIigUIAJ4Z6FsUogWJf+pjCFK+UXK9 T9g+PSZIVVBoLvXWpXv0u3Pg+5ugNbdJORGmv/ICZ1qh7rr/mCBfLcLy+rV4svkS OLKEYC2tn2TcfyvPptgYhXFe6rfnGHzPx7g1mW2a4zJlfUbZDKVD8v0f8CbiYbGV xefolBkYEReO7NXAGpL844jsux2aeSirwLYSLN06CyXkCX3vkXZLRRRI3rYOEy/J QWhVAI656eFCjcSbThQ4abdbO+Nr7I4XzGbCQz5CSzjJlI5YM1+r7l9ilSyhS7tb FafhZaH8evzaaut8AkW4KFDDnMgzGcdfRqvf2DbMmXsFhisxwFL2EDLuL4nFalA= =FPN2 -----END PGP SIGNATURE----- --2oS5YaxWCcQjTEyO--