From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42661) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aP9VJ-0005BG-Fr for qemu-devel@nongnu.org; Fri, 29 Jan 2016 08:54:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aP9VI-0002Sp-EL for qemu-devel@nongnu.org; Fri, 29 Jan 2016 08:54:25 -0500 References: <1453917600-2663-1-git-send-email-mreitz@redhat.com> <1453917600-2663-15-git-send-email-mreitz@redhat.com> <20160128041736.GN7877@ad.usersys.redhat.com> From: Max Reitz Message-ID: <56AB6F04.8040604@redhat.com> Date: Fri, 29 Jan 2016 14:54:12 +0100 MIME-Version: 1.0 In-Reply-To: <20160128041736.GN7877@ad.usersys.redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="bmcnU4AXQXNdJWStGO3aSq7FLvDjjNhNp" Subject: Re: [Qemu-devel] [PATCH v8 14/16] block: Rewrite bdrv_close_all() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng Cc: Kevin Wolf , Alberto Garcia , qemu-block@nongnu.org, John Snow , qemu-devel@nongnu.org, Paolo Bonzini This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --bmcnU4AXQXNdJWStGO3aSq7FLvDjjNhNp Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 28.01.2016 05:17, Fam Zheng wrote: > On Wed, 01/27 18:59, Max Reitz wrote: >> This patch rewrites bdrv_close_all(): Until now, all root BDSs have be= en >> force-closed. This is bad because it can lead to cached data not being= >> flushed to disk. >> >> Instead, try to make all reference holders relinquish their reference >> voluntarily: >> >> 1. All BlockBackend users are handled by making all BBs simply eject >> their BDS tree. Since a BDS can never be on top of a BB, this will >> not cause any of the issues as seen with the force-closing of BDSs.= >> The references will be relinquished and any further access to the B= B >> will fail gracefully. >> 2. All BDSs which are owned by the monitor itself (because they do not= >> have a BB) are relinquished next. >> 3. Besides BBs and the monitor, block jobs and other BDSs are the only= >> things left that can hold a reference to BDSs. After every remainin= g >> block job has been canceled, there should not be any BDSs left (and= >> the loop added here will always terminate (as long as NDEBUG is not= >> defined), because either all_bdrv_states will be empty or there wil= l >> not be any block job left to cancel, failing the assertion). >> >> Signed-off-by: Max Reitz >> Reviewed-by: Kevin Wolf >> --- >> block.c | 45 +++++++++++++++++++++++++++++++++++++-------- >> 1 file changed, 37 insertions(+), 8 deletions(-) >> >> diff --git a/block.c b/block.c >> index f8dd4a3..478e0db 100644 >> --- a/block.c >> +++ b/block.c >> @@ -2145,9 +2145,7 @@ static void bdrv_close(BlockDriverState *bs) >> { >> BdrvAioNotifier *ban, *ban_next; >> =20 >> - if (bs->job) { >> - block_job_cancel_sync(bs->job); >> - } >> + assert(!bs->job); >> =20 >> /* Disable I/O limits and drain all pending throttled requests */= >> if (bs->throttle_state) { >> @@ -2213,13 +2211,44 @@ static void bdrv_close(BlockDriverState *bs) >> void bdrv_close_all(void) >> { >> BlockDriverState *bs; >> + AioContext *aio_context; >> + int original_refcount =3D 0; >> =20 >> - QTAILQ_FOREACH(bs, &bdrv_states, device_list) { >> - AioContext *aio_context =3D bdrv_get_aio_context(bs); >> + /* Drop references from requests still in flight, such as cancele= d block >> + * jobs whose AIO context has not been polled yet */ >> + bdrv_drain_all(); >> =20 >> - aio_context_acquire(aio_context); >> - bdrv_close(bs); >> - aio_context_release(aio_context); >> + blockdev_close_all_bdrv_states(); >> + blk_remove_all_bs(); >=20 > This (monitor before BB) doesn't match the order in the commit message = (BB > before monitor). Will ask random.org whether to change the order here or in the commit message. :-) >> + >> + /* Cancel all block jobs */ >> + while (!QTAILQ_EMPTY(&all_bdrv_states)) { >> + QTAILQ_FOREACH(bs, &all_bdrv_states, bs_list) { >> + aio_context =3D bdrv_get_aio_context(bs); >> + >> + aio_context_acquire(aio_context); >> + if (bs->job) { >> + /* So we can safely query the current refcount */ >> + bdrv_ref(bs); >> + original_refcount =3D bs->refcnt; >> + >> + block_job_cancel_sync(bs->job); >> + aio_context_release(aio_context); >> + break; >> + } >> + aio_context_release(aio_context); >> + } >> + >> + /* All the remaining BlockDriverStates are referenced directl= y or >> + * indirectly from block jobs, so there needs to be at least = one BDS >> + * directly used by a block job */ >> + assert(bs); >> + >> + /* Wait for the block job to release its reference */ >> + while (bs->refcnt >=3D original_refcount) { >> + aio_poll(aio_context, true); >=20 > Why is this safe without acquiring aio_context? But oh wait, completion= s of > block jobs are defered to main loop BH, so I think to release the refer= ence, > aio_poll(qemu_get_aio_context(), ...) is the right thing to do. Actually, I think, commit 94db6d2d30962cc0422a69c88c3b3e9981b33e50 made this loop completely unnecessary. Will investigate. Max > This is also the problem in block_job_cancel_sync, which can dead loop = waiting > for job->completed flag, without processing main loop BH. >=20 > Fam >=20 >> + } >> + bdrv_unref(bs); >> } >> } >> =20 >> --=20 >> 2.7.0 >> --bmcnU4AXQXNdJWStGO3aSq7FLvDjjNhNp Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCAAGBQJWq28EAAoJEDuxQgLoOKytP50H/iYHaiwU9jmV/jn3M+uUYueh a8JFR5oYl/c47UJL8o0nW2Zk/baiVnkajnE8MCZdtKbByETHyM6NZSqfRxYJx99K ghBuokHHbwN9NDEFb0/46AmAoBTg/kbdgrO6Plmosed8j4DDIcwmBbB6O8UdCBF5 4cF2a4agvRqPoEXwPDDN1iy8OZwNeA3omgbYJwNDO2Elu+/PfrGmVZp8ZeyIUQ/o e/3iWLxas20dlgjfs5YQIy2hsztQD6xYweqYHvGp0woLgLcMgnbB/EczXom0DOPn Mhwd06wE7OQSXsL4aPCcCm2O8+trhH6VLS68AUbM6YS3UrhJlNiohbJgY3DIJSU= =WB5o -----END PGP SIGNATURE----- --bmcnU4AXQXNdJWStGO3aSq7FLvDjjNhNp--