From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40008) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1df5k2-0005SB-Po for qemu-devel@nongnu.org; Tue, 08 Aug 2017 10:44:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1df5k1-0007Sj-93 for qemu-devel@nongnu.org; Tue, 08 Aug 2017 10:44:18 -0400 References: <20170808142944.145159-1-vsementsov@virtuozzo.com> From: Eric Blake Message-ID: Date: Tue, 8 Aug 2017 09:44:05 -0500 MIME-Version: 1.0 In-Reply-To: <20170808142944.145159-1-vsementsov@virtuozzo.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="KhpGkggSrWFk0mvAMAmNurf5OmFNVn20E" Subject: Re: [Qemu-devel] [PATCH for 2.10] block/nbd-client: always return EIO on and after the first io channel error List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladimir Sementsov-Ogievskiy , qemu-block@nongnu.org, qemu-devel@nongnu.org Cc: mreitz@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, den@openvz.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --KhpGkggSrWFk0mvAMAmNurf5OmFNVn20E From: Eric Blake To: Vladimir Sementsov-Ogievskiy , qemu-block@nongnu.org, qemu-devel@nongnu.org Cc: mreitz@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, den@openvz.org Message-ID: Subject: Re: [PATCH for 2.10] block/nbd-client: always return EIO on and after the first io channel error References: <20170808142944.145159-1-vsementsov@virtuozzo.com> In-Reply-To: <20170808142944.145159-1-vsementsov@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 08/08/2017 09:29 AM, Vladimir Sementsov-Ogievskiy wrote: > Do not communicate after the first error to avoid communicating through= t > broken channel. The only exclusion is try to send NBD_CMD_DISC anyway o= n > in nbd_client_close. >=20 > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- >=20 > Hi all. Here is a patch, fixing a problem noted in > [PATCH 06/17] block/nbd-client: fix nbd_read_reply_entry > and > [PATCH 17/17] block/nbd-client: always return EIO on and after the firs= t io channel error > and discussed on list. >=20 > If it will be applied to 2.10, then I'll rebase my 'nbd client refactor= ing and fixing' > on it (for 2.11). If not - I'll prefer not rebase the series, so, do no= t apply this > patch for 2.11. It may be possible to do something even smaller: >=20 > block/nbd-client.h | 1 + > block/nbd-client.c | 58 +++++++++++++++++++++++++++++++++++++---------= -------- > 2 files changed, 41 insertions(+), 18 deletions(-) >=20 > diff --git a/block/nbd-client.h b/block/nbd-client.h > index df80771357..28db9922c8 100644 > --- a/block/nbd-client.h > +++ b/block/nbd-client.h > @@ -29,6 +29,7 @@ typedef struct NBDClientSession { > =20 > Coroutine *recv_coroutine[MAX_NBD_REQUESTS]; > NBDReply reply; > + bool eio_to_all; > } NBDClientSession; > =20 > NBDClientSession *nbd_get_client_session(BlockDriverState *bs); > diff --git a/block/nbd-client.c b/block/nbd-client.c > index 25dd28406b..1282b2484e 100644 > --- a/block/nbd-client.c > +++ b/block/nbd-client.c > @@ -49,6 +49,8 @@ static void nbd_teardown_connection(BlockDriverState = *bs) > { > NBDClientSession *client =3D nbd_get_client_session(bs); > =20 > + client->eio_to_all =3D true; > + > if (!client->ioc) { /* Already closed */ > return; > } > @@ -74,12 +76,16 @@ static coroutine_fn void nbd_read_reply_entry(void = *opaque) > Error *local_err =3D NULL; > =20 > for (;;) { > + if (s->eio_to_all) { > + break; > + } > + How is this conditional ever reached? nbd_read_reply_entry() is our workhorse, that is supposed to run until the client is ready to disconnec= t. > assert(s->reply.handle =3D=3D 0); > ret =3D nbd_receive_reply(s->ioc, &s->reply, &local_err); > if (ret < 0) { > error_report_err(local_err); > } > - if (ret <=3D 0) { > + if (ret <=3D 0 || s->eio_to_all) { > break; > } This says we're now supposed to break out of the while loop. So unless something can set s->eio_to_all during aio_co_wake(s->recv_coroutine[i]), the next iteration of the loop won't hit the first conditional. Meanwhile, even if we skip the first conditional, this second conditional will still end the loop prior to acting on anything received from the server (the difference being whether we called nbd_receive_reply() one additional time, but I don't see that as getting in the way of a clean exit). But my question is whether we can just go with a simpler fix: if we ever break out of the workhorse loop of nbd_read_reply_entry(), then (and only then) is when we call nbd_recv_coroutines_enter_all() to mop up any pending transactions before tearing down the coroutines. Is there something we can do in nbd_recv_coroutines_enter_all() that will guarantee that our final entry into each coroutine will fail? I'm still playing with the idea locally. --=20 Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org --KhpGkggSrWFk0mvAMAmNurf5OmFNVn20E Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEzBAEBCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAlmJzjUACgkQp6FrSiUn Q2ohWwgAj7HS0EXhYRQAn7nAdxwhFv4hiqwQIjcp8xfllpKSgVsSwgiaQyFfCo94 2VoNHhTJnEb5hfplBpk0IvnYR3AWeKn+IWtAj9uBwYJHeaPEg7TeYSFSwzIlf4ph wtLtIrWDyq/9rchnsquzT0RDCEtujdae4Wd/3UMqh9hG8LBP1JZjLeR7zFKZvXvQ PfeXPUnYXq4zGV8ae/3TR85CVSDtTH7WJu6vhqcF/jguV0/e68ZgJ4QxApO4waBf 8yYBXneX6tEC+4G8lhdLo/mLvhqlOI2RXT113jYjC1AkZXE+Q9LOam8YsFnLjH1+ /hQMvadw8SPYYkHDvcMk8+fAsb8mtA== =8UBI -----END PGP SIGNATURE----- --KhpGkggSrWFk0mvAMAmNurf5OmFNVn20E--