From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=55219 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PpSPo-0004ks-78 for qemu-devel@nongnu.org; Tue, 15 Feb 2011 16:26:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PpSPm-0002i8-DZ for qemu-devel@nongnu.org; Tue, 15 Feb 2011 16:26:31 -0500 Received: from nog.sh.bytemark.co.uk ([212.110.161.168]:44047) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PpSPm-0002ho-9D for qemu-devel@nongnu.org; Tue, 15 Feb 2011 16:26:30 -0500 Received: from bytemail.bytemark.co.uk ([212.110.161.227]) by nog.sh.bytemark.co.uk with esmtp (Exim 4.69) (envelope-from ) id 1PpSPg-0001zS-VQ for qemu-devel@nongnu.org; Tue, 15 Feb 2011 21:26:24 +0000 Received: from den.lupine.me.uk ([81.187.47.34]) by bytemail.bytemark.co.uk with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1PpSPg-0005Yq-RM for qemu-devel@nongnu.org; Tue, 15 Feb 2011 21:26:24 +0000 Subject: Re: [Qemu-devel] NBD block device backend - 'improvements' From: Nicholas Thomas In-Reply-To: <4D5A5ECD.7060701@redhat.com> References: <1297712422.12551.2.camel@den> <4D5A5ECD.7060701@redhat.com> Content-Type: text/plain; charset="UTF-8" Date: Tue, 15 Feb 2011 21:26:33 +0000 Message-ID: <1297805193.12551.39.camel@den> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Hi Kevin, Stefan. On Tue, 2011-02-15 at 12:09 +0100, Kevin Wolf wrote: > Am 14.02.2011 21:32, schrieb Stefan Hajnoczi: [...] > > block/nbd.c needs to be made asynchronous in order for this change to > > work. > > And even then it's not free of problem: For example qemu_aio_flush() > will hang. We're having all kinds of fun with NFS servers that go away > and let requests hang indefinitely. > > So maybe what we should add is a timeout option which defaults to 0 > (fail immediately, like today) Noted, so long as we can have -1 as "forever". I'm currently spending time reworking block/nbd.c to be asynchronous, following the model in block/sheepdog.c There does seem to be a lot of scope for code duplication (setting up the TCP connection, taking it down, the mechanics of actually reading / writing bytes using the aio interface, etc) between the two, and presumably for rbd as well. Reading http://www.mail-archive.com/qemu-devel@nongnu.org/msg36479.html suggests it should be possible to have a "tcp" (+ "unix") protocol / transport, which nbd+sheepdog could stack on top of (curl+rbd seem to depend on their own libraries for managing the TCP part of the connection). They would implement talking the actual protocol, while the tcp/unix transports would have the duplicatable bits. I've not investigated it in code yet - it's possible I'm just letting my appetite for abstraction get away with me. Thoughts? > Unconditionally stopping the VM from a block driver sounds wrong to me. > If you want to have this behaviour, the block driver should return an > error and you should use werror=stop. Unconditional? - if the socket manages to re-establish, the process continues on its way (I guess we'd see the same behaviour if a send/recv happened to take an unconscionably long time with the current code). Making just the I/O hang until the network comes back, keeping guest execution and qemu monitor working, is obviously better than that (although not /strictly/ necessary for our particular use case), so I hope to be able to offer an AIO NBD patch for review "soon". > > IPv6 would be nice and if you can consolidate that in qemu_socket.h, > > then that's a win for non-nbd socket users too. > > Agreed. We'd get it for free with a unified TCP transport, as described above (sheepdog already uses getaddrinfo and friends) - but if that's not feasible, I'll be happy to supply a patch just for this. Much easier than aio! :) /Nick