From: Kevin Wolf <kwolf@redhat.com>
To: Nicholas Thomas <nick@bytemark.co.uk>
Cc: Stefan Hajnoczi <stefanha@gmail.com>, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] NBD block device backend - 'improvements'
Date: Wed, 16 Feb 2011 13:00:59 +0100 [thread overview]
Message-ID: <4D5BBC7B.9020807@redhat.com> (raw)
In-Reply-To: <1297805193.12551.39.camel@den>
Hi Nick,
[ Please use "reply to all" so that the CC list is kept intact,
re-adding Stefan ]
Am 15.02.2011 22:26, schrieb Nicholas Thomas:
> Hi Kevin, Stefan.
>
> On Tue, 2011-02-15 at 12:09 +0100, Kevin Wolf wrote:
>> Am 14.02.2011 21:32, schrieb Stefan Hajnoczi:
> [...]
>>> block/nbd.c needs to be made asynchronous in order for this change to
>>> work.
>>
>> And even then it's not free of problem: For example qemu_aio_flush()
>> will hang. We're having all kinds of fun with NFS servers that go away
>> and let requests hang indefinitely.
>>
>> So maybe what we should add is a timeout option which defaults to 0
>> (fail immediately, like today)
>
> Noted, so long as we can have -1 as "forever".
As long as it's optional, that's fine with me.
> I'm currently spending
> time reworking block/nbd.c to be asynchronous, following the model in
> block/sheepdog.c
>
> There does seem to be a lot of scope for code duplication (setting up
> the TCP connection, taking it down, the mechanics of actually reading /
> writing bytes using the aio interface, etc) between the two, and
> presumably for rbd as well.
>
> Reading http://www.mail-archive.com/qemu-devel@nongnu.org/msg36479.html
> suggests it should be possible to have a "tcp" (+ "unix") protocol /
> transport, which nbd+sheepdog could stack on top of (curl+rbd seem to
> depend on their own libraries for managing the TCP part of the
> connection).
>
> They would implement talking the actual protocol, while the tcp/unix
> transports would have the duplicatable bits.
>
> I've not investigated it in code yet - it's possible I'm just letting my
> appetite for abstraction get away with me. Thoughts?
I'm not sure about how much duplication there actually is, but if you
can take a closer look and think it's worthwhile, we should probably
consider it.
>> Unconditionally stopping the VM from a block driver sounds wrong to me.
>> If you want to have this behaviour, the block driver should return an
>> error and you should use werror=stop.
>
> Unconditional? - if the socket manages to re-establish, the process
> continues on its way (I guess we'd see the same behaviour if a send/recv
> happened to take an unconscionably long time with the current code).
>
> Making just the I/O hang until the network comes back, keeping guest
> execution and qemu monitor working, is obviously better than that
> (although not /strictly/ necessary for our particular use case), so I
> hope to be able to offer an AIO NBD patch for review "soon".
Maybe I wasn't very clear. I was talking about Stefan's suggestion to
completely stop the VM, like we already can do for I/O errors (see the
werror and rerror options for -drive). I think it's not what you're
looking for, you just need the timeout=-1 thing.
>>> IPv6 would be nice and if you can consolidate that in qemu_socket.h,
>>> then that's a win for non-nbd socket users too.
>>
>> Agreed.
>
> We'd get it for free with a unified TCP transport, as described above
> (sheepdog already uses getaddrinfo and friends) - but if that's not
> feasible, I'll be happy to supply a patch just for this. Much easier
> than aio! :)
Sure, I think it would be a good thing to have. And even if you
implemented this unified TCP transport (I'm not sure yet what it would
look like), I think the basic support could still be in qemu_socket.h,
so that users outside the block layer can benefit from it, too.
Kevin
next prev parent reply other threads:[~2011-02-16 11:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-14 19:40 [Qemu-devel] NBD block device backend - 'improvements' Nicholas Thomas
2011-02-14 20:32 ` Stefan Hajnoczi
2011-02-15 11:09 ` Kevin Wolf
2011-02-15 21:26 ` Nicholas Thomas
2011-02-16 12:00 ` Kevin Wolf [this message]
2011-02-17 16:27 ` [Qemu-devel] " Nicholas Thomas
2011-02-17 16:34 ` [Qemu-devel] [PATCH 1/3] NBD library: whitespace changes Nicholas Thomas
2011-02-17 16:34 ` [Qemu-devel] [PATCH 2/3] NBD library: add aio-compatible read/write function Nicholas Thomas
2011-02-17 16:34 ` [Qemu-devel] [PATCH 3/3] block/nbd: Make the NBD block device use the AIO interface Nicholas Thomas
2011-02-17 19:28 ` Nicholas Thomas
2011-02-18 12:16 ` [Qemu-devel] [PATCH 3/3 v2] " Nicholas Thomas
2011-02-18 12:23 ` Kevin Wolf
2011-02-18 12:55 ` Nicholas Thomas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D5BBC7B.9020807@redhat.com \
--to=kwolf@redhat.com \
--cc=nick@bytemark.co.uk \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).