From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=48572 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1P3vNr-0007cp-Kt for qemu-devel@nongnu.org; Thu, 07 Oct 2010 14:43:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1P3vMl-0008AQ-HF for qemu-devel@nongnu.org; Thu, 07 Oct 2010 14:38:56 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:50035) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1P3vMl-0008AI-Dj for qemu-devel@nongnu.org; Thu, 07 Oct 2010 14:38:55 -0400 Received: by gxk22 with SMTP id 22so90943gxk.4 for ; Thu, 07 Oct 2010 11:38:54 -0700 (PDT) Message-ID: <4CAE13BA.70707@codemonkey.ws> Date: Thu, 07 Oct 2010 13:38:50 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH] ceph/rbd block driver for qemu-kvm (v4) References: <20100802194631.GA4923@chb-desktop> <20100803201407.GD1475@chb-desktop> <4CADD567.9010606@codemonkey.ws> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Yehuda Sadeh Weinraub Cc: Kevin Wolf , kvm@vger.kernel.org, qemu-devel@nongnu.org, ceph-devel@vger.kernel.org, Christian Brunner On 10/07/2010 01:08 PM, Yehuda Sadeh Weinraub wrote: > On Thu, Oct 7, 2010 at 7:12 AM, Anthony Liguori wrote: > >> On 08/03/2010 03:14 PM, Christian Brunner wrote: >> >>> +#include "qemu-common.h" >>> +#include "qemu-error.h" >>> +#include >>> +#include >>> + >>> +#include >>> >>> >> This looks to be unnecessary. Generally, system includes shouldn't be >> required so all of these should go away except rado/librados.h >> > Removed. > > >> >>> + >>> +#include "rbd_types.h" >>> +#include "module.h" >>> +#include "block_int.h" >>> + >>> +#include >>> +#include >>> +#include >>> + >>> +#include >>> + >>> + >>> +int eventfd(unsigned int initval, int flags); >>> >>> >> This is not quite right. Depending on eventfd is curious but in the very >> least, you need to detect the presence of eventfd in configure and provide a >> wrapper that redefines it as necessary. >> > Can fix that, though please see my later remarks. > >>> +static int create_tmap_op(uint8_t op, const char *name, char **tmap_desc) >>> +{ >>> + uint32_t len = strlen(name); >>> + /* total_len = encoding op + name + empty buffer */ >>> + uint32_t total_len = 1 + (sizeof(uint32_t) + len) + sizeof(uint32_t); >>> + char *desc = NULL; >>> >>> >> char is the wrong type to use here as it may be signed or unsigned. That >> can have weird effects with binary data when you're directly manipulating >> it. >> > Well, I can change it to uint8_t, so that it matches the op type, but > that'll require adding some other castings. In any case, you usually > get such a weird behavior when you cast to types of different sizes > and have the sign bit padded which is not the case in here. > > >> >>> + >>> + desc = qemu_malloc(total_len); >>> + >>> + *tmap_desc = desc; >>> + >>> + *desc = op; >>> + desc++; >>> + memcpy(desc,&len, sizeof(len)); >>> + desc += sizeof(len); >>> + memcpy(desc, name, len); >>> + desc += len; >>> + len = 0; >>> + memcpy(desc,&len, sizeof(len)); >>> + desc += sizeof(len); >>> >>> >> Shouldn't endianness be a concern? >> > Right. Fixed that. > > >> >>> + >>> + return desc - *tmap_desc; >>> +} >>> + >>> +static void free_tmap_op(char *tmap_desc) >>> +{ >>> + qemu_free(tmap_desc); >>> +} >>> + >>> +static int rbd_register_image(rados_pool_t pool, const char *name) >>> +{ >>> + char *tmap_desc; >>> + const char *dir = RBD_DIRECTORY; >>> + int ret; >>> + >>> + ret = create_tmap_op(CEPH_OSD_TMAP_SET, name,&tmap_desc); >>> + if (ret< 0) { >>> + return ret; >>> + } >>> + >>> + ret = rados_tmap_update(pool, dir, tmap_desc, ret); >>> + free_tmap_op(tmap_desc); >>> + >>> + return ret; >>> +} >>> >>> >> This ops are all synchronous? IOW, rados_tmap_update() call blocks until >> the operation is completed? >> > Yeah. And this is only called from the rbd_create() callback. > > >>> + header_snap += strlen(header_snap) + 1; >>> + if (header_snap> end) >>> + error_report("bad header, snapshot list broken"); >>> >>> >> Missing curly braces here. >> > Fixed. > > >>> + if (strncmp(hbuf + 68, RBD_HEADER_VERSION, 8)) { >>> + error_report("Unknown image version %s", hbuf + 68); >>> + r = -EMEDIUMTYPE; >>> + goto failed; >>> + } >>> + >>> + RbdHeader1 *header; >>> >>> >>> >> Don't mix variable definitions with code. >> > Fixed. > > >>> + s->efd = eventfd(0, 0); >>> + if (s->efd< 0) { >>> + error_report("error opening eventfd"); >>> + goto failed; >>> + } >>> + fcntl(s->efd, F_SETFL, O_NONBLOCK); >>> + qemu_aio_set_fd_handler(s->efd, rbd_aio_completion_cb, NULL, >>> + rbd_aio_flush_cb, NULL, s); >>> >>> >> It looks like you just use the eventfd to signal aio completion callbacks. >> A better way to do this would be to schedule a bottom half. eventfds are >> Linux specific and specific to recent kernels. >> > Digging back why we introduced the eventfd, it was due to some issues > seen with do_savevm() hangs on qemu_aio_flush(). The reason seemed > that we had no fd associated with the block device, which seemed to > not work well with the qemu aio model. If that assumption is wrong, > we'd be happy to change it. In any case, there are other more portable > ways to generate fds, so if it's needed we can do that. > There's no fd at all? How do you get notifications about an asynchronous event completion? Regards, Anthony Liguori