From: Anthony Liguori <anthony@codemonkey.ws>
To: Yehuda Sadeh Weinraub <yehudasa@gmail.com>
Cc: Christian Brunner <chb@muc.de>, malc <av1474@comtv.ru>,
kvm@vger.kernel.org, qemu-devel@nongnu.org,
Kevin Wolf <kwolf@redhat.com>,
ceph-devel@vger.kernel.org
Subject: Re: [Qemu-devel] [PATCH] ceph/rbd block driver for qemu-kvm (v4)
Date: Thu, 07 Oct 2010 14:51:33 -0500 [thread overview]
Message-ID: <4CAE24C5.8030007@codemonkey.ws> (raw)
In-Reply-To: <AANLkTim2qfA+MncyF9Cx1oGYEzgW2TugG9sFTLP8T=J-@mail.gmail.com>
On 10/07/2010 01:41 PM, Yehuda Sadeh Weinraub wrote:
> On Thu, Oct 7, 2010 at 11:38 AM, Anthony Liguori<anthony@codemonkey.ws> wrote:
>
>> On 10/07/2010 01:08 PM, Yehuda Sadeh Weinraub wrote:
>>
>>> On Thu, Oct 7, 2010 at 7:12 AM, Anthony Liguori<anthony@codemonkey.ws>
>>> wrote:
>>>
>>>
>>>> On 08/03/2010 03:14 PM, Christian Brunner wrote:
>>>>
>>>>
>>>>> +#include "qemu-common.h"
>>>>> +#include "qemu-error.h"
>>>>> +#include<sys/types.h>
>>>>> +#include<stdbool.h>
>>>>> +
>>>>> +#include<qemu-common.h>
>>>>>
>>>>>
>>>>>
>>>> This looks to be unnecessary. Generally, system includes shouldn't be
>>>> required so all of these should go away except rado/librados.h
>>>>
>>>>
>>> Removed.
>>>
>>>
>>>
>>>>
>>>>
>>>>> +
>>>>> +#include "rbd_types.h"
>>>>> +#include "module.h"
>>>>> +#include "block_int.h"
>>>>> +
>>>>> +#include<stdio.h>
>>>>> +#include<stdlib.h>
>>>>> +#include<rados/librados.h>
>>>>> +
>>>>> +#include<signal.h>
>>>>> +
>>>>> +
>>>>> +int eventfd(unsigned int initval, int flags);
>>>>>
>>>>>
>>>>>
>>>> This is not quite right. Depending on eventfd is curious but in the very
>>>> least, you need to detect the presence of eventfd in configure and
>>>> provide a
>>>> wrapper that redefines it as necessary.
>>>>
>>>>
>>> Can fix that, though please see my later remarks.
>>>
>>>
>>>>> +static int create_tmap_op(uint8_t op, const char *name, char
>>>>> **tmap_desc)
>>>>> +{
>>>>> + uint32_t len = strlen(name);
>>>>> + /* total_len = encoding op + name + empty buffer */
>>>>> + uint32_t total_len = 1 + (sizeof(uint32_t) + len) +
>>>>> sizeof(uint32_t);
>>>>> + char *desc = NULL;
>>>>>
>>>>>
>>>>>
>>>> char is the wrong type to use here as it may be signed or unsigned. That
>>>> can have weird effects with binary data when you're directly manipulating
>>>> it.
>>>>
>>>>
>>> Well, I can change it to uint8_t, so that it matches the op type, but
>>> that'll require adding some other castings. In any case, you usually
>>> get such a weird behavior when you cast to types of different sizes
>>> and have the sign bit padded which is not the case in here.
>>>
>>>
>>>
>>>>
>>>>
>>>>> +
>>>>> + desc = qemu_malloc(total_len);
>>>>> +
>>>>> + *tmap_desc = desc;
>>>>> +
>>>>> + *desc = op;
>>>>> + desc++;
>>>>> + memcpy(desc,&len, sizeof(len));
>>>>> + desc += sizeof(len);
>>>>> + memcpy(desc, name, len);
>>>>> + desc += len;
>>>>> + len = 0;
>>>>> + memcpy(desc,&len, sizeof(len));
>>>>> + desc += sizeof(len);
>>>>>
>>>>>
>>>>>
>>>> Shouldn't endianness be a concern?
>>>>
>>>>
>>> Right. Fixed that.
>>>
>>>
>>>
>>>>
>>>>
>>>>> +
>>>>> + return desc - *tmap_desc;
>>>>> +}
>>>>> +
>>>>> +static void free_tmap_op(char *tmap_desc)
>>>>> +{
>>>>> + qemu_free(tmap_desc);
>>>>> +}
>>>>> +
>>>>> +static int rbd_register_image(rados_pool_t pool, const char *name)
>>>>> +{
>>>>> + char *tmap_desc;
>>>>> + const char *dir = RBD_DIRECTORY;
>>>>> + int ret;
>>>>> +
>>>>> + ret = create_tmap_op(CEPH_OSD_TMAP_SET, name,&tmap_desc);
>>>>> + if (ret< 0) {
>>>>> + return ret;
>>>>> + }
>>>>> +
>>>>> + ret = rados_tmap_update(pool, dir, tmap_desc, ret);
>>>>> + free_tmap_op(tmap_desc);
>>>>> +
>>>>> + return ret;
>>>>> +}
>>>>>
>>>>>
>>>>>
>>>> This ops are all synchronous? IOW, rados_tmap_update() call blocks until
>>>> the operation is completed?
>>>>
>>>>
>>> Yeah. And this is only called from the rbd_create() callback.
>>>
>>>
>>>
>>>>> + header_snap += strlen(header_snap) + 1;
>>>>> + if (header_snap> end)
>>>>> + error_report("bad header, snapshot list broken");
>>>>>
>>>>>
>>>>>
>>>> Missing curly braces here.
>>>>
>>>>
>>> Fixed.
>>>
>>>
>>>
>>>>> + if (strncmp(hbuf + 68, RBD_HEADER_VERSION, 8)) {
>>>>> + error_report("Unknown image version %s", hbuf + 68);
>>>>> + r = -EMEDIUMTYPE;
>>>>> + goto failed;
>>>>> + }
>>>>> +
>>>>> + RbdHeader1 *header;
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Don't mix variable definitions with code.
>>>>
>>>>
>>> Fixed.
>>>
>>>
>>>
>>>>> + s->efd = eventfd(0, 0);
>>>>> + if (s->efd< 0) {
>>>>> + error_report("error opening eventfd");
>>>>> + goto failed;
>>>>> + }
>>>>> + fcntl(s->efd, F_SETFL, O_NONBLOCK);
>>>>> + qemu_aio_set_fd_handler(s->efd, rbd_aio_completion_cb, NULL,
>>>>> + rbd_aio_flush_cb, NULL, s);
>>>>>
>>>>>
>>>>>
>>>> It looks like you just use the eventfd to signal aio completion
>>>> callbacks.
>>>> A better way to do this would be to schedule a bottom half. eventfds
>>>> are
>>>> Linux specific and specific to recent kernels.
>>>>
>>>>
>>> Digging back why we introduced the eventfd, it was due to some issues
>>> seen with do_savevm() hangs on qemu_aio_flush(). The reason seemed
>>> that we had no fd associated with the block device, which seemed to
>>> not work well with the qemu aio model. If that assumption is wrong,
>>> we'd be happy to change it. In any case, there are other more portable
>>> ways to generate fds, so if it's needed we can do that.
>>>
>>>
>> There's no fd at all? How do you get notifications about an asynchronous
>> event completion?
>>
>> Regards,
>>
>> Anthony Liguori
>>
>>
> (resending to list, sorry)
>
> The fd is hidden deep under in librados. We get callback notifications
> for events completion.
>
How is that possible? Are the callbacks delivered in the context of a
different thread? If so, don't you need locking?
Regards,
Anthony Liguori
> Thanks,
> Yehuda
>
WARNING: multiple messages have this Message-ID (diff)
From: Anthony Liguori <anthony@codemonkey.ws>
To: Yehuda Sadeh Weinraub <yehudasa@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
kvm@vger.kernel.org, qemu-devel@nongnu.org,
ceph-devel@vger.kernel.org, Christian Brunner <chb@muc.de>
Subject: Re: [Qemu-devel] [PATCH] ceph/rbd block driver for qemu-kvm (v4)
Date: Thu, 07 Oct 2010 14:51:33 -0500 [thread overview]
Message-ID: <4CAE24C5.8030007@codemonkey.ws> (raw)
In-Reply-To: <AANLkTim2qfA+MncyF9Cx1oGYEzgW2TugG9sFTLP8T=J-@mail.gmail.com>
On 10/07/2010 01:41 PM, Yehuda Sadeh Weinraub wrote:
> On Thu, Oct 7, 2010 at 11:38 AM, Anthony Liguori<anthony@codemonkey.ws> wrote:
>
>> On 10/07/2010 01:08 PM, Yehuda Sadeh Weinraub wrote:
>>
>>> On Thu, Oct 7, 2010 at 7:12 AM, Anthony Liguori<anthony@codemonkey.ws>
>>> wrote:
>>>
>>>
>>>> On 08/03/2010 03:14 PM, Christian Brunner wrote:
>>>>
>>>>
>>>>> +#include "qemu-common.h"
>>>>> +#include "qemu-error.h"
>>>>> +#include<sys/types.h>
>>>>> +#include<stdbool.h>
>>>>> +
>>>>> +#include<qemu-common.h>
>>>>>
>>>>>
>>>>>
>>>> This looks to be unnecessary. Generally, system includes shouldn't be
>>>> required so all of these should go away except rado/librados.h
>>>>
>>>>
>>> Removed.
>>>
>>>
>>>
>>>>
>>>>
>>>>> +
>>>>> +#include "rbd_types.h"
>>>>> +#include "module.h"
>>>>> +#include "block_int.h"
>>>>> +
>>>>> +#include<stdio.h>
>>>>> +#include<stdlib.h>
>>>>> +#include<rados/librados.h>
>>>>> +
>>>>> +#include<signal.h>
>>>>> +
>>>>> +
>>>>> +int eventfd(unsigned int initval, int flags);
>>>>>
>>>>>
>>>>>
>>>> This is not quite right. Depending on eventfd is curious but in the very
>>>> least, you need to detect the presence of eventfd in configure and
>>>> provide a
>>>> wrapper that redefines it as necessary.
>>>>
>>>>
>>> Can fix that, though please see my later remarks.
>>>
>>>
>>>>> +static int create_tmap_op(uint8_t op, const char *name, char
>>>>> **tmap_desc)
>>>>> +{
>>>>> + uint32_t len = strlen(name);
>>>>> + /* total_len = encoding op + name + empty buffer */
>>>>> + uint32_t total_len = 1 + (sizeof(uint32_t) + len) +
>>>>> sizeof(uint32_t);
>>>>> + char *desc = NULL;
>>>>>
>>>>>
>>>>>
>>>> char is the wrong type to use here as it may be signed or unsigned. That
>>>> can have weird effects with binary data when you're directly manipulating
>>>> it.
>>>>
>>>>
>>> Well, I can change it to uint8_t, so that it matches the op type, but
>>> that'll require adding some other castings. In any case, you usually
>>> get such a weird behavior when you cast to types of different sizes
>>> and have the sign bit padded which is not the case in here.
>>>
>>>
>>>
>>>>
>>>>
>>>>> +
>>>>> + desc = qemu_malloc(total_len);
>>>>> +
>>>>> + *tmap_desc = desc;
>>>>> +
>>>>> + *desc = op;
>>>>> + desc++;
>>>>> + memcpy(desc,&len, sizeof(len));
>>>>> + desc += sizeof(len);
>>>>> + memcpy(desc, name, len);
>>>>> + desc += len;
>>>>> + len = 0;
>>>>> + memcpy(desc,&len, sizeof(len));
>>>>> + desc += sizeof(len);
>>>>>
>>>>>
>>>>>
>>>> Shouldn't endianness be a concern?
>>>>
>>>>
>>> Right. Fixed that.
>>>
>>>
>>>
>>>>
>>>>
>>>>> +
>>>>> + return desc - *tmap_desc;
>>>>> +}
>>>>> +
>>>>> +static void free_tmap_op(char *tmap_desc)
>>>>> +{
>>>>> + qemu_free(tmap_desc);
>>>>> +}
>>>>> +
>>>>> +static int rbd_register_image(rados_pool_t pool, const char *name)
>>>>> +{
>>>>> + char *tmap_desc;
>>>>> + const char *dir = RBD_DIRECTORY;
>>>>> + int ret;
>>>>> +
>>>>> + ret = create_tmap_op(CEPH_OSD_TMAP_SET, name,&tmap_desc);
>>>>> + if (ret< 0) {
>>>>> + return ret;
>>>>> + }
>>>>> +
>>>>> + ret = rados_tmap_update(pool, dir, tmap_desc, ret);
>>>>> + free_tmap_op(tmap_desc);
>>>>> +
>>>>> + return ret;
>>>>> +}
>>>>>
>>>>>
>>>>>
>>>> This ops are all synchronous? IOW, rados_tmap_update() call blocks until
>>>> the operation is completed?
>>>>
>>>>
>>> Yeah. And this is only called from the rbd_create() callback.
>>>
>>>
>>>
>>>>> + header_snap += strlen(header_snap) + 1;
>>>>> + if (header_snap> end)
>>>>> + error_report("bad header, snapshot list broken");
>>>>>
>>>>>
>>>>>
>>>> Missing curly braces here.
>>>>
>>>>
>>> Fixed.
>>>
>>>
>>>
>>>>> + if (strncmp(hbuf + 68, RBD_HEADER_VERSION, 8)) {
>>>>> + error_report("Unknown image version %s", hbuf + 68);
>>>>> + r = -EMEDIUMTYPE;
>>>>> + goto failed;
>>>>> + }
>>>>> +
>>>>> + RbdHeader1 *header;
>>>>>
>>>>>
>>>>>
>>>>>
>>>> Don't mix variable definitions with code.
>>>>
>>>>
>>> Fixed.
>>>
>>>
>>>
>>>>> + s->efd = eventfd(0, 0);
>>>>> + if (s->efd< 0) {
>>>>> + error_report("error opening eventfd");
>>>>> + goto failed;
>>>>> + }
>>>>> + fcntl(s->efd, F_SETFL, O_NONBLOCK);
>>>>> + qemu_aio_set_fd_handler(s->efd, rbd_aio_completion_cb, NULL,
>>>>> + rbd_aio_flush_cb, NULL, s);
>>>>>
>>>>>
>>>>>
>>>> It looks like you just use the eventfd to signal aio completion
>>>> callbacks.
>>>> A better way to do this would be to schedule a bottom half. eventfds
>>>> are
>>>> Linux specific and specific to recent kernels.
>>>>
>>>>
>>> Digging back why we introduced the eventfd, it was due to some issues
>>> seen with do_savevm() hangs on qemu_aio_flush(). The reason seemed
>>> that we had no fd associated with the block device, which seemed to
>>> not work well with the qemu aio model. If that assumption is wrong,
>>> we'd be happy to change it. In any case, there are other more portable
>>> ways to generate fds, so if it's needed we can do that.
>>>
>>>
>> There's no fd at all? How do you get notifications about an asynchronous
>> event completion?
>>
>> Regards,
>>
>> Anthony Liguori
>>
>>
> (resending to list, sorry)
>
> The fd is hidden deep under in librados. We get callback notifications
> for events completion.
>
How is that possible? Are the callbacks delivered in the context of a
different thread? If so, don't you need locking?
Regards,
Anthony Liguori
> Thanks,
> Yehuda
>
next prev parent reply other threads:[~2010-10-07 19:51 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-02 19:46 [PATCH] ceph/rbd block driver for qemu-kvm (v4) Christian Brunner
2010-08-02 19:46 ` [Qemu-devel] " Christian Brunner
2010-08-02 20:37 ` malc
2010-08-02 20:37 ` malc
2010-08-03 20:14 ` Christian Brunner
2010-08-03 20:14 ` Christian Brunner
2010-09-23 2:21 ` Yehuda Sadeh Weinraub
2010-09-23 2:21 ` Yehuda Sadeh Weinraub
2010-10-07 12:23 ` Kevin Wolf
2010-10-07 12:23 ` Kevin Wolf
2010-10-07 14:12 ` Anthony Liguori
2010-10-07 14:12 ` Anthony Liguori
2010-10-07 18:08 ` Yehuda Sadeh Weinraub
2010-10-07 18:08 ` Yehuda Sadeh Weinraub
2010-10-07 18:38 ` Anthony Liguori
2010-10-07 18:38 ` Anthony Liguori
2010-10-07 18:41 ` Yehuda Sadeh Weinraub
2010-10-07 18:41 ` Yehuda Sadeh Weinraub
2010-10-07 19:51 ` Anthony Liguori [this message]
2010-10-07 19:51 ` Anthony Liguori
2010-10-07 20:47 ` Yehuda Sadeh Weinraub
2010-10-07 20:47 ` Yehuda Sadeh Weinraub
2010-10-07 21:04 ` Anthony Liguori
2010-10-07 21:04 ` Anthony Liguori
2010-10-07 21:49 ` Yehuda Sadeh Weinraub
2010-10-07 21:49 ` Yehuda Sadeh Weinraub
2010-10-07 21:55 ` Anthony Liguori
2010-10-07 21:55 ` Anthony Liguori
2010-10-07 22:45 ` Sage Weil
2010-10-07 22:45 ` Sage Weil
2010-10-08 14:06 ` Anthony Liguori
2010-10-08 14:06 ` Anthony Liguori
2010-10-08 15:50 ` Yehuda Sadeh Weinraub
2010-10-08 15:50 ` Yehuda Sadeh Weinraub
2010-10-08 16:05 ` Anthony Liguori
2010-10-08 16:05 ` Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CAE24C5.8030007@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=av1474@comtv.ru \
--cc=ceph-devel@vger.kernel.org \
--cc=chb@muc.de \
--cc=kvm@vger.kernel.org \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=yehudasa@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.