From: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
To: Max Reitz <mreitz@redhat.com>, qemu devel <qemu-devel@nongnu.org>,
Eric Blake <eblake@redhat.com>, Alberto Garcia <berto@igalia.com>,
Kevin Wolf <kwolf@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu block <qemu-block@nongnu.org>,
Jiang Yunhong <yunhong.jiang@intel.com>,
Dong Eddie <eddie.dong@intel.com>,
Markus Armbruster <armbru@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
Gonglei <arei.gonglei@huawei.com>,
zhanghailiang <zhang.zhanghailiang@huawei.com>
Subject: Re: [Qemu-devel] [PATCH v10 2/3] quorum: implement bdrv_add_child() and bdrv_del_child()
Date: Mon, 7 Mar 2016 17:13:26 +0800 [thread overview]
Message-ID: <56DD4636.40700@cn.fujitsu.com> (raw)
In-Reply-To: <56DB21B7.7050104@redhat.com>
On 03/06/2016 02:13 AM, Max Reitz wrote:
> On 16.02.2016 10:37, Changlong Xie wrote:
>> From: Wen Congyang <wency@cn.fujitsu.com>
>>
>> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
>> ---
>> block.c | 8 ++--
>> block/quorum.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++-
>> include/block/block.h | 4 ++
>> 3 files changed, 128 insertions(+), 6 deletions(-)
>>
>> diff --git a/block.c b/block.c
>> index 08aa979..c3c9dc0 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -1198,10 +1198,10 @@ static int bdrv_fill_options(QDict **options, const char *filename,
>> return 0;
>> }
>>
>> -static BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
>> - BlockDriverState *child_bs,
>> - const char *child_name,
>> - const BdrvChildRole *child_role)
>> +BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
>> + BlockDriverState *child_bs,
>> + const char *child_name,
>> + const BdrvChildRole *child_role)
>> {
>> BdrvChild *child = g_new(BdrvChild, 1);
>> *child = (BdrvChild) {
>> diff --git a/block/quorum.c b/block/quorum.c
>> index a5ae4b8..e5a7e4f 100644
>> --- a/block/quorum.c
>> +++ b/block/quorum.c
>> @@ -24,6 +24,7 @@
>> #include "qapi/qmp/qstring.h"
>> #include "qapi-event.h"
>> #include "crypto/hash.h"
>> +#include "qemu/bitmap.h"
>>
>> #define HASH_LENGTH 32
>>
>> @@ -81,6 +82,8 @@ typedef struct BDRVQuorumState {
>> bool rewrite_corrupted;/* true if the driver must rewrite-on-read corrupted
>> * block if Quorum is reached.
>> */
>> + unsigned long *index_bitmap;
>> + int bsize;
>>
>> QuorumReadPattern read_pattern;
>> } BDRVQuorumState;
>> @@ -876,9 +879,9 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
>> ret = -EINVAL;
>> goto exit;
>> }
>> - if (s->num_children < 2) {
>> + if (s->num_children < 1) {
>> error_setg(&local_err,
>> - "Number of provided children must be greater than 1");
>> + "Number of provided children must be 1 or more");
>
> Side note: Actually, we could work with 0 children, too. Quorum would
> then need to implement bdrv_is_inserted() and return false if there are
> no children.
>
> But that is something that can be implemented later on if the need arises.
Hi Max
Thanks for pointing it out.
>
>> ret = -EINVAL;
>> goto exit;
>> }
>> @@ -927,6 +930,7 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
>> /* allocate the children array */
>> s->children = g_new0(BdrvChild *, s->num_children);
>> opened = g_new0(bool, s->num_children);
>> + s->index_bitmap = bitmap_new(s->num_children);
>>
>> for (i = 0; i < s->num_children; i++) {
>> char indexstr[32];
>> @@ -942,6 +946,8 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
>>
>> opened[i] = true;
>> }
>> + bitmap_set(s->index_bitmap, 0, s->num_children);
>> + s->bsize = s->num_children;
>>
>> g_free(opened);
>> goto exit;
>> @@ -998,6 +1004,115 @@ static void quorum_attach_aio_context(BlockDriverState *bs,
>> }
>> }
>>
>> +static int get_new_child_index(BDRVQuorumState *s)
>> +{
>> + int index;
>> +
>> + index = find_next_zero_bit(s->index_bitmap, s->bsize, 0);
>> + if (index < s->bsize) {
>> + return index;
>> + }
>> +
>> + if ((s->bsize % BITS_PER_LONG) == 0) {
>> + s->index_bitmap = bitmap_zero_extend(s->index_bitmap, s->bsize,
>> + s->bsize + 1);
>
> I think this function needs to be called unconditionally. Looking into
> its implementation, its call to g_realloc() will not do anything (and it
> will probably be pretty quick at that), but the following bitmap_clear()
Yes. If "BITS_TO_LONGS(new_nbits) == BITS_TO_LONGS(old_nbits)",
g_realloc will do nothing.
> will only clear the bits from old_nbits (s->bsize) to new_nbits
> (s->bsize + 1).
>
> Thus, if you only call this function every 32nd/64th child, only that
> child's bit will be initialized to zero. All the rest is undefined.
>
> You probably didn't notice because bitmap_new() returns a
> zero-initialized bitmap, and thus you'd have to create around 64
> children (on an x64 machine) to notice
OOH! you're catching a *BIG* fish here. I'll remove the wrong "if"
condition next version. *Thanks*
>
>> + }
>> +
>> + return s->bsize++;
>> +}
>> +
>> +static void remove_child_index(BDRVQuorumState *s, int index)
>> +{
>> + int last_index;
>> + long new_len;
>
> size_t would be the more appropriate type.
okay
>
>> +
>> + assert(index < s->bsize);
>> +
>> + clear_bit(index, s->index_bitmap);
>> + if (index < s->bsize - 1) {
>> + /*
>> + * The last bit is always set, and we don't clear
>
> s/don't/didn't/
I'm going to remove "and we don't clear the last bit" here.
>
>> + * the last bit.
>> + */
>> + return;
>> + }
>> +
>> + last_index = find_last_bit(s->index_bitmap, s->bsize);
>
> An assert(last_index < s->bsize); here wouldn't hurt.
>
okay.
> (last_index == s->bsize would be the case if no bit is set in
> s->index_bitmap anymore, which should be impossible.)
>
>> + s->bsize = last_index + 1;
>> + if (BITS_TO_LONGS(last_index + 1) == BITS_TO_LONGS(s->bsize)) {
I correct myself here, it should be "BITS_TO_LONGS(old_bsize) ==
BITS_TO_LONGS(s->bsize)".
>> + return;
>> + }
>> +
>> + new_len = BITS_TO_LONGS(last_index + 1) * sizeof(unsigned long);
>
> s/last_index + 1/s->bsize/ looks better to me.
okay.
>
>> + s->index_bitmap = g_realloc(s->index_bitmap, new_len);
>> +}
>> +
>> +static void quorum_add_child(BlockDriverState *bs, BlockDriverState *child_bs,
>> + Error **errp)
>> +{
>> + BDRVQuorumState *s = bs->opaque;
>> + BdrvChild *child;
>> + char indexstr[32];
>> + int index, ret;
>> +
>> + index = get_new_child_index(s);
>> + ret = snprintf(indexstr, 32, "children.%d", index);
>> + if (ret < 0 || ret >= 32) {
>> + error_setg(errp, "cannot generate child name");
>> + return;
>> + }
>> +
>> + bdrv_drain(bs);
>> +
>> + assert(s->num_children <= INT_MAX / sizeof(BdrvChild *));
>> + if (s->num_children == INT_MAX / sizeof(BdrvChild *)) {
>> + error_setg(errp, "Too many children");
>> + return;
>> + }
>> + s->children = g_renew(BdrvChild *, s->children, s->num_children + 1);
>> +
>> + bdrv_ref(child_bs);
>> + child = bdrv_attach_child(bs, child_bs, indexstr, &child_format);
>> + s->children[s->num_children++] = child;
>> + set_bit(index, s->index_bitmap);
>> +}
>> +
>> +static void quorum_del_child(BlockDriverState *bs, BlockDriverState *child_bs,
>> + Error **errp)
>> +{
>> + BDRVQuorumState *s = bs->opaque;
>> + BdrvChild *child;
>> + int i, index;
>> +
>> + for (i = 0; i < s->num_children; i++) {
>> + if (s->children[i]->bs == child_bs) {
>> + break;
>> + }
>> + }
>> +
>> + /* we have checked it in bdrv_del_child() */
>> + assert(i < s->num_children);
>> + child = s->children[i];
>> +
>> + if (s->num_children <= s->threshold) {
>> + error_setg(errp,
>> + "The number of children cannot be lower than the vote threshold %d",
>> + s->threshold);
>> + return;
>> + }
>> +
>> + /* child->name is "children.%d" */
>
> Optional: assert(!strncmp(child->name, "children.", 9));
>
>> + index = atoi(child->name + 9);
>
> Optional: Assert absence of an error:
>
> unsigned long index;
> char *endptr;
>
> index = strtoul(child->name + 9, &endptr, 10);
> assert(index >= 0 && !*endptr);
Really useful, but since we strictly named 'child->name' in
quorum_add_child, let's just keep the orignal one.
Thanks
-Xie
>
> Max
>
>> +
>> + bdrv_drain(bs);
>> + /* We can safely remove this child now */
>> + memmove(&s->children[i], &s->children[i + 1],
>> + (s->num_children - i - 1) * sizeof(void *));
>> + s->children = g_renew(BdrvChild *, s->children, --s->num_children);
>> + remove_child_index(s, index);
>> + bdrv_unref_child(bs, child);
>> +}
>> +
>> static void quorum_refresh_filename(BlockDriverState *bs, QDict *options)
>> {
>> BDRVQuorumState *s = bs->opaque;
>> @@ -1053,6 +1168,9 @@ static BlockDriver bdrv_quorum = {
>> .bdrv_detach_aio_context = quorum_detach_aio_context,
>> .bdrv_attach_aio_context = quorum_attach_aio_context,
>>
>> + .bdrv_add_child = quorum_add_child,
>> + .bdrv_del_child = quorum_del_child,
>> +
>> .is_filter = true,
>> .bdrv_recurse_is_first_non_filter = quorum_recurse_is_first_non_filter,
>> };
>> diff --git a/include/block/block.h b/include/block/block.h
>> index ecde190..4b787d2 100644
>> --- a/include/block/block.h
>> +++ b/include/block/block.h
>> @@ -517,6 +517,10 @@ void bdrv_disable_copy_on_read(BlockDriverState *bs);
>> void bdrv_ref(BlockDriverState *bs);
>> void bdrv_unref(BlockDriverState *bs);
>> void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child);
>> +BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
>> + BlockDriverState *child_bs,
>> + const char *child_name,
>> + const BdrvChildRole *child_role);
>>
>> bool bdrv_op_is_blocked(BlockDriverState *bs, BlockOpType op, Error **errp);
>> void bdrv_op_block(BlockDriverState *bs, BlockOpType op, Error *reason);
>>
>
>
next prev parent reply other threads:[~2016-03-07 9:11 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-16 9:37 [Qemu-devel] [PATCH v10 0/3] qapi: child add/delete support Changlong Xie
2016-02-16 9:37 ` [Qemu-devel] [PATCH v10 1/3] Add new block driver interface to add/delete a BDS's child Changlong Xie
2016-03-05 17:27 ` Max Reitz
2016-03-07 4:16 ` Changlong Xie
2016-03-07 15:23 ` Max Reitz
2016-02-16 9:37 ` [Qemu-devel] [PATCH v10 2/3] quorum: implement bdrv_add_child() and bdrv_del_child() Changlong Xie
2016-03-05 18:13 ` Max Reitz
2016-03-07 9:13 ` Changlong Xie [this message]
2016-03-07 16:02 ` Eric Blake
2016-03-07 16:02 ` Max Reitz
2016-03-08 2:57 ` Changlong Xie
2016-03-09 15:27 ` Max Reitz
2016-03-08 1:42 ` Changlong Xie
2016-02-16 9:37 ` [Qemu-devel] [PATCH v10 3/3] qmp: add monitor command to add/remove a child Changlong Xie
2016-03-05 18:33 ` Max Reitz
2016-03-07 9:15 ` Changlong Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56DD4636.40700@cn.fujitsu.com \
--to=xiecl.fnst@cn.fujitsu.com \
--cc=arei.gonglei@huawei.com \
--cc=armbru@redhat.com \
--cc=berto@igalia.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=eddie.dong@intel.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=yunhong.jiang@intel.com \
--cc=zhang.zhanghailiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).