From: Liu Bo <liubo2009@cn.fujitsu.com>
To: dave@jikos.cz
Cc: linux-btrfs@vger.kernel.org, chris.mason@oracle.com, josef@redhat.com
Subject: Re: [RFC PATCH v2 1/3] Btrfs: add the Probabilistic Skiplist
Date: Wed, 11 Jan 2012 10:06:47 +0800 [thread overview]
Message-ID: <4F0CEEB7.2080809@cn.fujitsu.com> (raw)
In-Reply-To: <20120111003700.GT7322@twin.jikos.cz>
On 01/11/2012 08:37 AM, David Sterba wrote:
> Hi, a few thoughts and comments below:
>
> On Tue, Jan 10, 2012 at 03:31:34PM +0800, Liu Bo wrote:
>> c) The level limit may need to be adjusted.
>> I know it is a magic number, but now for simplicity we just keep it at 16,
>> and then each skiplist is able to contain (2^32-1)/3 nodes at most.
>
> (2^32-1)/3 = 1,431,655,765 that's a lot, I wonder what an average member
> count of a skiplist would be and whether eg. maxlevel = 12 is not enough
> (5,592,405 members).
>
hmm, sorry, I found I've made a mistake here,
let me correct it here (changelog will also be updated later):
As I set the probability to 1/4, the members linked on N+1 level list will be 1/4 of
those linked on N level list.
And what's more, in skiplist a node can be linked on multi levels, eg. a node with N+1 level will
also be linked on N level list.
So before the node count reaches to 4^(maxlevel - 1), the skiplist can maintain O(lgn),
and after that, it will be no more O(lgn) although we can still insert nodes into the skiplist.
That's the difference.
> or you can set the maxlevel during skiplist creation, or predefine a
> "small" skiplist with compile-time-set level to whatever < 16.
>
> this can be tuned later of course.
>
Yes, I do set the maxlevel to 16 at the creation of a skiplist.
Here 4^(16 - 1) is 2^30, I don't think this is enough for some severe workloads which build large
amount of fragments.
Maybe we should make the maxlevel self-update.
>> --- /dev/null
>> +++ b/fs/btrfs/skiplist.c
>> @@ -0,0 +1,98 @@
>> +inline int sl_fill_node(struct sl_node *node, int level, gfp_t mask)
>
> I suggest to pick the full prefix "skiplist_" instead of just "sl_",
> it'll be IMHO more readable and googlable. (Out of curiosity I grepped
> for the sl_ prefix and it's used by drivers/net/slip/slip.c).
>
I did hesitate for a while between "skiplist_" and "sl_"...
and I just wanna make it be similar to "rb_".
Anyway, I'm ok with "skiplist_".
>> +{
>> + struct sl_node **p;
>> + struct sl_node **q;
>> + int num;
>> +
>> + BUG_ON(level > MAXLEVEL);
>> +
>> + num = level + 1;
>> + p = kmalloc(sizeof(*p) * num, mask);
>> + BUG_ON(!p);
>
> you can drop the BUG_ON
>
>> + if (!p)
> ^^^^^^^
>
>> + return -ENOMEM;
>> + q = kmalloc(sizeof(*q) * num, mask);
>> + BUG_ON(!q);
> ^^^^^^^^^^
>
ok, just in case.
>> + if (!q) {
>> + kfree(p);
>> + return -ENOMEM;
>> + }
>> +
>> + node->next = p;
>> + node->prev = q;
>> + node->level = level;
>> + return 0;
>> +}
>> +
>> diff --git a/fs/btrfs/skiplist.h b/fs/btrfs/skiplist.h
>> new file mode 100644
>> index 0000000..3e414b5
>> --- /dev/null
>> +++ b/fs/btrfs/skiplist.h
>> +
>> +#define MAXLEVEL 16
>> +/* double p = 0.25; */
>> +
>> +struct sl_node {
>> + struct sl_node **next;
>> + struct sl_node **prev;
>> + unsigned int level;
>> + unsigned int head:1;
>
> the bitfield will use another sizeof(int) bytes, but the level is at
> most 16, you can reduce it's size eg to unsigned short.
> on the other hand, the structure has to start at address aligned to
> sizeof(void*) and the bytes after 'head' up to next sizeof(void*)
> boundary will be left unusable anyway. then, 'head' could be a full int
> or bool so the compiler is not restricted and forced to keep state of
> the single bit. if access to these items is exptected to be frequent,
> the diffenence could be mesurable.
>
I see. Thanks a lot for your advice!
thanks,
liubo
>> +};
>
>
> david
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2012-01-11 2:06 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-01-10 7:31 [RFC PATCH v2 0/3] Btrfs: apply the Probabilistic Skiplist on btrfs Liu Bo
2012-01-10 7:31 ` [RFC PATCH v2 1/3] Btrfs: add the Probabilistic Skiplist Liu Bo
2012-01-11 0:37 ` David Sterba
2012-01-11 2:06 ` Liu Bo [this message]
2012-01-11 14:41 ` David Sterba
2012-01-11 1:36 ` David Sterba
2012-01-10 7:31 ` [RFC PATCH v2 2/3] Btrfs: rebuild extent_map based on skiplist Liu Bo
2012-01-10 7:31 ` [RFC PATCH v2 3/3] Btrfs: convert rwlock to RCU for extent_map Liu Bo
2012-01-12 21:28 ` [RFC PATCH v2 0/3] Btrfs: apply the Probabilistic Skiplist on btrfs Andi Kleen
2012-01-13 2:18 ` Liu Bo
2012-01-13 4:10 ` Dave Chinner
2012-01-13 9:01 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F0CEEB7.2080809@cn.fujitsu.com \
--to=liubo2009@cn.fujitsu.com \
--cc=chris.mason@oracle.com \
--cc=dave@jikos.cz \
--cc=josef@redhat.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.