Re: [Qemu-devel] [PATCH 37/47] add hierarchical bitmap data type and test cases

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Paolo Bonzini <pbonzini@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: kwolf@redhat.com, jcody@redhat.com, qemu-devel@nongnu.org,
	stefanha@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [PATCH 37/47] add hierarchical bitmap data type and test cases
Date: Mon, 30 Jul 2012 15:39:16 +0200	[thread overview]
Message-ID: <50168E84.7050101@redhat.com> (raw)
In-Reply-To: <5013E89C.7080903@redhat.com>

Il 28/07/2012 15:26, Eric Blake ha scritto:
> On 07/24/2012 05:04 AM, Paolo Bonzini wrote:
>> HBitmaps provide an array of bits.  The bits are stored as usual in an
>> array of unsigned longs, but HBitmap is also optimized to provide fast
>> iteration over set bits; going from one bit to the next is O(logB n)
>> worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
>> that the number of levels is in fact fixed.
>>
> 
>> +++ b/hbitmap.c
>> + * So it is easy to move up simply by shifting the index right by
>> + * log2(BITS_PER_LONG) bits.  To move down, you shift the index left
>> + * similarly, and add the word index within the group.  Iteration uses
>> + * ffs (find first set bit) to find the next word to examine; this
>> + * operation can be done in constant time in most current architectures.
> 
> Technically, ffs and friends can be done in constant time in ALL
> architectures.  There are some well-known bit-twiddling algorithms for
> computing it in straightline C code with no conditionals (and therefore
> in constant time, just with a larger constant than on platforms that
> lack the builtin single instruction). 

Technically, those are O(log2 BITS_PER_LONG). :)

>> +
>> +struct HBitmap {
>> +    /* Number of total bits in the bottom level.  */
>> +    uint64_t size;
>> +
>> +    /* Number of set bits in the bottom level.  */
>> +    uint64_t count;
>> +
>> +    /* A scaling factor.  When setting or resetting bits, the bitmap will
>> +     * scale bit numbers right by this amount of bits.  When iterating,
>> +     * the bitmap will scale bit numbers left by this amonut of bits.
> 
> s/amonut/amount/
> 
>> +     * Example of operations in a size-16, granularity-1 HBitmap:
>> +     *
>> +     *    initial state            00000000
>> +     *    set(start=0, count=9)    11111000 (iter: 0, 2, 4, 6, 8)
>> +     *    reset(start=1, count=3)  00111000 (iter: 4, 6, 8)
>> +     *    set(start=9, count=2)    00111100 (iter: 4, 6, 8, 10)
>> +     *    reset(start=5, count=5)  00000000
> 
> Thanks.  That helps me understand tremendously over the previous version
> of this patch.  Basically, you are stating that rather than storing 16
> bits, you compress and only store information on 8 pages, and each
> operation on a range of bits first rounds the bits to determine which
> page they land in then affect the entire page; while iteration only
> visits the first bit of each page.  A granularity of 1 means pages are
> 1<<1 or every 2 bit numbers.

Yes.

> Typical values of granularity will
> probably then be 0 (every bit is important) or 12 (operating on 4k
> pages, where touching any byte in the page is sufficient to track that
> entire page as affected in the bitmap).

In our case, bit indices are sector numbers, so a common value of the
granularity will be for example 7=16-9 (for a 64k-coarse dirty bitmap).

>> +     */
>> +    int granularity;
>> +
>> +    /* A number of progressively less coarse bitmaps (i.e. level 0 is the
>> +     * coarsest).  Each bit in level N represents a word in level N+1 that
>> +     * has a set bit, except the last level where each bit represents the
>> +     * actual bitmap.
>> +     */
>> +    unsigned long *levels[HBITMAP_LEVELS];
> 
> That is, even allocating a 1-bit bitmap will still allocate
> HBITMAP_LEVELS arrays, rather than trying to dynamically optimize and
> reduce the number of levels necessary to hold the requested size.  Fair
> enough.

Also because the HBitmapIter would become even more complex than it
already is.

>> +
>> +static int hbi_count_towards(HBitmapIter *hbi, uint64_t last)
>> +{
>> +    uint64_t next = hbi_next_internal(hbi);
>> +    int n;
>> +
>> +    /* Take it easy with the last few bits.  */
>> +    if (next >= (last & -BITS_PER_LONG)) {
>> +        return (next > last ? 0 : 1);
> 
> You could write this as:
>  return next <= last;
> but probably not worth the obfuscation.

You probably guessed why I wrote it that way.  It's at the top of the
function, and such a return may give the wrong idea that the function
returns a boolean.

>> +void hbitmap_iter_init(HBitmapIter *hbi, HBitmap *hb, uint64_t first)
>> +{
>> +    int i, bit;
>> +    size_t pos;
>> +
>> +    hbi->hb = hb;
>> +    pos = first;
>> +    for (i = HBITMAP_LEVELS; --i >= 0; ) {
>> +        bit = pos & (BITS_PER_LONG - 1);
>> +        pos >>= BITS_PER_LEVEL;
>> +
>> +        /* Drop bits representing items before first.  */
>> +        hbi->cur[i] = hb->levels[i][pos] & ~((1UL << bit) - 1);
>> +
>> +        /* We have already added level i+1, so the lowest set bit has
>> +         * been processed.  Clear it.
>> +         */
>> +        if (i != HBITMAP_LEVELS - 1) {
>> +            hbi->cur[i] &= ~(1UL << bit);
>> +        }
>> +    }
>> +
>> +    hbi->pos = first >> BITS_PER_LEVEL;
>> +    hbi->granularity = hb->granularity;
> 
> Do we really need hbi->granularity, or can the code get by with
> hbi->hb->granularity?

It's probably prematurely optimized---this way the fast path does not
need to access hb at all.  But it's also a little bit helpful in gdb,
and it is practically free, so I'd rather leave it.

>> +HBitmap *hbitmap_alloc(uint64_t size, int granularity)
>> +{
>> +    HBitmap *hb = g_malloc0(sizeof(struct HBitmap));
>> +    int i;
>> +
>> +    assert(granularity >= 0 && granularity < 64);
> 
> Shouldn't this be granularity < BITS_PER_LONG?

Yep, thanks.

>> +++ b/hbitmap.h
> 
>> +#define BITS_PER_LEVEL         (BITS_PER_LONG == 32 ? 5 : 6)
>> +
>> +/* For 32-bit, the largest that fits in a 4 GiB address space.
>> + * For 64-bit, the number of sectors in 1 PiB.  Good luck, in
>> + * either case... :)
>> + */
>> +#define HBITMAP_LOG_MAX_SIZE   (BITS_PER_LONG == 32 ? 34 : 41)
>> +
>> +/* Leave an extra bit for a sentinel.  */
>> +#define HBITMAP_LEVELS         ((HBITMAP_LOG_MAX_SIZE / BITS_PER_LEVEL) + 1)
> 
> Interesting that this picks 7 levels for both 32-bit and 64-bit long
> (hmm, that's why you capped HBITMAP_LOG_MAX_SIZE to the number of
> sectors in 1 PiB, rather than covering the entire address space :)
> 
>> +
>> +struct HBitmapIter {
>> +    HBitmap *hb;
>> +    size_t pos;
>> +    int granularity;
> 
> Again, do you really need granularity here?
> 
>> +    unsigned long cur[HBITMAP_LEVELS];
>> +};
>> +
> 
> I did a much closer read of the code this time around, and I'm happy
> that your implementation looks sound by inspection as well as has an
> accompanying testsuite.

Yeah, no way to be confident about this without a testsuite.

Thanks for the review!

Paolo

>> +++ b/tests/test-hbitmap.c
> 
>> +static void test_hbitmap_iter_granularity(TestHBitmapData *data,
>> +                                          const void *unused)
>> +{
>> +    HBitmapIter hbi;
>> +
>> +    /* Note that hbitmap_test_check has to be invoked manually in this test.  */
>> +    hbitmap_test_init(data, 131072 << 7, 7);
>> +    hbitmap_iter_init(&hbi, data->hb, 0);
>> +    g_assert_cmpint(hbitmap_iter_next(&hbi), <, 0);
>> +    hbitmap_test_set(data, ((L2 + L1 + 1) << 7) + 8, 8);
>> +    hbitmap_iter_init(&hbi, data->hb, 0);
>> +    g_assert_cmpint(hbitmap_iter_next(&hbi), ==, (L2 + L1 + 1) << 7);
> 
> Misleading comment, since you didn't call hbitmap_test_check.
>

next prev parent reply	other threads:[~2012-07-30 13:39 UTC|newest]

Thread overview: 136+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-24 11:03 [Qemu-devel] [PATCH 00/47] Block job improvements for 1.2 Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 01/47] qapi: generalize documentation of streaming commands Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 02/47] qerror/block: introduce QERR_BLOCK_JOB_NOT_ACTIVE Paolo Bonzini
2012-07-26 15:26   ` Kevin Wolf
2012-07-26 15:41     ` Paolo Bonzini
2012-07-26 16:49       ` Luiz Capitulino
2012-07-26 16:59         ` Paolo Bonzini
2012-07-26 17:02           ` Luiz Capitulino
2012-07-24 11:03 ` [Qemu-devel] [PATCH 03/47] block: move job APIs to separate files Paolo Bonzini
2012-07-26 15:50   ` Kevin Wolf
2012-07-24 11:03 ` [Qemu-devel] [PATCH 04/47] block: add block_job_query Paolo Bonzini
2012-07-30 14:47   ` Kevin Wolf
2012-07-30 15:05     ` Paolo Bonzini
2012-07-31  8:47       ` Kevin Wolf
2012-07-31  8:50         ` Paolo Bonzini
2012-08-02 19:28           ` Jeff Cody
2012-07-24 11:03 ` [Qemu-devel] [PATCH 05/47] block: add support for job pause/resume Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 06/47] qmp: add block-job-pause and block-job-resume Paolo Bonzini
2012-08-01  7:42   ` Kevin Wolf
2012-07-24 11:03 ` [Qemu-devel] [PATCH 07/47] qemu-iotests: add test for pausing a streaming operation Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 08/47] block: rename block_job_complete to block_job_completed Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 09/47] block: rename BlockErrorAction, BlockQMPEventAction Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 10/47] block: move BlockdevOnError declaration to QAPI Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 11/47] block: reorganize io error code Paolo Bonzini
2012-08-01  9:30   ` Kevin Wolf
2012-08-01  9:46     ` Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 12/47] block: sort BlockDeviceIoStatus errors by severity Paolo Bonzini
2012-08-01  9:44   ` Paolo Bonzini
2012-08-01  9:44   ` Kevin Wolf
2012-07-24 11:03 ` [Qemu-devel] [PATCH 13/47] block: introduce block job error Paolo Bonzini
2012-07-25 17:40   ` Eric Blake
2012-08-01 10:14   ` Kevin Wolf
2012-08-01 11:17     ` Paolo Bonzini
2012-08-01 11:49       ` Kevin Wolf
2012-08-01 12:09         ` Paolo Bonzini
2012-08-01 12:23           ` Kevin Wolf
2012-08-01 12:30             ` Paolo Bonzini
2012-08-01 13:09               ` Kevin Wolf
2012-08-01 13:21                 ` Paolo Bonzini
2012-08-01 14:01                   ` Kevin Wolf
2012-08-01 14:34                     ` Paolo Bonzini
2012-08-01 14:59                       ` Kevin Wolf
2012-08-01 15:15                         ` Paolo Bonzini
2012-08-06  9:29                           ` Kevin Wolf
2012-08-06  9:44                             ` Paolo Bonzini
2012-08-06 10:45                               ` Kevin Wolf
2012-08-06 10:58                                 ` Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 14/47] stream: add on-error argument Paolo Bonzini
2012-07-31 18:40   ` Eric Blake
2012-08-01 10:29   ` Kevin Wolf
2012-08-01 11:11     ` Paolo Bonzini
2012-08-01 11:45       ` Kevin Wolf
2012-07-24 11:03 ` [Qemu-devel] [PATCH 15/47] blkdebug: process all set_state rules in the old state Paolo Bonzini
2012-07-24 20:06   ` Blue Swirl
2012-07-24 11:03 ` [Qemu-devel] [PATCH 16/47] qemu-iotests: map underscore to dash in QMP argument names Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 17/47] qemu-iotests: add tests for streaming error handling Paolo Bonzini
2012-08-01 10:43   ` Kevin Wolf
2012-08-01 11:09     ` Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 18/47] block: live snapshot documentation tweaks Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 19/47] block: add bdrv_query_info Paolo Bonzini
2012-09-11 13:07   ` Kevin Wolf
2012-09-11 13:12     ` Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 20/47] block: add bdrv_query_stats Paolo Bonzini
2012-07-24 11:03 ` [Qemu-devel] [PATCH 21/47] block: add bdrv_ensure_backing_file Paolo Bonzini
2012-09-11 13:32   ` Kevin Wolf
2012-09-11 13:46     ` Paolo Bonzini
2012-09-11 13:58       ` Kevin Wolf
2012-09-11 14:10         ` Paolo Bonzini
2012-09-11 15:38           ` Kevin Wolf
2012-07-24 11:04 ` [Qemu-devel] [PATCH 22/47] block: make device optional in BlockInfo Paolo Bonzini
2012-09-11 13:38   ` Kevin Wolf
2012-09-11 13:49     ` Paolo Bonzini
2012-09-11 14:02       ` Kevin Wolf
2012-09-11 14:14         ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 23/47] block: add target info to QMP query-blockjobs command Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 24/47] block: introduce new dirty bitmap functionality Paolo Bonzini
2012-09-11 14:57   ` Kevin Wolf
2012-09-11 16:17     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 25/47] block: add block-job-complete Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 26/47] block: introduce BLOCK_JOB_READY event Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 27/47] block: introduce mirror job Paolo Bonzini
2012-07-25 23:02   ` Eric Blake
2012-09-13 12:54   ` Kevin Wolf
2012-09-13 14:07     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 28/47] qmp: add drive-mirror command Paolo Bonzini
2012-07-26 23:42   ` Eric Blake
2012-07-27  7:04     ` Paolo Bonzini
2012-07-31  9:26   ` Kevin Wolf
2012-07-31  9:33     ` Paolo Bonzini
2012-07-31  9:46       ` Kevin Wolf
2012-07-31 10:02         ` Paolo Bonzini
2012-07-31 10:25           ` Kevin Wolf
2012-07-31 10:51             ` Paolo Bonzini
2012-07-31 11:13               ` Kevin Wolf
2012-07-31 11:25                 ` Paolo Bonzini
2012-07-31 12:17                   ` Kevin Wolf
2012-07-31 12:52                     ` Paolo Bonzini
2012-09-13 13:15   ` Kevin Wolf
2012-09-13 13:24     ` Paolo Bonzini
2012-09-13 13:26       ` Kevin Wolf
2012-09-13 13:38         ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 29/47] mirror: support querying target file Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 30/47] mirror: implement completion Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 31/47] qemu-iotests: add mirroring test case Paolo Bonzini
2012-07-26 23:46   ` Eric Blake
2012-07-27  7:04     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 32/47] block: forward bdrv_iostatus_reset to block job Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 33/47] mirror: add support for on-source-error/on-target-error Paolo Bonzini
2012-07-27 15:26   ` Eric Blake
2012-07-30 13:29     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 34/47] qmp: add pull_event function Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 35/47] qemu-iotests: add testcases for mirroring on-source-error/on-target-error Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 36/47] host-utils: add ffsl and flsl Paolo Bonzini
2012-07-27 16:05   ` Eric Blake
2012-07-30 13:30     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 37/47] add hierarchical bitmap data type and test cases Paolo Bonzini
2012-07-28 13:26   ` Eric Blake
2012-07-30 13:39     ` Paolo Bonzini [this message]
2012-07-30 14:18       ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 38/47] block: implement dirty bitmap using HBitmap Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 39/47] block: make round_to_clusters public Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 40/47] mirror: perform COW if the cluster size is bigger than the granularity Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 41/47] block: return count of dirty sectors, not chunks Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 42/47] block: allow customizing the granularity of the dirty bitmap Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 43/47] mirror: allow customizing the granularity Paolo Bonzini
2012-07-28 13:43   ` Eric Blake
2012-07-30 13:40     ` Paolo Bonzini
2012-07-30 13:53       ` Eric Blake
2012-07-30 14:03         ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 44/47] mirror: switch mirror_iteration to AIO Paolo Bonzini
2012-07-28 13:46   ` Eric Blake
2012-07-30 13:41     ` Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 45/47] mirror: add buf-size argument to drive-mirror Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 46/47] mirror: support more than one in-flight AIO operation Paolo Bonzini
2012-07-24 11:04 ` [Qemu-devel] [PATCH 47/47] mirror: support arbitrarily-sized iterations Paolo Bonzini
2012-07-28 13:51 ` [Qemu-devel] [PATCH 00/47] Block job improvements for 1.2 Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50168E84.7050101@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=eblake@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).