From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59923) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YB14p-0007aU-Fm for qemu-devel@nongnu.org; Tue, 13 Jan 2015 08:00:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YB14h-00011W-V0 for qemu-devel@nongnu.org; Tue, 13 Jan 2015 08:00:07 -0500 Received: from mx2.parallels.com ([199.115.105.18]:38117) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YB14h-00011D-Mc for qemu-devel@nongnu.org; Tue, 13 Jan 2015 07:59:59 -0500 Message-ID: <54B516C6.7010403@parallels.com> Date: Tue, 13 Jan 2015 15:59:50 +0300 From: Vladimir Sementsov-Ogievskiy MIME-Version: 1.0 References: <1418307457-25996-1-git-send-email-vsementsov@parallels.com> <1418307457-25996-5-git-send-email-vsementsov@parallels.com> <54AEF4E8.5070106@redhat.com> In-Reply-To: <54AEF4E8.5070106@redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 4/9] hbitmap: store / restore List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: John Snow , qemu-devel@nongnu.org Cc: kwolf@redhat.com, den@openvz.org, stefanha@redhat.com, "pbonzini >> Paolo Bonzini" Best regards, Vladimir On 09.01.2015 00:21, John Snow wrote: > > > On 12/11/2014 09:17 AM, Vladimir Sementsov-Ogievskiy wrote: >> Functions to store / restore HBitmap. HBitmap should be saved to linear >> bitmap format independently of endianess. >> >> Because of restoring in several steps, every step writes only the last >> level of the bitmap. All other levels are restored by >> hbitmap_restore_finish as a last step of restoring. So, HBitmap is >> inconsistent while restoring. ..... > > I guess by nature of how bitmap migration has been implemented, we > cannot help but restore parts of the bitmap piece by piece, which > requires us to force the bitmap into an inconsistent state. > > Yuck. > > It might be "nice" to turn on a disable bit inside the hbitmap that > disallows its use until it is made consistent again, but I don't know > what the performance hit of the extra conditionals everywhere would be > like. > Another approach is to restore the bitmap after each piece set. It is a bit slower of course but the bitmap will be consistent after hbitmap_restore_data() >> +/** >> + * hbitmap_restore_finish >> + * @hb: HBitmap to operate on. >> + * >> + * Repair HBitmap after calling hbitmap_restore_data. Actuall all >> HBitmap >> + * layers are restore here. >> + */ >> +void hbitmap_restore_finish(HBitmap *hb); >> + >> +/** >> * hbitmap_free: >> * @hb: HBitmap to operate on. >> * > > These are just biased opinions: > > - It might be nice to name the store/restore functions "serialize" and > "deserialize," to describe exactly what they are doing. > > - I might refer to "restore_finish" as "post_load" or "post_restore" > or something similar to mimic how device migration functions are > named. I think hbitmap_restore_data_finalize would also be fine; the > key part here is clearly naming it relative to "restore_data." > Hmm. Ok, what about the following set: hbitmap_serialize() hbitmap_deserialize_part() hbitmap_deserialize_finish() >> diff --git a/util/hbitmap.c b/util/hbitmap.c >> index 8aa7406..ac0323f 100644 >> --- a/util/hbitmap.c >> +++ b/util/hbitmap.c >> @@ -362,6 +362,90 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item) >> return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & >> bit) != 0; >> } >> >> +uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count) >> +{ >> + uint64_t size; >> + >> + if (count == 0) { >> + return 0; >> + } >> + >> + size = (((count - 1) >> hb->granularity) >> BITS_PER_LEVEL) + 1; >> + >> + return size * sizeof(unsigned long); >> +} >> + > > This seems flawed to me: number of bits without an offset can't be > mapped to a number of real bytes, because "two bits" may or may not > cross a granularity boundary, depending on *WHERE* you start counting. > > e.g. > > granularity = 1 (i.e. 2^1 = 2 virtual bits per 1 real bit) > virtual: 001100 > real: 0 1 0 > > The amount of space required to hold "two bits" here could be as > little as one bit, if the offset is k={0,2,4} but it could be as much > as two bits if the offset is k={1,3}. > > You may never use the function in this way, but mapping virtual bits > to an implementation byte-size seems like it is inviting an > inconsistency. I dislike this function too.. But unfortunately we need the size in bytes used for serialization. Hmm. Ok, without loss of generality, let offset be less than granularity. The precise formula should look like: size = (((offset+count-1) >> hb->granularity) >> BITS_PER_LEVEL); So, size = ((((1 << hb->granularity) + count - 2) >> hb->granularity) >> BITS_PER_LEVEL) + 1; would be enough in any case. Ok? > >> +void hbitmap_store_data(const HBitmap *hb, uint8_t *buf, >> + uint64_t start, uint64_t count) >> +{ >> + uint64_t last = start + count - 1; >> + unsigned long *out = (unsigned long *)buf; >> + >> + if (count == 0) { >> + return; >> + } >> + >> + start = (start >> hb->granularity) >> BITS_PER_LEVEL; >> + last = (last >> hb->granularity) >> BITS_PER_LEVEL; >> + count = last - start + 1; >> + >> +#ifdef __BIG_ENDIAN_BITFIELD >> + for (i = start; i <= last; ++i) { >> + unsigned long el = hb->levels[HBITMAP_LEVELS - 1][i]; >> + out[i] = (BITS_PER_LONG == 32 ? cpu_to_le32(el) : >> cpu_to_le64(el)); >> + } >> +#else >> + memcpy(out, &hb->levels[HBITMAP_LEVELS - 1][start], >> + count * sizeof(unsigned long)); >> +#endif >> +} >> + > > I suppose the ifdefs are trying to take an optimization by using > memcpy if at all possible, without a conversion. > > Why are the conversions to little endian, though? Shouldn't we be > serializing to a Big Endian format? As Paolo already explained it is for portability. LE bitmap representation is independent of size of bitmap array elements, which may be 32bit/64bit, or whatever else with LE format. > >> +void hbitmap_restore_data(HBitmap *hb, uint8_t *buf, >> + uint64_t start, uint64_t count) >> +{ >> + uint64_t last = start + count - 1; >> + unsigned long *in = (unsigned long *)buf; >> + >> + if (count == 0) { >> + return; >> + } >> + >> + start = (start >> hb->granularity) >> BITS_PER_LEVEL; >> + last = (last >> hb->granularity) >> BITS_PER_LEVEL; >> + count = last - start + 1; >> + >> +#ifdef __BIG_ENDIAN_BITFIELD >> + for (i = start; i <= last; ++i) { >> + hb->levels[HBITMAP_LEVELS - 1][i] = >> + (BITS_PER_LONG == 32 ? be32_to_cpu(in[i]) : >> be64_to_cpu(in[i])); >> + } >> +#else >> + memcpy(&hb->levels[HBITMAP_LEVELS - 1][start], in, >> + count * sizeof(unsigned long)); >> +#endif >> +} >> + > > ...? We're storing as LE but restoring from BE? I'm confused. Oh, it's a mistake. Thanks. I've missed it when testing because of {#ifdef __BIG_ENDIAN_BITFIELD } branch was never compiled. > > I'm also not clear on the __BIG_ENDIAN_BITFIELD macro. Why do we want > to pack differently based on how C-bitfields are packed by the > compiler? I don't think that has any impact on how longs are stored > (always in the host native format.) > > I would rather see the serialize/deserialize always convert to and > from BE explicitly with cpu_to_be[32|64] and be[32|64]_to_cpu. I think > trying to optimize the pack/unpack in the no-op condition with a # > define and a memcpy is inviting portability problems. Ok. I'll leave only branch with for(...) in v2. > >> +void hbitmap_restore_finish(HBitmap *bitmap) >> +{ >> + int64_t i, size; >> + int lev, j; >> + >> + /* restore levels starting from penultimate to zero level, assuming >> + * that the last one is ok */ >> + size = MAX((bitmap->size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, >> 1); >> + for (lev = HBITMAP_LEVELS - 1; lev-- > 0; ) { >> + size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1); >> + for (i = 0; i < size; ++i) { >> + bitmap->levels[lev][i] = 0; >> + for (j = 0; j < BITS_PER_LONG; ++j) { >> + if (bitmap->levels[lev+1][(i << BITS_PER_LEVEL) + j]) { >> + bitmap->levels[lev][i] |= (1 << j); >> + } >> + } > > Just a musing: Should we cache the size of each level? I know we can > keep recalculating it, but... why bother? We recalculate it in so many > places now. Interesting point. However, there is another real bug: there are may be no longs on the next level for last bits of the current one. The new version of this function will be in v2. > >> + } >> + } >> +} >> + >> void hbitmap_free(HBitmap *hb) >> { >> unsigned i; >>