From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45210) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YB4xR-0000NH-QG for qemu-devel@nongnu.org; Tue, 13 Jan 2015 12:08:47 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YB4xP-0007gU-1h for qemu-devel@nongnu.org; Tue, 13 Jan 2015 12:08:45 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54515) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YB4xO-0007gH-HC for qemu-devel@nongnu.org; Tue, 13 Jan 2015 12:08:42 -0500 Message-ID: <54B55118.1020607@redhat.com> Date: Tue, 13 Jan 2015 12:08:40 -0500 From: John Snow MIME-Version: 1.0 References: <1418307457-25996-1-git-send-email-vsementsov@parallels.com> <1418307457-25996-5-git-send-email-vsementsov@parallels.com> <54AEF4E8.5070106@redhat.com> <54B516C6.7010403@parallels.com> In-Reply-To: <54B516C6.7010403@parallels.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 4/9] hbitmap: store / restore List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladimir Sementsov-Ogievskiy , qemu-devel@nongnu.org Cc: kwolf@redhat.com, den@openvz.org, stefanha@redhat.com, "pbonzini >> Paolo Bonzini" On 01/13/2015 07:59 AM, Vladimir Sementsov-Ogievskiy wrote: > > Best regards, > Vladimir > > On 09.01.2015 00:21, John Snow wrote: >> >> >> On 12/11/2014 09:17 AM, Vladimir Sementsov-Ogievskiy wrote: >>> Functions to store / restore HBitmap. HBitmap should be saved to linear >>> bitmap format independently of endianess. >>> >>> Because of restoring in several steps, every step writes only the last >>> level of the bitmap. All other levels are restored by >>> hbitmap_restore_finish as a last step of restoring. So, HBitmap is >>> inconsistent while restoring. > ..... >> >> I guess by nature of how bitmap migration has been implemented, we >> cannot help but restore parts of the bitmap piece by piece, which >> requires us to force the bitmap into an inconsistent state. >> >> Yuck. >> >> It might be "nice" to turn on a disable bit inside the hbitmap that >> disallows its use until it is made consistent again, but I don't know >> what the performance hit of the extra conditionals everywhere would be >> like. >> > > Another approach is to restore the bitmap after each piece set. It is a > bit slower of course but the bitmap will be consistent after > hbitmap_restore_data() > Yeah, that's not great either. The approach you have already taken is probably the best. >>> +/** >>> + * hbitmap_restore_finish >>> + * @hb: HBitmap to operate on. >>> + * >>> + * Repair HBitmap after calling hbitmap_restore_data. Actuall all >>> HBitmap >>> + * layers are restore here. >>> + */ >>> +void hbitmap_restore_finish(HBitmap *hb); >>> + >>> +/** >>> * hbitmap_free: >>> * @hb: HBitmap to operate on. >>> * >> >> These are just biased opinions: >> >> - It might be nice to name the store/restore functions "serialize" and >> "deserialize," to describe exactly what they are doing. >> >> - I might refer to "restore_finish" as "post_load" or "post_restore" >> or something similar to mimic how device migration functions are >> named. I think hbitmap_restore_data_finalize would also be fine; the >> key part here is clearly naming it relative to "restore_data." >> > > Hmm. Ok, what about the following set: > hbitmap_serialize() > hbitmap_deserialize_part() > hbitmap_deserialize_finish() > Looks good to me! >>> diff --git a/util/hbitmap.c b/util/hbitmap.c >>> index 8aa7406..ac0323f 100644 >>> --- a/util/hbitmap.c >>> +++ b/util/hbitmap.c >>> @@ -362,6 +362,90 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item) >>> return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & >>> bit) != 0; >>> } >>> >>> +uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count) >>> +{ >>> + uint64_t size; >>> + >>> + if (count == 0) { >>> + return 0; >>> + } >>> + >>> + size = (((count - 1) >> hb->granularity) >> BITS_PER_LEVEL) + 1; >>> + >>> + return size * sizeof(unsigned long); >>> +} >>> + >> >> This seems flawed to me: number of bits without an offset can't be >> mapped to a number of real bytes, because "two bits" may or may not >> cross a granularity boundary, depending on *WHERE* you start counting. >> >> e.g. >> >> granularity = 1 (i.e. 2^1 = 2 virtual bits per 1 real bit) >> virtual: 001100 >> real: 0 1 0 >> >> The amount of space required to hold "two bits" here could be as >> little as one bit, if the offset is k={0,2,4} but it could be as much >> as two bits if the offset is k={1,3}. >> >> You may never use the function in this way, but mapping virtual bits >> to an implementation byte-size seems like it is inviting an >> inconsistency. > I dislike this function too.. But unfortunately we need the size in > bytes used for serialization. > > Hmm. Ok, without loss of generality, let offset be less than > granularity. The precise formula should look like: > > size = (((offset+count-1) >> hb->granularity) >> BITS_PER_LEVEL); > > So, > size = ((((1 << hb->granularity) + count - 2) >> hb->granularity) >> > BITS_PER_LEVEL) + 1; > would be enough in any case. Ok? > I think so, as long as when you deserialize the object it does so correctly, regardless of whether or not you start on an even multiple of the granularity. > > >> >>> +void hbitmap_store_data(const HBitmap *hb, uint8_t *buf, >>> + uint64_t start, uint64_t count) >>> +{ >>> + uint64_t last = start + count - 1; >>> + unsigned long *out = (unsigned long *)buf; >>> + >>> + if (count == 0) { >>> + return; >>> + } >>> + >>> + start = (start >> hb->granularity) >> BITS_PER_LEVEL; >>> + last = (last >> hb->granularity) >> BITS_PER_LEVEL; >>> + count = last - start + 1; >>> + >>> +#ifdef __BIG_ENDIAN_BITFIELD >>> + for (i = start; i <= last; ++i) { >>> + unsigned long el = hb->levels[HBITMAP_LEVELS - 1][i]; >>> + out[i] = (BITS_PER_LONG == 32 ? cpu_to_le32(el) : >>> cpu_to_le64(el)); >>> + } >>> +#else >>> + memcpy(out, &hb->levels[HBITMAP_LEVELS - 1][start], >>> + count * sizeof(unsigned long)); >>> +#endif >>> +} >>> + >> >> I suppose the ifdefs are trying to take an optimization by using >> memcpy if at all possible, without a conversion. >> >> Why are the conversions to little endian, though? Shouldn't we be >> serializing to a Big Endian format? > As Paolo already explained it is for portability. LE bitmap > representation is independent of size of bitmap array elements, which > may be 32bit/64bit, or whatever else with LE format. Yes, that makes sense. :) >> >>> +void hbitmap_restore_data(HBitmap *hb, uint8_t *buf, >>> + uint64_t start, uint64_t count) >>> +{ >>> + uint64_t last = start + count - 1; >>> + unsigned long *in = (unsigned long *)buf; >>> + >>> + if (count == 0) { >>> + return; >>> + } >>> + >>> + start = (start >> hb->granularity) >> BITS_PER_LEVEL; >>> + last = (last >> hb->granularity) >> BITS_PER_LEVEL; >>> + count = last - start + 1; >>> + >>> +#ifdef __BIG_ENDIAN_BITFIELD >>> + for (i = start; i <= last; ++i) { >>> + hb->levels[HBITMAP_LEVELS - 1][i] = >>> + (BITS_PER_LONG == 32 ? be32_to_cpu(in[i]) : >>> be64_to_cpu(in[i])); >>> + } >>> +#else >>> + memcpy(&hb->levels[HBITMAP_LEVELS - 1][start], in, >>> + count * sizeof(unsigned long)); >>> +#endif >>> +} >>> + >> >> ...? We're storing as LE but restoring from BE? I'm confused. > Oh, it's a mistake. Thanks. I've missed it when testing because of > {#ifdef __BIG_ENDIAN_BITFIELD } branch was never compiled. Glad I am not crazy :) >> >> I'm also not clear on the __BIG_ENDIAN_BITFIELD macro. Why do we want >> to pack differently based on how C-bitfields are packed by the >> compiler? I don't think that has any impact on how longs are stored >> (always in the host native format.) >> >> I would rather see the serialize/deserialize always convert to and >> from BE explicitly with cpu_to_be[32|64] and be[32|64]_to_cpu. I think >> trying to optimize the pack/unpack in the no-op condition with a # >> define and a memcpy is inviting portability problems. > Ok. I'll leave only branch with for(...) in v2. You can correct me if I am mistaken, but I seem to recall determining endianness in a platform, compiler and libc independent way is very difficult during the pre-processor stage. If there _IS_ some way to do this, the optimization is fine, but I do not think it will work as written. The un-optimized version is the safest. >> >>> +void hbitmap_restore_finish(HBitmap *bitmap) >>> +{ >>> + int64_t i, size; >>> + int lev, j; >>> + >>> + /* restore levels starting from penultimate to zero level, assuming >>> + * that the last one is ok */ >>> + size = MAX((bitmap->size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, >>> 1); >>> + for (lev = HBITMAP_LEVELS - 1; lev-- > 0; ) { >>> + size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1); >>> + for (i = 0; i < size; ++i) { >>> + bitmap->levels[lev][i] = 0; >>> + for (j = 0; j < BITS_PER_LONG; ++j) { >>> + if (bitmap->levels[lev+1][(i << BITS_PER_LEVEL) + j]) { >>> + bitmap->levels[lev][i] |= (1 << j); >>> + } >>> + } >> >> Just a musing: Should we cache the size of each level? I know we can >> keep recalculating it, but... why bother? We recalculate it in so many >> places now. > Interesting point. > > However, there is another real bug: there are may be no longs on the > next level for last bits of the current one. The new version of this > function will be in v2. Great :) >> >>> + } >>> + } >>> +} >>> + >>> void hbitmap_free(HBitmap *hb) >>> { >>> unsigned i; >>> > Thank you! --js