From mboxrd@z Thu Jan 1 00:00:00 1970
From: Thomas Rast
Subject: Re: [PATCH v3 10/21] pack-bitmap: add support for bitmap indexes
Date: Fri, 29 Nov 2013 22:21:04 +0100
Message-ID: <87siuedhvj.fsf@thomasrast.ch>
References: <20131114124157.GA23784@sigill.intra.peff.net>
<20131114124432.GJ10757@sigill.intra.peff.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: git@vger.kernel.org,
Vicent =?utf-8?Q?Mart=C3=AD?=
To: Jeff King
X-From: git-owner@vger.kernel.org Fri Nov 29 22:22:43 2013
Return-path:
Envelope-to: gcvg-git-2@plane.gmane.org
Received: from vger.kernel.org ([209.132.180.67])
by plane.gmane.org with esmtp (Exim 4.69)
(envelope-from )
id 1VmVWN-0001y9-53
for gcvg-git-2@plane.gmane.org; Fri, 29 Nov 2013 22:22:43 +0100
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
id S1754241Ab3K2VWj convert rfc822-to-quoted-printable (ORCPT
); Fri, 29 Nov 2013 16:22:39 -0500
Received: from psi.thgersdorf.net ([176.9.98.78]:36208 "EHLO mail.psioc.net"
rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
id S1752606Ab3K2VWi convert rfc822-to-8bit (ORCPT
); Fri, 29 Nov 2013 16:22:38 -0500
Received: from localhost (localhost [127.0.0.1])
by localhost.psioc.net (Postfix) with ESMTP id 95A854D64E4;
Fri, 29 Nov 2013 22:22:33 +0100 (CET)
X-Virus-Scanned: amavisd-new at psioc.net
Received: from mail.psioc.net ([127.0.0.1])
by localhost (mail.psioc.net [127.0.0.1]) (amavisd-new, port 10024)
with LMTP id Y5OeIeN2NZah; Fri, 29 Nov 2013 22:22:23 +0100 (CET)
Received: from linux-1gf2.thomasrast.ch (unknown [89.204.130.229])
(using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits))
(Client did not present a certificate)
by mail.psioc.net (Postfix) with ESMTPSA id 6CE0E4D64DE;
Fri, 29 Nov 2013 22:22:22 +0100 (CET)
In-Reply-To: <20131114124432.GJ10757@sigill.intra.peff.net> (Jeff King's
message of "Thu, 14 Nov 2013 07:44:32 -0500")
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)
Sender: git-owner@vger.kernel.org
Precedence: bulk
List-ID:
X-Mailing-List: git@vger.kernel.org
Archived-At:
TLDR: nitpicks. Thanks for a very nice read.
I do think it's worth fixing the syntax pedantry at the end so that we
can keep supporting arcane compilers, but otherwise, meh.
> +static int open_pack_bitmap_1(struct packed_git *packfile)
This goes somewhat against the naming convention (if you can call it
that) used elsewhere in git. Usually foo_1() is an implementation
detail of foo(), used because it is convenient to wrap the main part in
another function, e.g. so that it can consistently free resources or
some such. But this one operates on one pack file, so in the terms of
the rest of git, it should probably be called open_pack_bitmap_one().
> +static void show_object(struct object *object, const struct name_pat=
h *path,
> + const char *last, void *data)
> +{
> + struct bitmap *base =3D data;
> + int bitmap_pos;
> +
> + bitmap_pos =3D bitmap_position(object->sha1);
> +
> + if (bitmap_pos < 0) {
> + char *name =3D path_name(path, last);
> + bitmap_pos =3D ext_index_add_object(object, name);
> + free(name);
> + }
> +
> + bitmap_set(base, bitmap_pos);
> +}
> +
> +static void show_commit(struct commit *commit, void *data)
> +{
> +}
A bit unfortunate that you inherit the strange show_* naming from
builtin/pack-objects.c, which seems to have stolen some code from
builtin/rev-list.c at some point without worrying about better naming..=
=2E
> +static void show_objects_for_type(
> + struct bitmap *objects,
> + struct ewah_bitmap *type_filter,
> + enum object_type object_type,
> + show_reachable_fn show_reach)
> +{
[...]
> + while (i < objects->word_alloc && ewah_iterator_next(&filter, &it))=
{
> + eword_t word =3D objects->words[i] & filter;
> +
> + for (offset =3D 0; offset < BITS_IN_WORD; ++offset) {
> + const unsigned char *sha1;
> + struct revindex_entry *entry;
> + uint32_t hash =3D 0;
> +
> + if ((word >> offset) =3D=3D 0)
> + break;
> +
> + offset +=3D ewah_bit_ctz64(word >> offset);
> +
> + if (pos + offset < bitmap_git.reuse_objects)
> + continue;
> +
> + entry =3D &bitmap_git.reverse_index->revindex[pos + offset];
> + sha1 =3D nth_packed_object_sha1(bitmap_git.pack, entry->nr);
> +
> + show_reach(sha1, object_type, 0, hash, bitmap_git.pack, entry->of=
fset);
> + }
You have a very nice bitmap_each_bit() function in ewah/bitmap.c, why
not use it here?
> +int reuse_partial_packfile_from_bitmap(struct packed_git **packfile,
> + uint32_t *entries,
> + off_t *up_to)
> +{
> + /*
> + * Reuse the packfile content if we need more than
> + * 90% of its objects
> + */
> + static const double REUSE_PERCENT =3D 0.9;
Curious: is this based on some measurements or just a guess?
> diff --git a/pack-bitmap.h b/pack-bitmap.h
[...]
> +static const char BITMAP_IDX_SIGNATURE[] =3D {'B', 'I', 'T', 'M'};;
There's a stray ; at the end of the line that is technically not
permitted:
pack-bitmap.h:22:65: warning: ISO C does not allow extra =E2=80=98;=E2=80=
=99 outside of a function [-Wpedantic]
> +enum pack_bitmap_opts {
> + BITMAP_OPT_FULL_DAG =3D 1,
And I think this trailing comma on the last enum item is also strictly
speaking not allowed, even though it is very nice to have:
pack-bitmap.h:28:27: warning: comma at end of enumerator list [-Wpedant=
ic]
--=20
Thomas Rast
tr@thomasrast.ch