git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Thomas Rast <tr@thomasrast.ch>
Cc: git@vger.kernel.org, "Vicent Martí" <vicent@github.com>
Subject: Re: [PATCH v3 10/21] pack-bitmap: add support for bitmap indexes
Date: Mon, 2 Dec 2013 11:12:08 -0500	[thread overview]
Message-ID: <20131202161208.GB24202@sigill.intra.peff.net> (raw)
In-Reply-To: <87siuedhvj.fsf@thomasrast.ch>

On Fri, Nov 29, 2013 at 10:21:04PM +0100, Thomas Rast wrote:

> I do think it's worth fixing the syntax pedantry at the end so that we
> can keep supporting arcane compilers, but otherwise, meh.

Agreed. I've picked up those changes in my tree.

> > +static int open_pack_bitmap_1(struct packed_git *packfile)
> 
> This goes somewhat against the naming convention (if you can call it
> that) used elsewhere in git.  Usually foo_1() is an implementation
> detail of foo(), used because it is convenient to wrap the main part in
> another function, e.g. so that it can consistently free resources or
> some such.  But this one operates on one pack file, so in the terms of
> the rest of git, it should probably be called open_pack_bitmap_one().

Hmm. I see your point, but I think that my (and Vicent's) mental model
was that is _was_ a helper for open_pack_bitmap. It just happens to also
fill the role of open_pack_bitmap_one(), but you would not want the
latter. We only support a single bitmap at a time; by calling the
helper, you would miss out on the assert which would catch the error.

So I don't care much, but I have a slight preference to leave it, as it
signals "you should not be calling this directly" more clearly.

> A bit unfortunate that you inherit the strange show_* naming from
> builtin/pack-objects.c, which seems to have stolen some code from
> builtin/rev-list.c at some point without worrying about better naming...

Yes, I agree they're not very descriptive. Let's leave it for now to
stay consistent with pack-objects, and I'd be happy to see a patch
giving all of them better names come later.

> > +	while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) {
> > +		eword_t word = objects->words[i] & filter;
> > +
> > +		for (offset = 0; offset < BITS_IN_WORD; ++offset) {
> > +			const unsigned char *sha1;
> > +			struct revindex_entry *entry;
> > +			uint32_t hash = 0;
> > +
> > +			if ((word >> offset) == 0)
> > +				break;
> > +
> > +			offset += ewah_bit_ctz64(word >> offset);
> > +
> > +			if (pos + offset < bitmap_git.reuse_objects)
> > +				continue;
> > +
> > +			entry = &bitmap_git.reverse_index->revindex[pos + offset];
> > +			sha1 = nth_packed_object_sha1(bitmap_git.pack, entry->nr);
> > +
> > +			show_reach(sha1, object_type, 0, hash, bitmap_git.pack, entry->offset);
> > +		}
> 
> You have a very nice bitmap_each_bit() function in ewah/bitmap.c, why
> not use it here?

We are bitwise-ANDing against an on-disk ewah bitmap to filter out
objects which do not match the desired type. bitmap_each_bit would make
this more complicated, because we wouldn't be able to move the
ewah_iterator in single-word lockstep. And it would probably be slower
(if you did it naively), because we'd end up checking each bit in the
ewah, rather than AND-ing whole words.

The right, reusable way to do it would probably be to bitmap_and_ewah
the original and the filter together, and then bitmap_each_bit the
result. But you would have to write bitmap_and_ewah first. :)

> > +	/*
> > +	 * Reuse the packfile content if we need more than
> > +	 * 90% of its objects
> > +	 */
> > +	static const double REUSE_PERCENT = 0.9;
> 
> Curious: is this based on some measurements or just a guess?

I think it's mostly a guess.

> > +enum pack_bitmap_opts {
> > +	BITMAP_OPT_FULL_DAG = 1,
> 
> And I think this trailing comma on the last enum item is also strictly
> speaking not allowed, even though it is very nice to have:
> 
> pack-bitmap.h:28:27: warning: comma at end of enumerator list [-Wpedantic]

It's allowed in C99, but was not in C89.  I've fixed this site for
consistency with the rest of git. But I wonder how relevant it still is.
The only data points I know of are:

  http://article.gmane.org/gmane.comp.version-control.git/145739

and

  http://article.gmane.org/gmane.comp.version-control.git/145739

It sounds like an ancient IBM VisualAge is the only reported problem.
And according to IBM, they stopped supporting it 10 years ago (well,
technically we have a few more weeks to hit the 10-year mark):

  http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=an&subtype=ca&supplier=897&appname=IBMLinkRedirect&letternum=ENUS903-227

I do wonder if at some point we should revisit our "do not use any
C99-isms" philosophy. It was very good advice in 2005. I don't know how
good it is over 8 years later (it seems like even ancient systems should
be able to get gcc compiled as a last resort, but maybe there really are
people for whom that is a burden).

-Peff

  reply	other threads:[~2013-12-02 16:12 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-14 12:41 [PATCH v3 0/21] pack bitmaps Jeff King
2013-11-14 12:42 ` [PATCH v3 01/21] sha1write: make buffer const-correct Jeff King
2013-11-14 12:42 ` [PATCH v3 02/21] revindex: Export new APIs Jeff King
2013-11-14 12:42 ` [PATCH v3 03/21] pack-objects: Refactor the packing list Jeff King
2013-11-14 12:43 ` [PATCH v3 04/21] pack-objects: factor out name_hash Jeff King
2013-11-14 12:43 ` [PATCH v3 05/21] revision: allow setting custom limiter function Jeff King
2013-11-14 12:43 ` [PATCH v3 06/21] sha1_file: export `git_open_noatime` Jeff King
2013-11-14 12:43 ` [PATCH v3 07/21] compat: add endianness helpers Jeff King
2013-11-14 12:43 ` [PATCH v3 08/21] ewah: compressed bitmap implementation Jeff King
2013-11-14 12:44 ` [PATCH v3 09/21] documentation: add documentation for the bitmap format Jeff King
2013-11-14 12:44 ` [PATCH v3 10/21] pack-bitmap: add support for bitmap indexes Jeff King
2013-11-24 21:36   ` Thomas Rast
2013-11-25 15:04     ` [PATCH] Document khash Thomas Rast
2013-11-28 10:35       ` Jeff King
2013-11-27  9:08     ` [PATCH v3 10/21] pack-bitmap: add support for bitmap indexes Karsten Blees
2013-11-28 10:38       ` Jeff King
2013-12-03 14:40         ` Karsten Blees
2013-12-03 18:21           ` Jeff King
2013-12-07 20:52             ` Karsten Blees
2013-11-29 21:21   ` Thomas Rast
2013-12-02 16:12     ` Jeff King [this message]
2013-12-02 20:36       ` Junio C Hamano
2013-12-02 20:47         ` Jeff King
2013-12-02 21:43           ` Junio C Hamano
2013-11-14 12:45 ` [PATCH v3 11/21] pack-objects: use bitmaps when packing objects Jeff King
2013-12-07 15:47   ` Thomas Rast
2013-12-21 13:15     ` Jeff King
2013-11-14 12:45 ` [PATCH v3 12/21] rev-list: add bitmap mode to speed up object lists Jeff King
2013-12-07 16:05   ` Thomas Rast
2013-11-14 12:45 ` [PATCH v3 13/21] pack-objects: implement bitmap writing Jeff King
2013-12-07 16:32   ` Thomas Rast
2013-12-21 13:17     ` Jeff King
2013-11-14 12:45 ` [PATCH v3 14/21] repack: stop using magic number for ARRAY_SIZE(exts) Jeff King
2013-12-07 16:34   ` Thomas Rast
2013-11-14 12:46 ` [PATCH v3 15/21] repack: turn exts array into array-of-struct Jeff King
2013-12-07 16:34   ` Thomas Rast
2013-11-14 12:46 ` [PATCH v3 16/21] repack: handle optional files created by pack-objects Jeff King
2013-12-07 16:35   ` Thomas Rast
2013-11-14 12:46 ` [PATCH v3 17/21] repack: consider bitmaps when performing repacks Jeff King
2013-12-07 16:37   ` Thomas Rast
2013-11-14 12:46 ` [PATCH v3 18/21] count-objects: recognize .bitmap in garbage-checking Jeff King
2013-12-07 16:38   ` Thomas Rast
2013-11-14 12:46 ` [PATCH v3 19/21] t: add basic bitmap functionality tests Jeff King
2013-12-07 16:43   ` Thomas Rast
2013-12-21 13:22     ` Jeff King
2013-11-14 12:48 ` [PATCH v3 20/21] t/perf: add tests for pack bitmaps Jeff King
2013-12-07 16:51   ` Thomas Rast
2013-12-21 13:40     ` Jeff King
2013-11-14 12:48 ` [PATCH v3 21/21] pack-bitmap: implement optional name_hash cache Jeff King
2013-12-07 16:59   ` Thomas Rast
2013-11-14 19:19 ` [PATCH v3 0/21] pack bitmaps Ramsay Jones
2013-11-14 21:33   ` Jeff King
2013-11-14 23:09     ` Ramsay Jones
2013-11-18 21:16       ` Ramsay Jones
2013-11-16 10:28     ` Thomas Rast

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131202161208.GB24202@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=tr@thomasrast.ch \
    --cc=vicent@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).