git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Lawrence Siebert <lawrencesiebert@gmail.com>
Cc: John Keeping <john@keeping.me.uk>,
	Johannes Schindelin <johannes.schindelin@gmx.de>,
	git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	tanoku@gmail.com
Subject: Re: [PATCH] --count feature for git shortlog
Date: Wed, 1 Jul 2015 07:50:37 -0400	[thread overview]
Message-ID: <20150701115036.GA31158@peff.net> (raw)
In-Reply-To: <CAKDoJU42kDs3QXYjo7rJ-vLMJtUdv9AwttJLHnya+toG6cSatQ@mail.gmail.com>

On Tue, Jun 30, 2015 at 08:00:53PM -0700, Lawrence Siebert wrote:

> The following doesn't currently run:  `git rev-list --count
> --use-bitmap-index HEAD`
> 
> This is an optional parameter for rev-list from commit
> aa32939fea9c8934b41efce56015732fa12b8247 which can't currently be used
> because of changes in parameter parsing, but which modifies `--count`
> and which may be faster. I've gotten it working again, both by
> changing the current repo code to make it work, and also by building
> from that commit, and when I tested it on the whole repo, it seems
> like it's less variable in speed then `git rev-list --count HEAD`. but
> not necessarily consistently faster like tests suggested it was when
> it was committed. Obviously I'm not testing on the same system as the
> original committer, or with the same compiler, or even using the same
> version of the linux kernel repo, so those may be a factor.  It may
> also work better in a circumstance that I haven't accounted for, like
> an older repo, on a per file basis when getting per file commit counts
> for all files, or something like that.

Can you give more details?

In a copy of linux.git with bitmaps:

  $ git log -1 --oneline
  64fb1d0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

  $ ls -l .git/objects/pack/
  total 892792
  -r--r--r-- 1 peff peff  24498374 May 21 13:39 pack-182149ca37c3f2d8fa190e4add772ae08af0e9d2.bitmap
  -r--r--r-- 1 peff peff 115283036 May 21 13:39 pack-182149ca37c3f2d8fa190e4add772ae08af0e9d2.idx
  -r--r--r-- 1 peff peff 774420808 May 21 13:39 pack-182149ca37c3f2d8fa190e4add772ae08af0e9d2.pack

The packfiles were created with "git repack -adb". It shows big
speedups for this exact operation:

  $ git version
  git version 2.5.0.rc0

  $ time git rev-list --count HEAD
  515406

  real    0m9.500s
  user    0m9.424s
  sys     0m0.092s

  $ time git rev-list --use-bitmap-index --count HEAD
  515406

  real    0m0.392s
  user    0m0.328s
  sys     0m0.064s

Note that this would not work with, say:

  git rev-list --use-bitmap-index --count HEAD -- Makefile

as the bitmap index does not have enough information to do path limiting
(we should probably disallow this or fall back to the non-bitmap code
path, but right now we just ignore the path limiter).

> I'm thinking I could submit a patch that makes it work again, and
> leave it to the user to decide whether to use it or not.   There is
> also a --test-bitmap option which compares the regular count with the
> bitmap count. I'm not sure if the implication there was regression
> testing or that --use-bitmap-index might give the wrong results in
> certain circumstances.  Vincent, could you clarify?

Yes, `--test-bitmap` is just for internal testing; you should always get
the same results.

-Peff

  reply	other threads:[~2015-07-01 11:50 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-29  1:22 [PATCH] --count feature for git shortlog Lawrence Siebert
2015-06-29  1:22 ` Lawrence Siebert
2015-06-29  1:22 ` Lawrence Siebert
2015-06-29  4:37 ` Junio C Hamano
     [not found]   ` <CAKDoJU4HcGoOS83MKwsQBXztYrDomMd9N-2SKc6iRyNhQQM5Eg@mail.gmail.com>
2015-06-29 16:46     ` Lawrence Siebert
2015-06-29 17:04       ` Junio C Hamano
2015-06-29 21:33         ` Lawrence Siebert
2015-06-30 12:10       ` Johannes Schindelin
2015-06-30 12:23         ` John Keeping
     [not found]           ` <CAKDoJU4cEvWvfnFsvfOJ_P0UOrD3RpLK1NdfxaUPiDTWXYg-oA@mail.gmail.com>
2015-07-01  3:00             ` Lawrence Siebert
2015-07-01 11:50               ` Jeff King [this message]
2015-07-01 15:15                 ` Junio C Hamano
2015-07-03 17:31           ` Junio C Hamano
2015-07-03 23:32             ` Lawrence Siebert
2015-07-21 18:27               ` Jakub Narębski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150701115036.GA31158@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=john@keeping.me.uk \
    --cc=lawrencesiebert@gmail.com \
    --cc=tanoku@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).