All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Rast <trast@inf.ethz.ch>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Ramkumar Ramachandra" <artagnon@gmail.com>,
	"Alex Bennée" <kernel-hacker@bennee.com>,
	"Antoine Pelisse" <apelisse@gmail.com>,
	"John Keeping" <john@keeping.me.uk>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit}
Date: Fri, 31 May 2013 10:08:06 +0200	[thread overview]
Message-ID: <87sj138tcp.fsf@linux-k42r.v.cablecom.net> (raw)
In-Reply-To: <20130530212223.GA2135@sigill.intra.peff.net> (Jeff King's message of "Thu, 30 May 2013 17:22:23 -0400")

Jeff King <peff@peff.net> writes:

> On Thu, May 30, 2013 at 10:00:23PM +0200, Thomas Rast wrote:
>
>> lookup_commit_reference_gently unconditionally parses the object given
>> to it.  This slows down git-describe a lot if you have a repository
>> with large tagged blobs in it: parse_object() will read the entire
>> blob and verify that its sha1 matches, only to then throw it away.
>> 
>> Speed it up by checking the type with sha1_object_info() prior to
>> unpacking.
>
> This would speed up the case where we do not end up looking at the
> object at all, but it will slow down the (presumably common) case where
> we will in fact find a commit and end up parsing the object anyway.
>
> Have you measured the impact of this on normal operations? During a
> traversal, we spend a measurable amount of time looking up commits in
> packfiles, and this would presumably double it.

I don't think so, but admittedly I didn't measure it.

The reason why it's unlikely is that this is specific to
lookup_commit_reference_gently, which according to some grepping is
usually done on refs or values that refs might have; e.g. on the old&new
sides of a fetch in remote.c, or in many places in the callback of some
variant of for_each_ref.

Of course if you have a ridiculously large number of refs (and I gather
_you_ do), this will hurt somewhat in the usual case, but speed up the
case where there is a ref (usually a lightweight tag) directly pointing
at a large blob.

I'm not sure this can be fixed without the change you outline here:

> This is not the first time I have seen this tradeoff in git.  It would
> be nice if our object access was structured to do incremental
> examination of the objects (i.e., store the packfile index lookup or
> partial unpack of a loose object header, and then use that to complete
> the next step of actually getting the contents).

But in any case I see the point, I should try and gather some
performance numbers.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  parent reply	other threads:[~2013-05-31  8:08 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30 10:38 Poor performance of git describe in big repos Alex Bennée
2013-05-30 11:33 ` Ramkumar Ramachandra
2013-05-30 13:09   ` Alex Bennée
2013-05-30 14:32     ` Ramkumar Ramachandra
2013-05-30 15:01       ` Alex Bennée
2013-05-30 15:17         ` Ramkumar Ramachandra
2013-05-30 15:33     ` Thomas Rast
2013-05-30 16:01       ` Alex Bennée
2013-05-30 16:21         ` Thomas Rast
2013-05-30 16:44           ` Thomas Rast
2013-05-30 19:01             ` Antoine Pelisse
2013-05-30 20:00             ` [PATCH 1/2] sha1_file: silence sha1_loose_object_info Thomas Rast
2013-05-30 20:00               ` [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit} Thomas Rast
2013-05-30 21:22                 ` Jeff King
2013-05-31  0:52                   ` Duy Nguyen
2013-05-31  8:08                   ` Thomas Rast [this message]
2013-05-31 16:00                     ` Jeff King
2013-05-31  6:43                 ` Ramkumar Ramachandra
2013-05-31  8:16                   ` Thomas Rast
2013-05-30 19:30           ` Poor performance of git describe in big repos John Keeping
2013-05-31  8:14             ` Alex Bennée
2013-05-31  8:24               ` Thomas Rast
2013-05-31  8:40                 ` Alex Bennée
2013-05-31  8:46                   ` Thomas Rast
2013-05-31  9:57                     ` Alex Bennée
2013-06-03  8:02                       ` Alex Bennée
2013-06-03 16:32                         ` Junio C Hamano
2013-06-03 17:48                           ` Junio C Hamano
2013-05-31 10:27                     ` Thomas Rast
2013-05-31 16:17                       ` Jeff King
2013-06-03  8:39                         ` Alex Bennée
2013-06-03 14:49                           ` Jeff King
2013-05-31  8:32               ` John Keeping
2013-05-31  8:49                 ` Alex Bennée
2013-05-31  8:59                   ` John Keeping
2013-05-30 11:48 ` John Keeping
2013-05-30 12:29   ` Alex Bennée
2013-05-30 13:20     ` Duy Nguyen
     [not found]       ` <CAJ-05NPacjAEC99Ntd9eMnTD9_PMMYFob-_tAx5CeSB79TkRSg@mail.gmail.com>
2013-05-30 13:45         ` Duy Nguyen
2013-05-30 14:02           ` Alex Bennée
2013-05-30 13:16   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sj138tcp.fsf@linux-k42r.v.cablecom.net \
    --to=trast@inf.ethz.ch \
    --cc=apelisse@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=john@keeping.me.uk \
    --cc=kernel-hacker@bennee.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.