All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Rast <trast@inf.ethz.ch>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
	"Alex Bennée" <kernel-hacker@bennee.com>,
	"Antoine Pelisse" <apelisse@gmail.com>,
	"John Keeping" <john@keeping.me.uk>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit}
Date: Fri, 31 May 2013 10:16:44 +0200	[thread overview]
Message-ID: <87fvx37edv.fsf@linux-k42r.v.cablecom.net> (raw)
In-Reply-To: <CALkWK0nL4hPio74Hm+ctObNNFg9+=9brKXFK2ymGB=sPTAk1Hg@mail.gmail.com> (Ramkumar Ramachandra's message of "Fri, 31 May 2013 12:13:00 +0530")

Ramkumar Ramachandra <artagnon@gmail.com> writes:

> Thomas Rast wrote:
>> +       struct object *obj;
>> +       int type = sha1_object_info(sha1, NULL);
>> +       /* If it's neither tag nor commit, parsing the object is wasted effort */
>> +       if (type != OBJ_TAG && type != OBJ_COMMIT)
>> +               return NULL;
>> +       obj = deref_tag(parse_object(sha1), NULL, 0);
[...]
> In contrast, parse_object() first calls lookup_object() to look it up
> in some hashtable to get the type -- the packfile idx, presumably?
> Why don't you also do that instead of sha1_object_info()?  Or, why
> don't you wrap parse_object() in an API that doesn't go beyond the
> first blob check (and not execute parse_object_buffer())?
>
> Also, does this patch fix the bug Alex reported?
>
> Apologies if I've misunderstood something horribly (which does seem to
> be the case).

Yes, it does fix the bug.  (It's not really buggy, just slow.)

However, you implicitly point out an important point: If we have the
object, and it was already parsed (obj->parsed is set), parse_object()
is essentially free.  But sha1_object_info is not, it will in particular
unconditionally dig through long delta chains to discover the base type
of an object that has already been unpacked.


As for your original questions: lookup_object() is "do we have it in our
big object hashtable?" -- the one that holds many[1] objects, that Peff
recently sped up.

sha1_object_info() and read_object() are in many ways parallel functions
that do approximately the following:

  check all pack indexes for this object
  if we found a hit:
    attempt to unpack by recursively going through deltas
    (for _info, no need to unpack, but we still go through the delta
    chain because the type of object is determined by the innermost
    delta base)
  try to load it as a loose object
  it could have been repacked and pruned while we were looking, so:
    reload pack index information
    try the packs again (search indexes, then unpack)
  complain


[1]  blobs in particular are frequently not stored in that hash table,
because it is an insert-only table

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  reply	other threads:[~2013-05-31  8:16 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30 10:38 Poor performance of git describe in big repos Alex Bennée
2013-05-30 11:33 ` Ramkumar Ramachandra
2013-05-30 13:09   ` Alex Bennée
2013-05-30 14:32     ` Ramkumar Ramachandra
2013-05-30 15:01       ` Alex Bennée
2013-05-30 15:17         ` Ramkumar Ramachandra
2013-05-30 15:33     ` Thomas Rast
2013-05-30 16:01       ` Alex Bennée
2013-05-30 16:21         ` Thomas Rast
2013-05-30 16:44           ` Thomas Rast
2013-05-30 19:01             ` Antoine Pelisse
2013-05-30 20:00             ` [PATCH 1/2] sha1_file: silence sha1_loose_object_info Thomas Rast
2013-05-30 20:00               ` [PATCH 2/2] lookup_commit_reference_gently: do not read non-{tag,commit} Thomas Rast
2013-05-30 21:22                 ` Jeff King
2013-05-31  0:52                   ` Duy Nguyen
2013-05-31  8:08                   ` Thomas Rast
2013-05-31 16:00                     ` Jeff King
2013-05-31  6:43                 ` Ramkumar Ramachandra
2013-05-31  8:16                   ` Thomas Rast [this message]
2013-05-30 19:30           ` Poor performance of git describe in big repos John Keeping
2013-05-31  8:14             ` Alex Bennée
2013-05-31  8:24               ` Thomas Rast
2013-05-31  8:40                 ` Alex Bennée
2013-05-31  8:46                   ` Thomas Rast
2013-05-31  9:57                     ` Alex Bennée
2013-06-03  8:02                       ` Alex Bennée
2013-06-03 16:32                         ` Junio C Hamano
2013-06-03 17:48                           ` Junio C Hamano
2013-05-31 10:27                     ` Thomas Rast
2013-05-31 16:17                       ` Jeff King
2013-06-03  8:39                         ` Alex Bennée
2013-06-03 14:49                           ` Jeff King
2013-05-31  8:32               ` John Keeping
2013-05-31  8:49                 ` Alex Bennée
2013-05-31  8:59                   ` John Keeping
2013-05-30 11:48 ` John Keeping
2013-05-30 12:29   ` Alex Bennée
2013-05-30 13:20     ` Duy Nguyen
     [not found]       ` <CAJ-05NPacjAEC99Ntd9eMnTD9_PMMYFob-_tAx5CeSB79TkRSg@mail.gmail.com>
2013-05-30 13:45         ` Duy Nguyen
2013-05-30 14:02           ` Alex Bennée
2013-05-30 13:16   ` Alex Bennée

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fvx37edv.fsf@linux-k42r.v.cablecom.net \
    --to=trast@inf.ethz.ch \
    --cc=apelisse@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=john@keeping.me.uk \
    --cc=kernel-hacker@bennee.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.