Re: [PATCH 1/2] upload-pack: avoid parsing objects during ref advertisement

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, git-dev@github.com
Subject: Re: [PATCH 1/2] upload-pack: avoid parsing objects during ref advertisement
Date: Thu, 24 Jan 2013 02:50:08 -0500	[thread overview]
Message-ID: <20130124075008.GA3249@sigill.intra.peff.net> (raw)
In-Reply-To: <7vehhiozkb.fsf@alter.siamese.dyndns.org>

On Fri, Jan 18, 2013 at 03:12:52PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > When we advertise a ref, the first thing we do is parse the
> > pointed-to object. This gives us two things:
> >
> >   1. a "struct object" we can use to store flags
> >
> >   2. the type of the object, so we know whether we need to
> >      dereference it as a tag
> >
> > Instead, we can just use lookup_unknown_object to get an
> > object struct, and then fill in just the type field using
> > sha1_object_info (which, in the case of packed files, can
> > find the information without actually inflating the object
> > data).
> >
> > This can save time if you have a large number of refs, and
> > the client isn't actually going to request those refs (e.g.,
> > because most of them are already up-to-date).
> 
> This is an old news, but do you recall why this patch did not update
> the code in mark_our_ref() that gets "struct object *o" from parse_object()
> the same way and mark them with OUR_REF flag?
> 
> I was wondering about code consolidation like this.

It was just because I did my measuring on raw upload-pack, so I didn't
notice that mark_our_ref was doing the same potentially slow thing. We
only call mark_our_ref during the second half of the stateless-rpc
conversation, and I did not measure that (and it would be a pain to do
so in isolation).

But it should be able to get the exact same speedups that we get from
send_ref. It probably matters less in the long run, because the
advertising phase is going to be called more frequently (e.g., for every
no-op fetch), and once we are calling mark_our_ref, we are presumably
about to do do actual packing work. However, there's no reason not to
get what speed we can there, too.

> diff --git a/upload-pack.c b/upload-pack.c
> index 95d8313..609cd6c 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -722,15 +722,18 @@ static void receive_needs(void)
>  	free(shallows.objects);
>  }
>  
> +static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data);
> +
>  static int send_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
>  {
>  	static const char *capabilities = "multi_ack thin-pack side-band"
>  		" side-band-64k ofs-delta shallow no-progress"
>  		" include-tag multi_ack_detailed";
> -	struct object *o = lookup_unknown_object(sha1);
>  	const char *refname_nons = strip_namespace(refname);
>  	unsigned char peeled[20];
>  
> +	mark_our_ref(refname, sha1, flag, cb_data);
> +
>  	if (capabilities)
>  		packet_write(1, "%s %s%c%s%s agent=%s\n",
>  			     sha1_to_hex(sha1), refname_nons,
> @@ -740,10 +743,6 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
>  	else
>  		packet_write(1, "%s %s\n", sha1_to_hex(sha1), refname_nons);
>  	capabilities = NULL;
> -	if (!(o->flags & OUR_REF)) {
> -		o->flags |= OUR_REF;
> -		nr_our_refs++;
> -	}
>  	if (!peel_ref(refname, peeled))
>  		packet_write(1, "%s %s^{}\n", sha1_to_hex(peeled), refname_nons);
>  	return 0;

Right, I think this is a nice cleanup.

> @@ -751,7 +750,7 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
>  
>  static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag, void *cb_data)
>  {
> -	struct object *o = parse_object(sha1);
> +	struct object *o = parse_object(sha1); /* lookup-unknown??? */
>  	if (!o)
>  		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));
>  	if (!(o->flags & OUR_REF)) {

And yeah, this should use lookup_unknown_object to extend the
optimization to mark_our_ref (and avoid removing it for the
ref-advertisement case, of course).

-Peff

next prev parent reply	other threads:[~2013-01-24  7:50 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-05 21:00 [PATCH] parse_object: try internal cache before reading object db Jeff King
2012-01-05 21:35 ` Junio C Hamano
2012-01-05 21:49   ` Jeff King
2012-01-05 21:55     ` Junio C Hamano
2012-01-05 22:18       ` Jeff King
2012-01-06 19:16   ` Jeff King
2012-01-06 21:27     ` Junio C Hamano
2012-01-06 22:33       ` Jeff King
2012-01-06 22:45         ` Junio C Hamano
2012-01-06 22:46           ` Jeff King
2012-01-06 19:17   ` [PATCH 1/2] upload-pack: avoid parsing objects during ref advertisement Jeff King
2013-01-18 23:12     ` Junio C Hamano
2013-01-24  7:50       ` Jeff King [this message]
2013-01-24 17:25         ` Junio C Hamano
2013-01-29  8:10     ` Shawn Pearce
2013-01-29  8:14       ` Jeff King
2012-01-06 19:18   ` [PATCH 2/2] upload-pack: avoid parsing tag destinations Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130124075008.GA3249@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git-dev@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).