All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@jeffhostetler.com
Cc: git@vger.kernel.org, peff@peff.net,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH v3] unpack-trees: avoid duplicate ODB lookups during checkout
Date: Thu, 13 Apr 2017 16:11:31 -0700	[thread overview]
Message-ID: <xmqqtw5roqcc.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170413155852.4281-2-git@jeffhostetler.com> (git@jeffhostetler.com's message of "Thu, 13 Apr 2017 15:58:52 +0000")

git@jeffhostetler.com writes:

> +	/*
> +	 * Fetch the tree from the ODB for each peer directory in the
> +	 * n commits.
> +	 *
> +	 * For 2- and 3-way traversals, we try to avoid hitting the
> +	 * ODB twice for the same OID.  This should yield a nice speed
> +	 * up in checkouts and merges when the commits are similar.
> +	 *
> +	 * We don't bother doing the full O(n^2) search for larger n,
> +	 * because wider traversals don't happen that often and we
> +	 * avoid the search setup.
> +	 *
> +	 * When 2 peer OIDs are the same, we just copy the tree
> +	 * descriptor data.  This implicitly borrows the buffer
> +	 * data from the earlier cell.

This "borrowing" made me worried, but it turns out that this is
perfectly OK.

fill_tree_descriptor() uses read_sha1_file() to give a tree_desc its
own copy of the tree object data, so the code that calls into the
tree traversal API is responsible for freeing the buffer returned
from fill_tree_descriptor().  The updated code avoids double freeing
by setting buf[i] to NULL for borrowing [i].

> +	 */
>  	for (i = 0; i < n; i++, dirmask >>= 1) {
> -		const unsigned char *sha1 = NULL;
> -		if (dirmask & 1)
> -			sha1 = names[i].oid->hash;
> -		buf[i] = fill_tree_descriptor(t+i, sha1);
> +		if (i > 0 && are_same_oid(&names[i], &names[i - 1])) {
> +			t[i] = t[i - 1];
> +			buf[i] = NULL;
> +		} else if (i > 1 && are_same_oid(&names[i], &names[i - 2])) {
> +			t[i] = t[i - 2];
> +			buf[i] = NULL;
> +		} else {
> +			const unsigned char *sha1 = NULL;
> +			if (dirmask & 1)
> +				sha1 = names[i].oid->hash;
> +			buf[i] = fill_tree_descriptor(t+i, sha1);
> +		}
>  	}
>  
>  	bottom = switch_cache_bottom(&newinfo);

  reply	other threads:[~2017-04-13 23:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-13 15:58 [PATCH v3] unpack-trees: avoid duplicate ODB lookups during checkout git
2017-04-13 15:58 ` git
2017-04-13 23:11   ` Junio C Hamano [this message]
2017-04-14  0:59     ` Jeff King
2017-04-14 19:32       ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqtw5roqcc.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.