All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] pack-objects: do not reuse packfiles without --delta-base-offset
Date: Wed, 02 Apr 2014 10:39:13 -0700	[thread overview]
Message-ID: <xmqqfvlvvdfi.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20140402063916.GA1437@sigill.intra.peff.net> (Jeff King's message of "Wed, 2 Apr 2014 02:39:17 -0400")

Jeff King <peff@peff.net> writes:

> When we are sending a packfile to a remote, we currently try
> to reuse a whole chunk of packfile without bothering to look
> at the individual objects. This can make things like initial
> clones much lighter on the server, as we can just dump the
> packfile bytes.
>
> However, it's possible that the other side cannot read our
> packfile verbatim. For example, we may have objects stored
> as OFS_DELTA, but the client is an antique version of git
> that only understands REF_DELTA. We negotiate this
> capability over the fetch protocol. A normal pack-objects
> run will convert OFS_DELTA into REF_DELTA on the fly, but
> the "reuse pack" code path never even looks at the objects.

The above makes it sound like "reuse pack" codepath is broken. Is it
too much hassle to peek at the initial bytes of each object to see
how they are encoded? Would it be possible to convert OFS_DELTA to
REF_DELTA on the fly on that codepath as well, instead of disabling
the reuse altogether?

> This patch disables packfile reuse if the other side is
> missing any capabilities that we might have used in the
> on-disk pack. Right now the only one is OFS_DELTA, but we
> may need to expand in the future (e.g., if packv4 introduces
> new object types).
>
> We could be more thorough and only disable reuse in this
> case when we actually have an OFS_DELTA to send, but:
>
>   1. We almost always will have one, since we prefer
>      OFS_DELTA to REF_DELTA when possible. So this case
>      would almost never come up.
>
>   2. Looking through the objects defeats the purpose of the
>      optimization, which is to do as little work as possible
>      to get the bytes to the remote.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I happened to be fooling around with git v1.4.0 today, and noticed a
> problem fetching from GitHub. Pre-OFS_DELTA git versions are ancient by
> today's standard, but it's quite easy to remain compatible here, so I
> don't see why not.




 And in theory, alternate implementations might not
> understand OFS_DELTA, though in practice I would consider such an
> implementation to be pretty crappy.
>
>  builtin/pack-objects.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 7950c43..1503632 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -2439,12 +2439,23 @@ static void loosen_unused_packed_objects(struct rev_info *revs)
>  	}
>  }
>  
> +/*
> + * This tracks any options which a reader of the pack might
> + * not understand, and which would therefore prevent blind reuse
> + * of what we have on disk.
> + */
> +static int pack_options_allow_reuse(void)
> +{
> +	return allow_ofs_delta;
> +}
> +
>  static int get_object_list_from_bitmap(struct rev_info *revs)
>  {
>  	if (prepare_bitmap_walk(revs) < 0)
>  		return -1;
>  
> -	if (!reuse_partial_packfile_from_bitmap(
> +	if (pack_options_allow_reuse() &&
> +	    !reuse_partial_packfile_from_bitmap(
>  			&reuse_packfile,
>  			&reuse_packfile_objects,
>  			&reuse_packfile_offset)) {

  reply	other threads:[~2014-04-03 10:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-02  6:39 [PATCH] pack-objects: do not reuse packfiles without --delta-base-offset Jeff King
2014-04-02 17:39 ` Junio C Hamano [this message]
2014-04-04 21:48   ` Jeff King
2014-04-04 22:28     ` Junio C Hamano
2014-04-04 23:13       ` Jeff King
2014-04-07 17:15         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqfvlvvdfi.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.