git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Wolf <jw@raven.inka.de>
To: git@vger.kernel.org
Subject: Re: Re-Transmission of blobs?
Date: Tue, 24 Sep 2013 22:36:51 +0200	[thread overview]
Message-ID: <20130924203651.GH14259@raven.wolf.lan> (raw)
In-Reply-To: <20130924073613.GC7257@sigill.intra.peff.net>

On Tue, Sep 24, 2013 at 03:36:13AM -0400, Jeff King wrote:
> On Fri, Sep 20, 2013 at 11:27:15AM +0200, Josef Wolf wrote:

> > Even without asking, we can assume with great probability that
> > origin/somebranch is available at origin.
> Bear in mind that the transfer process does not know about
> cherry-picking at all.

It dosn't need to know.

> It only sees the other side's tips and traverses.

The sender side knows with high probability that origin/somebranch is avalable
at the receivig side (unless it was deleted). And since the file in question
is part of the tree at the tip of origin/somebranch, we can deduce that the
file is available on the other side (unless it was deleted).

> > And the file in question happens to reside in the tree at the very tip
> > of origin/somebranch, not in some of its ancestors. In this case,
> > there's no need to search the history at all. And even in this pretty
> > simple case, the algorithm seems to fail for some reason.
> 
> Correct. And in the current code, we should be looking at the tip tree
> for your case.  However, the usual reason to do so is to mark those
> objects as a "preferred base" in pack-objects for doing deltas. I wonder
> if we are not correctly noticing the case that an object is both
> requested to be sent and marked as a preferred base (in which case we
> should drop it from our sending list).

Further, it seems that the marking as preferred base had no effect, since the
delta should have been zero in this case. Or is this mechanism deactivated for
binary data (/dev/zero in this case)?

> If that's the problem, it should be easy to fix cheaply. It would not
> work in the general case, but it would for your specific example. But
> since it costs nothing, there's no reason not to.
> 
> I'll see if I can investigate using the example script you posted.

Thanks!

> I meant "we do the optimization during history traversal that avoids
> going into sub-trees we have already seen". We do _not_ do the full
> history traversal for a partial push.

OK. I see. Maybe a config option to request a full traversal would be a
reasonable compromise? That way CPU could be traded against bandwidth for
repositories that happen to have slow/unreliable/expensive connections.

> Yes, that would be nice. However, in the common cases it would make
> things much worse. A clone of linux.git has ~3.5M objects.

Of course, if there's nothing you can drop, any attempt to drop objects will
add to overhead. That's similar to compressing compressed files. This will
enlarge the original file. Would that be a reasonable argument to get rid of
all attempts to compress files?

-- 
Josef Wolf
jw@raven.inka.de

  reply	other threads:[~2013-09-24 20:40 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-10 13:08 Re-Transmission of blobs? Josef Wolf
2013-09-10 17:51 ` Junio C Hamano
2013-09-11 11:27   ` Josef Wolf
2013-09-11 17:14     ` Junio C Hamano
2013-09-12  7:42       ` Josef Wolf
2013-09-12  9:23         ` Jeff King
2013-09-12 10:35           ` Josef Wolf
2013-09-12 19:44             ` Jeff King
2013-09-13 10:09               ` Josef Wolf
2013-09-16 21:55                 ` Jeff King
2013-09-20  9:27                   ` Josef Wolf
2013-09-24  7:36                     ` Jeff King
2013-09-24 20:36                       ` Josef Wolf [this message]
2013-09-12 12:45           ` Pyeron, Jason J CTR (US)
2013-09-12 19:56             ` Jeff King
2013-09-12 20:06               ` Pyeron, Jason J CTR (US)
2013-09-13 10:23                 ` Josef Wolf
2013-09-13 11:51                   ` Jason Pyeron
2013-09-13 12:16                 ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130924203651.GH14259@raven.wolf.lan \
    --to=jw@raven.inka.de \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).