All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Hommey <mh@glandium.org>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: suboptimal behavior of fast-import in some cases with "from"
Date: Thu, 9 Jul 2015 14:03:15 +0900	[thread overview]
Message-ID: <20150709050314.GA6013@glandium.org> (raw)
In-Reply-To: <xmqq4mlgzyl0.fsf@gitster.dls.corp.google.com>

On Mon, Jul 06, 2015 at 03:54:35PM -0700, Junio C Hamano wrote:
> Mike Hommey <mh@glandium.org> writes:
> 
> > One of the first things parse_from does is unconditionally throw away
> > the tree for the given branch, and then the "from" tree is loaded. So
> > when the "from" commit is the current head of the branch, that make
> > fast-import do more work than necessary.
> 
> If it is very common that the next commit the input stream wants to
> create is often on top of the commit that was just created, and if
> it is very common that the input stream producer knows what the
> commit object name of the commit that was just created, then
> optimising for that case does not sound too bad.  It really depends
> on two factors:
> 
>  - How likely is it that other people make the same mistake?

Looks like the answer is: very. Assuming my quick glance at the code is
not mistaken, the same mistake is made in at least git-p4.py (yes, the
one that comes with git), felipec's git-remote-hg, and hg-fast-export,
and that's 100% of the sample I looked at.

I won't claim to know what fast-import is doing, not having looked at
more than the parse_from* functions and the commit message for 4cabf858,
but it seems plausible this also skips making tree deltas for those
trees.

>  - How bad the damage to parse_from() would be if we wanted to
>    optimize for this case?

I /think/ it would look like this (untested), which doesn't seem too
damaging:

diff --git a/fast-import.c b/fast-import.c
index e78ca10..cb232e0 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2588,14 +2588,12 @@ static int parse_from(struct branch *b)
 {
        const char *from;
        struct branch *s;
+       unsigned char sha1[20];
 
        if (!skip_prefix(command_buf.buf, "from ", &from))
                return 0;
 
-       if (b->branch_tree.tree) {
-               release_tree_content_recursive(b->branch_tree.tree);
-               b->branch_tree.tree = NULL;
-       }
+       hashcpy(sha1, b->branch_tree.versions[1].sha1);
 
        s = lookup_branch(from);
        if (b == s)
@@ -2626,6 +2624,11 @@ static int parse_from(struct branch *b)
        else
                die("Invalid ref name or SHA1 expression: %s", from);
 
+       if (b->branch_tree.tree && hashcmp(sha1, b->branch_tree.versions[1].sha1)) {
+               release_tree_content_recursive(b->branch_tree.tree);
+               b->branch_tree.tree = NULL;
+       }
+
        read_next_command();
        return 1;
 }

Mike

  reply	other threads:[~2015-07-09  5:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-06 22:07 suboptimal behavior of fast-import in some cases with "from" Mike Hommey
2015-07-06 22:54 ` Junio C Hamano
2015-07-09  5:03   ` Mike Hommey [this message]
2015-07-09  5:52     ` Mike Hommey
2015-07-09  6:50       ` [PATCH] fast-import: Do less work when given "from" matches current branch head Mike Hommey
2015-07-09 20:37         ` Junio C Hamano
2015-07-09 22:30           ` Mike Hommey
2015-07-09 22:44             ` Junio C Hamano
2015-07-09 22:47               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150709050314.GA6013@glandium.org \
    --to=mh@glandium.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.