git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Keeping <john@keeping.me.uk>
To: Dave Abrahams <dave@boostpro.com>
Cc: git@vger.kernel.org, Jonathan Nieder <jrnieder@gmail.com>
Subject: Re: fast-import bug?
Date: Sun, 23 Jun 2013 12:09:33 +0100	[thread overview]
Message-ID: <20130623110933.GG4676@serenity.lan> (raw)
In-Reply-To: <m2txkp1shb.fsf@cube.gateway.2wire.net>

On Sat, Jun 22, 2013 at 07:16:48PM -0700, Dave Abrahams wrote:
> 
> on Sat Jun 22 2013, John Keeping <john-AT-keeping.me.uk> wrote:
> 
> > On Fri, Jun 21, 2013 at 02:21:47AM -0700, Dave Abrahams wrote:
> >> The docs for fast-import seem to imply that I can use "ls" to get the
> >> SHA1 of a commit for which I have a mark:
> >> 
> >>        Reading from a named tree
> >>            The <dataref> can be a mark reference (:<idnum>) or the full 40-byte
> >
> >>            SHA-1 of a Git tag, commit, or tree object, preexisting or waiting to
> >>            be written. The path is relative to the top level of the tree named by
> >>            <dataref>.
> >> 
> >>                        'ls' SP <dataref> SP <path> LF
> >> 
> >>        See filemodify above for a detailed description of <path>.
> >> 
> >>        Output uses the same format as git ls-tree <tree> -- <path>:
> >> 
> >>            <mode> SP ('blob' | 'tree' | 'commit') SP <dataref> HT <path> LF
> >> 
> >>        The <dataref> represents the blob, tree, or commit object at <path> and
> >>                                                    ^^^^^^
> >>        can be used in later cat-blob, filemodify, or ls commands.
> >> 
> >> but I can't get it to work.  It's not entirely clear it's supposed to
> >> work.  What path would I pass?  Passing an empty path simply causes git
> >> to report "missing ".
> >
> > Which version of Git are you using?  
> 
> ,----[ git --version ]
> | git version 1.8.3.1
> `----
> 
> > I just tried this and get the error
> > "fatal: Empty path component found in input", 
> 
> I get that too.
> 
> > which seems to be from commit 178e1de (fast-import: don't allow 'ls'
> > of path with empty components, 2012-03-09), which is included in Git
> > 1.7.9.5.
> 
> Yes, that's at least part of the issue.  I notice git-fast-import
> rejects the root path "" for other commands, e.g. when used as the
> source of a filecopy we get the same issue.  I also note that the docs
> don't make it clear that quoting the path is mandatory if it might turn
> out to be empty.

Interesting.  There are two places that can produce this error message,
tree_content_get and tree_content_set, but I wonder if this means that
tree_content_get should not be doing this check.  The two places that
call it are:

1) "parse_ls" as discussed here
2) "file_change_cr" which deals with file copy and rename.

My patch in the previous message only changes the behaviour for the
parse_ls case, but it seems that you have a valid use case for removing
this check in the file_change_cr case as well.

>                                              I also note that the docs
> don't make it clear that quoting the path is mandatory if it might turn
> out to be empty.

That's not quite the case.  It looks to me like quoting the path is
mandatory if no "<dataref>" is given, and indeed the documentation says:

   Reading from the active commit
       This form can only be used in the middle of a commit. The path
       names a directory entry within fast-import’s active commit. The
       path must be quoted in this case.

               'ls' SP <path> LF

> > It seems to be slightly more complicated than that though, because after
> > allowing empty trees I get the "missing" message for the root tree.
> 
> Yeah, I've tried to patch Git to solve this but ran into that problem
> and gave up.
> 
> > This seems to be because its mode is 0 and not S_IFDIR.
> 
> Aha.
> 
> > With the patch below, things are working as I expect 
> 
> Awesome; works for me, too!
> 
> > but I don't understand why the mode of the root is not set correctly
> > at this point.  Perhaps someone more familiar with fast-import will
> > have some insight...
> 
> Yeah... there's no bug tracker for Git, right?  So if nobody pays
> attention to this thread, the problem will persist?

Yes, but I don't see that happening particularly often.  In the worst
case issues are normally documented by a failing test case.

In this case, I think I do now understand why the mode is 0: in parse_ls
a new tree object is created and the SHA1 of the original is copied in
but the mode is left blank; clearly this should be set to S_IFDIR when
the SHA1 is non-null.

I think the patch I now have is correct (and addresses the "copy from
root" scenario), but I need to spend some time understanding t9300 so
that I can add suitable test cases.

-- >8 --
diff --git a/fast-import.c b/fast-import.c
index 23f625f..e2c9d50 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -1629,7 +1629,8 @@ del_entry:
 static int tree_content_get(
 	struct tree_entry *root,
 	const char *p,
-	struct tree_entry *leaf)
+	struct tree_entry *leaf,
+	int allow_root)
 {
 	struct tree_content *t;
 	const char *slash1;
@@ -1641,31 +1642,39 @@ static int tree_content_get(
 		n = slash1 - p;
 	else
 		n = strlen(p);
-	if (!n)
+	if (!n && !allow_root)
 		die("Empty path component found in input");
 
 	if (!root->tree)
 		load_tree(root);
+
+	if (!n) {
+		e = root;
+		goto found_entry;
+	}
+
 	t = root->tree;
 	for (i = 0; i < t->entry_count; i++) {
 		e = t->entries[i];
 		if (e->name->str_len == n && !strncmp_icase(p, e->name->str_dat, n)) {
-			if (!slash1) {
-				memcpy(leaf, e, sizeof(*leaf));
-				if (e->tree && is_null_sha1(e->versions[1].sha1))
-					leaf->tree = dup_tree_content(e->tree);
-				else
-					leaf->tree = NULL;
-				return 1;
-			}
+			if (!slash1)
+				goto found_entry;
 			if (!S_ISDIR(e->versions[1].mode))
 				return 0;
 			if (!e->tree)
 				load_tree(e);
-			return tree_content_get(e, slash1 + 1, leaf);
+			return tree_content_get(e, slash1 + 1, leaf, 0);
 		}
 	}
 	return 0;
+
+found_entry:
+	memcpy(leaf, e, sizeof(*leaf));
+	if (e->tree && is_null_sha1(e->versions[1].sha1))
+		leaf->tree = dup_tree_content(e->tree);
+	else
+		leaf->tree = NULL;
+	return 1;
 }
 
 static int update_branch(struct branch *b)
@@ -2415,7 +2424,7 @@ static void file_change_cr(struct branch *b, int rename)
 	if (rename)
 		tree_content_remove(&b->branch_tree, s, &leaf);
 	else
-		tree_content_get(&b->branch_tree, s, &leaf);
+		tree_content_get(&b->branch_tree, s, &leaf, 1);
 	if (!leaf.versions[1].mode)
 		die("Path %s not in branch", s);
 	if (!*d) {	/* C "path/to/subdir" "" */
@@ -3051,6 +3060,8 @@ static void parse_ls(struct branch *b)
 		struct object_entry *e = parse_treeish_dataref(&p);
 		root = new_tree_entry();
 		hashcpy(root->versions[1].sha1, e->idx.sha1);
+		if (!is_null_sha1(root->versions[1].sha1))
+			root->versions[1].mode = S_IFDIR;
 		load_tree(root);
 		if (*p++ != ' ')
 			die("Missing space after tree-ish: %s", command_buf.buf);
@@ -3065,7 +3076,7 @@ static void parse_ls(struct branch *b)
 			die("Garbage after path in: %s", command_buf.buf);
 		p = uq.buf;
 	}
-	tree_content_get(root, p, &leaf);
+	tree_content_get(root, p, &leaf, 1);
 	/*
 	 * A directory in preparation would have a sha1 of zero
 	 * until it is saved.  Save, for simplicity.

  reply	other threads:[~2013-06-23 11:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-21  9:21 fast-import bug? Dave Abrahams
2013-06-22 10:21 ` John Keeping
2013-06-23  2:16   ` Dave Abrahams
2013-06-23 11:09     ` John Keeping [this message]
2013-06-23 14:19       ` Dave Abrahams
2013-06-23 14:55         ` John Keeping

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130623110933.GG4676@serenity.lan \
    --to=john@keeping.me.uk \
    --cc=dave@boostpro.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).