git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] fast-import: add 'ls' command
@ 2011-02-11 22:43 Jonathan Nieder
  2011-02-11 22:47 ` Sverre Rabbelier
  2011-02-12  0:40 ` Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Jonathan Nieder @ 2011-02-11 22:43 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, David Barr, Ramkumar Ramachandra,
	Sverre Rabbelier

From: David Barr <david.barr@cordelta.com>
Date: Thu, 2 Dec 2010 21:40:20 +1100

Introduce an "ls" command to read directory entries from the active
commit or a named commit.  This allows printing a blob from the active
commit or copying a blob or tree from a previous commit for use in the
current one.

There are two forms of the 'ls' command: the two-argument form prints
the entry at <path> for the tree underlying the tree, commit, or tag
named by <dataref>:

	'ls' SP <dataref> SP <path> LF

The one-argument form prints the entry at <path> in fast-import's
active commit.

	'ls' SP <path> LF

Dirty hack: for now, git fast-import will treat missing paths as empty
subtrees and print them as

 040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904	path/to/nowhere

to avoid confusing frontends that inserted such a path before.  But
an alternate output format

	'missing' SP <path> LF

is also allowed, so other backends (and perhaps git some day) can
convey the difference between an empty subtree and no subtree at all.

Proposed updates to svn-fe use the "ls" command to delegate
responsibility for knowledge about paths to git.  Reading such
information from the target repository instead of a data structure
maintained by the fast-import frontend simplifies the frontend a great
deal and means support for resuming an import in a separate
fast-import run (i.e., incremental import) is basically free.

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Last seen at [1].

Changes from v2:
 - clearer documentation
 - has a corresponding 'feature'
 - tiny clarification to parse_ls
 - rebased against master (also applies on maint fwiw).

The vcs-fast-import-devs don't seem to dislike it, at least. :)

Thanks for the help and patience.

[1] http://thread.gmane.org/gmane.comp.version-control.git/162698/focus=165553

 Documentation/git-fast-import.txt |   65 ++++++++++++++-
 fast-import.c                     |  164 ++++++++++++++++++++++++++++++++++++-
 t/t9300-fast-import.sh            |  112 +++++++++++++++++++++++--
 3 files changed, 327 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index c3a2766..4f92954 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -196,7 +196,8 @@ especially when a higher level language such as Perl, Python or
 Ruby is being used.
 
 fast-import is very strict about its input.  Where we say SP below we mean
-*exactly* one space.  Likewise LF means one (and only one) linefeed.
+*exactly* one space.  Likewise LF means one (and only one) linefeed
+and HT one (and only one) horizontal tab.
 Supplying additional whitespace characters will cause unexpected
 results, such as branch names or file names with leading or trailing
 spaces in their name, or early termination of fast-import when it encounters
@@ -334,6 +335,11 @@ and control the current import process.  More detailed discussion
 	format to the file descriptor set with `--cat-blob-fd` or
 	`stdout` if unspecified.
 
+`ls`::
+	Causes fast-import to print a directory entry in 'ls-tree'
+	format to the file descriptor set with `--cat-blob-fd` or
+	`stdout` if unspecified.
+
 `feature`::
 	Require that fast-import supports the specified feature, or
 	abort if it does not.
@@ -919,6 +925,57 @@ This command can be used anywhere in the stream that comments are
 accepted.  In particular, the `cat-blob` command can be used in the
 middle of a commit but not in the middle of a `data` command.
 
+`ls`
+~~~~
+Prints a directory entry to a file descriptor previously arranged with
+the `--cat-blob-fd` argument.  This allows printing a blob from the
+active commit (with `cat-blob`) or copying a blob or tree from a
+previous commit for use in the current one (with `filemodify`).
+
+The `ls` command can be used anywhere in the stream that comments are
+accepted, including the middle of a commit.
+
+Reading from the active commit::
+	This form can only be used in the middle of a `commit`.
+	The path names a directory entry within fast-import's
+	active commit.  The path must be quoted in this case.
++
+....
+	'ls' SP <path> LF
+....
+
+Reading from a named tree::
+	The `<dataref>` can be a mark reference (`:<idnum>`) or the
+	full 40-byte SHA-1 of a Git tag, commit, or tree object,
+	preexisting or waiting to be written.
+	The path is relative to the top level of the tree
+	named by `<dataref>`.
++
+....
+	'ls' SP <dataref> SP <path> LF
+....
+
+See `filemodify` above for a detailed description of `<path>`.
+
+Output uses the same format as `git ls-tree <tree> -- <path>`:
+
+====
+	<mode> SP ('blob' | 'tree' | 'commit') SP <dataref> HT <path> LF
+====
+
+The <dataref> represents the blob, tree, or commit object at <path>
+and can be used in later 'cat-blob', 'filemodify', or 'ls' commands.
+
+Git repositories do not distinguish between missing paths and empty
+subtrees.  If a path is not found, 'git fast-import' will report it as
+an empty tree.  Backends that do have a notion of empty trees may write
+
+====
+	missing SP <path> LF
+====
+
+for paths that do not correspond to a blob or subtree.
+
 `feature`
 ~~~~~~~~~
 Require that fast-import supports the specified feature, or abort if
@@ -946,8 +1003,10 @@ import-marks::
 	any "feature import-marks" command in the stream.
 
 cat-blob::
-	Ignored.  Versions of fast-import not supporting the
-	"cat-blob" command will exit with a message indicating so.
+ls::
+	Require that the backend support the 'cat-blob' or 'ls' command.
+	Versions of fast-import not supporting the specified command
+	will exit with a message indicating so.
 	This lets the import error out early with a clear message,
 	rather than wasting time on the early part of an import
 	before the unsupported command is detected.
diff --git a/fast-import.c b/fast-import.c
index 3886a1b..cef2b8c 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -24,10 +24,12 @@ Format of STDIN stream:
     commit_msg
     ('from' sp committish lf)?
     ('merge' sp committish lf)*
-    file_change*
+    (file_change | ls)*
     lf?;
   commit_msg ::= data;
 
+  ls ::= 'ls' sp '"' quoted(path) '"' lf;
+
   file_change ::= file_clr
     | file_del
     | file_rnm
@@ -132,7 +134,7 @@ Format of STDIN stream:
   ts    ::= # time since the epoch in seconds, ascii base10 notation;
   tz    ::= # GIT style timezone;
 
-     # note: comments and cat requests may appear anywhere
+     # note: comments, ls and cat requests may appear anywhere
      # in the input, except within a data command.  Any form
      # of the data command always escapes the related input
      # from comment processing.
@@ -141,7 +143,9 @@ Format of STDIN stream:
      # must be the first character on that line (an lf
      # preceded it).
      #
+
   cat_blob ::= 'cat-blob' sp (hexsha1 | idnum) lf;
+  ls_tree  ::= 'ls' sp (hexsha1 | idnum) sp path_str lf;
 
   comment ::= '#' not_lf* lf;
   not_lf  ::= # Any byte that is not ASCII newline (LF);
@@ -374,6 +378,7 @@ static int cat_blob_fd = STDOUT_FILENO;
 
 static void parse_argv(void);
 static void parse_cat_blob(void);
+static void parse_ls(struct branch *b);
 
 static void write_branch_report(FILE *rpt, struct branch *b)
 {
@@ -2614,6 +2619,8 @@ static void parse_new_commit(void)
 			note_change_n(b, prev_fanout);
 		else if (!strcmp("deleteall", command_buf.buf))
 			file_change_deleteall(b);
+		else if (!prefixcmp(command_buf.buf, "ls "))
+			parse_ls(b);
 		else {
 			unread_command_buf = 1;
 			break;
@@ -2837,6 +2844,155 @@ static void parse_cat_blob(void)
 	cat_blob(oe, sha1);
 }
 
+static struct object_entry *dereference(struct object_entry *oe,
+					unsigned char sha1[20])
+{
+	unsigned long size;
+	void *buf = NULL;
+	if (!oe) {
+		enum object_type type = sha1_object_info(sha1, NULL);
+		if (type < 0)
+			die("object not found: %s", sha1_to_hex(sha1));
+		/* cache it! */
+		oe = insert_object(sha1);
+		oe->type = type;
+		oe->pack_id = MAX_PACK_ID;
+		oe->idx.offset = 1;
+	}
+	switch (oe->type) {
+	case OBJ_TREE:	/* easy case. */
+		return oe;
+	case OBJ_COMMIT:
+	case OBJ_TAG:
+		break;
+	default:
+		die("Not a treeish: %s", command_buf.buf);
+	}
+
+	if (oe->pack_id != MAX_PACK_ID) {	/* in a pack being written */
+		buf = gfi_unpack_entry(oe, &size);
+	} else {
+		enum object_type unused;
+		buf = read_sha1_file(sha1, &unused, &size);
+	}
+	if (!buf)
+		die("Can't load object %s", sha1_to_hex(sha1));
+
+	/* Peel one layer. */
+	switch (oe->type) {
+	case OBJ_TAG:
+		if (size < 40 + strlen("object ") ||
+		    get_sha1_hex(buf + strlen("object "), sha1))
+			die("Invalid SHA1 in tag: %s", command_buf.buf);
+		break;
+	case OBJ_COMMIT:
+		if (size < 40 + strlen("tree ") ||
+		    get_sha1_hex(buf + strlen("tree "), sha1))
+			die("Invalid SHA1 in commit: %s", command_buf.buf);
+	}
+
+	free(buf);
+	return find_object(sha1);
+}
+
+static struct object_entry *parse_treeish_dataref(const char **p)
+{
+	unsigned char sha1[20];
+	struct object_entry *e;
+
+	if (**p == ':') {	/* <mark> */
+		char *endptr;
+		e = find_mark(strtoumax(*p + 1, &endptr, 10));
+		if (endptr == *p + 1)
+			die("Invalid mark: %s", command_buf.buf);
+		if (!e)
+			die("Unknown mark: %s", command_buf.buf);
+		*p = endptr;
+		hashcpy(sha1, e->idx.sha1);
+	} else {	/* <sha1> */
+		if (get_sha1_hex(*p, sha1))
+			die("Invalid SHA1: %s", command_buf.buf);
+		e = find_object(sha1);
+		*p += 40;
+	}
+
+	while (!e || e->type != OBJ_TREE)
+		e = dereference(e, sha1);
+	return e;
+}
+
+static void print_ls(int mode, const unsigned char *sha1, const char *path)
+{
+	static struct strbuf line = STRBUF_INIT;
+
+	/* See show_tree(). */
+	const char *type =
+		S_ISGITLINK(mode) ? commit_type :
+		S_ISDIR(mode) ? tree_type :
+		blob_type;
+
+	/* mode SP type SP object_name TAB path LF */
+	strbuf_reset(&line);
+	strbuf_addf(&line, "%06o %s %s\t",
+			mode, type, sha1_to_hex(sha1));
+	quote_c_style(path, &line, NULL, 0);
+	strbuf_addch(&line, '\n');
+	cat_blob_write(line.buf, line.len);
+}
+
+static void parse_ls(struct branch *b)
+{
+	const char *p;
+	struct tree_entry *root = NULL;
+	struct tree_entry leaf = {0};
+
+	/* ls SP (<treeish> SP)? <path> */
+	p = command_buf.buf + strlen("ls ");
+	if (*p == '"') {
+		if (!b)
+			die("Not in a commit: %s", command_buf.buf);
+		root = &b->branch_tree;
+	} else {
+		struct object_entry *e = parse_treeish_dataref(&p);
+		root = new_tree_entry();
+		hashcpy(root->versions[1].sha1, e->idx.sha1);
+		load_tree(root);
+		if (*p++ != ' ')
+			die("Missing space after tree-ish: %s", command_buf.buf);
+	}
+	if (*p == '"') {
+		static struct strbuf uq = STRBUF_INIT;
+		const char *endp;
+		strbuf_reset(&uq);
+		if (unquote_c_style(&uq, p, &endp))
+			die("Invalid path: %s", command_buf.buf);
+		if (*endp)
+			die("Garbage after path in: %s", command_buf.buf);
+		p = uq.buf;
+	}
+	tree_content_get(root, p, &leaf);
+	/*
+	 * A directory in preparation would have a sha1 of zero
+	 * until it is saved.  Save, for simplicity.
+	 */
+	if (S_ISDIR(leaf.versions[1].mode))
+		store_tree(&leaf);
+
+	if (!leaf.versions[1].mode) {
+		/*
+		 * Missing path?  Must be an empty subtree!
+		 *
+		 * When git learns to track empty directories, we can report
+		 * this by saying 'missing "path/to/directory"' instead.
+		 */
+		print_ls(S_IFDIR, (const unsigned char *) EMPTY_TREE_SHA1_BIN, p);
+	} else {
+		print_ls(leaf.versions[1].mode, leaf.versions[1].sha1, p);
+	}
+	if (!b || root != &b->branch_tree)
+		release_tree_entry(root);
+}
+
 static void checkpoint(void)
 {
 	checkpoint_requested = 0;
@@ -3001,7 +3157,7 @@ static int parse_one_feature(const char *feature, int from_stream)
 		relative_marks_paths = 0;
 	} else if (!prefixcmp(feature, "force")) {
 		force_update = 1;
-	} else if (!strcmp(feature, "notes")) {
+	} else if (!strcmp(feature, "notes") || !strcmp(feature, "ls")) {
 		; /* do nothing; we have the feature */
 	} else {
 		return 0;
@@ -3142,6 +3298,8 @@ int main(int argc, const char **argv)
 	while (read_next_command() != EOF) {
 		if (!strcmp("blob", command_buf.buf))
 			parse_new_blob();
+		else if (!prefixcmp(command_buf.buf, "ls "))
+			parse_ls(NULL);
 		else if (!prefixcmp(command_buf.buf, "commit "))
 			parse_new_commit();
 		else if (!prefixcmp(command_buf.buf, "tag "))
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 52ac0e5..f70d2a9 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -42,6 +42,14 @@ echo "$@"'
 
 >empty
 
+test_expect_success 'setup: have pipes?' '
+	rm -f frob &&
+	if mkfifo frob
+	then
+		test_set_prereq PIPE
+	fi
+'
+
 ###
 ### series A
 ###
@@ -898,6 +906,97 @@ test_expect_success \
 	 git diff-tree -C --find-copies-harder -r N4^ N4 >actual &&
 	 compare_diff_raw expect actual'
 
+test_expect_success PIPE 'N: read and copy directory' '
+	cat >expect <<-\EOF
+	:100755 100755 f1fb5da718392694d0076d677d6d0e364c79b0bc f1fb5da718392694d0076d677d6d0e364c79b0bc C100	file2/newf	file3/newf
+	:100644 100644 7123f7f44e39be127c5eb701e5968176ee9d78b1 7123f7f44e39be127c5eb701e5968176ee9d78b1 C100	file2/oldf	file3/oldf
+	EOF
+	git update-ref -d refs/heads/N4 &&
+	rm -f backflow &&
+	mkfifo backflow &&
+	(
+		exec <backflow &&
+		cat <<-EOF &&
+		commit refs/heads/N4
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		copy by tree hash, part 2
+		COMMIT
+
+		from refs/heads/branch^0
+		ls "file2"
+		EOF
+		read mode type tree filename &&
+		echo "M 040000 $tree file3"
+	) |
+	git fast-import --cat-blob-fd=3 3>backflow &&
+	git diff-tree -C --find-copies-harder -r N4^ N4 >actual &&
+	compare_diff_raw expect actual
+'
+
+test_expect_success PIPE 'N: read and copy "empty" directory' '
+	cat <<-\EOF >expect &&
+	OBJNAME
+	:000000 100644 OBJNAME OBJNAME A	greeting
+	OBJNAME
+	:100644 000000 OBJNAME OBJNAME D	unrelated
+	OBJNAME
+	:000000 100644 OBJNAME OBJNAME A	unrelated
+	EOF
+	git update-ref -d refs/heads/copy-empty &&
+	rm -f backflow &&
+	mkfifo backflow &&
+	(
+		exec <backflow &&
+		cat <<-EOF &&
+		commit refs/heads/copy-empty
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		copy "empty" (missing) directory
+		COMMIT
+
+		M 100644 inline src/greeting
+		data <<BLOB
+		hello
+		BLOB
+		C src/greeting dst1/non-greeting
+		C src/greeting unrelated
+		# leave behind "empty" src directory
+		D src/greeting
+		ls "src"
+		EOF
+		read mode type tree filename &&
+		sed -e "s/X\$//" <<-EOF
+		M $mode $tree dst1
+		M $mode $tree dst2
+
+		commit refs/heads/copy-empty
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		copy empty directory to root
+		COMMIT
+
+		M $mode $tree X
+
+		commit refs/heads/copy-empty
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		add another file
+		COMMIT
+
+		M 100644 inline greeting
+		data <<BLOB
+		hello
+		BLOB
+		EOF
+	) |
+	git fast-import --cat-blob-fd=3 3>backflow &&
+	git rev-list copy-empty |
+	git diff-tree -r --root --stdin |
+	sed "s/$_x40/OBJNAME/g" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success \
 	'N: copy root directory by tree hash' \
 	'cat >expect <<-\EOF &&
@@ -1861,6 +1960,11 @@ test_expect_success 'R: feature no-relative-marks should be honoured' '
     test_cmp marks.new non-relative.out
 '
 
+test_expect_success 'R: feature ls supported' '
+	echo "feature ls" |
+	git fast-import
+'
+
 test_expect_success 'R: feature cat-blob supported' '
 	echo "feature cat-blob" |
 	git fast-import
@@ -1986,14 +2090,6 @@ test_expect_success 'R: print two blobs to stdout' '
 	test_cmp expect actual
 '
 
-test_expect_success 'setup: have pipes?' '
-	rm -f frob &&
-	if mkfifo frob
-	then
-		test_set_prereq PIPE
-	fi
-'
-
 test_expect_success PIPE 'R: copy using cat-file' '
 	expect_id=$(git hash-object big) &&
 	expect_len=$(wc -c <big) &&
-- 
1.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fast-import: add 'ls' command
  2011-02-11 22:43 [PATCH v3] fast-import: add 'ls' command Jonathan Nieder
@ 2011-02-11 22:47 ` Sverre Rabbelier
  2011-02-11 22:59   ` Jonathan Nieder
  2011-02-12  0:40 ` Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: Sverre Rabbelier @ 2011-02-11 22:47 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Junio C Hamano, David Barr, Ramkumar Ramachandra

Heya,

On Fri, Feb 11, 2011 at 23:43, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Introduce an "ls" command to read directory entries from the active
> commit or a named commit.  This allows printing a blob from the active
> commit or copying a blob or tree from a previous commit for use in the
> current one.

Useful addition I think. What happens if you ask it for a path that is
not a directory?

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fast-import: add 'ls' command
  2011-02-11 22:47 ` Sverre Rabbelier
@ 2011-02-11 22:59   ` Jonathan Nieder
  2011-02-11 23:51     ` Sverre Rabbelier
  0 siblings, 1 reply; 7+ messages in thread
From: Jonathan Nieder @ 2011-02-11 22:59 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: git, Junio C Hamano, David Barr, Ramkumar Ramachandra

Sverre Rabbelier wrote:
> On Fri, Feb 11, 2011 at 23:43, Jonathan Nieder <jrnieder@gmail.com> wrote:

>> Introduce an "ls" command to read directory entries from the active
>> commit or a named commit.  This allows printing a blob from the active
>> commit or copying a blob or tree from a previous commit for use in the
>> current one.
>
> Useful addition I think. What happens if you ask it for a path that is
> not a directory?

Hmm, the documentation was not at all clear, then.  Good catch.

The answer is that it always prints a single dirent, whether the path
supplied names a file or a directory.

	FE> ls :1 Documentation
	gfi> 040000 tree 9e6c2b599341d28a2a375f8207507e0a2a627fe9	Documentation
	FE> ls :1 git.c
	gfi> 100644 blob 23610aa0366ebe36e65e78fb8c5fba3f2d0b8f77	git.c
	FE> ls :1 thisfiledoesnotexist
	gfi> 040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904	thisfiledoesnotexist
	FE> ls :1 RelNotes
	gfi> 120000 blob b942e499449d97aeb50c73ca2bdc1c6e6d528743	RelNotes
	FE> ls 9e6c2b599341d28a2a375f8207507e0a2a627fe9 git-fast-import.txt
	gfi> 100644 blob 4f92954396e3f0f97e75b6838a5635b583708870	git-fast-import.txt
	FE> cat-blob b942e499449d97aeb50c73ca2bdc1c6e6d528743
	gfi> b942e499449d97aeb50c73ca2bdc1c6e6d528743 blob 32
	gfi> Documentation/RelNotes/1.7.4.txt

Maybe this patch (for squashing) would help.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index 4f92954..495e01f 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -336,9 +336,9 @@ and control the current import process.  More detailed discussion
 	`stdout` if unspecified.
 
 `ls`::
-	Causes fast-import to print a directory entry in 'ls-tree'
-	format to the file descriptor set with `--cat-blob-fd` or
-	`stdout` if unspecified.
+	Causes fast-import to print a line describing a directory
+	entry in 'ls-tree' format to the file descriptor set with
+	`--cat-blob-fd` or `stdout` if unspecified.
 
 `feature`::
 	Require that fast-import supports the specified feature, or
@@ -927,10 +927,11 @@ middle of a commit but not in the middle of a `data` command.
 
 `ls`
 ~~~~
-Prints a directory entry to a file descriptor previously arranged with
-the `--cat-blob-fd` argument.  This allows printing a blob from the
-active commit (with `cat-blob`) or copying a blob or tree from a
-previous commit for use in the current one (with `filemodify`).
+Prints information about the object at a path to a file descriptor
+previously arranged with the `--cat-blob-fd` argument.  This allows
+printing a blob from the active commit (with `cat-blob`) or copying a
+blob or tree from a previous commit for use in the current one (with
+`filemodify`).
 
 The `ls` command can be used anywhere in the stream that comments are
 accepted, including the middle of a commit.
@@ -957,7 +958,7 @@ Reading from a named tree::
 
 See `filemodify` above for a detailed description of `<path>`.
 
-Output uses the same format as `git ls-tree <tree> -- <path>`:
+Output uses the same format as `git ls-tree <tree> {litdd} <path>`:
 
 ====
 	<mode> SP ('blob' | 'tree' | 'commit') SP <dataref> HT <path> LF

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fast-import: add 'ls' command
  2011-02-11 22:59   ` Jonathan Nieder
@ 2011-02-11 23:51     ` Sverre Rabbelier
  0 siblings, 0 replies; 7+ messages in thread
From: Sverre Rabbelier @ 2011-02-11 23:51 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Junio C Hamano, David Barr, Ramkumar Ramachandra

Heya,

On Fri, Feb 11, 2011 at 23:59, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hmm, the documentation was not at all clear, then.  Good catch.
>
> The answer is that it always prints a single dirent, whether the path
> supplied names a file or a directory.

Ok, makes sense :).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fast-import: add 'ls' command
  2011-02-11 22:43 [PATCH v3] fast-import: add 'ls' command Jonathan Nieder
  2011-02-11 22:47 ` Sverre Rabbelier
@ 2011-02-12  0:40 ` Junio C Hamano
  2011-02-12  1:21   ` Jonathan Nieder
  1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2011-02-12  0:40 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, David Barr, Ramkumar Ramachandra, Sverre Rabbelier

Jonathan Nieder <jrnieder@gmail.com> writes:

> From: David Barr <david.barr@cordelta.com>
> Date: Thu, 2 Dec 2010 21:40:20 +1100
>
> Introduce an "ls" command to read directory entries from the active
> commit or a named commit.  This allows printing a blob from the active
> commit or copying a blob or tree from a previous commit for use in the
> current one.
>
> There are two forms of the 'ls' command: the two-argument form prints
> the entry at <path> for the tree underlying the tree, commit, or tag
> named by <dataref>:
>
> 	'ls' SP <dataref> SP <path> LF
>
> The one-argument form prints the entry at <path> in fast-import's
> active commit.
>
> 	'ls' SP <path> LF

Is this really "ls"?

Obviously, an extended SHA-1 that is accepted by the normal git does not
have a notion of "marks", but modulo that, the first one looks to me very
similar to "rev-parse <ref>:<path>" in spirit, which suggests that "the
path in the current one" might be better spelled as "rev-parse :<path>" to
make the syntax and the concept more consistent across parts of the
system.  If it makes sense to allow arbitary <committish> (or <treeish>
for that matter) for the <dataref> part of the parameter, this observation
becomes even more true, no?

Having said that, I do not deeply care about the token "ls" itself.  I
just reacted to "<dataref> SP <path>" part.

> Dirty hack: for now, git fast-import will treat missing paths as empty
> subtrees and print them as
>
>  040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904	path/to/nowhere
>
> to avoid confusing frontends that inserted such a path before.

Sorry, but I am not quite sure what this paragraph is trying to say.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fast-import: add 'ls' command
  2011-02-12  0:40 ` Junio C Hamano
@ 2011-02-12  1:21   ` Jonathan Nieder
  2011-02-12  1:36     ` Jonathan Nieder
  0 siblings, 1 reply; 7+ messages in thread
From: Jonathan Nieder @ 2011-02-12  1:21 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Barr, Ramkumar Ramachandra, Sverre Rabbelier,
	vcs-fast-import-devs

(+cc: vcs-fast-import-devs)

Junio C Hamano wrote:

>> There are two forms of the 'ls' command: the two-argument form prints
>> the entry at <path> for the tree underlying the tree, commit, or tag
>> named by <dataref>:
>>
>>       'ls' SP <dataref> SP <path> LF
[...]
> Is this really "ls"?
>
> Obviously, an extended SHA-1 that is accepted by the normal git does not
> have a notion of "marks", but modulo that, the first one looks to me very
> similar to "rev-parse <ref>:<path>" in spirit, which suggests that "the
> path in the current one" might be better spelled as "rev-parse :<path>" to
> make the syntax and the concept more consistent across parts of the
> system.  If it makes sense to allow arbitary <committish> (or <treeish>
> for that matter) for the <dataref> part of the parameter, this observation
> becomes even more true, no?

I suppose so.  It is inspired by "git ls-tree <ref> -- <path>", and
it is convenient that it prints the mode, too (otherwise how do we
tell a regular file from a symlink?).

>> Dirty hack: for now, git fast-import will treat missing paths as empty
>> subtrees and print them as
>>
>>  040000 tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904	path/to/nowhere
>>
>> to avoid confusing frontends that inserted such a path before.
>
> Sorry, but I am not quite sure what this paragraph is trying to say.

Where $empty is the empty tree, I might expect:

	M 040000 $emptytree some/subdirectory
	ls "some/subdirectory"

to result in

	040000 tree $empty	some/subdirectory

On the other hand, I also would expect

	D some/subdirectory
	ls "some/subdirectory"

to result in

	missing some/subdirectory

Since git fast-import (like git itself) does not track empty
subdirectories, the state being reported in both these cases is the
same.  How can it tell the difference between the two?  So the patch
arbitrarily uses the former behavior (reporting nonexistent dirents as
though they were the empty subtree), which seems to work well in
practice.

It is probably be less confusing to use "missing" and let frontends
handle that rather shielding the frontend from reality.  Like so:

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Documentation/git-fast-import.txt |    7 +----
 fast-import.c                     |   32 ++++++++++++--------------
 t/t9300-fast-import.sh            |   44 ++++++++++--------------------------
 3 files changed, 29 insertions(+), 54 deletions(-)

diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index 495e01f..e1b7a0f 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -967,16 +967,13 @@ Output uses the same format as `git ls-tree <tree> {litdd} <path>`:
 The <dataref> represents the blob, tree, or commit object at <path>
 and can be used in later 'cat-blob', 'filemodify', or 'ls' commands.
 
-Git repositories do not distinguish between missing paths and empty
-subtrees.  If a path is not found, 'git fast-import' will report it as
-an empty tree.  Backends that do have a notion of empty trees may write
+If there is no file or subtree at that path, 'git fast-import' will
+instead report
 
 ====
 	missing SP <path> LF
 ====
 
-for paths that do not correspond to a blob or subtree.
-
 `feature`
 ~~~~~~~~~
 Require that fast-import supports the specified feature, or abort if
diff --git a/fast-import.c b/fast-import.c
index cef2b8c..6c37b84 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -2931,12 +2931,20 @@ static void print_ls(int mode, const unsigned char *sha1, const char *path)
 		S_ISDIR(mode) ? tree_type :
 		blob_type;
 
-	/* mode SP type SP object_name TAB path LF */
-	strbuf_reset(&line);
-	strbuf_addf(&line, "%06o %s %s\t",
-			mode, type, sha1_to_hex(sha1));
-	quote_c_style(path, &line, NULL, 0);
-	strbuf_addch(&line, '\n');
+	if (!mode) {
+		/* missing SP path LF */
+		strbuf_reset(&line);
+		strbuf_addstr(&line, "missing ");
+		quote_c_style(path, &line, NULL, 0);
+		strbuf_addch(&line, '\n');
+	} else {
+		/* mode SP type SP object_name TAB path LF */
+		strbuf_reset(&line);
+		strbuf_addf(&line, "%06o %s %s\t",
+				mode, type, sha1_to_hex(sha1));
+		quote_c_style(path, &line, NULL, 0);
+		strbuf_addch(&line, '\n');
+	}
 	cat_blob_write(line.buf, line.len);
 }
 
@@ -2978,17 +2986,7 @@ static void parse_ls(struct branch *b)
 	if (S_ISDIR(leaf.versions[1].mode))
 		store_tree(&leaf);
 
-	if (!leaf.versions[1].mode) {
-		/*
-		 * Missing path?  Must be an empty subtree!
-		 *
-		 * When git learns to track empty directories, we can report
-		 * this by saying 'missing "path/to/directory"' instead.
-		 */
-		print_ls(S_IFDIR, (const unsigned char *) EMPTY_TREE_SHA1_BIN, p);
-	} else {
-		print_ls(leaf.versions[1].mode, leaf.versions[1].sha1, p);
-	}
+	print_ls(leaf.versions[1].mode, leaf.versions[1].sha1, p);
 	if (!b || root != &b->branch_tree)
 		release_tree_entry(root);
 }
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index f70d2a9..6b1ba6c 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -934,25 +934,22 @@ test_expect_success PIPE 'N: read and copy directory' '
 	compare_diff_raw expect actual
 '
 
-test_expect_success PIPE 'N: read and copy "empty" directory' '
+test_expect_success PIPE 'N: empty directory reads as missing' '
 	cat <<-\EOF >expect &&
 	OBJNAME
-	:000000 100644 OBJNAME OBJNAME A	greeting
-	OBJNAME
-	:100644 000000 OBJNAME OBJNAME D	unrelated
-	OBJNAME
 	:000000 100644 OBJNAME OBJNAME A	unrelated
 	EOF
-	git update-ref -d refs/heads/copy-empty &&
+	echo "missing src" >expect.response &&
+	git update-ref -d refs/heads/read-empty &&
 	rm -f backflow &&
 	mkfifo backflow &&
 	(
 		exec <backflow &&
 		cat <<-EOF &&
-		commit refs/heads/copy-empty
+		commit refs/heads/read-empty
 		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
 		data <<COMMIT
-		copy "empty" (missing) directory
+		read "empty" (missing) directory
 		COMMIT
 
 		M 100644 inline src/greeting
@@ -965,33 +962,16 @@ test_expect_success PIPE 'N: read and copy "empty" directory' '
 		D src/greeting
 		ls "src"
 		EOF
-		read mode type tree filename &&
-		sed -e "s/X\$//" <<-EOF
-		M $mode $tree dst1
-		M $mode $tree dst2
-
-		commit refs/heads/copy-empty
-		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
-		data <<COMMIT
-		copy empty directory to root
-		COMMIT
-
-		M $mode $tree X
-
-		commit refs/heads/copy-empty
-		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
-		data <<COMMIT
-		add another file
-		COMMIT
-
-		M 100644 inline greeting
-		data <<BLOB
-		hello
-		BLOB
+		read -r line &&
+		printf "%s\n" "$line" >response &&
+		cat <<-\EOF
+		D dst1
+		D dst2
 		EOF
 	) |
 	git fast-import --cat-blob-fd=3 3>backflow &&
-	git rev-list copy-empty |
+	test_cmp expect.response response &&
+	git rev-list read-empty |
 	git diff-tree -r --root --stdin |
 	sed "s/$_x40/OBJNAME/g" >actual &&
 	test_cmp expect actual
-- 
1.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v3] fast-import: add 'ls' command
  2011-02-12  1:21   ` Jonathan Nieder
@ 2011-02-12  1:36     ` Jonathan Nieder
  0 siblings, 0 replies; 7+ messages in thread
From: Jonathan Nieder @ 2011-02-12  1:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, David Barr, Ramkumar Ramachandra, Sverre Rabbelier,
	vcs-fast-import-devs

Jonathan Nieder wrote:

> It is probably be less confusing to use "missing" and let frontends
> handle that rather shielding the frontend from reality.  Like so:
                    ^

This should say "rather than", as in: "It is confusing to lie to the
frontend; let's not do that."

Side note: this adds some complication to frontends.  For example,
to copy some/path from the commit called :5, without this change,
one could do

	echo 'ls :5 some/path'
	read -r mode type dataref path
	echo "M $mode $dataref $path"

but now one would have to do:

	echo 'ls :5 some/path'
	read -r mode rest
	case $mode in
	missing)
		echo "D some/path"
		;;
	*)
		dataref=${rest#* }
		echo "M $mode $dataref some/path"
		;;
	esac

The former is simpler; the latter is more explicit and probably less
confusing.

Sorry for the nonsense.
Jonathan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-02-12  1:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-11 22:43 [PATCH v3] fast-import: add 'ls' command Jonathan Nieder
2011-02-11 22:47 ` Sverre Rabbelier
2011-02-11 22:59   ` Jonathan Nieder
2011-02-11 23:51     ` Sverre Rabbelier
2011-02-12  0:40 ` Junio C Hamano
2011-02-12  1:21   ` Jonathan Nieder
2011-02-12  1:36     ` Jonathan Nieder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).