All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Nieder <jrnieder@gmail.com>
To: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: git@vger.kernel.org, "Shawn O. Pearce" <spearce@spearce.org>,
	Sverre Rabbelier <srabbelier@gmail.com>,
	David Barr <david.barr@cordelta.com>, Sam Vilain <sam@vilain.net>
Subject: [PATCH 3/3] fast-import: Let importers retrieve the objects being written
Date: Sat, 4 Sep 2010 22:41:17 -0500	[thread overview]
Message-ID: <20100905034116.GD2344@burratino> (raw)
In-Reply-To: <20100905031528.GA2344@burratino>

As the description of the "progress" option in git-fast-import.1
hints, there is no convenient way to immediately access the objects
written to a new repository through fast-import.  Until a checkpoint
has been started and finishes writing the pack index, any new blobs,
trees, and commits will not be accessible using standard git tools.

So introduce another way: a "cat" command introduced in the command
stream requests for fast-import to print an object to the same
report-fd stream used to report commits being written.

The output uses the same format as "git cat-file --batch".

Like cat-file --batch, this does not provide an option to dereference
objects to a type of the requestor's choosing.  Tags are presented
as tags, commits as commits, and trees as trees.

Objects can be specified by path within a tree as well, using a

 cat TREE "PATH"

syntax.  With this syntax, also, the tree can only be specified by
:n marker or 40-digit tree id.

Cc: Shawn O. Pearce <spearce@spearce.org>
Cc: David Barr <david.barr@cordelta.com>
Cc: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
That's the end of the series.  As with patch 2, I'm not thrilled
with the interface, but I hope it can get the job done for now.

 Documentation/git-fast-import.txt |   34 +++++++++++-
 fast-import.c                     |  108 +++++++++++++++++++++++++++++++++++
 t/t9300-fast-import.sh            |  114 +++++++++++++++++++++++++++++++++++++
 3 files changed, 255 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt
index e217635..2cf48f5 100644
--- a/Documentation/git-fast-import.txt
+++ b/Documentation/git-fast-import.txt
@@ -102,7 +102,8 @@ OPTIONS
 +
 The described objects are not necessarily accessible
 using standard git plumbing tools until a little while
-after the next checkpoint.
+after the next checkpoint.  To request access to the
+objects before then, use `cat` lines in the command stream.
 
 --export-pack-edges=<file>::
 	After creating a packfile, print a line of data to
@@ -332,6 +333,10 @@ and control the current import process.  More detailed discussion
 	standard output.  This command is optional and is not needed
 	to perform an import.
 
+`cat`::
+	Causes fast-import to print an object in 'cat-file --batch'
+	format to the file descriptor set with "feature report-fd".
+
 `feature`::
 	Require that fast-import supports the specified feature, or
 	abort if it does not.
@@ -888,6 +893,33 @@ Placing a `progress` command immediately after a `checkpoint` will
 inform the reader when the `checkpoint` has been completed and it
 can safely access the refs that fast-import updated.
 
+`cat`
+~~~~~
+Causes fast-import to print an object to a file descriptor
+previously arranged with the `--report-fd` option.  The command
+otherwise has no impact on the current import; its main purpose is to
+retrieve objects that may be in fast-import's memory but not
+accessible from the target repository a little quicker than by the
+method suggested by the description of the `progress` option.
+
+....
+	'cat' SP <dataref> LF
+....
+
+The `<dataref>` can be either a mark reference (`:<idnum>`)
+set previously, or a full 40-byte SHA-1 of any Git object,
+preexisting or ready to be written.
+
+If `<dataref>` refers to a tree object, it may be followed by
+a path within that tree to retrieve a subtree or blob.
+
+....
+	'cat' SP <treeref> SP <path> LF
+....
+
+A `<path>` string should be surrounded with quotation marks and
+use C-style escaping.
+
 `feature`
 ~~~~~~~~~
 Require that fast-import supports the specified feature, or abort if
diff --git a/fast-import.c b/fast-import.c
index ef0cee7..b7fa9ae 100644
--- a/fast-import.c
+++ b/fast-import.c
@@ -55,6 +55,9 @@ Format of STDIN stream:
     ('from' sp committish lf)?
     lf?;
 
+  cat_request ::= 'cat' sp (hexsha1 | idnum) lf
+    | 'cat' sp hexsha1 sp path_str lf;
+
   checkpoint ::= 'checkpoint' lf
     lf?;
 
@@ -2688,6 +2691,109 @@ static void parse_reset_branch(void)
 		unread_command_buf = 1;
 }
 
+static void quoted_path_sha1(unsigned char sha1[20], struct tree_entry *root,
+				const char *path, const char *line)
+{
+	struct strbuf uq = STRBUF_INIT;
+	struct tree_entry leaf = {0};
+	const char *x;
+
+	if (unquote_c_style(&uq, path, &x))
+		die("Invalid path: %s", line);
+	if (*x)
+		die("Garbage after path: %s", line);
+	tree_content_get(root, uq.buf, &leaf);
+	if (!leaf.versions[1].mode)
+		die("Path %s not in branch", uq.buf);
+	hashcpy(sha1, leaf.versions[1].sha1);
+}
+
+static void sendreport(const char *buf, unsigned long size)
+{
+	if (write_in_full(report_fd, buf, size) != size)
+		die_errno("Write to frontend failed");
+}
+
+static void cat_object(struct object_entry *oe, unsigned char sha1[20])
+{
+	struct strbuf line = STRBUF_INIT;
+	unsigned long size;
+	enum object_type type = 0;
+	char *buf;
+
+	if (report_fd < 0)
+		die("Internal error: bad report_fd %d", report_fd);
+	if (oe && oe->pack_id != MAX_PACK_ID) {
+		type = oe->type;
+		buf = gfi_unpack_entry(oe, &size);
+	} else {
+		buf = read_sha1_file(sha1, &type, &size);
+	}
+	if (!buf)
+		die("Can't read object %s", sha1_to_hex(sha1));
+
+	/*
+	 * Output based on batch_one_object() from cat-file.c.
+	 */
+	if (type <= 0) {
+		strbuf_reset(&line);
+		strbuf_addf(&line, "%s missing\n", sha1_to_hex(sha1));
+		if (write_in_full(report_fd, line.buf, line.len) != line.len)
+			die_errno("Write to frontend failed 1");
+	}
+	strbuf_reset(&line);
+	strbuf_addf(&line, "%s %s %lu\n", sha1_to_hex(sha1),
+						typename(type), size);
+	sendreport(line.buf, line.len);
+	sendreport(buf, size);
+	sendreport("\n", 1);
+	free(buf);
+}
+
+
+static void parse_cat_request(void)
+{
+	const char *p;
+	struct object_entry *oe = oe;
+	unsigned char sha1[20];
+	struct tree_entry root = {0};
+
+	/* cat SP <object> */
+	p = command_buf.buf + strlen("cat ");
+	if (report_fd < 0)
+		die("The cat command features the report-fd feature.");
+	if (*p == ':') {
+		char *x;
+		oe = find_mark(strtoumax(p + 1, &x, 10));
+		if (x == p + 1)
+			die("Invalid mark: %s", command_buf.buf);
+		if (!oe)
+			die("Unknown mark: %s", command_buf.buf);
+		p = x;
+		hashcpy(sha1, oe->idx.sha1);
+	} else {
+		if (get_sha1_hex(p, sha1))
+			die("Invalid SHA1: %s", command_buf.buf);
+		p += 40;
+		if (!*p)
+			oe = find_object(sha1);
+	}
+
+	/* [ SP "<path>" ] */
+	if (*p) {
+		if (*p++ != ' ')
+			die("Missing space after SHA1: %s", command_buf.buf);
+
+		/* cat <tree> "<path>" form. */
+		hashcpy(root.versions[1].sha1, sha1);
+		load_tree(&root);
+		quoted_path_sha1(sha1, &root, p, command_buf.buf);
+		oe = find_object(sha1);
+	}
+
+	cat_object(oe, sha1);
+}
+
 static void parse_checkpoint(void)
 {
 	if (object_count) {
@@ -2971,6 +3077,8 @@ int main(int argc, const char **argv)
 			parse_new_tag();
 		else if (!prefixcmp(command_buf.buf, "reset "))
 			parse_reset_branch();
+		else if (!prefixcmp(command_buf.buf, "cat "))
+			parse_cat_request();
 		else if (!strcmp("checkpoint", command_buf.buf))
 			parse_checkpoint();
 		else if (!prefixcmp(command_buf.buf, "progress "))
diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 610e7a5..82e03e8 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -1705,6 +1705,120 @@ test_expect_success PIPE 'R: feature report-fd is honoured' '
 	test_cmp real received
 '
 
+test_expect_success PIPE 'R: report-fd: can feed back printed tree' '
+	cat >frontend <<-\FRONTEND_END &&
+		#!/bin/sh
+		cat <<EOF &&
+		feature report-fd=3
+		commit refs/heads/printed
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		to be printed
+		COMMIT
+
+		from refs/heads/master
+		D file3
+
+		EOF
+
+		read commit_id <&3 &&
+		echo "$commit_id" >printed &&
+		echo "$commit_id commit" >expect.response &&
+		echo "cat $commit_id" &&
+		read cid2 type size <&3 &&
+		echo "$cid2 $type" >response &&
+		dd if=/dev/stdin of=commit bs=1 count=$size <&3 &&
+		read newline <&3 &&
+		read tree tree_id <commit &&
+
+		cat <<EOF &&
+		commit refs/heads/printed
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		to be printed
+		COMMIT
+
+		from refs/heads/printed^0
+		M 040000 $tree_id old
+
+		EOF
+		read cid <&3
+	FRONTEND_END
+	chmod +x frontend &&
+
+	mkfifo commits &&
+	test_when_finished "rm -f commits" &&
+	(
+		{
+			sh frontend 3<commits ||
+			exit
+		} |
+		git fast-import 3>commits
+	) &&
+	git rev-parse printed^ >expect.printed &&
+	git cat-file commit printed^ >expect.commit &&
+
+	test_cmp expect.printed printed &&
+	test_cmp expect.response response &&
+	test_cmp expect.commit commit
+'
+
+test_expect_success PIPE 'R: report-fd: can feed back printed blob' '
+	cat >expect <<-EOF &&
+	:100755 100644 $file6_id $file6_id C100	newdir/exec.sh	file6
+	EOF
+
+	cat >frontend <<-\FRONTEND_END &&
+		#!/bin/sh
+
+		branch=$(git rev-parse --verify refs/heads/branch) &&
+		cat <<EOF &&
+		feature report-fd=3
+		cat $branch
+		EOF
+
+		read commit_id type size <&3 &&
+		dd if=/dev/stdin of=commit bs=1 count=$size <&3 &&
+		read newline <&3 &&
+		read tree tree_id <commit &&
+
+		echo "cat $tree_id \"newdir/exec.sh\"" &&
+		read blob_id type size <&3 &&
+		dd if=/dev/stdin of=blob bs=1 count=$size <&3 &&
+		read newline <&3 &&
+
+		cat <<EOF &&
+		commit refs/heads/copyblob
+		committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
+		data <<COMMIT
+		copy file6 to top level
+		COMMIT
+
+		from refs/heads/branch^0
+		M 644 inline "file6"
+		data $size
+		EOF
+		cat blob &&
+		echo &&
+		echo &&
+
+		read cid <&3
+	FRONTEND_END
+	chmod +x frontend &&
+
+	mkfifo commits &&
+	test_when_finished "rm -f commits" &&
+	(
+		{
+			sh frontend 3<commits ||
+			exit
+		} |
+		git fast-import 3>commits
+	) &&
+	git diff-tree -C --find-copies-harder -r copyblob^ copyblob >actual &&
+	compare_diff_raw expect actual
+'
+
 test_expect_success 'R: quiet option results in no stats being output' '
 	>empty &&
 	cat >input <<-\EOF &&
-- 
1.7.2.3

  parent reply	other threads:[~2010-09-05  3:43 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-01  3:18 [PATCH/RFC] Teach fast-import to import subtrees named by tree id Jonathan Nieder
2010-07-01  5:48 ` [WIP/PATCH] Teach fast-import to print the id of each imported commit Jonathan Nieder
2010-07-02  3:16   ` Sverre Rabbelier
2010-07-02  3:41     ` Jonathan Nieder
2010-07-02  4:29       ` Sverre Rabbelier
2010-07-02  5:12   ` Jonathan Nieder
2010-07-02 14:55     ` Sverre Rabbelier
2010-07-02 15:40       ` Jonathan Nieder
2010-07-02 15:48         ` Sverre Rabbelier
2010-07-04  0:02         ` Sam Vilain
2010-07-04  0:35           ` Jonathan Nieder
2010-07-04  3:44             ` Sam Vilain
2010-07-04  7:22               ` Jonathan Nieder
2010-08-17 17:02   ` Ramkumar Ramachandra
2010-09-05  3:15     ` [RFC/PATCH 0/3] fast-import: give importers access to the object store Jonathan Nieder
2010-09-05  3:22       ` [PATCH 1/3] t9300 (fast-import): style tweaks Jonathan Nieder
2010-09-24  6:59         ` [PATCH/RFC 00/24] " Jonathan Nieder
2010-09-24  7:04           ` [PATCH 01/24] t9300 (fast-import): avoid exiting early on failure Jonathan Nieder
2010-09-24  7:05           ` [PATCH 02/24] t9300 (fast-import): avoid hard-coded object names Jonathan Nieder
2010-09-24  7:09           ` [PATCH 03/24] t9300 (fast-import): guard "export large marks" test setup Jonathan Nieder
2010-09-24  9:38             ` Ramkumar Ramachandra
2010-09-24 10:56               ` Raja R Harinath
2010-09-24 10:34             ` Ramkumar Ramachandra
2010-09-24 11:01               ` Raja R Harinath
2010-09-24  7:11           ` [PATCH 04/24] t9300 (fast-import): check exit status from upstream of pipes Jonathan Nieder
2010-09-24  7:11           ` [PATCH 05/24] t9300 (fast-import): check exit status from command substitutions Jonathan Nieder
2010-09-24  7:12           ` [PATCH 06/24] t9300 (fast-import): use test_cmp in place of test $(foo) = $(bar) Jonathan Nieder
2010-09-24  7:13           ` [PATCH 07/24] t9300 (fast-import): use tabs to indent Jonathan Nieder
2010-09-24  8:54             ` Ramkumar Ramachandra
2010-09-24  9:21               ` Jonathan Nieder
2010-09-24  7:16           ` [PATCH 08/24] t9300 (fast-import), series A: re-indent Jonathan Nieder
2010-09-24  7:22             ` Sverre Rabbelier
2010-09-24  7:35               ` Jonathan Nieder
2010-09-24  7:18           ` [PATCH 09/24] t9300 (fast-import), series B: re-indent Jonathan Nieder
2010-09-24  7:19           ` [PATCH 10/24] t9300 (fast-import), series C: re-indent Jonathan Nieder
2010-09-24  7:19           ` [PATCH 11/24] t9300 (fast-import), series D: re-indent Jonathan Nieder
2010-09-24  7:21           ` [PATCH 12/24] t9300 (fast-import), series E: re-indent Jonathan Nieder
2010-09-24  7:22           ` [PATCH 13/24] t9300 (fast-import), series F: re-indent Jonathan Nieder
2010-09-24  7:22           ` [PATCH 14/24] t9300 (fast-import), series H: re-indent Jonathan Nieder
2010-09-24  7:23           ` [PATCH 15/24] t9300 (fast-import), series I: re-indent Jonathan Nieder
2010-09-24  7:24           ` [PATCH 16/24] t9300 (fast-import), series J: re-indent Jonathan Nieder
2010-09-24  7:25           ` [PATCH 17/24] t9300 (fast-import), series K: re-indent Jonathan Nieder
2010-09-24  7:25           ` [PATCH 18/24] t9300 (fast-import), series L: re-indent Jonathan Nieder
2010-09-24  7:26           ` [PATCH 19/24] t9300 (fast-import), series M: re-indent Jonathan Nieder
2010-09-24  7:26           ` [PATCH 20/24] t9300 (fast-import), series N: re-indent Jonathan Nieder
2010-09-24  7:27           ` [PATCH 21/24] t9300 (fast-import), series O: re-indent Jonathan Nieder
2010-09-24  7:27           ` [PATCH 22/24] t9300 (fast-import), series P: re-indent Jonathan Nieder
2010-09-24  7:28           ` [PATCH 23/24] t9300 (fast-import), series Q: re-indent Jonathan Nieder
2010-09-24  7:30           ` [PATCH 24/24] t9300 (fast-import), series R: re-indent Jonathan Nieder
2010-09-25  5:19           ` svn-fe status Jonathan Nieder
2010-09-25 10:25             ` Sverre Rabbelier
2010-09-27  2:54               ` Jonathan Nieder
2010-09-27  9:15                 ` Sverre Rabbelier
2010-09-05  3:29       ` [PATCH 2/3] Teach fast-import to print the id of each imported commit Jonathan Nieder
2010-09-05  3:41       ` Jonathan Nieder [this message]
2010-09-05  6:08       ` [RFC/PATCH 0/3] fast-import: give importers access to the object store Ramkumar Ramachandra
2010-09-05  6:28         ` Sverre Rabbelier
2010-09-05  8:47           ` Ramkumar Ramachandra
2010-09-05 16:20             ` Sverre Rabbelier
2010-09-05 17:31         ` Jonathan Nieder
2010-09-08  3:13       ` [PATCH 4/3] fast-import: typofix Jonathan Nieder
2010-09-08  3:17       ` [PATCH 5/3] fast-import: allow cat command with empty path Jonathan Nieder
2010-09-08  3:27       ` [PATCH 6/3] fast-import: Allow cat requests at arbitrary points in stream Jonathan Nieder
2010-09-08  3:38         ` Sverre Rabbelier
2010-09-08  3:57           ` Jonathan Nieder
2010-09-08 10:16         ` Ramkumar Ramachandra
2010-09-16  0:14       ` [RFC/PATCH 0/3] fast-import: give importers access to the object store Sam Vilain
2010-09-17 23:24         ` Sverre Rabbelier
2010-09-24 19:43         ` Jonathan Nieder
2010-09-24 23:44           ` Sverre Rabbelier
2010-09-25  0:01             ` Jonathan Nieder
2010-09-25  0:17               ` Sverre Rabbelier
2010-07-02  3:20 ` [PATCH/RFC] Teach fast-import to import subtrees named by tree id Sverre Rabbelier
2010-07-02  4:42   ` Jonathan Nieder
2010-07-02 12:44 ` Ramkumar Ramachandra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100905034116.GD2344@burratino \
    --to=jrnieder@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=david.barr@cordelta.com \
    --cc=git@vger.kernel.org \
    --cc=sam@vilain.net \
    --cc=spearce@spearce.org \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.