git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Feature wish: Cloning without history
@ 2006-05-18 19:21 Sven Ekman
  2006-05-18 19:35 ` Jakub Narebski
  2006-05-19  3:03 ` [PATCH] built-in tar-tree and remote tar-tree Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Sven Ekman @ 2006-05-18 19:21 UTC (permalink / raw)
  To: git

Hello,

Would it be possible to add an option to git-clone to
skip the complete history? The result should be a
repository which contains the current head only (or
maybe a specified tag) and has that commit id added to
.git/info/grafts. For the fetch process, this would
certainly have to imply the --no-tags flag.

>From a user's point of view I'd imagine something like
this:

git-clone --no-history=v2.6.16 \
    git://git.kernel.org/.../linux-2.6.git

The background: I'm regularly building kernels for a
handful of machines, and while I am happy to use the
blessings of git to get updates from the -stable
releases, I see no point in wasting space for a copy
of the complete kernel history on every single
machine. In practice this works pretty good, once I
have manually created such a castrated repository.

Sven

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Feature wish: Cloning without history
  2006-05-18 19:21 Feature wish: Cloning without history Sven Ekman
@ 2006-05-18 19:35 ` Jakub Narebski
  2006-05-19  3:03 ` [PATCH] built-in tar-tree and remote tar-tree Junio C Hamano
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Narebski @ 2006-05-18 19:35 UTC (permalink / raw)
  To: git

Sven Ekman wrote:

> Would it be possible to add an option to git-clone to
> skip the complete history? The result should be a
> repository which contains the current head only (or
> maybe a specified tag) and has that commit id added to
> .git/info/grafts. For the fetch process, this would
> certainly have to imply the --no-tags flag.

It is certainly reccuring request.

Check for "shallow clone" in git mailing list archives.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] built-in tar-tree and remote tar-tree
  2006-05-18 19:21 Feature wish: Cloning without history Sven Ekman
  2006-05-18 19:35 ` Jakub Narebski
@ 2006-05-19  3:03 ` Junio C Hamano
  2006-05-19  5:50   ` Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2006-05-19  3:03 UTC (permalink / raw)
  To: git

This makes tar-tree a built-in.  As an added bonus, you can now
say:

	git tar-tree --remote=remote-repository <ent> [<base>]

This does not work with git-daemon yet, but should work with
localhost and git over ssh transports.

Signed-off-by: Junio C Hamano <junkio@cox.net>

---

 * It comes up every once in a while that somebody says he wants
   a shallow clone, but when asked "what for?" it turns out that
   what is wanted is the ability to download a tarball for a
   specific revision for building that version, so here is one.

   If people think this is a useful thing, we would need to
   teach git-daemon about git-upload-tar, and perhaps we would
   also want to compress the tar image while it goes over the
   wire, which would be more work.

 Makefile                         |    8 +++--
 tar-tree.c => builtin-tar-tree.c |   62 +++++++++++++++++++++++++++++++++++-
 builtin-upload-tar.c             |   66 ++++++++++++++++++++++++++++++++++++++
 builtin.h                        |    2 +
 git.c                            |    2 +
 5 files changed, 135 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 2149fb8..caf1e70 100644
--- a/Makefile
+++ b/Makefile
@@ -161,7 +161,7 @@ PROGRAMS = \
 	git-receive-pack$X git-rev-parse$X \
 	git-send-pack$X git-show-branch$X git-shell$X \
 	git-show-index$X git-ssh-fetch$X \
-	git-ssh-upload$X git-tar-tree$X git-unpack-file$X \
+	git-ssh-upload$X git-unpack-file$X \
 	git-unpack-objects$X git-update-index$X git-update-server-info$X \
 	git-upload-pack$X git-verify-pack$X git-write-tree$X \
 	git-update-ref$X git-symbolic-ref$X \
@@ -170,7 +170,8 @@ PROGRAMS = \
 
 BUILT_INS = git-log$X git-whatchanged$X git-show$X \
 	git-count-objects$X git-diff$X git-push$X \
-	git-grep$X git-rev-list$X git-check-ref-format$X
+	git-grep$X git-rev-list$X git-check-ref-format$X \
+	git-tar-tree$X git-upload-tar$X
 
 # what 'all' will build and 'install' will install, in gitexecdir
 ALL_PROGRAMS = $(PROGRAMS) $(SIMPLE_PROGRAMS) $(SCRIPTS)
@@ -218,7 +219,8 @@ LIB_OBJS = \
 
 BUILTIN_OBJS = \
 	builtin-log.o builtin-help.o builtin-count.o builtin-diff.o builtin-push.o \
-	builtin-grep.o builtin-rev-list.o builtin-check-ref-format.o
+	builtin-grep.o builtin-rev-list.o builtin-check-ref-format.o \
+	builtin-tar-tree.o builtin-upload-tar.o
 
 GITLIBS = $(LIB_FILE) $(XDIFF_LIB)
 LIBS = $(GITLIBS) -lz
diff --git a/tar-tree.c b/builtin-tar-tree.c
similarity index 86%
rename from tar-tree.c
rename to builtin-tar-tree.c
index 3308736..e97e0af 100644
--- a/tar-tree.c
+++ b/builtin-tar-tree.c
@@ -7,11 +7,14 @@ #include "tree-walk.h"
 #include "commit.h"
 #include "strbuf.h"
 #include "tar.h"
+#include "builtin.h"
+#include "pkt-line.h"
 
 #define RECORDSIZE	(512)
 #define BLOCKSIZE	(RECORDSIZE * 20)
 
-static const char tar_tree_usage[] = "git-tar-tree <key> [basedir]";
+static const char tar_tree_usage[] =
+"git-tar-tree [--remote=<repo>] <ent> [basedir]";
 
 static char block[BLOCKSIZE];
 static unsigned long offset;
@@ -301,7 +304,7 @@ static void traverse_tree(struct tree_de
 	}
 }
 
-int main(int argc, char **argv)
+int generate_tar(int argc, const char **argv)
 {
 	unsigned char sha1[20], tree_sha1[20];
 	struct commit *commit;
@@ -348,3 +351,58 @@ int main(int argc, char **argv)
 	free(current_path.buf);
 	return 0;
 }
+
+static const char *exec = "git-upload-tar";
+
+static int remote_tar(int argc, const char **argv)
+{
+	int fd[2], ret, len;
+	pid_t pid;
+	char buf[1024];
+	char *url;
+
+	if (argc < 3 || 4 < argc)
+		usage(tar_tree_usage);
+
+	/* --remote=<repo> */
+	url = strdup(argv[1]+9);
+	pid = git_connect(fd, url, exec);
+	if (pid < 0)
+		return 1;
+
+	packet_write(fd[1], "want %s\n", argv[2]);
+	if (argv[3])
+		packet_write(fd[1], "base %s\n", argv[3]);
+	packet_flush(fd[1]);
+
+	len = packet_read_line(fd[0], buf, sizeof(buf));
+	if (!len)
+		die("git-tar-tree: expected ACK/NAK, got EOF");
+	if (buf[len-1] == '\n')
+		buf[--len] = 0;
+	if (strcmp(buf, "ACK")) {
+		if (5 < len && !strncmp(buf, "NACK ", 5))
+			die("git-tar-tree: NACK %s", buf + 5);
+		die("git-tar-tree: protocol error");
+	}
+	/* expect a flush */
+	len = packet_read_line(fd[0], buf, sizeof(buf));
+	if (len)
+		die("git-tar-tree: expected a flush");
+
+	/* Now, start reading from fd[0] and spit it out to stdout */
+	ret = copy_fd(fd[0], 1);
+	close(fd[0]);
+
+	ret |= finish_connect(pid);
+	return !!ret;
+}
+
+int cmd_tar_tree(int argc, const char **argv, char **envp)
+{
+	if (argc < 2)
+		usage(tar_tree_usage);
+	if (!strncmp("--remote=", argv[1], 9))
+		return remote_tar(argc, argv);
+	return generate_tar(argc, argv);
+}
diff --git a/builtin-upload-tar.c b/builtin-upload-tar.c
new file mode 100644
index 0000000..883b5aa
--- /dev/null
+++ b/builtin-upload-tar.c
@@ -0,0 +1,66 @@
+/*
+ * Copyright (c) 2006 Junio C Hamano
+ */
+#include "cache.h"
+#include "pkt-line.h"
+#include "exec_cmd.h"
+#include "builtin.h"
+
+static const char upload_tar_usage[] = "git-upload-tar <repo>";
+
+static int nack(const char *reason)
+{
+	packet_write(1, "NACK %s\n", reason);
+	packet_flush(1);
+	return 1;
+}
+
+int cmd_upload_tar(int argc, const char **argv, char **envp)
+{
+	int len;
+	char *dir = argv[1];
+	char buf[8129];
+	unsigned char sha1[20];
+	char *base = NULL;
+	char hex[41];
+	int ac;
+	const char *av[4];
+
+	if (argc != 2)
+		usage(upload_tar_usage);
+	if (!enter_repo(dir, 0))
+		return nak("not a git archive", dir);
+	len = packet_read_line(0, buf, sizeof(buf));
+	if (len < 5 || strncmp("want ", buf, 5))
+		return nak("expected want");
+	if (buf[len-1] == '\n')
+		buf[--len] = 0;
+	if (get_sha1(buf + 5, sha1))
+		return nak("expected sha1");
+        strcpy(hex, sha1_to_hex(sha1));
+
+	len = packet_read_line(0, buf, sizeof(buf));
+	if (len) {
+		if (len < 5 || strncmp("base ", buf, 5))
+			return nak("expected (optional) base");
+		if (buf[len-1] == '\n')
+			buf[--len] = 0;
+		base = strdup(buf + 5);
+		len = packet_read_line(0, buf, sizeof(buf));
+	}
+	if (len)
+		return nak("expected flush");
+
+	packet_write(1, "ACK\n");
+	packet_flush(1);
+
+	ac = 0;
+	av[ac++] = "tar-tree";
+	av[ac++] = hex;
+	if (base)
+		av[ac++] = base;
+	av[ac++] = NULL;
+	execv_git_cmd(av);
+	/* should it return that is an error */
+	return 1;
+}
diff --git a/builtin.h b/builtin.h
index ff559de..3ed5d65 100644
--- a/builtin.h
+++ b/builtin.h
@@ -26,5 +26,7 @@ extern int cmd_push(int argc, const char
 extern int cmd_grep(int argc, const char **argv, char **envp);
 extern int cmd_rev_list(int argc, const char **argv, char **envp);
 extern int cmd_check_ref_format(int argc, const char **argv, char **envp);
+extern int cmd_tar_tree(int argc, const char **argv, char **envp);
+extern int cmd_upload_tar(int argc, const char **argv, char **envp);
 
 #endif
diff --git a/git.c b/git.c
index d0650bb..79d81b1 100644
--- a/git.c
+++ b/git.c
@@ -50,6 +50,8 @@ static void handle_internal_command(int 
 		{ "count-objects", cmd_count_objects },
 		{ "diff", cmd_diff },
 		{ "grep", cmd_grep },
+		{ "tar-tree", cmd_tar_tree },
+		{ "upload-tar", cmd_upload_tar },
 		{ "rev-list", cmd_rev_list },
 		{ "check-ref-format", cmd_check_ref_format }
 	};
-- 
1.3.3.gfad60

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] built-in tar-tree and remote tar-tree
  2006-05-19  3:03 ` [PATCH] built-in tar-tree and remote tar-tree Junio C Hamano
@ 2006-05-19  5:50   ` Junio C Hamano
  2006-05-19 21:43     ` Sven Ekman
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2006-05-19  5:50 UTC (permalink / raw)
  To: git

Sorry for sending a crapoid that does not even compile.  I ran
format-patch while on a wrong branch.

Tonight's "pu" will have a fixed up one for people who are
interested to play with.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] built-in tar-tree and remote tar-tree
  2006-05-19  5:50   ` Junio C Hamano
@ 2006-05-19 21:43     ` Sven Ekman
  2006-05-19 21:56       ` Jakub Narebski
  2006-05-19 22:56       ` Junio C Hamano
  0 siblings, 2 replies; 7+ messages in thread
From: Sven Ekman @ 2006-05-19 21:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <junkio@cox.net> skrev:

> Sorry for sending a crapoid that does not even
> compile.  I ran format-patch while on a wrong 
> branch.
> 
> Tonight's "pu" will have a fixed up one for 
> people who are interested to play with.

Hi Junio,

Thanks for your answer. I'm looking forward to try the
remote tar-tree at the weekend.  While this will
definitely make it much more easy to grab single
revisions out of a git tree, it only solves a part of
the issue I was trying to address. If one wanted a
simple snapshot of the source, it's usually easier to
download a tarball. 

The great thing about having the kernel source in a
git repository is that it lets me upgrade the kernel
source tree in place with a single simple command. No
firing up a browser, downloading and applying a patch.
It is also braindead easy to maintain a set of local
patches across different kernel versions. Git makes
all this quick and easy.

Is there a simple way to retrieve a single object or a
list of objects _without_ any of their parents? If so
one could retrieve the wanted commit and the
corresponding tree and parse it on the client side to
retrieve its descendents and commits.  If so, the
number of roundtrips would be roughly proportional to
the depth of the trees, which would probably still be
acceptable.

Greetings, Sven

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] built-in tar-tree and remote tar-tree
  2006-05-19 21:43     ` Sven Ekman
@ 2006-05-19 21:56       ` Jakub Narebski
  2006-05-19 22:56       ` Junio C Hamano
  1 sibling, 0 replies; 7+ messages in thread
From: Jakub Narebski @ 2006-05-19 21:56 UTC (permalink / raw)
  To: git

Sven Ekman wrote:

> Is there a simple way to retrieve a single object or a
> list of objects _without_ any of their parents? If so
> one could retrieve the wanted commit and the
> corresponding tree and parse it on the client side to
> retrieve its descendents and commits.  If so, the
> number of roundtrips would be roughly proportional to
> the depth of the trees, which would probably still be
> acceptable.

Perhaps alternates file which points to _remote_ git repository, 
leaving all unused objects at remote directory, and needing constant
net access to said remote repository for almost all operations?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] built-in tar-tree and remote tar-tree
  2006-05-19 21:43     ` Sven Ekman
  2006-05-19 21:56       ` Jakub Narebski
@ 2006-05-19 22:56       ` Junio C Hamano
  1 sibling, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2006-05-19 22:56 UTC (permalink / raw)
  To: Sven Ekman; +Cc: git

Sven Ekman <svekman@yahoo.se> writes:

> Thanks for your answer. I'm looking forward to try the
> remote tar-tree at the weekend.

A word of caution.  This obviously needs the updated stuff on
both ends.  The downloader needs to have updated tar-tree, and
the other end needs the new command upload-tar.

> Is there a simple way to retrieve a single object or a
> list of objects _without_ any of their parents?

The thing is, I do not think that is what you really want.

If you do not have the necessary parents, many of the benefit
you list as "why do I want kernel from git repo" would not work.
The next fetch will try to see where the common ancestry commit
is, in order to download only from that one, for example. For
that you would need a well formed repositories on both ends.

Obviously bisect and anything that deal with the traversal of
ancestry chain would break, and while you would say "I accept
that some things may not work", their failure mode do not even
consider that the user might start from such an incomplete
repositories to begin with, so one thing is that the user would
be very confused, and another thing is that I would not be
surprised if some operations further "corrupt" such an already
incomplete repository (fsck, prune and probably bisect when it
tries to rewind the bisection branch -- your branch head may
point at nowhere and the user might need to do manual update-ref
instead of "git checkout master" to recover from it), causing
further grief.

In other words, to support such partial/incomplete repositories
properly, you are talking about a major major surgery.  I just
do not want to think about it right now.

On the other hand, the primary point of my patch is that the
result does _not_ pretend it is a proper git repository, so we
do not have to worry about all the above issues.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-05-19 22:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-18 19:21 Feature wish: Cloning without history Sven Ekman
2006-05-18 19:35 ` Jakub Narebski
2006-05-19  3:03 ` [PATCH] built-in tar-tree and remote tar-tree Junio C Hamano
2006-05-19  5:50   ` Junio C Hamano
2006-05-19 21:43     ` Sven Ekman
2006-05-19 21:56       ` Jakub Narebski
2006-05-19 22:56       ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).