All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] git fast-export: add --no-data option
@ 2009-07-25 13:45 Geoffrey Irving
  2009-07-25 14:28 ` Johannes Schindelin
  2009-07-25 17:25 ` Junio C Hamano
  0 siblings, 2 replies; 9+ messages in thread
From: Geoffrey Irving @ 2009-07-25 13:45 UTC (permalink / raw)
  To: git, Junio C Hamano

When using git fast-export and git fast-import to rewrite the history
of a repository with large binary files, almost all of the time is
spent dealing with blobs.  This is extremely inefficient if all we want
to do is rewrite the commits and tree structure.  --no-data skips the
output of blobs and writes SHA-1s instead of marks, which provides a
massive speedup.

Signed-off-by: Geoffrey Irving <irving@naml.us>
---

I've already done all I need with this change (for now, at least), but
here it is in case it proves useful to others.  Amusingly, rewriting
history with

    git fast-export --no-data <branch> | <python-script> | git fast-import

is now much, much faster than the equivalent

    git filter-branch --prune-empty --msg-filter ...

I haven't investigated why.

 Documentation/git-fast-export.txt |    7 +++++++
 builtin-fast-export.c             |    8 +++++++-
 2 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-fast-export.txt
b/Documentation/git-fast-export.txt
index 0c9eb56..47a96dd 100644
--- a/Documentation/git-fast-export.txt
+++ b/Documentation/git-fast-export.txt
@@ -71,6 +71,13 @@ marks the same across runs.
 	allow that.  So fake a tagger to be able to fast-import the
 	output.

+--no-data::
+	Skip output of blob objects and instead refer to blobs via
+	their original SHA-1 hash.  This is useful when rewriting the
+	directory structure or history of a repository without
+	touching the contents of individual files.  Note that the
+	resulting stream can only be used by a repository which
+	already contains the necessary objects.

 EXAMPLES
 --------
diff --git a/builtin-fast-export.c b/builtin-fast-export.c
index 9a8a6fc..ac72791 100644
--- a/builtin-fast-export.c
+++ b/builtin-fast-export.c
@@ -25,6 +25,7 @@ static const char *fast_export_usage[] = {
 static int progress;
 static enum { VERBATIM, WARN, STRIP, ABORT } signed_tag_mode = ABORT;
 static int fake_missing_tagger;
+static int no_data;

 static int parse_opt_signed_tag_mode(const struct option *opt,
 				     const char *arg, int unset)
@@ -101,6 +102,9 @@ static void handle_object(const unsigned char *sha1)
 	char *buf;
 	struct object *object;

+	if (no_data)
+		return;
+
 	if (is_null_sha1(sha1))
 		return;

@@ -158,7 +162,7 @@ static void show_filemodify(struct diff_queue_struct *q,
 			 * Links refer to objects in another repositories;
 			 * output the SHA-1 verbatim.
 			 */
-			if (S_ISGITLINK(spec->mode))
+			if (no_data || S_ISGITLINK(spec->mode))
 				printf("M %06o %s %s\n", spec->mode,
 				       sha1_to_hex(spec->sha1), spec->path);
 			else {
@@ -504,6 +508,8 @@ int cmd_fast_export(int argc, const char **argv,
const char *prefix)
 			     "Import marks from this file"),
 		OPT_BOOLEAN(0, "fake-missing-tagger", &fake_missing_tagger,
 			     "Fake a tagger when tags lack one"),
+		OPT_BOOLEAN(0, "no-data", &no_data,
+			     "Skip output of blob data"),
 		OPT_END()
 	};

-- 
1.6.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-07-28  8:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-25 13:45 [PATCH] git fast-export: add --no-data option Geoffrey Irving
2009-07-25 14:28 ` Johannes Schindelin
2009-07-25 17:25 ` Junio C Hamano
2009-07-25 17:44   ` Johannes Schindelin
2009-07-27 12:48     ` Geoffrey Irving
2009-07-27 18:49       ` Johannes Schindelin
2009-07-28  2:20         ` Geoffrey Irving
2009-07-28  4:11         ` Stephen Boyd
2009-07-28  8:01           ` Johannes Schindelin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.