git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] builtin-fast-export: Remove double spaces in author line
@ 2008-05-30 10:31 Pieter de Bie
  2008-05-30 20:27 ` Junio C Hamano
  0 siblings, 1 reply; 2+ messages in thread
From: Pieter de Bie @ 2008-05-30 10:31 UTC (permalink / raw)
  To: git; +Cc: Pieter de Bie

At least the Samba repository has an extra space after the author's email
for some commits. This breaks builtin-fast-export as that format is required
to have only a single space.

This removes double spaces in committer and author lines.
---

This is an ugly hack and obviously not meant for application (which is why
I didn't sign it off). 

The Samba thing can be seen by running:

  git cat-file commit 65968b294351d2612d1bf94236d1fcbf853c494e

It produces 

	"author Samba Release Account <samba-bugs@samba.org>  831196245 +0000"

The git-fast-import syntax says there can only be a single space after the >
sign. This Samba commit breaks bzr-fast-import, for example.

I'm not sure how to properly fix the problem, as I'm not very deep into Git's
code. I included this patch in case anyone else has the same problem and wants
a quick fix. I'm also not sure if it's a fast-export problem, or if the Samba
repository is just invalid :)

- Pieter

diff --git a/builtin-fast-export.c b/builtin-fast-export.c
index 1dfc01e..8218199 100755
--- a/builtin-fast-export.c
+++ b/builtin-fast-export.c
@@ -144,6 +144,24 @@ static const char *find_encoding(const char *begin, const char *end)
 	return bol;
 }
 
+static void parse_line(char **begin, char **end)
+{
+	char *b = *begin;
+	char *e = *end;
+
+	char *line = xmalloc((e - b) * sizeof(char));
+	*begin = line;
+	char prev = 0;
+
+	for (; b < e; b++) {
+		if (!(prev == ' ' && *b == ' '))
+			*line++ = *b;
+		prev = *b;
+	}
+
+	*end = line;
+}
+
 static void handle_commit(struct commit *commit, struct rev_info *rev)
 {
 	int saved_output_format = rev->diffopt.output_format;
@@ -163,11 +181,14 @@ static void handle_commit(struct commit *commit, struct rev_info *rev)
 	author++;
 	author_end = strchrnul(author, '\n');
 	committer = strstr(author_end, "\ncommitter ");
+	parse_line(&author, &author_end);
+
 	if (!committer)
 		die ("Could not find committer in commit %s",
 		     sha1_to_hex(commit->object.sha1));
 	committer++;
 	committer_end = strchrnul(committer, '\n');
+	parse_line(&committer, &committer_end);
 	message = strstr(committer_end, "\n\n");
 	encoding = find_encoding(committer_end, message);
 	if (message)
@@ -214,6 +235,8 @@ static void handle_commit(struct commit *commit, struct rev_info *rev)
 
 	printf("\n");
 
+	free(committer);
+	free(author);
 	show_progress();
 }
 
-- 
1.5.6.rc0.163.g26db5e.dirty

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] builtin-fast-export: Remove double spaces in author line
  2008-05-30 10:31 [PATCH] builtin-fast-export: Remove double spaces in author line Pieter de Bie
@ 2008-05-30 20:27 ` Junio C Hamano
  0 siblings, 0 replies; 2+ messages in thread
From: Junio C Hamano @ 2008-05-30 20:27 UTC (permalink / raw)
  To: Pieter de Bie; +Cc: git

Pieter de Bie <pdebie@ai.rug.nl> writes:

> It produces 
>
> 	"author Samba Release Account <samba-bugs@samba.org>  831196245 +0000"
>
> The git-fast-import syntax says there can only be a single space after the >
> sign. This Samba commit breaks bzr-fast-import, for example.
>
> I'm not sure how to properly fix the problem, as I'm not very deep into Git's
> code. I included this patch in case anyone else has the same problem and wants
> a quick fix. I'm also not sure if it's a fast-export problem, or if the Samba
> repository is just invalid :)

You can call that repository broken if you want, but we can try to be
liberal when receiving and be strict when generating.  IOW, fast-import
could accept such a minor deviation and generate a commit after fixing it.
The same thing can be said about fast-export --- read, fix and generate.

By the way, your quick hack would however squash multiple SPs anywhere on
the line, wouldn't it, not just the one between '>' and the timestamp?

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-05-30 20:28 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-30 10:31 [PATCH] builtin-fast-export: Remove double spaces in author line Pieter de Bie
2008-05-30 20:27 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).