git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Corrupted (?) commit 6e6db85e confusing gitk
@ 2007-12-02 16:06 Steffen Prohaska
  2007-12-02 16:12 ` Wincent Colaiuta
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Steffen Prohaska @ 2007-12-02 16:06 UTC (permalink / raw)
  To: Git Mailing List

When I run

    gitk 6e6db85ea9423eea755cf5acf7a563c0d9559063

gitk complaints with 'Error: expected integer but got "Hamano"'.

I tracked the problem down to the raw content of the commit object.
The author line is lacking time and timezone information:

     $ git-cat-file -p 6e6db85ea9423eea755cf5acf7a563c0d9559063
     tree 5265f13d094e7c453a06f097add25eaefb843a79
     parent d25430c5f88c7e7b4ce24c1b08e409f4345c4eb9
     author Junio C Hamano <gitster@pobox.com>
     committer Junio C Hamano <gitster@pobox.com> 1196466497 -0800

     Run the specified perl in Documentation/

     Makefile uses $(PERL_PATH) but Documentation/Makefile uses "perl";
     that means the two Makefiles actually use two different
     Perl installations.

     Teach Documentation/Makefile to use PERL_PATH that is exported  
from the
     toplevel Makefile, and give a sane fallback for people who run  
"make"
     from Documentation directory.

     Signed-off-by: Junio C Hamano <gitster@pobox.com>

gitk fails to parse this because it expects the time to be
the second item from the end of a line (look for "set audate"
in function parsecommit of gitk).  For the commit above, gitk
finds "Hamano" instead of the correct time.

I'm pretty convinced that the original commit is reported
correctly.  I verified that with two different versions of git
(1.5.3.7.949.g2221a6 on mac and 1.5.3.6.1889.g98603 on mingw).
Both report the raw commit without time and timezone.

I'd like to conclude with some questions:
  - Is this commit corrupted?
  - How was the commit created?
  - Should "git fsck" detect such corruption?
  - Should gitk more gracefully handle corrupted commits?

I do not have solutions yet.

	Steffen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 16:06 Corrupted (?) commit 6e6db85e confusing gitk Steffen Prohaska
@ 2007-12-02 16:12 ` Wincent Colaiuta
  2007-12-02 16:36 ` [PATCH] gitk: Add workaround to handle corrupted author date Steffen Prohaska
  2007-12-02 18:53 ` Corrupted (?) commit 6e6db85e confusing gitk Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Wincent Colaiuta @ 2007-12-02 16:12 UTC (permalink / raw)
  To: Steffen Prohaska; +Cc: Git Mailing List

El 2/12/2007, a las 17:06, Steffen Prohaska escribió:

> When I run
>
>   gitk 6e6db85ea9423eea755cf5acf7a563c0d9559063
>
> gitk complaints with 'Error: expected integer but got "Hamano"'.
>
> I tracked the problem down to the raw content of the commit object.
> The author line is lacking time and timezone information:
>
>    $ git-cat-file -p 6e6db85ea9423eea755cf5acf7a563c0d9559063
>    tree 5265f13d094e7c453a06f097add25eaefb843a79
>    parent d25430c5f88c7e7b4ce24c1b08e409f4345c4eb9
>    author Junio C Hamano <gitster@pobox.com>
>    committer Junio C Hamano <gitster@pobox.com> 1196466497 -0800

Yeah, and:

$ git show 6e6db85ea9423eea755cf5acf7a563c0d9559063 | head -3
commit 6e6db85ea9423eea755cf5acf7a563c0d9559063
Author: Junio C Hamano <gitster@pobox.com>
Date:   Thu Jan 1 00:00:00 1970 +0000

Cheers,
Wincent

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH] gitk: Add workaround to handle corrupted author date
  2007-12-02 16:06 Corrupted (?) commit 6e6db85e confusing gitk Steffen Prohaska
  2007-12-02 16:12 ` Wincent Colaiuta
@ 2007-12-02 16:36 ` Steffen Prohaska
  2007-12-02 18:53 ` Corrupted (?) commit 6e6db85e confusing gitk Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Steffen Prohaska @ 2007-12-02 16:36 UTC (permalink / raw)
  To: git; +Cc: Steffen Prohaska

6e6db85ea9423eea755cf5acf7a563c0d9559063 contains a corrupted
author line, which is lacking the time and timezone information.

This commit adds a workaround to handle this situation.  If the
time cannot be parsed, it is assumed to be 0 and the full line
is assumed to be the author's name.
---
 gitk |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

This works around the issue for me.  However, I don't think
this patch should be applied.

The best was if such a corrupted commit wouldn't enter the
repository in the first place.  But once it is there, I think git
should verify the format of a commit and report an approriate
error.  gitk could continue to assume well formed commits.

    Steffen

diff --git a/gitk b/gitk
index 1da0b0a..873766c 100755
--- a/gitk
+++ b/gitk
@@ -439,7 +439,12 @@ proc parsecommit {id contents listed} {
 	set tag [lindex $line 0]
 	if {$tag == "author"} {
 	    set audate [lindex $line end-1]
-	    set auname [lrange $line 1 end-2]
+	    if {[catch {formatdate $audate}]} {
+		set audate 0
+		set auname [lrange $line 1 end]
+	    } else {
+		set auname [lrange $line 1 end-2]
+	    }
 	} elseif {$tag == "committer"} {
 	    set comdate [lindex $line end-1]
 	    set comname [lrange $line 1 end-2]
-- 
1.5.3.7.949.g2221a6

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 16:06 Corrupted (?) commit 6e6db85e confusing gitk Steffen Prohaska
  2007-12-02 16:12 ` Wincent Colaiuta
  2007-12-02 16:36 ` [PATCH] gitk: Add workaround to handle corrupted author date Steffen Prohaska
@ 2007-12-02 18:53 ` Junio C Hamano
  2007-12-02 19:39   ` Brian Downing
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2007-12-02 18:53 UTC (permalink / raw)
  To: Steffen Prohaska; +Cc: Git Mailing List

Steffen Prohaska <prohaska@zib.de> writes:

> I'd like to conclude with some questions:
>  - Is this commit corrupted?
>  - How was the commit created?
>  - Should "git fsck" detect such corruption?
>  - Should gitk more gracefully handle corrupted commits?

Yeah, I was wondering what that commit that records the change older
than git or myself come to life ;-)

I did rewrite the commit a few times, and it was some interaction
between the built-in commit series, git-rebase -i and git-am, but I do
not have the details, sorry.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 18:53 ` Corrupted (?) commit 6e6db85e confusing gitk Junio C Hamano
@ 2007-12-02 19:39   ` Brian Downing
  2007-12-02 20:34     ` Linus Torvalds
                       ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Brian Downing @ 2007-12-02 19:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Steffen Prohaska, Git Mailing List

On Sun, Dec 02, 2007 at 10:53:33AM -0800, Junio C Hamano wrote:
> Yeah, I was wondering what that commit that records the change older
> than git or myself come to life ;-)
> 
> I did rewrite the commit a few times, and it was some interaction
> between the built-in commit series, git-rebase -i and git-am, but I do
> not have the details, sorry.

It looks like the "guilty" commit that allowed this behavior was:

commit 13208572fbe8838fd8835548d7502202d1f7b21d
Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Date:   Sun Nov 11 17:35:58 2007 +0000

    builtin-commit: fix --signoff

    The Signed-off-by: line contained a spurious timestamp.  The reason was
    a call to git_committer_info(1), which automatically added the
    timestamp.

    Instead, fmt_ident() was taught to interpret an empty string for the
    date (as opposed to NULL, which still triggers the default behavior)
    as "do not bother with the timestamp", and builtin-commit.c uses it.

    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>

With the above, something like:

echo msg | GIT_AUTHOR_DATE='' git commit-tree sha1

will produce a broken commit without a timestamp, since fmt_ident is
also used for the committer and author lines.

Personally, I think if the date_str is not NULL, it should die() on
anything that can't successfully be parsed as a date, rather than simply
falling back to the current time.  But maybe that's a bit extreme.

-bcd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 19:39   ` Brian Downing
@ 2007-12-02 20:34     ` Linus Torvalds
  2007-12-02 20:48     ` Junio C Hamano
  2007-12-02 21:34     ` Johannes Schindelin
  2 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2007-12-02 20:34 UTC (permalink / raw)
  To: Brian Downing; +Cc: Junio C Hamano, Steffen Prohaska, Git Mailing List



On Sun, 2 Dec 2007, Brian Downing wrote:
>
> With the above, something like:
> 
> echo msg | GIT_AUTHOR_DATE='' git commit-tree sha1
> 
> will produce a broken commit without a timestamp, since fmt_ident is
> also used for the committer and author lines.

Ouch. And I notice that fsck doesn't even warn about the resulting broken 
commit. Partly because I was lazy, but partly because originally I was 
thinking that maybe we'll have more header lines, so fsck basically just 
checks the ones that git *really* cares about (parenthood and tree), and 
the rest is not really even looked at (well, it does check that the next 
line starts with "author", but that's it).

I guess the breakage is pretty benign, but this is still very wrong.

Junio: that broken commit seems to be in "pu" only - we should make sure 
that it never makes it into next or master, so that it will eventually get 
pruned out of history.

		Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 19:39   ` Brian Downing
  2007-12-02 20:34     ` Linus Torvalds
@ 2007-12-02 20:48     ` Junio C Hamano
  2007-12-02 21:25       ` Linus Torvalds
                         ` (2 more replies)
  2007-12-02 21:34     ` Johannes Schindelin
  2 siblings, 3 replies; 18+ messages in thread
From: Junio C Hamano @ 2007-12-02 20:48 UTC (permalink / raw)
  To: Brian Downing; +Cc: Steffen Prohaska, Git Mailing List

bdowning@lavos.net (Brian Downing) writes:

> It looks like the "guilty" commit that allowed this behavior was:
>
> commit 13208572fbe8838fd8835548d7502202d1f7b21d
> Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Date:   Sun Nov 11 17:35:58 2007 +0000
>
>     builtin-commit: fix --signoff
>
>     The Signed-off-by: line contained a spurious timestamp.  The reason was
>     a call to git_committer_info(1), which automatically added the
>     timestamp.
>
>     Instead, fmt_ident() was taught to interpret an empty string for the
>     date (as opposed to NULL, which still triggers the default behavior)
>     as "do not bother with the timestamp", and builtin-commit.c uses it.
>
>     Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>     Signed-off-by: Junio C Hamano <gitster@pobox.com>
>
> With the above, something like:
>
> echo msg | GIT_AUTHOR_DATE='' git commit-tree sha1
>
> will produce a broken commit without a timestamp, since fmt_ident is
> also used for the committer and author lines.
>
> Personally, I think if the date_str is not NULL, it should die() on
> anything that can't successfully be parsed as a date, rather than simply
> falling back to the current time.  But maybe that's a bit extreme.

Yeah, that change does look like a hack now we look at it again.  It
would have been much cleaner to make the caller accept the default
behaviour of fmt_ident() and strip out the part it does not want from
the result.  That way, the damage would have been much contained.

The next issue would be to find who could pass an empty GIT_AUTHOR_DATE
without noticing...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 20:48     ` Junio C Hamano
@ 2007-12-02 21:25       ` Linus Torvalds
  2007-12-02 21:47         ` Junio C Hamano
  2007-12-02 21:43       ` Fix --signoff in builtin-commit differently Junio C Hamano
  2007-12-02 22:59       ` Corrupted (?) commit 6e6db85e confusing gitk Michael Gebetsroither
  2 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2007-12-02 21:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brian Downing, Steffen Prohaska, Git Mailing List



On Sun, 2 Dec 2007, Junio C Hamano wrote:
>
> The next issue would be to find who could pass an empty GIT_AUTHOR_DATE
> without noticing...

In the meantime, here's a not-very-well-tested patch to fsck to at least 
notice this.

Of course, in the name of containment it would probably be even better if 
parse_commit() did it, because then people would be unable to pull from 
such a corrupt repository! But this would seem to be at least a slight 
step in the right direction.

		Linus

---
 builtin-fsck.c |   57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/builtin-fsck.c b/builtin-fsck.c
index e4874f6..309212c 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -351,8 +351,48 @@ static int fsck_tree(struct tree *item)
 	return retval;
 }
 
+static int parse_commit_line(struct commit *commit, const char *expect, const char *buffer)
+{
+	char *end;
+	const char *p;
+	int len = strlen(expect);
+	int saw_lt = 0;
+
+	if (memcmp(buffer, expect, len))
+		goto bad;
+	p = (char *)buffer + len;
+	if (*p != ' ')
+		goto bad;
+	while (*++p != '>') {
+		if (*p == '<')
+			saw_lt++;
+		if (!*p)
+			goto bad;
+	}
+	if (saw_lt != 1)
+		goto bad;
+	if (*++p != ' ')
+		goto bad;
+
+	/* Date in seconds since the epoch (UTC) */
+	if (strtoul(p, &end, 10) == ULONG_MAX)
+		goto bad;
+	if (*end++ != ' ')
+		goto bad;
+
+	/* TZ that date was done in */
+	if (strtoul(end, &end, 10) == ULONG_MAX)
+		goto bad;
+	if (*end++ != '\n')
+		goto bad;
+	return end - buffer;
+bad:
+	return objerror(&commit->object, "invalid format - missing or corrupt '%s'", expect);
+}
+
 static int fsck_commit(struct commit *commit)
 {
+	int len;
 	char *buffer = commit->buffer;
 	unsigned char tree_sha1[20], sha1[20];
 
@@ -370,8 +410,21 @@ static int fsck_commit(struct commit *commit)
 			return objerror(&commit->object, "invalid 'parent' line format - bad sha1");
 		buffer += 48;
 	}
-	if (memcmp(buffer, "author ", 7))
-		return objerror(&commit->object, "invalid format - expected 'author' line");
+
+	/*
+	 * We check the author/committer lines for completeness.
+	 * But errors here aren't fatal to the rest of the parsing.
+	 */
+	len = parse_commit_line(commit, "author", buffer);
+	if (len >= 0) {
+		buffer += len;
+		len = parse_commit_line(commit, "committer", buffer);
+		if (len >= 0) {
+			buffer += len;
+			if (*buffer != '\n')
+				objerror(&commit->object, "invalid format - missing or corrupt end-of-headers");
+		}
+	}
 	free(commit->buffer);
 	commit->buffer = NULL;
 	if (!commit->tree)

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 19:39   ` Brian Downing
  2007-12-02 20:34     ` Linus Torvalds
  2007-12-02 20:48     ` Junio C Hamano
@ 2007-12-02 21:34     ` Johannes Schindelin
  2007-12-02 21:49       ` Linus Torvalds
  2007-12-02 22:14       ` Junio C Hamano
  2 siblings, 2 replies; 18+ messages in thread
From: Johannes Schindelin @ 2007-12-02 21:34 UTC (permalink / raw)
  To: Brian Downing; +Cc: Junio C Hamano, Steffen Prohaska, Git Mailing List

Hi,

On Sun, 2 Dec 2007, Brian Downing wrote:

> On Sun, Dec 02, 2007 at 10:53:33AM -0800, Junio C Hamano wrote:
> > Yeah, I was wondering what that commit that records the change older
> > than git or myself come to life ;-)
> > 
> > I did rewrite the commit a few times, and it was some interaction
> > between the built-in commit series, git-rebase -i and git-am, but I do
> > not have the details, sorry.
> 
> It looks like the "guilty" commit that allowed this behavior was:
> 
> commit 13208572fbe8838fd8835548d7502202d1f7b21d
> Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
> Date:   Sun Nov 11 17:35:58 2007 +0000
> 
>     builtin-commit: fix --signoff
> 
>     The Signed-off-by: line contained a spurious timestamp.  The reason was
>     a call to git_committer_info(1), which automatically added the
>     timestamp.
> 
>     Instead, fmt_ident() was taught to interpret an empty string for the
>     date (as opposed to NULL, which still triggers the default behavior)
>     as "do not bother with the timestamp", and builtin-commit.c uses it.
> 
>     Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>     Signed-off-by: Junio C Hamano <gitster@pobox.com>
> 
> With the above, something like:
> 
> echo msg | GIT_AUTHOR_DATE='' git commit-tree sha1

Darn.  But when can "GIT_AUTHOR_DATE" be set to the empty string?  I mean, 
I understand unset'ing it.  But setting it to ""?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Fix --signoff in builtin-commit differently.
  2007-12-02 20:48     ` Junio C Hamano
  2007-12-02 21:25       ` Linus Torvalds
@ 2007-12-02 21:43       ` Junio C Hamano
  2007-12-02 22:31         ` Johannes Schindelin
  2007-12-02 22:59       ` Corrupted (?) commit 6e6db85e confusing gitk Michael Gebetsroither
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2007-12-02 21:43 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Brian Downing, Steffen Prohaska, Git Mailing List

Introduce fmt_name() specifically meant for formatting the name and
email pair, to add signed-off-by value.  This reverts parts of
13208572fbe8838fd8835548d7502202d1f7b21d (builtin-commit: fix --signoff)
so that an empty datestamp string given to fmt_ident() by mistake will
error out as before.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 Junio C Hamano <gitster@pobox.com> writes:

 >> Personally, I think if the date_str is not NULL, it should die() on
 >> anything that can't successfully be parsed as a date, rather than simply
 >> falling back to the current time.  But maybe that's a bit extreme.
 >
 > Yeah, that change does look like a hack now we look at it again.  It
 > would have been much cleaner to make the caller accept the default
 > behaviour of fmt_ident() and strip out the part it does not want from
 > the result.  That way, the damage would have been much contained.
 >
 > The next issue would be to find who could pass an empty GIT_AUTHOR_DATE
 > without noticing...

 Perhaps like this...

 builtin-commit.c |    6 ++----
 cache.h          |    1 +
 ident.c          |   34 ++++++++++++++++++++++++----------
 3 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/builtin-commit.c b/builtin-commit.c
index 96cb544..2319cc1 100644
--- a/builtin-commit.c
+++ b/builtin-commit.c
@@ -346,11 +346,9 @@ static int prepare_log_message(const char *index_file, const char *prefix)
 
 		strbuf_init(&sob, 0);
 		strbuf_addstr(&sob, sign_off_header);
-		strbuf_addstr(&sob, fmt_ident(getenv("GIT_COMMITTER_NAME"),
-					      getenv("GIT_COMMITTER_EMAIL"),
-					      "", 1));
+		strbuf_addstr(&sob, fmt_name(getenv("GIT_COMMITTER_NAME"),
+					     getenv("GIT_COMMITTER_EMAIL")));
 		strbuf_addch(&sob, '\n');
-
 		for (i = sb.len - 1; i > 0 && sb.buf[i - 1] != '\n'; i--)
 			; /* do nothing */
 		if (prefixcmp(sb.buf + i, sob.buf)) {
diff --git a/cache.h b/cache.h
index cf0bdc6..43cfebb 100644
--- a/cache.h
+++ b/cache.h
@@ -444,6 +444,7 @@ enum date_mode parse_date_format(const char *format);
 extern const char *git_author_info(int);
 extern const char *git_committer_info(int);
 extern const char *fmt_ident(const char *name, const char *email, const char *date_str, int);
+extern const char *fmt_name(const char *name, const char *email);
 
 struct checkout {
 	const char *base_dir;
diff --git a/ident.c b/ident.c
index 5be7533..021d79b 100644
--- a/ident.c
+++ b/ident.c
@@ -192,12 +192,14 @@ static const char *env_hint =
 "Omit --global to set the identity only in this repository.\n"
 "\n";
 
-const char *fmt_ident(const char *name, const char *email,
-		      const char *date_str, int error_on_no_name)
+static const char *fmt_ident_1(const char *name, const char *email,
+			       const char *date_str, int flag)
 {
 	static char buffer[1000];
 	char date[50];
 	int i;
+	int error_on_no_name = !!(flag & 01);
+	int name_addr_only = !!(flag & 02);
 
 	setup_ident();
 	if (!name)
@@ -224,24 +226,36 @@ const char *fmt_ident(const char *name, const char *email,
 	}
 
 	strcpy(date, git_default_date);
-	if (date_str) {
-		if (*date_str)
-			parse_date(date_str, date, sizeof(date));
-		else
-			date[0] = '\0';
-	}
+	if (!name_addr_only && date_str)
+		parse_date(date_str, date, sizeof(date));
 
 	i = copy(buffer, sizeof(buffer), 0, name);
 	i = add_raw(buffer, sizeof(buffer), i, " <");
 	i = copy(buffer, sizeof(buffer), i, email);
-	i = add_raw(buffer, sizeof(buffer), i, date[0] ? "> " : ">");
-	i = copy(buffer, sizeof(buffer), i, date);
+	if (!name_addr_only) {
+		i = add_raw(buffer, sizeof(buffer), i,  "> ");
+		i = copy(buffer, sizeof(buffer), i, date);
+	} else {
+		i = add_raw(buffer, sizeof(buffer), i, ">");
+	}
 	if (i >= sizeof(buffer))
 		die("Impossibly long personal identifier");
 	buffer[i] = 0;
 	return buffer;
 }
 
+const char *fmt_ident(const char *name, const char *email,
+		      const char *date_str, int error_on_no_name)
+{
+	int flag = (error_on_no_name ? 01 : 0);
+	return fmt_ident_1(name, email, date_str, flag);
+}
+
+const char *fmt_name(const char *name, const char *email)
+{
+	return fmt_ident_1(name, email, NULL, 03);
+}
+
 const char *git_author_info(int error_on_no_name)
 {
 	return fmt_ident(getenv("GIT_AUTHOR_NAME"),

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 21:25       ` Linus Torvalds
@ 2007-12-02 21:47         ` Junio C Hamano
  2007-12-02 22:43           ` Linus Torvalds
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2007-12-02 21:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Brian Downing, Steffen Prohaska, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Sun, 2 Dec 2007, Junio C Hamano wrote:
>>
>> The next issue would be to find who could pass an empty GIT_AUTHOR_DATE
>> without noticing...
>
> In the meantime, here's a not-very-well-tested patch to fsck to at least 
> notice this.

Thanks.

I recall that the very initial git did not use the current format for
timestamp but ctime() return value, and this will also notice them (and
convert-objects will be there for us).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 21:34     ` Johannes Schindelin
@ 2007-12-02 21:49       ` Linus Torvalds
  2007-12-02 22:36         ` Junio C Hamano
  2007-12-02 22:14       ` Junio C Hamano
  1 sibling, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2007-12-02 21:49 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Brian Downing, Junio C Hamano, Steffen Prohaska, Git Mailing List



On Sun, 2 Dec 2007, Johannes Schindelin wrote:
> 
> Darn.  But when can "GIT_AUTHOR_DATE" be set to the empty string?  I mean, 
> I understand unset'ing it.  But setting it to ""?

Well, regardless, I think we should make sure that git-commit-tree never 
writes out an invalid commit - no matter *how* insane input it gets. 
Making it complain loudly (with a 'die("Oh, no, you don't!")') would be a 
good idea.

		Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 21:34     ` Johannes Schindelin
  2007-12-02 21:49       ` Linus Torvalds
@ 2007-12-02 22:14       ` Junio C Hamano
  1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2007-12-02 22:14 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Brian Downing, Steffen Prohaska, Git Mailing List

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> With the above, something like:
>> 
>> echo msg | GIT_AUTHOR_DATE='' git commit-tree sha1
>
> Darn.  But when can "GIT_AUTHOR_DATE" be set to the empty string?  I mean, 
> I understand unset'ing it.  But setting it to ""?

Maybe something like this would catch such a breakage earlier, but with
the re-fix for --signoff I just sent to make fmt_ident() safer, I think
this patch would fall into belt-and-suspender category.

---

 git-am.sh |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/git-am.sh b/git-am.sh
index 4126f0e..bab6f68 100755
--- a/git-am.sh
+++ b/git-am.sh
@@ -307,9 +307,11 @@ do
 	GIT_AUTHOR_EMAIL="$(sed -n '/^Email/ s/Email: //p' "$dotest/info")"
 	GIT_AUTHOR_DATE="$(sed -n '/^Date/ s/Date: //p' "$dotest/info")"
 
-	if test -z "$GIT_AUTHOR_EMAIL"
+	if test -z "$GIT_AUTHOR_EMAIL" ||
+	   test -z "$GIT_AUTHOR_NAME" ||
+	   test -z "$GIT_AUTHOR_DATE"
 	then
-		echo "Patch does not have a valid e-mail address."
+		echo "Patch does not have a valid authorship information."
 		stop_here $this
 	fi
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Fix --signoff in builtin-commit differently.
  2007-12-02 21:43       ` Fix --signoff in builtin-commit differently Junio C Hamano
@ 2007-12-02 22:31         ` Johannes Schindelin
  0 siblings, 0 replies; 18+ messages in thread
From: Johannes Schindelin @ 2007-12-02 22:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brian Downing, Steffen Prohaska, Git Mailing List

Hi,

On Sun, 2 Dec 2007, Junio C Hamano wrote:

> Introduce fmt_name() specifically meant for formatting the name and
> email pair, to add signed-off-by value.  This reverts parts of
> 13208572fbe8838fd8835548d7502202d1f7b21d (builtin-commit: fix --signoff)
> so that an empty datestamp string given to fmt_ident() by mistake will
> error out as before.

>From a quick glance, looks good to me.

Sorry for the breakage,
Dscho

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 21:49       ` Linus Torvalds
@ 2007-12-02 22:36         ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2007-12-02 22:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Brian Downing, Steffen Prohaska,
	Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Sun, 2 Dec 2007, Johannes Schindelin wrote:
>> 
>> Darn.  But when can "GIT_AUTHOR_DATE" be set to the empty string?  I mean, 
>> I understand unset'ing it.  But setting it to ""?
>
> Well, regardless, I think we should make sure that git-commit-tree never 
> writes out an invalid commit - no matter *how* insane input it gets. 
> Making it complain loudly (with a 'die("Oh, no, you don't!")') would be a 
> good idea.

FWIW, fmt_ident() records the current time (before the kh/commit series,
and after the fix I sent on top of kh/commit series), which may be a
reasonable alternative.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 21:47         ` Junio C Hamano
@ 2007-12-02 22:43           ` Linus Torvalds
  0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2007-12-02 22:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brian Downing, Steffen Prohaska, Git Mailing List



On Sun, 2 Dec 2007, Junio C Hamano wrote:
> 
> I recall that the very initial git did not use the current format for
> timestamp but ctime() return value

Yes, but..

> and this will also notice them (and convert-objects will be there for 
> us).

.. I don't think we have actually accepted the ctime-string format since 
switchng over, so no existing git repositories will have them. The commit 
date parsing just does a "strtoul()" in parse_commit_date().

So I wouldn't expect convert-objects to be needed - or rather, it was 
needed 2.5 _years_ ago, and we've not supported those early broken formats 
since, afaik.

			Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 20:48     ` Junio C Hamano
  2007-12-02 21:25       ` Linus Torvalds
  2007-12-02 21:43       ` Fix --signoff in builtin-commit differently Junio C Hamano
@ 2007-12-02 22:59       ` Michael Gebetsroither
  2007-12-02 23:05         ` Junio C Hamano
  2 siblings, 1 reply; 18+ messages in thread
From: Michael Gebetsroither @ 2007-12-02 22:59 UTC (permalink / raw)
  To: git

* Junio C Hamano <gitster@pobox.com> wrote:

> Yeah, that change does look like a hack now we look at it again.  It
> would have been much cleaner to make the caller accept the default
> behaviour of fmt_ident() and strip out the part it does not want from
> the result.  That way, the damage would have been much contained.

Last hg2git from repo.or.cz does it (it uses git-fast-import).

Just tried to convert lastes mercurial repos to git and the resulting
git repos can't be displayed with gitk.

git fsck --full gives many "bad commit date in xxxx"

% git cat-file commit f7900d59930d796c4739452cf68ca9c08a921b5d | sed
% '/^$/,//d'
tree 4fb3636148433dcd368fc2dfa20247cb281e2ff8
parent 6ea0491f4ebb13003d15c6fc478d92cbe1201902
author Mathieu Clabaut <mathieu.clabaut@gmail.com> <"Mathieu Clabaut
<mathieu.clabaut@gmail.com>"> 1153937514 +0200
committer Mathieu Clabaut <mathieu.clabaut@gmail.com> <"Mathieu Clabaut
<mathieu.clabaut@gmail.com>"> 1153937514 +0200

But the new hg2git is _amazingly_ fast, converts the whole mercurial
repos got git in just 1min :).

cu,
michael
-- 
It's already too late!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Corrupted (?) commit 6e6db85e confusing gitk
  2007-12-02 22:59       ` Corrupted (?) commit 6e6db85e confusing gitk Michael Gebetsroither
@ 2007-12-02 23:05         ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2007-12-02 23:05 UTC (permalink / raw)
  To: Michael Gebetsroither; +Cc: git

Michael Gebetsroither <gebi@sbox.tugraz.at> writes:

> Last hg2git from repo.or.cz does it (it uses git-fast-import).
>
> Just tried to convert lastes mercurial repos to git and the resulting
> git repos can't be displayed with gitk.
>
> author Mathieu Clabaut <mathieu.clabaut@gmail.com> <"Mathieu Clabaut
> <mathieu.clabaut@gmail.com>"> 1153937514 +0200
> committer Mathieu Clabaut <mathieu.clabaut@gmail.com> <"Mathieu Clabaut
> <mathieu.clabaut@gmail.com>"> 1153937514 +0200

That's totally broken author/committer information from a broken
hg2git, with what appears to be correct timestamps.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-12-02 23:05 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-02 16:06 Corrupted (?) commit 6e6db85e confusing gitk Steffen Prohaska
2007-12-02 16:12 ` Wincent Colaiuta
2007-12-02 16:36 ` [PATCH] gitk: Add workaround to handle corrupted author date Steffen Prohaska
2007-12-02 18:53 ` Corrupted (?) commit 6e6db85e confusing gitk Junio C Hamano
2007-12-02 19:39   ` Brian Downing
2007-12-02 20:34     ` Linus Torvalds
2007-12-02 20:48     ` Junio C Hamano
2007-12-02 21:25       ` Linus Torvalds
2007-12-02 21:47         ` Junio C Hamano
2007-12-02 22:43           ` Linus Torvalds
2007-12-02 21:43       ` Fix --signoff in builtin-commit differently Junio C Hamano
2007-12-02 22:31         ` Johannes Schindelin
2007-12-02 22:59       ` Corrupted (?) commit 6e6db85e confusing gitk Michael Gebetsroither
2007-12-02 23:05         ` Junio C Hamano
2007-12-02 21:34     ` Johannes Schindelin
2007-12-02 21:49       ` Linus Torvalds
2007-12-02 22:36         ` Junio C Hamano
2007-12-02 22:14       ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).