* git log - crash and core dump @ 2013-04-16 16:55 Ivan Lyapunov 2013-04-16 17:29 ` Antoine Pelisse 2013-04-16 18:09 ` René Scharfe 0 siblings, 2 replies; 28+ messages in thread From: Ivan Lyapunov @ 2013-04-16 16:55 UTC (permalink / raw) To: git git version 1.8.2.1 crashes on my ArchLinux x86_64 on git log command gdb bt is attached git log | less does not crash in same repo I cannot share a repo for a debug purposes since it's private repo of my employer but I can perform any suitable tests on repo to help this bug to be fixed #0 0x00007ffff722b3e6 in ____strtoull_l_internal () from /usr/lib/libc.so.6 #1 0x00000000004b31d4 in pp_user_info (pp=pp@entry=0x7fffffffd310, what=what@entry=0x521379 "Author", sb=sb@entry=0x7fffffffd290, line=line@entry=0x7b3a45 "Ivan Lyapunov <ilyapunov@trueconf.ru>- <> 1354083115 +0400\ncommitter Ivan Lyapunov <ilyapunov@trueconf.ru> 1354083115 +0400\n\n- small merge fixes", encoding=encoding@entry=0x505400 "UTF-8") at pretty.c:441 #2 0x00000000004b533a in pp_header (sb=0x7fffffffd290, msg_p=0x7fffffffd228, commit=0x7c1e10, encoding=0x505400 "UTF-8", pp=0x7fffffffd310) at pretty.c:1415 #3 pretty_print_commit (pp=pp@entry=0x7fffffffd310, commit=commit@entry=0x7c1e10, sb=sb@entry=0x7fffffffd290) at pretty.c:1545 #4 0x00000000004a0b45 in show_log (opt=opt@entry=0x7fffffffd4d0) at log-tree.c:683 #5 0x00000000004a1616 in log_tree_commit (opt=opt@entry=0x7fffffffd4d0, commit=commit@entry=0x7c1e10) at log-tree.c:859 #6 0x0000000000438b03 in cmd_log_walk (rev=rev@entry=0x7fffffffd4d0) at builtin/log.c:310 #7 0x00000000004395dd in cmd_log (argc=1, argv=0x7fffffffdd30, prefix=0x0) at builtin/log.c:582 #8 0x000000000040562d in run_builtin (argv=0x7fffffffdd30, argc=1, p=0x754d18 <commands.21404+1080>) at git.c:282 #9 handle_internal_command (argc=1, argv=0x7fffffffdd30) at git.c:444 #10 0x0000000000404a6f in run_argv (argv=0x7fffffffdbd0, argcp=0x7fffffffdbdc) at git.c:490 #11 main (argc=1, argv=0x7fffffffdd30) at git.c:565 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 16:55 git log - crash and core dump Ivan Lyapunov @ 2013-04-16 17:29 ` Antoine Pelisse 2013-04-16 18:09 ` René Scharfe 1 sibling, 0 replies; 28+ messages in thread From: Antoine Pelisse @ 2013-04-16 17:29 UTC (permalink / raw) To: Ivan Lyapunov; +Cc: git@vger.kernel.org Hey > #0 0x00007ffff722b3e6 in ____strtoull_l_internal () from /usr/lib/libc.so.6 > #1 0x00000000004b31d4 in pp_user_info (pp=pp@entry=0x7fffffffd310, > what=what@entry=0x521379 "Author", sb=sb@entry=0x7fffffffd290, > line=line@entry=0x7b3a45 "Ivan Lyapunov <ilyapunov@trueconf.ru>- > <> 1354083115 +0400\ncommitter Ivan Lyapunov <ilyapunov@trueconf.ru> > 1354083115 +0400\n\n- small merge fixes", > encoding=encoding@entry=0x505400 "UTF-8") at pretty.c:441 Clearly the author line is messed up after name and email. It means we won't be able to parse the time, and return a null pointer to it (which we run through strtoll after, with a crash). I thought that bug was already fixed though and we're now checking for null time also ? I think I can submit a fix for that when I'm back home. Thanks for reporting ! Cheers, Antoine ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 16:55 git log - crash and core dump Ivan Lyapunov 2013-04-16 17:29 ` Antoine Pelisse @ 2013-04-16 18:09 ` René Scharfe 2013-04-16 19:45 ` Junio C Hamano 1 sibling, 1 reply; 28+ messages in thread From: René Scharfe @ 2013-04-16 18:09 UTC (permalink / raw) To: Ivan Lyapunov; +Cc: git, Antoine Pelisse Am 16.04.2013 18:55, schrieb Ivan Lyapunov: > git version 1.8.2.1 crashes on my ArchLinux x86_64 on git log command > gdb bt is attached > > git log | less > does not crash in same repo > > I cannot share a repo for a debug purposes since it's private repo of > my employer > but I can perform any suitable tests on repo to help this bug to be fixed > > #0 0x00007ffff722b3e6 in ____strtoull_l_internal () from /usr/lib/libc.so.6 > #1 0x00000000004b31d4 in pp_user_info (pp=pp@entry=0x7fffffffd310, > what=what@entry=0x521379 "Author", sb=sb@entry=0x7fffffffd290, > line=line@entry=0x7b3a45 "Ivan Lyapunov <ilyapunov@trueconf.ru>- > <> 1354083115 +0400\ncommitter Ivan Lyapunov <ilyapunov@trueconf.ru> So this is the author information, correct? Ivan Lyapunov <ilyapunov@trueconf.ru>-<> 1354083115 +0400 |author name| |---author email----| ^^^ |--time--| |tz-| How did you manage to add the "-<>" after the email address? What does git log in version 1.8.1 or earlier show for this commit? > 1354083115 +0400\n\n- small merge fixes", > encoding=encoding@entry=0x505400 "UTF-8") at pretty.c:441 > #2 0x00000000004b533a in pp_header (sb=0x7fffffffd290, > msg_p=0x7fffffffd228, commit=0x7c1e10, encoding=0x505400 "UTF-8", > pp=0x7fffffffd310) at pretty.c:1415 > #3 pretty_print_commit (pp=pp@entry=0x7fffffffd310, > commit=commit@entry=0x7c1e10, sb=sb@entry=0x7fffffffd290) at > pretty.c:1545 > #4 0x00000000004a0b45 in show_log (opt=opt@entry=0x7fffffffd4d0) at > log-tree.c:683 > #5 0x00000000004a1616 in log_tree_commit > (opt=opt@entry=0x7fffffffd4d0, commit=commit@entry=0x7c1e10) at > log-tree.c:859 > #6 0x0000000000438b03 in cmd_log_walk (rev=rev@entry=0x7fffffffd4d0) > at builtin/log.c:310 > #7 0x00000000004395dd in cmd_log (argc=1, argv=0x7fffffffdd30, > prefix=0x0) at builtin/log.c:582 > #8 0x000000000040562d in run_builtin (argv=0x7fffffffdd30, argc=1, > p=0x754d18 <commands.21404+1080>) at git.c:282 > #9 handle_internal_command (argc=1, argv=0x7fffffffdd30) at git.c:444 > #10 0x0000000000404a6f in run_argv (argv=0x7fffffffdbd0, > argcp=0x7fffffffdbdc) at git.c:490 > #11 main (argc=1, argv=0x7fffffffdd30) at git.c:565 Does this patch help? pretty.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/pretty.c b/pretty.c index d3a82d2..713eefc 100644 --- a/pretty.c +++ b/pretty.c @@ -405,8 +405,8 @@ void pp_user_info(const struct pretty_print_context *pp, const char *mailbuf, *namebuf; size_t namelen, maillen; int max_length = 78; /* per rfc2822 */ - unsigned long time; - int tz; + unsigned long time = 0; + int tz = 0; if (pp->fmt == CMIT_FMT_ONELINE) return; @@ -438,8 +438,10 @@ void pp_user_info(const struct pretty_print_context *pp, strbuf_add(&name, namebuf, namelen); namelen = name.len + mail.len + 3; /* ' ' + '<' + '>' */ - time = strtoul(ident.date_begin, &date, 10); - tz = strtol(date, NULL, 10); + if (ident.date_begin) { + time = strtoul(ident.date_begin, &date, 10); + tz = strtol(date, NULL, 10); + } if (pp->fmt == CMIT_FMT_EMAIL) { strbuf_addstr(sb, "From: "); ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 18:09 ` René Scharfe @ 2013-04-16 19:45 ` Junio C Hamano 2013-04-16 21:10 ` René Scharfe ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Junio C Hamano @ 2013-04-16 19:45 UTC (permalink / raw) To: René Scharfe; +Cc: Ivan Lyapunov, git, Antoine Pelisse René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: > Does this patch help? > > pretty.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/pretty.c b/pretty.c > index d3a82d2..713eefc 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -405,8 +405,8 @@ void pp_user_info(const struct pretty_print_context *pp, > const char *mailbuf, *namebuf; > size_t namelen, maillen; > int max_length = 78; /* per rfc2822 */ > - unsigned long time; > - int tz; > + unsigned long time = 0; > + int tz = 0; > > if (pp->fmt == CMIT_FMT_ONELINE) > return; > @@ -438,8 +438,10 @@ void pp_user_info(const struct pretty_print_context *pp, > strbuf_add(&name, namebuf, namelen); > > namelen = name.len + mail.len + 3; /* ' ' + '<' + '>' */ > - time = strtoul(ident.date_begin, &date, 10); > - tz = strtol(date, NULL, 10); > + if (ident.date_begin) { > + time = strtoul(ident.date_begin, &date, 10); > + tz = strtol(date, NULL, 10); > + } > > if (pp->fmt == CMIT_FMT_EMAIL) { > strbuf_addstr(sb, "From: "); Looks like a sensible change. split_ident_line() decided that the given input was mangled and decided there is no valid date (the input had <> where the timestamp string was required), so the updated code leaves the time/tz unspecified. It still is curious how a malformed line was created in the first place. I wouldn't worry if a private tool used hash-object to create such a commit, but if it is something that is commonly used (e.g. "git commit"), others may suffer from the same and the tool needs to be tightened a bit. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 19:45 ` Junio C Hamano @ 2013-04-16 21:10 ` René Scharfe 2013-04-16 22:21 ` Junio C Hamano 2013-04-16 21:24 ` git log - crash and core dump Antoine Pelisse 2013-04-16 21:34 ` Jeff King 2 siblings, 1 reply; 28+ messages in thread From: René Scharfe @ 2013-04-16 21:10 UTC (permalink / raw) To: Junio C Hamano; +Cc: Ivan Lyapunov, git, Antoine Pelisse First, lest I forget again: Thank you, Ivan, for the very useful bug report! Am 16.04.2013 21:45, schrieb Junio C Hamano: > René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: > >> Does this patch help? >> >> pretty.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/pretty.c b/pretty.c >> index d3a82d2..713eefc 100644 >> --- a/pretty.c >> +++ b/pretty.c >> @@ -405,8 +405,8 @@ void pp_user_info(const struct pretty_print_context *pp, >> const char *mailbuf, *namebuf; >> size_t namelen, maillen; >> int max_length = 78; /* per rfc2822 */ >> - unsigned long time; >> - int tz; >> + unsigned long time = 0; >> + int tz = 0; >> >> if (pp->fmt == CMIT_FMT_ONELINE) >> return; >> @@ -438,8 +438,10 @@ void pp_user_info(const struct pretty_print_context *pp, >> strbuf_add(&name, namebuf, namelen); >> >> namelen = name.len + mail.len + 3; /* ' ' + '<' + '>' */ >> - time = strtoul(ident.date_begin, &date, 10); >> - tz = strtol(date, NULL, 10); >> + if (ident.date_begin) { >> + time = strtoul(ident.date_begin, &date, 10); >> + tz = strtol(date, NULL, 10); >> + } >> >> if (pp->fmt == CMIT_FMT_EMAIL) { >> strbuf_addstr(sb, "From: "); > > Looks like a sensible change. split_ident_line() decided that the > given input was mangled and decided there is no valid date (the > input had <> where the timestamp string was required), so the > updated code leaves the time/tz unspecified. We'd need update pretty.c::format_person_part() and builtin/blame.c::get_ac_line() as well, though. How about making split_ident_line() a bit friendlier be letting it provide the epoch as default time stamp instead of NULL? We shouldn't do that if we'd like to be able to tell a missing/broken time stamp apart from a commit that was actually made back in 1970 (e.g. an imported one). Or if we'd like to not show a time stamp in git log output at all in that case. -- >8 -- Subject: ident: let split_ident_line() provide a default time stamp If a commit has a broken time stamp, split_ident_line() sets date_begin, date_end, tz_begin and tz_end to NULL. Not all callers are prepared to handle that case and segfault. Instead of fixing them and having to be careful while implementing the next caller, provide a string consisting of the number zero as default value, representing the UNIX epoch. That's the value that git log showed before it was converted to use split_ident_line(). Reported-by: Ivan Lyapunov <dront78@gmail.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> --- ident.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/ident.c b/ident.c index 1c123e6..ee840f4 100644 --- a/ident.c +++ b/ident.c @@ -191,6 +191,8 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src) sb->buf[sb->len] = '\0'; } +static const char zero_string[] = "0"; + /* * Reverse of fmt_ident(); given an ident line, split the fields * to allow the caller to parse it. @@ -254,10 +256,10 @@ int split_ident_line(struct ident_split *split, const char *line, int len) return 0; person_only: - split->date_begin = NULL; - split->date_end = NULL; - split->tz_begin = NULL; - split->tz_end = NULL; + split->date_begin = zero_string; + split->date_end = zero_string + strlen(zero_string); + split->tz_begin = zero_string; + split->tz_end = zero_string + strlen(zero_string); return 0; } -- 1.8.2.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 21:10 ` René Scharfe @ 2013-04-16 22:21 ` Junio C Hamano 2013-04-17 2:50 ` Ivan Lyapunov 2013-04-17 5:26 ` Junio C Hamano 0 siblings, 2 replies; 28+ messages in thread From: Junio C Hamano @ 2013-04-16 22:21 UTC (permalink / raw) To: René Scharfe; +Cc: Ivan Lyapunov, git, Antoine Pelisse René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: > How about making split_ident_line() a bit friendlier be letting it > provide the epoch as default time stamp instead of NULL? Two knee-jerk concerns I have without going back to the callers: * Would that "0" ever be given to the approxidate parser, which rejects ancient dates in numbers-since-epoch format without @ prefix? * Does any existing caller use the NULL as a sign to see the input was without date and act on that information? > -- >8 -- > Subject: ident: let split_ident_line() provide a default time stamp > > If a commit has a broken time stamp, split_ident_line() sets > date_begin, date_end, tz_begin and tz_end to NULL. Not all callers > are prepared to handle that case and segfault. > > Instead of fixing them and having to be careful while implementing > the next caller, provide a string consisting of the number zero as > default value, representing the UNIX epoch. That's the value that > git log showed before it was converted to use split_ident_line(). > > Reported-by: Ivan Lyapunov <dront78@gmail.com> > Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> > --- > ident.c | 10 ++++++---- > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/ident.c b/ident.c > index 1c123e6..ee840f4 100644 > --- a/ident.c > +++ b/ident.c > @@ -191,6 +191,8 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src) > sb->buf[sb->len] = '\0'; > } > > +static const char zero_string[] = "0"; > + > /* > * Reverse of fmt_ident(); given an ident line, split the fields > * to allow the caller to parse it. > @@ -254,10 +256,10 @@ int split_ident_line(struct ident_split *split, const char *line, int len) > return 0; > > person_only: > - split->date_begin = NULL; > - split->date_end = NULL; > - split->tz_begin = NULL; > - split->tz_end = NULL; > + split->date_begin = zero_string; > + split->date_end = zero_string + strlen(zero_string); > + split->tz_begin = zero_string; > + split->tz_end = zero_string + strlen(zero_string); > return 0; > } ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 22:21 ` Junio C Hamano @ 2013-04-17 2:50 ` Ivan Lyapunov 2013-04-17 5:22 ` Ivan Lyapunov 2013-04-17 5:26 ` Junio C Hamano 1 sibling, 1 reply; 28+ messages in thread From: Ivan Lyapunov @ 2013-04-17 2:50 UTC (permalink / raw) To: git Looks like I missed a lot of fun I'm sleeping ;) The current repo is a product of our core crossplatform team, working on Linux/MacOSX/Win32 environment. The Linux environment also separates on Ubuntu/ArchLinux distros, and probably most of repo bugs coming from ArchLinux, since other environments are stable in time. I can't say exact git verison the bugs was introduced, but we have a commits, dated by 0 secs since the Epoch and invalid Author fields also. Also git fsck --full got the following stats on repo Checking object directories: 100% (256/256), done. error in commit 7eeb541987af8a589c6ebee53346c48a13142233: invalid author/committer line - missing space before date error in commit c23b0a487143e5d5d96cdc5354975e95114241ee: invalid author/committer line - missing space before date error in commit c7fa421863e073996b5d1ba6beb6001b9d146cba: invalid author/committer line - missing space before date error in commit 131155bd75c588bcd251b719c483d1d5bcb78504: invalid author/committer line - missing space before date error in commit 0888e7ffe6ae0aaf1b6d1ba67d05715487f88a52: invalid author/committer line - missing space before date error in commit 3cdeddd15c251a13fb3e79844ed3ea0e02cb611a: invalid author/committer line - missing space before date error in commit 21f9fe1565d89da845fc7080495c922103bacf24: invalid author/committer line - missing space before date error in commit f557d3427ba1bb33a1c1fd2c7936efa7e7c70281: invalid author/committer line - missing space before date error in commit c625943779c72b41b08b41730e56126b89cbb7b4: invalid author/committer line - missing space before date error in commit a83fc863991aae2bdad148a5897ed4315792dd82: invalid author/committer line - missing space before date error in commit 207321f773e695b2ae88884c34620bc663383f90: invalid author/committer line - missing space before date error in commit 67368e9eda9892acd6c6ebf03dd6f22b6de2db8a: invalid author/committer line - missing space before date error in commit 525a5d508a7f466a1339752e921517f4db8c4af6: invalid author/committer line - missing space before date error in commit 38215e27f74caa342e3353c4cd548fcf8c1df3dc: invalid author/committer line - missing space before date error in commit 1ee0167194eb34caca2c20ce5c74d062fc898718: invalid author/committer line - missing space before date error in commit 3274c469b981285f9a4d0b0a62afbb8f4d3e93ae: invalid author/committer line - missing space before date error in commit f37ab83f71ca93f42256e05efdd4244eb321efaf: invalid author/committer line - missing space before date error in commit 9bb6f7d63bb5a37b8afc3ae090bd6f34deb68633: invalid author/committer line - missing space before date Checking objects: 100% (9576/9576), done. And I haven't find a way to fix without loosing a commits history, so we left them as it is. I will check the approved patches and writeback a little bit later. Thanks a lot for looking this Ivan 2013/4/17 Junio C Hamano <gitster@pobox.com>: > René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: > >> How about making split_ident_line() a bit friendlier be letting it >> provide the epoch as default time stamp instead of NULL? > > Two knee-jerk concerns I have without going back to the callers: > > * Would that "0" ever be given to the approxidate parser, which > rejects ancient dates in numbers-since-epoch format without @ > prefix? > > * Does any existing caller use the NULL as a sign to see the input > was without date and act on that information? > > >> -- >8 -- >> Subject: ident: let split_ident_line() provide a default time stamp >> >> If a commit has a broken time stamp, split_ident_line() sets >> date_begin, date_end, tz_begin and tz_end to NULL. Not all callers >> are prepared to handle that case and segfault. >> >> Instead of fixing them and having to be careful while implementing >> the next caller, provide a string consisting of the number zero as >> default value, representing the UNIX epoch. That's the value that >> git log showed before it was converted to use split_ident_line(). >> >> Reported-by: Ivan Lyapunov <dront78@gmail.com> >> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> >> --- >> ident.c | 10 ++++++---- >> 1 file changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/ident.c b/ident.c >> index 1c123e6..ee840f4 100644 >> --- a/ident.c >> +++ b/ident.c >> @@ -191,6 +191,8 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src) >> sb->buf[sb->len] = '\0'; >> } >> >> +static const char zero_string[] = "0"; >> + >> /* >> * Reverse of fmt_ident(); given an ident line, split the fields >> * to allow the caller to parse it. >> @@ -254,10 +256,10 @@ int split_ident_line(struct ident_split *split, const char *line, int len) >> return 0; >> >> person_only: >> - split->date_begin = NULL; >> - split->date_end = NULL; >> - split->tz_begin = NULL; >> - split->tz_end = NULL; >> + split->date_begin = zero_string; >> + split->date_end = zero_string + strlen(zero_string); >> + split->tz_begin = zero_string; >> + split->tz_end = zero_string + strlen(zero_string); >> return 0; >> } ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 2:50 ` Ivan Lyapunov @ 2013-04-17 5:22 ` Ivan Lyapunov 2013-04-17 8:27 ` John Keeping 0 siblings, 1 reply; 28+ messages in thread From: Ivan Lyapunov @ 2013-04-17 5:22 UTC (permalink / raw) To: git I checked René Scharfe's patch and it works - no more git log crash. Also I checked a broken commits by git show and the most of them created by me. I can confirm I never used anyting else except console git commit or netbeans gui to commit, which is just git gui wrapper tool. Several commits is a just branch merge commits with no changes at all. The modification time across broken commits is around from near end of Oct to end of Nov 2012 and there are good commits between broken maded on the same machine. I hope it helps to reproduce the problem. Also it would be good if there are some way to fix this errors in repo, since we found the same bugs in our Eclipse-based Android project repo Ivan 2013/4/17 Ivan Lyapunov <dront78@gmail.com>: > Looks like I missed a lot of fun I'm sleeping ;) > > The current repo is a product of our core crossplatform team, working > on Linux/MacOSX/Win32 environment. The Linux environment also > separates on Ubuntu/ArchLinux distros, and probably most of repo bugs > coming from ArchLinux, since other environments are stable in time. I > can't say exact git verison the bugs was introduced, but we have a > commits, dated by 0 secs since the Epoch and invalid Author fields > also. Also git fsck --full got the following stats on repo > > Checking object directories: 100% (256/256), done. > error in commit 7eeb541987af8a589c6ebee53346c48a13142233: invalid > author/committer line - missing space before date > error in commit c23b0a487143e5d5d96cdc5354975e95114241ee: invalid > author/committer line - missing space before date > error in commit c7fa421863e073996b5d1ba6beb6001b9d146cba: invalid > author/committer line - missing space before date > error in commit 131155bd75c588bcd251b719c483d1d5bcb78504: invalid > author/committer line - missing space before date > error in commit 0888e7ffe6ae0aaf1b6d1ba67d05715487f88a52: invalid > author/committer line - missing space before date > error in commit 3cdeddd15c251a13fb3e79844ed3ea0e02cb611a: invalid > author/committer line - missing space before date > error in commit 21f9fe1565d89da845fc7080495c922103bacf24: invalid > author/committer line - missing space before date > error in commit f557d3427ba1bb33a1c1fd2c7936efa7e7c70281: invalid > author/committer line - missing space before date > error in commit c625943779c72b41b08b41730e56126b89cbb7b4: invalid > author/committer line - missing space before date > error in commit a83fc863991aae2bdad148a5897ed4315792dd82: invalid > author/committer line - missing space before date > error in commit 207321f773e695b2ae88884c34620bc663383f90: invalid > author/committer line - missing space before date > error in commit 67368e9eda9892acd6c6ebf03dd6f22b6de2db8a: invalid > author/committer line - missing space before date > error in commit 525a5d508a7f466a1339752e921517f4db8c4af6: invalid > author/committer line - missing space before date > error in commit 38215e27f74caa342e3353c4cd548fcf8c1df3dc: invalid > author/committer line - missing space before date > error in commit 1ee0167194eb34caca2c20ce5c74d062fc898718: invalid > author/committer line - missing space before date > error in commit 3274c469b981285f9a4d0b0a62afbb8f4d3e93ae: invalid > author/committer line - missing space before date > error in commit f37ab83f71ca93f42256e05efdd4244eb321efaf: invalid > author/committer line - missing space before date > error in commit 9bb6f7d63bb5a37b8afc3ae090bd6f34deb68633: invalid > author/committer line - missing space before date > Checking objects: 100% (9576/9576), done. > > And I haven't find a way to fix without loosing a commits history, so > we left them as it is. I will check the approved patches and writeback > a little bit later. Thanks a lot for looking this > > Ivan > > 2013/4/17 Junio C Hamano <gitster@pobox.com>: >> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: >> >>> How about making split_ident_line() a bit friendlier be letting it >>> provide the epoch as default time stamp instead of NULL? >> >> Two knee-jerk concerns I have without going back to the callers: >> >> * Would that "0" ever be given to the approxidate parser, which >> rejects ancient dates in numbers-since-epoch format without @ >> prefix? >> >> * Does any existing caller use the NULL as a sign to see the input >> was without date and act on that information? >> >> >>> -- >8 -- >>> Subject: ident: let split_ident_line() provide a default time stamp >>> >>> If a commit has a broken time stamp, split_ident_line() sets >>> date_begin, date_end, tz_begin and tz_end to NULL. Not all callers >>> are prepared to handle that case and segfault. >>> >>> Instead of fixing them and having to be careful while implementing >>> the next caller, provide a string consisting of the number zero as >>> default value, representing the UNIX epoch. That's the value that >>> git log showed before it was converted to use split_ident_line(). >>> >>> Reported-by: Ivan Lyapunov <dront78@gmail.com> >>> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> >>> --- >>> ident.c | 10 ++++++---- >>> 1 file changed, 6 insertions(+), 4 deletions(-) >>> >>> diff --git a/ident.c b/ident.c >>> index 1c123e6..ee840f4 100644 >>> --- a/ident.c >>> +++ b/ident.c >>> @@ -191,6 +191,8 @@ static void strbuf_addstr_without_crud(struct strbuf *sb, const char *src) >>> sb->buf[sb->len] = '\0'; >>> } >>> >>> +static const char zero_string[] = "0"; >>> + >>> /* >>> * Reverse of fmt_ident(); given an ident line, split the fields >>> * to allow the caller to parse it. >>> @@ -254,10 +256,10 @@ int split_ident_line(struct ident_split *split, const char *line, int len) >>> return 0; >>> >>> person_only: >>> - split->date_begin = NULL; >>> - split->date_end = NULL; >>> - split->tz_begin = NULL; >>> - split->tz_end = NULL; >>> + split->date_begin = zero_string; >>> + split->date_end = zero_string + strlen(zero_string); >>> + split->tz_begin = zero_string; >>> + split->tz_end = zero_string + strlen(zero_string); >>> return 0; >>> } ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 5:22 ` Ivan Lyapunov @ 2013-04-17 8:27 ` John Keeping 2013-04-17 9:14 ` Ivan Lyapunov 0 siblings, 1 reply; 28+ messages in thread From: John Keeping @ 2013-04-17 8:27 UTC (permalink / raw) To: Ivan Lyapunov; +Cc: git On Wed, Apr 17, 2013 at 09:22:01AM +0400, Ivan Lyapunov wrote: > I checked René Scharfe's patch and it works - no more git log crash. > Also I checked a broken commits by git show and the most of them > created by me. I can confirm I never used anyting else except console > git commit or netbeans gui to commit, which is just git gui wrapper > tool. Doesn't NetBeans use JGit[1]? That makes it a bit more than a wrapper for the Git command line tools. [1] http://eclipse.org/jgit/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 8:27 ` John Keeping @ 2013-04-17 9:14 ` Ivan Lyapunov 2013-04-17 9:43 ` Konstantin Khomoutov 0 siblings, 1 reply; 28+ messages in thread From: Ivan Lyapunov @ 2013-04-17 9:14 UTC (permalink / raw) To: John Keeping, git netbeans use some integrated git wrapper and I don't know about JGit source base or not. In Eclipse we use Egit. Also all broken commits limited to november 2012, but we still continue to use the same development environments without any problems Ivan 2013/4/17 John Keeping <john@keeping.me.uk>: > On Wed, Apr 17, 2013 at 09:22:01AM +0400, Ivan Lyapunov wrote: >> I checked René Scharfe's patch and it works - no more git log crash. >> Also I checked a broken commits by git show and the most of them >> created by me. I can confirm I never used anyting else except console >> git commit or netbeans gui to commit, which is just git gui wrapper >> tool. > > Doesn't NetBeans use JGit[1]? That makes it a bit more than a wrapper > for the Git command line tools. > > [1] http://eclipse.org/jgit/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 9:14 ` Ivan Lyapunov @ 2013-04-17 9:43 ` Konstantin Khomoutov [not found] ` <CANKwXW1heci+D5ZO3aF+dMN9davRawuZuKz0bf2n3iRiMjjgHg@mail.gmail.com> 0 siblings, 1 reply; 28+ messages in thread From: Konstantin Khomoutov @ 2013-04-17 9:43 UTC (permalink / raw) To: Ivan Lyapunov; +Cc: John Keeping, git On Wed, 17 Apr 2013 13:14:48 +0400 Ivan Lyapunov <dront78@gmail.com> wrote: > netbeans use some integrated git wrapper and I don't know about JGit > source base or not. In Eclipse we use Egit. Also all broken commits > limited to november 2012, but we still continue to use the same > development environments without any problems Does "the same" also mean "of the same version"? I mean that if, say, you managed to update netbeans after November, 2012 that would explain a lot as the update might just silently pull a fix for the Git-interfacing code. The same might apply to Git itself as well. ^ permalink raw reply [flat|nested] 28+ messages in thread
[parent not found: <CANKwXW1heci+D5ZO3aF+dMN9davRawuZuKz0bf2n3iRiMjjgHg@mail.gmail.com>]
* Re: git log - crash and core dump [not found] ` <CANKwXW1heci+D5ZO3aF+dMN9davRawuZuKz0bf2n3iRiMjjgHg@mail.gmail.com> @ 2013-04-17 10:23 ` Ivan Lyapunov 0 siblings, 0 replies; 28+ messages in thread From: Ivan Lyapunov @ 2013-04-17 10:23 UTC (permalink / raw) To: git missed a list sorry for dup I ment the same because git changes the version too. Also netbeans/eclipse upgrade are not synced, because of different products. So the same ment only names, not versions. But in deep search another repos I found the broken commits in Eclipse repo dated by 13 march 2013. Can them produced because of previous broken commits? And probably you can be right about java git wrappers can produce this issues, I'll try to make a broken repo from scratch later. Ivan 2013/4/17 Ivan Lyapunov <dront78@gmail.com>: > I ment the same because git changes the version too. Also > netbeans/eclipse upgrade are not synced, because of different > products. So the same ment only names, not versions. But in deep > search another repos I found the broken commits in Eclipse repo dated > by 13 march 2013. Can them produced because of previous broken > commits? And probably you can be right about java git wrappers can > produce this issues, I'll try to make a broken repo from scratch > later. > Ivan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 22:21 ` Junio C Hamano 2013-04-17 2:50 ` Ivan Lyapunov @ 2013-04-17 5:26 ` Junio C Hamano 2013-04-17 6:39 ` Jeff King 1 sibling, 1 reply; 28+ messages in thread From: Junio C Hamano @ 2013-04-17 5:26 UTC (permalink / raw) To: René Scharfe; +Cc: Ivan Lyapunov, git, Antoine Pelisse Junio C Hamano <gitster@pobox.com> writes: > René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: > >> How about making split_ident_line() a bit friendlier be letting it >> provide the epoch as default time stamp instead of NULL? > > Two knee-jerk concerns I have without going back to the callers: > > * Would that "0" ever be given to the approxidate parser, which > rejects ancient dates in numbers-since-epoch format without @ > prefix? > > * Does any existing caller use the NULL as a sign to see the input > was without date and act on that information? I looked at all the callers (there aren't that many), and none of them did "Do this on a person-only ident, and do that on an ident with timestamp". So for the callers that ignore timestamp, your patch will be a no-op, and for others that assume there is a timestamp, it will turn a crash/segv into output with funny timestamp. So I think the patch is a right thing to do (we would need in-code comment to warn new callers about the semantics, though). Thanks. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 5:26 ` Junio C Hamano @ 2013-04-17 6:39 ` Jeff King 2013-04-17 17:51 ` Junio C Hamano 2013-04-17 17:59 ` René Scharfe 0 siblings, 2 replies; 28+ messages in thread From: Jeff King @ 2013-04-17 6:39 UTC (permalink / raw) To: Junio C Hamano; +Cc: René Scharfe, Ivan Lyapunov, git, Antoine Pelisse On Tue, Apr 16, 2013 at 10:26:40PM -0700, Junio C Hamano wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: > > > >> How about making split_ident_line() a bit friendlier be letting it > >> provide the epoch as default time stamp instead of NULL? > > > > Two knee-jerk concerns I have without going back to the callers: > > > > * Would that "0" ever be given to the approxidate parser, which > > rejects ancient dates in numbers-since-epoch format without @ > > prefix? > > > > * Does any existing caller use the NULL as a sign to see the input > > was without date and act on that information? > > I looked at all the callers (there aren't that many), and none of > them did "Do this on a person-only ident, and do that on an ident > with timestamp". So for the callers that ignore timestamp, your > patch will be a no-op, and for others that assume there is a > timestamp, it will turn a crash/segv into output with funny > timestamp. What about sane_ident_split in builtin/commit.c? It explicitly rejects a NULL date. The logic in determine_author_info is a little hard to follow (it assembles the ident line with fmt_ident and then reparses it), but I believe it should be catching a bogus line from "commit -c", or from GIT_AUTHOR_DATE in the environment. As a side note, when determine_author_info sees a bogus ident, it appears to just silently ignore it, which is probably a bad thing. Shouldn't we by complaining? Or am I mis-reading the code? -Peff ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 6:39 ` Jeff King @ 2013-04-17 17:51 ` Junio C Hamano 2013-04-17 17:59 ` René Scharfe 1 sibling, 0 replies; 28+ messages in thread From: Junio C Hamano @ 2013-04-17 17:51 UTC (permalink / raw) To: Jeff King; +Cc: René Scharfe, Ivan Lyapunov, git, Antoine Pelisse Jeff King <peff@peff.net> writes: > What about sane_ident_split in builtin/commit.c? It explicitly rejects a > NULL date. The logic in determine_author_info is a little hard to follow > (it assembles the ident line with fmt_ident and then reparses it), but I > believe it should be catching a bogus line from "commit -c", or from > GIT_AUTHOR_DATE in the environment. Yeah, you are of course right. I noticed that "fmt_ident then parse" sequence a bit funny, too. It seems to manually parse to prepare what it feeds fmt_ident. > As a side note, when determine_author_info sees a bogus ident, it > appears to just silently ignore it, which is probably a bad thing. True, too. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 6:39 ` Jeff King 2013-04-17 17:51 ` Junio C Hamano @ 2013-04-17 17:59 ` René Scharfe 2013-04-17 18:02 ` Jeff King ` (2 more replies) 1 sibling, 3 replies; 28+ messages in thread From: René Scharfe @ 2013-04-17 17:59 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Ivan Lyapunov, git, Antoine Pelisse Am 17.04.2013 08:39, schrieb Jeff King: > On Tue, Apr 16, 2013 at 10:26:40PM -0700, Junio C Hamano wrote: > >> Junio C Hamano <gitster@pobox.com> writes: >> >>> René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: >>> >>>> How about making split_ident_line() a bit friendlier be letting it >>>> provide the epoch as default time stamp instead of NULL? >>> >>> Two knee-jerk concerns I have without going back to the callers: >>> >>> * Would that "0" ever be given to the approxidate parser, which >>> rejects ancient dates in numbers-since-epoch format without @ >>> prefix? >>> >>> * Does any existing caller use the NULL as a sign to see the input >>> was without date and act on that information? >> >> I looked at all the callers (there aren't that many), and none of >> them did "Do this on a person-only ident, and do that on an ident >> with timestamp". So for the callers that ignore timestamp, your >> patch will be a no-op, and for others that assume there is a >> timestamp, it will turn a crash/segv into output with funny >> timestamp. > > What about sane_ident_split in builtin/commit.c? It explicitly rejects a > NULL date. The logic in determine_author_info is a little hard to follow > (it assembles the ident line with fmt_ident and then reparses it), but I > believe it should be catching a bogus line from "commit -c", or from > GIT_AUTHOR_DATE in the environment. Right, so let's keep the NULLs and fix the individual cases. A quick "git grep -W -e date_begin -e date_end -e tz_begin -e tz_end" reveals that there are only the ones we talked about: blame, pretty, commit and -- of course -- ident. And only the first two need fixing. > As a side note, when determine_author_info sees a bogus ident, it > appears to just silently ignore it, which is probably a bad thing. > Shouldn't we by complaining? Or am I mis-reading the code? The code looks complicated, but I just tried it: fmt_ident() dies if you give it an invalid date. René ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 17:59 ` René Scharfe @ 2013-04-17 18:02 ` Jeff King 2013-04-17 19:06 ` René Scharfe 2013-04-17 18:33 ` [PATCH] pretty: handle broken commit headers gracefully René Scharfe 2013-04-17 18:33 ` [PATCH] blame: " René Scharfe 2 siblings, 1 reply; 28+ messages in thread From: Jeff King @ 2013-04-17 18:02 UTC (permalink / raw) To: René Scharfe; +Cc: Junio C Hamano, Ivan Lyapunov, git, Antoine Pelisse On Wed, Apr 17, 2013 at 07:59:28PM +0200, René Scharfe wrote: > >What about sane_ident_split in builtin/commit.c? It explicitly rejects a > >NULL date. The logic in determine_author_info is a little hard to follow > >(it assembles the ident line with fmt_ident and then reparses it), but I > >believe it should be catching a bogus line from "commit -c", or from > >GIT_AUTHOR_DATE in the environment. > > Right, so let's keep the NULLs and fix the individual cases. A quick > "git grep -W -e date_begin -e date_end -e tz_begin -e tz_end" reveals > that there are only the ones we talked about: blame, pretty, commit > and -- of course -- ident. And only the first two need fixing. Yes, that matches the list I came up with in February. I think we also need to do something about "git cat-file -p", which does not use the split_ident_line parser (but has its own problems with the home-grown parser). > >As a side note, when determine_author_info sees a bogus ident, it > >appears to just silently ignore it, which is probably a bad thing. > >Shouldn't we by complaining? Or am I mis-reading the code? > > The code looks complicated, but I just tried it: fmt_ident() dies if > you give it an invalid date. It does seem like determine_author_info can be greatly simplified, but I haven't looked all that closely. -Peff ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-17 18:02 ` Jeff King @ 2013-04-17 19:06 ` René Scharfe 2013-04-17 21:00 ` [PATCH] cat-file: print tags raw for "cat-file -p" Jeff King 0 siblings, 1 reply; 28+ messages in thread From: René Scharfe @ 2013-04-17 19:06 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Ivan Lyapunov, git, Antoine Pelisse Am 17.04.2013 20:02, schrieb Jeff King: > I think we also need to do something about "git cat-file -p", which does > not use the split_ident_line parser (but has its own problems with the > home-grown parser). Ah, while it prints commit object contents verbatim, it formats the date of tags. And it does it without help from tag.c (or ident.c), which in turn does its own parsing as well. So it looks like we have two more candidates for conversion to split_ident_line() here. René ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] cat-file: print tags raw for "cat-file -p" 2013-04-17 19:06 ` René Scharfe @ 2013-04-17 21:00 ` Jeff King 2013-04-19 3:03 ` Eric Sunshine 0 siblings, 1 reply; 28+ messages in thread From: Jeff King @ 2013-04-17 21:00 UTC (permalink / raw) To: René Scharfe; +Cc: Junio C Hamano, Ivan Lyapunov, git, Antoine Pelisse On Wed, Apr 17, 2013 at 09:06:21PM +0200, René Scharfe wrote: > Am 17.04.2013 20:02, schrieb Jeff King: > >I think we also need to do something about "git cat-file -p", which does > >not use the split_ident_line parser (but has its own problems with the > >home-grown parser). > > Ah, while it prints commit object contents verbatim, it formats the date > of tags. And it does it without help from tag.c (or ident.c), which in > turn does its own parsing as well. So it looks like we have two more > candidates for conversion to split_ident_line() here. I think we should apply the patch below to just drop the date formatting from cat-file, along with your two patches. This is the 4/4 from the series I posted in February: http://thread.gmane.org/gmane.comp.version-control.git/216870/focus=217081 but there I claimed that "git tag -v" might be affected. Upon looking closer, it is not; we accidentally dropped the pretty-printing of the date from it many years ago (and nobody seemed to care). The other patches from that series aren't necessary. The 1/4 is replaced by your patches (which do roughly the same thing, but add nice tests and seem to refactor a bit more). The 2/4 and 3/4 patches were about adding new fsck checks for tags, but I think there is some refactoring necessary there. They can wait for now. -- >8 -- Subject: [PATCH] cat-file: print tags raw for "cat-file -p" When "cat-file -p" prints commits, it shows them in their raw format, since git's format is already human-readable. For tags, however, we print the whole thing raw except for one thing: we convert the timestamp on the tagger line into a human-readable date. This dates all the way back to a0f15fa (Pretty-print tagger dates, 2006-03-01). At that time there was no other way to pretty-print a tag. These days, however, neither of those matters much. The normal way to pretty-print a tag is with "git show", which is much more flexible than "cat-file -p". Commit a0f15fa also built "verify-tag --verbose" (and subsequently "tag -v") around the "cat-file -p" output. However, that behavior was lost in commit 62e09ce (Make git tag a builtin, 2007-07-20), and we went back to printing the raw tag contents. Nobody seems to have noticed the bug since then (and it is arguably a saner behavior anyway, as it shows the actual bytes for which we verified the signature). Let's drop the tagger-date formatting for "cat-file -p". It makes us more consistent with cat-file's commit pretty-printer, and as a bonus, we can drop the hand-rolled tag parsing code in cat-file (which happened to behave inconsistently with the tag pretty-printing code elsewhere). This is a change of output format, so it's possible that some callers could considered this a regression. However, the original behavior was arguably a bug (due to the inconsistency with commits), likely nobody was relying on it (even we do not use it ourselves these days), and anyone relying on the "-p" pretty-printer should be able to expect a change in the output format (i.e., while "cat-file" is plumbing, the output format of "-p" was never guaranteed to be stable). Signed-off-by: Jeff King <peff@peff.net> --- builtin/cat-file.c | 71 ----------------------------------------------------- t/t1006-cat-file.sh | 5 +--- 2 files changed, 1 insertion(+), 75 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 40f87b4..045cee7 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -16,73 +16,6 @@ #define BATCH 1 #define BATCH_CHECK 2 -static void pprint_tag(const unsigned char *sha1, const char *buf, unsigned long size) -{ - /* the parser in tag.c is useless here. */ - const char *endp = buf + size; - const char *cp = buf; - - while (cp < endp) { - char c = *cp++; - if (c != '\n') - continue; - if (7 <= endp - cp && !memcmp("tagger ", cp, 7)) { - const char *tagger = cp; - - /* Found the tagger line. Copy out the contents - * of the buffer so far. - */ - write_or_die(1, buf, cp - buf); - - /* - * Do something intelligent, like pretty-printing - * the date. - */ - while (cp < endp) { - if (*cp++ == '\n') { - /* tagger to cp is a line - * that has ident and time. - */ - const char *sp = tagger; - char *ep; - unsigned long date; - long tz; - while (sp < cp && *sp != '>') - sp++; - if (sp == cp) { - /* give up */ - write_or_die(1, tagger, - cp - tagger); - break; - } - while (sp < cp && - !('0' <= *sp && *sp <= '9')) - sp++; - write_or_die(1, tagger, sp - tagger); - date = strtoul(sp, &ep, 10); - tz = strtol(ep, NULL, 10); - sp = show_date(date, tz, 0); - write_or_die(1, sp, strlen(sp)); - xwrite(1, "\n", 1); - break; - } - } - break; - } - if (cp < endp && *cp == '\n') - /* end of header */ - break; - } - /* At this point, we have copied out the header up to the end of - * the tagger line and cp points at one past \n. It could be the - * next header line after the tagger line, or it could be another - * \n that marks the end of the headers. We need to copy out the - * remainder as is. - */ - if (cp < endp) - write_or_die(1, cp, endp - cp); -} - static int cat_one_file(int opt, const char *exp_type, const char *obj_name) { unsigned char sha1[20]; @@ -133,10 +66,6 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name) buf = read_sha1_file(sha1, &type, &size); if (!buf) die("Cannot read object %s", obj_name); - if (type == OBJ_TAG) { - pprint_tag(sha1, buf, size); - return 0; - } /* otherwise just spit out the data */ break; diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index 9820f70..9cc5c6b 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -135,14 +135,11 @@ tag_size=$(strlen "$tag_content") tag_content="$tag_header_without_timestamp 0000000000 +0000 $tag_description" -tag_pretty_content="$tag_header_without_timestamp Thu Jan 1 00:00:00 1970 +0000 - -$tag_description" tag_sha1=$(echo_without_newline "$tag_content" | git mktag) tag_size=$(strlen "$tag_content") -run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_pretty_content" 1 +run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" 1 test_expect_success \ "Reach a blob from a tag pointing to it" \ -- 1.8.2.11.g42401f0 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH] cat-file: print tags raw for "cat-file -p" 2013-04-17 21:00 ` [PATCH] cat-file: print tags raw for "cat-file -p" Jeff King @ 2013-04-19 3:03 ` Eric Sunshine 0 siblings, 0 replies; 28+ messages in thread From: Eric Sunshine @ 2013-04-19 3:03 UTC (permalink / raw) To: Jeff King Cc: René Scharfe, Junio C Hamano, Ivan Lyapunov, Git List, Antoine Pelisse On Wed, Apr 17, 2013 at 5:00 PM, Jeff King <peff@peff.net> wrote: > Subject: [PATCH] cat-file: print tags raw for "cat-file -p" > > When "cat-file -p" prints commits, it shows them in their > raw format, since git's format is already human-readable. > For tags, however, we print the whole thing raw except for > one thing: we convert the timestamp on the tagger line into a > human-readable date. > [...] > Let's drop the tagger-date formatting for "cat-file -p". It > makes us more consistent with cat-file's commit > pretty-printer, and as a bonus, we can drop the hand-rolled > tag parsing code in cat-file (which happened to behave > inconsistently with the tag pretty-printing code elsewhere). > > This is a change of output format, so it's possible that > some callers could considered this a regression. However, s/considered/consider/ > the original behavior was arguably a bug (due to the > inconsistency with commits), likely nobody was relying on it > (even we do not use it ourselves these days), and anyone > relying on the "-p" pretty-printer should be able to expect > a change in the output format (i.e., while "cat-file" is > plumbing, the output format of "-p" was never guaranteed to > be stable). > > Signed-off-by: Jeff King <peff@peff.net> ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH] pretty: handle broken commit headers gracefully 2013-04-17 17:59 ` René Scharfe 2013-04-17 18:02 ` Jeff King @ 2013-04-17 18:33 ` René Scharfe 2013-04-17 18:33 ` [PATCH] blame: " René Scharfe 2 siblings, 0 replies; 28+ messages in thread From: René Scharfe @ 2013-04-17 18:33 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Ivan Lyapunov, Antoine Pelisse Centralize the parsing of the date and time zone strings in the new helper function show_ident_date() and make sure it checks the pointers provided by split_ident_line() for NULL before use. Reported-by: Ivan Lyapunov <dront78@gmail.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> --- The first test case passes on v1.8.1, i.e. it also showed the epoch. The second one passes on v1.8.1 and on master because there already is a NULL check in format_person_part(). pretty.c | 45 ++++++++++++++++++++++++--------------------- t/t4211-log-corrupt.sh | 42 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+), 21 deletions(-) create mode 100755 t/t4211-log-corrupt.sh diff --git a/pretty.c b/pretty.c index d3a82d2..c116248 100644 --- a/pretty.c +++ b/pretty.c @@ -393,6 +393,19 @@ static void add_rfc2047(struct strbuf *sb, const char *line, size_t len, strbuf_addstr(sb, "?="); } +static const char *show_ident_date(const struct ident_split *ident, + enum date_mode mode) +{ + unsigned long date = 0; + int tz = 0; + + if (ident->date_begin && ident->date_end) + date = strtoul(ident->date_begin, NULL, 10); + if (ident->tz_begin && ident->tz_end) + tz = strtol(ident->tz_begin, NULL, 10); + return show_date(date, tz, mode); +} + void pp_user_info(const struct pretty_print_context *pp, const char *what, struct strbuf *sb, const char *line, const char *encoding) @@ -401,12 +414,10 @@ void pp_user_info(const struct pretty_print_context *pp, struct strbuf mail; struct ident_split ident; int linelen; - char *line_end, *date; + char *line_end; const char *mailbuf, *namebuf; size_t namelen, maillen; int max_length = 78; /* per rfc2822 */ - unsigned long time; - int tz; if (pp->fmt == CMIT_FMT_ONELINE) return; @@ -438,8 +449,6 @@ void pp_user_info(const struct pretty_print_context *pp, strbuf_add(&name, namebuf, namelen); namelen = name.len + mail.len + 3; /* ' ' + '<' + '>' */ - time = strtoul(ident.date_begin, &date, 10); - tz = strtol(date, NULL, 10); if (pp->fmt == CMIT_FMT_EMAIL) { strbuf_addstr(sb, "From: "); @@ -472,13 +481,16 @@ void pp_user_info(const struct pretty_print_context *pp, switch (pp->fmt) { case CMIT_FMT_MEDIUM: - strbuf_addf(sb, "Date: %s\n", show_date(time, tz, pp->date_mode)); + strbuf_addf(sb, "Date: %s\n", + show_ident_date(&ident, pp->date_mode)); break; case CMIT_FMT_EMAIL: - strbuf_addf(sb, "Date: %s\n", show_date(time, tz, DATE_RFC2822)); + strbuf_addf(sb, "Date: %s\n", + show_ident_date(&ident, DATE_RFC2822)); break; case CMIT_FMT_FULLER: - strbuf_addf(sb, "%sDate: %s\n", what, show_date(time, tz, pp->date_mode)); + strbuf_addf(sb, "%sDate: %s\n", what, + show_ident_date(&ident, pp->date_mode)); break; default: /* notin' */ @@ -688,8 +700,6 @@ static size_t format_person_part(struct strbuf *sb, char part, { /* currently all placeholders have same length */ const int placeholder_len = 2; - int tz; - unsigned long date = 0; struct ident_split s; const char *name, *mail; size_t maillen, namelen; @@ -716,30 +726,23 @@ static size_t format_person_part(struct strbuf *sb, char part, if (!s.date_begin) goto skip; - date = strtoul(s.date_begin, NULL, 10); - if (part == 't') { /* date, UNIX timestamp */ strbuf_add(sb, s.date_begin, s.date_end - s.date_begin); return placeholder_len; } - /* parse tz */ - tz = strtoul(s.tz_begin + 1, NULL, 10); - if (*s.tz_begin == '-') - tz = -tz; - switch (part) { case 'd': /* date */ - strbuf_addstr(sb, show_date(date, tz, dmode)); + strbuf_addstr(sb, show_ident_date(&s, dmode)); return placeholder_len; case 'D': /* date, RFC2822 style */ - strbuf_addstr(sb, show_date(date, tz, DATE_RFC2822)); + strbuf_addstr(sb, show_ident_date(&s, DATE_RFC2822)); return placeholder_len; case 'r': /* date, relative */ - strbuf_addstr(sb, show_date(date, tz, DATE_RELATIVE)); + strbuf_addstr(sb, show_ident_date(&s, DATE_RELATIVE)); return placeholder_len; case 'i': /* date, ISO 8601 */ - strbuf_addstr(sb, show_date(date, tz, DATE_ISO8601)); + strbuf_addstr(sb, show_ident_date(&s, DATE_ISO8601)); return placeholder_len; } diff --git a/t/t4211-log-corrupt.sh b/t/t4211-log-corrupt.sh new file mode 100755 index 0000000..ec5099b --- /dev/null +++ b/t/t4211-log-corrupt.sh @@ -0,0 +1,42 @@ +#!/bin/sh + +test_description='git log with invalid commit headers' + +. ./test-lib.sh + +test_expect_success 'setup' ' + test_commit foo && + + git cat-file commit HEAD | + sed "/^author /s/>/>-<>/" >broken_email.commit && + git hash-object -w -t commit broken_email.commit >broken_email.hash && + git update-ref refs/heads/broken_email $(cat broken_email.hash) +' + +test_expect_success 'git log with broken author email' ' + { + echo commit $(cat broken_email.hash) + echo "Author: A U Thor <author@example.com>" + echo "Date: Thu Jan 1 00:00:00 1970 +0000" + echo + echo " foo" + } >expect.out && + : >expect.err && + + git log broken_email >actual.out 2>actual.err && + + test_cmp expect.out actual.out && + test_cmp expect.err actual.err +' + +test_expect_success 'git log --format with broken author email' ' + echo "A U Thor+author@example.com+" >expect.out && + : >expect.err && + + git log --format="%an+%ae+%ad" broken_email >actual.out 2>actual.err && + + test_cmp expect.out actual.out && + test_cmp expect.err actual.err +' + +test_done -- 1.8.2.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH] blame: handle broken commit headers gracefully 2013-04-17 17:59 ` René Scharfe 2013-04-17 18:02 ` Jeff King 2013-04-17 18:33 ` [PATCH] pretty: handle broken commit headers gracefully René Scharfe @ 2013-04-17 18:33 ` René Scharfe 2013-04-17 21:07 ` Jeff King 2 siblings, 1 reply; 28+ messages in thread From: René Scharfe @ 2013-04-17 18:33 UTC (permalink / raw) To: git; +Cc: Jeff King, Junio C Hamano, Ivan Lyapunov, Antoine Pelisse split_ident_line() can leave us with the pointers date_begin, date_end, tz_begin and tz_end all set to NULL. Check them before use and supply the same fallback values as in the case of a negative return code from split_ident_line(). The "(unknown)" is not actually shown in the output, though, because it will be converted to a number (zero) eventually. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> --- Minimal patch, test case missing. It's a bit sad that the old commit parser of blame handled Ivan's specific corruption (extra "-<>" after email) gracefully because it used the spaces as cutting points instead of "<" and ">". builtin/blame.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/builtin/blame.c b/builtin/blame.c index 86100e9..7770781 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -1375,10 +1375,15 @@ static void get_ac_line(const char *inbuf, const char *what, maillen = ident.mail_end - ident.mail_begin; mailbuf = ident.mail_begin; - *time = strtoul(ident.date_begin, NULL, 10); + if (ident.date_begin && ident.date_end) + *time = strtoul(ident.date_begin, NULL, 10); + else + *time = 0; - len = ident.tz_end - ident.tz_begin; - strbuf_add(tz, ident.tz_begin, len); + if (ident.tz_begin && ident.tz_end) + strbuf_add(tz, ident.tz_begin, ident.tz_end - ident.tz_begin); + else + strbuf_addstr(tz, "(unknown)"); /* * Now, convert both name and e-mail using mailmap -- 1.8.2.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH] blame: handle broken commit headers gracefully 2013-04-17 18:33 ` [PATCH] blame: " René Scharfe @ 2013-04-17 21:07 ` Jeff King 2013-04-17 21:22 ` René Scharfe 2013-04-17 21:55 ` Junio C Hamano 0 siblings, 2 replies; 28+ messages in thread From: Jeff King @ 2013-04-17 21:07 UTC (permalink / raw) To: René Scharfe; +Cc: git, Junio C Hamano, Ivan Lyapunov, Antoine Pelisse On Wed, Apr 17, 2013 at 08:33:54PM +0200, René Scharfe wrote: > Minimal patch, test case missing. It's a bit sad that the old commit > parser of blame handled Ivan's specific corruption (extra "-<>" after > email) gracefully because it used the spaces as cutting points instead > of "<" and ">". That may mean there is room for improvement in split_ident_line to be more resilient in removing cruft. With something like: Name <email@host>-<> 123456789 -0000 it would obviously be nice to find the date timestamp there, but I wonder what the "email" field should return? The full broken string, or just "email@host"? One way is convenient for overlooking problems in broken commits, but I would worry about code paths that are using split_ident_line to verify the quality of the string (like determine_author_info). It's possible we would need a strict and a forgiving mode. -Peff ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] blame: handle broken commit headers gracefully 2013-04-17 21:07 ` Jeff King @ 2013-04-17 21:22 ` René Scharfe 2013-04-17 21:55 ` Junio C Hamano 1 sibling, 0 replies; 28+ messages in thread From: René Scharfe @ 2013-04-17 21:22 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano, Ivan Lyapunov, Antoine Pelisse Am 17.04.2013 23:07, schrieb Jeff King: > On Wed, Apr 17, 2013 at 08:33:54PM +0200, René Scharfe wrote: > >> Minimal patch, test case missing. It's a bit sad that the old commit >> parser of blame handled Ivan's specific corruption (extra "-<>" after >> email) gracefully because it used the spaces as cutting points instead >> of "<" and ">". > > That may mean there is room for improvement in split_ident_line to > be more resilient in removing cruft. With something like: > > Name <email@host>-<> 123456789 -0000 > > it would obviously be nice to find the date timestamp there, but I > wonder what the "email" field should return? The full broken string, or > just "email@host"? One way is convenient for overlooking problems in > broken commits, but I would worry about code paths that are using > split_ident_line to verify the quality of the string (like > determine_author_info). It's possible we would need a strict and a > forgiving mode. You can have both; the necessary data is in the struct ident_split: Just check that *mail_end == '>' and mail_end + 1 == date_begin etc. René ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] blame: handle broken commit headers gracefully 2013-04-17 21:07 ` Jeff King 2013-04-17 21:22 ` René Scharfe @ 2013-04-17 21:55 ` Junio C Hamano 2013-04-18 16:56 ` Jeff King 1 sibling, 1 reply; 28+ messages in thread From: Junio C Hamano @ 2013-04-17 21:55 UTC (permalink / raw) To: Jeff King; +Cc: René Scharfe, git, Ivan Lyapunov, Antoine Pelisse Jeff King <peff@peff.net> writes: > On Wed, Apr 17, 2013 at 08:33:54PM +0200, René Scharfe wrote: > >> Minimal patch, test case missing. It's a bit sad that the old commit >> parser of blame handled Ivan's specific corruption (extra "-<>" after >> email) gracefully because it used the spaces as cutting points instead >> of "<" and ">". > > That may mean there is room for improvement in split_ident_line to > be more resilient in removing cruft. With something like: > > Name <email@host>-<> 123456789 -0000 > > it would obviously be nice to find the date timestamp there, but I > wonder what the "email" field should return? The full broken string, or > just "email@host"? Or you can imagine nastier input strings, like Name <>-<email@host> 123456789 -0000 Name <ema>-<il@host> 123456789 -0000 Name <email@host~ 1234>56789 -0000 I am afraid that at some point "we should salvage as much as we can", which is a worthy goal, becomes a losing proposition. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH] blame: handle broken commit headers gracefully 2013-04-17 21:55 ` Junio C Hamano @ 2013-04-18 16:56 ` Jeff King 0 siblings, 0 replies; 28+ messages in thread From: Jeff King @ 2013-04-18 16:56 UTC (permalink / raw) To: Junio C Hamano; +Cc: René Scharfe, git, Ivan Lyapunov, Antoine Pelisse On Wed, Apr 17, 2013 at 02:55:29PM -0700, Junio C Hamano wrote: > Or you can imagine nastier input strings, like > > Name <>-<email@host> 123456789 -0000 > Name <ema>-<il@host> 123456789 -0000 > Name <email@host~ 1234>56789 -0000 > > I am afraid that at some point "we should salvage as much as we > can", which is a worthy goal, becomes a losing proposition. Good point. In the worst cases, even if you cleaned things up, you might even need to allocate a new string (like your middle one), which would make calling split_ident_line a lot more annoying. Probably not worth the effort. -Peff ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 19:45 ` Junio C Hamano 2013-04-16 21:10 ` René Scharfe @ 2013-04-16 21:24 ` Antoine Pelisse 2013-04-16 21:34 ` Jeff King 2 siblings, 0 replies; 28+ messages in thread From: Antoine Pelisse @ 2013-04-16 21:24 UTC (permalink / raw) To: Junio C Hamano; +Cc: René Scharfe, Ivan Lyapunov, git On Tue, Apr 16, 2013 at 9:45 PM, Junio C Hamano <gitster@pobox.com> wrote: > It still is curious how a malformed line was created in the first > place. I wouldn't worry if a private tool used hash-object to > create such a commit, but if it is something that is commonly used > (e.g. "git commit"), others may suffer from the same and the tool > needs to be tightened a bit. I already happened to see one like that, and it was clearly imported through remote-hg. I've not been able to reproduce though, and the parser in git-fast-import seemed already robust enough to me to not allow this kind of messed-up line. I will see if I can find some time to reproduce/investigate this deeper. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: git log - crash and core dump 2013-04-16 19:45 ` Junio C Hamano 2013-04-16 21:10 ` René Scharfe 2013-04-16 21:24 ` git log - crash and core dump Antoine Pelisse @ 2013-04-16 21:34 ` Jeff King 2 siblings, 0 replies; 28+ messages in thread From: Jeff King @ 2013-04-16 21:34 UTC (permalink / raw) To: Junio C Hamano; +Cc: René Scharfe, Ivan Lyapunov, git, Antoine Pelisse On Tue, Apr 16, 2013 at 12:45:03PM -0700, Junio C Hamano wrote: > René Scharfe <rene.scharfe@lsrfire.ath.cx> writes: > > > Does this patch help? > > > > pretty.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/pretty.c b/pretty.c > > index d3a82d2..713eefc 100644 > > --- a/pretty.c > > +++ b/pretty.c > > @@ -405,8 +405,8 @@ void pp_user_info(const struct pretty_print_context *pp, > > const char *mailbuf, *namebuf; > > size_t namelen, maillen; > > int max_length = 78; /* per rfc2822 */ > > - unsigned long time; > > - int tz; > > + unsigned long time = 0; > > + int tz = 0; > > > > if (pp->fmt == CMIT_FMT_ONELINE) > > return; > > @@ -438,8 +438,10 @@ void pp_user_info(const struct pretty_print_context *pp, > > strbuf_add(&name, namebuf, namelen); > > > > namelen = name.len + mail.len + 3; /* ' ' + '<' + '>' */ > > - time = strtoul(ident.date_begin, &date, 10); > > - tz = strtol(date, NULL, 10); > > + if (ident.date_begin) { > > + time = strtoul(ident.date_begin, &date, 10); > > + tz = strtol(date, NULL, 10); > > + } > > > > if (pp->fmt == CMIT_FMT_EMAIL) { > > strbuf_addstr(sb, "From: "); > > Looks like a sensible change. split_ident_line() decided that the > given input was mangled and decided there is no valid date (the > input had <> where the timestamp string was required), so the > updated code leaves the time/tz unspecified. Hmm. This seemed oddly familiar to me, and indeed: http://thread.gmane.org/gmane.comp.version-control.git/216870/focus=217077 We need to fix blame, too, and there is a question of how "cat-file -p" behaves. -Peff ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2013-04-19 3:03 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-16 16:55 git log - crash and core dump Ivan Lyapunov 2013-04-16 17:29 ` Antoine Pelisse 2013-04-16 18:09 ` René Scharfe 2013-04-16 19:45 ` Junio C Hamano 2013-04-16 21:10 ` René Scharfe 2013-04-16 22:21 ` Junio C Hamano 2013-04-17 2:50 ` Ivan Lyapunov 2013-04-17 5:22 ` Ivan Lyapunov 2013-04-17 8:27 ` John Keeping 2013-04-17 9:14 ` Ivan Lyapunov 2013-04-17 9:43 ` Konstantin Khomoutov [not found] ` <CANKwXW1heci+D5ZO3aF+dMN9davRawuZuKz0bf2n3iRiMjjgHg@mail.gmail.com> 2013-04-17 10:23 ` Ivan Lyapunov 2013-04-17 5:26 ` Junio C Hamano 2013-04-17 6:39 ` Jeff King 2013-04-17 17:51 ` Junio C Hamano 2013-04-17 17:59 ` René Scharfe 2013-04-17 18:02 ` Jeff King 2013-04-17 19:06 ` René Scharfe 2013-04-17 21:00 ` [PATCH] cat-file: print tags raw for "cat-file -p" Jeff King 2013-04-19 3:03 ` Eric Sunshine 2013-04-17 18:33 ` [PATCH] pretty: handle broken commit headers gracefully René Scharfe 2013-04-17 18:33 ` [PATCH] blame: " René Scharfe 2013-04-17 21:07 ` Jeff King 2013-04-17 21:22 ` René Scharfe 2013-04-17 21:55 ` Junio C Hamano 2013-04-18 16:56 ` Jeff King 2013-04-16 21:24 ` git log - crash and core dump Antoine Pelisse 2013-04-16 21:34 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).