* [PATCH] fix: prevent date underflow when using positive timezone offset @ 2024-05-27 9:17 darcy via GitGitGadget 2024-05-28 14:05 ` Patrick Steinhardt 2024-06-02 23:06 ` [PATCH v2] date: detect underflow when parsing dates with " darcy via GitGitGadget 0 siblings, 2 replies; 36+ messages in thread From: darcy via GitGitGadget @ 2024-05-27 9:17 UTC (permalink / raw) To: git; +Cc: darcy, darcy From: darcy <acednes@gmail.com> Overriding the date of a commit to be `1970-01-01` with a large enough timezone for the equivalent GMT time to before 1970 is no longer accepted. Example: `GIT_COMMITTER_DATE='1970-01-01T00:00:00+10' git commit` would previously be accepted, only to unexpectedly fail in other parts of the code, such as `git push`. The timestamp is now checked against postitive timezone values. Signed-off-by: darcy <acednes@gmail.com> --- fix: prevent date underflow when using positive timezone offset Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1726%2Fdxrcy%2Fmaster-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1726/dxrcy/master-v1 Pull-Request: https://github.com/git/git/pull/1726 date.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/date.c b/date.c index 7365a4ad24f..8388629f267 100644 --- a/date.c +++ b/date.c @@ -908,7 +908,7 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) match = match_alpha(date, &tm, offset); else if (isdigit(c)) match = match_digit(date, &tm, offset, &tm_gmt); - else if ((c == '-' || c == '+') && isdigit(date[1])) + else if ((c == '-' || c == '+') && isdigit(date[1]) && tm.tm_hour != -1) match = match_tz(date, offset); if (!match) { @@ -937,8 +937,13 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) } } - if (!tm_gmt) + if (!tm_gmt) { + if (*offset > 0 && *offset * 60 > *timestamp) { + return -1; + } *timestamp -= *offset * 60; + } + return 0; /* success */ } base-commit: b9cfe4845cb2562584837bc0101c0ab76490a239 -- gitgitgadget ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH] fix: prevent date underflow when using positive timezone offset 2024-05-27 9:17 [PATCH] fix: prevent date underflow when using positive timezone offset darcy via GitGitGadget @ 2024-05-28 14:05 ` Patrick Steinhardt 2024-05-28 14:49 ` Phillip Wood 2024-05-28 17:06 ` Junio C Hamano 2024-06-02 23:06 ` [PATCH v2] date: detect underflow when parsing dates with " darcy via GitGitGadget 1 sibling, 2 replies; 36+ messages in thread From: Patrick Steinhardt @ 2024-05-28 14:05 UTC (permalink / raw) To: darcy via GitGitGadget; +Cc: git, darcy [-- Attachment #1: Type: text/plain, Size: 2550 bytes --] On Mon, May 27, 2024 at 09:17:06AM +0000, darcy via GitGitGadget wrote: > From: darcy <acednes@gmail.com> The commit message should start with the subsystem that you're touching, which in this case would be "date", e.g.: date: detect underflow when parsing dates with positive timezone offset > Overriding the date of a commit to be `1970-01-01` with a large enough > timezone for the equivalent GMT time to before 1970 is no longer > accepted. Okay. > Example: `GIT_COMMITTER_DATE='1970-01-01T00:00:00+10' git commit` would > previously be accepted, only to unexpectedly fail in other parts of the > code, such as `git push`. The timestamp is now checked against postitive > timezone values. How exactly does the failure look like before and after? > Signed-off-by: darcy <acednes@gmail.com> > --- > fix: prevent date underflow when using positive timezone offset > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1726%2Fdxrcy%2Fmaster-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1726/dxrcy/master-v1 > Pull-Request: https://github.com/git/git/pull/1726 > > date.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/date.c b/date.c > index 7365a4ad24f..8388629f267 100644 > --- a/date.c > +++ b/date.c > @@ -908,7 +908,7 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) > match = match_alpha(date, &tm, offset); > else if (isdigit(c)) > match = match_digit(date, &tm, offset, &tm_gmt); > - else if ((c == '-' || c == '+') && isdigit(date[1])) > + else if ((c == '-' || c == '+') && isdigit(date[1]) && tm.tm_hour != -1) > match = match_tz(date, offset); Without having a deep understanding of the code I don't quite see the connection between this change and the problem description. Is it necessary? If so, it might help to explain why it's needed in the commit message or in the code. > if (!match) { > @@ -937,8 +937,13 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) > } > } > > - if (!tm_gmt) > + if (!tm_gmt) { > + if (*offset > 0 && *offset * 60 > *timestamp) { > + return -1; > + } Nit: we don't add curly braces around one-line conditional bodies. This change here is the meat of it and looks like I'd expect. > *timestamp -= *offset * 60; > + } > + > return 0; /* success */ > } You should also add at least one test. Thanks for your contribution! Patrick [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH] fix: prevent date underflow when using positive timezone offset 2024-05-28 14:05 ` Patrick Steinhardt @ 2024-05-28 14:49 ` Phillip Wood 2024-05-28 17:06 ` Junio C Hamano 1 sibling, 0 replies; 36+ messages in thread From: Phillip Wood @ 2024-05-28 14:49 UTC (permalink / raw) To: Patrick Steinhardt, darcy via GitGitGadget; +Cc: git, darcy On 28/05/2024 15:05, Patrick Steinhardt wrote: > On Mon, May 27, 2024 at 09:17:06AM +0000, darcy via GitGitGadget wrote: >> From: darcy <acednes@gmail.com> >> diff --git a/date.c b/date.c >> index 7365a4ad24f..8388629f267 100644 >> --- a/date.c >> +++ b/date.c >> @@ -908,7 +908,7 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) >> match = match_alpha(date, &tm, offset); >> else if (isdigit(c)) >> match = match_digit(date, &tm, offset, &tm_gmt); >> - else if ((c == '-' || c == '+') && isdigit(date[1])) >> + else if ((c == '-' || c == '+') && isdigit(date[1]) && tm.tm_hour != -1) >> match = match_tz(date, offset); > > Without having a deep understanding of the code I don't quite see the > connection between this change and the problem description. Is it > necessary? If so, it might help to explain why it's needed in the commit > message or in the code. I was wondering about this change too >> if (!match) { >> @@ -937,8 +937,13 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) >> } >> } >> >> - if (!tm_gmt) >> + if (!tm_gmt) { >> + if (*offset > 0 && *offset * 60 > *timestamp) { >> + return -1; >> + } > > Nit: we don't add curly braces around one-line conditional bodies. > > This change here is the meat of it and looks like I'd expect. > >> *timestamp -= *offset * 60; Do we also want to check for overflows in the other direction (a large timestamp with a negative timezone offset)? Best Wishes Phillip ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH] fix: prevent date underflow when using positive timezone offset 2024-05-28 14:05 ` Patrick Steinhardt 2024-05-28 14:49 ` Phillip Wood @ 2024-05-28 17:06 ` Junio C Hamano 1 sibling, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2024-05-28 17:06 UTC (permalink / raw) To: Patrick Steinhardt; +Cc: darcy via GitGitGadget, git, darcy Patrick Steinhardt <ps@pks.im> writes: > On Mon, May 27, 2024 at 09:17:06AM +0000, darcy via GitGitGadget wrote: >> From: darcy <acednes@gmail.com> > > The commit message should start with the subsystem that you're touching, > which in this case would be "date", e.g.: > > date: detect underflow when parsing dates with positive timezone offset > >> Overriding the date of a commit to be `1970-01-01` with a large enough >> timezone for the equivalent GMT time to before 1970 is no longer >> accepted. > > Okay. "is no longer accepted" made me read the sentence three times to get what the author wants to say. Initially I thought the author wanted to report a regression where we used to accept but with a recent change we stopped accepting. In our convention, we present the status quo, point out why it is awkard/incorrect/bad, and then propose a new behaviour. Overriding ... before 1970 BEHAVES THIS WAY. This leads to BAD BEHAVIOUR FOR SUCH AND SUCH REASONS. Instead check the timezone offset and fail if the resulting time becomes before the epoch, "1970-01-01T00:00:00Z", when parsing. with the blanks filled in appropriately would have been much easier to see. >> Example: `GIT_COMMITTER_DATE='1970-01-01T00:00:00+10' git commit` would >> previously be accepted, only to unexpectedly fail in other parts of the >> code, such as `git push`. The timestamp is now checked against postitive >> timezone values. > > How exactly does the failure look like before and after? Yes, good question. >> Signed-off-by: darcy <acednes@gmail.com> >> --- I cannot offhand tell if Documentation/SubmittingPatches:real-name is followed here or ignored, so just to double check... Everything else in your review made sense to me. I guess that checking for tm_hour is assuming that TZ offset should always come before the values necessary to compute the timestamp comes, but it smells like an unwarranted assumption and not explaining the change in the proposed log message is a bad sign. Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-05-27 9:17 [PATCH] fix: prevent date underflow when using positive timezone offset darcy via GitGitGadget 2024-05-28 14:05 ` Patrick Steinhardt @ 2024-06-02 23:06 ` darcy via GitGitGadget 2024-06-03 11:13 ` Junio C Hamano 2024-06-07 0:17 ` [PATCH v3] date: detect underflow/overflow when parsing dates with " darcy via GitGitGadget 1 sibling, 2 replies; 36+ messages in thread From: darcy via GitGitGadget @ 2024-06-02 23:06 UTC (permalink / raw) To: git; +Cc: darcy, darcy From: darcy <acednes@gmail.com> Overriding the date of a commit to be close to "1970-01-01 00:00:00" with a large enough timezone for the equivelant GMT time to be before the epoch is considered valid by `parse_date_basic`. This leads to an integer underflow in the commit timestamp, which is not caught by `git-commit`, but will cause other services to fail, such as `git-fsck`, which reports "badDateOverflow: invalid author/committer line - date causes integer overflow". Instead check the timezone offset and fail if the resulting time comes before the epoch, "1970-01-01T00:00:00Z", when parsing. Signed-off-by: Darcy Burke <acednes@gmail.com> --- fix: prevent date underflow when using positive timezone offset cc: Patrick Steinhardt ps@pks.im cc: Phillip Wood phillip.wood123@gmail.com Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1726%2Fdxrcy%2Fmaster-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1726/dxrcy/master-v2 Pull-Request: https://github.com/git/git/pull/1726 Range-diff vs v1: 1: 4542d984aab ! 1: db508b2f533 fix: prevent date underflow when using positive timezone offset @@ Metadata Author: darcy <acednes@gmail.com> ## Commit message ## - fix: prevent date underflow when using positive timezone offset + date: detect underflow when parsing dates with positive timezone offset - Overriding the date of a commit to be `1970-01-01` with a large enough - timezone for the equivalent GMT time to before 1970 is no longer - accepted. + Overriding the date of a commit to be close to "1970-01-01 00:00:00" + with a large enough timezone for the equivelant GMT time to be before + the epoch is considered valid by `parse_date_basic`. - Example: `GIT_COMMITTER_DATE='1970-01-01T00:00:00+10' git commit` would - previously be accepted, only to unexpectedly fail in other parts of the - code, such as `git push`. The timestamp is now checked against postitive - timezone values. + This leads to an integer underflow in the commit timestamp, which is not + caught by `git-commit`, but will cause other services to fail, such as + `git-fsck`, which reports "badDateOverflow: invalid author/committer + line - date causes integer overflow". - Signed-off-by: darcy <acednes@gmail.com> + Instead check the timezone offset and fail if the resulting time comes + before the epoch, "1970-01-01T00:00:00Z", when parsing. + + Signed-off-by: Darcy Burke <acednes@gmail.com> ## date.c ## -@@ date.c: int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) - match = match_alpha(date, &tm, offset); - else if (isdigit(c)) - match = match_digit(date, &tm, offset, &tm_gmt); -- else if ((c == '-' || c == '+') && isdigit(date[1])) -+ else if ((c == '-' || c == '+') && isdigit(date[1]) && tm.tm_hour != -1) - match = match_tz(date, offset); - - if (!match) { @@ date.c: int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) } } - if (!tm_gmt) + if (!tm_gmt) { -+ if (*offset > 0 && *offset * 60 > *timestamp) { ++ if (*offset > 0 && *offset * 60 > *timestamp) + return -1; -+ } *timestamp -= *offset * 60; + } + return 0; /* success */ } + + ## t/t0006-date.sh ## +@@ t/t0006-date.sh: check_parse '2008-02-14 20:30:45 -:30' '2008-02-14 20:30:45 +0000' + check_parse '2008-02-14 20:30:45 -05:00' '2008-02-14 20:30:45 -0500' + check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 -0500' EST5 + check_parse 'Thu, 7 Apr 2005 15:14:13 -0700' '2005-04-07 15:14:13 -0700' ++check_parse '1970-01-01 00:00:00' '1970-01-01 00:00:00 +0000' ++check_parse '1970-01-01 00:00:00 +00' '1970-01-01 00:00:00 +0000' ++check_parse '1970-01-01 00:00:00 Z' '1970-01-01 00:00:00 +0000' ++check_parse '1970-01-01 00:00:00 -01' '1970-01-01 00:00:00 -0100' ++check_parse '1970-01-01 00:00:00 +01' bad ++check_parse '1970-01-01 00:00:00 +11' bad ++check_parse '1970-01-01 00:59:59 +01' bad ++check_parse '1970-01-01 01:00:00 +01' '1970-01-01 01:00:00 +0100' ++check_parse '1970-01-01 01:00:00 +11' bad ++check_parse '1970-01-02 00:00:00 +11' '1970-01-02 00:00:00 +1100' ++check_parse '1969-12-31 23:59:59' bad ++check_parse '1969-12-31 23:59:59 +00' bad ++check_parse '1969-12-31 23:59:59 Z' bad ++check_parse '1969-12-31 23:59:59 +11' bad ++check_parse '1969-12-31 23:59:59 -11' bad + + check_approxidate() { + echo "$1 -> $2 +0000" >expect date.c | 6 +++++- t/t0006-date.sh | 15 +++++++++++++++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/date.c b/date.c index 7365a4ad24f..8e3ec1bcb00 100644 --- a/date.c +++ b/date.c @@ -937,8 +937,12 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) } } - if (!tm_gmt) + if (!tm_gmt) { + if (*offset > 0 && *offset * 60 > *timestamp) + return -1; *timestamp -= *offset * 60; + } + return 0; /* success */ } diff --git a/t/t0006-date.sh b/t/t0006-date.sh index 3031256d143..cdbb40bec01 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -116,6 +116,21 @@ check_parse '2008-02-14 20:30:45 -:30' '2008-02-14 20:30:45 +0000' check_parse '2008-02-14 20:30:45 -05:00' '2008-02-14 20:30:45 -0500' check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 -0500' EST5 check_parse 'Thu, 7 Apr 2005 15:14:13 -0700' '2005-04-07 15:14:13 -0700' +check_parse '1970-01-01 00:00:00' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 +00' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 Z' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 -01' '1970-01-01 00:00:00 -0100' +check_parse '1970-01-01 00:00:00 +01' bad +check_parse '1970-01-01 00:00:00 +11' bad +check_parse '1970-01-01 00:59:59 +01' bad +check_parse '1970-01-01 01:00:00 +01' '1970-01-01 01:00:00 +0100' +check_parse '1970-01-01 01:00:00 +11' bad +check_parse '1970-01-02 00:00:00 +11' '1970-01-02 00:00:00 +1100' +check_parse '1969-12-31 23:59:59' bad +check_parse '1969-12-31 23:59:59 +00' bad +check_parse '1969-12-31 23:59:59 Z' bad +check_parse '1969-12-31 23:59:59 +11' bad +check_parse '1969-12-31 23:59:59 -11' bad check_approxidate() { echo "$1 -> $2 +0000" >expect base-commit: 9eaef5822cd76bbeb53b6479ce0ddaad34ee2b14 -- gitgitgadget ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-02 23:06 ` [PATCH v2] date: detect underflow when parsing dates with " darcy via GitGitGadget @ 2024-06-03 11:13 ` Junio C Hamano 2024-06-03 11:44 ` darcy 2024-06-07 0:17 ` [PATCH v3] date: detect underflow/overflow when parsing dates with " darcy via GitGitGadget 1 sibling, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2024-06-03 11:13 UTC (permalink / raw) To: darcy via GitGitGadget; +Cc: git, darcy "darcy via GitGitGadget" <gitgitgadget@gmail.com> writes: > fix: prevent date underflow when using positive timezone offset > > cc: Patrick Steinhardt ps@pks.im cc: Phillip Wood > phillip.wood123@gmail.com You're expected to respond to review comments before you send in updated patches. I didn't see the review comments responded to in the thread: https://lore.kernel.org/git/pull.1726.git.git.1716801427015.gitgitgadget@gmail.com/ Please see "A typical life cycle of a patch series" section of the SubmittingPatches document. https://git.github.io/htmldocs/SubmittingPatches.html#patch-flow Step #3 and Step #4 are distinct. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-03 11:13 ` Junio C Hamano @ 2024-06-03 11:44 ` darcy 2024-06-03 14:13 ` Phillip Wood 0 siblings, 1 reply; 36+ messages in thread From: darcy @ 2024-06-03 11:44 UTC (permalink / raw) To: Junio C Hamano, darcy via GitGitGadget; +Cc: git > Without having a deep understanding of the code I don't quite see the > connection between this change and the problem description. Is it > necessary? If so, it might help to explain why it's needed in the commit > message or in the code. > ...I guess that > checking for tm_hour is assuming that TZ offset should always come > before the values necessary to compute the timestamp comes, but it > smells like an unwarranted assumption and not explaining the change > in the proposed log message is a bad sign. This line has been reverted. The point was it would only parse the timezone offset if it occurs after the time part of the date, but I have realized that this is unrelated to the purpose of this change. > You should also add at least one test. Yep, thanks, added now to `t0006-date.sh`. > Do we also want to check for overflows in the other direction (a large timestamp with a negative timezone offset)? Is this something people want added? I am happy to implement this if so, though it wasn't my original intention. Issues with the commit message should also be resolved. Thank you everyone for your patience :) On 3/6/24 21:13, Junio C Hamano wrote: > "darcy via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> fix: prevent date underflow when using positive timezone offset >> >> cc: Patrick Steinhardt ps@pks.im cc: Phillip Wood >> phillip.wood123@gmail.com > You're expected to respond to review comments before you send in > updated patches. I didn't see the review comments responded to in > the thread: > > https://lore.kernel.org/git/pull.1726.git.git.1716801427015.gitgitgadget@gmail.com/ > > Please see "A typical life cycle of a patch series" section of the > SubmittingPatches document. > > https://git.github.io/htmldocs/SubmittingPatches.html#patch-flow > > Step #3 and Step #4 are distinct. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-03 11:44 ` darcy @ 2024-06-03 14:13 ` Phillip Wood 2024-06-04 8:48 ` darcy 0 siblings, 1 reply; 36+ messages in thread From: Phillip Wood @ 2024-06-03 14:13 UTC (permalink / raw) To: darcy, Junio C Hamano, darcy via GitGitGadget; +Cc: git, Patrick Steinhardt Hi darcy On 03/06/2024 12:44, darcy wrote: >> Do we also want to check for overflows in the other direction (a large >> timestamp with a negative timezone offset)? > > Is this something people want added? I am happy to implement this if > so, though it wasn't my original intention. I think if we're worried about the date overflowing in one direction it makes sense to fix overflows in the other direction at the same time especially as I think that the other case involves a signed integer overflow which is undefined behavior in C. Best Wishes Phillip ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-03 14:13 ` Phillip Wood @ 2024-06-04 8:48 ` darcy 2024-06-04 9:33 ` Jeff King 2024-06-05 13:10 ` Phillip Wood 0 siblings, 2 replies; 36+ messages in thread From: darcy @ 2024-06-04 8:48 UTC (permalink / raw) To: phillip.wood, Junio C Hamano, darcy via GitGitGadget Cc: git, Patrick Steinhardt On 4/6/24 00:13, Phillip Wood wrote: > Hi darcy > On 03/06/2024 12:44, darcy wrote: >>> Do we also want to check for overflows in the other direction (a large >>> timestamp with a negative timezone offset)? >> Is this something people want added? I am happy to implement this if >> so, though it wasn't my original intention. > I think if we're worried about the date overflowing in one direction it > makes sense to fix overflows in the other direction at the same time > especially as I think that the other case involves a signed integer > overflow which is undefined behavior in C. That makes sense. Though I am reading the `tm_to_time_t` code now and it only allows years up to 2099. > if (year < 0 || year > 129) /* algo only works for 1970-2099 */ > return -1; I can of course add a check here for dates close to the end of 2099, but it seems that the bigger issue is that some day people will want to use Git after 2099... Should I see if I can extend this range? I'm not sure where that specific year comes from, it doesn't seem to be based on a limit of the size of `time_t`, and the comment or git logs don't seem to provide a reason. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-04 8:48 ` darcy @ 2024-06-04 9:33 ` Jeff King 2024-06-05 6:52 ` darcy 2024-06-05 13:10 ` Phillip Wood 1 sibling, 1 reply; 36+ messages in thread From: Jeff King @ 2024-06-04 9:33 UTC (permalink / raw) To: darcy Cc: phillip.wood, Junio C Hamano, darcy via GitGitGadget, git, Patrick Steinhardt On Tue, Jun 04, 2024 at 06:48:59PM +1000, darcy wrote: > Though I am reading the `tm_to_time_t` code now and it only allows years > up to 2099. > > > if (year < 0 || year > 129) /* algo only works for 1970-2099 */ > > return -1; > > I can of course add a check here for dates close to the end of 2099, but > it seems that the bigger issue is that some day people will want to use > Git after 2099... Should I see if I can extend this range? I'm not sure > where that specific year comes from, it doesn't seem to be based on a > limit of the size of `time_t`, and the comment or git logs don't seem to > provide a reason. I think the 2099 limitation is there because 2100 is not a leap year, and the code has over-simplified computation there (it counts every 4th year as a leap year, which is not generally true for century boundaries, except for 2000 because it is divisible by 400). From the log message of ecee9d9e79 ([PATCH] Do date parsing by hand..., 2005-04-30), I think the 1970 limitation may be about avoiding non-60-minute DST (though it is not clear to me how the function cares about dst either way, as it operates on UTC). You can't represent times before 1970 in our code base anyway, because our epoch timestamp is unsigned. So yes, we could do better than that function. I worked on a series to handle negative timestamps a while ago, and as part of that, I had to replace this function. You can see the full series at the "jk/time-negative-wip" branch of https://github.com/peff/git, but I'll reproduce the commit in question below for the list archive. The resulting series worked fine for me, but IIRC there was some issue on Windows. I think its gmtime() did not understand negative timestamps, maybe? And we'd need the reverse equivalent of the patch below (which we could take from the same source). There are some overflow checks already in date.c (grep for "overflow") that might cover us, but I won't be surprised if there's a case that's missing (although we store the timestamps as uintmax_t, so you're unlikely to overflow that in practice). Underflow is more likely, though if we finally support negative timestamps, then it becomes a non-issue (and we'd actually want to revert the patch you're working on). If you're interested in picking up the negative timestamp work, I'd be happy to discuss it. It's been on my todo list for a long time, but I never quite get around to it. Anyway, here's the tm_to_time_t() cleanup. -- >8 -- date: make tm_to_time_t() less restricted We've been using a hand-rolled tm_to_time_t() conversion since ecee9d9e79 ([PATCH] Do date parsing by hand..., 2005-04-30). It works fine for recent dates, but it doesn't work before 1970 nor after 2099. Let's lift those restrictions by using a more robust algorithm. The days_from_civil() code here is lifted from Howard Hinnant's date library, which can be found at: https://github.com/HowardHinnant/date The code is in the year_month_day::to_days() function, but I also relied on his excellent explanation in: http://howardhinnant.github.io/date_algorithms.html The library is under the MIT license, which is GPL compatible. I've also left a comment in the source file marking the licensing (the code is substantially similar to the original; I only expanded the variable names to make it easier to follow). Ideally we could just rely on a system function to do this, but there isn't one. POSIX specifies mktime(), but it uses the local timezone. There's a timegm() function available in glibc and some BSD systems, as well as a similar function on Windows (with a different name). However, it's likely that there are still going to be some platforms without it, so we'd need a fallback anyway. We may as well just use that fallback everywhere to make sure that Git behaves consistently (and get more complete test coverage). Note that some callers, like parse_date_basic(), rely on tm_to_time_t() complaining when we have negative values in any field. So we'll continue to detect and complain about this case (the alternative would be to turn "month -1" into "month 11" of the previous year, and so on). Signed-off-by: Jeff King <peff@peff.net> --- date.c | 58 ++++++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 42 insertions(+), 16 deletions(-) diff --git a/date.c b/date.c index 4337e18abd..df6c414c78 100644 --- a/date.c +++ b/date.c @@ -10,29 +10,55 @@ #include "pager.h" #include "strbuf.h" +/* + * Convert a year-month-day time into a number of days since 1970 (possibly + * negative). Note that "year" is the full year (not offset from 1900), and + * "month" is in the range 1-12 (unlike a "struct tm" 0-11). + * + * Working in time_t is overkill (since it usually represents seconds), but + * this makes sure we don't hit any integer range issues. + * + * This function is taken from https://github.com/HowardHinnant/date, + * which is released under an MIT license. + */ +static time_t days_from_civil(int year, int month, int day) +{ + time_t era; + unsigned year_of_era, day_of_year, day_of_era; + + year -= month <= 2; + era = (year >= 0 ? year : year - 399) / 400; + year_of_era = year - era * 400; + day_of_year = (153 * (month + (month > 2 ? -3 : 9)) + 2) / 5 + day - 1; + day_of_era = year_of_era * 365 + year_of_era / 4 - year_of_era / 100 + + day_of_year; + return era * 146097 + (time_t)day_of_era - 719468; +} + /* * This is like mktime, but without normalization of tm_wday and tm_yday. * It also always operates in UTC, not the local timezone. */ time_t tm_to_time_t(const struct tm *tm) { - static const int mdays[] = { - 0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334 - }; - int year = tm->tm_year - 70; - int month = tm->tm_mon; - int day = tm->tm_mday; - - if (year < 0 || year > 129) /* algo only works for 1970-2099 */ - return -1; - if (month < 0 || month > 11) /* array bounds */ - return -1; - if (month < 2 || (year + 2) % 4) - day--; - if (tm->tm_hour < 0 || tm->tm_min < 0 || tm->tm_sec < 0) + time_t days, hours, minutes, seconds; + + if (tm->tm_year < 0 || + tm->tm_mon < 0 || + tm->tm_mday < 0 || + tm->tm_hour < 0 || + tm->tm_min < 0 || + tm->tm_sec < 0) return -1; - return (year * 365 + (year + 1) / 4 + mdays[month] + day) * 24*60*60UL + - tm->tm_hour * 60*60 + tm->tm_min * 60 + tm->tm_sec; + + days = days_from_civil(tm->tm_year + 1900, + tm->tm_mon + 1, + tm->tm_mday); + hours = 24 * days + tm->tm_hour; + minutes = 60 * hours + tm->tm_min; + seconds = 60 * minutes + tm->tm_sec; + + return seconds; } static const char *month_names[] = { ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-04 9:33 ` Jeff King @ 2024-06-05 6:52 ` darcy 0 siblings, 0 replies; 36+ messages in thread From: darcy @ 2024-06-05 6:52 UTC (permalink / raw) To: Jeff King Cc: phillip.wood, Junio C Hamano, darcy via GitGitGadget, git, Patrick Steinhardt On 4/6/24 19:33, Jeff King wrote: > I think the 2099 limitation is there because 2100 is not a leap year, > and the code has over-simplified computation there (it counts every 4th > year as a leap year, which is not generally true for century boundaries, > except for 2000 because it is divisible by 400). Ah, that makes sense, thanks. > If you're interested in picking up the negative timestamp work, I'd be > happy to discuss it. It's been on my todo list for a long time, but I > never quite get around to it. I may look at this in the future, but this looks a little out of my depth at the moment, as I am still unfamiliar with the codebase. For the moment, I will just add the check at year 2099, and possibly will revisit this when I have more time on my hands. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-04 8:48 ` darcy 2024-06-04 9:33 ` Jeff King @ 2024-06-05 13:10 ` Phillip Wood 2024-06-05 17:27 ` Junio C Hamano 1 sibling, 1 reply; 36+ messages in thread From: Phillip Wood @ 2024-06-05 13:10 UTC (permalink / raw) To: darcy, phillip.wood, Junio C Hamano, darcy via GitGitGadget Cc: git, Patrick Steinhardt, Jeff King On 04/06/2024 09:48, darcy wrote: > On 4/6/24 00:13, Phillip Wood wrote: > >> Hi darcy >> On 03/06/2024 12:44, darcy wrote: >>>> Do we also want to check for overflows in the other direction (a large >>>> timestamp with a negative timezone offset)? >>> Is this something people want added? I am happy to implement this if >>> so, though it wasn't my original intention. >> I think if we're worried about the date overflowing in one direction it >> makes sense to fix overflows in the other direction at the same time >> especially as I think that the other case involves a signed integer >> overflow which is undefined behavior in C. > > That makes sense. > > Though I am reading the `tm_to_time_t` code now and it only allows years > up to 2099. > >> if (year < 0 || year > 129) /* algo only works for 1970-2099 */ >> return -1; Thanks for taking a closer look, I wasn't aware of that. Reading Peff's reply it seems like timestamp is actually unsigned and as we limit the maximum year to 2099 it seems unlikely we'll overflow from too larger date after all. Best Wishes Phillip > I can of course add a check here for dates close to the end of 2099, but > it seems that the bigger issue is that some day people will want to use > Git after 2099... Should I see if I can extend this range? I'm not sure > where that specific year comes from, it doesn't seem to be based on a > limit of the size of `time_t`, and the comment or git logs don't seem to > provide a reason. > > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-05 13:10 ` Phillip Wood @ 2024-06-05 17:27 ` Junio C Hamano 2024-06-06 4:56 ` darcy 0 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2024-06-05 17:27 UTC (permalink / raw) To: Phillip Wood Cc: darcy, phillip.wood, darcy via GitGitGadget, git, Patrick Steinhardt, Jeff King Phillip Wood <phillip.wood123@gmail.com> writes: > Thanks for taking a closer look, I wasn't aware of that. Reading > Peff's reply it seems like timestamp is actually unsigned and as we > limit the maximum year to 2099 it seems unlikely we'll overflow from > too larger date after all. Thanks all. I haven't seen one question I posed answered, though. Has https://git.github.io/htmldocs/SubmittingPatches.html#real-name been followed in this patch when signing off the patch? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2] date: detect underflow when parsing dates with positive timezone offset 2024-06-05 17:27 ` Junio C Hamano @ 2024-06-06 4:56 ` darcy 0 siblings, 0 replies; 36+ messages in thread From: darcy @ 2024-06-06 4:56 UTC (permalink / raw) To: Junio C Hamano, Phillip Wood Cc: phillip.wood, darcy via GitGitGadget, git, Patrick Steinhardt, Jeff King On 6/6/24 03:27, Junio C Hamano wrote: > Phillip Wood <phillip.wood123@gmail.com> writes: > >> Thanks for taking a closer look, I wasn't aware of that. Reading >> Peff's reply it seems like timestamp is actually unsigned and as we >> limit the maximum year to 2099 it seems unlikely we'll overflow from >> too larger date after all. > Thanks all. > > I haven't seen one question I posed answered, though. Has > https://git.github.io/htmldocs/SubmittingPatches.html#real-name > been followed in this patch when signing off the patch? I have changed the line in the commit to properly follow that rule. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-02 23:06 ` [PATCH v2] date: detect underflow when parsing dates with " darcy via GitGitGadget 2024-06-03 11:13 ` Junio C Hamano @ 2024-06-07 0:17 ` darcy via GitGitGadget 2024-06-07 17:40 ` Junio C Hamano 2024-06-12 9:49 ` Karthik Nayak 1 sibling, 2 replies; 36+ messages in thread From: darcy via GitGitGadget @ 2024-06-07 0:17 UTC (permalink / raw) To: git; +Cc: darcy, darcy From: darcy <acednes@gmail.com> Overriding the date of a commit to be close to "1970-01-01 00:00:00" with a large enough positive timezone for the equivelant GMT time to be before the epoch is considered valid by `parse_date_basic`. Similar behaviour occurs when using a date close to "2099-12-31 23:59:59" (the maximum date allowed by `tm_to_time_t`) with a large enough negative timezone offset. This leads to an integer underflow or underflow respectively in the commit timestamp, which is not caught by `git-commit`, but will cause other services to fail, such as `git-fsck`, which, for the first case, reports "badDateOverflow: invalid author/committer line - date causes integer overflow". Instead check the timezone offset and fail if the resulting time comes before the epoch "1970-01-01T00:00:00Z" or after the maximum date "2099-12-31T23:59:59Z". Signed-off-by: Darcy Burke <acednes@gmail.com> --- fix: prevent date underflow when using positive timezone offset cc: Patrick Steinhardt ps@pks.im cc: Phillip Wood phillip.wood123@gmail.com cc: darcy acednes@gmail.com cc: Jeff King peff@peff.net Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1726%2Fdxrcy%2Fmaster-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1726/dxrcy/master-v3 Pull-Request: https://github.com/git/git/pull/1726 Range-diff vs v2: 1: db508b2f533 ! 1: f0de3a2c543 date: detect underflow when parsing dates with positive timezone offset @@ Metadata Author: darcy <acednes@gmail.com> ## Commit message ## - date: detect underflow when parsing dates with positive timezone offset + date: detect underflow/overflow when parsing dates with timezone offset Overriding the date of a commit to be close to "1970-01-01 00:00:00" - with a large enough timezone for the equivelant GMT time to be before - the epoch is considered valid by `parse_date_basic`. + with a large enough positive timezone for the equivelant GMT time to be + before the epoch is considered valid by `parse_date_basic`. Similar + behaviour occurs when using a date close to "2099-12-31 23:59:59" (the + maximum date allowed by `tm_to_time_t`) with a large enough negative + timezone offset. - This leads to an integer underflow in the commit timestamp, which is not - caught by `git-commit`, but will cause other services to fail, such as - `git-fsck`, which reports "badDateOverflow: invalid author/committer - line - date causes integer overflow". + This leads to an integer underflow or underflow respectively in the + commit timestamp, which is not caught by `git-commit`, but will cause + other services to fail, such as `git-fsck`, which, for the first case, + reports "badDateOverflow: invalid author/committer line - date causes + integer overflow". Instead check the timezone offset and fail if the resulting time comes - before the epoch, "1970-01-01T00:00:00Z", when parsing. + before the epoch "1970-01-01T00:00:00Z" or after the maximum date + "2099-12-31T23:59:59Z". Signed-off-by: Darcy Burke <acednes@gmail.com> ## date.c ## +@@ date.c: static int match_object_header_date(const char *date, timestamp_t *timestamp, in + return 0; + } + ++ ++/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ ++static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; ++ + /* Gr. strptime is crap for this; it doesn't have a way to require RFC2822 + (i.e. English) day/month names, and it doesn't work correctly with %z. */ + int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) @@ date.c: int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) } } @@ date.c: int parse_date_basic(const char *date, timestamp_t *timestamp, int *offs - if (!tm_gmt) + if (!tm_gmt) { + if (*offset > 0 && *offset * 60 > *timestamp) ++ return -1; ++ if (*offset < 0 && -*offset * 60 > timestamp_max - *timestamp) + return -1; *timestamp -= *offset * 60; + } @@ date.c: int parse_date_basic(const char *date, timestamp_t *timestamp, int *offs ## t/t0006-date.sh ## -@@ t/t0006-date.sh: check_parse '2008-02-14 20:30:45 -:30' '2008-02-14 20:30:45 +0000' - check_parse '2008-02-14 20:30:45 -05:00' '2008-02-14 20:30:45 -0500' +@@ t/t0006-date.sh: check_parse '2008-02-14 20:30:45 -05:00' '2008-02-14 20:30:45 -0500' check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 -0500' EST5 check_parse 'Thu, 7 Apr 2005 15:14:13 -0700' '2005-04-07 15:14:13 -0700' + +check_parse '1970-01-01 00:00:00' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 +00' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 Z' '1970-01-01 00:00:00 +0000' @@ t/t0006-date.sh: check_parse '2008-02-14 20:30:45 -:30' '2008-02-14 20:30:45 +00 +check_parse '1969-12-31 23:59:59 Z' bad +check_parse '1969-12-31 23:59:59 +11' bad +check_parse '1969-12-31 23:59:59 -11' bad - ++ ++check_parse '2099-12-31 23:59:59' '2099-12-31 23:59:59 +0000' ++check_parse '2099-12-31 23:59:59 +00' '2099-12-31 23:59:59 +0000' ++check_parse '2099-12-31 23:59:59 Z' '2099-12-31 23:59:59 +0000' ++check_parse '2099-12-31 23:59:59 +01' '2099-12-31 23:59:59 +0100' ++check_parse '2099-12-31 23:59:59 -01' bad ++check_parse '2099-12-31 23:59:59 -11' bad ++check_parse '2099-12-31 23:00:00 -01' bad ++check_parse '2099-12-31 22:59:59 -01' '2099-12-31 22:59:59 -0100' ++check_parse '2100-00-00 00:00:00' bad ++check_parse '2099-12-30 00:00:00 -11' '2099-12-30 00:00:00 -1100' ++check_parse '2100-00-00 00:00:00 +00' bad ++check_parse '2100-00-00 00:00:00 Z' bad ++check_parse '2100-00-00 00:00:00 -11' bad ++check_parse '2100-00-00 00:00:00 +11' bad ++ check_approxidate() { echo "$1 -> $2 +0000" >expect + test_expect_${3:-success} "parse approxidate ($1)" " date.c | 12 +++++++++++- t/t0006-date.sh | 31 +++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+), 1 deletion(-) diff --git a/date.c b/date.c index 7365a4ad24f..95776c8a92f 100644 --- a/date.c +++ b/date.c @@ -868,6 +868,10 @@ static int match_object_header_date(const char *date, timestamp_t *timestamp, in return 0; } + +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ +static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; + /* Gr. strptime is crap for this; it doesn't have a way to require RFC2822 (i.e. English) day/month names, and it doesn't work correctly with %z. */ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) @@ -937,8 +941,14 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) } } - if (!tm_gmt) + if (!tm_gmt) { + if (*offset > 0 && *offset * 60 > *timestamp) + return -1; + if (*offset < 0 && -*offset * 60 > timestamp_max - *timestamp) + return -1; *timestamp -= *offset * 60; + } + return 0; /* success */ } diff --git a/t/t0006-date.sh b/t/t0006-date.sh index 3031256d143..e8fdf361ad4 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -117,6 +117,37 @@ check_parse '2008-02-14 20:30:45 -05:00' '2008-02-14 20:30:45 -0500' check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 -0500' EST5 check_parse 'Thu, 7 Apr 2005 15:14:13 -0700' '2005-04-07 15:14:13 -0700' +check_parse '1970-01-01 00:00:00' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 +00' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 Z' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 -01' '1970-01-01 00:00:00 -0100' +check_parse '1970-01-01 00:00:00 +01' bad +check_parse '1970-01-01 00:00:00 +11' bad +check_parse '1970-01-01 00:59:59 +01' bad +check_parse '1970-01-01 01:00:00 +01' '1970-01-01 01:00:00 +0100' +check_parse '1970-01-01 01:00:00 +11' bad +check_parse '1970-01-02 00:00:00 +11' '1970-01-02 00:00:00 +1100' +check_parse '1969-12-31 23:59:59' bad +check_parse '1969-12-31 23:59:59 +00' bad +check_parse '1969-12-31 23:59:59 Z' bad +check_parse '1969-12-31 23:59:59 +11' bad +check_parse '1969-12-31 23:59:59 -11' bad + +check_parse '2099-12-31 23:59:59' '2099-12-31 23:59:59 +0000' +check_parse '2099-12-31 23:59:59 +00' '2099-12-31 23:59:59 +0000' +check_parse '2099-12-31 23:59:59 Z' '2099-12-31 23:59:59 +0000' +check_parse '2099-12-31 23:59:59 +01' '2099-12-31 23:59:59 +0100' +check_parse '2099-12-31 23:59:59 -01' bad +check_parse '2099-12-31 23:59:59 -11' bad +check_parse '2099-12-31 23:00:00 -01' bad +check_parse '2099-12-31 22:59:59 -01' '2099-12-31 22:59:59 -0100' +check_parse '2100-00-00 00:00:00' bad +check_parse '2099-12-30 00:00:00 -11' '2099-12-30 00:00:00 -1100' +check_parse '2100-00-00 00:00:00 +00' bad +check_parse '2100-00-00 00:00:00 Z' bad +check_parse '2100-00-00 00:00:00 -11' bad +check_parse '2100-00-00 00:00:00 +11' bad + check_approxidate() { echo "$1 -> $2 +0000" >expect test_expect_${3:-success} "parse approxidate ($1)" " base-commit: cd77e87115eab5e80e6b161e7b84a79ba1a12cdc -- gitgitgadget ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-07 0:17 ` [PATCH v3] date: detect underflow/overflow when parsing dates with " darcy via GitGitGadget @ 2024-06-07 17:40 ` Junio C Hamano 2024-06-08 18:58 ` Junio C Hamano ` (2 more replies) 2024-06-12 9:49 ` Karthik Nayak 1 sibling, 3 replies; 36+ messages in thread From: Junio C Hamano @ 2024-06-07 17:40 UTC (permalink / raw) To: darcy via GitGitGadget; +Cc: git, darcy "darcy via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: darcy <acednes@gmail.com> This ident should match what is used on "Signed-off-by:" line. > Overriding the date of a commit to be close to "1970-01-01 00:00:00" > with a large enough positive timezone for the equivelant GMT time to be > before the epoch is considered valid by `parse_date_basic`. Similar > behaviour occurs when using a date close to "2099-12-31 23:59:59" (the > maximum date allowed by `tm_to_time_t`) with a large enough negative > timezone offset. > > This leads to an integer underflow or underflow respectively in the "underflow or underflow respectively"? > commit timestamp, which is not caught by `git-commit`, but will cause > other services to fail, such as `git-fsck`, which, for the first case, > reports "badDateOverflow: invalid author/committer line - date causes > integer overflow". > > Instead check the timezone offset and fail if the resulting time comes > before the epoch "1970-01-01T00:00:00Z" or after the maximum date > "2099-12-31T23:59:59Z". Nicely described otherwise. > + > +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ > +static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; I wonder if this should be of timestamp_t type instead, as the check is done against *timestamp in parse_date_basic() where *timestamp is of type timestamp_t to match? > int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) > @@ -937,8 +941,14 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) > } > } > > - if (!tm_gmt) > + if (!tm_gmt) { > + if (*offset > 0 && *offset * 60 > *timestamp) > + return -1; > + if (*offset < 0 && -*offset * 60 > timestamp_max - *timestamp) > + return -1; > *timestamp -= *offset * 60; > + } > + > return 0; /* success */ > } Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-07 17:40 ` Junio C Hamano @ 2024-06-08 18:58 ` Junio C Hamano 2024-06-14 1:20 ` Junio C Hamano 2024-06-11 23:30 ` Junio C Hamano 2024-06-12 9:07 ` [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset Phillip Wood 2 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2024-06-08 18:58 UTC (permalink / raw) To: darcy via GitGitGadget; +Cc: git, darcy Junio C Hamano <gitster@pobox.com> writes: >> +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ >> +static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; > > I wonder if this should be of timestamp_t type instead, as the check > is done against *timestamp in parse_date_basic() where *timestamp is > of type timestamp_t to match? The CI build on Windows tells me that my worry was warranted. https://github.com/git/git/actions/runs/9424299208/job/25964281907#step:4:643 (GitHub seems to show the breakage details only to those who are logged in, so you'd need to be logged in to visit that link) Error: date.c:873:75: integer overflow in expression of type 'long int' results in '-192522496' [-Werror=overflow] 873 | static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; | ^ Error: date.c:873:1: overflow in constant expression [-Werror=overflow] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-08 18:58 ` Junio C Hamano @ 2024-06-14 1:20 ` Junio C Hamano 2024-06-15 11:47 ` Karthik Nayak 0 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2024-06-14 1:20 UTC (permalink / raw) To: darcy via GitGitGadget; +Cc: git, darcy Junio C Hamano <gitster@pobox.com> writes: > The CI build on Windows tells me that my worry was warranted. > > https://github.com/git/git/actions/runs/9424299208/job/25964281907#step:4:643 > > (GitHub seems to show the breakage details only to those who are > logged in, so you'd need to be logged in to visit that link) So here is what I accumulated in SQUASH??? patches on top of your topic while waiting for an updated version to unbreak the CI. * The "end of git time" timestamp does not fit in time_t on 32-bit systems, so I updated it to use timestamp_t at least for now. * t0006 has two tests that use TIME_IS_64BIT,TIME_T_IS_64BIT prerequisites; I introduced HAVE_64BIT_TIME to simplify them. * nobody passes $4 to check_parse to tell it to expect a failure, so I removed it. It always expects success. * check_parse now honors a global variable REQUIRE_64BIT_TIME that is used as the prerequisite to run its test_expect_success; the "near the end of git time" tests you added use the mechanism to pass HAVE_64BIT_TIME prerequisite. The last one is a bit questionable, as it only "punts" on 32-bit systems, instead of making sure we get the expected error messages. I think it is OK to punt here and have a separate test that checks timestamp around year 2040 for that condition. date.c | 2 +- t/t0006-date.sh | 20 ++++++++++++++------ 2 files changed, 15 insertions(+), 7 deletions(-) diff --git a/date.c b/date.c index 95776c8a92..bee9fe8f10 100644 --- a/date.c +++ b/date.c @@ -870,7 +870,7 @@ static int match_object_header_date(const char *date, timestamp_t *timestamp, in /* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ -static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; +static const timestamp_t timestamp_max = (((timestamp_t)2100 - 1970) * 365 + 32) * 24 * 60 * 60 - 1; /* Gr. strptime is crap for this; it doesn't have a way to require RFC2822 (i.e. English) day/month names, and it doesn't work correctly with %z. */ diff --git a/t/t0006-date.sh b/t/t0006-date.sh index e8fdf361ad..fd373e1b39 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -8,6 +8,11 @@ TEST_PASSES_SANITIZE_LEAK=true # arbitrary reference time: 2009-08-30 19:20:00 GIT_TEST_DATE_NOW=1251660000; export GIT_TEST_DATE_NOW +if test_have_prereq TIME_IS_64BIT,TIME_T_IS_64BIT +then + test_set_prereq HAVE_64BIT_TIME +fi + check_relative() { t=$(($GIT_TEST_DATE_NOW - $1)) echo "$t -> $2" >expect @@ -80,14 +85,15 @@ check_show raw "$TIME" '1466000000 -0200' # arbitrary time absurdly far in the future FUTURE="5758122296 -0400" -check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT,TIME_T_IS_64BIT -check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +0000" TIME_IS_64BIT,TIME_T_IS_64BIT +check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" HAVE_64BIT_TIME +check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +0000" HAVE_64BIT_TIME -check_parse() { +REQUIRE_64BIT_TIME= +check_parse () { echo "$1 -> $2" >expect - test_expect_${4:-success} "parse date ($1${3:+ TZ=$3})" " - TZ=${3:-$TZ} test-tool date parse '$1' >actual && - test_cmp expect actual + test_expect_success $REQUIRE_64BIT_TIME "parse date ($1${3:+ TZ=$3}) -> $2" " + TZ=${3:-$TZ} test-tool date parse '$1' >actual && + test_cmp expect actual " } @@ -133,6 +139,7 @@ check_parse '1969-12-31 23:59:59 Z' bad check_parse '1969-12-31 23:59:59 +11' bad check_parse '1969-12-31 23:59:59 -11' bad +REQUIRE_64BIT_TIME=HAVE_64BIT_TIME check_parse '2099-12-31 23:59:59' '2099-12-31 23:59:59 +0000' check_parse '2099-12-31 23:59:59 +00' '2099-12-31 23:59:59 +0000' check_parse '2099-12-31 23:59:59 Z' '2099-12-31 23:59:59 +0000' @@ -147,6 +154,7 @@ check_parse '2100-00-00 00:00:00 +00' bad check_parse '2100-00-00 00:00:00 Z' bad check_parse '2100-00-00 00:00:00 -11' bad check_parse '2100-00-00 00:00:00 +11' bad +REQUIRE_64BIT_TIME= check_approxidate() { echo "$1 -> $2 +0000" >expect ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-14 1:20 ` Junio C Hamano @ 2024-06-15 11:47 ` Karthik Nayak 0 siblings, 0 replies; 36+ messages in thread From: Karthik Nayak @ 2024-06-15 11:47 UTC (permalink / raw) To: Junio C Hamano, darcy via GitGitGadget; +Cc: git, darcy [-- Attachment #1: Type: text/plain, Size: 4650 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> The CI build on Windows tells me that my worry was warranted. >> >> https://github.com/git/git/actions/runs/9424299208/job/25964281907#step:4:643 >> >> (GitHub seems to show the breakage details only to those who are >> logged in, so you'd need to be logged in to visit that link) > > So here is what I accumulated in SQUASH??? patches on top of your > topic while waiting for an updated version to unbreak the CI. > > * The "end of git time" timestamp does not fit in time_t on 32-bit > systems, so I updated it to use timestamp_t at least for now. > > * t0006 has two tests that use TIME_IS_64BIT,TIME_T_IS_64BIT > prerequisites; I introduced HAVE_64BIT_TIME to simplify them. > > * nobody passes $4 to check_parse to tell it to expect a failure, > so I removed it. It always expects success. > > * check_parse now honors a global variable REQUIRE_64BIT_TIME that > is used as the prerequisite to run its test_expect_success; the > "near the end of git time" tests you added use the mechanism to > pass HAVE_64BIT_TIME prerequisite. > > The last one is a bit questionable, as it only "punts" on 32-bit > systems, instead of making sure we get the expected error messages. > I think it is OK to punt here and have a separate test that checks > timestamp around year 2040 for that condition. > > date.c | 2 +- > t/t0006-date.sh | 20 ++++++++++++++------ > 2 files changed, 15 insertions(+), 7 deletions(-) > > diff --git a/date.c b/date.c > index 95776c8a92..bee9fe8f10 100644 > --- a/date.c > +++ b/date.c > @@ -870,7 +870,7 @@ static int match_object_header_date(const char *date, timestamp_t *timestamp, in > > > /* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ > -static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; > +static const timestamp_t timestamp_max = (((timestamp_t)2100 - 1970) * 365 + 32) * 24 * 60 * 60 - 1; > > /* Gr. strptime is crap for this; it doesn't have a way to require RFC2822 > (i.e. English) day/month names, and it doesn't work correctly with %z. */ > diff --git a/t/t0006-date.sh b/t/t0006-date.sh > index e8fdf361ad..fd373e1b39 100755 > --- a/t/t0006-date.sh > +++ b/t/t0006-date.sh > @@ -8,6 +8,11 @@ TEST_PASSES_SANITIZE_LEAK=true > # arbitrary reference time: 2009-08-30 19:20:00 > GIT_TEST_DATE_NOW=1251660000; export GIT_TEST_DATE_NOW > > +if test_have_prereq TIME_IS_64BIT,TIME_T_IS_64BIT > +then > + test_set_prereq HAVE_64BIT_TIME > +fi > + This does make sense, I did check and noticed that the two are always used together everywhere, so outside of these patches, it perhaps would also make sense to combine them altogether. > check_relative() { > t=$(($GIT_TEST_DATE_NOW - $1)) > echo "$t -> $2" >expect > @@ -80,14 +85,15 @@ check_show raw "$TIME" '1466000000 -0200' > > # arbitrary time absurdly far in the future > FUTURE="5758122296 -0400" > -check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT,TIME_T_IS_64BIT > -check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +0000" TIME_IS_64BIT,TIME_T_IS_64BIT > +check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" HAVE_64BIT_TIME > +check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +0000" HAVE_64BIT_TIME > > -check_parse() { > +REQUIRE_64BIT_TIME= > +check_parse () { > echo "$1 -> $2" >expect > - test_expect_${4:-success} "parse date ($1${3:+ TZ=$3})" " > - TZ=${3:-$TZ} test-tool date parse '$1' >actual && > - test_cmp expect actual > + test_expect_success $REQUIRE_64BIT_TIME "parse date ($1${3:+ TZ=$3}) -> $2" " > + TZ=${3:-$TZ} test-tool date parse '$1' >actual && > + test_cmp expect actual > " > } > > @@ -133,6 +139,7 @@ check_parse '1969-12-31 23:59:59 Z' bad > check_parse '1969-12-31 23:59:59 +11' bad > check_parse '1969-12-31 23:59:59 -11' bad > > +REQUIRE_64BIT_TIME=HAVE_64BIT_TIME > check_parse '2099-12-31 23:59:59' '2099-12-31 23:59:59 +0000' > check_parse '2099-12-31 23:59:59 +00' '2099-12-31 23:59:59 +0000' > check_parse '2099-12-31 23:59:59 Z' '2099-12-31 23:59:59 +0000' > @@ -147,6 +154,7 @@ check_parse '2100-00-00 00:00:00 +00' bad > check_parse '2100-00-00 00:00:00 Z' bad > check_parse '2100-00-00 00:00:00 -11' bad > check_parse '2100-00-00 00:00:00 +11' bad > +REQUIRE_64BIT_TIME= > > check_approxidate() { > echo "$1 -> $2 +0000" >expect I think this patch looks good. I also think we should probably look into fixing the 2099 limit overall. I think Jeff already posted a patch to do the same already: https://lore.kernel.org/r/20240604093345.GA1279521@coredump.intra.peff.net [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-07 17:40 ` Junio C Hamano 2024-06-08 18:58 ` Junio C Hamano @ 2024-06-11 23:30 ` Junio C Hamano 2024-06-11 23:49 ` rsbecker 2024-06-25 23:12 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Junio C Hamano 2024-06-12 9:07 ` [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset Phillip Wood 2 siblings, 2 replies; 36+ messages in thread From: Junio C Hamano @ 2024-06-11 23:30 UTC (permalink / raw) To: darcy via GitGitGadget; +Cc: git, darcy Junio C Hamano <gitster@pobox.com> writes: > I wonder if this should be of timestamp_t type instead, as the check > is done against *timestamp in parse_date_basic() where *timestamp is > of type timestamp_t to match? Also, as you can see at one of the GitHub CI jobs, e.g., https://github.com/git/git/actions/runs/9455916669/job/26046731619#step:6:1915 you'd need to either exclude some "these are too large timestamps for the system" tests from 32-bit systems or expect output on them to be different from 64-bit systems. As you are actively detecting the condition and giving an error message "too large for _this_ system", I think it is a good idea to actually do the latter, i.e. on 64-bit system make sure parsing is done correctly, and on 32-bit system make sure you get that "too large for this system" error. Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-11 23:30 ` Junio C Hamano @ 2024-06-11 23:49 ` rsbecker 2024-06-11 23:52 ` Junio C Hamano 2024-06-25 23:12 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Junio C Hamano 1 sibling, 1 reply; 36+ messages in thread From: rsbecker @ 2024-06-11 23:49 UTC (permalink / raw) To: 'Junio C Hamano', 'darcy via GitGitGadget' Cc: git, 'darcy' On Tuesday, June 11, 2024 7:30 PM, Junio C Hamano wrote: >Junio C Hamano <gitster@pobox.com> writes: > >> I wonder if this should be of timestamp_t type instead, as the check >> is done against *timestamp in parse_date_basic() where *timestamp is >> of type timestamp_t to match? > >Also, as you can see at one of the GitHub CI jobs, e.g., > > >https://github.com/git/git/actions/runs/9455916669/job/26046731619#step:6: >1915 > >you'd need to either exclude some "these are too large timestamps for the system" >tests from 32-bit systems or expect output on them to be different from 64-bit >systems. > >As you are actively detecting the condition and giving an error message "too large >for _this_ system", I think it is a good idea to actually do the latter, i.e. on 64-bit >system make sure parsing is done correctly, and on 32-bit system make sure you get >that "too large for this system" error. Does this imply that timestamp tests will fail on 32-bit systems? I am unsure how to interpret this. Can you please clarify? Thanks, Randall ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-11 23:49 ` rsbecker @ 2024-06-11 23:52 ` Junio C Hamano 0 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2024-06-11 23:52 UTC (permalink / raw) To: rsbecker; +Cc: 'darcy via GitGitGadget', git, 'darcy' <rsbecker@nexbridge.com> writes: >>Also, as you can see at one of the GitHub CI jobs, e.g., >> >> >> https://github.com/git/git/actions/runs/9455916669/job/26046731619#step:6:1915 >> > Does this imply that timestamp tests will fail on 32-bit systems? I am > unsure how to interpret this. Can you please clarify? Sorry, but I have nothing more to clarify than what appears in the quoted CI log. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll @ 2024-06-25 23:12 ` Junio C Hamano 2024-06-25 23:12 ` [PATCH v4 1/2] t0006: simplify prerequisites Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: Junio C Hamano @ 2024-06-25 23:12 UTC (permalink / raw) To: git; +Cc: Darcy Burke, Karthik Nayak, Randall S . Becker, Phillip Wood So it has been some time since we discussed this topic. Let's clean up the messy "SQUASH???" patches I had to queue on top of the main patch to keep the CI working and make them into a preliminary patch. The tree at the end of the series is identical to what has been queued in 'seen' for the past few weeks. The only difference is that we first lay groundwork to skip certain time-parsing tests on 32-bit systems first, and then use Darcy's patch with minimum adjustments for 32-bit systems. Darcy Burke (1): date: detect underflow/overflow when parsing dates with timezone offset Junio C Hamano (1): t0006: simplify prerequisites date.c | 12 +++++++++++- t/t0006-date.sh | 51 +++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 56 insertions(+), 7 deletions(-) -- 2.45.2-796-g2ef7a3d713 ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v4 1/2] t0006: simplify prerequisites 2024-06-25 23:12 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Junio C Hamano @ 2024-06-25 23:12 ` Junio C Hamano 2024-06-25 23:30 ` Eric Sunshine 2024-06-25 23:12 ` [PATCH v4 2/2] date: detect underflow/overflow when parsing dates with timezone offset Junio C Hamano 2024-06-26 15:21 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Phillip Wood 2 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2024-06-25 23:12 UTC (permalink / raw) To: git The system must support 64-bit time and its time_t must be 64-bit wide to pass these tests. Combine these two prerequisites together to simplify the tests. In theory, they could be fulfilled independently and tests could require only one without the other, but in practice, but in practice these must come hand-in-hand. Update the "check_parse" test helper to pay attention to the REQUIRE_64BIT_TIME variable, which can be set to the HAVE_64BIT_TIME prerequisite so that a parse test can be skipped on 32-bit systems. This will be used in the next step to skip tests for timestamps near the end of year 2099, as 32-bit systems will not be able to express a timestamp beyond 2038 anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- t/t0006-date.sh | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/t/t0006-date.sh b/t/t0006-date.sh index 3031256d14..24e8647f26 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -8,6 +8,11 @@ TEST_PASSES_SANITIZE_LEAK=true # arbitrary reference time: 2009-08-30 19:20:00 GIT_TEST_DATE_NOW=1251660000; export GIT_TEST_DATE_NOW +if test_have_prereq TIME_IS_64BIT,TIME_T_IS_64BIT +then + test_set_prereq HAVE_64BIT_TIME +fi + check_relative() { t=$(($GIT_TEST_DATE_NOW - $1)) echo "$t -> $2" >expect @@ -80,14 +85,15 @@ check_show raw "$TIME" '1466000000 -0200' # arbitrary time absurdly far in the future FUTURE="5758122296 -0400" -check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT,TIME_T_IS_64BIT -check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +0000" TIME_IS_64BIT,TIME_T_IS_64BIT +check_show iso "$FUTURE" "2152-06-19 18:24:56 -0400" HAVE_64BIT_TIME +check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +0000" HAVE_64BIT_TIME -check_parse() { +REQUIRE_64BIT_TIME= +check_parse () { echo "$1 -> $2" >expect - test_expect_${4:-success} "parse date ($1${3:+ TZ=$3})" " - TZ=${3:-$TZ} test-tool date parse '$1' >actual && - test_cmp expect actual + test_expect_success $REQUIRE_64BIT_TIME "parse date ($1${3:+ TZ=$3}) -> $2" " + TZ=${3:-$TZ} test-tool date parse '$1' >actual && + test_cmp expect actual " } -- 2.45.2-796-g2ef7a3d713 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v4 1/2] t0006: simplify prerequisites 2024-06-25 23:12 ` [PATCH v4 1/2] t0006: simplify prerequisites Junio C Hamano @ 2024-06-25 23:30 ` Eric Sunshine 2024-06-26 0:04 ` Junio C Hamano 0 siblings, 1 reply; 36+ messages in thread From: Eric Sunshine @ 2024-06-25 23:30 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, Jun 25, 2024 at 7:13 PM Junio C Hamano <gitster@pobox.com> wrote: > The system must support 64-bit time and its time_t must be 64-bit > wide to pass these tests. Combine these two prerequisites together > to simplify the tests. In theory, they could be fulfilled > independently and tests could require only one without the other, > but in practice, but in practice these must come hand-in-hand. s/but in practice, but in practice/but in practice/ > Update the "check_parse" test helper to pay attention to the > REQUIRE_64BIT_TIME variable, which can be set to the HAVE_64BIT_TIME > prerequisite so that a parse test can be skipped on 32-bit systems. > This will be used in the next step to skip tests for timestamps near > the end of year 2099, as 32-bit systems will not be able to express > a timestamp beyond 2038 anyway. > > Signed-off-by: Junio C Hamano <gitster@pobox.com> ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 1/2] t0006: simplify prerequisites 2024-06-25 23:30 ` Eric Sunshine @ 2024-06-26 0:04 ` Junio C Hamano 0 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2024-06-26 0:04 UTC (permalink / raw) To: Eric Sunshine; +Cc: git Eric Sunshine <sunshine@sunshineco.com> writes: > On Tue, Jun 25, 2024 at 7:13 PM Junio C Hamano <gitster@pobox.com> wrote: >> The system must support 64-bit time and its time_t must be 64-bit >> wide to pass these tests. Combine these two prerequisites together >> to simplify the tests. In theory, they could be fulfilled >> independently and tests could require only one without the other, >> but in practice, but in practice these must come hand-in-hand. > > s/but in practice, but in practice/but in practice/ Thanks, always, for your sharp eyes. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v4 2/2] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-25 23:12 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Junio C Hamano 2024-06-25 23:12 ` [PATCH v4 1/2] t0006: simplify prerequisites Junio C Hamano @ 2024-06-25 23:12 ` Junio C Hamano 2024-06-26 15:21 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Phillip Wood 2 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2024-06-25 23:12 UTC (permalink / raw) To: git; +Cc: Darcy Burke From: Darcy Burke <acednes@gmail.com> Overriding the date of a commit to be close to "1970-01-01 00:00:00" with a large enough positive timezone for the equivelant GMT time to be before the epoch is considered valid by `parse_date_basic`. Similar behaviour occurs when using a date close to "2099-12-31 23:59:59" (the maximum date allowed by `tm_to_time_t`) with a large enough negative timezone offset. This leads to an integer underflow or underflow respectively in the commit timestamp, which is not caught by `git-commit`, but will cause other services to fail, such as `git-fsck`, which, for the first case, reports "badDateOverflow: invalid author/committer line - date causes integer overflow". Instead check the timezone offset and fail if the resulting time comes before the epoch "1970-01-01T00:00:00Z" or after the maximum date "2099-12-31T23:59:59Z". Using the REQUIRE_64BIT_TIME prerequisite, make sure that the tests near the end of Git time (aka end of year 2099) are not attempted on purely 32-bit systems, as they cannot express timestamp beyond 2038 anyway. Signed-off-by: Darcy Burke <acednes@gmail.com> [jc: fixups for 32-bit platforms] Signed-off-by: Junio C Hamano <gitster@pobox.com> --- date.c | 12 +++++++++++- t/t0006-date.sh | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/date.c b/date.c index 7365a4ad24..bee9fe8f10 100644 --- a/date.c +++ b/date.c @@ -868,6 +868,10 @@ static int match_object_header_date(const char *date, timestamp_t *timestamp, in return 0; } + +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ +static const timestamp_t timestamp_max = (((timestamp_t)2100 - 1970) * 365 + 32) * 24 * 60 * 60 - 1; + /* Gr. strptime is crap for this; it doesn't have a way to require RFC2822 (i.e. English) day/month names, and it doesn't work correctly with %z. */ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) @@ -937,8 +941,14 @@ int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset) } } - if (!tm_gmt) + if (!tm_gmt) { + if (*offset > 0 && *offset * 60 > *timestamp) + return -1; + if (*offset < 0 && -*offset * 60 > timestamp_max - *timestamp) + return -1; *timestamp -= *offset * 60; + } + return 0; /* success */ } diff --git a/t/t0006-date.sh b/t/t0006-date.sh index 24e8647f26..fd373e1b39 100755 --- a/t/t0006-date.sh +++ b/t/t0006-date.sh @@ -123,6 +123,39 @@ check_parse '2008-02-14 20:30:45 -05:00' '2008-02-14 20:30:45 -0500' check_parse '2008-02-14 20:30:45' '2008-02-14 20:30:45 -0500' EST5 check_parse 'Thu, 7 Apr 2005 15:14:13 -0700' '2005-04-07 15:14:13 -0700' +check_parse '1970-01-01 00:00:00' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 +00' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 Z' '1970-01-01 00:00:00 +0000' +check_parse '1970-01-01 00:00:00 -01' '1970-01-01 00:00:00 -0100' +check_parse '1970-01-01 00:00:00 +01' bad +check_parse '1970-01-01 00:00:00 +11' bad +check_parse '1970-01-01 00:59:59 +01' bad +check_parse '1970-01-01 01:00:00 +01' '1970-01-01 01:00:00 +0100' +check_parse '1970-01-01 01:00:00 +11' bad +check_parse '1970-01-02 00:00:00 +11' '1970-01-02 00:00:00 +1100' +check_parse '1969-12-31 23:59:59' bad +check_parse '1969-12-31 23:59:59 +00' bad +check_parse '1969-12-31 23:59:59 Z' bad +check_parse '1969-12-31 23:59:59 +11' bad +check_parse '1969-12-31 23:59:59 -11' bad + +REQUIRE_64BIT_TIME=HAVE_64BIT_TIME +check_parse '2099-12-31 23:59:59' '2099-12-31 23:59:59 +0000' +check_parse '2099-12-31 23:59:59 +00' '2099-12-31 23:59:59 +0000' +check_parse '2099-12-31 23:59:59 Z' '2099-12-31 23:59:59 +0000' +check_parse '2099-12-31 23:59:59 +01' '2099-12-31 23:59:59 +0100' +check_parse '2099-12-31 23:59:59 -01' bad +check_parse '2099-12-31 23:59:59 -11' bad +check_parse '2099-12-31 23:00:00 -01' bad +check_parse '2099-12-31 22:59:59 -01' '2099-12-31 22:59:59 -0100' +check_parse '2100-00-00 00:00:00' bad +check_parse '2099-12-30 00:00:00 -11' '2099-12-30 00:00:00 -1100' +check_parse '2100-00-00 00:00:00 +00' bad +check_parse '2100-00-00 00:00:00 Z' bad +check_parse '2100-00-00 00:00:00 -11' bad +check_parse '2100-00-00 00:00:00 +11' bad +REQUIRE_64BIT_TIME= + check_approxidate() { echo "$1 -> $2 +0000" >expect test_expect_${3:-success} "parse approxidate ($1)" " -- 2.45.2-796-g2ef7a3d713 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll 2024-06-25 23:12 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Junio C Hamano 2024-06-25 23:12 ` [PATCH v4 1/2] t0006: simplify prerequisites Junio C Hamano 2024-06-25 23:12 ` [PATCH v4 2/2] date: detect underflow/overflow when parsing dates with timezone offset Junio C Hamano @ 2024-06-26 15:21 ` Phillip Wood 2024-06-26 18:32 ` Junio C Hamano 2 siblings, 1 reply; 36+ messages in thread From: Phillip Wood @ 2024-06-26 15:21 UTC (permalink / raw) To: Junio C Hamano, git Cc: Darcy Burke, Karthik Nayak, Randall S . Becker, Phillip Wood Hi Junio On 26/06/2024 00:12, Junio C Hamano wrote: > So it has been some time since we discussed this topic. Let's clean > up the messy "SQUASH???" patches I had to queue on top of the main > patch to keep the CI working and make them into a preliminary patch. > > The tree at the end of the series is identical to what has been > queued in 'seen' for the past few weeks. The only difference is > that we first lay groundwork to skip certain time-parsing tests on > 32-bit systems first, and then use Darcy's patch with minimum > adjustments for 32-bit systems. I've had a read through this verison and I don't have anything to add to Eric's comments. Best Wishes Phillip > Darcy Burke (1): > date: detect underflow/overflow when parsing dates with timezone offset > > Junio C Hamano (1): > t0006: simplify prerequisites > > date.c | 12 +++++++++++- > t/t0006-date.sh | 51 +++++++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 56 insertions(+), 7 deletions(-) > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll 2024-06-26 15:21 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Phillip Wood @ 2024-06-26 18:32 ` Junio C Hamano 0 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2024-06-26 18:32 UTC (permalink / raw) To: Phillip Wood Cc: git, Darcy Burke, Karthik Nayak, Randall S . Becker, Phillip Wood Phillip Wood <phillip.wood123@gmail.com> writes: > Hi Junio > > On 26/06/2024 00:12, Junio C Hamano wrote: >> So it has been some time since we discussed this topic. Let's clean >> up the messy "SQUASH???" patches I had to queue on top of the main >> patch to keep the CI working and make them into a preliminary patch. >> The tree at the end of the series is identical to what has been >> queued in 'seen' for the past few weeks. The only difference is >> that we first lay groundwork to skip certain time-parsing tests on >> 32-bit systems first, and then use Darcy's patch with minimum >> adjustments for 32-bit systems. > > I've had a read through this verison and I don't have anything to add > to Eric's comments. > > Best Wishes Thanks, let's mark it for 'next' then. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-07 17:40 ` Junio C Hamano 2024-06-08 18:58 ` Junio C Hamano 2024-06-11 23:30 ` Junio C Hamano @ 2024-06-12 9:07 ` Phillip Wood 2 siblings, 0 replies; 36+ messages in thread From: Phillip Wood @ 2024-06-12 9:07 UTC (permalink / raw) To: Junio C Hamano, darcy via GitGitGadget; +Cc: git, darcy On 07/06/2024 18:40, Junio C Hamano wrote: > "darcy via GitGitGadget" <gitgitgadget@gmail.com> writes: > >> +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ >> +static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; > > I wonder if this should be of timestamp_t type instead, as the check > is done against *timestamp in parse_date_basic() where *timestamp is > of type timestamp_t to match? I think that would make sense. We'd want to cast one of the numbers to timestamp_t to ensure that sum uses a wide enough integer as well. Using 2100L is probably OK but an explicit cast would be clearer I think. Best Wishes Phillip ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-07 0:17 ` [PATCH v3] date: detect underflow/overflow when parsing dates with " darcy via GitGitGadget 2024-06-07 17:40 ` Junio C Hamano @ 2024-06-12 9:49 ` Karthik Nayak 2024-06-13 13:31 ` Phillip Wood 1 sibling, 1 reply; 36+ messages in thread From: Karthik Nayak @ 2024-06-12 9:49 UTC (permalink / raw) To: darcy via GitGitGadget, git; +Cc: darcy [-- Attachment #1: Type: text/plain, Size: 818 bytes --] "darcy via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: darcy <acednes@gmail.com> > > Overriding the date of a commit to be close to "1970-01-01 00:00:00" > with a large enough positive timezone for the equivelant GMT time to be s/equivelant/equivalent [snip] > diff --git a/date.c b/date.c > index 7365a4ad24f..95776c8a92f 100644 > --- a/date.c > +++ b/date.c > @@ -868,6 +868,10 @@ static int match_object_header_date(const char *date, timestamp_t *timestamp, in > return 0; > } > > + > +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ > +static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; > Nit: but since we're calculating the number of years here (2100L - 1970), shouldn't we also be calculating the number of leap days instead of hardcoding it? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-12 9:49 ` Karthik Nayak @ 2024-06-13 13:31 ` Phillip Wood 2024-06-13 16:16 ` Junio C Hamano 0 siblings, 1 reply; 36+ messages in thread From: Phillip Wood @ 2024-06-13 13:31 UTC (permalink / raw) To: Karthik Nayak, darcy via GitGitGadget, git; +Cc: darcy Hi Karthik On 12/06/2024 10:49, Karthik Nayak wrote: > "darcy via GitGitGadget" <gitgitgadget@gmail.com> writes: >> + >> +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ >> +static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; >> > > Nit: but since we're calculating the number of years here (2100L - > 1970), shouldn't we also be calculating the number of leap days instead > of hardcoding it? I'm happy with a hard coded constant for the number of leap days - I think it is probably easier to check that (which I have done) than it would be to check the calculation as I'm not sure off the top of my head if is it safe to do (2100-1970)/4 or whether we need something more complicated. Best Wishes Phillip ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-13 13:31 ` Phillip Wood @ 2024-06-13 16:16 ` Junio C Hamano 2024-06-14 20:09 ` Karthik Nayak 0 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2024-06-13 16:16 UTC (permalink / raw) To: Phillip Wood; +Cc: Karthik Nayak, darcy via GitGitGadget, git, darcy Phillip Wood <phillip.wood123@gmail.com> writes: >>> +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ >>> +static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; >>> >> Nit: but since we're calculating the number of years here (2100L - >> 1970), shouldn't we also be calculating the number of leap days instead >> of hardcoding it? > > I'm happy with a hard coded constant for the number of leap days - I > think it is probably easier to check that (which I have done) than it > would be to check the calculation as I'm not sure off the top of my > head if is it safe to do (2100-1970)/4 or whether we need something > more complicated. It's even OK to use a hard coded constant for the number of days since the epoch to the git-end-of-time ;-) The timestamp of the git-end-of-time would not fit in time_t on 32-bit systems, I would presume? If our tests are trying to see if timestamps around the beginning of year 2100 are handled "correctly", the definition of the correctness needs to be consitional on the platform. On systems with TIME_T_IS_64BIT, we'd want to see such a timestamp to be represented fine. On systems without, we'd want to see the "Timestamp too large for this system" error when we feed such a timestamp to be parsed. Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-13 16:16 ` Junio C Hamano @ 2024-06-14 20:09 ` Karthik Nayak 2024-06-14 21:02 ` Junio C Hamano 0 siblings, 1 reply; 36+ messages in thread From: Karthik Nayak @ 2024-06-14 20:09 UTC (permalink / raw) To: Junio C Hamano, Phillip Wood; +Cc: darcy via GitGitGadget, git, darcy [-- Attachment #1: Type: text/plain, Size: 1549 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Phillip Wood <phillip.wood123@gmail.com> writes: > >>>> +/* timestamp of 2099-12-31T23:59:59Z, including 32 leap days */ >>>> +static const time_t timestamp_max = ((2100L - 1970) * 365 + 32) * 24 * 60 * 60 - 1; >>>> >>> Nit: but since we're calculating the number of years here (2100L - >>> 1970), shouldn't we also be calculating the number of leap days instead >>> of hardcoding it? >> >> I'm happy with a hard coded constant for the number of leap days - I >> think it is probably easier to check that (which I have done) than it >> would be to check the calculation as I'm not sure off the top of my >> head if is it safe to do (2100-1970)/4 or whether we need something >> more complicated. > > It's even OK to use a hard coded constant for the number of days > since the epoch to the git-end-of-time ;-) That's why I noted it as a _Nit_, mostly because it wasn't anything big. But I found that part of it being dynamic and part of it being static was inconsistent. > The timestamp of the git-end-of-time would not fit in time_t on > 32-bit systems, I would presume? If our tests are trying to see if > timestamps around the beginning of year 2100 are handled > "correctly", the definition of the correctness needs to be > consitional on the platform. > > On systems with TIME_T_IS_64BIT, we'd want to see such a timestamp > to be represented fine. On systems without, we'd want to see the > "Timestamp too large for this system" error when we feed such a > timestamp to be parsed. > > Thanks. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-14 20:09 ` Karthik Nayak @ 2024-06-14 21:02 ` Junio C Hamano 2024-06-15 11:49 ` Karthik Nayak 0 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2024-06-14 21:02 UTC (permalink / raw) To: Karthik Nayak; +Cc: Phillip Wood, darcy via GitGitGadget, git, darcy Karthik Nayak <karthik.188@gmail.com> writes: >> It's even OK to use a hard coded constant for the number of days >> since the epoch to the git-end-of-time ;-) > > That's why I noted it as a _Nit_, mostly because it wasn't anything big. > But I found that part of it being dynamic and part of it being static > was inconsistent. Sure, but it is so tiny thing, we shouldn't waste more time than we spend getting the tests right even on 32-bit systems. We seem to be doing the opposite by talking about this part even more, which is a bit sad. Any comments on the actual patch I sent as a follow-up? >> The timestamp of the git-end-of-time would not fit in time_t on >> 32-bit systems, I would presume? If our tests are trying to see if >> timestamps around the beginning of year 2100 are handled >> "correctly", the definition of the correctness needs to be >> consitional on the platform. >> >> On systems with TIME_T_IS_64BIT, we'd want to see such a timestamp >> to be represented fine. On systems without, we'd want to see the >> "Timestamp too large for this system" error when we feed such a >> timestamp to be parsed. >> >> Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset 2024-06-14 21:02 ` Junio C Hamano @ 2024-06-15 11:49 ` Karthik Nayak 0 siblings, 0 replies; 36+ messages in thread From: Karthik Nayak @ 2024-06-15 11:49 UTC (permalink / raw) To: Junio C Hamano; +Cc: Phillip Wood, darcy via GitGitGadget, git, darcy [-- Attachment #1: Type: text/plain, Size: 737 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Karthik Nayak <karthik.188@gmail.com> writes: > >>> It's even OK to use a hard coded constant for the number of days >>> since the epoch to the git-end-of-time ;-) >> >> That's why I noted it as a _Nit_, mostly because it wasn't anything big. >> But I found that part of it being dynamic and part of it being static >> was inconsistent. > > Sure, but it is so tiny thing, we shouldn't waste more time than we > spend getting the tests right even on 32-bit systems. We seem to be > doing the opposite by talking about this part even more, which is a > bit sad. Any comments on the actual patch I sent as a follow-up? > Agreed! I looked at your patch and it looks good to me, thanks Junio. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2024-06-26 18:32 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-27 9:17 [PATCH] fix: prevent date underflow when using positive timezone offset darcy via GitGitGadget 2024-05-28 14:05 ` Patrick Steinhardt 2024-05-28 14:49 ` Phillip Wood 2024-05-28 17:06 ` Junio C Hamano 2024-06-02 23:06 ` [PATCH v2] date: detect underflow when parsing dates with " darcy via GitGitGadget 2024-06-03 11:13 ` Junio C Hamano 2024-06-03 11:44 ` darcy 2024-06-03 14:13 ` Phillip Wood 2024-06-04 8:48 ` darcy 2024-06-04 9:33 ` Jeff King 2024-06-05 6:52 ` darcy 2024-06-05 13:10 ` Phillip Wood 2024-06-05 17:27 ` Junio C Hamano 2024-06-06 4:56 ` darcy 2024-06-07 0:17 ` [PATCH v3] date: detect underflow/overflow when parsing dates with " darcy via GitGitGadget 2024-06-07 17:40 ` Junio C Hamano 2024-06-08 18:58 ` Junio C Hamano 2024-06-14 1:20 ` Junio C Hamano 2024-06-15 11:47 ` Karthik Nayak 2024-06-11 23:30 ` Junio C Hamano 2024-06-11 23:49 ` rsbecker 2024-06-11 23:52 ` Junio C Hamano 2024-06-25 23:12 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Junio C Hamano 2024-06-25 23:12 ` [PATCH v4 1/2] t0006: simplify prerequisites Junio C Hamano 2024-06-25 23:30 ` Eric Sunshine 2024-06-26 0:04 ` Junio C Hamano 2024-06-25 23:12 ` [PATCH v4 2/2] date: detect underflow/overflow when parsing dates with timezone offset Junio C Hamano 2024-06-26 15:21 ` [PATCH v4 0/2] Darcy's "date underflow fix" topic, final reroll Phillip Wood 2024-06-26 18:32 ` Junio C Hamano 2024-06-12 9:07 ` [PATCH v3] date: detect underflow/overflow when parsing dates with timezone offset Phillip Wood 2024-06-12 9:49 ` Karthik Nayak 2024-06-13 13:31 ` Phillip Wood 2024-06-13 16:16 ` Junio C Hamano 2024-06-14 20:09 ` Karthik Nayak 2024-06-14 21:02 ` Junio C Hamano 2024-06-15 11:49 ` Karthik Nayak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).