git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Update SVN.pm
@ 2014-04-16 14:16 Stepan Kasal
  2014-04-16 19:13 ` Junio C Hamano
  2014-04-17 18:01 ` Junio C Hamano
  0 siblings, 2 replies; 6+ messages in thread
From: Stepan Kasal @ 2014-04-16 14:16 UTC (permalink / raw)
  To: git

From: RomanBelinsky <belinsky.roman@gmail.com>
Date: Tue, 11 Feb 2014 18:23:02 +0200

fix parsing error for dates like:
2014-01-07T5:58:36.048176Z
previous regex can parse only:
2014-01-07T05:58:36.048176Z
reproduced in my svn repository during conversion.

Signed-off-by: Stepan Kasal <kasal@ucw.cz>
---
 perl/Git/SVN.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index a59564f..09cff13 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -1321,7 +1321,7 @@ sub get_untracked {
 sub parse_svn_date {
 	my $date = shift || return '+0000 1970-01-01 00:00:00';
 	my ($Y,$m,$d,$H,$M,$S) = ($date =~ /^(\d{4})\-(\d\d)\-(\d\d)T
-	                                    (\d\d)\:(\d\d)\:(\d\d)\.\d*Z$/x) or
+	                                    (\d\d?)\:(\d\d)\:(\d\d)\.\d*Z$/x) or
 	                                 croak "Unable to parse date: $date\n";
 	my $parsed_date;    # Set next.
 
-- 
1.9.2.msysgit.0.154.g978f18d

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Update SVN.pm
  2014-04-16 14:16 [PATCH] Update SVN.pm Stepan Kasal
@ 2014-04-16 19:13 ` Junio C Hamano
  2014-04-17  5:24   ` Stepan Kasal
  2014-04-17 18:01 ` Junio C Hamano
  1 sibling, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2014-04-16 19:13 UTC (permalink / raw)
  To: Stepan Kasal; +Cc: git

Stepan Kasal <kasal@ucw.cz> writes:

> From: RomanBelinsky <belinsky.roman@gmail.com>
> Date: Tue, 11 Feb 2014 18:23:02 +0200
>
> fix parsing error for dates like:
> 2014-01-07T5:58:36.048176Z
> previous regex can parse only:
> 2014-01-07T05:58:36.048176Z
> reproduced in my svn repository during conversion.

Interesting.  What other strange forms can they record in their
repositories, I have to wonder.  Can they do

    2014-01-07T5:8:6.048176Z

for example?  I am wondering if it is simpler and less error prone
to turn all these "we only accept two digits" into "\d+" not only
for the hour part but also minute and second parts.

> Signed-off-by: Stepan Kasal <kasal@ucw.cz>
> ---
>  perl/Git/SVN.pm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
> index a59564f..09cff13 100644
> --- a/perl/Git/SVN.pm
> +++ b/perl/Git/SVN.pm
> @@ -1321,7 +1321,7 @@ sub get_untracked {
>  sub parse_svn_date {
>  	my $date = shift || return '+0000 1970-01-01 00:00:00';
>  	my ($Y,$m,$d,$H,$M,$S) = ($date =~ /^(\d{4})\-(\d\d)\-(\d\d)T
> -	                                    (\d\d)\:(\d\d)\:(\d\d)\.\d*Z$/x) or
> +	                                    (\d\d?)\:(\d\d)\:(\d\d)\.\d*Z$/x) or
>  	                                 croak "Unable to parse date: $date\n";
>  	my $parsed_date;    # Set next.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Update SVN.pm
  2014-04-16 19:13 ` Junio C Hamano
@ 2014-04-17  5:24   ` Stepan Kasal
  2014-04-17 17:39     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Stepan Kasal @ 2014-04-17  5:24 UTC (permalink / raw)
  To: git

Hello,

On Wed, Apr 16, 2014 at 12:13:21PM -0700, Junio C Hamano wrote:
> Interesting.  What other strange forms can they record in their
> repositories, I have to wonder.  Can they do
>     2014-01-07T5:8:6.048176Z
> for example?

Roman Belinsky, the author of this fix, witnessed after large scale
conversion that the problem happens with the hour part only.
(SVN commits from the same origin did this with hours but not with
minutes.)  Recorded here:
https://github.com/msysgit/git/pull/126#discussion_r9661916

> I am wondering if it is simpler and less error prone
> to turn all these "we only accept two digits" into "\d+" not only
> for the hour part but also minute and second parts.

But Roman's proposed regexp nicely shows 1) what the standard is and
2) what is the deviation.

Have a nice day,
  Stepan Kasal

> > Signed-off-by: Stepan Kasal <kasal@ucw.cz>
> > ---
> >  perl/Git/SVN.pm | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
> > index a59564f..09cff13 100644
> > --- a/perl/Git/SVN.pm
> > +++ b/perl/Git/SVN.pm
> > @@ -1321,7 +1321,7 @@ sub get_untracked {
> >  sub parse_svn_date {
> >  	my $date = shift || return '+0000 1970-01-01 00:00:00';
> >  	my ($Y,$m,$d,$H,$M,$S) = ($date =~ /^(\d{4})\-(\d\d)\-(\d\d)T
> > -	                                    (\d\d)\:(\d\d)\:(\d\d)\.\d*Z$/x) or
> > +	                                    (\d\d?)\:(\d\d)\:(\d\d)\.\d*Z$/x) or
> >  	                                 croak "Unable to parse date: $date\n";
> >  	my $parsed_date;    # Set next.
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Update SVN.pm
  2014-04-17  5:24   ` Stepan Kasal
@ 2014-04-17 17:39     ` Junio C Hamano
  2014-04-18  6:48       ` Stepan Kasal
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2014-04-17 17:39 UTC (permalink / raw)
  To: Stepan Kasal; +Cc: git

Stepan Kasal <kasal@ucw.cz> writes:

> On Wed, Apr 16, 2014 at 12:13:21PM -0700, Junio C Hamano wrote:
>> Interesting.  What other strange forms can they record in their
>> repositories, I have to wonder.  Can they do
>>     2014-01-07T5:8:6.048176Z
>> for example?
>
> Roman Belinsky, the author of this fix, witnessed after large scale
> conversion that the problem happens with the hour part only.

Is this "large scale conversion" done from a SVN repository that is
created by bog standard SVN, or something else?  How certain are we
that this "hour part is broken" is the only kind of breakage in
timestamps we would encouter?

What I am trying to get at is that "we didn't see any breakage at
positions other than hour part after checking 2 million commits" is
different from "there will no breakage at positions other than hour
part", and by being slightly more lenient than necessary to cover
one observed case that triggered the patch, we can cover SVN
repositories broken in a similar but slightly different way.

Especially given that this regexp matching is not used for finding a
timestamp from random places but to parse out the datum we find at a
place where we expect to see a timestamp (check the callers), I
think loosening to allow single-digit minutes and seconds in the
same commit that allows single-digit hours would be such "slightly
more lenient than necessary" change without additional risk of
mistaking something that is not a timestamp as a timestamp.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Update SVN.pm
  2014-04-16 14:16 [PATCH] Update SVN.pm Stepan Kasal
  2014-04-16 19:13 ` Junio C Hamano
@ 2014-04-17 18:01 ` Junio C Hamano
  1 sibling, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2014-04-17 18:01 UTC (permalink / raw)
  To: Stepan Kasal; +Cc: git

Stepan Kasal <kasal@ucw.cz> writes:

> From: RomanBelinsky <belinsky.roman@gmail.com>
> Date: Tue, 11 Feb 2014 18:23:02 +0200
>
> fix parsing error for dates like:
> 2014-01-07T5:58:36.048176Z
> previous regex can parse only:
> 2014-01-07T05:58:36.048176Z
> reproduced in my svn repository during conversion.
>
> Signed-off-by: Stepan Kasal <kasal@ucw.cz>
> ---

Two niggles.

 - The "Subject" line is not descriptive enough to let readers of "git
   shortlog" know what this change is about.

 - Can we have the patch signed-off by the author?


For the first point, I'd suggest rewriting the proposed commit
message like this (this is what I came up with after reading that
msysgit discussion page you referred to in the other message):

------------------------------------------------------
SVN.pm::parse_svn_date: allow timestamps with a single-digit hour

Some broken subversion server gives timestamps with only one digit
in the hour part, like this:

    2014-01-07T5:58:36.048176Z

Loosen the regexp that expected to see two-digit hour, minute and
second parts to accept a single-digit hour (but not minute or
second).

Signed-off-by: Stepan Kasal <kasal@ucw.cz>
------------------------------------------------------


>  perl/Git/SVN.pm | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
> index a59564f..09cff13 100644
> --- a/perl/Git/SVN.pm
> +++ b/perl/Git/SVN.pm
> @@ -1321,7 +1321,7 @@ sub get_untracked {
>  sub parse_svn_date {
>  	my $date = shift || return '+0000 1970-01-01 00:00:00';
>  	my ($Y,$m,$d,$H,$M,$S) = ($date =~ /^(\d{4})\-(\d\d)\-(\d\d)T
> -	                                    (\d\d)\:(\d\d)\:(\d\d)\.\d*Z$/x) or
> +	                                    (\d\d?)\:(\d\d)\:(\d\d)\.\d*Z$/x) or
>  	                                 croak "Unable to parse date: $date\n";
>  	my $parsed_date;    # Set next.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Update SVN.pm
  2014-04-17 17:39     ` Junio C Hamano
@ 2014-04-18  6:48       ` Stepan Kasal
  0 siblings, 0 replies; 6+ messages in thread
From: Stepan Kasal @ 2014-04-18  6:48 UTC (permalink / raw)
  To: git; +Cc: Roman Belinsky

Hello,

cc'ing Roman, the original author.  (I should have done that
in the first post, sorry.  I have also forwarded him another
mail from this thread, asking him for author's sign off.)

On Thu, Apr 17, 2014 at 10:39:49AM -0700, Junio C Hamano wrote:
> Stepan Kasal <kasal@ucw.cz> writes:
> 
> > On Wed, Apr 16, 2014 at 12:13:21PM -0700, Junio C Hamano wrote:
> >> Interesting.  What other strange forms can they record in their
> >> repositories, I have to wonder.  Can they do
> >>     2014-01-07T5:8:6.048176Z
> >> for example?
> >
> > Roman Belinsky, the author of this fix, witnessed after large scale
> > conversion that the problem happens with the hour part only.
> 
> Is this "large scale conversion" done from a SVN repository that is
> created by bog standard SVN, or something else?

I don't know.  Roman?

> How certain are we that this "hour part is broken" is the only kind
> of breakage in timestamps we would encouter?

I would say we can be certain, as Roman said that the same PC
that inserts the timestamp with one-digit hours does not misformat
minutes.  (Still cited from the same discussion
https://github.com/msysgit/git/pull/126#discussion_r9661916 )

We do not have code review for that bug, as far as I know, but this
is a natural bug:  a reasonably looking time "5:08:09.048176" is
used in format "%sT%s"

> [...] and by being slightly more lenient than necessary to cover
> one observed case that triggered the patch, we can cover SVN
> repositories broken in a similar but slightly different way.

I second that, in general.
But my guess is that this particular "similar but slightly
different" breakage will never appear, so the self-documenting
original fix wins for me.

> Especially given that this regexp matching is not used for finding a
> timestamp from random places [...]

I agree that the broader regexp is not dangerous in this context.  So
it seems to be no big issue either way.

Thanks for taking this so carefully,
	Stepan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-04-18  6:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-16 14:16 [PATCH] Update SVN.pm Stepan Kasal
2014-04-16 19:13 ` Junio C Hamano
2014-04-17  5:24   ` Stepan Kasal
2014-04-17 17:39     ` Junio C Hamano
2014-04-18  6:48       ` Stepan Kasal
2014-04-17 18:01 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).