git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] git-remote-mediawiki: Fix a bug in a regexp
@ 2013-06-08 13:35 Célestin Matte
  2013-06-08 18:38 ` Matthieu Moy
  2013-06-09  5:11 ` Eric Sunshine
  0 siblings, 2 replies; 6+ messages in thread
From: Célestin Matte @ 2013-06-08 13:35 UTC (permalink / raw)
  To: git; +Cc: benoit.person, matthieu.moy, Célestin Matte

In Perl, '\n' is not a newline, but instead a literal backslash followed by an
"n". As the output of "rev-list --first-parent" is line-oriented, what we want
here is a newline.

Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
---
 contrib/mw-to-git/git-remote-mediawiki.perl |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
index 7af202f..a06bc31 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -1190,7 +1190,7 @@ sub mw_push_revision {
 		# history (linearized with --first-parent)
 		print STDERR "Warning: no common ancestor, pushing complete history\n";
 		my $history = run_git("rev-list --first-parent --children $local");
-		my @history = split('\n', $history);
+		my @history = split(/\n/, $history);
 		@history = @history[1..$#history];
 		foreach my $line (reverse @history) {
 			my @commit_info_split = split(/ |\n/, $line);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] git-remote-mediawiki: Fix a bug in a regexp
  2013-06-08 13:35 [PATCH] git-remote-mediawiki: Fix a bug in a regexp Célestin Matte
@ 2013-06-08 18:38 ` Matthieu Moy
  2013-06-08 20:08   ` Célestin Matte
  2013-06-09  2:57   ` Jeff King
  2013-06-09  5:11 ` Eric Sunshine
  1 sibling, 2 replies; 6+ messages in thread
From: Matthieu Moy @ 2013-06-08 18:38 UTC (permalink / raw)
  To: Célestin Matte; +Cc: git, benoit.person

Célestin Matte <celestin.matte@ensimag.fr> writes:

> In Perl, '\n' is not a newline, but instead a literal backslash followed by an
> "n". As the output of "rev-list --first-parent" is line-oriented, what we want
> here is a newline.

This is right, but the code actually worked the way it was. I'm not
sure, but my understanding is that '\n' is the string "backslash
followed by n", but interpreted as a regexp, it is a newline.

The new code looks better than the old one, but the log message may be
improved.

In any case, Acked-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] git-remote-mediawiki: Fix a bug in a regexp
  2013-06-08 18:38 ` Matthieu Moy
@ 2013-06-08 20:08   ` Célestin Matte
  2013-06-09  2:57   ` Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Célestin Matte @ 2013-06-08 20:08 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Célestin Matte, git, benoit.person

Le 08/06/2013 20:38, Matthieu Moy a écrit :> This is right, but the code
actually worked the way it was. I'm not
> sure, but my understanding is that '\n' is the string "backslash
> followed by n", but interpreted as a regexp, it is a newline.
>
> The new code looks better than the old one, but the log message may be
> improved.

Is this better?

"
In Perl, '\n' is not a newline, but instead the string composed of a
backslash followed by an "n". To match newlines, one has to use the /\n/
regexp. As the output of "rev-list --first-parent" is line-oriented,
what we want here is to match newlines, and not the "\n" string.
"

-- 
Célestin Matte

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] git-remote-mediawiki: Fix a bug in a regexp
  2013-06-08 18:38 ` Matthieu Moy
  2013-06-08 20:08   ` Célestin Matte
@ 2013-06-09  2:57   ` Jeff King
  2013-06-09  5:21     ` Eric Sunshine
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff King @ 2013-06-09  2:57 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Célestin Matte, git, benoit.person

On Sat, Jun 08, 2013 at 08:38:56PM +0200, Matthieu Moy wrote:

> Célestin Matte <celestin.matte@ensimag.fr> writes:
> 
> > In Perl, '\n' is not a newline, but instead a literal backslash followed by an
> > "n". As the output of "rev-list --first-parent" is line-oriented, what we want
> > here is a newline.
> 
> This is right, but the code actually worked the way it was. I'm not
> sure, but my understanding is that '\n' is the string "backslash
> followed by n", but interpreted as a regexp, it is a newline.

Yes, the relevant doc (from "perldoc -f split") is:

  The pattern "/PATTERN/" may be replaced with an expression to specify
  patterns that vary at runtime.  (To do runtime compilation only once,
  use "/$variable/o".)

So it is treating "\n" as an expression and compiling the regex each
time through (though I think modern perl may be smart enough to realize
it is a constant expression and compile the regex only once). You would
get the same behavior with this:

  split $arg, $data;

if $arg contained '\n'. Of course, you _also_ get the same thing if you
use a literal newline (either "\n" or if $arg contained a literal
newline), because they function the same in a regex. In other words, it
does not matter which you use because perl's interpolation of "\n" and
the regex expansion of "\n" are identical: t hey both mean a newline.

A more subtle example that shows what is going on is this:

  split '.', $data;

If you feed that "foo.bar.baz", it does not split it into three words;
each character is a delimiter, because the dot is compiled to a regex.

> The new code looks better than the old one, but the log message may be
> improved.

Agreed. I think the best explanation is something like:

  Perl's split function takes a regex pattern argument. You can also
  feed it an expression, which is then compiled into a regex at runtime.
  It therefore works to pass your pattern via single quotes, but it is
  much less obvious to a reader that the argument is meant to be a
  regex, not a static string. Using the traditional slash-delimiters
  makes this easier to read.

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] git-remote-mediawiki: Fix a bug in a regexp
  2013-06-08 13:35 [PATCH] git-remote-mediawiki: Fix a bug in a regexp Célestin Matte
  2013-06-08 18:38 ` Matthieu Moy
@ 2013-06-09  5:11 ` Eric Sunshine
  1 sibling, 0 replies; 6+ messages in thread
From: Eric Sunshine @ 2013-06-09  5:11 UTC (permalink / raw)
  To: Célestin Matte; +Cc: Git List, benoit.person, Matthieu Moy

On Sat, Jun 8, 2013 at 9:35 AM, Célestin Matte
<celestin.matte@ensimag.fr> wrote:
> In Perl, '\n' is not a newline, but instead a literal backslash followed by an
> "n". As the output of "rev-list --first-parent" is line-oriented, what we want
> here is a newline.
>
> Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr>
> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
> ---
>  contrib/mw-to-git/git-remote-mediawiki.perl |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
> index 7af202f..a06bc31 100755
> --- a/contrib/mw-to-git/git-remote-mediawiki.perl
> +++ b/contrib/mw-to-git/git-remote-mediawiki.perl
> @@ -1190,7 +1190,7 @@ sub mw_push_revision {
>                 # history (linearized with --first-parent)
>                 print STDERR "Warning: no common ancestor, pushing complete history\n";
>                 my $history = run_git("rev-list --first-parent --children $local");
> -               my @history = split('\n', $history);
> +               my @history = split(/\n/, $history);
>                 @history = @history[1..$#history];
>                 foreach my $line (reverse @history) {
>                         my @commit_info_split = split(/ |\n/, $line);

It would be quite acceptable to include this patch in your existing
patch series.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] git-remote-mediawiki: Fix a bug in a regexp
  2013-06-09  2:57   ` Jeff King
@ 2013-06-09  5:21     ` Eric Sunshine
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Sunshine @ 2013-06-09  5:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Matthieu Moy, Célestin Matte, Git List, benoit.person

On Sat, Jun 8, 2013 at 10:57 PM, Jeff King <peff@peff.net> wrote:
> On Sat, Jun 08, 2013 at 08:38:56PM +0200, Matthieu Moy wrote:
>
>> Célestin Matte <celestin.matte@ensimag.fr> writes:
>>
>> > In Perl, '\n' is not a newline, but instead a literal backslash followed by an
>> > "n". As the output of "rev-list --first-parent" is line-oriented, what we want
>> > here is a newline.
>>
>> This is right, but the code actually worked the way it was. I'm not
>> sure, but my understanding is that '\n' is the string "backslash
>> followed by n", but interpreted as a regexp, it is a newline.
>
> Yes, the relevant doc (from "perldoc -f split") is:
>
>   The pattern "/PATTERN/" may be replaced with an expression to specify
>   patterns that vary at runtime.  (To do runtime compilation only once,
>   use "/$variable/o".)
>
> So it is treating "\n" as an expression and compiling the regex each
> time through ...

I read this as saying only that static /PATTERN/ can also be a
composed /$PATTERN/. It does not indicate how string 'PATTERN' is
treated, nor does any other part of "perldoc -f split" make special
mention of string 'PATTERN'. In fact, the only explanation I found
regarding string 'PATTERN' is in my Camel book (3rd edition, page 796)
in a parenthesized comment:

    (... if you supply a string instead of a regular expression, it'll be
    interpreted as a regular expression anyway.)

>> The new code looks better than the old one, but the log message may be
>> improved.
>
> Agreed. I think the best explanation is something like:
>
>   Perl's split function takes a regex pattern argument. You can also
>   feed it an expression, which is then compiled into a regex at runtime.
>   It therefore works to pass your pattern via single quotes, but it is
>   much less obvious to a reader that the argument is meant to be a
>   regex, not a static string. Using the traditional slash-delimiters
>   makes this easier to read.

Sounds good to me.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-06-09  5:21 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-08 13:35 [PATCH] git-remote-mediawiki: Fix a bug in a regexp Célestin Matte
2013-06-08 18:38 ` Matthieu Moy
2013-06-08 20:08   ` Célestin Matte
2013-06-09  2:57   ` Jeff King
2013-06-09  5:21     ` Eric Sunshine
2013-06-09  5:11 ` Eric Sunshine

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).