* [PATCH] userdiff: update Ada patterns
@ 2014-02-02 10:51 Adrian Johnson
2014-02-02 23:35 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Adrian Johnson @ 2014-02-02 10:51 UTC (permalink / raw)
To: git
- Allow extra space in "is new" and "is separate"
- Fix bug in word regex for numbers
Signed-off-by: Adrian Johnson <ajohnson@redneon.com>
---
t/t4034/ada/expect | 2 +-
userdiff.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/t/t4034/ada/expect b/t/t4034/ada/expect
index be2376e..a682d28 100644
--- a/t/t4034/ada/expect
+++ b/t/t4034/ada/expect
@@ -4,7 +4,7 @@
<BOLD>+++ b/post<RESET>
<CYAN>@@ -1,13 +1,13 @@<RESET>
Ada.Text_IO.Put_Line("Hello World<RED>!<RESET><GREEN>?<RESET>");
-1 1e<RED>-<RESET>10 16#FE12#E2 3.141_592 '<RED>x<RESET><GREEN>y<RESET>'
+1 <RED>1e-10<RESET><GREEN>1e10<RESET> 16#FE12#E2 3.141_592 '<RED>x<RESET><GREEN>y<RESET>'
<RED>a<RESET><GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET>
<RED>a<RESET><GREEN>y<RESET>
<GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b<RESET>
diff --git a/userdiff.c b/userdiff.c
index ea43a03..e8915bf 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -15,13 +15,13 @@ static int drivers_alloc;
word_regex "|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+" }
static struct userdiff_driver builtin_drivers[] = {
IPATTERN("ada",
- "!^(.*[ \t])?(is new|renames|is separate)([ \t].*)?$\n"
+ "!^(.*[ \t])?(is[ \t]+new|renames|is[ \t]+separate)([ \t].*)?$\n"
"!^[ \t]*with[ \t].*$\n"
"^[ \t]*((procedure|function)[ \t]+.*)$\n"
"^[ \t]*((package|protected|task)[ \t]+.*)$",
/* -- */
"[a-zA-Z][a-zA-Z0-9_]*"
- "|[0-9][-+0-9#_.eE]"
+ "|[-+0-9#_.eE]+"
"|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
IPATTERN("fortran",
"!^([C*]|[ \t]*!)\n"
--
1.8.3.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] userdiff: update Ada patterns
2014-02-02 10:51 [PATCH] userdiff: update Ada patterns Adrian Johnson
@ 2014-02-02 23:35 ` Jeff King
2014-02-03 11:30 ` Adrian Johnson
2014-02-03 11:33 ` [PATCH v2] " Adrian Johnson
0 siblings, 2 replies; 8+ messages in thread
From: Jeff King @ 2014-02-02 23:35 UTC (permalink / raw)
To: Adrian Johnson; +Cc: git
On Sun, Feb 02, 2014 at 09:21:56PM +1030, Adrian Johnson wrote:
> - Allow extra space in "is new" and "is separate"
> [...]
> - "!^(.*[ \t])?(is new|renames|is separate)([ \t].*)?$\n"
> + "!^(.*[ \t])?(is[ \t]+new|renames|is[ \t]+separate)([ \t].*)?$\n"
I do not know anything about Ada syntax, but this change looks obviously
sensible.
> - Fix bug in word regex for numbers
> - "|[0-9][-+0-9#_.eE]"
> + "|[-+0-9#_.eE]+"
This makes "E" or "_" a number. Is that right?
I think the intent of the original was "starts with a digit, and then
has one or more of these other things after it". You do not describe the
bug, but I guess it would be that it does not match obvious things like
"5". Should it be "zero or more" instead, like:
[0-9][-+0-9#_.eE]*
? Also, should -/+ be hoisted to the front?
[-+]?[0-9][0-9#_.eE]*
Again, I am just guessing, as I am not familiar enough with Ada.
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] userdiff: update Ada patterns
2014-02-02 23:35 ` Jeff King
@ 2014-02-03 11:30 ` Adrian Johnson
2014-02-03 11:33 ` [PATCH v2] " Adrian Johnson
1 sibling, 0 replies; 8+ messages in thread
From: Adrian Johnson @ 2014-02-03 11:30 UTC (permalink / raw)
To: Jeff King; +Cc: git
On 03/02/14 10:05, Jeff King wrote:
> On Sun, Feb 02, 2014 at 09:21:56PM +1030, Adrian Johnson wrote:
>> - Fix bug in word regex for numbers
>> - "|[0-9][-+0-9#_.eE]"
>> + "|[-+0-9#_.eE]+"
>
> This makes "E" or "_" a number. Is that right?
>
> I think the intent of the original was "starts with a digit, and then
> has one or more of these other things after it". You do not describe the
> bug, but I guess it would be that it does not match obvious things like
> "5". Should it be "zero or more" instead, like:
>
> [0-9][-+0-9#_.eE]*
Yes, the original was missing the '*' at the end.
I changed it to be similar to the number regexes used by the other builtin
patterns which are of the form '[-+0-9#_.eE]+'.
>
> ? Also, should -/+ be hoisted to the front?
>
> [-+]?[0-9][0-9#_.eE]*
The other builtins don't do this. But it is probably better to have
[-+]?[0-9] at the front.
> Again, I am just guessing, as I am not familiar enough with Ada.
Ada numbers have the form:
- integers and reals eg 123, 1.23, 1e-2 ('.' can not be first)
- a '_' may be used between digits to improve readability eg 123_456
- base n (2 <= n <= 16) is of the form n#digits# eg 16#FFEF#
- base n numbers can include a radix point and/or an exponent
eg 16#FF12.8#e-2
- Ada is case insensitive
After having another look I noticed it was missing the hex characters.
The new number regex I am proposing is:
[-+]?[0-9][0-9#_.aAbBcCdDeEfF]*([eE][+-]?[0-9_]+)?
I kept exponents containing a +/- sign separate from the digits
to prevent things like '1+2' from matching. I'll send an updated patch.
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2] userdiff: update Ada patterns
2014-02-02 23:35 ` Jeff King
2014-02-03 11:30 ` Adrian Johnson
@ 2014-02-03 11:33 ` Adrian Johnson
2014-02-03 20:00 ` Junio C Hamano
1 sibling, 1 reply; 8+ messages in thread
From: Adrian Johnson @ 2014-02-03 11:33 UTC (permalink / raw)
To: git
- Allow extra space in "is new" and "is separate"
- Fix bug in word regex for numbers
Signed-off-by: Adrian Johnson <ajohnson@redneon.com>
---
t/t4034/ada/expect | 2 +-
userdiff.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/t/t4034/ada/expect b/t/t4034/ada/expect
index be2376e..a682d28 100644
--- a/t/t4034/ada/expect
+++ b/t/t4034/ada/expect
@@ -4,7 +4,7 @@
<BOLD>+++ b/post<RESET>
<CYAN>@@ -1,13 +1,13 @@<RESET>
Ada.Text_IO.Put_Line("Hello World<RED>!<RESET><GREEN>?<RESET>");
-1 1e<RED>-<RESET>10 16#FE12#E2 3.141_592 '<RED>x<RESET><GREEN>y<RESET>'
+1 <RED>1e-10<RESET><GREEN>1e10<RESET> 16#FE12#E2 3.141_592 '<RED>x<RESET><GREEN>y<RESET>'
<RED>a<RESET><GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET>
<RED>a<RESET><GREEN>y<RESET>
<GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b<RESET>
diff --git a/userdiff.c b/userdiff.c
index ea43a03..10b61ec 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -15,13 +15,13 @@ static int drivers_alloc;
word_regex "|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+" }
static struct userdiff_driver builtin_drivers[] = {
IPATTERN("ada",
- "!^(.*[ \t])?(is new|renames|is separate)([ \t].*)?$\n"
+ "!^(.*[ \t])?(is[ \t]+new|renames|is[ \t]+separate)([ \t].*)?$\n"
"!^[ \t]*with[ \t].*$\n"
"^[ \t]*((procedure|function)[ \t]+.*)$\n"
"^[ \t]*((package|protected|task)[ \t]+.*)$",
/* -- */
"[a-zA-Z][a-zA-Z0-9_]*"
- "|[0-9][-+0-9#_.eE]"
+ "|[-+]?[0-9][0-9#_.aAbBcCdDeEfF]*([eE][+-]?[0-9_]+)?"
"|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
IPATTERN("fortran",
"!^([C*]|[ \t]*!)\n"
--
1.8.3.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] userdiff: update Ada patterns
2014-02-03 11:33 ` [PATCH v2] " Adrian Johnson
@ 2014-02-03 20:00 ` Junio C Hamano
2014-02-05 10:44 ` Adrian Johnson
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2014-02-03 20:00 UTC (permalink / raw)
To: Adrian Johnson; +Cc: git, Jeff King
Adrian Johnson <ajohnson@redneon.com> writes:
> - Allow extra space in "is new" and "is separate"
> - Fix bug in word regex for numbers
>
> Signed-off-by: Adrian Johnson <ajohnson@redneon.com>
> ---
> t/t4034/ada/expect | 2 +-
> userdiff.c | 4 ++--
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/t/t4034/ada/expect b/t/t4034/ada/expect
> index be2376e..a682d28 100644
> --- a/t/t4034/ada/expect
> +++ b/t/t4034/ada/expect
> @@ -4,7 +4,7 @@
> <BOLD>+++ b/post<RESET>
> <CYAN>@@ -1,13 +1,13 @@<RESET>
> Ada.Text_IO.Put_Line("Hello World<RED>!<RESET><GREEN>?<RESET>");
> -1 1e<RED>-<RESET>10 16#FE12#E2 3.141_592 '<RED>x<RESET><GREEN>y<RESET>'
> +1 <RED>1e-10<RESET><GREEN>1e10<RESET> 16#FE12#E2 3.141_592 '<RED>x<RESET><GREEN>y<RESET>'
> <RED>a<RESET><GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET>
> <RED>a<RESET><GREEN>y<RESET>
> <GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b<RESET>
> diff --git a/userdiff.c b/userdiff.c
> index ea43a03..10b61ec 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -15,13 +15,13 @@ static int drivers_alloc;
> word_regex "|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+" }
> static struct userdiff_driver builtin_drivers[] = {
> IPATTERN("ada",
> - "!^(.*[ \t])?(is new|renames|is separate)([ \t].*)?$\n"
> + "!^(.*[ \t])?(is[ \t]+new|renames|is[ \t]+separate)([ \t].*)?$\n"
> "!^[ \t]*with[ \t].*$\n"
> "^[ \t]*((procedure|function)[ \t]+.*)$\n"
> "^[ \t]*((package|protected|task)[ \t]+.*)$",
> /* -- */
> "[a-zA-Z][a-zA-Z0-9_]*"
> - "|[0-9][-+0-9#_.eE]"
> + "|[-+]?[0-9][0-9#_.aAbBcCdDeEfF]*([eE][+-]?[0-9_]+)?"
This would match a lot wider than what I read you said you wanted to
match in your previous message. Does "-04##4_3_2Ee-9" count as a
number, for example, or can we just ignore such syntactically
incorrect sequence?
> "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
> IPATTERN("fortran",
> "!^([C*]|[ \t]*!)\n"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] userdiff: update Ada patterns
2014-02-03 20:00 ` Junio C Hamano
@ 2014-02-05 10:44 ` Adrian Johnson
2014-02-05 17:17 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Adrian Johnson @ 2014-02-05 10:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Jeff King
On 04/02/14 06:30, Junio C Hamano wrote:
> Adrian Johnson <ajohnson@redneon.com> writes:
>
>> - Allow extra space in "is new" and "is separate"
>> - Fix bug in word regex for numbers
>>
>> Signed-off-by: Adrian Johnson <ajohnson@redneon.com>
>> ---
>> t/t4034/ada/expect | 2 +-
>> userdiff.c | 4 ++--
>> 2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/t/t4034/ada/expect b/t/t4034/ada/expect
>> index be2376e..a682d28 100644
>> --- a/t/t4034/ada/expect
>> +++ b/t/t4034/ada/expect
>> @@ -4,7 +4,7 @@
>> <BOLD>+++ b/post<RESET>
>> <CYAN>@@ -1,13 +1,13 @@<RESET>
>> Ada.Text_IO.Put_Line("Hello World<RED>!<RESET><GREEN>?<RESET>");
>> -1 1e<RED>-<RESET>10 16#FE12#E2 3.141_592 '<RED>x<RESET><GREEN>y<RESET>'
>> +1 <RED>1e-10<RESET><GREEN>1e10<RESET> 16#FE12#E2 3.141_592 '<RED>x<RESET><GREEN>y<RESET>'
>> <RED>a<RESET><GREEN>x<RESET>+<RED>b a<RESET><GREEN>y x<RESET>-<RED>b<RESET>
>> <RED>a<RESET><GREEN>y<RESET>
>> <GREEN>x<RESET>*<RED>b a<RESET><GREEN>y x<RESET>/<RED>b<RESET>
>> diff --git a/userdiff.c b/userdiff.c
>> index ea43a03..10b61ec 100644
>> --- a/userdiff.c
>> +++ b/userdiff.c
>> @@ -15,13 +15,13 @@ static int drivers_alloc;
>> word_regex "|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+" }
>> static struct userdiff_driver builtin_drivers[] = {
>> IPATTERN("ada",
>> - "!^(.*[ \t])?(is new|renames|is separate)([ \t].*)?$\n"
>> + "!^(.*[ \t])?(is[ \t]+new|renames|is[ \t]+separate)([ \t].*)?$\n"
>> "!^[ \t]*with[ \t].*$\n"
>> "^[ \t]*((procedure|function)[ \t]+.*)$\n"
>> "^[ \t]*((package|protected|task)[ \t]+.*)$",
>> /* -- */
>> "[a-zA-Z][a-zA-Z0-9_]*"
>> - "|[0-9][-+0-9#_.eE]"
>> + "|[-+]?[0-9][0-9#_.aAbBcCdDeEfF]*([eE][+-]?[0-9_]+)?"
>
> This would match a lot wider than what I read you said you wanted to
> match in your previous message. Does "-04##4_3_2Ee-9" count as a
> number, for example, or can we just ignore such syntactically
> incorrect sequence?
Maybe I am misunderstanding the purpose of the word diff regexes. I
thought the purpose of the word regex is to split lines into words, not
determine what is syntactically correct.
For example decimal number regex for pascal is: [-+0-9.e]+
and for cpp: [-+0-9.e]+[fFlL]?
These will obviously match stuff that is not a number.
>
>> "|=>|\\.\\.|\\*\\*|:=|/=|>=|<=|<<|>>|<>"),
>> IPATTERN("fortran",
>> "!^([C*]|[ \t]*!)\n"
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] userdiff: update Ada patterns
2014-02-05 10:44 ` Adrian Johnson
@ 2014-02-05 17:17 ` Junio C Hamano
2014-02-05 17:28 ` Jeff King
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2014-02-05 17:17 UTC (permalink / raw)
To: Adrian Johnson; +Cc: git, Jeff King
Adrian Johnson <ajohnson@redneon.com> writes:
>>> - "|[0-9][-+0-9#_.eE]"
>>> + "|[-+]?[0-9][0-9#_.aAbBcCdDeEfF]*([eE][+-]?[0-9_]+)?"
>>
>> This would match a lot wider than what I read you said you wanted to
>> match in your previous message. Does "-04##4_3_2Ee-9" count as a
>> number, for example, or can we just ignore such syntactically
>> incorrect sequence?
>
> Maybe I am misunderstanding the purpose of the word diff regexes. I
> thought the purpose of the word regex is to split lines into words, not
> determine what is syntactically correct.
I agree that the purpose is former---So you could have just said
"the latter" ;-).
Any other nitpick, anybody? Otherwise I'll queue this version.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] userdiff: update Ada patterns
2014-02-05 17:17 ` Junio C Hamano
@ 2014-02-05 17:28 ` Jeff King
0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2014-02-05 17:28 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Adrian Johnson, git
On Wed, Feb 05, 2014 at 09:17:47AM -0800, Junio C Hamano wrote:
> Adrian Johnson <ajohnson@redneon.com> writes:
>
> >>> - "|[0-9][-+0-9#_.eE]"
> >>> + "|[-+]?[0-9][0-9#_.aAbBcCdDeEfF]*([eE][+-]?[0-9_]+)?"
> >>
> >> This would match a lot wider than what I read you said you wanted to
> >> match in your previous message. Does "-04##4_3_2Ee-9" count as a
> >> number, for example, or can we just ignore such syntactically
> >> incorrect sequence?
> >
> > Maybe I am misunderstanding the purpose of the word diff regexes. I
> > thought the purpose of the word regex is to split lines into words, not
> > determine what is syntactically correct.
>
> I agree that the purpose is former---So you could have just said
> "the latter" ;-).
>
> Any other nitpick, anybody? Otherwise I'll queue this version.
No nitpick here, I had the same thought as Adrian while reading the
thread (and if somebody comes up with a real case where the output looks
bad, we can iterate on it).
-Peff
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-02-05 17:28 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-02 10:51 [PATCH] userdiff: update Ada patterns Adrian Johnson
2014-02-02 23:35 ` Jeff King
2014-02-03 11:30 ` Adrian Johnson
2014-02-03 11:33 ` [PATCH v2] " Adrian Johnson
2014-02-03 20:00 ` Junio C Hamano
2014-02-05 10:44 ` Adrian Johnson
2014-02-05 17:17 ` Junio C Hamano
2014-02-05 17:28 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).