From: Junio C Hamano <gitster@pobox.com>
To: Paolo Bonzini <bonzini@gnu.org>
Cc: <git@vger.kernel.org>, peff@peff.net
Subject: Re: [PATCH] avoid exponential regex match for java and objc function names
Date: Wed, 17 Jun 2009 09:42:43 -0700 [thread overview]
Message-ID: <7vab46rev0.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: <1245248766-14867-1-git-send-email-bonzini@gnu.org> (Paolo Bonzini's message of "Wed\, 17 Jun 2009 16\:26\:06 +0200")
Paolo Bonzini <bonzini@gnu.org> writes:
> In the old regex
>
> ^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\([^;]*)$
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> you can backtrack arbitrarily from [A-Za-z_0-9]* into [A-Za-z_], thus
> causing an exponential number of backtracks. Ironically it also causes
> the regex not to work as intended; for example "catch" can match the
> underlined part of the regex, the first repetition matching "c" and
> the second matching "atch".
>
> The replacement regex avoids this problem, because it makes sure that
> at least a space/tab is eaten on each repetition. In other words,
> a suffix of a repetition can never be a prefix of the next repetition.
Thanks; nicely done.
Should I remove the "/* -- */" or is it for better readability I should
keep?
> Signed-off-by: Paolo Bonzini <bonzini@gnu.org>
> ---
> userdiff.c | 5 +++--
> 1 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/userdiff.c b/userdiff.c
> index d556da9..57529ae 100644
> --- a/userdiff.c
> +++ b/userdiff.c
> @@ -13,7 +13,8 @@ PATTERNS("html", "^[ \t]*(<[Hh][1-6][ \t].*>.*)$",
> "[^<>= \t]+|[^[:space:]]|[\x80-\xff]+"),
> PATTERNS("java",
> "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
> - "^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\\([^;]*)$",
> + "^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
> + /* -- */
> "[a-zA-Z_][a-zA-Z0-9_]*"
> "|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?"
> "|[-+*/<>%&^|=!]="
> @@ -25,7 +26,7 @@ PATTERNS("objc",
> /* Objective-C methods */
> "^[ \t]*([-+][ \t]*\\([ \t]*[A-Za-z_][A-Za-z_0-9* \t]*\\)[ \t]*[A-Za-z_].*)$\n"
> /* C functions */
> - "^[ \t]*(([ \t]*[A-Za-z_][A-Za-z_0-9]*){2,}[ \t]*\\([^;]*)$\n"
> + "^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$\n"
> /* Objective-C class/protocol definitions */
> "^(@(implementation|interface|protocol)[ \t].*)$",
> /* -- */
> --
> 1.6.0.3
next prev parent reply other threads:[~2009-06-17 16:42 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-16 1:37 git diff looping? John Bito
2009-06-16 2:44 ` Jeff Epler
2009-06-16 2:53 ` John Bito
2009-06-16 11:47 ` Jeff King
2009-06-16 12:07 ` Jeff King
2009-06-16 12:11 ` [PATCH 1/2] Makefile: refactor regex compat support Jeff King
2009-06-16 18:47 ` Johannes Sixt
2009-06-16 19:05 ` Jeff King
2009-06-16 19:07 ` [PATCH v2 " Jeff King
2009-06-16 19:08 ` [PATCH v2 2/2] Makefile: use compat regex on Solaris Jeff King
2009-06-16 20:07 ` Brandon Casey
2009-06-17 13:15 ` Mike Ralphson
2009-06-17 13:55 ` Mike Ralphson
2009-06-16 12:14 ` [PATCH " Jeff King
2009-06-16 15:48 ` git diff looping? John Bito
2009-06-16 16:51 ` Junio C Hamano
2009-06-16 17:15 ` Jeff King
2009-06-16 17:35 ` Brandon Casey
2009-06-16 17:39 ` John Bito
2009-06-16 17:41 ` Jeff King
2009-06-16 20:22 ` Brandon Casey
2009-06-17 8:46 ` Paolo Bonzini
2009-06-17 10:23 ` Jeff King
2009-06-17 11:02 ` Paolo Bonzini
2009-06-17 11:31 ` Andreas Ericsson
2009-06-17 13:08 ` Paolo Bonzini
2009-06-17 13:16 ` Andreas Ericsson
2009-06-17 13:58 ` Paolo Bonzini
2009-06-17 14:26 ` [PATCH] avoid exponential regex match for java and objc function names Paolo Bonzini
2009-06-17 15:46 ` demerphq
2009-06-17 15:56 ` Jeff King
2009-06-17 16:00 ` demerphq
2009-06-17 16:04 ` Paolo Bonzini
2009-06-17 16:42 ` Junio C Hamano [this message]
2009-06-18 6:45 ` Paolo Bonzini
2009-06-16 17:16 ` git diff looping? John Bito
2009-06-16 17:24 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vab46rev0.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=bonzini@gnu.org \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.