From: Brian Norris <computersforpeace@gmail.com>
To: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>, linux-kernel@vger.kernel.org
Subject: Re: [BUG] checkpatch: false positive for commits with quote characters
Date: Thu, 3 Dec 2015 17:39:48 -0800 [thread overview]
Message-ID: <20151204013948.GB116589@google.com> (raw)
In-Reply-To: <1449189388.17296.20.camel@perches.com>
On Thu, Dec 03, 2015 at 04:36:28PM -0800, Joe Perches wrote:
> On Thu, 2015-12-03 at 16:29 -0800, Joe Perches wrote:
> > On Thu, 2015-12-03 at 16:13 -0800, Brian Norris wrote:
> > > Ping? I've hit some different false positives today on the same rule.
> > > I'll stop bothering to report them if no one cares.
> >
> > Perhaps this:
Aside from the non-breaking spaces your copy of Evolution inserted, it's
an improvement. (It doesn't give a false positive for the patch I
reported.) But it's still got some holes.
> (minus the debugging this time...)
> ---
> scripts/checkpatch.pl | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index d4960f7..b23dff8 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -2397,20 +2397,20 @@ sub process {
> $long = 1 if ($line =~ /\bcommit\s+[0-9a-f]{41,}/i);
> $space = 0 if ($line =~ /\bcommit [0-9a-f]/i);
> $case = 0 if ($line =~ /\b[Cc]ommit\s+[0-9a-f]{5,40}[^A-F]/);
> - if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)"\)/i) {
> + if ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("(.*)"\)/i) {
You're got the reverse problem now. This will match too broadly. For
instance, if there is another instance of terminating quote +
parentheses on the same line, then we'll get a false postive. e.g., if I
wrote a patch that included something like this:
Commit 6dc0dcde406b ("parisc: Use platform_device_register_simple("rtc-generic")") is cool (as in "cool")
Or even if it's wrapped like this:
Commit 6dc0dcde406b ("parisc: Use
platform_device_register_simple("rtc-generic")") is cool (as in "cool")
Then the regexes will think the description was:
parisc: Use platform_device_register_simple("rtc-generic")") is cool (as in "cool
A hacky workaround for that one: only check for (loosely, not proper
regex syntax):
parsed_description =~ description . ")";
rather than:
description == parsed_description
Not sure how far you want to go on chasing false positives...
> $orig_desc = $1;
> $hasparens = 1;
> } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s*$/i &&
> defined $rawlines[$linenr] &&
> - $rawlines[$linenr] =~ /^\s*\("([^"]+)"\)/) {
> + $rawlines[$linenr] =~ /^\s*\("(.*)"\)/) {
> $orig_desc = $1;
> $hasparens = 1;
> } elsif ($line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("[^"]+$/i &&
> defined $rawlines[$linenr] &&
> - $rawlines[$linenr] =~ /^\s*[^"]+"\)/) {
> + $rawlines[$linenr] =~ /^\s*.*"\)/) {
> $line =~ /\bcommit\s+[0-9a-f]{5,}\s+\("([^"]+)$/i;
> $orig_desc = $1;
> - $rawlines[$linenr] =~ /^\s*([^"]+)"\)/;
> + $rawlines[$linenr] =~ /^\s*(.*)"\)/;
> $orig_desc .= " " . $1;
> $hasparens = 1;
> }
BTW, another false positive: just including this text in a commit
triggers a different one:
http://lkml.kernel.org/g/20151113220039.GA74382@google.com
A simple hack (appended, in addition to yours) would be to assume that
when people are trying to include badly-formatted commit hashes, they
will be preceding them with whitespace, the beginning of a line, or
encapsulating punctuation. Or use a URL parser.
Brian
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index d2993f19df3f..e7110ba3242b 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2349,7 +2349,7 @@ sub process {
# Check for git id commit length and improperly formed commit descriptions
if ($in_commit_log && !$commit_log_possible_stack_dump &&
($line =~ /\bcommit\s+[0-9a-f]{5,}\b/i ||
- ($line =~ /\b[0-9a-f]{12,40}\b/i &&
+ ($line =~ /\b[\s^\("][0-9a-f]{12,40}\b/i &&
$line !~ /[\<\[][0-9a-f]{12,40}[\>\]]/i &&
$line !~ /\bfixes:\s*[0-9a-f]{12,40}/i))) {
my $init_char = "c";
prev parent reply other threads:[~2015-12-04 1:39 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-16 22:43 [BUG] checkpatch: false positive for commits with quote characters Brian Norris
2015-11-17 17:48 ` Joe Perches
2015-11-17 18:03 ` Brian Norris
2015-12-04 0:13 ` Brian Norris
2015-12-04 0:29 ` Joe Perches
2015-12-04 0:36 ` Joe Perches
2015-12-04 1:39 ` Brian Norris [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151204013948.GB116589@google.com \
--to=computersforpeace@gmail.com \
--cc=apw@canonical.com \
--cc=joe@perches.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.