public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance
@ 2023-12-14 18:15 Carlos Llamas
  2023-12-14 18:15 ` [PATCH 5.10 1/2] checkpatch: add new exception to repeated word check Carlos Llamas
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Carlos Llamas @ 2023-12-14 18:15 UTC (permalink / raw)
  To: stable; +Cc: kernel-team, Carlos Llamas

The checkpatch.pl in v5.10.y still triggers lots of false positives for
REPEATED_WORD warnings, particularly for commit logs. Can we please
backport these two fixes?

Aditya Srivastava (1):
  checkpatch: fix false positives in REPEATED_WORD warning

Dwaipayan Ray (1):
  checkpatch: add new exception to repeated word check

 scripts/checkpatch.pl | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)


base-commit: b50306f77190155d2c14a72be5d2e02254d17dbd
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 5.10 1/2] checkpatch: add new exception to repeated word check
  2023-12-14 18:15 [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance Carlos Llamas
@ 2023-12-14 18:15 ` Carlos Llamas
  2023-12-14 18:31   ` Joe Perches
  2023-12-14 18:15 ` [PATCH 5.10 2/2] checkpatch: fix false positives in REPEATED_WORD warning Carlos Llamas
  2023-12-14 18:23 ` [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance Greg KH
  2 siblings, 1 reply; 9+ messages in thread
From: Carlos Llamas @ 2023-12-14 18:15 UTC (permalink / raw)
  To: stable, Andy Whitcroft, Joe Perches
  Cc: kernel-team, Dwaipayan Ray, Lukas Bulwahn, Aditya Srivastava,
	Andrew Morton, Linus Torvalds, Carlos Llamas

From: Dwaipayan Ray <dwaipayanray1@gmail.com>

commit 1db81a682a2f2a664489c4e94f3b945f70a43a13 upstream.

Recently, commit 4f6ad8aa1eac ("checkpatch: move repeated word test")
moved the repeated word test to check for more file types. But after
this, if checkpatch.pl is run on MAINTAINERS, it generates several
new warnings of the type:

  WARNING: Possible repeated word: 'git'

For example:

  WARNING: Possible repeated word: 'git'
  +T:	git git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml.git

So, the pattern "git git://..." is a false positive in this case.

There are several other combinations which may produce a wrong warning
message, such as "@size size", ":Begin begin", etc.

Extend repeated word check to compare the characters before and after
the word matches.

If there is a non whitespace character before the first word or a non
whitespace character excluding punctuation characters after the second
word, then the check is skipped and the warning is avoided.

Also add case insensitive word matching to the repeated word check.

Link: https://lore.kernel.org/linux-kernel-mentees/81b6a0bb2c7b9256361573f7a13201ebcd4876f1.camel@perches.com/
Link: https://lkml.kernel.org/r/20201017162732.152351-1-dwaipayanray1@gmail.com
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Suggested-by: Joe Perches <joe@perches.com>
Suggested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Joe Perches <joe@perches.com>
Cc: Aditya Srivastava <yashsri421@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 scripts/checkpatch.pl | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 0ad235ee96f9..a83e5f0088bb 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -3050,19 +3050,30 @@ sub process {
 
 # check for repeated words separated by a single space
 		if ($rawline =~ /^\+/ || $in_commit_log) {
+			pos($rawline) = 1 if (!$in_commit_log);
 			while ($rawline =~ /\b($word_pattern) (?=($word_pattern))/g) {
 
 				my $first = $1;
 				my $second = $2;
-
+				my $start_pos = $-[1];
+				my $end_pos = $+[2];
 				if ($first =~ /(?:struct|union|enum)/) {
 					pos($rawline) += length($first) + length($second) + 1;
 					next;
 				}
 
-				next if ($first ne $second);
+				next if (lc($first) ne lc($second));
 				next if ($first eq 'long');
 
+				# check for character before and after the word matches
+				my $start_char = '';
+				my $end_char = '';
+				$start_char = substr($rawline, $start_pos - 1, 1) if ($start_pos > ($in_commit_log ? 0 : 1));
+				$end_char = substr($rawline, $end_pos, 1) if ($end_pos < length($rawline));
+
+				next if ($start_char =~ /^\S$/);
+				next if (index(" \t.,;?!", $end_char) == -1);
+
 				if (WARN("REPEATED_WORD",
 					 "Possible repeated word: '$first'\n" . $herecurr) &&
 				    $fix) {
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5.10 2/2] checkpatch: fix false positives in REPEATED_WORD warning
  2023-12-14 18:15 [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance Carlos Llamas
  2023-12-14 18:15 ` [PATCH 5.10 1/2] checkpatch: add new exception to repeated word check Carlos Llamas
@ 2023-12-14 18:15 ` Carlos Llamas
  2023-12-14 18:23 ` [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance Greg KH
  2 siblings, 0 replies; 9+ messages in thread
From: Carlos Llamas @ 2023-12-14 18:15 UTC (permalink / raw)
  To: stable, Andy Whitcroft, Joe Perches
  Cc: kernel-team, Aditya Srivastava, Dwaipayan Ray, Lukas Bulwahn,
	Andrew Morton, Linus Torvalds, Carlos Llamas

From: Aditya Srivastava <yashsri421@gmail.com>

commit 8d0325cc74a31d517b5b4307c8d895c6e81076b7 upstream.

Presence of hexadecimal address or symbol results in false warning
message by checkpatch.pl.

For example, running checkpatch on commit b8ad540dd4e4 ("mptcp: fix
memory leak in mptcp_subflow_create_socket()") results in warning:

  WARNING:REPEATED_WORD: Possible repeated word: 'ff'
      00 00 00 00 00 00 00 00 00 2f 30 0a 81 88 ff ff  ........./0.....

Similarly, the presence of list command output in commit results in
an unnecessary warning.

For example, running checkpatch on commit 899e5ffbf246 ("perf record:
Introduce --switch-output-event") gives:

  WARNING:REPEATED_WORD: Possible repeated word: 'root'
    dr-xr-x---. 12 root root    4096 Apr 27 17:46 ..

Here, it reports 'ff' and 'root' to be repeated, but it is in fact part
of some address or code, where it has to be repeated.

In these cases, the intent of the warning to find stylistic issues in
commit messages is not met and the warning is just completely wrong in
this case.

To avoid these warnings, add an additional regex check for the directory
permission pattern and avoid checking the line for this class of
warning.  Similarly, to avoid hex pattern, check if the word consists of
hex symbols and skip this warning if it is not among the common english
words formed using hex letters.

A quick evaluation on v5.6..v5.8 showed that this fix reduces
REPEATED_WORD warnings by the frequency of 1890.

A quick manual check found all cases are related to hex output or list
command outputs in commit messages.

Link: https://lkml.kernel.org/r/20201024102253.13614-1-yashsri421@gmail.com
Signed-off-by: Aditya Srivastava <yashsri421@gmail.com>
Acked-by: Joe Perches <joe@perches.com>
Cc: Dwaipayan Ray <dwaipayanray1@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 scripts/checkpatch.pl | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index a83e5f0088bb..c2704af497ba 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -853,6 +853,13 @@ our $declaration_macros = qr{(?x:
 	(?:SKCIPHER_REQUEST|SHASH_DESC|AHASH_REQUEST)_ON_STACK\s*\(
 )};
 
+our %allow_repeated_words = (
+	add => '',
+	added => '',
+	bad => '',
+	be => '',
+);
+
 sub deparenthesize {
 	my ($string) = @_;
 	return "" if (!defined($string));
@@ -3049,7 +3056,9 @@ sub process {
 		}
 
 # check for repeated words separated by a single space
-		if ($rawline =~ /^\+/ || $in_commit_log) {
+# avoid false positive from list command eg, '-rw-r--r-- 1 root root'
+		if (($rawline =~ /^\+/ || $in_commit_log) &&
+		    $rawline !~ /[bcCdDlMnpPs\?-][rwxsStT-]{9}/) {
 			pos($rawline) = 1 if (!$in_commit_log);
 			while ($rawline =~ /\b($word_pattern) (?=($word_pattern))/g) {
 
@@ -3074,6 +3083,11 @@ sub process {
 				next if ($start_char =~ /^\S$/);
 				next if (index(" \t.,;?!", $end_char) == -1);
 
+                                # avoid repeating hex occurrences like 'ff ff fe 09 ...'
+                                if ($first =~ /\b[0-9a-f]{2,}\b/i) {
+                                        next if (!exists($allow_repeated_words{lc($first)}));
+                                }
+
 				if (WARN("REPEATED_WORD",
 					 "Possible repeated word: '$first'\n" . $herecurr) &&
 				    $fix) {
-- 
2.43.0.472.g3155946c3a-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance
  2023-12-14 18:15 [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance Carlos Llamas
  2023-12-14 18:15 ` [PATCH 5.10 1/2] checkpatch: add new exception to repeated word check Carlos Llamas
  2023-12-14 18:15 ` [PATCH 5.10 2/2] checkpatch: fix false positives in REPEATED_WORD warning Carlos Llamas
@ 2023-12-14 18:23 ` Greg KH
  2023-12-14 18:37   ` Carlos Llamas
  2 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2023-12-14 18:23 UTC (permalink / raw)
  To: Carlos Llamas; +Cc: stable, kernel-team

On Thu, Dec 14, 2023 at 06:15:02PM +0000, Carlos Llamas wrote:
> The checkpatch.pl in v5.10.y still triggers lots of false positives for
> REPEATED_WORD warnings, particularly for commit logs. Can we please
> backport these two fixes?

Why is older versions of checkpatch being used?  Why not always use the
latest version, much like perf is handled?

No new code should be written against older kernels, so who is using
this old tool?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 5.10 1/2] checkpatch: add new exception to repeated word check
  2023-12-14 18:15 ` [PATCH 5.10 1/2] checkpatch: add new exception to repeated word check Carlos Llamas
@ 2023-12-14 18:31   ` Joe Perches
  2023-12-14 18:52     ` Carlos Llamas
  0 siblings, 1 reply; 9+ messages in thread
From: Joe Perches @ 2023-12-14 18:31 UTC (permalink / raw)
  To: Carlos Llamas, stable, Andy Whitcroft
  Cc: kernel-team, Dwaipayan Ray, Lukas Bulwahn, Aditya Srivastava,
	Andrew Morton, Linus Torvalds

On Thu, 2023-12-14 at 18:15 +0000, Carlos Llamas wrote:
> From: Dwaipayan Ray <dwaipayanray1@gmail.com>
> 
> commit 1db81a682a2f2a664489c4e94f3b945f70a43a13 upstream.
> 
> Recently, commit 4f6ad8aa1eac ("checkpatch: move repeated word test")
> moved the repeated word test to check for more file types. But after
> this, if checkpatch.pl is run on MAINTAINERS, it generates several
> new warnings of the type:
> 
>   WARNING: Possible repeated word: 'git'

Why should this be backported?  Who cares?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance
  2023-12-14 18:23 ` [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance Greg KH
@ 2023-12-14 18:37   ` Carlos Llamas
  2023-12-15  7:15     ` Greg KH
  0 siblings, 1 reply; 9+ messages in thread
From: Carlos Llamas @ 2023-12-14 18:37 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, kernel-team

On Thu, Dec 14, 2023 at 07:23:28PM +0100, Greg KH wrote:
> On Thu, Dec 14, 2023 at 06:15:02PM +0000, Carlos Llamas wrote:
> > The checkpatch.pl in v5.10.y still triggers lots of false positives for
> > REPEATED_WORD warnings, particularly for commit logs. Can we please
> > backport these two fixes?
> 
> Why is older versions of checkpatch being used?  Why not always use the
> latest version, much like perf is handled?
> 
> No new code should be written against older kernels, so who is using
> this old tool?

This is a minor annoyance when working directly with the v5.10 stable
tree and doing e.g ./scripts/checkpatch.pl -g HEAD. I suppose it makes
sense to always prefer the top-of-tree scripts. However, this could be
inconvenient for some scenarios were master needs to be pulled
separately.

--
Carlos Llamas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 5.10 1/2] checkpatch: add new exception to repeated word check
  2023-12-14 18:31   ` Joe Perches
@ 2023-12-14 18:52     ` Carlos Llamas
  0 siblings, 0 replies; 9+ messages in thread
From: Carlos Llamas @ 2023-12-14 18:52 UTC (permalink / raw)
  To: Joe Perches
  Cc: stable, Andy Whitcroft, kernel-team, Dwaipayan Ray, Lukas Bulwahn,
	Aditya Srivastava, Andrew Morton, Linus Torvalds

On Thu, Dec 14, 2023 at 10:31:02AM -0800, Joe Perches wrote:
> On Thu, 2023-12-14 at 18:15 +0000, Carlos Llamas wrote:
> > From: Dwaipayan Ray <dwaipayanray1@gmail.com>
> > 
> > commit 1db81a682a2f2a664489c4e94f3b945f70a43a13 upstream.
> > 
> > Recently, commit 4f6ad8aa1eac ("checkpatch: move repeated word test")
> > moved the repeated word test to check for more file types. But after
> > this, if checkpatch.pl is run on MAINTAINERS, it generates several
> > new warnings of the type:
> > 
> >   WARNING: Possible repeated word: 'git'
> 
> Why should this be backported?  Who cares?
> 

Our CI for stable cares. However, it seems it would be best to run the
version directly from master instead. I'll try that. Sorry for the
noise.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance
  2023-12-14 18:37   ` Carlos Llamas
@ 2023-12-15  7:15     ` Greg KH
  2023-12-15 16:46       ` Carlos Llamas
  0 siblings, 1 reply; 9+ messages in thread
From: Greg KH @ 2023-12-15  7:15 UTC (permalink / raw)
  To: Carlos Llamas; +Cc: stable, kernel-team

On Thu, Dec 14, 2023 at 06:37:43PM +0000, Carlos Llamas wrote:
> On Thu, Dec 14, 2023 at 07:23:28PM +0100, Greg KH wrote:
> > On Thu, Dec 14, 2023 at 06:15:02PM +0000, Carlos Llamas wrote:
> > > The checkpatch.pl in v5.10.y still triggers lots of false positives for
> > > REPEATED_WORD warnings, particularly for commit logs. Can we please
> > > backport these two fixes?
> > 
> > Why is older versions of checkpatch being used?  Why not always use the
> > latest version, much like perf is handled?
> > 
> > No new code should be written against older kernels, so who is using
> > this old tool?
> 
> This is a minor annoyance when working directly with the v5.10 stable
> tree and doing e.g ./scripts/checkpatch.pl -g HEAD. I suppose it makes
> sense to always prefer the top-of-tree scripts. However, this could be
> inconvenient for some scenarios were master needs to be pulled
> separately.

It makes more sense to use the newer version of the tool, especially as
you are probably having it review backports of newer patches, which
obviously, should follow the newer checkpatch settings, not the older
ones :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance
  2023-12-15  7:15     ` Greg KH
@ 2023-12-15 16:46       ` Carlos Llamas
  0 siblings, 0 replies; 9+ messages in thread
From: Carlos Llamas @ 2023-12-15 16:46 UTC (permalink / raw)
  To: Greg KH; +Cc: stable, kernel-team

On Fri, Dec 15, 2023 at 08:15:01AM +0100, Greg KH wrote:
> On Thu, Dec 14, 2023 at 06:37:43PM +0000, Carlos Llamas wrote:
> > On Thu, Dec 14, 2023 at 07:23:28PM +0100, Greg KH wrote:
> > > On Thu, Dec 14, 2023 at 06:15:02PM +0000, Carlos Llamas wrote:
> > > > The checkpatch.pl in v5.10.y still triggers lots of false positives for
> > > > REPEATED_WORD warnings, particularly for commit logs. Can we please
> > > > backport these two fixes?
> > > 
> > > Why is older versions of checkpatch being used?  Why not always use the
> > > latest version, much like perf is handled?
> > > 
> > > No new code should be written against older kernels, so who is using
> > > this old tool?
> > 
> > This is a minor annoyance when working directly with the v5.10 stable
> > tree and doing e.g ./scripts/checkpatch.pl -g HEAD. I suppose it makes
> > sense to always prefer the top-of-tree scripts. However, this could be
> > inconvenient for some scenarios were master needs to be pulled
> > separately.
> 
> It makes more sense to use the newer version of the tool, especially as
> you are probably having it review backports of newer patches, which
> obviously, should follow the newer checkpatch settings, not the older
> ones :)

Yes, that is the use-case we have. We'll switch to the latest version so
please ignore these patches then. Sorry for the noise.

--
Carlos Llamas

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-15 16:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-14 18:15 [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance Carlos Llamas
2023-12-14 18:15 ` [PATCH 5.10 1/2] checkpatch: add new exception to repeated word check Carlos Llamas
2023-12-14 18:31   ` Joe Perches
2023-12-14 18:52     ` Carlos Llamas
2023-12-14 18:15 ` [PATCH 5.10 2/2] checkpatch: fix false positives in REPEATED_WORD warning Carlos Llamas
2023-12-14 18:23 ` [PATCH 5.10 0/2] checkpatch: fix repeated word annoyance Greg KH
2023-12-14 18:37   ` Carlos Llamas
2023-12-15  7:15     ` Greg KH
2023-12-15 16:46       ` Carlos Llamas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox