qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] scripts/checkpatch: fix SPDX-License-Identifier detection
@ 2025-05-08 17:00 Daniel P. Berrangé
  2025-05-08 17:00 ` [PATCH 1/2] Revert "scripts: mandate that new files have SPDX-License-Identifier" Daniel P. Berrangé
  2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
  0 siblings, 2 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2025-05-08 17:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Daniel P. Berrangé



Daniel P. Berrangé (2):
  Revert "scripts: mandate that new files have SPDX-License-Identifier"
  scripts/checkpatch: reimplement SPDX-License-Identifier detection

 scripts/checkpatch.pl | 54 ++++++++++++++++++++++++++++++-------------
 1 file changed, 38 insertions(+), 16 deletions(-)

-- 
2.49.0



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] Revert "scripts: mandate that new files have SPDX-License-Identifier"
  2025-05-08 17:00 [PATCH 0/2] scripts/checkpatch: fix SPDX-License-Identifier detection Daniel P. Berrangé
@ 2025-05-08 17:00 ` Daniel P. Berrangé
  2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
  1 sibling, 0 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2025-05-08 17:00 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Daniel P. Berrangé

This reverts commit fa4d79c64dae03ffa269e42e21822453856618b7.

The logic in this commit was flawed in two critical ways

 * It always failed to report SPDX validation on the last newly
   added file. IOW, it only worked if at least 2 new files were
   added in a commit

 * If an existing file change, followed a new file change, in
   the commit and the existing file context/changed lines
   included SPDX-License-Identifier, it would incorrectly
   associate this with the previous newly added file.

Simply reverting this commit will make it significantly easier to
understand the improved logic in the following commit.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 scripts/checkpatch.pl | 30 ------------------------------
 1 file changed, 30 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 365892de04..d355c0dad5 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1442,8 +1442,6 @@ sub process {
 	my $in_imported_file = 0;
 	my $in_no_imported_file = 0;
 	my $non_utf8_charset = 0;
-	my $expect_spdx = 0;
-	my $expect_spdx_file;
 
 	our @report = ();
 	our $cnt_lines = 0;
@@ -1681,34 +1679,6 @@ sub process {
 			WARN("added, moved or deleted file(s), does MAINTAINERS need updating?\n" . $herecurr);
 		}
 
-# All new files should have a SPDX-License-Identifier tag
-		if ($line =~ /^new file mode\s*\d+\s*$/) {
-		    if ($expect_spdx) {
-			if ($expect_spdx_file =~
-			    /\.(c|h|py|pl|sh|json|inc|Makefile)$/) {
-			    # source code files MUST have SPDX license declared
-			    ERROR("New file '$expect_spdx_file' requires " .
-				  "'SPDX-License-Identifier'");
-			} else {
-			    # Other files MAY have SPDX license if appropriate
-			    WARN("Does new file '$expect_spdx_file' need " .
-				 "'SPDX-License-Identifier'?");
-			}
-		    }
-		    $expect_spdx = 1;
-		    $expect_spdx_file = undef;
-		} elsif ($expect_spdx) {
-		    $expect_spdx_file = $realfile unless
-			defined $expect_spdx_file;
-
-		    # SPDX tags may occurr in comments which were
-		    # stripped from '$line', so use '$rawline'
-		    if ($rawline =~ /SPDX-License-Identifier/) {
-			$expect_spdx = 0;
-			$expect_spdx_file = undef;
-		    }
-		}
-
 # Check SPDX-License-Identifier references a permitted license
 		if ($rawline =~ m,SPDX-License-Identifier: (.*?)(\*/)?\s*$,) {
 		    &checkspdx($realfile, $1);
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection
  2025-05-08 17:00 [PATCH 0/2] scripts/checkpatch: fix SPDX-License-Identifier detection Daniel P. Berrangé
  2025-05-08 17:00 ` [PATCH 1/2] Revert "scripts: mandate that new files have SPDX-License-Identifier" Daniel P. Berrangé
@ 2025-05-08 17:01 ` Daniel P. Berrangé
  2025-05-09 13:01   ` Peter Maydell
  2025-05-09 13:07   ` Peter Maydell
  1 sibling, 2 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2025-05-08 17:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Daniel P. Berrangé

The new attempt at detecting missing SPDX-License-Identifier in
new files is using the following logic

 * When seeing a line starting 'diff --git ...' it indicates
   the start of a file in the patch. This must trigger reporting
   of violations in the previous file (if any).

   It must reset the validation state, since this may now be a
   pre-existing file being changed. This will be resolved by
   the next rule.

 * When seeing a line starting 'new file mode...' it indicates
   a newly created file and must enable SPDX validation.

 * When seeing EOF, it must trigger reporting of violations in
   the last new file in the patch, if any.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 scripts/checkpatch.pl | 56 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 54 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index d355c0dad5..5da0f85e08 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1353,7 +1353,22 @@ sub checkfilename {
 	}
 }
 
-sub checkspdx {
+sub check_spdx_present {
+    my $expect_spdx_file = shift;
+
+    if ($expect_spdx_file =~
+	/\.(c|h|py|pl|sh|json|inc|Makefile)$/) {
+	# source code files MUST have SPDX license declared
+	ERROR("New file '$expect_spdx_file' requires " .
+	      "'SPDX-License-Identifier'");
+    } else {
+	# Other files MAY have SPDX license if appropriate
+	WARN("Does new file '$expect_spdx_file' need " .
+	     "'SPDX-License-Identifier'?");
+    }
+}
+
+sub check_spdx_expression {
     my ($file, $expr) = @_;
 
     # Imported Linux headers probably have SPDX tags, but if they
@@ -1442,6 +1457,8 @@ sub process {
 	my $in_imported_file = 0;
 	my $in_no_imported_file = 0;
 	my $non_utf8_charset = 0;
+	my $expect_spdx = 0;
+	my $expect_spdx_file;
 
 	our @report = ();
 	our $cnt_lines = 0;
@@ -1679,9 +1696,38 @@ sub process {
 			WARN("added, moved or deleted file(s), does MAINTAINERS need updating?\n" . $herecurr);
 		}
 
+# All new files should have a SPDX-License-Identifier tag
+		if ($line =~ /^diff --git/) {
+		    # Start of file diff marker, report last file if it failed
+		    # SPDX validation
+		    if (defined $expect_spdx_file) {
+			&check_spdx_present($expect_spdx_file);
+		    }
+
+		    # Reset state ready to find new file
+		    $expect_spdx = 0;
+		    $expect_spdx_file = undef;
+		} elsif ($line =~ /^new file mode\s*\d+\s*$/) {
+		    # This diff block is a new file, so we must
+		    # mandate a SPDX tag
+		    $expect_spdx = 1;
+		} elsif ($expect_spdx) {
+		    # Capture filename if don't already have it
+		    $expect_spdx_file = $realfile unless
+			defined $expect_spdx_file;
+
+		    # SPDX tags may occurr in comments which were
+		    # stripped from '$line', so use '$rawline'. If
+		    # we see one we pass validation
+		    if ($rawline =~ /SPDX-License-Identifier/) {
+			$expect_spdx = 0;
+			$expect_spdx_file = undef;
+		    }
+		}
+
 # Check SPDX-License-Identifier references a permitted license
 		if ($rawline =~ m,SPDX-License-Identifier: (.*?)(\*/)?\s*$,) {
-		    &checkspdx($realfile, $1);
+		    &check_spdx_expression($realfile, $1);
 		}
 
 		if ($rawline =~ m,(SPDX-[a-zA-Z0-9-_]+):,) {
@@ -3213,6 +3259,12 @@ sub process {
 		}
 	}
 
+	# End of diff, report last file block if it failed
+	# SPDX validation
+	if (defined $expect_spdx_file) {
+	    &check_spdx_present($expect_spdx_file);
+	}
+
 	if ($is_patch && $chk_signoff && $signoff == 0) {
 		ERROR("Missing Signed-off-by: line(s)\n");
 	}
-- 
2.49.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection
  2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
@ 2025-05-09 13:01   ` Peter Maydell
  2025-05-09 14:17     ` Daniel P. Berrangé
  2025-05-09 13:07   ` Peter Maydell
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2025-05-09 13:01 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel

On Thu, 8 May 2025 at 18:01, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> The new attempt at detecting missing SPDX-License-Identifier in
> new files is using the following logic
>
>  * When seeing a line starting 'diff --git ...' it indicates
>    the start of a file in the patch. This must trigger reporting
>    of violations in the previous file (if any).
>
>    It must reset the validation state, since this may now be a
>    pre-existing file being changed. This will be resolved by
>    the next rule.
>
>  * When seeing a line starting 'new file mode...' it indicates
>    a newly created file and must enable SPDX validation.
>
>  * When seeing EOF, it must trigger reporting of violations in
>    the last new file in the patch, if any.

> @@ -1442,6 +1457,8 @@ sub process {
>         my $in_imported_file = 0;
>         my $in_no_imported_file = 0;
>         my $non_utf8_charset = 0;
> +       my $expect_spdx = 0;
> +       my $expect_spdx_file;
>
>         our @report = ();
>         our $cnt_lines = 0;
> @@ -1679,9 +1696,38 @@ sub process {
>                         WARN("added, moved or deleted file(s), does MAINTAINERS need updating?\n" . $herecurr);
>                 }
>
> +# All new files should have a SPDX-License-Identifier tag
> +               if ($line =~ /^diff --git/) {
> +                   # Start of file diff marker, report last file if it failed
> +                   # SPDX validation
> +                   if (defined $expect_spdx_file) {
> +                       &check_spdx_present($expect_spdx_file);
> +                   }
> +
> +                   # Reset state ready to find new file
> +                   $expect_spdx = 0;
> +                   $expect_spdx_file = undef;


We already have a point in the code where we say "ah, this looks
like the start of a new file" (under the comment "extract the
filename as it passes"), and it looks for two possible regexes,
not just "diff --git". Maybe we should combine these so that
we have something like

                if ($line =~ /^diff --git.*?(\S+)$/) {
                        handle_end_of_file($realfile) if $realfile ne '';
                        $realfile = $1;
                        $realfile =~ s@^([^/]*)/@@ if (!$file);
                        handle_start_of_file($realfile);
                } elsif ($line =~ /^\+\+\+\s+(\S+)/) {
                        handle_end_of_file($realfile) if $realfile ne '';
                        $realfile = $1;
                        $realfile =~ s@^([^/]*)/@@ if (!$file);
                        handle_start_of_file($realfile);

                        $p1_prefix = $1;
                        if (!$file && $tree && $p1_prefix ne '' &&
                            -e "$root/$p1_prefix") {
                                WARN("patch prefix '$p1_prefix'
exists, appears to be a -p0 patch\n");
                        }

                        next;

(side note: seems odd that we have 'next' here but not in the
previous half of this if()...)

                }

and move checkfilename() to inside handle_start_of_file(),
and have the spdx check handling done inside
handle_start_of_file() and handle_end_of_file() ?

> +       # End of diff, report last file block if it failed

and we would call
          handle_end_of_file($realfile) if $realfile ne '';

here too.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection
  2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
  2025-05-09 13:01   ` Peter Maydell
@ 2025-05-09 13:07   ` Peter Maydell
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2025-05-09 13:07 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel

On Thu, 8 May 2025 at 18:01, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> The new attempt at detecting missing SPDX-License-Identifier in
> new files is using the following logic
>
>  * When seeing a line starting 'diff --git ...' it indicates
>    the start of a file in the patch. This must trigger reporting
>    of violations in the previous file (if any).
>
>    It must reset the validation state, since this may now be a
>    pre-existing file being changed. This will be resolved by
>    the next rule.
>
>  * When seeing a line starting 'new file mode...' it indicates
>    a newly created file and must enable SPDX validation.
>
>  * When seeing EOF, it must trigger reporting of violations in
>    the last new file in the patch, if any.


> +# All new files should have a SPDX-License-Identifier tag
> +               if ($line =~ /^diff --git/) {
> +                   # Start of file diff marker, report last file if it failed
> +                   # SPDX validation
> +                   if (defined $expect_spdx_file) {
> +                       &check_spdx_present($expect_spdx_file);
> +                   }
> +

Forgot to mention, but our perl coding style in this
file seems to be generally not to use the & sigil
when making function calls.

-- PMM


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection
  2025-05-09 13:01   ` Peter Maydell
@ 2025-05-09 14:17     ` Daniel P. Berrangé
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2025-05-09 14:17 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel

On Fri, May 09, 2025 at 02:01:32PM +0100, Peter Maydell wrote:
> On Thu, 8 May 2025 at 18:01, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > The new attempt at detecting missing SPDX-License-Identifier in
> > new files is using the following logic
> >
> >  * When seeing a line starting 'diff --git ...' it indicates
> >    the start of a file in the patch. This must trigger reporting
> >    of violations in the previous file (if any).
> >
> >    It must reset the validation state, since this may now be a
> >    pre-existing file being changed. This will be resolved by
> >    the next rule.
> >
> >  * When seeing a line starting 'new file mode...' it indicates
> >    a newly created file and must enable SPDX validation.
> >
> >  * When seeing EOF, it must trigger reporting of violations in
> >    the last new file in the patch, if any.
> 
> > @@ -1442,6 +1457,8 @@ sub process {
> >         my $in_imported_file = 0;
> >         my $in_no_imported_file = 0;
> >         my $non_utf8_charset = 0;
> > +       my $expect_spdx = 0;
> > +       my $expect_spdx_file;
> >
> >         our @report = ();
> >         our $cnt_lines = 0;
> > @@ -1679,9 +1696,38 @@ sub process {
> >                         WARN("added, moved or deleted file(s), does MAINTAINERS need updating?\n" . $herecurr);
> >                 }
> >
> > +# All new files should have a SPDX-License-Identifier tag
> > +               if ($line =~ /^diff --git/) {
> > +                   # Start of file diff marker, report last file if it failed
> > +                   # SPDX validation
> > +                   if (defined $expect_spdx_file) {
> > +                       &check_spdx_present($expect_spdx_file);
> > +                   }
> > +
> > +                   # Reset state ready to find new file
> > +                   $expect_spdx = 0;
> > +                   $expect_spdx_file = undef;
> 
> 
> We already have a point in the code where we say "ah, this looks
> like the start of a new file" (under the comment "extract the
> filename as it passes"), and it looks for two possible regexes,
> not just "diff --git". Maybe we should combine these so that
> we have something like
> 
>                 if ($line =~ /^diff --git.*?(\S+)$/) {
>                         handle_end_of_file($realfile) if $realfile ne '';
>                         $realfile = $1;
>                         $realfile =~ s@^([^/]*)/@@ if (!$file);
>                         handle_start_of_file($realfile);
>                 } elsif ($line =~ /^\+\+\+\s+(\S+)/) {
>                         handle_end_of_file($realfile) if $realfile ne '';
>                         $realfile = $1;
>                         $realfile =~ s@^([^/]*)/@@ if (!$file);
>                         handle_start_of_file($realfile);
> 
>                         $p1_prefix = $1;
>                         if (!$file && $tree && $p1_prefix ne '' &&
>                             -e "$root/$p1_prefix") {
>                                 WARN("patch prefix '$p1_prefix'
> exists, appears to be a -p0 patch\n");
>                         }
> 
>                         next;
> 
> (side note: seems odd that we have 'next' here but not in the
> previous half of this if()...)
> 
>                 }
> 
> and move checkfilename() to inside handle_start_of_file(),
> and have the spdx check handling done inside
> handle_start_of_file() and handle_end_of_file() ?
> 
> > +       # End of diff, report last file block if it failed
> 
> and we would call
>           handle_end_of_file($realfile) if $realfile ne '';

Yeah, having standlone handle_start_of_file/handle_end_of_file
methods would make it easier to understand what's going on, as
this method with all the check rules is insanely long and hard
to follow.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-05-09 14:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-08 17:00 [PATCH 0/2] scripts/checkpatch: fix SPDX-License-Identifier detection Daniel P. Berrangé
2025-05-08 17:00 ` [PATCH 1/2] Revert "scripts: mandate that new files have SPDX-License-Identifier" Daniel P. Berrangé
2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
2025-05-09 13:01   ` Peter Maydell
2025-05-09 14:17     ` Daniel P. Berrangé
2025-05-09 13:07   ` Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).