* [PATCH 0/2] scripts/checkpatch: fix SPDX-License-Identifier detection
@ 2025-05-08 17:00 Daniel P. Berrangé
2025-05-08 17:00 ` [PATCH 1/2] Revert "scripts: mandate that new files have SPDX-License-Identifier" Daniel P. Berrangé
2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
0 siblings, 2 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2025-05-08 17:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Peter Maydell, Daniel P. Berrangé
Daniel P. Berrangé (2):
Revert "scripts: mandate that new files have SPDX-License-Identifier"
scripts/checkpatch: reimplement SPDX-License-Identifier detection
scripts/checkpatch.pl | 54 ++++++++++++++++++++++++++++++-------------
1 file changed, 38 insertions(+), 16 deletions(-)
--
2.49.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] Revert "scripts: mandate that new files have SPDX-License-Identifier"
2025-05-08 17:00 [PATCH 0/2] scripts/checkpatch: fix SPDX-License-Identifier detection Daniel P. Berrangé
@ 2025-05-08 17:00 ` Daniel P. Berrangé
2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
1 sibling, 0 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2025-05-08 17:00 UTC (permalink / raw)
To: qemu-devel; +Cc: Peter Maydell, Daniel P. Berrangé
This reverts commit fa4d79c64dae03ffa269e42e21822453856618b7.
The logic in this commit was flawed in two critical ways
* It always failed to report SPDX validation on the last newly
added file. IOW, it only worked if at least 2 new files were
added in a commit
* If an existing file change, followed a new file change, in
the commit and the existing file context/changed lines
included SPDX-License-Identifier, it would incorrectly
associate this with the previous newly added file.
Simply reverting this commit will make it significantly easier to
understand the improved logic in the following commit.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
scripts/checkpatch.pl | 30 ------------------------------
1 file changed, 30 deletions(-)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 365892de04..d355c0dad5 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1442,8 +1442,6 @@ sub process {
my $in_imported_file = 0;
my $in_no_imported_file = 0;
my $non_utf8_charset = 0;
- my $expect_spdx = 0;
- my $expect_spdx_file;
our @report = ();
our $cnt_lines = 0;
@@ -1681,34 +1679,6 @@ sub process {
WARN("added, moved or deleted file(s), does MAINTAINERS need updating?\n" . $herecurr);
}
-# All new files should have a SPDX-License-Identifier tag
- if ($line =~ /^new file mode\s*\d+\s*$/) {
- if ($expect_spdx) {
- if ($expect_spdx_file =~
- /\.(c|h|py|pl|sh|json|inc|Makefile)$/) {
- # source code files MUST have SPDX license declared
- ERROR("New file '$expect_spdx_file' requires " .
- "'SPDX-License-Identifier'");
- } else {
- # Other files MAY have SPDX license if appropriate
- WARN("Does new file '$expect_spdx_file' need " .
- "'SPDX-License-Identifier'?");
- }
- }
- $expect_spdx = 1;
- $expect_spdx_file = undef;
- } elsif ($expect_spdx) {
- $expect_spdx_file = $realfile unless
- defined $expect_spdx_file;
-
- # SPDX tags may occurr in comments which were
- # stripped from '$line', so use '$rawline'
- if ($rawline =~ /SPDX-License-Identifier/) {
- $expect_spdx = 0;
- $expect_spdx_file = undef;
- }
- }
-
# Check SPDX-License-Identifier references a permitted license
if ($rawline =~ m,SPDX-License-Identifier: (.*?)(\*/)?\s*$,) {
&checkspdx($realfile, $1);
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection
2025-05-08 17:00 [PATCH 0/2] scripts/checkpatch: fix SPDX-License-Identifier detection Daniel P. Berrangé
2025-05-08 17:00 ` [PATCH 1/2] Revert "scripts: mandate that new files have SPDX-License-Identifier" Daniel P. Berrangé
@ 2025-05-08 17:01 ` Daniel P. Berrangé
2025-05-09 13:01 ` Peter Maydell
2025-05-09 13:07 ` Peter Maydell
1 sibling, 2 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2025-05-08 17:01 UTC (permalink / raw)
To: qemu-devel; +Cc: Peter Maydell, Daniel P. Berrangé
The new attempt at detecting missing SPDX-License-Identifier in
new files is using the following logic
* When seeing a line starting 'diff --git ...' it indicates
the start of a file in the patch. This must trigger reporting
of violations in the previous file (if any).
It must reset the validation state, since this may now be a
pre-existing file being changed. This will be resolved by
the next rule.
* When seeing a line starting 'new file mode...' it indicates
a newly created file and must enable SPDX validation.
* When seeing EOF, it must trigger reporting of violations in
the last new file in the patch, if any.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
scripts/checkpatch.pl | 56 +++++++++++++++++++++++++++++++++++++++++--
1 file changed, 54 insertions(+), 2 deletions(-)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index d355c0dad5..5da0f85e08 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1353,7 +1353,22 @@ sub checkfilename {
}
}
-sub checkspdx {
+sub check_spdx_present {
+ my $expect_spdx_file = shift;
+
+ if ($expect_spdx_file =~
+ /\.(c|h|py|pl|sh|json|inc|Makefile)$/) {
+ # source code files MUST have SPDX license declared
+ ERROR("New file '$expect_spdx_file' requires " .
+ "'SPDX-License-Identifier'");
+ } else {
+ # Other files MAY have SPDX license if appropriate
+ WARN("Does new file '$expect_spdx_file' need " .
+ "'SPDX-License-Identifier'?");
+ }
+}
+
+sub check_spdx_expression {
my ($file, $expr) = @_;
# Imported Linux headers probably have SPDX tags, but if they
@@ -1442,6 +1457,8 @@ sub process {
my $in_imported_file = 0;
my $in_no_imported_file = 0;
my $non_utf8_charset = 0;
+ my $expect_spdx = 0;
+ my $expect_spdx_file;
our @report = ();
our $cnt_lines = 0;
@@ -1679,9 +1696,38 @@ sub process {
WARN("added, moved or deleted file(s), does MAINTAINERS need updating?\n" . $herecurr);
}
+# All new files should have a SPDX-License-Identifier tag
+ if ($line =~ /^diff --git/) {
+ # Start of file diff marker, report last file if it failed
+ # SPDX validation
+ if (defined $expect_spdx_file) {
+ &check_spdx_present($expect_spdx_file);
+ }
+
+ # Reset state ready to find new file
+ $expect_spdx = 0;
+ $expect_spdx_file = undef;
+ } elsif ($line =~ /^new file mode\s*\d+\s*$/) {
+ # This diff block is a new file, so we must
+ # mandate a SPDX tag
+ $expect_spdx = 1;
+ } elsif ($expect_spdx) {
+ # Capture filename if don't already have it
+ $expect_spdx_file = $realfile unless
+ defined $expect_spdx_file;
+
+ # SPDX tags may occurr in comments which were
+ # stripped from '$line', so use '$rawline'. If
+ # we see one we pass validation
+ if ($rawline =~ /SPDX-License-Identifier/) {
+ $expect_spdx = 0;
+ $expect_spdx_file = undef;
+ }
+ }
+
# Check SPDX-License-Identifier references a permitted license
if ($rawline =~ m,SPDX-License-Identifier: (.*?)(\*/)?\s*$,) {
- &checkspdx($realfile, $1);
+ &check_spdx_expression($realfile, $1);
}
if ($rawline =~ m,(SPDX-[a-zA-Z0-9-_]+):,) {
@@ -3213,6 +3259,12 @@ sub process {
}
}
+ # End of diff, report last file block if it failed
+ # SPDX validation
+ if (defined $expect_spdx_file) {
+ &check_spdx_present($expect_spdx_file);
+ }
+
if ($is_patch && $chk_signoff && $signoff == 0) {
ERROR("Missing Signed-off-by: line(s)\n");
}
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection
2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
@ 2025-05-09 13:01 ` Peter Maydell
2025-05-09 14:17 ` Daniel P. Berrangé
2025-05-09 13:07 ` Peter Maydell
1 sibling, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2025-05-09 13:01 UTC (permalink / raw)
To: Daniel P. Berrangé; +Cc: qemu-devel
On Thu, 8 May 2025 at 18:01, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> The new attempt at detecting missing SPDX-License-Identifier in
> new files is using the following logic
>
> * When seeing a line starting 'diff --git ...' it indicates
> the start of a file in the patch. This must trigger reporting
> of violations in the previous file (if any).
>
> It must reset the validation state, since this may now be a
> pre-existing file being changed. This will be resolved by
> the next rule.
>
> * When seeing a line starting 'new file mode...' it indicates
> a newly created file and must enable SPDX validation.
>
> * When seeing EOF, it must trigger reporting of violations in
> the last new file in the patch, if any.
> @@ -1442,6 +1457,8 @@ sub process {
> my $in_imported_file = 0;
> my $in_no_imported_file = 0;
> my $non_utf8_charset = 0;
> + my $expect_spdx = 0;
> + my $expect_spdx_file;
>
> our @report = ();
> our $cnt_lines = 0;
> @@ -1679,9 +1696,38 @@ sub process {
> WARN("added, moved or deleted file(s), does MAINTAINERS need updating?\n" . $herecurr);
> }
>
> +# All new files should have a SPDX-License-Identifier tag
> + if ($line =~ /^diff --git/) {
> + # Start of file diff marker, report last file if it failed
> + # SPDX validation
> + if (defined $expect_spdx_file) {
> + &check_spdx_present($expect_spdx_file);
> + }
> +
> + # Reset state ready to find new file
> + $expect_spdx = 0;
> + $expect_spdx_file = undef;
We already have a point in the code where we say "ah, this looks
like the start of a new file" (under the comment "extract the
filename as it passes"), and it looks for two possible regexes,
not just "diff --git". Maybe we should combine these so that
we have something like
if ($line =~ /^diff --git.*?(\S+)$/) {
handle_end_of_file($realfile) if $realfile ne '';
$realfile = $1;
$realfile =~ s@^([^/]*)/@@ if (!$file);
handle_start_of_file($realfile);
} elsif ($line =~ /^\+\+\+\s+(\S+)/) {
handle_end_of_file($realfile) if $realfile ne '';
$realfile = $1;
$realfile =~ s@^([^/]*)/@@ if (!$file);
handle_start_of_file($realfile);
$p1_prefix = $1;
if (!$file && $tree && $p1_prefix ne '' &&
-e "$root/$p1_prefix") {
WARN("patch prefix '$p1_prefix'
exists, appears to be a -p0 patch\n");
}
next;
(side note: seems odd that we have 'next' here but not in the
previous half of this if()...)
}
and move checkfilename() to inside handle_start_of_file(),
and have the spdx check handling done inside
handle_start_of_file() and handle_end_of_file() ?
> + # End of diff, report last file block if it failed
and we would call
handle_end_of_file($realfile) if $realfile ne '';
here too.
thanks
-- PMM
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection
2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
2025-05-09 13:01 ` Peter Maydell
@ 2025-05-09 13:07 ` Peter Maydell
1 sibling, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2025-05-09 13:07 UTC (permalink / raw)
To: Daniel P. Berrangé; +Cc: qemu-devel
On Thu, 8 May 2025 at 18:01, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> The new attempt at detecting missing SPDX-License-Identifier in
> new files is using the following logic
>
> * When seeing a line starting 'diff --git ...' it indicates
> the start of a file in the patch. This must trigger reporting
> of violations in the previous file (if any).
>
> It must reset the validation state, since this may now be a
> pre-existing file being changed. This will be resolved by
> the next rule.
>
> * When seeing a line starting 'new file mode...' it indicates
> a newly created file and must enable SPDX validation.
>
> * When seeing EOF, it must trigger reporting of violations in
> the last new file in the patch, if any.
> +# All new files should have a SPDX-License-Identifier tag
> + if ($line =~ /^diff --git/) {
> + # Start of file diff marker, report last file if it failed
> + # SPDX validation
> + if (defined $expect_spdx_file) {
> + &check_spdx_present($expect_spdx_file);
> + }
> +
Forgot to mention, but our perl coding style in this
file seems to be generally not to use the & sigil
when making function calls.
-- PMM
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection
2025-05-09 13:01 ` Peter Maydell
@ 2025-05-09 14:17 ` Daniel P. Berrangé
0 siblings, 0 replies; 6+ messages in thread
From: Daniel P. Berrangé @ 2025-05-09 14:17 UTC (permalink / raw)
To: Peter Maydell; +Cc: qemu-devel
On Fri, May 09, 2025 at 02:01:32PM +0100, Peter Maydell wrote:
> On Thu, 8 May 2025 at 18:01, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > The new attempt at detecting missing SPDX-License-Identifier in
> > new files is using the following logic
> >
> > * When seeing a line starting 'diff --git ...' it indicates
> > the start of a file in the patch. This must trigger reporting
> > of violations in the previous file (if any).
> >
> > It must reset the validation state, since this may now be a
> > pre-existing file being changed. This will be resolved by
> > the next rule.
> >
> > * When seeing a line starting 'new file mode...' it indicates
> > a newly created file and must enable SPDX validation.
> >
> > * When seeing EOF, it must trigger reporting of violations in
> > the last new file in the patch, if any.
>
> > @@ -1442,6 +1457,8 @@ sub process {
> > my $in_imported_file = 0;
> > my $in_no_imported_file = 0;
> > my $non_utf8_charset = 0;
> > + my $expect_spdx = 0;
> > + my $expect_spdx_file;
> >
> > our @report = ();
> > our $cnt_lines = 0;
> > @@ -1679,9 +1696,38 @@ sub process {
> > WARN("added, moved or deleted file(s), does MAINTAINERS need updating?\n" . $herecurr);
> > }
> >
> > +# All new files should have a SPDX-License-Identifier tag
> > + if ($line =~ /^diff --git/) {
> > + # Start of file diff marker, report last file if it failed
> > + # SPDX validation
> > + if (defined $expect_spdx_file) {
> > + &check_spdx_present($expect_spdx_file);
> > + }
> > +
> > + # Reset state ready to find new file
> > + $expect_spdx = 0;
> > + $expect_spdx_file = undef;
>
>
> We already have a point in the code where we say "ah, this looks
> like the start of a new file" (under the comment "extract the
> filename as it passes"), and it looks for two possible regexes,
> not just "diff --git". Maybe we should combine these so that
> we have something like
>
> if ($line =~ /^diff --git.*?(\S+)$/) {
> handle_end_of_file($realfile) if $realfile ne '';
> $realfile = $1;
> $realfile =~ s@^([^/]*)/@@ if (!$file);
> handle_start_of_file($realfile);
> } elsif ($line =~ /^\+\+\+\s+(\S+)/) {
> handle_end_of_file($realfile) if $realfile ne '';
> $realfile = $1;
> $realfile =~ s@^([^/]*)/@@ if (!$file);
> handle_start_of_file($realfile);
>
> $p1_prefix = $1;
> if (!$file && $tree && $p1_prefix ne '' &&
> -e "$root/$p1_prefix") {
> WARN("patch prefix '$p1_prefix'
> exists, appears to be a -p0 patch\n");
> }
>
> next;
>
> (side note: seems odd that we have 'next' here but not in the
> previous half of this if()...)
>
> }
>
> and move checkfilename() to inside handle_start_of_file(),
> and have the spdx check handling done inside
> handle_start_of_file() and handle_end_of_file() ?
>
> > + # End of diff, report last file block if it failed
>
> and we would call
> handle_end_of_file($realfile) if $realfile ne '';
Yeah, having standlone handle_start_of_file/handle_end_of_file
methods would make it easier to understand what's going on, as
this method with all the check rules is insanely long and hard
to follow.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-05-09 14:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-08 17:00 [PATCH 0/2] scripts/checkpatch: fix SPDX-License-Identifier detection Daniel P. Berrangé
2025-05-08 17:00 ` [PATCH 1/2] Revert "scripts: mandate that new files have SPDX-License-Identifier" Daniel P. Berrangé
2025-05-08 17:01 ` [PATCH 2/2] scripts/checkpatch: reimplement SPDX-License-Identifier detection Daniel P. Berrangé
2025-05-09 13:01 ` Peter Maydell
2025-05-09 14:17 ` Daniel P. Berrangé
2025-05-09 13:07 ` Peter Maydell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).