public inbox for linux-spdx@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
@ 2025-02-25 13:10 Ricardo Ribalda
  2025-04-03 21:34 ` Ricardo Ribalda
  2025-04-07 14:38 ` Duje Mihanović
  0 siblings, 2 replies; 15+ messages in thread
From: Ricardo Ribalda @ 2025-02-25 13:10 UTC (permalink / raw)
  To: Thomas Gleixner, Greg Kroah-Hartman
  Cc: linux-spdx, linux-kernel, Ricardo Ribalda

If the git.Repo object's scope extends to the Python interpreter's
shutdown phase, its destructor may fail due to the interpreter's state.

Exception ignored in: <function Git.AutoInterrupt.__del__ at 0x7f1941dd5620>
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 565, in __del__
  File "/usr/lib/python3/dist-packages/git/cmd.py", line 546, in _terminate
  File "/usr/lib/python3.13/subprocess.py", line 2227, in terminate
ImportError: sys.meta_path is None, Python is likely shutting down

Use the `with` statement to limit the scope of git.Repo and ensure
proper resource management.

Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
 scripts/spdxcheck.py | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/scripts/spdxcheck.py b/scripts/spdxcheck.py
index 8d608f61bf37..eba808cbaeeb 100755
--- a/scripts/spdxcheck.py
+++ b/scripts/spdxcheck.py
@@ -349,11 +349,11 @@ if __name__ == '__main__':
 
     try:
         # Use git to get the valid license expressions
-        repo = git.Repo(os.getcwd())
-        assert not repo.bare
+        with git.Repo(os.getcwd()) as repo:
+            assert not repo.bare
 
-        # Initialize SPDX data
-        spdx = read_spdxdata(repo)
+            # Initialize SPDX data
+            spdx = read_spdxdata(repo)
 
         # Initialize the parser
         parser = id_parser(spdx)

---
base-commit: d082ecbc71e9e0bf49883ee4afd435a77a5101b6
change-id: 20250225-spx-382cf543370e

Best regards,
-- 
Ricardo Ribalda <ribalda@chromium.org>


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
  2025-02-25 13:10 [PATCH] scripts/spdxcheck: Limit the scope of git.Repo Ricardo Ribalda
@ 2025-04-03 21:34 ` Ricardo Ribalda
  2025-04-04  6:21   ` Greg Kroah-Hartman
  2025-04-07 14:38 ` Duje Mihanović
  1 sibling, 1 reply; 15+ messages in thread
From: Ricardo Ribalda @ 2025-04-03 21:34 UTC (permalink / raw)
  To: Thomas Gleixner, Greg Kroah-Hartman; +Cc: linux-spdx, linux-kernel

Friendly ping

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
  2025-04-03 21:34 ` Ricardo Ribalda
@ 2025-04-04  6:21   ` Greg Kroah-Hartman
  2025-04-04  6:29     ` Ricardo Ribalda
  0 siblings, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-04-04  6:21 UTC (permalink / raw)
  To: Ricardo Ribalda; +Cc: Thomas Gleixner, linux-spdx, linux-kernel

On Thu, Apr 03, 2025 at 11:34:14PM +0200, Ricardo Ribalda wrote:
> Friendly ping

Empty pings provide no context at all :(

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
  2025-04-04  6:21   ` Greg Kroah-Hartman
@ 2025-04-04  6:29     ` Ricardo Ribalda
  2025-04-04  8:06       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 15+ messages in thread
From: Ricardo Ribalda @ 2025-04-04  6:29 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Thomas Gleixner, linux-spdx, linux-kernel

On Fri, 4 Apr 2025 at 08:22, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Thu, Apr 03, 2025 at 11:34:14PM +0200, Ricardo Ribalda wrote:
> > Friendly ping
>
> Empty pings provide no context at all :(

Do you mean that I'd rather left the whole patch as context, or that I
should provide a reason for the ping?

Let me try again:

Is there any change needed for
https://lore.kernel.org/linux-spdx/2025040417-aspire-relenting-5462@gregkh/T/#t

that was sent for review over a month ago?

Regards!


-- 
Ricardo Ribalda

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
  2025-04-04  6:29     ` Ricardo Ribalda
@ 2025-04-04  8:06       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-04-04  8:06 UTC (permalink / raw)
  To: Ricardo Ribalda; +Cc: Thomas Gleixner, linux-spdx, linux-kernel

On Fri, Apr 04, 2025 at 08:29:21AM +0200, Ricardo Ribalda wrote:
> On Fri, 4 Apr 2025 at 08:22, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Apr 03, 2025 at 11:34:14PM +0200, Ricardo Ribalda wrote:
> > > Friendly ping
> >
> > Empty pings provide no context at all :(
> 
> Do you mean that I'd rather left the whole patch as context, or that I
> should provide a reason for the ping?

Both, as it is, I have no idea of what you are asking for here with a
one line blank email.

> Let me try again:
> 
> Is there any change needed for
> https://lore.kernel.org/linux-spdx/2025040417-aspire-relenting-5462@gregkh/T/#t
> 
> that was sent for review over a month ago?

It's in the very-long review queue on my end, sorry.  Give us some
change to catch up.  While waiting, please feel free to help out in
reviewing changes from other people so that your changes bubble up to
the top.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
  2025-02-25 13:10 [PATCH] scripts/spdxcheck: Limit the scope of git.Repo Ricardo Ribalda
  2025-04-03 21:34 ` Ricardo Ribalda
@ 2025-04-07 14:38 ` Duje Mihanović
  2025-04-08  8:39   ` Gon Solo
  1 sibling, 1 reply; 15+ messages in thread
From: Duje Mihanović @ 2025-04-07 14:38 UTC (permalink / raw)
  To: Thomas Gleixner, Greg Kroah-Hartman, Ricardo Ribalda
  Cc: linux-spdx, linux-kernel

On Tuesday, 25 February 2025 14:10:41 Central European Summer Time Ricardo 
Ribalda wrote:
> If the git.Repo object's scope extends to the Python interpreter's
> shutdown phase, its destructor may fail due to the interpreter's state.
> 
> Exception ignored in: <function Git.AutoInterrupt.__del__ at 0x7f1941dd5620>
> Traceback (most recent call last):
>   File "/usr/lib/python3/dist-packages/git/cmd.py", line 565, in __del__
>   File "/usr/lib/python3/dist-packages/git/cmd.py", line 546, in _terminate
>   File "/usr/lib/python3.13/subprocess.py", line 2227, in terminate
> ImportError: sys.meta_path is None, Python is likely shutting down
> 
> Use the `with` statement to limit the scope of git.Repo and ensure
> proper resource management.
> 
> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> ---

checkpatch suddenly broke for me with the same error as shown here and the 
patch fixed it.

Tested-by: Duje Mihanović <duje.mihanovic@skole.hr>

Regards,
-- 
Duje




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
  2025-04-07 14:38 ` Duje Mihanović
@ 2025-04-08  8:39   ` Gon Solo
  2025-04-08  9:33     ` Gon Solo
  0 siblings, 1 reply; 15+ messages in thread
From: Gon Solo @ 2025-04-08  8:39 UTC (permalink / raw)
  To: Duje Mihanović
  Cc: Thomas Gleixner, Greg Kroah-Hartman, Ricardo Ribalda, linux-spdx,
	linux-kernel

> checkpatch suddenly broke for me with the same error as shown here and the 
> patch fixed it.
> 
> Tested-by: Duje Mihanović <duje.mihanovic@skole.hr>

Same for me.

Tested-by: Andreas Wendleder <gonsolo@gmail.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
  2025-04-08  8:39   ` Gon Solo
@ 2025-04-08  9:33     ` Gon Solo
  2025-04-08 10:36       ` Gon Solo
  0 siblings, 1 reply; 15+ messages in thread
From: Gon Solo @ 2025-04-08  9:33 UTC (permalink / raw)
  To: Duje Mihanović
  Cc: Thomas Gleixner, Greg Kroah-Hartman, Ricardo Ribalda, linux-spdx,
	linux-kernel

> > checkpatch suddenly broke for me with the same error as shown here and the 
> > patch fixed it.
> > 
> > Tested-by: Duje Mihanović <duje.mihanovic@skole.hr>

Turns out, it was not enough; the variable is used later.
How about the following patch?

From 763f25c8ca2e29f343bfd109a17501de71b38d43 Mon Sep 17 00:00:00 2001
From: Andreas Wendleder <gonsolo@gmail.com>
Date: Tue, 8 Apr 2025 11:21:17 +0200
Subject: [PATCH] Fix spdxcheck.py.

As explained in Ricardo Ribalda's patch:

If the git.Repo object's scope extends to the Python interpreter's
shutdown phase, its destructor may fail due to the interpreter's state.

Exception ignored in: <function Git.AutoInterrupt.__del__ at 0x76e6b0148040>
Traceback (most recent call last):
  File "/usr/lib/python3.13/site-packages/git/cmd.py", line 790, in __del__
  File "/usr/lib/python3.13/site-packages/git/cmd.py", line 781, in _terminate
  File "/usr/lib/python3.13/subprocess.py", line 2227, in terminate
ImportError: sys.meta_path is None, Python is likely shutting down

Unfortunately, repo is used later at line 392 and 399, so we have to
keep it and manually delete it before exiting. This can be checked by
testing a directory instead of a file.

Signed-off--by: Andreas Wendleder <gonsolo@gmail.com>
---
 scripts/spdxcheck.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/spdxcheck.py b/scripts/spdxcheck.py
index 8d608f61bf37..6a89e2b2faba 100755
--- a/scripts/spdxcheck.py
+++ b/scripts/spdxcheck.py
@@ -448,6 +448,7 @@ if __name__ == '__main__':
                     for f in sorted(di.files):
                         sys.stderr.write('    %s\n' %f)
 
+            del repo
             sys.exit(0)
 
     except Exception as ex:
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo
  2025-04-08  9:33     ` Gon Solo
@ 2025-04-08 10:36       ` Gon Solo
  2025-04-08 17:34         ` spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo) Bird, Tim
  0 siblings, 1 reply; 15+ messages in thread
From: Gon Solo @ 2025-04-08 10:36 UTC (permalink / raw)
  To: Duje Mihanović
  Cc: Thomas Gleixner, Greg Kroah-Hartman, Ricardo Ribalda, linux-spdx,
	linux-kernel

It's a known problem:
https://github.com/gitpython-developers/GitPython/issues/2003
https://github.com/python/cpython/issues/118761#issuecomment-2661504264


^ permalink raw reply	[flat|nested] 15+ messages in thread

* spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo)
  2025-04-08 10:36       ` Gon Solo
@ 2025-04-08 17:34         ` Bird, Tim
  2025-04-08 18:10           ` Greg Kroah-Hartman
  2025-04-08 23:41           ` Thomas Gleixner
  0 siblings, 2 replies; 15+ messages in thread
From: Bird, Tim @ 2025-04-08 17:34 UTC (permalink / raw)
  To: Gon Solo, Duje Mihanović
  Cc: Thomas Gleixner, Greg Kroah-Hartman, Ricardo Ribalda,
	linux-spdx@vger.kernel.org, linux-kernel@vger.kernel.org

> -----Original Message-----
> From: Gon Solo <gonsolo@gmail.com>
> It's a known problem:
> https://github.com/gitpython-developers/GitPython/issues/2003
> https://github.com/python/cpython/issues/118761#issuecomment-2661504264
> 

For what it's worth, I've always been a bit skeptical of the use of the python git module
in spdxcheck.py.  Its use makes it impossible to use spdxcheck on a kernel source tree
from a tarball (ie, on source not inside a git repo).  Also, from what I can see in spdxcheck.py,
the way it's used is just to get the top directories for either the LICENSES dir,
the top dir of the kernel source tree, or the directory to scan passed on the
spdxcheck.py command line, and then to use the repo.traverse() function on said directory.

This ends up excluding any files in the source directory tree that are not checked
into git yet, silently skipping them (which I've run into before when using the tool).

I think the code could be relatively easily refactored to eliminate the use of the git
module, to overcome these issues.  I'm not sure if removing the module would
eliminate the yield operation (used inside repo.traverse()), which seems to be causing the
problem found here.  IMHO, in my experience when using python it is helpful
to use as few non-core modules as possible, because they tend to break like this
occasionally.

Let me know if anyone objects to me working up a refactoring of spdxcheck.py
eliminating the use of the python 'git' module, and submitting it for review.

Thanks,
 -- Tim


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo)
  2025-04-08 17:34         ` spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo) Bird, Tim
@ 2025-04-08 18:10           ` Greg Kroah-Hartman
  2025-04-08 21:39             ` Ricardo Ribalda
  2025-04-08 23:41           ` Thomas Gleixner
  1 sibling, 1 reply; 15+ messages in thread
From: Greg Kroah-Hartman @ 2025-04-08 18:10 UTC (permalink / raw)
  To: Bird, Tim
  Cc: Gon Solo, Duje Mihanović, Thomas Gleixner, Ricardo Ribalda,
	linux-spdx@vger.kernel.org, linux-kernel@vger.kernel.org

On Tue, Apr 08, 2025 at 05:34:20PM +0000, Bird, Tim wrote:
> > -----Original Message-----
> > From: Gon Solo <gonsolo@gmail.com>
> > It's a known problem:
> > https://github.com/gitpython-developers/GitPython/issues/2003
> > https://github.com/python/cpython/issues/118761#issuecomment-2661504264
> > 
> 
> For what it's worth, I've always been a bit skeptical of the use of the python git module
> in spdxcheck.py.  Its use makes it impossible to use spdxcheck on a kernel source tree
> from a tarball (ie, on source not inside a git repo).  Also, from what I can see in spdxcheck.py,
> the way it's used is just to get the top directories for either the LICENSES dir,
> the top dir of the kernel source tree, or the directory to scan passed on the
> spdxcheck.py command line, and then to use the repo.traverse() function on said directory.
> 
> This ends up excluding any files in the source directory tree that are not checked
> into git yet, silently skipping them (which I've run into before when using the tool).
> 
> I think the code could be relatively easily refactored to eliminate the use of the git
> module, to overcome these issues.  I'm not sure if removing the module would
> eliminate the yield operation (used inside repo.traverse()), which seems to be causing the
> problem found here.  IMHO, in my experience when using python it is helpful
> to use as few non-core modules as possible, because they tend to break like this
> occasionally.
> 
> Let me know if anyone objects to me working up a refactoring of spdxcheck.py
> eliminating the use of the python 'git' module, and submitting it for review.

No objection from me!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo)
  2025-04-08 18:10           ` Greg Kroah-Hartman
@ 2025-04-08 21:39             ` Ricardo Ribalda
  0 siblings, 0 replies; 15+ messages in thread
From: Ricardo Ribalda @ 2025-04-08 21:39 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Bird, Tim, Gon Solo, Duje Mihanović, Thomas Gleixner,
	linux-spdx@vger.kernel.org, linux-kernel@vger.kernel.org

Hi Tim

On Tue, 8 Apr 2025 at 20:12, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:

> > Let me know if anyone objects to me working up a refactoring of spdxcheck.py
> > eliminating the use of the python 'git' module, and submitting it for review.
>
> No objection from me!

SGTM. Depending on how much time you need to implement it, we could
land something with `del` as Gon proposed. I can send a v2 if needed.

Let me know what you think



-- 
Ricardo Ribalda

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo)
  2025-04-08 17:34         ` spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo) Bird, Tim
  2025-04-08 18:10           ` Greg Kroah-Hartman
@ 2025-04-08 23:41           ` Thomas Gleixner
  2025-04-09 17:44             ` Bird, Tim
  1 sibling, 1 reply; 15+ messages in thread
From: Thomas Gleixner @ 2025-04-08 23:41 UTC (permalink / raw)
  To: Bird, Tim, Gon Solo, Duje Mihanović
  Cc: Greg Kroah-Hartman, Ricardo Ribalda, linux-spdx@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Tue, Apr 08 2025 at 17:34, Tim Bird wrote:
>> -----Original Message-----
> For what it's worth, I've always been a bit skeptical of the use of the python git module
> in spdxcheck.py.  Its use makes it impossible to use spdxcheck on a kernel source tree
> from a tarball (ie, on source not inside a git repo).  Also, from what I can see in spdxcheck.py,
> the way it's used is just to get the top directories for either the LICENSES dir,
> the top dir of the kernel source tree, or the directory to scan passed on the
> spdxcheck.py command line, and then to use the repo.traverse() function on said directory.
>
> This ends up excluding any files in the source directory tree that are not checked
> into git yet, silently skipping them (which I've run into before when
> using the tool).

The exactly same problem exists the other way round. Run an
unconstrained version of spdxcheck on a dirty source tree with lots of
leftovers, then it scans nonsense all the way instead of skipping some
not yet git tracked files.

The easiest way for me to achieve that was using git to exclude all of
the irrelevant noise, which I still consider to be a reasonable design
decision.

And yes, it ignores not yet tracked files, but if you want to check
them, then it's easy enough to commit them temporarily or provide a
dedicated file target to the tools, which ignores git.

> I think the code could be relatively easily refactored to eliminate the use of the git
> module, to overcome these issues.  I'm not sure if removing the module would
> eliminate the yield operation (used inside repo.traverse()), which seems to be causing the
> problem found here.  IMHO, in my experience when using python it is helpful
> to use as few non-core modules as possible, because they tend to break like this
> occasionally.
>
> Let me know if anyone objects to me working up a refactoring of spdxcheck.py
> eliminating the use of the python 'git' module, and submitting it for review.

I have no objections at all as long as it gives the same result of not
trying to scan random artifacts which might sit around in a source tree.

But not for the price that I have to create a tarball or a pristine
checked out tree first to run it. That'd be a usability regression to
begin with.

Good luck for coming up with a clever and clean solution for that!

Just for the record: I rather wish that people would contribute to
eliminate the remaining 17% (15397 files) which do not have SPDX
identifiers than complaining about the trivial to solve short-comings of
the tool, which was written to help this effort and to make sure that it
does not degrade.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo)
  2025-04-08 23:41           ` Thomas Gleixner
@ 2025-04-09 17:44             ` Bird, Tim
  2025-04-09 20:25               ` Thomas Gleixner
  0 siblings, 1 reply; 15+ messages in thread
From: Bird, Tim @ 2025-04-09 17:44 UTC (permalink / raw)
  To: Thomas Gleixner, Gon Solo, Duje Mihanović
  Cc: Greg Kroah-Hartman, Ricardo Ribalda, linux-spdx@vger.kernel.org,
	linux-kernel@vger.kernel.org

> -----Original Message-----
> From: Thomas Gleixner <tglx@linutronix.de>
> On Tue, Apr 08 2025 at 17:34, Tim Bird wrote:
> >> -----Original Message-----
> > For what it's worth, I've always been a bit skeptical of the use of the python git module
> > in spdxcheck.py.  Its use makes it impossible to use spdxcheck on a kernel source tree
> > from a tarball (ie, on source not inside a git repo).  Also, from what I can see in spdxcheck.py,
> > the way it's used is just to get the top directories for either the LICENSES dir,
> > the top dir of the kernel source tree, or the directory to scan passed on the
> > spdxcheck.py command line, and then to use the repo.traverse() function on said directory.
> >
> > This ends up excluding any files in the source directory tree that are not checked
> > into git yet, silently skipping them (which I've run into before when
> > using the tool).
> 
> The exactly same problem exists the other way round. Run an
> unconstrained version of spdxcheck on a dirty source tree with lots of
> leftovers, then it scans nonsense all the way instead of skipping some
> not yet git tracked files.

Yeah.  I thought about this overnight, and came to the same conclusion.
I forgot that most people build the kernel in a way
that the build results end up in the source tree.  (Crazy, right?)
I almost always use KBUILD_OUTPUT, and I always use it when I'm doing
embedded and spdx-related work, so I don't often run into build
contamination of the source tree.

> 
> The easiest way for me to achieve that was using git to exclude all of
> the irrelevant noise, which I still consider to be a reasonable design
> decision.

Agreed.  Given common build practices, this is a reasonable design decision.
I thought there might be a good reason for this design choice, and
was hoping you would respond, Thomas.  Thanks for the quick feedback.

> 
> And yes, it ignores not yet tracked files, but if you want to check
> them, then it's easy enough to commit them temporarily or provide a
> dedicated file target to the tools, which ignores git.

OK.  Yes. That's an easy workaround.

> 
> > I think the code could be relatively easily refactored to eliminate the use of the git
> > module, to overcome these issues.  I'm not sure if removing the module would
> > eliminate the yield operation (used inside repo.traverse()), which seems to be causing the
> > problem found here.  IMHO, in my experience when using python it is helpful
> > to use as few non-core modules as possible, because they tend to break like this
> > occasionally.
> >
> > Let me know if anyone objects to me working up a refactoring of spdxcheck.py
> > eliminating the use of the python 'git' module, and submitting it for review.
> 
> I have no objections at all as long as it gives the same result of not
> trying to scan random artifacts which might sit around in a source tree.
> 
> But not for the price that I have to create a tarball or a pristine
> checked out tree first to run it. That'd be a usability regression to
> begin with.

Agreed.
> 
> Good luck for coming up with a clever and clean solution for that!

I thought about various solutions for this, but each one I came up
with had other drawbacks.  If it was just a matter of separating 
*.[chS] files from ELF object files, that would be easy to deal with.
But we put SPDX headers on all kinds of files, and there are lots
of other types of files generated during a build that are not just
ELF objects.  And build rules change over time.  So even if I made
a comprehensive system today to catch build-generated outliers,
the solution would probably need constant updating and tweaking, which
IMHO makes it a no-go.

> 
> Just for the record: I rather wish that people would contribute to
> eliminate the remaining 17% (15397 files) which do not have SPDX
> identifiers than complaining about the trivial to solve short-comings of
> the tool, which was written to help this effort and to make sure that it
> does not degrade.

I agree with this.  Analyzing where the headers are missing is interesting.
But it's more important to just fix the missing ones.
I'll spend more of my time working on missing headers,
rather than on tools to analyze and report them.

Thanks and regards,
 -- Tim


^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo)
  2025-04-09 17:44             ` Bird, Tim
@ 2025-04-09 20:25               ` Thomas Gleixner
  0 siblings, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2025-04-09 20:25 UTC (permalink / raw)
  To: Bird, Tim, Gon Solo, Duje Mihanović
  Cc: Greg Kroah-Hartman, Ricardo Ribalda, linux-spdx@vger.kernel.org,
	linux-kernel@vger.kernel.org

Tim!

On Wed, Apr 09 2025 at 17:44, Tim Bird wrote:
>> From: Thomas Gleixner <tglx@linutronix.de>
>> On Tue, Apr 08 2025 at 17:34, Tim Bird wrote:
>> And yes, it ignores not yet tracked files, but if you want to check
>> them, then it's easy enough to commit them temporarily or provide a
>> dedicated file target to the tools, which ignores git.
>
> OK.  Yes. That's an easy workaround.

Actually spdxcheck supports that already:

   scripts/spdxcheck.py path/to/file

>> Good luck for coming up with a clever and clean solution for that!
>
> I thought about various solutions for this, but each one I came up
> with had other drawbacks.  If it was just a matter of separating 
> *.[chS] files from ELF object files, that would be easy to deal with.
> But we put SPDX headers on all kinds of files, and there are lots
> of other types of files generated during a build that are not just
> ELF objects.  And build rules change over time.  So even if I made
> a comprehensive system today to catch build-generated outliers,
> the solution would probably need constant updating and tweaking, which
> IMHO makes it a no-go.

I'm glad that I'm not the only one who came to this conclusion :)

>> Just for the record: I rather wish that people would contribute to
>> eliminate the remaining 17% (15397 files) which do not have SPDX
>> identifiers than complaining about the trivial to solve short-comings of
>> the tool, which was written to help this effort and to make sure that it
>> does not degrade.
>
> I agree with this.  Analyzing where the headers are missing is interesting.
> But it's more important to just fix the missing ones.
> I'll spend more of my time working on missing headers,
> rather than on tools to analyze and report them.

Very appreciated.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-04-09 20:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-25 13:10 [PATCH] scripts/spdxcheck: Limit the scope of git.Repo Ricardo Ribalda
2025-04-03 21:34 ` Ricardo Ribalda
2025-04-04  6:21   ` Greg Kroah-Hartman
2025-04-04  6:29     ` Ricardo Ribalda
2025-04-04  8:06       ` Greg Kroah-Hartman
2025-04-07 14:38 ` Duje Mihanović
2025-04-08  8:39   ` Gon Solo
2025-04-08  9:33     ` Gon Solo
2025-04-08 10:36       ` Gon Solo
2025-04-08 17:34         ` spdxcheck: python git module considered harmful (was RE: [PATCH] scripts/spdxcheck: Limit the scope of git.Repo) Bird, Tim
2025-04-08 18:10           ` Greg Kroah-Hartman
2025-04-08 21:39             ` Ricardo Ribalda
2025-04-08 23:41           ` Thomas Gleixner
2025-04-09 17:44             ` Bird, Tim
2025-04-09 20:25               ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox