From: Thomas Gleixner <tglx@linutronix.de>
To: Max Mehl <max.mehl@fsfe.org>, LKML <linux-kernel@vger.kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Christoph Hellwig <hch@lst.de>,
linux-spdx@vger.kernel.org
Subject: Re: [patch 0/9] scripts/spdxcheck: Better statistics and exclude handling
Date: Tue, 17 May 2022 23:43:49 +0200 [thread overview]
Message-ID: <8735h7ltre.ffs@tglx> (raw)
In-Reply-To: <1652775347.3cr9dmk5qv.2220@fsfe.org>
On Tue, May 17 2022 at 10:25, Max Mehl wrote:
> ~ Thomas Gleixner [2022-05-16 20:59 +0200]:
>> There is also an argument to be made whether we really need to have SPDX
>> identifiers on trivial files:
>>
>> #include <someheader.h>
>> <EOF>
>>
>> Such files are not copyrightable by any means. So what's the value of
>> doubling the line count to add an SPDX identifier? Just to make nice
>> statistics?
>
> We agree that such files are not copyrightable. But where is the
> threshold? Lines of code? Creativity? Number of used functions? And how
> to embed this threshold in tooling? So instead of fuzzy exclusion of
> such files in tools like spdxcheck or REUSE, it makes sense to treat
> them as every other file with the cost of adding two comment lines.
>
> This clear-cut rule eases maintaining and growing the effort you and
> others did because developers would know exactly what to add to a new
> file (license + copyright) without requiring looking up the thresholds
> or a manual review by maintainers who can interpret them.
Seriously no. I'm outright refusing to add my copyright to a trivial
file with one or two includes or a silly comment like '/* empty because */.
There is nothing copyrightable there.
I'm not going to make myself a fool just to make tools happy, which can
figure out on their own whether there is reasonable content in the vast
majority of cases.
Also you need some exclude rules in any case. Why?
- How do you tell a tool that a file is generated, e.g. in the kernel
the default configuration files?
Yes, the file content depends on human input to the generator tool,
but I'm looking forward for the explanation how this is
copyrightable especially with multiple people updating this file
over time where some of the updates are just done by invoking the
generator tool itself.
- How do you tell a tool that a file contains licensing documentation?
Go and look what license scanners make out of all the various
license-rules.rst files.
- ....
Do all scanners have to grow heuristics for ignoring the content past
the topmost SPDX License identifier in certain files or for figuring
out what might be generated content?
You also might need to add information about binary blobs, which
obviously cannot be part of the binary blobs themself.
The exclude rules I added are lazy and mostly focussed on spdxcheck, but
I'm happy to make them more useful and let them carry information about
the nature of the exclude or morph them into a general scanner info
which also contains binary blob info and other helpful information. But
that needs a larger discussion about the format and rules for such a
file.
That said, I'm all for clear cut rules, but rules just for the rules
sake are almost as bad as no rules at all.
As always you have to apply common sense and look at the bigger picture
and come up with solutions which are practicable, enforcable and useful
for the larger eco-system.
Your goal of having SPDX ids and copyright notices in every file of a
project is honorable, but impractical for various reasons.
See above.
Aside of that you cannot replace a full blown license scanner by REUSE
even if your project is SPDX and Copyright notice clean at the top level
of a file. You still need to verify that there is no other information
in a 'clean' file which might be contradicting or supplemental. You
cannot add all of this functionality to REUSE or whatever.
Thanks,
tglx
next prev parent reply other threads:[~2022-05-17 21:44 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-16 10:27 [patch 0/9] scripts/spdxcheck: Better statistics and exclude handling Thomas Gleixner
2022-05-16 10:27 ` [patch 1/9] scripts/spdxcheck: Add percentage to statistics Thomas Gleixner
2022-05-16 10:27 ` [patch 2/9] scripts/spdxcheck: Add directory statistics Thomas Gleixner
2022-05-16 10:27 ` [patch 3/9] scripts/spdxcheck: Add [sub]directory statistics Thomas Gleixner
2022-05-16 10:27 ` [patch 4/9] scripts/spdxcheck: Add option to display files without SPDX Thomas Gleixner
2022-05-16 10:27 ` [patch 5/9] scripts/spdxcheck: Put excluded files and directories into a separate file Thomas Gleixner
2022-05-16 10:27 ` [patch 6/9] scripts/spdxcheck: Exclude config directories Thomas Gleixner
2022-05-16 10:27 ` [patch 7/9] scripts/spdxcheck: Exclude MAINTAINERS/CREDITS Thomas Gleixner
2022-05-16 10:27 ` [patch 8/9] scripts/spdxcheck: Exclude dot files Thomas Gleixner
2022-05-16 14:22 ` Miguel Ojeda
2022-05-16 18:43 ` Thomas Gleixner
2022-05-18 13:36 ` Greg Kroah-Hartman
2022-05-16 10:27 ` [patch 9/9] scripts/spdxcheck: Exclude top-level README Thomas Gleixner
2022-05-16 13:14 ` [patch 0/9] scripts/spdxcheck: Better statistics and exclude handling Max Mehl
2022-05-16 18:52 ` Thomas Gleixner
2022-05-16 18:59 ` Thomas Gleixner
2022-05-17 8:25 ` Max Mehl
2022-05-17 21:43 ` Thomas Gleixner [this message]
2022-05-23 16:11 ` J Lovejoy
2022-05-23 21:44 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8735h7ltre.ffs@tglx \
--to=tglx@linutronix.de \
--cc=gregkh@linuxfoundation.org \
--cc=hch@lst.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-spdx@vger.kernel.org \
--cc=max.mehl@fsfe.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox