* [Buildroot] [RFC 0/2] script to find package licenses
@ 2016-08-04 14:16 Rahul Bedarkar
2016-08-04 14:16 ` [Buildroot] [RFC 1/2] scripts: add a script to find licenses of package Rahul Bedarkar
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Rahul Bedarkar @ 2016-08-04 14:16 UTC (permalink / raw)
To: buildroot
Hi,
Legal information is a kind of thing that we can't automate completely.
But we want it to be correct when new package is added or version bumps.
This patch set attempts to add a script to find license information from
package source files to verify or correct legal info for buildroot packages.
Legal information may get outdated with version bumps or even may not get
correct in first place if source package does not provide any license files.
In such cases, we need to look into file header to get that information.
But it could be very difficult if there are number of source files.
find-licenses script scans package source files for known licenses to
find under which license package is released. It aggregates license
information for all source files found in a package.
For finding license, we rely on file's license header. Generally
most of packages use standard license headers which helps us to detect
license of packages.
Currently it supports notable licenses. But we can later add other
licenses based on regx.
Script outputs licenses found on standard output file-wise, directory-
wise and final aggregation of all licenses found. It also lists files
which don't have license header. Directory-wise license listing will be
useful when different components are licensed under different license.
Since final license list is just aggregation of licenses found for all
source files, we can not surely say if package is dual or
multi-licensed or different components are licensed under different
license. That's why we can't use final license list directly in our
package .mk file, but it at least helps us to find or verify license
information quickly.
e.g.
$ make ubus-find-licenses
/home/rahul.bedarkar/buildroot/support/scripts/find-licenses ubus /home/rahul.bedarkar/buildroot/output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/
ubus: Licenses file-wise:
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/CMakeFiles/3.5.2/CompilerIdC/CMakeCCompilerId.c: []
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/CMakeFiles/feature_tests.c: []
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/examples/client.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/examples/count.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/examples/server.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/examples/count.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubus_common.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/libubus.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_proto.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/libubus.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/libubus-io.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/libubus-req.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/libubus-internal.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_id.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_acl.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_acl.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusmsg.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/libubus-obj.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_event.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/libubus-acl.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_monitor.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/libubus-sub.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/cli.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_obj.c: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_obj.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/ubusd_id.h: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/lua/test.lua: []
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/lua/test_client.lua: []
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/lua/ubus.c: ['LGPLv2.1']
ubus: Licenses directory-wise:
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/CMakeFiles/3.5.2/CompilerIdC: []
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/CMakeFiles: []
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/examples: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63: ['LGPLv2.1']
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/lua: ['LGPLv2.1']
ubus: Can not find license header in following files: 4
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/CMakeFiles/3.5.2/CompilerIdC/CMakeCCompilerId.c
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/CMakeFiles/feature_tests.c
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/lua/test.lua
output/build/ubus-259450f414d8c9ee41896e8e6d6bc57ec00e2b63/lua/test_client.lua
ubus: Surety of licenses: False
ubus: Final licenses: 1
['LGPLv2.1']
Rahul Bedarkar (2):
scripts: add a script to find licenses of package
new make target <PKG>-find-licenses
Makefile | 1 +
package/pkg-generic.mk | 4 +
support/scripts/find-licenses | 249 ++++++++++++++++++++++++++++++++++++++++++
3 files changed, 254 insertions(+)
create mode 100755 support/scripts/find-licenses
--
2.6.2
^ permalink raw reply [flat|nested] 8+ messages in thread* [Buildroot] [RFC 1/2] scripts: add a script to find licenses of package 2016-08-04 14:16 [Buildroot] [RFC 0/2] script to find package licenses Rahul Bedarkar @ 2016-08-04 14:16 ` Rahul Bedarkar 2016-08-04 14:16 ` [Buildroot] [RFC 2/2] new make target <PKG>-find-licenses Rahul Bedarkar 2016-08-04 16:33 ` [Buildroot] [RFC 0/2] script to find package licenses Thomas Petazzoni 2 siblings, 0 replies; 8+ messages in thread From: Rahul Bedarkar @ 2016-08-04 14:16 UTC (permalink / raw) To: buildroot Legal information is a kind of thing that may get outdated with version bumps or even may not get correct in first place if source package does not provide any license files. In such cases, we need to look into file header to get that information. But it could be very difficult if there are number of source files. find-licenses script scans package source files for known licenses to find under which license package is released. It aggregates license information for all source files found in a package. For finding license, we rely on file's license header. Generally most of packages use standard license headers which helps us to detect license of packages. Currently it supports notable licenses. But we can later add other licenses based on regx. Script outputs licenses found on standard output file-wise, directory- wise and final aggregation of all licenses found. It also lists files which don't have license header. Since final license list is just aggregation of licenses found for all source files, we can not surely say if package is dual or multi-licensed or different components are licensed under different license. That's why we can't use final license list directly in our package .mk file, but it at least helps us to find or verify license information quickly. Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com> --- support/scripts/find-licenses | 249 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 249 insertions(+) create mode 100755 support/scripts/find-licenses diff --git a/support/scripts/find-licenses b/support/scripts/find-licenses new file mode 100755 index 0000000..e5d5fb9 --- /dev/null +++ b/support/scripts/find-licenses @@ -0,0 +1,249 @@ +#!/usr/bin/python +# +# Usage: +# ./support/scripts/find-licenses <package-name> <package-source-dir> +# +# Limitations: +# * We can only list licenses found by scanning each source file and +# can not say if package is dual or multi-licensed. +# +# Author: Rahul Bedarkar <rahul.bedarkar@imgtec.com> +# +# Copyright (C) 2016, Imagination Technologies +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License +# as published by the Free Software Foundation; either version 2 +# of the License, or (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. + +import os +import argparse +import mmap +import re +import contextlib + +EXTENSIONS = (".c", ".cc", ".h", ".cpp", ".sh", ".py", ".lua") + +BSD_FAMILY_LICENSES = { + "BSD-4c": r"Redistribution and use in source and binary forms, with or without modification, "\ + r"are permitted provided that the following conditions are met: " \ + r"(1)?(.)?\s*Redistributions of source code must retain the above copyright "\ + r"notice, this list of conditions and the following disclaimer. "\ + r"(2)?(.)?\s*Redistributions in binary form must reproduce the above copyright "\ + r"notice, this list of conditions and the following disclaimer in the documentation "\ + r"and/or other materials provided with the distribution. "\ + r"(3)?(.)?\s*All advertising materials mentioning features or use of this software "\ + r"must display the following acknowledgement: "\ + r"This product includes software developed by the.*"\ + r"(4)?(.)?.*to endorse or promote products derived from this software without "\ + r"specific prior written permission.", + "BSD-3c": r"Redistribution and use in source and binary forms, with or without modification, "\ + r"are permitted provided that the following conditions are met: " \ + r"(1)?(.)?\s*Redistributions of source code must retain the above copyright "\ + r"notice, this list of conditions and the following disclaimer. "\ + r"(2)?(.)?\s*Redistributions in binary form must reproduce the above copyright "\ + r"notice, this list of conditions and the following disclaimer in the documentation "\ + r"and/or other materials provided with the distribution. "\ + r"(3)?(.)?.*to endorse or promote products derived from this software without "\ + r"specific prior written permission.", + "BSD-2c": r"Redistribution and use in source and binary forms, with or without modification, "\ + r"are permitted provided that the following conditions are met: " \ + r"(1)?(.)?\s*Redistributions of source code must retain the above copyright "\ + r"notice, this list of conditions and the following disclaimer. "\ + r"(2)?(.)?\s*Redistributions in binary form must reproduce the above copyright "\ + r"notice, this list of conditions and the following disclaimer in the documentation "\ + r"and/or other materials provided with the distribution.", +} + +GPL_FAMILY_LICENSES = { + "GPL": r"the GNU (Lesser|Library)?\s*General Public License(?:,)? version (2|2.1|3)(?:,)? "\ + r"as published by the Free Software Foundation", + "GPL+": r"the GNU (Lesser|Library)?\s*General Public License as published by the Free "\ + r"Software Foundation[;,] either version (2|2.1|3)\s*(?:of the License)?\s*, "\ + r"or \(at your option\) any later version", +} + +OTHER_LICENSES = { + "AFLv2.1": r"Academic Free License version 2.1", + "Public domain": r"Public domain", + "MIT": r"Permission is hereby granted, free of charge, to any person "\ + r"obtaining a copy of this software and associated documentation files "\ + r"\(the (\"|``)Software(\"|'')\), to deal in the Software without restriction, "\ + r"including without limitation the rights to use, copy, modify, merge, "\ + r"publish, distribute, sublicense, and/or sell copies of the Software, "\ + r"and to permit persons to whom the Software is furnished to do so, "\ + r"subject to the following conditions: "\ + r"The above copyright notice and this permission notice shall be "\ + r"included in all copies or substantial portions of the Software.", + "ISC": r"Permission to use, copy, modify, and/or distribute this software for any purpose "\ + r"with or without fee is hereby granted, provided that the above copyright notice "\ + r"and this permission notice appear in all copies.", + "OpenBSD": r"Permission to use, copy, modify, and distribute this software for any purpose "\ + r"with or without fee is hereby granted, provided that the above copyright notice "\ + r"and this permission notice appear in all copies.", + "Apache-2.0": r"Licensed under the Apache License, Version 2.0", +} + +def search_for_licenses(string): + """Search for different known license headers in given string. + + :param string str: string in which license headers are searched + :returns: list of licenses found, empty list if license header is not found + :rtype: list + """ + license_list = [] + data = " ".join(line for line in [line.strip() for line in string.splitlines() if line] if line) + # BSD family licenses have common clauses and one supersedes others. + # So first check for license which has more clauses. + for license in sorted(BSD_FAMILY_LICENSES, reverse=True): + found = re.search(BSD_FAMILY_LICENSES[license], data, re.MULTILINE | re.IGNORECASE) + if found: + license_list.append(license) + break + for license in GPL_FAMILY_LICENSES: + it = re.finditer(GPL_FAMILY_LICENSES[license], data, re.MULTILINE | re.IGNORECASE) + for found in it: + new_license = "GPLv" + if found.group(1): + new_license = "LGPLv" + new_license += found.group(2) + if license == "GPL+": + new_license += "+" + license_list.append(new_license) + for license in OTHER_LICENSES: + found = re.search(OTHER_LICENSES[license], data, re.MULTILINE | re.IGNORECASE) + if found: + license_list.append(license) + return license_list + +def get_file_licenses(path): + """Get list of licenses for a given file + + :param path str: name of file + :returns: list of licenses found, empty list if license header is not found + :rtype: list + """ + license_list = [] + with open(path, "r") as srcfile: + try: + with contextlib.closing(mmap.mmap(srcfile.fileno(), 0, access=mmap.ACCESS_READ)) as data: + all_lines = "" + # check for single line comments + for line in iter(data.readline, ""): + if line.startswith("//"): + all_lines += line.lstrip("/") + elif line.startswith("#"): + all_lines += line.lstrip("#") + elif line.startswith("--"): + all_lines += line.lstrip("-") + else: + break + + if all_lines: + license_list = search_for_licenses(all_lines) + + data.seek(0, 0) + # check for multiline comment block + pattern = re.compile(r"(/\*(.)*?\*/|--\[\[(.)*?--\]\])", re.DOTALL) + for match in pattern.finditer(data): + license_list += search_for_licenses(match.group(0).replace("*", "")) + except ValueError: # if input file is empty + pass + return license_list + +def find_pkg_licenses(name, src_dir): + """Find licenses of given package + + License information is printed on standard output. + :param name str: name of package + :param src_dir str: source directory of package + """ + struct = get_pkg_structure(name, src_dir) + for root in struct[name]: + for src_file in struct[name][root]: + struct[name][root][src_file] = get_file_licenses(os.path.join(root, src_file)) + process_pkg_license_info(name, struct) + +def get_pkg_structure(name, src_dir): + """Get suitable package structure to fill-in license information per file per directory + + Package source directory is scanned for C, C++ source files and empty dictionary + is prepared per file per sub directory. + :param name str: name of package + :param src_dir str: source directory of package + :returns: Dictionary of package structure + :rtype: dictionary + """ + struct = {} + struct[name] = {} + for root, dirs, files in os.walk(src_dir): + root = os.path.relpath(root, os.getcwd()) + for src_file in files: + if src_file.endswith(EXTENSIONS): + if not root in struct[name]: + struct[name][root] = {} + struct[name][root][src_file] = [] + return struct + +def process_pkg_license_info(name, struct): + """Processes package license information in given structure + + Aggregate license information per sub directory per package and prints license info + on standard output. + :param name str: name of package + :param struct dictionary: package structure with license info per file per directory + """ + intermediate_license_info = {} + sure = False + files_without_license = [] + for root in struct[name]: + if not root in intermediate_license_info: + intermediate_license_info[root] = set() + for src_file in struct[name][root]: + if not struct[name][root][src_file]: + files_without_license.append(os.path.join(root, src_file)) + intermediate_license_info[root] |= set(struct[name][root][src_file]) + final_licenses = set() + for root in intermediate_license_info: + final_licenses |= intermediate_license_info[root] + if len(list(final_licenses)) <= 1 and len(files_without_license) == 0: + sure = True + + print "{}: Licenses file-wise:".format(name) + for root in struct[name]: + for src_file in struct[name][root]: + print "{}: {}".format(os.path.join(root, src_file), struct[name][root][src_file]) + + print "{}: Licenses directory-wise:".format(name) + for key in intermediate_license_info: + print "{}: {}".format(key, list(intermediate_license_info[key])) + + print "{}: Can not find license header in following files: {}".\ + format(name, len(files_without_license)) + for src_file in files_without_license: + print src_file + + print "{}: Surety of licenses: {}".format(name, sure) + + print "{}: Final licenses: {}".format(name, len(list(final_licenses))) + print list(final_licenses) + +def main(): + parser = argparse.ArgumentParser("Find licenses of package") + parser.add_argument("name", help="name of a package") + parser.add_argument("src_dir", help="source directory of a package") + args = parser.parse_args() + find_pkg_licenses(args.name, args.src_dir) + + +if __name__ == "__main__": + main() -- 2.6.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Buildroot] [RFC 2/2] new make target <PKG>-find-licenses 2016-08-04 14:16 [Buildroot] [RFC 0/2] script to find package licenses Rahul Bedarkar 2016-08-04 14:16 ` [Buildroot] [RFC 1/2] scripts: add a script to find licenses of package Rahul Bedarkar @ 2016-08-04 14:16 ` Rahul Bedarkar 2016-08-04 16:33 ` [Buildroot] [RFC 0/2] script to find package licenses Thomas Petazzoni 2 siblings, 0 replies; 8+ messages in thread From: Rahul Bedarkar @ 2016-08-04 14:16 UTC (permalink / raw) To: buildroot Add a make target to run find-licenses script for given package with build dir as source package dir. Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com> --- Makefile | 1 + package/pkg-generic.mk | 4 ++++ 2 files changed, 5 insertions(+) diff --git a/Makefile b/Makefile index 027f21c..24fec89 100644 --- a/Makefile +++ b/Makefile @@ -949,6 +949,7 @@ help: @echo ' <pkg>-dirclean - Remove <pkg> build directory' @echo ' <pkg>-reconfigure - Restart the build from the configure step' @echo ' <pkg>-rebuild - Restart the build from the build step' + @echo ' <pkg>-find-licenses - Find licenses of <pkg> from sources' $(foreach p,$(HELP_PACKAGES), \ @echo $(sep) \ @echo '$($(p)_NAME):' $(sep) \ diff --git a/package/pkg-generic.mk b/package/pkg-generic.mk index 68ead3d..c4aa84b 100644 --- a/package/pkg-generic.mk +++ b/package/pkg-generic.mk @@ -698,6 +698,9 @@ $(1)-show-version: $(1)-show-depends: @echo $$($(2)_FINAL_ALL_DEPENDENCIES) +$(1)-find-licenses: + $$(TOPDIR)/support/scripts/find-licenses $(1) $$($(2)_BUILDDIR) + $(1)-graph-depends: graph-depends-requirements @$$(INSTALL) -d $$(GRAPHS_DIR) @cd "$$(CONFIG_DIR)"; \ @@ -922,6 +925,7 @@ endif $(1)-dirclean \ $(1)-external-deps \ $(1)-extract \ + $(1)-find-licenses \ $(1)-graph-depends \ $(1)-install \ $(1)-install-host \ -- 2.6.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Buildroot] [RFC 0/2] script to find package licenses 2016-08-04 14:16 [Buildroot] [RFC 0/2] script to find package licenses Rahul Bedarkar 2016-08-04 14:16 ` [Buildroot] [RFC 1/2] scripts: add a script to find licenses of package Rahul Bedarkar 2016-08-04 14:16 ` [Buildroot] [RFC 2/2] new make target <PKG>-find-licenses Rahul Bedarkar @ 2016-08-04 16:33 ` Thomas Petazzoni 2016-08-05 2:03 ` Khem Raj 2016-08-05 7:42 ` Rahul Bedarkar 2 siblings, 2 replies; 8+ messages in thread From: Thomas Petazzoni @ 2016-08-04 16:33 UTC (permalink / raw) To: buildroot Hello, On Thu, 4 Aug 2016 19:46:02 +0530, Rahul Bedarkar wrote: > Legal information is a kind of thing that we can't automate completely. > But we want it to be correct when new package is added or version bumps. > > This patch set attempts to add a script to find license information from > package source files to verify or correct legal info for buildroot packages. > > Legal information may get outdated with version bumps or even may not get > correct in first place if source package does not provide any license files. > In such cases, we need to look into file header to get that information. > But it could be very difficult if there are number of source files. > > find-licenses script scans package source files for known licenses to > find under which license package is released. It aggregates license > information for all source files found in a package. > > For finding license, we rely on file's license header. Generally > most of packages use standard license headers which helps us to detect > license of packages. > > Currently it supports notable licenses. But we can later add other > licenses based on regx. > > Script outputs licenses found on standard output file-wise, directory- > wise and final aggregation of all licenses found. It also lists files > which don't have license header. Directory-wise license listing will be > useful when different components are licensed under different license. > > Since final license list is just aggregation of licenses found for all > source files, we can not surely say if package is dual or > multi-licensed or different components are licensed under different > license. That's why we can't use final license list directly in our > package .mk file, but it at least helps us to find or verify license > information quickly. Thanks for this proposal. However, there are already some tools that do the same thing I believe. I'm thinking especially at the tools used by the Fossology project (https://www.fossology.org/). It is surely more complicated to install and use that your Python script, but it is also a lot more complete, and even more importantly: maintained by other people. Best regards, Thomas -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux, Kernel and Android engineering http://free-electrons.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Buildroot] [RFC 0/2] script to find package licenses 2016-08-04 16:33 ` [Buildroot] [RFC 0/2] script to find package licenses Thomas Petazzoni @ 2016-08-05 2:03 ` Khem Raj 2016-08-05 7:42 ` Rahul Bedarkar 1 sibling, 0 replies; 8+ messages in thread From: Khem Raj @ 2016-08-05 2:03 UTC (permalink / raw) To: buildroot On 8/4/16 9:33 AM, Thomas Petazzoni wrote: > Hello, > > On Thu, 4 Aug 2016 19:46:02 +0530, Rahul Bedarkar wrote: > >> Legal information is a kind of thing that we can't automate completely. >> But we want it to be correct when new package is added or version bumps. >> >> This patch set attempts to add a script to find license information from >> package source files to verify or correct legal info for buildroot packages. >> >> Legal information may get outdated with version bumps or even may not get >> correct in first place if source package does not provide any license files. >> In such cases, we need to look into file header to get that information. >> But it could be very difficult if there are number of source files. >> >> find-licenses script scans package source files for known licenses to >> find under which license package is released. It aggregates license >> information for all source files found in a package. >> >> For finding license, we rely on file's license header. Generally >> most of packages use standard license headers which helps us to detect >> license of packages. >> >> Currently it supports notable licenses. But we can later add other >> licenses based on regx. >> >> Script outputs licenses found on standard output file-wise, directory- >> wise and final aggregation of all licenses found. It also lists files >> which don't have license header. Directory-wise license listing will be >> useful when different components are licensed under different license. >> >> Since final license list is just aggregation of licenses found for all >> source files, we can not surely say if package is dual or >> multi-licensed or different components are licensed under different >> license. That's why we can't use final license list directly in our >> package .mk file, but it at least helps us to find or verify license >> information quickly. > > Thanks for this proposal. However, there are already some tools that do > the same thing I believe. I'm thinking especially at the tools used by > the Fossology project (https://www.fossology.org/). It is surely more > complicated to install and use that your Python script, but it is also > a lot more complete, and even more importantly: maintained by other > people. And SPDX. Something like this https://spdx.org/tools/community/fossologyspdx would be quite apt. > > Best regards, > > Thomas > ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Buildroot] [RFC 0/2] script to find package licenses 2016-08-04 16:33 ` [Buildroot] [RFC 0/2] script to find package licenses Thomas Petazzoni 2016-08-05 2:03 ` Khem Raj @ 2016-08-05 7:42 ` Rahul Bedarkar 2016-08-05 7:53 ` Thomas Petazzoni 1 sibling, 1 reply; 8+ messages in thread From: Rahul Bedarkar @ 2016-08-05 7:42 UTC (permalink / raw) To: buildroot Hi Thomas, On Thursday 04 August 2016 10:03 PM, Thomas Petazzoni wrote: > Hello, > > Thanks for this proposal. However, there are already some tools that do > the same thing I believe. I'm thinking especially at the tools used by > the Fossology project (https://www.fossology.org/). It is surely more > complicated to install and use that your Python script, but it is also > a lot more complete, and even more importantly: maintained by other > people. > Intention of script is to help us to verify or correct legal info that we add in .mk file. This could be a handy tool that can be used by anyone when we do version bump or add new package. The complex tools that are available are generally used by upstream package providers for Open Source Compliance which provide lot more information than just file license. And integrating such tools in Buildroot might be difficult. But in Buildroot where we just need license of a package, script could be useful as a starting point. Regards, Rahul ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Buildroot] [RFC 0/2] script to find package licenses 2016-08-05 7:42 ` Rahul Bedarkar @ 2016-08-05 7:53 ` Thomas Petazzoni 2016-08-08 17:42 ` Yann E. MORIN 0 siblings, 1 reply; 8+ messages in thread From: Thomas Petazzoni @ 2016-08-05 7:53 UTC (permalink / raw) To: buildroot Hello, On Fri, 5 Aug 2016 13:12:49 +0530, Rahul Bedarkar wrote: > Intention of script is to help us to verify or correct legal info that > we add in .mk file. This could be a handy tool that can be used by > anyone when we do version bump or add new package. The complex tools > that are available are generally used by upstream package providers for > Open Source Compliance which provide lot more information than just file > license. And integrating such tools in Buildroot might be difficult. But > in Buildroot where we just need license of a package, script could be > useful as a starting point. I'm sorry, but I still don't see why we should merge a script that we would have to maintain, while there are some existing, actively developed and more powerful tools doing the same work. Moreover, I believe that the cases that can be detected automatically by a script (such as a clear GPL, LGPL, BSD or MIT license) are clearly not the ones for which it is difficult to write the <pkg>_LICENSE string. The ones for which it is difficult are the ones that a script will never handle as it can't recognize any pattern. Thomas -- Thomas Petazzoni, CTO, Free Electrons Embedded Linux, Kernel and Android engineering http://free-electrons.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Buildroot] [RFC 0/2] script to find package licenses 2016-08-05 7:53 ` Thomas Petazzoni @ 2016-08-08 17:42 ` Yann E. MORIN 0 siblings, 0 replies; 8+ messages in thread From: Yann E. MORIN @ 2016-08-08 17:42 UTC (permalink / raw) To: buildroot Rahul, All, On 2016-08-05 09:53 +0200, Thomas Petazzoni spake thusly: > On Fri, 5 Aug 2016 13:12:49 +0530, Rahul Bedarkar wrote: > > Intention of script is to help us to verify or correct legal info that > > we add in .mk file. This could be a handy tool that can be used by > > anyone when we do version bump or add new package. The complex tools > > that are available are generally used by upstream package providers for > > Open Source Compliance which provide lot more information than just file > > license. And integrating such tools in Buildroot might be difficult. But > > in Buildroot where we just need license of a package, script could be > > useful as a starting point. > > I'm sorry, but I still don't see why we should merge a script that we > would have to maintain, while there are some existing, actively > developed and more powerful tools doing the same work. > > Moreover, I believe that the cases that can be detected automatically > by a script (such as a clear GPL, LGPL, BSD or MIT license) are clearly > not the ones for which it is difficult to write the <pkg>_LICENSE > string. > > The ones for which it is difficult are the ones that a script will > never handle as it can't recognize any pattern. I concur with Thomas here. The obvious licenses we can find pretty easily, so those ar enot the ones we must look for. On the other hand, the ones for which we would need an automated solution are not easy to find automatically. Hence this is a catch-22 situation. However, I think we could rely on an external siolution to find licenses. For example, Fossology and SPDX have both been mentionned already. It would be nice to see how we could interface to either to get a list of potential licenses for a package. AFAICS, SPDX does not provide a mean to extract free-form licensing in source code; the licensing information has to be specially encoded with specific headers. If that is the case, then we could use the SPDX scripts to extract SPDX licensing information. AS for Fossology, they have a publicly-available instance, but it is only meant as a test-bed; it is neither supposed to be always available nor supposed to be reliable. One can install Fossology locally, but I haven't seen where one may download the database. All in all, if we were to add support for automtically extract licensing information from a pacakge source code, I firmly believe this should be done with existing tools, not ones we invent ourselves. I'll be marking those two patches are ejcted in out patchwork. However, we would *really* welcome a similar addition that would make use of existing infrastructures like SPDX or Fossology (or others). Regards, Yann E. MORIN. -- .-----------------.--------------------.------------------.--------------------. | Yann E. MORIN | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: | | +33 662 376 056 | Software Designer | \ / CAMPAIGN | ___ | | +33 223 225 172 `------------.-------: X AGAINST | \e/ There is no | | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL | v conspiracy. | '------------------------------^-------^------------------^--------------------' ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-08-08 17:42 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-04 14:16 [Buildroot] [RFC 0/2] script to find package licenses Rahul Bedarkar 2016-08-04 14:16 ` [Buildroot] [RFC 1/2] scripts: add a script to find licenses of package Rahul Bedarkar 2016-08-04 14:16 ` [Buildroot] [RFC 2/2] new make target <PKG>-find-licenses Rahul Bedarkar 2016-08-04 16:33 ` [Buildroot] [RFC 0/2] script to find package licenses Thomas Petazzoni 2016-08-05 2:03 ` Khem Raj 2016-08-05 7:42 ` Rahul Bedarkar 2016-08-05 7:53 ` Thomas Petazzoni 2016-08-08 17:42 ` Yann E. MORIN
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox