Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Peter Kjellerstedt <peter.kjellerstedt@axis.com>
To: Saul Wold <Saul.Wold@windriver.com>,
	"openembedded-core@lists.openembedded.org"
	<openembedded-core@lists.openembedded.org>,
	"JPEWhacker@gmail.com" <JPEWhacker@gmail.com>
Subject: RE: [OE-core] [PATCH] create-spdx: Get SPDX-License-Identifier from source
Date: Wed, 2 Feb 2022 11:32:18 +0000	[thread overview]
Message-ID: <f6d4658eae34413b840508a184e21f05@axis.com> (raw)
In-Reply-To: <4fbd26c0-e5f8-e9f1-5933-4e2c0d7c3da6@windriver.com>

> -----Original Message-----
> From: Saul Wold <Saul.Wold@windriver.com>
> Sent: den 2 februari 2022 05:07
> To: Peter Kjellerstedt <peter.kjellerstedt@axis.com>; openembedded-
> core@lists.openembedded.org; JPEWhacker@gmail.com
> Subject: Re: [OE-core] [PATCH] create-spdx: Get SPDX-License-Identifier
> from source
> 
> On 2/1/22 19:21, Peter Kjellerstedt wrote:
> >> -----Original Message-----
> >> From: openembedded-core@lists.openembedded.org <openembedded-
> core@lists.openembedded.org> On Behalf Of Saul Wold
> >> Sent: den 2 februari 2022 01:02
> >> To: openembedded-core@lists.openembedded.org; JPEWhacker@gmail.com
> >> Cc: Saul Wold <saul.wold@windriver.com>
> >> Subject: [OE-core] [PATCH] create-spdx: Get SPDX-License-Identifier
> from source
> >>
> >> This patch will read the begining of source files and try to find
> >> the SPDX-License-Identifier to populate the licenseInfoInFiles
> >> field for each source file. This does not populate licenseConculed
> >
> > I assume that should be "licenseConcluded".
> 
> Well that depends on if "we" want to take some "ownership" of the
> conclusion as the "preparer".  How would we handle the case of 2
> SPDX-License-Identifiers tags in a file, is it an "AND" or an "OR"?
> Simple example.
> 
> The description of licenseConcluded is:
> 
> "License expression for licenseConcluded.  The licensing that the
> preparer of this SPDX document has concluded, based on the evidence,
> actually applies to the package."
> 
> At somepoint we might be able to fill in that field, but for now I think
> we leave it as NOASSERTION.
> 
> Sau!

Sorry, you misunderstood. Since I do not know the specification, I could 
only assume that the field you intended to refer to was actually named 
"licenseConcluded" rather than "licenseConculed".

//Peter

> >> at this time, nor rolls it up to package level.
> >>
> >> We read as binary to since some source code seem to have some
> >
> > to -> too
> >
> >> binary characters, the license is then converted to ascii strings.
> >>
> >> Signed-off-by: Saul Wold <saul.wold@windriver.com>
> >> ---
> >> Merge after Joshua's patch (spdx: Add set helper for list properties)
> >> merges
> >>
> >>   meta/classes/create-spdx.bbclass | 23 +++++++++++++++++++++++
> >>   1 file changed, 23 insertions(+)
> >>
> >> diff --git a/meta/classes/create-spdx.bbclass b/meta/classes/create-
> spdx.bbclass
> >> index 8b4203fdb5d..588489cc2b0 100644
> >> --- a/meta/classes/create-spdx.bbclass
> >> +++ b/meta/classes/create-spdx.bbclass
> >> @@ -37,6 +37,24 @@ SPDX_SUPPLIER[doc] = "The SPDX PackageSupplier field
> for SPDX packages created f
> >>
> >>   do_image_complete[depends] = "virtual/kernel:do_create_spdx"
> >>
> >> +def extract_licenses(filename):
> >> +    import re
> >> +    import oe.spdx
> >
> > You do not use oe.spdx in this function.
> >
> >> +
> >> +    lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d.
> ]+)[ |\n|\r\n]*?')
> >
> > I assume you meant:
> >
> >      lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d.
> ]+)(?: |\n|\r\n)*?')
> >
> > Not that it really matters though, as it will yield the same result as:
> >
> >      lic_regex = re.compile(b'SPDX-License-Identifier:\s+([-A-Za-z\d.
> ]+)')
> >
> > However, neither of the expressions above will correctly match all the
> > SPDX-License-Identifier examples at https://spdx.dev/ids/#how.
> >
> > Use this instead:
> >
> >      lic_regex = re.compile(b'^\W*SPDX-License-Identifier:\s*([
> \w\d.()+-]+?)(?:\s+\W*)?$', re.MULTILINE)
> >
> >> +
> >> +    try:
> >> +        with open(filename, 'rb') as f:
> >> +            size = min(15000, os.stat(filename).st_size)
> >> +            txt = f.read(size)
> >> +            licenses = re.findall(lic_regex, txt)
> >> +            if licenses:
> >> +                ascii_licenses = [lic.decode('ascii') for lic in
> licenses]
> >> +                return ascii_licenses
> >> +    except Exception as e:
> >> +        bb.warn(f"Exception reading {filename}: {e}")
> >> +    return None
> >> +
> >>   def get_doc_namespace(d, doc):
> >>       import uuid
> >>       namespace_uuid = uuid.uuid5(uuid.NAMESPACE_DNS,
> d.getVar("SPDX_UUID_NAMESPACE"))
> >> @@ -232,6 +250,11 @@ def add_package_files(d, doc, spdx_pkg, topdir,
> get_spdxid, get_types, *, archiv
> >>                           checksumValue=bb.utils.sha256_file(filepath),
> >>                       ))
> >>
> >> +                if "SOURCE" in spdx_file.fileTypes:
> >> +                    extracted_lics = extract_licenses(filepath)
> >> +                    if extracted_lics:
> >> +                        spdx_file.licenseInfoInFiles = extracted_lics
> >> +
> >>                   doc.files.append(spdx_file)
> >>                   doc.add_relationship(spdx_pkg, "CONTAINS", spdx_file)
> >>                   spdx_pkg.hasFiles.append(spdx_file.SPDXID)
> >> --
> >> 2.31.1
> >
> > //Peter
> >
> 
> --
> Sau!

      reply	other threads:[~2022-02-02 11:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-02  0:01 [PATCH] create-spdx: Get SPDX-License-Identifier from source Saul Wold
2022-02-02  3:21 ` [OE-core] " Peter Kjellerstedt
2022-02-02  4:07   ` Saul Wold
2022-02-02 11:32     ` Peter Kjellerstedt [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6d4658eae34413b840508a184e21f05@axis.com \
    --to=peter.kjellerstedt@axis.com \
    --cc=JPEWhacker@gmail.com \
    --cc=Saul.Wold@windriver.com \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox