Openembedded Core Discussions
 help / color / mirror / Atom feed
From: <Mikko.Rapeli@bmw.de>
To: <dl9pf@gmx.de>
Cc: <openembedded-core@lists.openembedded.org>,
	<jpew.hacker@gmail.com>, <scott.murray@konsulko.com>
Subject: Re: [OE-core] [PATCH v2] create-spdx: Get SPDX-License-Identifier from source
Date: Tue, 8 Feb 2022 13:35:01 +0000	[thread overview]
Message-ID: <YgJxhJCQZLrkpgON@korppu> (raw)
In-Reply-To: <2518421.NRruQZ00Rg@monster>

Hi,

On Tue, Feb 08, 2022 at 02:19:51PM +0100, Jan-Simon Möller wrote:
> Hi all
> 
> > > Can you given an overview of what meta-spdxscanner does? I'm not quite
> > > clear what extra processing would be required here.
> >
> > Jan-Simon can talk to it better, as he's done some dev work on the layer
> > and done tests with it against AGL (and the subsequent Fossology instance
> > experimentation), but AFAIK for the actual scanning scancode-toolkit
> > does pattern matching based license detection, so in theory it'll catch
> > excerpts of or slightly modified versions of the licenses in its
> > database, as opposed to just searching for SPDX-License-Identifier
> > declarations.  If everyone else is happy with the latter, I'm willing to
> > believe I'm offbase in my concerns, but either way I do think the
> > limitations are going to need to be documented so users (and their
> > lawyers) are aware of them.
> 
> TLDR: meta-spdxscanner integrates with scanning tools. Either with fossology
> or scancode-tk. An upload to blackduck is also possible meanwhile.
> 
> Let's focus on fossology and scancode-tk.
> 
> a) fossology
> 
> Here we essentially integrate in the task chain and archive the sources after
> patching to upload them to a fossology instance. All the scanning/processing
> happens then on the server and after some time (a lot ! ;) ) we get a SPDX
> report back that we store alongside the package. This is a result of a scan,
> so it might catch licenses of files deep in the source tree that may not be
> declared in the recipe and so on.
> 
> Also, fossology offers then a webinterface for manual inspection and review.
> So this is a thorough but quite manual process. More for release work than
> daily or occasional stuff.
> 
> 
> b) scancode-tk
> scancode on the contrary will run on your host during the build and gather the
> data.  It will write the spdx file out as well.
> 
> 
> I think for us the interesting part would be to compare e.g. the scancode-tk
> scan from b) with what we have declared in the recipe.

I guess reports from both will be a superset of used licenses (and possibly copyright
statements too) since the list of source files which are actually compiled is not
known to these services.

Currently the source recipes which have multiple licenses including problematic ones,
are not cleaned up for license compliance scan. E.g. GPLv3 licensed source code are not
deleted at do_patch() time. Thus reports need to be manually adjusted.

Cheers,

-Mikko

  reply	other threads:[~2022-02-08 13:35 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-07 19:29 [PATCH v2] create-spdx: Get SPDX-License-Identifier from source Saul Wold
2022-02-07 20:33 ` [OE-core] " Scott Murray
2022-02-07 20:35   ` Joshua Watt
2022-02-07 20:59     ` Scott Murray
2022-02-08 12:50       ` Robert Berger
2022-02-08 13:19       ` Jan-Simon Moeller
2022-02-08 13:35         ` Mikko.Rapeli [this message]
2022-02-08 13:56           ` Jan-Simon Moeller
2022-02-08 14:16         ` Joshua Watt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YgJxhJCQZLrkpgON@korppu \
    --to=mikko.rapeli@bmw.de \
    --cc=dl9pf@gmx.de \
    --cc=jpew.hacker@gmail.com \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=scott.murray@konsulko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox