From: stondo@gmail.com
To: openembedded-core@lists.openembedded.org
Cc: JPEWhacker@gmail.com, Stefano Tondo <stefano.tondo.ext@siemens.com>
Subject: [OE-core][PATCH v9 4/7] spdx30: Enrich source downloads with version and PURL
Date: Thu, 12 Mar 2026 16:38:42 +0100 [thread overview]
Message-ID: <20260312153845.164369-5-stondo@gmail.com> (raw)
In-Reply-To: <20260312153845.164369-1-stondo@gmail.com>
From: Stefano Tondo <stefano.tondo.ext@siemens.com>
Add version extraction, PURL generation, and external references
to source download packages in SPDX 3.0 SBOMs:
- Extract version from SRCREV for Git sources (full SHA-1)
- Generate PURLs for Git sources on github.com by default
- Support custom mappings via SPDX_GIT_PURL_MAPPINGS variable
(format: "domain:purl_type", split(':', 1) for parsing)
- Use ecosystem PURLs from SPDX_PACKAGE_URLS for non-Git
- Add VCS external references for Git downloads
- Add distribution external references for tarball downloads
- Parse Git URLs using urllib.parse
- Extract logic into _generate_git_purl() and
_enrich_source_package() helpers
For non-Git sources, version is not set from PV since the recipe
version does not necessarily reflect the version of individual
downloaded files. Ecosystem PURLs (which include version) from
SPDX_PACKAGE_URLS are still used when available.
The SPDX_GIT_PURL_MAPPINGS variable allows configuring PURL
generation for self-hosted Git services (e.g., GitLab).
github.com is always mapped to pkg:github by default.
Signed-off-by: Stefano Tondo <stefano.tondo.ext@siemens.com>
---
meta/classes/create-spdx-3.0.bbclass | 7 ++
meta/lib/oe/spdx30_tasks.py | 117 +++++++++++++++++++++++++++
2 files changed, 124 insertions(+)
diff --git a/meta/classes/create-spdx-3.0.bbclass b/meta/classes/create-spdx-3.0.bbclass
index def2dacbc3..9e912b34e1 100644
--- a/meta/classes/create-spdx-3.0.bbclass
+++ b/meta/classes/create-spdx-3.0.bbclass
@@ -152,6 +152,13 @@ SPDX_PACKAGE_URLS[doc] = "A space separated list of Package URLs (purls) for \
Override this variable to replace the default, otherwise append or prepend \
to add additional purls."
+SPDX_GIT_PURL_MAPPINGS ??= ""
+SPDX_GIT_PURL_MAPPINGS[doc] = "A space separated list of domain:purl_type \
+ mappings to configure PURL generation for Git source downloads. \
+ For example, "gitlab.example.com:pkg:gitlab" maps repositories hosted \
+ on gitlab.example.com to the pkg:gitlab PURL type. \
+ github.com is always mapped to pkg:github by default."
+
IMAGE_CLASSES:append = " create-spdx-image-3.0"
SDK_CLASSES += "create-spdx-sdk-3.0"
diff --git a/meta/lib/oe/spdx30_tasks.py b/meta/lib/oe/spdx30_tasks.py
index 8aaafea616..5639137520 100644
--- a/meta/lib/oe/spdx30_tasks.py
+++ b/meta/lib/oe/spdx30_tasks.py
@@ -14,6 +14,7 @@ import oe.spdx_common
import oe.sdk
import os
import re
+import urllib.parse
from contextlib import contextmanager
from datetime import datetime, timezone
@@ -378,6 +379,120 @@ def collect_dep_sources(dep_objsets, dest):
index_sources_by_hash(e.to, dest)
+def _generate_git_purl(d, download_location, srcrev):
+ """Generate a Package URL for a Git source from its download location.
+
+ Parses the Git URL to identify the hosting service and generates the
+ appropriate PURL type. Supports github.com by default and custom
+ mappings via SPDX_GIT_PURL_MAPPINGS.
+
+ Returns the PURL string or None if no mapping matches.
+ """
+ if not download_location or not download_location.startswith('git+'):
+ return None
+
+ git_url = download_location[4:] # Remove 'git+' prefix
+
+ # Default handler: github.com
+ git_purl_handlers = {
+ 'github.com': 'pkg:github',
+ }
+
+ # Custom PURL mappings from SPDX_GIT_PURL_MAPPINGS
+ # Format: "domain1:purl_type1 domain2:purl_type2"
+ custom_mappings = d.getVar('SPDX_GIT_PURL_MAPPINGS')
+ if custom_mappings:
+ for mapping in custom_mappings.split():
+ parts = mapping.split(':', 1)
+ if len(parts) == 2:
+ git_purl_handlers[parts[0]] = parts[1]
+ bb.debug(2, f"Added custom Git PURL mapping: {parts[0]} -> {parts[1]}")
+ else:
+ bb.warn(f"Invalid SPDX_GIT_PURL_MAPPINGS entry: {mapping} (expected format: domain:purl_type)")
+
+ try:
+ parsed = urllib.parse.urlparse(git_url)
+ except Exception:
+ return None
+
+ hostname = parsed.hostname
+ if not hostname:
+ return None
+
+ for domain, purl_type in git_purl_handlers.items():
+ if hostname == domain:
+ path = parsed.path.strip('/')
+ path_parts = path.split('/')
+ if len(path_parts) >= 2:
+ owner = path_parts[0]
+ repo = path_parts[1].replace('.git', '')
+ return f"{purl_type}/{owner}/{repo}@{srcrev}"
+ break
+
+ return None
+
+
+def _enrich_source_package(d, dl, fd, file_name, primary_purpose):
+ """Enrich a source download package with version, PURL, and external refs.
+
+ Extracts version from SRCREV for Git sources, generates PURLs for
+ known hosting services, and adds external references for VCS,
+ distribution URLs, and homepage.
+ """
+ version = None
+ purl = None
+
+ if fd.type == "git":
+ # Use full SHA-1 from fd.revision
+ srcrev = getattr(fd, 'revision', None)
+ if srcrev and srcrev not in {'${AUTOREV}', 'AUTOINC', 'INVALID'}:
+ version = srcrev
+
+ # Generate PURL for Git hosting services
+ download_location = getattr(dl, 'software_downloadLocation', None)
+ if version and download_location:
+ purl = _generate_git_purl(d, download_location, version)
+ else:
+ # Use ecosystem PURL from SPDX_PACKAGE_URLS if available
+ package_urls = (d.getVar('SPDX_PACKAGE_URLS') or '').split()
+ for url in package_urls:
+ if not url.startswith('pkg:yocto'):
+ purl = url
+ break
+
+ if version:
+ dl.software_packageVersion = version
+
+ if purl:
+ dl.software_packageUrl = purl
+
+ # Add external references
+ download_location = getattr(dl, 'software_downloadLocation', None)
+ if download_location and isinstance(download_location, str):
+ dl.externalRef = dl.externalRef or []
+
+ if download_location.startswith('git+'):
+ # VCS reference for Git repositories
+ git_url = download_location[4:]
+ if '@' in git_url:
+ git_url = git_url.split('@')[0]
+
+ dl.externalRef.append(
+ oe.spdx30.ExternalRef(
+ externalRefType=oe.spdx30.ExternalRefType.vcs,
+ locator=[git_url],
+ )
+ )
+ elif download_location.startswith(('http://', 'https://', 'ftp://')):
+ # Distribution reference for tarball/archive downloads
+ dl.externalRef.append(
+ oe.spdx30.ExternalRef(
+ externalRefType=oe.spdx30.ExternalRefType.altDownloadLocation,
+ locator=[download_location],
+ )
+ )
+
+
def add_download_files(d, objset):
inputs = set()
@@ -441,6 +556,8 @@ def add_download_files(d, objset):
)
)
+ _enrich_source_package(d, dl, fd, file_name, primary_purpose)
+
if fd.method.supports_checksum(fd):
# TODO Need something better than hard coding this
for checksum_id in ["sha256", "sha1"]:
--
2.53.0
next prev parent reply other threads:[~2026-03-12 15:39 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-02 16:01 [PATCH v5 00/10] spdx30: SBOM enrichment and documentation Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 01/10] spdx30: Add configurable file filtering support Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 02/10] spdx30: Add supplier support for image and SDK SBOMs Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 03/10] spdx30: Add ecosystem-specific PURL generation Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 04/10] spdx30: Add version extraction from SRCREV for Git source components Stefano Tondo
2026-03-03 8:42 ` [OE-core] " Mathieu Dubois-Briand
2026-03-03 10:27 ` Tondo, Stefano
2026-03-02 16:01 ` [PATCH v5 05/10] spdx30: Add SPDX_GIT_PURL_MAPPINGS for Git hosting Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 06/10] spdx30: Enrich source downloads with external refs and PURLs Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 07/10] oeqa/selftest: Add test for download_location defensive handling Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 08/10] spdx.py: Add test for version extraction patterns Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 09/10] cve_check: Escape special characters in CPE 2.3 formatted strings Stefano Tondo
2026-03-02 16:01 ` [PATCH v5 10/10] spdx-common: Add documentation for undocumented SPDX variables Stefano Tondo
2026-03-02 16:15 ` [OE-core] [PATCH v5 00/10] spdx30: SBOM enrichment and documentation Antonin Godard
2026-03-03 8:20 ` Tondo, Stefano
2026-03-04 17:05 ` [PATCH v6 " Stefano Tondo
2026-03-04 17:05 ` [PATCH v6 01/10] spdx30: Add configurable file filtering support Stefano Tondo
2026-03-07 21:53 ` Joshua Watt
2026-03-04 17:05 ` [PATCH v6 02/10] spdx30: Add supplier support for image and SDK SBOMs Stefano Tondo
2026-03-04 17:05 ` [PATCH v6 03/10] spdx30: Add ecosystem-specific PURL generation Stefano Tondo
2026-03-04 17:05 ` [PATCH v6 04/10] spdx30: Add version extraction from SRCREV for Git source components Stefano Tondo
2026-03-07 22:32 ` Joshua Watt
2026-03-04 17:05 ` [PATCH v6 05/10] spdx30: Add SPDX_GIT_PURL_MAPPINGS for Git hosting Stefano Tondo
2026-03-04 17:05 ` [PATCH v6 06/10] spdx30: Enrich source downloads with external refs and PURLs Stefano Tondo
2026-03-04 17:05 ` [PATCH v6 07/10] oeqa/selftest: Add test for download_location defensive handling Stefano Tondo
2026-03-04 17:05 ` [PATCH v6 08/10] spdx.py: Add test for version extraction patterns Stefano Tondo
2026-03-04 17:05 ` [PATCH v6 09/10] cve_check: Escape special characters in CPE 2.3 formatted strings Stefano Tondo
2026-03-04 17:05 ` [PATCH v6 10/10] spdx-common: Add documentation for undocumented SPDX variables Stefano Tondo
2026-03-06 6:32 ` [PATCH v6 00/10] spdx30: SBOM enrichment and documentation Mathieu Dubois-Briand
2026-03-06 13:59 ` [OE-core][PATCH v7 " Stefano Tondo
2026-03-06 13:59 ` [OE-core][PATCH v7 01/10] spdx30: Add configurable file filtering support Stefano Tondo
2026-03-06 13:59 ` [OE-core][PATCH v7 02/10] spdx30: Add supplier support for image and SDK SBOMs Stefano Tondo
2026-03-07 21:55 ` Joshua Watt
2026-03-06 13:59 ` [OE-core][PATCH v7 03/10] spdx30: Add ecosystem-specific PURL generation Stefano Tondo
2026-03-07 22:15 ` Joshua Watt
2026-03-06 13:59 ` [OE-core][PATCH v7 04/10] spdx30: Add version extraction from SRCREV for Git source components Stefano Tondo
2026-03-06 13:59 ` [OE-core][PATCH v7 05/10] spdx30: Add SPDX_GIT_PURL_MAPPINGS for Git hosting Stefano Tondo
2026-03-06 13:59 ` [OE-core][PATCH v7 06/10] spdx30: Enrich source downloads with external refs and PURLs Stefano Tondo
2026-03-07 22:42 ` Joshua Watt
2026-03-06 13:59 ` [OE-core][PATCH v7 07/10] oeqa/selftest: Add test for download_location defensive handling Stefano Tondo
2026-03-07 22:48 ` Joshua Watt
2026-03-06 14:00 ` [OE-core][PATCH v7 08/10] spdx.py: Add test for version extraction patterns Stefano Tondo
2026-03-07 22:51 ` Joshua Watt
2026-03-06 14:00 ` [OE-core][PATCH v7 09/10] cve_check: Escape special characters in CPE 2.3 formatted strings Stefano Tondo
2026-03-07 22:01 ` Joshua Watt
2026-03-06 14:00 ` [OE-core][PATCH v7 10/10] spdx-common: Add documentation for undocumented SPDX variables Stefano Tondo
2026-03-07 22:03 ` Joshua Watt
2026-03-09 13:28 ` [OE-core][PATCH v8 0/7] SPDX 3.0 SBOM enrichment and compliance improvements stondo
2026-03-09 13:28 ` [OE-core][PATCH v8 1/7] spdx30: Add configurable file exclusion pattern support stondo
2026-03-11 20:29 ` Joshua Watt
2026-03-09 13:28 ` [OE-core][PATCH v8 2/7] spdx30: Add supplier support for image and SDK SBOMs stondo
2026-03-11 20:31 ` Joshua Watt
2026-03-09 13:28 ` [OE-core][PATCH v8 3/7] spdx30: Add ecosystem-specific PURL generation via bbclasses stondo
2026-03-11 20:34 ` Joshua Watt
2026-03-09 13:28 ` [OE-core][PATCH v8 4/7] spdx30: Enrich source downloads with version and PURL stondo
2026-03-11 22:49 ` Joshua Watt
2026-03-11 22:51 ` Joshua Watt
2026-03-09 13:28 ` [OE-core][PATCH v8 5/7] oeqa/selftest: Add tests for source download enrichment stondo
2026-03-11 20:40 ` Joshua Watt
2026-03-09 13:28 ` [OE-core][PATCH v8 6/7] cve_check: Escape special characters in CPE 2.3 strings stondo
2026-03-11 20:44 ` Joshua Watt
2026-03-09 13:28 ` [OE-core][PATCH v8 7/7] spdx-common: Add documentation for undocumented SPDX variables stondo
2026-03-11 20:42 ` Joshua Watt
2026-03-12 15:38 ` [OE-core][PATCH v9 0/7] SPDX 3.0 SBOM enrichment and compliance improvements stondo
2026-03-12 15:38 ` [OE-core][PATCH v9 1/7] spdx30: Add configurable file exclusion pattern support stondo
2026-03-12 15:38 ` [OE-core][PATCH v9 2/7] spdx30: Add supplier support for image and SDK SBOMs stondo
2026-03-12 15:38 ` [OE-core][PATCH v9 3/7] spdx30: Add ecosystem-specific PURL generation via bbclasses stondo
2026-03-19 10:25 ` Richard Purdie
2026-03-12 15:38 ` stondo [this message]
2026-03-12 15:38 ` [OE-core][PATCH v9 5/7] oeqa/selftest: Add tests for source download enrichment stondo
2026-03-13 6:14 ` Mathieu Dubois-Briand
2026-03-13 8:30 ` Tondo, Stefano
2026-03-12 15:38 ` [OE-core][PATCH v9 6/7] cve_check: Escape special characters in CPE 2.3 strings stondo
2026-03-12 15:38 ` [OE-core][PATCH v9 7/7] spdx-common: Add documentation for undocumented SPDX variables stondo
2026-03-20 16:49 ` [OE-core][PATCH v10 0/7] SPDX 3.0 SBOM enrichment and compliance improvements stondo
2026-03-20 16:49 ` [OE-core][PATCH v10 1/7] spdx30: Add configurable file exclusion pattern support stondo
2026-03-20 16:49 ` [OE-core][PATCH v10 2/7] spdx30: Add supplier support for image and SDK SBOMs stondo
2026-03-20 16:49 ` [OE-core][PATCH v10 3/7] spdx30: Add ecosystem-specific PURL generation via bbclasses stondo
2026-03-20 16:49 ` [OE-core][PATCH v10 4/7] spdx30: Enrich source downloads with version and PURL stondo
2026-03-20 16:49 ` [OE-core][PATCH v10 5/7] oeqa/selftest: Add tests for source download enrichment stondo
2026-03-20 16:49 ` [OE-core][PATCH v10 6/7] cve_check: Escape special characters in CPE 2.3 strings stondo
2026-03-20 16:49 ` [OE-core][PATCH v10 7/7] spdx-common: Add documentation for undocumented SPDX variables stondo
2026-03-20 17:13 ` [OE-core][PATCH v10 0/7] SPDX 3.0 SBOM enrichment and compliance improvements Richard Purdie
2026-03-20 17:22 ` [OE-core][PATCH v9 " Mathieu Dubois-Briand
2026-03-20 17:24 ` Mathieu Dubois-Briand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260312153845.164369-5-stondo@gmail.com \
--to=stondo@gmail.com \
--cc=JPEWhacker@gmail.com \
--cc=openembedded-core@lists.openembedded.org \
--cc=stefano.tondo.ext@siemens.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.