public inbox for openembedded-core@lists.openembedded.org
 help / color / mirror / Atom feed
From: Stefano Tondo <stondo@gmail.com>
To: openembedded-core@lists.openembedded.org
Cc: stefano.tondo.ext@siemens.com, adrian.freihofer@siemens.com,
	Peter.Marko@siemens.com, jpewhacker@gmail.com,
	Ross.Burton@arm.com
Subject: [PATCH v2 03/18] spdx30: Add ecosystem-specific PURL generation
Date: Sat, 21 Feb 2026 06:09:51 +0100	[thread overview]
Message-ID: <20260221051006.335141-4-stondo@gmail.com> (raw)
In-Reply-To: <20260221051006.335141-1-stondo@gmail.com>

From: Stefano Tondo <stefano.tondo.ext@siemens.com>

Add a function that identifies ecosystem-specific PURLs (cargo, golang,
pypi, npm, cpan, nuget, maven) for dependency packages, working alongside
oe.purl.get_base_purl() which provides pkg:yocto PURLs.

Key design decision: Does NOT return pkg:generic fallback. This ensures:
- No overlap with the base pkg:yocto generation
- Packages get BOTH purls: pkg:yocto/layer/pkg@ver AND pkg:cargo/pkg@ver
- Maximum traceability for compliance tools

Detects ecosystems via:
- Unambiguous file extensions (.crate for Rust)
- Recipe inheritance (pypi, npm, cpan, nuget, maven classes)
- BitBake variables (GO_IMPORT, PYPI_PACKAGE, MAVEN_GROUP_ID)

Signed-off-by: Stefano Tondo <stefano.tondo.ext@siemens.com>
---
 meta/lib/oe/spdx30_tasks.py | 113 ++++++++++++++++++++++++++++++++++++
 1 file changed, 113 insertions(+)

diff --git a/meta/lib/oe/spdx30_tasks.py b/meta/lib/oe/spdx30_tasks.py
index 789b39bd93..0ee39ffcd5 100644
--- a/meta/lib/oe/spdx30_tasks.py
+++ b/meta/lib/oe/spdx30_tasks.py
@@ -13,12 +13,125 @@ import oe.spdx30
 import oe.spdx_common
 import oe.sdk
 import os
+import re
 
 from contextlib import contextmanager
 from datetime import datetime, timezone
 from pathlib import Path
 
 
+
+def extract_dependency_metadata(d, file_name):
+    """Extract ecosystem-specific PURL for dependency packages.
+
+    Uses recipe metadata to identify ecosystem PURLs (cargo, golang, pypi,
+    npm, cpan, nuget, maven). Returns (version, purl) or (None, None).
+    Does NOT return pkg:generic; base pkg:yocto is handled by get_base_purl().
+    """
+
+    pv = d.getVar("PV")
+    version = pv if pv else None
+    purl = None
+
+    # Rust crate (.crate extension is unambiguous)
+    if file_name.endswith('.crate'):
+        crate_match = re.match(r'^(.+?)-(\d+\.\d+\.\d+(?:\.\d+)?(?:[-+][\w.]+)?)\.crate$', file_name)
+        if crate_match:
+            name = crate_match.group(1)
+            version = crate_match.group(2)
+            purl = f"pkg:cargo/{name}@{version}"
+            return (version, purl)
+
+    # Go module via GO_IMPORT variable
+    go_import = d.getVar("GO_IMPORT")
+    if go_import and version:
+        purl = f"pkg:golang/{go_import}@{version}"
+        return (version, purl)
+
+    # Go module from filename with explicit hosting domain
+    go_match = re.match(
+        r'^((?:github|gitlab|gopkg|golang|go\.googlesource)\.com\.[\w.]+(?:\.[\w-]+)*?)-(v?\d+\.\d+\.\d+(?:[-+][\w.]+)?)\.',
+        file_name
+    )
+    if go_match:
+        module_path = go_match.group(1).replace('.', '/', 1)
+        parts = module_path.split('/', 1)
+        if len(parts) == 2:
+            domain = parts[0]
+            path = parts[1].replace('.', '/')
+            module_path = f"{domain}/{path}"
+
+        version = go_match.group(2)
+        purl = f"pkg:golang/{module_path}@{version}"
+        return (version, purl)
+
+    # PyPI package
+    if bb.data.inherits_class("pypi", d) and version:
+        pypi_package = d.getVar("PYPI_PACKAGE")
+        if pypi_package:
+            # Normalize per PEP 503
+            name = re.sub(r"[-_.]+", "-", pypi_package).lower()
+            purl = f"pkg:pypi/{name}@{version}"
+            return (version, purl)
+
+    # NPM package
+    if bb.data.inherits_class("npm", d) and version:
+        bpn = d.getVar("BPN")
+        if bpn:
+            name = bpn[4:] if bpn.startswith('npm-') else bpn
+            purl = f"pkg:npm/{name}@{version}"
+            return (version, purl)
+
+    # CPAN package
+    if bb.data.inherits_class("cpan", d) and version:
+        bpn = d.getVar("BPN")
+        if bpn:
+            if bpn.startswith('perl-'):
+                name = bpn[5:]
+            elif bpn.startswith('libperl-'):
+                name = bpn[8:]
+            else:
+                name = bpn
+            purl = f"pkg:cpan/{name}@{version}"
+            return (version, purl)
+
+    # NuGet package
+    if (bb.data.inherits_class("nuget", d) or bb.data.inherits_class("dotnet", d)) and version:
+        bpn = d.getVar("BPN")
+        if bpn:
+            if bpn.startswith('dotnet-'):
+                name = bpn[7:]
+            elif bpn.startswith('nuget-'):
+                name = bpn[6:]
+            else:
+                name = bpn
+            purl = f"pkg:nuget/{name}@{version}"
+            return (version, purl)
+
+    # Maven package
+    if bb.data.inherits_class("maven", d) and version:
+        group_id = d.getVar("MAVEN_GROUP_ID")
+        artifact_id = d.getVar("MAVEN_ARTIFACT_ID")
+
+        if group_id and artifact_id:
+            purl = f"pkg:maven/{group_id}/{artifact_id}@{version}"
+            return (version, purl)
+        else:
+            bpn = d.getVar("BPN")
+            if bpn:
+                if bpn.startswith('maven-'):
+                    name = bpn[6:]
+                elif bpn.startswith('java-'):
+                    name = bpn[5:]
+                else:
+                    name = bpn
+                purl = f"pkg:maven/{name}@{version}"
+                return (version, purl)
+
+    # Base pkg:yocto PURL is handled by oe.purl.get_base_purl()
+    return (version, None)
+
+
 def walk_error(err):
     bb.error(f"ERROR walking {err.filename}: {err}")
 
-- 
2.53.0



  parent reply	other threads:[~2026-02-21  5:10 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-21  5:09 [PATCH v2 00/18] spdx30: SBOM enrichment, lifecycle scope, and documentation Stefano Tondo
2026-02-21  5:09 ` [PATCH v2 01/18] spdx30: Add configurable file filtering support Stefano Tondo
2026-02-21  5:09 ` [PATCH v2 02/18] spdx30: Add supplier support for image and SDK SBOMs Stefano Tondo
2026-02-21  5:09 ` Stefano Tondo [this message]
2026-02-21  5:09 ` [PATCH v2 04/18] spdx30: Add version extraction from SRCREV for Git source components Stefano Tondo
2026-02-22 13:34   ` [OE-core] " Mathieu Dubois-Briand
2026-02-21  5:09 ` [PATCH v2 05/18] spdx30: Add SPDX_GIT_PURL_MAPPINGS for Git hosting Stefano Tondo
2026-02-21  5:09 ` [PATCH v2 06/18] sbom30: Fix object deduplication to preserve complete data Stefano Tondo
2026-02-21 16:45   ` Joshua Watt
2026-02-21  5:09 ` [PATCH v2 07/18] spdx30: Enrich source downloads with external refs and PURLs Stefano Tondo
2026-02-21  5:09 ` [PATCH v2 08/18] spdx30: Include recipe base PURL in package external identifiers Stefano Tondo
2026-02-21  5:09 ` [PATCH v2 09/18] spdx30: Add image root metadata package with describes relationship Stefano Tondo
2026-02-21 16:47   ` Joshua Watt
2026-02-21  5:09 ` [PATCH v2 10/18] spdx30_tasks: Fix non-deterministic BUILDNAME in image package version Stefano Tondo
2026-02-21  5:09 ` [PATCH v2 11/18] spdx30: Add rootfs version and dependency scope classification Stefano Tondo
2026-02-21  5:10 ` [PATCH v2 12/18] oeqa/selftest: Add test for download_location defensive handling Stefano Tondo
2026-02-21  5:10 ` [PATCH v2 13/18] spdx.py: Add test for version extraction patterns Stefano Tondo
2026-02-21  5:10 ` [PATCH v2 14/18] cve_check: Escape special characters in CPE 2.3 formatted strings Stefano Tondo
2026-02-21  5:10 ` [PATCH v2 15/18] spdx-common: Declare SPDX_FORCE_*_SCOPE override variables Stefano Tondo
2026-02-21  5:10 ` [PATCH v2 16/18] oeqa/selftest: Add test for lifecycle scope classification Stefano Tondo
2026-02-21  5:10 ` [PATCH v2 17/18] spdx-common: Add documentation for undocumented SPDX variables Stefano Tondo
2026-02-21  5:10 ` [PATCH v2 18/18] spdx-common: Clarify documentation and make SPDX_LICENSES extensible Stefano Tondo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260221051006.335141-4-stondo@gmail.com \
    --to=stondo@gmail.com \
    --cc=Peter.Marko@siemens.com \
    --cc=Ross.Burton@arm.com \
    --cc=adrian.freihofer@siemens.com \
    --cc=jpewhacker@gmail.com \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=stefano.tondo.ext@siemens.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox