* [OE-core] [RR V3][0/3] SPDX 3.0: Reduce redundant spdxid-hash symlinks to save inode on host
@ 2024-11-25 8:14 Hongxu Jia
2024-11-25 8:14 ` [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias Hongxu Jia
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Hongxu Jia @ 2024-11-25 8:14 UTC (permalink / raw)
To: openembedded-core, JPEWhacker
Summary: [OE-core] [RR V3][0/3] SPDX 3.0: Reduce redundant spdxid-hash symlinks to save inode on host
Changed in V2:
- Add link prefix and name to namespace of spdxId and alias, create one symlink for one jsonld file
Changed in v3:
- Rebase to fix conficlt with commit [lib/oe/sbom30: Prefix aliases with "http://spdx.org/spdxdocs/"] [1]
[1] https://github.com/openembedded/openembedded-core/commit/5e0ff36e025f5e842fa90b8219b53257d65ea66a
* Git logs
[oe-core]
commit 36b18e383a3feb9c15add223f0a065915d45f584
Author: Hongxu Jia <hongxu.jia@windriver.com>
Date: Sat Nov 9 17:16:31 2024 +0800
oeqa/selftest: Add SPDX 3.0 include source cases for core_image_minimal build
$ oe-selftest -r spdx.SPDX30Check.test_core_image_minimal_include_source
2024-11-09 09:17:54,600 - oe-selftest - INFO - Adding layer libraries:
2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta/lib
2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta-yocto-bsp/lib
2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta-selftest/lib
2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/meta-openembedded/meta-oe/lib
2024-11-09 09:17:54,602 - oe-selftest - INFO - Checking base configuration is valid/parsable
2024-11-09 09:17:56,653 - oe-selftest - INFO - Adding: "include selftest.inc" in path-to/build_spdx3-st/conf/local.conf
2024-11-09 09:17:56,653 - oe-selftest - INFO - Adding: "include bblayers.inc" in bblayers.conf
2024-11-09 09:17:56,653 - oe-selftest - INFO - test_core_image_minimal_include_source (spdx.SPDX30Check.test_core_image_minimal_include_source)
2024-11-09 10:41:16,654 - oe-selftest - INFO - Keepalive message
2024-11-09 11:37:53,091 - oe-selftest - INFO - ... ok
2024-11-09 11:55:18,638 - oe-selftest - INFO - ----------------------------------------------------------------------
2024-11-09 11:55:18,638 - oe-selftest - INFO - Ran 1 test in 9442.187s
2024-11-09 11:55:18,638 - oe-selftest - INFO - OK
2024-11-09 11:55:35,453 - oe-selftest - INFO - RESULTS:
2024-11-09 11:55:35,453 - oe-selftest - INFO - RESULTS - spdx.SPDX30Check.test_core_image_minimal_include_source: PASSED (8396.65s)
2024-11-09 11:55:35,490 - oe-selftest - INFO - SUMMARY:
2024-11-09 11:55:35,490 - oe-selftest - INFO - oe-selftest () - Ran 1 test in 9442.187s
2024-11-09 11:55:35,490 - oe-selftest - INFO - oe-selftest - OK - All required tests passed (successes=1, skipped=0, failures=0, errors=0)
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
commit d666b562d27d7f8166e032fa10c7e0f703eb14d5
Author: Hongxu Jia <hongxu.jia@windriver.com>
Date: Sat Nov 9 14:18:26 2024 +0800
sbom30.py: reduce redundant spdxid symlinks to save inode on host
In order to support all in-scope SPDX data within a single
JSON-LD file for SPDX 3.0.1, Yocto's SBOM:
- In native/target/nativesdk recipe, created spdxid-hash symlink
for each element to point to the JSON-LD file that contains
element details;
- In image recipe, use spdxid-hash symlink to collect element
details from varies of JSON-LD files
While SPDX_INCLUDE_SOURCES = "1", it adds sources to JSON-LD file
and create 2N+ spdxid-hash symlinks for N source files.
(N for software_File, N for hasDeclaredLicense's Relationship)
For large numbers of source files, adding an extra symlink -> real file
will occupy one more inode (per file), which will need a slot in
the OS's inode cache. In this situation, disk performance is slow
and inode is used up quickly
After commit [sbom30/spdx30: add link prefix and name to namespace
of spdxId and alias] applied, the namespace of spdxId and alias in
recipe and package jsonld differs. Use it to create symlink to jsonld,
take recipe shadow, package shadow and package shadow-src for example:
For recipe jsonld tmp/deploy/spdx/3.0.1/core2-64/recipes/shadow.spdx.json
spdxId: http://spdx.org/spdxdocs/recipe-shadow-xxx/...
alias: http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/...
link-name: recipe-shadow
symlink: tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/recipe-shadow.spdx.json -> ../recipes/shadow.spdx.json
For package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow.spdx.json
spdxId: http://spdx.org/spdxdocs/package-shadow-xxx/...
alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow/UNIHASH/...
link-name: package-shadow
symlink: tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/package-shadow.spdx.json -> ../packages/shadow.spdx.json
In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow-src.spdx.json
spdxId: http://spdx.org/spdxdocs/package-shadow-src-xxx/...
alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow-src/UNIHASH/...
link-name: package-shadow-src
symlink: tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/package-shadow-src.spdx.json -> ../packages/shadow-src.spdx.json
Build core-image-minimal with/without this commit, comparing the spdxid-link
number, 7 281 824 -> 6 043
echo 'SPDX_INCLUDE_SOURCES = "1"' >> local.conf
Without this commit:
$ time bitbake core-image-minimal
real 100m17.769s
user 0m24.516s
sys 0m4.334s
$ find tmp/deploy/spdx/3.0.1/*/by-spdxid-hash -name "*.json" |wc -l
7281824
With this commit:
$ time bitbake core-image-minimal
real 85m12.994s
user 0m20.423s
sys 0m4.228s
$ find tmp/deploy/spdx/3.0.1/*/by-spdxid-link -name "*.json" |wc -l
6043
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
commit 0523e946e618537edf13789c0a4e10741e59fb38
Author: Hongxu Jia <hongxu.jia@windriver.com>
Date: Sun Nov 24 21:44:07 2024 -0800
sbom30/spdx30: add link prefix to the namespace of spdxId and alias
In order to simple reference the SPDX ID to instead of making jsonld hash
path for each element, only creating one symlink for one file and referencing
it multiple times, add link prefix and name to the namespace of spdxId and alias
to replace ${PN} to avoid namespace conflict between recipe, packages and images.
Take recipe shadow, package shadow and package shadow-src for example:
Without this commit, spdxId and alias in recipe and package jsonld have the same
namespace
spdxId: http://spdx.org/spdxdocs/shadow-xxx/...
alias: http://spdx.org/spdxdocs/openembedded-alias/shadow/UNIHASH/...
After apply this commit, the namespace of spdxId in recipe and package jsonld differs:
In recipe jsonld tmp/deploy/spdx/3.0.1/core2-64/recipes/shadow.spdx.json
spdxId: http://spdx.org/spdxdocs/recipe-shadow-xxx/...
alias: http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/...
In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow.spdx.json
spdxId: http://spdx.org/spdxdocs/package-shadow-xxx/...
alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow/UNIHASH/...
In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow-src.spdx.json
spdxId: http://spdx.org/spdxdocs/package-shadow-src-xxx/...
alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow-src/UNIHASH/...
Then will use namespace of spdxId and alias to create link for jsonld file,
one symlink for one jsonld file, referenced by elements multiple times
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
====== Testing ======
* Commands
Build core-image-minimal with/without this commit, comparing the spdxid-link
number, 7 281 824 -> 6 043
echo 'SPDX_INCLUDE_SOURCES = "1"' >> local.conf
Without this commit:
$ time bitbake core-image-minimal
real 100m17.769s
user 0m24.516s
sys 0m4.334s
$ find tmp/deploy/spdx/3.0.1/*/by-spdxid-hash -name "*.json" |wc -l
7281824
With this commit:
$ time bitbake core-image-minimal
real 85m12.994s
user 0m20.423s
sys 0m4.228s
$ find tmp/deploy/spdx/3.0.1/*/by-spdxid-link -name "*.json" |wc -l
6043
$ oe-selftest -r spdx.SPDX30Check.test_core_image_minimal_include_source
2024-11-20 03:29:24,850 - oe-selftest - INFO - Adding layer libraries:
2024-11-20 03:29:24,850 - oe-selftest - INFO - path-to/poky/meta/lib
2024-11-20 03:29:24,850 - oe-selftest - INFO - path-to/poky/meta-yocto-bsp/lib
2024-11-20 03:29:24,850 - oe-selftest - INFO - path-to/poky/meta-selftest/lib
2024-11-20 03:29:24,850 - oe-selftest - INFO - path-to/meta-openembedded/meta-oe/lib
2024-11-20 03:29:24,868 - oe-selftest - INFO - Checking base configuration is valid/parsable
2024-11-20 03:29:27,317 - oe-selftest - INFO - Adding: "include selftest.inc" in path-to/build_spdx3-st/conf/local.conf
2024-11-20 03:29:27,317 - oe-selftest - INFO - Adding: "include bblayers.inc" in bblayers.conf
2024-11-20 03:29:27,317 - oe-selftest - INFO - test_core_image_minimal_include_source (spdx.SPDX30Check.test_core_image_minimal_include_source)
2024-11-20 04:52:47,318 - oe-selftest - INFO - Keepalive message
2024-11-20 05:14:08,115 - oe-selftest - INFO - ... ok
2024-11-20 05:21:12,798 - oe-selftest - INFO - ----------------------------------------------------------------------
2024-11-20 05:21:12,798 - oe-selftest - INFO - Ran 1 test in 6706.271s
2024-11-20 05:21:12,798 - oe-selftest - INFO - OK
2024-11-20 05:21:32,026 - oe-selftest - INFO - RESULTS:
2024-11-20 05:21:32,026 - oe-selftest - INFO - RESULTS - spdx.SPDX30Check.test_core_image_minimal_include_source: PASSED (6280.82s)
2024-11-20 05:21:32,027 - oe-selftest - INFO - SUMMARY:
2024-11-20 05:21:32,027 - oe-selftest - INFO - oe-selftest () - Ran 1 test in 6706.272s
2024-11-20 05:21:32,027 - oe-selftest - INFO - oe-selftest - OK - All required tests passed (successes=1, skipped=0, failures=0, errors=0)
* Expected Results
All successfully
^ permalink raw reply [flat|nested] 9+ messages in thread* [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias 2024-11-25 8:14 [OE-core] [RR V3][0/3] SPDX 3.0: Reduce redundant spdxid-hash symlinks to save inode on host Hongxu Jia @ 2024-11-25 8:14 ` Hongxu Jia 2024-11-25 8:14 ` [oe-core][PATCH V3 2/3] sbom30.py: reduce redundant spdxid symlinks to save inode on host Hongxu Jia ` (2 subsequent siblings) 3 siblings, 0 replies; 9+ messages in thread From: Hongxu Jia @ 2024-11-25 8:14 UTC (permalink / raw) To: openembedded-core, JPEWhacker In order to simple reference the SPDX ID to instead of making jsonld hash path for each element, only creating one symlink for one file and referencing it multiple times, add link prefix and name to the namespace of spdxId and alias to replace ${PN} to avoid namespace conflict between recipe, packages and images. Take recipe shadow, package shadow and package shadow-src for example: Without this commit, spdxId and alias in recipe and package jsonld have the same namespace spdxId: http://spdx.org/spdxdocs/shadow-xxx/... alias: http://spdx.org/spdxdocs/openembedded-alias/shadow/UNIHASH/... After apply this commit, the namespace of spdxId in recipe and package jsonld differs: In recipe jsonld tmp/deploy/spdx/3.0.1/core2-64/recipes/shadow.spdx.json spdxId: http://spdx.org/spdxdocs/recipe-shadow-xxx/... alias: http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/... In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow.spdx.json spdxId: http://spdx.org/spdxdocs/package-shadow-xxx/... alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow/UNIHASH/... In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow-src.spdx.json spdxId: http://spdx.org/spdxdocs/package-shadow-src-xxx/... alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow-src/UNIHASH/... Then will use namespace of spdxId and alias to create link for jsonld file, one symlink for one jsonld file, referenced by elements multiple times Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> --- meta/lib/oe/sbom30.py | 29 ++++++++++++++++++----------- meta/lib/oe/spdx30_tasks.py | 13 +++++++------ 2 files changed, 25 insertions(+), 17 deletions(-) diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py index 0a7b4c05fb..28d251a7ac 100644 --- a/meta/lib/oe/sbom30.py +++ b/meta/lib/oe/sbom30.py @@ -217,9 +217,11 @@ def to_list(l): class ObjectSet(oe.spdx30.SHACLObjectSet): - def __init__(self, d): + def __init__(self, d, name=None, link_prefix=None): super().__init__() self.d = d + self.name = name + self.link_prefix = link_prefix def create_index(self): self.by_sha256_hash = {} @@ -322,6 +324,8 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): uuid.NAMESPACE_DNS, self.d.getVar("SPDX_UUID_NAMESPACE") ) pn = self.d.getVar("PN") + if self.link_prefix and self.name: + pn = "%s-%s" % (self.link_prefix, self.name) return "%s/%s-%s" % ( self.d.getVar("SPDX_NAMESPACE_PREFIX"), pn, @@ -341,12 +345,15 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): elif namespace not in e._id: bb.warn(f"Namespace {namespace} not found in {e._id}") else: + pn = self.d.getVar("PN") + if self.link_prefix and self.name: + pn = "%s-%s" % (self.link_prefix, self.name) alias_ext = set_alias( e, e._id.replace(unihash, "UNIHASH").replace( namespace, - "http://spdx.org/spdxdocs/openembedded-alias/" - + self.d.getVar("PN"), + f"{self.d.getVar('SPDX_NAMESPACE_PREFIX')}/openembedded-alias/" + + pn, ), ) @@ -805,8 +812,8 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): ) @classmethod - def new_objset(cls, d, name, copy_from_bitbake_doc=True): - objset = cls(d) + def new_objset(cls, d, name, copy_from_bitbake_doc=True, link_prefix=None): + objset = cls(d, name=name, link_prefix=link_prefix) document = oe.spdx30.SpdxDocument( _id=objset.new_spdxid("document", name), @@ -887,9 +894,9 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): return missing_spdxids -def load_jsonld(d, path, required=False): +def load_jsonld(d, path, required=False, name=None, link_prefix=None): deserializer = oe.spdx30.JSONLDDeserializer() - objset = ObjectSet(d) + objset = ObjectSet(d, name=name, link_prefix=link_prefix) try: with path.open("rb") as f: deserializer.read(f, objset) @@ -918,9 +925,9 @@ def jsonld_hash_path(_id): return Path("by-spdxid-hash") / h[:2], h -def load_jsonld_by_arch(d, arch, subdir, name, *, required=False): +def load_jsonld_by_arch(d, arch, subdir, name, *, required=False, link_prefix=None): path = jsonld_arch_path(d, arch, subdir, name) - objset = load_jsonld(d, path, required=required) + objset = load_jsonld(d, path, required=required, name=name, link_prefix=link_prefix) if objset is not None: return (objset, path) return (None, None) @@ -1049,8 +1056,8 @@ def find_root_obj_in_jsonld(d, subdir, fn_name, obj_type, **attr_filter): return spdx_obj, objset -def load_obj_in_jsonld(d, arch, subdir, fn_name, obj_type, **attr_filter): - objset, fn = load_jsonld_by_arch(d, arch, subdir, fn_name, required=True) +def load_obj_in_jsonld(d, arch, subdir, fn_name, obj_type, link_prefix=None, **attr_filter): + objset, fn = load_jsonld_by_arch(d, arch, subdir, fn_name, required=True, link_prefix=link_prefix) spdx_obj = objset.find_filter(obj_type, **attr_filter) if not spdx_obj: diff --git a/meta/lib/oe/spdx30_tasks.py b/meta/lib/oe/spdx30_tasks.py index 5aeed5cd6f..ef829fbbf1 100644 --- a/meta/lib/oe/spdx30_tasks.py +++ b/meta/lib/oe/spdx30_tasks.py @@ -461,7 +461,7 @@ def create_spdx(d): if not include_vex in ("none", "current", "all"): bb.fatal("SPDX_INCLUDE_VEX must be one of 'none', 'current', 'all'") - build_objset = oe.sbom30.ObjectSet.new_objset(d, d.getVar("PN")) + build_objset = oe.sbom30.ObjectSet.new_objset(d, d.getVar("PN"), link_prefix="recipe") build = build_objset.new_task_build("recipe", "recipe") build_objset.set_element_alias(build) @@ -574,7 +574,7 @@ def create_spdx(d): bb.debug(1, "Creating SPDX for package %s" % pkg_name) - pkg_objset = oe.sbom30.ObjectSet.new_objset(d, pkg_name) + pkg_objset = oe.sbom30.ObjectSet.new_objset(d, pkg_name, link_prefix="package") spdx_package = pkg_objset.add_root( oe.spdx30.software_Package( @@ -793,7 +793,7 @@ def create_package_spdx(d): # Any element common to all packages that need to be referenced by ID # should be written into this objset set common_objset = oe.sbom30.ObjectSet.new_objset( - d, "%s-package-common" % d.getVar("PN") + d, "%s-package-common" % d.getVar("PN"), link_prefix="package" ) pkgdest = Path(d.getVar("PKGDEST")) @@ -812,6 +812,7 @@ def create_package_spdx(d): "packages-staging", pkg_name, oe.spdx30.software_Package, + link_prefix="package", software_primaryPurpose=oe.spdx30.software_SoftwarePurpose.install, ) @@ -1002,7 +1003,7 @@ def create_rootfs_spdx(d): with root_packages_file.open("r") as f: packages = json.load(f) - objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine)) + objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine), link_prefix="rootfs") rootfs = objset.add_root( oe.spdx30.software_Package( @@ -1037,7 +1038,7 @@ def create_image_spdx(d): image_basename = d.getVar("IMAGE_BASENAME") machine = d.getVar("MACHINE") - objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine)) + objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine), link_prefix="image") with manifest_path.open("r") as f: manifest = json.load(f) @@ -1150,7 +1151,7 @@ def sdk_create_spdx(d, sdk_type, spdx_work_dir, toolchain_outputname): sdk_name = toolchain_outputname + "-" + sdk_type sdk_packages = oe.sdk.sdk_list_installed_packages(d, sdk_type == "target") - objset = oe.sbom30.ObjectSet.new_objset(d, sdk_name) + objset = oe.sbom30.ObjectSet.new_objset(d, sdk_name, link_prefix="sdk") sdk_rootfs = objset.add_root( oe.spdx30.software_Package( -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [oe-core][PATCH V3 2/3] sbom30.py: reduce redundant spdxid symlinks to save inode on host 2024-11-25 8:14 [OE-core] [RR V3][0/3] SPDX 3.0: Reduce redundant spdxid-hash symlinks to save inode on host Hongxu Jia 2024-11-25 8:14 ` [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias Hongxu Jia @ 2024-11-25 8:14 ` Hongxu Jia 2024-11-25 8:15 ` [oe-core][PATCH 3/3] oeqa/selftest: Add SPDX 3.0 include source cases for core_image_minimal build Hongxu Jia [not found] ` <180B2809D3D6ACA7.7301@lists.openembedded.org> 3 siblings, 0 replies; 9+ messages in thread From: Hongxu Jia @ 2024-11-25 8:14 UTC (permalink / raw) To: openembedded-core, JPEWhacker In order to support all in-scope SPDX data within a single JSON-LD file for SPDX 3.0.1, Yocto's SBOM: - In native/target/nativesdk recipe, created spdxid-hash symlink for each element to point to the JSON-LD file that contains element details; - In image recipe, use spdxid-hash symlink to collect element details from varies of JSON-LD files While SPDX_INCLUDE_SOURCES = "1", it adds sources to JSON-LD file and create 2N+ spdxid-hash symlinks for N source files. (N for software_File, N for hasDeclaredLicense's Relationship) For large numbers of source files, adding an extra symlink -> real file will occupy one more inode (per file), which will need a slot in the OS's inode cache. In this situation, disk performance is slow and inode is used up quickly After commit [sbom30/spdx30: add link prefix and name to namespace of spdxId and alias] applied, the namespace of spdxId and alias in recipe and package jsonld differs. Use it to create symlink to jsonld, take recipe shadow, package shadow and package shadow-src for example: For recipe jsonld tmp/deploy/spdx/3.0.1/core2-64/recipes/shadow.spdx.json spdxId: http://spdx.org/spdxdocs/recipe-shadow-xxx/... alias: http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/... link-name: recipe-shadow symlink: tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/recipe-shadow.spdx.json -> ../recipes/shadow.spdx.json For package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow.spdx.json spdxId: http://spdx.org/spdxdocs/package-shadow-xxx/... alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow/UNIHASH/... link-name: package-shadow symlink: tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/package-shadow.spdx.json -> ../packages/shadow.spdx.json In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow-src.spdx.json spdxId: http://spdx.org/spdxdocs/package-shadow-src-xxx/... alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow-src/UNIHASH/... link-name: package-shadow-src symlink: tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/package-shadow-src.spdx.json -> ../packages/shadow-src.spdx.json Build core-image-minimal with/without this commit, comparing the spdxid-link number, 7 281 824 -> 6 043 echo 'SPDX_INCLUDE_SOURCES = "1"' >> local.conf Without this commit: $ time bitbake core-image-minimal real 100m17.769s user 0m24.516s sys 0m4.334s $ find tmp/deploy/spdx/3.0.1/*/by-spdxid-hash -name "*.json" |wc -l 7281824 With this commit: $ time bitbake core-image-minimal real 85m12.994s user 0m20.423s sys 0m4.228s $ find tmp/deploy/spdx/3.0.1/*/by-spdxid-link -name "*.json" |wc -l 6043 Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> --- meta/lib/oe/sbom30.py | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py index 28d251a7ac..014049e56d 100644 --- a/meta/lib/oe/sbom30.py +++ b/meta/lib/oe/sbom30.py @@ -919,10 +919,23 @@ def jsonld_arch_path(d, arch, subdir, name, deploydir=None): return deploydir / arch / subdir / (name + ".spdx.json") -def jsonld_hash_path(_id): - h = hashlib.sha256(_id.encode("utf-8")).hexdigest() +def jsonld_link_path(_id, d): + spdx_namespace_prefix = d.getVar("SPDX_NAMESPACE_PREFIX") + m = re.match(f"^{spdx_namespace_prefix}/openembedded-alias/([^/]+)/UNIHASH/", _id) + if m: + # Parse alias + # http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/license/3_24_0/BSD-3-Clause -> recipe-shadow + link_path = m.group(1) + else: + m = re.match(f"^{spdx_namespace_prefix}/([^/]+)/", _id) + if m: + # Parse spdxId + # http://spdx.org/spdxdocs/recipe-shadow-10e66933-65cf-5a2d-9a1d-99b12a405441/55a7286167e0c1a871d49da1af6070709d52370a5b52fdea03d248452f919aaa/source/4 -> recipe-shadow + link_path = m.group(1)[0:-len(str(uuid.NAMESPACE_DNS))-1] + else: + bb.fatal("Invalid id %s, neither SPDX ID or alias" % _id) - return Path("by-spdxid-hash") / h[:2], h + return Path("by-spdxid-link"), link_path def load_jsonld_by_arch(d, arch, subdir, name, *, required=False, link_prefix=None): @@ -993,7 +1006,7 @@ def write_recipe_jsonld_doc( dest = jsonld_arch_path(d, pkg_arch, subdir, objset.doc.name, deploydir=deploydir) def link_id(_id): - hash_path = jsonld_hash_path(_id) + hash_path = jsonld_link_path(_id, d) link_name = jsonld_arch_path( d, @@ -1001,6 +1014,11 @@ def write_recipe_jsonld_doc( *hash_path, deploydir=deploydir, ) + + # Return if expected symlink exists + if link_name.is_symlink() and link_name.resolve() == dest: + return hash_path[-1] + try: link_name.parent.mkdir(exist_ok=True, parents=True) link_name.symlink_to(os.path.relpath(dest, link_name.parent)) @@ -1067,7 +1085,7 @@ def load_obj_in_jsonld(d, arch, subdir, fn_name, obj_type, link_prefix=None, **a def find_by_spdxid(d, spdxid, *, required=False): - return find_jsonld(d, *jsonld_hash_path(spdxid), required=required) + return find_jsonld(d, *jsonld_link_path(spdxid, d), required=required) def create_sbom(d, name, root_elements, add_objectsets=[]): -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [oe-core][PATCH 3/3] oeqa/selftest: Add SPDX 3.0 include source cases for core_image_minimal build 2024-11-25 8:14 [OE-core] [RR V3][0/3] SPDX 3.0: Reduce redundant spdxid-hash symlinks to save inode on host Hongxu Jia 2024-11-25 8:14 ` [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias Hongxu Jia 2024-11-25 8:14 ` [oe-core][PATCH V3 2/3] sbom30.py: reduce redundant spdxid symlinks to save inode on host Hongxu Jia @ 2024-11-25 8:15 ` Hongxu Jia 2024-12-04 19:24 ` Joshua Watt [not found] ` <180B2809D3D6ACA7.7301@lists.openembedded.org> 3 siblings, 1 reply; 9+ messages in thread From: Hongxu Jia @ 2024-11-25 8:15 UTC (permalink / raw) To: openembedded-core, JPEWhacker $ oe-selftest -r spdx.SPDX30Check.test_core_image_minimal_include_source 2024-11-09 09:17:54,600 - oe-selftest - INFO - Adding layer libraries: 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta/lib 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta-yocto-bsp/lib 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta-selftest/lib 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/meta-openembedded/meta-oe/lib 2024-11-09 09:17:54,602 - oe-selftest - INFO - Checking base configuration is valid/parsable 2024-11-09 09:17:56,653 - oe-selftest - INFO - Adding: "include selftest.inc" in path-to/build_spdx3-st/conf/local.conf 2024-11-09 09:17:56,653 - oe-selftest - INFO - Adding: "include bblayers.inc" in bblayers.conf 2024-11-09 09:17:56,653 - oe-selftest - INFO - test_core_image_minimal_include_source (spdx.SPDX30Check.test_core_image_minimal_include_source) 2024-11-09 10:41:16,654 - oe-selftest - INFO - Keepalive message 2024-11-09 11:37:53,091 - oe-selftest - INFO - ... ok 2024-11-09 11:55:18,638 - oe-selftest - INFO - ---------------------------------------------------------------------- 2024-11-09 11:55:18,638 - oe-selftest - INFO - Ran 1 test in 9442.187s 2024-11-09 11:55:18,638 - oe-selftest - INFO - OK 2024-11-09 11:55:35,453 - oe-selftest - INFO - RESULTS: 2024-11-09 11:55:35,453 - oe-selftest - INFO - RESULTS - spdx.SPDX30Check.test_core_image_minimal_include_source: PASSED (8396.65s) 2024-11-09 11:55:35,490 - oe-selftest - INFO - SUMMARY: 2024-11-09 11:55:35,490 - oe-selftest - INFO - oe-selftest () - Ran 1 test in 9442.187s 2024-11-09 11:55:35,490 - oe-selftest - INFO - oe-selftest - OK - All required tests passed (successes=1, skipped=0, failures=0, errors=0) Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> --- meta/lib/oeqa/selftest/cases/spdx.py | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/meta/lib/oeqa/selftest/cases/spdx.py b/meta/lib/oeqa/selftest/cases/spdx.py index 8384070219..c785f5445f 100644 --- a/meta/lib/oeqa/selftest/cases/spdx.py +++ b/meta/lib/oeqa/selftest/cases/spdx.py @@ -174,6 +174,20 @@ class SPDX30Check(SPDX3CheckBase, OESelftestTestCase): # Document should be fully linked self.check_objset_missing_ids(objset) + def test_core_image_minimal_include_source(self): + objset = self.check_recipe_spdx( + "core-image-minimal", + "{DEPLOY_DIR_IMAGE}/core-image-minimal-{MACHINE}.rootfs.spdx.json", + extraconf=textwrap.dedent( + """\ + SPDX_INCLUDE_SOURCES = "1" + """ + ), + ) + + # Document should be fully linked + self.check_objset_missing_ids(objset) + def test_core_image_minimal_sdk(self): objset = self.check_recipe_spdx( "core-image-minimal", -- 2.25.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [oe-core][PATCH 3/3] oeqa/selftest: Add SPDX 3.0 include source cases for core_image_minimal build 2024-11-25 8:15 ` [oe-core][PATCH 3/3] oeqa/selftest: Add SPDX 3.0 include source cases for core_image_minimal build Hongxu Jia @ 2024-12-04 19:24 ` Joshua Watt 2024-12-11 3:04 ` Hongxu Jia 0 siblings, 1 reply; 9+ messages in thread From: Joshua Watt @ 2024-12-04 19:24 UTC (permalink / raw) To: Hongxu Jia; +Cc: openembedded-core Is there some advantage to this patch over the test_gcc_include_source test? On Mon, Nov 25, 2024 at 1:15 AM Hongxu Jia <hongxu.jia@windriver.com> wrote: > > $ oe-selftest -r spdx.SPDX30Check.test_core_image_minimal_include_source > 2024-11-09 09:17:54,600 - oe-selftest - INFO - Adding layer libraries: > 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta/lib > 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta-yocto-bsp/lib > 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta-selftest/lib > 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/meta-openembedded/meta-oe/lib > 2024-11-09 09:17:54,602 - oe-selftest - INFO - Checking base configuration is valid/parsable > 2024-11-09 09:17:56,653 - oe-selftest - INFO - Adding: "include selftest.inc" in path-to/build_spdx3-st/conf/local.conf > 2024-11-09 09:17:56,653 - oe-selftest - INFO - Adding: "include bblayers.inc" in bblayers.conf > 2024-11-09 09:17:56,653 - oe-selftest - INFO - test_core_image_minimal_include_source (spdx.SPDX30Check.test_core_image_minimal_include_source) > 2024-11-09 10:41:16,654 - oe-selftest - INFO - Keepalive message > 2024-11-09 11:37:53,091 - oe-selftest - INFO - ... ok > 2024-11-09 11:55:18,638 - oe-selftest - INFO - ---------------------------------------------------------------------- > 2024-11-09 11:55:18,638 - oe-selftest - INFO - Ran 1 test in 9442.187s > 2024-11-09 11:55:18,638 - oe-selftest - INFO - OK > 2024-11-09 11:55:35,453 - oe-selftest - INFO - RESULTS: > 2024-11-09 11:55:35,453 - oe-selftest - INFO - RESULTS - spdx.SPDX30Check.test_core_image_minimal_include_source: PASSED (8396.65s) > 2024-11-09 11:55:35,490 - oe-selftest - INFO - SUMMARY: > 2024-11-09 11:55:35,490 - oe-selftest - INFO - oe-selftest () - Ran 1 test in 9442.187s > 2024-11-09 11:55:35,490 - oe-selftest - INFO - oe-selftest - OK - All required tests passed (successes=1, skipped=0, failures=0, errors=0) > > Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> > --- > meta/lib/oeqa/selftest/cases/spdx.py | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/meta/lib/oeqa/selftest/cases/spdx.py b/meta/lib/oeqa/selftest/cases/spdx.py > index 8384070219..c785f5445f 100644 > --- a/meta/lib/oeqa/selftest/cases/spdx.py > +++ b/meta/lib/oeqa/selftest/cases/spdx.py > @@ -174,6 +174,20 @@ class SPDX30Check(SPDX3CheckBase, OESelftestTestCase): > # Document should be fully linked > self.check_objset_missing_ids(objset) > > + def test_core_image_minimal_include_source(self): > + objset = self.check_recipe_spdx( > + "core-image-minimal", > + "{DEPLOY_DIR_IMAGE}/core-image-minimal-{MACHINE}.rootfs.spdx.json", > + extraconf=textwrap.dedent( > + """\ > + SPDX_INCLUDE_SOURCES = "1" > + """ > + ), > + ) > + > + # Document should be fully linked > + self.check_objset_missing_ids(objset) > + > def test_core_image_minimal_sdk(self): > objset = self.check_recipe_spdx( > "core-image-minimal", > -- > 2.25.1 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [oe-core][PATCH 3/3] oeqa/selftest: Add SPDX 3.0 include source cases for core_image_minimal build 2024-12-04 19:24 ` Joshua Watt @ 2024-12-11 3:04 ` Hongxu Jia 0 siblings, 0 replies; 9+ messages in thread From: Hongxu Jia @ 2024-12-11 3:04 UTC (permalink / raw) To: Joshua Watt; +Cc: openembedded-core [-- Attachment #1: Type: text/plain, Size: 3541 bytes --] On 12/5/24 03:24, Joshua Watt wrote: > CAUTION: This email comes from a non Wind River email account! > Do not click links or open attachments unless you recognize the sender and know the content is safe. > > Is there some advantage to this patch over the test_gcc_include_source test? It could check objset missing or not while SPDX_INCLUDE_SOURCES = "1" //Hongxu > On Mon, Nov 25, 2024 at 1:15 AM Hongxu Jia<hongxu.jia@windriver.com> wrote: >> $ oe-selftest -r spdx.SPDX30Check.test_core_image_minimal_include_source >> 2024-11-09 09:17:54,600 - oe-selftest - INFO - Adding layer libraries: >> 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta/lib >> 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta-yocto-bsp/lib >> 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/poky/meta-selftest/lib >> 2024-11-09 09:17:54,601 - oe-selftest - INFO - path-to/meta-openembedded/meta-oe/lib >> 2024-11-09 09:17:54,602 - oe-selftest - INFO - Checking base configuration is valid/parsable >> 2024-11-09 09:17:56,653 - oe-selftest - INFO - Adding: "include selftest.inc" in path-to/build_spdx3-st/conf/local.conf >> 2024-11-09 09:17:56,653 - oe-selftest - INFO - Adding: "include bblayers.inc" in bblayers.conf >> 2024-11-09 09:17:56,653 - oe-selftest - INFO - test_core_image_minimal_include_source (spdx.SPDX30Check.test_core_image_minimal_include_source) >> 2024-11-09 10:41:16,654 - oe-selftest - INFO - Keepalive message >> 2024-11-09 11:37:53,091 - oe-selftest - INFO - ... ok >> 2024-11-09 11:55:18,638 - oe-selftest - INFO - ---------------------------------------------------------------------- >> 2024-11-09 11:55:18,638 - oe-selftest - INFO - Ran 1 test in 9442.187s >> 2024-11-09 11:55:18,638 - oe-selftest - INFO - OK >> 2024-11-09 11:55:35,453 - oe-selftest - INFO - RESULTS: >> 2024-11-09 11:55:35,453 - oe-selftest - INFO - RESULTS - spdx.SPDX30Check.test_core_image_minimal_include_source: PASSED (8396.65s) >> 2024-11-09 11:55:35,490 - oe-selftest - INFO - SUMMARY: >> 2024-11-09 11:55:35,490 - oe-selftest - INFO - oe-selftest () - Ran 1 test in 9442.187s >> 2024-11-09 11:55:35,490 - oe-selftest - INFO - oe-selftest - OK - All required tests passed (successes=1, skipped=0, failures=0, errors=0) >> >> Signed-off-by: Hongxu Jia<hongxu.jia@windriver.com> >> --- >> meta/lib/oeqa/selftest/cases/spdx.py | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/meta/lib/oeqa/selftest/cases/spdx.py b/meta/lib/oeqa/selftest/cases/spdx.py >> index 8384070219..c785f5445f 100644 >> --- a/meta/lib/oeqa/selftest/cases/spdx.py >> +++ b/meta/lib/oeqa/selftest/cases/spdx.py >> @@ -174,6 +174,20 @@ class SPDX30Check(SPDX3CheckBase, OESelftestTestCase): >> # Document should be fully linked >> self.check_objset_missing_ids(objset) >> >> + def test_core_image_minimal_include_source(self): >> + objset = self.check_recipe_spdx( >> + "core-image-minimal", >> + "{DEPLOY_DIR_IMAGE}/core-image-minimal-{MACHINE}.rootfs.spdx.json", >> + extraconf=textwrap.dedent( >> + """\ >> + SPDX_INCLUDE_SOURCES = "1" >> + """ >> + ), >> + ) >> + >> + # Document should be fully linked >> + self.check_objset_missing_ids(objset) >> + >> def test_core_image_minimal_sdk(self): >> objset = self.check_recipe_spdx( >> "core-image-minimal", >> -- >> 2.25.1 >> [-- Attachment #2: Type: text/html, Size: 4372 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <180B2809D3D6ACA7.7301@lists.openembedded.org>]
* Re: [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias [not found] ` <180B2809D3D6ACA7.7301@lists.openembedded.org> @ 2024-11-29 7:00 ` Hongxu Jia 2024-12-02 19:30 ` Joshua Watt 0 siblings, 1 reply; 9+ messages in thread From: Hongxu Jia @ 2024-11-29 7:00 UTC (permalink / raw) To: openembedded-core, JPEWhacker [-- Attachment #1: Type: text/plain, Size: 12914 bytes --] Ping Joshua, Let me add some additional explanations, The spdxId was consisted by "namespace" + "user define suffix" [1], in each jsonld file, multiple elements have the same "namespace" and different "user define suffix". This commit tries to make "namespace" differs between jsonld files, via adding prefix and name to instead of ${PN}, then each jsonld file have unique "namespace" After apply the commit, all elements in the same jsonld file will have the same "namespace", and pick up the same link-name from "namespace" Take shadow for example, In tmp/deploy/spdx/3.0.1/core2-64/recipes/shadow.spdx.json, link-name is "recipe-shadow" ... { "type": "simplelicensing_LicenseExpression", "spdxId": "http://spdx.org/spdxdocs/recipe-shadow-6845d95c-0853-56dd-b976-caae5a99461e/bdf9bac970ab5868fda8a811581a46956790f1744837b22af9726173428059b7/license/3_24_0/Unlicense", "creationInfo": "_:CreationInfo1", "extension": [ { "type": "https://rdf.openembedded.org/spdx/3.0/id-alias", "https://rdf.openembedded.org/spdx/3.0/alias": "http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/license/3_24_0/Unlicense", "https://rdf.openembedded.org/spdx/3.0/link-name": "recipe-shadow" }, { "type": "https://rdf.openembedded.org/spdx/3.0/link-extension", "https://rdf.openembedded.org/spdx/3.0/link-spdx-id": true, "https://rdf.openembedded.org/spdx/3.0/link-name": "recipe-shadow" } ], "simplelicensing_licenseExpression": "Unlicense", "simplelicensing_licenseListVersion": "3.24.0" }, ... In tmp/deploy/spdx/3.0.1/core2-64/packages/shadow-src.spdx.json, link-name is "package-shadow-src" ... { "type": "software_Package", "spdxId": "http://spdx.org/spdxdocs/package-shadow-src-ccffa1b0-6952-53bb-bc7f-0631476873ed/bdf9bac970ab5868fda8a811581a46956790f1744837b22af9726173428059b7/package/shadow-src", "creationInfo": "_:CreationInfo1", "description": "Tools to change and administer password and group data This package contains sources for debugging purposes.", "extension": [ { "type": "https://rdf.openembedded.org/spdx/3.0/id-alias", "https://rdf.openembedded.org/spdx/3.0/alias": "http://spdx.org/spdxdocs/openembedded-alias/package-shadow-src/UNIHASH/package/shadow-src", "https://rdf.openembedded.org/spdx/3.0/link-name": "package-shadow-src" }, { "type": "https://rdf.openembedded.org/spdx/3.0/link-extension", "https://rdf.openembedded.org/spdx/3.0/link-spdx-id": true, "https://rdf.openembedded.org/spdx/3.0/link-name": "package-shadow-src" } ], ... Then creating unique link recipe-shadow.spdx.json and package-shadow-src.spdx.json, one link for one jsonld file, all elements in one jsonld file share one symlink. Rather than each element have unique link hash path. tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/recipe-shadow.spdx.json -> ../recipes/shadow.spdx.json tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/package-shadow-src.spdx.json -> ../packages/shadow-src.spdx.json [1] https://github.com/openembedded/openembedded-core/blob/master/meta/lib/oe/sbom30.py#L353 //Hongxu On 11/25/24 16:14, hongxu via lists.openembedded.org wrote: > In order to simple reference the SPDX ID to instead of making jsonld hash > path for each element, only creating one symlink for one file and referencing > it multiple times, add link prefix and name to the namespace of spdxId and alias > to replace ${PN} to avoid namespace conflict between recipe, packages and images. > > Take recipe shadow, package shadow and package shadow-src for example: > Without this commit, spdxId and alias in recipe and package jsonld have the same > namespace > > spdxId:http://spdx.org/spdxdocs/shadow-xxx/... > alias:http://spdx.org/spdxdocs/openembedded-alias/shadow/UNIHASH/... > > After apply this commit, the namespace of spdxId in recipe and package jsonld differs: > In recipe jsonld tmp/deploy/spdx/3.0.1/core2-64/recipes/shadow.spdx.json > > spdxId:http://spdx.org/spdxdocs/recipe-shadow-xxx/... > alias:http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/... > > In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow.spdx.json > > spdxId:http://spdx.org/spdxdocs/package-shadow-xxx/... > alias:http://spdx.org/spdxdocs/openembedded-alias/package-shadow/UNIHASH/... > > In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow-src.spdx.json > > spdxId:http://spdx.org/spdxdocs/package-shadow-src-xxx/... > alias:http://spdx.org/spdxdocs/openembedded-alias/package-shadow-src/UNIHASH/... > > Then will use namespace of spdxId and alias to create link for jsonld file, > one symlink for one jsonld file, referenced by elements multiple times > > Signed-off-by: Hongxu Jia<hongxu.jia@windriver.com> > --- > meta/lib/oe/sbom30.py | 29 ++++++++++++++++++----------- > meta/lib/oe/spdx30_tasks.py | 13 +++++++------ > 2 files changed, 25 insertions(+), 17 deletions(-) > > diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py > index 0a7b4c05fb..28d251a7ac 100644 > --- a/meta/lib/oe/sbom30.py > +++ b/meta/lib/oe/sbom30.py > @@ -217,9 +217,11 @@ def to_list(l): > > > class ObjectSet(oe.spdx30.SHACLObjectSet): > - def __init__(self, d): > + def __init__(self, d, name=None, link_prefix=None): > super().__init__() > self.d = d > + self.name = name > + self.link_prefix = link_prefix > > def create_index(self): > self.by_sha256_hash = {} > @@ -322,6 +324,8 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): > uuid.NAMESPACE_DNS, self.d.getVar("SPDX_UUID_NAMESPACE") > ) > pn = self.d.getVar("PN") > + if self.link_prefix and self.name: > + pn = "%s-%s" % (self.link_prefix, self.name) > return "%s/%s-%s" % ( > self.d.getVar("SPDX_NAMESPACE_PREFIX"), > pn, > @@ -341,12 +345,15 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): > elif namespace not in e._id: > bb.warn(f"Namespace {namespace} not found in {e._id}") > else: > + pn = self.d.getVar("PN") > + if self.link_prefix and self.name: > + pn = "%s-%s" % (self.link_prefix, self.name) > alias_ext = set_alias( > e, > e._id.replace(unihash, "UNIHASH").replace( > namespace, > -"http://spdx.org/spdxdocs/openembedded-alias/" > - + self.d.getVar("PN"), > + f"{self.d.getVar('SPDX_NAMESPACE_PREFIX')}/openembedded-alias/" > + + pn, > ), > ) > > @@ -805,8 +812,8 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): > ) > > @classmethod > - def new_objset(cls, d, name, copy_from_bitbake_doc=True): > - objset = cls(d) > + def new_objset(cls, d, name, copy_from_bitbake_doc=True, link_prefix=None): > + objset = cls(d, name=name, link_prefix=link_prefix) > > document = oe.spdx30.SpdxDocument( > _id=objset.new_spdxid("document", name), > @@ -887,9 +894,9 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): > return missing_spdxids > > > -def load_jsonld(d, path, required=False): > +def load_jsonld(d, path, required=False, name=None, link_prefix=None): > deserializer = oe.spdx30.JSONLDDeserializer() > - objset = ObjectSet(d) > + objset = ObjectSet(d, name=name, link_prefix=link_prefix) > try: > with path.open("rb") as f: > deserializer.read(f, objset) > @@ -918,9 +925,9 @@ def jsonld_hash_path(_id): > return Path("by-spdxid-hash") / h[:2], h > > > -def load_jsonld_by_arch(d, arch, subdir, name, *, required=False): > +def load_jsonld_by_arch(d, arch, subdir, name, *, required=False, link_prefix=None): > path = jsonld_arch_path(d, arch, subdir, name) > - objset = load_jsonld(d, path, required=required) > + objset = load_jsonld(d, path, required=required, name=name, link_prefix=link_prefix) > if objset is not None: > return (objset, path) > return (None, None) > @@ -1049,8 +1056,8 @@ def find_root_obj_in_jsonld(d, subdir, fn_name, obj_type, **attr_filter): > return spdx_obj, objset > > > -def load_obj_in_jsonld(d, arch, subdir, fn_name, obj_type, **attr_filter): > - objset, fn = load_jsonld_by_arch(d, arch, subdir, fn_name, required=True) > +def load_obj_in_jsonld(d, arch, subdir, fn_name, obj_type, link_prefix=None, **attr_filter): > + objset, fn = load_jsonld_by_arch(d, arch, subdir, fn_name, required=True, link_prefix=link_prefix) > > spdx_obj = objset.find_filter(obj_type, **attr_filter) > if not spdx_obj: > diff --git a/meta/lib/oe/spdx30_tasks.py b/meta/lib/oe/spdx30_tasks.py > index 5aeed5cd6f..ef829fbbf1 100644 > --- a/meta/lib/oe/spdx30_tasks.py > +++ b/meta/lib/oe/spdx30_tasks.py > @@ -461,7 +461,7 @@ def create_spdx(d): > if not include_vex in ("none", "current", "all"): > bb.fatal("SPDX_INCLUDE_VEX must be one of 'none', 'current', 'all'") > > - build_objset = oe.sbom30.ObjectSet.new_objset(d, d.getVar("PN")) > + build_objset = oe.sbom30.ObjectSet.new_objset(d, d.getVar("PN"), link_prefix="recipe") > > build = build_objset.new_task_build("recipe", "recipe") > build_objset.set_element_alias(build) > @@ -574,7 +574,7 @@ def create_spdx(d): > > bb.debug(1, "Creating SPDX for package %s" % pkg_name) > > - pkg_objset = oe.sbom30.ObjectSet.new_objset(d, pkg_name) > + pkg_objset = oe.sbom30.ObjectSet.new_objset(d, pkg_name, link_prefix="package") > > spdx_package = pkg_objset.add_root( > oe.spdx30.software_Package( > @@ -793,7 +793,7 @@ def create_package_spdx(d): > # Any element common to all packages that need to be referenced by ID > # should be written into this objset set > common_objset = oe.sbom30.ObjectSet.new_objset( > - d, "%s-package-common" % d.getVar("PN") > + d, "%s-package-common" % d.getVar("PN"), link_prefix="package" > ) > > pkgdest = Path(d.getVar("PKGDEST")) > @@ -812,6 +812,7 @@ def create_package_spdx(d): > "packages-staging", > pkg_name, > oe.spdx30.software_Package, > + link_prefix="package", > software_primaryPurpose=oe.spdx30.software_SoftwarePurpose.install, > ) > > @@ -1002,7 +1003,7 @@ def create_rootfs_spdx(d): > with root_packages_file.open("r") as f: > packages = json.load(f) > > - objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine)) > + objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine), link_prefix="rootfs") > > rootfs = objset.add_root( > oe.spdx30.software_Package( > @@ -1037,7 +1038,7 @@ def create_image_spdx(d): > image_basename = d.getVar("IMAGE_BASENAME") > machine = d.getVar("MACHINE") > > - objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine)) > + objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine), link_prefix="image") > > with manifest_path.open("r") as f: > manifest = json.load(f) > @@ -1150,7 +1151,7 @@ def sdk_create_spdx(d, sdk_type, spdx_work_dir, toolchain_outputname): > sdk_name = toolchain_outputname + "-" + sdk_type > sdk_packages = oe.sdk.sdk_list_installed_packages(d, sdk_type == "target") > > - objset = oe.sbom30.ObjectSet.new_objset(d, sdk_name) > + objset = oe.sbom30.ObjectSet.new_objset(d, sdk_name, link_prefix="sdk") > > sdk_rootfs = objset.add_root( > oe.spdx30.software_Package( > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#207723):https://lists.openembedded.org/g/openembedded-core/message/207723 > Mute This Topic:https://lists.openembedded.org/mt/109768078/3617049 > Group Owner:openembedded-core+owner@lists.openembedded.org > Unsubscribe:https://lists.openembedded.org/g/openembedded-core/unsub [hongxu.jia@eng.windriver.com] > -=-=-=-=-=-=-=-=-=-=-=- > [-- Attachment #2: Type: text/html, Size: 20819 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias 2024-11-29 7:00 ` [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias Hongxu Jia @ 2024-12-02 19:30 ` Joshua Watt 2024-12-03 9:42 ` [PATCH " hongxu 0 siblings, 1 reply; 9+ messages in thread From: Joshua Watt @ 2024-12-02 19:30 UTC (permalink / raw) To: Hongxu Jia; +Cc: openembedded-core OK, thanks. I did get some time to sit down, and read through this, and I understand the problem that needs to be solved. I think you're on the right track, but I think we can use the rework to remove some code that is no longer necessary. My basic thought is now: 1. Set an OEIdAliasExtension on the SpdxDocument that is created with an ObjectSet. It should have a prefix like: "http://spdx.org/spdxdocs/openembedded-alias/document/..." 2. When an alias is set on another element, it should use a prefix of "http://spdx.org/spdxdocs/openembedded-alias/by-doc-hash/HASH/...", where HASH is the hash of the SpdxDocument alias set in #1 3. When a document is written out, a symbolic link is created using the hash of SpdxDocument alias, in a similar manner to what the code does today. Importantly, no links are made for any other elements. 4. When an alias needs to be resolved, the code checks for the prefix "http://spdx.org/spdxdocs/openembedded-alias/by-doc-hash/", then extracts the hash and opens the document with that hash by it's symbolic link (created in #3), then finds the object with the exact alias in that document. 5. Remove OELinkExtension since it is no longer necessary I'm willing to work on this if you want, but with the YP summit this week, it probably won't get done until next week. On Fri, Nov 29, 2024 at 12:00 AM Hongxu Jia <hongxu.jia@windriver.com> wrote: > > Ping Joshua, > > Let me add some additional explanations, > > The spdxId was consisted by "namespace" + "user define suffix" [1], > in each jsonld file, multiple elements have the same "namespace" and > different "user define suffix". > > This commit tries to make "namespace" differs between jsonld files, via adding prefix > and name to instead of ${PN}, then each jsonld file have unique "namespace" > > After apply the commit, all elements in the same jsonld file will have the same > "namespace", and pick up the same link-name from "namespace" > > Take shadow for example, > > In tmp/deploy/spdx/3.0.1/core2-64/recipes/shadow.spdx.json, link-name is "recipe-shadow" > ... > { > "type": "simplelicensing_LicenseExpression", > "spdxId": "http://spdx.org/spdxdocs/recipe-shadow-6845d95c-0853-56dd-b976-caae5a99461e/bdf9bac970ab5868fda8a811581a46956790f1744837b22af9726173428059b7/license/3_24_0/Unlicense", > "creationInfo": "_:CreationInfo1", > "extension": [ > { > "type": "https://rdf.openembedded.org/spdx/3.0/id-alias", > "https://rdf.openembedded.org/spdx/3.0/alias": "http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/license/3_24_0/Unlicense", > "https://rdf.openembedded.org/spdx/3.0/link-name": "recipe-shadow" > }, > { > "type": "https://rdf.openembedded.org/spdx/3.0/link-extension", > "https://rdf.openembedded.org/spdx/3.0/link-spdx-id": true, > "https://rdf.openembedded.org/spdx/3.0/link-name": "recipe-shadow" > } > ], > "simplelicensing_licenseExpression": "Unlicense", > "simplelicensing_licenseListVersion": "3.24.0" > }, > ... > > In tmp/deploy/spdx/3.0.1/core2-64/packages/shadow-src.spdx.json, link-name is "package-shadow-src" > ... > { > "type": "software_Package", > "spdxId": "http://spdx.org/spdxdocs/package-shadow-src-ccffa1b0-6952-53bb-bc7f-0631476873ed/bdf9bac970ab5868fda8a811581a46956790f1744837b22af9726173428059b7/package/shadow-src", > "creationInfo": "_:CreationInfo1", > "description": "Tools to change and administer password and group data This package contains sources for debugging purposes.", > "extension": [ > { > "type": "https://rdf.openembedded.org/spdx/3.0/id-alias", > "https://rdf.openembedded.org/spdx/3.0/alias": "http://spdx.org/spdxdocs/openembedded-alias/package-shadow-src/UNIHASH/package/shadow-src", > "https://rdf.openembedded.org/spdx/3.0/link-name": "package-shadow-src" > }, > { > "type": "https://rdf.openembedded.org/spdx/3.0/link-extension", > "https://rdf.openembedded.org/spdx/3.0/link-spdx-id": true, > "https://rdf.openembedded.org/spdx/3.0/link-name": "package-shadow-src" > } > ], > ... > > Then creating unique link recipe-shadow.spdx.json and package-shadow-src.spdx.json, > one link for one jsonld file, all elements in one jsonld file share one symlink. Rather than each > element have unique link hash path. > > tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/recipe-shadow.spdx.json -> ../recipes/shadow.spdx.json > tmp/deploy/spdx/3.0.1/core2-64/by-spdxid-link/package-shadow-src.spdx.json -> ../packages/shadow-src.spdx.json > > > [1] https://github.com/openembedded/openembedded-core/blob/master/meta/lib/oe/sbom30.py#L353 > > //Hongxu > > On 11/25/24 16:14, hongxu via lists.openembedded.org wrote: > > In order to simple reference the SPDX ID to instead of making jsonld hash > path for each element, only creating one symlink for one file and referencing > it multiple times, add link prefix and name to the namespace of spdxId and alias > to replace ${PN} to avoid namespace conflict between recipe, packages and images. > > Take recipe shadow, package shadow and package shadow-src for example: > Without this commit, spdxId and alias in recipe and package jsonld have the same > namespace > > spdxId: http://spdx.org/spdxdocs/shadow-xxx/... > alias: http://spdx.org/spdxdocs/openembedded-alias/shadow/UNIHASH/... > > After apply this commit, the namespace of spdxId in recipe and package jsonld differs: > In recipe jsonld tmp/deploy/spdx/3.0.1/core2-64/recipes/shadow.spdx.json > > spdxId: http://spdx.org/spdxdocs/recipe-shadow-xxx/... > alias: http://spdx.org/spdxdocs/openembedded-alias/recipe-shadow/UNIHASH/... > > In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow.spdx.json > > spdxId: http://spdx.org/spdxdocs/package-shadow-xxx/... > alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow/UNIHASH/... > > In package jsonld tmp/deploy/spdx/3.0.1/core2-64/packages/shadow-src.spdx.json > > spdxId: http://spdx.org/spdxdocs/package-shadow-src-xxx/... > alias: http://spdx.org/spdxdocs/openembedded-alias/package-shadow-src/UNIHASH/... > > Then will use namespace of spdxId and alias to create link for jsonld file, > one symlink for one jsonld file, referenced by elements multiple times > > Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> > --- > meta/lib/oe/sbom30.py | 29 ++++++++++++++++++----------- > meta/lib/oe/spdx30_tasks.py | 13 +++++++------ > 2 files changed, 25 insertions(+), 17 deletions(-) > > diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py > index 0a7b4c05fb..28d251a7ac 100644 > --- a/meta/lib/oe/sbom30.py > +++ b/meta/lib/oe/sbom30.py > @@ -217,9 +217,11 @@ def to_list(l): > > > class ObjectSet(oe.spdx30.SHACLObjectSet): > - def __init__(self, d): > + def __init__(self, d, name=None, link_prefix=None): > super().__init__() > self.d = d > + self.name = name > + self.link_prefix = link_prefix > > def create_index(self): > self.by_sha256_hash = {} > @@ -322,6 +324,8 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): > uuid.NAMESPACE_DNS, self.d.getVar("SPDX_UUID_NAMESPACE") > ) > pn = self.d.getVar("PN") > + if self.link_prefix and self.name: > + pn = "%s-%s" % (self.link_prefix, self.name) > return "%s/%s-%s" % ( > self.d.getVar("SPDX_NAMESPACE_PREFIX"), > pn, > @@ -341,12 +345,15 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): > elif namespace not in e._id: > bb.warn(f"Namespace {namespace} not found in {e._id}") > else: > + pn = self.d.getVar("PN") > + if self.link_prefix and self.name: > + pn = "%s-%s" % (self.link_prefix, self.name) > alias_ext = set_alias( > e, > e._id.replace(unihash, "UNIHASH").replace( > namespace, > - "http://spdx.org/spdxdocs/openembedded-alias/" > - + self.d.getVar("PN"), > + f"{self.d.getVar('SPDX_NAMESPACE_PREFIX')}/openembedded-alias/" > + + pn, > ), > ) > > @@ -805,8 +812,8 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): > ) > > @classmethod > - def new_objset(cls, d, name, copy_from_bitbake_doc=True): > - objset = cls(d) > + def new_objset(cls, d, name, copy_from_bitbake_doc=True, link_prefix=None): > + objset = cls(d, name=name, link_prefix=link_prefix) > > document = oe.spdx30.SpdxDocument( > _id=objset.new_spdxid("document", name), > @@ -887,9 +894,9 @@ class ObjectSet(oe.spdx30.SHACLObjectSet): > return missing_spdxids > > > -def load_jsonld(d, path, required=False): > +def load_jsonld(d, path, required=False, name=None, link_prefix=None): > deserializer = oe.spdx30.JSONLDDeserializer() > - objset = ObjectSet(d) > + objset = ObjectSet(d, name=name, link_prefix=link_prefix) > try: > with path.open("rb") as f: > deserializer.read(f, objset) > @@ -918,9 +925,9 @@ def jsonld_hash_path(_id): > return Path("by-spdxid-hash") / h[:2], h > > > -def load_jsonld_by_arch(d, arch, subdir, name, *, required=False): > +def load_jsonld_by_arch(d, arch, subdir, name, *, required=False, link_prefix=None): > path = jsonld_arch_path(d, arch, subdir, name) > - objset = load_jsonld(d, path, required=required) > + objset = load_jsonld(d, path, required=required, name=name, link_prefix=link_prefix) > if objset is not None: > return (objset, path) > return (None, None) > @@ -1049,8 +1056,8 @@ def find_root_obj_in_jsonld(d, subdir, fn_name, obj_type, **attr_filter): > return spdx_obj, objset > > > -def load_obj_in_jsonld(d, arch, subdir, fn_name, obj_type, **attr_filter): > - objset, fn = load_jsonld_by_arch(d, arch, subdir, fn_name, required=True) > +def load_obj_in_jsonld(d, arch, subdir, fn_name, obj_type, link_prefix=None, **attr_filter): > + objset, fn = load_jsonld_by_arch(d, arch, subdir, fn_name, required=True, link_prefix=link_prefix) > > spdx_obj = objset.find_filter(obj_type, **attr_filter) > if not spdx_obj: > diff --git a/meta/lib/oe/spdx30_tasks.py b/meta/lib/oe/spdx30_tasks.py > index 5aeed5cd6f..ef829fbbf1 100644 > --- a/meta/lib/oe/spdx30_tasks.py > +++ b/meta/lib/oe/spdx30_tasks.py > @@ -461,7 +461,7 @@ def create_spdx(d): > if not include_vex in ("none", "current", "all"): > bb.fatal("SPDX_INCLUDE_VEX must be one of 'none', 'current', 'all'") > > - build_objset = oe.sbom30.ObjectSet.new_objset(d, d.getVar("PN")) > + build_objset = oe.sbom30.ObjectSet.new_objset(d, d.getVar("PN"), link_prefix="recipe") > > build = build_objset.new_task_build("recipe", "recipe") > build_objset.set_element_alias(build) > @@ -574,7 +574,7 @@ def create_spdx(d): > > bb.debug(1, "Creating SPDX for package %s" % pkg_name) > > - pkg_objset = oe.sbom30.ObjectSet.new_objset(d, pkg_name) > + pkg_objset = oe.sbom30.ObjectSet.new_objset(d, pkg_name, link_prefix="package") > > spdx_package = pkg_objset.add_root( > oe.spdx30.software_Package( > @@ -793,7 +793,7 @@ def create_package_spdx(d): > # Any element common to all packages that need to be referenced by ID > # should be written into this objset set > common_objset = oe.sbom30.ObjectSet.new_objset( > - d, "%s-package-common" % d.getVar("PN") > + d, "%s-package-common" % d.getVar("PN"), link_prefix="package" > ) > > pkgdest = Path(d.getVar("PKGDEST")) > @@ -812,6 +812,7 @@ def create_package_spdx(d): > "packages-staging", > pkg_name, > oe.spdx30.software_Package, > + link_prefix="package", > software_primaryPurpose=oe.spdx30.software_SoftwarePurpose.install, > ) > > @@ -1002,7 +1003,7 @@ def create_rootfs_spdx(d): > with root_packages_file.open("r") as f: > packages = json.load(f) > > - objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine)) > + objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine), link_prefix="rootfs") > > rootfs = objset.add_root( > oe.spdx30.software_Package( > @@ -1037,7 +1038,7 @@ def create_image_spdx(d): > image_basename = d.getVar("IMAGE_BASENAME") > machine = d.getVar("MACHINE") > > - objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine)) > + objset = oe.sbom30.ObjectSet.new_objset(d, "%s-%s" % (image_basename, machine), link_prefix="image") > > with manifest_path.open("r") as f: > manifest = json.load(f) > @@ -1150,7 +1151,7 @@ def sdk_create_spdx(d, sdk_type, spdx_work_dir, toolchain_outputname): > sdk_name = toolchain_outputname + "-" + sdk_type > sdk_packages = oe.sdk.sdk_list_installed_packages(d, sdk_type == "target") > > - objset = oe.sbom30.ObjectSet.new_objset(d, sdk_name) > + objset = oe.sbom30.ObjectSet.new_objset(d, sdk_name, link_prefix="sdk") > > sdk_rootfs = objset.add_root( > oe.spdx30.software_Package( > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#207723): https://lists.openembedded.org/g/openembedded-core/message/207723 > Mute This Topic: https://lists.openembedded.org/mt/109768078/3617049 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [hongxu.jia@eng.windriver.com] > -=-=-=-=-=-=-=-=-=-=-=- > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias 2024-12-02 19:30 ` Joshua Watt @ 2024-12-03 9:42 ` hongxu 0 siblings, 0 replies; 9+ messages in thread From: hongxu @ 2024-12-03 9:42 UTC (permalink / raw) To: openembedded-core [-- Attachment #1: Type: text/plain, Size: 95 bytes --] Sure, thanks for the reply, I am glad to wait for your rework patch, thanks again //Hongxu [-- Attachment #2: Type: text/html, Size: 134 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-12-11 3:04 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-25 8:14 [OE-core] [RR V3][0/3] SPDX 3.0: Reduce redundant spdxid-hash symlinks to save inode on host Hongxu Jia
2024-11-25 8:14 ` [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias Hongxu Jia
2024-11-25 8:14 ` [oe-core][PATCH V3 2/3] sbom30.py: reduce redundant spdxid symlinks to save inode on host Hongxu Jia
2024-11-25 8:15 ` [oe-core][PATCH 3/3] oeqa/selftest: Add SPDX 3.0 include source cases for core_image_minimal build Hongxu Jia
2024-12-04 19:24 ` Joshua Watt
2024-12-11 3:04 ` Hongxu Jia
[not found] ` <180B2809D3D6ACA7.7301@lists.openembedded.org>
2024-11-29 7:00 ` [oe-core][PATCH V3 1/3] sbom30/spdx30: add link prefix to the namespace of spdxId and alias Hongxu Jia
2024-12-02 19:30 ` Joshua Watt
2024-12-03 9:42 ` [PATCH " hongxu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox