Openbmc Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Proposal: Utilizing Container Registry for Shared BitBake Sstate-Cache
@ 2026-06-04 14:15 Yash Patel
  2026-06-08 16:10 ` Patrick Williams
  2026-06-10 16:46 ` Ed Tanous
  0 siblings, 2 replies; 4+ messages in thread
From: Yash Patel @ 2026-06-04 14:15 UTC (permalink / raw)
  To: openbmc

Hello Team,

My name is Yash Patel and I'm a summer intern at IBM working with
Andrew Geissler. My goal is to start storing the sstate for our
bitbake builds into containers and upload them to a container registry
that the openbmc CI process can utilize as well as other openbmc
developers. This will make it much easier to bring new build nodes
online and to reset bad ones. It will also allow openbmc developers to
be able to quickly spin up a container and do bitbakes quickly for a
target machine with the sstate already pre-loaded.

There would be a container per machine type. The default machines
supported would be what we run CI for up at
https://jenkins.openbmc.org/job/ci-openbmc/. Supporting only a single
machine per container will keep the size of the container down and
most use cases are just building a single machine. Also, a lot of the
free opensource container registries have size limits on the
containers.

We are thinking that the
https://jenkins.openbmc.org/job/latest-master/ job will be what
generates and uploads the containers. Currently this job runs once a
day and builds whatever is in master at the time (we could tweak this
schedule if needed). We would then update
https://github.com/openbmc/openbmc-build-scripts/blob/master/build-setup.sh
(script used by openbmc CI) to look for an available container and use
it if available, otherwise just default to the standard flow.

We would generate containers for both x86 and arm as they will have
different sstates.

We've done some research and it appears that github provides a free
container registry for open source projects so this is the direction
we're thinking.

Any thoughts or comments appreciated!
Yash


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Proposal: Utilizing Container Registry for Shared BitBake Sstate-Cache
  2026-06-04 14:15 Proposal: Utilizing Container Registry for Shared BitBake Sstate-Cache Yash Patel
@ 2026-06-08 16:10 ` Patrick Williams
  2026-06-10 16:09   ` Yash Patel
  2026-06-10 16:46 ` Ed Tanous
  1 sibling, 1 reply; 4+ messages in thread
From: Patrick Williams @ 2026-06-08 16:10 UTC (permalink / raw)
  To: Yash Patel; +Cc: openbmc

[-- Attachment #1: Type: text/plain, Size: 4550 bytes --]

On Thu, Jun 04, 2026 at 09:15:42AM -0500, Yash Patel wrote:
> Hello Team,
> 
> My name is Yash Patel and I'm a summer intern at IBM working with
> Andrew Geissler. 

Welcome Yash.

> My goal is to start storing the sstate for our
> bitbake builds into containers and upload them to a container registry
> that the openbmc CI process can utilize as well as other openbmc
> developers. This will make it much easier to bring new build nodes
> online and to reset bad ones. It will also allow openbmc developers to
> be able to quickly spin up a container and do bitbakes quickly for a
> target machine with the sstate already pre-loaded.

Sounds like a good idea.

One of the issues we have right now is that we do not have a good way to
clean up the sstate/downloads directories out of each Jenkins node.
Eventually they run out of space.  Managing it through containers will
hopefully make it easier to prune old state and keep the nodes from
running out of space.

> There would be a container per machine type. The default machines
> supported would be what we run CI for up at
> https://jenkins.openbmc.org/job/ci-openbmc/. Supporting only a single
> machine per container will keep the size of the container down and
> most use cases are just building a single machine. Also, a lot of the
> free opensource container registries have size limits on the
> containers.

In addition to per-platform containers, you'll also need the branch
included there.  We should make sure to build this for Wrynose since
that is the Yocto LTS branch and we've committed to supporting that for
a few years.

> 
> We are thinking that the
> https://jenkins.openbmc.org/job/latest-master/ job will be what
> generates and uploads the containers. Currently this job runs once a
> day and builds whatever is in master at the time (we could tweak this
> schedule if needed). 

I would suggest a separate job for two reasons:
    - Ideally we need to trigger this more often than once per day if
      you want other Jenkins nodes to use that for a starting point of
      their sstate / downloads cache.

    - You are going to want to create sub-jobs per machine / branch (and
      like I mentioned, we should do this at least for the Wrynose
      branch in addition to master).

> We would then update
> https://github.com/openbmc/openbmc-build-scripts/blob/master/build-setup.sh
> (script used by openbmc CI) to look for an available container and use
> it if available, otherwise just default to the standard flow.

When you create these containers, ideally you'd use the
most-recent-previous container's sstate and downloads directory as a
mirror for bitbake.  This will let you do incremental rebuilds of these
sstate containers much faster.  If you try to build it fresh all the
time, you're going to consume hours of CI time per platform we're
trying to build (plus the x86/arm duplication).

Make sure when you do this that you don't use 'FROM: <old container>'
because that will just create a chain of container subsets that will
grow monsterrously large over time.  You have to create a build
container and a final container.  The build container uses the 'FROM' to
pull the previous container, but then the final container just takes the
resulting final sstate/downloads as a directory.

One tricky thing in the `build-setup` is that you're going to have to
figure out what the "latest best container" is to use.  You're going to
have to traverse the git history of what you're trying to build looking
for a docker tag that exists somewhere in the history as your starting
point.  That could be the previous commit or it could be dozens of
commits old.

> We would generate containers for both x86 and arm as they will have
> different sstates.
> 
> We've done some research and it appears that github provides a free
> container registry for open source projects so this is the direction
> we're thinking.

Have you done any experimentation on if the GHCR will rate limit us in a
way to make this non-useful?  Do we have enough space on the Jenkins
server to run a container repository there?  There are a few smaller
ones that are self-contained in containers like how we run
Jenkins/Gerrit.

We might want a Jenkins job on each node that is continuously pulling
the latest images from the container repository so that each node is
already "up to date" when a CI job kicks off.

> Any thoughts or comments appreciated!
> Yash
> 

-- 
Patrick Williams

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 870 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Proposal: Utilizing Container Registry for Shared BitBake Sstate-Cache
  2026-06-08 16:10 ` Patrick Williams
@ 2026-06-10 16:09   ` Yash Patel
  0 siblings, 0 replies; 4+ messages in thread
From: Yash Patel @ 2026-06-10 16:09 UTC (permalink / raw)
  To: Patrick Williams; +Cc: openbmc

On Mon, Jun 8, 2026 at 11:10 AM Patrick Williams <patrick@stwcx.xyz> wrote:
> Welcome Yash.

Thanks for the feedback and the welcome!

> One of the issues we have right now is that we do not have a good way to
> clean up the sstate/downloads directories out of each Jenkins node.
> Eventually they run out of space.  Managing it through containers will
> hopefully make it easier to prune old state and keep the nodes from
> running out of space.

I agree and managing the cache entirely within disposable container
layers should help address this.

> In addition to per-platform containers, you'll also need the branch
> included there.  We should make sure to build this for Wrynose since
> that is the Yocto LTS branch and we've committed to supporting that for
> a few years.

We will add the branch name to the image tagging convention.

> I would suggest a separate job for two reasons:
>     - Ideally we need to trigger this more often than once per day if
>       you want other Jenkins nodes to use that for a starting point of
>       their sstate / downloads cache.
>
>     - You are going to want to create sub-jobs per machine / branch (and
>       like I mentioned, we should do this at least for the Wrynose
>       branch in addition to master).

Agreed. Our current plan is to create a dedicated Jenkins job with
sub-jobs per platform and branch.

> Make sure when you do this that you don't use 'FROM: <old container>'
> because that will just create a chain of container subsets that will
> grow monsterrously large over time.  You have to create a build
> container and a final container.  The build container uses the 'FROM' to
> pull the previous container, but then the final container just takes the
> resulting final sstate/downloads as a directory.

The current implementation avoids this by copying the resulting
sstate/downloads into a fresh final image rather than building a chain
of container layers.

> Have you done any experimentation on if the GHCR will rate limit us in a
> way to make this non-useful?  Do we have enough space on the Jenkins
> server to run a container repository there?  There are a few smaller
> ones that are self-contained in containers like how we run
> Jenkins/Gerrit.

We have not done any extensive testing around GHCR rate limits yet.
Our initial focus has been validating the container workflow itself.
Evaluating GHCR limits and alternative hosting options is something we
plan to investigate once we have the Jenkins generation and
build-setup.sh integration in place.

We have a generic container available for testing. To try it:

git reset --hard 71085e902b0d22bcd854ae0db4c46c6fc666d17a

docker run -it --rm --network=host --pids-limit=-1 \
  -e LOCAL_UID=$(id -u) \
  -e LOCAL_GID=$(id -g) \
  -e LOCAL_USER=$(id -un) \
  -v $PWD:$PWD \
  -w $PWD \
  ghcr.io/pebbleeee/sstate:p10bmc-x86-generic \
  bash

. setup p10bmc

echo 'SSTATE_MIRRORS = "file://.*
file:///var/lib/openbmc/sstate-cache/PATH"' >> conf/local.conf
echo 'BB_SIGNATURE_HANDLER = "OEBasicHash"' >> conf/local.conf

MACHINE=p10bmc bitbake obmc-phosphor-image

Our immediate next steps are to generate the containers through a
Jenkins job and update build-setup.sh to automatically detect and
utilize them.

Thanks for the feedback.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Proposal: Utilizing Container Registry for Shared BitBake Sstate-Cache
  2026-06-04 14:15 Proposal: Utilizing Container Registry for Shared BitBake Sstate-Cache Yash Patel
  2026-06-08 16:10 ` Patrick Williams
@ 2026-06-10 16:46 ` Ed Tanous
  1 sibling, 0 replies; 4+ messages in thread
From: Ed Tanous @ 2026-06-10 16:46 UTC (permalink / raw)
  To: Yash Patel; +Cc: openbmc

On Thu, Jun 4, 2026 at 7:16 AM Yash Patel <yash.pateltx@gmail.com> wrote:
>
> Hello Team,

Welcome!

>
>It will also allow openbmc developers to
> be able to quickly spin up a container and do bitbakes quickly for a
> target machine with the sstate already pre-loaded.
>

I'm not an expert in our CI process, but anything that improves the
cache hit rate and reduces the amount of code we repeatedly build
seems like a welcome improvement.

I'm looking forward to seeing what you do here.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-10 16:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-04 14:15 Proposal: Utilizing Container Registry for Shared BitBake Sstate-Cache Yash Patel
2026-06-08 16:10 ` Patrick Williams
2026-06-10 16:09   ` Yash Patel
2026-06-10 16:46 ` Ed Tanous

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox