public inbox for docs@lists.yoctoproject.org
 help / color / mirror / Atom feed
From: "Antonin Godard" <antonin.godard@bootlin.com>
To: "Quentin Schulz" <quentin.schulz@cherry.de>,
	<docs@lists.yoctoproject.org>
Cc: "Thomas Petazzoni" <thomas.petazzoni@bootlin.com>
Subject: Re: [docs] [yocto-docs PATCH v2] Add scripts to build the docs in containers
Date: Tue, 10 Dec 2024 13:46:01 +0100	[thread overview]
Message-ID: <D6814L7Y2U7C.1PWWUILXKF7H3@bootlin.com> (raw)
In-Reply-To: <85b846b7-09ec-49dc-b739-813d52dffb81@cherry.de>

Hi Quentin,

On Tue Dec 10, 2024 at 12:08 PM CET, Quentin Schulz wrote:
> Hi Antonin,
>
> On 12/10/24 11:33 AM, Antonin Godard via lists.yoctoproject.org wrote:
>> Hi Quentin,
>> 
>> On Fri Dec 6, 2024 at 3:23 PM CET, Quentin Schulz wrote:
>>> Hi Antonin,
>>>
>>> On 12/5/24 12:06 PM, Antonin Godard wrote:
>>>> Add two scripts for building a container image and building the
>>>> documentation in that image: build-docs-container and
>>>> container-build-docs.
>>>>
>>>
>>> Yet there's only one left now :) (also in your commit title).
>> 
>> Oops, will update in v3 :)
>> 
>>>> For now, the documentation/tools/dockerfiles directory contains a
>>>> Dockerfile for building with Ubuntu or Debian, but we could extend this
>>>> to the supported distros.
>>>>
>>>> It should be possible to build the full documentation with two commands:
>>>>
>>>>     ./documentation/tools/build-docs-container ubuntu:24.04
>>>>
>>>> CONTAINERCMD can be replaced by "podman" to build with podman.
>>>>
>>>
>>> I assume installing podman-docker package could help with that too. Not
>>> sure it's recommended, but I do have it installed :)
>> 
>> Didn't know about that one. But it shouldn't be required to use podman or docker
>> independently I think?
>> 
>
> It's just that when you run `which docker` you would get podman. But we 
> shouldn't require the user to install podman-docker for the script to 
> work. Just wanted to mention it.
>
> [...]
>
>>>> +SCRIPT_DIR=$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" &>/dev/null && pwd)
>>>> +CONTAINERCMD=${CONTAINERCMD:-docker}
>>>> +DOCS_DIR="$SCRIPT_DIR/../.."
>>>> +POKY_YAML_IN="$SCRIPT_DIR/../poky.yaml.in"
>>>> +
>>>> +# This lists the different images we can build and the keys we pass to yq to
>>>> +# find packages in poky.yaml.in.
>>>> +# The keys should be in the form of "<image>:<version>", as this will be passed
>>>> +# to FROM in the selected Dockerfile below.
>>>> +# These are common yq keys used for multiple distros.
>>>> +_UBUNTU_DEBIAN_YQ_KEYS=".UBUNTU_DEBIAN_HOST_PACKAGES_DOC .UBUNTU_DEBIAN_HOST_PACKAGES_DOC_PDF"
>>>> +declare -A YQ_KEYS=(
>>>> +  [ubuntu:22.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>> +  [ubuntu:24.04]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>> +  [debian:12]="$_UBUNTU_DEBIAN_YQ_KEYS"
>>>> +)
>>>> +
>>>> +# This lists the dockerfile to use for each of the distro listed in YQ_KEYS
>>>> +# above. There should be a 1 to 1 match between the keys listed in YQ_KEYS above
>>>> +# and the keys listed in DOCKERFILES below.
>>>> +declare -A DOCKERFILES=(
>>>> +  [ubuntu:22.04]="Dockerfile.ubuntu-debian"
>>>> +  [ubuntu:24.04]="Dockerfile.ubuntu-debian"
>>>> +  [debian:12]="Dockerfile.ubuntu-debian"
>>>> +)
>>>> +
>>>> +main ()
>>>> +{
>>>> +  if [ "$#" -lt 1 ]; then
>>>> +    echo "No image provided. Provide one of: ${!YQ_KEYS[*]}"
>>>> +    exit 1
>>>> +  elif [ "$1" = "list" ]; then
>>>> +    echo -e "Available container images:\n\n${!YQ_KEYS[*]}"
>>>> +    exit 0
>>>> +  fi
>>>> +
>>>> +  local image="$1"
>>>> +  shift
>>>> +  local make_targets="${*:-publish}"
>>>> +
>>>> +  for cmd in $CONTAINERCMD yq; do
>>>> +    if ! which "$cmd" >/dev/null 2>&1; then
>>>> +      echo "The $cmd command was not found. Make sure you have $cmd installed."
>>>> +      exit 1
>>>> +    fi
>>>> +  done
>>>> +
>>>> +  # Get the appropriate dockerfile from DOCKERFILES
>>>> +  dockerfile="${DOCKERFILES[$image]}"
>>>> +
>>>> +  local temporary_dep_file="$SCRIPT_DIR/dockerfiles/deps"
>>>> +  echo -n > "$temporary_dep_file" # empty the file
>>>> +  for dep_key in ${YQ_KEYS[$image]}; do
>>>> +    yq --raw-output --join-output "$dep_key" "$POKY_YAML_IN" >> "$temporary_dep_file"
>>>> +    # add a white space after last element of yq command
>>>> +    echo -n " " >> "$temporary_dep_file"
>>>> +  done
>>>> +
>>>
>>> Just use a temporary file, this would allow to run builds from two
>>> different distros at the same time. mktemp should help you with that.
>> 
>> The problem is that the source in COPY must be part of the build context, so I
>> don't having a temp file in /tmp is possible. See
>> https://docs.docker.com/reference/dockerfile/#adding-files-from-the-build-context
>> 
>> I could use "mktemp --tmpdir=$SCRIPT_DIR/dockerfiles" but I don't really see the
>> added value.
>> 
>
> "this would allow to run builds from two different distros at the same time"
>
> Also, it's guaranteed that an old file wouldn't be used in case the 
> script is somehow broken.

I see what you mean. However, since mktemp create a random name it creates a
new layer each time, invalidating the following commands in the container file.

Instead I propose to move this above:

>>>> +  # docker build doesn't accept 2 colons, so "sanitize" the name
>>>> +  local sanitized_dockername
>>>> +  sanitized_dockername=$(echo "$image" | tr ':.' '-')

And do this:

  temporary_dep_file="$SCRIPT_DIR/dockerfiles/$sanitized_dockername.deps"

So we avoid the distro clash and we still use cached layers.

>>>> +
>>>> +  $CONTAINERCMD build \
>>>> +    --tag yocto-docs-$sanitized_dockername:latest \
>>>> +    --build-arg ARG_FROM="$image" \
>>>> +    --file "$SCRIPT_DIR/dockerfiles/$dockerfile" \
>>>> +    "$SCRIPT_DIR/dockerfiles"
>>>> +
>>>> +  # We can remove the deps file, we no longer need it
>>>> +  rm -f "$temporary_dep_file"
>>>> +
>>>> +  local -a args_run=(
>>>> +    --rm
>>>> +    --interactive
>>>> +    --tty
>>>> +    --volume="$DOCS_DIR:/docs:rw"
>>>> +    --workdir=/docs
>>>> +    --security-opt label=disable
>>>> +  )
>>>> +
>>>> +  if [ "$CONTAINERCMD" = "docker" ]; then
>>>> +    args_run+=(
>>>> +      --user="$(id -u)":"$(id -g)"
>>>> +    )
>>>> +  elif [ "$CONTAINERCMD" = "podman" ]; then
>>>> +    # we need net access to fetch bitbake terms
>>>> +    args_run+=(
>>>> +      --cap-add=NET_RAW
>>>> +      --userns=keep-id
>>>> +    )
>>>> +  fi
>>>> +
>>>> +  $CONTAINERCMD run \
>>>> +    "${args_run[@]}" \
>>>> +    yocto-docs-$sanitized_dockername \
>>>> +    make -C documentation $make_targets
>>>> +}
>>>> +
>>>> +set -eu -o pipefail
>>>> +
>>>> +main "$@"
>>>> diff --git a/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>>> new file mode 100644
>>>> index 0000000000000000000000000000000000000000..3c543dc4b0e96dc9c00b201558e2ed00847339fa
>>>> --- /dev/null
>>>> +++ b/documentation/tools/dockerfiles/Dockerfile.ubuntu-debian
>>>> @@ -0,0 +1,18 @@
>>>> +ARG ARG_FROM=debian:12 # default value to avoid warning
>>>> +FROM $ARG_FROM
>>>> +
>>>> +ENV DEBIAN_FRONTEND=noninteractive
>>>> +
>>>> +# relative to the location of the dockerfile
>>>> +COPY deps /deps
>>>> +
>>>> +RUN apt-get update \
>>>> + && apt-get --yes --no-install-recommends install $(cat /deps) \
>>>> + && apt-get --yes autoremove \
>>>> + && apt-get clean \
>>>> + && rm /deps
>>>> +
>>>> +RUN echo "en_US.UTF-8 UTF-8" >> /etc/locale.gen && locale-gen
>>>> +
>>>> +RUN git config --global --add safe.directory /docs
>>>> +
>>>>
>>>
>>> So... I've had another idea.
>>>
>>> Parsing yaml in shell, not fun.
>>>
>>> But I don't think we **need** the instructions to be in the YAML file.
>>>
>>> So I think we could drastically simplify all this by storing the setup
>>> instructions in shell scripts which are included in sphinx AND use those
>>> shell scripts in the Containerfile. We actually do this for our product
>>> user manuals :)
>>>
>>> Essentially, we would have:
>>>
>>> $ cat documentation/host-packages-ubuntu.sh
>>> sudo apt install gawk wget git diffstat unzip texinfo gcc \ [...]
>>>
>>> In Sphinx:
>>>
>>> .. literalinclude:: 50.literalinclude.apt.sh
>>>      :language: bash
>>>
>>> In the appropriate place.
>>>
>>> How we generate the Containerfile internally:
>>> """
>>> cat <<EOF > $WORKDIR/merged.sh
>>> #!/bin/bash
>>> set -eux
>>> export DEBIAN_FRONTEND=noninteractive
>>> apt-get update
>>> apt-get -y install sudo
>>> EOF
>>>
>>> # Merge bash snippets we want to run (in the right order)
>>> cat ../source/50.literalinclude.{apt,atf,uboot,linux,debos-prepare,\
>>> debos-build-bookworm}.sh >> $WORKDIR/merged.sh
>>>
>>> chmod +x $WORKDIR/merged.sh
>>>
>>> podman run --rm --tmpfs=/tmp -v=$WORKDIR:$WORKDIR --security-opt
>>> label=disable \  --annotation -w $WORKDIR debian:bookworm ./merged.sh
>>> """
>>>
>>> I assume we could simply use COPY in the Containerfile and include the
>>> shell script in there and run the shell script in a single RUN. You
>>> could then do the cleanup step in a separate layer, not optimal since
>>> the layer will be unnecessarily big but I assume the end image is going
>>> to be smaller anyway?
>>>
>>> The benefit is that you test **exactly** what's in the documentation.
>>> And also you don't have to parse YAML. And it should be relatively easy
>>> to add support for new distros.
>>>
>>> What do you think?
>> 
>> I like this idea! Only downside I see is to have to re-run "apt get install" for
>> each run command, which is not really ideal IMO. I prefer caching what's already
>
> No? I have one RUN command which is "execute this shell script". 
> Whatever is executed as part of this shell script is part of one layer. 
> You could even have one RUN command which is apt-get update && 
> my-shell-script && apt-get cache remove whatever to make this a smaller 
> layer.

So let me know if I got this wrong, but running "podman run ..." multiple times
is going to make use of a cached layer automagically? I.e. merged.sh is not
re-run each time? Or you are implying that there is still a Containerfile in the
process that caches this data?

>> been downloaded so we can re-build the doc swiftly (I imagined this script to be
>> used for more than verifying that what we list in the dep list is enough to
>> build, and used by anyone to test their changes to the docs easily).
>> 
>
> You can have as many scripts as you want? And have them do as many or 
> little things as you want, I'm not sure to follow what you have in mind 
> here?

That wasn't clear maybe. I simply meant to say that I want to publish this
script not only for verifying that listed dependencies are correct, but also
because it's convenient for anyone to build the documentation. It's unrelated to
your solution.

Thanks,
Antonin

-- 
Antonin Godard, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


  reply	other threads:[~2024-12-10 12:46 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-05 11:06 [yocto-docs PATCH v2] Add scripts to build the docs in containers Antonin Godard
2024-12-06 14:23 ` Quentin Schulz
2024-12-10 10:33   ` Antonin Godard
2024-12-10 11:08     ` [docs] " Quentin Schulz
2024-12-10 12:46       ` Antonin Godard [this message]
2024-12-10 13:48         ` Quentin Schulz
2024-12-10 14:48           ` Antonin Godard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D6814L7Y2U7C.1PWWUILXKF7H3@bootlin.com \
    --to=antonin.godard@bootlin.com \
    --cc=docs@lists.yoctoproject.org \
    --cc=quentin.schulz@cherry.de \
    --cc=thomas.petazzoni@bootlin.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox