public inbox for kdevops@lists.linux.dev
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] Build once, test everywhere
@ 2025-04-22 15:49 cel
  2025-04-22 15:49 ` [RFC PATCH 1/3] Add a guest/instance for building the test kernel cel
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: cel @ 2025-04-22 15:49 UTC (permalink / raw)
  To: Luis Chamberlain, Daniel Gomez, Scott Mayhew; +Cc: kdevops, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

With apologies to the Java folks.

I've been looking for ways to make our workflow runs more efficient
because my lab resources are limited, and moving to the cloud costs
money per CPU second.

One thing that seems like an obvious win would be to build the test
kernel one time (for example, after a merge/pull request), then
re-use that kernel binary for all the workflows we want to run on
it.

For libvirt, the usual way to do this is use the 9p option. There
are a few cases that are important to me where that option does not
work:

1. The control host is running late-model Fedora, and the test
   runners want to run old LTS kernels
2. The control host or test runners run a RHEL-based kernel, which
   does not support 9p, last I checked
3. Cloud, where there is no way for cloud-based test runners to
   mount the control host with 9p

That leaves me with building the test kernel in each test runner.
That's a lot of kernel builds; and in the cloud, the kernel source
code has to be cloned onto each runner from kernel.org, because we
don't have repo mirroring there.

Ted pointed out another use case where build-once is great: when
you want to under-provision the test runners to exercise code
paths that handle resource exhaustion. Building the kernel on
such test runners takes forever.

Cloud providers throw another curve: there is typically a per-
tenant or per-availability zone limit on the number of CPU cores
you can provision at once. So a workflow that starts several
instances has to limit each instance to only one or two cores.
Building the kernel on those instances also takes forever.

I really want something more like KOTD, where test kernels are
built once, packaged, and placed somewhere that the runners can
find and install them. I'm not real interested in setting up a
persistent yum repo and builder, though; I'd like kdevops to
manage all that for me.

So I got out my whittling knife and built this proof of concept
where kdevops brings up a kernel builder node that is tailored for
a fast kernel build (eg, it is one guest with 16 vCPUs). The test
kernel is packaged (.rpm or .deb) and fetched to the control host,
then the builder node is destroyed.

I added a top-level "make" target that uploads the test kernel
packages to each test runner, and they install it.

This is not part of the patch series, but shows how to run it:

$ make mrproper defconfig-kernel-builder
$ make && make bringup && make linux-packages && make destroy
$ make defconfig-workflow-one
$ make && make bringup && make linux-artifacts && make destroy
$ make defconfig-workflow-two
$ make && make bringup && make linux-artifacts && make destroy
$ make defconfig-workflow-three
$ make && make bringup && make linux-artifacts && make destroy
$ make mrproper

All three workflows use the same kernel, built just once, for all
of their test runners.

This is still band-aids and chewing gum, but it seems to work on
both libvirt and cloud configurations. There is more than one way
to skin this cat, though.

Chuck Lever (3):
  Add a guest/instance for building the test kernel
  playbooks: Add a build_linux role
  Experimental: Add a separate install_linux role

 .gitignore                                    |   2 +
 playbooks/build_linux.yml                     |   4 +
 playbooks/install_linux.yml                   |   4 +
 playbooks/roles/build_linux/README.md         |  74 +++++
 playbooks/roles/build_linux/defaults/main.yml |  38 +++
 .../tasks/install-deps/debian/main.yml        |  46 +++
 .../tasks/install-deps/redhat/main.yml        | 102 ++++++
 .../tasks/install-deps/suse/main.yml          |  31 ++
 playbooks/roles/build_linux/tasks/main.yml    | 295 ++++++++++++++++++
 playbooks/roles/gen_hosts/defaults/main.yml   |   2 +
 playbooks/roles/gen_hosts/tasks/main.yml      |  12 +
 .../roles/gen_hosts/templates/builder.j2      |  13 +
 playbooks/roles/gen_nodes/defaults/main.yml   |   2 +
 playbooks/roles/gen_nodes/tasks/main.yml      |  22 ++
 playbooks/roles/install_linux/README.md       | 136 ++++++++
 .../roles/install_linux/defaults/main.yml     |  43 +++
 .../tasks/install-deps/debian/main.yml        |  44 +++
 .../tasks/install-deps/redhat/main.yml        |  76 +++++
 .../tasks/install-deps/suse/main.yml          |  31 ++
 playbooks/roles/install_linux/tasks/main.yml  | 142 +++++++++
 .../tasks/update-grub/debian.yml              |   8 +
 .../tasks/update-grub/install.yml             | 196 ++++++++++++
 .../install_linux/tasks/update-grub/main.yml  |  15 +
 .../tasks/update-grub/redhat.yml              |  36 +++
 .../install_linux/tasks/update-grub/suse.yml  |  11 +
 workflows/linux/Kconfig                       |  10 +
 workflows/linux/Makefile                      |  19 ++
 27 files changed, 1414 insertions(+)
 create mode 100644 playbooks/build_linux.yml
 create mode 100644 playbooks/install_linux.yml
 create mode 100644 playbooks/roles/build_linux/README.md
 create mode 100644 playbooks/roles/build_linux/defaults/main.yml
 create mode 100644 playbooks/roles/build_linux/tasks/install-deps/debian/main.yml
 create mode 100644 playbooks/roles/build_linux/tasks/install-deps/redhat/main.yml
 create mode 100644 playbooks/roles/build_linux/tasks/install-deps/suse/main.yml
 create mode 100644 playbooks/roles/build_linux/tasks/main.yml
 create mode 100644 playbooks/roles/gen_hosts/templates/builder.j2
 create mode 100644 playbooks/roles/install_linux/README.md
 create mode 100644 playbooks/roles/install_linux/defaults/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/install-deps/debian/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/install-deps/redhat/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/install-deps/suse/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/debian.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/install.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/redhat.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/suse.yml

-- 
2.49.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH 1/3] Add a guest/instance for building the test kernel
  2025-04-22 15:49 [RFC PATCH 0/3] Build once, test everywhere cel
@ 2025-04-22 15:49 ` cel
  2025-04-22 15:49 ` [RFC PATCH 2/3] playbooks: Add a build_linux role cel
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: cel @ 2025-04-22 15:49 UTC (permalink / raw)
  To: Luis Chamberlain, Daniel Gomez, Scott Mayhew; +Cc: kdevops, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Provision a separate guest/instance for the purpose of building
the Linux kernel under test. It's not used yet.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 playbooks/roles/gen_hosts/defaults/main.yml   |  2 ++
 playbooks/roles/gen_hosts/tasks/main.yml      | 12 ++++++++++
 .../roles/gen_hosts/templates/builder.j2      | 13 +++++++++++
 playbooks/roles/gen_nodes/defaults/main.yml   |  2 ++
 playbooks/roles/gen_nodes/tasks/main.yml      | 22 +++++++++++++++++++
 workflows/linux/Kconfig                       | 10 +++++++++
 workflows/linux/Makefile                      |  4 ++++
 7 files changed, 65 insertions(+)
 create mode 100644 playbooks/roles/gen_hosts/templates/builder.j2

diff --git a/playbooks/roles/gen_hosts/defaults/main.yml b/playbooks/roles/gen_hosts/defaults/main.yml
index f6ab9bcce166..85afb35040e3 100644
--- a/playbooks/roles/gen_hosts/defaults/main.yml
+++ b/playbooks/roles/gen_hosts/defaults/main.yml
@@ -29,6 +29,8 @@ kdevops_workflow_enable_ltp: False
 kdevops_workflow_enable_nfstest: false
 kdevops_workflow_enable_sysbench: false
 
+bootlinux_builder: false
+
 is_fstests: False
 fstests_fstyp: "bogus"
 fs_config_role_path: "/dev/null"
diff --git a/playbooks/roles/gen_hosts/tasks/main.yml b/playbooks/roles/gen_hosts/tasks/main.yml
index a50355f72160..9182ac46f8c1 100644
--- a/playbooks/roles/gen_hosts/tasks/main.yml
+++ b/playbooks/roles/gen_hosts/tasks/main.yml
@@ -56,6 +56,18 @@
   when:
     - is_fstests
 
+- name: Generate the Ansible hosts file for a Linux kernel build
+  tags: [ 'hosts' ]
+  template:
+    src: "{{ kdevops_hosts_template }}"
+    dest: "{{ topdir_path }}/{{ kdevops_hosts }}"
+    force: yes
+    trim_blocks: True
+    lstrip_blocks: True
+  when:
+    - bootlinux_builder
+    - ansible_hosts_template.stat.exists
+
 - name: Generate the Ansible hosts file
   tags: [ 'hosts' ]
   template:
diff --git a/playbooks/roles/gen_hosts/templates/builder.j2 b/playbooks/roles/gen_hosts/templates/builder.j2
new file mode 100644
index 000000000000..0c9ba1e8e01a
--- /dev/null
+++ b/playbooks/roles/gen_hosts/templates/builder.j2
@@ -0,0 +1,13 @@
+[all]
+{{ kdevops_hosts_prefix }}-builder
+[all:vars]
+ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
+
+[baseline]
+{{ kdevops_hosts_prefix }}-builder
+[baseline:vars]
+ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
+
+[dev]
+[dev:vars]
+ansible_python_interpreter =  "{{ kdevops_python_interpreter }}"
diff --git a/playbooks/roles/gen_nodes/defaults/main.yml b/playbooks/roles/gen_nodes/defaults/main.yml
index 8ff9b87993a7..d5fa444c995f 100644
--- a/playbooks/roles/gen_nodes/defaults/main.yml
+++ b/playbooks/roles/gen_nodes/defaults/main.yml
@@ -93,6 +93,8 @@ fstests_fstyp: "bogus"
 fs_config_role_path: "/dev/null"
 fs_config_data: "[section_1]"
 
+bootlinux_builder: false
+builder_nodes: []
 bootlinux_9p: False
 bootlinux_9p_host_path: "/dev/null"
 bootlinux_9p_msize: 0
diff --git a/playbooks/roles/gen_nodes/tasks/main.yml b/playbooks/roles/gen_nodes/tasks/main.yml
index d541dcbf1f54..e3d0127d1f65 100644
--- a/playbooks/roles/gen_nodes/tasks/main.yml
+++ b/playbooks/roles/gen_nodes/tasks/main.yml
@@ -43,6 +43,14 @@
   when:
     - kdevops_baseline_and_dev
 
+- name: Set builder nodes array
+  tags: vars
+  set_fact:
+    builder_nodes:
+      - "{{ kdevops_host_prefix + '-builder' }}"
+  when:
+    - bootlinux_builder
+
 - name: Set iscsi_nodes list
   ansible.builtin.set_fact:
     iscsi_nodes: "{{ [kdevops_host_prefix + '-iscsi'] }}"
@@ -138,6 +146,20 @@
     - not kdevops_workflows_dedicated_workflow
     - ansible_nodes_template.stat.exists
 
+- name: Generate the builder kdevops nodes file using {{ kdevops_nodes_template }} as jinja2 source template
+  tags: [ 'nodes' ]
+  vars:
+    node_template: "{{ kdevops_nodes_template | basename }}"
+    all_generic_nodes: "{{ builder_nodes }}"
+    nodes: "{{ all_generic_nodes }}"
+  template:
+    src: "{{ node_template }}"
+    dest: "{{ topdir_path }}/{{ kdevops_nodes }}"
+    force: yes
+  when:
+    - bootlinux_builder
+    - ansible_nodes_template.stat.exists
+
 - name: Generate the pynfs kdevops nodes file using {{ kdevops_nodes_template }} as jinja2 source template
   tags: [ 'nodes' ]
   vars:
diff --git a/workflows/linux/Kconfig b/workflows/linux/Kconfig
index 797469e60d20..8a26eb2e2897 100644
--- a/workflows/linux/Kconfig
+++ b/workflows/linux/Kconfig
@@ -36,6 +36,16 @@ config BOOTLINUX_PURE_IOMAP
 
 endif # HAVE_SUPPORTS_PURE_IOMAP
 
+config BOOTLINUX_BUILDER
+	bool "Build once, test many"
+	output yaml
+	default n
+	help
+	  Enabling this option creates a separate guest/instance
+	  where the Linux kernel is built. kdevops passes the
+	  artifacts of this build to the test runner nodes to be
+	  installed before each workflow runs.
+
 config BOOTLINUX_9P
 	bool "Use 9p to build Linux"
 	depends on LIBVIRT && !GUESTFS_LACKS_9P
diff --git a/workflows/linux/Makefile b/workflows/linux/Makefile
index ecce273a4f67..65bbb8ae9a90 100644
--- a/workflows/linux/Makefile
+++ b/workflows/linux/Makefile
@@ -13,6 +13,10 @@ ifeq (y,$(CONFIG_BOOTLINUX_PURE_IOMAP))
 TREE_CONFIG:=config-$(TREE_REF)-pure-iomap
 endif
 
+ifeq (y,$(CONFIG_BOOTLINUX_BUILDER))
+KDEVOPS_HOSTS_TEMPLATE=builder.j2
+endif
+
 # Describes the Linux clone
 BOOTLINUX_ARGS	+= target_linux_git=$(TREE_URL)
 # ifeq (y,$(CONFIG_BOOTLINUX_TREE_CUSTOM_NAME))
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 2/3] playbooks: Add a build_linux role
  2025-04-22 15:49 [RFC PATCH 0/3] Build once, test everywhere cel
  2025-04-22 15:49 ` [RFC PATCH 1/3] Add a guest/instance for building the test kernel cel
@ 2025-04-22 15:49 ` cel
  2025-04-22 15:49 ` [RFC PATCH 3/3] Experimental: Add a separate install_linux role cel
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: cel @ 2025-04-22 15:49 UTC (permalink / raw)
  To: Luis Chamberlain, Daniel Gomez, Scott Mayhew; +Cc: kdevops, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Currently, for cloud configurations or when 9p is disabled, kdevops
builds the test kernel on each test runner. This is inefficient,
and gets worse as the test matrix for a single kernel version scales
out.

Instead we want to build the test kernel once and make the build
artifacts available for test runners to install. This would be
similar to what KOTD does now, except it does not require setting up
a separate yum repo.

For this experiment, I've created a stripped down version of the
bootlinux role that has only the steps needed to build the kernel on
a target node. It adds some steps to create kernel packages and then
fetch them to the control host.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 .gitignore                                    |   2 +
 playbooks/build_linux.yml                     |   4 +
 playbooks/roles/build_linux/README.md         |  74 +++++
 playbooks/roles/build_linux/defaults/main.yml |  38 +++
 .../tasks/install-deps/debian/main.yml        |  46 +++
 .../tasks/install-deps/redhat/main.yml        | 102 ++++++
 .../tasks/install-deps/suse/main.yml          |  31 ++
 playbooks/roles/build_linux/tasks/main.yml    | 295 ++++++++++++++++++
 workflows/linux/Makefile                      |   7 +
 9 files changed, 599 insertions(+)
 create mode 100644 playbooks/build_linux.yml
 create mode 100644 playbooks/roles/build_linux/README.md
 create mode 100644 playbooks/roles/build_linux/defaults/main.yml
 create mode 100644 playbooks/roles/build_linux/tasks/install-deps/debian/main.yml
 create mode 100644 playbooks/roles/build_linux/tasks/install-deps/redhat/main.yml
 create mode 100644 playbooks/roles/build_linux/tasks/install-deps/suse/main.yml
 create mode 100644 playbooks/roles/build_linux/tasks/main.yml

diff --git a/.gitignore b/.gitignore
index 45113a669390..f51213a59ad9 100644
--- a/.gitignore
+++ b/.gitignore
@@ -86,6 +86,8 @@ playbooks/roles/linux-mirror/linux-mirror-systemd/mirrors.yaml
 #   yet.
 workflows/selftests/results/
 
+workflows/linux/artifacts/
+
 workflows/linux/refs/default/Kconfig.linus
 workflows/linux/refs/default/Kconfig.next
 workflows/linux/refs/default/Kconfig.stable
diff --git a/playbooks/build_linux.yml b/playbooks/build_linux.yml
new file mode 100644
index 000000000000..094441d1befd
--- /dev/null
+++ b/playbooks/build_linux.yml
@@ -0,0 +1,4 @@
+---
+- hosts: all
+  roles:
+    - role: build_linux
diff --git a/playbooks/roles/build_linux/README.md b/playbooks/roles/build_linux/README.md
new file mode 100644
index 000000000000..cbf9755ce5e0
--- /dev/null
+++ b/playbooks/roles/build_linux/README.md
@@ -0,0 +1,74 @@
+build_linux
+===========
+
+The build_linux role downloads and builds the Linux kernel.  It also
+lets you apply custom patches, remove kernels, etc.; anything you
+have to do with regards to generic kernel development.
+
+By default, it tracks one of the latest stable kernels that are
+still supported using the linux stable git tree.
+
+Requirements
+------------
+
+A separate block device is required on which to create the file
+system where the test kernel is built.
+
+Role Variables
+--------------
+
+  * infer_uid_and_group: defaults to False, if set to True, then we will ignore
+    the passed on data_user and data_group and instead try to infer this by
+    inspecting the `whoami` and getent on the logged in target system we are
+    provisioning. So if user sam is running able on a host, targetting a system
+    called foofighter and logging into that system using username pincho,
+    then the data_user will be set overwritten and set to pincho. We will then
+    also lookup for pincho's default group id and use that for data_group.
+    This is useful if you are targetting a slew of systems and don't really
+    want to deal with the complexities of the username and group, and the
+    default target username you use to ssh into a system suffices to use as
+    a base. This is set to False to remain compatible with old users of
+    this role.
+  * data_path: where to place the git trees we clone under
+  * data_user: the user to assign permissions to
+  * data_group: the group to assign permissions to
+
+  * data_device: the target device to use for the data partition
+  * data_fstype: the filesystem to store the data parition under
+  * data_label: the label to use
+  * data_fs_opts: the filesystem options to use, you want to ensure to add the
+    label
+
+  * target_linux_admin_name: your developer name
+  * target_linux_admin_email: your email
+  * target_linux_git: the git tree to clone, by default this is the linux-stable
+    tree
+  * target_linux_tree: the name of the tree
+  * target_linux_dir_path: where to place the tree on the target system
+
+  * target_linux_ref : the actual tag as used on linux, so v4.19.62
+  * target_linux_extra_patch: if defined an extra patch to apply with git
+     am prior to compilation
+  * target_linux_config: the configuration file to use
+
+Dependencies
+------------
+
+None.
+
+Example Playbook
+----------------
+
+Below is an example playbook, say a build_linux.yml file:
+
+```
+---
+- hosts: all
+  roles:
+    - role: build_linux
+```
+
+License
+-------
+
+copyleft-next-0.3.1
diff --git a/playbooks/roles/build_linux/defaults/main.yml b/playbooks/roles/build_linux/defaults/main.yml
new file mode 100644
index 000000000000..5ff5ece28b1e
--- /dev/null
+++ b/playbooks/roles/build_linux/defaults/main.yml
@@ -0,0 +1,38 @@
+# SPDX-License-Identifier copyleft-next-0.3.1
+---
+kdevops_bootlinux: false
+infer_uid_and_group: false
+
+data_path: "/data"
+data_user: "kdevops"
+data_group: "kdevops"
+data_device: "/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_kdevops0"
+data_fstype: "xfs"
+data_label: "data"
+data_fs_opts: "-L {{ disk_setup_label }}"
+
+target_linux_admin_name: "Hacker Amanda"
+target_linux_admin_email: "devnull@kernel.org"
+target_linux_git: "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git"
+target_linux_shallow_depth: 0
+target_linux_tree: "linux-stable"
+target_linux_dir_path: "{{ data_path }}/{{ target_linux_tree }}"
+kdevops_baseline_and_dev: false
+
+target_linux_ref: "v4.19.133"
+target_linux_delta_file:
+target_linux_config: "config-{{ target_linux_ref }}"
+make: "make"
+target_linux_make_cmd: "{{ make }} -j{{ ansible_processor_nproc }}"
+target_linux_make_install_cmd: "{{ target_linux_make_cmd }} modules_install install"
+
+build_artifacts_dir: "{{ topdir_path }}/workflows/linux/artifacts/"
+
+uninstall_kernel_enable: false
+
+bootlinux_b4_am_this_host: false
+
+kdevops_workflow_enable_cxl: false
+bootlinux_cxl_test: false
+
+bootlinux_tree_set_by_cli: false
diff --git a/playbooks/roles/build_linux/tasks/install-deps/debian/main.yml b/playbooks/roles/build_linux/tasks/install-deps/debian/main.yml
new file mode 100644
index 000000000000..ef57dca756ec
--- /dev/null
+++ b/playbooks/roles/build_linux/tasks/install-deps/debian/main.yml
@@ -0,0 +1,46 @@
+---
+# Install dependencies for building linux on Debian
+
+- name: Update apt cache
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.apt:
+    update_cache: true
+  tags:
+    - linux
+
+# apt-get build-dep does not capture all requirements
+- name: Install Linux kernel build dependencies
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.apt:
+    name:
+      - bison
+      - flex
+      - git
+      - gcc
+      - make
+      - gawk
+      - bc
+      - dump
+      - indent
+      - sed
+      - libssl-dev
+      - libelf-dev
+      - liburcu-dev
+      - xfsprogs
+      - e2fsprogs
+      - btrfs-progs
+      - ntfs-3g
+      - mdadm
+      - rpcbind
+      - portmap
+      - hwinfo
+      - open-iscsi
+      - python3-pip
+      - zstd
+      - libncurses-dev
+      - b4
+    state: present
+  tags:
+    - linux
diff --git a/playbooks/roles/build_linux/tasks/install-deps/redhat/main.yml b/playbooks/roles/build_linux/tasks/install-deps/redhat/main.yml
new file mode 100644
index 000000000000..371e7f7f11ee
--- /dev/null
+++ b/playbooks/roles/build_linux/tasks/install-deps/redhat/main.yml
@@ -0,0 +1,102 @@
+---
+- name: Enable installation of packages from EPEL
+  ansible.builtin.include_role:
+    name: epel-release
+  when:
+    - ansible_distribution != "Fedora"
+
+- name: Install packages for building the Linux kernel
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.dnf:
+    name: "{{ packages }}"
+    state: present
+    update_cache: true
+  retries: 3
+  delay: 5
+  register: result
+  until: result is succeeded
+  vars:
+    packages:
+      - bison
+      - flex
+      - git-core
+      - e2fsprogs
+      - xfsprogs
+      - xfsdump
+      - lvm2
+      - gcc
+      - make
+      - gawk
+      - bc
+      - dump
+      - libtool
+      - psmisc
+      - sed
+      - vim
+      - fio
+      - libaio-devel
+      - diffutils
+      - net-tools
+      - ncurses-devel
+      - xfsprogs
+      - e2fsprogs
+      - elfutils-libelf-devel
+      - ntfs-3g
+      - mdadm
+      - rpcbind
+      - portmap
+      - hwinfo
+      - iscsi-initiator-utils
+      - openssl
+      - openssl-devel
+      - dwarves
+      - userspace-rcu
+      - zstd
+
+- name: Install btrfs-progs
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.dnf:
+    name:
+      - btrfs-progs
+    state: present
+    update_cache: true
+  retries: 3
+  delay: 5
+  register: btrfs_result
+  until: btrfs_result is succeeded
+  when:
+    - ansible_distribution == "Fedora"
+
+- name: Install rpmbuild
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.dnf:
+    name:
+      - rpmbuild
+    state: present
+    update_cache: true
+  retries: 3
+  delay: 5
+  register: rpmbuild_result
+  until: rpmbuild_result is succeeded
+  when:
+    - ansible_distribution != "Fedora"
+
+- name: Install rpmbuild
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.dnf:
+    name:
+      - perl-core
+      - rpm-build
+      - rsync
+    state: present
+    update_cache: true
+  retries: 3
+  delay: 5
+  register: rpmbuild_result
+  until: rpmbuild_result is succeeded
+  when:
+    - ansible_distribution == "Fedora"
diff --git a/playbooks/roles/build_linux/tasks/install-deps/suse/main.yml b/playbooks/roles/build_linux/tasks/install-deps/suse/main.yml
new file mode 100644
index 000000000000..0260da5a912d
--- /dev/null
+++ b/playbooks/roles/build_linux/tasks/install-deps/suse/main.yml
@@ -0,0 +1,31 @@
+---
+- name: Install Linux kernel build dependencies for SUSE
+  become: true
+  become_method: ansible.builtin.sudo
+  community.general.zypper:
+    name:
+      - bison
+      - flex
+      - git-core
+      - gcc
+      - make
+      - gawk
+      - bc
+      - dump
+      - sed
+      - libopenssl-devel
+      - libelf-devel
+      - liburcu8
+      - diffutils
+      - net-tools
+      - ncurses-devel
+      - xfsprogs
+      - e2fsprogs
+      - btrfsprogs
+      - ntfs-3g
+      - mdadm
+      - rpcbind
+      - portmap
+      - hwinfo
+      - open-iscsi
+    disable_recommends: false
diff --git a/playbooks/roles/build_linux/tasks/main.yml b/playbooks/roles/build_linux/tasks/main.yml
new file mode 100644
index 000000000000..a95b5ee98bb3
--- /dev/null
+++ b/playbooks/roles/build_linux/tasks/main.yml
@@ -0,0 +1,295 @@
+---
+- name: Include optional extra_vars
+  ansible.builtin.include_vars:
+    file: "{{ item }}"
+  with_first_found:
+    - files:
+        - "../extra_vars.yml"
+        - "../extra_vars.yaml"
+        - "../extra_vars.json"
+      skip: true
+  failed_when: false
+
+- name: Debian-specific set up
+  ansible.builtin.import_tasks: install-deps/debian/main.yml
+  when:
+    - ansible_os_family == "Debian"
+
+- name: Red Hat-specific set up
+  ansible.builtin.import_tasks: install-deps/redhat/main.yml
+  when:
+    - ansible_os_family == "RedHat"
+
+- name: Suse-specific set up
+  ansible.builtin.import_tasks: install-deps/suse/main.yml
+  when:
+    - ansible_os_family == "Suse"
+
+- name: Install b4
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.pip:
+    name:
+      - b4
+  when:
+    - target_linux_install_b4 is defined
+    - target_linux_install_b4
+    - ansible_os_family == "Debian"
+
+- name: Create the /data partition
+  ansible.builtin.include_role:
+    name: create_data_partition
+
+- name: Wipe the build directory
+  ansible.builtin.file:
+    path: "{{ target_linux_dir_path }}"
+    state: absent
+
+- name: Clone {{ target_linux_tree }}
+  ansible.builtin.git:
+    repo: "{{ target_linux_git }}"
+    dest: "{{ target_linux_dir_path }}"
+    update: true
+    depth: "{{ target_linux_shallow_depth }}"
+    version: "{{ target_linux_ref }}"
+  retries: 3
+  delay: 5
+  register: git_result
+  until: not git_result.failed
+
+- name: Copy the kernel delta to the builder
+  ansible.builtin.template:
+    src: "{{ target_linux_extra_patch }}"
+    dest: "{{ target_linux_dir_path }}/{{ target_linux_extra_patch }}"
+    owner: "{{ data_user }}"
+    group: "{{ data_group }}"
+    mode: "u=rw,g=r,o=r"
+  when:
+    - target_linux_extra_patch is defined
+
+- name: Apply the kernel delta on the builder
+  ansible.builtin.command:
+    cmd: "git am {{ target_linux_extra_patch }}"  # noqa: command-instead-of-module
+    chdir: "{{ target_linux_dir_path }}"
+  register: git_am
+  changed_when: not git_am.failed
+  when:
+    - target_linux_extra_patch is defined
+
+- name: Set git user name and email
+  ansible.builtin.shell: |
+    if ! $(git config --get user.email) ; then
+      git config --global user.email user@example.com
+    fi
+    if ! $(git config --get user.name) ; then
+      git config --global user.name user
+    fi
+  register: git_config
+  changed_when: not git_config.failed
+  when:
+    - target_linux_apply_patch_message_id is defined
+    - target_linux_apply_patch_message_id | length > 0
+    - bootlinux_b4_am_this_host|bool
+
+- name: Apply a message patch set
+  ansible.builtin.shell:
+    cmd: |
+      set -o pipefail
+      b4 am -o - {{ target_linux_apply_patch_message_id }} | git am
+    chdir: "{{ target_linux_dir_path }}"
+  register: b4_am
+  changed_when: not b4_am.failed
+  when:
+    - target_linux_apply_patch_message_id is defined
+    - target_linux_apply_patch_message_id | length > 0
+    - bootlinux_b4_am_this_host|bool
+
+- name: Set the pathname of the .config templates directory
+  ansible.builtin.set_fact:
+    template_path: "{{ topdir_path }}/playbooks/roles/bootlinux/templates"
+
+- name: Check whether config-kdevops exists
+  delegate_to: localhost
+  ansible.builtin.stat:
+    path: "{{ template_path }}/config-kdevops"
+  register: config_kdevops
+
+- name: Found config-kdevops, using it for template
+  ansible.builtin.set_fact:
+    linux_config: "config-kdevops"
+  when: config_kdevops.stat.exists
+
+- name: No config-kdevops, looking for {{ target_linux_config }}
+  ansible.builtin.set_fact:
+    linux_config: "{{ target_linux_config }}"
+  when: not config_kdevops.stat.exists
+
+- name: Check whether specific kernel config exists for {{ target_linux_ref }}
+  delegate_to: localhost
+  ansible.builtin.stat:
+    path: "{{ template_path }}/{{ target_linux_config }}"
+  register: kernel_config
+
+- name: Find all linux-next configs
+  delegate_to: localhost
+  ansible.builtin.find:
+    paths: "{{ template_path }}"
+    patterns: "config-next*"
+    file_type: file
+    recurse: false
+  register: found_configs
+  when:
+    - not config_kdevops.stat.exists
+    - not kernel_config.stat.exists
+
+- name: Extract the date from the filenames
+  ansible.builtin.set_fact:
+    configs_with_dates: "{{ configs_with_dates | default([]) + [{'file': item.path, 'date': (item.path | regex_search('config-next-(\\d{8})')).split('-')[-1]}] }}"
+  loop: "{{ found_configs.files }}"
+  no_log: true
+  when:
+    - not config_kdevops.stat.exists
+    - not kernel_config.stat.exists
+    - item.path is search('config-next-(\\d{8})')
+
+- name: Sort configs based on date extracted from filename
+  ansible.builtin.set_fact:
+    sorted_configs: "{{ configs_with_dates | selectattr('date', 'defined') | sort(attribute='date', reverse=True) | map(attribute='file') | list }}"
+  when:
+    - not config_kdevops.stat.exists
+    - not kernel_config.stat.exists
+    - configs_with_dates | length > 0
+
+- name: Set latest linux-next config
+  ansible.builtin.set_fact:
+    latest_linux_next_config: "{{ sorted_configs[0] }}"
+  when:
+    - not config_kdevops.stat.exists and not kernel_config.stat.exists
+    - sorted_configs | length > 0
+
+- name: Use the specific kernel config or fallback to the latest linux-next
+  ansible.builtin.set_fact:
+    linux_config: "{{ target_linux_config | default('') if kernel_config.stat.exists else (latest_linux_next_config | default('') | basename) }}"
+  when:
+    - not config_kdevops.stat.exists
+    - not kernel_config.stat.exists
+    - latest_linux_next_config is defined
+
+- name: Verify that the Linux configuration file exists
+  delegate_to: localhost
+  ansible.builtin.stat:
+    path: "{{ template_path }}/{{ linux_config }}"
+  register: config_stat
+  when: linux_config is defined
+
+- name: Fail if the configuration file does not exist
+  ansible.builtin.fail:
+    msg: "The configuration file {{ template_path }}/{{ linux_config }} does not exist."
+  when: not config_stat.stat.exists
+
+- name: Copy configuration for Linux {{ target_linux_tree }}
+  ansible.builtin.template:
+    src: "{{ template_path }}/{{ linux_config }}"
+    dest: "{{ target_linux_dir_path }}/.config"
+    owner: "{{ data_user }}"
+    group: "{{ data_group }}"
+    mode: "u=rw,g=r,o=r"
+
+- name: Set the kernel localversion
+  ansible.builtin.lineinfile:
+    path: "{{ target_linux_dir_path }}/localversion"
+    line: "{{ target_linux_localversion }}"
+    mode: "u=rw,g=r,o=r"
+    create: true
+  when:
+    - target_linux_localversion is defined and target_linux_localversion != ""
+
+- name: Configure Linux {{ target_linux_tree }}
+  community.general.make:
+    chdir: "{{ target_linux_dir_path }}"
+    target: "olddefconfig"
+
+- name: Build {{ target_linux_tree }}
+  community.general.make:
+    chdir: "{{ target_linux_dir_path }}"
+    target: "all"
+    jobs: "{{ ansible_processor_nproc }}"
+
+- name: Remove the artifacts directory
+  delegate_to: localhost
+  ansible.builtin.file:
+    path: "{{ build_artifacts_dir }}"
+    state: absent
+
+- name: Create an empty artifacts directory
+  delegate_to: localhost
+  ansible.builtin.file:
+    path: "{{ build_artifacts_dir }}"
+    state: directory
+    mode: "u=rwx,g=rx,o=rx"
+
+- name: Build kernel .deb packages
+  when:
+    - ansible_os_family == "Debian"
+  block:
+    - name: Make the bindeb-pkg target
+      community.general.make:
+        chdir: "{{ target_linux_dir_path }}"
+        target: "bindeb-pkg"
+
+    - name: Find the build artifacts
+      ansible.builtin.find:
+        paths: "{{ target_linux_dir_path }}"
+        patterns: "*.deb"
+        file_type: file
+        recurse: true
+      register: found_debs
+
+    - name: Fetch the build artifacts to the control host
+      ansible.builtin.fetch:
+        src: "{{ item.path }}"
+        dest: "{{ build_artifacts_dir }}"
+        flat: true
+      loop: "{{ found_debs.files }}"
+      loop_control:
+        label: "Fetching {{ item.path }}"
+
+- name: Build kernel .rpm packages
+  when:
+    - ansible_os_family == "RedHat"
+  block:
+    - name: Make the binrpm-pkg target
+      community.general.make:
+        chdir: "{{ target_linux_dir_path }}"
+        target: "binrpm-pkg"
+
+    - name: Find the build artifacts
+      ansible.builtin.find:
+        paths: "{{ target_linux_dir_path }}/rpmbuild/RPMS"
+        patterns: "*.rpm"
+        file_type: file
+        recurse: true
+      register: found_rpms
+
+    - name: Fetch the build artifacts to the control host
+      ansible.builtin.fetch:
+        src: "{{ item.path }}"
+        dest: "{{ build_artifacts_dir }}"
+        flat: true
+      loop: "{{ found_rpms.files }}"
+      loop_control:
+        label: "Fetching {{ item.path }}"
+
+- name: Extract the release information of the built kernel
+  community.general.make:
+    chdir: "{{ target_linux_dir_path }}"
+    target: "kernelrelease"
+  register: kernelrelease
+
+- name: Store the kernel release information with the build artifacts
+  delegate_to: localhost
+  ansible.builtin.lineinfile:
+    create: true
+    line: "{{ kernelrelease.stdout }}"
+    mode: "u=rw,g=r,o=r"
+    path: "{{ build_artifacts_dir }}/kernel.release"
diff --git a/workflows/linux/Makefile b/workflows/linux/Makefile
index 65bbb8ae9a90..bb7441e71fda 100644
--- a/workflows/linux/Makefile
+++ b/workflows/linux/Makefile
@@ -84,6 +84,7 @@ linux-help-menu:
 	@echo "linux-clone        - Only clones Linux"
 	@echo "linux-grub-setup   - Ensures the appropriate target kernel is set to boot"
 	@echo "linux-reboot       - Reboot guests"
+	@echo "linux-packages     - Clones, builds, and packages a Linux kernel"
 	@echo "uname              - Prints current running kernel"
 
 PHONY += linux-help-end
@@ -159,6 +160,12 @@ linux-reboot:
 		$(KDEVOPS_HOSTFILE) $(KDEVOPS_PLAYBOOKS_DIR)/bootlinux.yml \
 		--extra-vars="$(BOOTLINUX_ARGS)" $(LIMIT_HOSTS) --tags vars,reboot
 
+PHONY += linux-packages
+linux-packages:
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) -i $(KDEVOPS_HOSTFILE) \
+		$(KDEVOPS_PLAYBOOKS_DIR)/build_linux.yml \
+		--extra-vars="$(BOOTLINUX_ARGS)" $(LIMIT_HOSTS)
+
 PHONY += uname
 uname:
 	$(Q)ansible all -i hosts -b -m command -a "uname -r" -o \
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 3/3] Experimental: Add a separate install_linux role
  2025-04-22 15:49 [RFC PATCH 0/3] Build once, test everywhere cel
  2025-04-22 15:49 ` [RFC PATCH 1/3] Add a guest/instance for building the test kernel cel
  2025-04-22 15:49 ` [RFC PATCH 2/3] playbooks: Add a build_linux role cel
@ 2025-04-22 15:49 ` cel
  2025-04-23  5:27 ` [RFC PATCH 0/3] Build once, test everywhere Luis Chamberlain
  2025-04-23 12:34 ` Daniel Gomez
  4 siblings, 0 replies; 9+ messages in thread
From: cel @ 2025-04-22 15:49 UTC (permalink / raw)
  To: Luis Chamberlain, Daniel Gomez, Scott Mayhew; +Cc: kdevops, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Add a role that can grab the artifacts in workflows/linux/artifacts
and install them on all guests/instances.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 playbooks/install_linux.yml                   |   4 +
 playbooks/roles/install_linux/README.md       | 136 ++++++++++++
 .../roles/install_linux/defaults/main.yml     |  43 ++++
 .../tasks/install-deps/debian/main.yml        |  44 ++++
 .../tasks/install-deps/redhat/main.yml        |  76 +++++++
 .../tasks/install-deps/suse/main.yml          |  31 +++
 playbooks/roles/install_linux/tasks/main.yml  | 142 +++++++++++++
 .../tasks/update-grub/debian.yml              |   8 +
 .../tasks/update-grub/install.yml             | 196 ++++++++++++++++++
 .../install_linux/tasks/update-grub/main.yml  |  15 ++
 .../tasks/update-grub/redhat.yml              |  36 ++++
 .../install_linux/tasks/update-grub/suse.yml  |  11 +
 workflows/linux/Makefile                      |   8 +
 13 files changed, 750 insertions(+)
 create mode 100644 playbooks/install_linux.yml
 create mode 100644 playbooks/roles/install_linux/README.md
 create mode 100644 playbooks/roles/install_linux/defaults/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/install-deps/debian/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/install-deps/redhat/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/install-deps/suse/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/debian.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/install.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/main.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/redhat.yml
 create mode 100644 playbooks/roles/install_linux/tasks/update-grub/suse.yml

diff --git a/playbooks/install_linux.yml b/playbooks/install_linux.yml
new file mode 100644
index 000000000000..273ef4512f81
--- /dev/null
+++ b/playbooks/install_linux.yml
@@ -0,0 +1,4 @@
+---
+- hosts: all
+  roles:
+    - role: install_linux
diff --git a/playbooks/roles/install_linux/README.md b/playbooks/roles/install_linux/README.md
new file mode 100644
index 000000000000..0b440d2130b0
--- /dev/null
+++ b/playbooks/roles/install_linux/README.md
@@ -0,0 +1,136 @@
+bootlinux
+=========
+
+The ansible bootlinux lets you get, build and install Linux.  It also lets you
+apply custom patches, remove kernels, etc. Anything you have to do with regards
+to generic kernel development. The defaults it will track one of the latest
+stable kernels that are still supported, using the linux stable git tree.
+
+Requirements
+------------
+
+You are expected to have an extra partition
+
+Role Variables
+--------------
+
+  * infer_uid_and_group: defaults to False, if set to True, then we will ignore
+    the passed on data_user and data_group and instead try to infer this by
+    inspecting the `whoami` and getent on the logged in target system we are
+    provisioning. So if user sam is running able on a host, targetting a system
+    called foofighter and logging into that system using username pincho,
+    then the data_user will be set overwritten and set to pincho. We will then
+    also lookup for pincho's default group id and use that for data_group.
+    This is useful if you are targetting a slew of systems and don't really
+    want to deal with the complexities of the username and group, and the
+    default target username you use to ssh into a system suffices to use as
+    a base. This is set to False to remain compatible with old users of
+    this role.
+  * data_path: where to place the git trees we clone under
+  * data_user: the user to assign permissions to
+  * data_group: the group to assign permissions to
+
+  * data_device: the target device to use for the data partition
+  * data_fstype: the filesystem to store the data parition under
+  * data_label: the label to use
+  * data_fs_opts: the filesystem options to use, you want to ensure to add the
+    label
+
+  * target_linux_admin_name: your developer name
+  * target_linux_admin_email: your email
+  * target_linux_git: the git tree to clone, by default this is the linux-stable
+    tree
+  * target_linux_tree: the name of the tree
+  * target_linux_dir_path: where to place the tree on the target system
+
+  * target_linux_ref : the actual tag as used on linux, so v4.19.62
+  * target_linux_extra_patch: if defined an extra patch to apply with git
+     am prior to compilation
+  * target_linux_config: the configuration file to use
+  * make: the make command to use
+  * target_linux_make_cmd: the actual full make command and its arguments
+  * target_linux_make_install_cmd: the install command
+
+Dependencies
+------------
+
+None.
+
+Example Playbook
+----------------
+
+Below is an example playbook, say a bootlinux.yml file:
+
+```
+---
+- hosts: all
+  roles:
+    - role: bootlinux
+```
+
+Custom runs
+===========
+
+Say you want to boot compile a vanilla kernel and you have created a new
+section under the hosts file called [dev], with a subset of the [all] section.
+You can compile say a vanilla kernel v4.19.58 with an extra set of patches we'd
+`git am` for you on top by using the following:
+
+```
+cd ansible
+ansible-playbook -i hosts -l dev --extra-vars "target_linux_extra_patch=pend-v4.19.58-fixes-20190716-v2.patch" bootlinux.yml
+```
+
+You'd place the `pend-v4.19.58-fixes-20190716-v2.patch` file on the directory
+`ansible/roles/bootlinux/templates/`.
+
+Now say you wantd to be explicit about a tag of Linux you'd want to use:
+
+```
+ansible-playbook -i hosts -l dev --extra-vars "target_linux_ref=v4.19.21 "target_linux_extra_patch=try-v4.19.20-fixes-20190716-v1.patch" bootlinux.yml
+```
+
+To uninstall a kernel:
+
+```
+ansible-playbook -i hosts -l dev --tags uninstall-linux --extra-vars "uninstall_kernel_ver=4.19.58+" bootlinux.yml
+```
+
+To ensure you can get the grub prompt:
+
+```bash
+ansible-playbook -i hosts --tags console,vars,manual-update-grub playbooks/bootlinux.yml
+```
+
+The ansible bootlinux role relies on the create_partition role to create a data
+partition where we can stuff code, and compile it. To test that aspect of
+the bootlinux role you can run:
+
+```
+ansible-playbook -i hosts -l baseline --tags data_partition,partition bootlinux.yml
+```
+
+To reboot all hosts:
+
+```bash
+ansible-playbook -i hosts bootlinux.yml --tags reboot
+```
+
+For further examples refer to one of this role's users, the
+[https://github.com/mcgrof/kdevops](kdevops) project or the
+[https://github.com/mcgrof/oscheck](oscheck) project from where
+this code originally came from.
+
+# TODO
+
+## Avoiding carrying linux-next configs
+
+It seems a waste of space to be adding configurations for linux-next for all
+tags. It seems easier to just look for the latest linux-next and try that.
+We just symlink linux-next files when we really need to, and when something
+really needs a new config, we then just add a new file.
+
+License
+-------
+
+copyleft-next-0.3.1
diff --git a/playbooks/roles/install_linux/defaults/main.yml b/playbooks/roles/install_linux/defaults/main.yml
new file mode 100644
index 000000000000..43edb3cafd79
--- /dev/null
+++ b/playbooks/roles/install_linux/defaults/main.yml
@@ -0,0 +1,43 @@
+# SPDX-License-Identifier copyleft-next-0.3.1
+---
+kdevops_bootlinux: False
+infer_uid_and_group: False
+
+data_path: "/data"
+data_user: "vagrant"
+data_group: "vagrant"
+
+data_device: "/dev/disk/by-id/nvme-QEMU_NVMe_Ctrl_kdevops0"
+data_fstype: "xfs"
+data_label: "data"
+data_fs_opts: "-L {{ disk_setup_label }}"
+
+# Linux target defaults
+target_linux_admin_name: "Hacker Amanda"
+target_linux_admin_email: "devnull@kernel.org"
+target_linux_git: "https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git"
+target_linux_shallow_depth: 0
+target_linux_tree: "linux-stable"
+target_linux_dir_path: "{{ data_path }}/{{ target_linux_tree }}"
+kdevops_baseline_and_dev: False
+
+target_linux_ref: "v4.19.133"
+target_linux_delta_file:
+target_linux_config: "config-{{ target_linux_ref }}"
+make: "make"
+# Once ansible v2.10 becomes available we can move on to using
+# ansible_processor_nproc but that was merged in 2020:
+# The commit is 34db57a47f875d11c4068567b9ec7ace174ec4cf
+# introduce fact "ansible_processor_nproc": number of usable vcpus #66569
+# https://github.com/ansible/ansible/pull/66569
+target_linux_make_cmd: "{{ make }} -j{{ ansible_processor_vcpus }}"
+target_linux_make_install_cmd: "{{ target_linux_make_cmd }} modules_install install"
+
+uninstall_kernel_enable: False
+
+build_artifacts_dir: "{{ topdir_path }}/workflows/linux/artifacts/"
+
+kdevops_workflow_enable_cxl: False
+
+bootlinux_cxl_test: False
+bootlinux_tree_set_by_cli: False
diff --git a/playbooks/roles/install_linux/tasks/install-deps/debian/main.yml b/playbooks/roles/install_linux/tasks/install-deps/debian/main.yml
new file mode 100644
index 000000000000..51b216e47b06
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/install-deps/debian/main.yml
@@ -0,0 +1,44 @@
+---
+# Install dependencies for building linux on Debian
+
+- name: Update apt cache
+  become: yes
+  become_method: sudo
+  apt:
+    update_cache: yes
+  tags: linux
+
+# apt-get build-dep does not capture all requirements
+- name: Install Linux kernel build dependencies
+  become: yes
+  become_method: sudo
+  apt:
+    name:
+      - bison
+      - flex
+      - git
+      - gcc
+      - make
+      - gawk
+      - bc
+      - dump
+      - indent
+      - sed
+      - libssl-dev
+      - libelf-dev
+      - liburcu-dev
+      - xfsprogs
+      - e2fsprogs
+      - btrfs-progs
+      - ntfs-3g
+      - mdadm
+      - rpcbind
+      - portmap
+      - hwinfo
+      - open-iscsi
+      - python3-pip
+      - zstd
+      - libncurses-dev
+      - b4
+    state: present
+  tags: linux
diff --git a/playbooks/roles/install_linux/tasks/install-deps/redhat/main.yml b/playbooks/roles/install_linux/tasks/install-deps/redhat/main.yml
new file mode 100644
index 000000000000..57a340979fcd
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/install-deps/redhat/main.yml
@@ -0,0 +1,76 @@
+---
+- name: Enable installation of packages from EPEL
+  ansible.builtin.include_role:
+    name: epel-release
+  when:
+    - ansible_distribution != "Fedora"
+
+- name: Install packages we care about
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.dnf:
+    update_cache: true
+    name: "{{ packages }}"
+  retries: 3
+  delay: 5
+  register: result
+  until: result is succeeded
+  vars:
+    packages:
+      - bison
+      - flex
+      - git-core
+      - e2fsprogs
+      - xfsprogs
+      - xfsdump
+      - lvm2
+      - gcc
+      - make
+      - gawk
+      - bc
+      - dump
+      - libtool
+      - psmisc
+      - sed
+      - vim
+      - fio
+      - libaio-devel
+      - diffutils
+      - net-tools
+      - ncurses-devel
+      - xfsprogs
+      - e2fsprogs
+      - elfutils-libelf-devel
+      - ntfs-3g
+      - mdadm
+      - rpcbind
+      - portmap
+      - hwinfo
+      - iscsi-initiator-utils
+      - openssl
+      - openssl-devel
+      - dwarves
+      - userspace-rcu
+      - zstd
+
+- name: Install btrfs-progs
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.dnf:
+    update_cache: true
+    name: "{{ packages }}"
+  retries: 3
+  delay: 5
+  register: result
+  until: result is succeeded
+  vars:
+    packages:
+      - btrfs-progs
+  when: ansible_distribution == 'Fedora'
+
+- name: Remove packages that mess with initramfs
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.dnf:
+    state: absent
+    name: dracut-config-generic
diff --git a/playbooks/roles/install_linux/tasks/install-deps/suse/main.yml b/playbooks/roles/install_linux/tasks/install-deps/suse/main.yml
new file mode 100644
index 000000000000..204a181b8237
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/install-deps/suse/main.yml
@@ -0,0 +1,31 @@
+---
+- name: Install Linux kernel build dependencies for SUSE sources
+  become: yes
+  become_method: sudo
+  zypper:
+    name:
+      - bison
+      - flex
+      - git-core
+      - gcc
+      - make
+      - gawk
+      - bc
+      - dump
+      - sed
+      - libopenssl-devel
+      - libelf-devel
+      - liburcu8
+      - diffutils
+      - net-tools
+      - ncurses-devel
+      - xfsprogs
+      - e2fsprogs
+      - btrfsprogs
+      - ntfs-3g
+      - mdadm
+      - rpcbind
+      - portmap
+      - hwinfo
+      - open-iscsi
+    disable_recommends: no
diff --git a/playbooks/roles/install_linux/tasks/main.yml b/playbooks/roles/install_linux/tasks/main.yml
new file mode 100644
index 000000000000..a0da4c110f6e
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/main.yml
@@ -0,0 +1,142 @@
+---
+- name: Include optional extra_vars
+  ansible.builtin.include_vars:
+    file: "{{ item }}"
+  with_first_found:
+    - files:
+        - "../extra_vars.yml"
+        - "../extra_vars.yaml"
+        - "../extra_vars.json"
+      skip: true
+  failed_when: false
+  tags: vars
+
+- name: Debian-specific set up
+  ansible.builtin.import_tasks: install-deps/debian/main.yml
+  when:
+    - ansible_os_family == "Debian"
+
+- name: Suse-specific set up
+  ansible.builtin.import_tasks: install-deps/suse/main.yml
+  when:
+    - ansible_os_family == "Suse"
+
+- name: Red Hat-specific set up
+  ansible.builtin.import_tasks: install-deps/redhat/main.yml
+  when:
+    - ansible_os_family == "RedHat"
+
+# We use "console serial" so to enable real consoles to be
+# preferred first, and fallback to the serial as secondary
+# option. This let's us work with hardware serial consoles
+# say on IPMIs and virtual guests ('virsh console').
+- name: Ensure we can get the GRUB prompt on reboot
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  ansible.builtin.lineinfile:
+    path: /etc/default/grub
+    regexp: '^GRUB_TERMINAL='
+    line: GRUB_TERMINAL="console serial"
+  tags:
+    - linux
+    - git
+    - config
+    - console
+
+- name: Update the boot GRUB file
+  ansible.builtin.import_tasks: update-grub/main.yml
+  tags:
+    - linux
+    - uninstall-linux
+    - manual-update-grub
+    - console
+
+- name: Ensure DEFAULTDEBUG is set
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  register: grub_default_saved_cmd
+  ansible.builtin.lineinfile:
+    path: /etc/sysconfig/kernel
+    regexp: '^DEFAULTDEBUG='
+    line: DEFAULTDEBUG=yes
+  when:
+    - ansible_os_family == "RedHat"
+  tags:
+    - linux
+    - git
+    - config
+    - saved
+
+- name: Install the built kernel RPMs on the target nodes
+  when:
+    - ansible_os_family == "RedHat"
+  tags:
+    - linux
+    - install-linux
+  block:
+    - name: Find the kernel build artifacts on the control host
+      delegate_to: localhost
+      ansible.builtin.find:
+        paths: "{{ build_artifacts_dir }}"
+        patterns: "*.rpm"
+        file_type: file
+        recurse: true
+      register: found_rpms
+
+    - name: Upload the kernel build artifacts to the target nodes
+      ansible.builtin.copy:
+        src: "{{ item.path }}"
+        dest: "/tmp"
+        mode: "u=rw,g=r,o=r"
+      loop: "{{ found_rpms.files }}"
+      loop_control:
+        label: "Uploading {{ item.path }}"
+
+    - name: Initialize list of packages to install
+      ansible.builtin.set_fact:
+        packages: []
+
+    - name: Build a list of packages to install
+      ansible.builtin.set_fact:
+        packages: "{{ packages + ['/tmp/' + item.path | basename ] }}"
+      loop: "{{ found_rpms.files }}"
+      loop_control:
+        label: "Adding {{ item.path }}"
+
+    - name: Install the kernel build artifacts on the target nodes
+      become: true
+      become_method: ansible.builtin.sudo
+      ansible.builtin.dnf:
+        name: "{{ packages }}"
+        state: present
+        disable_gpg_check: true
+
+- name: Set the default kernel on the target nodes
+  ansible.builtin.import_tasks: update-grub/install.yml
+  tags:
+    - linux
+    - git
+    - config
+    - saved
+
+- name: Reboot the target nodes into Linux {{ target_linux_tree }}
+  become: true
+  become_method: ansible.builtin.sudo
+  ansible.builtin.reboot:
+  tags:
+    - linux
+    - reboot
+
+- name: Refresh facts
+  ansible.builtin.gather_facts:
+
+- name: Check the uname on the target nodes
+  ansible.builtin.debug:
+    msg: "Target kernel {{ target_linux_ref }}; Running kernel {{ ansible_kernel }}"
+  tags:
+    - linux
+    - git
+    - config
+    - uname
diff --git a/playbooks/roles/install_linux/tasks/update-grub/debian.yml b/playbooks/roles/install_linux/tasks/update-grub/debian.yml
new file mode 100644
index 000000000000..3c7deea2161a
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/update-grub/debian.yml
@@ -0,0 +1,8 @@
+- name: Run update-grub
+  become: yes
+  become_flags: 'su - -c'
+  become_method: sudo
+  command: "update-grub"
+  register: grub_update
+  changed_when: "grub_update.rc == 0"
+  tags: [ 'linux', 'manual-update-grub', 'console' ]
diff --git a/playbooks/roles/install_linux/tasks/update-grub/install.yml b/playbooks/roles/install_linux/tasks/update-grub/install.yml
new file mode 100644
index 000000000000..17966af58210
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/update-grub/install.yml
@@ -0,0 +1,196 @@
+# There is slightly confusing user-experience and not complete documentation
+# about the requirements for using grub-set-default in light of the fact that
+# most Linux distributions use sub-menus. You need to use GRUB_DEFAULT=saved
+# there is a few caveats with its use which are not well documented anywhere
+# and I'm pretty sure tons of people are running into these issues.
+#
+# I'll document them here for posterity and so to justify the approach used
+# in kdevops to ensure we do boot the correct kernel.
+#
+# Some users erroneously claim that you also need GRUB_SAVEDEFAULT=true when
+# using GRUB_DEFAULT=saved but this is not true. The issue with using
+# GRUB_DEFAULT=saved which causes confusion is that most distributions
+# today use submenus folks do not take these into account when using
+# grub-set-default and the documentation about grub-set-default is not
+# clear about this requirement.
+#
+# Sadly, if you use a bogus kernel grub-set-default will not complain. For
+# example since most distributions use submenus, if you install a new kernel you
+# may end up in a situation as follows:
+#
+# menuentry 'Debian GNU/Linux' ... {
+#   ...
+# }
+# submenu 'Advanced options for Debian GNU/Linux' ... {
+#   menuentry 'Debian GNU/Linux, with Linux 5.16.0-4-amd64' ... {
+#     ...
+#   }
+#   menuentry 'Debian GNU/Linux, with Linux 5.16.0-4-amd64 (recovery mode)' ... {
+#     ...
+#   }
+#   menuentry 'Debian GNU/Linux, with Linux 5.10.105' ... {
+#     ...
+#   }
+#   ... etc ...
+# }
+#
+# So under this scheme the 5.10.105 kernel is actually "1>2" and so if
+# you used:
+#
+#   grub-set-default 3
+#
+# This would not return an error and you would expect it to work. This
+# is a bug in grub-set-default, it should return an error. The correct
+# way to set this with submenus would be:
+#
+#   grub-set-default "1>2"
+#
+# However doing the reverse mapping is something which can get complicated
+# and there is no upstream GRUB2 support to do this for you. We can simplify
+# this problem instead by disabling the submenus, with GRUB_DISABLE_SUBMENU=y,
+# making the menu flat and then just querying for the linear mapping using
+# ansible using awk | grep and tail.
+#
+# So for instance, using GRUB_DISABLE_SUBMENU=y results in the following
+# options:
+#
+# vagrant@kdevops-xfs-nocrc ~ $ awk -F\' '/menuentry / {print $2}' /boot/grub/grub.cfg |  awk '{print NR-1" ... "$0}'
+# 0 ... Debian GNU/Linux, with Linux 5.16.0-4-amd64
+# 1 ... Debian GNU/Linux, with Linux 5.16.0-4-amd64 (recovery mode)
+# 2 ... Debian GNU/Linux, with Linux 5.10.105
+# 3 ... Debian GNU/Linux, with Linux 5.10.105 (recovery mode)
+# 4 ... Debian GNU/Linux, with Linux 5.10.0-5-amd64
+# 5 ... Debian GNU/Linux, with Linux 5.10.0-5-amd64 (recovery mode)
+#
+# We have a higher degree of confidence with this structure when looking
+# for "5.10.105" that its respective boot entry 2 is the correct one. So we'd
+# now just use:
+#
+#   grub-set-default 2
+- name: Ensure we have GRUB_DEFAULT=saved
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  ansible.builtin.lineinfile:
+    path: /etc/default/grub
+    regexp: '^GRUB_DEFAULT='
+    line: GRUB_DEFAULT=saved
+  register: grub_default_saved_cmd
+  tags:
+    - linux
+    - git
+    - config
+    - saved
+
+- name: Use GRUB_DISABLE_SUBMENU=y to enable grub-set-default use with one digit
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  register: grub_disable_submenu_cmd
+  ansible.builtin.lineinfile:
+    path: /etc/default/grub
+    regexp: '^GRUB_DISABLE_SUBMENU='
+    line: GRUB_DISABLE_SUBMENU=y
+  tags:
+    - linux
+    - git
+    - config
+    - saved
+
+- name: Update your boot GRUB file if necessary to ensure GRUB flat earth
+  ansible.builtin.import_tasks: update-grub/main.yml
+  tags:
+    - linux
+    - uninstall-linux
+    - manual-update-grub
+    - console
+
+- name: Read the artifacts release file
+  delegate_to: localhost
+  vars:
+    release_file: "{{ topdir_path }}/workflows/linux/artifacts/kernel.release"
+  ansible.builtin.set_fact:
+    kernelrelease: "{{ lookup('file', release_file) }}"
+
+- name: Show kernel release information
+  ansible.builtin.debug:
+    var: kernelrelease
+
+- name: Construct command line to determine default kernel ID
+  ansible.builtin.set_fact:
+    determine_default_kernel_id: >-
+      awk -F\' '/menuentry / {print $2}'
+      /boot/grub/grub.cfg | awk '{print NR-1" ... "$0}' |
+      grep {{ kernelrelease }} | head -1 | awk '{print $1}'
+  when:
+    - ansible_os_family != 'RedHat' or ansible_distribution_major_version | int < 8
+
+- name: Construct command line to determine default kernel ID for RHEL >= 8
+  ansible.builtin.set_fact:
+    determine_default_kernel_id: >-
+      for f in $(ls -1 /boot/loader/entries/*.conf); do
+      cat $f;
+      done | grep title | awk '{ gsub("title ", "", $0); print }' | grep '{{ kernelrelease }}';
+  when:
+    - ansible_os_family == "RedHat" and ansible_distribution_major_version | int >= 8
+
+# If this fails then grub-set-default won't be run, and the assumption here
+# is either you do the work to enhance the heuristic or live happy with the
+# assumption that grub2's default of picking the latest kernel is the best
+# option.
+- name: Try to find your target kernel's GRUB boot entry number now that the menu is flattened for {{ target_linux_ref }} using inferred KERNELRELEASE {{ kernelrelease}}
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  vars:
+    target_kernel: "{{ target_linux_ref | replace('v', '') }}"
+  ansible.builtin.shell: " {{ determine_default_kernel_id }} "
+  register: grub_boot_number_cmd
+  tags:
+    - linux
+    - git
+    - config
+    - saved
+
+- name: Obtain command to set default kernel to boot
+  ansible.builtin.set_fact:
+    grub_set_default_boot_kernel: grub-set-default
+  when:
+    - ansible_os_family != "RedHat" or ansible_distribution_major_version | int < 8
+
+- name: Obtain command to set default kernel to boot for RHEL >= 8
+  ansible.builtin.set_fact:
+    grub_set_default_boot_kernel: grub2-set-default
+  when:
+    - ansible_os_family == "RedHat" and ansible_distribution_major_version | int >= 8
+
+- name: Set the target kernel to be booted by default
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  vars:
+    target_boot_entry: "{{ grub_boot_number_cmd.stdout_lines.0 }}"
+  ansible.builtin.command: "{{ grub_set_default_boot_kernel }} \"{{ target_boot_entry }}\""
+  when:
+    - grub_boot_number_cmd.rc == 0
+    - grub_boot_number_cmd.stdout != ""
+  tags:
+    - linux
+    - git
+    - config
+    - saved
+
+- name: Itemize kernel and GRUB entry we just selected
+  vars:
+    target_kernel: "{{ target_linux_ref | replace('v', '') }}"
+    target_boot_entry: "{{ grub_boot_number_cmd.stdout_lines.0 }}"
+  ansible.builtin.debug:
+    msg: "{{ target_kernel }} determined to be {{ target_boot_entry }} on the GRUB2 flat menu. Ran: grub-set-default {{ target_boot_entry }}"
+  when:
+    - grub_boot_number_cmd.rc == 0
+    - grub_boot_number_cmd.stdout != ""
+  tags:
+    - linux
+    - git
+    - config
+    - saved
diff --git a/playbooks/roles/install_linux/tasks/update-grub/main.yml b/playbooks/roles/install_linux/tasks/update-grub/main.yml
new file mode 100644
index 000000000000..a565e0ac26ac
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/update-grub/main.yml
@@ -0,0 +1,15 @@
+---
+- name: Debian-specific grub update
+  ansible.builtin.import_tasks: debian.yml
+  when:
+    - ansible_os_family == "Debian"
+
+- name: Red Hat-specific grub update
+  ansible.builtin.import_tasks: redhat.yml
+  when:
+    - ansible_os_family == "RedHat"
+
+- name: Suse-specific grub update
+  ansible.builtin.import_tasks: suse.yml
+  when:
+    - ansible_os_family == "Suse"
diff --git a/playbooks/roles/install_linux/tasks/update-grub/redhat.yml b/playbooks/roles/install_linux/tasks/update-grub/redhat.yml
new file mode 100644
index 000000000000..11a92f34bab6
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/update-grub/redhat.yml
@@ -0,0 +1,36 @@
+- name: Disable Grub menu auto-hide
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  ansible.builtin.command: "grub2-editenv - unset menu_auto_hide"
+  register: grub_edit
+  changed_when: "grub_edit.rc == 0"
+
+- name: Determine if system was booted using UEFI
+  ansible.builtin.stat:
+    path: "/sys/firmware/efi/efivars"
+  register: efi_boot
+
+- name: Use /etc/grub2.cfg as the grub configuration file
+  ansible.builtin.set_fact:
+    grub_config_file: "/etc/grub2.cfg"
+  when:
+    - not efi_boot.stat.exists
+
+- name: Use /etc/grub2-efi.cfg as the configuration file
+  ansible.builtin.set_fact:
+    grub_config_file: "/etc/grub2-efi.cfg"
+  when:
+    - efi_boot.stat.exists
+
+- name: Run update-grub
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  ansible.builtin.command: "grub2-mkconfig -o {{ grub_config_file }}"
+  register: grub_update
+  changed_when: "grub_update.rc == 0"
+  tags:
+    - linux
+    - manual-update-grub
+    - console
diff --git a/playbooks/roles/install_linux/tasks/update-grub/suse.yml b/playbooks/roles/install_linux/tasks/update-grub/suse.yml
new file mode 100644
index 000000000000..1cb1fdc55f58
--- /dev/null
+++ b/playbooks/roles/install_linux/tasks/update-grub/suse.yml
@@ -0,0 +1,11 @@
+- name: Run update-grub
+  become: true
+  become_flags: 'su - -c'
+  become_method: ansible.builtin.sudo
+  ansible.builtin.command: "update-bootloader --refresh"
+  register: grub_update
+  changed_when: "grub_update.rc == 0"
+  tags:
+    - linux
+    - manual-update-grub
+    - console
diff --git a/workflows/linux/Makefile b/workflows/linux/Makefile
index bb7441e71fda..06722d5903b3 100644
--- a/workflows/linux/Makefile
+++ b/workflows/linux/Makefile
@@ -85,6 +85,7 @@ linux-help-menu:
 	@echo "linux-grub-setup   - Ensures the appropriate target kernel is set to boot"
 	@echo "linux-reboot       - Reboot guests"
 	@echo "linux-packages     - Clones, builds, and packages a Linux kernel"
+	@echo "linux-artifacts    - Installs artifacts generated by 'linux-packages'"
 	@echo "uname              - Prints current running kernel"
 
 PHONY += linux-help-end
@@ -166,6 +167,13 @@ linux-packages:
 		$(KDEVOPS_PLAYBOOKS_DIR)/build_linux.yml \
 		--extra-vars="$(BOOTLINUX_ARGS)" $(LIMIT_HOSTS)
 
+PHONY += linux-artifacts
+linux-artifacts:
+	$(Q)ansible-playbook $(ANSIBLE_VERBOSE) \
+		-i $(KDEVOPS_HOSTFILE) \
+		$(KDEVOPS_PLAYBOOKS_DIR)/install_linux.yml \
+		--extra-vars="$(BOOTLINUX_ARGS)" $(LIMIT_HOSTS)
+
 PHONY += uname
 uname:
 	$(Q)ansible all -i hosts -b -m command -a "uname -r" -o \
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Build once, test everywhere
  2025-04-22 15:49 [RFC PATCH 0/3] Build once, test everywhere cel
                   ` (2 preceding siblings ...)
  2025-04-22 15:49 ` [RFC PATCH 3/3] Experimental: Add a separate install_linux role cel
@ 2025-04-23  5:27 ` Luis Chamberlain
  2025-04-23 12:34 ` Daniel Gomez
  4 siblings, 0 replies; 9+ messages in thread
From: Luis Chamberlain @ 2025-04-23  5:27 UTC (permalink / raw)
  To: cel; +Cc: Daniel Gomez, Scott Mayhew, kdevops, Chuck Lever

On Tue, Apr 22, 2025 at 11:49:03AM -0400, cel@kernel.org wrote:
> So I got out my whittling knife and built this proof of concept
> where kdevops brings up a kernel builder node that is tailored for
> a fast kernel build (eg, it is one guest with 16 vCPUs). The test
> kernel is packaged (.rpm or .deb) and fetched to the control host,
> then the builder node is destroyed.
> 
> I added a top-level "make" target that uploads the test kernel
> packages to each test runner, and they install it.
> 
> This is not part of the patch series, but shows how to run it:
> 
> $ make mrproper defconfig-kernel-builder
> $ make && make bringup && make linux-packages && make destroy
> $ make defconfig-workflow-one
> $ make && make bringup && make linux-artifacts && make destroy
> $ make defconfig-workflow-two
> $ make && make bringup && make linux-artifacts && make destroy
> $ make defconfig-workflow-three
> $ make && make bringup && make linux-artifacts && make destroy
> $ make mrproper
> 
> All three workflows use the same kernel, built just once, for all
> of their test runners.
> 
> This is still band-aids and chewing gum, but it seems to work on
> both libvirt and cloud configurations. There is more than one way
> to skin this cat, though.

This all looks good to me. We also have a slew of CIs which could
take advantage of daily linux and linux-next builds, when that is
a target kernel, perhaps we could leverage opensuse build service for
that as that I think supports letting us leverage rpm and deb binaries.

But yeah all this work in this series makes perfect sense to me!

 Luis

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Build once, test everywhere
  2025-04-22 15:49 [RFC PATCH 0/3] Build once, test everywhere cel
                   ` (3 preceding siblings ...)
  2025-04-23  5:27 ` [RFC PATCH 0/3] Build once, test everywhere Luis Chamberlain
@ 2025-04-23 12:34 ` Daniel Gomez
  2025-04-23 13:36   ` Chuck Lever
  4 siblings, 1 reply; 9+ messages in thread
From: Daniel Gomez @ 2025-04-23 12:34 UTC (permalink / raw)
  To: cel; +Cc: Luis Chamberlain, Scott Mayhew, kdevops, Chuck Lever

On Tue, Apr 22, 2025 at 11:49:03AM +0100, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> With apologies to the Java folks.
> 
> I've been looking for ways to make our workflow runs more efficient
> because my lab resources are limited, and moving to the cloud costs
> money per CPU second.
> 
> One thing that seems like an obvious win would be to build the test
> kernel one time (for example, after a merge/pull request), then
> re-use that kernel binary for all the workflows we want to run on
> it.

What about rebuilds between iterations?
I think an extra step to improve that build time could be adding support for
ccache in our kernel builds. Would it make sense to extend this in the future to
enable a redis node?

https://github.com/ccache/ccache/wiki/Redis-storage

With the workflow suggestion below, I'd then do:

1. Bring up a redis node guest
2. Bring up a kernel builder node that pulls from redis node ccache
3. Push redis artifacts to control host (for later sync up) and destroy redis guest
4. Push kernel package to control host

That said, all this new workflow makes sense to me. I guess instead of
duplicating the bootlinux role into build_linux I'd like to see a linux
role that handles build location (localhost/specific guest), output package
generation or not, deployment, etc. Does that make sense to you too?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Build once, test everywhere
  2025-04-23 12:34 ` Daniel Gomez
@ 2025-04-23 13:36   ` Chuck Lever
  2025-04-23 17:28     ` Daniel Gomez
  0 siblings, 1 reply; 9+ messages in thread
From: Chuck Lever @ 2025-04-23 13:36 UTC (permalink / raw)
  To: Daniel Gomez; +Cc: Luis Chamberlain, Scott Mayhew, kdevops, Chuck Lever

On 4/23/25 8:34 AM, Daniel Gomez wrote:
> On Tue, Apr 22, 2025 at 11:49:03AM +0100, cel@kernel.org wrote:
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> With apologies to the Java folks.
>>
>> I've been looking for ways to make our workflow runs more efficient
>> because my lab resources are limited, and moving to the cloud costs
>> money per CPU second.
>>
>> One thing that seems like an obvious win would be to build the test
>> kernel one time (for example, after a merge/pull request), then
>> re-use that kernel binary for all the workflows we want to run on
>> it.
> 
> What about rebuilds between iterations?
> I think an extra step to improve that build time could be adding support for
> ccache in our kernel builds. Would it make sense to extend this in the future to
> enable a redis node?
> 
> https://github.com/ccache/ccache/wiki/Redis-storage
> 
> With the workflow suggestion below, I'd then do:
> 
> 1. Bring up a redis node guest
> 2. Bring up a kernel builder node that pulls from redis node ccache
> 3. Push redis artifacts to control host (for later sync up) and destroy redis guest
> 4. Push kernel package to control host

So I've asked our internal testing folks about using ccache to speed up
kernel builds. There were various reasons not to do this, one of which
is we want to have a clean trustworthy and repeatable kernel build to
test with.

I'm not sure I have a specific technical argument against using ccache
except that our builder node is ephemeral, making its ccache usable for
only a single build. A Redis storage node might solve that issue.
However I wonder if the time it takes to bring up the node, reload its
storage, then after the build save the redis artifacts and destroy it
might wipe out any benefit.

With libvirt guests we can save several minutes by doing the guest bring
up in parallel (via Ansible) instead of serially via a shell script.
I've got some ideas about that.


> That said, all this new workflow makes sense to me. I guess instead of
> duplicating the bootlinux role into build_linux I'd like to see a linux
> role that handles build location (localhost/specific guest), output package
> generation or not, deployment, etc. Does that make sense to you too?

The bootlinux role has grown a slew of special cases over the years, so
I'm in the mood to help reorganize it a little.

At least split it into a role that handles the kernel build, and one
that handles the minutiae of installing the builds on test runners.

I'm not quite following your suggestion, but I'm interested in hearing
more detail.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Build once, test everywhere
  2025-04-23 13:36   ` Chuck Lever
@ 2025-04-23 17:28     ` Daniel Gomez
  2025-04-24 13:51       ` Chuck Lever
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Gomez @ 2025-04-23 17:28 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Luis Chamberlain, Scott Mayhew, kdevops, Chuck Lever

On Wed, Apr 23, 2025 at 09:36:37AM +0100, Chuck Lever wrote:
> On 4/23/25 8:34 AM, Daniel Gomez wrote:
> > On Tue, Apr 22, 2025 at 11:49:03AM +0100, cel@kernel.org wrote:
> >> From: Chuck Lever <chuck.lever@oracle.com>
> >>
> >> With apologies to the Java folks.
> >>
> >> I've been looking for ways to make our workflow runs more efficient
> >> because my lab resources are limited, and moving to the cloud costs
> >> money per CPU second.
> >>
> >> One thing that seems like an obvious win would be to build the test
> >> kernel one time (for example, after a merge/pull request), then
> >> re-use that kernel binary for all the workflows we want to run on
> >> it.
> > 
> > What about rebuilds between iterations?
> > I think an extra step to improve that build time could be adding support for
> > ccache in our kernel builds. Would it make sense to extend this in the future to
> > enable a redis node?
> > 
> > https://github.com/ccache/ccache/wiki/Redis-storage
> > 
> > With the workflow suggestion below, I'd then do:
> > 
> > 1. Bring up a redis node guest
> > 2. Bring up a kernel builder node that pulls from redis node ccache
> > 3. Push redis artifacts to control host (for later sync up) and destroy redis guest
> > 4. Push kernel package to control host
> 
> So I've asked our internal testing folks about using ccache to speed up
> kernel builds. There were various reasons not to do this, one of which
> is we want to have a clean trustworthy and repeatable kernel build to
> test with.
> 
> I'm not sure I have a specific technical argument against using ccache
> except that our builder node is ephemeral, making its ccache usable for
> only a single build. A Redis storage node might solve that issue.

Build artifacts can also be shared between the build nodes instances as long as
they are the same following reproducible builds [1]

[1]
https://docs.kernel.org/kbuild/reproducible-builds.html#reproducible-builds

But all right, I understand this is a separate topic and having ccache/redis/
reproducible build features is something we can work with and extend as opt-in
features in kdevops and/or this new workflow.

Thanks for the feedback.

> However I wonder if the time it takes to bring up the node, reload its
> storage, then after the build save the redis artifacts and destroy it
> might wipe out any benefit.

I guess the benefits will depend on every user, kernel config and how the cache
is exposed.

> 
> With libvirt guests we can save several minutes by doing the guest bring
> up in parallel (via Ansible) instead of serially via a shell script.
> I've got some ideas about that.
> 
> 
> > That said, all this new workflow makes sense to me. I guess instead of
> > duplicating the bootlinux role into build_linux I'd like to see a linux
> > role that handles build location (localhost/specific guest), output package
> > generation or not, deployment, etc. Does that make sense to you too?
> 
> The bootlinux role has grown a slew of special cases over the years, so
> I'm in the mood to help reorganize it a little.

That would be great.

> 
> At least split it into a role that handles the kernel build, and one
> that handles the minutiae of installing the builds on test runners.
> 
> I'm not quite following your suggestion, but I'm interested in hearing
> more detail.

I'm usually more in favor of having standalone playbooks. But I see here
the potential of merging the features rather than having them separate. My
suggestion is to refactor bootlinux role into separate tasks/<task name/
feature>. Where the features are build_linux, artifacts, package creation, etc.

> 
> 
> -- 
> Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] Build once, test everywhere
  2025-04-23 17:28     ` Daniel Gomez
@ 2025-04-24 13:51       ` Chuck Lever
  0 siblings, 0 replies; 9+ messages in thread
From: Chuck Lever @ 2025-04-24 13:51 UTC (permalink / raw)
  To: Daniel Gomez; +Cc: Luis Chamberlain, Scott Mayhew, kdevops, Chuck Lever

On 4/23/25 1:28 PM, Daniel Gomez wrote:
> On Wed, Apr 23, 2025 at 09:36:37AM +0100, Chuck Lever wrote:
>> On 4/23/25 8:34 AM, Daniel Gomez wrote:
>>> On Tue, Apr 22, 2025 at 11:49:03AM +0100, cel@kernel.org wrote:
>>>> From: Chuck Lever <chuck.lever@oracle.com>
>>>>
>>>> With apologies to the Java folks.
>>>>
>>>> I've been looking for ways to make our workflow runs more efficient
>>>> because my lab resources are limited, and moving to the cloud costs
>>>> money per CPU second.
>>>>
>>>> One thing that seems like an obvious win would be to build the test
>>>> kernel one time (for example, after a merge/pull request), then
>>>> re-use that kernel binary for all the workflows we want to run on
>>>> it.
>>>
>>> What about rebuilds between iterations?
>>> I think an extra step to improve that build time could be adding support for
>>> ccache in our kernel builds. Would it make sense to extend this in the future to
>>> enable a redis node?
>>>
>>> https://github.com/ccache/ccache/wiki/Redis-storage
>>>
>>> With the workflow suggestion below, I'd then do:
>>>
>>> 1. Bring up a redis node guest
>>> 2. Bring up a kernel builder node that pulls from redis node ccache
>>> 3. Push redis artifacts to control host (for later sync up) and destroy redis guest
>>> 4. Push kernel package to control host
>>
>> So I've asked our internal testing folks about using ccache to speed up
>> kernel builds. There were various reasons not to do this, one of which
>> is we want to have a clean trustworthy and repeatable kernel build to
>> test with.
>>
>> I'm not sure I have a specific technical argument against using ccache
>> except that our builder node is ephemeral, making its ccache usable for
>> only a single build. A Redis storage node might solve that issue.
> 
> Build artifacts can also be shared between the build nodes instances as long as
> they are the same following reproducible builds [1]
> 
> [1]
> https://docs.kernel.org/kbuild/reproducible-builds.html#reproducible-builds
> 
> But all right, I understand this is a separate topic and having ccache/redis/
> reproducible build features is something we can work with and extend as opt-in
> features in kdevops and/or this new workflow.
> 
> Thanks for the feedback.
> 
>> However I wonder if the time it takes to bring up the node, reload its
>> storage, then after the build save the redis artifacts and destroy it
>> might wipe out any benefit.
> 
> I guess the benefits will depend on every user, kernel config and how the cache
> is exposed.
> 
>>
>> With libvirt guests we can save several minutes by doing the guest bring
>> up in parallel (via Ansible) instead of serially via a shell script.
>> I've got some ideas about that.
>>
>>
>>> That said, all this new workflow makes sense to me. I guess instead of
>>> duplicating the bootlinux role into build_linux I'd like to see a linux
>>> role that handles build location (localhost/specific guest), output package
>>> generation or not, deployment, etc. Does that make sense to you too?
>>
>> The bootlinux role has grown a slew of special cases over the years, so
>> I'm in the mood to help reorganize it a little.
> 
> That would be great.
> 
>>
>> At least split it into a role that handles the kernel build, and one
>> that handles the minutiae of installing the builds on test runners.
>>
>> I'm not quite following your suggestion, but I'm interested in hearing
>> more detail.
> 
> I'm usually more in favor of having standalone playbooks. But I see here
> the potential of merging the features rather than having them separate. My
> suggestion is to refactor bootlinux role into separate tasks/<task name/
> feature>. Where the features are build_linux, artifacts, package creation, etc.

I did something similar to playbooks/roles/gen_nodes. I can experiment
and post some ideas.


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-04-24 13:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-22 15:49 [RFC PATCH 0/3] Build once, test everywhere cel
2025-04-22 15:49 ` [RFC PATCH 1/3] Add a guest/instance for building the test kernel cel
2025-04-22 15:49 ` [RFC PATCH 2/3] playbooks: Add a build_linux role cel
2025-04-22 15:49 ` [RFC PATCH 3/3] Experimental: Add a separate install_linux role cel
2025-04-23  5:27 ` [RFC PATCH 0/3] Build once, test everywhere Luis Chamberlain
2025-04-23 12:34 ` Daniel Gomez
2025-04-23 13:36   ` Chuck Lever
2025-04-23 17:28     ` Daniel Gomez
2025-04-24 13:51       ` Chuck Lever

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox