* [RFT] bootlinux: add bisection support
@ 2025-04-24 0:09 Luis Chamberlain
2025-04-24 5:47 ` Luis Chamberlain
0 siblings, 1 reply; 14+ messages in thread
From: Luis Chamberlain @ 2025-04-24 0:09 UTC (permalink / raw)
To: Chuck Lever, Daniel Gomez, kdevops; +Cc: Luis Chamberlain
This adds automatic bisection support first through the CLI by
using simple command line options. The first target goal to test
is very simple, check if the kernel boots or not. There is no need
for a custom workflow for this. We just enable the user to set the
values through the command line interface like this:
git clone git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git \
--reference /mirror/linux-next.git linux
make defconfig-bisection \
GOOD=v6.15-rc2 \
BAD=next-20250422 \
KDEVOPS_HOSTS_PREFIX="bisect" \
LINUX_TREE="git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git" -j128
make -j128
make bringup
make linux # Boots to good at first
make linux-bisection # Runs the bisection script
For now confine this feature to BOOTLINUX_9P as its not clear what to do
about terraform yet. We can add that support later.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
Request For Testing as I've only gotten this to at least run up to the script,
and am debugging now to see if it actually does what I hope it should. The
other bug is that the first 'make bringup' boots into linux-next not the
good tag, but I can't see why yet. And so just two things are left here:
1) figure out why make bringup fails to use the good tag
2) see if we need to redo the simple boot script
I'm hoping we can use this to bisect why linux-next tag next-20250422
was hosed on boot.
defconfigs/bisection | 9 +++
playbooks/roles/bootlinux/defaults/main.yml | 6 ++
playbooks/roles/bootlinux/tasks/main.yml | 45 +++++++++++
.../bootlinux/templates/bisect-boot.sh.j2 | 7 ++
workflows/linux/Kconfig | 80 ++++++++++++++++++-
workflows/linux/Makefile | 15 ++++
6 files changed, 161 insertions(+), 1 deletion(-)
create mode 100644 defconfigs/bisection
create mode 100755 playbooks/roles/bootlinux/templates/bisect-boot.sh.j2
diff --git a/defconfigs/bisection b/defconfigs/bisection
new file mode 100644
index 000000000000..1c3112a0670e
--- /dev/null
+++ b/defconfigs/bisection
@@ -0,0 +1,9 @@
+CONFIG_GUESTFS=y
+CONFIG_LIBVIRT=y
+
+CONFIG_WORKFLOW_LINUX_CUSTOM=y
+CONFIG_BOOTLINUX=y
+CONFIG_BOOTLINUX_9P=y
+CONFIG_BOOTLINUX_BISECT_ENABLE=y
+
+CONFIG_DEVCONFIG_ENABLE_SYSTEMD_JOURNAL_REMOTE=y
diff --git a/playbooks/roles/bootlinux/defaults/main.yml b/playbooks/roles/bootlinux/defaults/main.yml
index fd5674b5fc19..245ae43c5f0d 100644
--- a/playbooks/roles/bootlinux/defaults/main.yml
+++ b/playbooks/roles/bootlinux/defaults/main.yml
@@ -52,3 +52,9 @@ kdevops_workflow_enable_cxl: False
bootlinux_cxl_test: False
bootlinux_tree_set_by_cli: False
+bootlinux_bisect_enable: False
+bootlinux_bisect_enable_cli: False
+bootlinux_bisect_ref_good_cli: False
+bootlinux_bisect_ref_bad_cli: False
+bootlinux_bisect_script_boot: False
+bootlinux_bisect_script_custom: False
diff --git a/playbooks/roles/bootlinux/tasks/main.yml b/playbooks/roles/bootlinux/tasks/main.yml
index 9ad675b3f278..afa533c011f8 100644
--- a/playbooks/roles/bootlinux/tasks/main.yml
+++ b/playbooks/roles/bootlinux/tasks/main.yml
@@ -650,3 +650,48 @@
vars:
running_kernel: "{{ uname_cmd.stdout_lines.0 }}"
tags: [ 'linux', 'git', 'config', 'uname' ]
+
+- name: Copy git bisection script over
+ template:
+ src: "{{ bootlinux_bisect_script }}.j2"
+ dest: "{{ bootlinux_9p_host_path }}/{{ bootlinux_bisect_script }}"
+ mode: 0644
+ tags: [ 'bisect' ]
+ when:
+ - bootlinux_bisect_enable|bool
+ run_once: true
+ delegate_to: localhost
+
+- name: Set up bisection for git
+ command: "git bisect start {{ bootlinux_bisect_ref_bad }} {{ bootlinux_bisect_ref_good }}"
+ register: build
+ changed_when: "build.rc == 0"
+ args:
+ chdir: "{{ bootlinux_9p_host_path }}"
+ tags: [ 'bisect' ]
+ when:
+ - bootlinux_bisect_enable|bool
+ run_once: true
+ delegate_to: localhost
+
+- name: Ensure bisection script is executable
+ file:
+ path: "{{ bootlinux_9p_host_path }}/{{ bootlinux_bisect_script }}"
+ mode: '0755'
+ state: file
+ when: bootlinux_bisect_enable | bool
+ delegate_to: localhost
+ run_once: true
+ tags: [ 'bisect' ]
+
+- name: Run the the bisection script
+ command: "git bisect run {{ bootlinux_9p_host_path }}/{{ bootlinux_bisect_script }}"
+ register: build
+ changed_when: "build.rc == 0"
+ args:
+ chdir: "{{ bootlinux_9p_host_path }}"
+ tags: [ 'bisect' ]
+ when:
+ - bootlinux_bisect_enable|bool
+ run_once: true
+ delegate_to: localhost
diff --git a/playbooks/roles/bootlinux/templates/bisect-boot.sh.j2 b/playbooks/roles/bootlinux/templates/bisect-boot.sh.j2
new file mode 100755
index 000000000000..10dbacd307da
--- /dev/null
+++ b/playbooks/roles/bootlinux/templates/bisect-boot.sh.j2
@@ -0,0 +1,7 @@
+#!/bin/bash
+# SPDX-License-Identifier: copyleft-next-0.3.1
+#
+# If make linux-deploy fails for reasons unrelated to the bug we're tracking
+# (e.g., a build error) we bail with exit code 125 to allow git to skip that
+# commit.
+make linux-deploy || exit 125
diff --git a/workflows/linux/Kconfig b/workflows/linux/Kconfig
index 797469e60d20..b30187c80e6e 100644
--- a/workflows/linux/Kconfig
+++ b/workflows/linux/Kconfig
@@ -10,6 +10,21 @@ config BOOTLINUX_TREE_REF_SET_BY_CLI
output yaml
default $(shell, scripts/check-cli-set-var.sh LINUX_TREE_REF)
+config BOOTLINUX_BISECT_ENABLE_CLI
+ bool
+ output yaml
+ default $(shell, scripts/check-cli-set-var.sh LINUX_BISECT)
+
+config BOOTLINUX_BISECT_REF_GOOD_CLI
+ bool
+ output yaml
+ default $(shell, scripts/check-cli-set-var.sh GOOD)
+
+config BOOTLINUX_BISECT_REF_BAD_CLI
+ bool
+ output yaml
+ default $(shell, scripts/check-cli-set-var.sh BAD)
+
config BOOTLINUX_HAS_PURE_IOMAP_CONFIG
bool
@@ -173,7 +188,8 @@ config BOOTLINUX_TREE_CUSTOM_URL
config BOOTLINUX_TREE_CUSTOM_REF
string "Custom Linux kernel tag or branch to use"
- default $(shell, ./scripts/append-makefile-vars.sh $(LINUX_TREE_REF)) if BOOTLINUX_TREE_REF_SET_BY_CLI
+ default $(shell, ./scripts/append-makefile-vars.sh $(GOOD)) if BOOTLINUX_BISECT_REF_GOOD_CLI && !BOOTLINUX_TREE_REF_SET_BY_CLI
+ default $(shell, ./scripts/append-makefile-vars.sh $(LINUX_TREE_REF)) if BOOTLINUX_TREE_REF_SET_BY_CLI && !BOOTLINUX_BISECT_REF_GOOD_CLI
default "master" if !BOOTLINUX_TREE_REF_SET_BY_CLI
help
The git ID or branch name to check out to compile linux.
@@ -278,9 +294,71 @@ config BOOTLINUX_TREE_LOCALVERSION
help
The Linux local version to use (for uname).
+config BOOTLINUX_BISECT_ENABLE
+ bool "Do you want to bisect a broken kernel?"
+ default y
+ depends on BOOTLINUX_9P
+ output yaml
+ help
+ Do you need to automate bisecting some broken kernel?
+
+if BOOTLINUX_BISECT_ENABLE
+
+choice
+ prompt "Bisection script to use"
+ default BOOTLINUX_BISECT_SCRIPT_BOOT
+
+config BOOTLINUX_BISECT_SCRIPT_BOOT
+ bool "Ensures we can at least boot"
+ help
+ This helps ensure we can at laest boot into the host. That's it.
+
+config BOOTLINUX_BISECT_SCRIPT_CUSTOM
+ bool "You will provide your own bisection script"
+ help
+ If you to test a new bisection script you can use this.
+
+endchoice
+
+config BOOTLINUX_BISECT_SCRIPT_CUSTOM_PATH
+ string "Custom path to git bisection script to use"
+ depends on BOOTLINUX_BISECT_SCRIPT_CUSTOM
+ default ""
+ output yaml
+ help
+ The custom path to the bisect script we will use. Instead of building
+ the kernel and booting it, 'make linux' will do the bisection
+ automatically for you based on the script.
+
+config BOOTLINUX_BISECT_SCRIPT
+ string
+ output yaml
+ default "bisect-boot.sh" if BOOTLINUX_BISECT_SCRIPT_BOOT
+ default BOOTLINUX_BISECT_SCRIPT_CUSTOM_PATH if BOOTLINUX_BISECT_SCRIPT_BOOT
+
+
+config BOOTLINUX_BISECT_REF_GOOD
+ string "The last known good commit"
+ default BOOTLINUX_TREE_REF if !BOOTLINUX_BISECT_REF_GOOD_CLI
+ default $(shell, ./scripts/append-makefile-vars.sh $(GOOD)) if BOOTLINUX_BISECT_REF_GOOD_CLI
+ output yaml
+ help
+ The known kernel commit to be good.
+
+config BOOTLINUX_BISECT_REF_BAD
+ string "The known broken commit"
+ default BOOTLINUX_TREE_STABLE_REF if !BOOTLINUX_BISECT_REF_BAD_CLI
+ default $(shell, ./scripts/append-makefile-vars.sh $(BAD)) if BOOTLINUX_BISECT_REF_BAD_CLI
+ output yaml
+ help
+ The first broken tag.
+
+endif # BOOTLINUX_BISECT_ENABLE
+
config BOOTLINUX_SHALLOW_CLONE
bool "Shallow git clone"
default y
+ depends on !BOOTLINUX_BISECT_ENABLE
help
If enabled the git tree cloned with be cloned using a shallow tree
with history truncated. You want to enable this if you really don't
diff --git a/workflows/linux/Makefile b/workflows/linux/Makefile
index ecce273a4f67..00ec13db74ca 100644
--- a/workflows/linux/Makefile
+++ b/workflows/linux/Makefile
@@ -104,6 +104,13 @@ linux-mount:
--tags vars,9p_mount \
--extra-vars="$(BOOTLINUX_ARGS)" $(LIMIT_HOSTS)
+PHONY += linux-bisect
+linux-bisect:
+ $(Q)ansible-playbook $(ANSIBLE_VERBOSE) -i \
+ $(KDEVOPS_HOSTFILE) $(KDEVOPS_PLAYBOOKS_DIR)/bootlinux.yml \
+ --tags vars,bisect \
+ --extra-vars="$(BOOTLINUX_ARGS)" $(LIMIT_HOSTS)
+
PHONY += linux-deploy
linux-deploy:
$(Q)ansible-playbook $(ANSIBLE_VERBOSE) -i \
@@ -176,6 +183,14 @@ linux-help-cxl:
LINUX_HELP_EXTRA += linux-help-cxl
endif
+ifeq (y,$(CONFIG_BOOTLINUX_BISECTION))
+PHONY += linux-help-bisection
+linux-help-bisection:
+ @echo "linux-bisection - Bisects the kernel automatically for you"
+
+LINUX_HELP_EXTRA += linux-help-bisection
+endif
+
HELP_TARGETS+=linux-help-menu
HELP_TARGETS+=$(LINUX_HELP_EXTRA)
HELP_TARGETS+=linux-help-end
--
2.45.2
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 0:09 [RFT] bootlinux: add bisection support Luis Chamberlain
@ 2025-04-24 5:47 ` Luis Chamberlain
2025-04-24 6:40 ` Daniel Gomez
0 siblings, 1 reply; 14+ messages in thread
From: Luis Chamberlain @ 2025-04-24 5:47 UTC (permalink / raw)
To: Chuck Lever, Daniel Gomez, kdevops
On Wed, Apr 23, 2025 at 05:09:41PM -0700, Luis Chamberlain wrote:
> 2) see if we need to redo the simple boot script
this needs just one adjustment, the top of the script needs to
change directory to the kdevops dir so the script needs to be:
#!/bin/bash
# SPDX-License-Identifier: copyleft-next-0.3.1
#
# If make linux-deploy fails for reasons unrelated to the bug we're
# tracking (e.g., a build error) we bail with exit code 125 to allow git
# to skip that commit.
cd {{ topdir_path }}
make linux-deploy || exit 12
That seems to be doing what I expected it would do! So just puzzle left,
why the GOOD ref was not used at 'make linux'.
Luis
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 5:47 ` Luis Chamberlain
@ 2025-04-24 6:40 ` Daniel Gomez
2025-04-24 7:25 ` Daniel Gomez
2025-04-24 17:08 ` Luis Chamberlain
0 siblings, 2 replies; 14+ messages in thread
From: Daniel Gomez @ 2025-04-24 6:40 UTC (permalink / raw)
To: Luis Chamberlain; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Wed, Apr 23, 2025 at 10:47:16PM +0100, Luis Chamberlain wrote:
> On Wed, Apr 23, 2025 at 05:09:41PM -0700, Luis Chamberlain wrote:
> > 2) see if we need to redo the simple boot script
>
> this needs just one adjustment, the top of the script needs to
> change directory to the kdevops dir so the script needs to be:
>
> #!/bin/bash
> # SPDX-License-Identifier: copyleft-next-0.3.1
> #
> # If make linux-deploy fails for reasons unrelated to the bug we're
> # tracking (e.g., a build error) we bail with exit code 125 to allow git
> # to skip that commit.
> cd {{ topdir_path }}
> make linux-deploy || exit 12
>
> That seems to be doing what I expected it would do! So just puzzle left,
> why the GOOD ref was not used at 'make linux'.
If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
will always boot into the "bad" kernel. How is the script going to boot the new
kernel for testing if the new ref is "bad"/"good"?
>
> Luis
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 6:40 ` Daniel Gomez
@ 2025-04-24 7:25 ` Daniel Gomez
2025-04-24 13:36 ` Daniel Gomez
2025-04-24 17:08 ` Luis Chamberlain
1 sibling, 1 reply; 14+ messages in thread
From: Daniel Gomez @ 2025-04-24 7:25 UTC (permalink / raw)
To: Luis Chamberlain; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Thu, Apr 24, 2025 at 08:40:54AM +0100, Daniel Gomez wrote:
> On Wed, Apr 23, 2025 at 10:47:16PM +0100, Luis Chamberlain wrote:
> > On Wed, Apr 23, 2025 at 05:09:41PM -0700, Luis Chamberlain wrote:
> > > 2) see if we need to redo the simple boot script
> >
> > this needs just one adjustment, the top of the script needs to
> > change directory to the kdevops dir so the script needs to be:
> >
> > #!/bin/bash
> > # SPDX-License-Identifier: copyleft-next-0.3.1
> > #
> > # If make linux-deploy fails for reasons unrelated to the bug we're
> > # tracking (e.g., a build error) we bail with exit code 125 to allow git
> > # to skip that commit.
> > cd {{ topdir_path }}
> > make linux-deploy || exit 12
> >
> > That seems to be doing what I expected it would do! So just puzzle left,
> > why the GOOD ref was not used at 'make linux'.
>
> If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
> will always boot into the "bad" kernel. How is the script going to boot the new
> kernel for testing if the new ref is "bad"/"good"?
I was testing this and run into the linux-next upstream issue:
[ 0.222812] ------------[ cut here ]------------
[ 0.222812] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:918 switch_mm_irqs_off+0x114/0x490
[ 0.222812] Modules linked in:
[ 0.222812] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.15.0-rc3-next-20250422 #1 PREEMPT(full)
[ 0.222812] Tainted: [W]=WARN
[ 0.222812] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.02-6 04/08/2025
[ 0.222812] RIP: 0010:switch_mm_irqs_off+0x114/0x490
[ 0.222812] Code: 65 4c 89 25 86 26 09 02 65 48 c7 05 72 26 09 02 01 00 00 00 48 81 fd 40 dc f0 91 74 0f 44 89 f8 48 0f a3 85 80 05 00 00 72 02 <0f> 0b 48 81 fb 40 dc f0 91 74 16 48 8d 83 80 05 00 00 4c 0f a3 bb
[ 0.222812] RSP: 0000:ffffffff91e03de8 EFLAGS: 00010202
[ 0.222812] RAX: 0000000000000000 RBX: ffffffff91f0dc40 RCX: 0000000000000002
[ 0.222812] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff916f810d
[ 0.222812] RBP: ffffffff91f7e980 R08: ffffffff91e03da4 R09: 000000007f7ec018
[ 0.222812] R10: 000000007f8e1860 R11: 000000006c09c2a4 R12: 0000000000000001
[ 0.222812] R13: ffffffff91e0c580 R14: 0000000000000000 R15: 0000000000000000
[ 0.222812] FS: 0000000000000000(0000) GS:ffff9a2aa94c6000(0000) knlGS:0000000000000000
[ 0.222812] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.222812] CR2: ffff9a2a21e01000 CR3: 0000000100352002 CR4: 0000000000770ef0
[ 0.222812] PKRU: 55555554
[ 0.222812] Call Trace:
[ 0.222812] <TASK>
[ 0.222812] unuse_temporary_mm+0x34/0x60
[ 0.222812] efi_set_virtual_address_map+0x15d/0x190
[ 0.222812] efi_enter_virtual_mode+0x3df/0x440
[ 0.222812] start_kernel+0x907/0x980
[ 0.222812] x86_64_start_reservations+0x20/0x20
[ 0.222812] x86_64_start_kernel+0x71/0x80
[ 0.222812] common_startup_64+0x13e/0x148
[ 0.222812] </TASK>
[ 0.222812] ---[ end trace 0000000000000000 ]--
I see this reported here:
https://lore.kernel.org/all/6807a0a8.050a0220.380c13.014e.GAE@google.com/
Regarding my question above, I think we need to add support for -kernel first in
QEMU/libvirt. Something I wanted to explore some time ago, so I'll give it a try
now for this use case.
>
> >
> > Luis
> >
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 7:25 ` Daniel Gomez
@ 2025-04-24 13:36 ` Daniel Gomez
2025-04-24 17:10 ` Luis Chamberlain
0 siblings, 1 reply; 14+ messages in thread
From: Daniel Gomez @ 2025-04-24 13:36 UTC (permalink / raw)
To: Luis Chamberlain; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Thu, Apr 24, 2025 at 09:25:54AM +0100, Daniel Gomez wrote:
> On Thu, Apr 24, 2025 at 08:40:54AM +0100, Daniel Gomez wrote:
> > On Wed, Apr 23, 2025 at 10:47:16PM +0100, Luis Chamberlain wrote:
> > > On Wed, Apr 23, 2025 at 05:09:41PM -0700, Luis Chamberlain wrote:
> > > > 2) see if we need to redo the simple boot script
> > >
> > > this needs just one adjustment, the top of the script needs to
> > > change directory to the kdevops dir so the script needs to be:
> > >
> > > #!/bin/bash
> > > # SPDX-License-Identifier: copyleft-next-0.3.1
> > > #
> > > # If make linux-deploy fails for reasons unrelated to the bug we're
> > > # tracking (e.g., a build error) we bail with exit code 125 to allow git
> > > # to skip that commit.
> > > cd {{ topdir_path }}
> > > make linux-deploy || exit 12
> > >
> > > That seems to be doing what I expected it would do! So just puzzle left,
> > > why the GOOD ref was not used at 'make linux'.
> >
> > If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
> > will always boot into the "bad" kernel. How is the script going to boot the new
> > kernel for testing if the new ref is "bad"/"good"?
>
> I was testing this and run into the linux-next upstream issue:
>
> [ 0.222812] ------------[ cut here ]------------
> [ 0.222812] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:918 switch_mm_irqs_off+0x114/0x490
> [ 0.222812] Modules linked in:
> [ 0.222812] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.15.0-rc3-next-20250422 #1 PREEMPT(full)
> [ 0.222812] Tainted: [W]=WARN
> [ 0.222812] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.02-6 04/08/2025
> [ 0.222812] RIP: 0010:switch_mm_irqs_off+0x114/0x490
> [ 0.222812] Code: 65 4c 89 25 86 26 09 02 65 48 c7 05 72 26 09 02 01 00 00 00 48 81 fd 40 dc f0 91 74 0f 44 89 f8 48 0f a3 85 80 05 00 00 72 02 <0f> 0b 48 81 fb 40 dc f0 91 74 16 48 8d 83 80 05 00 00 4c 0f a3 bb
> [ 0.222812] RSP: 0000:ffffffff91e03de8 EFLAGS: 00010202
> [ 0.222812] RAX: 0000000000000000 RBX: ffffffff91f0dc40 RCX: 0000000000000002
> [ 0.222812] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff916f810d
> [ 0.222812] RBP: ffffffff91f7e980 R08: ffffffff91e03da4 R09: 000000007f7ec018
> [ 0.222812] R10: 000000007f8e1860 R11: 000000006c09c2a4 R12: 0000000000000001
> [ 0.222812] R13: ffffffff91e0c580 R14: 0000000000000000 R15: 0000000000000000
> [ 0.222812] FS: 0000000000000000(0000) GS:ffff9a2aa94c6000(0000) knlGS:0000000000000000
> [ 0.222812] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.222812] CR2: ffff9a2a21e01000 CR3: 0000000100352002 CR4: 0000000000770ef0
> [ 0.222812] PKRU: 55555554
> [ 0.222812] Call Trace:
> [ 0.222812] <TASK>
> [ 0.222812] unuse_temporary_mm+0x34/0x60
> [ 0.222812] efi_set_virtual_address_map+0x15d/0x190
> [ 0.222812] efi_enter_virtual_mode+0x3df/0x440
> [ 0.222812] start_kernel+0x907/0x980
> [ 0.222812] x86_64_start_reservations+0x20/0x20
> [ 0.222812] x86_64_start_kernel+0x71/0x80
> [ 0.222812] common_startup_64+0x13e/0x148
> [ 0.222812] </TASK>
> [ 0.222812] ---[ end trace 0000000000000000 ]--
>
>
> I see this reported here:
> https://lore.kernel.org/all/6807a0a8.050a0220.380c13.014e.GAE@google.com/
>
> Regarding my question above, I think we need to add support for -kernel first in
> QEMU/libvirt. Something I wanted to explore some time ago, so I'll give it a try
> now for this use case.
This seems to boot fine if I don't use kdevops kernel build process. For that,
we need to work on the cleanup/refactor we talked in the other thread with
Chuck. But let me know what you think.
diff --git a/playbooks/roles/gen_nodes/templates/guestfs_q35.j2.xml b/playbooks/roles/gen_nodes/templates/guestfs_q35.j2.xml
index adaba91..9bf32ff 100644
--- a/playbooks/roles/gen_nodes/templates/guestfs_q35.j2.xml
+++ b/playbooks/roles/gen_nodes/templates/guestfs_q35.j2.xml
@@ -3,7 +3,13 @@
<memory unit='MiB'>{{ libvirt_mem_mb }}</memory>
<currentMemory unit='MiB'>{{ libvirt_mem_mb }}</currentMemory>
<vcpu placement='static'>{{ libvirt_vcpus_count }}</vcpu>
-{% if guestfs_requires_uefi %}
+{% if bootlinux_direct %}
+ <os>
+ <type arch='x86_64' machine='q35'>hvm</type>
+ <kernel>{{ target_linux_dir_path }}/arch/x86_64/boot/bzImage</kernel>
+ <cmdline>{{ bootlinux_direct_cmdline }}</cmdline>
+ </os>
+{% elif guestfs_requires_uefi %}
<os firmware='efi'>
<type arch='x86_64' machine='q35'>hvm</type>
<loader readonly='yes' secure='no'/>
diff --git a/workflows/linux/Kconfig b/workflows/linux/Kconfig
index 797469e..788e571 100644
--- a/workflows/linux/Kconfig
+++ b/workflows/linux/Kconfig
@@ -36,10 +36,13 @@ config BOOTLINUX_PURE_IOMAP
endif # HAVE_SUPPORTS_PURE_IOMAP
+choice
+ prompt "Type of kernel boot"
+ default BOOTLINUX_9P
+
config BOOTLINUX_9P
bool "Use 9p to build Linux"
depends on LIBVIRT && !GUESTFS_LACKS_9P
- default LIBVIRT
help
This will let you choose use 9p to build Linux. What this does is
use your localhost to git clone Linux under the assumption your
@@ -53,6 +56,16 @@ config BOOTLINUX_9P
where your localhost path for your git tree is. You should keep
the other settings as-is unless you know what you are doing.
+config BOOTLINUX_DIRECT
+ bool "Use direct kernel boot"
+ output yaml
+ help
+ This will let you choose use direct kernel boot.
+
+ https://libvirt.org/formatdomain.html#direct-kernel-boot
+
+endchoice
+
if BOOTLINUX_9P
menu "Modify default 9p configuration"
@@ -111,6 +124,23 @@ endmenu
endif # BOOTLINUX_9P
+if BOOTLINUX_DIRECT
+
+menu "Modify default direct kernel boot configuration"
+
+config BOOTLINUX_DIRECT_CMDLINE
+ string "cmdline"
+ output yaml
+ default "root=/dev/vda3 console=ttyS0,115200 audit=0"
+ help
+ This sets the kernel boot command line.
+
+ https://libvirt.org/formatdomain.html#direct-kernel-boot
+
+endmenu
+
+endif # BOOTLINUX_DIRECT
+
choice
prompt "Type of development version of Linux to use"
default BOOTLINUX_LINUS if !BOOTLINUX_TREE_SET_BY_CLI
diff --git a/workflows/linux/Makefile b/workflows/linux/Makefile
index ecce273..bc1ac41 100644
--- a/workflows/linux/Makefile
+++ b/workflows/linux/Makefile
@@ -75,6 +75,7 @@ linux-help-menu:
echo "linux-mount - Mounts 9p path on targets" ;\
fi
@echo "linux-deploy - Builds, installs, updates GRUB and reboots - useful for rapid development"
+ @echo "linux-build - Builds kernel"
@echo "linux-install - Only builds and installs Linux"
@echo "linux-uninstall - Remove a kernel you can pass arguments for the version such as KVER=6.5.0-rc7-next-20230825"
@echo "linux-clone - Only clones Linux"
@@ -111,6 +112,14 @@ linux-deploy:
--tags vars,build-linux,install-linux,manual-update-grub,saved,vars,reboot \
--extra-vars="$(BOOTLINUX_ARGS)" $(LIMIT_HOSTS)
+PHONY += linux-build
+linux-build:
+ $(Q)ansible-playbook $(ANSIBLE_VERBOSE) --connection=local \
+ --inventory localhost, \
+ $(KDEVOPS_PLAYBOOKS_DIR)/bootlinux.yml \
+ --tags vars,build-linux,saved,vars \
+ --extra-vars="$(BOOTLINUX_ARGS)" $(LIMIT_HOSTS)
+
PHONY += linux-install
linux-install:
$(Q)ansible-playbook $(ANSIBLE_VERBOSE) -i \
>
> >
> > >
> > > Luis
> > >
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 13:36 ` Daniel Gomez
@ 2025-04-24 17:10 ` Luis Chamberlain
2025-04-24 18:37 ` Daniel Gomez
0 siblings, 1 reply; 14+ messages in thread
From: Luis Chamberlain @ 2025-04-24 17:10 UTC (permalink / raw)
To: Daniel Gomez; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Thu, Apr 24, 2025 at 03:36:47PM +0200, Daniel Gomez wrote:
> On Thu, Apr 24, 2025 at 09:25:54AM +0100, Daniel Gomez wrote:
> > On Thu, Apr 24, 2025 at 08:40:54AM +0100, Daniel Gomez wrote:
> > > On Wed, Apr 23, 2025 at 10:47:16PM +0100, Luis Chamberlain wrote:
> > > > On Wed, Apr 23, 2025 at 05:09:41PM -0700, Luis Chamberlain wrote:
> > > > > 2) see if we need to redo the simple boot script
> > > >
> > > > this needs just one adjustment, the top of the script needs to
> > > > change directory to the kdevops dir so the script needs to be:
> > > >
> > > > #!/bin/bash
> > > > # SPDX-License-Identifier: copyleft-next-0.3.1
> > > > #
> > > > # If make linux-deploy fails for reasons unrelated to the bug we're
> > > > # tracking (e.g., a build error) we bail with exit code 125 to allow git
> > > > # to skip that commit.
> > > > cd {{ topdir_path }}
> > > > make linux-deploy || exit 12
> > > >
> > > > That seems to be doing what I expected it would do! So just puzzle left,
> > > > why the GOOD ref was not used at 'make linux'.
> > >
> > > If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
> > > will always boot into the "bad" kernel. How is the script going to boot the new
> > > kernel for testing if the new ref is "bad"/"good"?
> >
> > I was testing this and run into the linux-next upstream issue:
> >
> > [ 0.222812] ------------[ cut here ]------------
> > [ 0.222812] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:918 switch_mm_irqs_off+0x114/0x490
> > [ 0.222812] Modules linked in:
> > [ 0.222812] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.15.0-rc3-next-20250422 #1 PREEMPT(full)
> > [ 0.222812] Tainted: [W]=WARN
> > [ 0.222812] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.02-6 04/08/2025
> > [ 0.222812] RIP: 0010:switch_mm_irqs_off+0x114/0x490
> > [ 0.222812] Code: 65 4c 89 25 86 26 09 02 65 48 c7 05 72 26 09 02 01 00 00 00 48 81 fd 40 dc f0 91 74 0f 44 89 f8 48 0f a3 85 80 05 00 00 72 02 <0f> 0b 48 81 fb 40 dc f0 91 74 16 48 8d 83 80 05 00 00 4c 0f a3 bb
> > [ 0.222812] RSP: 0000:ffffffff91e03de8 EFLAGS: 00010202
> > [ 0.222812] RAX: 0000000000000000 RBX: ffffffff91f0dc40 RCX: 0000000000000002
> > [ 0.222812] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff916f810d
> > [ 0.222812] RBP: ffffffff91f7e980 R08: ffffffff91e03da4 R09: 000000007f7ec018
> > [ 0.222812] R10: 000000007f8e1860 R11: 000000006c09c2a4 R12: 0000000000000001
> > [ 0.222812] R13: ffffffff91e0c580 R14: 0000000000000000 R15: 0000000000000000
> > [ 0.222812] FS: 0000000000000000(0000) GS:ffff9a2aa94c6000(0000) knlGS:0000000000000000
> > [ 0.222812] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 0.222812] CR2: ffff9a2a21e01000 CR3: 0000000100352002 CR4: 0000000000770ef0
> > [ 0.222812] PKRU: 55555554
> > [ 0.222812] Call Trace:
> > [ 0.222812] <TASK>
> > [ 0.222812] unuse_temporary_mm+0x34/0x60
> > [ 0.222812] efi_set_virtual_address_map+0x15d/0x190
> > [ 0.222812] efi_enter_virtual_mode+0x3df/0x440
> > [ 0.222812] start_kernel+0x907/0x980
> > [ 0.222812] x86_64_start_reservations+0x20/0x20
> > [ 0.222812] x86_64_start_kernel+0x71/0x80
> > [ 0.222812] common_startup_64+0x13e/0x148
> > [ 0.222812] </TASK>
> > [ 0.222812] ---[ end trace 0000000000000000 ]--
> >
> >
> > I see this reported here:
> > https://lore.kernel.org/all/6807a0a8.050a0220.380c13.014e.GAE@google.com/
> >
> > Regarding my question above, I think we need to add support for -kernel first in
> > QEMU/libvirt. Something I wanted to explore some time ago, so I'll give it a try
> > now for this use case.
>
> This seems to boot fine if I don't use kdevops kernel build process. For that,
> we need to work on the cleanup/refactor we talked in the other thread with
> Chuck. But let me know what you think.
Direct mode is certainly a nice feature! But it lacks modules support,
and so the tests are limited to non-modular world. Fortunately we have
an outlet as I just replied to you previous email -- so now we just
gotta codify it. I think supporting both makes sense. Let folks pick.
Clearly if we want to help debug a regression with modules we need them.
Luis
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 17:10 ` Luis Chamberlain
@ 2025-04-24 18:37 ` Daniel Gomez
0 siblings, 0 replies; 14+ messages in thread
From: Daniel Gomez @ 2025-04-24 18:37 UTC (permalink / raw)
To: Luis Chamberlain; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Thu, Apr 24, 2025 at 10:10:24AM +0100, Luis Chamberlain wrote:
> On Thu, Apr 24, 2025 at 03:36:47PM +0200, Daniel Gomez wrote:
> > On Thu, Apr 24, 2025 at 09:25:54AM +0100, Daniel Gomez wrote:
> > > On Thu, Apr 24, 2025 at 08:40:54AM +0100, Daniel Gomez wrote:
> > > > On Wed, Apr 23, 2025 at 10:47:16PM +0100, Luis Chamberlain wrote:
> > > > > On Wed, Apr 23, 2025 at 05:09:41PM -0700, Luis Chamberlain wrote:
> > > > > > 2) see if we need to redo the simple boot script
> > > > >
> > > > > this needs just one adjustment, the top of the script needs to
> > > > > change directory to the kdevops dir so the script needs to be:
> > > > >
> > > > > #!/bin/bash
> > > > > # SPDX-License-Identifier: copyleft-next-0.3.1
> > > > > #
> > > > > # If make linux-deploy fails for reasons unrelated to the bug we're
> > > > > # tracking (e.g., a build error) we bail with exit code 125 to allow git
> > > > > # to skip that commit.
> > > > > cd {{ topdir_path }}
> > > > > make linux-deploy || exit 12
> > > > >
> > > > > That seems to be doing what I expected it would do! So just puzzle left,
> > > > > why the GOOD ref was not used at 'make linux'.
> > > >
> > > > If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
> > > > will always boot into the "bad" kernel. How is the script going to boot the new
> > > > kernel for testing if the new ref is "bad"/"good"?
> > >
> > > I was testing this and run into the linux-next upstream issue:
> > >
> > > [ 0.222812] ------------[ cut here ]------------
> > > [ 0.222812] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:918 switch_mm_irqs_off+0x114/0x490
> > > [ 0.222812] Modules linked in:
> > > [ 0.222812] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.15.0-rc3-next-20250422 #1 PREEMPT(full)
> > > [ 0.222812] Tainted: [W]=WARN
> > > [ 0.222812] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2025.02-6 04/08/2025
> > > [ 0.222812] RIP: 0010:switch_mm_irqs_off+0x114/0x490
> > > [ 0.222812] Code: 65 4c 89 25 86 26 09 02 65 48 c7 05 72 26 09 02 01 00 00 00 48 81 fd 40 dc f0 91 74 0f 44 89 f8 48 0f a3 85 80 05 00 00 72 02 <0f> 0b 48 81 fb 40 dc f0 91 74 16 48 8d 83 80 05 00 00 4c 0f a3 bb
> > > [ 0.222812] RSP: 0000:ffffffff91e03de8 EFLAGS: 00010202
> > > [ 0.222812] RAX: 0000000000000000 RBX: ffffffff91f0dc40 RCX: 0000000000000002
> > > [ 0.222812] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff916f810d
> > > [ 0.222812] RBP: ffffffff91f7e980 R08: ffffffff91e03da4 R09: 000000007f7ec018
> > > [ 0.222812] R10: 000000007f8e1860 R11: 000000006c09c2a4 R12: 0000000000000001
> > > [ 0.222812] R13: ffffffff91e0c580 R14: 0000000000000000 R15: 0000000000000000
> > > [ 0.222812] FS: 0000000000000000(0000) GS:ffff9a2aa94c6000(0000) knlGS:0000000000000000
> > > [ 0.222812] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 0.222812] CR2: ffff9a2a21e01000 CR3: 0000000100352002 CR4: 0000000000770ef0
> > > [ 0.222812] PKRU: 55555554
> > > [ 0.222812] Call Trace:
> > > [ 0.222812] <TASK>
> > > [ 0.222812] unuse_temporary_mm+0x34/0x60
> > > [ 0.222812] efi_set_virtual_address_map+0x15d/0x190
> > > [ 0.222812] efi_enter_virtual_mode+0x3df/0x440
> > > [ 0.222812] start_kernel+0x907/0x980
> > > [ 0.222812] x86_64_start_reservations+0x20/0x20
> > > [ 0.222812] x86_64_start_kernel+0x71/0x80
> > > [ 0.222812] common_startup_64+0x13e/0x148
> > > [ 0.222812] </TASK>
> > > [ 0.222812] ---[ end trace 0000000000000000 ]--
> > >
> > >
> > > I see this reported here:
> > > https://lore.kernel.org/all/6807a0a8.050a0220.380c13.014e.GAE@google.com/
> > >
> > > Regarding my question above, I think we need to add support for -kernel first in
> > > QEMU/libvirt. Something I wanted to explore some time ago, so I'll give it a try
> > > now for this use case.
> >
> > This seems to boot fine if I don't use kdevops kernel build process. For that,
> > we need to work on the cleanup/refactor we talked in the other thread with
> > Chuck. But let me know what you think.
>
> Direct mode is certainly a nice feature! But it lacks modules support,
> and so the tests are limited to non-modular world. Fortunately we have
You are right. If module support is needed and network boot is not part of the
bisect issue, we can mount a shared directory with 9p or virtiofs after the
kernel boots and network becomes available (as long as 9p/virtiofs is built in
of course). I use vmctl [1], which integrates Omar's script [2] [3] to handle
this.
Are you aware of any other methods/alternatives for doing this?
[1] https://github.com/SamsungDS/vmctl
[2]
https://github.com/SamsungDS/vmctl/blob/master/contrib/generate-cloud-config-seed.sh#L41
[3]
https://github.com/osandov/osandov-linux/blob/main/scripts/vm-modules-mounter.service
> an outlet as I just replied to you previous email -- so now we just
> gotta codify it. I think supporting both makes sense. Let folks pick.
I'll take a look.
> Clearly if we want to help debug a regression with modules we need them.
I agree. My suggestion with direct mode has quite a few conditions to work and
may not be suitable for everyone.
>
> Luis
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFT] bootlinux: add bisection support
2025-04-24 6:40 ` Daniel Gomez
2025-04-24 7:25 ` Daniel Gomez
@ 2025-04-24 17:08 ` Luis Chamberlain
2025-04-24 17:12 ` Luis Chamberlain
1 sibling, 1 reply; 14+ messages in thread
From: Luis Chamberlain @ 2025-04-24 17:08 UTC (permalink / raw)
To: Daniel Gomez; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Thu, Apr 24, 2025 at 08:40:52AM +0200, Daniel Gomez wrote:
> On Wed, Apr 23, 2025 at 10:47:16PM +0100, Luis Chamberlain wrote:
> > On Wed, Apr 23, 2025 at 05:09:41PM -0700, Luis Chamberlain wrote:
> > > 2) see if we need to redo the simple boot script
> >
> > this needs just one adjustment, the top of the script needs to
> > change directory to the kdevops dir so the script needs to be:
> >
> > #!/bin/bash
> > # SPDX-License-Identifier: copyleft-next-0.3.1
> > #
> > # If make linux-deploy fails for reasons unrelated to the bug we're
> > # tracking (e.g., a build error) we bail with exit code 125 to allow git
> > # to skip that commit.
> > cd {{ topdir_path }}
> > make linux-deploy || exit 12
> >
> > That seems to be doing what I expected it would do! So just puzzle left,
> > why the GOOD ref was not used at 'make linux'.
>
> If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
> will always boot into the "bad" kernel. How is the script going to boot the new
> kernel for testing if the new ref is "bad"/"good"?
Good point.
On the guest:
kdevops@bisect ~ $ awk -F\' '/menuentry / {print $2}'
/boot/grub/grub.cfg | awk '{print NR-1 " ... " $0}'
0 ... Debian GNU/Linux, with Linux 6.15.0-rc3-next-20250423
1 ... Debian GNU/Linux, with Linux 6.15.0-rc3-next-20250423 (recovery mode)
2 ... Debian GNU/Linux, with Linux 6.15.0-rc3-05015-g2ecdc8872668
3 ... Debian GNU/Linux, with Linux 6.15.0-rc3-05015-g2ecdc8872668 (recovery mode)
4 ... Debian GNU/Linux, with Linux 6.15.0-rc3-04291-g400331cd3ff4
5 ... Debian GNU/Linux, with Linux 6.15.0-rc3-04291-g400331cd3ff4 (recovery mode)
6 ... Debian GNU/Linux, with Linux 6.15.0-rc3-02800-g3686e26dd260
7 ... Debian GNU/Linux, with Linux 6.15.0-rc3-02800-g3686e26dd260 (recovery mode)
8 ... Debian GNU/Linux, with Linux 6.12.22-amd64
9 ... Debian GNU/Linux, with Linux 6.12.22-amd64 (recovery mode)
10 ... Debian GNU/Linux, with Linux 6.12.20-amd64
11 ... Debian GNU/Linux, with Linux 6.12.20-amd64 (recovery mode)
12 ... Debian GNU/Linux, with Linux 6.12.19-amd64
13 ... Debian GNU/Linux, with Linux 6.12.19-amd64 (recovery mode)
14 ... UEFI Firmware Settings
Right now it fails to boot. On the host:
GUEST=bisect
sudo virsh destroy $GUEST
DISK=$(sudo virsh dumpxml $GUEST | xmllint --xpath "string(//devices/disk[@device='disk']/source/@file)" -)
virt-customize --format raw -a $DISK --run-command 'grub-set-default 8'
[ 0.0] Examining the guest ...
[ 8.7] Setting a random seed
[ 8.7] Running: grub-set-default 8
[ 8.8] SELinux relabelling
[ 8.8] Finishing off
ssh bisect uname -r
6.12.22-amd64
Luis
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 17:08 ` Luis Chamberlain
@ 2025-04-24 17:12 ` Luis Chamberlain
2025-04-24 18:41 ` Daniel Gomez
2025-05-01 22:21 ` Daniel Gomez
0 siblings, 2 replies; 14+ messages in thread
From: Luis Chamberlain @ 2025-04-24 17:12 UTC (permalink / raw)
To: Daniel Gomez; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Thu, Apr 24, 2025 at 10:08:09AM -0700, Luis Chamberlain wrote:
> On Thu, Apr 24, 2025 at 08:40:52AM +0200, Daniel Gomez wrote:
> > If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
> > will always boot into the "bad" kernel. How is the script going to boot the new
> > kernel for testing if the new ref is "bad"/"good"?
>
> Good point.
>
> On the guest:
>
> kdevops@bisect ~ $ awk -F\' '/menuentry / {print $2}'
> /boot/grub/grub.cfg | awk '{print NR-1 " ... " $0}'
> 0 ... Debian GNU/Linux, with Linux 6.15.0-rc3-next-20250423
> 1 ... Debian GNU/Linux, with Linux 6.15.0-rc3-next-20250423 (recovery mode)
> 2 ... Debian GNU/Linux, with Linux 6.15.0-rc3-05015-g2ecdc8872668
> 3 ... Debian GNU/Linux, with Linux 6.15.0-rc3-05015-g2ecdc8872668 (recovery mode)
> 4 ... Debian GNU/Linux, with Linux 6.15.0-rc3-04291-g400331cd3ff4
> 5 ... Debian GNU/Linux, with Linux 6.15.0-rc3-04291-g400331cd3ff4 (recovery mode)
> 6 ... Debian GNU/Linux, with Linux 6.15.0-rc3-02800-g3686e26dd260
> 7 ... Debian GNU/Linux, with Linux 6.15.0-rc3-02800-g3686e26dd260 (recovery mode)
> 8 ... Debian GNU/Linux, with Linux 6.12.22-amd64
> 9 ... Debian GNU/Linux, with Linux 6.12.22-amd64 (recovery mode)
> 10 ... Debian GNU/Linux, with Linux 6.12.20-amd64
> 11 ... Debian GNU/Linux, with Linux 6.12.20-amd64 (recovery mode)
> 12 ... Debian GNU/Linux, with Linux 6.12.19-amd64
> 13 ... Debian GNU/Linux, with Linux 6.12.19-amd64 (recovery mode)
> 14 ... UEFI Firmware Settings
>
> Right now it fails to boot. On the host:
>
> GUEST=bisect
> sudo virsh destroy $GUEST
> DISK=$(sudo virsh dumpxml $GUEST | xmllint --xpath "string(//devices/disk[@device='disk']/source/@file)" -)
>
> virt-customize --format raw -a $DISK --run-command 'grub-set-default 8'
> [ 0.0] Examining the guest ...
> [ 8.7] Setting a random seed
> [ 8.7] Running: grub-set-default 8
> [ 8.8] SELinux relabelling
> [ 8.8] Finishing off
I forgot to mention here I obviously followed up with
sudo virsh start bisect
> ssh bisect uname -r
> 6.12.22-amd64
Luis
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 17:12 ` Luis Chamberlain
@ 2025-04-24 18:41 ` Daniel Gomez
2025-05-01 22:21 ` Daniel Gomez
1 sibling, 0 replies; 14+ messages in thread
From: Daniel Gomez @ 2025-04-24 18:41 UTC (permalink / raw)
To: Luis Chamberlain; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Thu, Apr 24, 2025 at 10:12:06AM +0100, Luis Chamberlain wrote:
> On Thu, Apr 24, 2025 at 10:08:09AM -0700, Luis Chamberlain wrote:
> > On Thu, Apr 24, 2025 at 08:40:52AM +0200, Daniel Gomez wrote:
> > > If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
> > > will always boot into the "bad" kernel. How is the script going to boot the new
> > > kernel for testing if the new ref is "bad"/"good"?
> >
> > Good point.
> >
> > On the guest:
> >
> > kdevops@bisect ~ $ awk -F\' '/menuentry / {print $2}'
> > /boot/grub/grub.cfg | awk '{print NR-1 " ... " $0}'
> > 0 ... Debian GNU/Linux, with Linux 6.15.0-rc3-next-20250423
> > 1 ... Debian GNU/Linux, with Linux 6.15.0-rc3-next-20250423 (recovery mode)
> > 2 ... Debian GNU/Linux, with Linux 6.15.0-rc3-05015-g2ecdc8872668
> > 3 ... Debian GNU/Linux, with Linux 6.15.0-rc3-05015-g2ecdc8872668 (recovery mode)
> > 4 ... Debian GNU/Linux, with Linux 6.15.0-rc3-04291-g400331cd3ff4
> > 5 ... Debian GNU/Linux, with Linux 6.15.0-rc3-04291-g400331cd3ff4 (recovery mode)
> > 6 ... Debian GNU/Linux, with Linux 6.15.0-rc3-02800-g3686e26dd260
> > 7 ... Debian GNU/Linux, with Linux 6.15.0-rc3-02800-g3686e26dd260 (recovery mode)
> > 8 ... Debian GNU/Linux, with Linux 6.12.22-amd64
> > 9 ... Debian GNU/Linux, with Linux 6.12.22-amd64 (recovery mode)
> > 10 ... Debian GNU/Linux, with Linux 6.12.20-amd64
> > 11 ... Debian GNU/Linux, with Linux 6.12.20-amd64 (recovery mode)
> > 12 ... Debian GNU/Linux, with Linux 6.12.19-amd64
> > 13 ... Debian GNU/Linux, with Linux 6.12.19-amd64 (recovery mode)
> > 14 ... UEFI Firmware Settings
> >
> > Right now it fails to boot. On the host:
> >
> > GUEST=bisect
> > sudo virsh destroy $GUEST
> > DISK=$(sudo virsh dumpxml $GUEST | xmllint --xpath "string(//devices/disk[@device='disk']/source/@file)" -)
> >
> > virt-customize --format raw -a $DISK --run-command 'grub-set-default 8'
> > [ 0.0] Examining the guest ...
> > [ 8.7] Setting a random seed
> > [ 8.7] Running: grub-set-default 8
> > [ 8.8] SELinux relabelling
> > [ 8.8] Finishing off
>
> I forgot to mention here I obviously followed up with
>
> sudo virsh start bisect
>
> > ssh bisect uname -r
> > 6.12.22-amd64
Nice trick!
>
> Luis
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-04-24 17:12 ` Luis Chamberlain
2025-04-24 18:41 ` Daniel Gomez
@ 2025-05-01 22:21 ` Daniel Gomez
2025-05-05 19:25 ` Luis Chamberlain
1 sibling, 1 reply; 14+ messages in thread
From: Daniel Gomez @ 2025-05-01 22:21 UTC (permalink / raw)
To: Luis Chamberlain; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Thu, Apr 24, 2025 at 10:12:06AM +0100, Luis Chamberlain wrote:
> On Thu, Apr 24, 2025 at 10:08:09AM -0700, Luis Chamberlain wrote:
> > On Thu, Apr 24, 2025 at 08:40:52AM +0200, Daniel Gomez wrote:
> > > If make linux-deploy installs a "bad" ref kernel with a boot problem, the VM
> > > will always boot into the "bad" kernel. How is the script going to boot the new
> > > kernel for testing if the new ref is "bad"/"good"?
> >
> > Good point.
> >
> > On the guest:
> >
> > kdevops@bisect ~ $ awk -F\' '/menuentry / {print $2}'
> > /boot/grub/grub.cfg | awk '{print NR-1 " ... " $0}'
> > 0 ... Debian GNU/Linux, with Linux 6.15.0-rc3-next-20250423
> > 1 ... Debian GNU/Linux, with Linux 6.15.0-rc3-next-20250423 (recovery mode)
> > 2 ... Debian GNU/Linux, with Linux 6.15.0-rc3-05015-g2ecdc8872668
> > 3 ... Debian GNU/Linux, with Linux 6.15.0-rc3-05015-g2ecdc8872668 (recovery mode)
> > 4 ... Debian GNU/Linux, with Linux 6.15.0-rc3-04291-g400331cd3ff4
> > 5 ... Debian GNU/Linux, with Linux 6.15.0-rc3-04291-g400331cd3ff4 (recovery mode)
> > 6 ... Debian GNU/Linux, with Linux 6.15.0-rc3-02800-g3686e26dd260
> > 7 ... Debian GNU/Linux, with Linux 6.15.0-rc3-02800-g3686e26dd260 (recovery mode)
> > 8 ... Debian GNU/Linux, with Linux 6.12.22-amd64
> > 9 ... Debian GNU/Linux, with Linux 6.12.22-amd64 (recovery mode)
> > 10 ... Debian GNU/Linux, with Linux 6.12.20-amd64
> > 11 ... Debian GNU/Linux, with Linux 6.12.20-amd64 (recovery mode)
> > 12 ... Debian GNU/Linux, with Linux 6.12.19-amd64
> > 13 ... Debian GNU/Linux, with Linux 6.12.19-amd64 (recovery mode)
> > 14 ... UEFI Firmware Settings
> >
> > Right now it fails to boot. On the host:
> >
> > GUEST=bisect
> > sudo virsh destroy $GUEST
> > DISK=$(sudo virsh dumpxml $GUEST | xmllint --xpath "string(//devices/disk[@device='disk']/source/@file)" -)
> >
> > virt-customize --format raw -a $DISK --run-command 'grub-set-default 8'
> > [ 0.0] Examining the guest ...
> > [ 8.7] Setting a random seed
> > [ 8.7] Running: grub-set-default 8
> > [ 8.8] SELinux relabelling
> > [ 8.8] Finishing off
>
> I forgot to mention here I obviously followed up with
>
> sudo virsh start bisect
>
> > ssh bisect uname -r
> > 6.12.22-amd64
I've had some time to think this over again, and I'm not sure the current loop
approach feels right to me. IIUC, the proposal works like this:
make -> ansible-playbook bootlinux.yml (bisect) -> task: bisect-script.sh ->
-> make -> ansible-playbook bootlinux (deploy).
So, I think it would be better to avoid calling an Ansible playbook from a
script executed in an Ansible play.
However, I was trying to find a convenient way to loop in Ansible and I haven't
been able to find one. I was thinking in:
a. bisect-start target:
1. Ensures VM is booted
2. git bisect start
b. bisect-loop target:
1. Capture GRUB list and ask user for index
2. Build, install kernel
3. Reboot VM
4. Wait until online
4.a git bisect bad if timeout or other custom condition
4.b git bisect good if online (and if custom condition succeeds)
6. Shutdown vm (regarless of good/bad)
7. Reconfigure grub to boot from user option
8. Start VM and goto 2.
I was reading Ansible loops and blocks [1] and I think Ansible limits loops
to task level. And include_tasks only loops over a list. I don't see a better/
cleaner way than running our own script and loop b. there.
Thoughts?
[1]
https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_loops.html#loops
https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_blocks.html
>
> Luis
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [RFT] bootlinux: add bisection support
2025-05-01 22:21 ` Daniel Gomez
@ 2025-05-05 19:25 ` Luis Chamberlain
2025-05-07 10:38 ` Daniel Gomez
0 siblings, 1 reply; 14+ messages in thread
From: Luis Chamberlain @ 2025-05-05 19:25 UTC (permalink / raw)
To: Daniel Gomez; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Fri, May 02, 2025 at 12:21:51AM +0200, Daniel Gomez wrote:
> However, I was trying to find a convenient way to loop in Ansible and I haven't
> been able to find one. I was thinking in:
Its the same chicken and egg problem with running fstests check for a long time,
and having a watchdog run to support a CI. We've now detatched the CI
stuff as part of the CI stuff, but similar questions came up long ago
when I tried to see if we could do both in one shot.
What I ended up finding was the async ansible stuff but I never ended up
mucking with it:
https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_async.html
Not sure if it helps here.
Luis
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFT] bootlinux: add bisection support
2025-05-05 19:25 ` Luis Chamberlain
@ 2025-05-07 10:38 ` Daniel Gomez
2025-05-07 18:52 ` Luis Chamberlain
0 siblings, 1 reply; 14+ messages in thread
From: Daniel Gomez @ 2025-05-07 10:38 UTC (permalink / raw)
To: Luis Chamberlain; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Mon, May 05, 2025 at 12:25:34PM +0100, Luis Chamberlain wrote:
> On Fri, May 02, 2025 at 12:21:51AM +0200, Daniel Gomez wrote:
> > However, I was trying to find a convenient way to loop in Ansible and I haven't
> > been able to find one. I was thinking in:
>
> Its the same chicken and egg problem with running fstests check for a long time,
> and having a watchdog run to support a CI. We've now detatched the CI
> stuff as part of the CI stuff, but similar questions came up long ago
> when I tried to see if we could do both in one shot.
I see.
>
> What I ended up finding was the async ansible stuff but I never ended up
> mucking with it:
>
> https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_async.html
>
> Not sure if it helps here.
>
>
> Luis
I think the Ansible async task feature might be useful for step 4 (Wait until
online), but it doesn't solve the overall problem.
Going back to the workflow proposal:
a. bisect-start target:
1. Ensures VM is booted
2. git bisect start
b. bisect-loop target:
1. Capture GRUB list and ask user for index
2. Build, install kernel
3. Reboot VM
4. Wait until online
4.a git bisect bad if timeout or other custom condition
4.b git bisect good if online (and if custom condition succeeds)
6. Shutdown vm (regarless of good/bad)
7. Reconfigure grub to boot from user option
8. Start VM and goto 2.
I think the CI loop approach suits better for this. What about something like
this?
bisect:
ansible-playbook bisect.yml --tags bisect-init
while ! [ -f .bisect-loop-finished ]; do \
ansible-playbook bisect.yml
done
This should loop b until Ansible detects bisect has finished.
For the GRUB part, we may be able to save the distro kernel GRUB configuration
in a (bisect-init tag), and use that when b fails to boot the vm.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFT] bootlinux: add bisection support
2025-05-07 10:38 ` Daniel Gomez
@ 2025-05-07 18:52 ` Luis Chamberlain
0 siblings, 0 replies; 14+ messages in thread
From: Luis Chamberlain @ 2025-05-07 18:52 UTC (permalink / raw)
To: Daniel Gomez; +Cc: Chuck Lever, Daniel Gomez, kdevops
On Wed, May 07, 2025 at 12:38:25PM +0200, Daniel Gomez wrote:
> On Mon, May 05, 2025 at 12:25:34PM +0100, Luis Chamberlain wrote:
> > On Fri, May 02, 2025 at 12:21:51AM +0200, Daniel Gomez wrote:
> > > However, I was trying to find a convenient way to loop in Ansible and I haven't
> > > been able to find one. I was thinking in:
> >
> > Its the same chicken and egg problem with running fstests check for a long time,
> > and having a watchdog run to support a CI. We've now detatched the CI
> > stuff as part of the CI stuff, but similar questions came up long ago
> > when I tried to see if we could do both in one shot.
>
> I see.
>
> >
> > What I ended up finding was the async ansible stuff but I never ended up
> > mucking with it:
> >
> > https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_async.html
> >
> > Not sure if it helps here.
> >
> >
> > Luis
>
> I think the Ansible async task feature might be useful for step 4 (Wait until
> online), but it doesn't solve the overall problem.
>
> Going back to the workflow proposal:
>
> a. bisect-start target:
> 1. Ensures VM is booted
> 2. git bisect start
>
> b. bisect-loop target:
> 1. Capture GRUB list and ask user for index
> 2. Build, install kernel
> 3. Reboot VM
> 4. Wait until online
> 4.a git bisect bad if timeout or other custom condition
> 4.b git bisect good if online (and if custom condition succeeds)
> 6. Shutdown vm (regarless of good/bad)
> 7. Reconfigure grub to boot from user option
> 8. Start VM and goto 2.
>
> I think the CI loop approach suits better for this. What about something like
> this?
>
> bisect:
> ansible-playbook bisect.yml --tags bisect-init
> while ! [ -f .bisect-loop-finished ]; do \
> ansible-playbook bisect.yml
> done
Looks good!
Luis
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-05-07 18:52 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-24 0:09 [RFT] bootlinux: add bisection support Luis Chamberlain
2025-04-24 5:47 ` Luis Chamberlain
2025-04-24 6:40 ` Daniel Gomez
2025-04-24 7:25 ` Daniel Gomez
2025-04-24 13:36 ` Daniel Gomez
2025-04-24 17:10 ` Luis Chamberlain
2025-04-24 18:37 ` Daniel Gomez
2025-04-24 17:08 ` Luis Chamberlain
2025-04-24 17:12 ` Luis Chamberlain
2025-04-24 18:41 ` Daniel Gomez
2025-05-01 22:21 ` Daniel Gomez
2025-05-05 19:25 ` Luis Chamberlain
2025-05-07 10:38 ` Daniel Gomez
2025-05-07 18:52 ` Luis Chamberlain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox