Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Mikko Rapeli <mikko.rapeli@linaro.org>
To: Mathieu Dubois-Briand <mathieu.dubois-briand@bootlin.com>,
	openembedded-core@lists.openembedded.org
Subject: Re: pseudo aborts on aarch64 ( Re: [OE-core] [PATCH v4 7/9] image_types_wic.bbclass: capture verbose wic output by default )
Date: Mon, 28 Apr 2025 12:26:51 +0300	[thread overview]
Message-ID: <aA9J25K043YATV4K@nuoska> (raw)
In-Reply-To: <1839881BF86B7FC2.2292@lists.openembedded.org>

Hi,

On Fri, Apr 25, 2025 at 01:12:59PM +0300, Mikko Rapeli via lists.openembedded.org wrote:
> On Fri, Apr 25, 2025 at 12:34:40PM +0300, Mikko Rapeli via lists.openembedded.org wrote:
> > On Fri, Apr 25, 2025 at 11:03:54AM +0200, Mathieu Dubois-Briand wrote:
> > > On Tue Apr 22, 2025 at 4:34 PM CEST, Mikko Rapeli via lists.openembedded.org wrote:
> > > > Call wic with --debug to capture logs from wic internals
> > > > so that it's clear which partitions get created and which
> > > > files get copied where. wic plugins contain for example
> > > > race conditions which don't install files at all and thus
> > > > images fail to boot and it's not possible to debug these without
> > > > something in wic task logs.
> > > >
> > > > For example core-image-initramfs-boot do_image_wic
> > > > log is now 576 lines which is not excessive but very
> > > > important when debugging problems, especially race
> > > > conditions which are only hit in some builds in CI.
> > > >
> > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
> > > > ---
> > > >  meta/classes-recipe/image_types_wic.bbclass | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/meta/classes-recipe/image_types_wic.bbclass b/meta/classes-recipe/image_types_wic.bbclass
> > > > index 1b422b6280..10888bc12b 100644
> > > > --- a/meta/classes-recipe/image_types_wic.bbclass
> > > > +++ b/meta/classes-recipe/image_types_wic.bbclass
> > > > @@ -72,7 +72,7 @@ IMAGE_CMD:wic () {
> > > >  	if [ -z "$wks" ]; then
> > > >  		bbfatal "No kickstart files from WKS_FILES were found: ${WKS_FILES}. Please set WKS_FILE or WKS_FILES appropriately."
> > > >  	fi
> > > > -	BUILDDIR="${TOPDIR}" PSEUDO_UNLOAD=1 wic create "$wks" --vars "${STAGING_DIR}/${MACHINE}/imgdata/" -e "${IMAGE_BASENAME}" -o "$build_wic/" -w "$tmp_wic" ${WIC_CREATE_EXTRA_ARGS}
> > > > +	BUILDDIR="${TOPDIR}" PSEUDO_UNLOAD=1 wic create --debug "$wks" --vars "${STAGING_DIR}/${MACHINE}/imgdata/" -e "${IMAGE_BASENAME}" -o "$build_wic/" -w "$tmp_wic" ${WIC_CREATE_EXTRA_ARGS}
> > > >  
> > > >  	# look to see if the user specifies a custom imager
> > > >  	IMAGER=direct
> > > 
> > > Hi Mikko,
> > > 
> > > As we dropped the "oeqa wic.py: clean image build dir before rebuild in
> > > test_permissions" patch, we again have an issue with this one.
> > > 
> > > 2025-04-24 16:54:36,535 - oe-selftest - INFO - wic.Wic.test_permissions (subunit.RemotedTestCase)
> > > 2025-04-24 16:54:36,536 - oe-selftest - INFO -  ... FAIL
> > > ...
> > > | DEBUG: Python function extend_recipe_sysroot finished
> > > | DEBUG: Executing python function set_image_size
> > > | DEBUG: 23394.800000 = 17996 * 1.300000
> > > | DEBUG: 23394.800000 = max(23394.800000, 8192)[23394.800000] + 0
> > > | DEBUG: 23395.000000 = int(23394.800000)
> > > | DEBUG: 23395 = aligned(23395)
> > > | DEBUG: returning 23395
> > > | DEBUG: Python function set_image_size finished
> > > | DEBUG: Executing shell function do_image_wic
> > > | abort()ing pseudo client by server request. See https://wiki.yoctoproject.org/wiki/Pseudo_Abort for more details on this.
> > > | Check logfile: /srv/pokybuild/yocto-worker/oe-selftest-armhost/build/build-st-1239956/tmp/work/qemuarm64-poky-linux/core-image-minimal/1.0/pseudo//pseudo.log
> > > | Aborted (core dumped)
> > > | WARNING: exit code 134 from a shell command.
> > > NOTE: recipe core-image-minimal-1.0-r0: task do_image_wic: Failed
> > > ERROR: Task (/srv/pokybuild/yocto-worker/oe-selftest-armhost/build/meta/recipes-core/images/core-image-minimal.bb:do_image_wic) failed with exit code '1'
> > > Pseudo log:
> > > path mismatch [2 links]: ino 157047752 db '/srv/pokybuild/yocto-worker/oe-selftest-armhost/build/build-st-1239956/tmp/work/qemuarm64-poky-linux/core-image-minimal/1.0/rootfs/var/log' req '/srv/pokybuild/yocto-worker/oe-selftest-armhost/build/build-st-1239956/tmp/work/qemuarm64-poky-linux/core-image-minimal/1.0/tmp-wic/rootfs1/var/log'.
> > > Setup complete, sending SIGUSR1 to pid 346075.
> > > 
> > > https://autobuilder.yoctoproject.org/valkyrie/#/builders/23/builds/1507
> > > 
> > > This can be reproduced locally:
> > > 
> > > Get https://web.git.yoctoproject.org/poky-ci-archive/tag/?h=autobuilder.yoctoproject.org/valkyrie/a-full-1456
> > > and run 'oe-selftest -r wic.Wic.test_permissions'
> > 
> > Yes. This pseudo issue needs to be root caused and fixed. Will need to get
> > into that.
> > 
> > FWIW, on aarch64 build host in bitbake devshell I see vim sometimes crashing with
> > pseudo aborts when opening files, sometimes also when closing, and sometimes
> > it works. These may be related.
> > 
> > $ bitbake -c devshell lttng-modules
> > ...
> > root@ledge:~/src/base/repo/poky/build_test/tmp/work/genericarm64-poky-linux/lttng-modules/2.13.18/lttng-modules-2.13.18# vi ../../../../../work/genericarm64-poky-linux/linux-yocto/6.12.23+git/linux-genericarm64-standard-build/.config
> > Vim: Caught deadly signal ABRT
> > Vim: Finished.
> > Aborted
> 
> # tail -1 ../pseudo/pseudo.log
> path mismatch [1 link]: ino 36752721 db '/home/mcfrisk/src/base/repo/poky/build_test/tmp/work-shared/genericarm64/kernel-source/.Makefile.swp' req '/home/mcfrisk/src/base/repo/poky/build_test/tmp/work/genericarm64-poky-linux/linux-yocto/6.12.23+git/linux-genericarm64-standard-build/.config.swp'.
> 
> So these swap files opened and closed by vim confuse pseudo. Disabling
> them with 'vi -n' fixes this.
> 
> Richard mention yesterday in the patch review call that the fast opening
> and closing of files and inode reuse is triggering this. The accounting
> done by pseudo breaks somehow on arm64/aarch64 but works on x86_64
> build hosts.

Re-reading https://wiki.yoctoproject.org/wiki/Pseudo_Abort and
I don't think this is a bug. Just a very annoying thing. User can't use
vim editor inside and outside of pseudo/"bitbake -c devshell". The process
will open temp files in various locations and possibly delete them and
pseudo will get confused and start aborting.

I don't think this can be fixed. Workarounds, well, don't edit anything
under devshell. I need to find new ways to create patches to various
recipes. I'm used to opening devshell after do_install to test applying
patches and then manually running do_configure, do_compile and do_install
tasks to test things out before doing full recipe and image builds.
Would be nice if the pseudo checks only applied to files inside
recipe workspace, but I guess that filtering is tricky.

Then this wic selftest regression, since there was opposition to enabling
more verbose logs so I will just drop this. There are real bugs
in wic which for some reason only get exposed by this verbose flag.

The bootloader config files generated by wic are done without pseudo
and thus wic and bitbake builds differ. This is true for systemd-boot,
this failing case, and also with grub when EFI_LOADER = "grub-efi"
which aborts with:

path mismatch [2 links]: ino 33909680 db '/home/mcfrisk/src/base/repo/poky/build_test-st/tmp/work/genericarm64-poky-linux/core-image-minimal/1.0/rootfs/boot/Image' req '/home/mcfrisk/src/base/repo/poky/build_test-st/tmp/work/genericarm64-poky-linux/core-image-minimal/1.0/tmp-wic/rootfs1/boot/Image'.

To me the difference is calling "wic" as normal user vs calling
"wic" under pseudo fakeroot shell as root inside bitbake env.

I don't think the output of both can ever be the same.

The failing sequence is:

 * modify wks file
 * call "wic" to build the image
 * build the same image with bitbake

The failure happens at bitbake image build. If bitbake image is built before
wic then the test passes. Same with cleaning the sysroot and pseudo databases
before building the image.

A lot of the wic plugins call "cp" and write directly to config files
without passing through pseudo. All of these break pseudo.
Fix in wic would always need to use pseudo when creating
files, directores and when copying files. This is currently not
the case and a lot of code would need to be refactored.
I'm not willing to do this now, sorry.

Cheers,

-Mikko


  parent reply	other threads:[~2025-04-28  9:26 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-22 14:34 [PATCH v4 0/9] systemd based initrd and modular kernel support Mikko Rapeli
2025-04-22 14:34 ` [PATCH v4 1/9] poky-altcfg.conf: enable "efi" in DISTRO_FEATURES Mikko Rapeli
2025-04-24 10:37   ` [OE-core] " Richard Purdie
2025-04-22 14:34 ` [PATCH v4 2/9] kernel.bbclass: add kernel-initrd-modules meta package Mikko Rapeli
2025-04-22 14:34 ` [PATCH v4 3/9] core-image-initramfs-boot: add option to build systemd based initrd Mikko Rapeli
2025-04-22 14:34 ` [PATCH v4 4/9] core-image-initramfs-boot: don't install RRECOMMENDS to reduce size Mikko Rapeli
2025-04-22 14:34 ` [PATCH v4 5/9] core-image-initramfs-boot: install kernel-initrd-modules by default Mikko Rapeli
2025-04-22 14:34 ` [PATCH v4 6/9] oeqa selftest uki.py: add aarch64/arm test with systemd based initrd Mikko Rapeli
2025-04-22 14:34 ` [PATCH v4 7/9] image_types_wic.bbclass: capture verbose wic output by default Mikko Rapeli
2025-04-25  9:03   ` [OE-core] " Mathieu Dubois-Briand
2025-04-25  9:34     ` Mikko Rapeli
     [not found]     ` <18398604A566E972.8275@lists.openembedded.org>
2025-04-25 10:12       ` pseudo aborts on aarch64 ( Re: [OE-core] [PATCH v4 7/9] image_types_wic.bbclass: capture verbose wic output by default ) Mikko Rapeli
     [not found]       ` <1839881BF86B7FC2.2292@lists.openembedded.org>
2025-04-28  9:26         ` Mikko Rapeli [this message]
2025-04-22 14:35 ` [PATCH v4 8/9] wic bootimg-efi.py: fail build if no binaries installed Mikko Rapeli
2025-04-24 10:35   ` [OE-core] " Richard Purdie
2025-04-22 14:35 ` [PATCH v4 9/9] image_types_wic.bbclass: depend on grub-efi and systemd-boot on aarch64, systemd-boot on arm Mikko Rapeli
2025-04-24 10:35   ` [OE-core] " Richard Purdie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aA9J25K043YATV4K@nuoska \
    --to=mikko.rapeli@linaro.org \
    --cc=mathieu.dubois-briand@bootlin.com \
    --cc=openembedded-core@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox