* Re: [OE-core] [PATCH] psplash: fix typo in psplash-systemd.service
2025-02-25 23:42 ` [OE-core] " Richard Purdie
@ 2025-02-27 9:09 ` Mikko Rapeli
2025-02-27 15:16 ` Richard Purdie
0 siblings, 1 reply; 4+ messages in thread
From: Mikko Rapeli @ 2025-02-27 9:09 UTC (permalink / raw)
To: Richard Purdie; +Cc: openembedded-core, Jon Mason, Ross Burton
Hi,
On Tue, Feb 25, 2025 at 11:42:10PM +0000, Richard Purdie wrote:
> Hi Mikko,
>
> On Thu, 2025-02-20 at 10:25 +0200, Mikko Rapeli via lists.openembedded.org wrote:
> > systemd ignores the typo and continues but startup fails later due to
> > missing fifo file. Fixes:
> >
> > systemd[1]: /usr/lib/systemd/system/psplash-systemd.service:8: Unknown key 'ConditionFileExists' in section [Unit], ignoring.
> >
> > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
> > ---
> > �meta/recipes-core/psplash/files/psplash-systemd.service | 2 +-
> > �1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/meta/recipes-core/psplash/files/psplash-systemd.service b/meta/recipes-core/psplash/files/psplash-systemd.service
> > index f9aaa2db3d..b618be1ba6 100644
> > --- a/meta/recipes-core/psplash/files/psplash-systemd.service
> > +++ b/meta/recipes-core/psplash/files/psplash-systemd.service
> > @@ -5,7 +5,7 @@ After=psplash-start@fb0.service
> > �Requires=psplash-start@fb0.service
> > �RequiresMountsFor=/run
> > �ConditionFileIsExecutable=/usr/bin/psplash
> > -ConditionFileExists=/run/psplash_fifo
> > +ConditionPathExists=/run/psplash_fifo
> > �
> > �[Service]
> > �ExecStart=/usr/bin/psplash-systemd
>
> With the systemctl patch dropped, it exposed another psplash failure in
> meta-arm again in my last master-next build:
>
> https://autobuilder.yoctoproject.org/valkyrie/#/builders/75/builds/1036
>
> So I'm not sure the previous issues were fully resolved :/
File "/srv/pokybuild/yocto-worker/meta-arm/build/meta/lib/oeqa/core/decorator/__init__.py", line 35, in wrapped_f
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/srv/pokybuild/yocto-worker/meta-arm/build/meta/lib/oeqa/runtime/cases/systemd.py", line 100, in test_systemd_failed
output += self.systemctl('status --full --failed')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/srv/pokybuild/yocto-worker/meta-arm/build/meta/lib/oeqa/runtime/cases/systemd.py", line 26, in systemctl
self.assertEqual(status, expected, message)
AssertionError: 3 != 0 : SYSTEMD_BUS_TIMEOUT=240s systemctl status --full --failed
x psplash-systemd.service - Start psplash-systemd progress communication helper
Loaded: loaded (/usr/lib/systemd/system/psplash-systemd.service; static)
Active: failed (Result: exit-code) since Tue 2025-02-25 17:25:22 UTC; 12min ago
Duration: 528ms
Invocation: aa5767bf98e343ad9747b82b323d1ddb
Main PID: 265 (code=exited, status=1/FAILURE)
Mem peak: 1.1M
CPU: 163ms
Feb 25 17:25:21 sbsa-ref systemd[1]: Started Start psplash-systemd progress communication helper.
Feb 25 17:25:22 sbsa-ref psplash-systemd[265]: Error unable to open fifo
Feb 25 17:25:22 sbsa-ref systemd[1]: psplash-systemd.service: Main process exited, code=exited, status=1/FAILURE
Feb 25 17:25:22 sbsa-ref systemd[1]: psplash-systemd.service: Failed with result 'exit-code'.
I tried to reproduce this in 10's of test runs looping over the test.
I started a VNC server like in the autobuilder
$ vncserver -kill :1
$ vncserver
and looped over the tests with
$ DISPLAY=:1 bitbake -c testimage core-image-minimal core-image-weston core-image-sato
psplash has two services: psplash-start@fb0.service and psplash-systemd.service.
Both get started by udev when /dev/fb0 is detected. psplash-systemd.service is always
started after psplash-start@fb0.service which creates the fifo in /run. Both services should
be started after /run has been mounted. psplash-start@fb0.service uses sd-notify to tell
systemd that it has started after the fifo file has been created. The missing
fifo error from psplash-systemd.service means that psplash-start@fb0.service failed but it did
not, or exited without reporting errors. Or the /dev/fb0 disappeared or stopped working, which could
possibly be the vncserver failing. If the vncserver fails, basically disconnecting
a "display" from the qemu machine, then I think this is test setup problem. The vncserver
is used from build host, not yocto build.
If vncserver is not running when qemu is started with graphics, the startup does
not boot target to working SSH. If qemu was booted with graphics to the vncserver,
then stopping the vncserver hangs the full qemu machine. So I think fatal issues
with vncserver should be visible as different kind of errors.
The psplash-systemd.c startup is really simple:
int main()
{
sd_event *event;
sd_event_source *event_source = NULL;
int r;
sigset_t ss;
usec_t time_now;
char *rundir;
/* Open pipe for psplash */
rundir = getenv("PSPLASH_FIFO_DIR");
if (!rundir)
rundir = "/run";
chdir(rundir);
if ((pipe_fd = open (PSPLASH_FIFO,O_WRONLY|O_NONBLOCK)) == -1) {
fprintf(stderr, "Error unable to open fifo");
exit(EXIT_FAILURE);
}
But this does not get started if psplash-start@fb0.service did not start correctly
so it must have, and also psplash-systemd.service only starts if the fifo file is
in /run:
After=psplash-start@fb0.service
Requires=psplash-start@fb0.service
RequiresMountsFor=/run
ConditionFileIsExecutable=/usr/bin/psplash
ConditionPathExists=/run/psplash_fifo
Now psplash-start@fb0.service can fail or exit and thus unlink() the fifo file
if there are errors and its main loop exits. This can happen at any time
due to /dev/fb0 errors, signals or when something wrote "QUIT" to the fifo.
These errors are not captured and return value is zero.
I propose ignoring these errors in psplash-systemd.service. It tries
really hard to start only when /dev/fb0 and fifo is there but can't handle all errors
from /dev/fb0 going missing or signals being thrown at psplash-start@fb0.service.
The progress bar is effectively "best effort".
Cheers,
-Mikko
^ permalink raw reply [flat|nested] 4+ messages in thread