Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Mikko Rapeli <mikko.rapeli@linaro.org>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: openembedded-core@lists.openembedded.org,
	Jon Mason <Jon.Mason@arm.com>, Ross Burton <ross.burton@arm.com>
Subject: Re: [OE-core] [PATCH] psplash: fix typo in psplash-systemd.service
Date: Thu, 27 Feb 2025 11:09:24 +0200	[thread overview]
Message-ID: <Z8ArxMPj19hAs3dZ@nuoska> (raw)
In-Reply-To: <11af77e0e16064cb97ab14593d5a931e38694275.camel@linuxfoundation.org>

Hi,

On Tue, Feb 25, 2025 at 11:42:10PM +0000, Richard Purdie wrote:
> Hi Mikko,
> 
> On Thu, 2025-02-20 at 10:25 +0200, Mikko Rapeli via lists.openembedded.org wrote:
> > systemd ignores the typo and continues but startup fails later due to
> > missing fifo file. Fixes:
> > 
> > systemd[1]: /usr/lib/systemd/system/psplash-systemd.service:8: Unknown key 'ConditionFileExists' in section [Unit], ignoring.
> > 
> > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
> > ---
> > �meta/recipes-core/psplash/files/psplash-systemd.service | 2 +-
> > �1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/meta/recipes-core/psplash/files/psplash-systemd.service b/meta/recipes-core/psplash/files/psplash-systemd.service
> > index f9aaa2db3d..b618be1ba6 100644
> > --- a/meta/recipes-core/psplash/files/psplash-systemd.service
> > +++ b/meta/recipes-core/psplash/files/psplash-systemd.service
> > @@ -5,7 +5,7 @@ After=psplash-start@fb0.service
> > �Requires=psplash-start@fb0.service
> > �RequiresMountsFor=/run
> > �ConditionFileIsExecutable=/usr/bin/psplash
> > -ConditionFileExists=/run/psplash_fifo
> > +ConditionPathExists=/run/psplash_fifo
> > �
> > �[Service]
> > �ExecStart=/usr/bin/psplash-systemd
> 
> With the systemctl patch dropped, it exposed another psplash failure in
> meta-arm again in my last master-next build:
> 
> https://autobuilder.yoctoproject.org/valkyrie/#/builders/75/builds/1036
> 
> So I'm not sure the previous issues were fully resolved :/

  File "/srv/pokybuild/yocto-worker/meta-arm/build/meta/lib/oeqa/core/decorator/__init__.py", line 35, in wrapped_f
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/srv/pokybuild/yocto-worker/meta-arm/build/meta/lib/oeqa/runtime/cases/systemd.py", line 100, in test_systemd_failed
    output += self.systemctl('status --full --failed')
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/pokybuild/yocto-worker/meta-arm/build/meta/lib/oeqa/runtime/cases/systemd.py", line 26, in systemctl
    self.assertEqual(status, expected, message)
AssertionError: 3 != 0 : SYSTEMD_BUS_TIMEOUT=240s systemctl status --full --failed 
x psplash-systemd.service - Start psplash-systemd progress communication helper
     Loaded: loaded (/usr/lib/systemd/system/psplash-systemd.service; static)
     Active: failed (Result: exit-code) since Tue 2025-02-25 17:25:22 UTC; 12min ago
   Duration: 528ms
 Invocation: aa5767bf98e343ad9747b82b323d1ddb
   Main PID: 265 (code=exited, status=1/FAILURE)
   Mem peak: 1.1M
        CPU: 163ms

Feb 25 17:25:21 sbsa-ref systemd[1]: Started Start psplash-systemd progress communication helper.
Feb 25 17:25:22 sbsa-ref psplash-systemd[265]: Error unable to open fifo
Feb 25 17:25:22 sbsa-ref systemd[1]: psplash-systemd.service: Main process exited, code=exited, status=1/FAILURE
Feb 25 17:25:22 sbsa-ref systemd[1]: psplash-systemd.service: Failed with result 'exit-code'.

I tried to reproduce this in 10's of test runs looping over the test.
I started a VNC server like in the autobuilder

$ vncserver -kill :1
$ vncserver

and looped over the tests with

$ DISPLAY=:1 bitbake -c testimage core-image-minimal core-image-weston core-image-sato

psplash has two services: psplash-start@fb0.service and psplash-systemd.service.
Both get started by udev when /dev/fb0 is detected. psplash-systemd.service is always
started after psplash-start@fb0.service which creates the fifo in /run. Both services should
be started after /run has been mounted. psplash-start@fb0.service uses sd-notify to tell
systemd that it has started after the fifo file has been created. The missing
fifo error from psplash-systemd.service means that psplash-start@fb0.service failed but it did
not, or exited without reporting errors. Or the /dev/fb0 disappeared or stopped working, which could
possibly be the vncserver failing. If the vncserver fails, basically disconnecting
a "display" from the qemu machine, then I think this is test setup problem. The vncserver
is used from build host, not yocto build.

If vncserver is not running when qemu is started with graphics, the startup does
not boot target to working SSH. If qemu was booted with graphics to the vncserver,
then stopping the vncserver hangs the full qemu machine. So I think fatal issues
with vncserver should be visible as different kind of errors.

The psplash-systemd.c startup is really simple:

int main()
{
        sd_event *event;
        sd_event_source *event_source = NULL;
        int r;
        sigset_t ss;
        usec_t time_now;
        char *rundir;

        /* Open pipe for psplash */
        rundir = getenv("PSPLASH_FIFO_DIR");

        if (!rundir)
                rundir = "/run";

        chdir(rundir);

        if ((pipe_fd = open (PSPLASH_FIFO,O_WRONLY|O_NONBLOCK)) == -1) {
                fprintf(stderr, "Error unable to open fifo");
                exit(EXIT_FAILURE);
        }

But this does not get started if psplash-start@fb0.service did not start correctly
so it must have, and also psplash-systemd.service only starts if the fifo file is
in /run:

After=psplash-start@fb0.service
Requires=psplash-start@fb0.service
RequiresMountsFor=/run
ConditionFileIsExecutable=/usr/bin/psplash
ConditionPathExists=/run/psplash_fifo

Now psplash-start@fb0.service can fail or exit and thus unlink() the fifo file
if there are errors and its main loop exits. This can happen at any time
due to /dev/fb0 errors, signals or when something wrote "QUIT" to the fifo.
These errors are not captured and return value is zero.

I propose ignoring these errors in psplash-systemd.service. It tries
really hard to start only when /dev/fb0 and fifo is there but can't handle all errors
from /dev/fb0 going missing or signals being thrown at psplash-start@fb0.service.

The progress bar is effectively "best effort".

Cheers,

-Mikko


  reply	other threads:[~2025-02-27  9:09 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-20  8:25 [PATCH] psplash: fix typo in psplash-systemd.service Mikko Rapeli
2025-02-25 23:42 ` [OE-core] " Richard Purdie
2025-02-27  9:09   ` Mikko Rapeli [this message]
2025-02-27 15:16     ` Richard Purdie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8ArxMPj19hAs3dZ@nuoska \
    --to=mikko.rapeli@linaro.org \
    --cc=Jon.Mason@arm.com \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=richard.purdie@linuxfoundation.org \
    --cc=ross.burton@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox