qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* avocado test: log shows initial chunk of terminal output is missing
@ 2023-07-17 15:42 Peter Maydell
  2023-07-17 15:54 ` Daniel P. Berrangé
  0 siblings, 1 reply; 2+ messages in thread
From: Peter Maydell @ 2023-07-17 15:42 UTC (permalink / raw)
  To: QEMU Developers
  Cc: Cleber Rosa, Phil Mathieu-Daudé, Wainer dos Santos Moschetta,
	Beraldo Leal, John Snow

I'm looking at an avocado test that's failing intermittently
on s390 host:
machine_aarch64_sbsaref.py:Aarch64SbsarefMachine.test_sbsaref_edk2_firmware

This test looks for various things on the terminal output,
and it's failing because it hasn't seen the first thing it looks
for. The avocado log seems to show that the guest has booted
past that point and has produced a later chunk of the output,
and avocado appears to have simply lost the earlier output that
the test case is trying to match on.

Does this "some of the terminal output is missing"
sound familiar as a bug we already know about ?

2023-07-17 15:16:15,461 protocol         L0481 DEBUG| Transitioning
from 'Runstate.CONNECTING' to 'Runstate.RUNNING'.
2023-07-17 15:16:19,897 __init__         L0153 DEBUG| 3h
2023-07-17 15:16:19,929 __init__         L0153 DEBUG|
2023-07-17 15:16:19,936 __init__         L0153 DEBUG| QEMU SBSA-REF
Machinearm-virt2.00 GHz1.01024 MB RAM
2023-07-17 15:16:19,945 __init__         L0153 DEBUG|
2023-07-17 15:16:19,956 __init__         L0153 DEBUG|
2023-07-17 15:16:19,971 __init__         L0153 DEBUG| <Standard
English>            Select Language
                                    >Device Manager
                       >Boot Manager
          >Boot Maintenance Manager
                                                      Continue
                                        Reset
2023-07-17 15:16:19,974 __init__         L0153 DEBUG|
                                                               ^v=Move
Highlight                               <Enter>=Select Entry      This
is the option
2023-07-17 15:16:19,975 __init__         L0153 DEBUG| one adjusts to change
2023-07-17 15:16:19,975 __init__         L0153 DEBUG| the language for the
2023-07-17 15:16:19,975 __init__         L0153 DEBUG| current system
2023-07-17 15:16:19,976 __init__         L0153 DEBUG|
2023-07-17 15:16:19,976 __init__         L0153 DEBUG|
2023-07-17 15:16:19,976 __init__         L0153 DEBUG|
2023-07-17 15:16:19,977 __init__         L0153 DEBUG|
2023-07-17 15:16:19,977 __init__         L0153 DEBUG|
2023-07-17 15:16:19,977 __init__         L0153 DEBUG|
2023-07-17 15:16:19,978 __init__         L0153 DEBUG|
2023-07-17 15:16:19,978 __init__         L0153 DEBUG|
2023-07-17 15:19:11,248 stacktrace       L0039 ERROR|
2023-07-17 15:19:11,248 stacktrace       L0041 ERROR| Reproduced
traceback from:
/home/linux1/qemu/build/all/tests/venv/lib/python3.8/site-packages/avocado/core/test.py:770
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR| Traceback (most
recent call last):
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|   File
"/home/linux1/qemu/build/all/tests/avocado/machine_aarch64_sbsaref.py",
line 95, in test_sbsaref_edk2_firmware
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|
wait_for_console_pattern(self, "Booting Trusted Firmware")
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|   File
"/home/linux1/qemu/build/all/tests/avocado/avocado_qemu/__init__.py",
line 199, in wait_for_console_pattern
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|
_console_interaction(test, success_message, failure_message, None,
vm=vm)
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|   File
"/home/linux1/qemu/build/all/tests/avocado/avocado_qemu/__init__.py",
line 148, in _console_interaction
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|     msg =
console.readline().decode().strip()
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|   File
"/usr/lib/python3.8/socket.py", line 669, in readinto
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|     return
self._sock.recv_into(b)
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|   File
"/home/linux1/qemu/build/all/tests/venv/lib/python3.8/site-packages/avocado/plugins/runner.py",
line 77, in sigterm_handler
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR|     raise
RuntimeError("Test interrupted by SIGTERM")
2023-07-17 15:19:11,249 stacktrace       L0045 ERROR| RuntimeError:
Test interrupted by SIGTERM


You can tell data has been lost because that '3h' is part
of the trailing end of an ANSI escape sequence, I think.

Full job.log at
https://people.linaro.org/~peter.maydell/job.log

The failure is intermittent: when it works the relevant part
of the job.log looks like this:

2023-07-17 15:37:56,648 protocol         L0481 DEBUG| Transitioning
from 'Runstate.CONNECTING' to 'Runstate.RUNNING'.
2023-07-17 15:37:56,650 __init__         L0153 DEBUG| NOTICE:  Booting
Trusted Firmware
2023-07-17 15:37:56,650 __init__         L0153 DEBUG| NOTICE:  BL1:
v2.9(release):v2.9.0-51-gc0d8ee386
2023-07-17 15:37:56,650 __init__         L0153 DEBUG| NOTICE:  BL1:
Built : 16:44:16, May 30 2023
2023-07-17 15:37:56,653 __init__         L0153 DEBUG| NOTICE:  BL1: Booting BL2
2023-07-17 15:37:56,656 __init__         L0153 DEBUG| NOTICE:  BL2:
v2.9(release):v2.9.0-51-gc0d8ee386
2023-07-17 15:37:56,656 __init__         L0153 DEBUG| NOTICE:  BL2:
Built : 16:44:16, May 30 2023
2023-07-17 15:37:56,667 __init__         L0153 DEBUG| NOTICE:  BL1: Booting BL31
2023-07-17 15:37:56,668 __init__         L0153 DEBUG| NOTICE:  BL31:
v2.9(release):v2.9.0-51-gc0d8ee386
2023-07-17 15:37:56,668 __init__         L0153 DEBUG| NOTICE:  BL31:
Built : 16:44:16, May 30 2023
2023-07-17 15:37:56,671 __init__         L0153 DEBUG| UEFI firmware
(version 1.0 built at 17:14:57 on Mar 21 2023)
2023-07-17 15:37:57,593 __init__         L0153 DEBUG|
ESC[2JESC[04DESC[=3hESC[2JESC[09DESC[2JESC[04DESC[0mESC[30mESC[40m
2023-07-17 15:37:57,610 __init__         L0153 DEBUG|
ESC[01;01HESC[0mESC[34mESC[47m
2023-07-17 15:37:57,616 __init__         L0153 DEBUG|
ESC[01;01HESC[02;02HQEMU SBSA-REF MachineESC[03;02Harm-virtESC[44C2.00
GHzESC[04;02H1.0ESC[49C1024 MB RAM
ESC[05;02HESC[52CESC[06;02HESC[52CESC[0mESC[37mESC[40mESC[21;01H
2023-07-17 15:37:57,616 machine          L0617 DEBUG| Shutting down VM
appliance; timeout=30
2023-07-17 15:37:57,616 machine          L0541 DEBUG| Attempting
graceful termination
2023-07-17 15:37:57,616 machine          L0513 DEBUG| Closing console socket
2023-07-17 15:37:57,616 machine          L0551 DEBUG| Politely asking
QEMU to terminate

(the test is looking for some of those BL1: BL2: etc lines before it starts
looking for the "QEMU SBSA-REF Machine" text.)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: avocado test: log shows initial chunk of terminal output is missing
  2023-07-17 15:42 avocado test: log shows initial chunk of terminal output is missing Peter Maydell
@ 2023-07-17 15:54 ` Daniel P. Berrangé
  0 siblings, 0 replies; 2+ messages in thread
From: Daniel P. Berrangé @ 2023-07-17 15:54 UTC (permalink / raw)
  To: Peter Maydell
  Cc: QEMU Developers, Cleber Rosa, Phil Mathieu-Daudé,
	Wainer dos Santos Moschetta, Beraldo Leal, John Snow

On Mon, Jul 17, 2023 at 04:42:30PM +0100, Peter Maydell wrote:
> I'm looking at an avocado test that's failing intermittently
> on s390 host:
> machine_aarch64_sbsaref.py:Aarch64SbsarefMachine.test_sbsaref_edk2_firmware
> 
> This test looks for various things on the terminal output,
> and it's failing because it hasn't seen the first thing it looks
> for. The avocado log seems to show that the guest has booted
> past that point and has produced a later chunk of the output,
> and avocado appears to have simply lost the earlier output that
> the test case is trying to match on.

Having had a quick look at the test code the test is calling
wait_for_console_pattern() to match console output. This
method calls _console_interaction(), which accesses the property
vm.console_socket. The code in this method does a lazy create
of the socket:

    @property
    def console_socket(self) -> socket.socket:
        """
        Returns a socket connected to the console
        """
        if self._console_socket is None:
            self._console_socket = console_socket.ConsoleSocket(
                self._console_address,
                file=self._console_log_path,
                drain=self._drain_console)
        return self._console_socket

We're configuring the VM with a UNIX socket for the console. Even
if we moved the console_socket.ConsoleSocket() call directly into
the 'launch()' method this would still be inherantly racy.

The solution is the same as we prviously did for the QMP socket.
the QEMUMachine class needs to pre-create the UNIX socket and
pass the pre-opened FD into QEMU, which will guarantee no race
condition.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-07-17 15:54 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-17 15:42 avocado test: log shows initial chunk of terminal output is missing Peter Maydell
2023-07-17 15:54 ` Daniel P. Berrangé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).