Openembedded Core Discussions
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Khem Raj <raj.khem@gmail.com>
Cc: openembedded-core@lists.openembedded.org,
	richard.purdie@linuxfoundation.org,
	 alexandre.belloni@bootlin.com,
	Mikko Rapeli <mikko.rapeli@linaro.org>
Subject: Re: [OE-core] [RFC PATCH v2] qemurunner.py: continue to drain stdout after login:
Date: Tue, 12 Dec 2023 18:51:08 +0000	[thread overview]
Message-ID: <87fs07b6g3.fsf@draig.linaro.org> (raw)
In-Reply-To: <CAMKF1spkcbVQj=Mvv37TRnMWEcF5BuEjUYYnOmHmfrQkOqzo1g@mail.gmail.com> (Khem Raj's message of "Tue, 12 Dec 2023 10:26:31 -0800")

Khem Raj <raj.khem@gmail.com> writes:

> this version hangs the run forever.

Do you have a reproducer? In my "make test" case it works fine.

Alternatively you could gdb attach to the hanging python and use py-bt
to generate a python backtrace.

>
> On Mon, Dec 11, 2023 at 7:57 AM Alex Bennée <alex.bennee@linaro.org> wrote:
>>
>> If qemurunner doesn't continuously drain stdout we will eventually
>> cause QEMU to block while trying to write to the pipe. This can
>> manifest itself if the guest has for example configured its serial
>> ports to output via stdio even if the test itself is using a TCP
>> console or SSH to run things.
>>
>> To do this:
>>
>>   - always create a logging thread regardless of serial_ports
>>   - use a semaphore between main and logging threads
>>   - move the login detection into the logging thread
>>   - wait until the second acquire before continuing
>>
>> This doesn't address a potential overflow of stderr although generally
>> stderr from QEMU will be a lot less likely to block due to the volume
>> of data.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> Cc: Mikko Rapeli <mikko.rapeli@linaro.org>
>> ---
>>  meta/lib/oeqa/utils/qemurunner.py | 128 ++++++++++++++++++------------
>>  1 file changed, 78 insertions(+), 50 deletions(-)
>>
>> diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py
>> index 29fe271976..b768c08f04 100644
>> --- a/meta/lib/oeqa/utils/qemurunner.py
>> +++ b/meta/lib/oeqa/utils/qemurunner.py
>> @@ -207,8 +207,7 @@ class QemuRunner:
>>          self.logger.info("QMP Available for connection at %s" % (qmp_port2))
>>
>>          try:
>> -            if self.serial_ports >= 2:
>> -                self.threadsock, threadport = self.create_socket()
>> +            self.threadsock, threadport = self.create_socket()
>>              self.server_socket, self.serverport = self.create_socket()
>>          except socket.error as msg:
>>              self.logger.error("Failed to create listening socket: %s" % msg[1])
>> @@ -243,6 +242,7 @@ class QemuRunner:
>>          # to be a proper fix but this will suffice for now.
>>          self.runqemu = subprocess.Popen(launch_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, stdin=subprocess.PIPE, preexec_fn=os.setpgrp, env=env, cwd=self.tmpdir)
>>          output = self.runqemu.stdout
>> +        output_drain = output
>>          launch_time = time.time()
>>
>>          #
>> @@ -431,21 +431,30 @@ class QemuRunner:
>>          self.logger.debug("Target IP: %s" % self.ip)
>>          self.logger.debug("Server IP: %s" % self.server_ip)
>>
>> -        if self.serial_ports >= 2:
>> -            self.thread = LoggingThread(self.log, self.threadsock, self.logger)
>> -            self.thread.start()
>> -            if not self.thread.connection_established.wait(self.boottime):
>> -                self.logger.error("Didn't receive a console connection from qemu. "
>> -                             "Here is the qemu command line used:\n%s\nand "
>> -                             "output from runqemu:\n%s" % (cmdline, out))
>> -                self.stop_thread()
>> -                return False
>> +        # Create and hold onto the login semaphore, this will block
>> +        # the LoggingThread until we are ready
>> +        login_semaphore = threading.Semaphore()
>> +        login_semaphore.acquire()
>> +
>> +        self.thread = LoggingThread(self.log, self.threadsock,
>> +                                    self.runqemu.stdout, self.boot_patterns['search_reached_prompt'],
>> +                                    self.logger, login_semaphore)
>> +
>> +        self.thread.start()
>> +        login_semaphore.release()
>> +
>> +        # if not self.thread.connection_established.wait(self.boottime):
>> +        #     self.logger.error("Didn't receive a console connection from qemu. "
>> +        #                       "Here is the qemu command line used:\n%s\nand "
>> +        #                       "output from runqemu:\n%s" % (cmdline, out))
>> +        #     self.stop_thread()
>> +        #     return False
>>
>>          self.logger.debug("Output from runqemu:\n%s", out)
>>          self.logger.debug("Waiting at most %d seconds for login banner (%s)" %
>>                            (self.boottime, time.strftime("%D %H:%M:%S")))
>>          endtime = time.time() + self.boottime
>> -        filelist = [self.server_socket, self.runqemu.stdout]
>> +        filelist = [self.server_socket]
>>          reachedlogin = False
>>          stopread = False
>>          qemusock = None
>> @@ -464,46 +473,19 @@ class QemuRunner:
>>                      filelist.remove(self.server_socket)
>>                      self.logger.debug("Connection from %s:%s" % addr)
>>                  else:
>> -                    # try to avoid reading only a single character at a time
>> -                    time.sleep(0.1)
>> -                    if hasattr(file, 'read'):
>> -                        read = file.read(1024)
>> -                    elif hasattr(file, 'recv'):
>> -                        read = file.recv(1024)
>> -                    else:
>> -                        self.logger.error('Invalid file type: %s\n%s' % (file))
>> -                        read = b''
>> -
>> -                    self.logger.debug2('Partial boot log:\n%s' % (read.decode('utf-8', errors='backslashreplace')))
>> -                    data = data + read
>> -                    if data:
>> -                        bootlog += data
>> -                        self.log(data, extension = ".2")
>> -                        data = b''
>> -
>> -                        if bytes(self.boot_patterns['search_reached_prompt'], 'utf-8') in bootlog:
>> -                            self.server_socket.close()
>> -                            self.server_socket = qemusock
>> -                            stopread = True
>> -                            reachedlogin = True
>> -                            self.logger.debug("Reached login banner in %.2f seconds (%s)" %
>> -                                              (time.time() - (endtime - self.boottime),
>> -                                              time.strftime("%D %H:%M:%S")))
>> -                    else:
>> -                        # no need to check if reachedlogin unless we support multiple connections
>> -                        self.logger.debug("QEMU socket disconnected before login banner reached. (%s)" %
>> -                                          time.strftime("%D %H:%M:%S"))
>> -                        filelist.remove(file)
>> -                        file.close()
>> +                    if login_semaphore.acquire(blocking=False):
>> +                        self.server_socket.close()
>> +                        self.server_socket = qemusock
>>                          stopread = True
>> +                        reachedlogin = True
>> +
>> +        self.logger.debug("continuing on....")
>>
>>          if not reachedlogin:
>>              if time.time() >= endtime:
>>                  self.logger.warning("Target didn't reach login banner in %d seconds (%s)" %
>>                                    (self.boottime, time.strftime("%D %H:%M:%S")))
>>              tail = lambda l: "\n".join(l.splitlines()[-25:])
>> -            bootlog = self.decode_qemulog(bootlog)
>> -            self.logger.warning("Last 25 lines of login console (%d):\n%s" % (len(bootlog), tail(bootlog)))
>>              self.logger.warning("Last 25 lines of all logging (%d):\n%s" % (len(self.msg), tail(self.msg)))
>>              self.logger.warning("Check full boot log: %s" % self.logfile)
>>              self.stop()
>> @@ -539,6 +521,7 @@ class QemuRunner:
>>                  self.logger.warning("The output:\n%s" % output)
>>          except:
>>              self.logger.warning("Serial console failed while trying to login")
>> +
>>          return True
>>
>>      def stop(self):
>> @@ -696,14 +679,16 @@ class QemuRunner:
>>                      status = 1
>>          return (status, str(data))
>>
>> -# This class is for reading data from a socket and passing it to logfunc
>> -# to be processed. It's completely event driven and has a straightforward
>> -# event loop. The mechanism for stopping the thread is a simple pipe which
>> -# will wake up the poll and allow for tearing everything down.
>> +# This class is for reading data from sockets and QEMU's stdio output
>> +# and passing it to logfunc to be processed. It's completely event
>> +# driven and has a straightforward event loop. The mechanism for
>> +# stopping the thread is a simple pipe which will wake up the poll and
>> +# allow for tearing everything down.
>>  class LoggingThread(threading.Thread):
>> -    def __init__(self, logfunc, sock, logger):
>> +    def __init__(self, logfunc, sock, qemu_stdio, prompt, logger, semaphore):
>>          self.connection_established = threading.Event()
>>          self.serversock = sock
>> +        self.qemu_stdio = qemu_stdio
>>          self.logfunc = logfunc
>>          self.logger = logger
>>          self.readsock = None
>> @@ -713,9 +698,19 @@ class LoggingThread(threading.Thread):
>>          self.errorevents = select.POLLERR | select.POLLHUP | select.POLLNVAL
>>          self.readevents = select.POLLIN | select.POLLPRI
>>
>> +        # tracking until we see prompt
>> +        self.prompt = prompt
>> +        self.prompt_seen = False
>> +        self.boot_log = b''
>> +        self.semaphore = semaphore
>> +        self.boottime = time.time()
>> +
>>          threading.Thread.__init__(self, target=self.threadtarget)
>>
>>      def threadtarget(self):
>> +        # we acquire until we see the boot prompt
>> +        self.semaphore.acquire()
>> +
>>          try:
>>              self.eventloop()
>>          finally:
>> @@ -750,6 +745,7 @@ class LoggingThread(threading.Thread):
>>          event_read_mask = self.errorevents | self.readevents
>>          poll.register(self.serversock.fileno())
>>          poll.register(self.readpipe, event_read_mask)
>> +        poll.register(self.qemu_stdio, event_read_mask)
>>
>>          breakout = False
>>          self.running = True
>> @@ -757,8 +753,12 @@ class LoggingThread(threading.Thread):
>>          while not breakout:
>>              events = poll.poll()
>>              for event in events:
>> +
>> +                self.logger.debug(f"event is {event}")
>> +
>>                  # An error occurred, bail out
>>                  if event[1] & self.errorevents:
>> +                    self.logger.debug("error event")
>>                      raise Exception(self.stringify_event(event[1]))
>>
>>                  # Event to stop the thread
>> @@ -767,6 +767,10 @@ class LoggingThread(threading.Thread):
>>                      breakout = True
>>                      break
>>
>> +                # stdio data to be logged
>> +                elif self.qemu_stdio.fileno() == event[0]:
>> +                    self.consume_qemu_stdio()
>> +
>>                  # A connection request was received
>>                  elif self.serversock.fileno() == event[0]:
>>                      self.logger.debug("Connection request received")
>> @@ -783,6 +787,30 @@ class LoggingThread(threading.Thread):
>>                      data = self.recv(1024)
>>                      self.logfunc(data)
>>
>> +
>> +    # Consume QEMU's stdio output, checking for login strings
>> +    def consume_qemu_stdio(self):
>> +        # try to avoid reading only a single character at a time
>> +        time.sleep(0.1)
>> +        read = self.qemu_stdio.read(1024)
>> +
>> +        # log what we have seen
>> +        decoded_data = read.decode('utf-8', errors='backslashreplace')
>> +        if self.prompt_seen:
>> +            self.logger.debug2('Post login log:\n%s' % decoded_data)
>> +        else:
>> +            self.logger.debug2('Pre login log:\n%s' % decoded_data)
>> +
>> +        if not self.prompt_seen and read:
>> +            self.boot_log += read
>> +
>> +            if bytes(self.prompt, 'utf-8') in self.boot_log:
>> +                time_to_login = time.time() - self.boottime
>> +                self.logger.debug("Reached login banner in %.2f seconds (%s)" %
>> +                                  (time_to_login,  time.strftime("%D %H:%M:%S")))
>> +                self.semaphore.release()
>> +                self.prompt_seen = True
>> +
>>      # Since the socket is non-blocking make sure to honor EAGAIN
>>      # and EWOULDBLOCK.
>>      def recv(self, count):
>> --
>> 2.39.2
>>
>>
>> -=-=-=-=-=-=-=-=-=-=-=-
>> Links: You receive all messages sent to this group.
>> View/Reply Online (#192177): https://lists.openembedded.org/g/openembedded-core/message/192177
>> Mute This Topic: https://lists.openembedded.org/mt/103111135/1997914
>> Group Owner: openembedded-core+owner@lists.openembedded.org
>> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [raj.khem@gmail.com]
>> -=-=-=-=-=-=-=-=-=-=-=-
>>

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro


  reply	other threads:[~2023-12-12 18:51 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11 15:57 [RFC PATCH v2] qemurunner.py: continue to drain stdout after login: Alex Bennée
2023-12-12 18:26 ` [OE-core] " Khem Raj
2023-12-12 18:51   ` Alex Bennée [this message]
2023-12-12 19:43     ` Khem Raj
2023-12-12 22:57       ` Alex Bennée
2023-12-18 11:23 ` Richard Purdie
     [not found] ` <17A1E9406CB748A4.24236@lists.openembedded.org>
2023-12-18 17:39   ` [OE-core] " Richard Purdie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fs07b6g3.fsf@draig.linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=alexandre.belloni@bootlin.com \
    --cc=mikko.rapeli@linaro.org \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=raj.khem@gmail.com \
    --cc=richard.purdie@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox