All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] oeqa,ssh: Handle SSHCall timeout error code
@ 2023-11-08 11:25 Luca Fancellu
  2023-11-09  9:35 ` [OE-core] " Alexandre Belloni
  0 siblings, 1 reply; 4+ messages in thread
From: Luca Fancellu @ 2023-11-08 11:25 UTC (permalink / raw)
  To: openembedded-core; +Cc: diego.sueiro, rahul.singh

The current code in ssh.py is terminating the ssh process that
does not finish its computation in a given timeout (when timeout
is passed), the SSHCall function is returning the process error
code.

The Openssl ssh before version 8.6_p1 is returning 0 when it is
terminated, from commit 8a9520836e71830f4fccca066dba73fea3d16bda
onwards (version >= 8.6_p1) ssh is returning 255 instead.

So for version of ssh older than 8.6_p1 when the SSHCall time out,
the return code will be 0, meaning success, which is wrong.

Fix this issue checking if the process has timeout (hence it's been
terminated) and checking if the returned code is 0, in that case
set it to 255 to advertise that an error occurred.

Add a test case exercising the timeout in the SSHTest, test_ssh
test function.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
---
 meta/lib/oeqa/core/target/ssh.py   | 21 ++++++++++++++++++++-
 meta/lib/oeqa/runtime/cases/ssh.py |  3 +++
 2 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py
index f22836d390ac..4fe763aaefba 100644
--- a/meta/lib/oeqa/core/target/ssh.py
+++ b/meta/lib/oeqa/core/target/ssh.py
@@ -224,6 +224,11 @@ class OESSHTarget(OETarget):
                 remoteDir = os.path.join(remotePath, tmpDir.lstrip("/"))
                 self.deleteDir(remoteDir)
 
+
+class SSHCallTimeout(Exception):
+    pass
+
+
 def SSHCall(command, logger, timeout=None, **opts):
 
     def run():
@@ -232,9 +237,9 @@ def SSHCall(command, logger, timeout=None, **opts):
         output_raw = b''
         starttime = time.time()
         process = subprocess.Popen(command, **options)
+        eof = False
         if timeout:
             endtime = starttime + timeout
-            eof = False
             os.set_blocking(process.stdout.fileno(), False)
             while time.time() < endtime and not eof:
                 try:
@@ -293,6 +298,11 @@ def SSHCall(command, logger, timeout=None, **opts):
                     pass
                 process.wait()
 
+        # Process has timed out, raise exception here so that the process at
+        # this point is already terminated/killed
+        if not eof:
+            raise SSHCallTimeout
+
     options = {
         "stdout": subprocess.PIPE,
         "stderr": subprocess.STDOUT,
@@ -313,6 +323,15 @@ def SSHCall(command, logger, timeout=None, **opts):
 
     try:
         run()
+    except SSHCallTimeout:
+        # Version of openssh before 8.6_p1 returns error code 0 when killed by
+        # a signal, when the timeout occurs we will receive a 0 error code
+        # because the process is been terminated and it's wrong because that
+        # value means success, but the process timed out.
+        # Afterwards, from version 8.6_p1 onwards, the returned code is 255.
+        # Fix this behaviour by checking the return code
+        if process.returncode == 0:
+            process.returncode = 255
     except:
         # Need to guard against a SystemExit or other exception ocurring
         # whilst running and ensure we don't leave a process behind.
diff --git a/meta/lib/oeqa/runtime/cases/ssh.py b/meta/lib/oeqa/runtime/cases/ssh.py
index 13aac5439698..cdbef595008c 100644
--- a/meta/lib/oeqa/runtime/cases/ssh.py
+++ b/meta/lib/oeqa/runtime/cases/ssh.py
@@ -13,6 +13,9 @@ class SSHTest(OERuntimeTestCase):
     @OETestDepends(['ping.PingTest.test_ping'])
     @OEHasPackage(['dropbear', 'openssh-sshd'])
     def test_ssh(self):
+        (status, output) = self.target.run('sleep 20', timeout=2)
+        msg='run() timed out but return code was zero.'
+        self.assertNotEqual(status, 0, msg=msg)
         (status, output) = self.target.run('uname -a')
         self.assertEqual(status, 0, msg='SSH Test failed: %s' % output)
         (status, output) = self.target.run('cat /etc/controllerimage')

base-commit: 6806bd23499aa66942c2b6b8fbc52dbec8ff8483
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [OE-core] [PATCH] oeqa,ssh: Handle SSHCall timeout error code
  2023-11-08 11:08 luca fancellu
@ 2023-11-08 13:22 ` Luca Fancellu
  0 siblings, 0 replies; 4+ messages in thread
From: Luca Fancellu @ 2023-11-08 13:22 UTC (permalink / raw)
  To: openembedded-core@lists.openembedded.org; +Cc: Diego Sueiro, Rahul Singh



> On 8 Nov 2023, at 11:08, luca fancellu via lists.openembedded.org <luca.fancellu=arm.com@lists.openembedded.org> wrote:
> 
> The current code in ssh.py is terminating the ssh process that
> does not finish its computation in a given timeout (when timeout
> is passed), the SSHCall function is returning the process error
> code.
> 
> The Openssl ssh before version 8.6_p1 is returning 0 when it is
> terminated, from commit 8a9520836e71830f4fccca066dba73fea3d16bda
> onwards (version >= 8.6_p1) ssh is returning 255 instead.
> 
> So for version of ssh older than 8.6_p1 when the SSHCall time out,
> the return code will be 0, meaning success, which is wrong.
> 
> Fix this issue checking if the process has timeout (hence it's been
> terminated) and checking if the returned code is 0, in that case
> set it to 255 to advertise that an error occurred.
> 
> Add a test case exercising the timeout in the SSHTest, test_ssh
> test function.
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---

Sorry, seems that I pushed twice this patch, ignore this.

Cheers,
Luca



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [OE-core] [PATCH] oeqa,ssh: Handle SSHCall timeout error code
  2023-11-08 11:25 [PATCH] oeqa,ssh: Handle SSHCall timeout error code Luca Fancellu
@ 2023-11-09  9:35 ` Alexandre Belloni
  2023-11-09 12:23   ` Luca Fancellu
  0 siblings, 1 reply; 4+ messages in thread
From: Alexandre Belloni @ 2023-11-09  9:35 UTC (permalink / raw)
  To: luca fancellu; +Cc: openembedded-core, diego.sueiro, rahul.singh

On 08/11/2023 11:25:47+0000, luca fancellu wrote:
> The current code in ssh.py is terminating the ssh process that
> does not finish its computation in a given timeout (when timeout
> is passed), the SSHCall function is returning the process error
> code.
> 
> The Openssl ssh before version 8.6_p1 is returning 0 when it is
> terminated, from commit 8a9520836e71830f4fccca066dba73fea3d16bda
> onwards (version >= 8.6_p1) ssh is returning 255 instead.
> 
> So for version of ssh older than 8.6_p1 when the SSHCall time out,
> the return code will be 0, meaning success, which is wrong.
> 
> Fix this issue checking if the process has timeout (hence it's been
> terminated) and checking if the returned code is 0, in that case
> set it to 255 to advertise that an error occurred.
> 
> Add a test case exercising the timeout in the SSHTest, test_ssh
> test function.
> 
> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
> ---
>  meta/lib/oeqa/core/target/ssh.py   | 21 ++++++++++++++++++++-
>  meta/lib/oeqa/runtime/cases/ssh.py |  3 +++
>  2 files changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py
> index f22836d390ac..4fe763aaefba 100644
> --- a/meta/lib/oeqa/core/target/ssh.py
> +++ b/meta/lib/oeqa/core/target/ssh.py
> @@ -224,6 +224,11 @@ class OESSHTarget(OETarget):
>                  remoteDir = os.path.join(remotePath, tmpDir.lstrip("/"))
>                  self.deleteDir(remoteDir)
>  
> +
> +class SSHCallTimeout(Exception):
> +    pass
> +
> +
>  def SSHCall(command, logger, timeout=None, **opts):
>  
>      def run():
> @@ -232,9 +237,9 @@ def SSHCall(command, logger, timeout=None, **opts):
>          output_raw = b''
>          starttime = time.time()
>          process = subprocess.Popen(command, **options)
> +        eof = False
>          if timeout:
>              endtime = starttime + timeout
> -            eof = False
>              os.set_blocking(process.stdout.fileno(), False)
>              while time.time() < endtime and not eof:
>                  try:
> @@ -293,6 +298,11 @@ def SSHCall(command, logger, timeout=None, **opts):
>                      pass
>                  process.wait()
>  
> +        # Process has timed out, raise exception here so that the process at
> +        # this point is already terminated/killed
> +        if not eof:
> +            raise SSHCallTimeout
> +
>      options = {
>          "stdout": subprocess.PIPE,
>          "stderr": subprocess.STDOUT,
> @@ -313,6 +323,15 @@ def SSHCall(command, logger, timeout=None, **opts):
>  
>      try:
>          run()
> +    except SSHCallTimeout:
> +        # Version of openssh before 8.6_p1 returns error code 0 when killed by
> +        # a signal, when the timeout occurs we will receive a 0 error code
> +        # because the process is been terminated and it's wrong because that
> +        # value means success, but the process timed out.
> +        # Afterwards, from version 8.6_p1 onwards, the returned code is 255.
> +        # Fix this behaviour by checking the return code
> +        if process.returncode == 0:
> +            process.returncode = 255

This breaks most of our tests:

https://autobuilder.yoctoproject.org/typhoon/#/builders/40/builds/8103/steps/13/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/76/builds/8054/steps/21/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/6037/steps/14/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/8106/steps/15/logs/stdio

and more...

>      except:
>          # Need to guard against a SystemExit or other exception ocurring
>          # whilst running and ensure we don't leave a process behind.
> diff --git a/meta/lib/oeqa/runtime/cases/ssh.py b/meta/lib/oeqa/runtime/cases/ssh.py
> index 13aac5439698..cdbef595008c 100644
> --- a/meta/lib/oeqa/runtime/cases/ssh.py
> +++ b/meta/lib/oeqa/runtime/cases/ssh.py
> @@ -13,6 +13,9 @@ class SSHTest(OERuntimeTestCase):
>      @OETestDepends(['ping.PingTest.test_ping'])
>      @OEHasPackage(['dropbear', 'openssh-sshd'])
>      def test_ssh(self):
> +        (status, output) = self.target.run('sleep 20', timeout=2)
> +        msg='run() timed out but return code was zero.'
> +        self.assertNotEqual(status, 0, msg=msg)
>          (status, output) = self.target.run('uname -a')
>          self.assertEqual(status, 0, msg='SSH Test failed: %s' % output)
>          (status, output) = self.target.run('cat /etc/controllerimage')
> 
> base-commit: 6806bd23499aa66942c2b6b8fbc52dbec8ff8483
> -- 
> 2.34.1
> 

> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#190319): https://lists.openembedded.org/g/openembedded-core/message/190319
> Mute This Topic: https://lists.openembedded.org/mt/102461629/3617179
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [alexandre.belloni@bootlin.com]
> -=-=-=-=-=-=-=-=-=-=-=-
> 


-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [OE-core] [PATCH] oeqa,ssh: Handle SSHCall timeout error code
  2023-11-09  9:35 ` [OE-core] " Alexandre Belloni
@ 2023-11-09 12:23   ` Luca Fancellu
  0 siblings, 0 replies; 4+ messages in thread
From: Luca Fancellu @ 2023-11-09 12:23 UTC (permalink / raw)
  To: Alexandre Belloni
  Cc: openembedded-core@lists.openembedded.org, Diego Sueiro,
	Rahul Singh


>> +    except SSHCallTimeout:
>> +        # Version of openssh before 8.6_p1 returns error code 0 when killed by
>> +        # a signal, when the timeout occurs we will receive a 0 error code
>> +        # because the process is been terminated and it's wrong because that
>> +        # value means success, but the process timed out.
>> +        # Afterwards, from version 8.6_p1 onwards, the returned code is 255.
>> +        # Fix this behaviour by checking the return code
>> +        if process.returncode == 0:
>> +            process.returncode = 255
> 
> This breaks most of our tests:
> 
> https://autobuilder.yoctoproject.org/typhoon/#/builders/40/builds/8103/steps/13/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/76/builds/8054/steps/21/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/6037/steps/14/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/72/builds/8106/steps/15/logs/stdio
> 
> and more...


Hi Alexandre,

Sorry about that, I will push a v2 that fixes the issues

Cheers,
Luca




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-09 12:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-08 11:25 [PATCH] oeqa,ssh: Handle SSHCall timeout error code Luca Fancellu
2023-11-09  9:35 ` [OE-core] " Alexandre Belloni
2023-11-09 12:23   ` Luca Fancellu
  -- strict thread matches above, loose matches on Subject: below --
2023-11-08 11:08 luca fancellu
2023-11-08 13:22 ` [OE-core] " Luca Fancellu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.