qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] tests/functional: more asset download robustness fixes
@ 2025-09-18 12:57 Daniel P. Berrangé
  2025-09-18 12:57 ` [PATCH 1/2] tests/functional: retry when seeing ConnectionError exception Daniel P. Berrangé
  2025-09-18 12:57 ` [PATCH 2/2] tests/functional: treat unknown exceptions as transient faults Daniel P. Berrangé
  0 siblings, 2 replies; 5+ messages in thread
From: Daniel P. Berrangé @ 2025-09-18 12:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Philippe Mathieu-Daudé, Daniel P. Berrangé



Daniel P. Berrangé (2):
  tests/functional: retry when seeing ConnectionError exception
  tests/functional: treat unknown exceptions as transient faults

 tests/functional/qemu_test/asset.py | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.50.1



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] tests/functional: retry when seeing ConnectionError exception
  2025-09-18 12:57 [PATCH 0/2] tests/functional: more asset download robustness fixes Daniel P. Berrangé
@ 2025-09-18 12:57 ` Daniel P. Berrangé
  2025-09-18 12:59   ` Thomas Huth
  2025-09-18 12:57 ` [PATCH 2/2] tests/functional: treat unknown exceptions as transient faults Daniel P. Berrangé
  1 sibling, 1 reply; 5+ messages in thread
From: Daniel P. Berrangé @ 2025-09-18 12:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Philippe Mathieu-Daudé, Daniel P. Berrangé

This base class is used for many different socket connection
errors, corresponding to ECONNRESET, ECONNREFUSED, ECONNABORTED
and more. Most of these are things you might expect to see every
now and then as transient flaws. We should thus retry the asset
download when seeing them.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/functional/qemu_test/asset.py | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py
index 2dd32bf28d..f8b87d2153 100644
--- a/tests/functional/qemu_test/asset.py
+++ b/tests/functional/qemu_test/asset.py
@@ -179,6 +179,13 @@ def fetch(self):
                                self.url, e.reason)
                 raise AssetError(self, "Unable to download: URL error %s" %
                                  e.reason, transient=True)
+            except ConnectionError as e:
+                # A socket connection failure, such as dropped conn
+                # or refused conn
+                tmp_cache_file.unlink()
+                self.log.error("Unable to download %s: Connection error %s",
+                               self.url, e)
+                continue
             except Exception as e:
                 tmp_cache_file.unlink()
                 raise AssetError(self, "Unable to download: %s" % e)
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] tests/functional: treat unknown exceptions as transient faults
  2025-09-18 12:57 [PATCH 0/2] tests/functional: more asset download robustness fixes Daniel P. Berrangé
  2025-09-18 12:57 ` [PATCH 1/2] tests/functional: retry when seeing ConnectionError exception Daniel P. Berrangé
@ 2025-09-18 12:57 ` Daniel P. Berrangé
  2025-09-18 13:01   ` Thomas Huth
  1 sibling, 1 reply; 5+ messages in thread
From: Daniel P. Berrangé @ 2025-09-18 12:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Thomas Huth, Philippe Mathieu-Daudé, Daniel P. Berrangé

To maximise the robustness of the functional tests we want to treat most
asset download failures as non-fatal to the test suite. Instead it
should just skip the tests which need that particular asset. The only
time aim to make it fatal is for 404 errors which are highly likely to
reflect genuine problems to be fixed.

We catch certain exception classes and handle them as transient errors,
but unfortunately it is proving difficult to predict what exception
classes urlopen() is capable of raising, with new possibilities being
discovered.

To provide a fail-safe, treat the generic Exception class as being a
transient error too. This may well mask certain genuine bugs, but it is
preferrable to prioritize running the test suite to the greatest extent
practical.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/functional/qemu_test/asset.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py
index f8b87d2153..2971a989d1 100644
--- a/tests/functional/qemu_test/asset.py
+++ b/tests/functional/qemu_test/asset.py
@@ -188,7 +188,8 @@ def fetch(self):
                 continue
             except Exception as e:
                 tmp_cache_file.unlink()
-                raise AssetError(self, "Unable to download: %s" % e)
+                raise AssetError(self, "Unable to download: %s" % e,
+                                 transient=True)
 
         if not os.path.exists(tmp_cache_file):
             raise AssetError(self, "Download retries exceeded", transient=True)
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] tests/functional: retry when seeing ConnectionError exception
  2025-09-18 12:57 ` [PATCH 1/2] tests/functional: retry when seeing ConnectionError exception Daniel P. Berrangé
@ 2025-09-18 12:59   ` Thomas Huth
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Huth @ 2025-09-18 12:59 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel; +Cc: Philippe Mathieu-Daudé

On 18/09/2025 14.57, Daniel P. Berrangé wrote:
> This base class is used for many different socket connection
> errors, corresponding to ECONNRESET, ECONNREFUSED, ECONNABORTED
> and more. Most of these are things you might expect to see every
> now and then as transient flaws. We should thus retry the asset
> download when seeing them.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>   tests/functional/qemu_test/asset.py | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py
> index 2dd32bf28d..f8b87d2153 100644
> --- a/tests/functional/qemu_test/asset.py
> +++ b/tests/functional/qemu_test/asset.py
> @@ -179,6 +179,13 @@ def fetch(self):
>                                  self.url, e.reason)
>                   raise AssetError(self, "Unable to download: URL error %s" %
>                                    e.reason, transient=True)
> +            except ConnectionError as e:
> +                # A socket connection failure, such as dropped conn
> +                # or refused conn
> +                tmp_cache_file.unlink()
> +                self.log.error("Unable to download %s: Connection error %s",
> +                               self.url, e)
> +                continue
>               except Exception as e:
>                   tmp_cache_file.unlink()
>                   raise AssetError(self, "Unable to download: %s" % e)

Reviewed-by: Thomas Huth <thuth@redhat.com>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] tests/functional: treat unknown exceptions as transient faults
  2025-09-18 12:57 ` [PATCH 2/2] tests/functional: treat unknown exceptions as transient faults Daniel P. Berrangé
@ 2025-09-18 13:01   ` Thomas Huth
  0 siblings, 0 replies; 5+ messages in thread
From: Thomas Huth @ 2025-09-18 13:01 UTC (permalink / raw)
  To: Daniel P. Berrangé, qemu-devel

On 18/09/2025 14.57, Daniel P. Berrangé wrote:
> To maximise the robustness of the functional tests we want to treat most
> asset download failures as non-fatal to the test suite. Instead it
> should just skip the tests which need that particular asset. The only
> time aim to make it fatal is for 404 errors which are highly likely to
> reflect genuine problems to be fixed.
> 
> We catch certain exception classes and handle them as transient errors,
> but unfortunately it is proving difficult to predict what exception
> classes urlopen() is capable of raising, with new possibilities being
> discovered.
> 
> To provide a fail-safe, treat the generic Exception class as being a
> transient error too. This may well mask certain genuine bugs, but it is
> preferrable to prioritize running the test suite to the greatest extent
> practical.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>   tests/functional/qemu_test/asset.py | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/functional/qemu_test/asset.py b/tests/functional/qemu_test/asset.py
> index f8b87d2153..2971a989d1 100644
> --- a/tests/functional/qemu_test/asset.py
> +++ b/tests/functional/qemu_test/asset.py
> @@ -188,7 +188,8 @@ def fetch(self):
>                   continue
>               except Exception as e:
>                   tmp_cache_file.unlink()
> -                raise AssetError(self, "Unable to download: %s" % e)
> +                raise AssetError(self, "Unable to download: %s" % e,
> +                                 transient=True)
>   
>           if not os.path.exists(tmp_cache_file):
>               raise AssetError(self, "Download retries exceeded", transient=True)

Reviewed-by: Thomas Huth <thuth@redhat.com>



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-09-18 13:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-18 12:57 [PATCH 0/2] tests/functional: more asset download robustness fixes Daniel P. Berrangé
2025-09-18 12:57 ` [PATCH 1/2] tests/functional: retry when seeing ConnectionError exception Daniel P. Berrangé
2025-09-18 12:59   ` Thomas Huth
2025-09-18 12:57 ` [PATCH 2/2] tests/functional: treat unknown exceptions as transient faults Daniel P. Berrangé
2025-09-18 13:01   ` Thomas Huth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).