linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH 0/3] Improvements for the s390x panic-loop tests
@ 2025-07-24 13:30 Thomas Huth
  2025-07-24 13:30 ` [kvm-unit-tests PATCH 1/3] s390x: Fix unreliable " Thomas Huth
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Thomas Huth @ 2025-07-24 13:30 UTC (permalink / raw)
  To: kvm, Nico Böhr
  Cc: Claudio Imbrenda, Janosch Frank, David Hildenbrand, linux-s390

The panic-loop tests sometimes fail in our downstream CI. The first
patch fixes this issue. While we're at it, add the tests to the
upstream CI, too (second patch). The third patch is a RFC: Is it ok
to drop our dependecy on "jq" for these tests by replacing it with "grep"?

Thomas Huth (3):
  s390x: Fix unreliable panic-loop tests
  .gitlab-ci.yml: Add the s390x panic-loop tests to the CI
  scripts/arch-run.bash: Drop the dependency on "jq"

 .gitlab-ci.yml        |  2 ++
 scripts/arch-run.bash | 27 +++++----------------------
 2 files changed, 7 insertions(+), 22 deletions(-)

-- 
2.50.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [kvm-unit-tests PATCH 1/3] s390x: Fix unreliable panic-loop tests
  2025-07-24 13:30 [kvm-unit-tests PATCH 0/3] Improvements for the s390x panic-loop tests Thomas Huth
@ 2025-07-24 13:30 ` Thomas Huth
  2025-07-28 14:03   ` Claudio Imbrenda
  2025-07-24 13:30 ` [kvm-unit-tests PATCH 2/3] .gitlab-ci.yml: Add the s390x panic-loop tests to the CI Thomas Huth
  2025-07-24 13:30 ` [kvm-unit-tests RFC PATCH 3/3] scripts/arch-run.bash: Drop the dependency on "jq" Thomas Huth
  2 siblings, 1 reply; 7+ messages in thread
From: Thomas Huth @ 2025-07-24 13:30 UTC (permalink / raw)
  To: kvm, Nico Böhr
  Cc: Claudio Imbrenda, Janosch Frank, David Hildenbrand, linux-s390

From: Thomas Huth <thuth@redhat.com>

In our CI, the s390x panic-loop-extint and panic-loop-pgm tests
are sometimes failing. Having a closer look, this seems to be caused
by ncat sometimes complaining about "Connection reset by peer" on stderr,
likely because QEMU terminated (due to the panic) before ncat could
properly tear down the connection. But having some output on stderr is
interpreted as test failure in qemu_fixup_return_code(), so the test is
marked as failed, even though the panic event occurred as expected.

To fix it, drop the usage of ncat here and simply handle the QMP
input and output via normal fifos instead. This has also the advantage
that we do not need an additional program for these tests anymore
that might not be available in the installation.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 scripts/arch-run.bash | 21 +++++----------------
 1 file changed, 5 insertions(+), 16 deletions(-)

diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
index c440f216..58e4f93f 100644
--- a/scripts/arch-run.bash
+++ b/scripts/arch-run.bash
@@ -63,14 +63,6 @@ qmp ()
 	echo '{ "execute": "qmp_capabilities" }{ "execute":' "$2" '}' | ncat -U $1
 }
 
-qmp_events ()
-{
-	while ! test -S "$1"; do sleep 0.1; done
-	echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' |
-		ncat --no-shutdown -U $1 |
-		jq -c 'select(has("event"))'
-}
-
 filter_quiet_msgs ()
 {
 	grep -v "Now migrate the VM (quiet)" |
@@ -295,26 +287,23 @@ do_migration ()
 
 run_panic ()
 {
-	if ! command -v ncat >/dev/null 2>&1; then
-		echo "${FUNCNAME[0]} needs ncat (netcat)" >&2
-		return 77
-	fi
-
 	if ! command -v jq >/dev/null 2>&1; then
 		echo "${FUNCNAME[0]} needs jq" >&2
 		return 77
 	fi
 
 	trap 'trap - TERM ; kill 0 ; exit 2' INT TERM
-	trap 'rm -f ${qmp}' RETURN EXIT
+	trap 'rm -f ${qmp}.in ${qmp}.out' RETURN EXIT
 
 	qmp=$(mktemp -u -t panic-qmp.XXXXXXXXXX)
+	mkfifo ${qmp}.in ${qmp}.out
 
 	# start VM stopped so we don't miss any events
-	"$@" -chardev socket,id=mon,path=${qmp},server=on,wait=off \
+	"$@" -chardev pipe,id=mon,path=${qmp} \
 		-mon chardev=mon,mode=control -S &
+	echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' > ${qmp}.in
 
-	panic_event_count=$(qmp_events ${qmp} | jq -c 'select(.event == "GUEST_PANICKED")' | wc -l)
+	panic_event_count=$(jq -c 'select(.event == "GUEST_PANICKED")' < ${qmp}.out | wc -l)
 	if [ "$panic_event_count" -lt 1 ]; then
 		echo "FAIL: guest did not panic"
 		ret=3
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [kvm-unit-tests PATCH 2/3] .gitlab-ci.yml: Add the s390x panic-loop tests to the CI
  2025-07-24 13:30 [kvm-unit-tests PATCH 0/3] Improvements for the s390x panic-loop tests Thomas Huth
  2025-07-24 13:30 ` [kvm-unit-tests PATCH 1/3] s390x: Fix unreliable " Thomas Huth
@ 2025-07-24 13:30 ` Thomas Huth
  2025-07-28 14:00   ` Claudio Imbrenda
  2025-07-24 13:30 ` [kvm-unit-tests RFC PATCH 3/3] scripts/arch-run.bash: Drop the dependency on "jq" Thomas Huth
  2 siblings, 1 reply; 7+ messages in thread
From: Thomas Huth @ 2025-07-24 13:30 UTC (permalink / raw)
  To: kvm, Nico Böhr
  Cc: Claudio Imbrenda, Janosch Frank, David Hildenbrand, linux-s390

From: Thomas Huth <thuth@redhat.com>

Now that these tests should work reliable, we should also run them
in our CI.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 .gitlab-ci.yml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index aa69ca59..7e3d4661 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -517,6 +517,8 @@ s390x-kvm:
       migration-sck
       migration-skey-parallel
       migration-skey-sequential
+      panic-loop-extint
+      panic-loop-pgm
       pfmf
       sclp-1g
       sclp-3g
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [kvm-unit-tests RFC PATCH 3/3] scripts/arch-run.bash: Drop the dependency on "jq"
  2025-07-24 13:30 [kvm-unit-tests PATCH 0/3] Improvements for the s390x panic-loop tests Thomas Huth
  2025-07-24 13:30 ` [kvm-unit-tests PATCH 1/3] s390x: Fix unreliable " Thomas Huth
  2025-07-24 13:30 ` [kvm-unit-tests PATCH 2/3] .gitlab-ci.yml: Add the s390x panic-loop tests to the CI Thomas Huth
@ 2025-07-24 13:30 ` Thomas Huth
  2025-07-28 13:59   ` Claudio Imbrenda
  2 siblings, 1 reply; 7+ messages in thread
From: Thomas Huth @ 2025-07-24 13:30 UTC (permalink / raw)
  To: kvm, Nico Böhr
  Cc: Claudio Imbrenda, Janosch Frank, David Hildenbrand, linux-s390

From: Thomas Huth <thuth@redhat.com>

For checking whether a panic event occurred, a simple "grep"
for the related text in the output is enough - it's very unlikely
that the output of QEMU will change. This way we can drop the
dependency on the program "jq" which might not be installed on
some systems.

Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 Marked as "RFC" since I'm a little bit torn here - on the one side,
 it's great to get rid of a dependency, on the other side, using
 grep might be a little bit less robust in case QEMU ever changes
 the layout of it's QMP output...

 scripts/arch-run.bash | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
index 58e4f93f..5abf2626 100644
--- a/scripts/arch-run.bash
+++ b/scripts/arch-run.bash
@@ -287,11 +287,6 @@ do_migration ()
 
 run_panic ()
 {
-	if ! command -v jq >/dev/null 2>&1; then
-		echo "${FUNCNAME[0]} needs jq" >&2
-		return 77
-	fi
-
 	trap 'trap - TERM ; kill 0 ; exit 2' INT TERM
 	trap 'rm -f ${qmp}.in ${qmp}.out' RETURN EXIT
 
@@ -303,8 +298,7 @@ run_panic ()
 		-mon chardev=mon,mode=control -S &
 	echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' > ${qmp}.in
 
-	panic_event_count=$(jq -c 'select(.event == "GUEST_PANICKED")' < ${qmp}.out | wc -l)
-	if [ "$panic_event_count" -lt 1 ]; then
+	if ! grep -q '"event": "GUEST_PANICKED"' ${qmp}.out ; then
 		echo "FAIL: guest did not panic"
 		ret=3
 	else
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests RFC PATCH 3/3] scripts/arch-run.bash: Drop the dependency on "jq"
  2025-07-24 13:30 ` [kvm-unit-tests RFC PATCH 3/3] scripts/arch-run.bash: Drop the dependency on "jq" Thomas Huth
@ 2025-07-28 13:59   ` Claudio Imbrenda
  0 siblings, 0 replies; 7+ messages in thread
From: Claudio Imbrenda @ 2025-07-28 13:59 UTC (permalink / raw)
  To: Thomas Huth
  Cc: kvm, Nico Böhr, Janosch Frank, David Hildenbrand, linux-s390

On Thu, 24 Jul 2025 15:30:51 +0200
Thomas Huth <thuth@redhat.com> wrote:

> From: Thomas Huth <thuth@redhat.com>
> 
> For checking whether a panic event occurred, a simple "grep"
> for the related text in the output is enough - it's very unlikely
> that the output of QEMU will change. This way we can drop the
> dependency on the program "jq" which might not be installed on
> some systems.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  Marked as "RFC" since I'm a little bit torn here - on the one side,
>  it's great to get rid of a dependency, on the other side, using
>  grep might be a little bit less robust in case QEMU ever changes
>  the layout of it's QMP output...

I share your concerns, but I'm also in favour of removing a relatively
uncommon dependency

> 
>  scripts/arch-run.bash | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
> index 58e4f93f..5abf2626 100644
> --- a/scripts/arch-run.bash
> +++ b/scripts/arch-run.bash
> @@ -287,11 +287,6 @@ do_migration ()
>  
>  run_panic ()
>  {
> -	if ! command -v jq >/dev/null 2>&1; then
> -		echo "${FUNCNAME[0]} needs jq" >&2
> -		return 77
> -	fi
> -
>  	trap 'trap - TERM ; kill 0 ; exit 2' INT TERM
>  	trap 'rm -f ${qmp}.in ${qmp}.out' RETURN EXIT
>  
> @@ -303,8 +298,7 @@ run_panic ()
>  		-mon chardev=mon,mode=control -S &
>  	echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' > ${qmp}.in
>  
> -	panic_event_count=$(jq -c 'select(.event == "GUEST_PANICKED")' < ${qmp}.out | wc -l)
> -	if [ "$panic_event_count" -lt 1 ]; then
> +	if ! grep -q '"event": "GUEST_PANICKED"' ${qmp}.out ; then

maybe:
	grep -E -q '"event"[[:blank:]]*:[[:blank:]]*"GUEST PANICKED"' ...

>  		echo "FAIL: guest did not panic"
>  		ret=3
>  	else


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH 2/3] .gitlab-ci.yml: Add the s390x panic-loop tests to the CI
  2025-07-24 13:30 ` [kvm-unit-tests PATCH 2/3] .gitlab-ci.yml: Add the s390x panic-loop tests to the CI Thomas Huth
@ 2025-07-28 14:00   ` Claudio Imbrenda
  0 siblings, 0 replies; 7+ messages in thread
From: Claudio Imbrenda @ 2025-07-28 14:00 UTC (permalink / raw)
  To: Thomas Huth
  Cc: kvm, Nico Böhr, Janosch Frank, David Hildenbrand, linux-s390

On Thu, 24 Jul 2025 15:30:50 +0200
Thomas Huth <thuth@redhat.com> wrote:

> From: Thomas Huth <thuth@redhat.com>
> 
> Now that these tests should work reliable, we should also run them
> in our CI.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>

Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  .gitlab-ci.yml | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index aa69ca59..7e3d4661 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -517,6 +517,8 @@ s390x-kvm:
>        migration-sck
>        migration-skey-parallel
>        migration-skey-sequential
> +      panic-loop-extint
> +      panic-loop-pgm
>        pfmf
>        sclp-1g
>        sclp-3g


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [kvm-unit-tests PATCH 1/3] s390x: Fix unreliable panic-loop tests
  2025-07-24 13:30 ` [kvm-unit-tests PATCH 1/3] s390x: Fix unreliable " Thomas Huth
@ 2025-07-28 14:03   ` Claudio Imbrenda
  0 siblings, 0 replies; 7+ messages in thread
From: Claudio Imbrenda @ 2025-07-28 14:03 UTC (permalink / raw)
  To: Thomas Huth
  Cc: kvm, Nico Böhr, Janosch Frank, David Hildenbrand, linux-s390

On Thu, 24 Jul 2025 15:30:49 +0200
Thomas Huth <thuth@redhat.com> wrote:

> From: Thomas Huth <thuth@redhat.com>
> 
> In our CI, the s390x panic-loop-extint and panic-loop-pgm tests
> are sometimes failing. Having a closer look, this seems to be caused
> by ncat sometimes complaining about "Connection reset by peer" on stderr,
> likely because QEMU terminated (due to the panic) before ncat could
> properly tear down the connection. But having some output on stderr is
> interpreted as test failure in qemu_fixup_return_code(), so the test is
> marked as failed, even though the panic event occurred as expected.
> 
> To fix it, drop the usage of ncat here and simply handle the QMP
> input and output via normal fifos instead. This has also the advantage
> that we do not need an additional program for these tests anymore
> that might not be available in the installation.
> 
> Signed-off-by: Thomas Huth <thuth@redhat.com>

Acked-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  scripts/arch-run.bash | 21 +++++----------------
>  1 file changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/scripts/arch-run.bash b/scripts/arch-run.bash
> index c440f216..58e4f93f 100644
> --- a/scripts/arch-run.bash
> +++ b/scripts/arch-run.bash
> @@ -63,14 +63,6 @@ qmp ()
>  	echo '{ "execute": "qmp_capabilities" }{ "execute":' "$2" '}' | ncat -U $1
>  }
>  
> -qmp_events ()
> -{
> -	while ! test -S "$1"; do sleep 0.1; done
> -	echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' |
> -		ncat --no-shutdown -U $1 |
> -		jq -c 'select(has("event"))'
> -}
> -
>  filter_quiet_msgs ()
>  {
>  	grep -v "Now migrate the VM (quiet)" |
> @@ -295,26 +287,23 @@ do_migration ()
>  
>  run_panic ()
>  {
> -	if ! command -v ncat >/dev/null 2>&1; then
> -		echo "${FUNCNAME[0]} needs ncat (netcat)" >&2
> -		return 77
> -	fi
> -
>  	if ! command -v jq >/dev/null 2>&1; then
>  		echo "${FUNCNAME[0]} needs jq" >&2
>  		return 77
>  	fi
>  
>  	trap 'trap - TERM ; kill 0 ; exit 2' INT TERM
> -	trap 'rm -f ${qmp}' RETURN EXIT
> +	trap 'rm -f ${qmp}.in ${qmp}.out' RETURN EXIT
>  
>  	qmp=$(mktemp -u -t panic-qmp.XXXXXXXXXX)
> +	mkfifo ${qmp}.in ${qmp}.out
>  
>  	# start VM stopped so we don't miss any events
> -	"$@" -chardev socket,id=mon,path=${qmp},server=on,wait=off \
> +	"$@" -chardev pipe,id=mon,path=${qmp} \
>  		-mon chardev=mon,mode=control -S &
> +	echo '{ "execute": "qmp_capabilities" }{ "execute": "cont" }' > ${qmp}.in
>  
> -	panic_event_count=$(qmp_events ${qmp} | jq -c 'select(.event == "GUEST_PANICKED")' | wc -l)
> +	panic_event_count=$(jq -c 'select(.event == "GUEST_PANICKED")' < ${qmp}.out | wc -l)
>  	if [ "$panic_event_count" -lt 1 ]; then
>  		echo "FAIL: guest did not panic"
>  		ret=3


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-07-28 14:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-24 13:30 [kvm-unit-tests PATCH 0/3] Improvements for the s390x panic-loop tests Thomas Huth
2025-07-24 13:30 ` [kvm-unit-tests PATCH 1/3] s390x: Fix unreliable " Thomas Huth
2025-07-28 14:03   ` Claudio Imbrenda
2025-07-24 13:30 ` [kvm-unit-tests PATCH 2/3] .gitlab-ci.yml: Add the s390x panic-loop tests to the CI Thomas Huth
2025-07-28 14:00   ` Claudio Imbrenda
2025-07-24 13:30 ` [kvm-unit-tests RFC PATCH 3/3] scripts/arch-run.bash: Drop the dependency on "jq" Thomas Huth
2025-07-28 13:59   ` Claudio Imbrenda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).