qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] tests/qtest: Poll on waitpid() for a while before sending SIGKILL
@ 2023-01-11 22:30 Stefan Berger
  2023-01-12  8:53 ` Daniel P. Berrangé
  2023-01-12  9:18 ` Philippe Mathieu-Daudé
  0 siblings, 2 replies; 5+ messages in thread
From: Stefan Berger @ 2023-01-11 22:30 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcandre.lureau, peter.maydell, berrange, Stefan Berger

To prevent getting stuck on waitpid() in case the target process does
not terminate on SIGTERM, poll on waitpid() for 10s and if the target
process has not changed state until then send a SIGKILL to it.

Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
---
 tests/qtest/libqtest.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index 2fbc3b88f3..362b1f724f 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -202,8 +202,24 @@ void qtest_wait_qemu(QTestState *s)
 {
 #ifndef _WIN32
     pid_t pid;
+    uint64_t end;
+
+    /* poll for 10s until sending SIGKILL */
+    end = g_get_monotonic_time() + 10 * G_TIME_SPAN_SECOND;
+
+    do {
+        pid = waitpid(s->qemu_pid, &s->wstatus, WNOHANG);
+        if (pid != 0) {
+            break;
+        }
+        g_usleep(100 * 1000);
+    } while (g_get_monotonic_time() < end);
+
+    if (pid == 0) {
+        kill(s->qemu_pid, SIGKILL);
+        TFR(pid = waitpid(s->qemu_pid, &s->wstatus, 0));
+    }
 
-    TFR(pid = waitpid(s->qemu_pid, &s->wstatus, 0));
     assert(pid == s->qemu_pid);
 #else
     DWORD ret;
-- 
2.39.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] tests/qtest: Poll on waitpid() for a while before sending SIGKILL
  2023-01-11 22:30 [PATCH] tests/qtest: Poll on waitpid() for a while before sending SIGKILL Stefan Berger
@ 2023-01-12  8:53 ` Daniel P. Berrangé
  2023-01-12  9:18 ` Philippe Mathieu-Daudé
  1 sibling, 0 replies; 5+ messages in thread
From: Daniel P. Berrangé @ 2023-01-12  8:53 UTC (permalink / raw)
  To: Stefan Berger; +Cc: qemu-devel, marcandre.lureau, peter.maydell

On Wed, Jan 11, 2023 at 05:30:18PM -0500, Stefan Berger wrote:
> To prevent getting stuck on waitpid() in case the target process does
> not terminate on SIGTERM, poll on waitpid() for 10s and if the target
> process has not changed state until then send a SIGKILL to it.
> 
> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> ---
>  tests/qtest/libqtest.c | 18 +++++++++++++++++-
>  1 file changed, 17 insertions(+), 1 deletion(-)

Since this is a test suite and we know our CI system gets very
heavily loaded, I think we should wait more than 10 secs, to
ensure QEMU has time to flush pending I/O in particular which
is most likely to delay things. If you bump the time to 30 secs
then

  Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>

> 
> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
> index 2fbc3b88f3..362b1f724f 100644
> --- a/tests/qtest/libqtest.c
> +++ b/tests/qtest/libqtest.c
> @@ -202,8 +202,24 @@ void qtest_wait_qemu(QTestState *s)
>  {
>  #ifndef _WIN32
>      pid_t pid;
> +    uint64_t end;
> +
> +    /* poll for 10s until sending SIGKILL */
> +    end = g_get_monotonic_time() + 10 * G_TIME_SPAN_SECOND;
> +
> +    do {
> +        pid = waitpid(s->qemu_pid, &s->wstatus, WNOHANG);
> +        if (pid != 0) {
> +            break;
> +        }
> +        g_usleep(100 * 1000);
> +    } while (g_get_monotonic_time() < end);
> +
> +    if (pid == 0) {
> +        kill(s->qemu_pid, SIGKILL);
> +        TFR(pid = waitpid(s->qemu_pid, &s->wstatus, 0));
> +    }
>  
> -    TFR(pid = waitpid(s->qemu_pid, &s->wstatus, 0));
>      assert(pid == s->qemu_pid);
>  #else
>      DWORD ret;
> -- 
> 2.39.0
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tests/qtest: Poll on waitpid() for a while before sending SIGKILL
  2023-01-11 22:30 [PATCH] tests/qtest: Poll on waitpid() for a while before sending SIGKILL Stefan Berger
  2023-01-12  8:53 ` Daniel P. Berrangé
@ 2023-01-12  9:18 ` Philippe Mathieu-Daudé
  2023-01-12  9:54   ` Daniel P. Berrangé
  1 sibling, 1 reply; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-01-12  9:18 UTC (permalink / raw)
  To: Stefan Berger, qemu-devel; +Cc: marcandre.lureau, peter.maydell, berrange

On 11/1/23 23:30, Stefan Berger wrote:
> To prevent getting stuck on waitpid() in case the target process does
> not terminate on SIGTERM, poll on waitpid() for 10s and if the target
> process has not changed state until then send a SIGKILL to it.
> 
> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> ---
>   tests/qtest/libqtest.c | 18 +++++++++++++++++-
>   1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
> index 2fbc3b88f3..362b1f724f 100644
> --- a/tests/qtest/libqtest.c
> +++ b/tests/qtest/libqtest.c
> @@ -202,8 +202,24 @@ void qtest_wait_qemu(QTestState *s)
>   {
>   #ifndef _WIN32
>       pid_t pid;
> +    uint64_t end;
> +
> +    /* poll for 10s until sending SIGKILL */
> +    end = g_get_monotonic_time() + 10 * G_TIME_SPAN_SECOND;

Maybe we could use getenv() to allow tuning / using different value?

> +    do {
> +        pid = waitpid(s->qemu_pid, &s->wstatus, WNOHANG);
> +        if (pid != 0) {
> +            break;
> +        }
> +        g_usleep(100 * 1000);
> +    } while (g_get_monotonic_time() < end);
> +
> +    if (pid == 0) {
> +        kill(s->qemu_pid, SIGKILL);
> +        TFR(pid = waitpid(s->qemu_pid, &s->wstatus, 0));
> +    }
>   
> -    TFR(pid = waitpid(s->qemu_pid, &s->wstatus, 0));
>       assert(pid == s->qemu_pid);
>   #else
>       DWORD ret;



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tests/qtest: Poll on waitpid() for a while before sending SIGKILL
  2023-01-12  9:18 ` Philippe Mathieu-Daudé
@ 2023-01-12  9:54   ` Daniel P. Berrangé
  2023-01-12 10:28     ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel P. Berrangé @ 2023-01-12  9:54 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé
  Cc: Stefan Berger, qemu-devel, marcandre.lureau, peter.maydell

On Thu, Jan 12, 2023 at 10:18:01AM +0100, Philippe Mathieu-Daudé wrote:
> On 11/1/23 23:30, Stefan Berger wrote:
> > To prevent getting stuck on waitpid() in case the target process does
> > not terminate on SIGTERM, poll on waitpid() for 10s and if the target
> > process has not changed state until then send a SIGKILL to it.
> > 
> > Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
> > ---
> >   tests/qtest/libqtest.c | 18 +++++++++++++++++-
> >   1 file changed, 17 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
> > index 2fbc3b88f3..362b1f724f 100644
> > --- a/tests/qtest/libqtest.c
> > +++ b/tests/qtest/libqtest.c
> > @@ -202,8 +202,24 @@ void qtest_wait_qemu(QTestState *s)
> >   {
> >   #ifndef _WIN32
> >       pid_t pid;
> > +    uint64_t end;
> > +
> > +    /* poll for 10s until sending SIGKILL */
> > +    end = g_get_monotonic_time() + 10 * G_TIME_SPAN_SECOND;
> 
> Maybe we could use getenv() to allow tuning / using different value?

I'd rather we picked a value large enough that it will work
reliably out of the box for all scenarios with no magic
env required. We're just trying to prevent infinite waits if
something unexpected happens. We don't need to use an
aggressively short value, as most users will never hit this
scenario. I think 30 seconds is large enough to be reliable
but we could easily go higher to 60/120 if we want to be
really really sure.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] tests/qtest: Poll on waitpid() for a while before sending SIGKILL
  2023-01-12  9:54   ` Daniel P. Berrangé
@ 2023-01-12 10:28     ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 5+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-01-12 10:28 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Stefan Berger, qemu-devel, marcandre.lureau, peter.maydell

On 12/1/23 10:54, Daniel P. Berrangé wrote:
> On Thu, Jan 12, 2023 at 10:18:01AM +0100, Philippe Mathieu-Daudé wrote:
>> On 11/1/23 23:30, Stefan Berger wrote:
>>> To prevent getting stuck on waitpid() in case the target process does
>>> not terminate on SIGTERM, poll on waitpid() for 10s and if the target
>>> process has not changed state until then send a SIGKILL to it.
>>>
>>> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
>>> ---
>>>    tests/qtest/libqtest.c | 18 +++++++++++++++++-
>>>    1 file changed, 17 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
>>> index 2fbc3b88f3..362b1f724f 100644
>>> --- a/tests/qtest/libqtest.c
>>> +++ b/tests/qtest/libqtest.c
>>> @@ -202,8 +202,24 @@ void qtest_wait_qemu(QTestState *s)
>>>    {
>>>    #ifndef _WIN32
>>>        pid_t pid;
>>> +    uint64_t end;
>>> +
>>> +    /* poll for 10s until sending SIGKILL */
>>> +    end = g_get_monotonic_time() + 10 * G_TIME_SPAN_SECOND;
>>
>> Maybe we could use getenv() to allow tuning / using different value?
> 
> I'd rather we picked a value large enough that it will work
> reliably out of the box for all scenarios with no magic
> env required. We're just trying to prevent infinite waits if
> something unexpected happens. We don't need to use an
> aggressively short value, as most users will never hit this
> scenario. I think 30 seconds is large enough to be reliable
> but we could easily go higher to 60/120 if we want to be
> really really sure.

I read your other comment later and I agree with you.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-01-12 11:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-11 22:30 [PATCH] tests/qtest: Poll on waitpid() for a while before sending SIGKILL Stefan Berger
2023-01-12  8:53 ` Daniel P. Berrangé
2023-01-12  9:18 ` Philippe Mathieu-Daudé
2023-01-12  9:54   ` Daniel P. Berrangé
2023-01-12 10:28     ` Philippe Mathieu-Daudé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).