public inbox for igt-dev@lists.freedesktop.org
 help / color / mirror / Atom feed
From: Petri Latvala <petri.latvala@intel.com>
To: igt-dev@lists.freedesktop.org
Subject: Re: [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting
Date: Tue, 18 Feb 2020 11:08:33 +0200	[thread overview]
Message-ID: <20200218090833.GN25209@platvala-desk.ger.corp.intel.com> (raw)
In-Reply-To: <20200217145042.829-1-petri.latvala@intel.com>

On Mon, Feb 17, 2020 at 04:50:41PM +0200, Petri Latvala wrote:
> Instead of aiming for inactivity_timeout and splitting that into
> suitable intervals for watchdog pinging, replace the whole logic with
> one-second select() timeouts and checking if we're reaching a timeout
> condition based on current time and the time passed since a particular
> event, be it the last activity or the time of signaling the child
> processes.
> 
> With the refactoring, we gain a couple of new features for free:
> 
> - use-watchdog now makes sense even without
> inactivity-timeout. Previously use-watchdog was silently ignored if
> inactivity-timeout was not set. Now, watchdogs will be used always if
> configured so, effectively ensuring the device gets rebooted if
> userspace dies without other timeout tracking.
> 
> - Killing tests early on kernel taint now happens even
> earlier. Previously on an inactive system we possibly waited for some
> tens of seconds before checking kernel taints.
> 
> Signed-off-by: Petri Latvala <petri.latvala@intel.com>
> ---
>  runner/executor.c | 224 +++++++++++++++++++++++-----------------------
>  1 file changed, 113 insertions(+), 111 deletions(-)
> 
> diff --git a/runner/executor.c b/runner/executor.c
> index 3ea5d167..33610c9e 100644
> --- a/runner/executor.c
> +++ b/runner/executor.c
> @@ -93,7 +93,7 @@ static void init_watchdogs(struct settings *settings)
>  
>  	memset(&watchdogs, 0, sizeof(watchdogs));
>  
> -	if (!settings->use_watchdog || settings->inactivity_timeout <= 0)
> +	if (!settings->use_watchdog)
>  		return;
>  
>  	if (settings->log_level >= LOG_LEVEL_VERBOSE) {
> @@ -672,6 +672,69 @@ static void show_kernel_task_state(void)
>  	sysrq('t');
>  }
>  
> +static const char *need_to_timeout(struct settings *settings,
> +				   int killed,
> +				   unsigned long taints,
> +				   double time_since_activity,
> +				   double time_since_kill)
> +{
> +	if (killed) {
> +		/*
> +		 * Timeout after being killed is a hardcoded amount
> +		 * depending on which signal we already used. The
> +		 * exception is SIGKILL which just immediately bails
> +		 * out if the kernel is tainted, because there's
> +		 * little to no hope of the process dying gracefully
> +		 * or at all.
> +		 *
> +		 * Note that if killed == SIGKILL, the caller needs
> +		 * special handling anyway and should ignore the
> +		 * actual string returned.
> +		 */
> +		const double kill_timeout = killed == SIGKILL ? 20.0 : 120.0;


Executing this code in my head a few times I realized that before this
patch, while we did have the exact same values for the timeout, we
waited forever for a killed test to die as long as it (or the kernel)
produced output within that time. Now we don't. I consider that a
bugfix.


-- 
Petri Latvala
_______________________________________________
igt-dev mailing list
igt-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/igt-dev

  parent reply	other threads:[~2020-02-18  9:08 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-17 14:50 [igt-dev] [PATCH i-g-t 1/2] runner: Refactor timeouting Petri Latvala
2020-02-17 14:50 ` [igt-dev] [PATCH i-g-t 2/2] runner: Introduce per-test timeouts Petri Latvala
2020-02-18 10:24   ` Chris Wilson
2020-02-18 11:26     ` Petri Latvala
2020-02-17 16:47 ` [igt-dev] ✗ Fi.CI.BAT: failure for series starting with [i-g-t,1/2] runner: Refactor timeouting Patchwork
2020-02-18 11:29   ` Petri Latvala
2020-02-18  9:08 ` Petri Latvala [this message]
2020-02-18 10:21 ` [igt-dev] [PATCH i-g-t 1/2] " Chris Wilson
2020-02-18 15:38 ` [igt-dev] ✓ Fi.CI.BAT: success for series starting with [i-g-t,1/2] " Patchwork
2020-02-19  7:09 ` [igt-dev] ✗ Fi.CI.IGT: failure " Patchwork
2020-02-19 10:39   ` Petri Latvala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200218090833.GN25209@platvala-desk.ger.corp.intel.com \
    --to=petri.latvala@intel.com \
    --cc=igt-dev@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox