public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Pratyush Anand <panand@redhat.com>,
	fu.wei@linaro.org, Suravee.Suthikulpanit@amd.com,
	timur@codeaurora.org, wim@iguana.be
Cc: linux-arm-kernel@lists.infradead.org,
	linux-watchdog@vger.kernel.org,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC] Watchdog: sbsa_gwdt: Enhance timeout range
Date: Tue, 3 May 2016 06:29:39 -0700	[thread overview]
Message-ID: <5728A7C3.4010001@roeck-us.net> (raw)
In-Reply-To: <20da73bb9bdf27993514c1da80fead13dc92932d.1462262900.git.panand@redhat.com>

On 05/03/2016 01:20 AM, Pratyush Anand wrote:
> Currently only WOR is used to program both first and second stage which
> provided very limited range of timeout.
>
> This patch uses WCV as well to achieve higher range of timeout. This patch
> programs max_timeout as 255, but that can be increased further as well.
>
> Following testing shows that we can happily achieve 40 second default timeout.
>
>   # modprobe sbsa_gwdt action=1
>   [  131.187562] sbsa-gwdt sbsa-gwdt.0: Initialized with 40s timeout @ 250000000 Hz, action=1.
>   # cd /sys/class/watchdog/watchdog0/
>   # cat state
>   inactive
>   # cat /dev/watchdog0
>   cat: /dev/watchdog0: Invalid argument
>   [  161.710593] watchdog: watchdog0: watchdog did not stop!
>   # cat state
>   active
>   # cat timeout
>   40
>   # cat timeleft
>   38
>   # cat timeleft
>   25
>   # cat /dev/watchdog0
>   cat: /dev/watchdog0: Invalid argument
>   [  184.931030] watchdog: watchdog0: watchdog did not stop!
>   # cat timeleft
>   37
>   # cat timeleft
>   21
>   ...
>   ...
>   # cat timeleft
>   1
>
> panic() is called upon timeout of 40s. See timestamp of last kick (cat) and
> next panic() message.
>
>   [  224.939065] Kernel panic - not syncing: SBSA Watchdog timeout
>
> Signed-off-by: Pratyush Anand <panand@redhat.com>

You could also use the new infrastructure (specify max_hw_heartbeat_ms instead
of max_timeout), and not depend on the correct implementation of WCV.

Overall this adds a lot of complexity for something that could by now easily
be handled by the infrastructure. Is this really necessary ?

Guenter

> ---
>   drivers/watchdog/sbsa_gwdt.c | 83 +++++++++++++++++++++++++++++++++-----------
>   1 file changed, 62 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/watchdog/sbsa_gwdt.c b/drivers/watchdog/sbsa_gwdt.c
> index ad383f6f15fc..529dd5e99fcd 100644
> --- a/drivers/watchdog/sbsa_gwdt.c
> +++ b/drivers/watchdog/sbsa_gwdt.c
> @@ -35,17 +35,23 @@
>    *
>    * SBSA GWDT:
>    * if action is 1 (the two stages mode):
> - * |--------WOR-------WS0--------WOR-------WS1
> + * |--------WCV-------WS0--------WCV-------WS1
>    * |----timeout-----(panic)----timeout-----reset
>    *
>    * if action is 0 (the single stage mode):
> - * |------WOR-----WS0(ignored)-----WOR------WS1
> + * |------WCV-----WS0(ignored)-----WOR------WS1
>    * |--------------timeout-------------------reset
>    *
> - * Note: Since this watchdog timer has two stages, and each stage is determined
> - * by WOR, in the single stage mode, the timeout is (WOR * 2); in the two
> - * stages mode, the timeout is WOR. The maximum timeout in the two stages mode
> - * is half of that in the single stage mode.
> + * Note: This watchdog timer has two stages. If action is 0, first stage is
> + * determined by directly programming WCV and second by WOR. When first
> + * timeout is reached, WS0 is triggered and WCV is reloaded with value in
> + * WOR. WS0 interrupt will be ignored, then the second watch period starts;
> + * when second timeout is reached, then WS1 is triggered, system resets. WCV
> + * and WOR are programmed in such a way that total time corresponding to
> + * WCV+WOR becomes equivalent to user programmed "timeout".
> + * If action is 1, then we expect to call panic() at user programmed
> + * "timeout". Therefore, we program both first and second stage using WCV
> + * only.
>    *
>    */
>
> @@ -95,7 +101,17 @@ struct sbsa_gwdt {
>   	void __iomem		*control_base;
>   };
>
> -#define DEFAULT_TIMEOUT		10 /* seconds */
> +/*
> + * Max Timeout Can be in days, but 255 seconds seems reasonable for all use
> + * cases
> + */
> +#define MAX_TIMEOUT		255

We don't usually define such arbitrary limits.

> +
> +/* Default timeout is 40 seconds, which is the 1st + 2nd watch periods when
> + * action is 0. When action is 1 then both 1st and 2nd watch periods will
> + * be of 40 seconds.
> + */
> +#define DEFAULT_TIMEOUT		40 /* seconds */
>
>   static unsigned int timeout;
>   module_param(timeout, uint, 0);
> @@ -127,20 +143,21 @@ static int sbsa_gwdt_set_timeout(struct watchdog_device *wdd,
>   				 unsigned int timeout)
>   {
>   	struct sbsa_gwdt *gwdt = watchdog_get_drvdata(wdd);
> +	u64 timeout_1, timeout_2;
>
>   	wdd->timeout = timeout;
>
>   	if (action)
> -		writel(gwdt->clk * timeout,
> -		       gwdt->control_base + SBSA_GWDT_WOR);
> +		timeout_1 = (u64)gwdt->clk * timeout;
>   	else
> -		/*
> -		 * In the single stage mode, The first signal (WS0) is ignored,
> -		 * the timeout is (WOR * 2), so the WOR should be configured
> -		 * to half value of timeout.
> -		 */
> -		writel(gwdt->clk / 2 * timeout,
> -		       gwdt->control_base + SBSA_GWDT_WOR);
> +		timeout_1 = (u64)gwdt->clk * (timeout - wdd->min_timeout);
> +
> +	/* when action=1, timeout_2 will be overwritten in ISR */
> +	timeout_2 = (u64)gwdt->clk * wdd->min_timeout;
> +
Why min_timeout ? Also, where is it overwritten in the interrupt handler,
and to which value ?

> +	writel(timeout_2, gwdt->control_base + SBSA_GWDT_WOR);
> +	writeq(timeout_1 + arch_counter_get_cntvct(),
> +		gwdt->control_base + SBSA_GWDT_WCV);
>
>   	return 0;
>   }
> @@ -172,12 +189,17 @@ static int sbsa_gwdt_keepalive(struct watchdog_device *wdd)
>   	struct sbsa_gwdt *gwdt = watchdog_get_drvdata(wdd);
>
>   	/*
> -	 * Writing WRR for an explicit watchdog refresh.
> -	 * You can write anyting (like 0).
> +	 * play safe: program WOR with max value so that we have sufficient
> +	 * time to overwrite them after explicit refresh
>   	 */
> +	writel(U32_MAX, gwdt->control_base + SBSA_GWDT_WOR);
> +	/*
> +	* Writing WRR for an explicit watchdog refresh.
> +	* You can write anyting (like 0).
> +	*/

Please stick with standard multi-line comments.

>   	writel(0, gwdt->refresh_base + SBSA_GWDT_WRR);
>
> -	return 0;
> +	return sbsa_gwdt_set_timeout(wdd, wdd->timeout);;
>   }
>
>   static unsigned int sbsa_gwdt_status(struct watchdog_device *wdd)
> @@ -193,10 +215,15 @@ static int sbsa_gwdt_start(struct watchdog_device *wdd)
>   {
>   	struct sbsa_gwdt *gwdt = watchdog_get_drvdata(wdd);
>
> +	/*
> +	 * play safe: program WOR with max value so that we have sufficient
> +	 * time to overwrite them after explicit refresh
> +	 */
> +	writel(U32_MAX, gwdt->control_base + SBSA_GWDT_WOR);
>   	/* writing WCS will cause an explicit watchdog refresh */
>   	writel(SBSA_GWDT_WCS_EN, gwdt->control_base + SBSA_GWDT_WCS);
>
> -	return 0;
> +	return sbsa_gwdt_set_timeout(wdd, wdd->timeout);;
>   }
>
>   static int sbsa_gwdt_stop(struct watchdog_device *wdd)
> @@ -211,6 +238,20 @@ static int sbsa_gwdt_stop(struct watchdog_device *wdd)
>
>   static irqreturn_t sbsa_gwdt_interrupt(int irq, void *dev_id)
>   {
> +	struct sbsa_gwdt *gwdt = (struct sbsa_gwdt *)dev_id;
> +	struct watchdog_device *wdd = &gwdt->wdd;
> +	u64 timeout_2 = (u64)gwdt->clk * wdd->timeout;
> +
> +	/*
> +	 * Since we can not trust system at this moment, therefore re-write
> +	 * WCV only if wdd->timeout <= MAX_TIMEOUT to avoid a corner
> +	 * scenario when we might have corrupted wdd->timeout values at
> +	 * this point.
> +	 */

This is quite vague. What is this corner scenario where wdd->timeout
would be corrupted ? How can wdd->timeout ever be larger than MAX_TIMEOUT ?

> +	if (wdd->timeout <= MAX_TIMEOUT)
> +		writeq(timeout_2 + arch_counter_get_cntvct(),
> +			gwdt->control_base + SBSA_GWDT_WCV);
> +
>   	panic(WATCHDOG_NAME " timeout");
>
>   	return IRQ_HANDLED;
> @@ -273,7 +314,7 @@ static int sbsa_gwdt_probe(struct platform_device *pdev)
>   	wdd->info = &sbsa_gwdt_info;
>   	wdd->ops = &sbsa_gwdt_ops;
>   	wdd->min_timeout = 1;
> -	wdd->max_timeout = U32_MAX / gwdt->clk;
> +	wdd->max_timeout = MAX_TIMEOUT;
>   	wdd->timeout = DEFAULT_TIMEOUT;
>   	watchdog_set_drvdata(wdd, gwdt);
>   	watchdog_set_nowayout(wdd, nowayout);
>

  parent reply	other threads:[~2016-05-03 13:30 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-03  8:20 [PATCH RFC] Watchdog: sbsa_gwdt: Enhance timeout range Pratyush Anand
2016-05-03 12:12 ` Timur Tabi
2016-05-03 13:24   ` Pratyush Anand
2016-05-03 13:47     ` Guenter Roeck
2016-05-03 14:17       ` Pratyush Anand
2016-05-03 14:46         ` Guenter Roeck
2016-05-03 15:04           ` Timur Tabi
2016-05-03 13:29 ` Guenter Roeck [this message]
2016-05-03 14:38   ` Pratyush Anand
2016-05-03 15:07     ` Timur Tabi
2016-05-03 15:51       ` Pratyush Anand
2016-05-03 17:16         ` Guenter Roeck
2016-05-04 14:14           ` Pratyush Anand
2016-05-04 14:21             ` Timur Tabi
2016-05-04 15:59               ` Pratyush Anand
2016-05-04 16:17                 ` Timur Tabi
2016-05-05 16:43                   ` Guenter Roeck
2016-05-05 18:20                     ` Pratyush Anand
2016-05-05 18:22                       ` Timur Tabi
2016-05-05 23:36                         ` Guenter Roeck
2016-05-05 23:38                           ` Timur Tabi
2016-05-05 23:45                             ` Timur Tabi
2016-05-06  0:18                               ` Guenter Roeck
2016-05-05 23:51                             ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5728A7C3.4010001@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=Suravee.Suthikulpanit@amd.com \
    --cc=fu.wei@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=panand@redhat.com \
    --cc=timur@codeaurora.org \
    --cc=wim@iguana.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox