All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pratyush Anand <panand@redhat.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: fu.wei@linaro.org, Suravee.Suthikulpanit@amd.com,
	timur@codeaurora.org, wim@iguana.be,
	linux-arm-kernel@lists.infradead.org,
	linux-watchdog@vger.kernel.org,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC] Watchdog: sbsa_gwdt: Enhance timeout range
Date: Tue, 3 May 2016 20:08:56 +0530	[thread overview]
Message-ID: <20160503143856.GE13045@dhcppc6.redhat.com> (raw)
In-Reply-To: <5728A7C3.4010001@roeck-us.net>

Hi Guenter,

On 03/05/2016:06:29:39 AM, Guenter Roeck wrote:
> On 05/03/2016 01:20 AM, Pratyush Anand wrote:
> >Currently only WOR is used to program both first and second stage which
> >provided very limited range of timeout.
> >
> >This patch uses WCV as well to achieve higher range of timeout. This patch
> >programs max_timeout as 255, but that can be increased further as well.
> >
> >Following testing shows that we can happily achieve 40 second default timeout.
> >
> >  # modprobe sbsa_gwdt action=1
> >  [  131.187562] sbsa-gwdt sbsa-gwdt.0: Initialized with 40s timeout @ 250000000 Hz, action=1.
> >  # cd /sys/class/watchdog/watchdog0/
> >  # cat state
> >  inactive
> >  # cat /dev/watchdog0
> >  cat: /dev/watchdog0: Invalid argument
> >  [  161.710593] watchdog: watchdog0: watchdog did not stop!
> >  # cat state
> >  active
> >  # cat timeout
> >  40
> >  # cat timeleft
> >  38
> >  # cat timeleft
> >  25
> >  # cat /dev/watchdog0
> >  cat: /dev/watchdog0: Invalid argument
> >  [  184.931030] watchdog: watchdog0: watchdog did not stop!
> >  # cat timeleft
> >  37
> >  # cat timeleft
> >  21
> >  ...
> >  ...
> >  # cat timeleft
> >  1
> >
> >panic() is called upon timeout of 40s. See timestamp of last kick (cat) and
> >next panic() message.
> >
> >  [  224.939065] Kernel panic - not syncing: SBSA Watchdog timeout
> >
> >Signed-off-by: Pratyush Anand <panand@redhat.com>
> 
> You could also use the new infrastructure (specify max_hw_heartbeat_ms instead
> of max_timeout), and not depend on the correct implementation of WCV.

Thanks for pointing to max_hw_heartbeat_ms. Just gone through it. Certainly it
would be helpful, and some part of this patch will go away. 

In fact after supporting max_hw_heartbeat_ms, there should be no change for
action=0 functionally. However, we would still need some changes for action=1.

When action=1, isr is called, which calls panic(). Calling panic() will further
trigger a dump saving mechanism, which can cause to execute a secondary kernel.
Now, it might happen that with the limited timeout (max_hw_heartbeat_ms)
programmed in first kernel, we land into a reset before secondary kernel could
start kicking it again or would complete dump save. 
So, in my opinion:
(1) We should use max_hw_heartbeat_ms.
(2) Then we should overwrite WCV in ISR so that it ensures a timeout of user
programmed "timeout" value for hardware reset.

~Pratyush

WARNING: multiple messages have this Message-ID (diff)
From: panand@redhat.com (Pratyush Anand)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH RFC] Watchdog: sbsa_gwdt: Enhance timeout range
Date: Tue, 3 May 2016 20:08:56 +0530	[thread overview]
Message-ID: <20160503143856.GE13045@dhcppc6.redhat.com> (raw)
In-Reply-To: <5728A7C3.4010001@roeck-us.net>

Hi Guenter,

On 03/05/2016:06:29:39 AM, Guenter Roeck wrote:
> On 05/03/2016 01:20 AM, Pratyush Anand wrote:
> >Currently only WOR is used to program both first and second stage which
> >provided very limited range of timeout.
> >
> >This patch uses WCV as well to achieve higher range of timeout. This patch
> >programs max_timeout as 255, but that can be increased further as well.
> >
> >Following testing shows that we can happily achieve 40 second default timeout.
> >
> >  # modprobe sbsa_gwdt action=1
> >  [  131.187562] sbsa-gwdt sbsa-gwdt.0: Initialized with 40s timeout @ 250000000 Hz, action=1.
> >  # cd /sys/class/watchdog/watchdog0/
> >  # cat state
> >  inactive
> >  # cat /dev/watchdog0
> >  cat: /dev/watchdog0: Invalid argument
> >  [  161.710593] watchdog: watchdog0: watchdog did not stop!
> >  # cat state
> >  active
> >  # cat timeout
> >  40
> >  # cat timeleft
> >  38
> >  # cat timeleft
> >  25
> >  # cat /dev/watchdog0
> >  cat: /dev/watchdog0: Invalid argument
> >  [  184.931030] watchdog: watchdog0: watchdog did not stop!
> >  # cat timeleft
> >  37
> >  # cat timeleft
> >  21
> >  ...
> >  ...
> >  # cat timeleft
> >  1
> >
> >panic() is called upon timeout of 40s. See timestamp of last kick (cat) and
> >next panic() message.
> >
> >  [  224.939065] Kernel panic - not syncing: SBSA Watchdog timeout
> >
> >Signed-off-by: Pratyush Anand <panand@redhat.com>
> 
> You could also use the new infrastructure (specify max_hw_heartbeat_ms instead
> of max_timeout), and not depend on the correct implementation of WCV.

Thanks for pointing to max_hw_heartbeat_ms. Just gone through it. Certainly it
would be helpful, and some part of this patch will go away. 

In fact after supporting max_hw_heartbeat_ms, there should be no change for
action=0 functionally. However, we would still need some changes for action=1.

When action=1, isr is called, which calls panic(). Calling panic() will further
trigger a dump saving mechanism, which can cause to execute a secondary kernel.
Now, it might happen that with the limited timeout (max_hw_heartbeat_ms)
programmed in first kernel, we land into a reset before secondary kernel could
start kicking it again or would complete dump save. 
So, in my opinion:
(1) We should use max_hw_heartbeat_ms.
(2) Then we should overwrite WCV in ISR so that it ensures a timeout of user
programmed "timeout" value for hardware reset.

~Pratyush

  reply	other threads:[~2016-05-03 14:38 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-03  8:20 [PATCH RFC] Watchdog: sbsa_gwdt: Enhance timeout range Pratyush Anand
2016-05-03  8:20 ` Pratyush Anand
2016-05-03 12:12 ` Timur Tabi
2016-05-03 12:12   ` Timur Tabi
2016-05-03 13:24   ` Pratyush Anand
2016-05-03 13:24     ` Pratyush Anand
2016-05-03 13:47     ` Guenter Roeck
2016-05-03 13:47       ` Guenter Roeck
2016-05-03 14:17       ` Pratyush Anand
2016-05-03 14:17         ` Pratyush Anand
2016-05-03 14:46         ` Guenter Roeck
2016-05-03 14:46           ` Guenter Roeck
2016-05-03 15:04           ` Timur Tabi
2016-05-03 15:04             ` Timur Tabi
2016-05-03 13:29 ` Guenter Roeck
2016-05-03 13:29   ` Guenter Roeck
2016-05-03 14:38   ` Pratyush Anand [this message]
2016-05-03 14:38     ` Pratyush Anand
2016-05-03 15:07     ` Timur Tabi
2016-05-03 15:07       ` Timur Tabi
2016-05-03 15:51       ` Pratyush Anand
2016-05-03 15:51         ` Pratyush Anand
2016-05-03 17:16         ` Guenter Roeck
2016-05-03 17:16           ` Guenter Roeck
2016-05-04 14:14           ` Pratyush Anand
2016-05-04 14:14             ` Pratyush Anand
2016-05-04 14:21             ` Timur Tabi
2016-05-04 14:21               ` Timur Tabi
2016-05-04 15:59               ` Pratyush Anand
2016-05-04 15:59                 ` Pratyush Anand
2016-05-04 16:17                 ` Timur Tabi
2016-05-04 16:17                   ` Timur Tabi
2016-05-05 16:43                   ` Guenter Roeck
2016-05-05 16:43                     ` Guenter Roeck
2016-05-05 18:20                     ` Pratyush Anand
2016-05-05 18:20                       ` Pratyush Anand
2016-05-05 18:20                       ` Pratyush Anand
2016-05-05 18:22                       ` Timur Tabi
2016-05-05 18:22                         ` Timur Tabi
2016-05-05 18:22                         ` Timur Tabi
2016-05-05 23:36                         ` Guenter Roeck
2016-05-05 23:36                           ` Guenter Roeck
2016-05-05 23:36                           ` Guenter Roeck
2016-05-05 23:38                           ` Timur Tabi
2016-05-05 23:38                             ` Timur Tabi
2016-05-05 23:38                             ` Timur Tabi
2016-05-05 23:45                             ` Timur Tabi
2016-05-05 23:45                               ` Timur Tabi
2016-05-05 23:45                               ` Timur Tabi
2016-05-06  0:18                               ` Guenter Roeck
2016-05-06  0:18                                 ` Guenter Roeck
2016-05-06  0:18                                 ` Guenter Roeck
2016-05-05 23:51                             ` Guenter Roeck
2016-05-05 23:51                               ` Guenter Roeck
2016-05-05 23:51                               ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160503143856.GE13045@dhcppc6.redhat.com \
    --to=panand@redhat.com \
    --cc=Suravee.Suthikulpanit@amd.com \
    --cc=fu.wei@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=timur@codeaurora.org \
    --cc=wim@iguana.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.