From: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
To: Guenter Roeck <linux@roeck-us.net>
Cc: linux-watchdog@vger.kernel.org, Wim Van Sebroeck <wim@iguana.be>,
linux-kernel@vger.kernel.org,
Timo Kokkonen <timo.kokkonen@offcode.fi>,
linux-doc@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>
Subject: Re: [PATCH v5 1/8] watchdog: Introduce hardware maximum timeout in watchdog core
Date: Mon, 23 Nov 2015 19:26:08 +0100 [thread overview]
Message-ID: <20151123182608.GA19888@pengutronix.de> (raw)
In-Reply-To: <56533B80.4030400@roeck-us.net>
Hello Guenter,
On Mon, Nov 23, 2015 at 08:14:56AM -0800, Guenter Roeck wrote:
> On 11/22/2015 11:53 PM, Uwe Kleine-König wrote:
> >On Sun, Nov 22, 2015 at 07:20:58PM -0800, Guenter Roeck wrote:
> >>@@ -160,7 +176,11 @@ they are supported. These optional routines/operations are:
> >> and -EIO for "could not write value to the watchdog". On success this
> >> routine should set the timeout value of the watchdog_device to the
> >> achieved timeout value (which may be different from the requested one
> >>- because the watchdog does not necessarily has a 1 second resolution).
> >>+ because the watchdog does not necessarily have a 1 second resolution).
> >>+ Drivers implementing hw_max_timeout_ms set the hardware watchdog timeout
> >>+ to the minimum of timeout and hw_max_timeout_ms. Those drivers set the
> >
> >Actually this is something that the wdg core could abstract for drivers.
> >Oh well, apart from hw_max_timeout_ms having ms accuracy.
> >
> Not that sure. about the abstraction. The actual timeout to set depends on
> the hardware, and may have an unknown granularity. The watchdog core can
What I wanted to say is that the driver core can do (ignoring scaling
due to different units):
if (timeout > wdd->hw_max_timeout_ms)
wdd->set_timeout(wdd->hw_max_timeout_ms);
else
wdd->set_timeout(timeout)
So that the device specific driver can assume that timeout passed to its
set_timeout callback is always less than or equal to hw_max_timeout_ms.
And so it doesn't need to compare against hw_max_timeout_ms again.
> not predict what that granularity would be. We can play with it, and try
> if it can be done, but I really would like this to be a separate patch.
>
> hw_max_timeout is in ms because some watchdogs have a very low maximum
> HW timeout.
>
> >>+ timeout value of the watchdog_device either to the requested timeout value
> >>+ (if it is larger than hw_max_timeout_ms), or to the achieved timeout value.
> >> (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
> >> watchdog's info structure).
> >> * get_timeleft: this routines returns the time that's left before a reset.
> >>diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> >>index 56a649e66eb2..1dba3f57dba3 100644
> >>--- a/drivers/watchdog/watchdog_dev.c
> >>+++ b/drivers/watchdog/watchdog_dev.c
> >>[...]
> >>+static long watchdog_next_keepalive(struct watchdog_device *wdd)
> >>+{
> >>+ unsigned int timeout_ms = wdd->timeout * 1000;
> >>+ unsigned long keepalive_interval;
> >>+ unsigned long last_heartbeat;
> >>+ unsigned long virt_timeout;
> >>+ unsigned int hw_timeout_ms;
> >>+
> >>+ virt_timeout = wdd->last_keepalive + msecs_to_jiffies(timeout_ms);
> >
> >I think it's sensible to store
> >
> > last_keepalive + timeout
> >
> >(i.e. the expected expiration time) in struct watchdog_device instead of
> >last_keepalive. This moves the (maybe expensive) calculation to a
> >context that has userspace interaction anyhow. On the other hand this
> >complicates the set_timeout call. Hmm.
> >
>
> I would rather keep the code simple. It is not as if this is called
> all the time. Also, I need last_keepalive later in the series
> to determine if the minimum timeout between hardware pings has elapsed,
> so we would need both.
I'm not sure my suggestion improves the situation, but I will keep my
idea in mind and check once the dust settled.
> >>+ hw_timeout_ms = min(timeout_ms, wdd->max_hw_timeout_ms);
> >>+ keepalive_interval = msecs_to_jiffies(hw_timeout_ms / 2);
> >>+
> >>+ /*
> >>+ * To ensure that the watchdog times out wdd->timeout seconds
> >>+ * after the most recent ping from userspace, the last
> >>+ * worker ping has to come in hw_timeout_ms before this timeout.
> >>+ */
> >>+ last_heartbeat = virt_timeout - msecs_to_jiffies(hw_timeout_ms);
> >>+ return min_t(long, last_heartbeat - jiffies, keepalive_interval);
> >>+}
> >>[...]
> >>@@ -61,26 +137,25 @@ static struct watchdog_device *old_wdd;
> >>
> >> static int watchdog_ping(struct watchdog_device *wdd)
> >> {
> >>- int err = 0;
> >>+ int err;
> >>
> >> mutex_lock(&wdd->lock);
> >>+ wdd->last_keepalive = jiffies;
> >>+ err = _watchdog_ping(wdd);
> >>+ mutex_unlock(&wdd->lock);
> >>
> >>- if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
> >>- err = -ENODEV;
> >>- goto out_ping;
> >>- }
> >>+ return err;
> >>+}
> >>
> >>- if (!watchdog_active(wdd))
> >>- goto out_ping;
> >>+static void watchdog_ping_work(struct work_struct *work)
> >>+{
> >>+ struct watchdog_device *wdd;
> >>
> >>- if (wdd->ops->ping)
> >>- err = wdd->ops->ping(wdd); /* ping the watchdog */
> >>- else
> >>- err = wdd->ops->start(wdd); /* restart watchdog */
> >>+ wdd = container_of(to_delayed_work(work), struct watchdog_device, work);
> >>
> >>-out_ping:
> >>+ mutex_lock(&wdd->lock);
> >>+ _watchdog_ping(wdd);
> >> mutex_unlock(&wdd->lock);
> >>- return err;
> >
> >Calling this function might come after last_keepalive + timeout in which
> >case the watchdog shouldn't be pinged.
> >
> Unless the code is wrong, the last time this function is called should be
> at (timeout - max_hw_timeout), ie well before the timeout elapsed.
> Given that, I don't think this is something to be concerned about.
Depends on max_hw_timeout and the load of the machine if that can
happen. And thinking again, if the ping didn't come in until
last_keepalive + timeout, the machine is reset anyhow ...
> >> }
> >>
> >> /*
> >>@@ -107,8 +182,11 @@ static int watchdog_start(struct watchdog_device *wdd)
> >> goto out_start;
> >>
> >> err = wdd->ops->start(wdd);
> >>- if (err == 0)
> >>+ if (err == 0) {
> >> set_bit(WDOG_ACTIVE, &wdd->status);
> >>+ wdd->last_keepalive = jiffies;
> >>+ watchdog_update_worker(wdd, true);
> >>+ }
> >
> >I think it's more correct to sample jiffies before calling .start.
> >Something like:
> >
> > unsigned long started_at = jiffies;
> >
> > err = wdd->ops->start(wdd);
> > if (err == 0)
> > wdd->last_keepalive = jiffies;
> >
> I assume you mean
> wdd->last_keepalive = started_at;
right.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
next prev parent reply other threads:[~2015-11-23 18:26 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-23 3:20 [PATCH v5 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
2015-11-23 3:20 ` [PATCH v5 1/8] watchdog: Introduce hardware maximum timeout in watchdog core Guenter Roeck
2015-11-23 7:53 ` Uwe Kleine-König
2015-11-23 16:14 ` Guenter Roeck
2015-11-23 18:26 ` Uwe Kleine-König [this message]
2015-11-23 21:55 ` Guenter Roeck
2015-11-24 7:16 ` Uwe Kleine-König
2015-11-24 15:03 ` Guenter Roeck
2015-11-24 16:11 ` Uwe Kleine-König
2015-11-24 16:45 ` Guenter Roeck
2015-11-24 21:14 ` Uwe Kleine-König
2015-11-23 3:20 ` [PATCH v5 2/8] watchdog: Introduce WDOG_RUNNING flag Guenter Roeck
2015-11-23 16:30 ` Alexander Stein
2015-11-23 16:42 ` Guenter Roeck
2015-11-23 19:26 ` Uwe Kleine-König
2015-11-23 21:48 ` Guenter Roeck
2015-11-23 3:21 ` [PATCH v5 3/8] watchdog: Make set_timeout function optional Guenter Roeck
2015-11-23 3:21 ` [PATCH v5 4/8] watchdog: Add support for minimum time between heartbeats Guenter Roeck
2015-11-23 3:21 ` [PATCH v5 5/8] watchdog: Simplify update_worker Guenter Roeck
2015-11-24 7:13 ` Uwe Kleine-König
2015-11-24 7:25 ` Guenter Roeck
2015-11-23 3:21 ` [RFT PATCH v5 6/8] watchdog: imx2: Convert to use infrastructure triggered keepalives Guenter Roeck
2015-11-23 3:21 ` [RFT PATCH v5 7/8] watchdog: retu: " Guenter Roeck
2015-11-23 3:21 ` [RFT PATCH v5 8/8] watchdog: at91sam9: " Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151123182608.GA19888@pengutronix.de \
--to=u.kleine-koenig@pengutronix.de \
--cc=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-watchdog@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=timo.kokkonen@offcode.fi \
--cc=wim@iguana.be \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox