From: Don Zickus <dzickus@redhat.com>
To: Guenter Roeck <linux@roeck-us.net>
Cc: Dave Young <dyoung@redhat.com>,
linux-watchdog@vger.kernel.org, kexec@lists.infradead.org,
wim@iguana.be, LKML <linux-kernel@vger.kernel.org>,
vgoyal@redhat.com
Subject: Re: [RFC PATCH] watchdog: Add hook for kicking in kdump path
Date: Wed, 10 Apr 2013 10:20:55 -0400 [thread overview]
Message-ID: <20130410142055.GW79013@redhat.com> (raw)
In-Reply-To: <20130410135123.GB15456@roeck-us.net>
On Wed, Apr 10, 2013 at 06:51:23AM -0700, Guenter Roeck wrote:
> On Wed, Apr 10, 2013 at 09:40:39AM -0400, Don Zickus wrote:
> > On Tue, Apr 09, 2013 at 09:07:58AM -0700, Guenter Roeck wrote:
> > > > > Just look for the use of mod_timer in the watchdog directory.
> > > >
> > > > So looking at the mod_timer logic in various drivers, it seems regardless
> > > > if the /dev/watchdog device is opened or not, if it is running, it will
> > > > automagically kick the watchdog.
> > > >
> > > yes
> > >
> > > > This seems that we can avoid pulling in userspace pieces for this. Just
> > > > load the driver and the hardware starts getting kicked.
> > > >
> > > Only if it is already running. Also, you don't want to rely on it, because you
> > > lose protection against user space issues.
> >
> > IOW if something goes wrong with a runaway userspace app, the kernel
> > blindly continues to kick the watchdog, which masks the problem, right?
> >
> That would be wrong if any of the drivers does that. The kernel should stop
> kicking after the software timeout expires.
>
> For example, if the HW needs to be kicked every second, and the high level
> timeout is set to one minute, the driver should keep kicking the hardware
> watchdog for one minute and then stop doing it if /dev/watchdog was opened
> and userspace is silent.
Ah ok.
>
> > >
> > > A second use is if the hw watchdog needs to be pinged more often than user
> > > space can provide. Some of the HW watchdogs need a ping in one-second intervals
> > > or even faster.
> > >
> > > > Is that true? And if so, do all drivers detect if the hardware is already
> > > > running during their init? Or is it based on the first device open?
> > > >
> > > It is usually done in the probe function.
> >
> > Ok. Thanks for the understanding of how the softdog stuff works.
> >
> > However, we still have the problem that if the machine panics and we want
> > to jump into the kdump kernel, we need to 'kick' the watchdog one more
> > time. This provides us a sane sync point for determining how long we have
> > to load the watchdog driver in the second kernel before the hardware
> > reboots us. Otherwise the reboots are pretty random and nothing is
> > guaranteed.
> >
> > Hence the need for some sort of patch resembling the one I posted.
> >
> > Soooooooo, any thoughts about that patch and what changes I should make?
> > :-)
> >
> The FIXME is a problem, and I think the name and scope would have to be
> more generic (watchdog_kick ?). Also, it doesn't solve the problem
> of having multiple open watchdogs (my system has three, for example),
> and it doesn't check if the watchdog is running.
Ok. I didn't know the watchdog subsystem well enough, so I just took
stabs in the dark about how things should work. I appreciate the
feedback.
I could make the name more generic. I wasn't sure if the watchdog
community would frown on that. The FIXME is a problem, I am not sure how
to handle the 'fail' scenario (can't get the mutex with trylock). And I
have no idea how to even find out if multiple watchdogs are open on the
system. Is there a list I could walk? And with regard to 'watchdog is
running', I thought 'watchdog_active' would do that. But again, I could
be misreading the code.
Thanks for the feedback.
Cheers,
Don
>
> Guenter
next prev parent reply other threads:[~2013-04-10 14:21 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1365192994-94850-1-git-send-email-dzickus@redhat.com>
2013-04-08 5:46 ` [RFC PATCH] watchdog: Add hook for kicking in kdump path Dave Young
2013-04-08 12:48 ` Don Zickus
2013-04-08 15:15 ` Guenter Roeck
2013-04-09 14:44 ` Don Zickus
2013-04-09 14:52 ` Guenter Roeck
2013-04-09 15:14 ` Don Zickus
2013-04-09 16:07 ` Guenter Roeck
2013-04-10 13:40 ` Don Zickus
2013-04-10 13:51 ` Guenter Roeck
2013-04-10 14:20 ` Don Zickus [this message]
2013-04-10 15:10 ` Guenter Roeck
2013-04-10 16:17 ` Don Zickus
2013-04-10 16:30 ` Guenter Roeck
2013-04-12 21:16 ` Don Zickus
2013-04-12 21:30 ` Guenter Roeck
2013-04-15 20:55 ` Don Zickus
2013-04-15 22:50 ` Guenter Roeck
2013-04-10 16:49 ` David Teigland
2013-04-10 17:17 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130410142055.GW79013@redhat.com \
--to=dzickus@redhat.com \
--cc=dyoung@redhat.com \
--cc=kexec@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-watchdog@vger.kernel.org \
--cc=linux@roeck-us.net \
--cc=vgoyal@redhat.com \
--cc=wim@iguana.be \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox