All of lore.kernel.org
 help / color / mirror / Atom feed
From: Don Zickus <dzickus@redhat.com>
To: Corey Minyard <minyard@acm.org>
Cc: openipmi-developer@lists.sourceforge.net, linux-watchdog@vger.kernel.org
Subject: Re: ipmi watchdog questions
Date: Fri, 2 May 2014 11:10:05 -0400	[thread overview]
Message-ID: <20140502151005.GG198341@redhat.com> (raw)
In-Reply-To: <5362E8FA.9050700@acm.org>

On Thu, May 01, 2014 at 07:38:18PM -0500, Corey Minyard wrote:
> On 05/01/2014 08:58 AM, Don Zickus wrote:
> > Hi Corey,
> >
> > I stumbled upon an issue with a partner of ours, where they booted their
> > machine and tried to load the ipmi_watchdog module by hand and it failed.
> >
> > The reason it failed was that the iTCO watchdog driver was already loaded
> > and it registered the misc device /dev/watchdog first.
> >
> > I looked at the ipmi watchdog driver and realized it was never converted
> > to the new watchdog framework where the watchdog_core module manages the
> > '/dev/watchdog' misc device.
> >
> > So being naive and not knowing much about IPMI, I decided to follow the
> > helpful document Documentation/watchdog/convert_drivers_to_kernel_api.txt
> > and convert the ipmi_watchdog to use the new watchdog framework.
> >
> > I ran into a few issues and then realized the driver itself never really
> > binds to any hardware, so it makes the conversion process a little more
> > challenging.
> >
> > So a few questions to you before I waste my time in this area:
> >
> > - Is there any prior history about why the ipmi_watchdog was never
> >   converted to the new watchdog framework?  Lack of interest? Technical
> > hurdles?
> 
> Mostly lack of interest, but there are some technical hurdles.

Hi Corey,

Thanks for all the responses.

> 
> It would be hard to implement some things.  The watchdog framework has
> no concept of pretimeouts.  And IPMI is message based, you send a
> message to a controller to do anything, and you have to wait for the
> response.  That doesn't work very well with the watchdog interface,
> which assumes you can do everything immediately.

I will defer this conversation to Guenter's expertise.  I am willing to hack
up any suggestions the both of you come up with here to see if everything
works well (same goes for the fasync/poll stuff).

> 
> >
> > - Is there a reason why the ipmi_watchdog is a seperate module as opposed
> >   to being called by ipmi_si?  It seems there shouldn't be an issue with
> > the watchdog always loaded, it just won't do anything until someone opens
> > it (from my understanding).  Also you would gain the ability to use the
> > shutdown/remove routines properly instead of the reboot/panic notifiers.
> 
> I'm not sure I understand this.  Why would you want it as part of
> ipmi_si?  ipmi_msghandler would be a little more logical, but IMHO still
> doesn't make sense.  It uses the IPMI interface, and the interface is
> designed to have multiple users.  Better to keep it separate because
> it's a separate function.
> 
> I also don't understand the comment about shutdown/remove instead of
> reboot/panic.  Can you elaborate on that?

So part of the problem with ipmi_watchdog is that it can't load
automatically (like a normal driver).  The reason is that it doesn't
attach to any device.  My suggestion to roll it into ipmi_si (or
ipmi_msghandler works too), was to help with the autoloading part.

To counter-argue the argument that a customer may not want the watchdog
running, I would argue that it does nothing until someone opens it the
first time.  So autoloading shouldn't have a big downside to it.

The second part I was trying to change is to remove the panic/reboot
notifiers and instead use proper shutdown/remove functions.  This would
make it easier to use the 'struct watchdog_device' pointer as the
pointer could be embedded in the per device struct.  Though on the other
hand, there is only ever one ipmi device per system, so maybe having it
global isn't a big deal.  I was trying to think of scalability issues.

> 
> >   In addition, passing the pointer to the 'struct watchdog_device' would be
> > easier if some of those extra pieces were not there (as opposed to making
> > it a global reference).
> >
> > - What does the fasync and poll calls do for a watchdog?
> 
> The IPMI watchdog has the ability to report a pretimeout at a specific
> amount of time before the final timeout, presumably to take some action
> before the system reboots.  the fasync and poll (and read) calls let
> this be reported to the user.
> 
> >
> > I'll start with that for now.
> >
> > I appreciate any feedback.  Currently we just implemented blacklisting the
> > iTCO watchdog driver to workaround this problem.  I thought we could do
> > better, hence my motivation to do work in this area.
> 
> It would be nice, yes.  I'm afraid to get all the functionality would be
> a lot of hacking on the watchdog framework or removal of function from
> the driver.

We can take it step by step and see how we can get there.  Again my goal
was to help a partner of ours get the ipmi_watchdog to play nice in their
system without resorting to blacklisting other watchdogs.  In addition, I
thought it would help with out of the box configuration if the
ipmi_watchdog could autoload if the ipmi pieces were loaded in the system
too instead of having to add an entry to /etc/modprobe.d.

Thanks for the help so far!

Cheers,
Don

      parent reply	other threads:[~2014-05-02 15:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-01 13:58 ipmi watchdog questions Don Zickus
2014-05-02  0:38 ` Corey Minyard
2014-05-02  1:11   ` Guenter Roeck
2014-05-02  4:38     ` Corey Minyard
2014-05-02 13:17       ` Guenter Roeck
2014-05-02 16:44         ` Don Zickus
2014-05-02 17:18           ` Guenter Roeck
2014-05-02 17:46             ` Don Zickus
2014-05-02 21:52               ` Corey Minyard
2014-05-02 23:20                 ` Guenter Roeck
2014-05-03  2:10                 ` Don Zickus
2014-05-03  2:51                   ` Corey Minyard
2014-05-03  4:23                   ` Guenter Roeck
2014-05-02 15:10   ` Don Zickus [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140502151005.GG198341@redhat.com \
    --to=dzickus@redhat.com \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=minyard@acm.org \
    --cc=openipmi-developer@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.