public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Christer Weinigel <wingel@acolyte.hack.org>
To: Marcelo Tosatti <marcelo@conectiva.com.br>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [DRIVER][RFC] SC1200 Watchdog driver
Date: Tue, 26 Feb 2002 02:37:35 +0100 (CET)	[thread overview]
Message-ID: <20020226013735.0D309F5B@acolyte.hack.org> (raw)
In-Reply-To: <20020226012245.95555F5B@acolyte.hack.org> (message from Christer Weinigel on Tue, 26 Feb 2002 02:22:45 +0100 (CET))
In-Reply-To: <Pine.LNX.4.33L2.0202251413100.11464-100000@dragon.pdx.osdl.net> <20020226012245.95555F5B@acolyte.hack.org>

Hi,

this is a patch trying to document the Watchdog API in the kernel.
I'd suggest that it gets into the mainline kernel so that people read
it and (hopefully) also update it.

  /Christer

diff -urN linux.orig/Documentation/watchdog-api.txt linux/Documentation/watchdog-api.txt
--- /dev/null	Wed Dec 31 16:00:00 1969
+++ linux/Documentation/watchdog-api.txt	Tue Feb 26 01:16:58 2002
@@ -0,0 +1,365 @@
+The Linux Watchdog driver API.
+
+Copyright 2002 Christer Weingel <wingel@nano-system.com>
+
+Some parts of this document are copied verbatim from the sbc60xxwdt
+driver which is (c) Copyright 2000 Jakob Oestergaard <jakob@ostenfeld.dk>
+
+This document describes the state of the Linux 2.4.18 kernel.
+
+Introduction:
+
+A Watchdog Timer (WDT) is a hardware circuit that can reset the
+computer system in case of a software fault.  You probably knew that
+already.
+
+Usually a userspace daemon will notify the kernel watchdog driver via the
+/dev/watchdog special device file that userspace is still alive, at
+regular intervals.  When such a notification occurs, the driver will
+usually tell the hardware watchdog that everything is in order, and
+that the watchdog should wait for yet another little while to reset
+the system.  If userspace fails (RAM error, kernel bug, whatever), the
+notifications cease to occur, and the hardware watchdog will reset the
+system (causing a reboot) after the timeout occurs.
+
+The Linux watchdog API is a rather AD hoc construction and different
+drivers implement different, and sometimes incompatible, parts of it.
+This file is an attempt to document the existing usage and allow
+future driver writers to use it as a reference.
+
+The simplest API:
+
+All drivers support the basic mode of operation, where the watchdog
+activates as soon as /dev/watchdog is opened and will reboot unless
+the watchdog is pinged within a certain time, this time is called the
+timeout or margin.  The simplest way to ping the watchdog is to write
+some data to the device.  So a very simple watchdog daemon would look
+like this:
+
+int main(int argc, const char *argv[]) {
+	int fd=open("/dev/watchdog",O_WRONLY);
+	if (fd==-1) {
+		perror("watchdog");
+		exit(1);
+	}
+	while(1) {
+		write(fd, "\0", 1);
+		sleep(10);
+	}
+}
+
+A more advanced driver could for example check that a HTTP server is
+still responding before doing the write call to ping the watchdog.
+
+When the device is closed, the watchdog is disabled.  This is not
+always such a good idea, since if there is a bug in the watchdog
+daemon and it crashes the system will not reboot.  Because of this,
+some of the drivers support the configuration option "Disable watchdog
+shutdown on close", CONFIG_WATCHDOG_NOWAYOUT.  If it is set to Y when
+compiling the kernel, there is no way of disabling the watchdog once
+it has been started.  So, if the watchdog dameon crashes, the system
+will reboot after the timeout has passed.
+
+Some other drivers will not disable the watchdog, unless a specific
+magic character 'V' has been sent /dev/watchdog just before closing
+the file.  If the userspace daemon closes the file without sending
+this special character, the driver will assume that the daemon (and
+userspace in general) died, and will stop pinging the watchdog without
+disabling it first.  This will then cause a reboot.
+
+The ioctl API:
+
+Some drivers also support an ioctl API.  I belive this API was first
+used in the Berkshire PC Watchdog driver and has been adopted by the
+other drivers.  
+
+Pinging the watchdog using an ioctl:
+
+All drivers that have an ioctl interface support at least one ioctl,
+KEEPALIVE.  This ioctl does exactly the same thing as a write to the
+watchdog device, so the main loop in the above program could be
+replaced with:
+
+	while (1) {
+		ioctl(fd, WDIOC_KEEPALIVE, 0);
+		sleep(10);
+	}
+
+the argument to the ioctl is ignored.
+
+Setting and getting the timeout:
+
+For some drivers it is possible to modify the watchdog timeout on the
+fly with the SETTIMEOUT ioctl, those drivers have the WDIOF_SETTIMEOUT
+flag set in their option field.  The argument is an integer
+representing the timeout in seconds.  The driver returns the real
+timeout used in the same variable, and this timeout might differ from
+the requested one due to limitation of the hardware.
+
+    int timeout = 45;
+    ioctl(fd, WDIOC_SETTIMEOUT, &timeout);
+    printf("The timeout was set to %d seconds\n", timeout);
+
+This example might actually print "The timeout was set to 60 seconds"
+if the device has a granularity of minutes for its timeout.
+
+Starting with the Linux 2.4.18 kernel, it is possible to query the
+current timeout using the GETTIMEOUT ioctl.
+
+    ioctl(fd, WDIOC_GETTIMEOUT, &timeout);
+    printf("The timeout was is %d seconds\n", timeout);
+
+Envinronmental monitoring:
+
+Some watchdog drivers can return more information about the system,
+some do temperature, fan and power level monitoring, some can tell you
+the reason for the last reebot of the system.  The GETSUPPORT ioctl is
+available to ask what the device can do:
+
+	struct watchdog_info ident;
+	ioctl(fd, WDIOC_GETSUPPORT, &ident);
+
+the fields returned in the ident struct are:
+
+        identity		a string identifying the watchdog driver
+	firmware_version	the firmware version of the card if available
+	options			a flags describing what the device supports
+
+the options field can have the following bits set, and describes what
+kind of information that the GET_STATUS and GET_BOOT_STATUS ioctls can
+return.   [FIXME -- Is this correct?]
+
+	WDIOF_OVERHEAT		Reset due to CPU overheat
+	WDIOF_FANFAULT		Fan failed
+	WDIOF_EXTERN1		External relay 1
+	WDIOF_EXTERN2		External relay 2
+	WDIOF_POWERUNDER	Power bad/power fault
+	WDIOF_CARDRESET		Card previously reset the CPU
+	WDIOF_POWEROVER		Power over voltage
+	WDIOF_KEEPALIVEPING	Keep alive ping reply
+	WDIOF_SETTIMEOUT	Can set/get the timeout
+
+[FIXME -- Can somebody explain what the flags mean in more detail,
+especially KEEPALIVEPING?]
+
+For those drivers that return any bits set in the option field, the
+GETSTATUS and GETBOOTSTATUS ioctls can be used to ask for the current
+status, and the status at the last reboot, respectively.  
+
+    int flags;
+    ioctl(fd, WDIOC_GETSTATUS, &flags);
+
+    or
+
+    ioctl(fd, WDIOC_GETBOOTSTATUS, &flags);
+
+Note that not all devices support these two calls, and some only
+support the GETBOOTSTATUS call.
+
+Some drivers can measure the temperature using the GETTEMP ioctl.  The
+returned value is the temperature in degrees farenheit.
+
+    int temperature;
+    ioctl(fd, WDIOC_GETTEMP, &temperature);
+
+Finally the SETOPTIONS ioctl can be used to control some aspects of
+the cards operation; right now the pcwd driver is the only one
+supporting thiss ioctl.
+
+    int options = 0;
+    ioctl(fd, WDIOC_SETOPTIONS, options);
+
+The following options are available:
+
+	WDIOS_DISABLECARD	Turn off the watchdog timer
+	WDIOS_ENABLECARD	Turn on the watchdog timer
+	WDIOS_TEMPPANIC		Kernel panic on temperature trip
+
+[FIXME -- better explanations]
+
+Implementations in the current drivers in the kernel tree:
+
+Here I have tried to summarize what the different drivers support and
+where they do strange things compared to the other drivers.
+
+acquirewdt.c -- Acquire Single Board Computer
+
+	This driver has a hardcoded timeout of 1 minute
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	GETSUPPORT returns KEEPALIVEPING.  GETSTATUS will return 1 if
+	the device is open, 0 if not.  [FIXME -- isn't this rather
+	silly?  To be able to use the ioctl, the device must be open
+	and so GETSTATUS will always return 1].
+
+advantechwdt.c -- Advantech Single Board Computer
+
+	Timeout that defaults to 60 seconds, supports SETTIMEOUT.
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT.
+	The GETSTATUS call returns if the device is open or not.
+	[FIXME -- silliness again?]
+	
+eurotechwdt.c -- Eurotech CPU-1220/1410
+
+	The timeout can be set using the SETTIMEOUT ioctl and defaults
+	to 60 seconds.
+
+	Also has a module parameter "ev", event type which controls
+	what should happen on a timeout, the string "int" or anything
+	else that causes a reboot.  [FIXME -- better description]
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	GETSUPPORT returns CARDRESET and WDIOF_SETTIMEOUT but
+	GETSTATUS is not supported and GETBOOTSTATUS just returns 0.
+
+i810-tco.c -- Intel 810 chipset
+
+	Also has support for a lot of other i810 stuff, but the
+	watchdog is one of the things.
+
+	The timeout is set using the module parameter "i810_margin",
+	which is in steps of 0.6 seconds where 2<i810_margin<64.  The
+	driver supports the SETTIMEOUT ioctl.
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT.
+
+	GETSUPPORT returns WDIOF_SETTIMEOUT.  The GETSTATUS call
+	returns some kind of timer value which ist not compatible with
+	the other drivers.  GETBOOT status returns some kind of
+	hardware specific boot status.  [FIXME -- describe this]
+
+ib700wdt.c -- IB700 Single Board Computer
+
+	Default timeout of 30 seconds and the timeout is settable
+	using the SETTIMEOUT ioctl.  Note that only a few timeout
+	values are supported.
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	GETSUPPORT returns WDIOF_KEEPALIVEPING and WDIOF_SETTIMEOUT.
+	The GETSTATUS call returns if the device is open or not.
+	[FIXME -- silliness again?]
+
+machzwd.c -- MachZ ZF-Logic
+
+	Hardcoded timeout of 10 seconds
+
+	Has a module parameter "action" that controls what happens
+	when the timeout runs out which can be 0 = RESET (default), 
+	1 = SMI, 2 = NMI, 3 = SCI.
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT and the the magic character
+	'V' close handling.
+
+	GETSUPPORT returns WDIOF_KEEPALIVEPING, and the GETSTATUS call
+	returns if the device is open or not.  [FIXME -- silliness
+	again?]
+
+mixcomwd.c -- MixCom Watchdog
+
+	[FIXME -- I'm unable to tell what the timeout is]
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	GETSUPPORT returns WDIOF_KEEPALIVEPING, GETSTATUS returns if
+	the device is opened or not [FIXME -- I'm not really sure how
+	this works, there seems to be some magic connected to
+	CONFIG_WATCHDOG_NOWAYOUT]
+
+pcwd.c -- Berkshire PC Watchdog
+
+	Hardcoded timeout of 1.5 seconds
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	GETSUPPORT returns WDIOF_OVERHEAT|WDIOF_CARDRESET and both
+	GETSTATUS and GETBOOTSTATUS return something useful.
+
+	The SETOPTIONS call can be used to enable and disable the card
+	and to ask the driver to call panic if the system overheats.
+
+sbc60xxwdt.c -- 60xx Single Board Computer
+
+	Hardcoded timeout of 10 seconds
+
+	Does not support CONFIG_WATCHDOG_NOWAYOUT, but has the magic
+	character 'V' close handling.
+
+	No bits set in GETSUPPORT
+
+shwdt.c -- SuperH 3/4 processors
+
+	[FIXME -- I'm unable to tell what the timeout is]
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	GETSUPPORT returns WDIOF_KEEPALIVEPING, and the GETSTATUS call
+	returns if the device is open or not.  [FIXME -- silliness
+	again?]
+
+softdog.c -- Software watchdog
+
+	The timeout is set with the module parameter "soft_margin"
+	which defaults to 60 seconds, the timeout is also settable
+	using the SETTIMEOUT ioctl.
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	WDIOF_SETTIMEOUT bit set in GETSUPPORT
+
+w83877f_wdt.c -- W83877F Computer
+
+	Hardcoded timeout of 30 seconds
+
+	Does not support CONFIG_WATCHDOG_NOWAYOUT, but has the magic
+	character 'V' close handling.
+
+	No bits set in GETSUPPORT
+
+wdt.c -- ICS WDT500/501 ISA and
+wdt_pci.c -- ICS WDT500/501 PCI
+
+	Default timeout of 60 seconds.  The timeout is also settable
+        using the SETTIMEOUT ioctl.
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	GETSUPPORT returns with a lot of bits set, it seems that the
+	driver can check for overheating, power failure, fan failure
+	and a two external relay inputs.  GETSTATUS can be used to
+	read these flags.
+
+wdt285.c -- Footbridge watchdog
+
+	The timeout is set with the module parameter "soft_margin"
+	which defaults to 60 seconds.  The timeout is also settable
+	using the SETTIMEOUT ioctl.
+
+	Does not support CONFIG_WATCHDOG_NOWAYOUT
+
+	WDIOF_SETTIMEOUT bit set in GETSUPPORT
+
+wdt977.c -- Netwinder W83977AF chip
+
+	Hardcoded timeout of 3 minutes
+
+	Supports CONFIG_WATCHDOG_NOWAYOUT
+
+	Does not support any ioctls at all.
+
+scx200.c -- National SCx200 CPUs
+
+	Not in the kernel yet.
+
+	The timeout is set using a module parameter "margin" which
+	defaults to 60 seconds.  The timeout can also be set using
+	SETTIMEOUT and read using GETTIMEOUT.
+
+	Supports a module parameter "nowayout" that is initialized
+	with the value of CONFIG_WATCHDOG_NOWAYOUT.  Also supports the
+	magic character 'V' handling.


-- 
"Just how much can I get away with and still go to heaven?"

  reply	other threads:[~2002-02-26  1:38 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-02-20 13:32 SC1200 support? Roy Sigurd Karlsbakk
2002-02-20 15:14 ` Alan Cox
2002-02-21  5:54   ` [DRIVER][RFC] SC1200 Watchdog driver Zwane Mwaikambo
2002-02-21  9:39     ` Jeff Garzik
2002-02-21  9:48       ` Zwane Mwaikambo
2002-02-21 10:15         ` Jeff Garzik
2002-02-21 10:10           ` Zwane Mwaikambo
2002-02-21 11:19           ` Christer Weinigel
2002-02-21 11:59             ` Christer Weinigel
2002-02-21 12:07               ` Zwane Mwaikambo
2002-02-21 13:37                 ` Alan Cox
2002-02-21 14:27                   ` Zwane Mwaikambo
2002-02-21 14:54                     ` Dave Jones
2002-02-21 15:16                     ` Alan Cox
2002-02-21 12:22               ` Jeff Garzik
2002-02-21 12:32                 ` Zwane Mwaikambo
2002-02-21 12:46                   ` Jeff Garzik
2002-02-21 12:57                 ` Christer Weinigel
2002-02-21 13:20                   ` Jeff Garzik
2002-02-22 19:57                     ` Christer Weinigel
     [not found]                       ` <20020222210107.A6828@fafner.intra.cogenit.fr>
2002-02-22 20:48                         ` Christer Weinigel
2002-02-22 22:56                           ` Joel Becker
2002-02-25 22:20                           ` Randy.Dunlap
2002-02-26  1:22                             ` Christer Weinigel
2002-02-26  1:37                               ` Christer Weinigel [this message]
2002-02-26  1:59                                 ` Jakob Østergaard
2002-02-26  2:42                                 ` Alan Cox
2002-02-21 15:53                   ` Joel Becker
2002-02-24 17:30         ` Roy Sigurd Karlsbakk
2002-02-25  8:22           ` Zwane Mwaikambo
2002-02-25 21:01             ` Roy Sigurd Karlsbakk
2002-02-21  0:02 ` SC1200 support? Christer Weinigel
2002-02-21  0:27   ` Keith Owens
2002-02-21  6:19   ` Zwane Mwaikambo
2002-02-21  6:35     ` nick
2002-02-21  6:31       ` Zwane Mwaikambo
2002-02-21 12:05     ` Christer Weinigel
2002-02-21 10:56   ` Christer Weinigel
2002-02-21 11:14     ` Keith Owens
2002-02-21 11:24   ` David Woodhouse
  -- strict thread matches above, loose matches on Subject: below --
2002-02-22 10:15 [DRIVER][RFC] SC1200 Watchdog driver Zwane Mwaikambo
2002-02-22 19:32 ` Roy Sigurd Karlsbakk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020226013735.0D309F5B@acolyte.hack.org \
    --to=wingel@acolyte.hack.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo@conectiva.com.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox