linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 3.8.3 and 3.9git occasional watchdog oops
       [not found] <201303142154.20501.arekm@maven.pl>
@ 2013-04-04 22:23 ` Arkadiusz Miskiewicz
  2013-04-05  1:59   ` Guenter Roeck
  0 siblings, 1 reply; 3+ messages in thread
From: Arkadiusz Miskiewicz @ 2013-04-04 22:23 UTC (permalink / raw)
  To: Wim Van Sebroeck; +Cc: linux-watchdog, linux-kernel

On Thursday 14 of March 2013, Arkadiusz Miśkiewicz wrote:
> Hi.
> 
> Just hit watchdog related oops in 3.8.3 kernel. Unfortunately photos only.
> 
> http://ixion.pld-linux.org/~arekm/watchdog-oops-3.8.3/IMG_8942.JPG
> http://ixion.pld-linux.org/~arekm/watchdog-oops-3.8.3/IMG_8941.JPG

3.9git from today isn't any better unfortunately:

http://ixion.pld-linux.org/~arekm/watchdog-oops-3.9git.jpg

> 
> oops started after I enabled systemd watchdog functionality. Cannot
> reproduce easily.
> 
> watchdog here (thinkpad t400) is:
>  iTCO_wdt: Found a ICH9M-E TCO device (Version=2, TCOBASE=0x1060)


-- 
Arkadiusz Miśkiewicz, arekm / maven.pl

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 3.8.3 and 3.9git occasional watchdog oops
  2013-04-04 22:23 ` 3.8.3 and 3.9git occasional watchdog oops Arkadiusz Miskiewicz
@ 2013-04-05  1:59   ` Guenter Roeck
  2013-04-06  3:47     ` Guenter Roeck
  0 siblings, 1 reply; 3+ messages in thread
From: Guenter Roeck @ 2013-04-05  1:59 UTC (permalink / raw)
  To: Arkadiusz Miskiewicz; +Cc: Wim Van Sebroeck, linux-watchdog, linux-kernel

On Fri, Apr 05, 2013 at 12:23:30AM +0200, Arkadiusz Miskiewicz wrote:
> On Thursday 14 of March 2013, Arkadiusz Miśkiewicz wrote:
> > Hi.
> > 
> > Just hit watchdog related oops in 3.8.3 kernel. Unfortunately photos only.
> > 
> > http://ixion.pld-linux.org/~arekm/watchdog-oops-3.8.3/IMG_8942.JPG
> > http://ixion.pld-linux.org/~arekm/watchdog-oops-3.8.3/IMG_8941.JPG
> 
> 3.9git from today isn't any better unfortunately:
> 
> http://ixion.pld-linux.org/~arekm/watchdog-oops-3.9git.jpg
> 
> > 
> > oops started after I enabled systemd watchdog functionality. Cannot
> > reproduce easily.
> > 
> > watchdog here (thinkpad t400) is:
> >  iTCO_wdt: Found a ICH9M-E TCO device (Version=2, TCOBASE=0x1060)
> 
> 
Wonder if there is a race condition in the watchdog driver: The watchdog device
is opened before watchdog_register_device returns. I suspect systemd waits for
a udev event, or by some other means detects that /dev/watchdog was created,
and opens it immediately.

I just have no idea where exactly the race condition, if there is one, is
hiding. Or maybe I am completely off track.

Guenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 3.8.3 and 3.9git occasional watchdog oops
  2013-04-05  1:59   ` Guenter Roeck
@ 2013-04-06  3:47     ` Guenter Roeck
  0 siblings, 0 replies; 3+ messages in thread
From: Guenter Roeck @ 2013-04-06  3:47 UTC (permalink / raw)
  To: Arkadiusz Miskiewicz; +Cc: Wim Van Sebroeck, linux-watchdog, linux-kernel

On Thu, Apr 04, 2013 at 06:59:59PM -0700, Guenter Roeck wrote:
> On Fri, Apr 05, 2013 at 12:23:30AM +0200, Arkadiusz Miskiewicz wrote:
> > On Thursday 14 of March 2013, Arkadiusz Miśkiewicz wrote:
> > > Hi.
> > > 
> > > Just hit watchdog related oops in 3.8.3 kernel. Unfortunately photos only.
> > > 
> > > http://ixion.pld-linux.org/~arekm/watchdog-oops-3.8.3/IMG_8942.JPG
> > > http://ixion.pld-linux.org/~arekm/watchdog-oops-3.8.3/IMG_8941.JPG
> > 
> > 3.9git from today isn't any better unfortunately:
> > 
> > http://ixion.pld-linux.org/~arekm/watchdog-oops-3.9git.jpg
> > 
> > > 
> > > oops started after I enabled systemd watchdog functionality. Cannot
> > > reproduce easily.
> > > 
> > > watchdog here (thinkpad t400) is:
> > >  iTCO_wdt: Found a ICH9M-E TCO device (Version=2, TCOBASE=0x1060)
> > 
> > 
> Wonder if there is a race condition in the watchdog driver: The watchdog device
> is opened before watchdog_register_device returns. I suspect systemd waits for
> a udev event, or by some other means detects that /dev/watchdog was created,
> and opens it immediately.
> 
> I just have no idea where exactly the race condition, if there is one, is
> hiding. Or maybe I am completely off track.
> 
I _think_ I understand the sequence of events.

- The driver is the first watchdog driver to register.
- watchdog_dev_register() gets called and creates the watchdog misc device
  by calling misc_register().
  At that time, the matching character device (/dev/watchdog0) does not yet
  exist, and old_wdd is not set either.
- Userspace gets an event and opens /dev/watchdog
- watchdog_open() is called and sets sets wdd = old_wdd, which is still NULL,
  and tries to dereference it. Bang.

If this is the problem, a simple fix would be to set old_wdd before calling
misc_register().

Can you test a patch ?

Guenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-04-06  3:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <201303142154.20501.arekm@maven.pl>
2013-04-04 22:23 ` 3.8.3 and 3.9git occasional watchdog oops Arkadiusz Miskiewicz
2013-04-05  1:59   ` Guenter Roeck
2013-04-06  3:47     ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).