From: Don Zickus <dzickus@redhat.com>
To: "Pádraig Brady" <P@draigBrady.com>
Cc: kexec@lists.infradead.org, linux-watchdog@vger.kernel.org,
vgoyal@redhat.com, amwang@redhat.com
Subject: Re: watchdogs and kdump
Date: Fri, 28 Oct 2011 09:39:54 -0400 [thread overview]
Message-ID: <20111028133954.GS3452@redhat.com> (raw)
In-Reply-To: <4EA9D09E.800@draigBrady.com>
On Thu, Oct 27, 2011 at 10:43:58PM +0100, Pádraig Brady wrote:
> On 10/27/2011 09:30 PM, Don Zickus wrote:
> > Hi,
> >
> > I was assisting a customer the other day debugging a kdump[1] problem, when we
> > noticed the real problem was the hardware watchdog was firing and
> > rebooting the box.
> >
> > Of course, this can be inconvienant if the panic happens right before the
> > watchdog is supposed to be kicked, leading to a spontaneous reboot before
> > the second kernel finishes booting and loading the watchdog module.
> >
> > I was trying to think of a way to solve this and thought, one way to
> > minimize the problem is to kick the watchdog before we jump into the kdump
> > kernel. Another way is to disable the watchdog entirely, but that doesn't
> > work on all hardware I believe.
> >
> > Anyway, I was posting on the watchdog mailing list to see if anyone had any
> > ideas that might help. And if my above idea to kick the watchdog before
> > jumping into the kdump kernel seems ok, then an api would need to be
> > developed.
> >
> > I am willing to do any coding and testing necessary, but before I did, I
> > wanted help to get a direction to go in first.
> >
> > Thoughts?
>
> Seems like the appropriate thing to do is to call all the
> reboot notifiers that each watchdog registers.
> Since one is not doingn a full SYS_RESTART (SYS_DOWN) though,
> i.e. not running through the BIOS code again,
> it might be worth having a different SYS_JUMP code in notifier.h
> that would allow you to kick rather than stop the watchdogs
> as the reboot notifiers generally do at the moment.
That is an interesting idea. Not sure if calling a blocking notifier in
the kdump path would be acceptable to the kexec folks. Then again using
the reboot notifier in the panic path may not be a good idea either, it
might lead to false expectations. :-/
> I think it would be important not to stop the watchdog if possible,
> given the large amount of logic that's going to be executed
> after the jump.
I agree. Especially since kdump is still not 100% reliable.
Thanks for the feedback!
Cheers,
Don
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
WARNING: multiple messages have this Message-ID (diff)
From: Don Zickus <dzickus@redhat.com>
To: "Pádraig Brady" <P@draigBrady.com>
Cc: linux-watchdog@vger.kernel.org, kexec@lists.infradead.org,
vgoyal@redhat.com, amwang@redhat.com
Subject: Re: watchdogs and kdump
Date: Fri, 28 Oct 2011 09:39:54 -0400 [thread overview]
Message-ID: <20111028133954.GS3452@redhat.com> (raw)
In-Reply-To: <4EA9D09E.800@draigBrady.com>
On Thu, Oct 27, 2011 at 10:43:58PM +0100, Pádraig Brady wrote:
> On 10/27/2011 09:30 PM, Don Zickus wrote:
> > Hi,
> >
> > I was assisting a customer the other day debugging a kdump[1] problem, when we
> > noticed the real problem was the hardware watchdog was firing and
> > rebooting the box.
> >
> > Of course, this can be inconvienant if the panic happens right before the
> > watchdog is supposed to be kicked, leading to a spontaneous reboot before
> > the second kernel finishes booting and loading the watchdog module.
> >
> > I was trying to think of a way to solve this and thought, one way to
> > minimize the problem is to kick the watchdog before we jump into the kdump
> > kernel. Another way is to disable the watchdog entirely, but that doesn't
> > work on all hardware I believe.
> >
> > Anyway, I was posting on the watchdog mailing list to see if anyone had any
> > ideas that might help. And if my above idea to kick the watchdog before
> > jumping into the kdump kernel seems ok, then an api would need to be
> > developed.
> >
> > I am willing to do any coding and testing necessary, but before I did, I
> > wanted help to get a direction to go in first.
> >
> > Thoughts?
>
> Seems like the appropriate thing to do is to call all the
> reboot notifiers that each watchdog registers.
> Since one is not doingn a full SYS_RESTART (SYS_DOWN) though,
> i.e. not running through the BIOS code again,
> it might be worth having a different SYS_JUMP code in notifier.h
> that would allow you to kick rather than stop the watchdogs
> as the reboot notifiers generally do at the moment.
That is an interesting idea. Not sure if calling a blocking notifier in
the kdump path would be acceptable to the kexec folks. Then again using
the reboot notifier in the panic path may not be a good idea either, it
might lead to false expectations. :-/
> I think it would be important not to stop the watchdog if possible,
> given the large amount of logic that's going to be executed
> after the jump.
I agree. Especially since kdump is still not 100% reliable.
Thanks for the feedback!
Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-10-28 13:40 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-27 20:30 watchdogs and kdump Don Zickus
2011-10-27 20:30 ` Don Zickus
2011-10-27 21:43 ` Pádraig Brady
2011-10-27 21:43 ` Pádraig Brady
2011-10-28 13:39 ` Don Zickus [this message]
2011-10-28 13:39 ` Don Zickus
2012-01-21 18:21 ` kdump not working Intel S5520UR motherboards with Xeon processor Prashant Dinkar Kharche
2012-01-23 15:18 ` Don Zickus
2011-10-28 15:46 ` watchdogs and kdump Alejandro Cabrera
2011-10-28 15:48 ` Don Zickus
2011-10-28 16:13 ` Alejandro Cabrera
2011-10-28 16:22 ` Don Zickus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111028133954.GS3452@redhat.com \
--to=dzickus@redhat.com \
--cc=P@draigBrady.com \
--cc=amwang@redhat.com \
--cc=kexec@lists.infradead.org \
--cc=linux-watchdog@vger.kernel.org \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.