All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
To: Julia Cartwright <julia@ni.com>
Cc: Guenter Roeck <linux@roeck-us.net>,
	"linux-watchdog\@vger.kernel.org"
	<linux-watchdog@vger.kernel.org>,
	Wim Van Sebroeck <wim@linux-watchdog.org>,
	Christophe Leroy <christophe.leroy@c-s.fr>,
	"linux-rt-users\@vger.kernel.org"
	<linux-rt-users@vger.kernel.org>
Subject: Re: [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering
Date: Mon, 24 Sep 2018 09:24:52 +0200	[thread overview]
Message-ID: <73in2vl5mj.fsf@pengutronix.de> (raw)
In-Reply-To: <20180920204843.GY23084@jcartwri.amer.corp.natinst.com>


Hi!

Julia Cartwright <julia@ni.com> writes:

> Hello all-
>
> On Wed, Sep 19, 2018 at 12:43:03PM -0700, Guenter Roeck wrote:
>> On Wed, Sep 19, 2018 at 08:46:19AM +0200, Steffen Trumtrar 
>> wrote:
>> > On Tue, Sep 18, 2018 at 06:46:15AM -0700, Guenter Roeck 
>> > wrote:
> [..]
>> > The problem I observe, is that the watchdog is trigged, 
>> > because it doesn't get pinged.
>> > The ksoftirqd seems to be blocked although it runs at a much 
>> > higher priority than the
>> > blocking userspace task.
>> >
>> Are you sure about that ? The other email seemed to suggest 
>> that the userspace
>> task is running at higher priority.
>
> Also: ksoftirqd is irrelevant on RT for the kernel watchdog 
> thread.  The
> relevant thread is ktimersoftd, which is the thread responsible 
> for
> invoking hrtimer expiry functions, like what's being used for 
> watchdogd.
>
> [..]
>> Overall, we have a number possibilities to consider:
>>
>> - The kernel watchdog timer thread is not triggered at all 
>> under some
>>   circumstances, meaning it is not set properly. So far we have 
>>   no real
>>   indication that this is the case (since the code works fine 
>>   unless some
>>   userspace task takes all available CPU time).
>
> What do you mean by "not triggered".  Do you mean 
> woken-up/activated
> from a scheduling perspective?  In the case I identified in my 
> other
> email, the watchdogd thread wakeup doesn't even occur, even when 
> the
> periodic ping timer expires, because ktimersoftd has been 
> starved.
>
> I suspect that's what's going on for Steffen, but am not yet 
> sure.
>
>> - The watchdog device is closed. The kernel watchdog timer 
>> thread is
>>   starved and does not get to run. The question is what to do 
>>   in this
>>   situation. In a real time system, this is almost always a 
>>   fatal
>>   condition. Should the system really be kept alive in this 
>>   situation ?
>
> Sometimes its the right decision, sometimes its not.  The only 
> sensible
> thing to do is to allow the user make the decision that's right 
> for
> their application needs by allowing the relative prioritization 
> of
> watchdogd and their application threads.
>
> ...which they can do now, but it's not effective on RT because 
> of the
> timer deferral through ktimersoftd.
>
> The solution, in my mind, and like I mentioned in my other 
> email, is to
> opt-out of the ktimersoftd-deferral mechanism.  This requires 
> some
> tweaking with the kthread_worker bits to ensure safety in 
> hardirq
> context, but that seems straightforward.  See the below.
>

I just tested your patch and it works for me \o/


Thanks,
Steffen

-- 
Pengutronix e.K.                          | Steffen Trumtrar 
|
Industrial Linux Solutions                | 
http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany| Phone: 
+49-5121-206917-0   |
Amtsgericht Hildesheim, HRA 2686          | Fax: 
+49-5121-206917-5555|

WARNING: multiple messages have this Message-ID (diff)
From: Steffen Trumtrar <s.trumtrar@pengutronix.de>
To: Julia Cartwright <julia@ni.com>
Cc: Guenter Roeck <linux@roeck-us.net>,
	"linux-watchdog@vger.kernel.org" <linux-watchdog@vger.kernel.org>,
	Wim Van Sebroeck <wim@linux-watchdog.org>,
	Christophe Leroy <christophe.leroy@c-s.fr>,
	"linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>
Subject: Re: [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering
Date: Mon, 24 Sep 2018 09:24:52 +0200	[thread overview]
Message-ID: <73in2vl5mj.fsf@pengutronix.de> (raw)
In-Reply-To: <20180920204843.GY23084@jcartwri.amer.corp.natinst.com>


Hi!

Julia Cartwright <julia@ni.com> writes:

> Hello all-
>
> On Wed, Sep 19, 2018 at 12:43:03PM -0700, Guenter Roeck wrote:
>> On Wed, Sep 19, 2018 at 08:46:19AM +0200, Steffen Trumtrar 
>> wrote:
>> > On Tue, Sep 18, 2018 at 06:46:15AM -0700, Guenter Roeck 
>> > wrote:
> [..]
>> > The problem I observe, is that the watchdog is trigged, 
>> > because it doesn't get pinged.
>> > The ksoftirqd seems to be blocked although it runs at a much 
>> > higher priority than the
>> > blocking userspace task.
>> >
>> Are you sure about that ? The other email seemed to suggest 
>> that the userspace
>> task is running at higher priority.
>
> Also: ksoftirqd is irrelevant on RT for the kernel watchdog 
> thread.  The
> relevant thread is ktimersoftd, which is the thread responsible 
> for
> invoking hrtimer expiry functions, like what's being used for 
> watchdogd.
>
> [..]
>> Overall, we have a number possibilities to consider:
>>
>> - The kernel watchdog timer thread is not triggered at all 
>> under some
>>   circumstances, meaning it is not set properly. So far we have 
>>   no real
>>   indication that this is the case (since the code works fine 
>>   unless some
>>   userspace task takes all available CPU time).
>
> What do you mean by "not triggered".  Do you mean 
> woken-up/activated
> from a scheduling perspective?  In the case I identified in my 
> other
> email, the watchdogd thread wakeup doesn't even occur, even when 
> the
> periodic ping timer expires, because ktimersoftd has been 
> starved.
>
> I suspect that's what's going on for Steffen, but am not yet 
> sure.
>
>> - The watchdog device is closed. The kernel watchdog timer 
>> thread is
>>   starved and does not get to run. The question is what to do 
>>   in this
>>   situation. In a real time system, this is almost always a 
>>   fatal
>>   condition. Should the system really be kept alive in this 
>>   situation ?
>
> Sometimes its the right decision, sometimes its not.  The only 
> sensible
> thing to do is to allow the user make the decision that's right 
> for
> their application needs by allowing the relative prioritization 
> of
> watchdogd and their application threads.
>
> ...which they can do now, but it's not effective on RT because 
> of the
> timer deferral through ktimersoftd.
>
> The solution, in my mind, and like I mentioned in my other 
> email, is to
> opt-out of the ktimersoftd-deferral mechanism.  This requires 
> some
> tweaking with the kthread_worker bits to ensure safety in 
> hardirq
> context, but that seems straightforward.  See the below.
>

I just tested your patch and it works for me \o/


Thanks,
Steffen

-- 
Pengutronix e.K.                          | Steffen Trumtrar 
|
Industrial Linux Solutions                | 
http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany| Phone: 
+49-5121-206917-0   |
Amtsgericht Hildesheim, HRA 2686          | Fax: 
+49-5121-206917-5555|

  parent reply	other threads:[~2018-09-24  7:24 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-18 13:21 [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering Steffen Trumtrar
2018-09-18 13:46 ` Guenter Roeck
2018-09-19  6:46   ` Steffen Trumtrar
2018-09-19 19:43     ` Guenter Roeck
2018-09-20 20:48       ` Julia Cartwright
2018-09-21 13:34         ` Guenter Roeck
2018-09-21 16:42           ` Julia Cartwright
2018-09-21 20:21             ` Guenter Roeck
2018-09-24  7:24         ` Steffen Trumtrar [this message]
2018-09-24  7:24           ` Steffen Trumtrar
2018-09-28 21:03           ` [PATCH 0/2] Fix watchdogd wakeup deferral on RT Julia Cartwright
2018-09-28 21:03             ` [PATCH 1/2] kthread: convert worker lock to raw spinlock Julia Cartwright
2018-10-05 16:46               ` Sebastian Andrzej Siewior
2018-10-05 18:10               ` Andrea Parri
2018-10-09 10:56                 ` Sebastian Andrzej Siewior
2018-09-28 21:03             ` [PATCH RT 2/2] watchdog, rt: prevent deferral of watchdogd wakeup Julia Cartwright
2018-09-28 22:38               ` kbuild test robot
2018-09-29  6:38                 ` Thomas Gleixner
2018-09-29 22:13                   ` Sebastian Andrzej Siewior
2018-09-30  1:41                     ` [kbuild-all] " Li, Philip
2018-09-28 23:20               ` kbuild test robot
2018-09-30 14:00               ` Guenter Roeck
2018-10-05 16:52               ` Sebastian Andrzej Siewior
2018-09-20  8:18   ` [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering Tim Sander
2018-09-18 18:14 ` Julia Cartwright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73in2vl5mj.fsf@pengutronix.de \
    --to=s.trumtrar@pengutronix.de \
    --cc=christophe.leroy@c-s.fr \
    --cc=julia@ni.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=wim@linux-watchdog.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.