From: Prarit Bhargava <prarit@redhat.com>
To: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: virtualization@lists.osdl.org,
Rick Lindsley <ricklind@us.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
john stultz <johnstul@us.ibm.com>, Ingo Molnar <mingo@elte.hu>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Paul Mackerras <paulus@samba.org>
Subject: Re: [patch 1/2] Ignore stolen time in the softlockup watchdog
Date: Tue, 27 Mar 2007 13:20:52 -0400 [thread overview]
Message-ID: <46095274.7050502@redhat.com> (raw)
In-Reply-To: <46095006.2000306@xensource.com>
Jeremy Fitzhardinge wrote:
> Prarit Bhargava wrote:
>
>> Jeremy Fitzhardinge wrote:
>>
>>
>>> Prarit Bhargava wrote:
>>>
>>>
>>>
>>>> I'd like to see this patch implement/fix touch_cpu_softlockup_watchdog
>>>> and touch_softlockup_watchdog to mimic touch_nmi_watchdog's behaviour.
>>>>
>>>>
>>>>
>>> Why? Is that more correct? It seems to me that you're interested in
>>> whether a specific CPU has gone and locked up. If touching the watchdog
>>>
>>> makes it update all CPU timestamps, then you'll hide the fact that other
>>> CPUs have locked up, won't it?
>>>
>>>
>>>
>>>
>> In case of misuse, yes. But there are cases where we know that all CPUs
>> will have softlockup issues, such as when doing a "big" sysrq-t dump.
>> When doing the sysrq-t we take the tasklist_lock which prevents all
>> other CPUs from scheduling -- this leads to bogus softlockup messages,
>> so we need to reset everyone's watchdog just before releasing the
>> tasklist_lock.
>>
>> Another question -- are you going to expose disable/enable_watchdog to
>> other subsystems? Or are you going to expose touch_softlockup_watchdog?
>>
>
> Well, it depends on who turns up.
>
> My first thought is to export both the global enable/disable interfaces
> and touch_softlockup_watchdog. But on second thoughts maybe
> touch_softlockup_watchdog is completely redundant, since you'd only do
>
IMO, if you export enable/disable you should drop touch_softlockup_watchdog.
> it if you're holding off timer interrupts, but the lockup only gets
> reported if timer interrupts are enabled (in other words, the best it
> can tell you is "you locked up for a while there", which isn't terribly
> useful).
I like to think of the softlockup watchdog letting me know that a cpu
hasn't scheduled in a long time.
> So perhaps this can just be dropped. I haven't looked at the
> users to see what they're really trying to achieve.
>
I've looked through much of that code for my previous patch ;)
AFAICT the uses appear to be cases where we _know_ that we've gone away
for a while and need to reset the timer.
But there were some exceptions: touch_nmi_watchdog erroneously calls
touch_softlockup_watchdog. In fact, touch_nmi_watchdog is trying to
touch all cpus softlockup watchdogs, not just one.
IIRC, There was an extra call to touch_softlockup_watchdog which wasn't
necessary IIRC...
Look at my previous patch where I replaced touch_softlockup_watchdog
with touch_cpu_softlockup_watchdog ...
> The enable/disable interfaces are more generally useful in that you can
> say "I *know* I'm going to go away for a while, so don't bother
> reporting it".
>
> J
>
WARNING: multiple messages have this Message-ID (diff)
From: Prarit Bhargava <prarit@redhat.com>
To: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
Rick Lindsley <ricklind@us.ibm.com>,
john stultz <johnstul@us.ibm.com>, Ingo Molnar <mingo@elte.hu>,
Linux Kernel <linux-kernel@vger.kernel.org>,
virtualization@lists.osdl.org, Paul Mackerras <paulus@samba.org>,
Martin Schwidefsky <schwidefsky@de.ibm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [patch 1/2] Ignore stolen time in the softlockup watchdog
Date: Tue, 27 Mar 2007 13:20:52 -0400 [thread overview]
Message-ID: <46095274.7050502@redhat.com> (raw)
In-Reply-To: <46095006.2000306@xensource.com>
Jeremy Fitzhardinge wrote:
> Prarit Bhargava wrote:
>
>> Jeremy Fitzhardinge wrote:
>>
>>
>>> Prarit Bhargava wrote:
>>>
>>>
>>>
>>>> I'd like to see this patch implement/fix touch_cpu_softlockup_watchdog
>>>> and touch_softlockup_watchdog to mimic touch_nmi_watchdog's behaviour.
>>>>
>>>>
>>>>
>>> Why? Is that more correct? It seems to me that you're interested in
>>> whether a specific CPU has gone and locked up. If touching the watchdog
>>>
>>> makes it update all CPU timestamps, then you'll hide the fact that other
>>> CPUs have locked up, won't it?
>>>
>>>
>>>
>>>
>> In case of misuse, yes. But there are cases where we know that all CPUs
>> will have softlockup issues, such as when doing a "big" sysrq-t dump.
>> When doing the sysrq-t we take the tasklist_lock which prevents all
>> other CPUs from scheduling -- this leads to bogus softlockup messages,
>> so we need to reset everyone's watchdog just before releasing the
>> tasklist_lock.
>>
>> Another question -- are you going to expose disable/enable_watchdog to
>> other subsystems? Or are you going to expose touch_softlockup_watchdog?
>>
>
> Well, it depends on who turns up.
>
> My first thought is to export both the global enable/disable interfaces
> and touch_softlockup_watchdog. But on second thoughts maybe
> touch_softlockup_watchdog is completely redundant, since you'd only do
>
IMO, if you export enable/disable you should drop touch_softlockup_watchdog.
> it if you're holding off timer interrupts, but the lockup only gets
> reported if timer interrupts are enabled (in other words, the best it
> can tell you is "you locked up for a while there", which isn't terribly
> useful).
I like to think of the softlockup watchdog letting me know that a cpu
hasn't scheduled in a long time.
> So perhaps this can just be dropped. I haven't looked at the
> users to see what they're really trying to achieve.
>
I've looked through much of that code for my previous patch ;)
AFAICT the uses appear to be cases where we _know_ that we've gone away
for a while and need to reset the timer.
But there were some exceptions: touch_nmi_watchdog erroneously calls
touch_softlockup_watchdog. In fact, touch_nmi_watchdog is trying to
touch all cpus softlockup watchdogs, not just one.
IIRC, There was an extra call to touch_softlockup_watchdog which wasn't
necessary IIRC...
Look at my previous patch where I replaced touch_softlockup_watchdog
with touch_cpu_softlockup_watchdog ...
> The enable/disable interfaces are more generally useful in that you can
> say "I *know* I'm going to go away for a while, so don't bother
> reporting it".
>
> J
>
next prev parent reply other threads:[~2007-03-27 17:20 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-27 5:38 [patch 0/2] softlockup watchdog improvements Jeremy Fitzhardinge
2007-03-27 5:38 ` Jeremy Fitzhardinge
2007-03-27 5:38 ` [patch 1/2] Ignore stolen time in the softlockup watchdog Jeremy Fitzhardinge
2007-03-27 7:00 ` Eric Dumazet
2007-03-27 7:12 ` Jeremy Fitzhardinge
2007-03-27 7:12 ` Jeremy Fitzhardinge
2007-03-27 7:50 ` Eric Dumazet
2007-03-27 7:50 ` Eric Dumazet
2007-03-27 14:39 ` Prarit Bhargava
2007-03-27 14:39 ` Prarit Bhargava
2007-03-27 16:37 ` Jeremy Fitzhardinge
2007-03-27 16:53 ` Prarit Bhargava
2007-03-27 16:53 ` Prarit Bhargava
2007-03-27 17:10 ` Jeremy Fitzhardinge
2007-03-27 17:10 ` Jeremy Fitzhardinge
2007-03-27 17:20 ` Prarit Bhargava [this message]
2007-03-27 17:20 ` Prarit Bhargava
2007-03-27 5:38 ` [patch 2/2] percpu enable flag for " Jeremy Fitzhardinge
2007-03-27 5:38 ` Jeremy Fitzhardinge
2007-03-27 14:42 ` Prarit Bhargava
2007-03-27 14:42 ` Prarit Bhargava
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46095274.7050502@redhat.com \
--to=prarit@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=jeremy@xensource.com \
--cc=johnstul@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=paulus@samba.org \
--cc=ricklind@us.ibm.com \
--cc=schwidefsky@de.ibm.com \
--cc=tglx@linutronix.de \
--cc=virtualization@lists.osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.