From: Chai Wen <chaiw.fnst@cn.fujitsu.com>
To: Don Zickus <dzickus@redhat.com>
Cc: <akpm@linux-foundation.org>, <mingo@redhat.com>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] softlockup: make detector be aware of task switch of processes hogging cpu
Date: Wed, 27 Aug 2014 09:33:33 +0800 [thread overview]
Message-ID: <53FD356D.6050507@cn.fujitsu.com> (raw)
In-Reply-To: <20140826142214.GN49576@redhat.com>
On 08/26/2014 10:22 PM, Don Zickus wrote:
> On Tue, Aug 26, 2014 at 08:51:30PM +0800, Chai Wen wrote:
>> On 08/22/2014 09:58 AM, Don Zickus wrote:
>>
>>> On Thu, Aug 21, 2014 at 01:42:22PM +0800, chai wen wrote:
>>>> For now, soft lockup detector warns once for each case of process softlockup.
>>>> But the thread 'watchdog/n' may not always get the cpu at the time slot between
>>>> the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
>>>>
>>>> An example would be two processes hogging the cpu. Process A causes the
>>>> softlockup warning and is killed manually by a user. Process B immediately
>>>> becomes the new process hogging the cpu preventing the softlockup code from
>>>> resetting the soft_watchdog_warn variable.
>>>>
>>>> This case is a false negative of "warn only once for a process", as there may
>>>> be a different process that is going to hog the cpu. Resolve this by
>>>> saving/checking the task pointer of the hogging process and use that to reset
>>>> soft_watchdog_warn too.
>>>>
>>>> Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
>>>> Signed-off-by: Don Zickus <dzickus@redhat.com>
>>>
>>> Acked-by: Don Zickus <dzickus@redhat.com>
>>>
>>
>>
>> Hi Andrew
>>
>> Sorry for some disturbing.
>> Could you help to check and pick up this little improvement patch ?
>>
>> I am not sure which MAINTAINER I should talk to, but the original version of
>> this patch is queued to -mm tree by you, so I assume that they are in the charge of you.
>>
>>
>> thanks
>> chai wen
>
> Hi Chai,
>
> Sorry about that. Ingo asked me privately to pick this up and re-post
> with my signoff. I was converting to a new test env and was going to use this
> patch as an excuse to exercise it. That is the delay. Let me get this
> out today.
>
OK, It is kind of you to do that, thanks for your work. :)
thanks
chai wen
> Cheers,
> Don
>
>>
>>>> ---
>>>> kernel/watchdog.c | 16 +++++++++++++++-
>>>> 1 files changed, 15 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>>>> index 0037db6..2e55620 100644
>>>> --- a/kernel/watchdog.c
>>>> +++ b/kernel/watchdog.c
>>>> @@ -42,6 +42,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
>>>> static DEFINE_PER_CPU(bool, soft_watchdog_warn);
>>>> static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
>>>> static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
>>>> +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
>>>> #ifdef CONFIG_HARDLOCKUP_DETECTOR
>>>> static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>>>> static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
>>>> @@ -328,8 +329,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>>>> return HRTIMER_RESTART;
>>>>
>>>> /* only warn once */
>>>> - if (__this_cpu_read(soft_watchdog_warn) == true)
>>>> + if (__this_cpu_read(soft_watchdog_warn) == true) {
>>>> + /*
>>>> + * Handle the case where multiple processes are
>>>> + * causing softlockups but the duration is small
>>>> + * enough, the softlockup detector can not reset
>>>> + * itself in time. Use task pointers to detect this.
>>>> + */
>>>> + if (__this_cpu_read(softlockup_task_ptr_saved) !=
>>>> + current) {
>>>> + __this_cpu_write(soft_watchdog_warn, false);
>>>> + __touch_watchdog();
>>>> + }
>>>> return HRTIMER_RESTART;
>>>> + }
>>>>
>>>> if (softlockup_all_cpu_backtrace) {
>>>> /* Prevent multiple soft-lockup reports if one cpu is already
>>>> @@ -345,6 +358,7 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>>>> pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>>>> smp_processor_id(), duration,
>>>> current->comm, task_pid_nr(current));
>>>> + __this_cpu_write(softlockup_task_ptr_saved, current);
>>>> print_modules();
>>>> print_irqtrace_events(current);
>>>> if (regs)
>>>> --
>>>> 1.7.1
>>>>
>>> .
>>>
>>
>>
>>
>> --
>> Regards
>>
>> Chai Wen
> .
>
--
Regards
Chai Wen
next prev parent reply other threads:[~2014-08-27 1:37 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-11 14:49 [PATCH 0/5] watchdog: various fixes Don Zickus
2014-08-11 14:49 ` [PATCH 1/5] watchdog: remove unnecessary head files Don Zickus
2014-08-18 18:03 ` [tip:perf/watchdog] watchdog: Remove unnecessary header files tip-bot for chai wen
2014-08-11 14:49 ` [PATCH 2/5] softlockup: make detector be aware of task switch of processes hogging cpu Don Zickus
2014-08-18 9:03 ` Ingo Molnar
2014-08-18 15:06 ` Don Zickus
2014-08-18 18:01 ` Ingo Molnar
2014-08-18 18:43 ` Don Zickus
2014-08-18 19:02 ` Ingo Molnar
2014-08-18 20:38 ` Don Zickus
2014-08-19 1:36 ` Chai Wen
2014-08-21 1:37 ` Chai Wen
2014-08-21 2:30 ` Don Zickus
2014-08-21 5:42 ` [PATCH] " chai wen
2014-08-22 1:12 ` Chai Wen
2014-08-22 1:58 ` Don Zickus
2014-08-26 12:51 ` Chai Wen
2014-08-26 14:22 ` Don Zickus
2014-08-27 1:33 ` Chai Wen [this message]
2014-08-11 14:49 ` [PATCH 3/5] watchdog: fix print-once on enable Don Zickus
2014-08-18 9:05 ` Ingo Molnar
2014-08-18 9:07 ` Ingo Molnar
2014-08-18 15:07 ` Don Zickus
2014-08-18 18:03 ` [tip:perf/watchdog] watchdog: Fix " tip-bot for Ulrich Obergfell
2014-08-11 14:49 ` [PATCH 4/5] watchdog: control hard lockup detection default Don Zickus
2014-08-18 9:12 ` Ingo Molnar
2014-08-18 15:07 ` Don Zickus
2014-08-18 9:16 ` Ingo Molnar
2014-08-18 10:44 ` Ulrich Obergfell
2014-08-18 15:17 ` Don Zickus
2014-08-18 18:07 ` Ingo Molnar
2014-08-18 18:53 ` Don Zickus
2014-08-18 19:00 ` Ingo Molnar
2014-08-11 14:49 ` [PATCH 5/5] kvm: ensure hard lockup detection is disabled by default Don Zickus
-- strict thread matches above, loose matches on Subject: below --
2014-08-28 4:52 [PATCH] softlockup: Make detector be aware of task switch of processes hogging cpu Don Zickus
2014-08-28 23:07 ` Andrew Morton
2014-08-29 1:27 ` Don Zickus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53FD356D.6050507@cn.fujitsu.com \
--to=chaiw.fnst@cn.fujitsu.com \
--cc=akpm@linux-foundation.org \
--cc=dzickus@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.