All of lore.kernel.org
 help / color / mirror / Atom feed
From: Don Zickus <dzickus@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
	chai wen <chaiw.fnst@cn.fujitsu.com>
Subject: Re: [PATCH] softlockup: Make detector be aware of task switch of processes hogging cpu
Date: Thu, 28 Aug 2014 21:27:05 -0400	[thread overview]
Message-ID: <20140829012705.GC49576@redhat.com> (raw)
In-Reply-To: <20140828160723.b174b4e03f2854d0a5146199@linux-foundation.org>

On Thu, Aug 28, 2014 at 04:07:23PM -0700, Andrew Morton wrote:
> On Thu, 28 Aug 2014 00:52:24 -0400 Don Zickus <dzickus@redhat.com> wrote:
> 
> > From: chai wen <chaiw.fnst@cn.fujitsu.com>
> > 
> > For now, soft lockup detector warns once for each case of process softlockup.
> > But the thread 'watchdog/n' may not always get the cpu at the time slot between
> > the task switch of two processes hogging that cpu to reset soft_watchdog_warn.
> > 
> > An example would be two processes hogging the cpu.  Process A causes the
> > softlockup warning and is killed manually by a user.  Process B immediately
> > becomes the new process hogging the cpu preventing the softlockup code from
> > resetting the soft_watchdog_warn variable.
> > 
> > This case is a false negative of "warn only once for a process", as there may
> > be a different process that is going to hog the cpu.  Resolve this by
> > saving/checking the task pointer of the hogging process and use that to reset
> > soft_watchdog_warn too.
> > 
> 
> OK, this should address the PID uniqueness issue which Ingo identified.
> 
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -47,6 +47,7 @@ static DEFINE_PER_CPU(bool, softlockup_touch_sync);
> >  static DEFINE_PER_CPU(bool, soft_watchdog_warn);
> >  static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts);
> >  static DEFINE_PER_CPU(unsigned long, soft_lockup_hrtimer_cnt);
> > +static DEFINE_PER_CPU(struct task_struct *, softlockup_task_ptr_saved);
> >  #ifdef CONFIG_HARDLOCKUP_DETECTOR
> >  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
> >  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
> > @@ -331,8 +332,20 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> >  			return HRTIMER_RESTART;
> >  
> >  		/* only warn once */
> > -		if (__this_cpu_read(soft_watchdog_warn) == true)
> > +		if (__this_cpu_read(soft_watchdog_warn) == true) {
> > +			/*
> > +			 * Handle the case where multiple processes are
> > +			 * causing softlockups but the duration is small
> > +			 * enough, the softlockup detector can not reset
> > +			 * itself in time.  Use task pointers to detect this.
> > +			 */
> 
> This comment is rather hard to follow ("the duration" of what?).  Can
> you think of some words which are a bit more complete/clear?

Agreed.  Does this work better?

"
/*
 * When multiple processes are causing softlockups
 * the softlockup detector only warns on the first
 * one because the code relies on a full quiet cycle
 * to re-arm.  The second process prevents the
 * quiet cycle and never gets reported.  Use task
 * pointers to detect this.
 */

Cheers,
Don


  reply	other threads:[~2014-08-29  1:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-28  4:52 [PATCH] softlockup: Make detector be aware of task switch of processes hogging cpu Don Zickus
2014-08-28 23:07 ` Andrew Morton
2014-08-29  1:27   ` Don Zickus [this message]
  -- strict thread matches above, loose matches on Subject: below --
2014-08-21  2:30 [PATCH 2/5] softlockup: make " Don Zickus
2014-08-21  5:42 ` [PATCH] " chai wen
2014-08-22  1:12   ` Chai Wen
2014-08-22  1:58   ` Don Zickus
2014-08-26 12:51     ` Chai Wen
2014-08-26 14:22       ` Don Zickus
2014-08-27  1:33         ` Chai Wen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140829012705.GC49576@redhat.com \
    --to=dzickus@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chaiw.fnst@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.