public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mandeep Singh Baines <msb@chromium.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mandeep Singh Baines <msb@chromium.org>,
	Don Zickus <dzickus@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.cz>
Subject: Re: [PATCH] watchdog: Make sure the watchdog thread gets CPU on loaded system
Date: Thu, 15 Mar 2012 10:04:34 -0700	[thread overview]
Message-ID: <20120315170434.GW27051@google.com> (raw)
In-Reply-To: <1331828203.18960.200.camel@twins>

Peter Zijlstra (peterz@infradead.org) wrote:
> On Thu, 2012-03-15 at 17:11 +0100, Peter Zijlstra wrote:
> > On Thu, 2012-03-15 at 17:10 +0100, Peter Zijlstra wrote:
> > > On Thu, 2012-03-15 at 08:39 -0700, Mandeep Singh Baines wrote:
> > > > Its a good tool for catching problems of scale. As we move to more and
> > > > more cores you'll uncover bugs where data structures start to blow up.
> > > > Hash tables get huge, when you have 100000s of processes or millions
> > > > of
> > > > TCP flows, or cgroups or namespace. That critical section (spinlock,
> > > > spinlock_bh, or preempt_disable) that used to be OK might no longer
> > > > be.
> > > 
> > > Or you run with the preempt latency tracer.
> > 
> > Or for that matter run cyclictest...
> 
> Thing is, if you want a latency detector, call it that and stop
> pretending its a useful debug feature. Also, if you want that, set the
> interval in the 0.1-0.5 seconds range and dump stack on every new max.
> 
> 

But preempt latency tracer is not negligible overhead while the softlockup
detector is. Softlockup is a great tool to use for detecting temporary
long duration lockups that can occur when data structures blow up.
Because of the overhead, you probably wouldn't enable preempt latency
tracking in production. If the problems is happening often enough, you
might temporarily turn it on for a few machines to get a stack trace. But
you might not have the luxury of being able to do that.

I can't predict what my users are going to do. They will do things I never
expected. So I can't test these cases in the lab, ruling out latency preempt
detector. With softlockup, I can find out about problems I never even
knew I had.

In addition, softlockup is also a great tool for find permanent lockups.

One idea for reducing preempt latency tracer overhead would use the same
approach that the HW counters use. Instead of examing every preempt
enable/disable, only examine 1 in a 1000 (some configurable numbers).
That way you could turn it on in production. Maybe a simple per_cpu
counter.

Regards,
Mandeep


  reply	other threads:[~2012-03-15 17:04 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-14 20:38 [PATCH] watchdog: Make sure the watchdog thread gets CPU on loaded system Don Zickus
2012-03-14 20:59 ` Mandeep Singh Baines
2012-03-14 23:19 ` Andrew Morton
2012-03-15  1:45   ` Mandeep Singh Baines
2012-03-15 11:00     ` Peter Zijlstra
2012-03-15 11:06       ` Peter Zijlstra
2012-03-15 12:42         ` Ingo Molnar
2012-03-15 14:00           ` Peter Zijlstra
2012-03-15 14:35             ` Don Zickus
2012-03-15 15:39               ` Mandeep Singh Baines
2012-03-15 16:10                 ` Peter Zijlstra
2012-03-15 16:11                   ` Peter Zijlstra
2012-03-15 16:16                     ` Peter Zijlstra
2012-03-15 17:04                       ` Mandeep Singh Baines [this message]
2012-03-15  8:02   ` Michal Hocko
2012-03-15 15:54     ` Don Zickus
2012-03-15 16:04       ` Peter Zijlstra
2012-03-19 22:00         ` Andrew Morton
2012-03-15 16:14       ` Michal Hocko
2012-03-15 17:14         ` Don Zickus
  -- strict thread matches above, loose matches on Subject: below --
2012-03-13  9:45 [PATCH] watchdog: make " Michal Hocko
2012-03-13 13:42 ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120315170434.GW27051@google.com \
    --to=msb@chromium.org \
    --cc=akpm@linux-foundation.org \
    --cc=dzickus@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox