All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: <bmaurer@fb.com>, <rkroll@fb.com>, <kernel-team@fb.com>,
	<mingo@redhat.com>, <linux-kernel@vger.kernel.org>,
	<umgwanakikbuti@gmail.com>
Subject: Re: [PATCH] sched/fair: change where we report sched stats V2
Date: Mon, 15 Dec 2014 10:37:09 -0500	[thread overview]
Message-ID: <548F0025.4040203@fb.com> (raw)
In-Reply-To: <20141215101625.GW29390@twins.programming.kicks-ass.net>

On 12/15/2014 05:16 AM, Peter Zijlstra wrote:
> On Thu, Dec 11, 2014 at 10:59:55AM -0500, Josef Bacik wrote:
>> The schedule stats currently spit out once the entity is being queued, which
>> means if we have stack traces enabled we will get the stack trace of the waker,
>> not of the task being woken.  This makes the backtrace completely useless when
>> trying to track down latency spikes, as we want to know why we were put to sleep
>> for as long as we were.
>>
>> This patch moves the trace_sched_stat* calls to after the task has been put back
>> on the CPU so that the backtrace we get is the one that put us to sleep.  Mike
>> Galbraith suggested that I leave the actual stat collection stuff where it is so
>> that we don't add the scheduler latency to the sleep stats, so I've added 3 new
>> fields in the scheduler stats to keep track of the last sleep time so that if
>> any of these are set we know to trigger the trace points.  Thanks,
>>
>
> Yeah, so I don't like this, it adds overhead for everyone.
>

Only if SCHEDSTATS is enabled tho, and it's no more overhead in the 
SCHEDSTATS case than before.  Would it be more acceptable to move the 
entire callback under SCHEDSTATS?

> Also, I'm very sure this issue has been raised before, and I'm pretty
> sure the answer back then was why don't you look at the
> trace_sched_switch() site matching this wakeup?
>

This is fine for discrete problems, but when trying to find a random 
latency spike in a production workload it's impossible. If I do

trace-cmd record -e sched:sched_switch -T sleep 5

on just one of our random web servers I end up with this

du -h trace.dat
62M     trace.dat

thats 62 megs in 5 seconds.  I ran the following command for almost 2 
hours when searching for a latency spike

trace-cmd record -B latency -e sched:sched_stat_blocked -f \"delay >= 
100000\" -T -o /root/latency.dat

and got the following .dat file

du -h latency.dat
48M     latency.dat

That right there is a huge reason for this change, so I'd really like to 
do whatever it takes to get this fixed.  I'd be happy to move the 
callback under CONFIG_SCHEDSTATS to reduce any possible overhead you 
would be concerned about.  Thanks,

Josef

  reply	other threads:[~2014-12-15 15:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-11 15:59 [PATCH] sched/fair: change where we report sched stats V2 Josef Bacik
2014-12-15 10:16 ` Peter Zijlstra
2014-12-15 15:37   ` Josef Bacik [this message]
2014-12-15 17:21     ` Peter Zijlstra
2014-12-15 17:30       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=548F0025.4040203@fb.com \
    --to=jbacik@fb.com \
    --cc=bmaurer@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkroll@fb.com \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.