From: Frederic Weisbecker <fweisbec@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>, LKML <linux-kernel@vger.kernel.org>,
Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] tracing: Add task activate/deactivate tracepoints
Date: Mon, 31 May 2010 16:48:23 +0200 [thread overview]
Message-ID: <20100531144820.GB5157@nowhere> (raw)
In-Reply-To: <1275317013.27810.23019.camel@twins>
On Mon, May 31, 2010 at 04:43:33PM +0200, Peter Zijlstra wrote:
> On Mon, 2010-05-31 at 16:36 +0200, Frederic Weisbecker wrote:
> > On Mon, May 31, 2010 at 10:54:59AM +0200, Peter Zijlstra wrote:
> > > On Mon, 2010-05-31 at 10:12 +0200, Peter Zijlstra wrote:
> > > > On Mon, 2010-05-31 at 10:00 +0200, Ingo Molnar wrote:
> > > > > >
> > > > > > NAK, aside from a few corner cases wakeup and sleep are the important
> > > > > > points.
> > > > > >
> > > > > > The activate and deactivate functions are implementation details.
> > > > >
> > > > > Frederic, can you show us a concrete example of where we dont know what is
> > > > > going on due to inadequate instrumentation? Can we fix that be extending the
> > > > > existing tracepoints?
> > > >
> > > > Right, so a few of those corner cases I mentioned above are things like
> > > > re-nice, PI-boosts etc.. Those use deactivate, modify task-state,
> > > > activate cycles. so if you want to see those, we can add an explicit
> > > > tracepoint for those actions.
> > > >
> > > > An explicit nice/PI-boost tracepoint is much clearer than trying to
> > > > figure out wth the deactivate/activate cycle was for.
> > >
> > > Another advantage of explicit tracepoints is that you'd see them even
> > > for non-running tasks, because we only do the deactivate/activate thingy
> > > for runnable tasks.
> >
> >
> > Yeah. So I agree with you that activate/deactivate are too much
> > implementation related, they even don't give much sense as we
> > don't know the cause of the event, could be a simple renice, or
> > could be a sleep.
> >
> > So agreed, this sucks.
> >
> > For the corner cases like re-nice and PI-boost or so, we can indeed plug
> > some higher level tracepoints there.
> >
> > But there is one more important problem these tracepoints were solving and
> > that still need something:
> >
> > We don't know when a task goes to sleep. We have two wait tracepoints,
> > sched_wait_task() to wait for a task to unschedule, and sched_process_wait()
> > that is a hooks for waitid and wait4 syscalls. So we are missing all
> > the event waiting from inside the kernel. But even with that, wait and sleep
> > doesn't mean the same thing. Sleeping don't always involve using the waiting
> > API.
> >
> > I think we need such tracepoint:
> >
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index 8c0b90d..5f67c04 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -3628,8 +3628,10 @@ need_resched_nonpreemptible:
> > if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
> > if (unlikely(signal_pending_state(prev->state, prev)))
> > prev->state = TASK_RUNNING;
> > - else
> > + else {
> > + trace_sched_task_sleep(prev);
> > deactivate_task(rq, prev, DEQUEUE_SLEEP);
> > + }
> > switch_count = &prev->nvcsw;
> > }
>
> > And concerning the task waking up, if it is not migrated, it means it stays
> > on its orig cpu. This is something that can be dealt from the post-processing.
>
> Hurm,.. I was thinking trace_sched_switch(.prev_state != TASK_RUNNING)
> would be enough, but its not for preemptible kernels.
>
> Should we maybe cure this and rely on sched_switch() to detect sleeps?
> It seems natural since only the current task can go to sleep, its just
> that the whole preempt state gets a bit iffy.
Sounds good, we have the preempt depth in the common tracepoint headers, I'll
try to rebuild a reliable cpu runqueue from post-processing and see if all that
is enough.
Thanks.
next prev parent reply other threads:[~2010-05-31 14:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-28 14:26 [PATCH] tracing: Add task activate/deactivate tracepoints Frederic Weisbecker
2010-05-28 15:15 ` Peter Zijlstra
2010-05-31 8:00 ` Ingo Molnar
2010-05-31 8:12 ` Peter Zijlstra
2010-05-31 8:54 ` Peter Zijlstra
2010-05-31 14:36 ` Frederic Weisbecker
2010-05-31 14:43 ` Peter Zijlstra
2010-05-31 14:48 ` Frederic Weisbecker [this message]
2010-05-31 16:18 ` Peter Zijlstra
2010-05-31 16:37 ` Steven Rostedt
2010-05-31 18:28 ` Peter Zijlstra
2010-05-31 19:14 ` Steven Rostedt
2010-05-31 19:16 ` Steven Rostedt
2010-05-31 16:51 ` Frederic Weisbecker
2010-06-01 9:13 ` [tip:sched/urgent] sched, trace: Fix sched_switch() prev_state argument tip-bot for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100531144820.GB5157@nowhere \
--to=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.