From: Peter Zijlstra <peterz@infradead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: akpm@linux-foundation.org, Ingo Molnar <mingo@elte.hu>,
linux-kernel@vger.kernel.org,
Masami Hiramatsu <mhiramat@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
"Frank Ch. Eigler" <fche@redhat.com>,
Hideo AOKI <haoki@redhat.com>,
Takashi Nishiie <t-nishiie@np.css.fujitsu.com>,
Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Subject: Re: [patch 06/17] LTTng instrumentation - scheduler
Date: Wed, 16 Jul 2008 10:30:58 +0200 [thread overview]
Message-ID: <1216197058.5232.17.camel@twins> (raw)
In-Reply-To: <20080715222747.290133980@polymtl.ca>
On Tue, 2008-07-15 at 18:26 -0400, Mathieu Desnoyers wrote:
> plain text document attachment (lttng-instrumentation-scheduler.patch)
> Instrument the scheduler activity (sched_switch, migration, wakeups, wait for a
> task, signal delivery) and process/thread creation/destruction (fork, exit,
> kthread stop). Actually, kthread creation is not instrumented in this patch
> because it is architecture dependent. It allows to connect tracers such as
> ftrace which detects scheduling latencies, good/bad scheduler decisions. Tools
> like LTTng can export this scheduler information along with instrumentation of
> the rest of the kernel activity to perform post-mortem analysis on the scheduler
> activity.
>
> About the performance impact of tracepoints (which is comparable to markers),
> even without immediate values optimizations, tests done by Hideo Aoki on ia64
> show no regression. His test case was using hackbench on a kernel where
> scheduler instrumentation (about 5 events in code scheduler code) was added.
> See the "Tracepoints" patch header for performance result detail.
>
> Changelog :
> - Change instrumentation location and parameter to match ftrace instrumentation,
> previously done with kernel markers.
>
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> CC: 'Peter Zijlstra' <peterz@infradead.org>
> CC: 'Steven Rostedt' <rostedt@goodmis.org>
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Masami Hiramatsu <mhiramat@redhat.com>
> CC: "Frank Ch. Eigler" <fche@redhat.com>
> CC: 'Ingo Molnar' <mingo@elte.hu>
> CC: 'Hideo AOKI' <haoki@redhat.com>
> CC: Takashi Nishiie <t-nishiie@np.css.fujitsu.com>
> CC: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> ---
> include/trace/sched.h | 45 +++++++++++++++++++++++++++++++++++++++++++++
> kernel/exit.c | 6 ++++++
> kernel/fork.c | 3 +++
> kernel/kthread.c | 5 +++++
> kernel/sched.c | 17 ++++++-----------
> kernel/signal.c | 3 +++
> 6 files changed, 68 insertions(+), 11 deletions(-)
>
> Index: linux-2.6-lttng/kernel/kthread.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/kthread.c 2008-07-15 14:51:49.000000000 -0400
> +++ linux-2.6-lttng/kernel/kthread.c 2008-07-15 15:12:54.000000000 -0400
> @@ -13,6 +13,7 @@
> #include <linux/file.h>
> #include <linux/module.h>
> #include <linux/mutex.h>
> +#include <trace/sched.h>
>
> #define KTHREAD_NICE_LEVEL (-5)
>
> @@ -187,6 +188,8 @@ int kthread_stop(struct task_struct *k)
> /* It could exit after stop_info.k set, but before wake_up_process. */
> get_task_struct(k);
>
> + trace_sched_kthread_stop(k);
> +
> /* Must init completion *before* thread sees kthread_stop_info.k */
> init_completion(&kthread_stop_info.done);
> smp_wmb();
> @@ -202,6 +205,8 @@ int kthread_stop(struct task_struct *k)
> ret = kthread_stop_info.err;
> mutex_unlock(&kthread_stop_lock);
>
> + trace_sched_kthread_stop_ret(ret);
> +
> return ret;
> }
> EXPORT_SYMBOL(kthread_stop);
Why do we need two trace points in this function?
> Index: linux-2.6-lttng/kernel/sched.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/sched.c 2008-07-15 14:51:50.000000000 -0400
> +++ linux-2.6-lttng/kernel/sched.c 2008-07-15 15:13:49.000000000 -0400
> @@ -71,6 +71,7 @@
> #include <linux/debugfs.h>
> #include <linux/ctype.h>
> #include <linux/ftrace.h>
> +#include <trace/sched.h>
>
> #include <asm/tlb.h>
> #include <asm/irq_regs.h>
> @@ -1987,6 +1988,7 @@ void wait_task_inactive(struct task_stru
> * just go back and repeat.
> */
> rq = task_rq_lock(p, &flags);
> + trace_sched_wait_task(rq, p);
> running = task_running(rq, p);
> on_rq = p->se.on_rq;
> task_rq_unlock(rq, &flags);
> @@ -2337,9 +2339,7 @@ out_activate:
> success = 1;
>
> out_running:
> - trace_mark(kernel_sched_wakeup,
> - "pid %d state %ld ## rq %p task %p rq->curr %p",
> - p->pid, p->state, rq, p, rq->curr);
> + trace_sched_wakeup(rq, p);
> check_preempt_curr(rq, p);
>
> p->state = TASK_RUNNING;
> @@ -2472,9 +2472,7 @@ void wake_up_new_task(struct task_struct
> p->sched_class->task_new(rq, p);
> inc_nr_running(rq);
> }
> - trace_mark(kernel_sched_wakeup_new,
> - "pid %d state %ld ## rq %p task %p rq->curr %p",
> - p->pid, p->state, rq, p, rq->curr);
> + trace_sched_wakeup_new(rq, p);
> check_preempt_curr(rq, p);
> #ifdef CONFIG_SMP
> if (p->sched_class->task_wake_up)
> @@ -2647,11 +2645,7 @@ context_switch(struct rq *rq, struct tas
> struct mm_struct *mm, *oldmm;
>
> prepare_task_switch(rq, prev, next);
> - trace_mark(kernel_sched_schedule,
> - "prev_pid %d next_pid %d prev_state %ld "
> - "## rq %p prev %p next %p",
> - prev->pid, next->pid, prev->state,
> - rq, prev, next);
> + trace_sched_switch(rq, prev, next);
> mm = next->mm;
> oldmm = prev->active_mm;
> /*
> @@ -2884,6 +2878,7 @@ static void sched_migrate_task(struct ta
> || unlikely(cpu_is_offline(dest_cpu)))
> goto out;
>
> + trace_sched_migrate_task(rq, p, dest_cpu);
> /* force the process onto the specified CPU */
> if (migrate_task(p, dest_cpu, &req)) {
> /* Need to wait for migration thread (might exit: take ref). */
> Index: linux-2.6-lttng/kernel/exit.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/exit.c 2008-07-15 14:51:49.000000000 -0400
> +++ linux-2.6-lttng/kernel/exit.c 2008-07-15 15:12:54.000000000 -0400
> @@ -46,6 +46,7 @@
> #include <linux/resource.h>
> #include <linux/blkdev.h>
> #include <linux/task_io_accounting_ops.h>
> +#include <trace/sched.h>
>
> #include <asm/uaccess.h>
> #include <asm/unistd.h>
> @@ -149,6 +150,7 @@ static void __exit_signal(struct task_st
>
> static void delayed_put_task_struct(struct rcu_head *rhp)
> {
> + trace_sched_process_free(container_of(rhp, struct task_struct, rcu));
> put_task_struct(container_of(rhp, struct task_struct, rcu));
> }
It might make sense to write it like:
static void delayed_put_task_struct(struct rcu_head *rhp)
{
struct task_struct *tsk = container_of(rhp, struct task_struct, rcu);
trace_sched_process_free(tsk);
put_task_struct(tsk);
}
> @@ -1040,6 +1042,8 @@ NORET_TYPE void do_exit(long code)
>
> if (group_dead)
> acct_process();
> + trace_sched_process_exit(tsk);
> +
> exit_sem(tsk);
> exit_files(tsk);
> exit_fs(tsk);
> @@ -1524,6 +1528,8 @@ static long do_wait(enum pid_type type,
> struct task_struct *tsk;
> int flag, retval;
>
> + trace_sched_process_wait(pid);
> +
> add_wait_queue(¤t->signal->wait_chldexit,&wait);
> repeat:
> /* If there is nothing that can match our critier just get out */
> Index: linux-2.6-lttng/kernel/fork.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/fork.c 2008-07-15 14:51:49.000000000 -0400
> +++ linux-2.6-lttng/kernel/fork.c 2008-07-15 15:14:23.000000000 -0400
> @@ -56,6 +56,7 @@
> #include <linux/proc_fs.h>
> #include <linux/blkdev.h>
> #include <linux/magic.h>
> +#include <trace/sched.h>
>
> #include <asm/pgtable.h>
> #include <asm/pgalloc.h>
> @@ -1362,6 +1363,8 @@ long do_fork(unsigned long clone_flags,
> if (!IS_ERR(p)) {
> struct completion vfork;
>
> + trace_sched_process_fork(current, p);
> +
> nr = task_pid_vnr(p);
>
> if (clone_flags & CLONE_PARENT_SETTID)
> Index: linux-2.6-lttng/kernel/signal.c
> ===================================================================
> --- linux-2.6-lttng.orig/kernel/signal.c 2008-07-15 14:49:14.000000000 -0400
> +++ linux-2.6-lttng/kernel/signal.c 2008-07-15 15:12:54.000000000 -0400
> @@ -26,6 +26,7 @@
> #include <linux/freezer.h>
> #include <linux/pid_namespace.h>
> #include <linux/nsproxy.h>
> +#include <trace/sched.h>
>
> #include <asm/param.h>
> #include <asm/uaccess.h>
> @@ -807,6 +808,8 @@ static int send_signal(int sig, struct s
> struct sigpending *pending;
> struct sigqueue *q;
>
> + trace_sched_signal_send(sig, t);
> +
> assert_spin_locked(&t->sighand->siglock);
> if (!prepare_signal(sig, t))
> return 0;
Would it make sense to also put a trace point on receiveing a signal?
/me utterly clueless about the whole signal stuff.
> Index: linux-2.6-lttng/include/trace/sched.h
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6-lttng/include/trace/sched.h 2008-07-15 15:12:54.000000000 -0400
> @@ -0,0 +1,45 @@
> +#ifndef _TRACE_SCHED_H
> +#define _TRACE_SCHED_H
> +
> +#include <linux/sched.h>
> +#include <linux/tracepoint.h>
> +
> +DEFINE_TRACE(sched_kthread_stop,
> + TPPROTO(struct task_struct *t),
> + TPARGS(t));
> +DEFINE_TRACE(sched_kthread_stop_ret,
> + TPPROTO(int ret),
> + TPARGS(ret));
> +DEFINE_TRACE(sched_wait_task,
> + TPPROTO(struct rq *rq, struct task_struct *p),
> + TPARGS(rq, p));
> +DEFINE_TRACE(sched_wakeup,
> + TPPROTO(struct rq *rq, struct task_struct *p),
> + TPARGS(rq, p));
> +DEFINE_TRACE(sched_wakeup_new,
> + TPPROTO(struct rq *rq, struct task_struct *p),
> + TPARGS(rq, p));
> +DEFINE_TRACE(sched_switch,
> + TPPROTO(struct rq *rq, struct task_struct *prev,
> + struct task_struct *next),
> + TPARGS(rq, prev, next));
> +DEFINE_TRACE(sched_migrate_task,
> + TPPROTO(struct rq *rq, struct task_struct *p, int dest_cpu),
> + TPARGS(rq, p, dest_cpu));
> +DEFINE_TRACE(sched_process_free,
> + TPPROTO(struct task_struct *p),
> + TPARGS(p));
> +DEFINE_TRACE(sched_process_exit,
> + TPPROTO(struct task_struct *p),
> + TPARGS(p));
> +DEFINE_TRACE(sched_process_wait,
> + TPPROTO(struct pid *pid),
> + TPARGS(pid));
> +DEFINE_TRACE(sched_process_fork,
> + TPPROTO(struct task_struct *parent, struct task_struct *child),
> + TPARGS(parent, child));
> +DEFINE_TRACE(sched_signal_send,
> + TPPROTO(int sig, struct task_struct *p),
> + TPARGS(sig, p));
> +
> +#endif
>
next prev parent reply other threads:[~2008-07-16 8:30 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-15 22:26 [patch 00/17] Tracepoints v4 for linux-next Mathieu Desnoyers
2008-07-15 22:26 ` [patch 01/17] RCU read sched Mathieu Desnoyers
2008-08-01 21:10 ` Paul E. McKenney
2008-08-01 23:04 ` Peter Zijlstra
2008-07-15 22:26 ` [patch 02/17] Kernel Tracepoints Mathieu Desnoyers
2008-07-24 15:08 ` Steven Rostedt
2008-07-24 20:18 ` Mathieu Desnoyers
2008-08-01 21:10 ` Paul E. McKenney
2008-08-04 15:17 ` Mathieu Desnoyers
2008-07-24 15:34 ` Steven Rostedt
2008-07-24 20:30 ` Mathieu Desnoyers
2008-07-24 22:22 ` Steven Rostedt
2008-07-24 15:39 ` Steven Rostedt
2008-07-24 20:37 ` [PATCH] Tracepoints use TABLE_SIZE macro Mathieu Desnoyers
2008-07-15 22:26 ` [patch 03/17] Tracepoints Documentation Mathieu Desnoyers
2008-07-15 22:26 ` [patch 04/17] Tracepoints Samples Mathieu Desnoyers
2008-07-15 22:26 ` [patch 05/17] LTTng instrumentation - irq Mathieu Desnoyers
2008-07-15 22:26 ` [patch 06/17] LTTng instrumentation - scheduler Mathieu Desnoyers
2008-07-16 8:30 ` Peter Zijlstra [this message]
2008-07-16 14:18 ` Mathieu Desnoyers
2008-07-15 22:26 ` [patch 07/17] LTTng instrumentation - timer Mathieu Desnoyers
2008-07-16 8:34 ` Peter Zijlstra
2008-07-16 14:34 ` Mathieu Desnoyers
2008-07-15 22:26 ` [patch 08/17] LTTng instrumentation - kernel Mathieu Desnoyers
2008-07-24 13:57 ` Steven Rostedt
2008-07-24 14:30 ` Mathieu Desnoyers
2008-07-24 15:13 ` Steven Rostedt
2008-07-15 22:26 ` [patch 09/17] LTTng instrumentation - filemap Mathieu Desnoyers
2008-07-16 8:35 ` Peter Zijlstra
2008-07-16 14:37 ` Mathieu Desnoyers
2008-07-17 6:25 ` Nick Piggin
2008-07-17 7:02 ` Mathieu Desnoyers
2008-07-17 7:11 ` Nick Piggin
2008-07-15 22:26 ` [patch 10/17] LTTng instrumentation - swap Mathieu Desnoyers
2008-07-16 8:39 ` Peter Zijlstra
2008-07-16 14:40 ` Mathieu Desnoyers
2008-07-16 14:47 ` Peter Zijlstra
2008-07-16 15:00 ` Mathieu Desnoyers
2008-07-16 15:50 ` KOSAKI Motohiro
2008-07-16 16:17 ` Mathieu Desnoyers
2008-07-15 22:26 ` [patch 11/17] LTTng instrumentation - memory page faults Mathieu Desnoyers
2008-07-15 22:26 ` [patch 12/17] LTTng instrumentation - page Mathieu Desnoyers
2008-07-16 8:41 ` Peter Zijlstra
2008-07-16 15:03 ` Mathieu Desnoyers
2008-07-15 22:26 ` [patch 13/17] LTTng instrumentation - hugetlb Mathieu Desnoyers
2008-07-15 22:26 ` [patch 14/17] LTTng instrumentation - net Mathieu Desnoyers
2008-07-15 22:26 ` [patch 15/17] LTTng instrumentation - ipv4 Mathieu Desnoyers
2008-07-15 22:26 ` [patch 16/17] LTTng instrumentation - ipv6 Mathieu Desnoyers
2008-07-15 22:26 ` [patch 17/17] ftrace port to tracepoints Mathieu Desnoyers
2008-07-16 8:51 ` [patch 00/17] Tracepoints v4 for linux-next Peter Zijlstra
2008-07-18 15:41 ` Masami Hiramatsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1216197058.5232.17.camel@twins \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=eduard.munteanu@linux360.ro \
--cc=fche@redhat.com \
--cc=haoki@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mhiramat@redhat.com \
--cc=mingo@elte.hu \
--cc=rostedt@goodmis.org \
--cc=t-nishiie@np.css.fujitsu.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox