From: Alexei Starovoitov <ast@plumgrid.com>
To: Daniel Wagner <daniel.wagner@bmw-carit.de>,
Steven Rostedt <rostedt@goodmis.org>,
Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Wang Nan <wangnan0@huawei.com>
Subject: Re: latency histogram with BPF
Date: Thu, 11 Jun 2015 15:08:23 -0700 [thread overview]
Message-ID: <557A06D7.7030304@plumgrid.com> (raw)
In-Reply-To: <557937D8.90606@bmw-carit.de>
On 6/11/15 12:25 AM, Daniel Wagner wrote:
>
> In both cases BPF or based on Tom's 'hist' triggers' patches, there is
> some trickery necessary to get it working. While the first approach
> has more flexibility what you want to measure or how you want to
> present it, I suspect it will be harder to get it working/accepted.
why? Out of your patch only 2 lines are for kernel, which I wanted
to add anyway. The sample code also looks good.
> Obviously, adding kprobes to trace_preempt_[on|off] is
> asking for problems:
thanks for reporting.
I reproduced it with hackbench, but my stack trace looks completely
different. Looks like some memory corruption is happening.
Not clear where. I'm still debugging.
> CPU 3
> latency : count distribution
> 1 -> 1 : 0 | |
> 2 -> 3 : 0 | |
> 4 -> 7 : 0 | |
> 8 -> 15 : 25182 |************************************************* |
> 16 -> 31 : 1675 |** |
> 32 -> 63 : 0 | |
> 64 -> 127 : 0 | |
> 128 -> 255 : 0 | |
> 256 -> 511 : 0 | |
> 512 -> 1023 : 0 | |
>
> The numbers look a bit too nice and low. I suspect something is going
> wrong.
I see similar numbers. 25k over 5 sec = 5k preempt on/off per second
which sounds about right for idle.
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 2d56ce5..e4d3e76 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -172,6 +172,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func
> return &bpf_probe_read_proto;
> case BPF_FUNC_ktime_get_ns:
> return &bpf_ktime_get_ns_proto;
> + case BPF_FUNC_get_smp_processor_id:
> + return &bpf_get_smp_processor_id_proto;
This part I'll take into my set of patches.
> --- a/samples/bpf/bpf_helpers.h
> +++ b/samples/bpf/bpf_helpers.h
> @@ -21,6 +21,8 @@ static unsigned long long (*bpf_ktime_get_ns)(void) =
> (void *) BPF_FUNC_ktime_get_ns;
> static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
> (void *) BPF_FUNC_trace_printk;
> +static unsigned int (*bpf_get_smp_processor_id)(void) =
> + (void *) BPF_FUNC_get_smp_processor_id;
this part is already there in net-next.
> diff --git a/samples/bpf/hwlathist_kern.c b/samples/bpf/hwlathist_kern.c
> new file mode 100644
> index 0000000..7a97e7e
> --- /dev/null
> +++ b/samples/bpf/hwlathist_kern.c
looks good and useful, but I would like to find and fix the bug first
before exposing this bit as a 'sample', since right now it's
'crash me if hackbench is running' test :)
If you have any suggestions on where to look, I'm all ears.
My stack traces look like:
Running with 10*40 (== 400) tasks.
[ 12.032571] kernel BUG at ../mm/slub.c:3413!
[ 12.036004] [<ffffffff810c1913>] rcu_process_callbacks+0x283/0x680
[ 12.036004] [<ffffffff810632d0>] __do_softirq+0xf0/0x2b0
[ 12.036004] [<ffffffff8107dff9>] ? kthread+0xc9/0xe0
[ 12.036004] [<ffffffff810634b1>] run_ksoftirqd+0x21/0x50
or
[ 25.788074] kernel tried to execute NX-protected page - exploit
attempt? (uid: 0)
[ 25.788801] BUG: unable to handle kernel paging request at
ffff88000b1ea2a0
[ 25.800085] [<ffffffff810c18ba>] ? rcu_process_callbacks+0x22a/0x680
[ 25.800085] [<ffffffff810632d0>] __do_softirq+0xf0/0x2b0
[ 25.800085] [<ffffffff816d6d7c>] do_softirq_own_stack+0x1c/0x30
[ 25.800085] <EOI>
[ 25.800085] [<ffffffff8106351d>] do_softirq+0x3d/0x40
[ 25.800085] [<ffffffff810635ba>] __local_bh_enable_ip+0x9a/0xb0
[ 25.800085] [<ffffffff816d4c1b>] _raw_spin_unlock_bh+0x1b/0x20
[ 25.800085] [<ffffffff811f3e42>] bdi_writeback_workfn+0x1b2/0x470
next prev parent reply other threads:[~2015-06-11 22:08 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-11 7:25 latency histogram with BPF Daniel Wagner
2015-06-11 22:08 ` Alexei Starovoitov [this message]
2015-06-12 6:12 ` Daniel Wagner
2015-06-12 14:33 ` Daniel Wagner
2015-06-12 17:17 ` Alexei Starovoitov
2015-06-15 9:03 ` Daniel Wagner
2015-06-12 6:58 ` Wangnan (F)
2015-06-12 17:21 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=557A06D7.7030304@plumgrid.com \
--to=ast@plumgrid.com \
--cc=daniel.wagner@bmw-carit.de \
--cc=linux-kernel@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tom.zanussi@linux.intel.com \
--cc=wangnan0@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox