Re: latency histogram with BPF

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Daniel Wagner <daniel.wagner@bmw-carit.de>
To: Alexei Starovoitov <ast@plumgrid.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Wang Nan <wangnan0@huawei.com>
Subject: Re: latency histogram with BPF
Date: Fri, 12 Jun 2015 08:12:12 +0200	[thread overview]
Message-ID: <557A783C.5080803@bmw-carit.de> (raw)
In-Reply-To: <557A06D7.7030304@plumgrid.com>

On 06/12/2015 12:08 AM, Alexei Starovoitov wrote:
> On 6/11/15 12:25 AM, Daniel Wagner wrote:
>> In both cases BPF or based on Tom's 'hist' triggers' patches, there is
>> some trickery necessary to get it working. While the first approach
>> has more flexibility what you want to measure or how you want to
>> present it, I suspect it will be harder to get it working/accepted.
> 
> why? Out of your patch only 2 lines are for kernel, which I wanted
> to add anyway. The sample code also looks good.

I had the impression that attaching kprobes to trace_preempt_[on|off] is
not going to work. But you seem confident to get this working :)

>> Obviously, adding kprobes to trace_preempt_[on|off] is
>> asking for problems:
> 
> thanks for reporting.
> I reproduced it with hackbench, but my stack trace looks completely
> different. Looks like some memory corruption is happening.
> Not clear where. I'm still debugging.

I have noticed that also network load triggers a panic. Most of the time
softirq is involved somehow. So far I haven't spend time looking into it
because I though that can't work anyway. I'll stare a bit more at it now.

>> CPU 3
>>        latency        : count     distribution
>>         1 -> 1        : 0       
>> |                                                  |
>>         2 -> 3        : 0       
>> |                                                  |
>>         4 -> 7        : 0       
>> |                                                  |
>>         8 -> 15       : 25182   
>> |************************************************* |
>>        16 -> 31       : 1675    
>> |**                                                |
>>        32 -> 63       : 0       
>> |                                                  |
>>        64 -> 127      : 0       
>> |                                                  |
>>       128 -> 255      : 0       
>> |                                                  |
>>       256 -> 511      : 0       
>> |                                                  |
>>       512 -> 1023     : 0       
>> |                                                  |
>>
>> The numbers look a bit too nice and low. I suspect something is going
>> wrong.
> 
> I see similar numbers. 25k over 5 sec = 5k preempt on/off per second
> which sounds about right for idle.

I expected to see some higher latencies it stays pretty low the whole
time. With Tom patches I got a histogram which looked like this:

https://lkml.org/lkml/2015/4/30/196

# trigger info:
hist:keys=latency.bucket:vals=:sort=latency.bucket:size=2048 [active]

latency:          0 hitcount:     558648
latency:        256 hitcount:      70163
latency:        512 hitcount:      24601
latency:        768 hitcount:       9649
latency:       1024 hitcount:       4578
latency:       1280 hitcount:       2623
latency:       1536 hitcount:       1642
latency:       1792 hitcount:       1281
latency:       2048 hitcount:       1203
latency:       2304 hitcount:        942
latency:       2560 hitcount:        713
latency:       2816 hitcount:        513
latency:       3072 hitcount:        436
latency:       3328 hitcount:        347
latency:       3584 hitcount:        312
latency:       3840 hitcount:        279
latency:       4096 hitcount:        269
latency:       4352 hitcount:        257
latency:       4608 hitcount:        236
latency:       4864 hitcount:        214
latency:       5120 hitcount:        190
latency:       5376 hitcount:        129
latency:       5632 hitcount:        159
latency:       5888 hitcount:        141
latency:       6144 hitcount:        118
latency:       6400 hitcount:        107
latency:       6656 hitcount:         99
latency:       6912 hitcount:         73
latency:       7168 hitcount:         70
latency:       7424 hitcount:         80
latency:       7680 hitcount:         58
latency:       7936 hitcount:         65
latency:       8192 hitcount:         57
latency:       8448 hitcount:         59
latency:       8704 hitcount:         67
latency:       8960 hitcount:         64
latency:       9216 hitcount:         82
latency:       9472 hitcount:         89
latency:       9728 hitcount:         74
latency:       9984 hitcount:         89
latency:      10240 hitcount:         65

This distribution seems to be consistent with the latency-hist.patch
from -rt. That patch has been used for a while and it looks like it
reports the right values.

https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt&id=0eba3c128347cc82fff093af1c61b2211985b4c9

>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>> index 2d56ce5..e4d3e76 100644
>> --- a/kernel/trace/bpf_trace.c
>> +++ b/kernel/trace/bpf_trace.c
>> @@ -172,6 +172,8 @@ static const struct bpf_func_proto
>> *kprobe_prog_func_proto(enum bpf_func_id func
>>           return &bpf_probe_read_proto;
>>       case BPF_FUNC_ktime_get_ns:
>>           return &bpf_ktime_get_ns_proto;
>> +    case BPF_FUNC_get_smp_processor_id:
>> +        return &bpf_get_smp_processor_id_proto;
> 
> This part I'll take into my set of patches.

Great.

>> --- a/samples/bpf/bpf_helpers.h
>> +++ b/samples/bpf/bpf_helpers.h
>> @@ -21,6 +21,8 @@ static unsigned long long (*bpf_ktime_get_ns)(void) =
>>       (void *) BPF_FUNC_ktime_get_ns;
>>   static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
>>       (void *) BPF_FUNC_trace_printk;
>> +static unsigned int (*bpf_get_smp_processor_id)(void) =
>> +    (void *) BPF_FUNC_get_smp_processor_id;
> 
> this part is already there in net-next.

Ah good to know. FWIW, I had to pick some other snippets from other
trees to get it working. All those patches will hit mainline in some
near future.

>> diff --git a/samples/bpf/hwlathist_kern.c b/samples/bpf/hwlathist_kern.c
>> new file mode 100644
>> index 0000000..7a97e7e
>> --- /dev/null
>> +++ b/samples/bpf/hwlathist_kern.c
> 
> looks good and useful, but I would like to find and fix the bug first
> before exposing this bit as a 'sample', since right now it's
> 'crash me if hackbench is running' test :)

Agreed. :)

> If you have any suggestions on where to look, I'm all ears.
> My stack traces look like:
> Running with 10*40 (== 400) tasks.
> [   12.032571] kernel BUG at ../mm/slub.c:3413!
> [   12.036004]  [<ffffffff810c1913>] rcu_process_callbacks+0x283/0x680
> [   12.036004]  [<ffffffff810632d0>] __do_softirq+0xf0/0x2b0
> [   12.036004]  [<ffffffff8107dff9>] ? kthread+0xc9/0xe0
> [   12.036004]  [<ffffffff810634b1>] run_ksoftirqd+0x21/0x50
> or
> [   25.788074] kernel tried to execute NX-protected page - exploit
> attempt? (uid: 0)
> [   25.788801] BUG: unable to handle kernel paging request at
> ffff88000b1ea2a0
> [   25.800085]  [<ffffffff810c18ba>] ? rcu_process_callbacks+0x22a/0x680
> [   25.800085]  [<ffffffff810632d0>] __do_softirq+0xf0/0x2b0
> [   25.800085]  [<ffffffff816d6d7c>] do_softirq_own_stack+0x1c/0x30
> [   25.800085]  <EOI>
> [   25.800085]  [<ffffffff8106351d>] do_softirq+0x3d/0x40
> [   25.800085]  [<ffffffff810635ba>] __local_bh_enable_ip+0x9a/0xb0
> [   25.800085]  [<ffffffff816d4c1b>] _raw_spin_unlock_bh+0x1b/0x20
> [   25.800085]  [<ffffffff811f3e42>] bdi_writeback_workfn+0x1b2/0x470

I see crashes as well with do_softirq() and RCU in it as well.

cheers,
daniel

next prev parent reply	other threads:[~2015-06-12  6:12 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-11  7:25 latency histogram with BPF Daniel Wagner
2015-06-11 22:08 ` Alexei Starovoitov
2015-06-12  6:12   ` Daniel Wagner [this message]
2015-06-12 14:33     ` Daniel Wagner
2015-06-12 17:17       ` Alexei Starovoitov
2015-06-15  9:03         ` Daniel Wagner
2015-06-12  6:58 ` Wangnan (F)
2015-06-12 17:21   ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=557A783C.5080803@bmw-carit.de \
    --to=daniel.wagner@bmw-carit.de \
    --cc=ast@plumgrid.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tom.zanussi@linux.intel.com \
    --cc=wangnan0@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.