linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
To: sahil aggarwal <sahil.agg15@gmail.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: Sampling on sched:sched_switch
Date: Tue, 14 Apr 2015 11:22:26 -0300	[thread overview]
Message-ID: <20150414142226.GL16027@kernel.org> (raw)
In-Reply-To: <CAGAANTV1gQFd0_WZbGTG_+ASZZdDmfAecr_ddchnWGB+dMyVDA@mail.gmail.com>

Em Tue, Apr 14, 2015 at 07:00:43PM +0530, sahil aggarwal escreveu:
> > This is incomplete: "multiple events for all CPUS" ok, but for a specific
> > thread? Or for all of them?
 
> If i have 2 CPU's, i made it run 2 threads each per CPU. Both threads
> have different streams for same tracepoints polling on them. Well, had
> to go with this approach since enabling tracepoints on all CPU when
> inherit was not working for me :(

My question stands: "Are you monitoring a specific thread? Or all
threads in the system"?
 
> > See this, look for the inherit flag, then look for the CPU arg to
> > sys_perf_event_open, many tracepoints, a process that creates a process that
> > creates a process that makes a networking call that hits net:*skb* tracepoints,
> > is something like that that you want?
 
> This is exactly what i want. But things seem to be working in unexpected way.

Read below
 
> > [root@zoo ~]# perf stat -vv -e sched:* -e skb:* time time ping -c 1 127.0.0.1
> > <SNIP>
> > ------------------------------------------------------------
> > perf_event_attr:
> >   type                             2
> >   size                             112
> >   config                           10b
> >   { sample_period, sample_freq }   1
> >   sample_type                      TIME|CPU|PERIOD|RAW
> >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> >   disabled                         1
> >   inherit                          1
> >   enable_on_exec                   1
> >   exclude_guest                    1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 9944  cpu -1  group_fd -1  flags 0x8
> > ------------------------------------------------------------
> 
> I think i should figure out why mmap is failing when inherit=1, seeing
> this it doesn't make sense. Will get back to you if i find something.

'perf stat' doesn't use mmap, its just counting events, lemme try
mmaping that same workload:

[root@zoo ~]# trace -o /tmp/trace.output -vv --ev sched:* --ev skb:* time time ping -c 1 127.0.0.1
<SNIP>
------------------------------------------------------------
perf_event_attr:
  type                             2
  size                             112
  config                           10b
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  read_format                      ID
  disabled                         1
  inherit                          1
  mmap                             1
  comm                             1
  enable_on_exec                   1
  task                             1
  sample_id_all                    1
  exclude_guest                    1
  mmap2                            1
  comm_exec                        1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 19784  cpu 0  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 1  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 2  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 3  group_fd -1  flags 0x8
------------------------------------------------------------
perf_event_attr:
  type                             2
  size                             112
  config                           10a
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  read_format                      ID
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  sample_id_all                    1
  exclude_guest                    1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 19784  cpu 0  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 1  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 2  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 3  group_fd -1  flags 0x8
------------------------------------------------------------
perf_event_attr:
  type                             2
  size                             112
  config                           109
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  read_format                      ID
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  sample_id_all                    1
  exclude_guest                    1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 19784  cpu 0  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 1  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 2  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 3  group_fd -1  flags 0x8
------------------------------------------------------------
<SNIP>
mmap size 528384B
perf event ring buffer mmapped per cpu
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.050 ms

--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.050/0.050/0.050/0.000 ms
0.00user 0.00system 0:00.00elapsed 50%CPU (0avgtext+0avgdata 2112maxresident)k
0inputs+0outputs (0major+100minor)pagefaults 0swaps
0.00user 0.00system 0:00.00elapsed 60%CPU (0avgtext+0avgdata 2112maxresident)k
0inputs+0outputs (0major+180minor)pagefaults 0swaps
[root@zoo ~]# 

Ok, it works, but notice that it will create one file descriptor per event per
CPU (this machine has 4 CPUs), and then it will use an ioctl to ask the kernel
to send all events for an event on a CPU to the same ring buffer, so we end up
with just 4 ring buffers (perf mmaps), one per CPU:

From tools/perf/util/evlist.c, function perf_evlist__mmap_per_evsel():

  if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)

That is why I asked you about what you are monitoring, that is not clear so
far, for me.

Above, we are not asking the kernel for cpu == -1 and thread == -1, that will
result in that -EINVAL, that is there for scalability reasons.

System wide is done by using CPU = N and thread = -1, with one mmap per CPU.

- Arnaldo

  reply	other threads:[~2015-04-14 14:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-13 13:04 Sampling on sched:sched_switch sahil aggarwal
2015-04-13 15:11 ` Arnaldo Carvalho de Melo
2015-04-14  5:42   ` sahil aggarwal
2015-04-14 12:15     ` Arnaldo Carvalho de Melo
2015-04-14 12:30       ` sahil aggarwal
2015-04-14 12:59         ` Arnaldo Carvalho de Melo
2015-04-14 13:30           ` sahil aggarwal
2015-04-14 14:22             ` Arnaldo Carvalho de Melo [this message]
2015-04-14 14:55               ` sahil aggarwal
2015-04-15  6:17                 ` sahil aggarwal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150414142226.GL16027@kernel.org \
    --to=arnaldo.melo@gmail.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sahil.agg15@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).