All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
To: sahil aggarwal <sahil.agg15@gmail.com>
Cc: linux-perf-users@vger.kernel.org
Subject: Re: Sampling on sched:sched_switch
Date: Tue, 14 Apr 2015 11:22:26 -0300	[thread overview]
Message-ID: <20150414142226.GL16027@kernel.org> (raw)
In-Reply-To: <CAGAANTV1gQFd0_WZbGTG_+ASZZdDmfAecr_ddchnWGB+dMyVDA@mail.gmail.com>

Em Tue, Apr 14, 2015 at 07:00:43PM +0530, sahil aggarwal escreveu:
> > This is incomplete: "multiple events for all CPUS" ok, but for a specific
> > thread? Or for all of them?
 
> If i have 2 CPU's, i made it run 2 threads each per CPU. Both threads
> have different streams for same tracepoints polling on them. Well, had
> to go with this approach since enabling tracepoints on all CPU when
> inherit was not working for me :(

My question stands: "Are you monitoring a specific thread? Or all
threads in the system"?
 
> > See this, look for the inherit flag, then look for the CPU arg to
> > sys_perf_event_open, many tracepoints, a process that creates a process that
> > creates a process that makes a networking call that hits net:*skb* tracepoints,
> > is something like that that you want?
 
> This is exactly what i want. But things seem to be working in unexpected way.

Read below
 
> > [root@zoo ~]# perf stat -vv -e sched:* -e skb:* time time ping -c 1 127.0.0.1
> > <SNIP>
> > ------------------------------------------------------------
> > perf_event_attr:
> >   type                             2
> >   size                             112
> >   config                           10b
> >   { sample_period, sample_freq }   1
> >   sample_type                      TIME|CPU|PERIOD|RAW
> >   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> >   disabled                         1
> >   inherit                          1
> >   enable_on_exec                   1
> >   exclude_guest                    1
> > ------------------------------------------------------------
> > sys_perf_event_open: pid 9944  cpu -1  group_fd -1  flags 0x8
> > ------------------------------------------------------------
> 
> I think i should figure out why mmap is failing when inherit=1, seeing
> this it doesn't make sense. Will get back to you if i find something.

'perf stat' doesn't use mmap, its just counting events, lemme try
mmaping that same workload:

[root@zoo ~]# trace -o /tmp/trace.output -vv --ev sched:* --ev skb:* time time ping -c 1 127.0.0.1
<SNIP>
------------------------------------------------------------
perf_event_attr:
  type                             2
  size                             112
  config                           10b
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  read_format                      ID
  disabled                         1
  inherit                          1
  mmap                             1
  comm                             1
  enable_on_exec                   1
  task                             1
  sample_id_all                    1
  exclude_guest                    1
  mmap2                            1
  comm_exec                        1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 19784  cpu 0  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 1  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 2  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 3  group_fd -1  flags 0x8
------------------------------------------------------------
perf_event_attr:
  type                             2
  size                             112
  config                           10a
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  read_format                      ID
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  sample_id_all                    1
  exclude_guest                    1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 19784  cpu 0  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 1  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 2  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 3  group_fd -1  flags 0x8
------------------------------------------------------------
perf_event_attr:
  type                             2
  size                             112
  config                           109
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|ID|CPU|PERIOD|RAW
  read_format                      ID
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  sample_id_all                    1
  exclude_guest                    1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 19784  cpu 0  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 1  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 2  group_fd -1  flags 0x8
sys_perf_event_open: pid 19784  cpu 3  group_fd -1  flags 0x8
------------------------------------------------------------
<SNIP>
mmap size 528384B
perf event ring buffer mmapped per cpu
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.050 ms

--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.050/0.050/0.050/0.000 ms
0.00user 0.00system 0:00.00elapsed 50%CPU (0avgtext+0avgdata 2112maxresident)k
0inputs+0outputs (0major+100minor)pagefaults 0swaps
0.00user 0.00system 0:00.00elapsed 60%CPU (0avgtext+0avgdata 2112maxresident)k
0inputs+0outputs (0major+180minor)pagefaults 0swaps
[root@zoo ~]# 

Ok, it works, but notice that it will create one file descriptor per event per
CPU (this machine has 4 CPUs), and then it will use an ioctl to ask the kernel
to send all events for an event on a CPU to the same ring buffer, so we end up
with just 4 ring buffers (perf mmaps), one per CPU:

From tools/perf/util/evlist.c, function perf_evlist__mmap_per_evsel():

  if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)

That is why I asked you about what you are monitoring, that is not clear so
far, for me.

Above, we are not asking the kernel for cpu == -1 and thread == -1, that will
result in that -EINVAL, that is there for scalability reasons.

System wide is done by using CPU = N and thread = -1, with one mmap per CPU.

- Arnaldo

  reply	other threads:[~2015-04-14 14:22 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-13 13:04 Sampling on sched:sched_switch sahil aggarwal
2015-04-13 15:11 ` Arnaldo Carvalho de Melo
2015-04-14  5:42   ` sahil aggarwal
2015-04-14 12:15     ` Arnaldo Carvalho de Melo
2015-04-14 12:30       ` sahil aggarwal
2015-04-14 12:59         ` Arnaldo Carvalho de Melo
2015-04-14 13:30           ` sahil aggarwal
2015-04-14 14:22             ` Arnaldo Carvalho de Melo [this message]
2015-04-14 14:55               ` sahil aggarwal
2015-04-15  6:17                 ` sahil aggarwal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150414142226.GL16027@kernel.org \
    --to=arnaldo.melo@gmail.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sahil.agg15@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.