Re: Is it possible to trace events and its call stack?

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>,
	linux-perf-users@vger.kernel.org
Subject: Re: Is it possible to trace events and its call stack?
Date: Thu, 12 Jan 2017 17:41:53 -0300	[thread overview]
Message-ID: <20170112204153.GD20003@kernel.org> (raw)
In-Reply-To: <20170112101658.GA3470@naverao1-tp.localdomain>

Em Thu, Jan 12, 2017 at 03:46:58PM +0530, Naveen N. Rao escreveu:
> On 2017/01/12 03:49PM, Qu Wenruo wrote:
> > Hi,
> > 
> > Is it possible to use perf/ftrace to trace events and its call stack?
> > 
> > [Background]
> > It's one structure in btrfs, btrfs_bio, I'm tracing for.
> > That structure is allocated and free somewhat frequently, and its size is
> > not fixed, so no SLAB/SLUB cache is used.
> > 
> > I added trace events(or trace points, anyway, just in
> > include/trace/events/btrfs.h) to trace the allocation and freeing.
> > Which will output the pointer address of that structure, so I can pair them,
> > alone with other info.
> > 
> > Things went well until, I found some structures are allocated but not freed.
> > (no corresponding trace point is triggered for given address).
> > 
> > It's possible that btrfs just forget to free it, or btrfs is just holding it
> > for some purpose.
> > So kernel memleak detector won't catch the later one.
> > 
> > That's to say along with the tracepoint data, I still need the call stack of
> > each calling, to determine the code who leak or hold the pointer.
> > 
> > Is it possible to do it using perf or ftrace?
> 
> Yes, use -g option with 'perf record'. In fact, I don't think you even 
> need to add a new tracepoint - you should be able to use kprobes (perf 
> probe) at structure allocation/free points.

Yes, with 'perf record -g', as suggested above, or directly with 'perf trace',
if the volume is not big or if you're ok about using a strace like workflow,
for example:

[root@jouet ~]# perf probe -m btrfs -F btrfs_bio* 
btrfs_bio_alloc
btrfs_bio_clone
btrfs_bio_counter_inc_blocked
btrfs_bio_counter_inc_noblocked
btrfs_bio_counter_sub
btrfs_bio_wq_end_io
[root@jouet ~]# perf probe -m btrfs btrfs_bio_alloc
Added new event:
  probe:btrfs_bio_alloc (on btrfs_bio_alloc in btrfs)

You can now use it in all perf tools, such as:

	perf record -e probe:btrfs_bio_alloc -aR sleep 1

[root@jouet ~]# #perf trace -e write,read,probe:btrfs* 
[root@jouet ~]# mount | grep btrfs
/var/lib/machines.raw on /var/lib/machines type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/)
[root@jouet ~]# perf trace --no-syscalls -e probe:btrfs*/max-stack=4/
     0.000 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       write_one_eb ([btrfs])
                                       btree_write_cache_pages ([btrfs])
                                       btree_writepages ([btrfs])
    13.112 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       __extent_writepage_io ([btrfs])
                                       __extent_writepage ([btrfs])
                                       extent_write_cache_pages.isra.43.constprop.60 ([btrfs])
    13.285 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       __extent_writepage_io ([btrfs])
                                       __extent_writepage ([btrfs])
                                       extent_write_cache_pages.isra.43.constprop.60 ([btrfs])
    13.434 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       write_one_eb ([btrfs])
                                       btree_write_cache_pages ([btrfs])
                                       btree_writepages ([btrfs])
    13.454 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       write_one_eb ([btrfs])
                                       btree_write_cache_pages ([btrfs])
                                       btree_writepages ([btrfs])
^C[root@jouet ~]#
[root@jouet ~]# perf probe -l
  probe:btrfs_bio_alloc (on __start_delalloc_inodes+624@git/linux/fs/btrfs/inode.c in btrfs)
[root@jouet ~]# 

This was a system wide record, you could do it just for a set of threads, or
for work taking place in a specific CPU, etc.

I.e. you could try to isolate a set of CPUs, then make sure that the work you
want to trace takes place there and then trace just those CPUS, etc.

Use 'perf trace -h topic' to see options related to a topic, e.g.:

[root@jouet ~]# perf trace -h cpu

 Usage: perf trace [<options>] [<command>]
    or: perf trace [<options>] -- <command> [<options>]
    or: perf trace record [<options>] [<command>]
    or: perf trace record [<options>] -- <command> [<options>]

    -a, --all-cpus        system-wide collection from all CPUs
    -C, --cpu <cpu>       list of cpus to monitor

[root@jouet ~]#

Remote that --no-syscalls to see strace like output for syscalls (enter + exit,
time it takes, only syscalls with more than N milliseconds.microsecds, etc)
 
> A more efficient way would probably use a eBPF program with stackmaps to 
> track the stack traces.

If wanting to do aggregation inside the kernel, yes.

- Arnaldo
 
> - Naveen
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2017-01-12 20:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-12  7:49 Is it possible to trace events and its call stack? Qu Wenruo
2017-01-12 10:16 ` Naveen N. Rao
2017-01-12 20:41   ` Arnaldo Carvalho de Melo [this message]
2017-01-16  8:54     ` Qu Wenruo
2017-01-16  2:55 ` Masami Hiramatsu
2017-01-16  8:48   ` Qu Wenruo
2017-01-16 12:26     ` Masami Hiramatsu
2017-01-16 19:32       ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170112204153.GD20003@kernel.org \
    --to=acme@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).