From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>,
linux-perf-users@vger.kernel.org
Subject: Re: Is it possible to trace events and its call stack?
Date: Thu, 12 Jan 2017 17:41:53 -0300 [thread overview]
Message-ID: <20170112204153.GD20003@kernel.org> (raw)
In-Reply-To: <20170112101658.GA3470@naverao1-tp.localdomain>
Em Thu, Jan 12, 2017 at 03:46:58PM +0530, Naveen N. Rao escreveu:
> On 2017/01/12 03:49PM, Qu Wenruo wrote:
> > Hi,
> >
> > Is it possible to use perf/ftrace to trace events and its call stack?
> >
> > [Background]
> > It's one structure in btrfs, btrfs_bio, I'm tracing for.
> > That structure is allocated and free somewhat frequently, and its size is
> > not fixed, so no SLAB/SLUB cache is used.
> >
> > I added trace events(or trace points, anyway, just in
> > include/trace/events/btrfs.h) to trace the allocation and freeing.
> > Which will output the pointer address of that structure, so I can pair them,
> > alone with other info.
> >
> > Things went well until, I found some structures are allocated but not freed.
> > (no corresponding trace point is triggered for given address).
> >
> > It's possible that btrfs just forget to free it, or btrfs is just holding it
> > for some purpose.
> > So kernel memleak detector won't catch the later one.
> >
> > That's to say along with the tracepoint data, I still need the call stack of
> > each calling, to determine the code who leak or hold the pointer.
> >
> > Is it possible to do it using perf or ftrace?
>
> Yes, use -g option with 'perf record'. In fact, I don't think you even
> need to add a new tracepoint - you should be able to use kprobes (perf
> probe) at structure allocation/free points.
Yes, with 'perf record -g', as suggested above, or directly with 'perf trace',
if the volume is not big or if you're ok about using a strace like workflow,
for example:
[root@jouet ~]# perf probe -m btrfs -F btrfs_bio*
btrfs_bio_alloc
btrfs_bio_clone
btrfs_bio_counter_inc_blocked
btrfs_bio_counter_inc_noblocked
btrfs_bio_counter_sub
btrfs_bio_wq_end_io
[root@jouet ~]# perf probe -m btrfs btrfs_bio_alloc
Added new event:
probe:btrfs_bio_alloc (on btrfs_bio_alloc in btrfs)
You can now use it in all perf tools, such as:
perf record -e probe:btrfs_bio_alloc -aR sleep 1
[root@jouet ~]# #perf trace -e write,read,probe:btrfs*
[root@jouet ~]# mount | grep btrfs
/var/lib/machines.raw on /var/lib/machines type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/)
[root@jouet ~]# perf trace --no-syscalls -e probe:btrfs*/max-stack=4/
0.000 probe:btrfs_bio_alloc:(ffffffffc0aae110))
btrfs_bio_alloc ([btrfs])
write_one_eb ([btrfs])
btree_write_cache_pages ([btrfs])
btree_writepages ([btrfs])
13.112 probe:btrfs_bio_alloc:(ffffffffc0aae110))
btrfs_bio_alloc ([btrfs])
__extent_writepage_io ([btrfs])
__extent_writepage ([btrfs])
extent_write_cache_pages.isra.43.constprop.60 ([btrfs])
13.285 probe:btrfs_bio_alloc:(ffffffffc0aae110))
btrfs_bio_alloc ([btrfs])
__extent_writepage_io ([btrfs])
__extent_writepage ([btrfs])
extent_write_cache_pages.isra.43.constprop.60 ([btrfs])
13.434 probe:btrfs_bio_alloc:(ffffffffc0aae110))
btrfs_bio_alloc ([btrfs])
write_one_eb ([btrfs])
btree_write_cache_pages ([btrfs])
btree_writepages ([btrfs])
13.454 probe:btrfs_bio_alloc:(ffffffffc0aae110))
btrfs_bio_alloc ([btrfs])
write_one_eb ([btrfs])
btree_write_cache_pages ([btrfs])
btree_writepages ([btrfs])
^C[root@jouet ~]#
[root@jouet ~]# perf probe -l
probe:btrfs_bio_alloc (on __start_delalloc_inodes+624@git/linux/fs/btrfs/inode.c in btrfs)
[root@jouet ~]#
This was a system wide record, you could do it just for a set of threads, or
for work taking place in a specific CPU, etc.
I.e. you could try to isolate a set of CPUs, then make sure that the work you
want to trace takes place there and then trace just those CPUS, etc.
Use 'perf trace -h topic' to see options related to a topic, e.g.:
[root@jouet ~]# perf trace -h cpu
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-a, --all-cpus system-wide collection from all CPUs
-C, --cpu <cpu> list of cpus to monitor
[root@jouet ~]#
Remote that --no-syscalls to see strace like output for syscalls (enter + exit,
time it takes, only syscalls with more than N milliseconds.microsecds, etc)
> A more efficient way would probably use a eBPF program with stackmaps to
> track the stack traces.
If wanting to do aggregation inside the kernel, yes.
- Arnaldo
> - Naveen
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-01-12 20:42 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-12 7:49 Is it possible to trace events and its call stack? Qu Wenruo
2017-01-12 10:16 ` Naveen N. Rao
2017-01-12 20:41 ` Arnaldo Carvalho de Melo [this message]
2017-01-16 8:54 ` Qu Wenruo
2017-01-16 2:55 ` Masami Hiramatsu
2017-01-16 8:48 ` Qu Wenruo
2017-01-16 12:26 ` Masami Hiramatsu
2017-01-16 19:32 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170112204153.GD20003@kernel.org \
--to=acme@kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=naveen.n.rao@linux.vnet.ibm.com \
--cc=quwenruo@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).