Re: Is it possible to trace events and its call stack?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Qu Wenruo <quwenruo@cn.fujitsu.com>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>,
	linux-perf-users@vger.kernel.org
Subject: Re: Is it possible to trace events and its call stack?
Date: Thu, 12 Jan 2017 17:41:53 -0300	[thread overview]
Message-ID: <20170112204153.GD20003@kernel.org> (raw)
In-Reply-To: <20170112101658.GA3470@naverao1-tp.localdomain>

Em Thu, Jan 12, 2017 at 03:46:58PM +0530, Naveen N. Rao escreveu:
> On 2017/01/12 03:49PM, Qu Wenruo wrote:
> > Hi,
> > 
> > Is it possible to use perf/ftrace to trace events and its call stack?
> > 
> > [Background]
> > It's one structure in btrfs, btrfs_bio, I'm tracing for.
> > That structure is allocated and free somewhat frequently, and its size is
> > not fixed, so no SLAB/SLUB cache is used.
> > 
> > I added trace events(or trace points, anyway, just in
> > include/trace/events/btrfs.h) to trace the allocation and freeing.
> > Which will output the pointer address of that structure, so I can pair them,
> > alone with other info.
> > 
> > Things went well until, I found some structures are allocated but not freed.
> > (no corresponding trace point is triggered for given address).
> > 
> > It's possible that btrfs just forget to free it, or btrfs is just holding it
> > for some purpose.
> > So kernel memleak detector won't catch the later one.
> > 
> > That's to say along with the tracepoint data, I still need the call stack of
> > each calling, to determine the code who leak or hold the pointer.
> > 
> > Is it possible to do it using perf or ftrace?
> 
> Yes, use -g option with 'perf record'. In fact, I don't think you even 
> need to add a new tracepoint - you should be able to use kprobes (perf 
> probe) at structure allocation/free points.

Yes, with 'perf record -g', as suggested above, or directly with 'perf trace',
if the volume is not big or if you're ok about using a strace like workflow,
for example:

[root@jouet ~]# perf probe -m btrfs -F btrfs_bio* 
btrfs_bio_alloc
btrfs_bio_clone
btrfs_bio_counter_inc_blocked
btrfs_bio_counter_inc_noblocked
btrfs_bio_counter_sub
btrfs_bio_wq_end_io
[root@jouet ~]# perf probe -m btrfs btrfs_bio_alloc
Added new event:
  probe:btrfs_bio_alloc (on btrfs_bio_alloc in btrfs)

You can now use it in all perf tools, such as:

	perf record -e probe:btrfs_bio_alloc -aR sleep 1

[root@jouet ~]# #perf trace -e write,read,probe:btrfs* 
[root@jouet ~]# mount | grep btrfs
/var/lib/machines.raw on /var/lib/machines type btrfs (rw,relatime,seclabel,space_cache,subvolid=5,subvol=/)
[root@jouet ~]# perf trace --no-syscalls -e probe:btrfs*/max-stack=4/
     0.000 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       write_one_eb ([btrfs])
                                       btree_write_cache_pages ([btrfs])
                                       btree_writepages ([btrfs])
    13.112 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       __extent_writepage_io ([btrfs])
                                       __extent_writepage ([btrfs])
                                       extent_write_cache_pages.isra.43.constprop.60 ([btrfs])
    13.285 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       __extent_writepage_io ([btrfs])
                                       __extent_writepage ([btrfs])
                                       extent_write_cache_pages.isra.43.constprop.60 ([btrfs])
    13.434 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       write_one_eb ([btrfs])
                                       btree_write_cache_pages ([btrfs])
                                       btree_writepages ([btrfs])
    13.454 probe:btrfs_bio_alloc:(ffffffffc0aae110))
                                       btrfs_bio_alloc ([btrfs])
                                       write_one_eb ([btrfs])
                                       btree_write_cache_pages ([btrfs])
                                       btree_writepages ([btrfs])
^C[root@jouet ~]#
[root@jouet ~]# perf probe -l
  probe:btrfs_bio_alloc (on __start_delalloc_inodes+624@git/linux/fs/btrfs/inode.c in btrfs)
[root@jouet ~]# 

This was a system wide record, you could do it just for a set of threads, or
for work taking place in a specific CPU, etc.

I.e. you could try to isolate a set of CPUs, then make sure that the work you
want to trace takes place there and then trace just those CPUS, etc.

Use 'perf trace -h topic' to see options related to a topic, e.g.:

[root@jouet ~]# perf trace -h cpu

 Usage: perf trace [<options>] [<command>]
    or: perf trace [<options>] -- <command> [<options>]
    or: perf trace record [<options>] [<command>]
    or: perf trace record [<options>] -- <command> [<options>]

    -a, --all-cpus        system-wide collection from all CPUs
    -C, --cpu <cpu>       list of cpus to monitor

[root@jouet ~]#

Remote that --no-syscalls to see strace like output for syscalls (enter + exit,
time it takes, only syscalls with more than N milliseconds.microsecds, etc)
 
> A more efficient way would probably use a eBPF program with stackmaps to 
> track the stack traces.

If wanting to do aggregation inside the kernel, yes.

- Arnaldo
 
> - Naveen
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2017-01-12 20:42 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-12  7:49 Is it possible to trace events and its call stack? Qu Wenruo
2017-01-12 10:16 ` Naveen N. Rao
2017-01-12 20:41   ` Arnaldo Carvalho de Melo [this message]
2017-01-16  8:54     ` Qu Wenruo
2017-01-16  2:55 ` Masami Hiramatsu
2017-01-16  8:48   ` Qu Wenruo
2017-01-16 12:26     ` Masami Hiramatsu
2017-01-16 19:32       ` Arnaldo Carvalho de Melo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170112204153.GD20003@kernel.org \
    --to=acme@kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=quwenruo@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.