public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Cong Wang <xiyou.wangcong@gmail.com>
To: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com>,
	Lin Ming <mlin@ss.pku.edu.cn>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mingo@elte.hu" <mingo@elte.hu>,
	"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>,
	"a.p.zijlstra@chello.nl" <a.p.zijlstra@chello.nl>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"Zuo, Jiao" <jiao.zuo@intel.com>
Subject: Re: [RFC 1/2] kernel patch for dump user space stack tool
Date: Thu, 19 Apr 2012 14:13:19 +0800	[thread overview]
Message-ID: <4F8FACFF.9070107@gmail.com> (raw)
In-Reply-To: <1334812653.14538.29.camel@ymzhang.sh.intel.com>

On 04/19/2012 01:17 PM, Yanmin Zhang wrote:
> On Thu, 2012-04-19 at 11:50 +0800, Cong Wang wrote:
>> On 04/17/2012 10:37 PM, Tu, Xiaobing wrote:
>>> Resend the patch because of the log is too long on a single line.
>>>
>>> From: xiaobing tu<xiaobing.tu@intel.com>
>>>
>>> Here is the kernel patch for this tool, The idea is to output user space stack call-chain from
>>> /proc/xxx/stack, currently, /proc/xxx/stack only output kernel stack call chain. We extend
>>> it to output user space call chain in hex format
>>>
>>
>> Can you teach me why we still need this as we have pstack?
> Cong,
>
> Sorry for replying so late. Xiaobing told me you sent him email and I
> didn't receive the 1st one you sent out.


Based on the length of your reply and the description of the patch, you 
hide lots of information in your patch description.

>
> I tried pstack and it does work. It means developers in the world wanted
> the tool long long ago.
>
> Although not checking the source codes of pstack (sorry, I'm busy in debugging
> many critical issues), I think pstack is based on ptrace interface, which means:
> 1) It need traps into system for many times to collect call frames of one
> task.
> 2) It need send signal to the ptraced process to stop it. Such behavior
> might have some impact if the ptraced process also processes many signals.
> 3) The data parsing to get symbols might not be split from data collection.
> I mean, it collects call frames of one process, then parses it; then collects the 2nd
> task's. If there are many processes, it couldn't collect the data just at the monitor
> time point.


Yet another one who wants to "fix" ptrace. ;-)

>
> Why do we work out the tools? The original requirement is from real work.
> We are enabling Android on Medfield. One typical error of Android is ANR.
> When a process couldn't respond in 5 seconds, Android reports an ANR error,
> and dumps JAVA call stack. However, it couldn't dump userspace lib (such like
> bionic, written by C or C++). In addition, Android just dumps the stack of
> the non-responding process. It doesn't dump stack of others. As binder is basic
> framework in Android, processes communicate by binder in the model of client/server.
> When one process is not responding quickly, maybe another process blocks it. We
> need dump that process status.
>
> Many teams complained it's hard to debug such ANR issues, especially the ones which
> are triggered at MTBF testing. Sometimes, an ANR happens after MTBF testing runs
> for one week. Developers ask us to implement such tool over and over again.
>
> Besides ANR, sometimes, system might not respond to any user operation. Usually,
> kernel or firmware would reset system. At that time, we also need get the call
> chains of all the user space processes before system is reset.


I am not familiar with Andriod at all, so a quick question is if this is 
only for Andriod, why you introduce this for all? IOW, why not provide a 
Kconfig?

BTW, I am sure you need to put the above paragraphs into your patch 
description, to make it clear why the patch is needed.

>
> With our tool,
> 1) We could collect the HEX-format call chain data and /proc/XXX/maps
> of all the processes quickly, then parse them either after rebooting, or
> after the issue is reported. It could catch the scene just at the time point
> when the error happens. Our experiments shows the tool could collect the data
> of all processes within 200ms.
> 2) The new tool won't stop the processes and have less impact on them.
> Considering a scenario of performance bottleneck investigation, statistics collection
> shouldn't have big impact on running processes.
> 3) It could support both i386 and x86-64. I tried pstack and it doesn't work
> with x86-64.
> 4) It follows /proc/XXX/stack interface and it's easy to use it.
>
> Besides this tool, we are considering to extend it to collect user space
> call chain of current process from kernel when kernel detects some other
> abnormal behavior.
>

In my previous reply, I ran 'pstrack' on my x86-64 machine, don't 
understand why you said it doesn't work with x86-64? I guess pstack 
supports more than just x86, as ptrace is available in other arch's too.

Thanks.

  reply	other threads:[~2012-04-19  6:13 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-17 14:37 [RFC 1/2] kernel patch for dump user space stack tool Tu, Xiaobing
2012-04-19  3:50 ` Cong Wang
2012-04-19  5:17   ` Yanmin Zhang
2012-04-19  6:13     ` Cong Wang [this message]
2012-04-19  6:28       ` Yanmin Zhang
2012-04-20  9:38     ` Peter Zijlstra
2012-04-24  0:56       ` Yanmin Zhang
2012-04-20  9:54     ` Peter Zijlstra
2012-04-24  2:19       ` Yanmin Zhang
  -- strict thread matches above, loose matches on Subject: below --
2012-04-11  8:07 Tu, Xiaobing
2012-04-17  4:43 ` Lin Ming
2012-04-17 14:38   ` Tu, Xiaobing
2012-04-20  9:44 ` Peter Zijlstra
2012-04-24  1:30   ` Yanmin Zhang
2012-04-24 10:10     ` Peter Zijlstra
2012-04-25  2:58       ` Yanmin Zhang
2012-04-24 10:11     ` Peter Zijlstra
2012-04-25  2:44       ` Yanmin Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F8FACFF.9070107@gmail.com \
    --to=xiyou.wangcong@gmail.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=jiao.zuo@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=mlin@ss.pku.edu.cn \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=xiaobing.tu@intel.com \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox