All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mao Han <han_mao@c-sky.com>
To: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH V3 0/3] riscv: Add perf callchain support
Date: Mon, 19 Aug 2019 16:18:01 +0800	[thread overview]
Message-ID: <20190819081758.GA15999@vmh-VirtualBox> (raw)
In-Reply-To: <alpine.DEB.2.21.9999.1908161008450.18249@viisi.sifive.com>

[-- Attachment #1: Type: text/plain, Size: 4535 bytes --]

Hi Paul,
On Fri, Aug 16, 2019 at 10:14:01AM -0700, Paul Walmsley wrote:
> Hello Mao Han,
> 
> On Fri, 17 May 2019, Mao Han wrote:
> 
> > This patch set add perf callchain(FP/DWARF) support for RISC-V.
> > It comes from the csky version callchain support with some
> > slight modifications. The patchset base on Linux 5.1.
> > 
> > CC: Palmer Dabbelt <palmer@sifive.com>
> > CC: linux-riscv <linux-riscv@lists.infradead.org>
> > CC: Christoph Hellwig <hch@lst.de>
> > CC: Guo Ren <guoren@kernel.org>
> 
> I tried these patches on v5.3-rc4, both on the HiFive Unleashed board 
> with a Debian-based rootfs and QEMU rv64 with a Fedora-based rootfs.  For 
> QEMU, I used defconfig, and for the HiFive Unleashed, I added a few more 
> Kconfig directives; and on both, I enabled CONFIG_PERF_EVENTS.  I built 
> the perf tools from the kernel tree.
> 
> Upon running "/root/bin/perf record -e cpu-clock --call-graph fp 
> /bin/ls", I see the backtraces below.  The first is on the HiFive 
> Unleashed, the second is on QEMU.  
> 
> Could you take a look and tell me if you see similar issues?  And if not, 
> could you please walk me through your process for testing these patches on 
> rv64, so I can reproduce it here?
>

I'v tried the command line above and got similar issues with probability.
unwind_frame_kernel can not stop unwind when fp is a quite large
value(like 0x70aac93ff0eff584) which can pass the simple stack check.
        if (kstack_end((void *)frame->fp))
                return -EPERM;
        if (frame->fp & 0x3 || frame->fp < TASK_SIZE)
                return -EPERM;
handle_exception from arch/riscv/kernel/entry.S will use s0(fp) as temp
register. The context for this frame is unpredictable. We may add more
strict check in unwind_frame_kernel or keep s0 always 0 in handle_exception
to fix this issue.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
    at arch/riscv/kernel/perf_callchain.c:14
14      {
1: /x *frame = {fp = 0xffffffe000005ee0, ra = 0xffffffe000124c90}
(gdb)
Continuing.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
    at arch/riscv/kernel/perf_callchain.c:14
14      {
1: /x *frame = {fp = 0xffffffe000124c74, ra = 0xffffffe000036cba}
(gdb)
Continuing.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
    at arch/riscv/kernel/perf_callchain.c:14
14      {
1: /x *frame = {fp = 0x70aac57ff0eff584, ra = 0x8082614d64ea740a}
(gdb)

Some time perf record can output perf.data correct. something like:
# perf record -e cpu-clock --call-graph fp ls

perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data (102 samples) ]
# 
# perf report
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 102  of event 'cpu-clock'
# Event count (approx.): 25500000
#
# Children      Self  Command  Shared Object      Symbol                           
# ........  ........  .......  .................  .................................
#
    30.39%     0.00%  ls       [kernel.kallsyms]  [k] ret_from_exception
            |
            ---ret_from_exception
               |          
               |--16.67%--do_page_fault
               |          handle_mm_fault
               |          __handle_mm_fault
               |          |          
               |          |--9.80%--filemap_map_pages
               |          |          |          
               |          |          |--3.92%--alloc_set_pte
               |          |          |          page_add_file_rmap
               |          |          |          
               |          |           --3.92%--filemap_map_pages
               |          |          
               |           --0.98%--__do_fault
               |          
               |--8.82%--schedule
               |          __sched_text_start
               |          finish_task_switch
               |          
                --4.90%--__kprobes_text_start
                          irq_exit
                          irq_exit

    28.43%     0.00%  ls       [kernel.kallsyms]  [k] ret_from_syscall
            |
            ---ret_from_syscall
               |          
               |--15.69%--__se_sys_execve
               |          __do_execve_file
               |          search_binary_handler.part.7

I previous tested these patch with a program(callchain_test.c) mostly
running in userspace, so I didn't got this issues.

Thanks,
Mao Han

[-- Attachment #2: callchain_test.c --]
[-- Type: text/x-csrc, Size: 431 bytes --]

#include <stdio.h>

void test_4(void)
{
  volatile int i, j;

  for(i = 0; i < 10000000; i++) 
    j=i; 
}

void test_3(void)
{
  volatile int i, j;
  test_4();
  for(i = 0; i < 3000; i++)
    j=i;
}

void test_2(void)
{
  volatile int i, j;
  test_3();
  for(i = 0; i < 3000; i++)
    j=i;
}

void test_1(void)
{
  volatile int i, j;
  test_2();
  for(i = 0; i < 3000; i++)
    j=i;

}

int main(void)
{
  test_1();
  return 0;
}

[-- Attachment #3: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Mao Han <han_mao@c-sky.com>
To: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org
Subject: Re: [PATCH V3 0/3] riscv: Add perf callchain support
Date: Mon, 19 Aug 2019 16:18:01 +0800	[thread overview]
Message-ID: <20190819081758.GA15999@vmh-VirtualBox> (raw)
In-Reply-To: <alpine.DEB.2.21.9999.1908161008450.18249@viisi.sifive.com>

[-- Attachment #1: Type: text/plain, Size: 4535 bytes --]

Hi Paul,
On Fri, Aug 16, 2019 at 10:14:01AM -0700, Paul Walmsley wrote:
> Hello Mao Han,
> 
> On Fri, 17 May 2019, Mao Han wrote:
> 
> > This patch set add perf callchain(FP/DWARF) support for RISC-V.
> > It comes from the csky version callchain support with some
> > slight modifications. The patchset base on Linux 5.1.
> > 
> > CC: Palmer Dabbelt <palmer@sifive.com>
> > CC: linux-riscv <linux-riscv@lists.infradead.org>
> > CC: Christoph Hellwig <hch@lst.de>
> > CC: Guo Ren <guoren@kernel.org>
> 
> I tried these patches on v5.3-rc4, both on the HiFive Unleashed board 
> with a Debian-based rootfs and QEMU rv64 with a Fedora-based rootfs.  For 
> QEMU, I used defconfig, and for the HiFive Unleashed, I added a few more 
> Kconfig directives; and on both, I enabled CONFIG_PERF_EVENTS.  I built 
> the perf tools from the kernel tree.
> 
> Upon running "/root/bin/perf record -e cpu-clock --call-graph fp 
> /bin/ls", I see the backtraces below.  The first is on the HiFive 
> Unleashed, the second is on QEMU.  
> 
> Could you take a look and tell me if you see similar issues?  And if not, 
> could you please walk me through your process for testing these patches on 
> rv64, so I can reproduce it here?
>

I'v tried the command line above and got similar issues with probability.
unwind_frame_kernel can not stop unwind when fp is a quite large
value(like 0x70aac93ff0eff584) which can pass the simple stack check.
        if (kstack_end((void *)frame->fp))
                return -EPERM;
        if (frame->fp & 0x3 || frame->fp < TASK_SIZE)
                return -EPERM;
handle_exception from arch/riscv/kernel/entry.S will use s0(fp) as temp
register. The context for this frame is unpredictable. We may add more
strict check in unwind_frame_kernel or keep s0 always 0 in handle_exception
to fix this issue.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
    at arch/riscv/kernel/perf_callchain.c:14
14      {
1: /x *frame = {fp = 0xffffffe000005ee0, ra = 0xffffffe000124c90}
(gdb)
Continuing.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
    at arch/riscv/kernel/perf_callchain.c:14
14      {
1: /x *frame = {fp = 0xffffffe000124c74, ra = 0xffffffe000036cba}
(gdb)
Continuing.

Breakpoint 1, unwind_frame_kernel (frame=0xffffffe0000057a0 <rcu_init+118>)
    at arch/riscv/kernel/perf_callchain.c:14
14      {
1: /x *frame = {fp = 0x70aac57ff0eff584, ra = 0x8082614d64ea740a}
(gdb)

Some time perf record can output perf.data correct. something like:
# perf record -e cpu-clock --call-graph fp ls

perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data (102 samples) ]
# 
# perf report
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 102  of event 'cpu-clock'
# Event count (approx.): 25500000
#
# Children      Self  Command  Shared Object      Symbol                           
# ........  ........  .......  .................  .................................
#
    30.39%     0.00%  ls       [kernel.kallsyms]  [k] ret_from_exception
            |
            ---ret_from_exception
               |          
               |--16.67%--do_page_fault
               |          handle_mm_fault
               |          __handle_mm_fault
               |          |          
               |          |--9.80%--filemap_map_pages
               |          |          |          
               |          |          |--3.92%--alloc_set_pte
               |          |          |          page_add_file_rmap
               |          |          |          
               |          |           --3.92%--filemap_map_pages
               |          |          
               |           --0.98%--__do_fault
               |          
               |--8.82%--schedule
               |          __sched_text_start
               |          finish_task_switch
               |          
                --4.90%--__kprobes_text_start
                          irq_exit
                          irq_exit

    28.43%     0.00%  ls       [kernel.kallsyms]  [k] ret_from_syscall
            |
            ---ret_from_syscall
               |          
               |--15.69%--__se_sys_execve
               |          __do_execve_file
               |          search_binary_handler.part.7

I previous tested these patch with a program(callchain_test.c) mostly
running in userspace, so I didn't got this issues.

Thanks,
Mao Han

[-- Attachment #2: callchain_test.c --]
[-- Type: text/x-csrc, Size: 431 bytes --]

#include <stdio.h>

void test_4(void)
{
  volatile int i, j;

  for(i = 0; i < 10000000; i++) 
    j=i; 
}

void test_3(void)
{
  volatile int i, j;
  test_4();
  for(i = 0; i < 3000; i++)
    j=i;
}

void test_2(void)
{
  volatile int i, j;
  test_3();
  for(i = 0; i < 3000; i++)
    j=i;
}

void test_1(void)
{
  volatile int i, j;
  test_2();
  for(i = 0; i < 3000; i++)
    j=i;

}

int main(void)
{
  test_1();
  return 0;
}

  reply	other threads:[~2019-08-19  8:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-17  8:43 [PATCH V3 0/3] riscv: Add perf callchain support Mao Han
2019-05-17  8:43 ` Mao Han
2019-05-17  8:43 ` [PATCH V3 1/3] " Mao Han
2019-05-17  8:43   ` Mao Han
2019-05-17  8:43 ` [PATCH V3 2/3] riscv: Add support for perf registers sampling Mao Han
2019-05-17  8:43   ` Mao Han
2019-05-17  8:43 ` [PATCH V3 3/3] riscv: Add support for libdw Mao Han
2019-05-17  8:43   ` Mao Han
2019-06-03  8:33 ` [PATCH V3 0/3] riscv: Add perf callchain support Mao Han
2019-06-03  8:33   ` Mao Han
2019-08-16 17:14 ` Paul Walmsley
2019-08-16 17:14   ` Paul Walmsley
2019-08-19  8:18   ` Mao Han [this message]
2019-08-19  8:18     ` Mao Han
2019-08-19 10:56     ` Mao Han
2019-08-19 10:56       ` Mao Han
2019-08-24  2:12       ` Paul Walmsley
2019-08-24  2:12         ` Paul Walmsley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190819081758.GA15999@vmh-VirtualBox \
    --to=han_mao@c-sky.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=paul.walmsley@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.