All of lore.kernel.org
 help / color / mirror / Atom feed
From: wangnan0@huawei.com (Wangnan (F))
To: linux-arm-kernel@lists.infradead.org
Subject: [BUG REPORT]: ARM64: perf: System hung in perf test
Date: Mon, 21 Dec 2015 20:56:03 +0800	[thread overview]
Message-ID: <5677F6E3.9050902@huawei.com> (raw)

System hung can be reproduced on qemu and real hardware using:

  # perf test -v signal

If qemu is started with '-smp 1', system hung. In real hardware and in
qemu with smp > 1, the result is:

  # /perf test -v signal
  17: Test breakpoint overflow signal handler                  :
  --- start ---
  test child forked, pid 792
  count1 11, count2 11, overflow 11
  failed: RF EFLAG recursion issue detected
  failed: wrong overflow hit
  failed: wrong count for bp2
  test child finished with -1
  ---- end ----
  Test breakpoint overflow signal handler: FAILED!

Looks like something like [1] is required for ARM64.

Some analysis is done with qemu:

This testcase tests the intertaction between breakpoint, perf_event
and signal handling. It installs a breakpoint at the enter of a
function and makes the corresponding perf_event generate SIGIO when
the event raise.

When perf_event on a async perf_event is triggered:

         if (*perf_event_fasync(event) && event->pending_kill) {
                 event->pending_wakeup = 1;
                 irq_work_queue(&event->pending);
         }

it calls irq_work_queue(&event->pending), which is used to fire a
poll event and SIGIO. Later when perf_event is closed, in _free_event
irq_work_sync(&event->pending) is called to ensure all irq_work is done.
On ARM64, if we have only 1 cpu, the system hung at irq_work_sync().

Using gdb attached, I see:
  1. IRQ is not disabled. Inside irq_work_sync, result of 
arch_local_save_flags()
     is 0x140.

  2. hrtimer_interrupt() is still generated. The system is not dead.

  3. In irq_work_tick, we have a chance to process irq_work. However,
     llist_empty(raised) is false but arch_irq_work_has_interrupt()
     is true, so kernel only process lazy_list.

  4. handle_IPI() is never called, so I guess the IPI is disabled by 
breakpoint
     and not restored in this case.

[1] 
http://lkml.kernel.org/r/1362940871-24486-1-git-send-email-jolsa at redhat.com

WARNING: multiple messages have this Message-ID (diff)
From: "Wangnan (F)" <wangnan0@huawei.com>
To: Will Deacon <will.deacon@arm.com>, <guohanjun@huawei.com>,
	Jiri Olsa <jolsa@kernel.org>
Cc: <linux-arm-kernel@lists.infradead.org>,
	<linux-kernel@vger.kernel.org>, pi3orama <pi3orama@163.com>,
	xiakaixu 00238161 <xiakaixu@huawei.com>
Subject: [BUG REPORT]: ARM64: perf: System hung in perf test
Date: Mon, 21 Dec 2015 20:56:03 +0800	[thread overview]
Message-ID: <5677F6E3.9050902@huawei.com> (raw)

System hung can be reproduced on qemu and real hardware using:

  # perf test -v signal

If qemu is started with '-smp 1', system hung. In real hardware and in
qemu with smp > 1, the result is:

  # /perf test -v signal
  17: Test breakpoint overflow signal handler                  :
  --- start ---
  test child forked, pid 792
  count1 11, count2 11, overflow 11
  failed: RF EFLAG recursion issue detected
  failed: wrong overflow hit
  failed: wrong count for bp2
  test child finished with -1
  ---- end ----
  Test breakpoint overflow signal handler: FAILED!

Looks like something like [1] is required for ARM64.

Some analysis is done with qemu:

This testcase tests the intertaction between breakpoint, perf_event
and signal handling. It installs a breakpoint at the enter of a
function and makes the corresponding perf_event generate SIGIO when
the event raise.

When perf_event on a async perf_event is triggered:

         if (*perf_event_fasync(event) && event->pending_kill) {
                 event->pending_wakeup = 1;
                 irq_work_queue(&event->pending);
         }

it calls irq_work_queue(&event->pending), which is used to fire a
poll event and SIGIO. Later when perf_event is closed, in _free_event
irq_work_sync(&event->pending) is called to ensure all irq_work is done.
On ARM64, if we have only 1 cpu, the system hung at irq_work_sync().

Using gdb attached, I see:
  1. IRQ is not disabled. Inside irq_work_sync, result of 
arch_local_save_flags()
     is 0x140.

  2. hrtimer_interrupt() is still generated. The system is not dead.

  3. In irq_work_tick, we have a chance to process irq_work. However,
     llist_empty(raised) is false but arch_irq_work_has_interrupt()
     is true, so kernel only process lazy_list.

  4. handle_IPI() is never called, so I guess the IPI is disabled by 
breakpoint
     and not restored in this case.

[1] 
http://lkml.kernel.org/r/1362940871-24486-1-git-send-email-jolsa@redhat.com



             reply	other threads:[~2015-12-21 12:56 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-21 12:56 Wangnan (F) [this message]
2015-12-21 12:56 ` [BUG REPORT]: ARM64: perf: System hung in perf test Wangnan (F)
2015-12-21 14:01 ` Will Deacon
2015-12-21 14:01   ` Will Deacon
2015-12-22 14:16   ` Wangnan (F)
2015-12-22 14:16     ` Wangnan (F)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5677F6E3.9050902@huawei.com \
    --to=wangnan0@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.