linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [BUG REPORT]: ARM64: perf: System hung in perf test
@ 2015-12-21 12:56 Wangnan (F)
  2015-12-21 14:01 ` Will Deacon
  0 siblings, 1 reply; 3+ messages in thread
From: Wangnan (F) @ 2015-12-21 12:56 UTC (permalink / raw)
  To: linux-arm-kernel

System hung can be reproduced on qemu and real hardware using:

  # perf test -v signal

If qemu is started with '-smp 1', system hung. In real hardware and in
qemu with smp > 1, the result is:

  # /perf test -v signal
  17: Test breakpoint overflow signal handler                  :
  --- start ---
  test child forked, pid 792
  count1 11, count2 11, overflow 11
  failed: RF EFLAG recursion issue detected
  failed: wrong overflow hit
  failed: wrong count for bp2
  test child finished with -1
  ---- end ----
  Test breakpoint overflow signal handler: FAILED!

Looks like something like [1] is required for ARM64.

Some analysis is done with qemu:

This testcase tests the intertaction between breakpoint, perf_event
and signal handling. It installs a breakpoint at the enter of a
function and makes the corresponding perf_event generate SIGIO when
the event raise.

When perf_event on a async perf_event is triggered:

         if (*perf_event_fasync(event) && event->pending_kill) {
                 event->pending_wakeup = 1;
                 irq_work_queue(&event->pending);
         }

it calls irq_work_queue(&event->pending), which is used to fire a
poll event and SIGIO. Later when perf_event is closed, in _free_event
irq_work_sync(&event->pending) is called to ensure all irq_work is done.
On ARM64, if we have only 1 cpu, the system hung at irq_work_sync().

Using gdb attached, I see:
  1. IRQ is not disabled. Inside irq_work_sync, result of 
arch_local_save_flags()
     is 0x140.

  2. hrtimer_interrupt() is still generated. The system is not dead.

  3. In irq_work_tick, we have a chance to process irq_work. However,
     llist_empty(raised) is false but arch_irq_work_has_interrupt()
     is true, so kernel only process lazy_list.

  4. handle_IPI() is never called, so I guess the IPI is disabled by 
breakpoint
     and not restored in this case.

[1] 
http://lkml.kernel.org/r/1362940871-24486-1-git-send-email-jolsa at redhat.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [BUG REPORT]: ARM64: perf: System hung in perf test
  2015-12-21 12:56 [BUG REPORT]: ARM64: perf: System hung in perf test Wangnan (F)
@ 2015-12-21 14:01 ` Will Deacon
  2015-12-22 14:16   ` Wangnan (F)
  0 siblings, 1 reply; 3+ messages in thread
From: Will Deacon @ 2015-12-21 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 21, 2015 at 08:56:03PM +0800, Wangnan (F) wrote:
> System hung can be reproduced on qemu and real hardware using:
> 
>  # perf test -v signal
> 
> If qemu is started with '-smp 1', system hung. In real hardware and in
> qemu with smp > 1, the result is:

That sounds like a qemu issue...

>  # /perf test -v signal
>  17: Test breakpoint overflow signal handler                  :
>  --- start ---
>  test child forked, pid 792
>  count1 11, count2 11, overflow 11
>  failed: RF EFLAG recursion issue detected
>  failed: wrong overflow hit
>  failed: wrong count for bp2
>  test child finished with -1
>  ---- end ----
>  Test breakpoint overflow signal handler: FAILED!

... and this sounds like the perf tool expecting to single-step over
signal handlers, whereas (on arm64) we deliberately single-step into
them, so you can run into a loop for this case.

Will

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [BUG REPORT]: ARM64: perf: System hung in perf test
  2015-12-21 14:01 ` Will Deacon
@ 2015-12-22 14:16   ` Wangnan (F)
  0 siblings, 0 replies; 3+ messages in thread
From: Wangnan (F) @ 2015-12-22 14:16 UTC (permalink / raw)
  To: linux-arm-kernel



On 2015/12/21 22:01, Will Deacon wrote:
> On Mon, Dec 21, 2015 at 08:56:03PM +0800, Wangnan (F) wrote:
>
[SNIP]
>>   # /perf test -v signal
>>   17: Test breakpoint overflow signal handler                  :
>>   --- start ---
>>   test child forked, pid 792
>>   count1 11, count2 11, overflow 11
>>   failed: RF EFLAG recursion issue detected
>>   failed: wrong overflow hit
>>   failed: wrong count for bp2
>>   test child finished with -1
>>   ---- end ----
>>   Test breakpoint overflow signal handler: FAILED!
> ... and this sounds like the perf tool expecting to single-step over
> signal handlers, whereas (on arm64) we deliberately single-step into
> them, so you can run into a loop for this case.

Could you please provide me more information? Do you think it is
a kernel bug and should be fixed? Or it is an invalid testcase for
aarch64?

Thank you.

> Will

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-12-22 14:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-21 12:56 [BUG REPORT]: ARM64: perf: System hung in perf test Wangnan (F)
2015-12-21 14:01 ` Will Deacon
2015-12-22 14:16   ` Wangnan (F)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).