* [BUG] perf, arm64, acpi: sleeping function called from invalid context
@ 2018-01-29 17:30 Jan Glauber
  2018-01-30 13:48 ` Will Deacon
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Glauber @ 2018-01-29 17:30 UTC (permalink / raw)
  To: linux-arm-kernel
Hi Will & Mark,
I'm seeing the following warning with 4.15 (and earlier) using ACPI & perf:
[   34.823577] BUG: sleeping function called from invalid context at mm/slab.h:419
[   34.830881] in_atomic(): 0, irqs_disabled(): 128, pid: 14, name: cpuhp/0
[   34.837574] 1 lock held by cpuhp/0/14:
[   34.841314]  #0:  (cpuhp_state-up){....}, at: [<00000000f44ba116>] cpuhp_thread_fun+0x148/0x290
[   34.850032] CPU: 0 PID: 14 Comm: cpuhp/0 Not tainted 4.15.0-rc9-jang+ #13
[   34.856810] Hardware name: Default string Cavium ThunderX2/Default string, BIOS 5.13 12/18/2017
[   34.865499] Call trace:
[   34.867941]  dump_backtrace+0x0/0x160
[   34.871595]  show_stack+0x24/0x30
[   34.874905]  dump_stack+0x9c/0xd0
[   34.878214]  ___might_sleep+0x140/0x1a0
[   34.882042]  __might_sleep+0x58/0x90
[   34.885610]  kmem_cache_alloc_trace+0x2c4/0x320
[   34.890134]  armpmu_alloc+0x38/0x1b0
[   34.893701]  arm_pmu_acpi_cpu_starting+0x10c/0x138
[   34.898484]  cpuhp_invoke_callback+0x120/0xaa8
[   34.902920]  cpuhp_thread_fun+0xec/0x290
[   34.906834]  smpboot_thread_fn+0x21c/0x2b8
[   34.910923]  kthread+0x10c/0x138
[   34.914143]  ret_from_fork+0x10/0x18
Changing the allocations in arm_pmu_alloc() to GFP_ATOMIC didn't help,
as the interrupt request is also not happy in this context.
Would it be possible to init the PMUs later?
--Jan
^ permalink raw reply	[flat|nested] 4+ messages in thread
* [BUG] perf, arm64, acpi: sleeping function called from invalid context
  2018-01-29 17:30 [BUG] perf, arm64, acpi: sleeping function called from invalid context Jan Glauber
@ 2018-01-30 13:48 ` Will Deacon
  2018-01-30 15:34   ` Linus Walleij
  0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2018-01-30 13:48 UTC (permalink / raw)
  To: linux-arm-kernel
Hi Jan, [+LinusW, Lee]
On Mon, Jan 29, 2018 at 06:30:54PM +0100, Jan Glauber wrote:
> I'm seeing the following warning with 4.15 (and earlier) using ACPI & perf:
> 
> [   34.823577] BUG: sleeping function called from invalid context at mm/slab.h:419
> [   34.830881] in_atomic(): 0, irqs_disabled(): 128, pid: 14, name: cpuhp/0
> [   34.837574] 1 lock held by cpuhp/0/14:
> [   34.841314]  #0:  (cpuhp_state-up){....}, at: [<00000000f44ba116>] cpuhp_thread_fun+0x148/0x290
> [   34.850032] CPU: 0 PID: 14 Comm: cpuhp/0 Not tainted 4.15.0-rc9-jang+ #13
> [   34.856810] Hardware name: Default string Cavium ThunderX2/Default string, BIOS 5.13 12/18/2017
> [   34.865499] Call trace:
> [   34.867941]  dump_backtrace+0x0/0x160
> [   34.871595]  show_stack+0x24/0x30
> [   34.874905]  dump_stack+0x9c/0xd0
> [   34.878214]  ___might_sleep+0x140/0x1a0
> [   34.882042]  __might_sleep+0x58/0x90
> [   34.885610]  kmem_cache_alloc_trace+0x2c4/0x320
> [   34.890134]  armpmu_alloc+0x38/0x1b0
> [   34.893701]  arm_pmu_acpi_cpu_starting+0x10c/0x138
> [   34.898484]  cpuhp_invoke_callback+0x120/0xaa8
> [   34.902920]  cpuhp_thread_fun+0xec/0x290
> [   34.906834]  smpboot_thread_fn+0x21c/0x2b8
> [   34.910923]  kthread+0x10c/0x138
> [   34.914143]  ret_from_fork+0x10/0x18
> 
> Changing the allocations in arm_pmu_alloc() to GFP_ATOMIC didn't help,
> as the interrupt request is also not happy in this context.
> 
> Would it be possible to init the PMUs later?
I know that Mark's had a good go at fixing this, but we ran into problems
having the fix co-exist with the IRQ bouncing workaround we perform for the
PMU on U8500 platforms. Frustratingly, those platforms don't appear to be
available any more, so we're being held up by something that we're unable
to test and might be considered dead.
Linus, Lee: do we still need to support PMU interrupts on U8500? It's
causing us real headaches with ACPI-based arm64 systems. [the answer might
be "yes", but I have to ask!]
Cheers,
Will
^ permalink raw reply	[flat|nested] 4+ messages in thread
* [BUG] perf, arm64, acpi: sleeping function called from invalid context
  2018-01-30 13:48 ` Will Deacon
@ 2018-01-30 15:34   ` Linus Walleij
  2018-01-31 16:17     ` Will Deacon
  0 siblings, 1 reply; 4+ messages in thread
From: Linus Walleij @ 2018-01-30 15:34 UTC (permalink / raw)
  To: linux-arm-kernel
On Tue, Jan 30, 2018 at 2:48 PM, Will Deacon <will.deacon@arm.com> wrote:
>> Changing the allocations in arm_pmu_alloc() to GFP_ATOMIC didn't help,
>> as the interrupt request is also not happy in this context.
>>
>> Would it be possible to init the PMUs later?
>
> I know that Mark's had a good go at fixing this, but we ran into problems
> having the fix co-exist with the IRQ bouncing workaround we perform for the
> PMU on U8500 platforms. Frustratingly, those platforms don't appear to be
> available any more, so we're being held up by something that we're unable
> to test and might be considered dead.
Not any more dead than the OMAP3 based Nokia n900 or the
Samsung S3C stuff I would say. Just these don't have weird PMU
counter interrupts :D
The Ux500 Galaxy S Advance phone from Samsung was picked up by
hobbyists from PostmarketOS as a hacking target, maybe that needs to
become more accessible using just upstream, mea culpa.
I am willing to test any patches on the reference designs though.
> Linus, Lee: do we still need to support PMU interrupts on U8500? It's
> causing us real headaches with ACPI-based arm64 systems. [the answer might
> be "yes", but I have to ask!]
Can we still use perf without the IRQs? I.e. I guess it will it fall
back to some
sampling method that is "good enough"? We don't do a lot of performance
testing using perf admittedly.
I do think it is unfair to have to support a bunch of weird
workarounds just because ST Micro decided to connect these two IRQs
to the same GIC line using an OR gate. IIRC that was the issue.
Yours,
Linus Walleij
^ permalink raw reply	[flat|nested] 4+ messages in thread
* [BUG] perf, arm64, acpi: sleeping function called from invalid context
  2018-01-30 15:34   ` Linus Walleij
@ 2018-01-31 16:17     ` Will Deacon
  0 siblings, 0 replies; 4+ messages in thread
From: Will Deacon @ 2018-01-31 16:17 UTC (permalink / raw)
  To: linux-arm-kernel
Hi Linus,
On Tue, Jan 30, 2018 at 04:34:07PM +0100, Linus Walleij wrote:
> On Tue, Jan 30, 2018 at 2:48 PM, Will Deacon <will.deacon@arm.com> wrote:
> 
> >> Changing the allocations in arm_pmu_alloc() to GFP_ATOMIC didn't help,
> >> as the interrupt request is also not happy in this context.
> >>
> >> Would it be possible to init the PMUs later?
> >
> > I know that Mark's had a good go at fixing this, but we ran into problems
> > having the fix co-exist with the IRQ bouncing workaround we perform for the
> > PMU on U8500 platforms. Frustratingly, those platforms don't appear to be
> > available any more, so we're being held up by something that we're unable
> > to test and might be considered dead.
> 
> Not any more dead than the OMAP3 based Nokia n900 or the
> Samsung S3C stuff I would say. Just these don't have weird PMU
> counter interrupts :D
> 
> The Ux500 Galaxy S Advance phone from Samsung was picked up by
> hobbyists from PostmarketOS as a hacking target, maybe that needs to
> become more accessible using just upstream, mea culpa.
> 
> I am willing to test any patches on the reference designs though.
Thanks, we might take you up on that kind offer!
> > Linus, Lee: do we still need to support PMU interrupts on U8500? It's
> > causing us real headaches with ACPI-based arm64 systems. [the answer might
> > be "yes", but I have to ask!]
> 
> Can we still use perf without the IRQs? I.e. I guess it will it fall
> back to some
> sampling method that is "good enough"? We don't do a lot of performance
> testing using perf admittedly.
Right, so perf will still work in counting mode, but sampling mode would be
disabled. This is the case for all other platforms with borked IRQs (e.g.
raspberry pi and imx6), so u8500 would be handled the same way.
> I do think it is unfair to have to support a bunch of weird
> workarounds just because ST Micro decided to connect these two IRQs
> to the same GIC line using an OR gate. IIRC that was the issue.
Yes, that's right and the current workaround of bouncing the IRQ around makes
it impossible to use per-cpu indirection for the IRQ dispatch code, which we
need in order to avoid atomic memory allocation issues with ACPI systems.
I think we'll go ahead and remove the workaround so that we can fix the ACPI
systems. If somebody complains that it breaks for them, then we should
strongly consider looking at falling back on a timer IRQ in the perf core
code when a sampling IRQ is not available.
Sound reasonable as a starting point?
Will
^ permalink raw reply	[flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-01-31 16:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-29 17:30 [BUG] perf, arm64, acpi: sleeping function called from invalid context Jan Glauber
2018-01-30 13:48 ` Will Deacon
2018-01-30 15:34   ` Linus Walleij
2018-01-31 16:17     ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).