* pci-aardvark ath9k arm64 issues
@ 2018-01-12 23:22 Marek Behun
2018-01-16 10:09 ` Lorenzo Pieralisi
2018-01-16 10:21 ` Thomas Petazzoni
0 siblings, 2 replies; 3+ messages in thread
From: Marek Behun @ 2018-01-12 23:22 UTC (permalink / raw)
To: Lorenzo Pieralisi; +Cc: Bjorn Helgaas, linux-pci, Thomas Petazzoni
[-- Attachment #1: Type: text/plain, Size: 658 bytes --]
Hello,
we are having a CPU stall issue with ath9k driver on pci-aardvark
(Marvell Armada 3720 (arm64)).
The ath9k driver loads correctly and the interface connects and for
some time it works correctly, but then CPU stalls and kernel dumps
self-detected stall on CPU.
I don't know if this is issue with aardvark or whole arm64 (someone had
a similar problem, see
https://www.spinics.net/lists/linux-wireless/msg157038.html ), but
ath10k doesn't have this problem.
I am attaching the rcu_sched stall detection backtrace (although I
don't know if contains needed information).
Can you please point me how to debug/solve this issue?
Thank you.
Marek Behun
[-- Attachment #2: dmesg-aardvark.txt --]
[-- Type: text/plain, Size: 4627 bytes --]
[ 265.283187] INFO: rcu_sched self-detected stall on CPU
[ 265.288346] 0-...: (2100 ticks this GP) idle=c1e/2/0 softirq=3408/3408 fqs=0
[ 265.295896] (t=2100 jiffies g=846 c=845 q=8)
[ 265.300492] rcu_sched kthread starved for 2100 jiffies! g846 c845 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
[ 265.310932] rcu_sched I 0 8 2 0x00000000
[ 265.316333] Call trace:
[ 265.319134] [<ffffff8008084e94>] __switch_to+0x84/0x98
[ 265.324168] [<ffffff8008687948>] __schedule+0x1f0/0x4c0
[ 265.329565] [<ffffff8008687c44>] schedule+0x2c/0x88
[ 265.334694] [<ffffff800868aed4>] schedule_timeout+0x134/0x288
[ 265.340454] [<ffffff80080f7154>] rcu_gp_kthread+0x434/0x728
[ 265.346303] [<ffffff80080be874>] kthread+0xfc/0x128
[ 265.351434] [<ffffff80080842f0>] ret_from_fork+0x10/0x18
[ 265.356927] Task dump for CPU 0:
[ 265.360340] swapper/0 R running task 0 0 0 0x00000002
[ 265.367450] Call trace:
[ 265.369885] [<ffffff8008086fd0>] dump_backtrace+0x0/0x380
[ 265.375820] [<ffffff8008087364>] show_stack+0x14/0x20
[ 265.380954] [<ffffff80080c861c>] sched_show_task+0x13c/0x160
[ 265.386892] [<ffffff80080c9028>] dump_cpu_task+0x40/0x50
[ 265.392024] [<ffffff80080f84b0>] rcu_dump_cpu_stacks+0x98/0xd8
[ 265.398500] [<ffffff80080f7e6c>] rcu_check_callbacks+0x61c/0x7e0
[ 265.404628] [<ffffff80080fafe4>] update_process_times+0x2c/0x58
[ 265.410481] [<ffffff800810986c>] tick_sched_handle.isra.5+0x34/0x50
[ 265.417316] [<ffffff80081098c8>] tick_sched_timer+0x40/0x90
[ 265.422723] [<ffffff80080fba40>] __hrtimer_run_queues+0xe8/0x160
[ 265.428934] [<ffffff80080fbcd0>] hrtimer_interrupt+0xa0/0x220
[ 265.435142] [<ffffff8008532548>] arch_timer_handler_phys+0x30/0x40
[ 265.441173] [<ffffff80080ec590>] handle_percpu_devid_irq+0x78/0x128
[ 265.447922] [<ffffff80080e6ec4>] generic_handle_irq+0x24/0x38
[ 265.453863] [<ffffff80080e754c>] __handle_domain_irq+0x5c/0xb8
[ 265.460074] [<ffffff8008080dfc>] gic_handle_irq+0xfc/0x1c4
[ 265.465660] Exception stack(0xffffff8008003d80 to 0xffffff8008003ec0)
[ 265.472324] 3d80: 0000000000000000 ffffff80089b7c00 00000000ffffea38 000000401776b000
[ 265.480066] 3da0: 000000000000001f ffffff8008a10000 000000002974debf 0000000000000000
[ 265.488342] 3dc0: 0000000000000040 ffffff8008943e80 0000000000000880 0000000000000000
[ 265.496529] 3de0: 0000000000000001 0000000000000000 0000000000000000 0000000000000010
[ 265.504718] 3e00: ffffff80081a5708 0000007f9431e328 000000000000001c ffffff8008945000
[ 265.512728] 3e20: fffffffffffffff8 ffffff80088584b0 0000000000000001 ffffff80089b7c00
[ 265.520649] 3e40: 000000000000003d ffffff8008857000 ffffff8008857000 0000000000000040
[ 265.529015] 3e60: ffffff8008950800 ffffff8008003ec0 ffffff80080a73f8 ffffff8008003ec0
[ 265.537111] 3e80: ffffff8008080f74 0000000040000145 0000000000000001 ffffffc01e9e8600
[ 265.545120] 3ea0: 0000008000000000 0000000000000001 ffffff8008003ec0 ffffff8008080f74
[ 265.553310] [<ffffff80080828f0>] el1_irq+0xb0/0x140
[ 265.558264] [<ffffff8008080f74>] __do_softirq+0xac/0x208
[ 265.563665] [<ffffff80080a73f8>] irq_exit+0xc8/0x100
[ 265.568704] [<ffffff80080e7550>] __handle_domain_irq+0x60/0xb8
[ 265.574911] [<ffffff8008080dfc>] gic_handle_irq+0xfc/0x1c4
[ 265.580759] Exception stack(0xffffff8008943dd0 to 0xffffff8008943f10)
[ 265.587153] 3dc0: 0000000000000000 0000000000000000
[ 265.595431] 3de0: 0000000000000001 0000000000000000 ffffff8008943f10 000000401776b000
[ 265.603083] 3e00: 0000000000000001 ffffff80089484e0 0000000000000000 ffffff8008943e80
[ 265.611270] 3e20: 0000000000000880 0000000000000000 0000000000000001 0000000000000000
[ 265.619815] 3e40: 0000000000000000 0000000000000010 ffffff80081a5708 0000007f9431e328
[ 265.627651] 3e60: 000000000000001c ffffff8008857000 ffffff8008949930 ffffff8008949000
[ 265.636017] 3e80: ffffff800885ea88 0000000000000000 0000000000000000 ffffff8008950800
[ 265.644030] 3ea0: 0000000000000000 000000001ff26364 0000000000820018 ffffff8008943f10
[ 265.651953] 3ec0: ffffff8008084a64 ffffff8008943f10 ffffff8008084a68 0000000060000145
[ 265.660054] 3ee0: ffffffc01ffffb00 ffffff800884f028 ffffffffffffffff 0000000000000000
[ 265.668420] 3f00: ffffff8008943f10 ffffff8008084a68
[ 265.673020] [<ffffff80080828f0>] el1_irq+0xb0/0x140
[ 265.678238] [<ffffff8008084a68>] arch_cpu_idle+0x10/0x18
[ 265.683640] [<ffffff80080da43c>] do_idle+0x10c/0x1a0
[ 265.688678] [<ffffff80080da64c>] cpu_startup_entry+0x24/0x28
[ 265.694887] [<ffffff800868719c>] rest_init+0xac/0xb8
[ 265.700107] [<ffffff8008820b98>] start_kernel+0x390/0x3a4
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: pci-aardvark ath9k arm64 issues
2018-01-12 23:22 pci-aardvark ath9k arm64 issues Marek Behun
@ 2018-01-16 10:09 ` Lorenzo Pieralisi
2018-01-16 10:21 ` Thomas Petazzoni
1 sibling, 0 replies; 3+ messages in thread
From: Lorenzo Pieralisi @ 2018-01-16 10:09 UTC (permalink / raw)
To: Marek Behun; +Cc: Bjorn Helgaas, linux-pci, Thomas Petazzoni, marc.zyngier
[+cc Marc, for his information]
On Sat, Jan 13, 2018 at 12:22:34AM +0100, Marek Behun wrote:
> Hello,
>
> we are having a CPU stall issue with ath9k driver on pci-aardvark
> (Marvell Armada 3720 (arm64)).
>
> The ath9k driver loads correctly and the interface connects and for
> some time it works correctly, but then CPU stalls and kernel dumps
> self-detected stall on CPU.
>
> I don't know if this is issue with aardvark or whole arm64 (someone had
> a similar problem, see
> https://www.spinics.net/lists/linux-wireless/msg157038.html ), but
> ath10k doesn't have this problem.
>
> I am attaching the rcu_sched stall detection backtrace (although I
> don't know if contains needed information).
>
> Can you please point me how to debug/solve this issue?
I do not think this is arm64 related - IMO it is host bridge related
and I do not have HW to test this (and Marvell Armada 3720 datasheets).
I think pci-aardvark is another host bridge that needs IRQ handling
updates according to:
https://marc.info/?l=linux-pci&m=151517416712010&w=2
The sooner we convert the host bridges to use the right API the
better, at least we will be able to fix these bugs more quickly.
I hope Thomas can help you have a look into this.
Thanks,
Lorenzo
> Thank you.
>
> Marek Behun
> [ 265.283187] INFO: rcu_sched self-detected stall on CPU
> [ 265.288346] 0-...: (2100 ticks this GP) idle=c1e/2/0 softirq=3408/3408 fqs=0
> [ 265.295896] (t=2100 jiffies g=846 c=845 q=8)
> [ 265.300492] rcu_sched kthread starved for 2100 jiffies! g846 c845 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=0
> [ 265.310932] rcu_sched I 0 8 2 0x00000000
> [ 265.316333] Call trace:
> [ 265.319134] [<ffffff8008084e94>] __switch_to+0x84/0x98
> [ 265.324168] [<ffffff8008687948>] __schedule+0x1f0/0x4c0
> [ 265.329565] [<ffffff8008687c44>] schedule+0x2c/0x88
> [ 265.334694] [<ffffff800868aed4>] schedule_timeout+0x134/0x288
> [ 265.340454] [<ffffff80080f7154>] rcu_gp_kthread+0x434/0x728
> [ 265.346303] [<ffffff80080be874>] kthread+0xfc/0x128
> [ 265.351434] [<ffffff80080842f0>] ret_from_fork+0x10/0x18
> [ 265.356927] Task dump for CPU 0:
> [ 265.360340] swapper/0 R running task 0 0 0 0x00000002
> [ 265.367450] Call trace:
> [ 265.369885] [<ffffff8008086fd0>] dump_backtrace+0x0/0x380
> [ 265.375820] [<ffffff8008087364>] show_stack+0x14/0x20
> [ 265.380954] [<ffffff80080c861c>] sched_show_task+0x13c/0x160
> [ 265.386892] [<ffffff80080c9028>] dump_cpu_task+0x40/0x50
> [ 265.392024] [<ffffff80080f84b0>] rcu_dump_cpu_stacks+0x98/0xd8
> [ 265.398500] [<ffffff80080f7e6c>] rcu_check_callbacks+0x61c/0x7e0
> [ 265.404628] [<ffffff80080fafe4>] update_process_times+0x2c/0x58
> [ 265.410481] [<ffffff800810986c>] tick_sched_handle.isra.5+0x34/0x50
> [ 265.417316] [<ffffff80081098c8>] tick_sched_timer+0x40/0x90
> [ 265.422723] [<ffffff80080fba40>] __hrtimer_run_queues+0xe8/0x160
> [ 265.428934] [<ffffff80080fbcd0>] hrtimer_interrupt+0xa0/0x220
> [ 265.435142] [<ffffff8008532548>] arch_timer_handler_phys+0x30/0x40
> [ 265.441173] [<ffffff80080ec590>] handle_percpu_devid_irq+0x78/0x128
> [ 265.447922] [<ffffff80080e6ec4>] generic_handle_irq+0x24/0x38
> [ 265.453863] [<ffffff80080e754c>] __handle_domain_irq+0x5c/0xb8
> [ 265.460074] [<ffffff8008080dfc>] gic_handle_irq+0xfc/0x1c4
> [ 265.465660] Exception stack(0xffffff8008003d80 to 0xffffff8008003ec0)
> [ 265.472324] 3d80: 0000000000000000 ffffff80089b7c00 00000000ffffea38 000000401776b000
> [ 265.480066] 3da0: 000000000000001f ffffff8008a10000 000000002974debf 0000000000000000
> [ 265.488342] 3dc0: 0000000000000040 ffffff8008943e80 0000000000000880 0000000000000000
> [ 265.496529] 3de0: 0000000000000001 0000000000000000 0000000000000000 0000000000000010
> [ 265.504718] 3e00: ffffff80081a5708 0000007f9431e328 000000000000001c ffffff8008945000
> [ 265.512728] 3e20: fffffffffffffff8 ffffff80088584b0 0000000000000001 ffffff80089b7c00
> [ 265.520649] 3e40: 000000000000003d ffffff8008857000 ffffff8008857000 0000000000000040
> [ 265.529015] 3e60: ffffff8008950800 ffffff8008003ec0 ffffff80080a73f8 ffffff8008003ec0
> [ 265.537111] 3e80: ffffff8008080f74 0000000040000145 0000000000000001 ffffffc01e9e8600
> [ 265.545120] 3ea0: 0000008000000000 0000000000000001 ffffff8008003ec0 ffffff8008080f74
> [ 265.553310] [<ffffff80080828f0>] el1_irq+0xb0/0x140
> [ 265.558264] [<ffffff8008080f74>] __do_softirq+0xac/0x208
> [ 265.563665] [<ffffff80080a73f8>] irq_exit+0xc8/0x100
> [ 265.568704] [<ffffff80080e7550>] __handle_domain_irq+0x60/0xb8
> [ 265.574911] [<ffffff8008080dfc>] gic_handle_irq+0xfc/0x1c4
> [ 265.580759] Exception stack(0xffffff8008943dd0 to 0xffffff8008943f10)
> [ 265.587153] 3dc0: 0000000000000000 0000000000000000
> [ 265.595431] 3de0: 0000000000000001 0000000000000000 ffffff8008943f10 000000401776b000
> [ 265.603083] 3e00: 0000000000000001 ffffff80089484e0 0000000000000000 ffffff8008943e80
> [ 265.611270] 3e20: 0000000000000880 0000000000000000 0000000000000001 0000000000000000
> [ 265.619815] 3e40: 0000000000000000 0000000000000010 ffffff80081a5708 0000007f9431e328
> [ 265.627651] 3e60: 000000000000001c ffffff8008857000 ffffff8008949930 ffffff8008949000
> [ 265.636017] 3e80: ffffff800885ea88 0000000000000000 0000000000000000 ffffff8008950800
> [ 265.644030] 3ea0: 0000000000000000 000000001ff26364 0000000000820018 ffffff8008943f10
> [ 265.651953] 3ec0: ffffff8008084a64 ffffff8008943f10 ffffff8008084a68 0000000060000145
> [ 265.660054] 3ee0: ffffffc01ffffb00 ffffff800884f028 ffffffffffffffff 0000000000000000
> [ 265.668420] 3f00: ffffff8008943f10 ffffff8008084a68
> [ 265.673020] [<ffffff80080828f0>] el1_irq+0xb0/0x140
> [ 265.678238] [<ffffff8008084a68>] arch_cpu_idle+0x10/0x18
> [ 265.683640] [<ffffff80080da43c>] do_idle+0x10c/0x1a0
> [ 265.688678] [<ffffff80080da64c>] cpu_startup_entry+0x24/0x28
> [ 265.694887] [<ffffff800868719c>] rest_init+0xac/0xb8
> [ 265.700107] [<ffffff8008820b98>] start_kernel+0x390/0x3a4
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: pci-aardvark ath9k arm64 issues
2018-01-12 23:22 pci-aardvark ath9k arm64 issues Marek Behun
2018-01-16 10:09 ` Lorenzo Pieralisi
@ 2018-01-16 10:21 ` Thomas Petazzoni
1 sibling, 0 replies; 3+ messages in thread
From: Thomas Petazzoni @ 2018-01-16 10:21 UTC (permalink / raw)
To: Marek Behun
Cc: Lorenzo Pieralisi, Bjorn Helgaas, linux-pci, Grégory Clement,
Antoine Ténart, Miquèl Raynal
Hello Marek,
Thanks for your bug report!
On Sat, 13 Jan 2018 00:22:34 +0100, Marek Behun wrote:
> we are having a CPU stall issue with ath9k driver on pci-aardvark
> (Marvell Armada 3720 (arm64)).
>
> The ath9k driver loads correctly and the interface connects and for
> some time it works correctly, but then CPU stalls and kernel dumps
> self-detected stall on CPU.
>
> I don't know if this is issue with aardvark or whole arm64 (someone had
> a similar problem, see
> https://www.spinics.net/lists/linux-wireless/msg157038.html ), but
> ath10k doesn't have this problem.
>
> I am attaching the rcu_sched stall detection backtrace (although I
> don't know if contains needed information).
>
> Can you please point me how to debug/solve this issue?
Could you try to apply the following patches ?
https://patchwork.ozlabs.org/patch/819586/
https://patchwork.ozlabs.org/patch/819589/
https://patchwork.ozlabs.org/patch/819587/
https://patchwork.ozlabs.org/patch/819592/
https://patchwork.ozlabs.org/patch/819590/
https://patchwork.ozlabs.org/patch/819591/
https://patchwork.ozlabs.org/patch/819588/
And see if it helps? If it does help, could you try with just
https://patchwork.ozlabs.org/patch/819592/ applied ?
Thanks!
Thomas
--
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-01-16 10:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-12 23:22 pci-aardvark ath9k arm64 issues Marek Behun
2018-01-16 10:09 ` Lorenzo Pieralisi
2018-01-16 10:21 ` Thomas Petazzoni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).