* AHCI bug?: a lockup in ahci_interrupt with fbs enabled pmp
@ 2013-06-06 6:31 Yu Liu
0 siblings, 0 replies; 7+ messages in thread
From: Yu Liu @ 2013-06-06 6:31 UTC (permalink / raw)
To: linux-ide
Hi all,
I met a lockup while I was running IO test on disks connected
with an fbs enabled pmp board and an ahci host.
looks like the reason for the lockup is as below:
ahci_interrert()
| spin_lock(&host->lock); // get host->lock
| ahci_port_intr()
| ahci_error_intr() // status & PORT_IRQ_ERROR
| ata_link_online() // if fbs_enabled
| sata_scr_read()
| sata_pmp_scr_read() // using pmp
|ata_exec_internal()
| ata_exec_internal_sg()
| spin_lock_irqsave(ap->lock, flags);
since ap->lock == &host->lock,
these two spin_lock get conflict
Did I miss anything? Can someone confirm the issue?
my dump info is listed below:
---
RIP: 0010:[<ffffffff814c867f>] [<ffffffff814c867f>]
_spin_lock_irqsave+0x2f/0x40
RSP: 0000:ffff880001e43c38 EFLAGS: 00000097
RAX: 000000000000b685 RBX: ffff88007654de48 RCX: 000000000000b684
RDX: 0000000000000082 RSI: ffff880001e43d88 RDI: ffff88007930c158
RBP: ffff880001e43c38 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000001ff8 R11: 0000000000000246 R12: ffff88007654c000
R13: ffff88007654de48 R14: ffff88007654dce0 R15: 0000000000000000
FS: 00007fc12e0eb700(0000) GS:ffff880001e40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f3b55311cc1 CR3: 0000000078f7a000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process gzip (pid: 15126, threadinfo ffff880067a86000, task ffff880037d02a70)
Stack:
ffff880001e43ce8 ffffffff813558f7 ffff880000000003 0000000000000000
<0> ffffffff81328537 ffff880001e43cc0 000000008134743b ffff880001e43cc0
<0> e4ffffff81262545 ffff880001e43d88 0000000000000000 ffff880001e43cc0
Call Trace:
<IRQ>
[<ffffffff813558f7>] ata_exec_internal_sg+0x67/0x570
[<ffffffff81328537>] ? put_device+0x17/0x20
[<ffffffff81355e79>] ata_exec_internal+0x79/0xb0
[<ffffffff8134699f>] ? scsi_run_queue+0xcf/0x380
[<ffffffff813406c0>] ? __scsi_put_command+0x60/0xa0
[<ffffffff8136740f>] sata_pmp_read+0x7f/0xb0
[<ffffffff8101adf5>] ? native_sched_clock+0x15/0x70
[<ffffffff81367505>] sata_pmp_scr_read+0x35/0xb0
[<ffffffff81353096>] sata_scr_read+0x26/0x60
[<ffffffff81353708>] ata_phys_link_online+0x18/0x30
[<ffffffff81353750>] ata_link_online+0x30/0x70
[<ffffffffa066b964>] ahci_interrupt+0x684/0x790 [ahci]
[<ffffffff810d7750>] handle_IRQ_event+0x60/0x170
[<ffffffff81073983>] ? __do_softirq+0x113/0x1d0
[<ffffffff810d9e46>] handle_edge_irq+0xc6/0x160
[<ffffffff81015fb9>] handle_irq+0x49/0xa0
[<ffffffff814cd62c>] do_IRQ+0x6c/0xf0
[<ffffffff81013ad3>] ret_from_intr+0x0/0x11
<EOI>
Thanks,
Yu
^ permalink raw reply [flat|nested] 7+ messages in thread
* AHCI bug: a lockup in ahci_interrupt with fbs enabled pmp
@ 2013-06-06 6:33 Yu Liu
2013-06-06 21:47 ` Tejun Heo
0 siblings, 1 reply; 7+ messages in thread
From: Yu Liu @ 2013-06-06 6:33 UTC (permalink / raw)
To: linux-ide
Hi all,
I met a lockup while I was running IO test on disks connected
with an fbs enabled pmp board and an ahci host.
looks like the reason for the lockup is as below:
ahci_interrert()
| spin_lock(&host->lock); // get host->lock
| ahci_port_intr()
| ahci_error_intr() // status & PORT_IRQ_ERROR
| ata_link_online() // if fbs_enabled
| sata_scr_read()
| sata_pmp_scr_read() // using pmp
| ata_exec_internal()
| ata_exec_internal_sg()
| spin_lock_irqsave(ap->lock, flags);
since ap->lock == &host->lock,
these two spin_lock get conflict
Can someone confirm the issue? Did I miss anything?
my dump info is listed below:
---
RIP: 0010:[<ffffffff814c867f>] [<ffffffff814c867f>]
_spin_lock_irqsave+0x2f/0x40
RSP: 0000:ffff880001e43c38 EFLAGS: 00000097
RAX: 000000000000b685 RBX: ffff88007654de48 RCX: 000000000000b684
RDX: 0000000000000082 RSI: ffff880001e43d88 RDI: ffff88007930c158
RBP: ffff880001e43c38 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000001ff8 R11: 0000000000000246 R12: ffff88007654c000
R13: ffff88007654de48 R14: ffff88007654dce0 R15: 0000000000000000
FS: 00007fc12e0eb700(0000) GS:ffff880001e40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f3b55311cc1 CR3: 0000000078f7a000 CR4: 00000000000026e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process gzip (pid: 15126, threadinfo ffff880067a86000, task ffff880037d02a70)
Stack:
ffff880001e43ce8 ffffffff813558f7 ffff880000000003 0000000000000000
<0> ffffffff81328537 ffff880001e43cc0 000000008134743b ffff880001e43cc0
<0> e4ffffff81262545 ffff880001e43d88 0000000000000000 ffff880001e43cc0
Call Trace:
<IRQ>
[<ffffffff813558f7>] ata_exec_internal_sg+0x67/0x570
[<ffffffff81328537>] ? put_device+0x17/0x20
[<ffffffff81355e79>] ata_exec_internal+0x79/0xb0
[<ffffffff8134699f>] ? scsi_run_queue+0xcf/0x380
[<ffffffff813406c0>] ? __scsi_put_command+0x60/0xa0
[<ffffffff8136740f>] sata_pmp_read+0x7f/0xb0
[<ffffffff8101adf5>] ? native_sched_clock+0x15/0x70
[<ffffffff81367505>] sata_pmp_scr_read+0x35/0xb0
[<ffffffff81353096>] sata_scr_read+0x26/0x60
[<ffffffff81353708>] ata_phys_link_online+0x18/0x30
[<ffffffff81353750>] ata_link_online+0x30/0x70
[<ffffffffa066b964>] ahci_interrupt+0x684/0x790 [ahci]
[<ffffffff810d7750>] handle_IRQ_event+0x60/0x170
[<ffffffff81073983>] ? __do_softirq+0x113/0x1d0
[<ffffffff810d9e46>] handle_edge_irq+0xc6/0x160
[<ffffffff81015fb9>] handle_irq+0x49/0xa0
[<ffffffff814cd62c>] do_IRQ+0x6c/0xf0
[<ffffffff81013ad3>] ret_from_intr+0x0/0x11
<EOI>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AHCI bug: a lockup in ahci_interrupt with fbs enabled pmp
2013-06-06 6:33 AHCI bug: a lockup in ahci_interrupt with fbs enabled pmp Yu Liu
@ 2013-06-06 21:47 ` Tejun Heo
2013-06-07 4:29 ` Huang, Shane
0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2013-06-06 21:47 UTC (permalink / raw)
To: Yu Liu; +Cc: linux-ide, Shane Huang
Cc'ing Shane.
On Thu, Jun 06, 2013 at 02:33:20PM +0800, Yu Liu wrote:
> Hi all,
>
> I met a lockup while I was running IO test on disks connected
> with an fbs enabled pmp board and an ahci host.
>
> looks like the reason for the lockup is as below:
> ahci_interrert()
> | spin_lock(&host->lock); // get host->lock
> | ahci_port_intr()
> | ahci_error_intr() // status & PORT_IRQ_ERROR
> | ata_link_online() // if fbs_enabled
> | sata_scr_read()
> | sata_pmp_scr_read() // using pmp
> | ata_exec_internal()
> | ata_exec_internal_sg()
> | spin_lock_irqsave(ap->lock, flags);
> since ap->lock == &host->lock,
> these two spin_lock get conflict
>
> Can someone confirm the issue? Did I miss anything?
Yeah, it's a bug. ata_link_online() can't be called from interrupt
handlers. Shane? Can you please look into it? What's the purpose of
ata_link_online() in ahci_error_intr()?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: AHCI bug: a lockup in ahci_interrupt with fbs enabled pmp
2013-06-06 21:47 ` Tejun Heo
@ 2013-06-07 4:29 ` Huang, Shane
2013-06-07 22:56 ` Tejun Heo
0 siblings, 1 reply; 7+ messages in thread
From: Huang, Shane @ 2013-06-07 4:29 UTC (permalink / raw)
To: Tejun Heo, Yu Liu; +Cc: linux-ide@vger.kernel.org, Huang, Shane
> Yeah, it's a bug. ata_link_online() can't be called from interrupt
> handlers. Shane? Can you please look into it? What's the purpose
> of ata_link_online() in ahci_error_intr()?
ata_link_online() was used to check that pmp link is active...
which should be replaced by ata_link_active()?
Thanks,
Shane
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AHCI bug: a lockup in ahci_interrupt with fbs enabled pmp
2013-06-07 4:29 ` Huang, Shane
@ 2013-06-07 22:56 ` Tejun Heo
2013-06-08 5:59 ` Huang, Shane
0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2013-06-07 22:56 UTC (permalink / raw)
To: Huang, Shane; +Cc: Yu Liu, linux-ide@vger.kernel.org
On Fri, Jun 07, 2013 at 04:29:47AM +0000, Huang, Shane wrote:
> > Yeah, it's a bug. ata_link_online() can't be called from interrupt
> > handlers. Shane? Can you please look into it? What's the purpose
> > of ata_link_online() in ahci_error_intr()?
>
> ata_link_online() was used to check that pmp link is active...
> which should be replaced by ata_link_active()?
ata_link_sactive() asks whether there are commands in progress. I
don't think that fits in there. Can't it just bounce to EH for actual
error handling? Why is the link online check necessary?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: AHCI bug: a lockup in ahci_interrupt with fbs enabled pmp
2013-06-07 22:56 ` Tejun Heo
@ 2013-06-08 5:59 ` Huang, Shane
2013-06-08 6:01 ` Tejun Heo
0 siblings, 1 reply; 7+ messages in thread
From: Huang, Shane @ 2013-06-08 5:59 UTC (permalink / raw)
To: Tejun Heo; +Cc: Yu Liu, linux-ide@vger.kernel.org, Huang, Shane
Tejun,
> ata_link_sactive() asks whether there are commands in progress. I
> don't think that fits in there. Can't it just bounce to EH for actual
> error handling? Why is the link online check necessary?
I tried hard to recall why I put ata_link_online() check there, at
last I find it was suggested by you in 2009 when you reviewed v2 :-)
http://marc.info/?l=linux-ide&m=125170571525422&w=2
Need I submit a patch to remove online check or you will handle it?
Thanks,
Shane
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: AHCI bug: a lockup in ahci_interrupt with fbs enabled pmp
2013-06-08 5:59 ` Huang, Shane
@ 2013-06-08 6:01 ` Tejun Heo
0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2013-06-08 6:01 UTC (permalink / raw)
To: Huang, Shane; +Cc: Yu Liu, linux-ide@vger.kernel.org
Hello,
On Sat, Jun 08, 2013 at 05:59:27AM +0000, Huang, Shane wrote:
> > ata_link_sactive() asks whether there are commands in progress. I
> > don't think that fits in there. Can't it just bounce to EH for actual
> > error handling? Why is the link online check necessary?
>
> I tried hard to recall why I put ata_link_online() check there, at
> last I find it was suggested by you in 2009 when you reviewed v2 :-)
>
> http://marc.info/?l=linux-ide&m=125170571525422&w=2
Heh, so it's my own stupidity. :)
> Need I submit a patch to remove online check or you will handle it?
It'd be great if you can submit a patch.
Thanks a lot!
--
tejun
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-06-08 6:01 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-06 6:33 AHCI bug: a lockup in ahci_interrupt with fbs enabled pmp Yu Liu
2013-06-06 21:47 ` Tejun Heo
2013-06-07 4:29 ` Huang, Shane
2013-06-07 22:56 ` Tejun Heo
2013-06-08 5:59 ` Huang, Shane
2013-06-08 6:01 ` Tejun Heo
-- strict thread matches above, loose matches on Subject: below --
2013-06-06 6:31 AHCI bug?: " Yu Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).