From: James Smart <James.Smart@Emulex.Com>
To: Linas Vepstas <linas@austin.ibm.com>
Cc: linux-scsi@vger.kernel.org, rlary@us.ibm.com,
James Smart <James.Smart@Emulex.Com>
Subject: Re: PCI error recovery for the Emulex LPFC
Date: Tue, 31 Oct 2006 08:51:08 -0500 [thread overview]
Message-ID: <454754CC.8040308@emulex.com> (raw)
In-Reply-To: <20061030222047.GN6360@austin.ibm.com>
Linas,
I don't know of anything in this area.
I also need a deeper understand of what the error was, and how,
that was injected. This play into it.
Also, PCI error recovery is not a simple task. There are many
aspects to the adapter messaging interface and the affects of the
PCI error recovery scheme that has to be closely looked at. DMA
errors can be very fatal, even if the PCI bus survives. In many
cases, the only safe recovery is a hard adapter reset (with little
to no interaction with the adapter to clean up). We can discuss
this more offline if you'd like.
-- james s
Linas Vepstas wrote:
> Hi James,
>
> I recently started fiddling with the emulex lpfc driver
> with the idea of adding PCI error recovery support to
> the driver. I'm trying to figure out how to proceed.
>
> Some background: In IBM pSeries, and now newer PCI-E
> based systems, things like parity errors, etc. on the
> PCI bus are detected by the PCI bridge chip, which
> then freezes all further traffic to the adapter.
> When an error condition is detected, there's a
> handful of callbacks made to the device driver, which
> can then try to recover from the error, and move
> forward.
>
> When io is frozen, mmio reads return all 0xffff's ...
> I injected an error on the lpfc, and the (so far,
> completely unmodified) driver promptly crashed on me:
>
> 0:mon> excp
> cpu 0x0: Vector: 300 (Data Access) at [c0000003fbed3890]
> pc: d000000000aa23c0: .lpfc_dev_loss_tmo_callbk+0x68/0x238 [lpfc]
> lr: c0000000002e9dac: .fc_starget_delete+0x90/0x17c
> sp: c0000003fbed3b10
> msr: 9000000000009032
> dar: 6b6b6b6b6b6b7753
> dsisr: 40000000
> current = 0xc0000003fa4ac7f0
> paca = 0xc000000000523300
> pid = 4714, comm = fc_wq_1
>
> 0:mon> t
> [c0000003fbed3bf0] c0000000002e9dac .fc_starget_delete+0x90/0x17c
> [c0000003fbed3c80] c0000000002ebc5c .fc_rport_final_delete+0x80/0x124
> [c0000003fbed3d20] c000000000067268 .run_workqueue+0xdc/0x168
> [c0000003fbed3dc0] c000000000067d0c .worker_thread+0x140/0x1b0
> [c0000003fbed3ee0] c00000000006c24c .kthread+0x124/0x174
> [c0000003fbed3f90] c000000000024d20 .kernel_thread+0x4c/0x68
>
> This is on 2.6.19-rc1-git11 -- I'll try to track this down
> further, but thought I'd mention it now. Does sucha crash
> look familiar?
>
> -- Linas Vepstas
>
>
>
next prev parent reply other threads:[~2006-10-31 13:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-30 22:20 PCI error recovery for the Emulex LPFC Linas Vepstas
2006-10-31 13:51 ` James Smart [this message]
2006-10-31 17:19 ` Linas Vepstas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=454754CC.8040308@emulex.com \
--to=james.smart@emulex.com \
--cc=linas@austin.ibm.com \
--cc=linux-scsi@vger.kernel.org \
--cc=rlary@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox