From: bugzilla-daemon@bugzilla.kernel.org
To: linux-scsi@vger.kernel.org
Subject: [Bug 206123] aacraid ( PM8068) and iommu=nobypass Frozen PHB error on ppc64
Date: Wed, 06 May 2020 08:21:50 +0000 [thread overview]
Message-ID: <bug-206123-11613-FxV5I6y4ak@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-206123-11613@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=206123
--- Comment #6 from gyakovlev@gentoo.org ---
tried linux 5.6.10, it now happens right at boot, but at least controller reset
is working it seems, before needed a reboot to access disks again.
[May 6 01:10] PowerNV: IOMMU bypass window disabled.
...
[ 24.609683] Adaptec aacraid driver 1.2.1[50983]-custom
[ 24.609784] aacraid 0002:01:00.0: enabling device (0140 -> 0142)
[ 24.628036] aacraid: Comm Interface type3 enabled
...
[ 25.661962] EEH: Recovering PHB#2-PE#fd
[ 25.662010] EEH: PE location: UOPWR.A100034-Node0-Builtin SAS, PHB location:
N/A
[ 25.662097] EEH: Frozen PHB#2-PE#fd detected
[ 25.662145] EEH: Call Trace:
[ 25.662186] EEH: [(____ptrval____)] __eeh_send_failure_event+0x60/0x110
[ 25.662282] EEH: [(____ptrval____)] eeh_dev_check_failure+0x360/0x5f0
[ 25.662373] EEH: [(____ptrval____)] eeh_check_failure+0x98/0x100
[ 25.666794] EEH: [(____ptrval____)] aac_src_check_health+0x8c/0xc0
[ 25.669770] EEH: [(____ptrval____)] aac_command_thread+0x718/0x930
[ 25.672745] EEH: [(____ptrval____)] kthread+0x180/0x190
[ 25.675719] EEH: [(____ptrval____)] ret_from_kernel_thread+0x5c/0x6c
[ 25.678722] EEH: This PCI device has failed 1 times in the last hour and
will be permanently disabled after 5 failures.
[ 25.681822] EEH: Notify device drivers to shutdown
[ 25.684910] EEH: Beginning: 'error_detected(IO frozen)'
[ 25.688007] PCI 0002:01:00.0#00fd: EEH: Invoking aacraid->error_detected(IO
frozen)
[ 25.688011] aacraid 0002:01:00.0: aacraid: PCI error detected 2
[ 25.695317] PCI 0002:01:00.0#00fd: EEH: aacraid driver reports: 'need reset'
[ 25.695320] EEH: Finished:'error_detected(IO frozen)' with aggregate
recovery state:'need reset'
[ 25.695325] EEH: Collect temporary log
[ 25.695354] EEH: of node=0002:01:00.0
[ 25.695358] EEH: PCI device/vendor: 028d9005
[ 25.695361] EEH: PCI cmd/status register: 00100146
[ 25.695362] EEH: PCI-E capabilities and status follow:
[ 25.695376] EEH: PCI-E 00: 00020010 000081a2 00002950 00437083
[ 25.695387] EEH: PCI-E 10: 10820000 00000000 00000000 00000000
[ 25.695389] EEH: PCI-E 20: 00000000
[ 25.695391] EEH: PCI-E AER capability register set follows:
[ 25.695404] EEH: PCI-E AER 00: 30020001 00000000 00400000 00462030
[ 25.695415] EEH: PCI-E AER 10: 00000000 0000e000 000001e0 00000000
[ 25.695426] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
[ 25.695430] EEH: PCI-E AER 30: 00000000 00000000
[ 25.695432] PHB4 PHB#2 Diag-data (Version: 1)
[ 25.695434] brdgCtl: 00000002
[ 25.695436] RootSts: 00000040 00402000 e0820008 00100107 00000000
[ 25.695438] PhbSts: 0000001c00000000 0000001c00000000
[ 25.695440] Lem: 0000000000000080 0000000000000000 0000000000000080
[ 25.695443] PhbErr: 0000020000000000 0000020000000000 2148000098000240
a008400000000000
[ 25.695445] RxeTceErr: 6000000000000000 2000000000000000 40000000000000fd
0000000000000000
[ 25.695450] PE[0fd] A/B: 8000b03800000000 8000000000000000
[ 25.695453] EEH: Reset without hotplug activity
...
aacraid 0002:01:00.0: enabling device (0140 -> 0142)
[ 1392.284584276,3] PHB#0002[0:2]: brdgCtl = 00000002
[ 1392.284685636,3] PHB#0002[0:2]: deviceStatus = 00000040
[ 1392.284739080,3] PHB#0002[0:2]: slotStatus = 00402000
[ 1392.284804382,3] PHB#0002[0:2]: linkStatus = e0820008
[ 1392.284857805,3] PHB#0002[0:2]: devCmdStatus = 00100107
[ 1392.284899389,3] PHB#0002[0:2]: devSecStatus = 00000000
[ 1392.284948786,3] PHB#0002[0:2]: rootErrorStatus = 00000000
[ 1392.285006352,3] PHB#0002[0:2]: corrErrorStatus = 00000000
[ 1392.285055882,3] PHB#0002[0:2]: uncorrErrorStatus = 00000000
[ 1392.285113499,3] PHB#0002[0:2]: devctl = 00000040
[ 1392.285162880,3] PHB#0002[0:2]: devStat = 00000000
[ 1392.285224300,3] PHB#0002[0:2]: tlpHdr1 = 00000000
[ 1392.285285888,3] PHB#0002[0:2]: tlpHdr2 = 00000000
[ 1392.285355027,3] PHB#0002[0:2]: tlpHdr3 = 00000000
[ 1392.285404499,3] PHB#0002[0:2]: tlpHdr4 = 00000000
[ 1392.285473783,3] PHB#0002[0:2]: sourceId = 00000000
[ 1392.285523293,3] PHB#0002[0:2]: nFir = 0000000000000000
[ 1392.285599065,3] PHB#0002[0:2]: nFirMask = 0030001c00000000
[ 1392.285658870,3] PHB#0002[0:2]: nFirWOF = 0000000000000000
[ 1392.285718721,3] PHB#0002[0:2]: phbPlssr = 0000001c00000000
[ 1392.285778426,3] PHB#0002[0:2]: phbCsr = 0000001c00000000
[ 1392.285834260,3] PHB#0002[0:2]: lemFir = 0000000000000080
[ 1392.285894227,3] PHB#0002[0:2]: lemErrorMask = 0000000000000000
[ 1392.285954146,3] PHB#0002[0:2]: lemWOF = 0000000000000080
[ 1392.286017988,3] PHB#0002[0:2]: phbErrorStatus = 0000020000000000
[ 1392.286085562,3] PHB#0002[0:2]: phbFirstErrorStatus = 0000020000000000
[ 1392.286145499,3] PHB#0002[0:2]: phbErrorLog0 = 2148000098000240
[ 1392.286205500,3] PHB#0002[0:2]: phbErrorLog1 = a008400000000000
[ 1392.286265282,3] PHB#0002[0:2]: phbTxeErrorStatus = 0000000000000000
[ 1392.286328808,3] PHB#0002[0:2]: phbTxeFirstErrorStatus = 0000000000000000
[ 1392.286388242,3] PHB#0002[0:2]: phbTxeErrorLog0 = 0000000000000000
[ 1392.286448308,3] PHB#0002[0:2]: phbTxeErrorLog1 = 0000000000000000
[ 1392.286508132,3] PHB#0002[0:2]: phbRxeArbErrorStatus = 0000000000000000
[ 1392.286568068,3] PHB#0002[0:2]: phbRxeArbFrstErrorStatus = 0000000000000000
[ 1392.286623656,3] PHB#0002[0:2]: phbRxeArbErrorLog0 = 0000000000000000
[ 1392.286683206,3] PHB#0002[0:2]: phbRxeArbErrorLog1 = 0000000000000000
[ 1392.286743009,3] PHB#0002[0:2]: phbRxeMrgErrorStatus = 0000000000000000
[ 1392.286802898,3] PHB#0002[0:2]: phbRxeMrgFrstErrorStatus = 0000000000000000
[ 1392.286862689,3] PHB#0002[0:2]: phbRxeMrgErrorLog0 = 0000000000000000
[ 1392.286922435,3] PHB#0002[0:2]: phbRxeMrgErrorLog1 = 0000000000000000
[ 1392.286982236,3] PHB#0002[0:2]: phbRxeTceErrorStatus = 6000000000000000
[ 1392.287042233,3] PHB#0002[0:2]: phbRxeTceFrstErrorStatus = 2000000000000000
[ 1392.287101957,3] PHB#0002[0:2]: phbRxeTceErrorLog0 = 40000000000000fd
[ 1392.287161569,3] PHB#0002[0:2]: phbRxeTceErrorLog1 = 0000000000000000
[ 1392.287221038,3] PHB#0002[0:2]: phbPblErrorStatus = 0000000000000000
[ 1392.287280741,3] PHB#0002[0:2]: phbPblFirstErrorStatus = 0000000000000000
[ 1392.287336316,3] PHB#0002[0:2]: phbPblErrorLog0 = 0000000000000000
[ 1392.287407731,3] PHB#0002[0:2]: phbPblErrorLog1 = 0000000000000000
[ 1392.287479365,3] PHB#0002[0:2]: phbPcieDlpErrorLog1 = 0000000000000000
[ 1392.287550878,3] PHB#0002[0:2]: phbPcieDlpErrorLog2 = 0000000000000000
[ 1392.287622331,3] PHB#0002[0:2]: phbPcieDlpErrorStatus = 0000000000000000
[ 1392.287682208,3] PHB#0002[0:2]: phbRegbErrorStatus = 0040000000000000
[ 1392.287741819,3] PHB#0002[0:2]: phbRegbFirstErrorStatus = 0000000000000000
[ 1392.287801590,3] PHB#0002[0:2]: phbRegbErrorLog0 = 4800003c00000000
[ 1392.287861285,3] PHB#0002[0:2]: phbRegbErrorLog1 = 0000000000000200
[ 1392.287921850,3] PHB#0002[0:2]: PEST[0fd] = 8000b03800000000
8000000000000000
EEH: Beginning: 'slot_reset'
PCI 0002:01:00.0#00fd: EEH: Invoking aacraid->slot_reset()
aacraid 0002:01:00.0: aacraid: PCI error - slot reset
PCI 0002:01:00.0#00fd: EEH: aacraid driver reports: 'recovered'
EEH: Finished:'slot_reset' with aggregate recovery state:'recovered'
EEH: Notify device driver to resume
EEH: Beginning: 'resume'
PCI 0002:01:00.0#00fd: EEH: Invoking aacraid->resume()
--
You are receiving this mail because:
You are watching the assignee of the bug.
next prev parent reply other threads:[~2020-05-06 8:21 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-08 5:59 [Bug 206123] New: aacraid ( PM8068) and iommu=nobypass Frozen PHB error on ppc64 bugzilla-daemon
2020-01-08 6:00 ` [Bug 206123] " bugzilla-daemon
2020-01-08 6:05 ` bugzilla-daemon
2020-01-08 6:25 ` bugzilla-daemon
2020-04-20 3:18 ` bugzilla-daemon
2020-04-20 18:41 ` bugzilla-daemon
2020-05-06 8:21 ` bugzilla-daemon [this message]
2020-09-09 19:07 ` bugzilla-daemon
2020-09-09 23:07 ` bugzilla-daemon
2020-09-09 23:14 ` bugzilla-daemon
2020-09-10 1:32 ` bugzilla-daemon
2020-09-10 1:46 ` bugzilla-daemon
2020-09-11 3:38 ` bugzilla-daemon
2020-09-11 6:16 ` bugzilla-daemon
2020-09-11 6:22 ` bugzilla-daemon
2020-09-11 18:30 ` bugzilla-daemon
2020-09-11 18:39 ` bugzilla-daemon
2020-09-11 18:48 ` bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-206123-11613-FxV5I6y4ak@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@bugzilla.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox