From: Anton Blanchard <anton@samba.org>
To: Gavin Shan <shangw@linux.vnet.ibm.com>
Cc: linuxppc-dev@ozlabs.org
Subject: Re: [PATCH v5 00/21] EEH reorganization
Date: Fri, 13 Apr 2012 12:03:46 +1000 [thread overview]
Message-ID: <20120413120346.42e01402@kryten> (raw)
In-Reply-To: <20120413073931.0c36169b@kryten>
Hi,
> I just hit this on mainline from today (3.4.0-rc2-00065-gf549e08).
> Haven't had a chance to narrow it down yet.
Looking closer, it was caused by an EEH error at boot. It looks like
the Mellanox infiniband card gets an error when probed by their
firmware tool (mstmread), but only if the kernel driver is not loaded.
I see this EEH error back on 3.0, so it's not new.
The question now is why we oops in the EEH code on mainline.
Anton
------------[ cut here ]------------
WARNING: at arch/powerpc/platforms/pseries/eeh.c:492
Modules linked in:
NIP: c000000000056cc4 LR: c000000000056cc0 CTR: c00000000051dd60
REGS: c000001f3953f6a0 TRAP: 0700 Not tainted (3.4.0-rc2-00065-gf549e08-dirty)
MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 28004482 XER: 0000000f
SOFTE: 0
CFAR: c00000000074ea30
TASK = c000001f39685040[19058] 'mstmread' THREAD: c000001f3953c000 CPU: 38
GPR00: c000000000056cc0 c000001f3953f920 c000000000bd3a28 0000000000000021
GPR04: 0000000000000000 ffffffffffffffff 00000000000323f7 0000000000000000
GPR08: 000000006365203c c000000000b10a20 0000000000020000 c000000000a74cc0
GPR12: 0000000024004422 c00000000eda8500 000000003a58582e 00000000583a5858
GPR16: 000000002f585858 0000000069636573 000000002f646576 0000000010003b48
GPR20: 00000fffc7a3d17c 0000000000000058 0000000000000004 c000001f3953fb90
GPR24: 0000000000000000 0000000000000000 c000000000c77088 c000003e6fffeee8
GPR28: c000000000d82680 0000000000000000 c000000000c770d0 0000000000000000
NIP [c000000000056cc4] .eeh_dn_check_failure+0x304/0x320
LR [c000000000056cc0] .eeh_dn_check_failure+0x300/0x320
Call Trace:
[c000001f3953f920] [c000000000056cc0] .eeh_dn_check_failure+0x300/0x320 (unreliable)
[c000001f3953f9d0] [c00000000002717c] .rtas_read_config+0x13c/0x1b0
[c000001f3953fa70] [c0000000003d543c] .pci_user_read_config_dword+0xcc/0x150
[c000001f3953fb20] [c0000000003e19d8] .pci_read_config+0xe8/0x2a0
[c000001f3953fc00] [c00000000022d330] .read+0x130/0x210
[c000001f3953fce0] [c0000000001a723c] .vfs_read+0xec/0x1e0
[c000001f3953fd80] [c0000000001a73ec] .SyS_pread64+0xbc/0xd0
[c000001f3953fe30] [c000000000009780] syscall_exit+0x0/0x7c
Instruction dump:
7f83e378 48001909 60000000 2fbf0000 419e002c e89f00d8 2fa40000 409e0008
e89f0098 e8629fb8 486f7d39 60000000 <0fe00000> 3b200001 4bfffdb4 e8829fa8
---[ end trace a6e6d788c9869e00 ]---
EEH: Detected PCI bus error on device 0006:01:00.0
EEH: This PCI device has failed 1 times in the last hour:
EEH: Bus location=U78AB.001.WZSGRFL-P1-C4-T1 driver= pci addr=0006:01:00.0
EEH: Device location=U78AB.001.WZSGRFL-P1-C4-T1 driver= pci addr=0006:01:00.0
EEH: of node=/pci@800000020000203/pci1014,415@0
EEH: PCI device/vendor: 673c15b3
EEH: PCI cmd/status register: 00100140
next prev parent reply other threads:[~2012-04-13 2:03 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-28 6:03 [PATCH v5 00/21] EEH reorganization Gavin Shan
2012-02-28 6:03 ` [PATCH 01/21] Cleanup on comments of EEH core Gavin Shan
2012-02-28 6:03 ` [PATCH 02/21] Cleanup on function names " Gavin Shan
2012-02-28 6:03 ` [PATCH 03/21] Platform dependent EEH operations Gavin Shan
2012-02-28 6:03 ` [PATCH 04/21] pSeries platform EEH initialization Gavin Shan
2012-02-28 6:03 ` [PATCH 05/21] pSeries platform EEH operation Gavin Shan
2012-02-28 6:03 ` [PATCH 06/21] pSeries platform EEH PE address retrieval Gavin Shan
2012-02-28 6:03 ` [PATCH 07/21] pSeries platform PE state retrieval Gavin Shan
2012-02-28 6:03 ` [PATCH 08/21] pSeries platform EEH wait PE state Gavin Shan
2012-02-28 6:03 ` [PATCH 09/21] pSeries platform EEH reset PE Gavin Shan
2012-02-28 6:04 ` [PATCH 10/21] pSeries platform EEH error log retrieval Gavin Shan
2012-02-28 6:04 ` [PATCH 11/21] pSeries platform EEH configure bridge Gavin Shan
2012-02-28 6:04 ` [PATCH 12/21] Cleanup on comments of EEH aux components Gavin Shan
2012-02-28 6:04 ` [PATCH 13/21] Cleanup on function names " Gavin Shan
2012-02-28 6:04 ` [PATCH 14/21] Introduce EEH device Gavin Shan
2012-02-28 6:04 ` [PATCH 15/21] Replace pci_dn with eeh_dev for EEH sysfs Gavin Shan
2012-02-28 6:04 ` [PATCH 16/21] Replace pci_dn with eeh_dev for EEH address cache Gavin Shan
2012-02-28 6:04 ` [PATCH 17/21] Replace pci_dn with eeh_dev for EEH core Gavin Shan
2012-02-28 6:04 ` [PATCH 18/21] Replace pci_dn with eeh_dev for EEH aux components Gavin Shan
2012-02-28 6:04 ` [PATCH 19/21] Replace pci_dn with eeh_dev for EEH on pSeries Gavin Shan
2012-02-28 6:04 ` [PATCH 20/21] Introduce struct eeh_stats for EEH Gavin Shan
2012-02-28 10:04 ` David Laight
2012-02-29 1:08 ` Gavin Shan
2012-02-29 2:25 ` Gavin Shan
2012-02-29 12:56 ` Michael Ellerman
2012-03-01 1:14 ` Gavin Shan
2012-03-01 1:47 ` [PATCH 20/21] Introduce struct eeh_stats for EEH - Reworked Gavin Shan
2012-02-28 6:04 ` [PATCH 21/21] pSeries platform config space access in EEH Gavin Shan
2012-02-29 3:04 ` [PATCH v5 00/21] EEH reorganization Gavin Shan
2012-04-12 21:39 ` Anton Blanchard
2012-04-13 2:03 ` Anton Blanchard [this message]
2012-04-17 1:29 ` Gavin Shan
2012-04-17 1:37 ` Anton Blanchard
2012-04-17 1:57 ` Benjamin Herrenschmidt
2012-04-17 5:30 ` Gavin Shan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120413120346.42e01402@kryten \
--to=anton@samba.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=shangw@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).