* [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100
@ 2024-03-19 7:34 Changhui Zhong
2024-03-19 16:23 ` Bjorn Helgaas
0 siblings, 1 reply; 7+ messages in thread
From: Changhui Zhong @ 2024-03-19 7:34 UTC (permalink / raw)
To: linux-pci
Hello,
repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
branch: master
commit HEAD:b3603fcb79b1036acae10602bffc4855a4b9af80
dmesg log:
Rebooting.
[ 292.644951] {1}[Hardware Error]: Hardware error from APEI Generic
Hardware Error Source: 5
[ 292.644955] {1}[Hardware Error]: event severity: fatal
[ 292.644958] {1}[Hardware Error]: Error 0, type: fatal
[ 292.644959] {1}[Hardware Error]: section_type: PCIe error
[ 292.644960] {1}[Hardware Error]: port_type: 0, PCIe end point
[ 292.644962] {1}[Hardware Error]: version: 3.0
[ 292.644963] {1}[Hardware Error]: command: 0x0002, status: 0x0010
[ 292.644964] {1}[Hardware Error]: device_id: 0000:01:00.1
[ 292.644966] {1}[Hardware Error]: slot: 0
[ 292.644967] {1}[Hardware Error]: secondary_bus: 0x00
[ 292.644968] {1}[Hardware Error]: vendor_id: 0x14e4, device_id: 0x165f
[ 292.644969] {1}[Hardware Error]: class_code: 020000
[ 292.644971] {1}[Hardware Error]: aer_uncor_status: 0x00100000,
aer_uncor_mask: 0x00010000
[ 292.644972] {1}[Hardware Error]: aer_uncor_severity: 0x000ef030
[ 292.644973] {1}[Hardware Error]: TLP Header: 40000001 0000020f
90028090 00000000
[ 292.644976] Kernel panic - not syncing: Fatal hardware error!
[ 292.644978] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.8.0+ #1
[ 292.644981] Hardware name: Dell Inc. PowerEdge R640/0X45NX, BIOS
2.19.1 06/04/2023
[ 292.644982] Call Trace:
[ 292.644984] <NMI>
[ 292.644985] panic+0x32b/0x350
[ 292.644995] __ghes_panic+0x69/0x70
[ 292.645000] ghes_in_nmi_queue_one_entry.constprop.0+0x1d9/0x2b0
[ 292.645005] ghes_notify_nmi+0x59/0xd0
[ 292.645007] nmi_handle+0x5b/0x150
[ 292.645014] default_do_nmi+0x40/0x100
[ 292.645017] exc_nmi+0x100/0x180
[ 292.645019] end_repeat_nmi+0xf/0x53
[ 292.645023] RIP: 0010:intel_idle+0x59/0xa0
[ 292.645028] Code: d2 48 89 d1 65 48 8b 05 55 21 73 70 0f 01 c8 48
8b 00 a8 08 75 14 66 90 0f 00 2d 2e 00 43 00 b9 01 00 00 00 48 89 f0
0f 01 c9 <65> 48 8b 05 2f 21 73 70 f0 80 60 02 df f0 83 44 24 fc 00 48
8b 00
[ 292.645030] RSP: 0018:ffffffff90403e48 EFLAGS: 00000046
[ 292.645032] RAX: 0000000000000001 RBX: 0000000000000002 RCX: 0000000000000001
[ 292.645034] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff93d22fa3ffa0
[ 292.645035] RBP: ffff93d22fa3ffa0 R08: 0000000000000002 R09: 00000000fffffffd
[ 292.645036] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff908bbf60
[ 292.645037] R13: ffffffff908bc048 R14: 0000000000000002 R15: 0000000000000000
[ 292.645040] ? intel_idle+0x59/0xa0
[ 292.645043] ? intel_idle+0x59/0xa0
[ 292.645046] </NMI>
[ 292.645046] <TASK>
[ 292.645047] cpuidle_enter_state+0x7d/0x410
[ 292.645050] cpuidle_enter+0x29/0x40
[ 292.645054] cpuidle_idle_call+0xf8/0x160
[ 292.645060] do_idle+0x7a/0xe0
[ 292.645062] cpu_startup_entry+0x25/0x30
[ 292.645065] rest_init+0xcc/0xd0
[ 292.645068] start_kernel+0x325/0x400
[ 292.645072] x86_64_start_reservations+0x14/0x30
[ 292.645076] x86_64_start_kernel+0xed/0xf0
[ 292.645079] common_startup_64+0x13e/0x141
[ 292.645084] </TASK>
[ 292.645101] Kernel Offset: 0xdc00000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
# lspci -nn -s 01:00.1
01:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries
NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
# lspci -vvv -s 01:00.1
01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme
BCM5720 Gigabit Ethernet PCIe
DeviceName: NIC4
Subsystem: Broadcom Inc. and subsidiaries Device 4160
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 17
NUMA node: 0
Region 0: Memory at 92900000 (64-bit, prefetchable) [size=64K]
Region 2: Memory at 92910000 (64-bit, prefetchable) [size=64K]
Region 4: Memory at 92920000 (64-bit, prefetchable) [size=64K]
Expansion ROM at 90040000 [disabled] [size=256K]
Capabilities: [48] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] Vital Product Data
Product Name: Broadcom NetXtreme Gigabit Ethernet
Read-only fields:
[PN] Part number: BCM95720
[MN] Manufacture ID: 1028
[V0] Vendor specific: FFV22.61.8
[V1] Vendor specific: DSV1028VPDR.VER1.0
[V2] Vendor specific: NPY2
[V3] Vendor specific: PMT1
[V4] Vendor specific: NMVBroadcom Corp
[V5] Vendor specific: DTINIC
[V6] Vendor specific: DCM3001008d454101008d45
[RV] Reserved: checksum good, 233 byte(s) reserved
End
Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [a0] MSI-X: Enable+ Count=17 Masked-
Vector table: BAR=4 offset=00000000
PBA: BAR=4 offset=00001000
Capabilities: [ac] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
<4us, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+
FLReset+ SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop- FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq-
AuxPwr+ TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM not supported
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s (ok), Width x2 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
NROPrPrP- LTR-
10BitTagComp- 10BitTagReq- OBFF Not
Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported,
EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 65ms to 210ms,
TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkSta2: Current De-emphasis Level: -6dB,
EqualizationComplete- EqualizationPhase1-
EqualizationPhase2- EqualizationPhase3-
LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt+
UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
AdvNonFatalErr+
CEMsk: RxErr- BadTLP+ BadDLLP+ Rollover+ Timeout+
AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+
ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 40000001 0000020f 90028090 00000000
Capabilities: [13c v1] Device Serial Number 00-00-e4-3d-1a-3c-8b-bb
Capabilities: [150 v1] Power Budgeting <?>
Capabilities: [160 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Kernel driver in use: tg3
Kernel modules: tg3
Thanks,
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100
2024-03-19 7:34 [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100 Changhui Zhong
@ 2024-03-19 16:23 ` Bjorn Helgaas
2024-03-20 2:16 ` Changhui Zhong
0 siblings, 1 reply; 7+ messages in thread
From: Bjorn Helgaas @ 2024-03-19 16:23 UTC (permalink / raw)
To: Changhui Zhong; +Cc: linux-pci
On Tue, Mar 19, 2024 at 03:34:56PM +0800, Changhui Zhong wrote:
> Hello,
>
> repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> branch: master
> commit HEAD:b3603fcb79b1036acae10602bffc4855a4b9af80
Where's the rest of this? I don't see "WARNING: CPU: 0 PID: 226 at
drivers/pci/pci.c:2236" in the snippet below. Please include or post
the complete dmesg log.
Is this reproducible? If so, how? And is it a regression?
> dmesg log:
> Rebooting.
> [ 292.644951] {1}[Hardware Error]: Hardware error from APEI Generic
> Hardware Error Source: 5
> [ 292.644955] {1}[Hardware Error]: event severity: fatal
> [ 292.644958] {1}[Hardware Error]: Error 0, type: fatal
> [ 292.644959] {1}[Hardware Error]: section_type: PCIe error
> [ 292.644960] {1}[Hardware Error]: port_type: 0, PCIe end point
> [ 292.644962] {1}[Hardware Error]: version: 3.0
> [ 292.644963] {1}[Hardware Error]: command: 0x0002, status: 0x0010
> [ 292.644964] {1}[Hardware Error]: device_id: 0000:01:00.1
> [ 292.644966] {1}[Hardware Error]: slot: 0
> [ 292.644967] {1}[Hardware Error]: secondary_bus: 0x00
> [ 292.644968] {1}[Hardware Error]: vendor_id: 0x14e4, device_id: 0x165f
> [ 292.644969] {1}[Hardware Error]: class_code: 020000
> [ 292.644971] {1}[Hardware Error]: aer_uncor_status: 0x00100000,
> aer_uncor_mask: 0x00010000
> [ 292.644972] {1}[Hardware Error]: aer_uncor_severity: 0x000ef030
> [ 292.644973] {1}[Hardware Error]: TLP Header: 40000001 0000020f
> 90028090 00000000
aer_uncor_status 0x00100000 looks like bit 20, Unsupported Request.
If I decoded it correctly, the TLP log says:
40000001: 0100 ... 0001
Fmt 010 3 DW header with data (PCIe r6.0, sec 2.2.1.1)
Type 0 0000 Memory Write
Length 1 1 DW
0000020f (sec 2.2.7.1)
Requester ID 0000
Tag 2
First DW BE f 32-bit write
90028090
Address 90028090
I don't see 0x90028090 as a BAR value in the lspci output below,
although we don't have any information about possible address
translation (this would be in the dmesg log or "lspci -b" output).
But it *looks* like an MMIO write that got routed to 01:00.1 (the
bridge window configuration that would be in the dmesg log would show
this), and 01:00.1 said "I don't know about this address" (it doesn't
match any of my BARs) and logged a UR error.
> [ 292.644976] Kernel panic - not syncing: Fatal hardware error!
> [ 292.644978] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.8.0+ #1
> [ 292.644981] Hardware name: Dell Inc. PowerEdge R640/0X45NX, BIOS
> 2.19.1 06/04/2023
> [ 292.644982] Call Trace:
> [ 292.644984] <NMI>
> [ 292.644985] panic+0x32b/0x350
> [ 292.644995] __ghes_panic+0x69/0x70
> [ 292.645000] ghes_in_nmi_queue_one_entry.constprop.0+0x1d9/0x2b0
> [ 292.645005] ghes_notify_nmi+0x59/0xd0
> [ 292.645007] nmi_handle+0x5b/0x150
> [ 292.645014] default_do_nmi+0x40/0x100
> [ 292.645017] exc_nmi+0x100/0x180
> [ 292.645019] end_repeat_nmi+0xf/0x53
> [ 292.645023] RIP: 0010:intel_idle+0x59/0xa0
> [ 292.645028] Code: d2 48 89 d1 65 48 8b 05 55 21 73 70 0f 01 c8 48
> 8b 00 a8 08 75 14 66 90 0f 00 2d 2e 00 43 00 b9 01 00 00 00 48 89 f0
> 0f 01 c9 <65> 48 8b 05 2f 21 73 70 f0 80 60 02 df f0 83 44 24 fc 00 48
> 8b 00
> [ 292.645030] RSP: 0018:ffffffff90403e48 EFLAGS: 00000046
> [ 292.645032] RAX: 0000000000000001 RBX: 0000000000000002 RCX: 0000000000000001
> [ 292.645034] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff93d22fa3ffa0
> [ 292.645035] RBP: ffff93d22fa3ffa0 R08: 0000000000000002 R09: 00000000fffffffd
> [ 292.645036] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff908bbf60
> [ 292.645037] R13: ffffffff908bc048 R14: 0000000000000002 R15: 0000000000000000
> [ 292.645040] ? intel_idle+0x59/0xa0
> [ 292.645043] ? intel_idle+0x59/0xa0
> [ 292.645046] </NMI>
> [ 292.645046] <TASK>
> [ 292.645047] cpuidle_enter_state+0x7d/0x410
> [ 292.645050] cpuidle_enter+0x29/0x40
> [ 292.645054] cpuidle_idle_call+0xf8/0x160
> [ 292.645060] do_idle+0x7a/0xe0
> [ 292.645062] cpu_startup_entry+0x25/0x30
> [ 292.645065] rest_init+0xcc/0xd0
> [ 292.645068] start_kernel+0x325/0x400
> [ 292.645072] x86_64_start_reservations+0x14/0x30
> [ 292.645076] x86_64_start_kernel+0xed/0xf0
> [ 292.645079] common_startup_64+0x13e/0x141
> [ 292.645084] </TASK>
> [ 292.645101] Kernel Offset: 0xdc00000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>
>
> # lspci -nn -s 01:00.1
> 01:00.1 Ethernet controller [0200]: Broadcom Inc. and subsidiaries
> NetXtreme BCM5720 Gigabit Ethernet PCIe [14e4:165f]
>
> # lspci -vvv -s 01:00.1
> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme
> BCM5720 Gigabit Ethernet PCIe
> DeviceName: NIC4
> Subsystem: Broadcom Inc. and subsidiaries Device 4160
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0
> Interrupt: pin B routed to IRQ 17
> NUMA node: 0
> Region 0: Memory at 92900000 (64-bit, prefetchable) [size=64K]
> Region 2: Memory at 92910000 (64-bit, prefetchable) [size=64K]
> Region 4: Memory at 92920000 (64-bit, prefetchable) [size=64K]
> Expansion ROM at 90040000 [disabled] [size=256K]
> Capabilities: [48] Power Management version 3
> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
> Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
> Capabilities: [50] Vital Product Data
> Product Name: Broadcom NetXtreme Gigabit Ethernet
> Read-only fields:
> [PN] Part number: BCM95720
> [MN] Manufacture ID: 1028
> [V0] Vendor specific: FFV22.61.8
> [V1] Vendor specific: DSV1028VPDR.VER1.0
> [V2] Vendor specific: NPY2
> [V3] Vendor specific: PMT1
> [V4] Vendor specific: NMVBroadcom Corp
> [V5] Vendor specific: DTINIC
> [V6] Vendor specific: DCM3001008d454101008d45
> [RV] Reserved: checksum good, 233 byte(s) reserved
> End
> Capabilities: [58] MSI: Enable- Count=1/8 Maskable- 64bit+
> Address: 0000000000000000 Data: 0000
> Capabilities: [a0] MSI-X: Enable+ Count=17 Masked-
> Vector table: BAR=4 offset=00000000
> PBA: BAR=4 offset=00001000
> Capabilities: [ac] Express (v2) Endpoint, MSI 00
> DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
> <4us, L1 <64us
> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+
> FLReset+ SlotPowerLimit 25.000W
> DevCtl: CorrErr- NonFatalErr+ FatalErr+ UnsupReq+
> RlxdOrd- ExtTag- PhantFunc- AuxPwr+ NoSnoop- FLReset-
> MaxPayload 128 bytes, MaxReadReq 512 bytes
> DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq-
> AuxPwr+ TransPend-
> LnkCap: Port #0, Speed 5GT/s, Width x2, ASPM not supported
> ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
> LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
> ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
> LnkSta: Speed 5GT/s (ok), Width x2 (ok)
> TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> DevCap2: Completion Timeout: Range ABCD, TimeoutDis+
> NROPrPrP- LTR-
> 10BitTagComp- 10BitTagReq- OBFF Not
> Supported, ExtFmt- EETLPPrefix-
> EmergencyPowerReduction Not Supported,
> EmergencyPowerReductionInit-
> FRS- TPHComp- ExtTPHComp-
> AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> DevCtl2: Completion Timeout: 65ms to 210ms,
> TimeoutDis- LTR- OBFF Disabled,
> AtomicOpsCtl: ReqEn-
> LnkSta2: Current De-emphasis Level: -6dB,
> EqualizationComplete- EqualizationPhase1-
> EqualizationPhase2- EqualizationPhase3-
> LinkEqualizationRequest-
> Retimer- 2Retimers- CrosslinkRes: unsupported
> Capabilities: [100 v1] Advanced Error Reporting
> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
> UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt+
> UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> AdvNonFatalErr+
> CEMsk: RxErr- BadTLP+ BadDLLP+ Rollover+ Timeout+
> AdvNonFatalErr+
> AERCap: First Error Pointer: 00, ECRCGenCap+
> ECRCGenEn- ECRCChkCap+ ECRCChkEn-
> MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
> HeaderLog: 40000001 0000020f 90028090 00000000
> Capabilities: [13c v1] Device Serial Number 00-00-e4-3d-1a-3c-8b-bb
> Capabilities: [150 v1] Power Budgeting <?>
> Capabilities: [160 v1] Virtual Channel
> Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
> Arb: Fixed- WRR32- WRR64- WRR128-
> Ctrl: ArbSelect=Fixed
> Status: InProgress-
> VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
> Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
> Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
> Status: NegoPending- InProgress-
> Kernel driver in use: tg3
> Kernel modules: tg3
>
> Thanks,
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100
2024-03-19 16:23 ` Bjorn Helgaas
@ 2024-03-20 2:16 ` Changhui Zhong
2024-03-20 2:46 ` Bjorn Helgaas
0 siblings, 1 reply; 7+ messages in thread
From: Changhui Zhong @ 2024-03-20 2:16 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: linux-pci
Hi,Bjorn
On Wed, Mar 20, 2024 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Tue, Mar 19, 2024 at 03:34:56PM +0800, Changhui Zhong wrote:
> > Hello,
> >
> > repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > branch: master
> > commit HEAD:b3603fcb79b1036acae10602bffc4855a4b9af80
>
> Where's the rest of this? I don't see "WARNING: CPU: 0 PID: 226 at
> drivers/pci/pci.c:2236" in the snippet below. Please include or post
> the complete dmesg log.
>
> Is this reproducible? If so, how? And is it a regression?
>
it reproduceible,I can trigger it every time on my server,but I'm not
sure if it is a regression,
dmesg log on my other server:
```
System Reboot
.
[ 248.433904] watchdog: watchdog0: watchdog did not stop!
[ 258.459553] systemd-shutdown[1]: Waiting for process: 4506 (sleep),
4491 (rhts-reboot)
[ 338.521745] watchdog: watchdog0: watchdog did not stop!
[ 338.556096] dracut Warning: Killing all remaining processes
dracut Warning: Killing all remaining processes
[ 338.589595] dracut Warning: Unmounted /oldroot.
dracut Warning: Unmounted /oldroot.
Rebooting.
[ 339.651690] {1}[Hardware Error]: Hardware error from APEI Generic
Hardware Error Source: 5
[ 339.659948] {1}[Hardware Error]: event severity: recoverable
[ 339.665606] {1}[Hardware Error]: Error 0, type: fatal
[ 339.670743] {1}[Hardware Error]: section_type: PCIe error
[ 339.676310] {1}[Hardware Error]: port_type: 0, PCIe end point
[ 339.682228] {1}[Hardware Error]: version: 3.0
[ 339.686761] {1}[Hardware Error]: command: 0x0002, status: 0x0010
[ 339.692939] {1}[Hardware Error]: device_id: 0000:04:00.0
[ 339.698427] {1}[Hardware Error]: slot: 0
[ 339.702525] {1}[Hardware Error]: secondary_bus: 0x00
[ 339.707664] {1}[Hardware Error]: vendor_id: 0x14e4, device_id: 0x165f
[ 339.714278] {1}[Hardware Error]: class_code: 020000
[ 339.719331] {1}[Hardware Error]: aer_uncor_status: 0x00100000,
aer_uncor_mask: 0x00010000
[ 339.727678] {1}[Hardware Error]: aer_uncor_severity: 0x000ef030
[ 339.733769] {1}[Hardware Error]: TLP Header: 40000001 0000020f
90028090 00000000
[ 339.741353] tg3 0000:04:00.0: AER: aer_status: 0x00100000,
aer_mask: 0x00010000
[ 339.748662] tg3 0000:04:00.0: [20] UnsupReq (First)
[ 339.755014] tg3 0000:04:00.0: AER: aer_layer=Transaction Layer,
aer_agent=Requester ID
[ 339.762924] tg3 0000:04:00.0: AER: aer_uncor_severity: 0x000ef030
[ 339.769018] tg3 0000:04:00.0: AER: TLP Header: 40000001 0000020f
90028090 00000000
[ 339.776761] ------------[ cut here ]------------
[ 339.781378] tg3 0000:04:00.0: disabling already-disabled device
[ 339.781386] WARNING: CPU: 0 PID: 358 at drivers/pci/pci.c:2236
pci_disable_device+0xf4/0x100
[ 339.795737] Modules linked in: raid1 rpcsec_gss_krb5 auth_rpcgss
nfsv4 dns_resolver nfs lockd grace netfs rfkill sunrpc ipmi_ssif
intel_rapl_msr intel_rapl_common intel_uncore_frequency
intel_uncore_frequency_common i10nm_edac nfit libnvdimm
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat
mgag200 rapl dax_hmem iTCO_wdt i2c_algo_bit cxl_acpi
iTCO_vendor_support drm_shmem_helper intel_cstate acpi_ipmi ipmi_si
mei_me cxl_core i2c_i801 dell_smbios isst_if_mmio isst_if_mbox_pci
drm_kms_helper ipmi_devintf intel_uncore dcdbas mei einj
intel_pch_thermal intel_vsec isst_if_common wmi_bmof
dell_wmi_descriptor pcspkr i2c_smbus ipmi_msghandler acpi_power_meter
drm fuse xfs libcrc32c sd_mod t10_pi sg crct10dif_pclmul ahci
crc32_pclmul libahci crc32c_intel libata tg3 ghash_clmulni_intel wmi
dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug]
[ 339.872243] CPU: 0 PID: 358 Comm: kworker/0:3 Not tainted 6.8.0+ #1
[ 339.878505] Hardware name: Dell Inc. PowerEdge R650xs/0PPTY2, BIOS
1.4.4 10/07/2021
[ 339.886157] Workqueue: events aer_recover_work_func
[ 339.891037] RIP: 0010:pci_disable_device+0xf4/0x100
[ 339.895917] Code: 4d 85 e4 75 07 4c 8b a3 c8 00 00 00 48 8d bb c8
00 00 00 e8 9e c7 17 00 4c 89 e2 48 c7 c7 50 92 21 91 48 89 c6 e8 ac
94 a1 ff <0f> 0b e9 3b ff ff ff e8 80 36 60 00 90 90 90 90 90 90 90 90
90 90
[ 339.914664] RSP: 0018:ff56179a82883d10 EFLAGS: 00010286
[ 339.919888] RAX: 0000000000000000 RBX: ff2f7c9b44e58000 RCX: ffffffff9171e4a8
[ 339.927022] RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: 0000000000000001
[ 339.934154] RBP: ff2f7c9b65860000 R08: 0000000000000000 R09: ff56179a82883bc0
[ 339.941289] R10: ff56179a82883bb8 R11: ffffffff917de4e8 R12: ff2f7c9b445fa4e0
[ 339.948421] R13: 0000000000000002 R14: ff2f7c9b44e58148 R15: ff2f7c9b44e5d000
[ 339.955552] FS: 0000000000000000(0000) GS:ff2f7c9eaf600000(0000)
knlGS:0000000000000000
[ 339.963640] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 339.969385] CR2: 00007f7577713838 CR3: 0000000300a20003 CR4: 0000000000771ef0
[ 339.976519] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 339.983651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 339.990782] PKRU: 55555554
[ 339.993494] Call Trace:
[ 339.995949] <TASK>
[ 339.998054] ? __warn+0x7f/0x130
[ 340.001286] ? pci_disable_device+0xf4/0x100
[ 340.005560] ? report_bug+0x18a/0x1a0
[ 340.009227] ? handle_bug+0x3c/0x70
[ 340.012719] ? exc_invalid_op+0x14/0x70
[ 340.016559] ? asm_exc_invalid_op+0x16/0x20
[ 340.020745] ? pci_disable_device+0xf4/0x100
[ 340.025017] ? __pfx_report_frozen_detected+0x10/0x10
[ 340.030069] tg3_io_error_detected+0x1f5/0x2b0 [tg3]
[ 340.035044] ? __pfx_report_frozen_detected+0x10/0x10
[ 340.040098] report_error_detected+0xc7/0x1c0
[ 340.044456] ? __pfx_report_frozen_detected+0x10/0x10
[ 340.049509] __pci_walk_bus+0x6b/0xb0
[ 340.053176] ? __pfx_aer_root_reset+0x10/0x10
[ 340.057535] pcie_do_recovery+0x2b4/0x3c0
[ 340.061548] aer_recover_work_func+0x106/0x110
[ 340.065992] process_one_work+0x193/0x3d0
[ 340.070005] worker_thread+0x2fc/0x410
[ 340.073758] ? __pfx_worker_thread+0x10/0x10
[ 340.078032] kthread+0xdc/0x110
[ 340.081179] ? __pfx_kthread+0x10/0x10
[ 340.084930] ret_from_fork+0x2d/0x50
[ 340.088510] ? __pfx_kthread+0x10/0x10
[ 340.092263] ret_from_fork_asm+0x1a/0x30
[ 340.096190] </TASK>
[ 340.098380] ---[ end trace 0000000000000000 ]---
[ 340.103083] reboot: Restarting system
[-- MARK -- Tue Mar 19 14:05:00 2024]
```
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100
2024-03-20 2:16 ` Changhui Zhong
@ 2024-03-20 2:46 ` Bjorn Helgaas
2024-03-20 3:13 ` Changhui Zhong
2024-03-21 10:11 ` Changhui Zhong
0 siblings, 2 replies; 7+ messages in thread
From: Bjorn Helgaas @ 2024-03-20 2:46 UTC (permalink / raw)
To: Changhui Zhong; +Cc: linux-pci
On Wed, Mar 20, 2024 at 10:16:06AM +0800, Changhui Zhong wrote:
> On Wed, Mar 20, 2024 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Mar 19, 2024 at 03:34:56PM +0800, Changhui Zhong wrote:
> > > repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > > branch: master
> > > commit HEAD:b3603fcb79b1036acae10602bffc4855a4b9af80
> >
> > Where's the rest of this? I don't see "WARNING: CPU: 0 PID: 226 at
> > drivers/pci/pci.c:2236" in the snippet below. Please include or post
> > the complete dmesg log.
> >
> > Is this reproducible? If so, how? And is it a regression?
>
> it reproduceible,I can trigger it every time on my server,but I'm not
> sure if it is a regression,
Great, it's always easier if it's easily reproducible. Can you please
try an older kernel, e.g., v6.8?
> dmesg log on my other server:
Please include or post the *complete* dmesg log all the way from the
very beginning of boot, not just the snippet you included below. The
complete log contains useful information that we need to investigate
this problem.
> ```
> System Reboot
> .
> [ 248.433904] watchdog: watchdog0: watchdog did not stop!
> [ 258.459553] systemd-shutdown[1]: Waiting for process: 4506 (sleep),
> 4491 (rhts-reboot)
> [ 338.521745] watchdog: watchdog0: watchdog did not stop!
> [ 338.556096] dracut Warning: Killing all remaining processes
> dracut Warning: Killing all remaining processes
> [ 338.589595] dracut Warning: Unmounted /oldroot.
> dracut Warning: Unmounted /oldroot.
> Rebooting.
> [ 339.651690] {1}[Hardware Error]: Hardware error from APEI Generic
> Hardware Error Source: 5
> [ 339.659948] {1}[Hardware Error]: event severity: recoverable
> [ 339.665606] {1}[Hardware Error]: Error 0, type: fatal
> [ 339.670743] {1}[Hardware Error]: section_type: PCIe error
> [ 339.676310] {1}[Hardware Error]: port_type: 0, PCIe end point
> [ 339.682228] {1}[Hardware Error]: version: 3.0
> [ 339.686761] {1}[Hardware Error]: command: 0x0002, status: 0x0010
> [ 339.692939] {1}[Hardware Error]: device_id: 0000:04:00.0
> [ 339.698427] {1}[Hardware Error]: slot: 0
> [ 339.702525] {1}[Hardware Error]: secondary_bus: 0x00
> [ 339.707664] {1}[Hardware Error]: vendor_id: 0x14e4, device_id: 0x165f
> [ 339.714278] {1}[Hardware Error]: class_code: 020000
> [ 339.719331] {1}[Hardware Error]: aer_uncor_status: 0x00100000,
> aer_uncor_mask: 0x00010000
> [ 339.727678] {1}[Hardware Error]: aer_uncor_severity: 0x000ef030
> [ 339.733769] {1}[Hardware Error]: TLP Header: 40000001 0000020f
> 90028090 00000000
> [ 339.741353] tg3 0000:04:00.0: AER: aer_status: 0x00100000,
> aer_mask: 0x00010000
> [ 339.748662] tg3 0000:04:00.0: [20] UnsupReq (First)
> [ 339.755014] tg3 0000:04:00.0: AER: aer_layer=Transaction Layer,
> aer_agent=Requester ID
> [ 339.762924] tg3 0000:04:00.0: AER: aer_uncor_severity: 0x000ef030
> [ 339.769018] tg3 0000:04:00.0: AER: TLP Header: 40000001 0000020f
> 90028090 00000000
> [ 339.776761] ------------[ cut here ]------------
> [ 339.781378] tg3 0000:04:00.0: disabling already-disabled device
> [ 339.781386] WARNING: CPU: 0 PID: 358 at drivers/pci/pci.c:2236
> pci_disable_device+0xf4/0x100
> [ 339.795737] Modules linked in: raid1 rpcsec_gss_krb5 auth_rpcgss
> nfsv4 dns_resolver nfs lockd grace netfs rfkill sunrpc ipmi_ssif
> intel_rapl_msr intel_rapl_common intel_uncore_frequency
> intel_uncore_frequency_common i10nm_edac nfit libnvdimm
> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm vfat fat
> mgag200 rapl dax_hmem iTCO_wdt i2c_algo_bit cxl_acpi
> iTCO_vendor_support drm_shmem_helper intel_cstate acpi_ipmi ipmi_si
> mei_me cxl_core i2c_i801 dell_smbios isst_if_mmio isst_if_mbox_pci
> drm_kms_helper ipmi_devintf intel_uncore dcdbas mei einj
> intel_pch_thermal intel_vsec isst_if_common wmi_bmof
> dell_wmi_descriptor pcspkr i2c_smbus ipmi_msghandler acpi_power_meter
> drm fuse xfs libcrc32c sd_mod t10_pi sg crct10dif_pclmul ahci
> crc32_pclmul libahci crc32c_intel libata tg3 ghash_clmulni_intel wmi
> dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug]
> [ 339.872243] CPU: 0 PID: 358 Comm: kworker/0:3 Not tainted 6.8.0+ #1
> [ 339.878505] Hardware name: Dell Inc. PowerEdge R650xs/0PPTY2, BIOS
> 1.4.4 10/07/2021
> [ 339.886157] Workqueue: events aer_recover_work_func
> [ 339.891037] RIP: 0010:pci_disable_device+0xf4/0x100
> [ 339.895917] Code: 4d 85 e4 75 07 4c 8b a3 c8 00 00 00 48 8d bb c8
> 00 00 00 e8 9e c7 17 00 4c 89 e2 48 c7 c7 50 92 21 91 48 89 c6 e8 ac
> 94 a1 ff <0f> 0b e9 3b ff ff ff e8 80 36 60 00 90 90 90 90 90 90 90 90
> 90 90
> [ 339.914664] RSP: 0018:ff56179a82883d10 EFLAGS: 00010286
> [ 339.919888] RAX: 0000000000000000 RBX: ff2f7c9b44e58000 RCX: ffffffff9171e4a8
> [ 339.927022] RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: 0000000000000001
> [ 339.934154] RBP: ff2f7c9b65860000 R08: 0000000000000000 R09: ff56179a82883bc0
> [ 339.941289] R10: ff56179a82883bb8 R11: ffffffff917de4e8 R12: ff2f7c9b445fa4e0
> [ 339.948421] R13: 0000000000000002 R14: ff2f7c9b44e58148 R15: ff2f7c9b44e5d000
> [ 339.955552] FS: 0000000000000000(0000) GS:ff2f7c9eaf600000(0000)
> knlGS:0000000000000000
> [ 339.963640] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 339.969385] CR2: 00007f7577713838 CR3: 0000000300a20003 CR4: 0000000000771ef0
> [ 339.976519] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 339.983651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 339.990782] PKRU: 55555554
> [ 339.993494] Call Trace:
> [ 339.995949] <TASK>
> [ 339.998054] ? __warn+0x7f/0x130
> [ 340.001286] ? pci_disable_device+0xf4/0x100
> [ 340.005560] ? report_bug+0x18a/0x1a0
> [ 340.009227] ? handle_bug+0x3c/0x70
> [ 340.012719] ? exc_invalid_op+0x14/0x70
> [ 340.016559] ? asm_exc_invalid_op+0x16/0x20
> [ 340.020745] ? pci_disable_device+0xf4/0x100
> [ 340.025017] ? __pfx_report_frozen_detected+0x10/0x10
> [ 340.030069] tg3_io_error_detected+0x1f5/0x2b0 [tg3]
> [ 340.035044] ? __pfx_report_frozen_detected+0x10/0x10
> [ 340.040098] report_error_detected+0xc7/0x1c0
> [ 340.044456] ? __pfx_report_frozen_detected+0x10/0x10
> [ 340.049509] __pci_walk_bus+0x6b/0xb0
> [ 340.053176] ? __pfx_aer_root_reset+0x10/0x10
> [ 340.057535] pcie_do_recovery+0x2b4/0x3c0
> [ 340.061548] aer_recover_work_func+0x106/0x110
> [ 340.065992] process_one_work+0x193/0x3d0
> [ 340.070005] worker_thread+0x2fc/0x410
> [ 340.073758] ? __pfx_worker_thread+0x10/0x10
> [ 340.078032] kthread+0xdc/0x110
> [ 340.081179] ? __pfx_kthread+0x10/0x10
> [ 340.084930] ret_from_fork+0x2d/0x50
> [ 340.088510] ? __pfx_kthread+0x10/0x10
> [ 340.092263] ret_from_fork_asm+0x1a/0x30
> [ 340.096190] </TASK>
> [ 340.098380] ---[ end trace 0000000000000000 ]---
> [ 340.103083] reboot: Restarting system
> [-- MARK -- Tue Mar 19 14:05:00 2024]
> ```
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100
2024-03-20 2:46 ` Bjorn Helgaas
@ 2024-03-20 3:13 ` Changhui Zhong
2024-03-21 10:11 ` Changhui Zhong
1 sibling, 0 replies; 7+ messages in thread
From: Changhui Zhong @ 2024-03-20 3:13 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: linux-pci
On Wed, Mar 20, 2024 at 10:46 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Wed, Mar 20, 2024 at 10:16:06AM +0800, Changhui Zhong wrote:
> > On Wed, Mar 20, 2024 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Mar 19, 2024 at 03:34:56PM +0800, Changhui Zhong wrote:
> > > > repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > > > branch: master
> > > > commit HEAD:b3603fcb79b1036acae10602bffc4855a4b9af80
> > >
> > > Where's the rest of this? I don't see "WARNING: CPU: 0 PID: 226 at
> > > drivers/pci/pci.c:2236" in the snippet below. Please include or post
> > > the complete dmesg log.
> > >
> > > Is this reproducible? If so, how? And is it a regression?
> >
> > it reproduceible,I can trigger it every time on my server,but I'm not
> > sure if it is a regression,
>
> Great, it's always easier if it's easily reproducible. Can you please
> try an older kernel, e.g., v6.8?
>
yeah,I can try it, will feedback test result later
> > dmesg log on my other server:
>
> Please include or post the *complete* dmesg log all the way from the
> very beginning of boot, not just the snippet you included below. The
> complete log contains useful information that we need to investigate
> this problem.
>
the complete dmesg log is too huge. I intercepted a log from boot to reboot.
please check https://pastebin.com/0iuYhFCR
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100
2024-03-20 2:46 ` Bjorn Helgaas
2024-03-20 3:13 ` Changhui Zhong
@ 2024-03-21 10:11 ` Changhui Zhong
2024-03-21 12:44 ` Bjorn Helgaas
1 sibling, 1 reply; 7+ messages in thread
From: Changhui Zhong @ 2024-03-21 10:11 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: linux-pci
On Wed, Mar 20, 2024 at 10:46 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Wed, Mar 20, 2024 at 10:16:06AM +0800, Changhui Zhong wrote:
> > On Wed, Mar 20, 2024 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Mar 19, 2024 at 03:34:56PM +0800, Changhui Zhong wrote:
> > > > repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > > > branch: master
> > > > commit HEAD:b3603fcb79b1036acae10602bffc4855a4b9af80
> > >
> > > Where's the rest of this? I don't see "WARNING: CPU: 0 PID: 226 at
> > > drivers/pci/pci.c:2236" in the snippet below. Please include or post
> > > the complete dmesg log.
> > >
> > > Is this reproducible? If so, how? And is it a regression?
> >
> > it reproduceible,I can trigger it every time on my server,but I'm not
> > sure if it is a regression,
>
> Great, it's always easier if it's easily reproducible. Can you please
> try an older kernel, e.g., v6.8?
>
I tested v6.8 and v6.7 both triggered this issue,
and not trigger this issue on v6.6
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100
2024-03-21 10:11 ` Changhui Zhong
@ 2024-03-21 12:44 ` Bjorn Helgaas
0 siblings, 0 replies; 7+ messages in thread
From: Bjorn Helgaas @ 2024-03-21 12:44 UTC (permalink / raw)
To: Changhui Zhong; +Cc: linux-pci
On Thu, Mar 21, 2024 at 06:11:46PM +0800, Changhui Zhong wrote:
> On Wed, Mar 20, 2024 at 10:46 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Wed, Mar 20, 2024 at 10:16:06AM +0800, Changhui Zhong wrote:
> > > On Wed, Mar 20, 2024 at 12:30 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Tue, Mar 19, 2024 at 03:34:56PM +0800, Changhui Zhong wrote:
> > > > > repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > > > > branch: master
> > > > > commit HEAD:b3603fcb79b1036acae10602bffc4855a4b9af80
> > > >
> > > > Where's the rest of this? I don't see "WARNING: CPU: 0 PID: 226 at
> > > > drivers/pci/pci.c:2236" in the snippet below. Please include or post
> > > > the complete dmesg log.
> > > >
> > > > Is this reproducible? If so, how? And is it a regression?
> > >
> > > it reproduceible,I can trigger it every time on my server,but I'm not
> > > sure if it is a regression,
> >
> > Great, it's always easier if it's easily reproducible. Can you please
> > try an older kernel, e.g., v6.8?
>
> I tested v6.8 and v6.7 both triggered this issue,
> and not trigger this issue on v6.6
Bisecting between v6.6 and v6.7 might be the quickest way to find it,
but it's a fair bit of work on your end.
How do you trigger the problem?
It looks like you're capturing console output, and most of the kernel
messages don't appear on the console. The kernel messages (dmesg) I'm
interested in should be captured somewhere like /var/log/dmesg or
similar (I don't know the exact filename for Red Hat).
Bjorn
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-03-21 12:44 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-19 7:34 [bug report] WARNING: CPU: 0 PID: 226 at drivers/pci/pci.c:2236 pci_disable_device+0xf4/0x100 Changhui Zhong
2024-03-19 16:23 ` Bjorn Helgaas
2024-03-20 2:16 ` Changhui Zhong
2024-03-20 2:46 ` Bjorn Helgaas
2024-03-20 3:13 ` Changhui Zhong
2024-03-21 10:11 ` Changhui Zhong
2024-03-21 12:44 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox