From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 07242CEBF78 for ; Fri, 27 Sep 2024 06:10:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:From:References:To:Subject:MIME-Version:Date: Message-ID:Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From :Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=H6W1LyfESXJb9xi845y6DXyVhfkTRoCo/BVMje4AuyI=; b=J5nNljwI5z6Hh+UBIDdwqGMBHR kdNIioptnxzmouqQP6WVCjSU0Vds305fw0pJ3Usrm6ps/aETkAQy9JtdkLZQgxiiRQS2+4L/DMBEf kmm8DoCLnIUbxv96Iv+fECZqUHYW0Y5UXEPX/OXUK1DS6OrcSFTPKMOpi7lhNY/PpEG5VqquTYpvR 0WgfgyWizORTojB+UfP0qNKIkBPsUOlr45wsQrCZJx8+ru3jfx07BlpqESbOs2wvsUa5gPmyK8P2t OoN3A1XMGuAhi/ny6UpmB1BYjRsmwzQVWL29TvlKRO02tS0rIzBy8IMpm6q596qe2XtjVDiE0i+1T SbLnON0w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1su4BM-0000000A9E4-1rk2; Fri, 27 Sep 2024 06:10:24 +0000 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1su4BF-0000000A9DV-0gRz for linux-nvme@lists.infradead.org; Fri, 27 Sep 2024 06:10:19 +0000 Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 48R2pPfC021634; Fri, 27 Sep 2024 06:10:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h= message-id:date:mime-version:subject:to:references:from :in-reply-to:content-type:content-transfer-encoding; s=pp1; bh=H 6W1LyfESXJb9xi845y6DXyVhfkTRoCo/BVMje4AuyI=; b=tt77vcdSRvnnjE7aY 8sJkEJw6CTM73erUvOkO5d8y204dPN5HLxYF7Bd9GZ0Rok0jeCvqXyWlS1eXu8dJ Sk0+KWbiR/YYI6D22bp9/8mFcaJDp/PTtRgGRlrPpnesF7z8+oFuQCOdvooUZrge yVUCjwHer0csuINvaeArl3fxhA8vrhwk5IvaBOuVT2NIKHjqhPTlnM1fuRhciurA B5xdJdYXNzc9io94i7/7cMX/EVQaeyR8U+HvF2hhYRAOQr4FNy2vMVjwjF+/U4lR uopfR3UhqJx2p9Uh6HSyFfOtFLm0ntSombTUqVXmShJWs+l+dUeNJcl44svO9ezq opMkQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 41snvbjgup-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Sep 2024 06:10:12 +0000 (GMT) Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 48R6ABgo013514; Fri, 27 Sep 2024 06:10:11 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 41snvbjguj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Sep 2024 06:10:11 +0000 (GMT) Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 48R4ZKsS020814; Fri, 27 Sep 2024 06:10:10 GMT Received: from smtprelay05.dal12v.mail.ibm.com ([172.16.1.7]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 41tb63jtgq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 27 Sep 2024 06:10:10 +0000 Received: from smtpav03.wdc07v.mail.ibm.com (smtpav03.wdc07v.mail.ibm.com [10.39.53.230]) by smtprelay05.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 48R6A9rg43778366 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 27 Sep 2024 06:10:09 GMT Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5E1C258062; Fri, 27 Sep 2024 06:10:09 +0000 (GMT) Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BA7DD5805F; Fri, 27 Sep 2024 06:10:07 +0000 (GMT) Received: from [9.171.27.86] (unknown [9.171.27.86]) by smtpav03.wdc07v.mail.ibm.com (Postfix) with ESMTP; Fri, 27 Sep 2024 06:10:07 +0000 (GMT) Message-ID: <7ef2300b-adb2-40d8-95b0-995aaf8d7436@linux.ibm.com> Date: Fri, 27 Sep 2024 11:40:05 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: nvme: machine check when running nvme subsystem-reset /dev/nvme0 against direct attach via PCIE slot To: Laurence Oberman , "busch, keith" , linux-nvme@lists.infradead.org, Keith Busch References: Content-Language: en-US From: Nilay Shroff In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: bny2InZLhkJexO2ebewvlS0tV5mrwRlC X-Proofpoint-GUID: uBjmZ3xxOF0tyNGZBV49biQ8ZgEhaHOB X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.60.29 definitions=2024-09-27_02,2024-09-26_01,2024-09-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 priorityscore=1501 phishscore=0 clxscore=1011 spamscore=0 mlxscore=0 adultscore=0 impostorscore=0 bulkscore=0 malwarescore=0 suspectscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2408220000 definitions=main-2409270039 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240926_231017_450809_2DB82309 X-CRM114-Status: GOOD ( 25.18 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org On 9/27/24 02:41, Laurence Oberman wrote: > Hi Keith > Hope all is well > > Quick question (expected or not) > > It was reported to Red Hat, seeing issues with using a > "nvme subsystem-reset /dev/nvme0" command to test resets. > > On multiple servers I tested on two types of nvme attached devices > These are not the rootfs devices > > 1. The front slot (hotplug) devices in a 2.5in format > reset and after some time recover (what is expected) > > Example of one working > > Does not trap and land up as a machine-check > > [ 2215.440468] pcieport 0000:10:01.1: AER: Multiple Uncorrected (Non- > Fatal) error received: 0000:12:13.0 > [ 2215.440532] pcieport 0000:12:13.0: PCIe Bus Error: > severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester > ID) > [ 2215.440536] pcieport 0000:12:13.0: device [10b5:8748] error > status/mask=00100000/00000000 > [ 2215.440540] pcieport 0000:12:13.0: [20] UnsupReq > (First) > [ 2215.440544] pcieport 0000:12:13.0: AER: TLP Header: 40009001 > 1000000f e9211000 12000000 > [ 2215.441813] systemd-journald[2173]: Sent WATCHDOG=1 notification. > [ 2216.937498] {1}[Hardware Error]: Hardware error from APEI Generic > Hardware Error Source: 4 > [ 2216.937505] {1}[Hardware Error]: event severity: info > [ 2216.937508] {1}[Hardware Error]: Error 0, type: fatal > [ 2216.937511] {1}[Hardware Error]: fru_text: PcieError > [ 2216.937514] {1}[Hardware Error]: section_type: PCIe error > [ 2216.937515] {1}[Hardware Error]: port_type: 4, root port > [ 2216.937517] {1}[Hardware Error]: version: 0.2 > [ 2216.937519] {1}[Hardware Error]: command: 0x0407, status: 0x0010 > [ 2216.937522] {1}[Hardware Error]: device_id: 0000:10:01.1 > [ 2216.937524] {1}[Hardware Error]: slot: 3 > [ 2216.937525] {1}[Hardware Error]: secondary_bus: 0x11 > [ 2216.937526] {1}[Hardware Error]: vendor_id: 0x1022, device_id: > 0x1453 > [ 2216.937528] {1}[Hardware Error]: class_code: 060400 > [ 2216.937529] {1}[Hardware Error]: bridge: secondary_status: 0x2000, > control: 0x0012 > [ 2216.937530] {1}[Hardware Error]: aer_uncor_status: 0x00000000, > aer_uncor_mask: 0x04500000 > [ 2216.937532] {1}[Hardware Error]: aer_uncor_severity: 0x004e2030 > [ 2216.937532] {1}[Hardware Error]: TLP Header: 00000000 00000000 > 00000000 00000000 > [ 2216.937629] pcieport 0000:10:01.1: AER: aer_status: 0x00000000, > aer_mask: 0x04500000 > [ 2216.937634] pcieport 0000:10:01.1: AER: aer_layer=Transaction Layer, > aer_agent=Receiver ID > [ 2216.937638] pcieport 0000:10:01.1: AER: aer_uncor_severity: > 0x004e2030 > [ 2216.937645] nvme nvme4: frozen state error detected, reset > controller > [ 2217.071095] nvme nvme10: frozen state error detected, reset > controller > [ 2217.096928] nvme nvme0: frozen state error detected, reset > controller > [ 2217.118947] nvme nvme18: frozen state error detected, reset > controller > [ 2217.138945] nvme nvme6: frozen state error detected, reset > controller > [ 2217.164918] nvme nvme14: frozen state error detected, reset > controller > [ 2217.186902] nvme nvme20: frozen state error detected, reset > controller > [ 2279.420266] nvme 0000:1a:00.0: Unable to change power state from > D3cold to D0, device inaccessible > [ 2279.420329] nvme nvme22: Disabling device after reset failure: -19 > [ 2279.464727] pcieport 0000:12:13.0: AER: device recovery failed > [ 2279.464823] pcieport 0000:12:13.0: pciehp: pcie_do_write_cmd: no > response from device > > Port resets and recovers > > [ 2279.593196] pcieport 0000:10:01.1: AER: Root Port link has been > reset (0) > [ 2279.593699] nvme nvme4: restart after slot reset > [ 2279.593949] nvme nvme10: restart after slot reset > [ 2279.594222] nvme nvme0: restart after slot reset > [ 2279.594453] nvme nvme18: restart after slot reset > [ 2279.594728] nvme nvme6: restart after slot reset > [ 2279.594984] nvme nvme14: restart after slot reset > [ 2279.595226] nvme nvme20: restart after slot reset > [ 2279.595435] pcieport 0000:12:13.0: pciehp: Slot(19): Card present > [ 2279.595441] pcieport 0000:12:13.0: pciehp: Slot(19): Link Up > [ 2279.609081] nvme nvme4: Shutdown timeout set to 8 seconds > [ 2279.617532] nvme nvme0: Shutdown timeout set to 8 seconds > [ 2279.617533] nvme nvme14: Shutdown timeout set to 8 seconds > [ 2279.618028] nvme nvme6: Shutdown timeout set to 8 seconds > [ 2279.618207] nvme nvme18: Shutdown timeout set to 8 seconds > [ 2279.618290] nvme nvme10: Shutdown timeout set to 8 seconds > [ 2279.618308] nvme nvme20: Shutdown timeout set to 8 seconds > [ 2279.631961] nvme nvme4: 32/0/0 default/read/poll queues > [ 2279.643293] nvme nvme14: 32/0/0 default/read/poll queues > [ 2279.643372] nvme nvme0: 32/0/0 default/read/poll queues > [ 2279.644881] nvme nvme6: 32/0/0 default/read/poll queues > [ 2279.644966] nvme nvme10: 32/0/0 default/read/poll queues > [ 2279.645030] nvme nvme18: 32/0/0 default/read/poll queues > [ 2279.645132] nvme nvme20: 32/0/0 default/read/poll queues > [ 2279.645202] pcieport 0000:10:01.1: AER: device recovery successful > > 2. Any kernel upstream latest 6.11, RHEL8 or RHEL9 causes  > a machine check and panics the box when its against a nvme in a > PCIE slot > > 263.862919] mce: [Hardware Error]: CPU 12: Machine Check Exception: 5 > Bank 6: ba00000000000e0b > [ 263.862924] mce: [Hardware Error]: RIP !INEXACT! > 10: {intel_idle+0x54/0x90} > [ 263.862931] mce: [Hardware Error]: TSC 7a47d8d62ba6dd MISC 83100000 > [ 263.862933] mce: [Hardware Error]: PROCESSOR 0:606a6 TIME 1727384194 > SOCKET 1 APIC 40 microcode d0003a5 > [ 263.862936] mce: [Hardware Error]: Run the above through 'mcelog -- > ascii' > [ 263.885254] mce: [Hardware Error]: Machine check: Processor context > corrupt > [ 263.885259] Kernel panic - not syncing: Fatal machine check > > Hardware event. This is not a software error. > CPU 0 BANK 0 TSC 7a47d8d62ba6dd > RIP !INEXACT! 10:ffffffff8571dce4 > TIME 1727384194 Thu Sep 26 16:56:34 2024 > MCG status: > MCi status: > Machine check not valid > Corrected error > MCA: No Error > STATUS 0 MCGSTATUS 0 > CPUID Vendor Intel Family 6 Model 106 Step 6 > RIP: intel_idle+0x54/0x90} > SOCKET 1 APIC 40 microcode d0003a5 > Run the above through 'mcelog --ascii' > Machine check: Processor context corrupt > > Regards > Laurence > > > I think the Keith's email address is not correct. Adding the correct email address of Keith here. BTW, Keith recently help fixed an issue in kernel v6.11 with nvme subsystem-reset command to ensure that we recover the nvme disk on PPC. On PPC architecture, we use EEH to recover the disk post subsystem-reset but yours is Intel and that uses AER for recovery. So I'm not sure if that same commit 210b1f6576e8("nvme-pci: do not directly handle subsys reset fallout") which was merged in kernel v6.11 causing a side effect on the Intel machine. Would you please revert the above commit and see if that help fix the observed symptom on your Intel machine? Thanks, --Nilay