From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28AD8C678D4 for ; Mon, 16 Jan 2023 16:52:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234003AbjAPQwR (ORCPT ); Mon, 16 Jan 2023 11:52:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234005AbjAPQvy (ORCPT ); Mon, 16 Jan 2023 11:51:54 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 681B84B775 for ; Mon, 16 Jan 2023 08:37:16 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4B7DA61083 for ; Mon, 16 Jan 2023 16:37:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5DC2BC433F0; Mon, 16 Jan 2023 16:37:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1673887034; bh=829ReEx9kwuWay6Tp+HCEI3eVnwwgKf+v3BoEfkhc8Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PBSBxskWIP47Or8FK7eMnAgeKXxjHxO6vWxR14A+2WcAHtNWyBm88JSG0ly5S6YiF Mwunli+/IfErpTI5ueFvEy1vuqUx1E00Slu06LWdQ9kYTqXTMXqnsoFvRGPA1bQ7dU 0x0NbYSIhEh/ITaleXXod0XwQcaLPq2TTPqn9Alw= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Mahesh Salgaonkar , Michael Ellerman Subject: [PATCH 5.4 656/658] pseries/eeh: Fix the kdump kernel crash during eeh_pseries_init Date: Mon, 16 Jan 2023 16:52:24 +0100 Message-Id: <20230116154939.466910375@linuxfoundation.org> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20230116154909.645460653@linuxfoundation.org> References: <20230116154909.645460653@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Mahesh Salgaonkar commit eb8257a12192f43ffd41bd90932c39dade958042 upstream. On pseries LPAR when an empty slot is assigned to partition OR in single LPAR mode, kdump kernel crashes during issuing PHB reset. In the kdump scenario, we traverse all PHBs and issue reset using the pe_config_addr of the first child device present under each PHB. However the code assumes that none of the PHB slots can be empty and uses list_first_entry() to get the first child device under the PHB. Since list_first_entry() expects the list to be non-empty, it returns an invalid pci_dn entry and ends up accessing NULL phb pointer under pci_dn->phb causing kdump kernel crash. This patch fixes the below kdump kernel crash by skipping empty slots: audit: initializing netlink subsys (disabled) thermal_sys: Registered thermal governor 'fair_share' thermal_sys: Registered thermal governor 'step_wise' cpuidle: using governor menu pstore: Registered nvram as persistent store backend Issue PHB reset ... audit: type=2000 audit(1631267818.000:1): state=initialized audit_enabled=0 res=1 BUG: Kernel NULL pointer dereference on read at 0x00000268 Faulting instruction address: 0xc000000008101fb0 Oops: Kernel access of bad area, sig: 7 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 7 PID: 1 Comm: swapper/7 Not tainted 5.14.0 #1 NIP: c000000008101fb0 LR: c000000009284ccc CTR: c000000008029d70 REGS: c00000001161b840 TRAP: 0300 Not tainted (5.14.0) MSR: 8000000002009033 CR: 28000224 XER: 20040002 CFAR: c000000008101f0c DAR: 0000000000000268 DSISR: 00080000 IRQMASK: 0 ... NIP pseries_eeh_get_pe_config_addr+0x100/0x1b0 LR __machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350 Call Trace: 0xc00000001161bb80 (unreliable) __machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350 do_one_initcall+0x60/0x2d0 kernel_init_freeable+0x350/0x3f8 kernel_init+0x3c/0x17c ret_from_kernel_thread+0x5c/0x64 Fixes: 5a090f7c363fd ("powerpc/pseries: PCIE PHB reset") Signed-off-by: Mahesh Salgaonkar [mpe: Tweak wording and trim oops] Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/163215558252.413351.8600189949820258982.stgit@jupiter Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/platforms/pseries/eeh_pseries.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/arch/powerpc/platforms/pseries/eeh_pseries.c +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c @@ -879,6 +879,10 @@ static int __init eeh_pseries_init(void) if (is_kdump_kernel() || reset_devices) { pr_info("Issue PHB reset ...\n"); list_for_each_entry(phb, &hose_list, list_node) { + // Skip if the slot is empty + if (list_empty(&PCI_DN(phb->dn)->child_list)) + continue; + pdn = list_first_entry(&PCI_DN(phb->dn)->child_list, struct pci_dn, list); addr = (pdn->busno << 16) | (pdn->devfn << 8); config_addr = pseries_eeh_get_config_addr(phb, addr);