From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C523CC28D13 for ; Thu, 25 Aug 2022 13:45:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=k8RBY4SLhyBCR1pWhB/7yt8gvtSGvO61HCld8dhBYE8=; b=le6yB5J9v0cFZU AMorz+FCraF4Md4XeOBqQv9SgTQV7wm6fqRXlPdSGRxXPzOkliKbR7cWIq2OT31VG9oKYLoiYpZUC Mat4eQxatgGCvmJHMIuCIbWXH/SQZr1xpSKbsKwDTgYda+Dsdplf35gbaOtKkFL6pIrhu1GGJz3UT o6peOKLVKEoBnUR2VF29m2aE2XhvcwBcUPqSZ4rl7rB8SyYOXjQ77OWsNqEeBSYqscIzzZEyt52lo 6IR2/foSevm6ZHNKDQygNcNxJWAu6CqEo/+Bm74bcqktkR3Qh4OgRqlRFFMkco/6plJDtt85GnXLt 2taV+9m/iWrQP4SKltgg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oRD9i-00EyZD-SG; Thu, 25 Aug 2022 13:44:23 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oRD90-00Ey1o-AE for linux-arm-kernel@lists.infradead.org; Thu, 25 Aug 2022 13:43:40 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9FAE661D20; Thu, 25 Aug 2022 13:43:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 236F8C433C1; Thu, 25 Aug 2022 13:43:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1661435016; bh=5JLuURgLlHoOj/DTpUcWL0Zj4664Y5H4qqY395xv84I=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=eN8HSXcwzOCSExYnG2xLUQnzQDZFDjDENE81beqrGQ8goxw8Iz6pu4s0WSHK2xCku cxgbjCymkVE1AG1slxA3/gYaTsoTMAxM74p1gyOkFSqnzipl0qvzfqayBb6DczfHc8 NBljR2eDw4ooez0L3PDy0+qMGFwvoS2LprzspSAgh66tugm55TUSZc1IKOSVQPsOuI zQiZo9XIeKOyuXFkSzMGbzM6sjpErYlAsnAoxttp+vhLEF6I/TPQ56Vhd/dE1h7gG+ 1T6qO4/WvPJ2Pgr+cMIdCMbwn38BLMR3/1nbUemRHF2pCmdsIupX6+vqaUfWyGBj5d djaQzxQDvhSOA== Date: Thu, 25 Aug 2022 15:43:28 +0200 From: Lorenzo Pieralisi To: Pali =?iso-8859-1?Q?Roh=E1r?= Cc: Bjorn Helgaas , Thomas Petazzoni , Rob Herring , Krzysztof =?utf-8?Q?Wilczy=C5=84ski?= , Bjorn Helgaas , Hajo Noerenberg , linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, maz@kernel.org Subject: Re: [PATCH] PCI: mvebu: Dispose INTx irqs prior to removing INTx domain Message-ID: References: <20220808184418.brjntz26kalathig@pali> <20220809020042.GA1260418@bhelgaas> <20220809133911.hqi7eyskcq2sojia@pali> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20220809133911.hqi7eyskcq2sojia@pali> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220825_064338_537898_6F5F12CB X-CRM114-Status: GOOD ( 57.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org [+Marc] On Tue, Aug 09, 2022 at 03:39:11PM +0200, Pali Roh=E1r wrote: [...] > > Then we'll need either a strong argument that this is safe even if > > drivers for endpoints below fail to dispose of their IRQ info > > correctly, or an argument that the benefit of making the PCI > > controller drivers removable outweighs that risk. > > = > > I haven't been paying close attention recently, so I may have missed > > the details here, but my superficial opinion is that removability of > > PCI controller drivers is primarily useful to developers, not to end > > users. > = > For developers it is strong need. Hajo needed this patch for debugging > SATA related issue. > = > For end-users in more cases it is also needed because cards, card > drivers or controller drivers can be buggy and calling rmmod && modprobe > is the way how to reload PCIe bus with controller without need to reboot > whole machine. This approach is lot of times suggested by admins and > also on user forums, because it really helps. > = > In my opinion, if hotplug capable kernel driver (for what ever reasons) > leaks resources on unbind procedure, it is an big issue. It basically > disallow proper next bind procedure; which is required for hotplug > capable drivers and buses, which PCIe is. memory and interrupts are just > example of resources which drivers use lot of. > = > So I consider a bug if driver does not release resource properly and > "parent" driver should not expect that something like it is normal. > = > This particular patch was tested at least by two people that is really > fixes resource-free issue with more endpoint drivers. > = > I think that in this case for pci-mvebu.c it is safe because: At the > first step of unbind procedure is called unregistraction of PCIe bus > with all devices bound on it. This ensures that all PCIe endpoint > drivers are unbind, devices removed and no new driver or device and > appear. After that there should not be any remaining usage of PCIe > resources (if there is then whole PCIe hotplug code is broken and we > have other and bigger issue...). Next pci-mvebu.c manually disposes all > remaining legacy interrupts (which PCI core code does not because legacy > interrupts are shared and it does not know if they are used or not). > This is safe because at these stage there are no PCI drivers bound, > there is no PCI device for that controller registered. And after that is > removed IRQ domain (which has finally disposed all interrupts). > If you see there some logical issue, please let me know. > = > Note that allowing doing device unbind and disallowing rmmod is useless > as the whole issue happens during unbind, not during rmmod. So the > discussion should have been about device unbinding, not rmmoding. > = > I see very big benefit for both developers and end users to allow doing > unbind procedure as explained above. > = > But if you decide that unbind should be disallowed, then doing it > properly should probably imply to also disallow doing PCIe hog-unplug. > As unplugging device means to unbind endpoint card drivers. And I think > hot-plug and hot-unplug is a PCIe feature which could be supported. But > well, this is decision for you as maintainer. I think that this is the right thing to do - CC'ed Marc here in case I am missing something - I think this ought to be fixed, for this controller and all others that are missing the irq_dispose_mapping() call. Lorenzo > > Bjorn > > = > > > On Saturday 09 July 2022 18:18:58 Pali Roh=E1r wrote: > > > > Documentation for irq_domain_remove() says that all mapping within = the > > > > domain must be disposed prior to domain remove. > > > > = > > > > Currently INTx irqs are not disposed in pci-mvebu.c device unbind c= allback > > > > which cause that kernel crashes after unloading driver and trying t= o read > > > > /sys/kernel/debug/irq/irqs/ or /proc/interrupts. > > > > = > > > > Fixes: ec075262648f ("PCI: mvebu: Implement support for legacy INTx= interrupts") > > > > Reported-by: Hajo Noerenberg > > > > Signed-off-by: Pali Roh=E1r > > > > --- > > > > Depends on patch: > > > > https://lore.kernel.org/linux-pci/20220524122817.7199-1-pali@kernel= .org/ > > > > = > > > > Here is the captured kernel crash which happens without this patch: > > > > = > > > > $ cat /sys/kernel/debug/irq/irqs/64 > > > > [ 301.571370] 8<--- cut here --- > > > > [ 301.574496] Unable to handle kernel paging request at virtual ad= dress 0a00002a > > > > [ 301.581736] [0a00002a] *pgd=3D00000000 > > > > [ 301.585323] Internal error: Oops: 80000005 [#1] SMP ARM > > > > [ 301.590560] Modules linked in: > > > > [ 301.593621] CPU: 1 PID: 4641 Comm: cat Not tainted 5.16.0-rc1+ #= 192 > > > > [ 301.599905] Hardware name: Marvell Armada 380/385 (Device Tree) > > > > [ 301.605836] PC is at 0xa00002a > > > > [ 301.608896] LR is at irq_debug_show+0x210/0x2d4 > > > > [ 301.613440] pc : [<0a00002a>] lr : [] psr: 20000= 0b3 > > > > [ 301.619721] sp : c797fdd8 ip : 0000000b fp : 0a00002b > > > > [ 301.624957] r10: c0d9a364 r9 : 00000001 r8 : 00000000 > > > > [ 301.630192] r7 : c18fee18 r6 : c0da2a74 r5 : c18fee00 r4 : c6= 6ec050 > > > > [ 301.636734] r3 : 00000001 r2 : c18fee18 r1 : 00000000 r0 : c6= 6ec050 > > > > [ 301.643275] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA Thu= mb Segment none > > > > [ 301.650689] Control: 10c5387d Table: 0790c04a DAC: 00000051 > > > > [ 301.656446] Register r0 information: slab seq_file start c66ec05= 0 pointer offset 0 > > > > [ 301.664040] Register r1 information: NULL pointer > > > > [ 301.668755] Register r2 information: slab kmalloc-256 start c18f= ee00 pointer offset 24 size 256 > > > > [ 301.677480] Register r3 information: non-paged memory > > > > [ 301.682543] Register r4 information: slab seq_file start c66ec05= 0 pointer offset 0 > > > > [ 301.690133] Register r5 information: slab kmalloc-256 start c18f= ee00 pointer offset 0 size 256 > > > > [ 301.698770] Register r6 information: non-slab/vmalloc memory > > > > [ 301.704442] Register r7 information: slab kmalloc-256 start c18f= ee00 pointer offset 24 size 256 > > > > [ 301.713165] Register r8 information: NULL pointer > > > > [ 301.717879] Register r9 information: non-paged memory > > > > [ 301.722941] Register r10 information: non-slab/vmalloc memory > > > > [ 301.728699] Register r11 information: non-paged memory > > > > [ 301.733848] Register r12 information: non-paged memory > > > > [ 301.738997] Process cat (pid: 4641, stack limit =3D 0xf591166e) > > > > [ 301.744756] Stack: (0xc797fdd8 to 0xc7980000) > > > > [ 301.749123] fdc0: = 0000000a 830d3f3e > > > > [ 301.757321] fde0: c1004f48 c0d9a374 c7a9cc10 c66ec050 00000000 c= 88af900 c797fe80 7ffff000 > > > > [ 301.765518] fe00: 00400cc0 c66ec068 00000001 c02c5cb8 00000000 0= 0000000 c66ec078 c797fe68 > > > > [ 301.773715] fe20: c1cdf6c0 c7a9cc10 ffffffea c88af900 00000010 0= 0000000 00000000 c88af900 > > > > [ 301.781911] fe40: c1004f48 c797ff78 00001000 00004004 c03efcb8 c= 02c6100 00001000 00000000 > > > > [ 301.790108] fe60: bec73e04 00001000 00000000 00000000 00001000 c= 797fe60 00000001 00000000 > > > > [ 301.798304] fe80: c88af900 00000000 00000000 00000000 00000000 0= 0000000 00000000 40040000 > > > > [ 301.806501] fea0: 00000000 00000000 c1004f48 830d3f3e c88af900 c= 02c6018 c1c7a770 bec73e04 > > > > [ 301.814697] fec0: 00001000 c797ff78 00000001 c03efd0c 00001000 c= 88af900 00000000 bec73e04 > > > > [ 301.822894] fee0: c1004f48 c797ff78 00000001 c029c728 c887ca20 0= 1100cca 0000004f 0045f000 > > > > [ 301.831091] ff00: 00000254 c790c010 c790c010 00000000 00000000 0= 0000000 c5f6117c eeece9b8 > > > > [ 301.839288] ff20: 00000000 830d3f3e 00000000 c797ffb0 c79fc000 8= 0000007 0045f5b8 00000254 > > > > [ 301.847484] ff40: c79fc040 00000004 c887ca20 830d3f3e 00000000 c= 1004f48 c88af900 00000000 > > > > [ 301.855681] ff60: 00000000 c88af900 bec73e04 00001000 00000000 c= 029cd68 00000000 00000000 > > > > [ 301.863877] ff80: 00000000 830d3f3e 00000000 00000000 01000000 0= 0000003 c0100284 c1b8abc0 > > > > [ 301.872074] ffa0: 00000003 c0100060 00000000 00000000 00000003 b= ec73e04 00001000 00000000 > > > > [ 301.880270] ffc0: 00000000 00000000 01000000 00000003 00000003 0= 0000001 00000001 00000000 > > > > [ 301.888468] ffe0: bec73d98 bec73d88 b6f81f88 b6f81410 60000010 0= 0000003 00000000 00000000 > > > > [ 301.896666] [] (irq_debug_show) from [] (seq= _read_iter+0x1a4/0x504) > > > > [ 301.904700] [] (seq_read_iter) from [] (seq_= read+0xe8/0x12c) > > > > [ 301.912117] [] (seq_read) from [] (full_prox= y_read+0x54/0x70) > > > > [ 301.919623] [] (full_proxy_read) from [] (vf= s_read+0xa0/0x2c8) > > > > [ 301.927214] [] (vfs_read) from [] (ksys_read= +0x58/0xd0) > > > > [ 301.934195] [] (ksys_read) from [] (ret_fast= _syscall+0x0/0x54) > > > > [ 301.941785] Exception stack(0xc797ffa8 to 0xc797fff0) > > > > [ 301.946849] ffa0: 00000000 00000000 00000003 b= ec73e04 00001000 00000000 > > > > [ 301.955045] ffc0: 00000000 00000000 01000000 00000003 00000003 0= 0000001 00000001 00000000 > > > > [ 301.963241] ffe0: bec73d98 bec73d88 b6f81f88 b6f81410 > > > > [ 301.968304] Code: bad PC value > > > > [ 301.971365] ---[ end trace fe25fd26d042b605 ]--- > > > > [ 301.975992] Kernel panic - not syncing: Fatal exception > > > > [ 301.981229] CPU0: stopping > > > > [ 301.983946] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G D = 5.16.0-rc1+ #192 > > > > [ 301.991884] Hardware name: Marvell Armada 380/385 (Device Tree) > > > > [ 301.997817] [] (unwind_backtrace) from [] (s= how_stack+0x10/0x14) > > > > [ 302.005587] [] (show_stack) from [] (dump_st= ack_lvl+0x40/0x4c) > > > > [ 302.013179] [] (dump_stack_lvl) from [] (do_= handle_IPI+0xf4/0x128) > > > > [ 302.021117] [] (do_handle_IPI) from [] (ipi_= handler+0x18/0x20) > > > > [ 302.028707] [] (ipi_handler) from [] (handle= _percpu_devid_irq+0x78/0x124) > > > > [ 302.037256] [] (handle_percpu_devid_irq) from [] (generic_handle_domain_irq+0x44/0x88) > > > > [ 302.046938] [] (generic_handle_domain_irq) from [] (gic_handle_irq+0x74/0x88) > > > > [ 302.055839] [] (gic_handle_irq) from [] (gen= eric_handle_arch_irq+0x34/0x44) > > > > [ 302.064564] [] (generic_handle_arch_irq) from [] (__irq_svc+0x50/0x68) > > > > [ 302.072851] Exception stack(0xc1001f00 to 0xc1001f48) > > > > [ 302.077916] 1f00: 000d6830 00000000 00000001 c0116be0 c1004f90 c= 1004fd4 00000001 00000000 > > > > [ 302.086114] 1f20: c1004f48 c0f5d2a8 c1009e80 00000000 00000000 c= 1001f50 c01076f4 c01076f8 > > > > [ 302.094309] 1f40: 60000013 ffffffff > > > > [ 302.097804] [] (__irq_svc) from [] (arch_cpu= _idle+0x38/0x3c) > > > > [ 302.105223] [] (arch_cpu_idle) from [] (defa= ult_idle_call+0x1c/0x2c) > > > > [ 302.113338] [] (default_idle_call) from [] (= do_idle+0x1c8/0x218) > > > > [ 302.121106] [] (do_idle) from [] (cpu_startu= p_entry+0x18/0x20) > > > > [ 302.128697] [] (cpu_startup_entry) from [] (= start_kernel+0x650/0x694) > > > > [ 302.136901] Rebooting in 3 seconds.. > > > > --- > > > > drivers/pci/controller/pci-mvebu.c | 9 ++++++++- > > > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > = > > > > diff --git a/drivers/pci/controller/pci-mvebu.c b/drivers/pci/contr= oller/pci-mvebu.c > > > > index 31f53a019b8f..951030052358 100644 > > > > --- a/drivers/pci/controller/pci-mvebu.c > > > > +++ b/drivers/pci/controller/pci-mvebu.c > > > > @@ -1713,8 +1713,15 @@ static int mvebu_pcie_remove(struct platform= _device *pdev) > > > > mvebu_writel(port, ~PCIE_INT_ALL_MASK, PCIE_INT_CAUSE_OFF); > > > > = > > > > /* Remove IRQ domains. */ > > > > - if (port->intx_irq_domain) > > > > + if (port->intx_irq_domain) { > > > > + int virq, j; > > > > + for (j =3D 0; j < PCI_NUM_INTX; j++) { > > > > + virq =3D irq_find_mapping(port->intx_irq_domain, j); > > > > + if (virq > 0) > > > > + irq_dispose_mapping(virq); > > > > + } > > > > irq_domain_remove(port->intx_irq_domain); > > > > + } > > > > = > > > > /* Free config space for emulated root bridge. */ > > > > pci_bridge_emul_cleanup(&port->bridge); > > > > -- = > > > > 2.20.1 > > > > = > > > = > > > _______________________________________________ > > > linux-arm-kernel mailing list > > > linux-arm-kernel@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel