From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_SANE_2 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06CC8C3F2D8 for ; Fri, 6 Mar 2020 02:36:30 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C49D0206D7 for ; Fri, 6 Mar 2020 02:36:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="k1p3fHtC"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=mediatek.com header.i=@mediatek.com header.b="AAQ6CTG6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C49D0206D7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=mediatek.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Date:To:From:Subject:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=gQ9tPxD0ucVzYLmFD90xNy3EYURsS8on1SCbhOOgmgA=; b=k1p3fHtCq6ubpn rAEPrzXeZznyQFFQS7OBfVND7+yja6NGs7E7FdW0HgPTEmAWTeTmTMcTMox8b/8sS8zthp+gEdMHO ZMqPDWXiw+ef7H+Qm2u3icU0fe0J6czPQ1CN7mHSJsR6aGaGq04v+VJ97YU/MAAXGV1REjEutlAAJ s6QrDPwxUFmB+jRTbURt2NpRAJQn2+bafobmQspSi9EvmSnaYaYQTJ3GP65UbmgtfonXzSKLkWd56 o7Zi/q+CF6LdDOxgVgDIyiaZ3HD5wxAwwE+NjL3c8ZT9yZbcDe9WtP2Rnnt8oiBeN0dQW4MbozYqR e54n9n88cXqYbxQGtspw==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jA2qj-000696-A9; Fri, 06 Mar 2020 02:36:29 +0000 Received: from mailgw01.mediatek.com ([216.200.240.184]) by bombadil.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jA2qf-00066C-Hj; Fri, 06 Mar 2020 02:36:27 +0000 X-UUID: 10e3b1444feb453f865fc52935495105-20200305 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Transfer-Encoding:MIME-Version:Content-Type:References:In-Reply-To:Date:CC:To:From:Subject:Message-ID; bh=xm6Lh3IOt6rBhv5b30U0fgLekvmN2ICx+5qldR5tZI4=; b=AAQ6CTG6kATMMz24ktLahqkzucRleC80edGrEcqu7h+px54917XELT8pJReHvs69D4UZ8RSMyRv2X8CNtccqpvqZYnIma4j2adyTo10sxsuyR/gzRy6C4myjsg0MsdcfPTp8hqGPCPZ/krSBlwfxifEYWxijIRWwSYteepqp+jU=; X-UUID: 10e3b1444feb453f865fc52935495105-20200305 Received: from mtkcas68.mediatek.inc [(172.29.94.19)] by mailgw01.mediatek.com (envelope-from ) (musrelay.mediatek.com ESMTP with TLS) with ESMTP id 1655822279; Thu, 05 Mar 2020 18:36:18 -0800 Received: from mtkmbs07n1.mediatek.inc (172.21.101.16) by MTKMBS62N2.mediatek.inc (172.29.193.42) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Thu, 5 Mar 2020 18:36:37 -0800 Received: from mtkcas07.mediatek.inc (172.21.101.84) by mtkmbs07n1.mediatek.inc (172.21.101.16) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Fri, 6 Mar 2020 10:35:19 +0800 Received: from [172.21.77.33] (172.21.77.33) by mtkcas07.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.0.1395.4 via Frontend Transport; Fri, 6 Mar 2020 10:35:25 +0800 Message-ID: <1583462174.12083.67.camel@mtkswgap22> Subject: Re: [PATCH] xhci-mtk: Fix NULL pointer dereference with xhci_irq() for shared_hcd From: Macpaul Lin To: Greg Kroah-Hartman Date: Fri, 6 Mar 2020 10:36:14 +0800 In-Reply-To: <20200305183202.GA2107395@kroah.com> References: <1579246910-22736-1-git-send-email-macpaul.lin@mediatek.com> <08f69bab-2ada-d6ab-7bf7-d960e9f148a0@linux.intel.com> <1580556039.10835.3.camel@mtkswgap22> <39ec1610-1686-6509-02ac-6e73d8be2453@linux.intel.com> <1583291775.12083.59.camel@mtkswgap22> <1583377126.12083.63.camel@mtkswgap22> <20200305183202.GA2107395@kroah.com> X-Mailer: Evolution 3.2.3-0ubuntu6 MIME-Version: 1.0 X-MTK: N X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200305_183625_599776_7DE71D7B X-CRM114-Status: GOOD ( 33.77 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sriharsha Allenki , Mathias Nyman , wsd_upstream , Mathias Nyman , "linux-usb@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Chunfeng Yun =?UTF-8?Q?=28=E4=BA=91=E6=98=A5=E5=B3=B0=29?= , "linux-mediatek@lists.infradead.org" , Matthias Brugger , "linux-arm-kernel@lists.infradead.org" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+infradead-linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, 2020-03-05 at 19:32 +0100, Greg Kroah-Hartman wrote: > On Thu, Mar 05, 2020 at 10:58:46AM +0800, Macpaul Lin wrote: > > On Wed, 2020-03-04 at 16:39 +0200, Mathias Nyman wrote: > > > On 4.3.2020 5.16, Macpaul Lin wrote: > > > > On Tue, 2020-02-04 at 17:44 +0800, Mathias Nyman wrote: > > > >> On 1.2.2020 13.20, Macpaul Lin wrote: > > > >>> On Fri, 2020-01-31 at 16:50 +0200, Mathias Nyman wrote: > > > >>>> On 17.1.2020 9.41, Macpaul Lin wrote: > > > >>>>> According to NULL pointer fix: https://tinyurl.com/uqft5ra > > > >>>>> xhci: Fix NULL pointer dereference with xhci_irq() for shared_hcd > > > >>>>> The similar issue has also been found in QC activities in Mediatek. > > > >>>>> > > > >>>>> Here quote the description from the referenced patch as follows. > > > >>>>> "Commit ("f068090426ea xhci: Fix leaking USB3 shared_hcd > > > >>>>> at xhci removal") sets xhci_shared_hcd to NULL without > > > >>>>> stopping xhci host. This results into a race condition > > > >>>>> where shared_hcd (super speed roothub) related interrupts > > > >>>>> are being handled with xhci_irq happens when the > > > >>>>> xhci_plat_remove is called and shared_hcd is set to NULL. > > > >>>>> Fix this by setting the shared_hcd to NULL only after the > > > >>>>> controller is halted and no interrupts are generated." > > > >>>>> > > > >>>>> Signed-off-by: Sriharsha Allenki > > > >>>>> Signed-off-by: Macpaul Lin > > > >>>>> --- > > > >>>>> drivers/usb/host/xhci-mtk.c | 2 +- > > > >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > > > >>>>> > > > >>>>> diff --git a/drivers/usb/host/xhci-mtk.c b/drivers/usb/host/xhci-mtk.c > > > >>>>> index b18a6baef204..c227c67f5dc5 100644 > > > >>>>> --- a/drivers/usb/host/xhci-mtk.c > > > >>>>> +++ b/drivers/usb/host/xhci-mtk.c > > > >>>>> @@ -593,11 +593,11 @@ static int xhci_mtk_remove(struct platform_device *dev) > > > >>>>> struct usb_hcd *shared_hcd = xhci->shared_hcd; > > > >>>>> > > > >>>>> usb_remove_hcd(shared_hcd); > > > >>>>> - xhci->shared_hcd = NULL; > > > >>>>> device_init_wakeup(&dev->dev, false); > > > >>>>> > > > >>>>> usb_remove_hcd(hcd); > > > >>>>> usb_put_hcd(shared_hcd); > > > >>>>> + xhci->shared_hcd = NULL; > > > >>>>> usb_put_hcd(hcd); > > > >>>>> xhci_mtk_sch_exit(mtk); > > > >>>>> xhci_mtk_clks_disable(mtk); > > > >>>>> > > > >>>> > > > >>>> Could you share details of the NULL pointer dereference, (backtrace). > > > >>> > > > >>> This bug was found by our QA staff while doing 500 times plug-in and > > > >>> plug-out devices. The backtrace I have was recorded by QA and I didn't > > > >>> reproduce this issue on my own environment. However, after applied this > > > >>> patch the issue seems resolve. Here is the backtrace: > > > >>> > > > >>> Exception Class: Kernel (KE) > > > >>> PC is at [] xhci_irq+0x728/0x2364 > > > >>> LR is at [] xhci_irq+0x2f0/0x2364 > > > >>> > > > >>> Current Executing Process: > > > >>> [iptables, 859][netdagent, 770] > > > >>> > > > >>> Backtrace: > > > >>> [] __atomic_notifier_call_chain+0xa8/0x130 > > > >>> [] notify_die+0x84/0xac > > > >>> [] die+0x1d8/0x3b8 > > > >>> [] __do_kernel_fault+0x178/0x188 > > > >>> [] do_page_fault+0x44/0x3b0 > > > >>> [] do_translation_fault+0x44/0x98 > > > >>> [] do_mem_abort+0x4c/0x128 > > > >>> [] el1_da+0x24/0x3c > > > >>> [] xhci_irq+0x728/0x2364 > > > >>> [] usb_hcd_irq+0x2c/0x44 > > > >>> [] __handle_irq_event_percpu+0x26c/0x4a4 > > > >>> [] handle_irq_event+0x5c/0xd0 > > > >>> [] handle_fasteoi_irq+0x10c/0x1e0 > > > >>> [] __handle_domain_irq+0x32c/0x738 > > > >>> [] gic_handle_irq+0x174/0x1c4 > > > >>> [] el0_irq_naked+0x50/0x5c > > > >>> [] 0xffffffffffffffff > > > >>> > > > >> > > > >> Thanks, > > > >> Could you help me find out which line of code xhci_irq+0x728 is in your case. > > > >> > > > >> As Guenter pointed out there is a risk of turning the NULL pointer dereference > > > >> into a use after free if we just solve this by setting xhci->shared_hcd = NULL > > > >> later. > > > >> > > > >> If you still have that kernel around, and xhci is compiled in: > > > >> gdb vmlinux > > > >> gdb li *(xhci_irq+0x728) > > > >> > > > > > > > > Sorry that I couldn't get back to you soon. The internal code version > > > > for this issue was really old and a little bit difficult to rewind to > > > > that version. > > > > However, I think the following dump might be correct for the code base. > > > > > > > > (gdb) li *(xhci_irq+0x728) > > > > 0xffffff8008cc8634 is in xhci_irq (*stripped* > > > > kernel-4.14/drivers/usb/host/xhci.h:1694). > > > > 1689 */ > > > > 1690 #define XHCI_MAX_REXIT_TIMEOUT_MS 20 > > > > 1691 > > > > 1692 static inline unsigned int hcd_index(struct usb_hcd *hcd) > > > > 1693 { > > > > 1694 if (hcd->speed >= HCD_USB3) > > > > 1695 return 0; > > > > 1696 else > > > > 1697 return 1; > > > > 1698 } > > > > (gdb) > > > > > > > > Thanks > > > > Macpaul Lin > > > > > > > > > > Ah, it was a 4.14 kernel. > > > This should be fixed in 4.20 with patch: > > > 1245374e9b83 xhci: handle port status events for removed USB3 hcd > > > > > > Port arrays/structures were changed completely in 4.18 > > > > > > Something like the below should work for 4.14: > > > > > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > > > index 61fa3007a74a..e7367b9f19c5 100644 > > > --- a/drivers/usb/host/xhci-ring.c > > > +++ b/drivers/usb/host/xhci-ring.c > > > @@ -1640,6 +1640,12 @@ static void handle_port_status(struct xhci_hcd *xhci, > > > if ((major_revision == 0x03) != (hcd->speed >= HCD_USB3)) > > > hcd = xhci->shared_hcd; > > > > > > + if (!hcd) { > > > + xhci_dbg(xhci, "No hcd found for port %u event\n", port_id); > > > + bogus_port_status = true; > > > + goto cleanup; > > > + } > > > + > > > if (major_revision == 0) { > > > xhci_warn(xhci, "Event for port %u not in " > > > "Extended Capabilities, ignoring.\n", > > > > Thanks for this suggestion, this is much better! I am sorry that we're > > using android kernel that some reported issue might be out of date. I > > will update the suggestion into our code base. Thanks! > > Should I backport this to 4.14 and older kernels to prevent this issue > from showing up in newer Android devices that are using these older > kernels? > > thanks, > > greg k-h If this could be backported to older kernel that will be great for newer Android devices. Some of the shipping devices will have requirement of kernel upgrade. Hence if you could backport this patch will be great. Thanks! Regards, Macpaul Lin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel