From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8B2CBC54E67 for ; Wed, 27 Mar 2024 11:03:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:CC :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=s3wFMCUvmTWhIT5yQKAy1dTHf3RmIAwwHXqn2a2Zer8=; b=epDLL3KGLVM5Zy 0H5Ml8wweHVAkkJgRtG3B09suudnapXXqrc8ESFL/L5gVvf4kAQrIPhWe1xHm32ETAJzVQ9MWaEAN GyfKEOAShwV6aoRIorSdIGNjp3N1oV+XTd+0N8+bsaKRl/id1u8zcpBkUY7RX3kQQR0iX0KamffGu KnBQPAEF7fR1mtu1f38ie5GodjVBklGOIU5SCHR3u9xTjKS7tB/6pEfiKFZ+UV8MeiOLkFvp9kZgG 0jva6XZCfwUHAhaGL9yS7Do6oh4B1FdjItOCyGs1p5dI6KEpq/Oj3b8q6U5k2f8L+dgPhKdCjzkNt 9dJ1EyvbC/GhMqvunw+Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rpR3a-00000008WJY-1xG3; Wed, 27 Mar 2024 11:02:58 +0000 Received: from mx1.unisoc.com ([222.66.158.135] helo=SHSQR01.spreadtrum.com) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rpR3V-00000008WED-3ima for linux-arm-kernel@lists.infradead.org; Wed, 27 Mar 2024 11:02:56 +0000 Received: from dlp.unisoc.com ([10.29.3.86]) by SHSQR01.spreadtrum.com with ESMTP id 42RB1uoB056670; Wed, 27 Mar 2024 19:01:56 +0800 (+08) (envelope-from cathy.cai@unisoc.com) Received: from SHDLP.spreadtrum.com (shmbx06.spreadtrum.com [10.0.1.11]) by dlp.unisoc.com (SkyGuard) with ESMTPS id 4V4Nx94wClz2MN62X; Wed, 27 Mar 2024 19:00:13 +0800 (CST) Received: from zeshkernups02.spreadtrum.com (10.29.35.184) by shmbx06.spreadtrum.com (10.0.1.11) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Wed, 27 Mar 2024 19:01:55 +0800 From: Cathy Cai To: , , , , , , , CC: , , , , , , , , Subject: [RFC PATCH] net: stmmac: Fix the problem about interrupt storm Date: Wed, 27 Mar 2024 19:01:42 +0800 Message-ID: <20240327110142.159851-1-cathy.cai@unisoc.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Originating-IP: [10.29.35.184] X-ClientProxiedBy: SHCAS03.spreadtrum.com (10.0.1.207) To shmbx06.spreadtrum.com (10.0.1.11) X-MAIL: SHSQR01.spreadtrum.com 42RB1uoB056670 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240327_040254_281545_B811DBAA X-CRM114-Status: GOOD ( 13.69 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org After I do seven days of MSR test (monkey sleep reboot test) in Android, I can encounter below netdev watchdog timeout issue. Tx queue timed out then reset adapter. There is a probability that an interruption storm will occur and the system will crash. When we do MSR test, there is a NETDEV WATCHDOG WARNING: [ 117.885804] ------------[ cut here ]------------ [ 117.885818] NETDEV WATCHDOG: eth0 (stmmaceth): transmit queue 0 timed out [ 117.885873] WARNING: CPU: 1 PID: 4169 at net/sched/sch_generic.c:473 dev_watchdog+0x2fc/0x41c [ 117.886070] sprd_systimer sprd_sip_svc sprd_wdt_fiq sprd_wdt_pon [ 117.886082] CPU: 1 PID: 4169 Comm: RenderThread Tainted: G S C O 5.4.147-ab41313 #1 [ 117.886085] Hardware name: Spreadtrum UIS6780 SoC (DT) [ 117.886090] pstate: 60400005 (nZCv daif +PAN -UAO) [ 117.886094] pc : dev_watchdog+0x2fc/0x41c [ 117.886098] lr : dev_watchdog+0x2fc/0x41c [ 117.886100] sp : ffffffc01000bcf0 [ 117.886103] x29: ffffffc01000bcf0 x28: ffffffc011eafe28 [ 117.886107] x27: ffffff80f97a5c40 x26: 00000000ffffffff [ 117.886111] x25: 0000000000000001 x24: 0000000000000008 [ 117.886114] x23: ffffffc011ea6000 x22: ffffffc011e73020 [ 117.886118] x21: 0000000000000000 x20: ffffff80f434841c [ 117.886122] x19: ffffff80f4348000 x18: ffffffc01000d048 [ 117.886127] x17: ffffffc012050044 x16: 00000000000508d0 [ 117.886130] x15: 0000000000000006 x14: 0000000000000058 [ 117.886134] x13: 0000000000000008 x12: 0000000042d7d11b [ 117.886138] x11: 0000000000000015 x10: 0000000000000001 [ 117.886141] x9 : a6fe08b7d867fd00 x8 : a6fe08b7d867fd00 [ 117.886145] x7 : 0000000000000000 x6 : ffffffc0120a0899 [ 117.886149] x5 : 0000000000000058 x4 : 0000000000000002 [ 117.886152] x3 : ffffffc01000b980 x2 : 0000000000000007 [ 117.886156] x1 : 0000000000000006 x0 : 000000000000003d [ 117.886164] [ 117.887028] [ 117.887030] Call trace: [ 117.887035] dev_watchdog+0x2fc/0x41c [ 117.887043] call_timer_fn+0x5c/0x274 [ 117.887046] expire_timers+0x74/0x1b4 [ 117.887050] __run_timers+0x250/0x2b0 [ 117.887054] run_timer_softirq+0x28/0x4c [ 117.887061] __do_softirq+0x128/0x4dc [ 117.887067] irq_exit+0xf8/0xfc [ 117.887072] __handle_domain_irq+0xb0/0x108 [ 117.887076] gic_handle_irq+0x6c/0x124 [ 117.887081] el0_irq_naked+0x64/0x74 [ 117.887084] ---[ end trace 1308772835db89f6 ]--- [ 117.887188] stmmaceth 32600000.ethernet eth0: Reset adapter. Tx queue time out then reset adapter. When reset the adapter, stmmac driver sets the state to STMMAC_DOWN and calls dev_close() function. If an interrupt is triggered at this instant after setting state to STMMAC_DOWN, before the dev_close() call. The scene is as follows: stmmac_reset_subtask() set_bit(STMMAC_DOWN, &priv->state); --->interrupt stmmac_interrupt() return IRQ_HANDLED dev_close(priv->dev); The interrupt handler stmmac_interrupt is executed, judging that the state is STMMAC_DOWN and returning IRQ_HANDLED. Then the processing will not continue, and it will not be able to clear the interrupt status. Therefore, to avoid this, set STMMAC_DOWN after dev_close(). Signed-off-by: Cathy Cai --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 24cd80490d19..61690b68b6ad 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -7167,8 +7167,8 @@ static void stmmac_reset_subtask(struct stmmac_priv *priv) while (test_and_set_bit(STMMAC_RESETING, &priv->state)) usleep_range(1000, 2000); - set_bit(STMMAC_DOWN, &priv->state); dev_close(priv->dev); + set_bit(STMMAC_DOWN, &priv->state); dev_open(priv->dev, NULL); clear_bit(STMMAC_DOWN, &priv->state); clear_bit(STMMAC_RESETING, &priv->state); -- 2.34.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel