From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1C94C433E0 for ; Mon, 3 Aug 2020 09:52:51 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6958220678 for ; Mon, 3 Aug 2020 09:52:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="ycdqCUY2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6958220678 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Type: Content-Transfer-Encoding:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=Vm4EPEJg+WXZxpNCpC1XCDmE312Jt4YpmfQdDJnZLXk=; b=ycdqCUY2Lmyf9eig3sh/eoGN1 f5zyOLXX08f3zRLxPFuSymOwsH/Lsk8em35Qf2fg55cTJsj0xDPqiYiypTONB35zDcDAAQwXBB5xh izB3KU+1IDTzDG6WIA4Xd1NNgHVns8g95E5JUPF5gg9MrZRQwERvhvGWggIvOaAgV8h5gRhC4/6eh uzXRWeRRPGaWOQAf9OW5RgfA6Rb8+7IBsC8rmeoHInReglNGypi6LYFM4mg4TI3wjJTKqgfxAFJ8p flGGcPCSWmzG5VGoUCk31F1cou5zFVQqE5FY3VtVHs0SwxjzWE6rP17T5yzRhHCRgyRGCIp6s0Z5Y +Ktlq8ZAg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2X89-0003vW-7F; Mon, 03 Aug 2020 09:51:41 +0000 Received: from foss.arm.com ([217.140.110.172]) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1k2X86-0003uW-4L for linux-arm-kernel@lists.infradead.org; Mon, 03 Aug 2020 09:51:39 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DB3BC30E; Mon, 3 Aug 2020 02:51:35 -0700 (PDT) Received: from [10.57.35.143] (unknown [10.57.35.143]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A83B03F6CF; Mon, 3 Aug 2020 02:51:33 -0700 (PDT) Subject: Re: [RFC PATCH] arm64: defconfig: Disable fine-grained task level IRQ time accounting To: Vladimir Oltean , Kurt Kanzenbach References: <20200729033934.22349-1-alison.wang@nxp.com> <877dumbtoi.fsf@kurt> <20200729094943.lsmhsqlnl7rlnl6f@skbuf> <87mu3ho48v.fsf@kurt> <20200730082228.r24zgdeiofvwxijm@skbuf> <873654m9zi.fsf@kurt> <20200803081625.czdfwcpw5emcd4ls@skbuf> From: Robin Murphy Message-ID: Date: Mon, 3 Aug 2020 10:51:32 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Thunderbird/68.11.0 MIME-Version: 1.0 In-Reply-To: <20200803081625.czdfwcpw5emcd4ls@skbuf> Content-Language: en-GB X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200803_055138_346945_90541F1C X-CRM114-Status: GOOD ( 25.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Marc Zyngier , paulmck@kernel.org, catalin.marinas@arm.com, Alison Wang , linux-kernel@vger.kernel.org, leoyang.li@nxp.com, will@kernel.org, vladimir.oltean@nxp.com, Thomas Gleixner , mw@semihalf.com, Anna-Maria Gleixner , Valentin Schneider , linux-arm-kernel@lists.infradead.org Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2020-08-03 09:16, Vladimir Oltean wrote: > On Mon, Aug 03, 2020 at 10:04:01AM +0200, Kurt Kanzenbach wrote: >> On Thu Jul 30 2020, Vladimir Oltean wrote: >>> On Thu, Jul 30, 2020 at 09:23:44AM +0200, Kurt Kanzenbach wrote: >>>> On Wed Jul 29 2020, Vladimir Oltean wrote: >>>>> For more context, here is my original report of the issue: >>>>> https://lkml.org/lkml/2020/6/4/1062 >>>>> >>>>> Just like you, I could not reproduce the RCU stalls and system hang on a >>>>> 5.6-rt kernel, just on mainline and derivatives, using the plain >>>>> defconfig. >>>>> >>>>> The issue is not specific to Layerscape or i.MX8, but rather I was able >>>>> to see the same behavior on Marvell Armada 37xx as well as Qualcomm >>>>> MSM8976. >>>>> >>>>> So, while of course I agree that disabling IRQ time accounting for arm64 >>>>> isn't a real solution, it isn't by far an exaggerated proposal either. >>>>> Nonetheless, the patch is just a RFC and should be treated as such. We >>>>> are at a loss when it comes to debugging this any further and we would >>>>> appreciate some pointers. >>>> >>>> Yeah, sure. I'll try to reproduce this issue first. So it triggers with: >>>> >>>> * arm64 >>>> * mainline, not -rt kernel >>>> * opened serial console >>>> * irq accounting enabled >>>> >>>> Anything else? >>>> >>>> Thanks, >>>> Kurt >>> >>> Thanks for giving a helping hand, Kurt. The defconfig should be enough. >>> In the interest of full disclosure, the only arm64 device on which we >>> didn't reproduce this was the 16-core LX2160A. But we did reproduce on >>> that with maxcpus=1 though. And also on msm8976 with all 8 cores booted. >>> Just mentioning this in case you're testing on a 16-core system, you >>> might want to reduce the number a bit. >> >> OK. I've reproduced it on a Marvell Armada SoC with v5.6 mainline. See >> splats below. Running with irq time accounting enabled, kills the >> machine immediately. However, I'm not getting the possible deadlock >> warnings in 8250 as you did. So that might be unrelated. >> > > Yes, the console lockdep warnings are unrelated. They are discussed > here: > https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/ > >> Unfortunately I have no idea what to debug here. >> >> Thanks, >> Kurt > > So, this means we could submit a formal version of this patch? :) Defconfig is absolutely not the appropriate way to work around bugs - it's merely a starting point for users and distros to set up their own kernel, and if they can still enable this option and render their system unusable then patching some other config that they aren't using is pointless. To usefully mitigate a problem you'd need to make sure the offending option cannot be selected at all (i.e. prohibit HAVE_IRQ_TASK_ACCOUNTING as well). Having glanced across another thread that mentions IRQ accounting recently[1], I wonder if the underlying bug here might have something do to with the stuff that Marc's trying to clean up. Robin. [1] https://lore.kernel.org/linux-arm-kernel/20200624195811.435857-16-maz@kernel.org/ _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel