From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B320C433ED for ; Mon, 26 Apr 2021 13:41:19 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 65AD761157 for ; Mon, 26 Apr 2021 13:41:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 65AD761157 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:CC:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=nIRwf5d+rFrmtfUD4DQrUtMrWbClz9siecQpL5PFFSs=; b=F4YM49C8VjnXN9FBxOKncuJ9k IQEvprrguQ/+0U2nT0RTQc+wp5haHVd954DN5GqCQCgz2Yh1ObBqi3rfLObYWvPGeIf+OMaNfN3Kq uXY6q5bkTY1l4YeyrpEqY8pALnw7vWHQcD2SHaDkhgtHPN29g+GL6s0SgSGky6NSvItBwPTx+ALLE hUu10T15ayJwNQmWiSRrDNHz+uiGQM9u9hKqr+DSNsEAVimAWbRPudsLHMr9ec9GX1q3MyqOAabvG o3Wh9crU3HmFP7dj9kM8LO7ILkiK2/iqwmdVwIWrsl7EE4oo8AJxnJGu7V8XETdqpwOMQvM0E0Vct FPURGbfPw==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1lb1Sa-007kVl-2o; Mon, 26 Apr 2021 13:39:36 +0000 Received: from bombadil.infradead.org ([2607:7c80:54:e::133]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lb1SX-007kVb-DJ for linux-arm-kernel@desiato.infradead.org; Mon, 26 Apr 2021 13:39:33 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Content-Transfer-Encoding: Content-Type:In-Reply-To:MIME-Version:Date:Message-ID:From:References:CC:To: Subject:Sender:Reply-To:Content-ID:Content-Description; bh=5Rf+cHGa+U683tGJvcIUxI8Qgqo7siOyOdbsUP/w30I=; b=dCPgCIpw0n8jGr8OSSqd15G0MR TRa6E/01GvTAC7LtvehctdmIRgKkrSiVQAQhVuk48NkLTOsOzJ5v0O99RDFAOO7PmjLeOxPmPIiH5 9auqnt8VJfT0WrDS4C+44YtspbrzfSe8WPLFOT/SllphzfAYcEPBrIskfohEph2AVNIIqG3t0F8O+ sgbtstPpw81O9Uj5WQRUU3bd06WLwUR/Fl29H4d4V8EJuFkU5UlzYyAvF8w2waWQj0C9q6izyiGOE AAvyJeKuTFl2UcIziwY+7+uiJOhlZeaysbRY3ecKRL39DAcK2xH5lQH+1Kha4ZiItCUR9X4+H5Qwt wGZsNpzA==; Received: from szxga04-in.huawei.com ([45.249.212.190]) by bombadil.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1lb1SU-00FzQv-BJ for linux-arm-kernel@lists.infradead.org; Mon, 26 Apr 2021 13:39:32 +0000 Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.60]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4FTQs601Vlzpc69; Mon, 26 Apr 2021 21:36:10 +0800 (CST) Received: from [10.174.185.179] (10.174.185.179) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.498.0; Mon, 26 Apr 2021 21:39:06 +0800 Subject: Re: [PATCHv2 09/11] arm64: entry: fix non-NMI kernel<->kernel transitions To: Mark Rutland CC: , , , , , , , , References: <20201130115950.22492-1-mark.rutland@arm.com> <20201130115950.22492-10-mark.rutland@arm.com> <20210426092139.GA16287@C02TD0UTHF1T.local> From: Zenghui Yu Message-ID: <0e1143a5-8e1a-c04e-e4fd-5c57f8354f61@huawei.com> Date: Mon, 26 Apr 2021 21:39:04 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: <20210426092139.GA16287@C02TD0UTHF1T.local> Content-Language: en-US X-Originating-IP: [10.174.185.179] X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210426_063930_748366_65275F5F X-CRM114-Status: GOOD ( 21.51 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Mark, On 2021/4/26 17:21, Mark Rutland wrote: > On Sun, Apr 25, 2021 at 01:29:31PM +0800, Zenghui Yu wrote: >> Hi Mark, > > Hi Zenghui, [...] >> Booting a lockdep-enabled kernel with "irqchip.gicv3_pseudo_nmi=1" would >> result in splats as below: >> >> | DEBUG_LOCKS_WARN_ON(!irqs_disabled()) >> | WARNING: CPU: 3 PID: 125 at kernel/locking/lockdep.c:4258 >> lockdep_hardirqs_off+0xd4/0xe8 >> | Modules linked in: >> | CPU: 3 PID: 125 Comm: modprobe Tainted: G W 5.12.0-rc8+ >> #463 >> | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 >> | pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO BTYPE=--) >> | pc : lockdep_hardirqs_off+0xd4/0xe8 >> | lr : lockdep_hardirqs_off+0xd4/0xe8 >> | sp : ffff80002a39bad0 >> | pmr_save: 000000e0 >> | x29: ffff80002a39bad0 x28: ffff0000de214bc0 >> | x27: ffff0000de1c0400 x26: 000000000049b328 >> | x25: 0000000000406f30 x24: ffff0000de1c00a0 >> | x23: 0000000020400005 x22: ffff8000105f747c >> | x21: 0000000096000044 x20: 0000000000498ef9 >> | x19: ffff80002a39bc88 x18: ffffffffffffffff >> | x17: 0000000000000000 x16: ffff800011c61eb0 >> | x15: ffff800011700a88 x14: 0720072007200720 >> | x13: 0720072007200720 x12: 0720072007200720 >> | x11: 0720072007200720 x10: 0720072007200720 >> | x9 : ffff80002a39bad0 x8 : ffff80002a39bad0 >> | x7 : ffff8000119f0800 x6 : c0000000ffff7fff >> | x5 : ffff8000119f07a8 x4 : 0000000000000001 >> | x3 : 9bcdab23f2432800 x2 : ffff800011730538 >> | x1 : 9bcdab23f2432800 x0 : 0000000000000000 >> | Call trace: >> | lockdep_hardirqs_off+0xd4/0xe8 >> | enter_from_kernel_mode.isra.5+0x7c/0xa8 >> | el1_abort+0x24/0x100 >> | el1_sync_handler+0x80/0xd0 >> | el1_sync+0x6c/0x100 >> | __arch_clear_user+0xc/0x90 >> | load_elf_binary+0x9fc/0x1450 >> | bprm_execve+0x404/0x880 >> | kernel_execve+0x180/0x188 >> | call_usermodehelper_exec_async+0xdc/0x158 >> | ret_from_fork+0x10/0x18 >> >> The code that triggers the splat is lockdep_hardirqs_off+0xd4/0xe8: >> >> | /* >> | * So we're supposed to get called after you mask local IRQs, but for >> | * some reason the hardware doesn't quite think you did a proper job. >> | */ >> | if (DEBUG_LOCKS_WARN_ON(!irqs_disabled())) >> | return; >> >> which looks like a false positive as DAIF are all masked on taking >> an synchronous exception and hardirqs are therefore disabled. With >> pseudo NMI used, irqs_disabled() takes the value of ICC_PMR_EL1 as >> the interrupt enable state, which is GIC_PRIO_IRQON (0xe0) in this >> case and doesn't help much. Not dig further though. > > Thanks for this report. I think I understand the problem. > > In some paths (e.g. el1_dbg, el0_svc) we update the PMR with > (GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET) before we notify lockdep, but in > others (e.g. el1_abort) we do not. The case where we do not are where > lockdep will warn, since IRQs will be masked by DAIF but not the PMR, as > you describe above. > > With the current PMR management scheme, we'll need to consistently > update the PMR earlier in the entry code. Does the below diff help? [...] > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S > index 6acfc5e6b5e0..7d46c74a8706 100644 > --- a/arch/arm64/kernel/entry.S > +++ b/arch/arm64/kernel/entry.S > @@ -292,6 +292,8 @@ alternative_else_nop_endif > alternative_if ARM64_HAS_IRQ_PRIO_MASKING > mrs_s x20, SYS_ICC_PMR_EL1 > str x20, [sp, #S_PMR_SAVE] > + orr x20, x20, #GIC_PRIO_PSR_I_SET > + msr_s SYS_ICC_PMR_EL1, x20 > alternative_else_nop_endif While this does fix the lockdep part, it breaks something else. The sleep-in-atomic one stands out (which says, I've seen other splats triggered with this diff), where irqs_disabled() in do_mem_abort() now gets confused by the updated PMR (GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET). Please have a look. Thanks, Zenghui | BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:582 | in_atomic(): 0, irqs_disabled(): 16, non_block: 0, pid: 512, name: sh | CPU: 2 PID: 512 Comm: sh Tainted: G S W 5.12.0+ | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | Call trace: | dump_backtrace+0x0/0x210 | show_stack+0x2c/0x38 | dump_stack+0x150/0x1c4 | ___might_sleep+0x154/0x250 | __might_sleep+0x58/0x90 | do_page_fault+0x24c/0x498 | do_mem_abort+0x50/0xc0 | el1_abort+0x50/0x100 | el1_sync_handler+0x80/0xd0 | el1_sync+0x74/0x100 | schedule_tail+0xa0/0xd0 | ret_from_fork+0x4/0x18 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel