From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1968AC433EF for ; Tue, 5 Oct 2021 07:05:11 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C81E561503 for ; Tue, 5 Oct 2021 07:05:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C81E561503 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=sgC9ThnERsV4GnxTgyy+VzdNweC77P9kv6hHtt1LjRs=; b=KvNFus/YS731LG c7VC2gmOVuTI4muGPH35iLx9fMHYBNQzk5lLkD5CAwjUbIs154PX59i30d3x0IF/OUYmcKS/C8v+2 ih8aoRBDCJGexO1BOYfc3UxmCd0g9W2GD2Imeu3vM0fE4c4df2Ajwd0x5K1CeueCeoiJH4YNn52g8 1naJP13K25W0/9ueGl+Sq3voEOdVcU3pHQG91LetajLUHt7/UaAjsZbU+9waFTzKYXrMjM1QaXeLR MUrTwuGsrUfMvuHD62u2U9n/FuRbIqxoXL+U4UC8p/DEYP155OqRUc0gLr5vbLGfij9IQnqLtxLoH Wj3K2MIpW4ZDmQj+GQZg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mXeU3-009FOc-4k; Tue, 05 Oct 2021 07:03:27 +0000 Received: from smtp-out2.suse.de ([195.135.220.29]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1mXeTy-009FNK-RH for linux-arm-kernel@lists.infradead.org; Tue, 05 Oct 2021 07:03:24 +0000 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 0571D1FE37; Tue, 5 Oct 2021 07:03:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1633417401; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=oJzanskWVV3j7+XNGt4NLM5ATeulaEMUdbY78QfwQk8=; b=ar2P+TAzqqXOqAYxlhmM4uGh7nC/AD/G42oeMi1XBuJHP7RXIeVOaKe3AjX/kuQ5yCZsLq ecUvVYwv+5gA4fMCpfVY3lxCuVZZykjc4wt+NhAXtwX7MAoul7QZam4r/w53pZjghwVz4N xw9H6HcWyTzG0L2uek/C66v27fCvvKs= Received: from suse.cz (unknown [10.100.224.162]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id A611EA3B84; Tue, 5 Oct 2021 07:03:20 +0000 (UTC) Date: Tue, 5 Oct 2021 09:03:17 +0200 From: Petr Mladek To: Pingfan Liu Cc: linux-kernel@vger.kernel.org, Sumit Garg , Catalin Marinas , Will Deacon , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Marc Zyngier , Kees Cook , Masahiro Yamada , Sami Tolvanen , Andrew Morton , Wang Qing , "Peter Zijlstra (Intel)" , Santosh Sivaraj , linux-arm-kernel@lists.infradead.org Subject: Re: [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model Message-ID: References: <20210923140951.35902-1-kernelfans@gmail.com> <20210923140951.35902-4-kernelfans@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210923140951.35902-4-kernelfans@gmail.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211005_000323_067002_228B4846 X-CRM114-Status: GOOD ( 33.35 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu 2021-09-23 22:09:50, Pingfan Liu wrote: > When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready > yet. E.g. on arm64, PMU is not ready until > device_initcall(armv8_pmu_driver_init). And it is deeply integrated > with the driver model and cpuhp. Hence it is hard to push this > initialization before smp_init(). > > But it is easy to take an opposite approach by enabling watchdog_hld to > get the capability of PMU async. > > The async model is achieved by expanding watchdog_nmi_probe() with > -EBUSY, and a re-initializing work_struct which waits on a wait_queue_head. > > Signed-off-by: Pingfan Liu > Cc: Sumit Garg > Cc: Catalin Marinas > Cc: Will Deacon > Cc: Ingo Molnar > Cc: Arnaldo Carvalho de Melo > Cc: Mark Rutland > Cc: Alexander Shishkin > Cc: Jiri Olsa > Cc: Namhyung Kim > Cc: Marc Zyngier > Cc: Kees Cook > Cc: Masahiro Yamada > Cc: Sami Tolvanen > Cc: Petr Mladek > Cc: Andrew Morton > Cc: Wang Qing > Cc: "Peter Zijlstra (Intel)" > Cc: Santosh Sivaraj > Cc: linux-arm-kernel@lists.infradead.org > To: linux-kernel@vger.kernel.org > --- > include/linux/nmi.h | 3 +++ > kernel/watchdog.c | 37 +++++++++++++++++++++++++++++++++++-- > 2 files changed, 38 insertions(+), 2 deletions(-) > > diff --git a/include/linux/nmi.h b/include/linux/nmi.h > index b7bcd63c36b4..270d440fe4b7 100644 > --- a/include/linux/nmi.h > +++ b/include/linux/nmi.h > @@ -118,6 +118,9 @@ static inline int hardlockup_detector_perf_init(void) { return 0; } > > void watchdog_nmi_stop(void); > void watchdog_nmi_start(void); > + > +extern bool hld_detector_delay_initialized; > +extern struct wait_queue_head hld_detector_wait; > int watchdog_nmi_probe(void); > void watchdog_nmi_enable(unsigned int cpu); > void watchdog_nmi_disable(unsigned int cpu); > diff --git a/kernel/watchdog.c b/kernel/watchdog.c > index 6e6dd5f0bc3e..bd4ae1839b72 100644 > --- a/kernel/watchdog.c > +++ b/kernel/watchdog.c > @@ -103,7 +103,10 @@ void __weak watchdog_nmi_disable(unsigned int cpu) > hardlockup_detector_perf_disable(); > } > > -/* Return 0, if a NMI watchdog is available. Error code otherwise */ > +/* > + * Return 0, if a NMI watchdog is available. -EBUSY if not ready. > + * Other negative value if not support. > + */ > int __weak __init watchdog_nmi_probe(void) > { > return hardlockup_detector_perf_init(); > @@ -739,15 +742,45 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write, > } > #endif /* CONFIG_SYSCTL */ > > +static void lockup_detector_delay_init(struct work_struct *work); > +bool hld_detector_delay_initialized __initdata; > + > +struct wait_queue_head hld_detector_wait __initdata = > + __WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait); > + > +static struct work_struct detector_work __initdata = > + __WORK_INITIALIZER(detector_work, lockup_detector_delay_init); > + > +static void __init lockup_detector_delay_init(struct work_struct *work) > +{ > + int ret; > + > + wait_event(hld_detector_wait, hld_detector_delay_initialized); > + ret = watchdog_nmi_probe(); > + if (!ret) { > + nmi_watchdog_available = true; > + lockup_detector_setup(); Is it really safe to call the entire lockup_detector_setup() later? It manipulates also softlockup detector. And more importantly, the original call is before smp_init(). It means that it was running when only single CPU was on. It seems that x86 has some problem with hardlockup detector as well. It later manipulates only the hardlockup detector. Also it uses cpus_read_lock() to prevent races with CPU hotplug, see fixup_ht_bug(). > + } else { > + WARN_ON(ret == -EBUSY); > + pr_info("Perf NMI watchdog permanently disabled\n"); > + } > +} > + > void __init lockup_detector_init(void) > { > + int ret; > + > if (tick_nohz_full_enabled()) > pr_info("Disabling watchdog on nohz_full cores by default\n"); > > cpumask_copy(&watchdog_cpumask, > housekeeping_cpumask(HK_FLAG_TIMER)); > > - if (!watchdog_nmi_probe()) > + ret = watchdog_nmi_probe(); > + if (!ret) > nmi_watchdog_available = true; > + else if (ret == -EBUSY) > + queue_work_on(smp_processor_id(), system_wq, &detector_work); IMHO, this is not acceptable. It will block one worker until someone wakes it. Only arm64 will have a code to wake up the work and only when pmu is successfully initialized. In all other cases, the worker will stay blocked forever. The right solution is to do it the other way. Queue the work from arm64-specific code when armv8_pmu_driver_init() succeeded. Also I suggest to flush the work to make sure that it is finished before __init code gets removed. The open question is what code the work will call. As mentioned above, I am not sure that lockup_detector_delay_init() is safe. IMHO, we need to manipulate only hardlockup detector and we have to serialize it against CPU hotplug. Best Regards, Petr _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel