From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9B00C433EF for ; Wed, 9 Feb 2022 00:32:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239328AbiBIAcU (ORCPT ); Tue, 8 Feb 2022 19:32:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238991AbiBIAcU (ORCPT ); Tue, 8 Feb 2022 19:32:20 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 853A7C06157B for ; Tue, 8 Feb 2022 16:32:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644366737; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=sB5IseLJBNqjNOsMz7OQ46XvJJFOFs9bsWn3jXjDKHA=; b=eL9Gk0xhtVsIz0MGIRkxe61YAeCqtFdchRRbLOVPXP+nH1hwwcdPgF3WpEI8tPSz8EClNP v8m/5TVu4Q18hxVPHjbZvwANYM4TTClVNTrfcAszvBHOELQJeR5wL2OKEiEok8caCk3K9n QnteTcMMbNQmwZE9zK2A7Rt/EXK7TDo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-613-SjNaxTdpPPeoX9hYgKnVJg-1; Tue, 08 Feb 2022 19:32:14 -0500 X-MC-Unique: SjNaxTdpPPeoX9hYgKnVJg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6CC7E1006AA4; Wed, 9 Feb 2022 00:32:11 +0000 (UTC) Received: from localhost (ovpn-12-114.pek2.redhat.com [10.72.12.114]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 229E162D48; Wed, 9 Feb 2022 00:31:53 +0000 (UTC) Date: Wed, 9 Feb 2022 08:31:51 +0800 From: "bhe@redhat.com" To: "Guilherme G. Piccoli" Cc: Petr Mladek , "dyoung@redhat.com" , "vgoyal@redhat.com" , "d.hatayama@fujitsu.com" , "kexec@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "linux-doc@vger.kernel.org" , "stern@rowland.harvard.edu" , "akpm@linux-foundation.org" , "andriy.shevchenko@linux.intel.com" , "corbet@lwn.net" , "halves@canonical.com" , "kernel@gpiccoli.net" , mhiramat@kernel.org, d.hatayama@jp.fujitsu.com Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter Message-ID: References: <20220108153451.195121-1-gpiccoli@igalia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org On 02/08/22 at 03:51pm, Guilherme G. Piccoli wrote: > On 28/01/2022 10:38, Petr Mladek wrote: > > [...] On Thu 2022-01-27 14:16:20, Guilherme G. Piccoli wrote: > > First, I am sorry for the very long mail. But the problem is really > > complicated. I did my best to describe it a clean way. > > > > I have discussed these problems with a colleague and he had some good > > points. And my view evolved even further. > > Thanks Petr for the very comprehensive and detailed email - this helps a > lot in shaping the future of panic notifier(s)! > > > > [...] > > I think about the following solution: > > > > + split the notifiers into three lists: > > > > + info: stop watchdogs, provide extra info > > + hypervisor: poke hypervisor > > + reboot: actions needed only when crash dump did not happen > > > > + allow to call hypervisor notifiers before or after kdump > > > > + stop CPUs before kdump when either hypervisor notifiers or > > kmsg_dump is enabled > > > > Note that it still allows to call kdump as the first action when > > hypervisor notifiers are called after kdump and no kmsg dumper > > is registered. > > > > > > void panic(void) > > { > > [...] > > > > if (crash_kexec_post_hypervisor || panic_print || enabled_kmsg_dump()) { > > /* > > * Stop CPUs when some extra action is required before > > * crash dump. We will need architecture dependent extra > > * works in addition to stopping other CPUs. > > */ > > crash_smp_send_stop(); > > cpus_stopped = true; > > } > > > > if (crash_kexec_post_hypervisor) { > > /* Tell hypervisor about the panic */ > > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf); > > } > > > > if (enabled_kmsg_dump) { > > /* > > * Print extra info by notifiers. > > * Prevent rumors, for example, by stopping watchdogs. > > */ > > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf); > > } > > > > /* Optional extra info */ > > panic_printk_sys_info(); > > > > /* No dumper by default */ > > kmsg_dump(); > > > > /* Used only when crash kernel loaded */ > > __crash_kexec(NULL); > > > > if (!cpus_stopped) { > > /* > > * Note smp_send_stop is the usual smp shutdown function, which > > * unfortunately means it may not be hardened to work in a > > * panic situation. > > */ > > smp_send_stop(); > > } > > > > if (!crash_kexec_post_hypervisor) { > > /* Tell hypervisor about the panic */ > > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf); > > } > > > > if (!enabled_kmsg_dump) { > > /* > > * Print extra info by notifiers. > > * Prevent rumors, for example, by stopping watchdogs. > > */ > > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf); > > } > > > > /* > > * Help to reboot a safe way. > > */ > > atomic_notifier_call_chain(&panic_reboot_notifier_list, 0, buf); > > > > [...] > > } > > > > Any opinion? > > Do the notifier list names make sense? > > > > This was exposed very clearly, thanks. I agree with you, it's a good > approach, and we can evolve that during the implementation phase, like > "function A is not good in the hypervisor list because of this and > that", so we move it to the reboot list. Also, name of the lists is not > so relevant, might evolve in the implementation phase - I personally > liked them, specially the "info" and "hypervisor" ones (reboot seems > good but not great heh). > > So, what are the opinions from kdump maintainers about this idea? > Baoquan / Vivek / Dave, does it make sense to you? Do you have any > suggestions/concerns to add on top of Petr draft? Yeah, it's reasonable. As I replied to Michael in another thread, I think splitting the current notifier list is a good idea. At least the code to archieve hyper-V's goal with panic_notifier is a little odd and should be taken out and execute w/o conditional before kdump, and maybe some others Petr has combed out. For those which will be switched on with the need of adding panic_notifier or panic_print into cmdline, the heavy users like HATAYAMA and Masa can help check. For Petr's draft code, does it mean hyper-V need another knob to trigger the needed notifiers? Will you go with the draft direclty? Hyper-V now runs panic notifiers by default, just a reminder. > > I prefer this refactor than the filter, certainly. If nobody else > working on that, I can try implementing that - it's very interesting. > The only thing I'd like to have first is an ACK from the kdump > maintainers about the general idea. > > Cheers, > > > Guilherme >