From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53F823F99FB for ; Fri, 26 Jun 2026 14:47:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.54 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782485241; cv=none; b=LsECtRkAsYGNldGDO+wI0hbIUDzxqNihK0GwYVE38Zg+4KuVJYmol56yLdFRybNKNkE6qWODLKY85vHqsQitE1Z9YeyIj021mylztcruqSJ5Y+eeMGGAb819fuee5sGenimL9GeusSbm+le3XnPgZ+yIrlul2DUy0mKZa8R+6vc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782485241; c=relaxed/simple; bh=xDkHSkzopZwkSPwv+BN4MT5f+6RvZQ28IrxJHisZ+GY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=CJrgfFL+WLVnV3I8kj5A5KTCpub+xVW5SnJBOB7Hs+DFhltof47EGwjYlYUvBjCYa/ed3hWi5dMxW0bP3JqmREhFadqnuSj4s0Jj+CkAT9syGUD8AN3u3sQRJqz/b0oL79lGBqoqCEiyeolUe3xRIcpPpJ5PzcJi1qTjOKlpAM0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=fpZnSO8i; arc=none smtp.client-ip=209.85.128.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="fpZnSO8i" Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-490ac357c55so9587535e9.1 for ; Fri, 26 Jun 2026 07:47:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1782485235; x=1783090035; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=KsD3Uuhb7V4xwmN32D9oiv/QfdIeYjME+PsD06WdL4s=; b=fpZnSO8izsGwlu7+9dTTzmDLUTP4VrJGnWjWibLi9u8FYhmTsdl/3LUcLxLl+un6YI ZXnWvicFoJQXC2Yez0m+jNq/ItS6dUrp6Ww5BKdFebpYTmIf8KAUIwCjjERH3zpuntVb MrFSoDqDFmL4GnNZi/LzZjwr+/o/H1jwuYsjJf+xHKXe/Bq89XfO9c4BQKKFuOA5juKh c6YdXiUX3VgAP59Bw1z4TkAyTCMlUv0g+p6prEVQOrsOz4zzaOGu7Vwcb78q/WQb4t86 vnJ8cd2zeSAxMF70MdfVr/xX46+N4RCmJTlNeVCTdcdu7sKRHASk4ITumkVzkcaiLCA4 dnMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1782485235; x=1783090035; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KsD3Uuhb7V4xwmN32D9oiv/QfdIeYjME+PsD06WdL4s=; b=Hxf8GPMOlpXfPTBTCkzF5NOI6psXEd1wSIUo20yt5TEDsdNQO+lhFBAt0NOrqi6lO7 nQgwFfOfeSSnx2fxsbLbPEBZ8xDKPGTb+Nj4nAXt7QtrwsPTmrKkqhKTHbq6BlMtk4k5 ydp0+iI02wd2yLxTBcsuaENL+HZhYUekZRbNnuMhwd6jMUfCGRzjqkzJcLsxbyDHqy3g shpnCR7uRAqV3ZvEytHkyAAnsb9E2O+QCEtdxDaJveixq9NpkQzTzGDv3QyccBoCzH2W pROcmg4NCvRo9/6csqkgeWGNHer4YRCb5s/np5Na+9fdvwxKakWM31WbWfNzFvAFUzvE JqMw== X-Forwarded-Encrypted: i=1; AFNElJ+gqb9+CVfrhmmzU7W9OBBHDUPsR00+20dR892IK49sxG7bdoENkmJrcE/i7rt9lXDwoV0xMuASFSHI3vE=@vger.kernel.org X-Gm-Message-State: AOJu0YxvTCFEivqdNV+s6EQEw4lCV2uhh0rUxzkz4Avm10IShj+411u4 cm1a3YXmZSqYmF3o82aU1RA4bwEodOcLypzIiKZwUW83gelPLISCJEu4phHQemvJEIE= X-Gm-Gg: AfdE7clX7bqFqs/twVOkqpqcTIm+ei73rz0KcuRc8RHmQWTe/u2FlZKsHGFDPHmfzX8 kIIGDowp8u1J/OHmcgoBOCbwz2383Vj75ExRQdHXjUTCHeW5e9/oYNx874lOndltaCrbTa48+rK BWIURa5uS7mpMjzNfZPBbGcPns5sKA1TZLuAshCBZ63Te1CrnkngOqHsoIb2bJb2CPCZx1AcwgY BYMfjE5UlcTkDqR8g9tgmjJ8/f7oFCKEw9g62k7MIrTx9MLHkORhsiSk8wloqzy8j8IFJy33z1k yt8aQmqT/fvf5S/Ij6YBWbIuAvJ9z+qFAWMceOsQwVdJULqdGVaqpTIQVcaCsSl23bKSxa9x8S/ 4ofRWVSg33lj1DzzV5WKftzitC+cE58bh2BJ8ULeDGFSVyol3riQhfS3myQPjOvwuNOsQizNXmH eqjofS/e/O3g9i9lQ= X-Received: by 2002:a05:600c:4e55:b0:492:70af:1c35 with SMTP id 5b1f17b1804b1-49270af1d9emr2971675e9.35.1782485235337; Fri, 26 Jun 2026 07:47:15 -0700 (PDT) Received: from pathway.suse.cz ([176.114.240.130]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-492690a1a85sm145508505e9.15.2026.06.26.07.47.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jun 2026 07:47:14 -0700 (PDT) Date: Fri, 26 Jun 2026 16:47:12 +0200 From: Petr Mladek To: Bradley Morgan Cc: Andrew Morton , Feng Tang , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Madhavan Srinivasan , Douglas Anderson , linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, stable@vger.kernel.org Subject: Re: [PATCH v3 4/4] panic: use sys_info_with_filter() to avoid duplicate backtraces Message-ID: References: <20260625152558.7450-1-include@grrlz.net> <20260625152558.7450-5-include@grrlz.net> <85F6E30C-EB1B-4BAF-9204-5174FD066EE0@grrlz.net> <4CF5AE3F-D7ED-47F8-A920-61D0AA078CF9@grrlz.net> <688433ED-A478-43F7-9103-995398A6BF63@grrlz.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <688433ED-A478-43F7-9103-995398A6BF63@grrlz.net> On Fri 2026-06-26 15:35:19, Bradley Morgan wrote: > On June 26, 2026 3:26:11 PM GMT+01:00, Petr Mladek > wrote: > >On Fri 2026-06-26 13:32:38, Bradley Morgan wrote: > >> On June 26, 2026 1:17:13 PM GMT+01:00, Bradley Morgan > > > >> wrote: > >> >On June 26, 2026 1:14:14 PM GMT+01:00, Petr Mladek > >> >wrote: > >> >>On Fri 2026-06-26 12:23:50, Petr Mladek wrote: > >> >>> On Thu 2026-06-25 15:25:58, Bradley Morgan wrote: > >> >>> But it all becomes very hairy. We have several levels: > >> >>> > >> >>> + watchdog-all_bt-specific option, e.g. > >> >>sysctl_hardlockup_all_cpu_backtrace > >> >>> > >> >>> + watchdog-specific si_info preferences, e.g. hardlockup_si_mask > >> >>> > >> >>> + panic-specific si_info: panic_print > >> >>> > >> >>> + universal fallback for any layer: kernel_si_info > >> >>> > >> >>> Now, we try to check all these variables back and forth to > >> >>> trigger all backtraces or to avoid triggering them. > >> >>> And it clearly does not work well and the code is more and more > >> >>> hairy. > >> >>> > >> >>> I think about another approach. The word "waterfall" comes to my > >mind. > >> >>> Instead of checking all the settings back and forth, let's process > >> >>> each setting one by one and just remember what has been done and > >> >>> skip this in the next level. > >> >>> > >> >>> All the si_info actions seems to dump a global system state. > >> >>> So, it would make sense to remember the state in a global variable > >> >>> even when it might be modified by more CPUs in parallel. > >> >>> > >> Hmm.. new idea > >> > >> kernel/dump_filter.c ? > >> > >> What this file could do is to handle a generic lockup state machine > >> so any subsystem can log what it already dumped? > >> > >> I know it may bloat, but it's better then cramming fixes in. > > > >I am not sure what exactly you would like to achieve but it sounds > >a bit scary ;-) > > > >Anyway, we should not synchronize the watchdog reports against > >each other, definitely. They are running in non-compatible contexts > >(task vs interrupt vs NMI). Also we should not add any locking > >because they usually print something when the system has enough > >troubles. > > > >Also I think that it is not worth preventing duplicated backtraces > >or reports from a single CPU. IMHO, it is not a big problem > >in practice. > > > >So, we are down to large reports, like backtraces from all CPUs, > >timers, locks, ... which are handled by sys_info(). So, I think > >that it should be enough to handle this inside the sys_info() API. > > > >I do not want to say that my proposal was the best solution. > >I am sure that there are better ones. But we need to consider > >the gain vs. complexity. > > > >Honestly, I am already a bit scared by the complexity which > >we the sys_info() API added. And it is hard to imagine that > >adding another API would make it easier. But I might be wrong. > > > >Instead, it might make sense to integrate the conflicting > >subsystem-specific calls under the sys_info() API. > >I mean that, for example watchdog_hardlockup_check() won't > >call trigger_allbutcpu_cpu_backtrace() directly but > >it would call it via sys_info() API so that sys_info() > >could keep track of it. Something like: > > > >void sys_info_allbutcpu_bt(int cpu) > >{ > > trigger_allbutcpu_cpu_backtrace(cpu); > > /* > > * The caller likely printed backtrace of the given @cpu > > * on its own. Prevent duplicate backtraces from all > > * CPUs with potential next sys_info() call. > > */ > > sys_info_done(SYS_INFO_ALL_BT); > >} > > > >But I am not sure if it is really easier to follow > >than calling sys_info_done() from the watchdog code. > > > >Some watchdogs try to optimize the output and print backtraces > >only from CPUs which are relevant for the given lockup. > >We should keep the logic for selecting the set of CPUs > >in the watchdog code. We just need to solve how to elegantly > >make sys_info() aware of it or at least about the more massive > >reports. > > > >Anyway, I would prefer to keep it simple until we see some problems > >in practice. > > > >Best Regards, > >Petr > > > > > I understand it's scary. To make a new file in the first place. > > But I was a bit vague of what I wanted, and I'm sorry. > > So, the reason why I'd suggest a new file, is because if any subsystem > Theoretically bypasses sys_info to log a lockup, this completely misses > the filter and duplicates the dump > > My file would act as a generic lockless state machine that any > subsystem can update regardless of how they dump logs. > > If you have any questions, feel absolutely free to ask! :) > > Discussion is a way to make everyone happy! Honestly, I am more and more wondering whether your are a real person or AI bot. Best Regards, Petr