From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3DFF17D2 for ; Sun, 16 Nov 2025 07:59:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763279946; cv=none; b=BWlAsuseyYMI6fbAiwYEvrZC3deiaItLIHa/O0phemtZaJdQzRIUBMfqDbLrSn0n26zA8J0tpBlASlyQlaOuW6cbrCZQD4pgZyJ6ZIG9PI7wn6nkOzdC6ZTTm08oTyhokeXQfM4/ZY7PKpuLEd+CoJTL/82RgRNjiIbxQlkk+XA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763279946; c=relaxed/simple; bh=Df4U0Prj2cNzEN+kMtbZCq5ZYTSUaUsmZyrVqZofU8Q=; h=Message-ID:Date:MIME-Version:Subject:To:References:From:Cc: In-Reply-To:Content-Type; b=ma26B3/npDJEU4CDjH6LpotZ6DSBS9ekotq5KWOOd48G0YM8oFeR7e9nBQLVwEkgeOny+HuoZz5KDLCu6zFZxChpb0aHWUWTMbjtsFnHyAUeQcUZsNaFo3WHq8c99uhyfJG/uDyLS1NSEYNLj61pViol9QGj0bggk4Sc3Vl63MQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=PiujpaGf; arc=none smtp.client-ip=91.218.175.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="PiujpaGf" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1763279941; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IRBIZDHJe4kVFZzkRJcSvhRleLxrjJHk9Zbbl7dyKJg=; b=PiujpaGfh24sFaP3UObqpZOCfqa6MldB72AaH0bgkszHWOAxQ2Q63CNQMzPz16mHDl1dNv fDr0P1dBptub5yQwJEdjF0JMXOrusT6mOrse1X9SR7Zoja22A/14mzgHJ1y5e1VBmkIDP9 lycWsM2kQI2zFoRuebfmnTfceWqA29I= Date: Sun, 16 Nov 2025 15:58:32 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH v2 2/4] hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung To: Feng Tang References: <20251113111039.22701-1-feng.tang@linux.alibaba.com> <20251113111039.22701-3-feng.tang@linux.alibaba.com> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Lance Yang Cc: Petr Mladek , Andrew Morton , Steven Rostedt , Lance Yang , linux-kernel@vger.kernel.org, Jonathan Corbet , paulmck@kernel.org, lirongqing@baidu.com, leonylgao@tencent.com In-Reply-To: <20251113111039.22701-3-feng.tang@linux.alibaba.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 2025/11/13 19:10, Feng Tang wrote: > When task-hung happens, developers may need different kinds of system > information (call-stacks, memory info, locks, etc.) to help debugging. > > Add 'hung_task_sys_info' sysctl knob to take human readable string like > "tasks,mem,timers,locks,ftrace,...", and when task-hung happens, all > requested information will be dumped. (refer kernel/sys_info.c for more > details). > > Meanwhile, the newly introduced sys_info() call is used to unify some > existing info-dumping knobs. > > Suggested-by: Petr Mladek > Signed-off-by: Feng Tang > --- > Documentation/admin-guide/sysctl/kernel.rst | 5 ++ > kernel/hung_task.c | 62 +++++++++++++-------- > 2 files changed, 43 insertions(+), 24 deletions(-) > > diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst > index a397eeccaea7..45b4408dad31 100644 > --- a/Documentation/admin-guide/sysctl/kernel.rst > +++ b/Documentation/admin-guide/sysctl/kernel.rst [...] > diff --git a/kernel/hung_task.c b/kernel/hung_task.c > index 5ac0e66a1361..5b3a7785d3a2 100644 > --- a/kernel/hung_task.c > +++ b/kernel/hung_task.c > @@ -24,6 +24,7 @@ > #include > #include > #include > +#include > > #include > > @@ -59,12 +60,17 @@ static unsigned long __read_mostly sysctl_hung_task_check_interval_secs; > static int __read_mostly sysctl_hung_task_warnings = 10; > > static int __read_mostly did_panic; > -static bool hung_task_show_lock; > static bool hung_task_call_panic; > -static bool hung_task_show_all_bt; > > static struct task_struct *watchdog_task; > > +/* > + * A bitmask to control what kinds of system info to be printed when > + * a hung task is detected, it could be task, memory, lock etc. Refer > + * include/linux/sys_info.h for detailed bit definition. > + */ > +static unsigned long hung_task_si_mask; > + > #ifdef CONFIG_SMP > /* > * Should we dump all CPUs backtraces in a hung task event? > @@ -217,11 +223,8 @@ static inline void debug_show_blocker(struct task_struct *task, unsigned long ti > } > #endif > > -static void check_hung_task(struct task_struct *t, unsigned long timeout, > - unsigned long prev_detect_count) > +static void check_hung_task(struct task_struct *t, unsigned long timeout) > { > - unsigned long total_hung_task; > - > if (!task_is_hung(t, timeout)) > return; > > @@ -231,20 +234,13 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout, > */ > sysctl_hung_task_detect_count++; > > - total_hung_task = sysctl_hung_task_detect_count - prev_detect_count; > trace_sched_process_hang(t); > > - if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) { > - console_verbose(); > - hung_task_show_lock = true; > - hung_task_call_panic = true; > - } > - > /* > * Ok, the task did not get scheduled for more than 2 minutes, > * complain: > */ > - if (sysctl_hung_task_warnings || hung_task_call_panic) { > + if (sysctl_hung_task_warnings) { It seems like the behavior changes when sysctl_hung_task_warnings is 0 but a panic is about to be triggered ... Looking at the history: 1) Commit ("hung_task: ignore hung_task_warnings when hung_task_panic is enabled")[1] ensured that hung task information is always dumped when a panic is configured, even if the warning counter is exhausted. 2) Later, commit ("hung_task: panic when there are more than N hung tasks at the same time")[2] refined the logic to trigger a panic based on the number of hung tasks found in a single scan. To stay consistent with the established behavior, I think we should continue to dump the information for hung tasks as long as sysctl_hung_task_panic is enabled :) [1] https://lore.kernel.org/all/20240613033159.3446265-1-leonylgao@gmail.com [2] https://lore.kernel.org/all/20251015063615.2632-1-lirongqing@baidu.com [...] Cheers, Lance