From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-180.mta1.migadu.com (out-180.mta1.migadu.com [95.215.58.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E512B2DCBFC for ; Sun, 16 Nov 2025 13:22:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.180 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763299380; cv=none; b=Ef1F08QZEVJd5JCP5SqyseXXQ6B+vI3il3Y+U/2vCyqkgvrXw4ZXFgXeKnqh2kT3BhuTYIAN3vqibKJzZIsQZfxaoEACWmbTNVgLHToy1PO1a+zsb5H07or+fl6t66kdao1VARAmkyp7IUS40wNGZcGljP7l51tEDFiuE22hEsU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763299380; c=relaxed/simple; bh=TV8sCHKU1iC2KKuMvMFLMXaZUKeJB6ErkvWXIKN7BjY=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=JfyYjL7y6ZxkVuQdx3Se4jFgjU/wSzcpVFk1wPOVc1R32DVw9/lomVkA4lE6U8vQ35TOs9juSa8AYCX5jMiOHUpZRaFHYMkJE0SeNgmJYbuBoDkXzo3HLuGEBEiDCBrvGDSUwDT5HcgOKky4yAyZeehLMN5qOZmW5P695FiGysg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=FQXcDKdv; arc=none smtp.client-ip=95.215.58.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="FQXcDKdv" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1763299373; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jHXRkIp78p++gZNKlRzD4x5pdku8aE11zjI/JOi0HSg=; b=FQXcDKdvBohWIuOuCnbVFS/uuCJlfOkVR3IUHs78oFOTzMEhM6V0yWB1IS8fArsqemNH0W F+IWyw0EXq1yJuK5ET1vHoy0SJHbFEH3E1vReVkTvKHuoxh/ycPt5UhidJAsjKgYf/IMa9 k35DwnI9lMpljIH2R47CEhzU9mjDpNI= Date: Sun, 16 Nov 2025 21:22:43 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH v2 2/4] hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung Content-Language: en-US To: Feng Tang Cc: Petr Mladek , Andrew Morton , Steven Rostedt , Lance Yang , linux-kernel@vger.kernel.org, Jonathan Corbet , paulmck@kernel.org, lirongqing@baidu.com, leonylgao@tencent.com References: <20251113111039.22701-1-feng.tang@linux.alibaba.com> <20251113111039.22701-3-feng.tang@linux.alibaba.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Lance Yang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 2025/11/16 17:11, Feng Tang wrote: > On Sun, Nov 16, 2025 at 03:58:32PM +0800, Lance Yang wrote: >> >> >> On 2025/11/13 19:10, Feng Tang wrote: >>> When task-hung happens, developers may need different kinds of system >>> information (call-stacks, memory info, locks, etc.) to help debugging. >>> >>> Add 'hung_task_sys_info' sysctl knob to take human readable string like >>> "tasks,mem,timers,locks,ftrace,...", and when task-hung happens, all >>> requested information will be dumped. (refer kernel/sys_info.c for more >>> details). >>> >>> Meanwhile, the newly introduced sys_info() call is used to unify some >>> existing info-dumping knobs. >>> >>> Suggested-by: Petr Mladek >>> Signed-off-by: Feng Tang >>> --- >>> Documentation/admin-guide/sysctl/kernel.rst | 5 ++ >>> kernel/hung_task.c | 62 +++++++++++++-------- >>> 2 files changed, 43 insertions(+), 24 deletions(-) >>> * Ok, the task did not get scheduled for more than 2 minutes, >>> * complain: >>> */ >>> - if (sysctl_hung_task_warnings || hung_task_call_panic) { >>> + if (sysctl_hung_task_warnings) { >> >> It seems like the behavior changes when sysctl_hung_task_warnings is >> 0 but a panic is about to be triggered ... >> >> Looking at the history: >> >> 1) Commit ("hung_task: ignore hung_task_warnings when hung_task_panic >> is enabled")[1] ensured that hung task information is always dumped >> when a panic is configured, even if the warning counter is exhausted. >> >> 2) Later, commit ("hung_task: panic when there are more than N hung >> tasks at the same time")[2] refined the logic to trigger a panic based >> on the number of hung tasks found in a single scan. >> >> To stay consistent with the established behavior, I think we should >> continue to dump the information for hung tasks as long as >> sysctl_hung_task_panic is enabled :) >> >> [1] https://lore.kernel.org/all/20240613033159.3446265-1-leonylgao@gmail.com >> [2] https://lore.kernel.org/all/20251015063615.2632-1-lirongqing@baidu.com >> [...] > > Aha, Petr asked similar question during his review. Thanks for the catch! > > How about following fixup patch to restore that part of logic? > > Thanks, > Feng > > --- > diff --git a/kernel/hung_task.c b/kernel/hung_task.c > index 5b3a7785d3a2..d2254c91450b 100644 > --- a/kernel/hung_task.c > +++ b/kernel/hung_task.c > @@ -223,8 +223,11 @@ static inline void debug_show_blocker(struct task_struct *task, unsigned long ti > } > #endif > > -static void check_hung_task(struct task_struct *t, unsigned long timeout) > +static void check_hung_task(struct task_struct *t, unsigned long timeout, > + unsigned long prev_detect_count) > { > + unsigned long total_hung_task; > + > if (!task_is_hung(t, timeout)) > return; > > @@ -234,13 +237,19 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout) > */ > sysctl_hung_task_detect_count++; > > + total_hung_task = sysctl_hung_task_detect_count - prev_detect_count; > trace_sched_process_hang(t); > > + if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) { > + console_verbose(); > + hung_task_call_panic = true; > + } > + > /* > * Ok, the task did not get scheduled for more than 2 minutes, > * complain: > */ > - if (sysctl_hung_task_warnings) { > + if (sysctl_hung_task_warnings || hung_task_call_panic) { > if (sysctl_hung_task_warnings > 0) > sysctl_hung_task_warnings--; > pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n", > @@ -295,7 +304,6 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout) > { > int max_count = sysctl_hung_task_check_count; > unsigned long last_break = jiffies; > - unsigned long total_hung_task; > struct task_struct *g, *t; > unsigned long prev_detect_count = sysctl_hung_task_detect_count; > int need_warning = sysctl_hung_task_warnings; > @@ -320,20 +328,14 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout) > last_break = jiffies; > } > > - check_hung_task(t, timeout); > + check_hung_task(t, timeout, prev_detect_count); > } > unlock: > rcu_read_unlock(); > > - total_hung_task = sysctl_hung_task_detect_count - prev_detect_count; > - if (!total_hung_task) > + if (!(sysctl_hung_task_detect_count - prev_detect_count)) > return; > > - if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) { > - console_verbose(); > - hung_task_call_panic = true; > - } > - > if (need_warning || hung_task_call_panic) { > si_mask |= SYS_INFO_LOCKS; Looks good to me now! I assume v3 would be expected, can you post a new version? Cheers, Lance