From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94F3142049 for ; Thu, 15 Jan 2026 03:06:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.189 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768446388; cv=none; b=pXttvRhcnkzbFaRYVHGi3ozYhzDI7XibOck6tXjAPDhrWuVtf8BXyJUfM6OFkD2CvgtPqo7zrp3RUGJP314+E1yATPvh4TpdghRAld8zqwJlWlTlQCBLImELiwjhKYa4kB9oYJ8uCmrArcSxz4spCrcU8TIrpAxu+/1EyR2chcU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768446388; c=relaxed/simple; bh=linX5eQak7Yl3uRQMaWot42qTQYclbBXNTLItjUdx/8=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=h9K0ZpFh5YSD+sATvze4QE2ZYXYZ2VTSLp29kWjiiy6T1fwn8J9eGizoOGNniON2fl1C5mKlJav658rbxDMCKJMSDYwV7FVgOqOI1OrztmIbHjO7+boQNtUKW7u48Y/t7YvoPUl5knSASoXdNqlDXYoV7bQwTOuleVUHEo8C/jU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=PmkxQhOc; arc=none smtp.client-ip=91.218.175.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="PmkxQhOc" Message-ID: <8c753996-a649-4e43-8b26-cac4780bbcd0@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1768446383; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j2qNyAbCH7Xhg6dzDtiL1PRK9ismGN9jIwBTndrNs5Q=; b=PmkxQhOcnTlR4PDYu247YL+8ivi4qblksWnwMugItOCPrlXWn/leGT6LU4xrQkkcJ22xDx P7DF9M53b+Ewk2REt0Z+A85IsieJk8yqXO8ktK4cLoy8KwBudratG+wO4ChEbts+1gGoQZ 8Wz+/Olga6N2kDMse08z2EMFmwUam7k= Date: Thu, 15 Jan 2026 11:06:16 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [v6 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count Content-Language: en-US To: Aaron Tomlin Cc: sean@ashe.io, linux-kernel@vger.kernel.org, pmladek@suse.com, gregkh@linuxfoundation.org, mhiramat@kernel.org, akpm@linux-foundation.org, joel.granados@kernel.org References: <20260115023229.3028462-1-atomlin@atomlin.com> <20260115023229.3028462-3-atomlin@atomlin.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Lance Yang In-Reply-To: <20260115023229.3028462-3-atomlin@atomlin.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 2026/1/15 10:32, Aaron Tomlin wrote: > Currently, the hung_task_detect_count sysctl provides a cumulative count > of hung tasks since boot. In long-running, high-availability > environments, this counter may lose its utility if it cannot be reset > once an incident has been resolved. Furthermore, the previous > implementation relied upon implicit ordering, which could not strictly > guarantee that diagnostic metadata published by one CPU was visible to > the panic logic on another. > > This patch introduces the capability to reset the detection count by > writing "0" to the hung_task_detect_count sysctl. The proc_handler logic > has been updated to validate this input and atomically reset the > counter. > > The synchronisation of sysctl_hung_task_detect_count relies upon a > transactional model to ensure the integrity of the detection counter > against concurrent resets from userspace. The application of > atomic_long_read_acquire() and atomic_long_cmpxchg_release() is correct > and provides the following guarantees: > > 1. Prevention of Load-Store Reordering via Acquire Semantics By > utilising atomic_long_read_acquire() to snapshot the counter > before initiating the task traversal, we establish a strict > memory barrier. This prevents the compiler or hardware from > reordering the initial load to a point later in the scan. Without > this "acquire" barrier, a delayed load could potentially read a > "0" value resulting from a userspace reset that occurred > mid-scan. This would lead to the subsequent cmpxchg succeeding > erroneously, thereby overwriting the user's reset with stale > increment data. > > 2. Atomicity of the "Commit" Phase via Release Semantics The > atomic_long_cmpxchg_release() serves as the transaction's commit > point. The "release" barrier ensures that all diagnostic > recordings and task-state observations made during the scan are > globally visible before the counter is incremented. > > 3. Race Condition Resolution This pairing effectively detects any > "out-of-band" reset of the counter. If > sysctl_hung_task_detect_count is modified via the procfs > interface during the scan, the final cmpxchg will detect the > discrepancy between the current value and the "acquire" snapshot. > Consequently, the update will fail, ensuring that a reset command > from the administrator is prioritised over a scan that may have > been invalidated by that very reset. > > Signed-off-by: Aaron Tomlin > --- > Documentation/admin-guide/sysctl/kernel.rst | 3 +- > kernel/hung_task.c | 109 +++++++++++++------- > 2 files changed, 75 insertions(+), 37 deletions(-) > > diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst > index 239da22c4e28..68da4235225a 100644 > --- a/Documentation/admin-guide/sysctl/kernel.rst > +++ b/Documentation/admin-guide/sysctl/kernel.rst > @@ -418,7 +418,8 @@ hung_task_detect_count > ====================== > > Indicates the total number of tasks that have been detected as hung since > -the system boot. > +the system boot or since the counter was reset. The counter is zeroed when > +a value of 0 is written. > > This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. > > diff --git a/kernel/hung_task.c b/kernel/hung_task.c > index b5ad7a755eb5..2eb9c861bdcc 100644 > --- a/kernel/hung_task.c > +++ b/kernel/hung_task.c > @@ -224,24 +224,43 @@ static inline void debug_show_blocker(struct task_struct *task, unsigned long ti > } > #endif > > -static void check_hung_task(struct task_struct *t, unsigned long timeout, > - unsigned long prev_detect_count) > +/** > + * hung_task_diagnostics - Print structured diagnostic info for a hung task. > + * @t: Pointer to the detected hung task. > + * > + * This function consolidates the printing of core diagnostic information > + * for a task found to be blocked. > + */ > +static inline void hung_task_diagnostics(struct task_struct *t) > { > - unsigned long total_hung_task, cur_detect_count; > - > - if (!task_is_hung(t, timeout)) > - return; > - > - /* > - * This counter tracks the total number of tasks detected as hung > - * since boot. > - */ > - cur_detect_count = atomic_long_inc_return_relaxed(&sysctl_hung_task_detect_count); > - total_hung_task = cur_detect_count - prev_detect_count; > + unsigned long blocked_secs = (jiffies - t->last_switch_time) / HZ; > + > + pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n", > + t->comm, t->pid, blocked_secs); > + pr_err(" %s %s %.*s\n", > + print_tainted(), init_utsname()->release, > + (int)strcspn(init_utsname()->version, " "), > + init_utsname()->version); > + if (t->flags & PF_POSTCOREDUMP) > + pr_err(" Blocked by coredump.\n"); > + pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\" disables this message.\n"); > +} I see hung_task_diagnostics() is still in this patch. I thought we'd concluded that[1] the refactoring wasn't really necessary for a single-use block? [1] https://lore.kernel.org/all/noze3vhqjbsuulvvoaw4h5yeinggpwfslrit5vsd2dllfo4ath@qgmp22hoibgn/