From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7BB483587AA for ; Mon, 12 Jan 2026 13:13:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.54 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768223626; cv=none; b=O4XwQuU17s9W0AzGKuUTVt8lhzFMYIlNhiBH7nELED7Opc45fRc0GE9ILObhpwLU49MhmopHtMpka4EyRAnduE40MbJ1fpt6vXlWxbjquu+SKbEV3m94j6h/4mvEFTiO3PopyZgQcafwA4ccouV1JCR9x8onJsEODJG0NIV6E2k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768223626; c=relaxed/simple; bh=CDWelwusMgknfsmaq5uKBLj/vMhq8QDH6UV5q8LFDeQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=u8ErC1fZQ19fmpvknFtiPp4yVZQwh8XG0zEAFtfSG7F7cZur8DI22IBh/tmsaZolLNa6uWk7M9TFXxxuRqfvR11bdW2Pg3LjJKQfY+pk9rAPK/Lg+7yFkA0mqYU0fRwd+HSH8/j5xMHT51+l+YLTOr+lVl26gWtjW6mURS6uGks= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=QgvmT4qs; arc=none smtp.client-ip=209.85.128.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="QgvmT4qs" Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-47aa03d3326so50938875e9.3 for ; Mon, 12 Jan 2026 05:13:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1768223623; x=1768828423; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=9LE7R7RaOLfGOZ+LUXlKUcDeeBakcV2L6ptiuZ1LHFU=; b=QgvmT4qs63IrirVo89ZGGbYwqebT5hwlV0DQVblphrJofHQ/56B6kcJJs4l+gtBOE4 UBn7GKyuc/9Uaspje7n7rzTCoVLPPL4h8EKhitH+zxje9uYjQguLXT9Sqj2UjqyHdAe+ uc0ueOPJvC9jzE7PQ+hw20NpuDL54YFJzPwamg7+89rr4WsA+REw5eIkNJJbGfEUjamB Ou716XFs4sxoTpI80ibwSl6sevVPwgnrSHNq0aLiQRBHorz4VYzOQ0Sprliye9ZzP2+f p/nPZ61LUwbyqJn3qDvUh39t2ezplddpYMxgS+rEKQ5D/3zlcJUWZ7yrMVNgtToPRcGp l7Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768223623; x=1768828423; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9LE7R7RaOLfGOZ+LUXlKUcDeeBakcV2L6ptiuZ1LHFU=; b=dzmuTkAGRGBYtFBG1nERYVqZoet7W9P9l9RXGS1aZ1VE3zoQTQnDPNjgWhI8pY00z4 1y2qp6YqG2kfdj6+ygPUMO8V1qrT66wyDgQsNsZ5alwV4sQxSD2IW6xb6VSiGmxyj7lb YFSdGocZdQP1xgEU9DYyMntX9rMivEFAxYosxu9O3N7bPrZrpBjeH5pApnv3+wk3h89y RGWTf+/7qDkUL5AQyWxzz4OAlAAqrl3qBZYcobHRzgxI9Wq7rKJD8dhI5houmYaiaN7x J4gnS/09QA3VsssNg7k19vHIiDUmIaDeHU0TSI/uREukf/oTdJwbdQkvfgRTdnAEr4UF gfsw== X-Forwarded-Encrypted: i=1; AJvYcCXTdjtLN8wzAZNglO0WYGNMOiNVaDR8kVof4q2pyzTvhSfRZY34xYyfmFpf9O4OMQsRShW6xLZaAVjRK6Y=@vger.kernel.org X-Gm-Message-State: AOJu0YwooHOyS6AeFdIjzxF4/ka8bAayYJaUebLiu5QWXQLSQdJnPAND U1usFfqIB+IFPGC8Tx7WujN7RfxCcTpOsUVdiFK8WuMa/1wowrzZvrAVTFJ/PW9eUyk= X-Gm-Gg: AY/fxX6iloVjQ92LfrZhANyO/MR0l4beJZGsPGkFanjj4B/sdk6ioFcb7YcVNxLe5QA aCp2bcJysLZpnK9beykbXyZdZw2fnl0+hrn3SVix57UGgBC/2ivn1UNggvUvhHYUi0qvgrOC65k TD3XzH/5cFmBCnwhf6IGBIlARrOqnYZvFDmc8joVXxcL0VjUTjraWx8NuGWpGT8R54iNfimJXXO JHS222TCZNhbENW+NV07W0bPfPvFzmbjo81tRkOgYBoLAcCUx5d3AOdj2vKJ8+58F4LjC0KbQpj JpCaB5r3FbAquUjMBPlznUxWOKs0iTjZyB0hsf0IVSK7tWpEXJgWKfRDUmJiRSSXn07XplrGsCt nDCEMQpbrEUgr7lqw4HCs4ptsHt29PiJgLMAD9sf+HNKIL/1iZp+M6YLWf6frgnhm6SFsUm6JER cXd5xErx/QEnPDHA== X-Google-Smtp-Source: AGHT+IFCNpRBnJSG3TQ7OuMy4G/iNQi9+FCwzsa7WtT/UiOAiOrGAHu/K/Q9pfidAi3+5UphFbGY2g== X-Received: by 2002:a05:600c:1e24:b0:475:de14:db1e with SMTP id 5b1f17b1804b1-47d84b36a1emr217204515e9.24.1768223622784; Mon, 12 Jan 2026 05:13:42 -0800 (PST) Received: from pathway.suse.cz ([176.114.240.130]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-47d7f418538sm354719505e9.5.2026.01.12.05.13.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jan 2026 05:13:42 -0800 (PST) Date: Mon, 12 Jan 2026 14:13:40 +0100 From: Petr Mladek To: Lance Yang Cc: Aaron Tomlin , akpm@linux-foundation.org, mhiramat@kernel.org, gregkh@linuxfoundation.org, joel.granados@kernel.org, sean@ashe.io, linux-kernel@vger.kernel.org Subject: Re: [v5 PATCH 2/2] hung_task: Enable runtime reset of hung_task_detect_count Message-ID: References: <20251231004125.2380105-1-atomlin@atomlin.com> <20251231004125.2380105-3-atomlin@atomlin.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri 2026-01-09 21:50:20, Lance Yang wrote: > > > On 2026/1/8 22:41, Petr Mladek wrote: > > On Tue 2025-12-30 19:41:25, Aaron Tomlin wrote: > > > Introduce support for writing to /proc/sys/kernel/hung_task_detect_count. > > > > > > Writing a value of zero to this file atomically resets the counter of > > > detected hung tasks. This grants system administrators the ability to > > > clear the cumulative diagnostic history after resolving an incident, > > > simplifying monitoring without requiring a system restart. > > > > > --- a/kernel/hung_task.c > > > +++ b/kernel/hung_task.c > > > @@ -346,7 +355,14 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout) > > > unlock: > > > rcu_read_unlock(); > > > - if (!(sysctl_hung_task_detect_count - prev_detect_count)) > > > + /* Ensures we see all hang details recorded during the scan. */ > > > + cur_detect_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count); > > > > This value is read at the end of the scan => _release > > semantic/barrier should be here. > > Seems like _acquire is still correct here, because it is a load. > > _release semantics apply to stores, while _acquire on a load > ensures subsequent memory accesses are not reordered before it. Right! > Or smp_mb()? > > In the same thread, atomic operations on the same variable are not > reordered with respect to each other, even the _relaxed variant > preserves program order for that variable, IIRC. > > So the increment will always complete before the final read in > program order, and the read will see the updated value (unless > another CPU resets it concurrently, which is a logical race, not > a reordering issue). > > So, it would be: > > prev = atomic_long_read_acquire(&counter); // scan start > ... > cur = atomic_long_inc_return_relaxed(&counter); // during scan > ... > cur = atomic_long_read_acquire(&counter); // scan end > > The first _acquire ensures no task-checking code is reordered > before the start read, the middle increment is just atomic > without extra barriers, and the final _acquire makes sure we > observe all hang details before computing the delta. The acquire/relaxed/acquire semantic looks weird. The problem is that we do not use the counter as a lock. I thought about a sane approach and the following came to my mind: >From c28b74c35d653f527aa9017c32630ad08180fb4e Mon Sep 17 00:00:00 2001 From: Petr Mladek Date: Mon, 12 Jan 2026 14:00:52 +0100 Subject: [POC] hung_task: Update the global counter using a proper acquire/release semantic The global counter of hung tasks might get reset when the check is in progress. Also the number of hung tasks detected in the current round is important to decide whether panic() is needed or not. Handle races by: 1. Remember the total counter at the beginnning of the check. 2. Count the current round in a local variable. 3. Udpate the total counter only when the value has not been modified during the check. Note that this is only compile tested. Signed-off-by: Petr Mladek --- kernel/hung_task.c | 53 +++++++++++++++++----------------------------- 1 file changed, 20 insertions(+), 33 deletions(-) diff --git a/kernel/hung_task.c b/kernel/hung_task.c index 3bc72a4e4032..c939cd3d8a2c 100644 --- a/kernel/hung_task.c +++ b/kernel/hung_task.c @@ -246,30 +246,12 @@ static inline void hung_task_diagnostics(struct task_struct *t) pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\" disables this message.\n"); } -static void check_hung_task(struct task_struct *t, unsigned long timeout, - unsigned long prev_detect_count) +static void hung_task_info(struct task_struct *t, unsigned long timeout, + unsigned long this_round_count) { - unsigned long total_hung_task, cur_detect_count; - - if (!task_is_hung(t, timeout)) - return; - - /* - * This counter tracks the total number of tasks detected as hung - * since boot. If a reset occurred during the scan, we treat the - * current count as the new delta to avoid an underflow error. - * Ensure hang details are globally visible before the counter - * update. - */ - cur_detect_count = atomic_long_inc_return_release(&sysctl_hung_task_detect_count); - if (cur_detect_count >= prev_detect_count) - total_hung_task = cur_detect_count - prev_detect_count; - else - total_hung_task = cur_detect_count; - trace_sched_process_hang(t); - if (sysctl_hung_task_panic && total_hung_task >= sysctl_hung_task_panic) { + if (sysctl_hung_task_panic && this_round_count >= sysctl_hung_task_panic) { console_verbose(); hung_task_call_panic = true; } @@ -325,12 +307,13 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout) int max_count = sysctl_hung_task_check_count; unsigned long last_break = jiffies; struct task_struct *g, *t; - unsigned long cur_detect_count, prev_detect_count, delta; + unsigned long total_count, this_round_count; int need_warning = sysctl_hung_task_warnings; unsigned long si_mask = hung_task_si_mask; - /* Acquire prevents reordering task checks before this point. */ - prev_detect_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count); + /* The counter might get reset. Remember the initial value. */ + total_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count); + /* * If the system crashed already then all bets are off, * do not report extra hung tasks: @@ -339,6 +322,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout) return; + this_round_count = 0UL; rcu_read_lock(); for_each_process_thread(g, t) { @@ -350,21 +334,24 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout) last_break = jiffies; } - check_hung_task(t, timeout, prev_detect_count); + if (task_is_hung(t, timeout)) { + this_round_count++; + hung_task_info(t, timeout, this_round_count); + } } unlock: rcu_read_unlock(); - /* Ensures we see all hang details recorded during the scan. */ - cur_detect_count = atomic_long_read_acquire(&sysctl_hung_task_detect_count); - if (cur_detect_count < prev_detect_count) - delta = cur_detect_count; - else - delta = cur_detect_count - prev_detect_count; - - if (!delta) + if (!this_round_count) return; + /* + * Do not count this round when the global counter has been reset + * during this check. + */ + atomic_long_cmpxchg_release(&sysctl_hung_task_detect_count, total_count, + total_count + this_round_count); + if (need_warning || hung_task_call_panic) { si_mask |= SYS_INFO_LOCKS; -- 2.52.0 It is just a POC. Feel free to do the refactoring another way. Best Regards, Petr