From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-177.mta0.migadu.com (out-177.mta0.migadu.com [91.218.175.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5752C221FD0 for ; Thu, 16 Oct 2025 05:07:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.177 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760591278; cv=none; b=uDy8dKkVl+Ybvr/eNEus8Qj75HHKQ22iW49HshH/X7980OrlIBm4/sWfYeFqHJw0grj2aTSHEozY2bzXyXzPGDHIdHsFDCqivQp6Bub0mZM6b+6HfyVDdZxQRcbpT8bw321gYhAbDgzt/iQqTqyQzoLzpxvAz++R2rmUH9+BcnA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760591278; c=relaxed/simple; bh=BUn/w7Ujb8/rDnrs/nT1AxSoHEFiEVHXe9U2H1NnrMQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=RjrjNgbiOzxeR4OO/YE2ppj1Ode0b4eOghzk/lrySnpH7XmjDSi7k8ri1f4nBS97ztQB36HJafiH+VYxfhKIuCzEt20E5tv1Hnw4G6/5KkP+01av6phfQRXPXjnruMTfLBd0mAQq1+Ck7XhWwHw8g0x3o4MmWEj7AqpC8tu54ko= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=geBJxByL; arc=none smtp.client-ip=91.218.175.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="geBJxByL" Message-ID: <4db3bd26-1f74-4096-84fd-f652ec9a4d27@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1760591263; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SCAq3Tpy90yiw+VRqp/jzmhdqwWtKZRonlnTXaNao8c=; b=geBJxByLcvWBiujKOiaOahCqNyndVoWuu2mZ45TF8xE284+jE46+6ntqZCrEJOY3tjLJtS CA04TcSdtSgMTu+1LyL0JCiZYt8KyHlnhHzez+oDkOoqTGathzLcXzzn8tcestdiOMA31r Wv+r/T6z/ByuejbAoTutxC4uLBsXkfs= Date: Thu, 16 Oct 2025 13:07:23 +0800 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH][v4] hung_task: Panic when there are more than N hung tasks at the same time Content-Language: en-US To: lirongqing , Andrew Morton Cc: linux-doc@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-aspeed@lists.ozlabs.org, wireguard@lists.zx2c4.com, netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, Masami Hiramatsu , Andrew Jeffery , Anshuman Khandual , Arnd Bergmann , David Hildenbrand , Florian Wesphal , Jakub Kacinski , "Jason A . Donenfeld" , Joel Granados , Joel Stanley , Jonathan Corbet , Kees Cook , Liam Howlett , Lorenzo Stoakes , "Paul E . McKenney" , Pawan Gupta , Petr Mladek , Phil Auld , Randy Dunlap , Russell King , Shuah Khan , Simon Horman , Stanislav Fomichev , Steven Rostedt , linux-kernel@vger.kernel.org References: <20251015063615.2632-1-lirongqing@baidu.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Lance Yang In-Reply-To: <20251015063615.2632-1-lirongqing@baidu.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT LGTM. It works as expected, thanks! On 2025/10/15 14:36, lirongqing wrote: > From: Li RongQing For the commit message, I'd suggest the following for better clarity: ``` The hung_task_panic sysctl is currently a blunt instrument: it's all or nothing. Panicking on a single hung task can be an overreaction to a transient glitch. A more reliable indicator of a systemic problem is when multiple tasks hang simultaneously. Extend hung_task_panic to accept an integer threshold, allowing the kernel to panic only when N hung tasks are detected in a single scan. This provides finer control to distinguish between isolated incidents and system-wide failures. The accepted values are: - 0: Don't panic (unchanged) - 1: Panic on the first hung task (unchanged) - N > 1: Panic after N hung tasks are detected in a single scan The original behavior is preserved for values 0 and 1, maintaining full backward compatibility. ``` If you agree, likely no need to resend - Andrew could pick it up directly when applying :) > > Currently, when 'hung_task_panic' is enabled, the kernel panics > immediately upon detecting the first hung task. However, some hung > tasks are transient and allow system recovery, while persistent hangs > should trigger a panic when accumulating beyond a threshold. > > Extend the 'hung_task_panic' sysctl to accept a threshold value > specifying the number of hung tasks that must be detected before > triggering a kernel panic. This provides finer control for environments > where transient hangs may occur but persistent hangs should be fatal. > > The sysctl now accepts: > - 0: don't panic (maintains original behavior) > - 1: panic on first hung task (maintains original behavior) > - N > 1: panic after N hung tasks are detected in a single scan > > This maintains backward compatibility while providing flexibility for > different hang scenarios. > > Signed-off-by: Li RongQing > Cc: Andrew Jeffery > Cc: Anshuman Khandual > Cc: Arnd Bergmann > Cc: David Hildenbrand > Cc: Florian Wesphal > Cc: Jakub Kacinski > Cc: Jason A. Donenfeld > Cc: Joel Granados > Cc: Joel Stanley > Cc: Jonathan Corbet > Cc: Kees Cook > Cc: Lance Yang > Cc: Liam Howlett > Cc: Lorenzo Stoakes > Cc: "Masami Hiramatsu (Google)" > Cc: "Paul E . McKenney" > Cc: Pawan Gupta > Cc: Petr Mladek > Cc: Phil Auld > Cc: Randy Dunlap > Cc: Russell King > Cc: Shuah Khan > Cc: Simon Horman > Cc: Stanislav Fomichev > Cc: Steven Rostedt > --- So: Reviewed-by: Lance Yang Tested-by: Lance Yang Cheers, Lance