From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f50.google.com (mail-ej1-f50.google.com [209.85.218.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7863132D0ED for ; Wed, 19 Nov 2025 12:31:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763555475; cv=none; b=cTWu2zOG7AfU1szWsVI44HZThmN56D0hjBHZ+eYvSQDyH/ucVyxJNxQQVIHH2bywwmm11sY7OsMUIiMPnCYYchoLYLTSBLbqSMraFBwfgbyVjqFSTyE5tRhBF70iZOOfvyrejLviAWMBXgDQqKpXrBl370UP20lEnGnRQp1VHoU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763555475; c=relaxed/simple; bh=bClw9la+HXvB2j4qppCaIOfYfMKYSsi26Mq8BMm9ZWA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=PemQ2HkjDKqPJx7lCxgb20zG/jqQpnwLkT5vqcmdY8xXbTC2EEG0oKl0Y5e+pX1n1DCJ3yMXUFIz/EovbEQ1UnavrW/bCAVK+1Qfc43z/p8jEXMZGv2z1IbwAESkny+mtxRDbu+V5ZPPNvQ41JQyHU6HWMv+sWSxWZ0qZbSLn5A= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=JQW+wBcg; arc=none smtp.client-ip=209.85.218.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="JQW+wBcg" Received: by mail-ej1-f50.google.com with SMTP id a640c23a62f3a-b72bf7e703fso1071975766b.2 for ; Wed, 19 Nov 2025 04:31:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1763555470; x=1764160270; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=qNznA38Qc6qQrimFXUKRhF0+Nnwc0I+gvYDSqbkApAc=; b=JQW+wBcgxW9/nifHACE+yCYwdbNF/fDYc7LF17l9JEd1zmAi2dGITKUgsekOm/SIBY GwT+tPrfZqU+uLDCTuvQFZfhJhTUOuO2TFtQSDcW2l2Q5e6zC26L6cwjjwmcoR1dvoJ4 QhLUmd/rPVqEfJzMXPVJFZ1XM5QNuEitKkVKRiHHxMjB70vnaCFiXg1SXbNwOKdqG//C VNSrmKN818xUvAQCSH2Bsbl0w4hhwTwXGrnhe5PUYLH+rkWZmFDcIa34GSQiZjLxvolM QHcC1RqKGZvWgWHoG2S6DccNldHG6V031Mh5Wr0IIxZ7xSoUiYHWikZtJYobnGBz/Y+g n3Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763555470; x=1764160270; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qNznA38Qc6qQrimFXUKRhF0+Nnwc0I+gvYDSqbkApAc=; b=wskJ13Wa3AksETGKPpbffNnVkJv3AVVK9vFOZYhtLWAVlyXUcMKnLN7llNhXEY6IRk JHIzx311Oaj0J6dmmiF+4unaHEUQjA0UNN0M4M22CKpyATwL2iSur/1DF3HcVQLivD2Y BUiHQo3ZnNS70DVxDm8Cg7wV3uTC6mzTZfIaeNpGvwO8lwJogvL/k9/hevVXWDw6mLLz ImR6vH1SEOBMYE3gDweVmnmixdCIgUV+O4BGk0EVuhEgOwBpQfQxbBaegvzY9JwPp+tR fAcNhNdVFA+z2J4qspi6uGDm6P7b4ntW+f1NrgcZa4Rw6kzUBNot48/sqzquM5ioj/ml 4Tpg== X-Forwarded-Encrypted: i=1; AJvYcCWqCqCxHzuENODLXnqED/iUIukjaDkp2iHH4MHhNcXGVkJViMpvbmp0VoAzKMBsIkjJpl5VkdiyhqUlX9Y=@vger.kernel.org X-Gm-Message-State: AOJu0YzMFSzaJnrK/MVsrgBwO8ve3V3KNrc+5oR+hHH4E537Ohdqsa+8 vGRNYX16oydHnPOHyLdx9iLmIATGpgAB2FQLlLv8pERYHkD8Piqdxb6g+63wucQg0s8= X-Gm-Gg: ASbGncv9WvFccQZbvKHu+GmtUlkg5aA6hNxRWWH6TgT0pPrl9QI+lR0+x6moAUY6/o+ ie9AM5Noo8BzAiBJvAqRTeORgtbuN2/nbzRh4bIyEIHXMSQV0WU4V0EsgZPcQckLsR637ZQ2HqQ IGKIKJvF/J8dlWQUxMND+vRUsCV4eREa5XO7uOREuWsO0pyC2/sUNvHhUoA0QTOBLYTaWvlS+Eh dTT4UtwWS+G4VJCc5wJa65RO6mhiSW6rk/cRWszaiCfkBQsjK3S3bBKCynyY3bg8FoobUWMM9FL qWQTd9WjNwTnsh5y0rj4Ta0l0j9eHdoP/Yglwm26sp8JBOK52sPHHIkSb5rnrbPpIS2wVpefmVT uth0Psj6CP9rWtqbHtKmaFWOM/AIE6+4uxmaL1nJH8JmiG8zI8UsHX+C7fLr6L5QRs808CxLXt0 841BVL3qIBrVCn/v2mrTJ21jBk X-Google-Smtp-Source: AGHT+IFU3d/nlVh37Jaj5QkF0nW+wX671YCHi1aT46q7aGFyj6Yw3V+ioz294jjYO4aw6oAFySOLnQ== X-Received: by 2002:a17:906:fd82:b0:b73:8887:f42d with SMTP id a640c23a62f3a-b738887f472mr1769728066b.5.1763555470484; Wed, 19 Nov 2025 04:31:10 -0800 (PST) Received: from pathway.suse.cz ([176.114.240.130]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b734fb12c87sm1593143866b.31.2025.11.19.04.31.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 19 Nov 2025 04:31:10 -0800 (PST) Date: Wed, 19 Nov 2025 13:31:06 +0100 From: Petr Mladek To: Lance Yang Cc: Andrew Morton , Feng Tang , Steven Rostedt , Lance Yang , linux-kernel@vger.kernel.org, Jonathan Corbet , paulmck@kernel.org, lirongqing@baidu.com, leonylgao@tencent.com Subject: Re: [PATCH v2 2/4] hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung Message-ID: References: <20251113111039.22701-1-feng.tang@linux.alibaba.com> <20251113111039.22701-3-feng.tang@linux.alibaba.com> <20251117095352.8dfb46ec468ba5a69a829031@linux-foundation.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Wed 2025-11-19 01:57:36, Lance Yang wrote: > On 2025/11/18 23:20, Petr Mladek wrote: > > Well, the behavior is still not ideal. It would be better when > > we printed backtraces from _all_ "hung" tasks before panicking. > > But it prints the backtraces only when sysctl_hung_task_panic > > limit is reached. > > > > I mean, for example, let's have: > > > > + sysctl_hung_task_warnings = 2; > > + sysctl_hung_task_panic = 5; > > + and detect 6 hung tasks. > > > > The code will report 1st and 2nd hung tasks. It will skip 3rd and 4th > > because sysctl_hung_task_warnings reached 0. It will report 5th and > > 6th tasks because (total_hung_task >= 5). > > > > It is better than nothing. But it might be confusing. > > Right, I can see how it might be confusing. > > IMHO, sysctl_hung_task_warnings is a user-configured limit on verbosity. > It makes sense that reports are suppressed after the limit is exhausted, > except when the sysctl_hung_task_panic threshold is reached ;) > > > I am not sure how to fix it. A minimalist solution would be to print > > a warning. Something like: > > > > if (sysctl_hung_task_panic > 1 && > > (total_hung_task == sysctl_hung_task_panic) && > > !sysctl_hung_task_warnings) { > > pr_err("INFO: %d blocked tasks might have been skipped because reached hung_task_warnings limit\n", > > sysctl_hung_task_panic - 1); > > > > Or we could print the "total_hung_task" counter somewhere, for > > example, > > > > pr_err("INFO[%lu]: task %s:%d blocked for more than %ld seconds.\n", > > total_hung_task, ... > > > > Or we could restart the for_each_process_thread() cycle and make sure > > that all hung tasks will get reported. > > > > Or we could ignore it until anyone complains. > > It looks like we already inform the user when that happens. When > sysctl_hung_task_warnings is finally decremented to zero, the code prints: > > ``` > if (!sysctl_hung_task_warnings) > pr_info("Future hung task reports are suppressed, see sysctl > kernel.hung_task_warnings\n"); > ``` > > Given that this explicit warning is already in place, perhaps the current > behavior is sufficient and clear enough? The warning might get lost or it might happen long time before critical stall so people might miss it. But you are right. There is a warning. And my worries are rather theoretical. Let's keep the code simple until anyone complains. Best Regards, Petr