From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 175C9C433F5 for ; Sun, 9 Oct 2022 02:14:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229586AbiJICOg (ORCPT ); Sat, 8 Oct 2022 22:14:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229553AbiJICOd (ORCPT ); Sat, 8 Oct 2022 22:14:33 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C2022B19B for ; Sat, 8 Oct 2022 19:14:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 776CD60B07 for ; Sun, 9 Oct 2022 02:14:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C50BFC433C1; Sun, 9 Oct 2022 02:14:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1665281670; bh=dy7V4eNdrmZ4tTaUF44dYavFWUVrwzovs1/Ad2ilers=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=IfMej96wgNeEnWoFEIYz4j8Eg1S+4MmdNWVVRl659cp/KSHQboDsmpWTUYbQKZq4L S3AZY0oiC1NIBH7CkYaELR89DqvSMkSfpol8U4ZhaPTYokzTPkI8g1mez240ApzFAu K3AxrH2DpAHvb5+DXV1LWVSUmjhDKxjobUlHk9WKFpImBPgXiVt8VdtF3h5rvCYz5g oahPu+qwqLjfWpxx3pbtHkZWA+3FhRdREjYwcuv7anhIreLBuJuRmuX/6hx8bXNeqB Nk8m8oVdKJHcv4p3RXtBEQch12Hpk1XVCpy0XBVBixAEPaEmATWH10kUg8P7Atdh7t UCpIxFVR+b/RA== Received: from [156.39.10.100] (helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1ohLpk-00FL7T-7Q; Sun, 09 Oct 2022 03:14:28 +0100 Date: Sun, 09 Oct 2022 03:13:29 +0100 Message-ID: <87tu4dhhh2.wl-maz@kernel.org> From: Marc Zyngier To: "=?utf-8?B?WmhhbmcgWGluY2hlbmc=?=" Cc: "=?utf-8?B?dGdseA==?=" , "=?utf-8?B?bGludXgta2VybmVs?=" , "=?utf-8?B?b2xla3NhbmRy?=" , "=?utf-8?B?SGFucyBkZSBHb2VkZQ==?=" , "=?utf-8?B?YmlnZWFzeQ==?=" , "=?utf-8?B?bWFyay5ydXRsYW5k?=" , "=?utf-8?B?bWljaGFlbA==?=" Subject: Re: [PATCH] interrupt: discover and disable very frequent interrupts In-Reply-To: References: <20220930064042.14564-1-zhangxincheng@uniontech.com> <86bkqx6wrd.wl-maz@kernel.org> <868rm16tbu.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 156.39.10.100 X-SA-Exim-Rcpt-To: zhangxincheng@uniontech.com, tglx@linutronix.de, linux-kernel@vger.kernel.org, oleksandr@natalenko.name, hdegoede@redhat.com, bigeasy@linutronix.de, mark.rutland@arm.com, michael@walle.cc X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 09 Oct 2022 02:31:36 +0100, "=?utf-8?B?WmhhbmcgWGluY2hlbmc=?=" wrote: > > > Again: what makes you think that it is better to kill the interrupt > > than suffering a RCU stall? Yes, that's a lot of interrupts. But > > killing it and risking the whole system isn't an acceptable outcome. > > It's really not good to kill interrupts directly. I'm glad you finally agree, (202210081220.9da0a329-yujie.liu@intel.com has a good example of a perfectly working machine that your patch kills for no reason). > Perhaps a better way is > to report it and let the system administrator decide what to do with it. > > + if((desc->gap_count & 0xffff0000) == 0) > + desc->gap_time = get_jiffies_64(); > + > + desc->gap_count ++; > + > + if((desc->gap_count & 0x0000ffff) >= 2000) { > + if((get_jiffies_64() - desc->gap_time) < HZ) { > + desc->gap_count += 0x00010000; > + desc->gap_count &= 0xffff0000; > + } else { > + desc->gap_count = 0; > + } > + > + if((desc->gap_count >> 16) > 30) { > + __report_bad_irq(desc, action_ret, KERN_ERR "irq %d: triggered too frequently\n"); > + } > + } > + I don't think this is much better. You hardcode values that only make sense on your HW, and for nobody else. And what can the user do with this message? Nothing at all. The message itself only contributes to problem. As it is, this patch is only a nuisance. As I said before, this would be much better as a rate-limiter, with configurable limits, and behind a debug option. M. -- Without deviation from the norm, progress is not possible.