From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by aws-us-west-2-korg-lkml-1.web.codeaurora.org (Postfix) with ESMTP id 5B9D5C5CFC0 for ; Fri, 15 Jun 2018 02:20:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1681B20891 for ; Fri, 15 Jun 2018 02:20:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1681B20891 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965427AbeFOCUL (ORCPT ); Thu, 14 Jun 2018 22:20:11 -0400 Received: from mga18.intel.com ([134.134.136.126]:54854 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965001AbeFOCUJ (ORCPT ); Thu, 14 Jun 2018 22:20:09 -0400 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Jun 2018 19:20:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,225,1526367600"; d="scan'208";a="50051074" Received: from voyager.sc.intel.com (HELO voyager) ([10.3.52.149]) by orsmga006.jf.intel.com with ESMTP; 14 Jun 2018 19:20:06 -0700 Date: Thu, 14 Jun 2018 19:16:29 -0700 From: Ricardo Neri To: Thomas Gleixner Cc: Ingo Molnar , "H. Peter Anvin" , Andi Kleen , Ashok Raj , Borislav Petkov , Tony Luck , "Ravi V. Shankar" , x86@kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Jacob Pan , "Rafael J. Wysocki" , Don Zickus , Nicholas Piggin , Michael Ellerman , Frederic Weisbecker , Alexei Starovoitov , Babu Moger , Mathieu Desnoyers , Masami Hiramatsu , Peter Zijlstra , Andrew Morton , Philippe Ombredanne , Colin Ian King , Byungchul Park , "Paul E. McKenney" , "Luis R. Rodriguez" , Waiman Long , Josh Poimboeuf , Randy Dunlap , Davidlohr Bueso , Christoffer Dall , Marc Zyngier , Kai-Heng Feng , Konrad Rzeszutek Wilk , David Rientjes , iommu@lists.linux-foundation.org Subject: Re: [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs Message-ID: <20180615021629.GD11625@voyager> References: <1528851463-21140-1-git-send-email-ricardo.neri-calderon@linux.intel.com> <1528851463-21140-21-git-send-email-ricardo.neri-calderon@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 13, 2018 at 11:48:09AM +0200, Thomas Gleixner wrote: > On Tue, 12 Jun 2018, Ricardo Neri wrote: > > + /* There are no CPUs to monitor. */ > > + if (!cpumask_weight(&hdata->monitored_mask)) > > + return NMI_HANDLED; > > + > > inspect_for_hardlockups(regs); > > > > + /* > > + * Target a new CPU. Keep trying until we find a monitored CPU. CPUs > > + * are addded and removed to this mask at cpu_up() and cpu_down(), > > + * respectively. Thus, the interrupt should be able to be moved to > > + * the next monitored CPU. > > + */ > > + spin_lock(&hld_data->lock); > > Yuck. Taking a spinlock from NMI ... I am sorry. I will look into other options for locking. Do you think rcu_lock would help in this case? I need this locking because the CPUs being monitored changes as CPUs come online and offline. > > > + for_each_cpu_wrap(cpu, &hdata->monitored_mask, smp_processor_id() + 1) { > > + if (!irq_set_affinity(hld_data->irq, cpumask_of(cpu))) > > + break; > > ... and then calling into generic interrupt code which will take even more > locks is completely broken. I will into reworking how the destination of the interrupt is set. Thanks and BR, Ricardo