From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 416PMt5dBlzF1BL for ; Fri, 15 Jun 2018 12:20:10 +1000 (AEST) Date: Thu, 14 Jun 2018 19:16:29 -0700 From: Ricardo Neri To: Thomas Gleixner Cc: Ingo Molnar , "H. Peter Anvin" , Andi Kleen , Ashok Raj , Borislav Petkov , Tony Luck , "Ravi V. Shankar" , x86@kernel.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Jacob Pan , "Rafael J. Wysocki" , Don Zickus , Nicholas Piggin , Michael Ellerman , Frederic Weisbecker , Alexei Starovoitov , Babu Moger , Mathieu Desnoyers , Masami Hiramatsu , Peter Zijlstra , Andrew Morton , Philippe Ombredanne , Colin Ian King , Byungchul Park , "Paul E. McKenney" , "Luis R. Rodriguez" , Waiman Long , Josh Poimboeuf , Randy Dunlap , Davidlohr Bueso , Christoffer Dall , Marc Zyngier , Kai-Heng Feng , Konrad Rzeszutek Wilk , David Rientjes , iommu@lists.linux-foundation.org Subject: Re: [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs Message-ID: <20180615021629.GD11625@voyager> References: <1528851463-21140-1-git-send-email-ricardo.neri-calderon@linux.intel.com> <1528851463-21140-21-git-send-email-ricardo.neri-calderon@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Wed, Jun 13, 2018 at 11:48:09AM +0200, Thomas Gleixner wrote: > On Tue, 12 Jun 2018, Ricardo Neri wrote: > > + /* There are no CPUs to monitor. */ > > + if (!cpumask_weight(&hdata->monitored_mask)) > > + return NMI_HANDLED; > > + > > inspect_for_hardlockups(regs); > > > > + /* > > + * Target a new CPU. Keep trying until we find a monitored CPU. CPUs > > + * are addded and removed to this mask at cpu_up() and cpu_down(), > > + * respectively. Thus, the interrupt should be able to be moved to > > + * the next monitored CPU. > > + */ > > + spin_lock(&hld_data->lock); > > Yuck. Taking a spinlock from NMI ... I am sorry. I will look into other options for locking. Do you think rcu_lock would help in this case? I need this locking because the CPUs being monitored changes as CPUs come online and offline. > > > + for_each_cpu_wrap(cpu, &hdata->monitored_mask, smp_processor_id() + 1) { > > + if (!irq_set_affinity(hld_data->irq, cpumask_of(cpu))) > > + break; > > ... and then calling into generic interrupt code which will take even more > locks is completely broken. I will into reworking how the destination of the interrupt is set. Thanks and BR, Ricardo