From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1426357AbcBRKjK (ORCPT ); Thu, 18 Feb 2016 05:39:10 -0500 Received: from mga11.intel.com ([192.55.52.93]:60632 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1425611AbcBRKjG (ORCPT ); Thu, 18 Feb 2016 05:39:06 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,464,1449561600"; d="scan'208";a="918368327" Message-ID: <1455791946.9851.24.camel@linux.intel.com> Subject: Re: [Intel-gfx] [PATCH] [RFC] kernel/cpu: Use lockref for online CPU reference counting From: Joonas Lahtinen To: Peter Zijlstra Cc: Oleg Nesterov , Intel graphics driver community testing & development , Linux kernel development , Ingo Molnar , David Hildenbrand , "Paul E. McKenney" , "Gautham R. Shenoy" , Chris Wilson Date: Thu, 18 Feb 2016 12:39:06 +0200 In-Reply-To: <20160217163747.GJ6357@twins.programming.kicks-ass.net> References: <20160215170618.GL6375@twins.programming.kicks-ass.net> <1455612576.4977.11.camel@linux.intel.com> <20160216091440.GT6357@twins.programming.kicks-ass.net> <1455619863.4977.29.camel@linux.intel.com> <20160216110732.GU6357@twins.programming.kicks-ass.net> <1455713251.5622.9.camel@linux.intel.com> <20160217142005.GD6357@twins.programming.kicks-ass.net> <20160217161320.GL32705@phenom.ffwll.local> <20160217161457.GH6357@twins.programming.kicks-ass.net> <20160217163351.GP32705@phenom.ffwll.local> <20160217163747.GJ6357@twins.programming.kicks-ass.net> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.4 (3.18.4-1.fc23) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On ke, 2016-02-17 at 17:37 +0100, Peter Zijlstra wrote: > On Wed, Feb 17, 2016 at 05:33:51PM +0100, Daniel Vetter wrote: > > On Wed, Feb 17, 2016 at 05:14:57PM +0100, Peter Zijlstra wrote: > > > On Wed, Feb 17, 2016 at 05:13:21PM +0100, Daniel Vetter wrote: > > > > And for context we're hitting this on CI in a bunch of our machines, which > > > > > > What's CI ? > > > > Continuous integration, aka our own farm of machines dedicated to running > > i915.ko testcases. Kinda like 0day (it does pre-merge testing on the m-l > > and also post-merge on our own little integration tree), but for just the > > graphics team and our needs. > > So what patch triggered this new issue? Did cpufreq change or what? It appeared right after enabling lockdep debugging on the continuous integration system. So we do not have a history of it not being there. Taking an another look at my code, it could indeed end up in double- wait-looping scenario if suspend and initialization was performed simultaneously (it had a couple of other bugs too, fixed in v2). Strange thing is, I think that should have been caught by cpuhp_lock_* lockdep tracking. So I'll move the discussion to linux-pm list to change the CPUfreq code. Thanks for the comments. Regards, Joonas -- Joonas Lahtinen Open Source Technology Center Intel Corporation