From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932919Ab0BEJjX (ORCPT ); Fri, 5 Feb 2010 04:39:23 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:39788 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932748Ab0BEJjV convert rfc822-to-8bit (ORCPT ); Fri, 5 Feb 2010 04:39:21 -0500 To: Xiaotian Feng Cc: Amerigo Wang , linux-kernel@vger.kernel.org, Tejun Heo , Greg Kroah-Hartman , Peter Zijlstra , Miles Lane , Heiko Carstens , Benjamin Herrenschmidt , Larry Finger , akpm@linux-foundation.org Subject: Re: [Patch v2] sysfs: add lockdep class support to s_active References: <20100205064622.4141.72867.sendpatchset@localhost.localdomain> <7b6bb4a51002042309v3ca1dd0p60b9dbcacdafaee6@mail.gmail.com> From: ebiederm@xmission.com (Eric W. Biederman) Date: Fri, 05 Feb 2010 01:39:07 -0800 In-Reply-To: <7b6bb4a51002042309v3ca1dd0p60b9dbcacdafaee6@mail.gmail.com> (Xiaotian Feng's message of "Fri\, 5 Feb 2010 15\:09\:49 +0800") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-XM-SPF: eid=;;;mid=;;;hst=in02.mta.xmission.com;;;ip=76.21.114.89;;;frm=ebiederm@xmission.com;;;spf=neutral X-SA-Exim-Connect-IP: 76.21.114.89 X-SA-Exim-Mail-From: ebiederm@xmission.com X-SA-Exim-Scanned: No (on in02.mta.xmission.com); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Xiaotian Feng writes: > On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang wrote: >> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >> As reported by several people, it is something like: >> >> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >> [ 6967.956156] Disabling non-boot CPUs ... >> [ 6967.970401] >> [ 6967.970408] ============================================= >> [ 6967.970419] [ INFO: possible recursive locking detected ] >> [ 6967.970431] 2.6.33-rc2-git6 #27 >> [ 6967.970439] --------------------------------------------- >> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >> [ 6967.970460]  (s_active){++++.+}, at: [] >> sysfs_hash_and_remove+0x3d/0x4f >> [ 6967.970493] >> [ 6967.970497] but task is already holding lock: >> [ 6967.970506]  (s_active){++++.+}, at: [] >> sysfs_get_active_two+0x16/0x36 >> [...] >> >> Eric already provides a patch for this[1], but it still can't fix the >> problem. Based on his work and Peter's suggestion, I write this patch, >> hopefully we can fix the warning completely. >> >> This patch put sysfs s_active into two classes, one is for PM, the other >> is for the rest, so lockdep will distinguish them. > > I think this patch does not hit the root cause, we have a similiar > warning which is not related with PM. The root cause is that our locking is crazy complicated. No lockdep changes are going to fix that. What we can do and what the patch does is teach lockdep to treat some of the sysfs files as a different group (subclass) from other sysfs files. Which keeps us from overgeneralizing too much and having a better signal to noise ratio. As for the block device problem goes, I can't easily say that the block layer is correct. I expect it is because changing the scheduler is unlikely to delete block devices. If the block layer has bugs then adding another subclass as Amerigo suggests should simply make lockdep warnings harder to trigger and more accurate so that sounds like a path worth walking. In general I recommend that pieces of code that need to do a lot of work in a sysfs attribute consider using a work queue or a kernel thread, as that can be easier to analyze. Eric