From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932322Ab0BHDKc (ORCPT ); Sun, 7 Feb 2010 22:10:32 -0500 Received: from mx1.redhat.com ([209.132.183.28]:15410 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932174Ab0BHDKa (ORCPT ); Sun, 7 Feb 2010 22:10:30 -0500 Message-ID: <4B6F8137.9070905@redhat.com> Date: Mon, 08 Feb 2010 11:12:55 +0800 From: Cong Wang User-Agent: Thunderbird 2.0.0.23 (X11/20091001) MIME-Version: 1.0 To: Xiaotian Feng CC: "Eric W. Biederman" , linux-kernel@vger.kernel.org, Tejun Heo , Greg Kroah-Hartman , Peter Zijlstra , Miles Lane , Heiko Carstens , Benjamin Herrenschmidt , Larry Finger , akpm@linux-foundation.org Subject: Re: [Patch v2] sysfs: add lockdep class support to s_active References: <20100205064622.4141.72867.sendpatchset@localhost.localdomain> <7b6bb4a51002042309v3ca1dd0p60b9dbcacdafaee6@mail.gmail.com> <7b6bb4a51002050200m669ac1dv8b525cc843e10d60@mail.gmail.com> In-Reply-To: <7b6bb4a51002050200m669ac1dv8b525cc843e10d60@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Xiaotian Feng wrote: > On Fri, Feb 5, 2010 at 5:39 PM, Eric W. Biederman wrote: >> Xiaotian Feng writes: >> >>> On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang wrote: >>>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug. >>>> As reported by several people, it is something like: >>>> >>>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3 >>>> [ 6967.956156] Disabling non-boot CPUs ... >>>> [ 6967.970401] >>>> [ 6967.970408] ============================================= >>>> [ 6967.970419] [ INFO: possible recursive locking detected ] >>>> [ 6967.970431] 2.6.33-rc2-git6 #27 >>>> [ 6967.970439] --------------------------------------------- >>>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock: >>>> [ 6967.970460] (s_active){++++.+}, at: [] >>>> sysfs_hash_and_remove+0x3d/0x4f >>>> [ 6967.970493] >>>> [ 6967.970497] but task is already holding lock: >>>> [ 6967.970506] (s_active){++++.+}, at: [] >>>> sysfs_get_active_two+0x16/0x36 >>>> [...] >>>> >>>> Eric already provides a patch for this[1], but it still can't fix the >>>> problem. Based on his work and Peter's suggestion, I write this patch, >>>> hopefully we can fix the warning completely. >>>> >>>> This patch put sysfs s_active into two classes, one is for PM, the other >>>> is for the rest, so lockdep will distinguish them. >>> I think this patch does not hit the root cause, we have a similiar >>> warning which is not related with PM. >> The root cause is that our locking is crazy complicated. No lockdep >> changes are going to fix that. >> >> What we can do and what the patch does is teach lockdep to treat some >> of the sysfs files as a different group (subclass) from other sysfs >> files. Which keeps us from overgeneralizing too much and having >> a better signal to noise ratio. >> >> As for the block device problem goes, I can't easily say that >> the block layer is correct. I expect it is because changing >> the scheduler is unlikely to delete block devices. If the block layer >> has bugs then adding another subclass as Amerigo suggests should simply >> make lockdep warnings harder to trigger and more accurate so that >> sounds like a path worth walking. >> >> In general I recommend that pieces of code that need to do a lot of >> work in a sysfs attribute consider using a work queue or a kernel >> thread, as that can be easier to analyze. > > PM case > store /sys/devices/system/cpu1/online > remove /sys/devices/system/cpu1/cache/ > > iosched case > store /sys/block/sdx/queue/scheduler > remove /sys/block/sdx/queue/iosched/ > > So it looks like this is from sysfs layer .... > Right, and both locks are s_active, so I think they are the same problem, but I haven't check the iosched case carefully. ;)