From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1763949AbZE3VU5 (ORCPT ); Sat, 30 May 2009 17:20:57 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1760670AbZE3VUr (ORCPT ); Sat, 30 May 2009 17:20:47 -0400 Received: from out02.mta.xmission.com ([166.70.13.232]:43407 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756663AbZE3VUq (ORCPT ); Sat, 30 May 2009 17:20:46 -0400 To: James Bottomley Cc: Tejun Heo , Andrew Morton , Greg Kroah-Hartman , linux-kernel@vger.kernel.org, Cornelia Huck , linux-fsdevel@vger.kernel.org, Kay Sievers , Greg KH , "Eric W. Biederman" References: <1243551665-23596-4-git-send-email-ebiederm@xmission.com> <4A1FA777.3040200@kernel.org> <4A210DEF.2030203@kernel.org> <1243693199.5223.5.camel@mulgrave.int.hansenpartnership.com> <1243698667.5223.12.camel@mulgrave.int.hansenpartnership.com> From: ebiederm@xmission.com (Eric W. Biederman) Date: Sat, 30 May 2009 14:20:35 -0700 In-Reply-To: <1243698667.5223.12.camel@mulgrave.int.hansenpartnership.com> (James Bottomley's message of "Sat\, 30 May 2009 15\:51\:07 +0000") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-XM-SPF: eid=;;;mid=;;;hst=in02.mta.xmission.com;;;ip=76.21.114.89;;;frm=ebiederm@xmission.com;;;spf=neutral X-SA-Exim-Connect-IP: 76.21.114.89 X-SA-Exim-Rcpt-To: James.Bottomley@HansenPartnership.com, ebiederm@aristanetworks.com, greg@kroah.com, kay.sievers@vrfy.org, linux-fsdevel@vger.kernel.org, cornelia.huck@de.ibm.com, linux-kernel@vger.kernel.org, gregkh@suse.de, akpm@linux-foundation.org, tj@kernel.org X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;James Bottomley X-Spam-Relay-Country: X-Spam-Report: * -1.8 ALL_TRUSTED Passed through trusted hosts only via SMTP * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 T_TM2_M_HEADER_IN_MSG BODY: T_TM2_M_HEADER_IN_MSG * -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% * [score: 0.0000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject * 0.1 XMSolicitRefs_0 Weightloss drug * 0.0 XM_SPF_Neutral SPF-Neutral * 0.4 UNTRUSTED_Relay Comes from a non-trusted relay Subject: Re: [PATCH 04/24] sysfs: Normalize removing sysfs directories. X-SA-Exim-Version: 4.2.1 (built Thu, 25 Oct 2007 00:26:12 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org James Bottomley writes: >> >> My take is simply that a correct user has to wait until no one else >> >> can find the kobject before calling kobject_del. At which point >> >> races are impossible, and it doesn't matter if sysfs_mutex is held >> >> across the entire operation. >> > >> > I'm afraid this one isn't a valid assumption. If you look in SCSI, >> > you'll see we do get objects after they've been removed from visibility. >> > We use it as part of the state model for how our objects work (objects >> > removed from visibility are dying, but we still need them to be findable >> > (and gettable). >> >> I was not precise enough. It appears I overlooked the fact that >> kobject_del is not always called from kobject_put by way of >> kobject_release. > > OK ... just so you understand, I'm thinking about the device model > rather than kobjects. device_del() can't be called from release methods > because they're often called from interrupt context and the mutex > requirements in device_del() mean it needs user context. Makes sense. >> Strictly the requirement is that after kobject_del we don't add, >> remove or otherwise manipulate sysfs attributes. That is we don't >> call any of: >> >> sysfs_add_file >> sysfs_create_file >> sysfs_create_bin_file >> sysfs_remove_file >> sysfs_remove_bin_file >> sysfs_create_link >> sysfs_remove_link >> sysfs_create_group >> sysfs_remove_group >> sysfs_create_subdir >> sysfs_remove_subdir >> >> >> Those all either oops or BUG today if you try it. So I can't see how >> a subsystem could depend on those working. > > It doesn't; you've altered your requirement. We can fully buy into this > new relaxed one. My apologies for misstating it earlier. Sometimes translating what is happening in sysfs up to the device model can be a bit of a challenge. At the sysfs layer the requirement is all the same. Don't mess with a directory as or after you have deleted it. To recap, my change that Tejun has a problem with is simply that I have refactored sysfs_remove_dir so that if there are directory entries present. A very fast observer in the kernel or in user space can see each directory entry being deleted individually. Before I delete the directory itself. This is because I now drop and reacquire the sysfs_mutex in between each delete. As the upper layers must already avoid messing with the attributes of a sysfs directory from the time we call kobject_del I don't see that this makes any difference to them. >> Also there is sysfs_remove_dir (on a subdirectory) aka kobject_del on >> a child object after kobject_del on the parent object. >> >> As best I can tell that only works by fluke today. > > Yes, that's an artifact of the fact that the reference counted lifecycle > is on release ... del just happens at a certain point in it. We don't > hold any counters that tell us what the visibility of our children are, > so it's possible to make a parent invisible by calling del simply > because you don't know. Strictly speaking my changes don't affect this part either except to issue a warning that something unexpected is going on. Eric