From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm@xmission.com (Eric W. Biederman) Subject: Re: [PATCH review 11/11] mnt: Honor MNT_LOCKED when detaching mounts Date: Fri, 16 Jan 2015 12:29:39 -0600 Message-ID: <87vbk6zii4.fsf@x220.int.ebiederm.org> References: <1420490787-14387-11-git-send-email-ebiederm@xmission.com> <20150107184334.GZ22149@ZenIV.linux.org.uk> <87h9w2gzht.fsf@x220.int.ebiederm.org> <20150107205239.GB22149@ZenIV.linux.org.uk> <87iogi8dka.fsf@x220.int.ebiederm.org> <20150108002227.GC22149@ZenIV.linux.org.uk> <20150108223212.GF22149@ZenIV.linux.org.uk> <20150109203126.GI22149@ZenIV.linux.org.uk> <87h9vzryio.fsf@x220.int.ebiederm.org> <20150110055148.GY22149@ZenIV.linux.org.uk> <20150111020030.GF22149@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain Cc: Linux Containers , linux-fsdevel@vger.kernel.org, "Serge E. Hallyn" , Andy Lutomirski , Chen Hanxiao , Richard Weinberger , Andrey Vagin , Linus Torvalds To: Al Viro Return-path: Received: from out02.mta.xmission.com ([166.70.13.232]:54175 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751935AbbAPScd (ORCPT ); Fri, 16 Jan 2015 13:32:33 -0500 In-Reply-To: <20150111020030.GF22149@ZenIV.linux.org.uk> (Al Viro's message of "Sun, 11 Jan 2015 02:00:30 +0000") Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Al Viro writes: > On Sat, Jan 10, 2015 at 05:51:48AM +0000, Al Viro wrote: >> On Fri, Jan 09, 2015 at 11:32:47PM -0600, Eric W. Biederman wrote: >> >> > I don't believe rcu anything in this function itself buys you anything, >> > but structuring this primitive so that it can be called from an rcu list >> > traversal seems interesting. >> >> ??? >> >> Without RCU, what would prevent it being freed right under us? >> >> The whole point is to avoid pinning it down - as it is, we can have >> several processes call ->kill() on the same object. The first one >> would end up doing cleanup, the rest would wait *without* *affecting* >> *fs_pin* *lifetime*. >> >> Note that I'm using autoremove there for wait.func(), then in the wait >> loop I check (without locks) wait.task_list being empty. It is racy; >> deliberately so. All I really care about in there is checking that >> wait.func has not been called until after rcu_read_lock(). If that is >> true, we know that p->wait hadn't been woken until that point, i.e. >> p hadn't reached rcu delay on the way to being freed until after our >> rcu_read_lock(). Ergo, it can't get freed until we do rcu_read_unlock() >> and we can safely take p->wait.lock. >> >> RCU is very much relevant there. > > FWIW, I've just pushed a completely untested tree in #experimental-fs_pin; > it definitely will be reordered, etc., probably with quite a few of the > patches from the beginning of your series mixed in, but the current tree > in there should show at least what I'm aiming at. I have merged the work you have been doing and what I have been doing and posted it to a branch #for-testing of my user-namespace.git tree. And yes I managed to make the core of the pin primitive not care about rcu, and I think I will need that property to clean up some of the weirdness that I still see with using fs_pin. pin_insert does not wind up being a clean primitive, adding to both lists at the same time does not end up with particularly clean or obvious locking rules or a clean locking impelementation. Still the code works and is a good starting point for further discussion and thinking. I am posting the code while I go off to see if I can spot better ways to clean some of these things up. Eric