From mboxrd@z Thu Jan 1 00:00:00 1970 From: Al Viro Subject: Re: [2.6.38] Deadlock between rename_lock and vfsmount_lock. Date: Fri, 18 Mar 2011 12:07:48 +0000 Message-ID: <20110318120748.GH22723@ZenIV.linux.org.uk> References: <201103160854.p2G8sR6c077737@www262.sakura.ne.jp> <201103170501.p2H51PjU052428@www262.sakura.ne.jp> <201103181959.CHC73937.SQOLJtVHOMFFOF@I-love.SAKURA.ne.jp> <20110318110603.GG22723@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: npiggin@kernel.dk, linux-fsdevel@vger.kernel.org To: Tetsuo Handa Return-path: Received: from zeniv.linux.org.uk ([195.92.253.2]:58911 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754064Ab1CRMHu (ORCPT ); Fri, 18 Mar 2011 08:07:50 -0400 Content-Disposition: inline In-Reply-To: <20110318110603.GG22723@ZenIV.linux.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri, Mar 18, 2011 at 11:06:03AM +0000, Al Viro wrote: > That's incredibly ugly. I agree that the deadlock exists and needs to be > dealt with, but not that way. _Strongly_ NAKed. I'll see what I can come > up with, but that variant is not an option. Actually, why do we hold vfsmount_lock over that loop at all? We already hold namespace_sem, so ->mnt_parent is protected... FWIW, there's a thing I really don't like about that area - I would really prefer to have namespace_sem nest _inside_ i_mutex and have no fs operations whatsoever done under it. Note that we already take care (mostly) to keep actual fs shutdowns outside of it. The real obstacle is follow_down() we do under namespace_sem on several paths; otherwise we'd be able to grab i_mutex first and be done with that. Moreover, we have extra ugliness in follow_down(); I really wonder what is the only instance trying to do here. Look: we pass mounting_here == true to autofs4_d_manage(). It sees the flag, then checks if we have DCACHE_MOUNTED set and returns either -EISDIR or 0. -EISDIR leads to follow_down() returning 0 immediately. 0 leads to trying to cross the mountpoint. First of all, we could as well return 0 regardless - if DCACHE_MOUNTED is not set, follow_down() will see that and return 0 since there's nothing left to do. What's more, we have a funny situation here - we might as well replace if (managed & DCACHE_MANAGE_TRANSIT) { with if (!mounting_here && managed & DCACHE_MANAGE_TRANSIT) { in follow_down() and lose that argument of ->d_manage(). So let's do this: lock_mount(path, follow) retry: lock path->dentry->d_inode if unlikely(can't mount) unlock fail grab namespace_sem if !follow || likely(lookup_mnt() returns NULL) we are done drop namespace_sem unlock drop path replace it with result of lookup_mnt() (and its ->mnt_root) goto retry; and use that (local to fs/namespace.c) in do_add_mount()/do_move_mount()/ do_loopback() (with follow = 1) and pivot_root() (follow = 0). BTW, the lack of following in do_loopback() looks like a bug... Also in do_loopback(): take release_mounts() after unlocking everything. graft_tree() doesn't grab i_mutex - callers take it. follow_down() loses mounting_here argument and so does ->d_manage(). cant_mount() calls are all merged into one in lock_mount(). Comments?