* Re: Linux 3.0-rc6 .. [not found] <CA+55aFxQw4T07hxz8JSm12x3FOH_Dcf=G5mvLrxiTuLxjbw+Mg@mail.gmail.com> @ 2011-07-11 6:03 ` Al Viro 2011-07-12 23:48 ` ->d_lock FUBAR (was Re: Linux 3.0-rc6) Al Viro 0 siblings, 1 reply; 12+ messages in thread From: Al Viro @ 2011-07-11 6:03 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel On Mon, Jul 04, 2011 at 04:09:29PM -0700, Linus Torvalds wrote: > Other than the isci driver, the rest really is just lots of random > small stuff. It's getting to the point where I'm thinking I should > just release 3.0, because it's been pretty quiet, and the fixes > haven't been earth-shakingly exciting. Some drm (radeon and intel) > fixes might be noticeable to more people, the rest would tend to be > pretty esoteric. Sigh... Looks like we have serious problems around ->d_parent handling. First of all, __d_unalias() is fscked - calling d_ancestor() is not going to do us any good before we made sure that tree topology won't change right under us. Used to be protected by dcache_lock, but not anymore. Moreover, there's a similar problem with __d_materialise_dentry() side of things; there we don't check for loop creation at all and with NFS we just might try to attach a root of disconnected subtree *inside* that subtree. No check and no locking either... Another piece of PITA - cifs_get_root() will cheerfully call d_materialise_unique() on dentry it got from d_lookup() if it happens to be negative to start with *and* directory had been created on server in the meanwhile. BUG_ON() triggered in d_materialise_unique()... The lack of i_mutex on parent also doesn't help. cifs_get_root() mess is from this cycle, BTW. btrfs get_default_root() doesn't grab i_mutex either. It should, since it calls d_splice_alias(). I'm really not fond of the code in dentry_lock_for_move(); we _might_ manage to avoid deadlocks since i_mutex serialization might prevent contention on d_lock, but I'm still not convinced, especially due to missing i_mutex in several callers... I mean, this * If there is an ancestor relationship: * dentry->d_parent->...->d_parent->d_lock * ... * dentry->d_parent->d_lock * dentry->d_lock * * If no ancestor relationship: * if (dentry1 < dentry2) * dentry1->d_lock * dentry2->d_lock is no good: suppose A is ancestor of B and C is unrelated to either. With B sitting at lower address than C and A at higher one. We have A before B, since it's an ancestor; C before A since they are unrelated and addresses compare that way; B before C (ditto). Loops in lock ordering are generally bad; we _might_ get away with that in this case since we serialize d_move() callers to hell and back, but... Al, going through ->d_parent code review and not happy about the state of that animal... ^ permalink raw reply [flat|nested] 12+ messages in thread
* ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-11 6:03 ` Linux 3.0-rc6 Al Viro @ 2011-07-12 23:48 ` Al Viro 2011-07-13 0:04 ` Linus Torvalds 0 siblings, 1 reply; 12+ messages in thread From: Al Viro @ 2011-07-12 23:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel, npiggin On Mon, Jul 11, 2011 at 07:03:15AM +0100, Al Viro wrote: > * If there is an ancestor relationship: > * dentry->d_parent->...->d_parent->d_lock > * ... > * dentry->d_parent->d_lock > * dentry->d_lock > * > * If no ancestor relationship: > * if (dentry1 < dentry2) > * dentry1->d_lock > * dentry2->d_lock > > is no good: suppose A is ancestor of B and C is unrelated to either. > With B sitting at lower address than C and A at higher one. We have > A before B, since it's an ancestor; C before A since they are unrelated > and addresses compare that way; B before C (ditto). Loops in lock > ordering are generally bad; we _might_ get away with that in this case > since we serialize d_move() callers to hell and back, but... But it's not enough. Look: getting from rcu pathwalk to normal one involves * grabbing d_lock on parent * grabbing d_lock on child * checking that child hadn't been moved elsewhere in the meanwhile All flakiness of the locking "order" aside, here we simply lock two dentries that might be nowhere near each other by now. Hell, by that point the parent might've been moved under (what used to be) child. Or it might have address greater than that of child and be not an ancestor anymore. Note that no i_mutex, etc. is held at that point, so there's no external serialization to save our arses... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-12 23:48 ` ->d_lock FUBAR (was Re: Linux 3.0-rc6) Al Viro @ 2011-07-13 0:04 ` Linus Torvalds 2011-07-13 0:56 ` Al Viro 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2011-07-13 0:04 UTC (permalink / raw) To: Al Viro; +Cc: linux-fsdevel, npiggin On Tue, Jul 12, 2011 at 4:48 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > All flakiness of the locking "order" aside, here we simply lock two dentries > that might be nowhere near each other by now. Hell, by that point the parent > might've been moved under (what used to be) child. Or it might have address > greater than that of child and be not an ancestor anymore. Note that no > i_mutex, etc. is held at that point, so there's no external serialization > to save our arses... So why wouldn't it be sufficient to just take the s_vfs_rename_mutex in the caller? We're only talking d_materialise_unique(), no? That said, looking at NFS (nfs_prime_dcache), we do seem to hold the parent directory inode semaphore. So I'm not seeing any renames happening wrt the entry we found and the parent. So the relationship there should be safe regardless, no? I didn't check the other users of d_materialise_unique() (ciph and ceph) but at least the cifs use is in "readdir.c", which implies similar directory inode mutex issues. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-13 0:04 ` Linus Torvalds @ 2011-07-13 0:56 ` Al Viro 2011-07-13 1:39 ` Al Viro 2011-07-13 1:48 ` Linus Torvalds 0 siblings, 2 replies; 12+ messages in thread From: Al Viro @ 2011-07-13 0:56 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel, npiggin On Tue, Jul 12, 2011 at 05:04:55PM -0700, Linus Torvalds wrote: > > All flakiness of the locking "order" aside, here we simply lock two dentries > > that might be nowhere near each other by now. ?Hell, by that point the parent > > might've been moved under (what used to be) child. ?Or it might have address > > greater than that of child and be not an ancestor anymore. ?Note that no > > i_mutex, etc. is held at that point, so there's no external serialization > > to save our arses... > > So why wouldn't it be sufficient to just take the s_vfs_rename_mutex > in the caller? We're only talking d_materialise_unique(), no? Alas, no. d_materialize_unique() aside (we have a couple of bugs in there), ->d_lock handling is fscked up. Basically, it's unlazy_walk() taking ->d_lock instances without anything to guarantee that it's doing that in order consistent with d_move() (and any other users, for that matter). *And* the "locking order" in question is not transitive, but that's a separate story. And no, we obviously are not going to serialize the switch from RCU to normal pathwalk on a per-fs mutex... I'm crawling through the d_lock users right now (>400 places ;-/), trying to put together reasonable locking rules. FWIW, we used to have this for d_move() et.al.: * all places changing ->d_parent hold i_mutex on parent(s) * if parents differ, we have ->s_vfs_rename_mutex as well * if old_dentry was not attached to the tree, it was positive and a subdirectory of new_dentry->d_parent. Said new_dentry->d_parent was been locked by caller. Unfortunately, current tree has these rules violated in several places. And these rules have nothing to say about ->d_lock ;-/ Nick, could you please describe the locking rules you had in mind for ->d_lock? unlazy_walk() (aka nameidata_dentry_drop_rcu()) can probably be dealt with by checking d_seq twice, once before locking the child. Then we could be sure that it's still a child of parent and will stay so as long as parent's ->d_lock is held, and thus the ordering would stay stable... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-13 0:56 ` Al Viro @ 2011-07-13 1:39 ` Al Viro 2011-07-13 2:40 ` Linus Torvalds 2011-07-13 1:48 ` Linus Torvalds 1 sibling, 1 reply; 12+ messages in thread From: Al Viro @ 2011-07-13 1:39 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel, npiggin On Wed, Jul 13, 2011 at 01:56:34AM +0100, Al Viro wrote: > Nick, could you please describe the locking rules you had in mind for > ->d_lock? unlazy_walk() (aka nameidata_dentry_drop_rcu()) can probably > be dealt with by checking d_seq twice, once before locking the child. > Then we could be sure that it's still a child of parent and will stay > so as long as parent's ->d_lock is held, and thus the ordering would > stay stable... As the matter of fact, can we ever get there with IS_ROOT(dentry)? AFAICS, that should be impossible - dentry->d_seq would have to be changed by whatever had torn it off the tree and we would have buggered off on __d_rcu_to_refcount() failing... AFAICS, the only way to get there would be with mountpoint crossing returning a symlink with symlink already killed by rename() somehow (call in walk_component()). The first part should be impossible - symlinks can't be mounted/bound on anything (and if it would be possible, we'd trigger that BUG_ON() if symlink was still alive, anyway). So here's what I want to do to unlazy_walk(); it'll almost certainly leave other problems with ->d_lock, but at least it'll take care of that one: Make sure that child is still a child of parent before nested locking of child->d_lock in unlazy_walk(); otherwise we are risking a violation of locking order and deadlocks. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- diff --git a/fs/namei.c b/fs/namei.c index 0223c41..5c867dd 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -433,6 +433,8 @@ static int unlazy_walk(struct nameidata *nd, struct dentry *dentry) goto err_parent; BUG_ON(nd->inode != parent->d_inode); } else { + if (dentry->d_parent != parent) + goto err_parent; spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); if (!__d_rcu_to_refcount(dentry, nd->seq)) goto err_child; ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-13 1:39 ` Al Viro @ 2011-07-13 2:40 ` Linus Torvalds 2011-07-13 2:59 ` Al Viro 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2011-07-13 2:40 UTC (permalink / raw) To: Al Viro; +Cc: linux-fsdevel, npiggin On Tue, Jul 12, 2011 at 6:39 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > So here's what I want to do to unlazy_walk(); it'll almost certainly > leave other problems with ->d_lock, but at least it'll take care of > that one: That patch seems pretty clearly safe. If the parent has changed, we'd want to exit anyway: as you point out the __d_rcu_to_refcount() is there to catch that case. So exiting early and thus making that direct parenthood requirement explicit in that d_lock case seems to be a good thing regardless. It's dentry_lock_for_move() that makes me really nervous. Not only does it lock up to four dentries, but it mixes the whole parenthood vs pointer ordering. Or course, it does have those BUG_ON() checks, so it should never cause any circular dependencies, but still.. The actual main protection to get lookups correct in the presence of concurrent moves largely depends on the sequence numbers (ie d_lookup() retrying if it hits a rename), which is why I also find it unlikely that we really should need to hold all those d_lock cases all at the same time. So does d_move() really need to get all the locks at the same time and then do all the operations inside that "super-locked" region? Or could we get the locks in sequence and do individual parts of the rename operations under individual locks? Are there any other d_lock cases that depend on the pointer ordering? Most everything else seems to be about direct parenthood, no? Linus ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-13 2:40 ` Linus Torvalds @ 2011-07-13 2:59 ` Al Viro 2011-07-13 3:22 ` Al Viro 0 siblings, 1 reply; 12+ messages in thread From: Al Viro @ 2011-07-13 2:59 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel, npiggin On Tue, Jul 12, 2011 at 07:40:01PM -0700, Linus Torvalds wrote: > It's dentry_lock_for_move() that makes me really nervous. Not only > does it lock up to four dentries, but it mixes the whole parenthood vs > pointer ordering. Or course, it does have those BUG_ON() checks, so > it should never cause any circular dependencies, but still.. Me too, obviously. > The actual main protection to get lookups correct in the presence of > concurrent moves largely depends on the sequence numbers (ie > d_lookup() retrying if it hits a rename), which is why I also find it > unlikely that we really should need to hold all those d_lock cases all > at the same time. > > So does d_move() really need to get all the locks at the same time and > then do all the operations inside that "super-locked" region? Or could > we get the locks in sequence and do individual parts of the rename > operations under individual locks? > > Are there any other d_lock cases that depend on the pointer ordering? > Most everything else seems to be about direct parenthood, no? It's not that easy. We want ->d_lock on parents - not only because there's code iterating through the list of children, but because ordering on direct parenthood bloody depends on children not moving out while we hold ->d_lock on their parent. Otherwise we are looking for nightmares in shrink_dcache_parent() et.al. I'm not sure how much do we care about stability of x->d_parent when x->d_lock is held. ->d_compare() is the most obvious potential area of trouble in that respect, but there might be more. I'm still not finished reviewing ->d_lock uses; about a couple of hundreds is left to wade through. I would really, *REALLY* appreciate explicitly defined locking rules from Nick (it's his changes, mostly). As in "this, this and that field is protected by ->d_lock on..." Note that ->d_parent is stable when i_mutex is held on parent, which makes most of the users of ->d_parent safe and fine (->lookup(), etc. are all called with directory locked). I've not finished reviewing ->d_parent users either, but IMO ->d_lock review is more important, so it got bumped in front of queue... Back to GrepTFS and stripping the paint off the walls... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-13 2:59 ` Al Viro @ 2011-07-13 3:22 ` Al Viro 2011-07-13 3:56 ` Linus Torvalds 0 siblings, 1 reply; 12+ messages in thread From: Al Viro @ 2011-07-13 3:22 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel, npiggin On Wed, Jul 13, 2011 at 03:59:18AM +0100, Al Viro wrote: > I'm not sure how much do we care about stability of x->d_parent when > x->d_lock is held. ->d_compare() is the most obvious potential area > of trouble in that respect, but there might be more. Oh, and another fun area is per-chain locks, of course ;-/ Look at __d_drop(); reengineering the callers of d_drop() to make sure that fscker's parent stays stable would be very painful and turning that into loop-based variant is going to be interesting. Doable, but not fun... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-13 3:22 ` Al Viro @ 2011-07-13 3:56 ` Linus Torvalds 2011-07-14 7:21 ` Al Viro 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2011-07-13 3:56 UTC (permalink / raw) To: Al Viro; +Cc: linux-fsdevel, npiggin On Tue, Jul 12, 2011 at 8:22 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Oh, and another fun area is per-chain locks, of course ;-/ Look at > __d_drop(); reengineering the callers of d_drop() to make sure that > fscker's parent stays stable would be very painful and turning that > into loop-based variant is going to be interesting. Doable, but not > fun... So I'm almost convinced. The reason I say "almost" is that I wonder if we couldn't just turn the d_move() into basically a write_seqlock(rename_lock); write_seqcount_begin(dentry); write_seqcount_begin(target); unhash(dentry); /* Takes and releases dentry/dentry-parent locks */ unhash(target); /* Takes and releases target/target-parent locks */ tname = target->name; dname = dentry->name; rehash(dentry, tname); /* takes/releases locks as above.. */ rehash(target, dname); /* takes/releases locks as above.. */ write_seqcount_end(target); write_seqcount_end(dentry); write_sequnlock(rename_lock); where the seqlock and sequence counts mean that any lookups (RCU or not) while the thing was unhashed get automatically re-done, so the fact that it's not at "atomic" move in between would not important. See what I'm saying? Sure, we want to hold both the dentry and parent locks when messing with dentry->d_parent: but does that really mean that we want to hold all FOUR locks at the same time, and in particular hold those *independent* locks (which is why we do that pointer value comparison, to give ordering *outside* of the parenthood issue)? IOW, maybe we could just hold and release them pairwise, and at any one point in time we'd only hold one pair of dentry/parent d_lock? No dentry-pointer-based ordering required - but we'd still protect each d_parent pointer _individually_. But I didn't really look closely at the code. The above suggestion is purely based on my high-level gut feel, and it's entirely possible that there's some particular practical reason why it's a totally broken idea, and I'm just a complete moron. I'm realizing, for example, that maybe we can't even do the dentry seqcount without holding d_lock? It would take the locks a few more times, but it would avoid the nasty lock ordering issues exactly because it drops them in between rather than try to hold them all at once. And d_move() is *not* a performance critical area, afaik, so the point would be to make the locking more straightforward and avoid the horrible subtlety. Crazy? Stupid? What am I missing? It's probably something obvious. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-13 3:56 ` Linus Torvalds @ 2011-07-14 7:21 ` Al Viro 2011-07-15 4:58 ` Al Viro 0 siblings, 1 reply; 12+ messages in thread From: Al Viro @ 2011-07-14 7:21 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel, npiggin On Tue, Jul 12, 2011 at 08:56:35PM -0700, Linus Torvalds wrote: > It would take the locks a few more times, but it would avoid the nasty > lock ordering issues exactly because it drops them in between rather > than try to hold them all at once. And d_move() is *not* a performance > critical area, afaik, so the point would be to make the locking more > straightforward and avoid the horrible subtlety. > > Crazy? Stupid? What am I missing? It's probably something obvious. It's loops over d_subdirs of given dentry. IOW, dentry can be found not just via the hash chains and those iterators (ones outside of fs/dcache.c) don't play with rename seqlock. They just rely on ->d_lock on dentry being enough to stabilize the list of children. It's not all that horrible, since i_mutex on parent (held in quite a few of those) is enough to prevent d_move() altogether. Or iterator sits on fs that never does d_move/d_materialise_unique at all. There are exceptions. The most generic one is __fsnotify_update_child_dentry_flags(), but that's not all. E.g. coda ->d_revalidate(), possibly some ceph code... Overall, it might be doable, but it'll take some massage in strange places. Hell knows; I think at that point in release cycle we don't want to go there. After looking through ->d_lock users I'm mostly convinced that unlazy_walk() fix + making damn sure we don't try to create loops in d_materialise_unique() + adding missing ->i_mutex where needed is the way to go for now. We have several places where ->d_lock overlap is not parent-to-child, but these really can't lead to deadlocks - fs is specialized enough to prevent them. However, looking through these loops over ->d_subdirs shows *another* pile of bugs, like this one: mutex_lock(&bus->d_inode->i_mutex); list_for_each_entry(dev, &bus->d_subdirs, d_u.d_child) if (dev->d_inode) update_dev(dev); mutex_unlock(&bus->d_inode->i_mutex); in drivers/usb/core/inode.c:update_bus(). See the problem? Sure, it's ramfs-like: all dentries are pinned down until unlink/rmdir, which is blocked by i_mutex on parent. And yes, d_move() is also not a problem (ditto). However, if you open a file and unlink it, dentry will be unpinned, but stay around. It will be unhashed, but it'll be on ->d_subdir. Close the file and dentry will be killed, without i_mutex on parent. Note that ->d_locked will be taken - both on victim and parent. But the quoted sucker takes no ->d_lock at all. Bummer... IOW, iterators relying _purely_ on i_mutex on parent are currently fucked. We have more such cases. Another one is debugfs_remove_recursive(). I wonder if we simply ought to evict the sucker from d_subdirs/d_child in simple_unlink()... It would require verifying that users of that animal are OK with that, but I suspect that most of them will be happy. It might simplify life for readdir() and llseek() on such directories, while we are at it... The "hunt for dentry leaks" logics on umount might become slightly less useful for those filesystems (we'll get "parent is misteriously busy" instead of "child is unlinked, but somehow busy on fs shutdown"), but I doubt it's that serious. Hell knows - I never had to hunt dentry leaks on debugfs/usbfs/etc. BTW, usbfs tree iterators should've been killed a long time ago - we simply need ->permission() and ->getattr() added for usbfs to DTRT instead of this "iterate over all inodes, patch new i_uid/i_gid/i_mode when remount asks to do that" crap. BTW^2: in clk_debugfs_register_one() in assorted arch/arm/mach-*/clock.c d = c->dent; list_for_each_entry_safe(child, child_tmp, &d->d_subdirs, d_u.d_child) debugfs_remove(child); debugfs_remove(c->dent); should be debugfs_remove_recursive(). Here we don't even bother with i_mutex... BTW^3: sel_remove_classes() plays very unsafe games - no ->i_mutex at all, no ->d_lock until we get leaf directories. Root-only, but... ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-14 7:21 ` Al Viro @ 2011-07-15 4:58 ` Al Viro 0 siblings, 0 replies; 12+ messages in thread From: Al Viro @ 2011-07-15 4:58 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel, npiggin On Thu, Jul 14, 2011 at 08:21:20AM +0100, Al Viro wrote: > Overall, it might be doable, but it'll take some massage in strange places. > Hell knows; I think at that point in release cycle we don't want to go > there. After looking through ->d_lock users I'm mostly convinced that > unlazy_walk() fix + making damn sure we don't try to create loops in > d_materialise_unique() + adding missing ->i_mutex where needed is the > way to go for now. We have several places where ->d_lock overlap is > not parent-to-child, but these really can't lead to deadlocks - fs is > specialized enough to prevent them. OK, hopefully that should take care of the ->d_lock for now; missing ->i_mutex is *not* dealt with yet, since I'm yet to finish ->d_parent code review. dentry_lock_for_move() is serialized by rename_lock now (call in __dentry_materialise_unique() used to be done without it; fixed in this series), so we can't have more than one of those in any potential deadlock set. Everything else either does lock child after immediate parent or happens on filesystems that see neither d_move() nor d_materialise_unique() and have serialization of their own. IOW, I think that should be enough to avoid deadlocks for in-tree filesystems for the time being; anything more dramatic will have to wait for the next merge window - it's too late in the cycle to do anything daring in that area... Usual place, git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6.git/ for-linus Shortlog: Al Viro (2): Fix ->d_lock locking order in unlazy_walk() fix loop checks in d_materialise_unique() Diffstat: fs/dcache.c | 51 ++++++++++++++++++++++++++++++++++----------------- fs/namei.c | 2 ++ 2 files changed, 36 insertions(+), 17 deletions(-) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: ->d_lock FUBAR (was Re: Linux 3.0-rc6) 2011-07-13 0:56 ` Al Viro 2011-07-13 1:39 ` Al Viro @ 2011-07-13 1:48 ` Linus Torvalds 1 sibling, 0 replies; 12+ messages in thread From: Linus Torvalds @ 2011-07-13 1:48 UTC (permalink / raw) To: Al Viro; +Cc: linux-fsdevel, npiggin On Tue, Jul 12, 2011 at 5:56 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Alas, no. d_materialize_unique() aside (we have a couple of bugs in > there), ->d_lock handling is fscked up. Ok, I misread your previous email, and read it as if the problem you were discussing was __d_unalias(), and then mis-read the newer email as a result. I now see what you're saying. Without having thought very deeply about this, I would suggest that we aim for entirely getting rid of anything that holds two d_lock's at the same time. There may be cases where we do end up having both "dentry" and the immediate parent, and in that case maybe we can have that direct relationship act as a ordering requirement, but yes, we should definitely aim to get rid of the whole "if ancestor do one ordering, otherwise base it on pointer ordering". Some of the things that take both locks make me go "do we really need to hold those locks at the same time?" IOW, maybe the locking could be simply sequential. Some of that pain may be simply unnecessary. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-07-15 4:59 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <CA+55aFxQw4T07hxz8JSm12x3FOH_Dcf=G5mvLrxiTuLxjbw+Mg@mail.gmail.com> 2011-07-11 6:03 ` Linux 3.0-rc6 Al Viro 2011-07-12 23:48 ` ->d_lock FUBAR (was Re: Linux 3.0-rc6) Al Viro 2011-07-13 0:04 ` Linus Torvalds 2011-07-13 0:56 ` Al Viro 2011-07-13 1:39 ` Al Viro 2011-07-13 2:40 ` Linus Torvalds 2011-07-13 2:59 ` Al Viro 2011-07-13 3:22 ` Al Viro 2011-07-13 3:56 ` Linus Torvalds 2011-07-14 7:21 ` Al Viro 2011-07-15 4:58 ` Al Viro 2011-07-13 1:48 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).