* [PATCH] mnt: Fix a memory stomp in umount
@ 2014-12-18 16:57 Eric W. Biederman
[not found] ` <CA+55aFz6grtss0SXqOizXMOOF4sxT3FC4GC4NCiMF2Huy1vE4A@mail.gmail.com>
2014-12-18 20:01 ` Al Viro
0 siblings, 2 replies; 14+ messages in thread
From: Eric W. Biederman @ 2014-12-18 16:57 UTC (permalink / raw)
To: Al Viro; +Cc: Linus Torvalds, linux-fsdevel
While reviewing the code of umount_tree I realized that when we append
to a preexisting unmounted list we do not change pprev of the former
first item in the list.
Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on
the former first item of the list will stomp unmounted.first leaving
it set to some random mount point which we are likely to free soon.
This isn't likely to hit, but if it does I don't know how anyone could
track it down.
Fixes: 38129a13e6e71f666e0468e99fdd932a687b4d7e switch mnt_hash to hlist
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
Al do you want to take this one, or would you like me to make certain it
makes it Linus?
fs/namespace.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/namespace.c b/fs/namespace.c
index fe1c77145a78..6afbd7bb79f3 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1370,6 +1370,8 @@ void umount_tree(struct mount *mnt, int how)
}
if (last) {
last->mnt_hash.next = unmounted.first;
+ if (unmounted.first)
+ unmounted.first->pprev = &last->mnt_hash.next;
unmounted.first = tmp_list.first;
unmounted.first->pprev = &unmounted.first;
}
--
2.1.3
^ permalink raw reply related [flat|nested] 14+ messages in thread[parent not found: <CA+55aFz6grtss0SXqOizXMOOF4sxT3FC4GC4NCiMF2Huy1vE4A@mail.gmail.com>]
* Re: [PATCH] mnt: Fix a memory stomp in umount [not found] ` <CA+55aFz6grtss0SXqOizXMOOF4sxT3FC4GC4NCiMF2Huy1vE4A@mail.gmail.com> @ 2014-12-18 18:41 ` Linus Torvalds 2014-12-18 19:24 ` Eric W. Biederman 0 siblings, 1 reply; 14+ messages in thread From: Linus Torvalds @ 2014-12-18 18:41 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-fsdevel, Al Viro On Thu, Dec 18, 2014 at 9:07 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > Why is this piece of code using its own made up and buggy list handling in > the first place? We have list functions for these things, exactly so that > people shouldn't write buggy stuff by hand. Oh. Ok, I see what's going on. We have "list_splice()", but we don't have the equivalent "hlist_splice()". So it's doing that by hand, and did it badly. Al, this is your bug. I guess I can take the "manual hlist_splice" fix from Eric, but I'm not really happy with it. There's a few other places in that same commit where the list splice operation has been open-coded. Mind taking a look? Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-18 18:41 ` Linus Torvalds @ 2014-12-18 19:24 ` Eric W. Biederman 2014-12-18 19:34 ` Linus Torvalds 2014-12-18 21:05 ` Al Viro 0 siblings, 2 replies; 14+ messages in thread From: Eric W. Biederman @ 2014-12-18 19:24 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-fsdevel, Al Viro Linus Torvalds <torvalds@linux-foundation.org> writes: > On Thu, Dec 18, 2014 at 9:07 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> Why is this piece of code using its own made up and buggy list handling in >> the first place? We have list functions for these things, exactly so that >> people shouldn't write buggy stuff by hand. > > Oh. Ok, I see what's going on. We have "list_splice()", but we don't > have the equivalent "hlist_splice()". So it's doing that by hand, and > did it badly. > > Al, this is your bug. I guess I can take the "manual hlist_splice" fix > from Eric, but I'm not really happy with it. There's a few other > places in that same commit where the list splice operation has been > open-coded. > > Mind taking a look? It looks like we can pretty easily use mnt_list instead of mnt_hash, see below (note: the code is only compile tested). While converting this to ordinary list helpers I found something strange. In __propagate_umount we currently add the child to be unmounted in a different location in the list then we did before the conversion of mnt_hash to a hlist for rcu's accesses benefit. Now maybe propagate_next handles this (I still need to read and understand that code) if not it looks like I may have found another bug, as it looks like today we can add a node to our list without propogating the unmount from the node. From: "Eric W. Biederman" <ebiederm@xmission.com> Date: Thu, 18 Dec 2014 13:10:48 -0600 Subject: [PATCH] mnt: In umount_tree reuse mnt_list instead of mnt_hash umount_tree builds a list of mounts that need to be unmounted. Utilize mnt_list for this purpose instead of mnt_hash as mnt_list is an ordianry list_head, allowing the use of list_splice and list_move instead of rolling our own. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- fs/namespace.c | 45 +++++++++++++++++++-------------------------- fs/pnode.c | 7 ++++--- fs/pnode.h | 2 +- 3 files changed, 24 insertions(+), 30 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index 6afbd7bb79f3..2f21a973d7bc 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1285,23 +1285,22 @@ int may_umount(struct vfsmount *mnt) EXPORT_SYMBOL(may_umount); -static HLIST_HEAD(unmounted); /* protected by namespace_sem */ +static LIST_HEAD(unmounted); /* protected by namespace_sem */ static void namespace_unlock(void) { struct mount *mnt; - struct hlist_head head = unmounted; + LIST_HEAD(head); - if (likely(hlist_empty(&head))) { + if (likely(list_empty(&unmounted))) { up_write(&namespace_sem); return; } - head.first->pprev = &head.first; - INIT_HLIST_HEAD(&unmounted); + list_splice_init(&unmounted, &head); /* undo decrements we'd done in umount_tree() */ - hlist_for_each_entry(mnt, &head, mnt_hash) + list_for_each_entry(mnt, &head, mnt_list) if (mnt->mnt_ex_mountpoint.mnt) mntget(mnt->mnt_ex_mountpoint.mnt); @@ -1309,9 +1308,9 @@ static void namespace_unlock(void) synchronize_rcu(); - while (!hlist_empty(&head)) { - mnt = hlist_entry(head.first, struct mount, mnt_hash); - hlist_del_init(&mnt->mnt_hash); + while (!list_empty(&head)) { + mnt = list_first_entry(&head, struct mount, mnt_list); + list_del_init(&mnt->mnt_list); if (mnt->mnt_ex_mountpoint.mnt) path_put(&mnt->mnt_ex_mountpoint); mntput(&mnt->mnt); @@ -1332,24 +1331,25 @@ static inline void namespace_lock(void) */ void umount_tree(struct mount *mnt, int how) { - HLIST_HEAD(tmp_list); + LIST_HEAD(tmp_list); struct mount *p; - struct mount *last = NULL; - for (p = mnt; p; p = next_mnt(p, mnt)) { - hlist_del_init_rcu(&p->mnt_hash); - hlist_add_head(&p->mnt_hash, &tmp_list); - } + /* Gather the mounts to umount */ + for (p = mnt; p; p = next_mnt(p, mnt)) + list_move(&p->mnt_list, &tmp_list); - hlist_for_each_entry(p, &tmp_list, mnt_hash) + /* Hide the mounts from lookup_mnt and mnt_mounts */ + list_for_each_entry(p, &tmp_list, mnt_list) { + hlist_del_init_rcu(&p->mnt_hash); list_del_init(&p->mnt_child); + } + /* Add propogated mounts to the tmp_list */ if (how) propagate_umount(&tmp_list); - hlist_for_each_entry(p, &tmp_list, mnt_hash) { + list_for_each_entry(p, &tmp_list, mnt_list) { list_del_init(&p->mnt_expire); - list_del_init(&p->mnt_list); __touch_mnt_namespace(p->mnt_ns); p->mnt_ns = NULL; if (how < 2) @@ -1366,15 +1366,8 @@ void umount_tree(struct mount *mnt, int how) p->mnt_mp = NULL; } change_mnt_propagation(p, MS_PRIVATE); - last = p; - } - if (last) { - last->mnt_hash.next = unmounted.first; - if (unmounted.first) - unmounted.first->pprev = &last->mnt_hash.next; - unmounted.first = tmp_list.first; - unmounted.first->pprev = &unmounted.first; } + list_splice(&tmp_list, &unmounted); } static void shrink_submounts(struct mount *mnt); diff --git a/fs/pnode.c b/fs/pnode.c index 260ac8f898a4..c4319520d884 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -384,7 +384,8 @@ static void __propagate_umount(struct mount *mnt) if (child && list_empty(&child->mnt_mounts)) { list_del_init(&child->mnt_child); hlist_del_init_rcu(&child->mnt_hash); - hlist_add_before_rcu(&child->mnt_hash, &mnt->mnt_hash); + /* list_move or list_move_tail? */ + list_move(&child->mnt_list, &mnt->mnt_list); } } } @@ -396,11 +397,11 @@ static void __propagate_umount(struct mount *mnt) * * vfsmount lock must be held for write */ -int propagate_umount(struct hlist_head *list) +int propagate_umount(struct list_head *list) { struct mount *mnt; - hlist_for_each_entry(mnt, list, mnt_hash) + list_for_each_entry(mnt, list, mnt_list) __propagate_umount(mnt); return 0; } diff --git a/fs/pnode.h b/fs/pnode.h index 4a246358b031..997596a5d31b 100644 --- a/fs/pnode.h +++ b/fs/pnode.h @@ -40,7 +40,7 @@ static inline void set_mnt_shared(struct mount *mnt) void change_mnt_propagation(struct mount *, int); int propagate_mnt(struct mount *, struct mountpoint *, struct mount *, struct hlist_head *); -int propagate_umount(struct hlist_head *); +int propagate_umount(struct list_head *); int propagate_mount_busy(struct mount *, int); void mnt_release_group_id(struct mount *); int get_dominating_id(struct mount *mnt, const struct path *root); -- 2.1.3 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-18 19:24 ` Eric W. Biederman @ 2014-12-18 19:34 ` Linus Torvalds 2014-12-18 21:05 ` Al Viro 1 sibling, 0 replies; 14+ messages in thread From: Linus Torvalds @ 2014-12-18 19:34 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-fsdevel, Al Viro On Thu, Dec 18, 2014 at 11:24 AM, Eric W. Biederman <ebiederm@xmission.com> wrote: > > It looks like we can pretty easily use mnt_list instead of mnt_hash, > see below (note: the code is only compile tested). Ok. That sounds good. In the meantime, I just applied your patch. I'm not a huge fan of it, but the bug is real, and for backporting that patch is the unquestionably the right thing to do. I just want to at least have this discussion about better long-term solutions, and I think your patch may be it. Thanks. Al? Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-18 19:24 ` Eric W. Biederman 2014-12-18 19:34 ` Linus Torvalds @ 2014-12-18 21:05 ` Al Viro 2014-12-18 21:18 ` Eric W. Biederman 1 sibling, 1 reply; 14+ messages in thread From: Al Viro @ 2014-12-18 21:05 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Linus Torvalds, linux-fsdevel On Thu, Dec 18, 2014 at 01:24:26PM -0600, Eric W. Biederman wrote: > Linus Torvalds <torvalds@linux-foundation.org> writes: > > > On Thu, Dec 18, 2014 at 9:07 AM, Linus Torvalds > > <torvalds@linux-foundation.org> wrote: > >> Why is this piece of code using its own made up and buggy list handling in > >> the first place? We have list functions for these things, exactly so that > >> people shouldn't write buggy stuff by hand. > > > > Oh. Ok, I see what's going on. We have "list_splice()", but we don't > > have the equivalent "hlist_splice()". So it's doing that by hand, and > > did it badly. > > > > Al, this is your bug. I guess I can take the "manual hlist_splice" fix > > from Eric, but I'm not really happy with it. There's a few other > > places in that same commit where the list splice operation has been > > open-coded. > > > > Mind taking a look? > > It looks like we can pretty easily use mnt_list instead of mnt_hash, > see below (note: the code is only compile tested). > > While converting this to ordinary list helpers I found something > strange. > > In __propagate_umount we currently add the child to be unmounted in a > different location in the list then we did before the conversion of > mnt_hash to a hlist for rcu's accesses benefit. > > Now maybe propagate_next handles this (I still need to read and > understand that code) if not it looks like I may have found another bug, > as it looks like today we can add a node to our list without propogating > the unmount from the node. Er... Why would we want to reprocess it? The loop goes through all nodes getting events from ours; everything that gets events from _them_ included at the same time they are. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-18 21:05 ` Al Viro @ 2014-12-18 21:18 ` Eric W. Biederman 2014-12-19 0:02 ` Al Viro 0 siblings, 1 reply; 14+ messages in thread From: Eric W. Biederman @ 2014-12-18 21:18 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, linux-fsdevel Al Viro <viro@ZenIV.linux.org.uk> writes: > On Thu, Dec 18, 2014 at 01:24:26PM -0600, Eric W. Biederman wrote: >> Linus Torvalds <torvalds@linux-foundation.org> writes: >> >> > On Thu, Dec 18, 2014 at 9:07 AM, Linus Torvalds >> > <torvalds@linux-foundation.org> wrote: >> >> Why is this piece of code using its own made up and buggy list handling in >> >> the first place? We have list functions for these things, exactly so that >> >> people shouldn't write buggy stuff by hand. >> > >> > Oh. Ok, I see what's going on. We have "list_splice()", but we don't >> > have the equivalent "hlist_splice()". So it's doing that by hand, and >> > did it badly. >> > >> > Al, this is your bug. I guess I can take the "manual hlist_splice" fix >> > from Eric, but I'm not really happy with it. There's a few other >> > places in that same commit where the list splice operation has been >> > open-coded. >> > >> > Mind taking a look? >> >> It looks like we can pretty easily use mnt_list instead of mnt_hash, >> see below (note: the code is only compile tested). >> >> While converting this to ordinary list helpers I found something >> strange. >> >> In __propagate_umount we currently add the child to be unmounted in a >> different location in the list then we did before the conversion of >> mnt_hash to a hlist for rcu's accesses benefit. >> >> Now maybe propagate_next handles this (I still need to read and >> understand that code) if not it looks like I may have found another bug, >> as it looks like today we can add a node to our list without propogating >> the unmount from the node. > > Er... Why would we want to reprocess it? The loop goes through all nodes > getting events from ours; everything that gets events from _them_ included > at the same time they are. We might have wanted to because before your change to an hlist that is what the code did. Having read through the code of propagate_next it does look like it iterates through the entire propagation hierarchy so there shouldn't be a need to visit mounts that we have placed on the propagation list. Any thoughts on using mnt_list instead of mnt_hash to allow the use of list_splice and list_move? Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-18 21:18 ` Eric W. Biederman @ 2014-12-19 0:02 ` Al Viro 2014-12-19 0:03 ` Al Viro 2015-01-02 21:06 ` Eric W. Biederman 0 siblings, 2 replies; 14+ messages in thread From: Al Viro @ 2014-12-19 0:02 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Linus Torvalds, linux-fsdevel On Thu, Dec 18, 2014 at 03:18:49PM -0600, Eric W. Biederman wrote: > We might have wanted to because before your change to an hlist that is > what the code did. > > Having read through the code of propagate_next it does look like it > iterates through the entire propagation hierarchy so there shouldn't be > a need to visit mounts that we have placed on the propagation list. > > Any thoughts on using mnt_list instead of mnt_hash to allow the use of > list_splice and list_move? Frankly, I don't see much benefit in that. What's wrong with actually adding __hlist_splice(struct hlist_node *head, struct hlist_node *tail, struct hlist_node **where) { next = *where; *where = head; head->pprev = where; if (next) { tail->next = next; next->pprev = tail; } } and use it as __hlist_splice(tmp_list.first, &last->mnt_hash, &unmounted.first); (and I'd start with if (unlikely(!mnt)) return; /* idiot caller */) Another primitive would be something like hlist_transplant(struct hlist_head **from, struct hlist_head **to) { *to = *from; if (!hlist_empty(*to)) { to->first->pprev = to; INIT_HLIST_HEAD(from); } } with namespace_unlock() starting with hlist_transplant(&head, &unmounted); if (likely(hlist_empty(&head))) { up_write(&namespace_sem); return; } /* undo decrements we'd done in umount_tree() */ ... Linus, would that work for you, or would you prefer something more fancy? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-19 0:02 ` Al Viro @ 2014-12-19 0:03 ` Al Viro 2015-01-02 21:06 ` Eric W. Biederman 1 sibling, 0 replies; 14+ messages in thread From: Al Viro @ 2014-12-19 0:03 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Linus Torvalds, linux-fsdevel On Fri, Dec 19, 2014 at 12:02:15AM +0000, Al Viro wrote: > On Thu, Dec 18, 2014 at 03:18:49PM -0600, Eric W. Biederman wrote: > > > We might have wanted to because before your change to an hlist that is > > what the code did. > > > > Having read through the code of propagate_next it does look like it > > iterates through the entire propagation hierarchy so there shouldn't be > > a need to visit mounts that we have placed on the propagation list. > > > > Any thoughts on using mnt_list instead of mnt_hash to allow the use of > > list_splice and list_move? > > Frankly, I don't see much benefit in that. What's wrong with actually > adding > __hlist_splice(struct hlist_node *head, > struct hlist_node *tail, > struct hlist_node **where) > { > next = *where; > *where = head; > head->pprev = where; > if (next) { > tail->next = next; > next->pprev = tail; > } > } > > and use it as > __hlist_splice(tmp_list.first, &last->mnt_hash, &unmounted.first); > (and I'd start with if (unlikely(!mnt)) return; /* idiot caller */) > > Another primitive would be something like > hlist_transplant(struct hlist_head **from, struct hlist_head **to) hlist_transplant(struct hlist_head **to, struct hlist_head **from), that is... > { > *to = *from; > if (!hlist_empty(*to)) { > to->first->pprev = to; > INIT_HLIST_HEAD(from); > } > } > > with namespace_unlock() starting with > hlist_transplant(&head, &unmounted); > if (likely(hlist_empty(&head))) { > up_write(&namespace_sem); > return; > } > /* undo decrements we'd done in umount_tree() */ > ... > > Linus, would that work for you, or would you prefer something more fancy? > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-19 0:02 ` Al Viro 2014-12-19 0:03 ` Al Viro @ 2015-01-02 21:06 ` Eric W. Biederman 2015-01-02 21:13 ` Al Viro 1 sibling, 1 reply; 14+ messages in thread From: Eric W. Biederman @ 2015-01-02 21:06 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, linux-fsdevel Al Viro <viro@ZenIV.linux.org.uk> writes: > On Thu, Dec 18, 2014 at 03:18:49PM -0600, Eric W. Biederman wrote: > >> We might have wanted to because before your change to an hlist that is >> what the code did. >> >> Having read through the code of propagate_next it does look like it >> iterates through the entire propagation hierarchy so there shouldn't be >> a need to visit mounts that we have placed on the propagation list. >> >> Any thoughts on using mnt_list instead of mnt_hash to allow the use of >> list_splice and list_move? > > Frankly, I don't see much benefit in that. What's wrong with actually > adding > __hlist_splice(struct hlist_node *head, > struct hlist_node *tail, > struct hlist_node **where) > { > next = *where; > *where = head; > head->pprev = where; > if (next) { > tail->next = next; > next->pprev = tail; > } > } > > and use it as > __hlist_splice(tmp_list.first, &last->mnt_hash, &unmounted.first); > (and I'd start with if (unlikely(!mnt)) return; /* idiot caller */) I have a couple of reasons for not wanting to go that way. - It requires keeping track of the tail of the list which is half of the error prone complexity. - I have a fix for a bad interaction between MNT_LOCKED and MNT_DETACH where I need to leave mounts on the mount hash, until the final mntput. For that I need not to reuse the the mnt_hash lists. Patches to follow shortly. Of course after having gone through this code a time or two and tested things I am very annoyed by the fact that "hlist_add_before_rcu()" and "list_move_tail" perform the same operation. That is down right confusing. > Another primitive would be something like > hlist_transplant(struct hlist_head **from, struct hlist_head **to) > { > *to = *from; > if (!hlist_empty(*to)) { > to->first->pprev = to; > INIT_HLIST_HEAD(from); > } > } > > with namespace_unlock() starting with > hlist_transplant(&head, &unmounted); > if (likely(hlist_empty(&head))) { > up_write(&namespace_sem); > return; > } > /* undo decrements we'd done in umount_tree() */ > ... > > Linus, would that work for you, or would you prefer something more fancy? Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2015-01-02 21:06 ` Eric W. Biederman @ 2015-01-02 21:13 ` Al Viro 2015-01-02 21:56 ` Eric W. Biederman 0 siblings, 1 reply; 14+ messages in thread From: Al Viro @ 2015-01-02 21:13 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Linus Torvalds, linux-fsdevel On Fri, Jan 02, 2015 at 03:06:06PM -0600, Eric W. Biederman wrote: > - It requires keeping track of the tail of the list which is half of the > error prone complexity. Er... In umount_tree() it comes pretty much for free, actually. > - I have a fix for a bad interaction between MNT_LOCKED and MNT_DETACH > where I need to leave mounts on the mount hash, until the final > mntput. For that I need not to reuse the the mnt_hash lists. Details, please. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2015-01-02 21:13 ` Al Viro @ 2015-01-02 21:56 ` Eric W. Biederman 0 siblings, 0 replies; 14+ messages in thread From: Eric W. Biederman @ 2015-01-02 21:56 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, linux-fsdevel Al Viro <viro@ZenIV.linux.org.uk> writes: > On Fri, Jan 02, 2015 at 03:06:06PM -0600, Eric W. Biederman wrote: > >> - It requires keeping track of the tail of the list which is half of the >> error prone complexity. > > Er... In umount_tree() it comes pretty much for free, actually. The runtime complexity is basically free. The conceptual complexity is not. It is yet another detail to pay attention to. Having gotten it wrong my first round while trying to do something similar while developing my patches I can attest to this. The hlist version is more error prone to use and develop against. (Even if appropriate helpers are available). >> - I have a fix for a bad interaction between MNT_LOCKED and MNT_DETACH >> where I need to leave mounts on the mount hash, until the final >> mntput. For that I need not to reuse the the mnt_hash lists. > > Details, please. The code doing that has been sent out for review. Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-18 16:57 [PATCH] mnt: Fix a memory stomp in umount Eric W. Biederman [not found] ` <CA+55aFz6grtss0SXqOizXMOOF4sxT3FC4GC4NCiMF2Huy1vE4A@mail.gmail.com> @ 2014-12-18 20:01 ` Al Viro 2014-12-18 20:15 ` Al Viro 1 sibling, 1 reply; 14+ messages in thread From: Al Viro @ 2014-12-18 20:01 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Linus Torvalds, linux-fsdevel On Thu, Dec 18, 2014 at 10:57:19AM -0600, Eric W. Biederman wrote: > > While reviewing the code of umount_tree I realized that when we append > to a preexisting unmounted list we do not change pprev of the former > first item in the list. > > Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on > the former first item of the list will stomp unmounted.first leaving > it set to some random mount point which we are likely to free soon. > > This isn't likely to hit, but if it does I don't know how anyone could > track it down. All you need for that kind of loop to work is correct -next on all elements. Seriously. Even correct first->pprev is not needed. Look: struct hlist_head head = unmounted; if (likely(hlist_empty(&head))) // no dereferences of ppre *or* next sod off head.first->pprev = &head.first; // pprev of the first is valid now INIT_HLIST_HEAD(&unmounted); hlist_for_each_entry(mnt, &head, mnt_hash) if (mnt->mnt_ex_mountpoint.mnt) mntget(mnt->mnt_ex_mountpoint.mnt); // only needed ->next for that loop ... while (!hlist_empty(&head)) { mnt = hlist_entry(head.first, struct mount, mnt_hash); hlist_del_init(&mnt->mnt_hash); ... } Now, we certainly need pprev of the first to be correct for hlist_del_init() to work. What we do not need is correctness of pprev of the second as we call hlist_del_init(). And _after_ hlist_del_init() pprev of what used to be the second (the first, now) is healed. So we actually are OK here. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-18 20:01 ` Al Viro @ 2014-12-18 20:15 ` Al Viro 2014-12-18 20:40 ` Eric W. Biederman 0 siblings, 1 reply; 14+ messages in thread From: Al Viro @ 2014-12-18 20:15 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Linus Torvalds, linux-fsdevel On Thu, Dec 18, 2014 at 08:01:10PM +0000, Al Viro wrote: > On Thu, Dec 18, 2014 at 10:57:19AM -0600, Eric W. Biederman wrote: > > > > While reviewing the code of umount_tree I realized that when we append > > to a preexisting unmounted list we do not change pprev of the former > > first item in the list. > > > > Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on > > the former first item of the list will stomp unmounted.first leaving > > it set to some random mount point which we are likely to free soon. > > > > This isn't likely to hit, but if it does I don't know how anyone could > > track it down. > > All you need for that kind of loop to work is correct -next on all elements. Mind you, I'm not saying that this isn't worth sanitizing. Just that analysis in your commit message is incorrect - we actually do _not_ stomp on memory there. It's brittle and it's only saved by having only very few things done to that hlist, but the actual memory corruption does not happen. Again, I'm no defending that code; it's certaily better off making a properly-spliced hlist rather than relying on precise sequence of operations in namespace_unlock(). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] mnt: Fix a memory stomp in umount 2014-12-18 20:15 ` Al Viro @ 2014-12-18 20:40 ` Eric W. Biederman 0 siblings, 0 replies; 14+ messages in thread From: Eric W. Biederman @ 2014-12-18 20:40 UTC (permalink / raw) To: Al Viro; +Cc: Linus Torvalds, linux-fsdevel Al Viro <viro@ZenIV.linux.org.uk> writes: > On Thu, Dec 18, 2014 at 08:01:10PM +0000, Al Viro wrote: >> On Thu, Dec 18, 2014 at 10:57:19AM -0600, Eric W. Biederman wrote: >> > >> > While reviewing the code of umount_tree I realized that when we append >> > to a preexisting unmounted list we do not change pprev of the former >> > first item in the list. >> > >> > Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on >> > the former first item of the list will stomp unmounted.first leaving >> > it set to some random mount point which we are likely to free soon. >> > >> > This isn't likely to hit, but if it does I don't know how anyone could >> > track it down. >> >> All you need for that kind of loop to work is correct -next on all elements. > > Mind you, I'm not saying that this isn't worth sanitizing. Just that analysis > in your commit message is incorrect - we actually do _not_ stomp on memory > there. It's brittle and it's only saved by having only very few things done > to that hlist, but the actual memory corruption does not happen. Again, I'm > no defending that code; it's certaily better off making a properly-spliced > hlist rather than relying on precise sequence of operations in > namespace_unlock(). Agreed. That code wanted to be a memory stomp but we got lucky and it couldn't figure out how to be. Which I guess in the grand scheme of things I am glad of. When I sent the patch I knew there was an issue and I figured I had better send a patch before my mind moved on and I forgot about the incorrectly spliced list. Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2015-01-02 21:59 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-18 16:57 [PATCH] mnt: Fix a memory stomp in umount Eric W. Biederman
[not found] ` <CA+55aFz6grtss0SXqOizXMOOF4sxT3FC4GC4NCiMF2Huy1vE4A@mail.gmail.com>
2014-12-18 18:41 ` Linus Torvalds
2014-12-18 19:24 ` Eric W. Biederman
2014-12-18 19:34 ` Linus Torvalds
2014-12-18 21:05 ` Al Viro
2014-12-18 21:18 ` Eric W. Biederman
2014-12-19 0:02 ` Al Viro
2014-12-19 0:03 ` Al Viro
2015-01-02 21:06 ` Eric W. Biederman
2015-01-02 21:13 ` Al Viro
2015-01-02 21:56 ` Eric W. Biederman
2014-12-18 20:01 ` Al Viro
2014-12-18 20:15 ` Al Viro
2014-12-18 20:40 ` Eric W. Biederman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox