From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: kamezawa.hiroyu@jp.fujitsu.com
Cc: linux-mm@kvack.org, balbir@linux.vnet.ibm.com, xemul@openvz.org,
hugh@veritas.com
Subject: Re: Re: Re: [PATCH 7/7] memcg: freeing page_cgroup at suitable chance
Date: Thu, 20 Mar 2008 17:09:50 +0100 [thread overview]
Message-ID: <1206029390.8514.403.camel@twins> (raw)
In-Reply-To: <2248236.1206029072224.kamezawa.hiroyu@jp.fujitsu.com>
On Fri, 2008-03-21 at 01:04 +0900, kamezawa.hiroyu@jp.fujitsu.com wrote:
> >>On Thu, 2008-03-20 at 14:07 +0900, KAMEZAWA Hiroyuki wrote:
> >>> On Wed, 19 Mar 2008 22:33:19 +0100
> >>> Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> >>>
> >>> > > + if (spin_trylock(&root->tree_lock)) {
> >>> > > + /* radix tree is freed by RCU. so they will not call
> >>> > > + free_pages() right now.*/
> >>> > > + radix_tree_delete(&root->root_node, idx);
> >>> > > + spin_unlock(&root->tree_lock);
> >>> > > + /* We can free this in lazy fashion .*/
> >>> > > + free_page_cgroup(head);
> >>> >
> >>> > No RCU based freeing? I'd expected a call_rcu(), otherwise we race with
> >>> > lookups.
> >>> >
> >>> SLAB itself is SLAB_DESTROY_BY_RCU. I'll add comments here.
> >>
> >>SLAB_DESTROYED_BY_RCU is not enough, that will just ensure the slab does
> >>not get invalid, but it does not guarantee the objects will not be
> >>re-used. So you still have a race here, the lookup can see another
> >>object than it went for.
> >>
> >>Consider:
> >>
> >>
> >> A B
> >>
> >>rcu_read_lock()
> >>obj = radix_tree_lookup(&tree, idx)
> >> spin_lock(&tree_lock)
> >> obj = radix_tree_remove(&tree, idx)
> >> spin_unlock(&tree_lock)
> >>
> >> kmem_cache_free(s, obj)
> >>
> >>rcu_read_unlock()
> >>
> >> obj2 = kmem_cache_alloc(s)
> >> spin_lock(&tree_lock)
> >> radix_tree_insert(&tree_lock, idx2)
> >> spin_unlock(&tree_lock)
> >>
> >>
> >>return obj->data
> >>
> >>
> >>If B's obj2 == obj (possible), then A will return the object for idx2,
> >>while it asked for idx.
> >>
> >>So A needs an object validate and retry loop, or you need to RCU-free
> >>objects and extend the rcu_read_lock() range to cover obj's usage.
> >>
> >ok. thank you for pointing out and explaining kindly.
> >
> Hmm, I have to add texts for myself...
>
> I assume that page-cgroup-lookup against *free* pages never occur.
>
> When we free entries of [pfn, pfn + 1 << order), [pfn, pfn + 1 << order)
> is in freelist and we have zone->lock. Lookup against [pfn, pfn + 1 << order)
> cannot happen against freed pages.
> Then, looking up and freeing an idx cannot happen at the same time.
>
> radix_tree_lookup() can look up wrong entry in this case ?
You're right, my bad.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-03-20 16:09 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-14 9:59 [PATCH 0/7] memcg: radix-tree page_cgroup KAMEZAWA Hiroyuki
2008-03-14 10:03 ` [PATCH 1/7] re-define page_cgroup KAMEZAWA Hiroyuki
2008-03-16 14:15 ` Balbir Singh
2008-03-18 1:10 ` KAMEZAWA Hiroyuki
2008-03-17 0:21 ` Li Zefan
2008-03-18 1:12 ` KAMEZAWA Hiroyuki
2008-03-17 2:07 ` Li Zefan
2008-03-18 1:11 ` KAMEZAWA Hiroyuki
2008-03-14 10:06 ` [PATCH 2/7] charge/uncharge KAMEZAWA Hiroyuki
2008-03-17 1:46 ` Balbir Singh
2008-03-18 1:14 ` KAMEZAWA Hiroyuki
2008-03-17 2:26 ` Li Zefan
2008-03-18 1:15 ` KAMEZAWA Hiroyuki
2008-03-14 10:07 ` [PATCH 3/7] memcg: move_lists KAMEZAWA Hiroyuki
2008-03-18 16:44 ` Balbir Singh
2008-03-19 2:34 ` KAMEZAWA Hiroyuki
2008-03-14 10:15 ` [PATCH 4/7] memcg: page migration KAMEZAWA Hiroyuki
2008-03-17 2:36 ` Li Zefan
2008-03-18 1:17 ` KAMEZAWA Hiroyuki
2008-03-18 18:11 ` Balbir Singh
2008-03-19 2:44 ` KAMEZAWA Hiroyuki
2008-03-14 10:17 ` [PATCH 5/7] radix-tree page cgroup KAMEZAWA Hiroyuki
2008-03-17 2:56 ` Li Zefan
2008-03-17 3:26 ` Li Zefan
2008-03-18 1:18 ` KAMEZAWA Hiroyuki
2008-03-18 1:23 ` KAMEZAWA Hiroyuki
2008-03-19 2:05 ` Balbir Singh
2008-03-19 2:51 ` KAMEZAWA Hiroyuki
2008-03-19 3:14 ` Balbir Singh
2008-03-19 3:24 ` KAMEZAWA Hiroyuki
2008-03-19 21:11 ` Peter Zijlstra
2008-03-20 4:45 ` KAMEZAWA Hiroyuki
2008-03-20 5:09 ` KAMEZAWA Hiroyuki
2008-03-14 10:18 ` [PATCH 6/7] memcg: speed up by percpu KAMEZAWA Hiroyuki
2008-03-17 3:03 ` Li Zefan
2008-03-18 1:25 ` KAMEZAWA Hiroyuki
2008-03-18 23:55 ` Li Zefan
2008-03-19 2:51 ` KAMEZAWA Hiroyuki
2008-03-19 21:19 ` Peter Zijlstra
2008-03-19 21:41 ` Peter Zijlstra
2008-03-20 9:08 ` Andy Whitcroft
2008-03-20 4:46 ` KAMEZAWA Hiroyuki
2008-03-14 10:22 ` [PATCH 7/7] memcg: freeing page_cgroup at suitable chance KAMEZAWA Hiroyuki
2008-03-17 3:10 ` Li Zefan
2008-03-18 1:30 ` KAMEZAWA Hiroyuki
2008-03-19 21:33 ` Peter Zijlstra
2008-03-20 5:07 ` KAMEZAWA Hiroyuki
2008-03-20 7:55 ` Peter Zijlstra
2008-03-20 14:49 ` kamezawa.hiroyu
2008-03-20 16:04 ` kamezawa.hiroyu
2008-03-20 16:09 ` Peter Zijlstra [this message]
2008-03-20 16:15 ` kamezawa.hiroyu
2008-03-15 6:15 ` [PATCH 0/7] memcg: radix-tree page_cgroup Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1206029390.8514.403.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=balbir@linux.vnet.ibm.com \
--cc=hugh@veritas.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.