From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: kamezawa.hiroyu@jp.fujitsu.com
Cc: linux-mm@kvack.org, balbir@linux.vnet.ibm.com, xemul@openvz.org,
hugh@veritas.com
Subject: Re: Re: Re: [PATCH 7/7] memcg: freeing page_cgroup at suitable chance
Date: Thu, 20 Mar 2008 17:09:50 +0100 [thread overview]
Message-ID: <1206029390.8514.403.camel@twins> (raw)
In-Reply-To: <2248236.1206029072224.kamezawa.hiroyu@jp.fujitsu.com>
On Fri, 2008-03-21 at 01:04 +0900, kamezawa.hiroyu@jp.fujitsu.com wrote:
> >>On Thu, 2008-03-20 at 14:07 +0900, KAMEZAWA Hiroyuki wrote:
> >>> On Wed, 19 Mar 2008 22:33:19 +0100
> >>> Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> >>>
> >>> > > + if (spin_trylock(&root->tree_lock)) {
> >>> > > + /* radix tree is freed by RCU. so they will not call
> >>> > > + free_pages() right now.*/
> >>> > > + radix_tree_delete(&root->root_node, idx);
> >>> > > + spin_unlock(&root->tree_lock);
> >>> > > + /* We can free this in lazy fashion .*/
> >>> > > + free_page_cgroup(head);
> >>> >
> >>> > No RCU based freeing? I'd expected a call_rcu(), otherwise we race with
> >>> > lookups.
> >>> >
> >>> SLAB itself is SLAB_DESTROY_BY_RCU. I'll add comments here.
> >>
> >>SLAB_DESTROYED_BY_RCU is not enough, that will just ensure the slab does
> >>not get invalid, but it does not guarantee the objects will not be
> >>re-used. So you still have a race here, the lookup can see another
> >>object than it went for.
> >>
> >>Consider:
> >>
> >>
> >> A B
> >>
> >>rcu_read_lock()
> >>obj = radix_tree_lookup(&tree, idx)
> >> spin_lock(&tree_lock)
> >> obj = radix_tree_remove(&tree, idx)
> >> spin_unlock(&tree_lock)
> >>
> >> kmem_cache_free(s, obj)
> >>
> >>rcu_read_unlock()
> >>
> >> obj2 = kmem_cache_alloc(s)
> >> spin_lock(&tree_lock)
> >> radix_tree_insert(&tree_lock, idx2)
> >> spin_unlock(&tree_lock)
> >>
> >>
> >>return obj->data
> >>
> >>
> >>If B's obj2 == obj (possible), then A will return the object for idx2,
> >>while it asked for idx.
> >>
> >>So A needs an object validate and retry loop, or you need to RCU-free
> >>objects and extend the rcu_read_lock() range to cover obj's usage.
> >>
> >ok. thank you for pointing out and explaining kindly.
> >
> Hmm, I have to add texts for myself...
>
> I assume that page-cgroup-lookup against *free* pages never occur.
>
> When we free entries of [pfn, pfn + 1 << order), [pfn, pfn + 1 << order)
> is in freelist and we have zone->lock. Lookup against [pfn, pfn + 1 << order)
> cannot happen against freed pages.
> Then, looking up and freeing an idx cannot happen at the same time.
>
> radix_tree_lookup() can look up wrong entry in this case ?
You're right, my bad.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-03-20 16:09 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-14 9:59 [PATCH 0/7] memcg: radix-tree page_cgroup KAMEZAWA Hiroyuki
2008-03-14 10:03 ` [PATCH 1/7] re-define page_cgroup KAMEZAWA Hiroyuki
2008-03-16 14:15 ` Balbir Singh
2008-03-18 1:10 ` KAMEZAWA Hiroyuki
2008-03-17 0:21 ` Li Zefan
2008-03-18 1:12 ` KAMEZAWA Hiroyuki
2008-03-17 2:07 ` Li Zefan
2008-03-18 1:11 ` KAMEZAWA Hiroyuki
2008-03-14 10:06 ` [PATCH 2/7] charge/uncharge KAMEZAWA Hiroyuki
2008-03-17 1:46 ` Balbir Singh
2008-03-18 1:14 ` KAMEZAWA Hiroyuki
2008-03-17 2:26 ` Li Zefan
2008-03-18 1:15 ` KAMEZAWA Hiroyuki
2008-03-14 10:07 ` [PATCH 3/7] memcg: move_lists KAMEZAWA Hiroyuki
2008-03-18 16:44 ` Balbir Singh
2008-03-19 2:34 ` KAMEZAWA Hiroyuki
2008-03-14 10:15 ` [PATCH 4/7] memcg: page migration KAMEZAWA Hiroyuki
2008-03-17 2:36 ` Li Zefan
2008-03-18 1:17 ` KAMEZAWA Hiroyuki
2008-03-18 18:11 ` Balbir Singh
2008-03-19 2:44 ` KAMEZAWA Hiroyuki
2008-03-14 10:17 ` [PATCH 5/7] radix-tree page cgroup KAMEZAWA Hiroyuki
2008-03-17 2:56 ` Li Zefan
2008-03-17 3:26 ` Li Zefan
2008-03-18 1:18 ` KAMEZAWA Hiroyuki
2008-03-18 1:23 ` KAMEZAWA Hiroyuki
2008-03-19 2:05 ` Balbir Singh
2008-03-19 2:51 ` KAMEZAWA Hiroyuki
2008-03-19 3:14 ` Balbir Singh
2008-03-19 3:24 ` KAMEZAWA Hiroyuki
2008-03-19 21:11 ` Peter Zijlstra
2008-03-20 4:45 ` KAMEZAWA Hiroyuki
2008-03-20 5:09 ` KAMEZAWA Hiroyuki
2008-03-14 10:18 ` [PATCH 6/7] memcg: speed up by percpu KAMEZAWA Hiroyuki
2008-03-17 3:03 ` Li Zefan
2008-03-18 1:25 ` KAMEZAWA Hiroyuki
2008-03-18 23:55 ` Li Zefan
2008-03-19 2:51 ` KAMEZAWA Hiroyuki
2008-03-19 21:19 ` Peter Zijlstra
2008-03-19 21:41 ` Peter Zijlstra
2008-03-20 9:08 ` Andy Whitcroft
2008-03-20 4:46 ` KAMEZAWA Hiroyuki
2008-03-14 10:22 ` [PATCH 7/7] memcg: freeing page_cgroup at suitable chance KAMEZAWA Hiroyuki
2008-03-17 3:10 ` Li Zefan
2008-03-18 1:30 ` KAMEZAWA Hiroyuki
2008-03-19 21:33 ` Peter Zijlstra
2008-03-20 5:07 ` KAMEZAWA Hiroyuki
2008-03-20 7:55 ` Peter Zijlstra
2008-03-20 14:49 ` kamezawa.hiroyu
2008-03-20 16:04 ` kamezawa.hiroyu
2008-03-20 16:09 ` Peter Zijlstra [this message]
2008-03-20 16:15 ` kamezawa.hiroyu
2008-03-15 6:15 ` [PATCH 0/7] memcg: radix-tree page_cgroup Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1206029390.8514.403.camel@twins \
--to=a.p.zijlstra@chello.nl \
--cc=balbir@linux.vnet.ibm.com \
--cc=hugh@veritas.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).