* [PATCH v2] mm: simplify find_vma_prev @ 2011-12-09 21:23 kosaki.motohiro 2011-12-09 21:35 ` Joe Perches 0 siblings, 1 reply; 13+ messages in thread From: kosaki.motohiro @ 2011-12-09 21:23 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: KOSAKI Motohiro, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> commit 297c5eee37 (mm: make the vma list be doubly linked) added vm_prev member into vm_area_struct. Therefore we can simplify find_vma_prev() by using it. Also, this change help to improve page fault performance because it has strong locality of reference. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> --- mm/mmap.c | 40 +++++++++++----------------------------- 1 files changed, 11 insertions(+), 29 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index eae90af..a84539b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1603,39 +1603,21 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) EXPORT_SYMBOL(find_vma); -/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */ +/* + * Same as find_vma, but also return a pointer to the previous VMA in *pprev. + * Note: pprev is set to NULL when return value is NULL. + */ struct vm_area_struct * -find_vma_prev(struct mm_struct *mm, unsigned long addr, - struct vm_area_struct **pprev) +find_vma_prev(struct mm_struct *mm, unsigned long addr, struct vm_area_struct **pprev) { - struct vm_area_struct *vma = NULL, *prev = NULL; - struct rb_node *rb_node; - if (!mm) - goto out; - - /* Guard against addr being lower than the first VMA */ - vma = mm->mmap; - - /* Go through the RB tree quickly. */ - rb_node = mm->mm_rb.rb_node; - - while (rb_node) { - struct vm_area_struct *vma_tmp; - vma_tmp = rb_entry(rb_node, struct vm_area_struct, vm_rb); + struct vm_area_struct *vma; - if (addr < vma_tmp->vm_end) { - rb_node = rb_node->rb_left; - } else { - prev = vma_tmp; - if (!prev->vm_next || (addr < prev->vm_next->vm_end)) - break; - rb_node = rb_node->rb_right; - } - } + *pprev = NULL; + vma = find_vma(mm, addr); + if (vma) + *pprev = vma->vm_prev; -out: - *pprev = prev; - return prev ? prev->vm_next : vma; + return vma; } /* -- 1.7.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: simplify find_vma_prev 2011-12-09 21:23 [PATCH v2] mm: simplify find_vma_prev kosaki.motohiro @ 2011-12-09 21:35 ` Joe Perches 2011-12-09 22:48 ` [PATCH v3] " kosaki.motohiro 2011-12-09 22:49 ` [PATCH v2] " KOSAKI Motohiro 0 siblings, 2 replies; 13+ messages in thread From: Joe Perches @ 2011-12-09 21:35 UTC (permalink / raw) To: kosaki.motohiro Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li On Fri, 2011-12-09 at 16:23 -0500, kosaki.motohiro@gmail.com wrote: > commit 297c5eee37 (mm: make the vma list be doubly linked) added > vm_prev member into vm_area_struct. Therefore we can simplify > find_vma_prev() by using it. Also, this change help to improve > page fault performance because it has strong locality of reference. trivia: > diff --git a/mm/mmap.c b/mm/mmap.c [] > @@ -1603,39 +1603,21 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) > > EXPORT_SYMBOL(find_vma); > > -/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */ > +/* > + * Same as find_vma, but also return a pointer to the previous VMA in *pprev. > + * Note: pprev is set to NULL when return value is NULL. > + */ > struct vm_area_struct * > -find_vma_prev(struct mm_struct *mm, unsigned long addr, > - struct vm_area_struct **pprev) > +find_vma_prev(struct mm_struct *mm, unsigned long addr, struct vm_area_struct **pprev) eh. This declaration change seems gratuitous and it exceeds 80 columns. > + *pprev = NULL; > + vma = find_vma(mm, addr); > + if (vma) > + *pprev = vma->vm_prev; There's no need to possibly set *pprev twice. Maybe { struct vm_area_struct *vma = find_vma(mm, addr); *pprev = vma ? vma->vm_prev : NULL; or if (vma) *pprev = vma->vm_prev; else *pprev = NULL; return vma; } > -out: > - *pprev = prev; > - return prev ? prev->vm_next : vma; > + return vma; > } > > /* ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3] mm: simplify find_vma_prev 2011-12-09 21:35 ` Joe Perches @ 2011-12-09 22:48 ` kosaki.motohiro 2011-12-12 0:49 ` KAMEZAWA Hiroyuki 2011-12-12 13:26 ` Michal Hocko 2011-12-09 22:49 ` [PATCH v2] " KOSAKI Motohiro 1 sibling, 2 replies; 13+ messages in thread From: kosaki.motohiro @ 2011-12-09 22:48 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: KOSAKI Motohiro, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> commit 297c5eee37 (mm: make the vma list be doubly linked) added vm_prev member into vm_area_struct. Therefore we can simplify find_vma_prev() by using it. Also, this change help to improve page fault performance because it has strong locality of reference. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> --- mm/mmap.c | 36 ++++++++---------------------------- 1 files changed, 8 insertions(+), 28 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index eae90af..b9c0241 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1603,39 +1603,19 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) EXPORT_SYMBOL(find_vma); -/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */ +/* + * Same as find_vma, but also return a pointer to the previous VMA in *pprev. + * Note: pprev is set to NULL when return value is NULL. + */ struct vm_area_struct * find_vma_prev(struct mm_struct *mm, unsigned long addr, struct vm_area_struct **pprev) { - struct vm_area_struct *vma = NULL, *prev = NULL; - struct rb_node *rb_node; - if (!mm) - goto out; - - /* Guard against addr being lower than the first VMA */ - vma = mm->mmap; - - /* Go through the RB tree quickly. */ - rb_node = mm->mm_rb.rb_node; - - while (rb_node) { - struct vm_area_struct *vma_tmp; - vma_tmp = rb_entry(rb_node, struct vm_area_struct, vm_rb); - - if (addr < vma_tmp->vm_end) { - rb_node = rb_node->rb_left; - } else { - prev = vma_tmp; - if (!prev->vm_next || (addr < prev->vm_next->vm_end)) - break; - rb_node = rb_node->rb_right; - } - } + struct vm_area_struct *vma; -out: - *pprev = prev; - return prev ? prev->vm_next : vma; + vma = find_vma(mm, addr); + *pprev = vma ? vma->vm_prev : NULL; + return vma; } /* -- 1.7.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-09 22:48 ` [PATCH v3] " kosaki.motohiro @ 2011-12-12 0:49 ` KAMEZAWA Hiroyuki 2011-12-12 9:27 ` KAMEZAWA Hiroyuki 2011-12-12 13:26 ` Michal Hocko 1 sibling, 1 reply; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-12 0:49 UTC (permalink / raw) To: kosaki.motohiro Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li On Fri, 9 Dec 2011 17:48:40 -0500 kosaki.motohiro@gmail.com wrote: > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > commit 297c5eee37 (mm: make the vma list be doubly linked) added > vm_prev member into vm_area_struct. Therefore we can simplify > find_vma_prev() by using it. Also, this change help to improve > page fault performance because it has strong locality of reference. > > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-12 0:49 ` KAMEZAWA Hiroyuki @ 2011-12-12 9:27 ` KAMEZAWA Hiroyuki 2011-12-12 15:31 ` KOSAKI Motohiro 0 siblings, 1 reply; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-12 9:27 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: kosaki.motohiro, linux-mm, linux-kernel, KOSAKI Motohiro, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li On Mon, 12 Dec 2011 09:49:30 +0900 KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote: > On Fri, 9 Dec 2011 17:48:40 -0500 > kosaki.motohiro@gmail.com wrote: > > > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > > > commit 297c5eee37 (mm: make the vma list be doubly linked) added > > vm_prev member into vm_area_struct. Therefore we can simplify > > find_vma_prev() by using it. Also, this change help to improve > > page fault performance because it has strong locality of reference. > > > > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> > Hmm, your work remind me of a patch I tried in past. Here is a refleshed one...how do you think ? == >From c0261936fc01322d06425731d33f38b2021e8067 Mon Sep 17 00:00:00 2001 From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Date: Mon, 12 Dec 2011 18:31:19 +0900 Subject: [PATCH] per thread vma cache. This is a toy patch. How do you think ? This is a patch for per-thread mmap_cache without heavy atomic ops. I'm sure overhead of find_vma() is pretty small in usual application and this will not show good improvement. But I think, if we need to have cache of vma, it should be per thread rather than per mm. This patch adds thread->mmap_cache, a pointer for vm_area_struct and update it appropriately. Because we have no refcnt on vm_area_struct, thread->mmap_cache may be a stale pointer. This patch detects stale pointer by checking - thread->mmap_cache is one of SLABs in vm_area_cachep. - thread->mmap_cache->vm_mm == mm. vma->vm_mm will be cleared before kmem_cache_free() by this patch. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Note: Kosaki's work will merge find_vma_prev() and find_vma(). Then, we'll cover most of case just by modifying find_vma(). --- include/linux/mm_types.h | 2 + include/linux/sched.h | 1 + include/linux/slab_def.h | 13 ++++++++++ include/linux/slub_def.h | 12 +++++++++ init/Kconfig | 5 ++++ kernel/fork.c | 3 +- mm/mmap.c | 61 +++++++++++++++++++++++++++++++++++++++------- mm/nommu.c | 4 +- 8 files changed, 89 insertions(+), 12 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 81a56df..8a9be1a 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -255,6 +255,8 @@ struct vm_area_struct { #endif }; +extern void free_vma(struct vm_area_struct *vma); + struct core_thread { struct task_struct *task; struct core_thread *next; diff --git a/include/linux/sched.h b/include/linux/sched.h index cbb5d3e..a161c2b 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1294,6 +1294,7 @@ struct task_struct { #endif struct mm_struct *mm, *active_mm; + struct vm_area_struct *mmap_cache; #ifdef CONFIG_COMPAT_BRK unsigned brk_randomized:1; #endif diff --git a/include/linux/slab_def.h b/include/linux/slab_def.h index d00e0ba..763c1d9 100644 --- a/include/linux/slab_def.h +++ b/include/linux/slab_def.h @@ -214,4 +214,17 @@ found: #endif /* CONFIG_NUMA */ +/* + * Check the object is under specified kmem_cache. + */ +static inline bool is_kmem_cache(void *data, struct kmem_cache *s) +{ + struct page *page; + + page = virt_to_head_page(data); + if (PageSlab(page) && page->lru.prev == s) + return true; + return false; +} + #endif /* _LINUX_SLAB_DEF_H */ diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index a32bcfd..9eba7e7 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -10,6 +10,7 @@ #include <linux/gfp.h> #include <linux/workqueue.h> #include <linux/kobject.h> +#include <linux/mm.h> #include <linux/kmemleak.h> @@ -313,4 +314,15 @@ static __always_inline void *kmalloc_node(size_t size, gfp_t flags, int node) } #endif +/* + * Check the object is under specified kmem cache. + */ +static inline bool is_kmem_cache(void *data, struct kmem_cache *s) +{ + struct page *page = virt_to_head_page(data); + + if (PageSlab(page) && page->slab == s) + return true; + return false; +} #endif /* _LINUX_SLUB_DEF_H */ diff --git a/init/Kconfig b/init/Kconfig index 6dfc8c3..7fcfffd 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1266,6 +1266,11 @@ config SLOB endchoice +config PER_THREAD_MMAP_CACHE + bool + default y + depends on SLAB || SLUB + config MMAP_ALLOW_UNINITIALIZED bool "Allow mmapped anonymous memory to be uninitialized" depends on EXPERT && !MMU diff --git a/kernel/fork.c b/kernel/fork.c index e20518d..18d73c2 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -432,7 +432,7 @@ out: fail_nomem_anon_vma_fork: mpol_put(pol); fail_nomem_policy: - kmem_cache_free(vm_area_cachep, tmp); + free_vma(tmp); fail_nomem: retval = -ENOMEM; vm_unacct_memory(charge); @@ -825,6 +825,7 @@ good_mm: tsk->mm = mm; tsk->active_mm = mm; + tsk->mmap_cache = NULL; return 0; fail_nomem: diff --git a/mm/mmap.c b/mm/mmap.c index 83813fa..7b86e05 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -238,7 +238,7 @@ static struct vm_area_struct *remove_vma(struct vm_area_struct *vma) removed_exe_file_vma(vma->vm_mm); } mpol_put(vma_policy(vma)); - kmem_cache_free(vm_area_cachep, vma); + free_vma(vma); return next; } @@ -478,8 +478,11 @@ __vma_unlink(struct mm_struct *mm, struct vm_area_struct *vma, if (next) next->vm_prev = prev; rb_erase(&vma->vm_rb, &mm->mm_rb); - if (mm->mmap_cache == vma) + if (mm->mmap_cache == vma) { mm->mmap_cache = prev; + if (current->mm == mm) + current->mmap_cache = prev; + } } /* @@ -642,7 +645,7 @@ again: remove_next = 1 + (end > next->vm_end); anon_vma_merge(vma, next); mm->map_count--; mpol_put(vma_policy(next)); - kmem_cache_free(vm_area_cachep, next); + free_vma(next); /* * In mprotect's case 6 (see comments on vma_merge), * we must remove another next too. It would clutter @@ -1364,7 +1367,7 @@ unmap_and_free_vma: unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end); charged = 0; free_vma: - kmem_cache_free(vm_area_cachep, vma); + free_vma(vma); unacct_error: if (charged) vm_unacct_memory(charged); @@ -1588,10 +1591,42 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, EXPORT_SYMBOL(get_unmapped_area); +#ifdef CONFIG_PER_THREAD_MMAP_CACHE +static struct vm_area_struct *thread_mmap_cache(struct mm_struct *mm) +{ + struct vm_area_struct *vma = current->mmap_cache; + + if (!vma || current->mm != mm) + return NULL; + + if ((vma->vm_mm != mm) || !is_kmem_cache(vma, vm_area_cachep)) + return NULL; + + return vma; +} + +static void set_thread_mmap_cache(struct mm_struct *mm, + struct vm_area_struct *vma) +{ + if (current->mm == mm) + current->mmap_cache = vma; +} +#else +static struct vm_area_struct *thread_mmap_cache(struct mm_struct *mm) +{ + return NULL; +} + +static void set_thread_mmap_cache(struct mm_struct *mm, + struct vm_area_struct *vma) +{ +} +#endif + /* Look up the first VMA which satisfies addr < vm_end, NULL if none. */ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) { - struct vm_area_struct *vma = NULL; + struct vm_area_struct *vma = thread_mmap_cache(mm); if (mm) { /* Check the cache first. */ @@ -1617,8 +1652,10 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) } else rb_node = rb_node->rb_right; } - if (vma) + if (vma) { mm->mmap_cache = vma; + set_thread_mmap_cache(mm, vma); + } } } return vma; @@ -2017,7 +2054,7 @@ static int __split_vma(struct mm_struct * mm, struct vm_area_struct * vma, out_free_mpol: mpol_put(pol); out_free_vma: - kmem_cache_free(vm_area_cachep, new); + free_vma(new); out_err: return err; } @@ -2400,7 +2437,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, out_free_mempol: mpol_put(pol); out_free_vma: - kmem_cache_free(vm_area_cachep, new_vma); + free_vma(new_vma); return NULL; } @@ -2506,7 +2543,7 @@ int install_special_mapping(struct mm_struct *mm, return 0; out: - kmem_cache_free(vm_area_cachep, vma); + free_vma(vma); return ret; } @@ -2675,6 +2712,12 @@ void mm_drop_all_locks(struct mm_struct *mm) mutex_unlock(&mm_all_locks_mutex); } +void free_vma(struct vm_area_struct *vma) +{ + vma->vm_mm = NULL; + kmem_cache_free(vm_area_cachep, vma); +} + /* * initialise the VMA slab */ diff --git a/mm/nommu.c b/mm/nommu.c index b982290..3c98fd5 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -793,7 +793,7 @@ static void delete_vma(struct mm_struct *mm, struct vm_area_struct *vma) removed_exe_file_vma(mm); } put_nommu_region(vma->vm_region); - kmem_cache_free(vm_area_cachep, vma); + free_vma(vma); } /* @@ -1443,7 +1443,7 @@ error: fput(vma->vm_file); if (vma->vm_flags & VM_EXECUTABLE) removed_exe_file_vma(vma->vm_mm); - kmem_cache_free(vm_area_cachep, vma); + free_vma(vma); kleave(" = %d", ret); return ret; -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-12 9:27 ` KAMEZAWA Hiroyuki @ 2011-12-12 15:31 ` KOSAKI Motohiro 2011-12-13 4:25 ` KAMEZAWA Hiroyuki 0 siblings, 1 reply; 13+ messages in thread From: KOSAKI Motohiro @ 2011-12-12 15:31 UTC (permalink / raw) To: KAMEZAWA Hiroyuki Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Andrew Morton (commit_signer:15/23=65%), Hugh Dickins (commit_signer:7/23=30%), Peter Zijlstra (commit_signer:4/23=17%), Shaohua Li (commit_signer:3/23=13%) (12/12/11 4:27 AM), KAMEZAWA Hiroyuki wrote: > On Mon, 12 Dec 2011 09:49:30 +0900 > KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> wrote: > >> On Fri, 9 Dec 2011 17:48:40 -0500 >> kosaki.motohiro@gmail.com wrote: >> >>> From: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> >>> >>> commit 297c5eee37 (mm: make the vma list be doubly linked) added >>> vm_prev member into vm_area_struct. Therefore we can simplify >>> find_vma_prev() by using it. Also, this change help to improve >>> page fault performance because it has strong locality of reference. >>> >>> Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> >> >> Reviewed-by: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> >> > > Hmm, your work remind me of a patch I tried in past. > Here is a refleshed one...how do you think ? > > == > From c0261936fc01322d06425731d33f38b2021e8067 Mon Sep 17 00:00:00 2001 > From: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> > Date: Mon, 12 Dec 2011 18:31:19 +0900 > Subject: [PATCH] per thread vma cache. > > This is a toy patch. How do you think ? > > This is a patch for per-thread mmap_cache without heavy atomic ops. > > I'm sure overhead of find_vma() is pretty small in usual application > and this will not show good improvement. But I think, if we need > to have cache of vma, it should be per thread rather than per mm. Agreed. per-thread is better. > This patch adds thread->mmap_cache, a pointer for vm_area_struct > and update it appropriately. Because we have no refcnt on vm_area_struct, > thread->mmap_cache may be a stale pointer. This patch detects stale > pointer by checking > > - thread->mmap_cache is one of SLABs in vm_area_cachep. > - thread->mmap_cache->vm_mm == mm. > > vma->vm_mm will be cleared before kmem_cache_free() by this patch. Do you mean the cache can make mishit with unrelated vma when freed vma was reused? If so, it is most tricky part of this patch, I strongly hope you write a comment more. Thank you. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-12 15:31 ` KOSAKI Motohiro @ 2011-12-13 4:25 ` KAMEZAWA Hiroyuki 0 siblings, 0 replies; 13+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-12-13 4:25 UTC (permalink / raw) To: KOSAKI Motohiro Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Andrew Morton (commit_signer:15/23=65%), Hugh Dickins (commit_signer:7/23=30%), Peter Zijlstra (commit_signer:4/23=17%), Shaohua Li (commit_signer:3/23=13%) On Mon, 12 Dec 2011 10:31:57 -0500 KOSAKI Motohiro <kosaki.motohiro@gmail.com> wrote: > (12/12/11 4:27 AM), KAMEZAWA Hiroyuki wrote: > > On Mon, 12 Dec 2011 09:49:30 +0900 > > KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> wrote: > > > >> On Fri, 9 Dec 2011 17:48:40 -0500 > >> kosaki.motohiro@gmail.com wrote: > >> > >>> From: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> > >>> > >>> commit 297c5eee37 (mm: make the vma list be doubly linked) added > >>> vm_prev member into vm_area_struct. Therefore we can simplify > >>> find_vma_prev() by using it. Also, this change help to improve > >>> page fault performance because it has strong locality of reference. > >>> > >>> Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> > >> > >> Reviewed-by: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> > >> > > > > Hmm, your work remind me of a patch I tried in past. > > Here is a refleshed one...how do you think ? > > > > == > > From c0261936fc01322d06425731d33f38b2021e8067 Mon Sep 17 00:00:00 2001 > > From: KAMEZAWA Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com> > > Date: Mon, 12 Dec 2011 18:31:19 +0900 > > Subject: [PATCH] per thread vma cache. > > > > This is a toy patch. How do you think ? > > > > This is a patch for per-thread mmap_cache without heavy atomic ops. > > > > I'm sure overhead of find_vma() is pretty small in usual application > > and this will not show good improvement. But I think, if we need > > to have cache of vma, it should be per thread rather than per mm. > > Agreed. per-thread is better. > > > > This patch adds thread->mmap_cache, a pointer for vm_area_struct > > and update it appropriately. Because we have no refcnt on vm_area_struct, > > thread->mmap_cache may be a stale pointer. This patch detects stale > > pointer by checking > > > > - thread->mmap_cache is one of SLABs in vm_area_cachep. > > - thread->mmap_cache->vm_mm == mm. > > > > vma->vm_mm will be cleared before kmem_cache_free() by this patch. > > Do you mean the cache can make mishit with unrelated vma when freed vma > was reused? yes. > If so, it is most tricky part of this patch, I strongly hope you write > a comment more. > Sure. -Kame ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-09 22:48 ` [PATCH v3] " kosaki.motohiro 2011-12-12 0:49 ` KAMEZAWA Hiroyuki @ 2011-12-12 13:26 ` Michal Hocko 2011-12-12 14:49 ` KOSAKI Motohiro 2011-12-12 19:15 ` Michal Hocko 1 sibling, 2 replies; 13+ messages in thread From: Michal Hocko @ 2011-12-12 13:26 UTC (permalink / raw) To: kosaki.motohiro Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li On Fri 09-12-11 17:48:40, kosaki.motohiro@gmail.com wrote: > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > commit 297c5eee37 (mm: make the vma list be doubly linked) added > vm_prev member into vm_area_struct. Therefore we can simplify > find_vma_prev() by using it. Also, this change help to improve > page fault performance because it has strong locality of reference. > > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > --- > mm/mmap.c | 36 ++++++++---------------------------- > 1 files changed, 8 insertions(+), 28 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index eae90af..b9c0241 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1603,39 +1603,19 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) > > EXPORT_SYMBOL(find_vma); > > -/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */ > +/* > + * Same as find_vma, but also return a pointer to the previous VMA in *pprev. > + * Note: pprev is set to NULL when return value is NULL. > + */ > struct vm_area_struct * > find_vma_prev(struct mm_struct *mm, unsigned long addr, > struct vm_area_struct **pprev) > { > - struct vm_area_struct *vma = NULL, *prev = NULL; > - struct rb_node *rb_node; > - if (!mm) > - goto out; > - > - /* Guard against addr being lower than the first VMA */ > - vma = mm->mmap; Why have you removed this guard? Previously we had pprev==NULL and returned mm->mmap. This seems like a semantic change without any explanation. Could you clarify? > - > - /* Go through the RB tree quickly. */ > - rb_node = mm->mm_rb.rb_node; > - > - while (rb_node) { > - struct vm_area_struct *vma_tmp; > - vma_tmp = rb_entry(rb_node, struct vm_area_struct, vm_rb); > - > - if (addr < vma_tmp->vm_end) { > - rb_node = rb_node->rb_left; > - } else { > - prev = vma_tmp; > - if (!prev->vm_next || (addr < prev->vm_next->vm_end)) > - break; > - rb_node = rb_node->rb_right; > - } > - } > + struct vm_area_struct *vma; > > -out: > - *pprev = prev; > - return prev ? prev->vm_next : vma; > + vma = find_vma(mm, addr); > + *pprev = vma ? vma->vm_prev : NULL; > + return vma; > } > > /* > -- > 1.7.1 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-12 13:26 ` Michal Hocko @ 2011-12-12 14:49 ` KOSAKI Motohiro 2011-12-12 19:15 ` Michal Hocko 1 sibling, 0 replies; 13+ messages in thread From: KOSAKI Motohiro @ 2011-12-12 14:49 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li (12/12/11 8:26 AM), Michal Hocko wrote: > On Fri 09-12-11 17:48:40, kosaki.motohiro@gmail.com wrote: >> From: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> >> >> commit 297c5eee37 (mm: make the vma list be doubly linked) added >> vm_prev member into vm_area_struct. Therefore we can simplify >> find_vma_prev() by using it. Also, this change help to improve >> page fault performance because it has strong locality of reference. >> >> Signed-off-by: KOSAKI Motohiro<kosaki.motohiro@jp.fujitsu.com> >> --- >> mm/mmap.c | 36 ++++++++---------------------------- >> 1 files changed, 8 insertions(+), 28 deletions(-) >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index eae90af..b9c0241 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -1603,39 +1603,19 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) >> >> EXPORT_SYMBOL(find_vma); >> >> -/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */ >> +/* >> + * Same as find_vma, but also return a pointer to the previous VMA in *pprev. >> + * Note: pprev is set to NULL when return value is NULL. >> + */ >> struct vm_area_struct * >> find_vma_prev(struct mm_struct *mm, unsigned long addr, >> struct vm_area_struct **pprev) >> { >> - struct vm_area_struct *vma = NULL, *prev = NULL; >> - struct rb_node *rb_node; >> - if (!mm) >> - goto out; >> - >> - /* Guard against addr being lower than the first VMA */ >> - vma = mm->mmap; > > Why have you removed this guard? Previously we had pprev==NULL and > returned mm->mmap. > This seems like a semantic change without any explanation. Could you > clarify? IIUC, find_vma_prev() is module unexported and none of known caller use pprev==NULL. Thus, I thought it can be also simplified. Am I missing something? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-12 13:26 ` Michal Hocko 2011-12-12 14:49 ` KOSAKI Motohiro @ 2011-12-12 19:15 ` Michal Hocko 2011-12-12 19:24 ` KOSAKI Motohiro 1 sibling, 1 reply; 13+ messages in thread From: Michal Hocko @ 2011-12-12 19:15 UTC (permalink / raw) To: kosaki.motohiro Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li On Mon 12-12-11 14:26:16, Michal Hocko wrote: > On Fri 09-12-11 17:48:40, kosaki.motohiro@gmail.com wrote: > > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > > > commit 297c5eee37 (mm: make the vma list be doubly linked) added > > vm_prev member into vm_area_struct. Therefore we can simplify > > find_vma_prev() by using it. Also, this change help to improve > > page fault performance because it has strong locality of reference. > > > > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > > --- > > mm/mmap.c | 36 ++++++++---------------------------- > > 1 files changed, 8 insertions(+), 28 deletions(-) > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > index eae90af..b9c0241 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -1603,39 +1603,19 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) > > > > EXPORT_SYMBOL(find_vma); > > > > -/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */ > > +/* > > + * Same as find_vma, but also return a pointer to the previous VMA in *pprev. > > + * Note: pprev is set to NULL when return value is NULL. > > + */ > > struct vm_area_struct * > > find_vma_prev(struct mm_struct *mm, unsigned long addr, > > struct vm_area_struct **pprev) > > { > > - struct vm_area_struct *vma = NULL, *prev = NULL; > > - struct rb_node *rb_node; > > - if (!mm) > > - goto out; > > - > > - /* Guard against addr being lower than the first VMA */ > > - vma = mm->mmap; > > Why have you removed this guard? Previously we had pprev==NULL and > returned mm->mmap. > This seems like a semantic change without any explanation. Could you > clarify? Scratch that. I have misread the code. find_vma will return mm->mmap if the given address is bellow all vmas. Sorry about noise. The only concern left would be the caching. Are you sure this will not break some workloads which benefit from mmap_cache usage and would interfere with find_vma_prev callers now? Anyway this could be fixed trivially. Thanks -- Michal Hocko SUSE Labs SUSE LINUX s.r.o. Lihovarska 1060/12 190 00 Praha 9 Czech Republic ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-12 19:15 ` Michal Hocko @ 2011-12-12 19:24 ` KOSAKI Motohiro 2011-12-12 19:34 ` KOSAKI Motohiro 0 siblings, 1 reply; 13+ messages in thread From: KOSAKI Motohiro @ 2011-12-12 19:24 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, linux-kernel, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li >> Why have you removed this guard? Previously we had pprev==NULL and >> returned mm->mmap. >> This seems like a semantic change without any explanation. Could you >> clarify? > > Scratch that. I have misread the code. find_vma will return mm->mmap if > the given address is bellow all vmas. Sorry about noise. > > The only concern left would be the caching. Are you sure this will not > break some workloads which benefit from mmap_cache usage and would > interfere with find_vma_prev callers now? Anyway this could be fixed > trivially. Here is callers list. find_vma_prev 115 arch/ia64/mm/fault.c vma = find_vma_prev(mm, address, &prev_vma); find_vma_prev 183 arch/parisc/mm/fault.c vma = find_vma_prev(mm, address, &prev_vma); find_vma_prev 229 arch/tile/mm/hugetlbpage.c vma = find_vma_prev(mm, addr, &prev_vma); find_vma_prev 336 arch/x86/mm/hugetlbpage.c if (!(vma = find_vma_prev(mm, addr, &prev_vma))) find_vma_prev 388 mm/madvise.c vma = find_vma_prev(current->mm, start, &prev); find_vma_prev 642 mm/mempolicy.c vma = find_vma_prev(mm, start, &prev); find_vma_prev 388 mm/mlock.c vma = find_vma_prev(current->mm, start, &prev); find_vma_prev 265 mm/mprotect.c vma = find_vma_prev(current->mm, start, &prev); In short, find_find_prev() is only used from page fault, madvise, mbind, mlock and mprotect. And page fault is only performance impact callsite because other don't used frequently on regular workload. So, I wouldn't say, this patch has zero negative impact, but I think it is enough small and benefit is enough much. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] mm: simplify find_vma_prev 2011-12-12 19:24 ` KOSAKI Motohiro @ 2011-12-12 19:34 ` KOSAKI Motohiro 0 siblings, 0 replies; 13+ messages in thread From: KOSAKI Motohiro @ 2011-12-12 19:34 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, linux-kernel, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li (12/12/11 2:24 PM), KOSAKI Motohiro wrote: >>> Why have you removed this guard? Previously we had pprev==NULL and >>> returned mm->mmap. >>> This seems like a semantic change without any explanation. Could you >>> clarify? >> >> Scratch that. I have misread the code. find_vma will return mm->mmap if >> the given address is bellow all vmas. Sorry about noise. >> >> The only concern left would be the caching. Are you sure this will not >> break some workloads which benefit from mmap_cache usage and would >> interfere with find_vma_prev callers now? Anyway this could be fixed >> trivially. > > Here is callers list. > > find_vma_prev 115 arch/ia64/mm/fault.c vma = > find_vma_prev(mm, address,&prev_vma); > find_vma_prev 183 arch/parisc/mm/fault.c vma = > find_vma_prev(mm, address,&prev_vma); > find_vma_prev 229 arch/tile/mm/hugetlbpage.c vma = > find_vma_prev(mm, addr,&prev_vma); > find_vma_prev 336 arch/x86/mm/hugetlbpage.c if > (!(vma = find_vma_prev(mm, addr,&prev_vma))) > find_vma_prev 388 mm/madvise.c vma = > find_vma_prev(current->mm, start,&prev); > find_vma_prev 642 mm/mempolicy.c vma = find_vma_prev(mm, start,&prev); > find_vma_prev 388 mm/mlock.c vma = > find_vma_prev(current->mm, start,&prev); > find_vma_prev 265 mm/mprotect.c vma = > find_vma_prev(current->mm, start,&prev); > > In short, find_find_prev() is only used from page fault, madvise, mbind, mlock > and mprotect. And page fault is only performance impact callsite because other > don't used frequently on regular workload. > > So, I wouldn't say, this patch has zero negative impact, but I think > it is enough > small and benefit is enough much. In addition, other callsite (i.e. madvise, mbind, mlock and mprotect) are used from syscall. then, an optimal behavior depend on syscall argument. IOW, we and the kernel can't know it on ahead. Therefore, this change may increase some applications performance a bit and may decrease another some applications. I can reasonably guess the former are much than latter because many app have locality. but I can't prove it. Anyway, the impact is enough small, I think. They are rare than page fault. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: simplify find_vma_prev 2011-12-09 21:35 ` Joe Perches 2011-12-09 22:48 ` [PATCH v3] " kosaki.motohiro @ 2011-12-09 22:49 ` KOSAKI Motohiro 1 sibling, 0 replies; 13+ messages in thread From: KOSAKI Motohiro @ 2011-12-09 22:49 UTC (permalink / raw) To: Joe Perches Cc: linux-mm, linux-kernel, Andrew Morton, Hugh Dickins, Peter Zijlstra, Shaohua Li >> diff --git a/mm/mmap.c b/mm/mmap.c > [] >> @@ -1603,39 +1603,21 @@ struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr) >> >> EXPORT_SYMBOL(find_vma); >> >> -/* Same as find_vma, but also return a pointer to the previous VMA in *pprev. */ >> +/* >> + * Same as find_vma, but also return a pointer to the previous VMA in *pprev. >> + * Note: pprev is set to NULL when return value is NULL. >> + */ >> struct vm_area_struct * >> -find_vma_prev(struct mm_struct *mm, unsigned long addr, >> - struct vm_area_struct **pprev) > >> +find_vma_prev(struct mm_struct *mm, unsigned long addr, struct vm_area_struct **pprev) > > eh. This declaration change seems gratuitous and it exceeds 80 columns. > >> + *pprev = NULL; >> + vma = find_vma(mm, addr); >> + if (vma) >> + *pprev = vma->vm_prev; > > There's no need to possibly set *pprev twice. > > Maybe > { > struct vm_area_struct *vma = find_vma(mm, addr); > > *pprev = vma ? vma->vm_prev : NULL; > or > if (vma) > *pprev = vma->vm_prev; > else > *pprev = NULL; > > return vma; Thank you for reviewing. Updated. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-12-13 4:26 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-12-09 21:23 [PATCH v2] mm: simplify find_vma_prev kosaki.motohiro 2011-12-09 21:35 ` Joe Perches 2011-12-09 22:48 ` [PATCH v3] " kosaki.motohiro 2011-12-12 0:49 ` KAMEZAWA Hiroyuki 2011-12-12 9:27 ` KAMEZAWA Hiroyuki 2011-12-12 15:31 ` KOSAKI Motohiro 2011-12-13 4:25 ` KAMEZAWA Hiroyuki 2011-12-12 13:26 ` Michal Hocko 2011-12-12 14:49 ` KOSAKI Motohiro 2011-12-12 19:15 ` Michal Hocko 2011-12-12 19:24 ` KOSAKI Motohiro 2011-12-12 19:34 ` KOSAKI Motohiro 2011-12-09 22:49 ` [PATCH v2] " KOSAKI Motohiro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox