* [PATCH] mm/vmalloc: Keep a separate lazy-free list @ 2016-04-12 6:57 Chris Wilson 2016-04-14 13:13 ` Roman Peniaev 0 siblings, 1 reply; 9+ messages in thread From: Chris Wilson @ 2016-04-12 6:57 UTC (permalink / raw) To: intel-gfx Cc: Chris Wilson, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, Andrew Morton, David Rientjes, Joonsoo Kim, Roman Pen, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel When mixing lots of vmallocs and set_memory_*() (which calls vm_unmap_aliases()) I encountered situations where the performance degraded severely due to the walking of the entire vmap_area list each invocation. One simple improvement is to add the lazily freed vmap_area to a separate lockless free list, such that we then avoid having to walk the full list on each purge. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Roman Pen <r.peniaev@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Shawn Lin <shawn.lin@rock-chips.com> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- include/linux/vmalloc.h | 3 ++- mm/vmalloc.c | 29 ++++++++++++++--------------- 2 files changed, 16 insertions(+), 16 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 8b51df3ab334..3d9d786a943c 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -4,6 +4,7 @@ #include <linux/spinlock.h> #include <linux/init.h> #include <linux/list.h> +#include <linux/llist.h> #include <asm/page.h> /* pgprot_t */ #include <linux/rbtree.h> @@ -45,7 +46,7 @@ struct vmap_area { unsigned long flags; struct rb_node rb_node; /* address sorted rbtree */ struct list_head list; /* address sorted list */ - struct list_head purge_list; /* "lazy purge" list */ + struct llist_node purge_list; /* "lazy purge" list */ struct vm_struct *vm; struct rcu_head rcu_head; }; diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 293889d7f482..5388bf64dc32 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -21,6 +21,7 @@ #include <linux/debugobjects.h> #include <linux/kallsyms.h> #include <linux/list.h> +#include <linux/llist.h> #include <linux/notifier.h> #include <linux/rbtree.h> #include <linux/radix-tree.h> @@ -282,6 +283,7 @@ EXPORT_SYMBOL(vmalloc_to_pfn); static DEFINE_SPINLOCK(vmap_area_lock); /* Export for kexec only */ LIST_HEAD(vmap_area_list); +static LLIST_HEAD(vmap_purge_list); static struct rb_root vmap_area_root = RB_ROOT; /* The vmap cache globals are protected by vmap_area_lock */ @@ -628,7 +630,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, int sync, int force_flush) { static DEFINE_SPINLOCK(purge_lock); - LIST_HEAD(valist); + struct llist_node *valist; struct vmap_area *va; struct vmap_area *n_va; int nr = 0; @@ -647,20 +649,15 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, if (sync) purge_fragmented_blocks_allcpus(); - rcu_read_lock(); - list_for_each_entry_rcu(va, &vmap_area_list, list) { - if (va->flags & VM_LAZY_FREE) { - if (va->va_start < *start) - *start = va->va_start; - if (va->va_end > *end) - *end = va->va_end; - nr += (va->va_end - va->va_start) >> PAGE_SHIFT; - list_add_tail(&va->purge_list, &valist); - va->flags |= VM_LAZY_FREEING; - va->flags &= ~VM_LAZY_FREE; - } + valist = llist_del_all(&vmap_purge_list); + llist_for_each_entry(va, valist, purge_list) { + if (va->va_start < *start) + *start = va->va_start; + if (va->va_end > *end) + *end = va->va_end; + nr += (va->va_end - va->va_start) >> PAGE_SHIFT; + va->flags |= VM_LAZY_FREEING; } - rcu_read_unlock(); if (nr) atomic_sub(nr, &vmap_lazy_nr); @@ -670,7 +667,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, if (nr) { spin_lock(&vmap_area_lock); - list_for_each_entry_safe(va, n_va, &valist, purge_list) + llist_for_each_entry_safe(va, n_va, valist, purge_list) __free_vmap_area(va); spin_unlock(&vmap_area_lock); } @@ -706,6 +703,8 @@ static void purge_vmap_area_lazy(void) static void free_vmap_area_noflush(struct vmap_area *va) { va->flags |= VM_LAZY_FREE; + llist_add(&va->purge_list, &vmap_purge_list); + atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); if (unlikely(atomic_read(&vmap_lazy_nr) > lazy_max_pages())) try_purge_vmap_area_lazy(); -- 2.8.0.rc3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list 2016-04-12 6:57 [PATCH] mm/vmalloc: Keep a separate lazy-free list Chris Wilson @ 2016-04-14 13:13 ` Roman Peniaev 2016-04-14 13:49 ` Chris Wilson 0 siblings, 1 reply; 9+ messages in thread From: Roman Peniaev @ 2016-04-14 13:13 UTC (permalink / raw) To: Chris Wilson Cc: intel-gfx, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, Andrew Morton, David Rientjes, Joonsoo Kim, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel@vger.kernel.org Hi, Chris. Is it made on purpose not to drop VM_LAZY_FREE flag in __purge_vmap_area_lazy()? With your patch va->flags will have two bits set: VM_LAZY_FREE | VM_LAZY_FREEING. Seems it is not that bad, because all other code paths do not care, but still the change is not clear. Also, did you consider to avoid taking static purge_lock in __purge_vmap_area_lazy() ? Because, with your change it seems that you can avoid taking this lock at all. Just be careful when you observe llist as empty, i.e. nr == 0. And one comment is below: On Tue, Apr 12, 2016 at 8:57 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > When mixing lots of vmallocs and set_memory_*() (which calls > vm_unmap_aliases()) I encountered situations where the performance > degraded severely due to the walking of the entire vmap_area list each > invocation. One simple improvement is to add the lazily freed vmap_area > to a separate lockless free list, such that we then avoid having to walk > the full list on each purge. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: David Rientjes <rientjes@google.com> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Cc: Roman Pen <r.peniaev@gmail.com> > Cc: Mel Gorman <mgorman@techsingularity.net> > Cc: Toshi Kani <toshi.kani@hp.com> > Cc: Shawn Lin <shawn.lin@rock-chips.com> > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org > --- > include/linux/vmalloc.h | 3 ++- > mm/vmalloc.c | 29 ++++++++++++++--------------- > 2 files changed, 16 insertions(+), 16 deletions(-) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 8b51df3ab334..3d9d786a943c 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -4,6 +4,7 @@ > #include <linux/spinlock.h> > #include <linux/init.h> > #include <linux/list.h> > +#include <linux/llist.h> > #include <asm/page.h> /* pgprot_t */ > #include <linux/rbtree.h> > > @@ -45,7 +46,7 @@ struct vmap_area { > unsigned long flags; > struct rb_node rb_node; /* address sorted rbtree */ > struct list_head list; /* address sorted list */ > - struct list_head purge_list; /* "lazy purge" list */ > + struct llist_node purge_list; /* "lazy purge" list */ > struct vm_struct *vm; > struct rcu_head rcu_head; > }; > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 293889d7f482..5388bf64dc32 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -21,6 +21,7 @@ > #include <linux/debugobjects.h> > #include <linux/kallsyms.h> > #include <linux/list.h> > +#include <linux/llist.h> > #include <linux/notifier.h> > #include <linux/rbtree.h> > #include <linux/radix-tree.h> > @@ -282,6 +283,7 @@ EXPORT_SYMBOL(vmalloc_to_pfn); > static DEFINE_SPINLOCK(vmap_area_lock); > /* Export for kexec only */ > LIST_HEAD(vmap_area_list); > +static LLIST_HEAD(vmap_purge_list); > static struct rb_root vmap_area_root = RB_ROOT; > > /* The vmap cache globals are protected by vmap_area_lock */ > @@ -628,7 +630,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > int sync, int force_flush) > { > static DEFINE_SPINLOCK(purge_lock); > - LIST_HEAD(valist); > + struct llist_node *valist; > struct vmap_area *va; > struct vmap_area *n_va; > int nr = 0; > @@ -647,20 +649,15 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > if (sync) > purge_fragmented_blocks_allcpus(); > > - rcu_read_lock(); > - list_for_each_entry_rcu(va, &vmap_area_list, list) { > - if (va->flags & VM_LAZY_FREE) { > - if (va->va_start < *start) > - *start = va->va_start; > - if (va->va_end > *end) > - *end = va->va_end; > - nr += (va->va_end - va->va_start) >> PAGE_SHIFT; > - list_add_tail(&va->purge_list, &valist); > - va->flags |= VM_LAZY_FREEING; > - va->flags &= ~VM_LAZY_FREE; > - } > + valist = llist_del_all(&vmap_purge_list); > + llist_for_each_entry(va, valist, purge_list) { > + if (va->va_start < *start) > + *start = va->va_start; > + if (va->va_end > *end) > + *end = va->va_end; > + nr += (va->va_end - va->va_start) >> PAGE_SHIFT; > + va->flags |= VM_LAZY_FREEING; > } > - rcu_read_unlock(); > > if (nr) > atomic_sub(nr, &vmap_lazy_nr); > @@ -670,7 +667,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > > if (nr) { > spin_lock(&vmap_area_lock); > - list_for_each_entry_safe(va, n_va, &valist, purge_list) > + llist_for_each_entry_safe(va, n_va, valist, purge_list) > __free_vmap_area(va); > spin_unlock(&vmap_area_lock); > } > @@ -706,6 +703,8 @@ static void purge_vmap_area_lazy(void) > static void free_vmap_area_noflush(struct vmap_area *va) > { > va->flags |= VM_LAZY_FREE; > + llist_add(&va->purge_list, &vmap_purge_list); > + > atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); it seems to me that this a very long-standing problem: when you mark va->flags as VM_LAZY_FREE, va can be immediately freed from another CPU. If so, the line: atomic_add((va->va_end - va->va_start).... does use-after-free access. So I would also fix it with careful line reordering with barrier: (probably barrier is excess here, because llist_add implies cmpxchg, but I simply want to be explicit here, showing that marking va as VM_LAZY_FREE and adding it to the list should be at the end) - va->flags |= VM_LAZY_FREE; atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); + smp_mb__after_atomic(); + va->flags |= VM_LAZY_FREE; + llist_add(&va->purge_list, &vmap_purge_list); What do you think? -- Roman -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list 2016-04-14 13:13 ` Roman Peniaev @ 2016-04-14 13:49 ` Chris Wilson 2016-04-14 14:44 ` Roman Peniaev 0 siblings, 1 reply; 9+ messages in thread From: Chris Wilson @ 2016-04-14 13:49 UTC (permalink / raw) To: Roman Peniaev Cc: intel-gfx, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, Andrew Morton, David Rientjes, Joonsoo Kim, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel@vger.kernel.org On Thu, Apr 14, 2016 at 03:13:26PM +0200, Roman Peniaev wrote: > Hi, Chris. > > Is it made on purpose not to drop VM_LAZY_FREE flag in > __purge_vmap_area_lazy()? With your patch va->flags > will have two bits set: VM_LAZY_FREE | VM_LAZY_FREEING. > Seems it is not that bad, because all other code paths > do not care, but still the change is not clear. Oh, that was just a bad deletion. > Also, did you consider to avoid taking static purge_lock > in __purge_vmap_area_lazy() ? Because, with your change > it seems that you can avoid taking this lock at all. > Just be careful when you observe llist as empty, i.e. > nr == 0. I admit I only briefly looked at the lock. I will be honest and say I do not fully understand the requirements of the sync/force_flush parameters. purge_fragmented_blocks() manages per-cpu lists, so that looks safe under its own rcu_read_lock. Yes, it looks feasible to remove the purge_lock if we can relax sync. > > @@ -706,6 +703,8 @@ static void purge_vmap_area_lazy(void) > > static void free_vmap_area_noflush(struct vmap_area *va) > > { > > va->flags |= VM_LAZY_FREE; > > + llist_add(&va->purge_list, &vmap_purge_list); > > + > > atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); > > it seems to me that this a very long-standing problem: when you mark > va->flags as VM_LAZY_FREE, va can be immediately freed from another CPU. > If so, the line: > > atomic_add((va->va_end - va->va_start).... > > does use-after-free access. > > So I would also fix it with careful line reordering with barrier: > (probably barrier is excess here, because llist_add implies cmpxchg, > but I simply want to be explicit here, showing that marking va as > VM_LAZY_FREE and adding it to the list should be at the end) > > - va->flags |= VM_LAZY_FREE; > atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); > + smp_mb__after_atomic(); > + va->flags |= VM_LAZY_FREE; > + llist_add(&va->purge_list, &vmap_purge_list); > > What do you think? Yup, it is racy. We can drop the modification of LAZY_FREE/LAZY_FREEING to ease one headache, since those bits are not inspected anywhere afaict. Would not using atomic_add_return() be even clearer with respect to ordering: nr_lazy = atomic_add_return((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); llist_add(&va->purge_list, &vmap_purge_list); if (unlikely(nr_lazy > lazy_max_pages())) try_purge_vmap_area_lazy(); Since it doesn't matter that much if we make an extra call to try_purge_vmap_area_lazy() when we are on the boundary. -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list 2016-04-14 13:49 ` Chris Wilson @ 2016-04-14 14:44 ` Roman Peniaev 2016-04-15 11:07 ` [PATCH v2] " Chris Wilson 2016-04-15 11:14 ` [PATCH] " Chris Wilson 0 siblings, 2 replies; 9+ messages in thread From: Roman Peniaev @ 2016-04-14 14:44 UTC (permalink / raw) To: Chris Wilson, Roman Peniaev, intel-gfx, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, Andrew Morton, David Rientjes, Joonsoo Kim, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel@vger.kernel.org On Thu, Apr 14, 2016 at 3:49 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Thu, Apr 14, 2016 at 03:13:26PM +0200, Roman Peniaev wrote: >> Hi, Chris. >> >> Is it made on purpose not to drop VM_LAZY_FREE flag in >> __purge_vmap_area_lazy()? With your patch va->flags >> will have two bits set: VM_LAZY_FREE | VM_LAZY_FREEING. >> Seems it is not that bad, because all other code paths >> do not care, but still the change is not clear. > > Oh, that was just a bad deletion. > >> Also, did you consider to avoid taking static purge_lock >> in __purge_vmap_area_lazy() ? Because, with your change >> it seems that you can avoid taking this lock at all. >> Just be careful when you observe llist as empty, i.e. >> nr == 0. > > I admit I only briefly looked at the lock. I will be honest and say I > do not fully understand the requirements of the sync/force_flush > parameters. if sync: o I can wait for other purge in progress (do not care if purge_lock is dropped) o purge fragmented blocks if force_flush: o even nothing to purge, flush TLB, which is costly. (again sync-like is implied) > purge_fragmented_blocks() manages per-cpu lists, so that looks safe > under its own rcu_read_lock. > > Yes, it looks feasible to remove the purge_lock if we can relax sync. what is still left is waiting on vmap_area_lock for !sync mode. but probably is not that bad. > >> > @@ -706,6 +703,8 @@ static void purge_vmap_area_lazy(void) >> > static void free_vmap_area_noflush(struct vmap_area *va) >> > { >> > va->flags |= VM_LAZY_FREE; >> > + llist_add(&va->purge_list, &vmap_purge_list); >> > + >> > atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); >> >> it seems to me that this a very long-standing problem: when you mark >> va->flags as VM_LAZY_FREE, va can be immediately freed from another CPU. >> If so, the line: >> >> atomic_add((va->va_end - va->va_start).... >> >> does use-after-free access. >> >> So I would also fix it with careful line reordering with barrier: >> (probably barrier is excess here, because llist_add implies cmpxchg, >> but I simply want to be explicit here, showing that marking va as >> VM_LAZY_FREE and adding it to the list should be at the end) >> >> - va->flags |= VM_LAZY_FREE; >> atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); >> + smp_mb__after_atomic(); >> + va->flags |= VM_LAZY_FREE; >> + llist_add(&va->purge_list, &vmap_purge_list); >> >> What do you think? > > Yup, it is racy. We can drop the modification of LAZY_FREE/LAZY_FREEING > to ease one headache, since those bits are not inspected anywhere afaict. Yes, those flags can be completely dropped. > Would not using atomic_add_return() be even clearer with respect to > ordering: > > nr_lazy = atomic_add_return((va->va_end - va->va_start) >> PAGE_SHIFT, > &vmap_lazy_nr); > llist_add(&va->purge_list, &vmap_purge_list); > > if (unlikely(nr_lazy > lazy_max_pages())) > try_purge_vmap_area_lazy(); > > Since it doesn't matter that much if we make an extra call to > try_purge_vmap_area_lazy() when we are on the boundary. Nice. -- Roman > -Chris > > -- > Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] mm/vmalloc: Keep a separate lazy-free list 2016-04-14 14:44 ` Roman Peniaev @ 2016-04-15 11:07 ` Chris Wilson 2016-04-15 11:54 ` Roman Peniaev 2016-04-15 11:14 ` [PATCH] " Chris Wilson 1 sibling, 1 reply; 9+ messages in thread From: Chris Wilson @ 2016-04-15 11:07 UTC (permalink / raw) To: intel-gfx Cc: Chris Wilson, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, Andrew Morton, David Rientjes, Joonsoo Kim, Roman Pen, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel When mixing lots of vmallocs and set_memory_*() (which calls vm_unmap_aliases()) I encountered situations where the performance degraded severely due to the walking of the entire vmap_area list each invocation. One simple improvement is to add the lazily freed vmap_area to a separate lockless free list, such that we then avoid having to walk the full list on each purge. v2: Remove unused VM_LAZY_FREE and VM_LAZY_FREEING flags and reorder access of vmap_area during addition to the lazy free list to avoid use-after free (Roman). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Roman Pen <r.peniaev@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Shawn Lin <shawn.lin@rock-chips.com> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org --- include/linux/vmalloc.h | 3 ++- mm/vmalloc.c | 40 ++++++++++++++++++++-------------------- 2 files changed, 22 insertions(+), 21 deletions(-) diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 8b51df3ab334..3d9d786a943c 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -4,6 +4,7 @@ #include <linux/spinlock.h> #include <linux/init.h> #include <linux/list.h> +#include <linux/llist.h> #include <asm/page.h> /* pgprot_t */ #include <linux/rbtree.h> @@ -45,7 +46,7 @@ struct vmap_area { unsigned long flags; struct rb_node rb_node; /* address sorted rbtree */ struct list_head list; /* address sorted list */ - struct list_head purge_list; /* "lazy purge" list */ + struct llist_node purge_list; /* "lazy purge" list */ struct vm_struct *vm; struct rcu_head rcu_head; }; diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 293889d7f482..70f942832164 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -21,6 +21,7 @@ #include <linux/debugobjects.h> #include <linux/kallsyms.h> #include <linux/list.h> +#include <linux/llist.h> #include <linux/notifier.h> #include <linux/rbtree.h> #include <linux/radix-tree.h> @@ -275,13 +276,12 @@ EXPORT_SYMBOL(vmalloc_to_pfn); /*** Global kva allocator ***/ -#define VM_LAZY_FREE 0x01 -#define VM_LAZY_FREEING 0x02 #define VM_VM_AREA 0x04 static DEFINE_SPINLOCK(vmap_area_lock); /* Export for kexec only */ LIST_HEAD(vmap_area_list); +static LLIST_HEAD(vmap_purge_list); static struct rb_root vmap_area_root = RB_ROOT; /* The vmap cache globals are protected by vmap_area_lock */ @@ -628,7 +628,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, int sync, int force_flush) { static DEFINE_SPINLOCK(purge_lock); - LIST_HEAD(valist); + struct llist_node *valist; struct vmap_area *va; struct vmap_area *n_va; int nr = 0; @@ -647,20 +647,14 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, if (sync) purge_fragmented_blocks_allcpus(); - rcu_read_lock(); - list_for_each_entry_rcu(va, &vmap_area_list, list) { - if (va->flags & VM_LAZY_FREE) { - if (va->va_start < *start) - *start = va->va_start; - if (va->va_end > *end) - *end = va->va_end; - nr += (va->va_end - va->va_start) >> PAGE_SHIFT; - list_add_tail(&va->purge_list, &valist); - va->flags |= VM_LAZY_FREEING; - va->flags &= ~VM_LAZY_FREE; - } + valist = llist_del_all(&vmap_purge_list); + llist_for_each_entry(va, valist, purge_list) { + if (va->va_start < *start) + *start = va->va_start; + if (va->va_end > *end) + *end = va->va_end; + nr += (va->va_end - va->va_start) >> PAGE_SHIFT; } - rcu_read_unlock(); if (nr) atomic_sub(nr, &vmap_lazy_nr); @@ -670,7 +664,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, if (nr) { spin_lock(&vmap_area_lock); - list_for_each_entry_safe(va, n_va, &valist, purge_list) + llist_for_each_entry_safe(va, n_va, valist, purge_list) __free_vmap_area(va); spin_unlock(&vmap_area_lock); } @@ -705,9 +699,15 @@ static void purge_vmap_area_lazy(void) */ static void free_vmap_area_noflush(struct vmap_area *va) { - va->flags |= VM_LAZY_FREE; - atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); - if (unlikely(atomic_read(&vmap_lazy_nr) > lazy_max_pages())) + int nr_lazy; + + nr_lazy = atomic_add_return((va->va_end - va->va_start) >> PAGE_SHIFT, + &vmap_lazy_nr); + + /* After this point, we may free va at any time */ + llist_add(&va->purge_list, &vmap_purge_list); + + if (unlikely(nr_lazy > lazy_max_pages())) try_purge_vmap_area_lazy(); } -- 2.8.0.rc3 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2] mm/vmalloc: Keep a separate lazy-free list 2016-04-15 11:07 ` [PATCH v2] " Chris Wilson @ 2016-04-15 11:54 ` Roman Peniaev 0 siblings, 0 replies; 9+ messages in thread From: Roman Peniaev @ 2016-04-15 11:54 UTC (permalink / raw) To: Chris Wilson Cc: intel-gfx, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, Andrew Morton, David Rientjes, Joonsoo Kim, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel@vger.kernel.org On Fri, Apr 15, 2016 at 1:07 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > When mixing lots of vmallocs and set_memory_*() (which calls > vm_unmap_aliases()) I encountered situations where the performance > degraded severely due to the walking of the entire vmap_area list each > invocation. One simple improvement is to add the lazily freed vmap_area > to a separate lockless free list, such that we then avoid having to walk > the full list on each purge. > > v2: Remove unused VM_LAZY_FREE and VM_LAZY_FREEING flags and reorder > access of vmap_area during addition to the lazy free list to avoid > use-after free (Roman). > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: David Rientjes <rientjes@google.com> > Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Cc: Roman Pen <r.peniaev@gmail.com> > Cc: Mel Gorman <mgorman@techsingularity.net> > Cc: Toshi Kani <toshi.kani@hp.com> > Cc: Shawn Lin <shawn.lin@rock-chips.com> > Cc: linux-mm@kvack.org > Cc: linux-kernel@vger.kernel.org Reviewed-by: Roman Pen <r.peniaev@gmail.com> Thanks. -- Roman > --- > include/linux/vmalloc.h | 3 ++- > mm/vmalloc.c | 40 ++++++++++++++++++++-------------------- > 2 files changed, 22 insertions(+), 21 deletions(-) > > diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h > index 8b51df3ab334..3d9d786a943c 100644 > --- a/include/linux/vmalloc.h > +++ b/include/linux/vmalloc.h > @@ -4,6 +4,7 @@ > #include <linux/spinlock.h> > #include <linux/init.h> > #include <linux/list.h> > +#include <linux/llist.h> > #include <asm/page.h> /* pgprot_t */ > #include <linux/rbtree.h> > > @@ -45,7 +46,7 @@ struct vmap_area { > unsigned long flags; > struct rb_node rb_node; /* address sorted rbtree */ > struct list_head list; /* address sorted list */ > - struct list_head purge_list; /* "lazy purge" list */ > + struct llist_node purge_list; /* "lazy purge" list */ > struct vm_struct *vm; > struct rcu_head rcu_head; > }; > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 293889d7f482..70f942832164 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -21,6 +21,7 @@ > #include <linux/debugobjects.h> > #include <linux/kallsyms.h> > #include <linux/list.h> > +#include <linux/llist.h> > #include <linux/notifier.h> > #include <linux/rbtree.h> > #include <linux/radix-tree.h> > @@ -275,13 +276,12 @@ EXPORT_SYMBOL(vmalloc_to_pfn); > > /*** Global kva allocator ***/ > > -#define VM_LAZY_FREE 0x01 > -#define VM_LAZY_FREEING 0x02 > #define VM_VM_AREA 0x04 > > static DEFINE_SPINLOCK(vmap_area_lock); > /* Export for kexec only */ > LIST_HEAD(vmap_area_list); > +static LLIST_HEAD(vmap_purge_list); > static struct rb_root vmap_area_root = RB_ROOT; > > /* The vmap cache globals are protected by vmap_area_lock */ > @@ -628,7 +628,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > int sync, int force_flush) > { > static DEFINE_SPINLOCK(purge_lock); > - LIST_HEAD(valist); > + struct llist_node *valist; > struct vmap_area *va; > struct vmap_area *n_va; > int nr = 0; > @@ -647,20 +647,14 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > if (sync) > purge_fragmented_blocks_allcpus(); > > - rcu_read_lock(); > - list_for_each_entry_rcu(va, &vmap_area_list, list) { > - if (va->flags & VM_LAZY_FREE) { > - if (va->va_start < *start) > - *start = va->va_start; > - if (va->va_end > *end) > - *end = va->va_end; > - nr += (va->va_end - va->va_start) >> PAGE_SHIFT; > - list_add_tail(&va->purge_list, &valist); > - va->flags |= VM_LAZY_FREEING; > - va->flags &= ~VM_LAZY_FREE; > - } > + valist = llist_del_all(&vmap_purge_list); > + llist_for_each_entry(va, valist, purge_list) { > + if (va->va_start < *start) > + *start = va->va_start; > + if (va->va_end > *end) > + *end = va->va_end; > + nr += (va->va_end - va->va_start) >> PAGE_SHIFT; > } > - rcu_read_unlock(); > > if (nr) > atomic_sub(nr, &vmap_lazy_nr); > @@ -670,7 +664,7 @@ static void __purge_vmap_area_lazy(unsigned long *start, unsigned long *end, > > if (nr) { > spin_lock(&vmap_area_lock); > - list_for_each_entry_safe(va, n_va, &valist, purge_list) > + llist_for_each_entry_safe(va, n_va, valist, purge_list) > __free_vmap_area(va); > spin_unlock(&vmap_area_lock); > } > @@ -705,9 +699,15 @@ static void purge_vmap_area_lazy(void) > */ > static void free_vmap_area_noflush(struct vmap_area *va) > { > - va->flags |= VM_LAZY_FREE; > - atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr); > - if (unlikely(atomic_read(&vmap_lazy_nr) > lazy_max_pages())) > + int nr_lazy; > + > + nr_lazy = atomic_add_return((va->va_end - va->va_start) >> PAGE_SHIFT, > + &vmap_lazy_nr); > + > + /* After this point, we may free va at any time */ > + llist_add(&va->purge_list, &vmap_purge_list); > + > + if (unlikely(nr_lazy > lazy_max_pages())) > try_purge_vmap_area_lazy(); > } > > -- > 2.8.0.rc3 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list 2016-04-14 14:44 ` Roman Peniaev 2016-04-15 11:07 ` [PATCH v2] " Chris Wilson @ 2016-04-15 11:14 ` Chris Wilson 2016-04-22 21:49 ` Andrew Morton 1 sibling, 1 reply; 9+ messages in thread From: Chris Wilson @ 2016-04-15 11:14 UTC (permalink / raw) To: Roman Peniaev Cc: intel-gfx, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, Andrew Morton, David Rientjes, Joonsoo Kim, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel@vger.kernel.org On Thu, Apr 14, 2016 at 04:44:48PM +0200, Roman Peniaev wrote: > On Thu, Apr 14, 2016 at 3:49 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > On Thu, Apr 14, 2016 at 03:13:26PM +0200, Roman Peniaev wrote: > >> Hi, Chris. > >> > >> Is it made on purpose not to drop VM_LAZY_FREE flag in > >> __purge_vmap_area_lazy()? With your patch va->flags > >> will have two bits set: VM_LAZY_FREE | VM_LAZY_FREEING. > >> Seems it is not that bad, because all other code paths > >> do not care, but still the change is not clear. > > > > Oh, that was just a bad deletion. > > > >> Also, did you consider to avoid taking static purge_lock > >> in __purge_vmap_area_lazy() ? Because, with your change > >> it seems that you can avoid taking this lock at all. > >> Just be careful when you observe llist as empty, i.e. > >> nr == 0. > > > > I admit I only briefly looked at the lock. I will be honest and say I > > do not fully understand the requirements of the sync/force_flush > > parameters. > > if sync: > o I can wait for other purge in progress > (do not care if purge_lock is dropped) > > o purge fragmented blocks > > if force_flush: > o even nothing to purge, flush TLB, which is costly. > (again sync-like is implied) > > > purge_fragmented_blocks() manages per-cpu lists, so that looks safe > > under its own rcu_read_lock. > > > > Yes, it looks feasible to remove the purge_lock if we can relax sync. > > what is still left is waiting on vmap_area_lock for !sync mode. > but probably is not that bad. Ok, that's bit beyond my comfort zone with a patch to change the free list handling. I'll chicken out for the time being, atm I am more concerned that i915.ko may call set_page_wb() frequently on individual pages. -Chris -- Chris Wilson, Intel Open Source Technology Centre -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list 2016-04-15 11:14 ` [PATCH] " Chris Wilson @ 2016-04-22 21:49 ` Andrew Morton 2016-04-23 11:21 ` Roman Peniaev 0 siblings, 1 reply; 9+ messages in thread From: Andrew Morton @ 2016-04-22 21:49 UTC (permalink / raw) To: Chris Wilson Cc: Roman Peniaev, intel-gfx, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, David Rientjes, Joonsoo Kim, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel@vger.kernel.org On Fri, 15 Apr 2016 12:14:31 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > purge_fragmented_blocks() manages per-cpu lists, so that looks safe > > > under its own rcu_read_lock. > > > > > > Yes, it looks feasible to remove the purge_lock if we can relax sync. > > > > what is still left is waiting on vmap_area_lock for !sync mode. > > but probably is not that bad. > > Ok, that's bit beyond my comfort zone with a patch to change the free > list handling. I'll chicken out for the time being, atm I am more > concerned that i915.ko may call set_page_wb() frequently on individual > pages. Nick Piggin's vmap rewrite. 20x (or more) faster. https://lwn.net/Articles/285341/ 10 years ago, never finished. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] mm/vmalloc: Keep a separate lazy-free list 2016-04-22 21:49 ` Andrew Morton @ 2016-04-23 11:21 ` Roman Peniaev 0 siblings, 0 replies; 9+ messages in thread From: Roman Peniaev @ 2016-04-23 11:21 UTC (permalink / raw) To: Andrew Morton Cc: Chris Wilson, intel-gfx, Joonas Lahtinen, Tvrtko Ursulin, Daniel Vetter, David Rientjes, Joonsoo Kim, Mel Gorman, Toshi Kani, Shawn Lin, linux-mm, linux-kernel@vger.kernel.org On Fri, Apr 22, 2016 at 11:49 PM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Fri, 15 Apr 2016 12:14:31 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote: > >> > > purge_fragmented_blocks() manages per-cpu lists, so that looks safe >> > > under its own rcu_read_lock. >> > > >> > > Yes, it looks feasible to remove the purge_lock if we can relax sync. >> > >> > what is still left is waiting on vmap_area_lock for !sync mode. >> > but probably is not that bad. >> >> Ok, that's bit beyond my comfort zone with a patch to change the free >> list handling. I'll chicken out for the time being, atm I am more >> concerned that i915.ko may call set_page_wb() frequently on individual >> pages. > > Nick Piggin's vmap rewrite. 20x (or more) faster. > https://lwn.net/Articles/285341/ > > 10 years ago, never finished. But that's exactly what we are changing making 20.5x faster :) -- Roman -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-04-23 11:21 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-04-12 6:57 [PATCH] mm/vmalloc: Keep a separate lazy-free list Chris Wilson 2016-04-14 13:13 ` Roman Peniaev 2016-04-14 13:49 ` Chris Wilson 2016-04-14 14:44 ` Roman Peniaev 2016-04-15 11:07 ` [PATCH v2] " Chris Wilson 2016-04-15 11:54 ` Roman Peniaev 2016-04-15 11:14 ` [PATCH] " Chris Wilson 2016-04-22 21:49 ` Andrew Morton 2016-04-23 11:21 ` Roman Peniaev
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).