linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention
@ 2014-05-29  6:22 Joonsoo Kim
  2014-05-29 12:56 ` Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Joonsoo Kim @ 2014-05-29  6:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Zhang Yanfei, Johannes Weiner, Andi Kleen, linux-mm, linux-kernel,
	Richard Yao, Joonsoo Kim

Richard Yao reported a month ago that his system have a trouble
with vmap_area_lock contention during performance analysis
by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo
stressfully, but he didn't answer it.

https://lkml.org/lkml/2014/4/10/416

Although I'm not sure that this is right usage or not, there is a solution
reducing vmap_area_lock contention with no side-effect. That is just
to use rcu list iterator in get_vmalloc_info(). This function only needs
values on vmap_area structure, so we don't need to grab a spinlock.

Reported-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index f64632b..fdbb116 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2690,14 +2690,14 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
 
 	prev_end = VMALLOC_START;
 
-	spin_lock(&vmap_area_lock);
+	rcu_read_lock();
 
 	if (list_empty(&vmap_area_list)) {
 		vmi->largest_chunk = VMALLOC_TOTAL;
 		goto out;
 	}
 
-	list_for_each_entry(va, &vmap_area_list, list) {
+	list_for_each_entry_rcu(va, &vmap_area_list, list) {
 		unsigned long addr = va->va_start;
 
 		/*
@@ -2724,7 +2724,7 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
 		vmi->largest_chunk = VMALLOC_END - prev_end;
 
 out:
-	spin_unlock(&vmap_area_lock);
+	rcu_read_unlock();
 }
 #endif
 
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention
  2014-05-29  6:22 [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention Joonsoo Kim
@ 2014-05-29 12:56 ` Eric Dumazet
  2014-05-29 13:17 ` Richard Yao
  2014-05-29 20:05 ` Andrew Morton
  2 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2014-05-29 12:56 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Zhang Yanfei, Johannes Weiner, Andi Kleen,
	linux-mm, linux-kernel, Richard Yao

On Thu, 2014-05-29 at 15:22 +0900, Joonsoo Kim wrote:
> Richard Yao reported a month ago that his system have a trouble
> with vmap_area_lock contention during performance analysis
> by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo
> stressfully, but he didn't answer it.
> 
> https://lkml.org/lkml/2014/4/10/416
> 
> Although I'm not sure that this is right usage or not, there is a solution
> reducing vmap_area_lock contention with no side-effect. That is just
> to use rcu list iterator in get_vmalloc_info(). This function only needs
> values on vmap_area structure, so we don't need to grab a spinlock.
> 
> Reported-by: Richard Yao <ryao@gentoo.org>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Nice, I was considering the same

Acked-by: Eric Dumazet <edumazet@google.com>



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention
  2014-05-29  6:22 [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention Joonsoo Kim
  2014-05-29 12:56 ` Eric Dumazet
@ 2014-05-29 13:17 ` Richard Yao
  2014-05-29 20:05 ` Andrew Morton
  2 siblings, 0 replies; 6+ messages in thread
From: Richard Yao @ 2014-05-29 13:17 UTC (permalink / raw)
  Cc: Andrew Morton, Zhang Yanfei, Johannes Weiner, Andi Kleen,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org, Joonsoo Kim

I do not have a way of tracing it. I meant to reply when I did, but that has not changed. That being said, I like this patch.

On May 29, 2014, at 2:22 AM, Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:

> Richard Yao reported a month ago that his system have a trouble
> with vmap_area_lock contention during performance analysis
> by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo
> stressfully, but he didn't answer it.
> 
> https://lkml.org/lkml/2014/4/10/416
> 
> Although I'm not sure that this is right usage or not, there is a solution
> reducing vmap_area_lock contention with no side-effect. That is just
> to use rcu list iterator in get_vmalloc_info(). This function only needs
> values on vmap_area structure, so we don't need to grab a spinlock.
> 
> Reported-by: Richard Yao <ryao@gentoo.org>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index f64632b..fdbb116 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2690,14 +2690,14 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
> 
>    prev_end = VMALLOC_START;
> 
> -    spin_lock(&vmap_area_lock);
> +    rcu_read_lock();
> 
>    if (list_empty(&vmap_area_list)) {
>        vmi->largest_chunk = VMALLOC_TOTAL;
>        goto out;
>    }
> 
> -    list_for_each_entry(va, &vmap_area_list, list) {
> +    list_for_each_entry_rcu(va, &vmap_area_list, list) {
>        unsigned long addr = va->va_start;
> 
>        /*
> @@ -2724,7 +2724,7 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
>        vmi->largest_chunk = VMALLOC_END - prev_end;
> 
> out:
> -    spin_unlock(&vmap_area_lock);
> +    rcu_read_unlock();
> }
> #endif
> 
> -- 
> 1.7.9.5
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention
  2014-05-29  6:22 [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention Joonsoo Kim
  2014-05-29 12:56 ` Eric Dumazet
  2014-05-29 13:17 ` Richard Yao
@ 2014-05-29 20:05 ` Andrew Morton
  2014-05-29 21:23   ` Eric Dumazet
  2 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2014-05-29 20:05 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Zhang Yanfei, Johannes Weiner, Andi Kleen, linux-mm, linux-kernel,
	Richard Yao

On Thu, 29 May 2014 15:22:34 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:

> Richard Yao reported a month ago that his system have a trouble
> with vmap_area_lock contention during performance analysis
> by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo
> stressfully, but he didn't answer it.
> 
> https://lkml.org/lkml/2014/4/10/416
> 
> Although I'm not sure that this is right usage or not, there is a solution
> reducing vmap_area_lock contention with no side-effect. That is just
> to use rcu list iterator in get_vmalloc_info(). This function only needs
> values on vmap_area structure, so we don't need to grab a spinlock.

The mixture of rcu protection and spinlock protection for
vmap_area_list is pretty confusing.  Are you able to describe the
overall design here?  When and why do we use one versus the other?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention
  2014-05-29 20:05 ` Andrew Morton
@ 2014-05-29 21:23   ` Eric Dumazet
  2014-05-30  0:43     ` Joonsoo Kim
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2014-05-29 21:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Joonsoo Kim, Zhang Yanfei, Johannes Weiner, Andi Kleen, linux-mm,
	linux-kernel, Richard Yao

On Thu, 2014-05-29 at 13:05 -0700, Andrew Morton wrote:
> On Thu, 29 May 2014 15:22:34 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
> 
> > Richard Yao reported a month ago that his system have a trouble
> > with vmap_area_lock contention during performance analysis
> > by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo
> > stressfully, but he didn't answer it.
> > 
> > https://lkml.org/lkml/2014/4/10/416
> > 
> > Although I'm not sure that this is right usage or not, there is a solution
> > reducing vmap_area_lock contention with no side-effect. That is just
> > to use rcu list iterator in get_vmalloc_info(). This function only needs
> > values on vmap_area structure, so we don't need to grab a spinlock.
> 
> The mixture of rcu protection and spinlock protection for
> vmap_area_list is pretty confusing.  Are you able to describe the
> overall design here?  When and why do we use one versus the other?

The spinlock protects writers.

rcu can be used in this function because all RCU protocol is already
respected by writers, since Nick Piggin commit db64fe02258f1507e13fe5
("mm: rewrite vmap layer") back in linux-2.6.28

Specifically :
   insertions use list_add_rcu(), 
   deletions use list_del_rcu() and kfree_rcu().

Note the rb tree is not used from rcu reader (it would not be safe),
only the vmap_area_list has full RCU protection.

Note that __purge_vmap_area_lazy() already uses this rcu protection.

        rcu_read_lock();
        list_for_each_entry_rcu(va, &vmap_area_list, list) {
                if (va->flags & VM_LAZY_FREE) {
                        if (va->va_start < *start)
                                *start = va->va_start;
                        if (va->va_end > *end)
                                *end = va->va_end;
                        nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
                        list_add_tail(&va->purge_list, &valist);
                        va->flags |= VM_LAZY_FREEING;
                        va->flags &= ~VM_LAZY_FREE;
                }
        }
        rcu_read_unlock();


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention
  2014-05-29 21:23   ` Eric Dumazet
@ 2014-05-30  0:43     ` Joonsoo Kim
  0 siblings, 0 replies; 6+ messages in thread
From: Joonsoo Kim @ 2014-05-30  0:43 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrew Morton, Zhang Yanfei, Johannes Weiner, Andi Kleen,
	linux-mm, linux-kernel, Richard Yao

On Thu, May 29, 2014 at 02:23:08PM -0700, Eric Dumazet wrote:
> On Thu, 2014-05-29 at 13:05 -0700, Andrew Morton wrote:
> > On Thu, 29 May 2014 15:22:34 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
> > 
> > > Richard Yao reported a month ago that his system have a trouble
> > > with vmap_area_lock contention during performance analysis
> > > by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo
> > > stressfully, but he didn't answer it.
> > > 
> > > https://lkml.org/lkml/2014/4/10/416
> > > 
> > > Although I'm not sure that this is right usage or not, there is a solution
> > > reducing vmap_area_lock contention with no side-effect. That is just
> > > to use rcu list iterator in get_vmalloc_info(). This function only needs
> > > values on vmap_area structure, so we don't need to grab a spinlock.
> > 
> > The mixture of rcu protection and spinlock protection for
> > vmap_area_list is pretty confusing.  Are you able to describe the
> > overall design here?  When and why do we use one versus the other?
> 
> The spinlock protects writers.
> 
> rcu can be used in this function because all RCU protocol is already
> respected by writers, since Nick Piggin commit db64fe02258f1507e13fe5
> ("mm: rewrite vmap layer") back in linux-2.6.28
> 
> Specifically :
>    insertions use list_add_rcu(), 
>    deletions use list_del_rcu() and kfree_rcu().
> 
> Note the rb tree is not used from rcu reader (it would not be safe),
> only the vmap_area_list has full RCU protection.
> 
> Note that __purge_vmap_area_lazy() already uses this rcu protection.
> 
>         rcu_read_lock();
>         list_for_each_entry_rcu(va, &vmap_area_list, list) {
>                 if (va->flags & VM_LAZY_FREE) {
>                         if (va->va_start < *start)
>                                 *start = va->va_start;
>                         if (va->va_end > *end)
>                                 *end = va->va_end;
>                         nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
>                         list_add_tail(&va->purge_list, &valist);
>                         va->flags |= VM_LAZY_FREEING;
>                         va->flags &= ~VM_LAZY_FREE;
>                 }
>         }
>         rcu_read_unlock();
> 

Thanks Eric.
I will add more.

Although it is really complicated, I try to demonstrate overall design
how I understood.

There are five things we have to know, vm_struct structure, vmap_area
structure, rbtree rooted from vmap_area_root, vmap_area_list
and vmap_area_lock.

vmap_area is main structure representing virtual address range for this area.
vm_struct is the structure to keep information about mapped pages or phys_addr
for this vmap_area.
rbtree is used for finding target area or vacant area rapidly and is protected
by vmap_area_lock on all insert/remove/find operations.

vmap_area_list links the vmap_area in ascending order in virtaul address.
Manipulation of this list is protected by vmap_area_lock and RCU. When we
insert/remove vmap_area, we need to hold the vmap_area_lock so no concurrent
user can insert/remove different vmap_area. And when insert/remove, we use
list_add_rcu() and list_del_rcu(), so we can iterate the vmap_area_list safely
if we hold rcu_read_lock().

Another things vmap_area_lock is protecting are va->vm, that is, pointer to
vm_struct and VM_VM_AREA value on vmap_area's flag. We set/unset these values
with holding the vmap_area_lock to serialize access to this values. So when
we need to access these values, we should hold the lock.

In conclusion, get_vmalloc_info() needs to iterate vmap_area_list, but,
it doesn't access va->vm so we don't need to grab the vmap_area_lock.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-05-30  0:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-29  6:22 [PATCH] vmalloc: use rcu list iterator to reduce vmap_area_lock contention Joonsoo Kim
2014-05-29 12:56 ` Eric Dumazet
2014-05-29 13:17 ` Richard Yao
2014-05-29 20:05 ` Andrew Morton
2014-05-29 21:23   ` Eric Dumazet
2014-05-30  0:43     ` Joonsoo Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).