All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Hurley <peter@hurleysoftware.com>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Zhang Yanfei <zhangyanfei.yes@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andi Kleen <andi@firstfloor.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Richard Yao <ryao@gentoo.org>,
	Eric Dumazet <eric.dumazet@gmail.com>
Subject: Re: [PATCH v2] vmalloc: use rcu list iterator to reduce vmap_area_lock contention
Date: Tue, 10 Jun 2014 23:32:19 -0400	[thread overview]
Message-ID: <5397CDC3.1050809@hurleysoftware.com> (raw)
In-Reply-To: <1402453146-10057-1-git-send-email-iamjoonsoo.kim@lge.com>

On 06/10/2014 10:19 PM, Joonsoo Kim wrote:
> Richard Yao reported a month ago that his system have a trouble
> with vmap_area_lock contention during performance analysis
> by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo
> stressfully, but he didn't answer it.
>
> https://lkml.org/lkml/2014/4/10/416
>
> Although I'm not sure that this is right usage or not, there is a solution
> reducing vmap_area_lock contention with no side-effect. That is just
> to use rcu list iterator in get_vmalloc_info().
>
> rcu can be used in this function because all RCU protocol is already
> respected by writers, since Nick Piggin commit db64fe02258f1507e13fe5
> ("mm: rewrite vmap layer") back in linux-2.6.28

While rcu list traversal over the vmap_area_list is safe, this may
arrive at different results than the spinlocked version. The rcu list
traversal version will not be a 'snapshot' of a single, valid instant
of the entire vmap_area_list, but rather a potential amalgam of
different list states.

This is because the vmap_area_list can continue to change during
list traversal.

Regards,
Peter Hurley

> Specifically :
>     insertions use list_add_rcu(),
>     deletions use list_del_rcu() and kfree_rcu().
>
> Note the rb tree is not used from rcu reader (it would not be safe),
> only the vmap_area_list has full RCU protection.
>
> Note that __purge_vmap_area_lazy() already uses this rcu protection.
>
>          rcu_read_lock();
>          list_for_each_entry_rcu(va, &vmap_area_list, list) {
>                  if (va->flags & VM_LAZY_FREE) {
>                          if (va->va_start < *start)
>                                  *start = va->va_start;
>                          if (va->va_end > *end)
>                                  *end = va->va_end;
>                          nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
>                          list_add_tail(&va->purge_list, &valist);
>                          va->flags |= VM_LAZY_FREEING;
>                          va->flags &= ~VM_LAZY_FREE;
>                  }
>          }
>          rcu_read_unlock();
>
> v2: add more commit description from Eric
>
> [edumazet@google.com: add more commit description]
> Reported-by: Richard Yao <ryao@gentoo.org>
> Acked-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index f64632b..fdbb116 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2690,14 +2690,14 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
>
>   	prev_end = VMALLOC_START;
>
> -	spin_lock(&vmap_area_lock);
> +	rcu_read_lock();
>
>   	if (list_empty(&vmap_area_list)) {
>   		vmi->largest_chunk = VMALLOC_TOTAL;
>   		goto out;
>   	}
>
> -	list_for_each_entry(va, &vmap_area_list, list) {
> +	list_for_each_entry_rcu(va, &vmap_area_list, list) {
>   		unsigned long addr = va->va_start;
>
>   		/*
> @@ -2724,7 +2724,7 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
>   		vmi->largest_chunk = VMALLOC_END - prev_end;
>
>   out:
> -	spin_unlock(&vmap_area_lock);
> +	rcu_read_unlock();
>   }
>   #endif
>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Peter Hurley <peter@hurleysoftware.com>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Zhang Yanfei <zhangyanfei.yes@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andi Kleen <andi@firstfloor.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Richard Yao <ryao@gentoo.org>,
	Eric Dumazet <eric.dumazet@gmail.com>
Subject: Re: [PATCH v2] vmalloc: use rcu list iterator to reduce vmap_area_lock contention
Date: Tue, 10 Jun 2014 23:32:19 -0400	[thread overview]
Message-ID: <5397CDC3.1050809@hurleysoftware.com> (raw)
In-Reply-To: <1402453146-10057-1-git-send-email-iamjoonsoo.kim@lge.com>

On 06/10/2014 10:19 PM, Joonsoo Kim wrote:
> Richard Yao reported a month ago that his system have a trouble
> with vmap_area_lock contention during performance analysis
> by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo
> stressfully, but he didn't answer it.
>
> https://lkml.org/lkml/2014/4/10/416
>
> Although I'm not sure that this is right usage or not, there is a solution
> reducing vmap_area_lock contention with no side-effect. That is just
> to use rcu list iterator in get_vmalloc_info().
>
> rcu can be used in this function because all RCU protocol is already
> respected by writers, since Nick Piggin commit db64fe02258f1507e13fe5
> ("mm: rewrite vmap layer") back in linux-2.6.28

While rcu list traversal over the vmap_area_list is safe, this may
arrive at different results than the spinlocked version. The rcu list
traversal version will not be a 'snapshot' of a single, valid instant
of the entire vmap_area_list, but rather a potential amalgam of
different list states.

This is because the vmap_area_list can continue to change during
list traversal.

Regards,
Peter Hurley

> Specifically :
>     insertions use list_add_rcu(),
>     deletions use list_del_rcu() and kfree_rcu().
>
> Note the rb tree is not used from rcu reader (it would not be safe),
> only the vmap_area_list has full RCU protection.
>
> Note that __purge_vmap_area_lazy() already uses this rcu protection.
>
>          rcu_read_lock();
>          list_for_each_entry_rcu(va, &vmap_area_list, list) {
>                  if (va->flags & VM_LAZY_FREE) {
>                          if (va->va_start < *start)
>                                  *start = va->va_start;
>                          if (va->va_end > *end)
>                                  *end = va->va_end;
>                          nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
>                          list_add_tail(&va->purge_list, &valist);
>                          va->flags |= VM_LAZY_FREEING;
>                          va->flags &= ~VM_LAZY_FREE;
>                  }
>          }
>          rcu_read_unlock();
>
> v2: add more commit description from Eric
>
> [edumazet@google.com: add more commit description]
> Reported-by: Richard Yao <ryao@gentoo.org>
> Acked-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index f64632b..fdbb116 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2690,14 +2690,14 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
>
>   	prev_end = VMALLOC_START;
>
> -	spin_lock(&vmap_area_lock);
> +	rcu_read_lock();
>
>   	if (list_empty(&vmap_area_list)) {
>   		vmi->largest_chunk = VMALLOC_TOTAL;
>   		goto out;
>   	}
>
> -	list_for_each_entry(va, &vmap_area_list, list) {
> +	list_for_each_entry_rcu(va, &vmap_area_list, list) {
>   		unsigned long addr = va->va_start;
>
>   		/*
> @@ -2724,7 +2724,7 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
>   		vmi->largest_chunk = VMALLOC_END - prev_end;
>
>   out:
> -	spin_unlock(&vmap_area_lock);
> +	rcu_read_unlock();
>   }
>   #endif
>
>


  reply	other threads:[~2014-06-11  3:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-11  2:19 [PATCH v2] vmalloc: use rcu list iterator to reduce vmap_area_lock contention Joonsoo Kim
2014-06-11  2:19 ` Joonsoo Kim
2014-06-11  3:32 ` Peter Hurley [this message]
2014-06-11  3:32   ` Peter Hurley
2014-06-11  4:34   ` Joonsoo Kim
2014-06-11  4:34     ` Joonsoo Kim
2014-06-11 21:56     ` Andrew Morton
2014-06-11 21:56       ` Andrew Morton
2014-06-11  5:43   ` Eric Dumazet
2014-06-11  5:43     ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5397CDC3.1050809@hurleysoftware.com \
    --to=peter@hurleysoftware.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=eric.dumazet@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryao@gentoo.org \
    --cc=zhangyanfei.yes@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.