From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933741AbbA1Bo0 (ORCPT ); Tue, 27 Jan 2015 20:44:26 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:41426 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933369AbbA1Bii (ORCPT ); Tue, 27 Jan 2015 20:38:38 -0500 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Joonsoo Kim , Richard Yao , Eric Dumazet , Peter Hurley , Zhang Yanfei , Johannes Weiner , Andi Kleen , Andrew Morton , Linus Torvalds , Mel Gorman Subject: [PATCH 3.14 68/77] vmalloc: use rcu list iterator to reduce vmap_area_lock contention Date: Tue, 27 Jan 2015 17:27:46 -0800 Message-Id: <20150128012748.004727776@linuxfoundation.org> X-Mailer: git-send-email 2.2.2 In-Reply-To: <20150128012745.971137091@linuxfoundation.org> References: <20150128012745.971137091@linuxfoundation.org> User-Agent: quilt/0.63-1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Joonsoo Kim commit 474750aba88817c53f39424e5567b8e4acc4b39b upstream. Richard Yao reported a month ago that his system have a trouble with vmap_area_lock contention during performance analysis by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo stressfully, but he didn't answer it. https://lkml.org/lkml/2014/4/10/416 Although I'm not sure that this is right usage or not, there is a solution reducing vmap_area_lock contention with no side-effect. That is just to use rcu list iterator in get_vmalloc_info(). rcu can be used in this function because all RCU protocol is already respected by writers, since Nick Piggin commit db64fe02258f1 ("mm: rewrite vmap layer") back in linux-2.6.28 Specifically : insertions use list_add_rcu(), deletions use list_del_rcu() and kfree_rcu(). Note the rb tree is not used from rcu reader (it would not be safe), only the vmap_area_list has full RCU protection. Note that __purge_vmap_area_lazy() already uses this rcu protection. rcu_read_lock(); list_for_each_entry_rcu(va, &vmap_area_list, list) { if (va->flags & VM_LAZY_FREE) { if (va->va_start < *start) *start = va->va_start; if (va->va_end > *end) *end = va->va_end; nr += (va->va_end - va->va_start) >> PAGE_SHIFT; list_add_tail(&va->purge_list, &valist); va->flags |= VM_LAZY_FREEING; va->flags &= ~VM_LAZY_FREE; } } rcu_read_unlock(); Peter: : While rcu list traversal over the vmap_area_list is safe, this may : arrive at different results than the spinlocked version. The rcu list : traversal version will not be a 'snapshot' of a single, valid instant : of the entire vmap_area_list, but rather a potential amalgam of : different list states. Joonsoo: : Yes, you are right, but I don't think that we should be strict here. : Meminfo is already not a 'snapshot' at specific time. While we try to get : certain stats, the other stats can change. And, although we may arrive at : different results than the spinlocked version, the difference would not be : large and would not make serious side-effect. [edumazet@google.com: add more commit description] Signed-off-by: Joonsoo Kim Reported-by: Richard Yao Acked-by: Eric Dumazet Cc: Peter Hurley Cc: Zhang Yanfei Cc: Johannes Weiner Cc: Andi Kleen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Mel Gorman Signed-off-by: Greg Kroah-Hartman --- mm/vmalloc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2681,14 +2681,14 @@ void get_vmalloc_info(struct vmalloc_inf prev_end = VMALLOC_START; - spin_lock(&vmap_area_lock); + rcu_read_lock(); if (list_empty(&vmap_area_list)) { vmi->largest_chunk = VMALLOC_TOTAL; goto out; } - list_for_each_entry(va, &vmap_area_list, list) { + list_for_each_entry_rcu(va, &vmap_area_list, list) { unsigned long addr = va->va_start; /* @@ -2715,7 +2715,7 @@ void get_vmalloc_info(struct vmalloc_inf vmi->largest_chunk = VMALLOC_END - prev_end; out: - spin_unlock(&vmap_area_lock); + rcu_read_unlock(); } #endif