public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: George Spelvin <linux@horizon.com>
Cc: dave@sr71.net, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux@rasmusvillemoes.dk, peterz@infradead.org, riel@redhat.com,
	rientjes@google.com, torvalds@linux-foundation.org
Subject: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info
Date: Sun, 23 Aug 2015 10:17:51 +0200	[thread overview]
Message-ID: <20150823081750.GA28349@gmail.com> (raw)
In-Reply-To: <20150823064603.14050.qmail@ns.horizon.com>


* George Spelvin <linux@horizon.com> wrote:

> Ingo Molnar <mingo@kernel.org> wrote:
> > I think this is too complex.
> > 
> > How about something simple like the patch below (on top of the third patch)?
> 
> > It makes the vmalloc info transactional - /proc/meminfo will always print a 
> > consistent set of numbers. (Not that we really care about races there, but it 
> > looks really simple to solve so why not.)
> 
> Looks like a huge simplification!
> 
> It needs a comment about the approximate nature of the locking and
> the obvious race conditions:
> 1) The first caller to get_vmalloc_info() clears vmap_info_changed
>    before updating vmap_info_cache, so a second caller is likely to
>    get stale data for the duration of a calc_vmalloc_info call.
> 2) Although unlikely, it's possible for two threads to race calling
>    calc_vmalloc_info, and the one that computes fresher data updates
>    the cache first, so the later write leaves stale data.
> 
> Other issues:
> 3) Me, I'd make vmap_info_changed a bool, for documentation more than
>    any space saving.
> 4) I wish there were a trylock version of write_seqlock, so we could
>    avoid blocking entirely.  (You *could* hand-roll it, but that eats
>    into the simplicity.)

Ok, fair enough - so how about the attached approach instead, which uses a 64-bit 
generation counter to track changes to the vmalloc state.

This is still very simple, but should not suffer from stale data being returned 
indefinitely in /proc/meminfo. We might race - but that was true before as well 
due to the lock-less RCU list walk - but we'll always return a correct and 
consistent version of the information.

Lightly tested. This is a replacement patch to make it easier to read via email.

I also made sure there's no extra overhead in the !CONFIG_PROC_FS case.

Note that there's an even simpler variant possible I think: we could use just the 
two generation counters and barriers to remove the seqlock.

Thanks,

	Ingo

==============================>
>From f9fd770e75e2edb4143f32ced0b53d7a77969c94 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 22 Aug 2015 12:28:01 +0200
Subject: [PATCH] mm/vmalloc: Cache the vmalloc memory info

Linus reported that glibc (rather stupidly) reads /proc/meminfo
for every sysinfo() call, which causes the Git build to use
a surprising amount of CPU time, mostly due to the overhead
of get_vmalloc_info() - which walks a long list to do its
statistics.

Modify Linus's jiffies based patch to use generation counters
to cache the vmalloc info: vmap_unlock() increases the generation
counter, and the get_vmalloc_info() reads it and compares it
against a cached generation counter.

Also use a seqlock to make sure we always print a consistent
set of vmalloc statistics.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 mm/vmalloc.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 56 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 605138083880..d72b23436906 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -276,7 +276,21 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 #define VM_LAZY_FREEING	0x02
 #define VM_VM_AREA	0x04
 
-static DEFINE_SPINLOCK(vmap_area_lock);
+static __cacheline_aligned_in_smp DEFINE_SPINLOCK(vmap_area_lock);
+
+#ifdef CONFIG_PROC_FS
+/*
+ * A seqlock and two generation counters for a simple cache of the
+ * vmalloc allocation statistics info printed in /proc/meminfo.
+ *
+ * ( The assumption of the optimization is that it's read frequently, but
+ *   modified infrequently. )
+ */
+static DEFINE_SEQLOCK(vmap_info_lock);
+static u64 vmap_info_gen;
+static u64 vmap_info_cache_gen;
+static struct vmalloc_info vmap_info_cache;
+#endif
 
 static inline void vmap_lock(void)
 {
@@ -285,6 +299,9 @@ static inline void vmap_lock(void)
 
 static inline void vmap_unlock(void)
 {
+#ifdef CONFIG_PROC_FS
+	WRITE_ONCE(vmap_info_gen, vmap_info_gen+1);
+#endif
 	spin_unlock(&vmap_area_lock);
 }
 
@@ -2699,7 +2716,7 @@ static int __init proc_vmalloc_init(void)
 }
 module_init(proc_vmalloc_init);
 
-void get_vmalloc_info(struct vmalloc_info *vmi)
+static void calc_vmalloc_info(struct vmalloc_info *vmi)
 {
 	struct vmap_area *va;
 	unsigned long free_area_size;
@@ -2746,5 +2763,41 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
 out:
 	rcu_read_unlock();
 }
-#endif
 
+/*
+ * Return a consistent snapshot of the current vmalloc allocation
+ * statistics, for /proc/meminfo:
+ */
+void get_vmalloc_info(struct vmalloc_info *vmi)
+{
+	u64 gen = READ_ONCE(vmap_info_gen);
+
+	/*
+	 * If the generation counter of the cache matches that of
+	 * the vmalloc generation counter then return the cache:
+	 */
+	if (READ_ONCE(vmap_info_cache_gen) == gen) {
+		unsigned int seq;
+
+		do {
+			seq = read_seqbegin(&vmap_info_lock);
+			*vmi = vmap_info_cache;
+		} while (read_seqretry(&vmap_info_lock, seq));
+
+		return;
+	}
+
+	calc_vmalloc_info(vmi);
+
+	/*
+	 * If are racing with a new vmalloc() then we might write
+	 * the old generation counter here - and the next call to
+	 * get_vmalloc_info() will fix things up:
+	 */
+	write_seqlock(&vmap_info_lock);
+	vmap_info_cache = *vmi;
+	WRITE_ONCE(vmap_info_cache_gen, gen);
+	write_sequnlock(&vmap_info_lock);
+}
+
+#endif /* CONFIG_PROC_FS */

  reply	other threads:[~2015-08-23  8:17 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-23  4:48 [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics George Spelvin
2015-08-23  6:04 ` Ingo Molnar
2015-08-23  6:46   ` George Spelvin
2015-08-23  8:17     ` Ingo Molnar [this message]
2015-08-23 20:53       ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Rasmus Villemoes
2015-08-24  6:58         ` Ingo Molnar
2015-08-24  8:39           ` Rasmus Villemoes
2015-08-23 21:56       ` Rasmus Villemoes
2015-08-24  7:00         ` Ingo Molnar
2015-08-25 16:39         ` Linus Torvalds
2015-08-25 17:03           ` Linus Torvalds
2015-08-24  1:04       ` George Spelvin
2015-08-24  7:34         ` [PATCH 3/3 v4] " Ingo Molnar
2015-08-24  7:47           ` Ingo Molnar
2015-08-24  7:50             ` [PATCH 3/3 v5] " Ingo Molnar
2015-08-24 12:54               ` George Spelvin
2015-08-25  9:56                 ` [PATCH 3/3 v6] " Ingo Molnar
2015-08-25 10:36                   ` George Spelvin
2015-08-25 12:59                   ` Peter Zijlstra
2015-08-25 14:19                   ` Rasmus Villemoes
2015-08-25 15:11                     ` George Spelvin
2015-08-24 13:11           ` [PATCH 3/3 v4] " John Stoffel
2015-08-24 15:11             ` George Spelvin
2015-08-24 15:55               ` John Stoffel
2015-08-25 12:46       ` [PATCH 3/3 v3] " Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150823081750.GA28349@gmail.com \
    --to=mingo@kernel.org \
    --cc=dave@sr71.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@horizon.com \
    --cc=linux@rasmusvillemoes.dk \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox