linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: George Spelvin <linux@horizon.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	torvalds@linux-foundation.org, Dave Hansen <dave@sr71.net>,
	Peter Zijlstra <peterz@infradead.org>,
	David Rientjes <rientjes@google.com>,
	Rik van Riel <riel@redhat.com>,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: Re: [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics
Date: Sun, 23 Aug 2015 08:04:43 +0200	[thread overview]
Message-ID: <20150823060443.GA9882@gmail.com> (raw)
In-Reply-To: <20150823044839.5727.qmail@ns.horizon.com>


* George Spelvin <linux@horizon.com> wrote:

> Linus wrote:
> > I don't think any of this can be called "correct", in that the
> > unlocked accesses to the cached state are clearly racy, but I think
> > it's very much "acceptable".
> 
> I'd think you could easily fix that with a seqlock-like system.
> 
> What makes it so simple is that you can always fall back to
> calc_vmalloc_info if there's any problem, rather than looping or blocking.
> 
> The basic idea is that you have a seqlock counter, but if either of
> the two lsbits are set, the cached information is stale.
> 
> Basically, you need a seqlock and a spinlock.  The seqlock does
> most of the work, and the spinlock ensures that there's only one
> updater of the cache.
> 
> vmap_unlock() does set_bit(0, &seq->sequence).  This marks the information
> as stale.
> 
> get_vmalloc_info reads the seqlock.  There are two case:
> - If the two lsbits are 10, the cached information is valid.
>   Copy it out, re-check the seqlock, and loop if the sequence
>   number changes.
> - In any other case, the cached information is
>   not valid.
>   - Try to obtain the spinlock.  Do not block if it's unavailable.
>     - If unavailable, do not block.
>     - If the lock is acquired:
>       - Set the sequence to (sequence | 3) + 1 (we're the only writer)
>       - This bumps the sequence number and leaves the lsbits at 00 (invalid)
>       - Memory barrier TBD.  Do the RCU ops in calc_vmalloc_info do it for us?
>   - Call calc_vmalloc_info
>   - If we obtained the spinlock earlier:
>     - Copy our vmi to cached_info
>     - smp_wmb()
>     - set_bit(1, &seq->sequence).  This marks the information as valid,
>       as long as bit 0 is still clear.
>     - Release the spinlock.
> 
> Basically, bit 0 says "vmalloc info has changed", and bit 1 says
> "vmalloc cache has been updated".  This clears bit 0 before
> starting the update so that an update during calc_vmalloc_info
> will force a new update.
> 
> So the three case are basically:
> 00 - calc_vmalloc_info() in progress
> 01 - vmap_unlock() during calc_vmalloc_info()
> 10 - cached_info is valid
> 11 - vmap_unlock has invalidated cached_info, awaiting refresh
> 
> Logically, the sequence number should be initialized to ...01,
> but the code above handles 00 okay.

I think this is too complex.

How about something simple like the patch below (on top of the third patch)?

It makes the vmalloc info transactional - /proc/meminfo will always print a 
consistent set of numbers. (Not that we really care about races there, but it 
looks really simple to solve so why not.)

( I also moved the function-static cache next to the flag and seqlock - this
  should further compress the cache footprint. )

Have I missed anything? Very lightly tested: booted in a VM.

Thanks,

	Ingo

=========================>

 mm/vmalloc.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ef48e557df5a..66726f41e726 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -278,7 +278,15 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
 
 static __cacheline_aligned_in_smp DEFINE_SPINLOCK(vmap_area_lock);
 
+/*
+ * Seqlock and flag for the vmalloc info cache printed in /proc/meminfo.
+ *
+ * The assumption of the optimization is that it's read frequently, but
+ * modified infrequently.
+ */
+static DEFINE_SEQLOCK(vmap_info_lock);
 static int vmap_info_changed;
+static struct vmalloc_info vmap_info_cache;
 
 static inline void vmap_lock(void)
 {
@@ -2752,10 +2760,14 @@ static void calc_vmalloc_info(struct vmalloc_info *vmi)
 
 void get_vmalloc_info(struct vmalloc_info *vmi)
 {
-	static struct vmalloc_info cached_info;
+	if (!READ_ONCE(vmap_info_changed)) {
+		unsigned int seq;
+
+		do {
+			seq = read_seqbegin(&vmap_info_lock);
+			*vmi = vmap_info_cache;
+		} while (read_seqretry(&vmap_info_lock, seq));
 
-	if (!vmap_info_changed) {
-		*vmi = cached_info;
 		return;
 	}
 
@@ -2764,8 +2776,9 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
 
 	calc_vmalloc_info(vmi);
 
-	barrier();
-	cached_info = *vmi;
+	write_seqlock(&vmap_info_lock);
+	vmap_info_cache = *vmi;
+	write_sequnlock(&vmap_info_lock);
 }
 
 #endif

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-08-23  6:04 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-23  4:48 [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics George Spelvin
2015-08-23  6:04 ` Ingo Molnar [this message]
2015-08-23  6:46   ` George Spelvin
2015-08-23  8:17     ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Ingo Molnar
2015-08-23 20:53       ` Rasmus Villemoes
2015-08-24  6:58         ` Ingo Molnar
2015-08-24  8:39           ` Rasmus Villemoes
2015-08-23 21:56       ` Rasmus Villemoes
2015-08-24  7:00         ` Ingo Molnar
2015-08-25 16:39         ` Linus Torvalds
2015-08-25 17:03           ` Linus Torvalds
2015-08-24  1:04       ` George Spelvin
2015-08-24  7:34         ` [PATCH 3/3 v4] " Ingo Molnar
2015-08-24  7:47           ` Ingo Molnar
2015-08-24  7:50             ` [PATCH 3/3 v5] " Ingo Molnar
2015-08-24 12:54               ` George Spelvin
2015-08-25  9:56                 ` [PATCH 3/3 v6] " Ingo Molnar
2015-08-25 10:36                   ` George Spelvin
2015-08-25 12:59                   ` Peter Zijlstra
2015-08-25 14:19                   ` Rasmus Villemoes
2015-08-25 15:11                     ` George Spelvin
2015-08-24 13:11           ` [PATCH 3/3 v4] " John Stoffel
2015-08-24 15:11             ` George Spelvin
2015-08-24 15:55               ` John Stoffel
2015-08-25 12:46       ` [PATCH 3/3 v3] " Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2015-08-22 10:44 [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics Ingo Molnar
2015-08-22 14:36 ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150823060443.GA9882@gmail.com \
    --to=mingo@kernel.org \
    --cc=dave@sr71.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@horizon.com \
    --cc=linux@rasmusvillemoes.dk \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).