From: Ingo Molnar <mingo@kernel.org>
To: George Spelvin <linux@horizon.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
torvalds@linux-foundation.org, Dave Hansen <dave@sr71.net>,
Peter Zijlstra <peterz@infradead.org>,
David Rientjes <rientjes@google.com>,
Rik van Riel <riel@redhat.com>,
Rasmus Villemoes <linux@rasmusvillemoes.dk>
Subject: Re: [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics
Date: Sun, 23 Aug 2015 08:04:43 +0200 [thread overview]
Message-ID: <20150823060443.GA9882@gmail.com> (raw)
In-Reply-To: <20150823044839.5727.qmail@ns.horizon.com>
* George Spelvin <linux@horizon.com> wrote:
> Linus wrote:
> > I don't think any of this can be called "correct", in that the
> > unlocked accesses to the cached state are clearly racy, but I think
> > it's very much "acceptable".
>
> I'd think you could easily fix that with a seqlock-like system.
>
> What makes it so simple is that you can always fall back to
> calc_vmalloc_info if there's any problem, rather than looping or blocking.
>
> The basic idea is that you have a seqlock counter, but if either of
> the two lsbits are set, the cached information is stale.
>
> Basically, you need a seqlock and a spinlock. The seqlock does
> most of the work, and the spinlock ensures that there's only one
> updater of the cache.
>
> vmap_unlock() does set_bit(0, &seq->sequence). This marks the information
> as stale.
>
> get_vmalloc_info reads the seqlock. There are two case:
> - If the two lsbits are 10, the cached information is valid.
> Copy it out, re-check the seqlock, and loop if the sequence
> number changes.
> - In any other case, the cached information is
> not valid.
> - Try to obtain the spinlock. Do not block if it's unavailable.
> - If unavailable, do not block.
> - If the lock is acquired:
> - Set the sequence to (sequence | 3) + 1 (we're the only writer)
> - This bumps the sequence number and leaves the lsbits at 00 (invalid)
> - Memory barrier TBD. Do the RCU ops in calc_vmalloc_info do it for us?
> - Call calc_vmalloc_info
> - If we obtained the spinlock earlier:
> - Copy our vmi to cached_info
> - smp_wmb()
> - set_bit(1, &seq->sequence). This marks the information as valid,
> as long as bit 0 is still clear.
> - Release the spinlock.
>
> Basically, bit 0 says "vmalloc info has changed", and bit 1 says
> "vmalloc cache has been updated". This clears bit 0 before
> starting the update so that an update during calc_vmalloc_info
> will force a new update.
>
> So the three case are basically:
> 00 - calc_vmalloc_info() in progress
> 01 - vmap_unlock() during calc_vmalloc_info()
> 10 - cached_info is valid
> 11 - vmap_unlock has invalidated cached_info, awaiting refresh
>
> Logically, the sequence number should be initialized to ...01,
> but the code above handles 00 okay.
I think this is too complex.
How about something simple like the patch below (on top of the third patch)?
It makes the vmalloc info transactional - /proc/meminfo will always print a
consistent set of numbers. (Not that we really care about races there, but it
looks really simple to solve so why not.)
( I also moved the function-static cache next to the flag and seqlock - this
should further compress the cache footprint. )
Have I missed anything? Very lightly tested: booted in a VM.
Thanks,
Ingo
=========================>
mm/vmalloc.c | 23 ++++++++++++++++++-----
1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ef48e557df5a..66726f41e726 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -278,7 +278,15 @@ EXPORT_SYMBOL(vmalloc_to_pfn);
static __cacheline_aligned_in_smp DEFINE_SPINLOCK(vmap_area_lock);
+/*
+ * Seqlock and flag for the vmalloc info cache printed in /proc/meminfo.
+ *
+ * The assumption of the optimization is that it's read frequently, but
+ * modified infrequently.
+ */
+static DEFINE_SEQLOCK(vmap_info_lock);
static int vmap_info_changed;
+static struct vmalloc_info vmap_info_cache;
static inline void vmap_lock(void)
{
@@ -2752,10 +2760,14 @@ static void calc_vmalloc_info(struct vmalloc_info *vmi)
void get_vmalloc_info(struct vmalloc_info *vmi)
{
- static struct vmalloc_info cached_info;
+ if (!READ_ONCE(vmap_info_changed)) {
+ unsigned int seq;
+
+ do {
+ seq = read_seqbegin(&vmap_info_lock);
+ *vmi = vmap_info_cache;
+ } while (read_seqretry(&vmap_info_lock, seq));
- if (!vmap_info_changed) {
- *vmi = cached_info;
return;
}
@@ -2764,8 +2776,9 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
calc_vmalloc_info(vmi);
- barrier();
- cached_info = *vmi;
+ write_seqlock(&vmap_info_lock);
+ vmap_info_cache = *vmi;
+ write_sequnlock(&vmap_info_lock);
}
#endif
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-08-23 6:04 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-23 4:48 [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics George Spelvin
2015-08-23 6:04 ` Ingo Molnar [this message]
2015-08-23 6:46 ` George Spelvin
2015-08-23 8:17 ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Ingo Molnar
2015-08-23 20:53 ` Rasmus Villemoes
2015-08-24 6:58 ` Ingo Molnar
2015-08-24 8:39 ` Rasmus Villemoes
2015-08-23 21:56 ` Rasmus Villemoes
2015-08-24 7:00 ` Ingo Molnar
2015-08-25 16:39 ` Linus Torvalds
2015-08-25 17:03 ` Linus Torvalds
2015-08-24 1:04 ` George Spelvin
2015-08-24 7:34 ` [PATCH 3/3 v4] " Ingo Molnar
2015-08-24 7:47 ` Ingo Molnar
2015-08-24 7:50 ` [PATCH 3/3 v5] " Ingo Molnar
2015-08-24 12:54 ` George Spelvin
2015-08-25 9:56 ` [PATCH 3/3 v6] " Ingo Molnar
2015-08-25 10:36 ` George Spelvin
2015-08-25 12:59 ` Peter Zijlstra
2015-08-25 14:19 ` Rasmus Villemoes
2015-08-25 15:11 ` George Spelvin
2015-08-24 13:11 ` [PATCH 3/3 v4] " John Stoffel
2015-08-24 15:11 ` George Spelvin
2015-08-24 15:55 ` John Stoffel
2015-08-25 12:46 ` [PATCH 3/3 v3] " Peter Zijlstra
-- strict thread matches above, loose matches on Subject: below --
2015-08-22 10:44 [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics Ingo Molnar
2015-08-22 14:36 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150823060443.GA9882@gmail.com \
--to=mingo@kernel.org \
--cc=dave@sr71.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux@horizon.com \
--cc=linux@rasmusvillemoes.dk \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).