* Re: [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics
@ 2015-08-23 4:48 George Spelvin
2015-08-23 6:04 ` Ingo Molnar
0 siblings, 1 reply; 25+ messages in thread
From: George Spelvin @ 2015-08-23 4:48 UTC (permalink / raw)
To: mingo; +Cc: linux, linux-kernel, linux-mm, torvalds
Linus wrote:
> I don't think any of this can be called "correct", in that the
> unlocked accesses to the cached state are clearly racy, but I think
> it's very much "acceptable".
I'd think you could easily fix that with a seqlock-like system.
What makes it so simple is that you can always fall back to
calc_vmalloc_info if there's any problem, rather than looping or blocking.
The basic idea is that you have a seqlock counter, but if either of
the two lsbits are set, the cached information is stale.
Basically, you need a seqlock and a spinlock. The seqlock does
most of the work, and the spinlock ensures that there's only one
updater of the cache.
vmap_unlock() does set_bit(0, &seq->sequence). This marks the information
as stale.
get_vmalloc_info reads the seqlock. There are two case:
- If the two lsbits are 10, the cached information is valid.
Copy it out, re-check the seqlock, and loop if the sequence
number changes.
- In any other case, the cached information is
not valid.
- Try to obtain the spinlock. Do not block if it's unavailable.
- If unavailable, do not block.
- If the lock is acquired:
- Set the sequence to (sequence | 3) + 1 (we're the only writer)
- This bumps the sequence number and leaves the lsbits at 00 (invalid)
- Memory barrier TBD. Do the RCU ops in calc_vmalloc_info do it for us?
- Call calc_vmalloc_info
- If we obtained the spinlock earlier:
- Copy our vmi to cached_info
- smp_wmb()
- set_bit(1, &seq->sequence). This marks the information as valid,
as long as bit 0 is still clear.
- Release the spinlock.
Basically, bit 0 says "vmalloc info has changed", and bit 1 says
"vmalloc cache has been updated". This clears bit 0 before
starting the update so that an update during calc_vmalloc_info
will force a new update.
So the three case are basically:
00 - calc_vmalloc_info() in progress
01 - vmap_unlock() during calc_vmalloc_info()
10 - cached_info is valid
11 - vmap_unlock has invalidated cached_info, awaiting refresh
Logically, the sequence number should be initialized to ...01,
but the code above handles 00 okay.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics 2015-08-23 4:48 [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics George Spelvin @ 2015-08-23 6:04 ` Ingo Molnar 2015-08-23 6:46 ` George Spelvin 0 siblings, 1 reply; 25+ messages in thread From: Ingo Molnar @ 2015-08-23 6:04 UTC (permalink / raw) To: George Spelvin Cc: linux-kernel, linux-mm, torvalds, Dave Hansen, Peter Zijlstra, David Rientjes, Rik van Riel, Rasmus Villemoes * George Spelvin <linux@horizon.com> wrote: > Linus wrote: > > I don't think any of this can be called "correct", in that the > > unlocked accesses to the cached state are clearly racy, but I think > > it's very much "acceptable". > > I'd think you could easily fix that with a seqlock-like system. > > What makes it so simple is that you can always fall back to > calc_vmalloc_info if there's any problem, rather than looping or blocking. > > The basic idea is that you have a seqlock counter, but if either of > the two lsbits are set, the cached information is stale. > > Basically, you need a seqlock and a spinlock. The seqlock does > most of the work, and the spinlock ensures that there's only one > updater of the cache. > > vmap_unlock() does set_bit(0, &seq->sequence). This marks the information > as stale. > > get_vmalloc_info reads the seqlock. There are two case: > - If the two lsbits are 10, the cached information is valid. > Copy it out, re-check the seqlock, and loop if the sequence > number changes. > - In any other case, the cached information is > not valid. > - Try to obtain the spinlock. Do not block if it's unavailable. > - If unavailable, do not block. > - If the lock is acquired: > - Set the sequence to (sequence | 3) + 1 (we're the only writer) > - This bumps the sequence number and leaves the lsbits at 00 (invalid) > - Memory barrier TBD. Do the RCU ops in calc_vmalloc_info do it for us? > - Call calc_vmalloc_info > - If we obtained the spinlock earlier: > - Copy our vmi to cached_info > - smp_wmb() > - set_bit(1, &seq->sequence). This marks the information as valid, > as long as bit 0 is still clear. > - Release the spinlock. > > Basically, bit 0 says "vmalloc info has changed", and bit 1 says > "vmalloc cache has been updated". This clears bit 0 before > starting the update so that an update during calc_vmalloc_info > will force a new update. > > So the three case are basically: > 00 - calc_vmalloc_info() in progress > 01 - vmap_unlock() during calc_vmalloc_info() > 10 - cached_info is valid > 11 - vmap_unlock has invalidated cached_info, awaiting refresh > > Logically, the sequence number should be initialized to ...01, > but the code above handles 00 okay. I think this is too complex. How about something simple like the patch below (on top of the third patch)? It makes the vmalloc info transactional - /proc/meminfo will always print a consistent set of numbers. (Not that we really care about races there, but it looks really simple to solve so why not.) ( I also moved the function-static cache next to the flag and seqlock - this should further compress the cache footprint. ) Have I missed anything? Very lightly tested: booted in a VM. Thanks, Ingo =========================> mm/vmalloc.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index ef48e557df5a..66726f41e726 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -278,7 +278,15 @@ EXPORT_SYMBOL(vmalloc_to_pfn); static __cacheline_aligned_in_smp DEFINE_SPINLOCK(vmap_area_lock); +/* + * Seqlock and flag for the vmalloc info cache printed in /proc/meminfo. + * + * The assumption of the optimization is that it's read frequently, but + * modified infrequently. + */ +static DEFINE_SEQLOCK(vmap_info_lock); static int vmap_info_changed; +static struct vmalloc_info vmap_info_cache; static inline void vmap_lock(void) { @@ -2752,10 +2760,14 @@ static void calc_vmalloc_info(struct vmalloc_info *vmi) void get_vmalloc_info(struct vmalloc_info *vmi) { - static struct vmalloc_info cached_info; + if (!READ_ONCE(vmap_info_changed)) { + unsigned int seq; + + do { + seq = read_seqbegin(&vmap_info_lock); + *vmi = vmap_info_cache; + } while (read_seqretry(&vmap_info_lock, seq)); - if (!vmap_info_changed) { - *vmi = cached_info; return; } @@ -2764,8 +2776,9 @@ void get_vmalloc_info(struct vmalloc_info *vmi) calc_vmalloc_info(vmi); - barrier(); - cached_info = *vmi; + write_seqlock(&vmap_info_lock); + vmap_info_cache = *vmi; + write_sequnlock(&vmap_info_lock); } #endif -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics 2015-08-23 6:04 ` Ingo Molnar @ 2015-08-23 6:46 ` George Spelvin 2015-08-23 8:17 ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Ingo Molnar 0 siblings, 1 reply; 25+ messages in thread From: George Spelvin @ 2015-08-23 6:46 UTC (permalink / raw) To: linux, mingo Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds Ingo Molnar <mingo@kernel.org> wrote: > I think this is too complex. > > How about something simple like the patch below (on top of the third patch)? > It makes the vmalloc info transactional - /proc/meminfo will always print a > consistent set of numbers. (Not that we really care about races there, but it > looks really simple to solve so why not.) Looks like a huge simplification! It needs a comment about the approximate nature of the locking and the obvious race conditions: 1) The first caller to get_vmalloc_info() clears vmap_info_changed before updating vmap_info_cache, so a second caller is likely to get stale data for the duration of a calc_vmalloc_info call. 2) Although unlikely, it's possible for two threads to race calling calc_vmalloc_info, and the one that computes fresher data updates the cache first, so the later write leaves stale data. Other issues: 3) Me, I'd make vmap_info_changed a bool, for documentation more than any space saving. 4) I wish there were a trylock version of write_seqlock, so we could avoid blocking entirely. (You *could* hand-roll it, but that eats into the simplicity.) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-23 6:46 ` George Spelvin @ 2015-08-23 8:17 ` Ingo Molnar 2015-08-23 20:53 ` Rasmus Villemoes ` (3 more replies) 0 siblings, 4 replies; 25+ messages in thread From: Ingo Molnar @ 2015-08-23 8:17 UTC (permalink / raw) To: George Spelvin Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds * George Spelvin <linux@horizon.com> wrote: > Ingo Molnar <mingo@kernel.org> wrote: > > I think this is too complex. > > > > How about something simple like the patch below (on top of the third patch)? > > > It makes the vmalloc info transactional - /proc/meminfo will always print a > > consistent set of numbers. (Not that we really care about races there, but it > > looks really simple to solve so why not.) > > Looks like a huge simplification! > > It needs a comment about the approximate nature of the locking and > the obvious race conditions: > 1) The first caller to get_vmalloc_info() clears vmap_info_changed > before updating vmap_info_cache, so a second caller is likely to > get stale data for the duration of a calc_vmalloc_info call. > 2) Although unlikely, it's possible for two threads to race calling > calc_vmalloc_info, and the one that computes fresher data updates > the cache first, so the later write leaves stale data. > > Other issues: > 3) Me, I'd make vmap_info_changed a bool, for documentation more than > any space saving. > 4) I wish there were a trylock version of write_seqlock, so we could > avoid blocking entirely. (You *could* hand-roll it, but that eats > into the simplicity.) Ok, fair enough - so how about the attached approach instead, which uses a 64-bit generation counter to track changes to the vmalloc state. This is still very simple, but should not suffer from stale data being returned indefinitely in /proc/meminfo. We might race - but that was true before as well due to the lock-less RCU list walk - but we'll always return a correct and consistent version of the information. Lightly tested. This is a replacement patch to make it easier to read via email. I also made sure there's no extra overhead in the !CONFIG_PROC_FS case. Note that there's an even simpler variant possible I think: we could use just the two generation counters and barriers to remove the seqlock. Thanks, Ingo ==============================> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-23 8:17 ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Ingo Molnar @ 2015-08-23 20:53 ` Rasmus Villemoes 2015-08-24 6:58 ` Ingo Molnar 2015-08-23 21:56 ` Rasmus Villemoes ` (2 subsequent siblings) 3 siblings, 1 reply; 25+ messages in thread From: Rasmus Villemoes @ 2015-08-23 20:53 UTC (permalink / raw) To: Ingo Molnar Cc: George Spelvin, dave, linux-kernel, linux-mm, peterz, riel, rientjes, torvalds On Sun, Aug 23 2015, Ingo Molnar <mingo@kernel.org> wrote: > Ok, fair enough - so how about the attached approach instead, which > uses a 64-bit generation counter to track changes to the vmalloc > state. How does this invalidation approach compare to the jiffies approach? In other words, how often does the vmalloc info actually change (or rather, in this approximation, how often is vmap_area_lock taken)? In particular, does it also solve the problem with git's test suite and similar situations with lots of short-lived processes? > ==============================> > From f9fd770e75e2edb4143f32ced0b53d7a77969c94 Mon Sep 17 00:00:00 2001 > From: Ingo Molnar <mingo@kernel.org> > Date: Sat, 22 Aug 2015 12:28:01 +0200 > Subject: [PATCH] mm/vmalloc: Cache the vmalloc memory info > > Linus reported that glibc (rather stupidly) reads /proc/meminfo > for every sysinfo() call, Not quite: It is done by the two functions get_{av,}phys_pages functions; and get_phys_pages is called (once per process) by glibc's qsort implementation. In fact, sysinfo() is (at least part of) the cure, not the disease. Whether qsort should care about the total amount of memory is another discussion. <http://thread.gmane.org/gmane.comp.lib.glibc.alpha/54342/focus=54558> Rasmus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-23 20:53 ` Rasmus Villemoes @ 2015-08-24 6:58 ` Ingo Molnar 2015-08-24 8:39 ` Rasmus Villemoes 0 siblings, 1 reply; 25+ messages in thread From: Ingo Molnar @ 2015-08-24 6:58 UTC (permalink / raw) To: Rasmus Villemoes Cc: George Spelvin, dave, linux-kernel, linux-mm, peterz, riel, rientjes, torvalds * Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > On Sun, Aug 23 2015, Ingo Molnar <mingo@kernel.org> wrote: > > > Ok, fair enough - so how about the attached approach instead, which > > uses a 64-bit generation counter to track changes to the vmalloc > > state. > > How does this invalidation approach compare to the jiffies approach? In > other words, how often does the vmalloc info actually change (or rather, > in this approximation, how often is vmap_area_lock taken)? In > particular, does it also solve the problem with git's test suite and > similar situations with lots of short-lived processes? The two approaches are pretty similar, and in a typical distro with typical workload vmalloc() is mostly a boot time affair. But vmalloc() can be used more often in certain corner cases - neither of the patches makes that in any way slower, just the optimization won't trigger that often. Since vmalloc() use is suboptimal for several reasons (it does not use large pages for kernel space allocations, etc.), this is all pretty OK IMHO. > > ==============================> > > From f9fd770e75e2edb4143f32ced0b53d7a77969c94 Mon Sep 17 00:00:00 2001 > > From: Ingo Molnar <mingo@kernel.org> > > Date: Sat, 22 Aug 2015 12:28:01 +0200 > > Subject: [PATCH] mm/vmalloc: Cache the vmalloc memory info > > > > Linus reported that glibc (rather stupidly) reads /proc/meminfo > > for every sysinfo() call, > > Not quite: It is done by the two functions get_{av,}phys_pages > functions; and get_phys_pages is called (once per process) by glibc's > qsort implementation. In fact, sysinfo() is (at least part of) the cure, > not the disease. Whether qsort should care about the total amount of > memory is another discussion. > > <http://thread.gmane.org/gmane.comp.lib.glibc.alpha/54342/focus=54558> Thanks, is the fixed up changelog below better? Ingo ===============> mm/vmalloc: Cache the vmalloc memory info Linus reported that for scripting-intense workloads such as the Git build, glibc's qsort will read /proc/meminfo for every process created (by way of get_phys_pages()), which causes the Git build to generate a surprising amount of kernel overhead. A fair chunk of the overhead is due to get_vmalloc_info() - which walks a potentially long list to do its statistics. Modify Linus's jiffies based patch to use generation counters to cache the vmalloc info: vmap_unlock() increases the generation counter, and the get_vmalloc_info() reads it and compares it against a cached generation counter. Also use a seqlock to make sure we always print a consistent set of vmalloc statistics. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Signed-off-by: Ingo Molnar <mingo@kernel.org> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 6:58 ` Ingo Molnar @ 2015-08-24 8:39 ` Rasmus Villemoes 0 siblings, 0 replies; 25+ messages in thread From: Rasmus Villemoes @ 2015-08-24 8:39 UTC (permalink / raw) To: Ingo Molnar Cc: George Spelvin, dave, linux-kernel, linux-mm, peterz, riel, rientjes, torvalds On Mon, Aug 24 2015, Ingo Molnar <mingo@kernel.org> wrote: > > Thanks, is the fixed up changelog below better? > Yes, though Linus specifically referred to "make test" (but I guess one could/should consider that part of the build process). Rasmus > mm/vmalloc: Cache the vmalloc memory info > > Linus reported that for scripting-intense workloads such as the > Git build, glibc's qsort will read /proc/meminfo for every process > created (by way of get_phys_pages()), which causes the Git build > to generate a surprising amount of kernel overhead. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-23 8:17 ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Ingo Molnar 2015-08-23 20:53 ` Rasmus Villemoes @ 2015-08-23 21:56 ` Rasmus Villemoes 2015-08-24 7:00 ` Ingo Molnar 2015-08-25 16:39 ` Linus Torvalds 2015-08-24 1:04 ` George Spelvin 2015-08-25 12:46 ` [PATCH 3/3 v3] " Peter Zijlstra 3 siblings, 2 replies; 25+ messages in thread From: Rasmus Villemoes @ 2015-08-23 21:56 UTC (permalink / raw) To: Ingo Molnar Cc: George Spelvin, dave, linux-kernel, linux-mm, peterz, riel, rientjes, torvalds I was curious why these fields were ever added to /proc/meminfo, and dug up this: commit d262ee3ee6ba4f5f6125571d93d9d63191d2ef76 Author: Andrew Morton <akpm@digeo.com> Date: Sat Apr 12 12:59:04 2003 -0700 [PATCH] vmalloc stats in /proc/meminfo From: Matt Porter <porter@cox.net> There was a thread a while back on lkml where Dave Hansen proposed this simple vmalloc usage reporting patch. The thread pretty much died out as most people seemed focused on what VM loading type bugs it could solve. I had posted that this type of information was really valuable in debugging embedded Linux board ports. A common example is where people do arch specific setup that limits there vmalloc space and then they find modules won't load. ;) Having the Vmalloc* info readily available is real useful in helping folks to fix their kernel ports. That thread is at <http://thread.gmane.org/gmane.linux.kernel/53360>. [Maybe one could just remove the fields and see if anybody actually notices/cares any longer. Or, if they are only used by kernel developers, put them in their own file.] Rasmus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-23 21:56 ` Rasmus Villemoes @ 2015-08-24 7:00 ` Ingo Molnar 2015-08-25 16:39 ` Linus Torvalds 1 sibling, 0 replies; 25+ messages in thread From: Ingo Molnar @ 2015-08-24 7:00 UTC (permalink / raw) To: Rasmus Villemoes Cc: George Spelvin, dave, linux-kernel, linux-mm, peterz, riel, rientjes, torvalds * Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > I was curious why these fields were ever added to /proc/meminfo, and dug > up this: > > commit d262ee3ee6ba4f5f6125571d93d9d63191d2ef76 > Author: Andrew Morton <akpm@digeo.com> > Date: Sat Apr 12 12:59:04 2003 -0700 > > [PATCH] vmalloc stats in /proc/meminfo > > From: Matt Porter <porter@cox.net> > > There was a thread a while back on lkml where Dave Hansen proposed this > simple vmalloc usage reporting patch. The thread pretty much died out as > most people seemed focused on what VM loading type bugs it could solve. I > had posted that this type of information was really valuable in debugging > embedded Linux board ports. A common example is where people do arch > specific setup that limits there vmalloc space and then they find modules > won't load. ;) Having the Vmalloc* info readily available is real useful in > helping folks to fix their kernel ports. > > That thread is at <http://thread.gmane.org/gmane.linux.kernel/53360>. > > [Maybe one could just remove the fields and see if anybody actually > notices/cares any longer. Or, if they are only used by kernel > developers, put them in their own file.] So instead of removing the fields (which I'm quite sure is an ABI breaker as it could break less robust /proc/meminfo parsers and scripts), we could just report '0' all the time - and have the real info somewhere else? Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-23 21:56 ` Rasmus Villemoes 2015-08-24 7:00 ` Ingo Molnar @ 2015-08-25 16:39 ` Linus Torvalds 2015-08-25 17:03 ` Linus Torvalds 1 sibling, 1 reply; 25+ messages in thread From: Linus Torvalds @ 2015-08-25 16:39 UTC (permalink / raw) To: Rasmus Villemoes Cc: Ingo Molnar, George Spelvin, Dave Hansen, Linux Kernel Mailing List, linux-mm, Peter Zijlstra, Rik van Riel, David Rientjes On Sun, Aug 23, 2015 at 2:56 PM, Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > [Maybe one could just remove the fields and see if anybody actually > notices/cares any longer. Or, if they are only used by kernel > developers, put them in their own file.] I'm actually inclined to try exactly that for 4.3, and then take Ingo's patch as a fallback in case somebody actually notices. I'm not convinced anybody actually uses those values, and they are getting *less* relevant rather than more (on 64-bit, those values really don't matter, since the vmalloc space isn't really a limitation), so let's try removing them and seeing what happens. And then we know what we can do if somebody does actually notice. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-25 16:39 ` Linus Torvalds @ 2015-08-25 17:03 ` Linus Torvalds 0 siblings, 0 replies; 25+ messages in thread From: Linus Torvalds @ 2015-08-25 17:03 UTC (permalink / raw) To: Rasmus Villemoes Cc: Ingo Molnar, George Spelvin, Dave Hansen, Linux Kernel Mailing List, linux-mm, Peter Zijlstra, Rik van Riel, David Rientjes On Tue, Aug 25, 2015 at 9:39 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > I'm not convinced anybody actually uses those values, and they are > getting *less* relevant rather than more (on 64-bit, those values > really don't matter, since the vmalloc space isn't really a > limitation) Side note: the people who actually care about "my vmalloc area is too full, what's up?" would use /proc/vmallocinfo anyway, since that's what shows things like fragmentation etc. So I'm just talking about removing the /proc/meminfo part. First try to remove it *all*, and if there is some script that hollers because it wants to parse them, print out the values as zero. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-23 8:17 ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Ingo Molnar 2015-08-23 20:53 ` Rasmus Villemoes 2015-08-23 21:56 ` Rasmus Villemoes @ 2015-08-24 1:04 ` George Spelvin 2015-08-24 7:34 ` [PATCH 3/3 v4] " Ingo Molnar 2015-08-25 12:46 ` [PATCH 3/3 v3] " Peter Zijlstra 3 siblings, 1 reply; 25+ messages in thread From: George Spelvin @ 2015-08-24 1:04 UTC (permalink / raw) To: linux, mingo Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds First, an actual, albeit minor, bug: initializing both vmap_info_gen and vmap_info_cache_gen to 0 marks the cache as valid, which it's not. vmap_info_gen should be initialized to 1 to force an initial cache update. Second, I don't see why you need a 64-bit counter. Seqlocks consider 32 bits (31 bits, actually, the lsbit means "update in progress") quite a strong enough guarantee. Third, it seems as though vmap_info_cache_gen is basically a duplicate of vmap_info_lock.sequence. It should be possible to make one variable serve both purposes. You just need a kludge to handle the case of multiple vamp_info updates between cache updates. There are two simple ones: 1) Avoid bumping vmap_info_gen unnecessarily. In vmap_unlock(), do vmap_info_gen = (vmap_info_lock.sequence | 1) + 1; 2) - Make vmap_info_gen a seqcount_t - In vmap_unlock(), do write_seqcount_barrier(&vmap_info_gen) - In get_vmalloc_info, inside the seqlock critical section, do vmap_info_lock.seqcount.sequence = vmap_info_gen.sequence - 1; (Using the vmap_info_gen.sequence read while validating the cache in the first place.) I should try to write an actual patch illustrating this. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 3/3 v4] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 1:04 ` George Spelvin @ 2015-08-24 7:34 ` Ingo Molnar 2015-08-24 7:47 ` Ingo Molnar 2015-08-24 13:11 ` [PATCH 3/3 v4] " John Stoffel 0 siblings, 2 replies; 25+ messages in thread From: Ingo Molnar @ 2015-08-24 7:34 UTC (permalink / raw) To: George Spelvin Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds * George Spelvin <linux@horizon.com> wrote: > First, an actual, albeit minor, bug: initializing both vmap_info_gen > and vmap_info_cache_gen to 0 marks the cache as valid, which it's not. Ha! :-) Fixed. > vmap_info_gen should be initialized to 1 to force an initial > cache update. Yeah. > Second, I don't see why you need a 64-bit counter. Seqlocks consider > 32 bits (31 bits, actually, the lsbit means "update in progress") quite > a strong enough guarantee. Just out of general paranoia - but you are right, and this would lower the overhead on 32-bit SMP platforms a bit, plus it avoids 64-bit word tearing artifacts on 32 bit platforms as well. I modified it to u32. > Third, it seems as though vmap_info_cache_gen is basically a duplicate > of vmap_info_lock.sequence. It should be possible to make one variable > serve both purposes. Correct, I alluded to that in my description: > > Note that there's an even simpler variant possible I think: we could use just > > the two generation counters and barriers to remove the seqlock. > You just need a kludge to handle the case of multiple vamp_info updates > between cache updates. > > There are two simple ones: > > 1) Avoid bumping vmap_info_gen unnecessarily. In vmap_unlock(), do > vmap_info_gen = (vmap_info_lock.sequence | 1) + 1; > 2) - Make vmap_info_gen a seqcount_t > - In vmap_unlock(), do write_seqcount_barrier(&vmap_info_gen) > - In get_vmalloc_info, inside the seqlock critical section, do > vmap_info_lock.seqcount.sequence = vmap_info_gen.sequence - 1; > (Using the vmap_info_gen.sequence read while validating the > cache in the first place.) > > I should try to write an actual patch illustrating this. So I think something like the patch below is even simpler than trying to kludge generation counter semantics into seqcounts. I used two generation counters and a spinlock. The fast path is completely lockless and lightweight on modern SMP platforms. (where smp_rmb() is a no-op or very cheap.) There's not even a seqlock retry loop, instead an invalid cache causes us to fall back to the old behavior - and the freshest result is guaranteed to end up in the cache. The linecount got a bit larger: but half of it is comments. Note that the generation counters are signed integers so that this comparison can be done: + if (gen-vmap_info_cache_gen > 0) { Thanks, Ingo ======================> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v4] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 7:34 ` [PATCH 3/3 v4] " Ingo Molnar @ 2015-08-24 7:47 ` Ingo Molnar 2015-08-24 7:50 ` [PATCH 3/3 v5] " Ingo Molnar 2015-08-24 13:11 ` [PATCH 3/3 v4] " John Stoffel 1 sibling, 1 reply; 25+ messages in thread From: Ingo Molnar @ 2015-08-24 7:47 UTC (permalink / raw) To: George Spelvin Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds * Ingo Molnar <mingo@kernel.org> wrote: > +/* > + * Return a consistent snapshot of the current vmalloc allocation > + * statistics, for /proc/meminfo: > + */ > +void get_vmalloc_info(struct vmalloc_info *vmi) > +{ > + int gen = READ_ONCE(vmap_info_gen); > + > + /* > + * If the generation counter of the cache matches that of > + * the vmalloc generation counter then return the cache: > + */ > + if (READ_ONCE(vmap_info_cache_gen) == gen) { > + int gen_after; > + > + /* > + * The two read barriers make sure that we read > + * 'gen', 'vmap_info_cache' and 'gen_after' in > + * precisely that order: > + */ > + smp_rmb(); > + *vmi = vmap_info_cache; > + > + smp_rmb(); > + gen_after = READ_ONCE(vmap_info_gen); > + > + /* The cache is still valid: */ > + if (gen == gen_after) > + return; > + > + /* Ok, the cache got invalidated just now, regenerate it */ > + gen = gen_after; > + } One more detail: I just realized that with the read barriers, the READ_ONCE() accesses are not needed anymore - the barriers and the control dependencies are enough. This will further simplify the code. Thanks, Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 3/3 v5] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 7:47 ` Ingo Molnar @ 2015-08-24 7:50 ` Ingo Molnar 2015-08-24 12:54 ` George Spelvin 0 siblings, 1 reply; 25+ messages in thread From: Ingo Molnar @ 2015-08-24 7:50 UTC (permalink / raw) To: George Spelvin Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds * Ingo Molnar <mingo@kernel.org> wrote: > One more detail: I just realized that with the read barriers, the READ_ONCE() > accesses are not needed anymore - the barriers and the control dependencies are > enough. > > This will further simplify the code. I.e. something like the updated patch below. (We still need the WRITE_ONCE() for vmap_info_gen update.) Thanks, Ingo ========================> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v5] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 7:50 ` [PATCH 3/3 v5] " Ingo Molnar @ 2015-08-24 12:54 ` George Spelvin 2015-08-25 9:56 ` [PATCH 3/3 v6] " Ingo Molnar 0 siblings, 1 reply; 25+ messages in thread From: George Spelvin @ 2015-08-24 12:54 UTC (permalink / raw) To: linux, mingo Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds (I hope I'm not annoying you by bikeshedding this too much, although I think this is improving.) You've sort of re-invented spinlocks, but after thinking a bit, it all works. Rather than using a single word, which is incremented to an odd number at the start of an update and an even number at the end, there are two. An update is in progress when they're unequal. vmap_info_gen is incremented early, when the cache needs updating, and read late (after the cache is copied). vmap_info_cache_gen is incremented after the cache is updated, and read early (before the cache is copied). This is logically equivalent to my complicated scheme with atomic updates to various bits in a single generation word, but greatly simplified by having two separate words. In particular, there's no longer a need to distinguish "vmap has updated list" from "calc_vmalloc_info in progress". I particularly like the "gen - vmap_info_cache_gen > 0" test. You *must* test for inequality to prevent tearing of a valid cache (...grr...English heteronyms...), and given that, might as well require it be fresher. Anyway, suggested changes for v6 (sigh...): First: you do a second read of vmap_info_gen to optimize out the copy of vmalloc_info if it's easily seen as pointless, but given how small vmalloc_info is (two words!), i'd be inclined to omit that optimization. Copy always, *then* see if it's worth keeping. Smaller code, faster fast path, and is barely noticeable on the slow path. Second, and this is up to you, I'd be inclined to go fully non-blocking and only spin_trylock(). If that fails, just skip the cache update. Third, ANSI C rules allow a compiler to assume that signed integer overflow does not occur. That means that gcc is allowed to optimize "if (x - y > 0)" to "if (x > y)". Given that gcc has annoyed us by using this optimization in other contexts, It might be safer to make them unsigned (which is required to wrap properly) and cast to integer after subtraction. Basically, the following (untested, but pretty damn simple): +/* + * Return a consistent snapshot of the current vmalloc allocation + * statistics, for /proc/meminfo: + */ +void get_vmalloc_info(struct vmalloc_info *vmi) +{ + unsigned gen, cache_gen = READ_ONCE(vmap_info_cache_gen); + + /* + * The two read barriers make sure that we read + * 'cache_gen', 'vmap_info_cache' and 'gen' in + * precisely that order: + */ + smp_rmb(); + *vmi = vmap_info_cache; + + smp_rmb(); + gen = READ_ONCE(vmap_info_gen); + + /* + * If the generation counter of the cache matches that of + * the vmalloc generation counter then return the cache: + */ + if (gen == cache_gen) + return; + + /* Make sure 'gen' is read before the vmalloc info */ + smp_rmb(); + calc_vmalloc_info(vmi); + + /* + * All updates to vmap_info_cache_gen go through this spinlock, + * so when the cache got invalidated, we'll only mark it valid + * again if we first fully write the new vmap_info_cache. + * + * This ensures that partial results won't be used. + */ + if (spin_trylock(&vmap_info_lock)) { + if ((int)(gen - vmap_info_cache_gen) > 0) { + vmap_info_cache = *vmi; + /* + * Make sure the new cached data is visible before + * the generation counter update: + */ + smp_wmb(); + WRITE_ONCE(vmap_info_cache_gen, gen); + } + spin_unlock(&vmap_info_lock); + } +} + +#endif /* CONFIG_PROC_FS */ The only remaining *very small* nit is that this function is a mix of "return early" and "wrap it in an if()" style. If you want to make that "if (!spin_trylock(...)) return;", I leave that you your esthetic judgement. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 3/3 v6] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 12:54 ` George Spelvin @ 2015-08-25 9:56 ` Ingo Molnar 2015-08-25 10:36 ` George Spelvin ` (2 more replies) 0 siblings, 3 replies; 25+ messages in thread From: Ingo Molnar @ 2015-08-25 9:56 UTC (permalink / raw) To: George Spelvin Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds * George Spelvin <linux@horizon.com> wrote: > (I hope I'm not annoying you by bikeshedding this too much, although I > think this is improving.) [ I don't mind, although I wish other, more critical parts of the kernel got this much attention as well ;-) ] > Anyway, suggested changes for v6 (sigh...): > > First: you do a second read of vmap_info_gen to optimize out the copy > of vmalloc_info if it's easily seen as pointless, but given how small > vmalloc_info is (two words!), i'd be inclined to omit that optimization. > > Copy always, *then* see if it's worth keeping. Smaller code, faster > fast path, and is barely noticeable on the slow path. Ok, done. > Second, and this is up to you, I'd be inclined to go fully non-blocking and > only spin_trylock(). If that fails, just skip the cache update. So I'm not sure about this one: we have no guarantee of the order every updater reaches the spinlock, and we want the 'freshest' updater to do the update. The trylock might cause us to drop the 'freshest' update erroneously - so this change would introduce a 'stale data' bug I think. > Third, ANSI C rules allow a compiler to assume that signed integer > overflow does not occur. That means that gcc is allowed to optimize > "if (x - y > 0)" to "if (x > y)". That's annoying ... > Given that gcc has annoyed us by using this optimization in other > contexts, It might be safer to make them unsigned (which is required to > wrap properly) and cast to integer after subtraction. Ok, done. > Basically, the following (untested, but pretty damn simple): I've attached v6 which applies your first and last suggestion, but not the trylock one. I also removed _ONCE() accesses from the places that didn't need them. I added your Reviewed-by optimistically, saving a v7 submission hopefully ;-) Lightly tested. Thanks, Ingo ==============================> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v6] mm/vmalloc: Cache the vmalloc memory info 2015-08-25 9:56 ` [PATCH 3/3 v6] " Ingo Molnar @ 2015-08-25 10:36 ` George Spelvin 2015-08-25 12:59 ` Peter Zijlstra 2015-08-25 14:19 ` Rasmus Villemoes 2 siblings, 0 replies; 25+ messages in thread From: George Spelvin @ 2015-08-25 10:36 UTC (permalink / raw) To: linux, mingo Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds >> Second, and this is up to you, I'd be inclined to go fully non-blocking >> and only spin_trylock(). If that fails, just skip the cache update. > So I'm not sure about this one: we have no guarantee of the order every > updater reaches the spinlock, and we want the 'freshest' updater to do > the update. The trylock might cause us to drop the 'freshest' update > erroneously - so this change would introduce a 'stale data' bug I think. Er, no it wouldn't. If someone leaves the cache stale, they'll leave vmap_info_cache_gen != vmap_info_gen and the next reader to come along will see the cache is stale and refresh it. If there's lock contention, there's a risk of more work, because the callers fall back to calc_vmalloc_info rather than waiting. But it's not a big risk. With the blocking code, if two readers arrive simultaneously and see a stale cache, they'll both call calc_vmalloc_info() and then line up to update the cache. The second will get the lock but then *not* update the cache. With trylock(), the second will just skip the update faster. Same number of calc_vmalloc_info() calls. The only inefficient case is if two readers arrive far enough apart that vmap_info_gen is updated between them, yet close enough together than the second arrives at the lock while the first is updating the cache. In that case, the second reader will not update the cache and (assuming no more changes to vmap_info_gen) some future third reader will have to duplicate the effort of calling calc_vmalloc_info(). But that's such a tiny window I give preference to my fondness for non-blocking code. > I added your Reviewed-by optimistically, saving a v7 submission hopefully ;-) You did right. As I said, the non-blocking part is your preference. I've done so much nasty stuff in interrupt handlers ($DAY_JOB more than kernel) that I go for the non-blocking algorithm whenever possible. Reviewed-by: George Spelvin <linux@horizon.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v6] mm/vmalloc: Cache the vmalloc memory info 2015-08-25 9:56 ` [PATCH 3/3 v6] " Ingo Molnar 2015-08-25 10:36 ` George Spelvin @ 2015-08-25 12:59 ` Peter Zijlstra 2015-08-25 14:19 ` Rasmus Villemoes 2 siblings, 0 replies; 25+ messages in thread From: Peter Zijlstra @ 2015-08-25 12:59 UTC (permalink / raw) To: Ingo Molnar Cc: George Spelvin, dave, linux-kernel, linux-mm, linux, riel, rientjes, torvalds On Tue, Aug 25, 2015 at 11:56:38AM +0200, Ingo Molnar wrote: > +void get_vmalloc_info(struct vmalloc_info *vmi) > +{ > + unsigned int cache_gen, gen; I see you dropped the u64 thing, good, ignore that previous email. > + > + /* > + * The common case is that the cache is valid, so first > + * read it, then check its validity. > + * > + * The two read barriers make sure that we read > + * 'cache_gen', 'vmap_info_cache' and 'gen' in > + * precisely that order: > + */ > + cache_gen = vmap_info_cache_gen; > + smp_rmb(); > + *vmi = vmap_info_cache; > + smp_rmb(); > + gen = vmap_info_gen; > + > + /* > + * If the generation counter of the cache matches that of > + * the vmalloc generation counter then return the cache: > + */ > + if (cache_gen == gen) > + return; There is one aspect of READ_ONCE() that is not replaced with smp_rmb(), and that is that READ_ONCE() should avoid split loads. Without READ_ONCE() the compiler is free to turn the loads into separate and out of order byte loads just because its insane, thereby also making the WRITE_ONCE() pointless. Now I'm fairly sure it all doesn't matter much, the info can change the moment we've copied it, so the read is inherently racy. And by that same argument I feel this is all somewhat over engineered. > + > + /* Make sure 'gen' is read before the vmalloc info: */ > + smp_rmb(); > + calc_vmalloc_info(vmi); > + > + /* > + * All updates to vmap_info_cache_gen go through this spinlock, > + * so when the cache got invalidated, we'll only mark it valid > + * again if we first fully write the new vmap_info_cache. > + * > + * This ensures that partial results won't be used and that the > + * vmalloc info belonging to the freshest update is used: > + */ > + spin_lock(&vmap_info_lock); > + if ((int)(gen-vmap_info_cache_gen) > 0) { > + vmap_info_cache = *vmi; > + /* > + * Make sure the new cached data is visible before > + * the generation counter update: > + */ > + smp_wmb(); > + vmap_info_cache_gen = gen; > + } > + spin_unlock(&vmap_info_lock); > +} > + > +#endif /* CONFIG_PROC_FS */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v6] mm/vmalloc: Cache the vmalloc memory info 2015-08-25 9:56 ` [PATCH 3/3 v6] " Ingo Molnar 2015-08-25 10:36 ` George Spelvin 2015-08-25 12:59 ` Peter Zijlstra @ 2015-08-25 14:19 ` Rasmus Villemoes 2015-08-25 15:11 ` George Spelvin 2 siblings, 1 reply; 25+ messages in thread From: Rasmus Villemoes @ 2015-08-25 14:19 UTC (permalink / raw) To: Ingo Molnar Cc: George Spelvin, dave, linux-kernel, linux-mm, peterz, riel, rientjes, torvalds On Tue, Aug 25 2015, Ingo Molnar <mingo@kernel.org> wrote: > * George Spelvin <linux@horizon.com> wrote: > >> (I hope I'm not annoying you by bikeshedding this too much, although I >> think this is improving.) > > [ I don't mind, although I wish other, more critical parts of the kernel got this > much attention as well ;-) ] > Since we're beating dead horses, let me point out one possibly unintentional side-effect of initializing just one of vmap_info{,_cache}_gen: $ nm -n vmlinux | grep -E 'vmap_info(_cache)?_gen' ffffffff81e4e5e0 d vmap_info_gen ffffffff820d5700 b vmap_info_cache_gen [Up-thread, you wrote "I also moved the function-static cache next to the flag and seqlock - this should further compress the cache footprint."] One should probably ensure that they end up in the same cacheline if one wants the fast-path to be as fast as possible - the easiest way to ensure that is to put them in a small struct, and that might as well contain the spinlock and the cache itself as well. It's been fun seeing this evolve, but overall, I tend to agree with Peter: It's a lot of complexity for little gain. If we're not going to just kill the Vmalloc* fields (which is probably too controversial) I'd prefer Linus' simpler version. Rasmus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v6] mm/vmalloc: Cache the vmalloc memory info 2015-08-25 14:19 ` Rasmus Villemoes @ 2015-08-25 15:11 ` George Spelvin 0 siblings, 0 replies; 25+ messages in thread From: George Spelvin @ 2015-08-25 15:11 UTC (permalink / raw) To: linux, mingo Cc: dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds >>> (I hope I'm not annoying you by bikeshedding this too much, although I >>> think this is improving.) >> >> [ I don't mind, although I wish other, more critical parts of the kernel got this >> much attention as well ;-) ] That's the problem with small, understandable problems: people *aren't* scared to mess with them. > It's been fun seeing this evolve, but overall, I tend to agree with > Peter: It's a lot of complexity for little gain. If we're not going to > just kill the Vmalloc* fields (which is probably too controversial) > I'd prefer Linus' simpler version. Are you sure you're not being affected by the number of iterations? The final version is not actually a lot of code (although yes, more than Linus's), and offers the advantage of peace of mind: there's not some nasty-smelling code you can't entirely trust left behind. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v4] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 7:34 ` [PATCH 3/3 v4] " Ingo Molnar 2015-08-24 7:47 ` Ingo Molnar @ 2015-08-24 13:11 ` John Stoffel 2015-08-24 15:11 ` George Spelvin 1 sibling, 1 reply; 25+ messages in thread From: John Stoffel @ 2015-08-24 13:11 UTC (permalink / raw) To: Ingo Molnar Cc: George Spelvin, dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds >>>>> "Ingo" == Ingo Molnar <mingo@kernel.org> writes: Ingo> * George Spelvin <linux@horizon.com> wrote: >> First, an actual, albeit minor, bug: initializing both vmap_info_gen >> and vmap_info_cache_gen to 0 marks the cache as valid, which it's not. Ingo> Ha! :-) Fixed. >> vmap_info_gen should be initialized to 1 to force an initial >> cache update. Blech, it should be initialized with a proper #define VMAP_CACHE_NEEDS_UPDATE 1, instead of more magic numbers. Ingo> + */ Ingo> +static DEFINE_SPINLOCK(vmap_info_lock); Ingo> +static int vmap_info_gen = 1; static int vmap_info_gen = VMAP_CACHE_NEEDS_UPDATE; Ingo> +static int vmap_info_cache_gen; Ingo> +static struct vmalloc_info vmap_info_cache; Ingo> +#endif This will help keep bugs like this out in the future... I hope! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v4] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 13:11 ` [PATCH 3/3 v4] " John Stoffel @ 2015-08-24 15:11 ` George Spelvin 2015-08-24 15:55 ` John Stoffel 0 siblings, 1 reply; 25+ messages in thread From: George Spelvin @ 2015-08-24 15:11 UTC (permalink / raw) To: john, mingo Cc: dave, linux-kernel, linux-mm, linux, linux, peterz, riel, rientjes, torvalds John Stoffel <john@stoffel.org> wrote: >> vmap_info_gen should be initialized to 1 to force an initial >> cache update. > Blech, it should be initialized with a proper #define > VMAP_CACHE_NEEDS_UPDATE 1, instead of more magic numbers. Er... this is a joke, right? First, this number is used exactly once, and it's not part of a collection of similar numbers. And the definition would be adjacent to the use. We have easier ways of accomplishing that, called "comments". Second, your proposed name is misleading. "needs update" is defined as vmap_info_gen != vmap_info_cache_gen. There is no particular value of either that has this meaning. For example, initializing vmap_info_cache_gen to -1 would do just as well. (I actually considered that before deciding that +1 was "simpler" than -1.) For some versions of the code, an *arbitrary* difference is okay. You could set one ot 0xDEADBEEF and the other to 0xFEEDFACE. For other versions, the magnitude matters, but not *too* much. Initializing it to 42 would be perfectly correct, but waste time doing 42 cache updates before settling down. Singling out the value 1 as VMAP_CACHE_NEEDS_UPDATE is actively misleading. > This will help keep bugs like this out in the future... I hope! And this is the punchline, right? The problem was not realizing that non-default initialization was required; what we *call* the non-default value is irrelevant. I doubt it would ever have been a real (i.e. noticeable) bug, actually; the first bit of vmap activity in very early boot would have invalidated the cache. (John, my apologies if I went over the top and am contributing to LKML's reputation for flaming. I *did* actually laugh, and *do* think it's a dumb idea, but my annoyance is really directed at unpleasant memories of mindless application of coding style guidelines. In this case, I suspect you just posted before reading carefully enough to see the subtle logic.) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v4] mm/vmalloc: Cache the vmalloc memory info 2015-08-24 15:11 ` George Spelvin @ 2015-08-24 15:55 ` John Stoffel 0 siblings, 0 replies; 25+ messages in thread From: John Stoffel @ 2015-08-24 15:55 UTC (permalink / raw) To: George Spelvin Cc: john, mingo, dave, linux-kernel, linux-mm, linux, peterz, riel, rientjes, torvalds George> John Stoffel <john@stoffel.org> wrote: >>> vmap_info_gen should be initialized to 1 to force an initial >>> cache update. >> Blech, it should be initialized with a proper #define >> VMAP_CACHE_NEEDS_UPDATE 1, instead of more magic numbers. George> Er... this is a joke, right? Not really. The comment made before was that by setting this variable to zero, it wasn't properly initialized. Which implies that either the API is wrong... or we should be documenting it better. I just went in the direction of the #define instead of a comment. George> First, this number is used exactly once, and it's not part of George> a collection of similar numbers. And the definition would be George> adjacent to the use. George> We have easier ways of accomplishing that, called "comments". Sure, that would be the better solution in this case. George> Second, your proposed name is misleading. "needs update" is defined George> as vmap_info_gen != vmap_info_cache_gen. There is no particular value George> of either that has this meaning. George> For example, initializing vmap_info_cache_gen to -1 would do just as well. George> (I actually considered that before deciding that +1 was "simpler" than -1.) See, I just threw out a dumb suggestion without reading the patch properly. My fault. George> (John, my apologies if I went over the top and am contributing to LKML's George> reputation for flaming. I *did* actually laugh, and *do* think it's a George> dumb idea, but my annoyance is really directed at unpleasant memories of George> mindless application of coding style guidelines. In this case, I suspect George> you just posted before reading carefully enough to see the subtle logic.) Nope, I'm in the wrong here. And your comment here is wonderful, I really do appreciate how you handled my ham fisted attempt to contribute. But I've got thick skin and I'll keep trying in my free time to comment on patches when I can. John -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info 2015-08-23 8:17 ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Ingo Molnar ` (2 preceding siblings ...) 2015-08-24 1:04 ` George Spelvin @ 2015-08-25 12:46 ` Peter Zijlstra 3 siblings, 0 replies; 25+ messages in thread From: Peter Zijlstra @ 2015-08-25 12:46 UTC (permalink / raw) To: Ingo Molnar Cc: George Spelvin, dave, linux-kernel, linux-mm, linux, riel, rientjes, torvalds On Sun, Aug 23, 2015 at 10:17:51AM +0200, Ingo Molnar wrote: > +static u64 vmap_info_gen; > +static u64 vmap_info_cache_gen; > +void get_vmalloc_info(struct vmalloc_info *vmi) > +{ > + u64 gen = READ_ONCE(vmap_info_gen); > + > + /* > + * If the generation counter of the cache matches that of > + * the vmalloc generation counter then return the cache: > + */ > + if (READ_ONCE(vmap_info_cache_gen) == gen) { Why are those things u64? It has the obvious down-side that you still get split loads on 32bit machines. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2015-08-25 17:03 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-08-23 4:48 [PATCH 0/3] mm/vmalloc: Cache the /proc/meminfo vmalloc statistics George Spelvin 2015-08-23 6:04 ` Ingo Molnar 2015-08-23 6:46 ` George Spelvin 2015-08-23 8:17 ` [PATCH 3/3 v3] mm/vmalloc: Cache the vmalloc memory info Ingo Molnar 2015-08-23 20:53 ` Rasmus Villemoes 2015-08-24 6:58 ` Ingo Molnar 2015-08-24 8:39 ` Rasmus Villemoes 2015-08-23 21:56 ` Rasmus Villemoes 2015-08-24 7:00 ` Ingo Molnar 2015-08-25 16:39 ` Linus Torvalds 2015-08-25 17:03 ` Linus Torvalds 2015-08-24 1:04 ` George Spelvin 2015-08-24 7:34 ` [PATCH 3/3 v4] " Ingo Molnar 2015-08-24 7:47 ` Ingo Molnar 2015-08-24 7:50 ` [PATCH 3/3 v5] " Ingo Molnar 2015-08-24 12:54 ` George Spelvin 2015-08-25 9:56 ` [PATCH 3/3 v6] " Ingo Molnar 2015-08-25 10:36 ` George Spelvin 2015-08-25 12:59 ` Peter Zijlstra 2015-08-25 14:19 ` Rasmus Villemoes 2015-08-25 15:11 ` George Spelvin 2015-08-24 13:11 ` [PATCH 3/3 v4] " John Stoffel 2015-08-24 15:11 ` George Spelvin 2015-08-24 15:55 ` John Stoffel 2015-08-25 12:46 ` [PATCH 3/3 v3] " Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).