* [patch] mm, show_mem: suppress page counts in non-blockable contexts
@ 2013-02-27 0:46 David Rientjes
2013-02-27 10:00 ` Michal Hocko
2013-02-27 15:51 ` Dave Hansen
0 siblings, 2 replies; 5+ messages in thread
From: David Rientjes @ 2013-02-27 0:46 UTC (permalink / raw)
To: Andrew Morton; +Cc: Mel Gorman, linux-arch, linux-kernel, linux-mm
On large systems with a lot of memory, walking all RAM to determine page
types may take a half second or even more.
In non-blockable contexts, the page allocator will emit a page allocation
failure warning unless __GFP_NOWARN is specified. In such contexts, irqs
are typically disabled and such a lengthy delay may result in soft
lockups.
To fix this, suppress the page walk in such contexts when printing the
page allocation failure warning.
Signed-off-by: David Rientjes <rientjes@google.com>
---
arch/arm/mm/init.c | 3 +++
arch/ia64/mm/contig.c | 2 ++
arch/ia64/mm/discontig.c | 2 ++
arch/parisc/mm/init.c | 2 ++
arch/unicore32/mm/init.c | 3 +++
include/linux/mm.h | 3 ++-
lib/show_mem.c | 3 +++
mm/page_alloc.c | 7 +++++++
8 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -99,6 +99,9 @@ void show_mem(unsigned int filter)
printk("Mem-info:\n");
show_free_areas(filter);
+ if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
+ return;
+
for_each_bank (i, mi) {
struct membank *bank = &mi->bank[i];
unsigned int pfn1, pfn2;
diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
--- a/arch/ia64/mm/contig.c
+++ b/arch/ia64/mm/contig.c
@@ -47,6 +47,8 @@ void show_mem(unsigned int filter)
printk(KERN_INFO "Mem-info:\n");
show_free_areas(filter);
printk(KERN_INFO "Node memory in pages:\n");
+ if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
+ return;
for_each_online_pgdat(pgdat) {
unsigned long present;
unsigned long flags;
diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
--- a/arch/ia64/mm/discontig.c
+++ b/arch/ia64/mm/discontig.c
@@ -623,6 +623,8 @@ void show_mem(unsigned int filter)
printk(KERN_INFO "Mem-info:\n");
show_free_areas(filter);
+ if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
+ return;
printk(KERN_INFO "Node memory in pages:\n");
for_each_online_pgdat(pgdat) {
unsigned long present;
diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -697,6 +697,8 @@ void show_mem(unsigned int filter)
printk(KERN_INFO "Mem-info:\n");
show_free_areas(filter);
+ if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
+ return;
#ifndef CONFIG_DISCONTIGMEM
i = max_mapnr;
while (i-- > 0) {
diff --git a/arch/unicore32/mm/init.c b/arch/unicore32/mm/init.c
--- a/arch/unicore32/mm/init.c
+++ b/arch/unicore32/mm/init.c
@@ -66,6 +66,9 @@ void show_mem(unsigned int filter)
printk(KERN_DEFAULT "Mem-info:\n");
show_free_areas(filter);
+ if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
+ return;
+
for_each_bank(i, mi) {
struct membank *bank = &mi->bank[i];
unsigned int pfn1, pfn2;
diff --git a/include/linux/mm.h b/include/linux/mm.h
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -898,7 +898,8 @@ extern void pagefault_out_of_memory(void);
* Flags passed to show_mem() and show_free_areas() to suppress output in
* various contexts.
*/
-#define SHOW_MEM_FILTER_NODES (0x0001u) /* filter disallowed nodes */
+#define SHOW_MEM_FILTER_NODES (0x0001u) /* disallowed nodes */
+#define SHOW_MEM_FILTER_PAGE_COUNT (0x0002u) /* page type count */
extern void show_free_areas(unsigned int flags);
extern bool skip_free_areas_node(unsigned int flags, int nid);
diff --git a/lib/show_mem.c b/lib/show_mem.c
--- a/lib/show_mem.c
+++ b/lib/show_mem.c
@@ -18,6 +18,9 @@ void show_mem(unsigned int filter)
printk("Mem-Info:\n");
show_free_areas(filter);
+ if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
+ return;
+
for_each_online_pgdat(pgdat) {
unsigned long i, flags;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2009,6 +2009,13 @@ void warn_alloc_failed(gfp_t gfp_mask, int order, const char *fmt, ...)
return;
/*
+ * Walking all memory to count page types is very expensive and should
+ * be inhibited in non-blockable contexts.
+ */
+ if (!(gfp_mask & __GFP_WAIT))
+ filter |= SHOW_MEM_FILTER_PAGE_COUNT;
+
+ /*
* This documents exceptions given to allocations in certain
* contexts that are allowed to allocate outside current's set
* of allowed nodes.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] mm, show_mem: suppress page counts in non-blockable contexts
2013-02-27 0:46 [patch] mm, show_mem: suppress page counts in non-blockable contexts David Rientjes
@ 2013-02-27 10:00 ` Michal Hocko
2013-02-27 22:16 ` David Rientjes
2013-02-27 15:51 ` Dave Hansen
1 sibling, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2013-02-27 10:00 UTC (permalink / raw)
To: David Rientjes
Cc: Andrew Morton, Mel Gorman, linux-arch, linux-kernel, linux-mm
On Tue 26-02-13 16:46:08, David Rientjes wrote:
> On large systems with a lot of memory, walking all RAM to determine page
> types may take a half second or even more.
>
> In non-blockable contexts, the page allocator will emit a page allocation
> failure warning unless __GFP_NOWARN is specified. In such contexts, irqs
> are typically disabled and such a lengthy delay may result in soft
> lockups.
But we are trying to prevent from soft lockups by calling
touch_nmi_watchdog every now when iterating over pages so the lock up
detector shouldn't trigger.
Anyway, I think that the additional information (which can be really
costly as you are describing) is not that useful. Most of the useful
information is already printed by show_free_areas. Or does it help when
we know how much memory is shared/reserved/etc. when the allocation
fails?
So I do agree with the dropping the additional information for the
allocation failure path (sysrq+m might still show it) but I fail to see
how the lockup detector plays any role here. Can we just drop it because
it is not that interesting and it is costly so it is not worth
bothering?
> To fix this, suppress the page walk in such contexts when printing the
> page allocation failure warning.
>
> Signed-off-by: David Rientjes <rientjes@google.com>
> ---
> arch/arm/mm/init.c | 3 +++
> arch/ia64/mm/contig.c | 2 ++
> arch/ia64/mm/discontig.c | 2 ++
> arch/parisc/mm/init.c | 2 ++
> arch/unicore32/mm/init.c | 3 +++
> include/linux/mm.h | 3 ++-
> lib/show_mem.c | 3 +++
> mm/page_alloc.c | 7 +++++++
> 8 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -99,6 +99,9 @@ void show_mem(unsigned int filter)
> printk("Mem-info:\n");
> show_free_areas(filter);
>
> + if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
> + return;
> +
> for_each_bank (i, mi) {
> struct membank *bank = &mi->bank[i];
> unsigned int pfn1, pfn2;
> diff --git a/arch/ia64/mm/contig.c b/arch/ia64/mm/contig.c
> --- a/arch/ia64/mm/contig.c
> +++ b/arch/ia64/mm/contig.c
> @@ -47,6 +47,8 @@ void show_mem(unsigned int filter)
> printk(KERN_INFO "Mem-info:\n");
> show_free_areas(filter);
> printk(KERN_INFO "Node memory in pages:\n");
> + if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
> + return;
> for_each_online_pgdat(pgdat) {
> unsigned long present;
> unsigned long flags;
> diff --git a/arch/ia64/mm/discontig.c b/arch/ia64/mm/discontig.c
> --- a/arch/ia64/mm/discontig.c
> +++ b/arch/ia64/mm/discontig.c
> @@ -623,6 +623,8 @@ void show_mem(unsigned int filter)
>
> printk(KERN_INFO "Mem-info:\n");
> show_free_areas(filter);
> + if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
> + return;
> printk(KERN_INFO "Node memory in pages:\n");
> for_each_online_pgdat(pgdat) {
> unsigned long present;
> diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
> --- a/arch/parisc/mm/init.c
> +++ b/arch/parisc/mm/init.c
> @@ -697,6 +697,8 @@ void show_mem(unsigned int filter)
>
> printk(KERN_INFO "Mem-info:\n");
> show_free_areas(filter);
> + if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
> + return;
> #ifndef CONFIG_DISCONTIGMEM
> i = max_mapnr;
> while (i-- > 0) {
> diff --git a/arch/unicore32/mm/init.c b/arch/unicore32/mm/init.c
> --- a/arch/unicore32/mm/init.c
> +++ b/arch/unicore32/mm/init.c
> @@ -66,6 +66,9 @@ void show_mem(unsigned int filter)
> printk(KERN_DEFAULT "Mem-info:\n");
> show_free_areas(filter);
>
> + if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
> + return;
> +
> for_each_bank(i, mi) {
> struct membank *bank = &mi->bank[i];
> unsigned int pfn1, pfn2;
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -898,7 +898,8 @@ extern void pagefault_out_of_memory(void);
> * Flags passed to show_mem() and show_free_areas() to suppress output in
> * various contexts.
> */
> -#define SHOW_MEM_FILTER_NODES (0x0001u) /* filter disallowed nodes */
> +#define SHOW_MEM_FILTER_NODES (0x0001u) /* disallowed nodes */
> +#define SHOW_MEM_FILTER_PAGE_COUNT (0x0002u) /* page type count */
>
> extern void show_free_areas(unsigned int flags);
> extern bool skip_free_areas_node(unsigned int flags, int nid);
> diff --git a/lib/show_mem.c b/lib/show_mem.c
> --- a/lib/show_mem.c
> +++ b/lib/show_mem.c
> @@ -18,6 +18,9 @@ void show_mem(unsigned int filter)
> printk("Mem-Info:\n");
> show_free_areas(filter);
>
> + if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
> + return;
> +
> for_each_online_pgdat(pgdat) {
> unsigned long i, flags;
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2009,6 +2009,13 @@ void warn_alloc_failed(gfp_t gfp_mask, int order, const char *fmt, ...)
> return;
>
> /*
> + * Walking all memory to count page types is very expensive and should
> + * be inhibited in non-blockable contexts.
> + */
> + if (!(gfp_mask & __GFP_WAIT))
> + filter |= SHOW_MEM_FILTER_PAGE_COUNT;
> +
> + /*
> * This documents exceptions given to allocations in certain
> * contexts that are allowed to allocate outside current's set
> * of allowed nodes.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] mm, show_mem: suppress page counts in non-blockable contexts
2013-02-27 0:46 [patch] mm, show_mem: suppress page counts in non-blockable contexts David Rientjes
2013-02-27 10:00 ` Michal Hocko
@ 2013-02-27 15:51 ` Dave Hansen
2013-02-27 22:13 ` David Rientjes
1 sibling, 1 reply; 5+ messages in thread
From: Dave Hansen @ 2013-02-27 15:51 UTC (permalink / raw)
To: David Rientjes
Cc: Andrew Morton, Mel Gorman, linux-arch, linux-kernel, linux-mm
On 02/26/2013 04:46 PM, David Rientjes wrote:
> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -99,6 +99,9 @@ void show_mem(unsigned int filter)
> printk("Mem-info:\n");
> show_free_areas(filter);
>
> + if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
> + return;
> +
Won't this just look like a funky truncated warning to the end user?
Seems like we should at least dump out a little message for this stuff
to say that it's intentionally truncated?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] mm, show_mem: suppress page counts in non-blockable contexts
2013-02-27 15:51 ` Dave Hansen
@ 2013-02-27 22:13 ` David Rientjes
0 siblings, 0 replies; 5+ messages in thread
From: David Rientjes @ 2013-02-27 22:13 UTC (permalink / raw)
To: Dave Hansen; +Cc: Andrew Morton, Mel Gorman, linux-arch, linux-kernel, linux-mm
On Wed, 27 Feb 2013, Dave Hansen wrote:
> On 02/26/2013 04:46 PM, David Rientjes wrote:
> > diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> > --- a/arch/arm/mm/init.c
> > +++ b/arch/arm/mm/init.c
> > @@ -99,6 +99,9 @@ void show_mem(unsigned int filter)
> > printk("Mem-info:\n");
> > show_free_areas(filter);
> >
> > + if (filter & SHOW_MEM_FILTER_PAGE_COUNT)
> > + return;
> > +
>
> Won't this just look like a funky truncated warning to the end user?
>
No, because of the uninhibited call to show_free_areas() above. This
still dumps the pcp state, global and per-node page type breakdown, and
free pages at given order. The only things suppresses are the total
pages, pages reserved, pages shared, and pages non-shared counts that are
quite expensive to determine because it walks all memory while irqs are
disabled and increases with the amount of RAM a system has.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [patch] mm, show_mem: suppress page counts in non-blockable contexts
2013-02-27 10:00 ` Michal Hocko
@ 2013-02-27 22:16 ` David Rientjes
0 siblings, 0 replies; 5+ messages in thread
From: David Rientjes @ 2013-02-27 22:16 UTC (permalink / raw)
To: Michal Hocko
Cc: Andrew Morton, Mel Gorman, linux-arch, linux-kernel, linux-mm
On Wed, 27 Feb 2013, Michal Hocko wrote:
> But we are trying to prevent from soft lockups by calling
> touch_nmi_watchdog every now when iterating over pages so the lock up
> detector shouldn't trigger.
>
> Anyway, I think that the additional information (which can be really
> costly as you are describing) is not that useful. Most of the useful
> information is already printed by show_free_areas. Or does it help when
> we know how much memory is shared/reserved/etc. when the allocation
> fails?
>
I do not think it is helpful since show_free_areas() already shows all
pertinent information, and hence I'm suppressing it in atomic contexts in
this patch.
> So I do agree with the dropping the additional information for the
> allocation failure path (sysrq+m might still show it) but I fail to see
> how the lockup detector plays any role here. Can we just drop it because
> it is not that interesting and it is costly so it is not worth
> bothering?
>
I would agree it is not interesting to debugging VM issues and is
obviously very expensive.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-02-27 22:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-27 0:46 [patch] mm, show_mem: suppress page counts in non-blockable contexts David Rientjes
2013-02-27 10:00 ` Michal Hocko
2013-02-27 22:16 ` David Rientjes
2013-02-27 15:51 ` Dave Hansen
2013-02-27 22:13 ` David Rientjes
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).