[PATCH] mm: Warn about costly page allocation

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] mm: Warn about costly page allocation
@ 2012-07-09  2:38 Minchan Kim
  2012-07-09  8:22 ` Mel Gorman
  0 siblings, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2012-07-09  2:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Rik van Riel, KOSAKI Motohiro, Mel Gorman,
	Johannes Weiner, Kamezawa Hiroyuki, Minchan Kim

Since lumpy reclaim was introduced at 2.6.23, it helped higher
order allocation.
Recently, we removed it at 3.4 and we didn't enable compaction
forcingly[1]. The reason makes sense that compaction.o + migration.o
isn't trivial for system doesn't use higher order allocation.
But the problem is that we have to enable compaction explicitly
while lumpy reclaim enabled unconditionally.

Normally, admin doesn't know his system have used higher order
allocation and even lumpy reclaim have helped it.
Admin in embdded system have a tendency to minimise code size so that
they can disable compaction. In this case, we can see page allocation
failure we can never see in the past. It's critical on embedded side
because...

Let's think this scenario.

There is QA team in embedded company and they have tested their product.
In test scenario, they can allocate 100 high order allocation.
(they don't matter how many high order allocations in kernel are needed
during test. their concern is just only working well or fail of their
middleware/application) High order allocation will be serviced well
by natural buddy allocation without lumpy's help. So they released
the product and sold out all over the world.
Unfortunately, in real practice, sometime, 105 high order allocation was
needed rarely and fortunately, lumpy reclaim could help it so the product
doesn't have a problem until now.

If they use latest kernel, they will see the new config CONFIG_COMPACTION
which is very poor documentation, and they can't know it's replacement of
lumpy reclaim(even, they don't know lumpy reclaim) so they simply disable
that option for size optimization. Of course, QA team still test it but they
can't find the problem if they don't do test stronger than old.
It ends up release the product and sold out all over the world, again.
But in this time, we don't have both lumpy and compaction so the problem
would happen in real practice. A poor enginner from Korea have to flight
to the USA for the fix a ton of products. Otherwise, should recall products
from all over the world. Maybe he can lose a job. :(

This patch adds warning for notice. If the system try to allocate
PAGE_ALLOC_COSTLY_ORDER above page and system enters reclaim path,
it emits the warning. At least, it gives a chance to look into their
system before the relase.

This patch avoids false positive by alloc_large_system_hash which
allocates with GFP_ATOMIC and a fallback mechanism so it can make
this warning useless.

[1] c53919ad(mm: vmscan: remove lumpy reclaim)

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/page_alloc.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a4d3a19..1155e00 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2276,6 +2276,20 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
 	return alloc_flags;
 }

+#if defined(CONFIG_DEBUG_VM) && !defined(CONFIG_COMPACTION)
+static inline void check_page_alloc_costly_order(unsigned int order)
+{
+	if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
+		printk_once("WARNING: You are tring to allocate %d-order page."
+		" You might need to turn on CONFIG_COMPACTION\n", order);
+	}
+}
+#else
+static inline void check_page_alloc_costly_order(unsigned int order)
+{
+}
+#endif
+
 static inline struct page *
 __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	struct zonelist *zonelist, enum zone_type high_zoneidx,
@@ -2353,6 +2367,8 @@ rebalance:
 	if (!wait)
 		goto nopage;

+	check_page_alloc_costly_order(order);
+
 	/* Avoid recursion of direct reclaim */
 	if (current->flags & PF_MEMALLOC)
 		goto nopage;
-- 
1.7.9.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09  2:38 [PATCH] mm: Warn about costly page allocation Minchan Kim
@ 2012-07-09  8:22 ` Mel Gorman
  2012-07-09  8:46   ` Minchan Kim
  0 siblings, 1 reply; 11+ messages in thread
From: Mel Gorman @ 2012-07-09  8:22 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-kernel, linux-mm, Rik van Riel,
	KOSAKI Motohiro, Johannes Weiner, Kamezawa Hiroyuki

On Mon, Jul 09, 2012 at 11:38:20AM +0900, Minchan Kim wrote:
> Since lumpy reclaim was introduced at 2.6.23, it helped higher
> order allocation.
> Recently, we removed it at 3.4 and we didn't enable compaction
> forcingly[1]. The reason makes sense that compaction.o + migration.o
> isn't trivial for system doesn't use higher order allocation.
> But the problem is that we have to enable compaction explicitly
> while lumpy reclaim enabled unconditionally.
> 
> Normally, admin doesn't know his system have used higher order
> allocation and even lumpy reclaim have helped it.
> Admin in embdded system have a tendency to minimise code size so that
> they can disable compaction. In this case, we can see page allocation
> failure we can never see in the past. It's critical on embedded side
> because...
> 
> Let's think this scenario.
> 
> There is QA team in embedded company and they have tested their product.
> In test scenario, they can allocate 100 high order allocation.
> (they don't matter how many high order allocations in kernel are needed
> during test. their concern is just only working well or fail of their
> middleware/application) High order allocation will be serviced well
> by natural buddy allocation without lumpy's help. So they released
> the product and sold out all over the world.
> Unfortunately, in real practice, sometime, 105 high order allocation was
> needed rarely and fortunately, lumpy reclaim could help it so the product
> doesn't have a problem until now.
> 
> If they use latest kernel, they will see the new config CONFIG_COMPACTION
> which is very poor documentation, and they can't know it's replacement of
> lumpy reclaim(even, they don't know lumpy reclaim) so they simply disable

Depending on lumpy reclaim or compaction for high-order kernel allocations
is dangerous. Both depend on being able to move MIGRATE_MOVABLE allocations
to satisy the high-order allocation. If used regularly for high-order kernel
allocations and they are long-lived, the system will eventually be unable
to grant these allocations, with or without compaction or lumpy reclaim.

Be also aware that lumpy reclaim was very aggressive when reclaiming pages
to satisfy an allocation. Compaction is not and compaction can be temporarily
disabled if an allocation attempt fails. If lumpy reclaim was being depended
upon to satisfy high-order allocations, there is no guarantee, particularly
with 3.4, that compaction will succeed as it does not reclaim aggressively.

> that option for size optimization. Of course, QA team still test it but they
> can't find the problem if they don't do test stronger than old.
> It ends up release the product and sold out all over the world, again.
> But in this time, we don't have both lumpy and compaction so the problem
> would happen in real practice. A poor enginner from Korea have to flight
> to the USA for the fix a ton of products. Otherwise, should recall products
> from all over the world. Maybe he can lose a job. :(
> 
> This patch adds warning for notice. If the system try to allocate
> PAGE_ALLOC_COSTLY_ORDER above page and system enters reclaim path,
> it emits the warning. At least, it gives a chance to look into their
> system before the relase.
> 
> This patch avoids false positive by alloc_large_system_hash which
> allocates with GFP_ATOMIC and a fallback mechanism so it can make
> this warning useless.
> 
> [1] c53919ad(mm: vmscan: remove lumpy reclaim)
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/page_alloc.c |   16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index a4d3a19..1155e00 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2276,6 +2276,20 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  	return alloc_flags;
>  }
>  
> +#if defined(CONFIG_DEBUG_VM) && !defined(CONFIG_COMPACTION)
> +static inline void check_page_alloc_costly_order(unsigned int order)
> +{
> +	if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> +		printk_once("WARNING: You are tring to allocate %d-order page."
> +		" You might need to turn on CONFIG_COMPACTION\n", order);
> +	}

WARN_ON_ONCE would tell you what is trying to satisfy the allocation.

It should further check if this is a GFP_MOVABLE allocation or not and if
not, then it should either be documented that compaction may only delay
allocation failures and that they may need to consider reserving the memory
in advance or doing something like forcing MIGRATE_RESERVE to only be used
for high-order allocations.

> +}
> +#else
> +static inline void check_page_alloc_costly_order(unsigned int order)
> +{
> +}
> +#endif
> +
>  static inline struct page *
>  __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
>  	struct zonelist *zonelist, enum zone_type high_zoneidx,
> @@ -2353,6 +2367,8 @@ rebalance:
>  	if (!wait)
>  		goto nopage;
>  
> +	check_page_alloc_costly_order(order);
> +
>  	/* Avoid recursion of direct reclaim */
>  	if (current->flags & PF_MEMALLOC)
>  		goto nopage;
> -- 
> 1.7.9.5
> 

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09  8:22 ` Mel Gorman
@ 2012-07-09  8:46   ` Minchan Kim
  2012-07-09  9:12     ` Mel Gorman
  2012-07-09 12:53     ` Cong Wang
  0 siblings, 2 replies; 11+ messages in thread
From: Minchan Kim @ 2012-07-09  8:46 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, linux-kernel, linux-mm, Rik van Riel,
	KOSAKI Motohiro, Johannes Weiner, Kamezawa Hiroyuki

Hi Mel,

On Mon, Jul 09, 2012 at 09:22:00AM +0100, Mel Gorman wrote:
> On Mon, Jul 09, 2012 at 11:38:20AM +0900, Minchan Kim wrote:
> > Since lumpy reclaim was introduced at 2.6.23, it helped higher
> > order allocation.
> > Recently, we removed it at 3.4 and we didn't enable compaction
> > forcingly[1]. The reason makes sense that compaction.o + migration.o
> > isn't trivial for system doesn't use higher order allocation.
> > But the problem is that we have to enable compaction explicitly
> > while lumpy reclaim enabled unconditionally.
> > 
> > Normally, admin doesn't know his system have used higher order
> > allocation and even lumpy reclaim have helped it.
> > Admin in embdded system have a tendency to minimise code size so that
> > they can disable compaction. In this case, we can see page allocation
> > failure we can never see in the past. It's critical on embedded side
> > because...
> > 
> > Let's think this scenario.
> > 
> > There is QA team in embedded company and they have tested their product.
> > In test scenario, they can allocate 100 high order allocation.
> > (they don't matter how many high order allocations in kernel are needed
> > during test. their concern is just only working well or fail of their
> > middleware/application) High order allocation will be serviced well
> > by natural buddy allocation without lumpy's help. So they released
> > the product and sold out all over the world.
> > Unfortunately, in real practice, sometime, 105 high order allocation was
> > needed rarely and fortunately, lumpy reclaim could help it so the product
> > doesn't have a problem until now.
> > 
> > If they use latest kernel, they will see the new config CONFIG_COMPACTION
> > which is very poor documentation, and they can't know it's replacement of
> > lumpy reclaim(even, they don't know lumpy reclaim) so they simply disable
> 
> Depending on lumpy reclaim or compaction for high-order kernel allocations
> is dangerous. Both depend on being able to move MIGRATE_MOVABLE allocations
> to satisy the high-order allocation. If used regularly for high-order kernel
> allocations and they are long-lived, the system will eventually be unable
> to grant these allocations, with or without compaction or lumpy reclaim.

Indeed.

> 
> Be also aware that lumpy reclaim was very aggressive when reclaiming pages
> to satisfy an allocation. Compaction is not and compaction can be temporarily
> disabled if an allocation attempt fails. If lumpy reclaim was being depended
> upon to satisfy high-order allocations, there is no guarantee, particularly
> with 3.4, that compaction will succeed as it does not reclaim aggressively.

It's good explanation and let's add it in description.

> 
> > that option for size optimization. Of course, QA team still test it but they
> > can't find the problem if they don't do test stronger than old.
> > It ends up release the product and sold out all over the world, again.
> > But in this time, we don't have both lumpy and compaction so the problem
> > would happen in real practice. A poor enginner from Korea have to flight
> > to the USA for the fix a ton of products. Otherwise, should recall products
> > from all over the world. Maybe he can lose a job. :(
> > 
> > This patch adds warning for notice. If the system try to allocate
> > PAGE_ALLOC_COSTLY_ORDER above page and system enters reclaim path,
> > it emits the warning. At least, it gives a chance to look into their
> > system before the relase.
> > 
> > This patch avoids false positive by alloc_large_system_hash which
> > allocates with GFP_ATOMIC and a fallback mechanism so it can make
> > this warning useless.
> > 
> > [1] c53919ad(mm: vmscan: remove lumpy reclaim)
> > 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/page_alloc.c |   16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index a4d3a19..1155e00 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2276,6 +2276,20 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
> >  	return alloc_flags;
> >  }
> >  
> > +#if defined(CONFIG_DEBUG_VM) && !defined(CONFIG_COMPACTION)
> > +static inline void check_page_alloc_costly_order(unsigned int order)
> > +{
> > +	if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> > +		printk_once("WARNING: You are tring to allocate %d-order page."
> > +		" You might need to turn on CONFIG_COMPACTION\n", order);
> > +	}
> 
> WARN_ON_ONCE would tell you what is trying to satisfy the allocation.

Do you mean that it would be better to use WARN_ON_ONCE rather than raw printk?
If so, I would like to insist raw printk because WARN_ON_ONCE could be disabled
by !CONFIG_BUG.
If I miss something, could you elaborate it more?

> 
> It should further check if this is a GFP_MOVABLE allocation or not and if
> not, then it should either be documented that compaction may only delay
> allocation failures and that they may need to consider reserving the memory
> in advance or doing something like forcing MIGRATE_RESERVE to only be used
> for high-order allocations.

Okay. but I got confused you want to add above description in code directly
like below or write it down in comment of check_page_alloc_costly_order?

static inline void check_page_alloc_costly_order(unsigned int order, gfp_t gfp_flags)
{
       if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
               printk_once("WARNING: You are tring to allocate %d-order page."
               " You might need to turn on CONFIG_COMPACTION\n", order);
                if (gfp_flags is not GFP_MOVABLE)
                        printk_once("Compaction doesn't make sure .....\n");
       }

}

Thanks for the comment, Mel.

> 
> > +}
> > +#else
> > +static inline void check_page_alloc_costly_order(unsigned int order)
> > +{
> > +}
> > +#endif
> > +
> >  static inline struct page *
> >  __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
> >  	struct zonelist *zonelist, enum zone_type high_zoneidx,
> > @@ -2353,6 +2367,8 @@ rebalance:
> >  	if (!wait)
> >  		goto nopage;
> >  
> > +	check_page_alloc_costly_order(order);
> > +
> >  	/* Avoid recursion of direct reclaim */
> >  	if (current->flags & PF_MEMALLOC)
> >  		goto nopage;
> > -- 
> > 1.7.9.5
> > 
> 
> -- 
> Mel Gorman
> SUSE Labs
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09  8:46   ` Minchan Kim
@ 2012-07-09  9:12     ` Mel Gorman
  2012-07-09 12:50       ` Minchan Kim
  2012-07-09 12:53     ` Cong Wang
  1 sibling, 1 reply; 11+ messages in thread
From: Mel Gorman @ 2012-07-09  9:12 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-kernel, linux-mm, Rik van Riel,
	KOSAKI Motohiro, Johannes Weiner, Kamezawa Hiroyuki

On Mon, Jul 09, 2012 at 05:46:57PM +0900, Minchan Kim wrote:
> > > <SNIP>
> > > +#if defined(CONFIG_DEBUG_VM) && !defined(CONFIG_COMPACTION)
> > > +static inline void check_page_alloc_costly_order(unsigned int order)
> > > +{
> > > +	if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> > > +		printk_once("WARNING: You are tring to allocate %d-order page."
> > > +		" You might need to turn on CONFIG_COMPACTION\n", order);
> > > +	}
> > 
> > WARN_ON_ONCE would tell you what is trying to satisfy the allocation.
> 
> Do you mean that it would be better to use WARN_ON_ONCE rather than raw printk?

Yes.

> If so, I would like to insist raw printk because WARN_ON_ONCE could be disabled
> by !CONFIG_BUG.
> If I miss something, could you elaborate it more?
> 

Ok, but all this will tell you is that *something* tried a high-order
allocation. It will not tell you who and because it's a printk_once, it
will also not tell you how often it's happening. You could add a
dump_stack to capture that information.

> > 
> > It should further check if this is a GFP_MOVABLE allocation or not and if
> > not, then it should either be documented that compaction may only delay
> > allocation failures and that they may need to consider reserving the memory
> > in advance or doing something like forcing MIGRATE_RESERVE to only be used
> > for high-order allocations.
> 
> Okay. but I got confused you want to add above description in code directly
> like below or write it down in comment of check_page_alloc_costly_order?
> 

You're aiming this at embedded QA people according to your changelog so
do whatever you think is going to be the most effective. It's already
"known" that high-order kernel allocations are meant to be unreliable and
apparently this is being ignored. The in-code warning could look
something like

if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
	printk_once("%s: page allocation high-order stupidity: order:%d, mode:0x%x\n",
                   current->comm, order, gfp_mask);
	if (gfp_flags & __GFP_MOVABLE) {
		printk_once("Enable compaction or whatever\n");
		dump_stack();
	} else {
		printk_once("Regular high-order kernel allocations like this will eventually start failing.");
		dump_stack();
	}
}

There should be a comment above it giving more information if you think
the embedded people will actually read it. Of course, if this warning
triggers during driver initialisation then it might be a completely useless.
You could rate limit the warning (printk_ratelimit()) instead to be more
effective. As I don't know what sort of device drivers you are seeing this
problem with I can't judge what the best style of warning would be.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09  9:12     ` Mel Gorman
@ 2012-07-09 12:50       ` Minchan Kim
  2012-07-09 13:05         ` Mel Gorman
  0 siblings, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2012-07-09 12:50 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm, Rik van Riel,
	KOSAKI Motohiro, Johannes Weiner, Kamezawa Hiroyuki

On Mon, Jul 09, 2012 at 10:12:03AM +0100, Mel Gorman wrote:
> On Mon, Jul 09, 2012 at 05:46:57PM +0900, Minchan Kim wrote:
> > > > <SNIP>
> > > > +#if defined(CONFIG_DEBUG_VM) && !defined(CONFIG_COMPACTION)
> > > > +static inline void check_page_alloc_costly_order(unsigned int order)
> > > > +{
> > > > +	if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> > > > +		printk_once("WARNING: You are tring to allocate %d-order page."
> > > > +		" You might need to turn on CONFIG_COMPACTION\n", order);
> > > > +	}
> > > 
> > > WARN_ON_ONCE would tell you what is trying to satisfy the allocation.
> > 
> > Do you mean that it would be better to use WARN_ON_ONCE rather than raw printk?
> 
> Yes.
> 
> > If so, I would like to insist raw printk because WARN_ON_ONCE could be disabled
> > by !CONFIG_BUG.
> > If I miss something, could you elaborate it more?
> > 
> 
> Ok, but all this will tell you is that *something* tried a high-order
> allocation. It will not tell you who and because it's a printk_once, it
> will also not tell you how often it's happening. You could add a
> dump_stack to capture that information.

That's a good idea.

> 
> > > 
> > > It should further check if this is a GFP_MOVABLE allocation or not and if
> > > not, then it should either be documented that compaction may only delay
> > > allocation failures and that they may need to consider reserving the memory
> > > in advance or doing something like forcing MIGRATE_RESERVE to only be used
> > > for high-order allocations.
> > 
> > Okay. but I got confused you want to add above description in code directly
> > like below or write it down in comment of check_page_alloc_costly_order?
> > 
> 
> You're aiming this at embedded QA people according to your changelog so
> do whatever you think is going to be the most effective. It's already
> "known" that high-order kernel allocations are meant to be unreliable and
> apparently this is being ignored. The in-code warning could look
> something like
> 
> if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> 	printk_once("%s: page allocation high-order stupidity: order:%d, mode:0x%x\n",
>                    current->comm, order, gfp_mask);
> 	if (gfp_flags & __GFP_MOVABLE) {
> 		printk_once("Enable compaction or whatever\n");
> 		dump_stack();
> 	} else {
> 		printk_once("Regular high-order kernel allocations like this will eventually start failing.");
> 		dump_stack();
> 	}
> }

I'm not sure we have to check further for __GFP_MOVABLE because I have not seen driver
uses __GFP_MOVABLE for high order allocation. Although it uses the flag, it's never
compactable since it's out of LRU list. So I think it's rather overkill.

> 
> There should be a comment above it giving more information if you think
> the embedded people will actually read it. Of course, if this warning
> triggers during driver initialisation then it might be a completely useless.
> You could rate limit the warning (printk_ratelimit()) instead to be more
> effective. As I don't know what sort of device drivers you are seeing this
> problem with I can't judge what the best style of warning would be.

Okay.
I will send patch like below tomorrow if there isn't any objection.

if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
	if (printk_ratelimit()) {
		printk("%s: page allocation high-order stupidity: order:%d, mode:0x%x\n",
			current->comm, order, gfp_mask);
		printk_once("Enable compaction or whatever\n");
		printk_once("Regular high-order kernel allocations like this will eventually start failing.\n");
		dump_stack();
	}
}

> 
> -- 
> Mel Gorman
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09 12:50       ` Minchan Kim
@ 2012-07-09 13:05         ` Mel Gorman
  2012-07-09 13:19           ` Minchan Kim
  0 siblings, 1 reply; 11+ messages in thread
From: Mel Gorman @ 2012-07-09 13:05 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-kernel, linux-mm, Rik van Riel,
	KOSAKI Motohiro, Johannes Weiner, Kamezawa Hiroyuki

On Mon, Jul 09, 2012 at 09:50:48PM +0900, Minchan Kim wrote:
> > <SNIP>
> > 
> > You're aiming this at embedded QA people according to your changelog so
> > do whatever you think is going to be the most effective. It's already
> > "known" that high-order kernel allocations are meant to be unreliable and
> > apparently this is being ignored. The in-code warning could look
> > something like
> > 
> > if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> > 	printk_once("%s: page allocation high-order stupidity: order:%d, mode:0x%x\n",
> >                    current->comm, order, gfp_mask);
> > 	if (gfp_flags & __GFP_MOVABLE) {
> > 		printk_once("Enable compaction or whatever\n");
> > 		dump_stack();
> > 	} else {
> > 		printk_once("Regular high-order kernel allocations like this will eventually start failing.");
> > 		dump_stack();
> > 	}
> > }
> 
> I'm not sure we have to check further for __GFP_MOVABLE because I have not seen driver
> uses __GFP_MOVABLE for high order allocation. Although it uses the flag, it's never
> compactable since it's out of LRU list. So I think it's rather overkill.
> 

Then I would have considered it even more important to warn them that
their specific usage is going to break eventually, with or without
compaction. However, you know the target audience for this warning so it's
your call.

> > 
> > There should be a comment above it giving more information if you think
> > the embedded people will actually read it. Of course, if this warning
> > triggers during driver initialisation then it might be a completely useless.
> > You could rate limit the warning (printk_ratelimit()) instead to be more
> > effective. As I don't know what sort of device drivers you are seeing this
> > problem with I can't judge what the best style of warning would be.
> 
> Okay.
> I will send patch like below tomorrow if there isn't any objection.
> 
> if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> 	if (printk_ratelimit()) {
> 		printk("%s: page allocation high-order stupidity: order:%d, mode:0x%x\n",
> 			current->comm, order, gfp_mask);
> 		printk_once("Enable compaction or whatever\n");
> 		printk_once("Regular high-order kernel allocations like this will eventually start failing.\n");
> 		dump_stack();
> 	}
> }

The warning message could be improved. I did not expect you to use "Enable
compaction or whatever" verbatim. I was just illustrating what type of
warnings I thought might be useful. I expected you would change it to
something that embedded driver authors would pay attention to :)

As you are using printk_ratelimit(), you can also use pr_warning to
annotate this as KERN_WARNING.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09 13:05         ` Mel Gorman
@ 2012-07-09 13:19           ` Minchan Kim
  2012-07-09 20:53             ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2012-07-09 13:19 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm, Rik van Riel,
	KOSAKI Motohiro, Johannes Weiner, Kamezawa Hiroyuki

On Mon, Jul 09, 2012 at 02:05:51PM +0100, Mel Gorman wrote:
> On Mon, Jul 09, 2012 at 09:50:48PM +0900, Minchan Kim wrote:
> > > <SNIP>
> > > 
> > > You're aiming this at embedded QA people according to your changelog so
> > > do whatever you think is going to be the most effective. It's already
> > > "known" that high-order kernel allocations are meant to be unreliable and
> > > apparently this is being ignored. The in-code warning could look
> > > something like
> > > 
> > > if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> > > 	printk_once("%s: page allocation high-order stupidity: order:%d, mode:0x%x\n",
> > >                    current->comm, order, gfp_mask);
> > > 	if (gfp_flags & __GFP_MOVABLE) {
> > > 		printk_once("Enable compaction or whatever\n");
> > > 		dump_stack();
> > > 	} else {
> > > 		printk_once("Regular high-order kernel allocations like this will eventually start failing.");
> > > 		dump_stack();
> > > 	}
> > > }
> > 
> > I'm not sure we have to check further for __GFP_MOVABLE because I have not seen driver
> > uses __GFP_MOVABLE for high order allocation. Although it uses the flag, it's never
> > compactable since it's out of LRU list. So I think it's rather overkill.
> > 
> 
> Then I would have considered it even more important to warn them that
> their specific usage is going to break eventually, with or without
> compaction. However, you know the target audience for this warning so it's
> your call.
> 
> > > 
> > > There should be a comment above it giving more information if you think
> > > the embedded people will actually read it. Of course, if this warning
> > > triggers during driver initialisation then it might be a completely useless.
> > > You could rate limit the warning (printk_ratelimit()) instead to be more
> > > effective. As I don't know what sort of device drivers you are seeing this
> > > problem with I can't judge what the best style of warning would be.
> > 
> > Okay.
> > I will send patch like below tomorrow if there isn't any objection.
> > 
> > if (unlikely(order > PAGE_ALLOC_COSTLY_ORDER)) {
> > 	if (printk_ratelimit()) {
> > 		printk("%s: page allocation high-order stupidity: order:%d, mode:0x%x\n",
> > 			current->comm, order, gfp_mask);
> > 		printk_once("Enable compaction or whatever\n");
> > 		printk_once("Regular high-order kernel allocations like this will eventually start failing.\n");

s/printk_once/printk/g
Copy&Paste should go away. :(

> > 		dump_stack();
> > 	}
> > }
> 
> The warning message could be improved. I did not expect you to use "Enable
> compaction or whatever" verbatim. I was just illustrating what type of
> warnings I thought might be useful. I expected you would change it to
> something that embedded driver authors would pay attention to :)

Okay.

> 
> As you are using printk_ratelimit(), you can also use pr_warning to
> annotate this as KERN_WARNING.

Will do.
Thanks, Mel.

> 
> -- 
> Mel Gorman
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09 13:19           ` Minchan Kim
@ 2012-07-09 20:53             ` Andrew Morton
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Morton @ 2012-07-09 20:53 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Mel Gorman, linux-kernel, linux-mm, Rik van Riel, KOSAKI Motohiro,
	Johannes Weiner, Kamezawa Hiroyuki

On Mon, 9 Jul 2012 22:19:42 +0900
Minchan Kim <minchan@kernel.org> wrote:

> > As you are using printk_ratelimit()

include/linux/printk.h sayeth

/*
 * Please don't use printk_ratelimit(), because it shares ratelimiting state
 * with all other unrelated printk_ratelimit() callsites.  Instead use
 * printk_ratelimited() or plain old __ratelimit().
 */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09  8:46   ` Minchan Kim
  2012-07-09  9:12     ` Mel Gorman
@ 2012-07-09 12:53     ` Cong Wang
  2012-07-09 14:12       ` Minchan Kim
  1 sibling, 1 reply; 11+ messages in thread
From: Cong Wang @ 2012-07-09 12:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm

On Mon, 09 Jul 2012 at 08:46 GMT, Minchan Kim <minchan@kernel.org> wrote:
>> 
>> WARN_ON_ONCE would tell you what is trying to satisfy the allocation.
>
> Do you mean that it would be better to use WARN_ON_ONCE rather than raw printk?
> If so, I would like to insist raw printk because WARN_ON_ONCE could be disabled
> by !CONFIG_BUG.
> If I miss something, could you elaborate it more?
>

Raw printk could be disabled by !CONFIG_PRINTK too, and given that:

config PRINTK
        default y
        bool "Enable support for printk" if EXPERT
		    
config BUG
        bool "BUG() support" if EXPERT
        default y

they are both configurable only when ERPERT, so we don't need to
worry much. :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09 12:53     ` Cong Wang
@ 2012-07-09 14:12       ` Minchan Kim
  2012-07-11  2:45         ` Cong Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Minchan Kim @ 2012-07-09 14:12 UTC (permalink / raw)
  To: Cong Wang; +Cc: linux-kernel, linux-mm

Hi Cong,

On Mon, Jul 09, 2012 at 12:53:22PM +0000, Cong Wang wrote:
> On Mon, 09 Jul 2012 at 08:46 GMT, Minchan Kim <minchan@kernel.org> wrote:
> >> 
> >> WARN_ON_ONCE would tell you what is trying to satisfy the allocation.
> >
> > Do you mean that it would be better to use WARN_ON_ONCE rather than raw printk?
> > If so, I would like to insist raw printk because WARN_ON_ONCE could be disabled
> > by !CONFIG_BUG.
> > If I miss something, could you elaborate it more?
> >
> 
> Raw printk could be disabled by !CONFIG_PRINTK too, and given that:

Yes.
In such case, It is very hard to diagnose the system so at least
we enables CONFIG_PRINTK.

> 
> config PRINTK
>         default y
>         bool "Enable support for printk" if EXPERT
> 		    
> config BUG
>         bool "BUG() support" if EXPERT
>         default y
> 
> they are both configurable only when ERPERT, so we don't need to
> worry much. :)

Embedded can use CONFIG_PRINTK and !CONFIG_BUG for size optimization
and printk(pr_xxx) + dump_stack is common technic used in all over kernel
sources. Do you have any reason you don't like it?


> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: Warn about costly page allocation
  2012-07-09 14:12       ` Minchan Kim
@ 2012-07-11  2:45         ` Cong Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Cong Wang @ 2012-07-11  2:45 UTC (permalink / raw)
  To: Minchan Kim; +Cc: linux-kernel, linux-mm

On Mon, Jul 9, 2012 at 10:12 PM, Minchan Kim <minchan@kernel.org> wrote:
>
> Embedded can use CONFIG_PRINTK and !CONFIG_BUG for size optimization
> and printk(pr_xxx) + dump_stack is common technic used in all over kernel
> sources. Do you have any reason you don't like it?
>

No, I am just feeling like it is a kind of dup. No objections from me.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-07-11  2:45 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-09  2:38 [PATCH] mm: Warn about costly page allocation Minchan Kim
2012-07-09  8:22 ` Mel Gorman
2012-07-09  8:46   ` Minchan Kim
2012-07-09  9:12     ` Mel Gorman
2012-07-09 12:50       ` Minchan Kim
2012-07-09 13:05         ` Mel Gorman
2012-07-09 13:19           ` Minchan Kim
2012-07-09 20:53             ` Andrew Morton
2012-07-09 12:53     ` Cong Wang
2012-07-09 14:12       ` Minchan Kim
2012-07-11  2:45         ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).