Re: [PATCH 02/22] Do not sanity check order in the fast path

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mel Gorman <mel@csn.ul.ie>
To: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	Nick Piggin <npiggin@suse.de>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Lin Ming <ming.m.lin@intel.com>,
	Zhang Yanmin <yanmin_zhang@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 02/22] Do not sanity check order in the fast path
Date: Wed, 22 Apr 2009 18:11:51 +0100	[thread overview]
Message-ID: <20090422171151.GF15367@csn.ul.ie> (raw)
In-Reply-To: <1240416791.10627.78.camel@nimitz>

On Wed, Apr 22, 2009 at 09:13:11AM -0700, Dave Hansen wrote:
> On Wed, 2009-04-22 at 14:53 +0100, Mel Gorman wrote:
> > No user of the allocator API should be passing in an order >= MAX_ORDER
> > but we check for it on each and every allocation. Delete this check and
> > make it a VM_BUG_ON check further down the call path.
> 
> Should we get the check re-added to some of the upper-level functions,
> then?  Perhaps __get_free_pages() or things like alloc_pages_exact()? 
> 

I don't think so, no. It just moves the source of the text bloat and
for the few callers that are asking for something that will never
succeed.

> I'm selfishly thinking of what I did in profile_init().  Can I slab
> alloc it?  Nope.  Page allocator?  Nope.  Oh, well, try vmalloc():
> 
>         prof_buffer = kzalloc(buffer_bytes, GFP_KERNEL);
>         if (prof_buffer)
>                 return 0;
> 
>         prof_buffer = alloc_pages_exact(buffer_bytes, GFP_KERNEL|__GFP_ZERO);
>         if (prof_buffer)
>                 return 0;
> 
>         prof_buffer = vmalloc(buffer_bytes);
>         if (prof_buffer)
>                 return 0;
> 
>         free_cpumask_var(prof_cpu_mask);
>         return -ENOMEM;
> 

Can this ever actually be asking for an order larger than MAX_ORDER
though? If so, you're condemning it to always behave poorly.

> Same thing in __kmalloc_section_memmap():
> 
>         page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size));
>         if (page)
>                 goto got_map_page;
> 
>         ret = vmalloc(memmap_size);
>         if (ret)
>                 goto got_map_ptr;
> 

If I'm reading that right, the order will never be a stupid order. It can fail
for higher orders in which case it falls back to vmalloc() .  For example,
to hit that limit, the section size for a 4K kernel, maximum usable order
of 10, the section size would need to be 256MB (assuming struct page size
of 64 bytes). I don't think it's ever that size and if so, it'll always be
sub-optimal which is a poor choice to make.

> I depend on the allocator to tell me when I've fed it too high of an
> order.  If we really need this, perhaps we should do an audit and then
> add a WARN_ON() for a few releases to catch the stragglers.
> 

I consider it buggy to ask for something so large that you always end up
with the worst option - vmalloc(). How about leaving it as a VM_BUG_ON
to get as many reports as possible on who is depending on this odd
behaviour?

If there are users with good reasons, then we could convert this to WARN_ON
to fix up the callers. I suspect that the allocator can already cope with
recieving a stupid order silently but slowly. It should go all the way to the
bottom and just never find anything useful and return NULL.  zone_watermark_ok
is the most dangerous looking part but even it should never get to MAX_ORDER
because it should always find there are not enough free pages and return
before it overruns.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-04-22 17:11 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-22 13:53 [PATCH 00/22] Cleanup and optimise the page allocator V7 Mel Gorman
2009-04-22 13:53 ` [PATCH 01/22] Replace __alloc_pages_internal() with __alloc_pages_nodemask() Mel Gorman
2009-04-22 13:53 ` [PATCH 02/22] Do not sanity check order in the fast path Mel Gorman
2009-04-22 16:13   ` Dave Hansen
2009-04-22 17:11     ` Mel Gorman [this message]
2009-04-22 17:30       ` Dave Hansen
2009-04-23  0:13         ` Mel Gorman
2009-04-23  1:34           ` Dave Hansen
2009-04-23  9:58             ` Mel Gorman
2009-04-23 17:36               ` Dave Hansen
2009-04-24  2:57                 ` KOSAKI Motohiro
2009-04-24 10:34                 ` Mel Gorman
2009-04-24 14:16                   ` Dave Hansen
2009-04-23 19:26             ` Dave Hansen
2009-04-23 19:45               ` Dave Hansen
2009-04-24  9:21                 ` Mel Gorman
2009-04-24 14:25                   ` Dave Hansen
2009-04-22 20:11       ` David Rientjes
2009-04-22 20:20         ` Christoph Lameter
2009-04-23  7:44         ` Pekka Enberg
2009-04-23 22:44       ` Andrew Morton
2009-04-22 13:53 ` [PATCH 03/22] Do not check NUMA node ID when the caller knows the node is valid Mel Gorman
2009-04-22 13:53 ` [PATCH 04/22] Check only once if the zonelist is suitable for the allocation Mel Gorman
2009-04-22 13:53 ` [PATCH 05/22] Break up the allocator entry point into fast and slow paths Mel Gorman
2009-04-22 13:53 ` [PATCH 06/22] Move check for disabled anti-fragmentation out of fastpath Mel Gorman
2009-04-22 13:53 ` [PATCH 07/22] Calculate the preferred zone for allocation only once Mel Gorman
2009-04-23 22:48   ` Andrew Morton
2009-04-22 13:53 ` [PATCH 08/22] Calculate the migratetype " Mel Gorman
2009-04-22 13:53 ` [PATCH 09/22] Calculate the alloc_flags " Mel Gorman
2009-04-23 22:52   ` Andrew Morton
2009-04-24 10:47     ` Mel Gorman
2009-04-24 17:51       ` Andrew Morton
2009-04-22 13:53 ` [PATCH 10/22] Remove a branch by assuming __GFP_HIGH == ALLOC_HIGH Mel Gorman
2009-04-22 13:53 ` [PATCH 11/22] Inline __rmqueue_smallest() Mel Gorman
2009-04-22 13:53 ` [PATCH 12/22] Inline buffered_rmqueue() Mel Gorman
2009-04-22 13:53 ` [PATCH 13/22] Inline __rmqueue_fallback() Mel Gorman
2009-04-22 13:53 ` [PATCH 14/22] Do not call get_pageblock_migratetype() more than necessary Mel Gorman
2009-04-22 13:53 ` [PATCH 15/22] Do not disable interrupts in free_page_mlock() Mel Gorman
2009-04-23 22:59   ` Andrew Morton
2009-04-24  0:07     ` KOSAKI Motohiro
2009-04-24  0:33     ` KOSAKI Motohiro
2009-04-24 11:33       ` Mel Gorman
2009-04-24 11:52         ` Lee Schermerhorn
2009-04-24 11:18     ` Mel Gorman
2009-04-22 13:53 ` [PATCH 16/22] Do not setup zonelist cache when there is only one node Mel Gorman
2009-04-22 20:24   ` David Rientjes
2009-04-22 20:32     ` Lee Schermerhorn
2009-04-22 20:34       ` David Rientjes
2009-04-23  0:11         ` KOSAKI Motohiro
2009-04-23  0:19     ` Mel Gorman
2009-04-22 13:53 ` [PATCH 17/22] Do not check for compound pages during the page allocator sanity checks Mel Gorman
2009-04-22 13:53 ` [PATCH 18/22] Use allocation flags as an index to the zone watermark Mel Gorman
2009-04-22 17:11   ` Dave Hansen
2009-04-22 17:14     ` Mel Gorman
2009-04-22 17:47       ` Dave Hansen
2009-04-23  0:27         ` KOSAKI Motohiro
2009-04-23 10:03           ` Mel Gorman
2009-04-24  6:41             ` KOSAKI Motohiro
2009-04-22 20:06   ` David Rientjes
2009-04-23  0:29     ` Mel Gorman
2009-04-27 17:00     ` [RFC] Replace the watermark-related union in struct zone with a watermark[] array Mel Gorman
2009-04-27 20:48       ` David Rientjes
2009-04-27 20:54         ` Mel Gorman
2009-04-27 20:51           ` Christoph Lameter
2009-04-27 21:04           ` David Rientjes
2009-04-30 13:35             ` Mel Gorman
2009-04-30 13:48               ` Dave Hansen
2009-05-12 14:13                 ` [RFC] Replace the watermark-related union in struct zone with a watermark[] array V2 Mel Gorman
2009-05-12 15:05                   ` [RFC] Replace the watermark-related union in struct zone with awatermark[] " Dave Hansen
2009-05-13  8:31                   ` [RFC] Replace the watermark-related union in struct zone with a watermark[] " KOSAKI Motohiro
2009-04-22 13:53 ` [PATCH 19/22] Update NR_FREE_PAGES only as necessary Mel Gorman
2009-04-23 23:06   ` Andrew Morton
2009-04-23 23:04     ` Christoph Lameter
2009-04-24 13:06     ` Mel Gorman
2009-04-22 13:53 ` [PATCH 20/22] Get the pageblock migratetype without disabling interrupts Mel Gorman
2009-04-22 13:53 ` [PATCH 21/22] Use a pre-calculated value instead of num_online_nodes() in fast paths Mel Gorman
2009-04-22 23:04   ` David Rientjes
2009-04-23  0:44     ` Mel Gorman
2009-04-23 19:29       ` David Rientjes
2009-04-24 13:31         ` [PATCH] Do not override definition of node_set_online() with macro Mel Gorman
2009-04-22 13:53 ` [PATCH 22/22] slab: Use nr_online_nodes to check for a NUMA platform Mel Gorman
2009-04-22 14:37   ` Pekka Enberg
2009-04-27  7:58 ` [PATCH 00/22] Cleanup and optimise the page allocator V7 Zhang, Yanmin
2009-04-27 14:38   ` Mel Gorman
2009-04-28  1:59     ` Zhang, Yanmin
2009-04-28 10:27       ` Mel Gorman
2009-04-28 10:31       ` [PATCH] Properly account for freed pages in free_pages_bulk() and when allocating high-order pages in buffered_rmqueue() Mel Gorman
2009-04-28 16:37         ` Christoph Lameter
2009-04-28 16:51           ` Mel Gorman
2009-04-28 17:15             ` Hugh Dickins
2009-04-28 18:07               ` [PATCH] Properly account for freed pages in free_pages_bulk() and when allocating high-order pages in buffered_rmqueue() V2 Mel Gorman
2009-04-28 18:25                 ` Hugh Dickins
2009-04-28 18:36               ` [PATCH] Properly account for freed pages in free_pages_bulk() and when allocating high-order pages in buffered_rmqueue() Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090422171151.GF15367@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ming.m.lin@intel.com \
    --cc=npiggin@suse.de \
    --cc=penberg@cs.helsinki.fi \
    --cc=peterz@infradead.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).