From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760984AbZEKXUu (ORCPT ); Mon, 11 May 2009 19:20:50 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1758079AbZEKXUk (ORCPT ); Mon, 11 May 2009 19:20:40 -0400 Received: from smtp1.linux-foundation.org ([140.211.169.13]:60372 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751392AbZEKXUk (ORCPT ); Mon, 11 May 2009 19:20:40 -0400 Date: Mon, 11 May 2009 16:14:46 -0700 From: Andrew Morton To: David Rientjes Cc: gregkh@suse.de, npiggin@suse.de, mel@csn.ul.ie, a.p.zijlstra@chello.nl, cl@linux-foundation.org, dave@linux.vnet.ibm.com, san@android.com, arve@android.com, linux-kernel@vger.kernel.org Subject: Re: [patch 08/11 -mmotm] oom: invoke oom killer for __GFP_NOFAIL Message-Id: <20090511161446.4d2a32a5.akpm@linux-foundation.org> In-Reply-To: References: <20090511142936.dd68005b.akpm@linux-foundation.org> <20090511151130.9a949cb7.akpm@linux-foundation.org> <20090511154603.0fb0acbf.akpm@linux-foundation.org> X-Mailer: Sylpheed version 2.2.4 (GTK+ 2.8.20; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 11 May 2009 16:00:58 -0700 (PDT) David Rientjes wrote: > On Mon, 11 May 2009, Andrew Morton wrote: > > > oh, well that was pretty useless then. I was trying to find a handy > > spot where we can avoid adding fastpath cycles. > > > > How about we sneak it into the order>0 leg inside buffered_rmqueue()? > > > > Wouldn't it be easier after my patch is merged to just check the oom > killer stack traces for such allocations and people complain about > unnecessary oom killing when memory is available but too fragmented? The > gfp_flags and order are shown in the oom killer header. That assumes that the oom-killer is triggered - in the typical kernel developer testing, that won't happen. I think what we should do here is to prevent people even attempting to use __GFP_NOFAIL with higher-order allocations. Are you aware of any callsite which is presently using __GFP_NOFAIL on order>0 allocations? I expect slub might cause this to happen due to its habit of using larger-than-needed orders for small objects. For example, cxgb3 is passing __GFP_NOFAIL into alloc_skb(). > > > > --- a/mm/page_alloc.c~page-allocator-warn-if-__gfp_nofail-is-used-for-a-large-allocation > > +++ a/mm/page_alloc.c > > @@ -1130,6 +1130,20 @@ again: > > list_del(&page->lru); > > pcp->count--; > > } else { > > + if (unlikely(gfp_mask & __GFP_NOFAIL)) { > > + /* > > + * __GFP_NOFAIL is not to be used in new code. > > + * > > + * All __GFP_NOFAIL callers should be fixed so that they > > + * properly detect and handle allocation failures. > > + * > > + * We most definitely don't want callers attempting to > > + * allocate greater than single-page units with > > + * __GFP_NOFAIL. > > + */ > > + WARN_ON_ONCE(order > 0); > > + return 0; > > + } > > spin_lock_irqsave(&zone->lock, flags); > > page = __rmqueue(zone, order, migratetype); > > __mod_zone_page_state(zone, NR_FREE_PAGES, -(1 << order)); > > That "return 0" definitely needs to be removed, though :) The inventor of copy-n-paste has a lot to answer for.