From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751164Ab3LKJ04 (ORCPT ); Wed, 11 Dec 2013 04:26:56 -0500 Received: from cantor2.suse.de ([195.135.220.15]:40375 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750774Ab3LKJ0w (ORCPT ); Wed, 11 Dec 2013 04:26:52 -0500 Date: Wed, 11 Dec 2013 09:26:48 +0000 From: Mel Gorman To: David Rientjes Cc: Andrew Morton , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [patch] mm, page_alloc: allow __GFP_NOFAIL to allocate below watermarks after reclaim Message-ID: <20131211092648.GW11295@suse.de> References: <20131210075059.GA11295@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 10, 2013 at 03:03:39PM -0800, David Rientjes wrote: > On Tue, 10 Dec 2013, Mel Gorman wrote: > > > > If direct reclaim has failed to free memory, __GFP_NOFAIL allocations > > > can potentially loop forever in the page allocator. In this case, it's > > > better to give them the ability to access below watermarks so that they > > > may allocate similar to the same privilege given to GFP_ATOMIC > > > allocations. > > > > > > We're careful to ensure this is only done after direct reclaim has had > > > the chance to free memory, however. > > > > > > Signed-off-by: David Rientjes > > > > The main problem with doing something like this is that it just smacks > > into the adjusted watermark if there are a number of __GFP_NOFAIL. Who > > was the user of __GFP_NOFAIL that was fixed by this patch? > > > > Nobody, it comes out of a memcg discussion where __GFP_NOFAIL were > recently given the ability to bypass charges to the root memcg when the > memcg has hit its limit since we disallow the oom killer to kill a process > (for the same reason that the vast majority of __GFP_NOFAIL users, those > that do GFP_NOFS | __GFP_NOFAIL, disallow the oom killer in the page > allocator). > > Without some other thread freeing memory, these allocations simply loop > forever. We probably don't want to reconsider the choice that prevents > calling the oom killer in !__GFP_FS contexts since it will allow > unnecessary oom killing when memory can actually be freed by another > thread. > > Since there are comments in both gfp.h and page_alloc.c that say no new > users will be added, it seems legitimate to ensure that the allocation > will at least have a chance of succeeding, but not the point of depleting > memory reserves entirely. > Which __GFP_NOFAIL on its own does not guarantee if they just smack into that barrier and cannot do anything. It changes the timing, not fixes the problem. > > There are enough bad users of __GFP_NOFAIL that I really question how > > good an idea it is to allow emergency reserves to be used when they are > > potentially leaked to other !__GFP_NOFAIL users via the slab allocator > > shortly afterwards. > > > > You could make the same argument for GFP_ATOMIC which can also allow > access to memory reserves. The critical difference being that GFP_ATOMIC callers typically can handle NULL being returned to them. GFP_ATOMIC storms may starve !GFP_ATOMIC requests but it does not cause the same types of problems that __GFP_NOFAIL using reserves would. -- Mel Gorman SUSE Labs