From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mgwym02.jp.fujitsu.com ([211.128.242.41]:13875 "EHLO mgwym02.jp.fujitsu.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756350AbcBQGGE (ORCPT ); Wed, 17 Feb 2016 01:06:04 -0500 Received: from g01jpfmpwyt01.exch.g01.fujitsu.local (g01jpfmpwyt01.exch.g01.fujitsu.local [10.128.193.38]) by yt-mxoi2.gw.nic.fujitsu.com (Postfix) with ESMTP id 843B8AC02D1 for ; Wed, 17 Feb 2016 14:55:37 +0900 (JST) Subject: Re: [PATCH] btrfs: Avoid BUG_ON()s because of ENOMEM caused by kmalloc() failure To: , "linux-btrfs@vger.kernel.org" References: <56C16441.5030000@jp.fujitsu.com> <20160215175333.GO4374@twin.jikos.cz> From: Satoru Takeuchi Message-ID: <56C40B0F.3090209@jp.fujitsu.com> Date: Wed, 17 Feb 2016 14:54:23 +0900 MIME-Version: 1.0 In-Reply-To: <20160215175333.GO4374@twin.jikos.cz> Content-Type: text/plain; charset="windows-1252"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 2016/02/16 2:53, David Sterba wrote: > On Mon, Feb 15, 2016 at 02:38:09PM +0900, Satoru Takeuchi wrote: >> There are some BUG_ON()'s after kmalloc() as follows. >> >> ===== >> foo = kmalloc(); >> BUG_ON(!foo); /* -ENOMEM case */ >> ===== >> >> A Docker + memory cgroup user hit these BUG_ON()s. >> >> https://bugzilla.kernel.org/show_bug.cgi?id=112101 >> >> Since it's very hard to handle these ENOMEMs properly, >> preventing these kmalloc() failures to avoid these >> BUG_ON()s for now, are a bit better than the current >> implementation anyway. > > Beware that the NOFAIL semantics is can cause deadlocks if it's on the > critical writeback path or and can be reentered from itself through the > reclaim. Unless you're sure that this is not the case, please do not add > them just because it would seemingly fix the allocation failures. About the all cases I changed, kmalloc()s can block since gfp_flags_allow_blocking() are true. Then no locks are acquired here and deadlocks don't happen. Am I missing something? > > In the docker example, the memory is limited by cgroups so the NOFAIL > mode can exhaust all reserves and just loop endlessly waiting for the > OOM killer to get some memory or just waiting without any chance to > progress. I consider triggering OOM killer and killing processes in a cgroup are better than killing whole system. About the possibility of endless loop, there are many such problems in the whole kernel. Of course it can be said to Btrfs. ========================================== $ grep -rnH __GFP_NOFAIL fs/btrfs/ fs/btrfs/extent-tree.c:5970: GFP_NOFS | __GFP_NOFAIL); fs/btrfs/extent-tree.c:6043: bytenr + num_bytes - 1, GFP_NOFS | __GFP_NOFAIL); fs/btrfs/extent_io.c:4643: eb = kmem_cache_zalloc(extent_buffer_cache, GFP_NOFS|__GFP_NOFAIL); fs/btrfs/extent_io.c:4909: p = find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL); ========================================== I understand fixing these problems cooperate with memory cgroup guys is the best in the long run. However, I consider bypassing this problem for now is better than the current implementation. Thanks, Satoru > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >