From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Subject: Re: [PATCH] vmalloc: add warning in __vmalloc Date: Thu, 3 May 2012 16:30:56 +1000 Message-ID: References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Minchan Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , linux-fsdevel To: Sage Weil Return-path: In-Reply-To: Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On 3 May 2012 15:46, Sage Weil wrote: > On Thu, 3 May 2012, Minchan Kim wrote: >> On 05/03/2012 04:46 AM, Andrew Morton wrote: >> > Well. =C2=A0What are we actually doing here? =C2=A0Causing the kernel = to spew a >> > warning due to known-buggy callsites, so that users will report the >> > warnings, eventually goading maintainers into fixing their stuff. >> > >> > This isn't very efficient :( >> >> >> Yes. I hope maintainers fix it before merging this. >> >> > >> > It would be better to fix that stuff first, then add the warning to >> > prevent reoccurrences. =C2=A0Yes, maintainers are very naughty and pro= bably >> > do need cattle prods^W^W warnings to motivate them to fix stuff, but w= e >> > should first make an effort to get these things fixed without >> > irritating and alarming our users. >> > >> > Where are these offending callsites? > > Okay, maybe this is a stupid question, but: if an fs can't call vmalloc > with GFP_NOFS without risking deadlock, calling with GFP_KERNEL instead > doesn't fix anything (besides being more honest). =C2=A0This really means= that > vmalloc is effectively off-limits for file systems in any > writeback-related path, right? Anywhere it cannot reenter the filesystem, yes. GFP_NOFS is effectively GFP_KERNEL when calling vmalloc. Note that in writeback paths, a "good citizen" filesystem should not requir= e any allocations, or at least it should be able to tolerate allocation failu= res. So fixing that would be a good idea anyway. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Artem Bityutskiy Subject: Re: [PATCH] vmalloc: add warning in __vmalloc Date: Thu, 03 May 2012 10:13:26 +0300 Message-ID: <1336029206.13013.11.camel@sauron.fi.intel.com> References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> Reply-To: dedekind1@gmail.com Mime-Version: 1.0 Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-IhXpZohLmfOVPioqd9U9" Cc: Sage Weil , Minchan Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , David Woodhouse , Theodore Ts'o , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , linux-fsdevel To: Nick Piggin Return-path: Received: from mga11.intel.com ([192.55.52.93]:45262 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752627Ab2ECHKJ (ORCPT ); Thu, 3 May 2012 03:10:09 -0400 In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: --=-IhXpZohLmfOVPioqd9U9 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2012-05-03 at 16:30 +1000, Nick Piggin wrote: > Note that in writeback paths, a "good citizen" filesystem should not requ= ire > any allocations, or at least it should be able to tolerate allocation fai= lures. > So fixing that would be a good idea anyway. This is a good point, but UBIFS kmallocs(GFP_NOFS) when doing I/O because it needs to compress/decompress. But I agree that if kmalloc fails, we should have a fall-back reserve buffer protected by a mutex for memory pressure situations. --=20 Best Regards, Artem Bityutskiy --=-IhXpZohLmfOVPioqd9U9 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPojAWAAoJECmIfjd9wqK0uyoQAIHwcFt6wlnSB89vJSJSz7kh O+NcHb04TACpGZCPzaST5tWBN3I03h9fyZC4Cyvy5APcLJkMZaV3znP1lBsXTrRT tUmoxoSirX89ILYyY9Vafc/+TWfe59DV41/XZY13dWqA0X4m5mlDnaZoaadHCO3Z chcmUSHqMMgpyBTGzOhxlICDY8e/PBKunzvogc2THrHKjYimL+e8MaPJ3suTe3Vm OhY/zMaaGMANuUsCVJNXpVGMaPolVkxslZbjljgZDCQ7Iwj9W96Km5E+31w8fz5F edM03z2EH9Cc1yQKqK+pe8URa4V/rJcGMQsh9KFkfSPDLYdUmr/4isNyTzLzp98c QA4sue8F3/x2hlssCQKEof9MnHSXA45uhf14gL3va8bZ2y0EQPQ02c6g8OsPKb4E As9To54l9Nb7fSWXnrRpt+V0W7PrDgMLxDdVmWLtrRFfg9MJogbLRiNcwPqWuM3P +uE1egbi7MXfepH6vx7gpdwnd8YjqwtZZAbXlBaIz6yPbt5cx908lJirftJxOgKs xNBMsvX+gTZz6/ObxlMFQa86yiPC8pGatT4siKojCgKj+p+DehqK0FP4HHNTzDuH xeeMC2bhtpZmxX52mlu5KcehRSjJFIrqfPknJQ6i9A158QzIUre3jMUiDrkD0glI gva38IN4fRPD+21Ct/ly =luMf -----END PGP SIGNATURE----- --=-IhXpZohLmfOVPioqd9U9-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nick Piggin Subject: Re: [PATCH] vmalloc: add warning in __vmalloc Date: Thu, 3 May 2012 17:14:16 +1000 Message-ID: References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> <1336029206.13013.11.camel@sauron.fi.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Sage Weil , Minchan Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , linux-fsdevel To: dedekind1@gmail.com Return-path: In-Reply-To: <1336029206.13013.11.camel@sauron.fi.intel.com> Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org On 3 May 2012 17:13, Artem Bityutskiy wrote: > On Thu, 2012-05-03 at 16:30 +1000, Nick Piggin wrote: >> Note that in writeback paths, a "good citizen" filesystem should not require >> any allocations, or at least it should be able to tolerate allocation failures. >> So fixing that would be a good idea anyway. > > This is a good point, but UBIFS kmallocs(GFP_NOFS) when doing I/O > because it needs to compress/decompress. But I agree that if kmalloc > fails, we should have a fall-back reserve buffer protected by a mutex > for memory pressure situations. AKA, a mempool :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Whitehouse Subject: Re: [PATCH] vmalloc: add warning in __vmalloc Date: Thu, 03 May 2012 14:48:36 +0100 Message-ID: <1336052916.7030.7.camel@menhir> References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Sage Weil , Minchan Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , "David S. Miller" , James Morris , Alexander Viro , linux-fsdevel To: Nick Piggin Return-path: In-Reply-To: Sender: owner-linux-mm@kvack.org List-Id: linux-fsdevel.vger.kernel.org Hi, On Thu, 2012-05-03 at 16:30 +1000, Nick Piggin wrote: > On 3 May 2012 15:46, Sage Weil wrote: > > On Thu, 3 May 2012, Minchan Kim wrote: > >> On 05/03/2012 04:46 AM, Andrew Morton wrote: > >> > Well. What are we actually doing here? Causing the kernel to spew a > >> > warning due to known-buggy callsites, so that users will report the > >> > warnings, eventually goading maintainers into fixing their stuff. > >> > > >> > This isn't very efficient :( > >> > >> > >> Yes. I hope maintainers fix it before merging this. > >> > >> > > >> > It would be better to fix that stuff first, then add the warning to > >> > prevent reoccurrences. Yes, maintainers are very naughty and probably > >> > do need cattle prods^W^W warnings to motivate them to fix stuff, but we > >> > should first make an effort to get these things fixed without > >> > irritating and alarming our users. > >> > > >> > Where are these offending callsites? > > > > Okay, maybe this is a stupid question, but: if an fs can't call vmalloc > > with GFP_NOFS without risking deadlock, calling with GFP_KERNEL instead > > doesn't fix anything (besides being more honest). This really means that > > vmalloc is effectively off-limits for file systems in any > > writeback-related path, right? > > Anywhere it cannot reenter the filesystem, yes. GFP_NOFS is effectively > GFP_KERNEL when calling vmalloc. > > Note that in writeback paths, a "good citizen" filesystem should not require > any allocations, or at least it should be able to tolerate allocation failures. > So fixing that would be a good idea anyway. For cluster filesystems, there is an additional issue. When we allocate memory with GFP_KERNEL we might land up pushing inodes out of cache, which can also mean deallocating them. That process involves taking cluster locks, and so it is not valid to do this while holding another cluster lock (since the locks may be taken in random order). In the GFS2 use case for vmalloc, this is being done if kmalloc fails and also if the memory required is too large for kmalloc (very unlikely, but possible with very large directories). Also, it is being done under a cluster lock (shared mode). I recently looked back at the thread which resulted in that particular vmalloc call being left there: http://www.redhat.com/archives/cluster-devel/2010-July/msg00021.html http://www.redhat.com/archives/cluster-devel/2010-July/msg00022.html http://www.redhat.com/archives/cluster-devel/2010-July/msg00023.html which reminded me of the problem. So this might not be so easy to resolve... Steve. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx158.postini.com [74.125.245.158]) by kanga.kvack.org (Postfix) with SMTP id DC4A66B004D for ; Wed, 2 May 2012 00:28:36 -0400 (EDT) From: Minchan Kim Subject: [PATCH] vmalloc: add warning in __vmalloc Date: Wed, 2 May 2012 13:28:09 +0900 Message-Id: <1335932890-25294-1-git-send-email-minchan@kernel.org> Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Minchan Kim , Neil Brown , Artem Bityutskiy , David Woodhouse , Theodore Ts'o , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , Sage Weil Now there are several places to use __vmalloc with GFP_ATOMIC, GFP_NOIO, GFP_NOFS but unfortunately __vmalloc calls map_vm_area which calls alloc_pages with GFP_KERNEL to allocate page tables. It means it's possible to happen deadlock. I don't know why it doesn't have reported until now. Firstly, I tried passing gfp_t to lower functions to support __vmalloc with such flags but other mm guys don't want and decided that all of caller should be fixed. http://marc.info/?l=linux-kernel&m=133517143616544&w=2 To begin with, let's listen other's opinion whether they can fix it by other approach without calling __vmalloc with such flags. So this patch adds warning in __vmalloc_node_range to detect it and to be fixed hopely. __vmalloc_node_range isn't random chocie because all caller which has gfp_mask of map_vm_area use it through __vmalloc_area_node. And __vmalloc_area_node is current static function and is called by only __vmalloc_node_range. So warning in __vmalloc_node_range would cover all vmalloc functions which have gfp_t argument. I Cced related maintainers. If I miss someone, please Cced them. * Changelog * Replace WARN_ON with WARN_ON_ONCE - by Andrew Morton, Nick Piggin. Cc: Neil Brown Cc: Artem Bityutskiy Cc: David Woodhouse Cc: "Theodore Ts'o" Cc: Adrian Hunter Cc: Steven Whitehouse Cc: "David S. Miller" Cc: James Morris Cc: Alexander Viro Cc: Sage Weil Signed-off-by: Minchan Kim --- mm/vmalloc.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index c28b0b9..def5943 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1648,6 +1648,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, void *addr; unsigned long real_size = size; + WARN_ON_ONCE(!(gfp_mask & __GFP_WAIT) || + !(gfp_mask & __GFP_IO) || + !(gfp_mask & __GFP_FS)); + size = PAGE_ALIGN(size); if (!size || (size >> PAGE_SHIFT) > totalram_pages) goto fail; -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx161.postini.com [74.125.245.161]) by kanga.kvack.org (Postfix) with SMTP id 271506B0081 for ; Wed, 2 May 2012 15:46:13 -0400 (EDT) Date: Wed, 2 May 2012 12:46:10 -0700 From: Andrew Morton Subject: Re: [PATCH] vmalloc: add warning in __vmalloc Message-Id: <20120502124610.175e099c.akpm@linux-foundation.org> In-Reply-To: <1335932890-25294-1-git-send-email-minchan@kernel.org> References: <1335932890-25294-1-git-send-email-minchan@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , Theodore Ts'o , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , Sage Weil On Wed, 2 May 2012 13:28:09 +0900 Minchan Kim wrote: > Now there are several places to use __vmalloc with GFP_ATOMIC, > GFP_NOIO, GFP_NOFS but unfortunately __vmalloc calls map_vm_area > which calls alloc_pages with GFP_KERNEL to allocate page tables. > It means it's possible to happen deadlock. > I don't know why it doesn't have reported until now. > > Firstly, I tried passing gfp_t to lower functions to support __vmalloc > with such flags but other mm guys don't want and decided that > all of caller should be fixed. > > http://marc.info/?l=linux-kernel&m=133517143616544&w=2 > > To begin with, let's listen other's opinion whether they can fix it > by other approach without calling __vmalloc with such flags. > > So this patch adds warning in __vmalloc_node_range to detect it and > to be fixed hopely. __vmalloc_node_range isn't random chocie because > all caller which has gfp_mask of map_vm_area use it through __vmalloc_area_node. > And __vmalloc_area_node is current static function and is called by only > __vmalloc_node_range. So warning in __vmalloc_node_range would cover all > vmalloc functions which have gfp_t argument. > > I Cced related maintainers. > If I miss someone, please Cced them. > > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1648,6 +1648,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > void *addr; > unsigned long real_size = size; > > + WARN_ON_ONCE(!(gfp_mask & __GFP_WAIT) || > + !(gfp_mask & __GFP_IO) || > + !(gfp_mask & __GFP_FS)); > + > size = PAGE_ALIGN(size); > if (!size || (size >> PAGE_SHIFT) > totalram_pages) > goto fail; Well. What are we actually doing here? Causing the kernel to spew a warning due to known-buggy callsites, so that users will report the warnings, eventually goading maintainers into fixing their stuff. This isn't very efficient :( It would be better to fix that stuff first, then add the warning to prevent reoccurrences. Yes, maintainers are very naughty and probably do need cattle prods^W^W warnings to motivate them to fix stuff, but we should first make an effort to get these things fixed without irritating and alarming our users. Where are these offending callsites? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx162.postini.com [74.125.245.162]) by kanga.kvack.org (Postfix) with SMTP id 8335E6B0081 for ; Wed, 2 May 2012 21:03:07 -0400 (EDT) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1SPkRj-0001Os-AF for linux-mm@kvack.org; Thu, 03 May 2012 03:03:03 +0200 Received: from 121.50.20.41 ([121.50.20.41]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 03 May 2012 03:03:03 +0200 Received: from minchan by 121.50.20.41 with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 03 May 2012 03:03:03 +0200 From: Minchan Kim Subject: Re: [PATCH] vmalloc: add warning in __vmalloc Date: Thu, 03 May 2012 10:02:52 +0900 Message-ID: <4FA1D93C.9000306@kernel.org> References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit In-Reply-To: <20120502124610.175e099c.akpm@linux-foundation.org> Sender: owner-linux-mm@kvack.org List-ID: To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org On 05/03/2012 04:46 AM, Andrew Morton wrote: > On Wed, 2 May 2012 13:28:09 +0900 > Minchan Kim wrote: > >> Now there are several places to use __vmalloc with GFP_ATOMIC, >> GFP_NOIO, GFP_NOFS but unfortunately __vmalloc calls map_vm_area >> which calls alloc_pages with GFP_KERNEL to allocate page tables. >> It means it's possible to happen deadlock. >> I don't know why it doesn't have reported until now. >> >> Firstly, I tried passing gfp_t to lower functions to support __vmalloc >> with such flags but other mm guys don't want and decided that >> all of caller should be fixed. >> >> http://marc.info/?l=linux-kernel&m=133517143616544&w=2 >> >> To begin with, let's listen other's opinion whether they can fix it >> by other approach without calling __vmalloc with such flags. >> >> So this patch adds warning in __vmalloc_node_range to detect it and >> to be fixed hopely. __vmalloc_node_range isn't random chocie because >> all caller which has gfp_mask of map_vm_area use it through __vmalloc_area_node. >> And __vmalloc_area_node is current static function and is called by only >> __vmalloc_node_range. So warning in __vmalloc_node_range would cover all >> vmalloc functions which have gfp_t argument. >> >> I Cced related maintainers. >> If I miss someone, please Cced them. >> >> --- a/mm/vmalloc.c >> +++ b/mm/vmalloc.c >> @@ -1648,6 +1648,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, >> void *addr; >> unsigned long real_size = size; >> >> + WARN_ON_ONCE(!(gfp_mask & __GFP_WAIT) || >> + !(gfp_mask & __GFP_IO) || >> + !(gfp_mask & __GFP_FS)); >> + >> size = PAGE_ALIGN(size); >> if (!size || (size >> PAGE_SHIFT) > totalram_pages) >> goto fail; > > Well. What are we actually doing here? Causing the kernel to spew a > warning due to known-buggy callsites, so that users will report the > warnings, eventually goading maintainers into fixing their stuff. > > This isn't very efficient :( Yes. I hope maintainers fix it before merging this. > > It would be better to fix that stuff first, then add the warning to > prevent reoccurrences. Yes, maintainers are very naughty and probably > do need cattle prods^W^W warnings to motivate them to fix stuff, but we > should first make an effort to get these things fixed without > irritating and alarming our users. > > Where are these offending callsites? dm: __alloc_buffer_wait_no_callback ubi: ubi_dbg_check_write ubi_dbg_check_all_ff ext4 : ext4_kvmalloc gfs2 : gfs2_alloc_sort_buffer ntfs : __ntfs_malloc ubifs : dbg_dump_leb scan_check_cb dump_lpt_leb dbg_check_ltab_lnum dbg_scan_orphans mm : alloc_large_system_hash ceph : fill_inode ceph_setxattr ceph_removexattr ceph_x_build_authorizer ceph_decode_buffer ceph_alloc_middle > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > Don't email: email@kvack.org > -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx175.postini.com [74.125.245.175]) by kanga.kvack.org (Postfix) with SMTP id CC6086B004D for ; Thu, 3 May 2012 01:46:14 -0400 (EDT) Date: Wed, 2 May 2012 22:46:12 -0700 (PDT) From: Sage Weil Subject: Re: [PATCH] vmalloc: add warning in __vmalloc In-Reply-To: <4FA1D93C.9000306@kernel.org> Message-ID: References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Minchan Kim Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , Theodore Ts'o , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro On Thu, 3 May 2012, Minchan Kim wrote: > On 05/03/2012 04:46 AM, Andrew Morton wrote: > > Well. What are we actually doing here? Causing the kernel to spew a > > warning due to known-buggy callsites, so that users will report the > > warnings, eventually goading maintainers into fixing their stuff. > > > > This isn't very efficient :( > > > Yes. I hope maintainers fix it before merging this. > > > > > It would be better to fix that stuff first, then add the warning to > > prevent reoccurrences. Yes, maintainers are very naughty and probably > > do need cattle prods^W^W warnings to motivate them to fix stuff, but we > > should first make an effort to get these things fixed without > > irritating and alarming our users. > > > > Where are these offending callsites? Okay, maybe this is a stupid question, but: if an fs can't call vmalloc with GFP_NOFS without risking deadlock, calling with GFP_KERNEL instead doesn't fix anything (besides being more honest). This really means that vmalloc is effectively off-limits for file systems in any writeback-related path, right? sage > > > dm: > __alloc_buffer_wait_no_callback > > ubi: > ubi_dbg_check_write > ubi_dbg_check_all_ff > > ext4 : > ext4_kvmalloc > > gfs2 : > gfs2_alloc_sort_buffer > > ntfs : > __ntfs_malloc > > ubifs : > dbg_dump_leb > scan_check_cb > dump_lpt_leb > dbg_check_ltab_lnum > dbg_scan_orphans > > mm : > alloc_large_system_hash > > ceph : > fill_inode > ceph_setxattr > ceph_removexattr > ceph_x_build_authorizer > ceph_decode_buffer > ceph_alloc_middle > > > > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > > Don't email: email@kvack.org > > > > > > -- > Kind regards, > Minchan Kim > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx122.postini.com [74.125.245.122]) by kanga.kvack.org (Postfix) with SMTP id 22B3C6B004D for ; Thu, 3 May 2012 01:55:58 -0400 (EDT) Received: by lbjn8 with SMTP id n8so1423127lbj.14 for ; Wed, 02 May 2012 22:55:56 -0700 (PDT) Message-ID: <1336024538.2056.1.camel@koala> Subject: Re: [PATCH] vmalloc: add warning in __vmalloc From: Artem Bityutskiy Date: Thu, 03 May 2012 08:55:38 +0300 In-Reply-To: <20120502124610.175e099c.akpm@linux-foundation.org> References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-eNgnivd47g30r683VV7y" Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , David Woodhouse , Theodore Ts'o , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , Sage Weil --=-eNgnivd47g30r683VV7y Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2012-05-02 at 12:46 -0700, Andrew Morton wrote: > Well. What are we actually doing here? Causing the kernel to spew a > warning due to known-buggy callsites, so that users will report the > warnings, eventually goading maintainers into fixing their stuff. >=20 > This isn't very efficient :( I'll look at UBIFS and UBI - they use vmalloc and probably some of them may be in write-back paths. --=20 Best Regards, Artem Bityutskiy --=-eNgnivd47g30r683VV7y Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPoh3aAAoJECmIfjd9wqK0DugQAIW8EaTY4ZaVIylSG+FHCsTH HxRyp9vBXs1a0bl7K0qBLoJbUR0k6WuNGjJVX7hHGtYs3p9Q3CwzRoQrmJ4r377V KKW8d/oDuzCOEycIa9LJqPh5SEcdxRCL1gtaKJ3tFPclntUj9k7MC2if61GdXQuY fhfJIgCeZXu0e7oft8szQMJCi2tmt3QKeZ++KHToy86BHSR5uoG6Mbp+vYYppPhV pkQXK/UhHYsNBEj5Vz7mZx5fNcOpz9EeM6Z7Xo79xi3u+LEn0ugYOABrLfB5zMZG 3ckZEJvOE1egE2/QOv3wD/55VvPTDzD7zpsnMbNQIgUFYwix6G72WTn+ZDxV+4Kc CqiNpJ+4F9s4JVT+GvLS0tgLvJzcvA05x7TGzi7d/WyVsgWJsVVr4Ku4kHcCdWcS bpRFCk3PotHIqbugriW0GRd8hBw0sGtbNzol2ye2tqE2mvvkDXlUSnLKEL/42APE UheXtHFq6w6W4Vt+lNZbuW+JWH0/ga6cI/Zwn2BCwnM2LDpcx3LhAKeaM4vgzwHy ryk8/RlOqgHBOfd3snfaEzk/3v4VM2r3PYDxyV244dOXRO5p2qDBt/KVJ0/6UXb0 VEGX5JXelBnXr9tBU6SHPRDx1AyaNRx9vw2D8oEmYJ/fERyrbQQSUfjEc2WxEKlk 2ok7x2nrfnqnBjjf6OEC =UZEI -----END PGP SIGNATURE----- --=-eNgnivd47g30r683VV7y-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx155.postini.com [74.125.245.155]) by kanga.kvack.org (Postfix) with SMTP id 577306B00E7 for ; Thu, 3 May 2012 07:08:52 -0400 (EDT) Message-ID: <1336043475.13013.47.camel@sauron.fi.intel.com> Subject: Re: [PATCH] vmalloc: add warning in __vmalloc From: Artem Bityutskiy Reply-To: dedekind1@gmail.com Date: Thu, 03 May 2012 14:11:15 +0300 In-Reply-To: <20120502124610.175e099c.akpm@linux-foundation.org> References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-57Lh1wuEjZihfs+vd2pa" Mime-Version: 1.0 Sender: owner-linux-mm@kvack.org List-ID: To: Andrew Morton Cc: Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , David Woodhouse , Theodore Ts'o , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , Sage Weil --=-57Lh1wuEjZihfs+vd2pa Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2012-05-02 at 12:46 -0700, Andrew Morton wrote: > On Wed, 2 May 2012 13:28:09 +0900 > Minchan Kim wrote: >=20 > > Now there are several places to use __vmalloc with GFP_ATOMIC, > > GFP_NOIO, GFP_NOFS but unfortunately __vmalloc calls map_vm_area > > which calls alloc_pages with GFP_KERNEL to allocate page tables. > > It means it's possible to happen deadlock. > > I don't know why it doesn't have reported until now. > >=20 > > Firstly, I tried passing gfp_t to lower functions to support __vmalloc > > with such flags but other mm guys don't want and decided that > > all of caller should be fixed. > >=20 > > http://marc.info/?l=3Dlinux-kernel&m=3D133517143616544&w=3D2 > >=20 > > To begin with, let's listen other's opinion whether they can fix it > > by other approach without calling __vmalloc with such flags. > >=20 > > So this patch adds warning in __vmalloc_node_range to detect it and > > to be fixed hopely. __vmalloc_node_range isn't random chocie because > > all caller which has gfp_mask of map_vm_area use it through __vmalloc_a= rea_node. > > And __vmalloc_area_node is current static function and is called by onl= y > > __vmalloc_node_range. So warning in __vmalloc_node_range would cover al= l > > vmalloc functions which have gfp_t argument. > > > > I Cced related maintainers. > > If I miss someone, please Cced them. > >=20 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -1648,6 +1648,10 @@ void *__vmalloc_node_range(unsigned long size, u= nsigned long align, > > void *addr; > > unsigned long real_size =3D size; > > =20 > > + WARN_ON_ONCE(!(gfp_mask & __GFP_WAIT) || > > + !(gfp_mask & __GFP_IO) || > > + !(gfp_mask & __GFP_FS)); > > + > > size =3D PAGE_ALIGN(size); > > if (!size || (size >> PAGE_SHIFT) > totalram_pages) > > goto fail; >=20 > Well. What are we actually doing here? Causing the kernel to spew a > warning due to known-buggy callsites, so that users will report the > warnings, eventually goading maintainers into fixing their stuff. >=20 > This isn't very efficient :( >=20 > It would be better to fix that stuff first, then add the warning to > prevent reoccurrences. Yes, maintainers are very naughty and probably > do need cattle prods^W^W warnings to motivate them to fix stuff, but we > should first make an effort to get these things fixed without > irritating and alarming our users. =20 >=20 > Where are these offending callsites? OK, I checked my part - both UBI and UBIFS call __vmalloc() with GFP_NOFS in several places of the _debugging_ code, and this is why we do not see any issues - the debugging code is used very rarely for validating purposes. All the places look fixable, I'll fix them a bit later. WARN_ON_ONCE() looks like a good first step. An I think it is better if maintainers fix their areas rather than if someone who does not know how the subsystem works starts trying to do that. --=20 Best Regards, Artem Bityutskiy --=-57Lh1wuEjZihfs+vd2pa Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJPomfTAAoJECmIfjd9wqK0Le0P/jVKUGL5CYTbVVuaczEr0YHG 3u1H1H5PdbSh+nZh1k+B4954w5tS96bPRrJi2HsFGTSdAvWudF0w5jOaXZqxE9y9 JRIqsYO1n4yJcYM6vNFOTOxk0gUCap4hR+AvqBGEB8C/fiC3f5E/FxISwSrzLBFx xMaScsYuaJJv0IzOD+MuPEDW2YuX3dyiKssqh8yWIcPNtS/o8LQ/im06HZugYvXo cuslXW3/vtcV7npEvyXNxRioKlsouWfuEh2ukrIvHRN7hGgAkUeW87ht2Z10065U oK+kWKd8PjrT7j0tzjnbGWZNkgUnIzo+4p91RzszfdVo/pZRQRogNAT4U6zRwbMZ MoW9prWmwT57iI4MIu9eCvqMug+nQesCvMlok7Bh8bvESxl6+CzTGGu0VwjKkUty xgPH3XRDGjrPPGwOUDL4Rv9ugCh8IVQE4punSIjGxwVrjtEDpLGObF7bj2Zzl46Q 7ye17JDT3p4HwtNgcDvNCUlyZ7xT2hReSQ8+GOMGuH94Wz9okMAg/eF2UHGsT1+d pbcy2a2YbikOooG1hVMcKf3YjJHeMwVNH/Eag8zpcKg5+Hl7Gf/BxrbzetubAow2 /i01lJ7nYieCXTp+AzJiRsaBIiQ7YXhcY6pK4NCu5EeQJfl9ugxyoUBGr7h/6YlG h5DV5VzMJrgsGpxoKonm =U6kO -----END PGP SIGNATURE----- --=-57Lh1wuEjZihfs+vd2pa-- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752069Ab2EBE2i (ORCPT ); Wed, 2 May 2012 00:28:38 -0400 Received: from LGEMRELSE1Q.lge.com ([156.147.1.111]:60885 "EHLO LGEMRELSE1Q.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751229Ab2EBE2h (ORCPT ); Wed, 2 May 2012 00:28:37 -0400 X-AuditID: 9c93016f-b7b1bae0000044da-7f-4fa0b7ed5194 From: Minchan Kim To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Minchan Kim , Neil Brown , Artem Bityutskiy , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , Sage Weil Subject: [PATCH] vmalloc: add warning in __vmalloc Date: Wed, 2 May 2012 13:28:09 +0900 Message-Id: <1335932890-25294-1-git-send-email-minchan@kernel.org> X-Mailer: git-send-email 1.7.9.5 X-Brightmail-Tracker: AAAAAA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Now there are several places to use __vmalloc with GFP_ATOMIC, GFP_NOIO, GFP_NOFS but unfortunately __vmalloc calls map_vm_area which calls alloc_pages with GFP_KERNEL to allocate page tables. It means it's possible to happen deadlock. I don't know why it doesn't have reported until now. Firstly, I tried passing gfp_t to lower functions to support __vmalloc with such flags but other mm guys don't want and decided that all of caller should be fixed. http://marc.info/?l=linux-kernel&m=133517143616544&w=2 To begin with, let's listen other's opinion whether they can fix it by other approach without calling __vmalloc with such flags. So this patch adds warning in __vmalloc_node_range to detect it and to be fixed hopely. __vmalloc_node_range isn't random chocie because all caller which has gfp_mask of map_vm_area use it through __vmalloc_area_node. And __vmalloc_area_node is current static function and is called by only __vmalloc_node_range. So warning in __vmalloc_node_range would cover all vmalloc functions which have gfp_t argument. I Cced related maintainers. If I miss someone, please Cced them. * Changelog * Replace WARN_ON with WARN_ON_ONCE - by Andrew Morton, Nick Piggin. Cc: Neil Brown Cc: Artem Bityutskiy Cc: David Woodhouse Cc: "Theodore Ts'o" Cc: Adrian Hunter Cc: Steven Whitehouse Cc: "David S. Miller" Cc: James Morris Cc: Alexander Viro Cc: Sage Weil Signed-off-by: Minchan Kim --- mm/vmalloc.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index c28b0b9..def5943 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1648,6 +1648,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, void *addr; unsigned long real_size = size; + WARN_ON_ONCE(!(gfp_mask & __GFP_WAIT) || + !(gfp_mask & __GFP_IO) || + !(gfp_mask & __GFP_FS)); + size = PAGE_ALIGN(size); if (!size || (size >> PAGE_SHIFT) > totalram_pages) goto fail; -- 1.7.9.5 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756858Ab2EBTqO (ORCPT ); Wed, 2 May 2012 15:46:14 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:42582 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756209Ab2EBTqM (ORCPT ); Wed, 2 May 2012 15:46:12 -0400 Date: Wed, 2 May 2012 12:46:10 -0700 From: Andrew Morton To: Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , Sage Weil Subject: Re: [PATCH] vmalloc: add warning in __vmalloc Message-Id: <20120502124610.175e099c.akpm@linux-foundation.org> In-Reply-To: <1335932890-25294-1-git-send-email-minchan@kernel.org> References: <1335932890-25294-1-git-send-email-minchan@kernel.org> X-Mailer: Sylpheed 3.0.2 (GTK+ 2.20.1; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2 May 2012 13:28:09 +0900 Minchan Kim wrote: > Now there are several places to use __vmalloc with GFP_ATOMIC, > GFP_NOIO, GFP_NOFS but unfortunately __vmalloc calls map_vm_area > which calls alloc_pages with GFP_KERNEL to allocate page tables. > It means it's possible to happen deadlock. > I don't know why it doesn't have reported until now. > > Firstly, I tried passing gfp_t to lower functions to support __vmalloc > with such flags but other mm guys don't want and decided that > all of caller should be fixed. > > http://marc.info/?l=linux-kernel&m=133517143616544&w=2 > > To begin with, let's listen other's opinion whether they can fix it > by other approach without calling __vmalloc with such flags. > > So this patch adds warning in __vmalloc_node_range to detect it and > to be fixed hopely. __vmalloc_node_range isn't random chocie because > all caller which has gfp_mask of map_vm_area use it through __vmalloc_area_node. > And __vmalloc_area_node is current static function and is called by only > __vmalloc_node_range. So warning in __vmalloc_node_range would cover all > vmalloc functions which have gfp_t argument. > > I Cced related maintainers. > If I miss someone, please Cced them. > > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1648,6 +1648,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, > void *addr; > unsigned long real_size = size; > > + WARN_ON_ONCE(!(gfp_mask & __GFP_WAIT) || > + !(gfp_mask & __GFP_IO) || > + !(gfp_mask & __GFP_FS)); > + > size = PAGE_ALIGN(size); > if (!size || (size >> PAGE_SHIFT) > totalram_pages) > goto fail; Well. What are we actually doing here? Causing the kernel to spew a warning due to known-buggy callsites, so that users will report the warnings, eventually goading maintainers into fixing their stuff. This isn't very efficient :( It would be better to fix that stuff first, then add the warning to prevent reoccurrences. Yes, maintainers are very naughty and probably do need cattle prods^W^W warnings to motivate them to fix stuff, but we should first make an effort to get these things fixed without irritating and alarming our users. Where are these offending callsites? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754312Ab2ECBDN (ORCPT ); Wed, 2 May 2012 21:03:13 -0400 Received: from LGEMRELSE1Q.lge.com ([156.147.1.111]:47812 "EHLO LGEMRELSE1Q.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751939Ab2ECBDL (ORCPT ); Wed, 2 May 2012 21:03:11 -0400 X-AuditID: 9c93016f-b7c99ae000007c38-fa-4fa1d9406c58 Message-ID: <4FA1D93C.9000306@kernel.org> Date: Thu, 03 May 2012 10:02:52 +0900 From: Minchan Kim User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120410 Thunderbird/11.0.1 MIME-Version: 1.0 Newsgroups: gmane.linux.kernel.mm,gmane.linux.kernel To: Andrew Morton CC: linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , Sage Weil Subject: Re: [PATCH] vmalloc: add warning in __vmalloc References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> In-Reply-To: <20120502124610.175e099c.akpm@linux-foundation.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: AAAAAA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/03/2012 04:46 AM, Andrew Morton wrote: > On Wed, 2 May 2012 13:28:09 +0900 > Minchan Kim wrote: > >> Now there are several places to use __vmalloc with GFP_ATOMIC, >> GFP_NOIO, GFP_NOFS but unfortunately __vmalloc calls map_vm_area >> which calls alloc_pages with GFP_KERNEL to allocate page tables. >> It means it's possible to happen deadlock. >> I don't know why it doesn't have reported until now. >> >> Firstly, I tried passing gfp_t to lower functions to support __vmalloc >> with such flags but other mm guys don't want and decided that >> all of caller should be fixed. >> >> http://marc.info/?l=linux-kernel&m=133517143616544&w=2 >> >> To begin with, let's listen other's opinion whether they can fix it >> by other approach without calling __vmalloc with such flags. >> >> So this patch adds warning in __vmalloc_node_range to detect it and >> to be fixed hopely. __vmalloc_node_range isn't random chocie because >> all caller which has gfp_mask of map_vm_area use it through __vmalloc_area_node. >> And __vmalloc_area_node is current static function and is called by only >> __vmalloc_node_range. So warning in __vmalloc_node_range would cover all >> vmalloc functions which have gfp_t argument. >> >> I Cced related maintainers. >> If I miss someone, please Cced them. >> >> --- a/mm/vmalloc.c >> +++ b/mm/vmalloc.c >> @@ -1648,6 +1648,10 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align, >> void *addr; >> unsigned long real_size = size; >> >> + WARN_ON_ONCE(!(gfp_mask & __GFP_WAIT) || >> + !(gfp_mask & __GFP_IO) || >> + !(gfp_mask & __GFP_FS)); >> + >> size = PAGE_ALIGN(size); >> if (!size || (size >> PAGE_SHIFT) > totalram_pages) >> goto fail; > > Well. What are we actually doing here? Causing the kernel to spew a > warning due to known-buggy callsites, so that users will report the > warnings, eventually goading maintainers into fixing their stuff. > > This isn't very efficient :( Yes. I hope maintainers fix it before merging this. > > It would be better to fix that stuff first, then add the warning to > prevent reoccurrences. Yes, maintainers are very naughty and probably > do need cattle prods^W^W warnings to motivate them to fix stuff, but we > should first make an effort to get these things fixed without > irritating and alarming our users. > > Where are these offending callsites? dm: __alloc_buffer_wait_no_callback ubi: ubi_dbg_check_write ubi_dbg_check_all_ff ext4 : ext4_kvmalloc gfs2 : gfs2_alloc_sort_buffer ntfs : __ntfs_malloc ubifs : dbg_dump_leb scan_check_cb dump_lpt_leb dbg_check_ltab_lnum dbg_scan_orphans mm : alloc_large_system_hash ceph : fill_inode ceph_setxattr ceph_removexattr ceph_x_build_authorizer ceph_decode_buffer ceph_alloc_middle > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > Don't email: email@kvack.org > -- Kind regards, Minchan Kim From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752641Ab2ECFqP (ORCPT ); Thu, 3 May 2012 01:46:15 -0400 Received: from cobra.newdream.net ([66.33.216.30]:48219 "EHLO cobra.newdream.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752083Ab2ECFqO (ORCPT ); Thu, 3 May 2012 01:46:14 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=newdream.net; h=date:from:to:cc :subject:in-reply-to:message-id:references:mime-version: content-type; q=dns; s=newdream.net; b=FUSM2UFk8RU+n19L3/w6WoUKd PzVs9VKYa/cal9DDWAb8z704I+SXQjNud0zBnT/lkYQU3rM0LjKbDICBR5Ye8QnQ 97eiK+SQrPr+XYMJox5WTgUIL9joFU5O6G0ZNIQBVZLKLfs8ZyOypFB3IAjuZgu0 q18MwwCgD/uE3tN5bE= Date: Wed, 2 May 2012 22:46:12 -0700 (PDT) From: Sage Weil To: Minchan Kim cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro Subject: Re: [PATCH] vmalloc: add warning in __vmalloc In-Reply-To: <4FA1D93C.9000306@kernel.org> Message-ID: References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 3 May 2012, Minchan Kim wrote: > On 05/03/2012 04:46 AM, Andrew Morton wrote: > > Well. What are we actually doing here? Causing the kernel to spew a > > warning due to known-buggy callsites, so that users will report the > > warnings, eventually goading maintainers into fixing their stuff. > > > > This isn't very efficient :( > > > Yes. I hope maintainers fix it before merging this. > > > > > It would be better to fix that stuff first, then add the warning to > > prevent reoccurrences. Yes, maintainers are very naughty and probably > > do need cattle prods^W^W warnings to motivate them to fix stuff, but we > > should first make an effort to get these things fixed without > > irritating and alarming our users. > > > > Where are these offending callsites? Okay, maybe this is a stupid question, but: if an fs can't call vmalloc with GFP_NOFS without risking deadlock, calling with GFP_KERNEL instead doesn't fix anything (besides being more honest). This really means that vmalloc is effectively off-limits for file systems in any writeback-related path, right? sage > > > dm: > __alloc_buffer_wait_no_callback > > ubi: > ubi_dbg_check_write > ubi_dbg_check_all_ff > > ext4 : > ext4_kvmalloc > > gfs2 : > gfs2_alloc_sort_buffer > > ntfs : > __ntfs_malloc > > ubifs : > dbg_dump_leb > scan_check_cb > dump_lpt_leb > dbg_check_ltab_lnum > dbg_scan_orphans > > mm : > alloc_large_system_hash > > ceph : > fill_inode > ceph_setxattr > ceph_removexattr > ceph_x_build_authorizer > ceph_decode_buffer > ceph_alloc_middle > > > > > > > -- > > To unsubscribe, send a message with 'unsubscribe linux-mm' in > > the body to majordomo@kvack.org. For more info on Linux MM, > > see: http://www.linux-mm.org/ . > > Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ > > Don't email: email@kvack.org > > > > > > -- > Kind regards, > Minchan Kim > > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752866Ab2ECGa6 (ORCPT ); Thu, 3 May 2012 02:30:58 -0400 Received: from mail-ob0-f174.google.com ([209.85.214.174]:61245 "EHLO mail-ob0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751118Ab2ECGa5 convert rfc822-to-8bit (ORCPT ); Thu, 3 May 2012 02:30:57 -0400 MIME-Version: 1.0 In-Reply-To: References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> Date: Thu, 3 May 2012 16:30:56 +1000 Message-ID: Subject: Re: [PATCH] vmalloc: add warning in __vmalloc From: Nick Piggin To: Sage Weil Cc: Minchan Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , linux-fsdevel Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3 May 2012 15:46, Sage Weil wrote: > On Thu, 3 May 2012, Minchan Kim wrote: >> On 05/03/2012 04:46 AM, Andrew Morton wrote: >> > Well.  What are we actually doing here?  Causing the kernel to spew a >> > warning due to known-buggy callsites, so that users will report the >> > warnings, eventually goading maintainers into fixing their stuff. >> > >> > This isn't very efficient :( >> >> >> Yes. I hope maintainers fix it before merging this. >> >> > >> > It would be better to fix that stuff first, then add the warning to >> > prevent reoccurrences.  Yes, maintainers are very naughty and probably >> > do need cattle prods^W^W warnings to motivate them to fix stuff, but we >> > should first make an effort to get these things fixed without >> > irritating and alarming our users. >> > >> > Where are these offending callsites? > > Okay, maybe this is a stupid question, but: if an fs can't call vmalloc > with GFP_NOFS without risking deadlock, calling with GFP_KERNEL instead > doesn't fix anything (besides being more honest).  This really means that > vmalloc is effectively off-limits for file systems in any > writeback-related path, right? Anywhere it cannot reenter the filesystem, yes. GFP_NOFS is effectively GFP_KERNEL when calling vmalloc. Note that in writeback paths, a "good citizen" filesystem should not require any allocations, or at least it should be able to tolerate allocation failures. So fixing that would be a good idea anyway. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754091Ab2ECHOS (ORCPT ); Thu, 3 May 2012 03:14:18 -0400 Received: from mail-ob0-f174.google.com ([209.85.214.174]:57013 "EHLO mail-ob0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751374Ab2ECHOR (ORCPT ); Thu, 3 May 2012 03:14:17 -0400 MIME-Version: 1.0 In-Reply-To: <1336029206.13013.11.camel@sauron.fi.intel.com> References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> <1336029206.13013.11.camel@sauron.fi.intel.com> Date: Thu, 3 May 2012 17:14:16 +1000 Message-ID: Subject: Re: [PATCH] vmalloc: add warning in __vmalloc From: Nick Piggin To: dedekind1@gmail.com Cc: Sage Weil , Minchan Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , Steven Whitehouse , "David S. Miller" , James Morris , Alexander Viro , linux-fsdevel Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3 May 2012 17:13, Artem Bityutskiy wrote: > On Thu, 2012-05-03 at 16:30 +1000, Nick Piggin wrote: >> Note that in writeback paths, a "good citizen" filesystem should not require >> any allocations, or at least it should be able to tolerate allocation failures. >> So fixing that would be a good idea anyway. > > This is a good point, but UBIFS kmallocs(GFP_NOFS) when doing I/O > because it needs to compress/decompress. But I agree that if kmalloc > fails, we should have a fall-back reserve buffer protected by a mutex > for memory pressure situations. AKA, a mempool :) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756012Ab2ECNud (ORCPT ); Thu, 3 May 2012 09:50:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:3936 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755939Ab2ECNuc (ORCPT ); Thu, 3 May 2012 09:50:32 -0400 Subject: Re: [PATCH] vmalloc: add warning in __vmalloc From: Steven Whitehouse To: Nick Piggin Cc: Sage Weil , Minchan Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kosaki.motohiro@gmail.com, rientjes@google.com, Neil Brown , Artem Bityutskiy , David Woodhouse , "Theodore Ts'o" , Adrian Hunter , "David S. Miller" , James Morris , Alexander Viro , linux-fsdevel In-Reply-To: References: <1335932890-25294-1-git-send-email-minchan@kernel.org> <20120502124610.175e099c.akpm@linux-foundation.org> <4FA1D93C.9000306@kernel.org> Content-Type: text/plain; charset="UTF-8" Organization: Red Hat UK Ltd Date: Thu, 03 May 2012 14:48:36 +0100 Message-ID: <1336052916.7030.7.camel@menhir> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Thu, 2012-05-03 at 16:30 +1000, Nick Piggin wrote: > On 3 May 2012 15:46, Sage Weil wrote: > > On Thu, 3 May 2012, Minchan Kim wrote: > >> On 05/03/2012 04:46 AM, Andrew Morton wrote: > >> > Well. What are we actually doing here? Causing the kernel to spew a > >> > warning due to known-buggy callsites, so that users will report the > >> > warnings, eventually goading maintainers into fixing their stuff. > >> > > >> > This isn't very efficient :( > >> > >> > >> Yes. I hope maintainers fix it before merging this. > >> > >> > > >> > It would be better to fix that stuff first, then add the warning to > >> > prevent reoccurrences. Yes, maintainers are very naughty and probably > >> > do need cattle prods^W^W warnings to motivate them to fix stuff, but we > >> > should first make an effort to get these things fixed without > >> > irritating and alarming our users. > >> > > >> > Where are these offending callsites? > > > > Okay, maybe this is a stupid question, but: if an fs can't call vmalloc > > with GFP_NOFS without risking deadlock, calling with GFP_KERNEL instead > > doesn't fix anything (besides being more honest). This really means that > > vmalloc is effectively off-limits for file systems in any > > writeback-related path, right? > > Anywhere it cannot reenter the filesystem, yes. GFP_NOFS is effectively > GFP_KERNEL when calling vmalloc. > > Note that in writeback paths, a "good citizen" filesystem should not require > any allocations, or at least it should be able to tolerate allocation failures. > So fixing that would be a good idea anyway. For cluster filesystems, there is an additional issue. When we allocate memory with GFP_KERNEL we might land up pushing inodes out of cache, which can also mean deallocating them. That process involves taking cluster locks, and so it is not valid to do this while holding another cluster lock (since the locks may be taken in random order). In the GFS2 use case for vmalloc, this is being done if kmalloc fails and also if the memory required is too large for kmalloc (very unlikely, but possible with very large directories). Also, it is being done under a cluster lock (shared mode). I recently looked back at the thread which resulted in that particular vmalloc call being left there: http://www.redhat.com/archives/cluster-devel/2010-July/msg00021.html http://www.redhat.com/archives/cluster-devel/2010-July/msg00022.html http://www.redhat.com/archives/cluster-devel/2010-July/msg00023.html which reminded me of the problem. So this might not be so easy to resolve... Steve.