* [PATCH] staging: lustre: fix GFP_ATOMIC macro usage @ 2014-01-17 8:46 Marek Szyprowski 2014-01-17 14:33 ` Greg Kroah-Hartman 0 siblings, 1 reply; 10+ messages in thread From: Marek Szyprowski @ 2014-01-17 8:46 UTC (permalink / raw) To: linux-kernel, devel Cc: Marek Szyprowski, Greg Kroah-Hartman, Andreas Dilger, Peng Tao GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other flags and LACK of __GFP_WAIT flag. To check if caller wanted to perform an atomic allocation, the code must test __GFP_WAIT flag presence. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- .../lustre/include/linux/libcfs/libcfs_private.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h index d0d942c..dddccca1 100644 --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h @@ -120,7 +120,7 @@ do { \ do { \ LASSERT(!in_interrupt() || \ ((size) <= LIBCFS_VMALLOC_SIZE && \ - ((mask) & GFP_ATOMIC)) != 0); \ + ((mask) & __GFP_WAIT) == 0)); \ } while (0) #define LIBCFS_ALLOC_POST(ptr, size) \ -- 1.7.9.5 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-17 8:46 [PATCH] staging: lustre: fix GFP_ATOMIC macro usage Marek Szyprowski @ 2014-01-17 14:33 ` Greg Kroah-Hartman 2014-01-17 14:51 ` Dan Carpenter 2014-01-20 8:18 ` Marek Szyprowski 0 siblings, 2 replies; 10+ messages in thread From: Greg Kroah-Hartman @ 2014-01-17 14:33 UTC (permalink / raw) To: Marek Szyprowski; +Cc: linux-kernel, devel, Andreas Dilger, Peng Tao On Fri, Jan 17, 2014 at 09:46:56AM +0100, Marek Szyprowski wrote: > GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other > flags and LACK of __GFP_WAIT flag. To check if caller wanted to perform an > atomic allocation, the code must test __GFP_WAIT flag presence. > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > .../lustre/include/linux/libcfs/libcfs_private.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h > index d0d942c..dddccca1 100644 > --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h > +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h > @@ -120,7 +120,7 @@ do { \ > do { \ > LASSERT(!in_interrupt() || \ > ((size) <= LIBCFS_VMALLOC_SIZE && \ > - ((mask) & GFP_ATOMIC)) != 0); \ > + ((mask) & __GFP_WAIT) == 0)); \ > } while (0) What a horrible assert, can't we just remove this entirely? in_interrupt() usually should never be checked, if so, the code is doing something wrong. And __GFP flags shouldn't be used on their own. thanks, greg k-h ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-17 14:33 ` Greg Kroah-Hartman @ 2014-01-17 14:51 ` Dan Carpenter 2014-01-17 15:17 ` Greg Kroah-Hartman 2014-01-20 8:18 ` Marek Szyprowski 1 sibling, 1 reply; 10+ messages in thread From: Dan Carpenter @ 2014-01-17 14:51 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Marek Szyprowski, devel, Peng Tao, Andreas Dilger, linux-kernel We will want to get rid of lustre's custom allocator before this gets out of staging. But one feature that the lustre allocator has which is pretty neat is that it lets you debug how much memory the filesystem is using. Is there a standard way to find this information? regards, dan carpenter ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-17 14:51 ` Dan Carpenter @ 2014-01-17 15:17 ` Greg Kroah-Hartman 2014-01-21 20:02 ` Dilger, Andreas 0 siblings, 1 reply; 10+ messages in thread From: Greg Kroah-Hartman @ 2014-01-17 15:17 UTC (permalink / raw) To: Dan Carpenter Cc: devel, Peng Tao, Andreas Dilger, linux-kernel, Marek Szyprowski On Fri, Jan 17, 2014 at 05:51:28PM +0300, Dan Carpenter wrote: > We will want to get rid of lustre's custom allocator before this gets > out of staging. > > But one feature that the lustre allocator has which is pretty neat is > that it lets you debug how much memory the filesystem is using. Is > there a standard way to find this information? Create your own mempool/slab/whatever_it's_called and look in the debugfs or proc files for the allocator usage, depending on the memory allocator the kernel is using. That's how the rest of the kernel does it, no reason lustre should be any different. thanks, greg k-h ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-17 15:17 ` Greg Kroah-Hartman @ 2014-01-21 20:02 ` Dilger, Andreas 2014-01-21 20:16 ` Dan Carpenter 2014-01-21 21:15 ` Dave Hansen 0 siblings, 2 replies; 10+ messages in thread From: Dilger, Andreas @ 2014-01-21 20:02 UTC (permalink / raw) To: Greg Kroah-Hartman, Dan Carpenter Cc: devel@driverdev.osuosl.org, Peng Tao, linux-kernel@vger.kernel.org, Marek Szyprowski, Drokin, Oleg On 2014/01/17, 8:17 AM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org> wrote: >On Fri, Jan 17, 2014 at 05:51:28PM +0300, Dan Carpenter wrote: >> We will want to get rid of lustre's custom allocator before this gets >> out of staging. >> >> But one feature that the lustre allocator has which is pretty neat is >> that it lets you debug how much memory the filesystem is using. Is >> there a standard way to find this information? > >Create your own mempool/slab/whatever_it's_called and look in the >debugfs or proc files for the allocator usage, depending on the memory >allocator the kernel is using. > >That's how the rest of the kernel does it, no reason lustre should be >any different. The Lustre allocation macros track the memory usage across the whole filesystem, not just of a single structure that a mempool/slab/whatever would do. This is useful to know for debugging purposes (e.g. user complains about not having enough RAM for their highly-tuned application, or to check for leaks at unmount). It can also log the alloc/free calls and post-process them to find leaks easily, or find pieces code that is allocating too much memory that are not using dedicated slabs. This also works if you encounter a system with a lot of allocated memory, enable "free" logging, and then unmount the filesystem. The logs will show which structures are being freed (assuming they are not leaked completely) and point you to whatever is not being shrunk properly. I don't know if there is any way to track this with regular kmalloc(), and creating separate slabs for so ever data structure would be ugly. The generic /proc/meminfo data doesn't really tell you what is using all the memory, and the size-NNNN slabs give some information, but are used all over the kernel. I'm pretty much resigned to losing all of this functionality, but it definitely has been very useful for finding problems. Cheers, Andreas -- Andreas Dilger Lustre Software Architect Intel High Performance Data Division ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-21 20:02 ` Dilger, Andreas @ 2014-01-21 20:16 ` Dan Carpenter 2014-01-22 1:31 ` Drokin, Oleg 2014-01-21 21:15 ` Dave Hansen 1 sibling, 1 reply; 10+ messages in thread From: Dan Carpenter @ 2014-01-21 20:16 UTC (permalink / raw) To: Dilger, Andreas Cc: Greg Kroah-Hartman, devel@driverdev.osuosl.org, Drokin, Oleg, Peng Tao, linux-kernel@vger.kernel.org, Marek Szyprowski Greg meant mempools not slab. You should look at it. It does what you need for debugging and solves a couple other problems as well. We have a leak checker in the kernel but most people disable it. I forget the config name. There are a bunch of useful debug configs. regards, dan carpenter ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-21 20:16 ` Dan Carpenter @ 2014-01-22 1:31 ` Drokin, Oleg 0 siblings, 0 replies; 10+ messages in thread From: Drokin, Oleg @ 2014-01-22 1:31 UTC (permalink / raw) To: Dan Carpenter Cc: Dilger, Andreas, Greg Kroah-Hartman, devel@driverdev.osuosl.org, Peng Tao, linux-kernel@vger.kernel.org, Marek Szyprowski Hello! On Jan 21, 2014, at 3:16 PM, Dan Carpenter wrote: > We have a leak checker in the kernel but most people disable it. I > forget the config name. There are a bunch of useful debug configs. I actually use it at times too and it's useful (e.g. it works even if you did not wrap the allocation in the lustre macro, and this did happen before), but it comes with it's own problems. E.g. it gobbles on memory like there's no tomorrow (at least in my case), it has horrible false failure ratio with zfs in place (and probably that's why it uses even more memory then), it has otherwise quite significant failure ratio too. But yes, it is useful and I am glad it's there. It does not solve some other usecases outlined by Andreas, though. Bye, Oleg ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-21 20:02 ` Dilger, Andreas 2014-01-21 20:16 ` Dan Carpenter @ 2014-01-21 21:15 ` Dave Hansen 2014-01-22 1:52 ` Drokin, Oleg 1 sibling, 1 reply; 10+ messages in thread From: Dave Hansen @ 2014-01-21 21:15 UTC (permalink / raw) To: Dilger, Andreas, Greg Kroah-Hartman, Dan Carpenter Cc: devel@driverdev.osuosl.org, Peng Tao, linux-kernel@vger.kernel.org, Marek Szyprowski, Drokin, Oleg On 01/21/2014 12:02 PM, Dilger, Andreas wrote: > The Lustre allocation macros track the memory usage across the whole > filesystem, > not just of a single structure that a mempool/slab/whatever would do. > This is > useful to know for debugging purposes (e.g. user complains about not having > enough RAM for their highly-tuned application, or to check for leaks at > unmount). Urg, it does this with a global variable. If we did this kind of thing generically, we'd get eaten alive by all of the cacheline bouncing from that atomic. It's also a 32-bit atomic. Guess it's never had more than 4GB of memory tracked in there. :) This also doesn't track overhead from things that *are* in slabs like the inodes, or the 14 kmem_caches that lustre has, so it's far from a complete picture of how lustre is using memory. > It can also log the alloc/free calls and post-process them to find leaks > easily, or find pieces code that is allocating too much memory that are not > using dedicated slabs. This also works if you encounter a system with a > lot of allocated memory, enable "free" logging, and then unmount the > filesystem. > The logs will show which structures are being freed (assuming they are not > leaked completely) and point you to whatever is not being shrunk properly. This isn't perfect, but it does cover most of the ext4 call sites in my kernel. It would work better for a module, I'd imagine: cd /sys/kernel/debug/tracing/events/kmem echo -n 'call_site < 0xffffffff81e0af00 && call_site >= 0xffffffff81229cf0' > kmalloc/filter echo 1 > kmalloc/enable cat /sys/kernel/debug/tracing/trace_pipe It will essentially log all the kmalloc() calls from the ext4 code. I got the call site locations from grepping System.map. It would be _really_ nice if we were able to do something like: echo 'call_site =~ *ext4*' but there's no way to do that which I know of. You could probably rig something up with the function graph tracer by triggering only on entry to the lustre code. > I don't know if there is any way to track this with regular kmalloc(), and > creating separate slabs for so ever data structure would be ugly. The > generic > /proc/meminfo data doesn't really tell you what is using all the memory, > and > the size-NNNN slabs give some information, but are used all over the > kernel. > > I'm pretty much resigned to losing all of this functionality, but it > definitely > has been very useful for finding problems. Yeah, it is hard to find out who is responsible for leaking pages or kmalloc()s, especially after the fact. But, we seem to limp along just fine. If lustre is that bad of a memory leaker that it *NEEDS* this feature, we have bigger problems on our hands. :) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-21 21:15 ` Dave Hansen @ 2014-01-22 1:52 ` Drokin, Oleg 0 siblings, 0 replies; 10+ messages in thread From: Drokin, Oleg @ 2014-01-22 1:52 UTC (permalink / raw) To: Hansen, Dave Cc: Dilger, Andreas, Greg Kroah-Hartman, Dan Carpenter, devel@driverdev.osuosl.org, Peng Tao, linux-kernel@vger.kernel.org, Marek Szyprowski Hello! On Jan 21, 2014, at 4:15 PM, Dave Hansen wrote: > On 01/21/2014 12:02 PM, Dilger, Andreas wrote: >> The Lustre allocation macros track the memory usage across the whole >> filesystem, >> not just of a single structure that a mempool/slab/whatever would do. >> This is >> useful to know for debugging purposes (e.g. user complains about not having >> enough RAM for their highly-tuned application, or to check for leaks at >> unmount). > Urg, it does this with a global variable. If we did this kind of thing > generically, we'd get eaten alive by all of the cacheline bouncing from > that atomic. It's also a 32-bit atomic. Guess it's never had more than > 4GB of memory tracked in there. :) No, hopefully we'll never get to be this memory hungry on a single node. Good point about the cacheline, I guess. > This also doesn't track overhead from things that *are* in slabs like > the inodes, or the 14 kmem_caches that lustre has, so it's far from a > complete picture of how lustre is using memory. The inodes are per filesystem, so we do have complete picture there. dentries and some other structures are shared, but e.g. on a client lustre is frequently the only active FS in use and as such it's easy. >> It can also log the alloc/free calls and post-process them to find leaks >> easily, or find pieces code that is allocating too much memory that are not >> using dedicated slabs. This also works if you encounter a system with a >> lot of allocated memory, enable "free" logging, and then unmount the >> filesystem. >> The logs will show which structures are being freed (assuming they are not >> leaked completely) and point you to whatever is not being shrunk properly. > > This isn't perfect, but it does cover most of the ext4 call sites in my > kernel. It would work better for a module, I'd imagine: > > cd /sys/kernel/debug/tracing/events/kmem > echo -n 'call_site < 0xffffffff81e0af00 && call_site >= > 0xffffffff81229cf0' > kmalloc/filter > echo 1 > kmalloc/enable > cat /sys/kernel/debug/tracing/trace_pipe That's a neat trick. > It will essentially log all the kmalloc() calls from the ext4 code. I > got the call site locations from grepping System.map. It would be > _really_ nice if we were able to do something like: > > echo 'call_site =~ *ext4*' So basically module address plus len from /proc/modules should do for modular features? Should be easy to script. Is there any neat way to enable this after module is loaded, but before it gets any control, so that there's a full track of all allocations and deallocations (obviously that's only going to be used for debugging). > Yeah, it is hard to find out who is responsible for leaking pages or > kmalloc()s, especially after the fact. But, we seem to limp along just > fine. If lustre is that bad of a memory leaker that it *NEEDS* this > feature, we have bigger problems on our hands. :) It's one of those things that you don't need often, but that does come very handy once the need arises ;) It's also nice that it warns you on module unload that "hey, you left this much stuff behind, bad boy!". But we can live without it, that's true too. Bye, Oleg ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] staging: lustre: fix GFP_ATOMIC macro usage 2014-01-17 14:33 ` Greg Kroah-Hartman 2014-01-17 14:51 ` Dan Carpenter @ 2014-01-20 8:18 ` Marek Szyprowski 1 sibling, 0 replies; 10+ messages in thread From: Marek Szyprowski @ 2014-01-20 8:18 UTC (permalink / raw) To: Greg Kroah-Hartman; +Cc: linux-kernel, devel, Andreas Dilger, Peng Tao Hello, On 2014-01-17 15:33, Greg Kroah-Hartman wrote: > On Fri, Jan 17, 2014 at 09:46:56AM +0100, Marek Szyprowski wrote: > > GFP_ATOMIC is not a single gfp flag, but a macro which expands to the other > > flags and LACK of __GFP_WAIT flag. To check if caller wanted to perform an > > atomic allocation, the code must test __GFP_WAIT flag presence. > > > > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > > --- > > .../lustre/include/linux/libcfs/libcfs_private.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h > > index d0d942c..dddccca1 100644 > > --- a/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h > > +++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_private.h > > @@ -120,7 +120,7 @@ do { \ > > do { \ > > LASSERT(!in_interrupt() || \ > > ((size) <= LIBCFS_VMALLOC_SIZE && \ > > - ((mask) & GFP_ATOMIC)) != 0); \ > > + ((mask) & __GFP_WAIT) == 0)); \ > > } while (0) > > What a horrible assert, can't we just remove this entirely? > in_interrupt() usually should never be checked, if so, the code is doing > something wrong. And __GFP flags shouldn't be used on their own. Well, I've prepared this patch when I was checking kernel sources for incorrect GFP_ATOMIC usage. I agree that drivers should use generic memory allocation methods instead of inventing their own stuff. Feel free to ignore my patch in favor of removing this custom allocator at all. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-01-22 1:52 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-17 8:46 [PATCH] staging: lustre: fix GFP_ATOMIC macro usage Marek Szyprowski 2014-01-17 14:33 ` Greg Kroah-Hartman 2014-01-17 14:51 ` Dan Carpenter 2014-01-17 15:17 ` Greg Kroah-Hartman 2014-01-21 20:02 ` Dilger, Andreas 2014-01-21 20:16 ` Dan Carpenter 2014-01-22 1:31 ` Drokin, Oleg 2014-01-21 21:15 ` Dave Hansen 2014-01-22 1:52 ` Drokin, Oleg 2014-01-20 8:18 ` Marek Szyprowski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox