* Re: [RFC] free_pages stuff [not found] ` <20151222210435.GB20997@ZenIV.linux.org.uk> @ 2016-01-05 13:59 ` Michal Hocko 2016-01-05 15:26 ` Al Viro 0 siblings, 1 reply; 3+ messages in thread From: Michal Hocko @ 2016-01-05 13:59 UTC (permalink / raw) To: Al Viro Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List, linux-mm [CCing linux-mm] On Tue 22-12-15 21:04:35, Al Viro wrote: [...] > Documentation/which-allocator-should-I-use might be a good idea... Notes > below are just a skeleton - a lot of details need to be added; in particular, > there should be a part on "I have this kind of address and I want that; > when and how should that be done?", completely missing here. And there > should be a big scary warning along the lines of "this is NOT an invitation > for a flood of checkpatch-inspired patches"... > > Comments, corrections and additions would be very welcome. FWIW I think this is a very good idea. The current form is good enough IMHO. > 1) Most of the time kmalloc() is the right thing to use. > Limitations: alignment is no better than word, not available very early in > bootstrap, allocated memory is physically contiguous, so large allocations > are best avoided. > > 2) kmem_cache_alloc() allows to specify the alignment at cache creation > time. Otherwise it's similar to kmalloc(). Normally it's used for > situations where we have a lot of instances of some type and want dynamic > allocation of those. > > 3) vmalloc() is for large allocations. They will be page-aligned, > but *not* physically contiguous. OTOH, large physically contiguous > allocations are generally a bad idea. Unlike other allocators, there's > no variant that could be used in interrupt; freeing is possible there, > but allocation is not. Note that non-blocking variant *does* exist - > __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic > contexts; it's the interrupt ones that are no-go. It is also hardcoded GFP_KERNEL context so a usage from NOFS context needs a special treatment. > 4) if it's very early in bootstrap, alloc_bootmem() and friends > may be the only option. Rule of the thumb: if it's already printed > Memory: ...../..... available..... > you shouldn't be using that one. Allocations are physically contiguous > and at that point large physically contiguous allocations are still OK. > > 5) if you need to allocate memory for DMA, use dma_alloc_coherent() > and friends. They'll give you both the virtual address for your use > and DMA address refering to the same memory for use by device; do *NOT* > try to derive the latter from the former; use of virt_to_bus() et.al. > is a Bloody Bad Idea(tm). > > 6) if you need a reference to struct page, use alloc_page/alloc_pages. > > 7) in some cases (page tables, for the most obvious example), __get_free_page() > and friends might be the right answer. In principle, it's case (6), but > it returns page_address(page) instead of the page itself. Historically that > was the first API introduced, so a _lot_ of places that should've been using > something else ended up using that. Do not assume that being lower level > makes it faster than e.g. kmalloc() - this is simply not true. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] free_pages stuff 2016-01-05 13:59 ` [RFC] free_pages stuff Michal Hocko @ 2016-01-05 15:26 ` Al Viro 2016-01-05 15:42 ` Michal Hocko 0 siblings, 1 reply; 3+ messages in thread From: Al Viro @ 2016-01-05 15:26 UTC (permalink / raw) To: Michal Hocko Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List, linux-mm On Tue, Jan 05, 2016 at 02:59:03PM +0100, Michal Hocko wrote: > > 3) vmalloc() is for large allocations. They will be page-aligned, > > but *not* physically contiguous. OTOH, large physically contiguous > > allocations are generally a bad idea. Unlike other allocators, there's > > no variant that could be used in interrupt; freeing is possible there, > > but allocation is not. Note that non-blocking variant *does* exist - > > __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic > > contexts; it's the interrupt ones that are no-go. The last sentence I'd put into that part was complete crap... > It is also hardcoded GFP_KERNEL context so a usage from NOFS context > needs a special treatment. ... in part because of this. GFP_ATOMIC __vmalloc() will be anything but, and the only caller passing that is almost certainly bogus. As for NOFS/NOIO, I wonder if we should apply that special treatment inside __vmalloc_area_node rather than in callers; see the current thread on linux-mm for details... Another interesting issue is __GFP_HIGHMEM meaning for kmalloc and __vmalloc resp. (should never be passed to kmalloc, should almost always be passed to __vmalloc - the former needs pages mapped in kernel space, the latter probably never needs a separate kernel alias for the data pages, to such degree that I'm not sure if we shouldn't _force_ __GFP_HIGHMEM for data pages allocation in __vmalloc_area_node()) > > 4) if it's very early in bootstrap, alloc_bootmem() and friends > > may be the only option. Rule of the thumb: if it's already printed > > Memory: ...../..... available..... > > you shouldn't be using that one. Allocations are physically contiguous > > and at that point large physically contiguous allocations are still OK. Probably needs at least some discussion of memblock vs. bootmem APIs. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] free_pages stuff 2016-01-05 15:26 ` Al Viro @ 2016-01-05 15:42 ` Michal Hocko 0 siblings, 0 replies; 3+ messages in thread From: Michal Hocko @ 2016-01-05 15:42 UTC (permalink / raw) To: Al Viro Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List, linux-mm On Tue 05-01-16 15:26:02, Al Viro wrote: > On Tue, Jan 05, 2016 at 02:59:03PM +0100, Michal Hocko wrote: > > > > 3) vmalloc() is for large allocations. They will be page-aligned, > > > but *not* physically contiguous. OTOH, large physically contiguous > > > allocations are generally a bad idea. Unlike other allocators, there's > > > no variant that could be used in interrupt; freeing is possible there, > > > but allocation is not. Note that non-blocking variant *does* exist - > > > __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic > > > contexts; it's the interrupt ones that are no-go. > > The last sentence I'd put into that part was complete crap... > > > It is also hardcoded GFP_KERNEL context so a usage from NOFS context > > needs a special treatment. > > ... in part because of this. GFP_ATOMIC __vmalloc() will be anything but, > and the only caller passing that is almost certainly bogus. Agreed as just replied in the other email thread which I have noticed only now. > As for NOFS/NOIO, > I wonder if we should apply that special treatment inside __vmalloc_area_node > rather than in callers; see the current thread on linux-mm for details... That would make a lot of sense to me. Spreading the _special_ treatment all over the kernel is certainly worse. > Another interesting issue is __GFP_HIGHMEM meaning for kmalloc and __vmalloc > resp. (should never be passed to kmalloc, should almost always be passed > to __vmalloc - the former needs pages mapped in kernel space, the latter > probably never needs a separate kernel alias for the data pages, to such > degree that I'm not sure if we shouldn't _force_ __GFP_HIGHMEM for data pages > allocation in __vmalloc_area_node()) I would have to think about this one some more. Let's not fragment the discussion and continue in that email thread: http://lkml.kernel.org/r/20160103071246.GK9938%40ZenIV.linux.org.uk -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-01-05 15:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20151221234615.GW20997@ZenIV.linux.org.uk>
[not found] ` <CA+55aFwp4iy4rtX2gE2WjBGFL=NxMVnoFeHqYa2j1dYOMMGqxg@mail.gmail.com>
[not found] ` <20151222010403.GX20997@ZenIV.linux.org.uk>
[not found] ` <CA+55aFy9NrV_RnziN9z3p5O6rv1A0mirhLD0hL7Wrb77+YyBeg@mail.gmail.com>
[not found] ` <20151222022226.GY20997@ZenIV.linux.org.uk>
[not found] ` <CAMuHMdUGkVcUOH4VUXiuoa6eGVQEA+QRDEop3GrEOEWz8GeNig@mail.gmail.com>
[not found] ` <20151222210435.GB20997@ZenIV.linux.org.uk>
2016-01-05 13:59 ` [RFC] free_pages stuff Michal Hocko
2016-01-05 15:26 ` Al Viro
2016-01-05 15:42 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).