* Re: [RFC] free_pages stuff
[not found] ` <20151222210435.GB20997@ZenIV.linux.org.uk>
@ 2016-01-05 13:59 ` Michal Hocko
2016-01-05 15:26 ` Al Viro
0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2016-01-05 13:59 UTC (permalink / raw)
To: Al Viro
Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List,
linux-mm
[CCing linux-mm]
On Tue 22-12-15 21:04:35, Al Viro wrote:
[...]
> Documentation/which-allocator-should-I-use might be a good idea... Notes
> below are just a skeleton - a lot of details need to be added; in particular,
> there should be a part on "I have this kind of address and I want that;
> when and how should that be done?", completely missing here. And there
> should be a big scary warning along the lines of "this is NOT an invitation
> for a flood of checkpatch-inspired patches"...
>
> Comments, corrections and additions would be very welcome.
FWIW I think this is a very good idea. The current form is good enough
IMHO.
> 1) Most of the time kmalloc() is the right thing to use.
> Limitations: alignment is no better than word, not available very early in
> bootstrap, allocated memory is physically contiguous, so large allocations
> are best avoided.
>
> 2) kmem_cache_alloc() allows to specify the alignment at cache creation
> time. Otherwise it's similar to kmalloc(). Normally it's used for
> situations where we have a lot of instances of some type and want dynamic
> allocation of those.
>
> 3) vmalloc() is for large allocations. They will be page-aligned,
> but *not* physically contiguous. OTOH, large physically contiguous
> allocations are generally a bad idea. Unlike other allocators, there's
> no variant that could be used in interrupt; freeing is possible there,
> but allocation is not. Note that non-blocking variant *does* exist -
> __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic
> contexts; it's the interrupt ones that are no-go.
It is also hardcoded GFP_KERNEL context so a usage from NOFS context
needs a special treatment.
> 4) if it's very early in bootstrap, alloc_bootmem() and friends
> may be the only option. Rule of the thumb: if it's already printed
> Memory: ...../..... available.....
> you shouldn't be using that one. Allocations are physically contiguous
> and at that point large physically contiguous allocations are still OK.
>
> 5) if you need to allocate memory for DMA, use dma_alloc_coherent()
> and friends. They'll give you both the virtual address for your use
> and DMA address refering to the same memory for use by device; do *NOT*
> try to derive the latter from the former; use of virt_to_bus() et.al.
> is a Bloody Bad Idea(tm).
>
> 6) if you need a reference to struct page, use alloc_page/alloc_pages.
>
> 7) in some cases (page tables, for the most obvious example), __get_free_page()
> and friends might be the right answer. In principle, it's case (6), but
> it returns page_address(page) instead of the page itself. Historically that
> was the first API introduced, so a _lot_ of places that should've been using
> something else ended up using that. Do not assume that being lower level
> makes it faster than e.g. kmalloc() - this is simply not true.
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] free_pages stuff
2016-01-05 13:59 ` [RFC] free_pages stuff Michal Hocko
@ 2016-01-05 15:26 ` Al Viro
2016-01-05 15:42 ` Michal Hocko
0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2016-01-05 15:26 UTC (permalink / raw)
To: Michal Hocko
Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List,
linux-mm
On Tue, Jan 05, 2016 at 02:59:03PM +0100, Michal Hocko wrote:
> > 3) vmalloc() is for large allocations. They will be page-aligned,
> > but *not* physically contiguous. OTOH, large physically contiguous
> > allocations are generally a bad idea. Unlike other allocators, there's
> > no variant that could be used in interrupt; freeing is possible there,
> > but allocation is not. Note that non-blocking variant *does* exist -
> > __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic
> > contexts; it's the interrupt ones that are no-go.
The last sentence I'd put into that part was complete crap...
> It is also hardcoded GFP_KERNEL context so a usage from NOFS context
> needs a special treatment.
... in part because of this. GFP_ATOMIC __vmalloc() will be anything but,
and the only caller passing that is almost certainly bogus. As for NOFS/NOIO,
I wonder if we should apply that special treatment inside __vmalloc_area_node
rather than in callers; see the current thread on linux-mm for details...
Another interesting issue is __GFP_HIGHMEM meaning for kmalloc and __vmalloc
resp. (should never be passed to kmalloc, should almost always be passed
to __vmalloc - the former needs pages mapped in kernel space, the latter
probably never needs a separate kernel alias for the data pages, to such
degree that I'm not sure if we shouldn't _force_ __GFP_HIGHMEM for data pages
allocation in __vmalloc_area_node())
> > 4) if it's very early in bootstrap, alloc_bootmem() and friends
> > may be the only option. Rule of the thumb: if it's already printed
> > Memory: ...../..... available.....
> > you shouldn't be using that one. Allocations are physically contiguous
> > and at that point large physically contiguous allocations are still OK.
Probably needs at least some discussion of memblock vs. bootmem APIs.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] free_pages stuff
2016-01-05 15:26 ` Al Viro
@ 2016-01-05 15:42 ` Michal Hocko
0 siblings, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2016-01-05 15:42 UTC (permalink / raw)
To: Al Viro
Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List,
linux-mm
On Tue 05-01-16 15:26:02, Al Viro wrote:
> On Tue, Jan 05, 2016 at 02:59:03PM +0100, Michal Hocko wrote:
>
> > > 3) vmalloc() is for large allocations. They will be page-aligned,
> > > but *not* physically contiguous. OTOH, large physically contiguous
> > > allocations are generally a bad idea. Unlike other allocators, there's
> > > no variant that could be used in interrupt; freeing is possible there,
> > > but allocation is not. Note that non-blocking variant *does* exist -
> > > __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic
> > > contexts; it's the interrupt ones that are no-go.
>
> The last sentence I'd put into that part was complete crap...
>
> > It is also hardcoded GFP_KERNEL context so a usage from NOFS context
> > needs a special treatment.
>
> ... in part because of this. GFP_ATOMIC __vmalloc() will be anything but,
> and the only caller passing that is almost certainly bogus.
Agreed as just replied in the other email thread which I have noticed
only now.
> As for NOFS/NOIO,
> I wonder if we should apply that special treatment inside __vmalloc_area_node
> rather than in callers; see the current thread on linux-mm for details...
That would make a lot of sense to me. Spreading the _special_ treatment
all over the kernel is certainly worse.
> Another interesting issue is __GFP_HIGHMEM meaning for kmalloc and __vmalloc
> resp. (should never be passed to kmalloc, should almost always be passed
> to __vmalloc - the former needs pages mapped in kernel space, the latter
> probably never needs a separate kernel alias for the data pages, to such
> degree that I'm not sure if we shouldn't _force_ __GFP_HIGHMEM for data pages
> allocation in __vmalloc_area_node())
I would have to think about this one some more. Let's not fragment the
discussion and continue in that email thread:
http://lkml.kernel.org/r/20160103071246.GK9938%40ZenIV.linux.org.uk
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-01-05 15:42 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20151221234615.GW20997@ZenIV.linux.org.uk>
[not found] ` <CA+55aFwp4iy4rtX2gE2WjBGFL=NxMVnoFeHqYa2j1dYOMMGqxg@mail.gmail.com>
[not found] ` <20151222010403.GX20997@ZenIV.linux.org.uk>
[not found] ` <CA+55aFy9NrV_RnziN9z3p5O6rv1A0mirhLD0hL7Wrb77+YyBeg@mail.gmail.com>
[not found] ` <20151222022226.GY20997@ZenIV.linux.org.uk>
[not found] ` <CAMuHMdUGkVcUOH4VUXiuoa6eGVQEA+QRDEop3GrEOEWz8GeNig@mail.gmail.com>
[not found] ` <20151222210435.GB20997@ZenIV.linux.org.uk>
2016-01-05 13:59 ` [RFC] free_pages stuff Michal Hocko
2016-01-05 15:26 ` Al Viro
2016-01-05 15:42 ` Michal Hocko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).