linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC] free_pages stuff
       [not found]           ` <20151222210435.GB20997@ZenIV.linux.org.uk>
@ 2016-01-05 13:59             ` Michal Hocko
  2016-01-05 15:26               ` Al Viro
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Hocko @ 2016-01-05 13:59 UTC (permalink / raw)
  To: Al Viro
  Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List,
	linux-mm

[CCing linux-mm]

On Tue 22-12-15 21:04:35, Al Viro wrote:
[...]
> Documentation/which-allocator-should-I-use might be a good idea...  Notes
> below are just a skeleton - a lot of details need to be added; in particular,
> there should be a part on "I have this kind of address and I want that;
> when and how should that be done?", completely missing here.  And there
> should be a big scary warning along the lines of "this is NOT an invitation
> for a flood of checkpatch-inspired patches"...
> 
> Comments, corrections and additions would be very welcome.

FWIW I think this is a very good idea. The current form is good enough
IMHO.

> 1) Most of the time kmalloc() is the right thing to use.
> Limitations: alignment is no better than word, not available very early in
> bootstrap, allocated memory is physically contiguous, so large allocations
> are best avoided.
> 
> 2) kmem_cache_alloc() allows to specify the alignment at cache creation
> time.  Otherwise it's similar to kmalloc().  Normally it's used for
> situations where we have a lot of instances of some type and want dynamic
> allocation of those.
> 
> 3) vmalloc() is for large allocations.  They will be page-aligned,
> but *not* physically contiguous.  OTOH, large physically contiguous
> allocations are generally a bad idea.  Unlike other allocators, there's
> no variant that could be used in interrupt; freeing is possible there,
> but allocation is not.  Note that non-blocking variant *does* exist -
> __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic
> contexts; it's the interrupt ones that are no-go.

It is also hardcoded GFP_KERNEL context so a usage from NOFS context
needs a special treatment.

> 4) if it's very early in bootstrap, alloc_bootmem() and friends
> may be the only option.  Rule of the thumb: if it's already printed
> Memory: ...../..... available.....
> you shouldn't be using that one.  Allocations are physically contiguous
> and at that point large physically contiguous allocations are still OK.
> 
> 5) if you need to allocate memory for DMA, use dma_alloc_coherent()
> and friends.  They'll give you both the virtual address for your use
> and DMA address refering to the same memory for use by device; do *NOT*
> try to derive the latter from the former; use of virt_to_bus() et.al.
> is a Bloody Bad Idea(tm).
> 
> 6) if you need a reference to struct page, use alloc_page/alloc_pages.
> 
> 7) in some cases (page tables, for the most obvious example), __get_free_page()
> and friends might be the right answer.  In principle, it's case (6), but
> it returns page_address(page) instead of the page itself.  Historically that
> was the first API introduced, so a _lot_ of places that should've been using
> something else ended up using that.  Do not assume that being lower level
> makes it faster than e.g. kmalloc() - this is simply not true.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] free_pages stuff
  2016-01-05 13:59             ` [RFC] free_pages stuff Michal Hocko
@ 2016-01-05 15:26               ` Al Viro
  2016-01-05 15:42                 ` Michal Hocko
  0 siblings, 1 reply; 3+ messages in thread
From: Al Viro @ 2016-01-05 15:26 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List,
	linux-mm

On Tue, Jan 05, 2016 at 02:59:03PM +0100, Michal Hocko wrote:

> > 3) vmalloc() is for large allocations.  They will be page-aligned,
> > but *not* physically contiguous.  OTOH, large physically contiguous
> > allocations are generally a bad idea.  Unlike other allocators, there's
> > no variant that could be used in interrupt; freeing is possible there,
> > but allocation is not.  Note that non-blocking variant *does* exist -
> > __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic
> > contexts; it's the interrupt ones that are no-go.

The last sentence I'd put into that part was complete crap...

> It is also hardcoded GFP_KERNEL context so a usage from NOFS context
> needs a special treatment.

... in part because of this.  GFP_ATOMIC __vmalloc() will be anything but,
and the only caller passing that is almost certainly bogus.  As for NOFS/NOIO,
I wonder if we should apply that special treatment inside __vmalloc_area_node
rather than in callers; see the current thread on linux-mm for details...

Another interesting issue is __GFP_HIGHMEM meaning for kmalloc and __vmalloc
resp. (should never be passed to kmalloc, should almost always be passed
to __vmalloc - the former needs pages mapped in kernel space, the latter
probably never needs a separate kernel alias for the data pages, to such
degree that I'm not sure if we shouldn't _force_ __GFP_HIGHMEM for data pages
allocation in __vmalloc_area_node())

> > 4) if it's very early in bootstrap, alloc_bootmem() and friends
> > may be the only option.  Rule of the thumb: if it's already printed
> > Memory: ...../..... available.....
> > you shouldn't be using that one.  Allocations are physically contiguous
> > and at that point large physically contiguous allocations are still OK.

Probably needs at least some discussion of memblock vs. bootmem APIs.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] free_pages stuff
  2016-01-05 15:26               ` Al Viro
@ 2016-01-05 15:42                 ` Michal Hocko
  0 siblings, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2016-01-05 15:42 UTC (permalink / raw)
  To: Al Viro
  Cc: Geert Uytterhoeven, Linus Torvalds, Linux Kernel Mailing List,
	linux-mm

On Tue 05-01-16 15:26:02, Al Viro wrote:
> On Tue, Jan 05, 2016 at 02:59:03PM +0100, Michal Hocko wrote:
> 
> > > 3) vmalloc() is for large allocations.  They will be page-aligned,
> > > but *not* physically contiguous.  OTOH, large physically contiguous
> > > allocations are generally a bad idea.  Unlike other allocators, there's
> > > no variant that could be used in interrupt; freeing is possible there,
> > > but allocation is not.  Note that non-blocking variant *does* exist -
> > > __vmalloc(size, GFP_ATOMIC, PAGE_KERNEL) can be used in atomic
> > > contexts; it's the interrupt ones that are no-go.
> 
> The last sentence I'd put into that part was complete crap...
> 
> > It is also hardcoded GFP_KERNEL context so a usage from NOFS context
> > needs a special treatment.
> 
> ... in part because of this.  GFP_ATOMIC __vmalloc() will be anything but,
> and the only caller passing that is almost certainly bogus.

Agreed as just replied in the other email thread which I have noticed
only now.

> As for NOFS/NOIO,
> I wonder if we should apply that special treatment inside __vmalloc_area_node
> rather than in callers; see the current thread on linux-mm for details...

That would make a lot of sense to me. Spreading the _special_ treatment
all over the kernel is certainly worse.
 
> Another interesting issue is __GFP_HIGHMEM meaning for kmalloc and __vmalloc
> resp. (should never be passed to kmalloc, should almost always be passed
> to __vmalloc - the former needs pages mapped in kernel space, the latter
> probably never needs a separate kernel alias for the data pages, to such
> degree that I'm not sure if we shouldn't _force_ __GFP_HIGHMEM for data pages
> allocation in __vmalloc_area_node())

I would have to think about this one some more. Let's not fragment the
discussion and continue in that email thread:
http://lkml.kernel.org/r/20160103071246.GK9938%40ZenIV.linux.org.uk

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-01-05 15:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20151221234615.GW20997@ZenIV.linux.org.uk>
     [not found] ` <CA+55aFwp4iy4rtX2gE2WjBGFL=NxMVnoFeHqYa2j1dYOMMGqxg@mail.gmail.com>
     [not found]   ` <20151222010403.GX20997@ZenIV.linux.org.uk>
     [not found]     ` <CA+55aFy9NrV_RnziN9z3p5O6rv1A0mirhLD0hL7Wrb77+YyBeg@mail.gmail.com>
     [not found]       ` <20151222022226.GY20997@ZenIV.linux.org.uk>
     [not found]         ` <CAMuHMdUGkVcUOH4VUXiuoa6eGVQEA+QRDEop3GrEOEWz8GeNig@mail.gmail.com>
     [not found]           ` <20151222210435.GB20997@ZenIV.linux.org.uk>
2016-01-05 13:59             ` [RFC] free_pages stuff Michal Hocko
2016-01-05 15:26               ` Al Viro
2016-01-05 15:42                 ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).