* [PATCH 0/9] mm: generic adaptive large memory allocation APIs
@ 2010-05-13 9:49 Changli Gao
2010-05-13 15:04 ` James Bottomley
0 siblings, 1 reply; 3+ messages in thread
From: Changli Gao @ 2010-05-13 9:49 UTC (permalink / raw)
To: akpm
Cc: Hoang-Nam Nguyen, Christoph Raisch, Roland Dreier, Sean Hefty,
Hal Rosenstock, Divy Le Ray, James E.J. Bottomley,
Theodore Ts'o, Andreas Dilger, Alexander Viro, Paul Menage,
Li Zefan, linux-rdma, linux-kernel, netdev, linux-scsi,
linux-ext4, linux-fsdevel, linux-mm, containers, Changli Gao
generic adaptive large memory allocation APIs
kv*alloc are used to allocate large contiguous memory and the users don't mind
whether the memory is physically or virtually contiguous. The allocator always
try its best to allocate physically contiguous memory first.
In this patch set, some APIs are introduced: kvmalloc(), kvzalloc(), kvcalloc(),
kvrealloc(), kvfree() and kvfree_inatomic().
Some code are converted to use the new generic APIs instead.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
drivers/infiniband/hw/ehca/ipz_pt_fn.c | 22 +-----
drivers/net/cxgb3/cxgb3_defs.h | 2
drivers/net/cxgb3/cxgb3_offload.c | 31 ---------
drivers/net/cxgb3/l2t.c | 4 -
drivers/net/cxgb4/cxgb4.h | 3
drivers/net/cxgb4/cxgb4_main.c | 37 +----------
drivers/net/cxgb4/l2t.c | 2
drivers/scsi/cxgb3i/cxgb3i_ddp.c | 12 +--
drivers/scsi/cxgb3i/cxgb3i_ddp.h | 26 -------
drivers/scsi/cxgb3i/cxgb3i_offload.c | 6 -
fs/ext4/super.c | 21 +-----
fs/file.c | 109 ++++-----------------------------
include/linux/mm.h | 31 +++++++++
include/linux/vmalloc.h | 1
kernel/cgroup.c | 47 +-------------
kernel/relay.c | 35 ----------
mm/nommu.c | 6 +
mm/util.c | 104 +++++++++++++++++++++++++++++++
mm/vmalloc.c | 14 ++++
19 files changed, 207 insertions(+), 306 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 0/9] mm: generic adaptive large memory allocation APIs
2010-05-13 9:49 [PATCH 0/9] mm: generic adaptive large memory allocation APIs Changli Gao
@ 2010-05-13 15:04 ` James Bottomley
2010-05-13 21:56 ` Andreas Dilger
0 siblings, 1 reply; 3+ messages in thread
From: James Bottomley @ 2010-05-13 15:04 UTC (permalink / raw)
To: Changli Gao
Cc: akpm, Hoang-Nam Nguyen, Christoph Raisch, Roland Dreier,
Sean Hefty, Hal Rosenstock, Divy Le Ray, Theodore Ts'o,
Andreas Dilger, Alexander Viro, Paul Menage, Li Zefan, linux-rdma,
linux-kernel, netdev, linux-scsi, linux-ext4, linux-fsdevel,
linux-mm, containers
On Thu, 2010-05-13 at 17:49 +0800, Changli Gao wrote:
> generic adaptive large memory allocation APIs
>
> kv*alloc are used to allocate large contiguous memory and the users don't mind
> whether the memory is physically or virtually contiguous. The allocator always
> try its best to allocate physically contiguous memory first.
This isn't necessarily true ... most drivers and filesystems have to
know what type they're getting. Often they have to do extra tricks to
process vmalloc areas. Conversely, large kmalloc areas are a very
precious commodity: if a driver or filesystem can handle vmalloc for
large allocations, it should: it's easier for us to expand the vmalloc
area than to try to make page reclaim keep large contiguous areas ... I
notice your proposed API does the exact opposite of this ... tries
kmalloc first and then does vmalloc.
Given this policy problem, isn't it easier simply to hand craft the
vmalloc fall back to kmalloc (or vice versa) in the driver than add this
whole massive raft of APIs for it?
> In this patch set, some APIs are introduced: kvmalloc(), kvzalloc(), kvcalloc(),
> kvrealloc(), kvfree() and kvfree_inatomic().
>
> Some code are converted to use the new generic APIs instead.
>
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
> ----
> drivers/infiniband/hw/ehca/ipz_pt_fn.c | 22 +-----
> drivers/net/cxgb3/cxgb3_defs.h | 2
> drivers/net/cxgb3/cxgb3_offload.c | 31 ---------
> drivers/net/cxgb3/l2t.c | 4 -
> drivers/net/cxgb4/cxgb4.h | 3
> drivers/net/cxgb4/cxgb4_main.c | 37 +----------
> drivers/net/cxgb4/l2t.c | 2
> drivers/scsi/cxgb3i/cxgb3i_ddp.c | 12 +--
> drivers/scsi/cxgb3i/cxgb3i_ddp.h | 26 -------
> drivers/scsi/cxgb3i/cxgb3i_offload.c | 6 -
> fs/ext4/super.c | 21 +-----
> fs/file.c | 109 ++++-----------------------------
> include/linux/mm.h | 31 +++++++++
> include/linux/vmalloc.h | 1
> kernel/cgroup.c | 47 +-------------
> kernel/relay.c | 35 ----------
> mm/nommu.c | 6 +
> mm/util.c | 104 +++++++++++++++++++++++++++++++
> mm/vmalloc.c | 14 ++++
> 19 files changed, 207 insertions(+), 306 deletions(-)
James
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 0/9] mm: generic adaptive large memory allocation APIs
2010-05-13 15:04 ` James Bottomley
@ 2010-05-13 21:56 ` Andreas Dilger
0 siblings, 0 replies; 3+ messages in thread
From: Andreas Dilger @ 2010-05-13 21:56 UTC (permalink / raw)
To: James Bottomley
Cc: Changli Gao, Andrew Morton, Hoang-Nam Nguyen, Christoph Raisch,
Roland Dreier, Sean Hefty, Hal Rosenstock, Divy Le Ray,
Theodore Ts'o, Alexander Viro, Paul Menage, Li Zefan,
linux-rdma, linux-kernel@vger.kernel.org Mailinglist, netdev,
linux-scsi, linux-ext4@vger.kernel.org development, linux-fsdevel,
linux-mm, containers
On 2010-05-13, at 09:04, James Bottomley wrote:
> This isn't necessarily true ... most drivers and filesystems have to
> know what type they're getting. Often they have to do extra tricks to
> process vmalloc areas. Conversely, large kmalloc areas are a very
> precious commodity: if a driver or filesystem can handle vmalloc for
> large allocations, it should: it's easier for us to expand the vmalloc
> area than to try to make page reclaim keep large contiguous areas ... I
> notice your proposed API does the exact opposite of this ... tries
> kmalloc first and then does vmalloc.
>
> Given this policy problem, isn't it easier simply to hand craft the
> vmalloc fall back to kmalloc (or vice versa) in the driver than add this
> whole massive raft of APIs for it?
I know we wouldn't mind using large vmalloc allocations for e.g. per-group arrays in ext4 (allocated once per mount), but I'd always understood that using vmalloc for general purpose uses can have a significant impact because the vmalloc() engine has (had?) serious performance problems. That means it is better performance-wise to have a wrapper function like this to switch between kmalloc() and vmalloc() based on the allocation size, but it makes the code ugly. Having the wrapper in the kernel would at least identify the different places that are using this kind of workaround.
If the performance of vmalloc() has been improved in the last few years, then I'd be happy to just use vmalloc() all the time. That said, vmalloc still isn't suitable for sub-page allocations, so if you have a variable-sized allocation that may be very small or very large the small allocations will waste a whole page and a wrapper is still needed, or vmalloc should be changed to call kmalloc/kfree for the sub-page allocations.
Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-05-13 21:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-13 9:49 [PATCH 0/9] mm: generic adaptive large memory allocation APIs Changli Gao
2010-05-13 15:04 ` James Bottomley
2010-05-13 21:56 ` Andreas Dilger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).