* [RFC 0/4] Intermix Lowmem and vmalloc
@ 2013-11-11 23:26 Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 1/4] arm: mm: Add iotable_init_novmreserve Laura Abbott
` (4 more replies)
0 siblings, 5 replies; 16+ messages in thread
From: Laura Abbott @ 2013-11-11 23:26 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
This is an RFC for a feature to allow lowmem and vmalloc virtual address space
to be intermixed. This has currently only been tested on a narrow set of ARM
chips.
Currently on 32-bit systems we have
Virtual Physical
PAGE_OFFSET +--------------+ PHYS_OFFSET +------------+
| | | |
| | | |
| | | |
| lowmem | | direct |
| | | mapped |
| | | |
| | | |
| | | |
+--------------+------------------>x------------>
| | | |
| | | |
| | | not-direct|
| | | mapped |
| vmalloc | | |
| | | |
| | | |
| | | |
+--------------+ +------------+
Where part of the virtual spaced above PHYS_OFFSET is reserved for direct
mapped lowmem and part of the virtual address space is reserved for vmalloc.
Obviously, we want to optimize for having as much direct mapped memory as
possible since there is a penalty for mapping/unmapping highmem. Unfortunately
system constraints often give memory layouts such as
Virtual Physical
PAGE_OFFSET +--------------+ PHYS_OFFSET +------------+
| | | |
| | | |
| | |xxxxxxxxxxxx|
| lowmem | |xxxxxxxxxxxx|
| | |xxxxxxxxxxxx|
| | |xxxxxxxxxxxx|
| | | |
| | | |
+--------------+------------------>x------------>
| | | |
| | | |
| | | not-direct|
| | | mapped |
| vmalloc | | |
| | | |
| | | |
| | | |
+--------------+ +------------+
(x = Linux cannot touch this memory)
where part of physical region that would be direct mapped as lowmem is not
actually in use by Linux.
This means that even though the system is not actually accessing the memory
we are still losing that portion of the direct mapped lowmem space. What this
series does is treat the virtual address space that would have been taken up
by the lowmem memory as vmalloc space and allows more lowmem to be mapped
Virtual Physical
PAGE_OFFSET +--------------+ PHYS_OFFSET +------------+
| | | |
| lowmem | | |
<----------------------------------+xxxxxxxxxxxx|
| | |xxxxxxxxxxxx|
| vmalloc | |xxxxxxxxxxxx|
<----------------------------------+xxxxxxxxxxxx|
| | | |
| lowmem | | |
| | | |
| | | |
| | | |
| | | |
+----------------------------------------------->
| vmalloc | | |
| | | not-direct|
| | | mapped |
| | | |
+--------------+ +------------+
The goal here is to allow as much lowmem to be mapped as if the block of memory
was not reserved from the physical lowmem region. Previously, we had been
hacking up the direct virt <-> phys translation to ignore a large region of
memory. This did not scale for multiple holes of memory however.
Open issues:
- vmalloc=<size> will account for all vmalloc now. This may have the
side effect of shrinking 'traditional' vmalloc too much for regular
static mappings. We were debating if this is just part of finding the
correct size for vmalloc or if there is a need for vmalloc_upper=
- People who like bike shedding more than I do can suggest better
config names if there is sufficient interest in the series.
Comments or suggestions on other ways to accomplish the same thing are welcome.
arch/arm/Kconfig | 3 +
arch/arm/include/asm/mach/map.h | 2 +
arch/arm/mm/dma-mapping.c | 2 +-
arch/arm/mm/init.c | 104 +++++++++++++++++++++++++++------------
arch/arm/mm/ioremap.c | 5 +-
arch/arm/mm/mm.h | 3 +-
arch/arm/mm/mmu.c | 40 ++++++++++++++-
include/linux/mm.h | 6 ++
include/linux/vmalloc.h | 1 +
mm/Kconfig | 11 ++++
mm/vmalloc.c | 37 ++++++++++++++
11 files changed, 175 insertions(+), 39 deletions(-)
Thanks,
Laura
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 1/4] arm: mm: Add iotable_init_novmreserve
2013-11-11 23:26 [RFC 0/4] Intermix Lowmem and vmalloc Laura Abbott
@ 2013-11-11 23:26 ` Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 2/4] arm: mm: Track lowmem in vmalloc Laura Abbott
` (3 subsequent siblings)
4 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2013-11-11 23:26 UTC (permalink / raw)
To: linux-arm-kernel
iotable_init is currently used by dma_contiguous_remap to remap
CMA memory appropriately. This has the side effect of reserving
the area of CMA in the vmalloc tracking structures. This is fine
under normal circumstances but it creates conflicts if we want
to track lowmem in vmalloc. Since dma_contiguous_remap is only
really concerned with the remapping, introduce iotable_init_novmreserve
to allow remapping of pages without reserving the virtual address
in vmalloc space.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
---
arch/arm/include/asm/mach/map.h | 2 ++
arch/arm/mm/dma-mapping.c | 2 +-
arch/arm/mm/ioremap.c | 5 +++--
arch/arm/mm/mm.h | 2 +-
arch/arm/mm/mmu.c | 17 ++++++++++++++---
5 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/arch/arm/include/asm/mach/map.h b/arch/arm/include/asm/mach/map.h
index 2fe141f..02e3509 100644
--- a/arch/arm/include/asm/mach/map.h
+++ b/arch/arm/include/asm/mach/map.h
@@ -37,6 +37,7 @@ struct map_desc {
#ifdef CONFIG_MMU
extern void iotable_init(struct map_desc *, int);
+extern void iotable_init_novmreserve(struct map_desc *, int);
extern void vm_reserve_area_early(unsigned long addr, unsigned long size,
void *caller);
@@ -56,6 +57,7 @@ extern int ioremap_page(unsigned long virt, unsigned long phys,
const struct mem_type *mtype);
#else
#define iotable_init(map,num) do { } while (0)
+#define iotable_init_novmreserve(map,num) do { } while(0)
#define vm_reserve_area_early(a,s,c) do { } while (0)
#endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 7f9b179..bf80e43 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -435,7 +435,7 @@ void __init dma_contiguous_remap(void)
addr += PMD_SIZE)
pmd_clear(pmd_off_k(addr));
- iotable_init(&map, 1);
+ iotable_init_novmreserve(&map, 1);
}
}
diff --git a/arch/arm/mm/ioremap.c b/arch/arm/mm/ioremap.c
index f123d6e..ad92d4f 100644
--- a/arch/arm/mm/ioremap.c
+++ b/arch/arm/mm/ioremap.c
@@ -84,14 +84,15 @@ struct static_vm *find_static_vm_vaddr(void *vaddr)
return NULL;
}
-void __init add_static_vm_early(struct static_vm *svm)
+void __init add_static_vm_early(struct static_vm *svm, bool add_to_vm)
{
struct static_vm *curr_svm;
struct vm_struct *vm;
void *vaddr;
vm = &svm->vm;
- vm_area_add_early(vm);
+ if (add_to_vm)
+ vm_area_add_early(vm);
vaddr = vm->addr;
list_for_each_entry(curr_svm, &static_vmlist, list) {
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index d5a4e9a..27a3680 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -75,7 +75,7 @@ struct static_vm {
extern struct list_head static_vmlist;
extern struct static_vm *find_static_vm_vaddr(void *vaddr);
-extern __init void add_static_vm_early(struct static_vm *svm);
+extern __init void add_static_vm_early(struct static_vm *svm, bool add_to_vm);
#endif
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 53cdbd3..b83ed88 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -817,7 +817,8 @@ static void __init create_mapping(struct map_desc *md)
/*
* Create the architecture specific mappings
*/
-void __init iotable_init(struct map_desc *io_desc, int nr)
+static void __init __iotable_init(struct map_desc *io_desc, int nr,
+ bool add_to_vm)
{
struct map_desc *md;
struct vm_struct *vm;
@@ -838,10 +839,20 @@ void __init iotable_init(struct map_desc *io_desc, int nr)
vm->flags = VM_IOREMAP | VM_ARM_STATIC_MAPPING;
vm->flags |= VM_ARM_MTYPE(md->type);
vm->caller = iotable_init;
- add_static_vm_early(svm++);
+ add_static_vm_early(svm++, add_to_vm);
}
}
+void __init iotable_init(struct map_desc *io_desc, int nr)
+{
+ return __iotable_init(io_desc, nr, true);
+}
+
+void __init iotable_init_novmreserve(struct map_desc *io_desc, int nr)
+{
+ return __iotable_init(io_desc, nr, false);
+}
+
void __init vm_reserve_area_early(unsigned long addr, unsigned long size,
void *caller)
{
@@ -855,7 +866,7 @@ void __init vm_reserve_area_early(unsigned long addr, unsigned long size,
vm->size = size;
vm->flags = VM_IOREMAP | VM_ARM_EMPTY_MAPPING;
vm->caller = caller;
- add_static_vm_early(svm);
+ add_static_vm_early(svm, true);
}
#ifndef CONFIG_ARM_LPAE
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 2/4] arm: mm: Track lowmem in vmalloc
2013-11-11 23:26 [RFC 0/4] Intermix Lowmem and vmalloc Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 1/4] arm: mm: Add iotable_init_novmreserve Laura Abbott
@ 2013-11-11 23:26 ` Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked " Laura Abbott
` (2 subsequent siblings)
4 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2013-11-11 23:26 UTC (permalink / raw)
To: linux-arm-kernel
Rather than always keeping lowmem and vmalloc separate, we can
now allow the two to be mixed. This means that all lowmem areas
need to be explicitly tracked in vmalloc to avoid over allocating.
Additionally, adjust the vmalloc reserve to account for the fact
that there may be a hole in the middle consisting of vmalloc.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
---
arch/arm/Kconfig | 3 +
arch/arm/mm/init.c | 104 ++++++++++++++++++++++++++++++++++++----------------
arch/arm/mm/mm.h | 1 +
arch/arm/mm/mmu.c | 23 +++++++++++
4 files changed, 99 insertions(+), 32 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 051fce4..1f36664 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -270,6 +270,9 @@ config GENERIC_BUG
def_bool y
depends on BUG
+config ARCH_TRACKS_VMALLOC
+ bool
+
source "init/Kconfig"
source "kernel/Kconfig.freezer"
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 15225d8..c9ca316 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -576,6 +576,46 @@ static void __init free_highpages(void)
#endif
}
+#define MLK(b, t) b, t, ((t) - (b)) >> 10
+#define MLM(b, t) b, t, ((t) - (b)) >> 20
+#define MLK_ROUNDUP(b, t) b, t, DIV_ROUND_UP(((t) - (b)), SZ_1K)
+
+#ifdef CONFIG_ENABLE_VMALLOC_SAVING
+void print_vmalloc_lowmem_info(void)
+{
+ int i;
+ void *va_start, *va_end;
+
+ printk(KERN_NOTICE
+ " vmalloc : 0x%08lx - 0x%08lx (%4ld MB)\n",
+ MLM(VMALLOC_START, VMALLOC_END));
+
+ for (i = meminfo.nr_banks - 1; i >= 0; i--) {
+ if (!meminfo.bank[i].highmem) {
+ va_start = __va(meminfo.bank[i].start);
+ va_end = __va(meminfo.bank[i].start +
+ meminfo.bank[i].size);
+ printk(KERN_NOTICE
+ " lowmem : 0x%08lx - 0x%08lx (%4ld MB)\n",
+ MLM((unsigned long)va_start, (unsigned long)va_end));
+ }
+ if (i && ((meminfo.bank[i-1].start + meminfo.bank[i-1].size) !=
+ meminfo.bank[i].start)) {
+ if (meminfo.bank[i-1].start + meminfo.bank[i-1].size
+ <= MAX_HOLE_ADDRESS) {
+ va_start = __va(meminfo.bank[i-1].start
+ + meminfo.bank[i-1].size);
+ va_end = __va(meminfo.bank[i].start);
+ printk(KERN_NOTICE
+ " vmalloc : 0x%08lx - 0x%08lx (%4ld MB)\n",
+ MLM((unsigned long)va_start,
+ (unsigned long)va_end));
+ }
+ }
+ }
+}
+#endif
+
/*
* mem_init() marks the free areas in the mem_map and tells us how much
* memory is free. This is done after various parts of the system have
@@ -604,55 +644,52 @@ void __init mem_init(void)
mem_init_print_info(NULL);
-#define MLK(b, t) b, t, ((t) - (b)) >> 10
-#define MLM(b, t) b, t, ((t) - (b)) >> 20
-#define MLK_ROUNDUP(b, t) b, t, DIV_ROUND_UP(((t) - (b)), SZ_1K)
-
printk(KERN_NOTICE "Virtual kernel memory layout:\n"
" vector : 0x%08lx - 0x%08lx (%4ld kB)\n"
#ifdef CONFIG_HAVE_TCM
" DTCM : 0x%08lx - 0x%08lx (%4ld kB)\n"
" ITCM : 0x%08lx - 0x%08lx (%4ld kB)\n"
#endif
- " fixmap : 0x%08lx - 0x%08lx (%4ld kB)\n"
- " vmalloc : 0x%08lx - 0x%08lx (%4ld MB)\n"
- " lowmem : 0x%08lx - 0x%08lx (%4ld MB)\n"
-#ifdef CONFIG_HIGHMEM
- " pkmap : 0x%08lx - 0x%08lx (%4ld MB)\n"
-#endif
-#ifdef CONFIG_MODULES
- " modules : 0x%08lx - 0x%08lx (%4ld MB)\n"
-#endif
- " .text : 0x%p" " - 0x%p" " (%4d kB)\n"
- " .init : 0x%p" " - 0x%p" " (%4d kB)\n"
- " .data : 0x%p" " - 0x%p" " (%4d kB)\n"
- " .bss : 0x%p" " - 0x%p" " (%4d kB)\n",
-
+ " fixmap : 0x%08lx - 0x%08lx (%4ld kB)\n",
MLK(UL(CONFIG_VECTORS_BASE), UL(CONFIG_VECTORS_BASE) +
(PAGE_SIZE)),
#ifdef CONFIG_HAVE_TCM
MLK(DTCM_OFFSET, (unsigned long) dtcm_end),
MLK(ITCM_OFFSET, (unsigned long) itcm_end),
#endif
- MLK(FIXADDR_START, FIXADDR_TOP),
- MLM(VMALLOC_START, VMALLOC_END),
- MLM(PAGE_OFFSET, (unsigned long)high_memory),
+ MLK(FIXADDR_START, FIXADDR_TOP));
+#ifdef CONFIG_ENABLE_VMALLOC_SAVING
+ print_vmalloc_lowmem_info();
+#else
+ printk(KERN_NOTICE
+ " vmalloc : 0x%08lx - 0x%08lx (%4ld MB)\n"
+ " lowmem : 0x%08lx - 0x%08lx (%4ld MB)\n",
+ MLM(VMALLOC_START, VMALLOC_END),
+ MLM(PAGE_OFFSET, (unsigned long)high_memory));
+#endif
#ifdef CONFIG_HIGHMEM
- MLM(PKMAP_BASE, (PKMAP_BASE) + (LAST_PKMAP) *
+ printk(KERN_NOTICE
+ " pkmap : 0x%08lx - 0x%08lx (%4ld MB)\n"
+#endif
+#ifdef CONFIG_MODULES
+ " modules : 0x%08lx - 0x%08lx (%4ld MB)\n"
+#endif
+ " .text : 0x%p" " - 0x%p" " (%4d kB)\n"
+ " .init : 0x%p" " - 0x%p" " (%4d kB)\n"
+ " .data : 0x%p" " - 0x%p" " (%4d kB)\n"
+ " .bss : 0x%p" " - 0x%p" " (%4d kB)\n",
+#ifdef CONFIG_HIGHMEM
+ MLM(PKMAP_BASE, (PKMAP_BASE) + (LAST_PKMAP) *
(PAGE_SIZE)),
#endif
#ifdef CONFIG_MODULES
- MLM(MODULES_VADDR, MODULES_END),
+ MLM(MODULES_VADDR, MODULES_END),
#endif
- MLK_ROUNDUP(_text, _etext),
- MLK_ROUNDUP(__init_begin, __init_end),
- MLK_ROUNDUP(_sdata, _edata),
- MLK_ROUNDUP(__bss_start, __bss_stop));
-
-#undef MLK
-#undef MLM
-#undef MLK_ROUNDUP
+ MLK_ROUNDUP(_text, _etext),
+ MLK_ROUNDUP(__init_begin, __init_end),
+ MLK_ROUNDUP(_sdata, _edata),
+ MLK_ROUNDUP(__bss_start, __bss_stop));
/*
* Check boundaries twice: Some fundamental inconsistencies can
@@ -660,7 +697,7 @@ void __init mem_init(void)
*/
#ifdef CONFIG_MMU
BUILD_BUG_ON(TASK_SIZE > MODULES_VADDR);
- BUG_ON(TASK_SIZE > MODULES_VADDR);
+ BUG_ON(TASK_SIZE > MODULES_VADDR);
#endif
#ifdef CONFIG_HIGHMEM
@@ -679,6 +716,9 @@ void __init mem_init(void)
}
}
+#undef MLK
+#undef MLM
+#undef MLK_ROUNDUP
void free_initmem(void)
{
#ifdef CONFIG_HAVE_TCM
diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h
index 27a3680..f484e52 100644
--- a/arch/arm/mm/mm.h
+++ b/arch/arm/mm/mm.h
@@ -85,6 +85,7 @@ extern phys_addr_t arm_dma_limit;
#define arm_dma_limit ((phys_addr_t)~0)
#endif
+#define MAX_HOLE_ADDRESS (PHYS_OFFSET + 0x10000000)
extern phys_addr_t arm_lowmem_limit;
void __init bootmem_init(void);
diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index b83ed88..ed2a4fa 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1004,6 +1004,19 @@ void __init sanity_check_meminfo(void)
int i, j, highmem = 0;
phys_addr_t vmalloc_limit = __pa(vmalloc_min - 1) + 1;
+#ifdef CONFIG_ARCH_TRACKS_VMALLOC
+ unsigned long hole_start;
+ for (i = 0; i < (meminfo.nr_banks - 1); i++) {
+ hole_start = meminfo.bank[i].start + meminfo.bank[i].size;
+ if (hole_start != meminfo.bank[i+1].start) {
+ if (hole_start <= MAX_HOLE_ADDRESS) {
+ vmalloc_min = (void *) (vmalloc_min +
+ (meminfo.bank[i+1].start - hole_start));
+ }
+ }
+ }
+#endif
+
for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
struct membank *bank = &meminfo.bank[j];
phys_addr_t size_limit;
@@ -1311,6 +1324,7 @@ static void __init map_lowmem(void)
phys_addr_t start = reg->base;
phys_addr_t end = start + reg->size;
struct map_desc map;
+ struct vm_struct *vm;
if (end > arm_lowmem_limit)
end = arm_lowmem_limit;
@@ -1323,6 +1337,15 @@ static void __init map_lowmem(void)
map.type = MT_MEMORY;
create_mapping(&map);
+
+#ifdef CONFIG_ARCH_TRACKS_VMALLOC
+ vm = early_alloc_aligned(sizeof(*vm), __alignof__(*vm));
+ vm->addr = (void *)map.virtual;
+ vm->size = end - start;
+ vm->flags = VM_LOWMEM;
+ vm->caller = map_lowmem;
+ vm_area_add_early(vm);
+#endif
}
}
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked in vmalloc
2013-11-11 23:26 [RFC 0/4] Intermix Lowmem and vmalloc Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 1/4] arm: mm: Add iotable_init_novmreserve Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 2/4] arm: mm: Track lowmem in vmalloc Laura Abbott
@ 2013-11-11 23:26 ` Laura Abbott
2013-11-11 23:37 ` Kyungmin Park
2013-11-14 17:45 ` Dave Hansen
2013-11-11 23:26 ` [RFC PATCH 4/4] mm/vmalloc.c: Treat the entire kernel virtual space as vmalloc Laura Abbott
2013-11-12 0:13 ` [RFC 0/4] Intermix Lowmem and vmalloc Russell King - ARM Linux
4 siblings, 2 replies; 16+ messages in thread
From: Laura Abbott @ 2013-11-11 23:26 UTC (permalink / raw)
To: linux-arm-kernel
vmalloc is currently assumed to be a completely separate address space
from the lowmem region. While this may be true in the general case,
there are some instances where lowmem and virtual space intermixing
provides gains. One example is needing to steal a large chunk of physical
lowmem for another purpose outside the systems usage. Rather than
waste the precious lowmem space on a 32-bit system, we can allow the
virtual holes created by the physical holes to be used by vmalloc
for virtual addressing. Track lowmem allocations in vmalloc to
allow mixing of lowmem and vmalloc.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
---
include/linux/mm.h | 6 ++++++
include/linux/vmalloc.h | 1 +
mm/Kconfig | 11 +++++++++++
mm/vmalloc.c | 26 ++++++++++++++++++++++++++
4 files changed, 44 insertions(+), 0 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index f022460..76df50d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -308,6 +308,10 @@ unsigned long vmalloc_to_pfn(const void *addr);
* On nommu, vmalloc/vfree wrap through kmalloc/kfree directly, so there
* is no special casing required.
*/
+
+#ifdef CONFIG_VMALLOC_SAVING
+extern int is_vmalloc_addr(const void *x)
+#else
static inline int is_vmalloc_addr(const void *x)
{
#ifdef CONFIG_MMU
@@ -318,6 +322,8 @@ static inline int is_vmalloc_addr(const void *x)
return 0;
#endif
}
+#endif
+
#ifdef CONFIG_MMU
extern int is_vmalloc_or_module_addr(const void *x);
#else
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 4b8a891..e0c8c49 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -16,6 +16,7 @@ struct vm_area_struct; /* vma defining user mapping in mm_types.h */
#define VM_USERMAP 0x00000008 /* suitable for remap_vmalloc_range */
#define VM_VPAGES 0x00000010 /* buffer for pages was vmalloc'ed */
#define VM_UNINITIALIZED 0x00000020 /* vm_struct is not fully initialized */
+#define VM_LOWMEM 0x00000040 /* Tracking of direct mapped lowmem */
/* bits [20..32] reserved for arch specific ioremap internals */
/*
diff --git a/mm/Kconfig b/mm/Kconfig
index 8028dcc..b3c459d 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -519,3 +519,14 @@ config MEM_SOFT_DIRTY
it can be cleared by hands.
See Documentation/vm/soft-dirty.txt for more details.
+
+config ENABLE_VMALLOC_SAVING
+ bool "Intermix lowmem and vmalloc virtual space"
+ depends on ARCH_TRACKS_VMALLOC
+ help
+ Some memory layouts on embedded systems steal large amounts
+ of lowmem physical memory for purposes outside of the kernel.
+ Rather than waste the physical and virtual space, allow the
+ kernel to use the virtual space as vmalloc space.
+
+ If unsure, say N.
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 13a5495..c7b138b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -204,6 +204,29 @@ static int vmap_page_range(unsigned long start, unsigned long end,
return ret;
}
+#ifdef ENABLE_VMALLOC_SAVING
+int is_vmalloc_addr(const void *x)
+{
+ struct rb_node *n;
+ struct vmap_area *va;
+ int ret = 0;
+
+ spin_lock(&vmap_area_lock);
+
+ for (n = rb_first(vmap_area_root); n; rb_next(n)) {
+ va = rb_entry(n, struct vmap_area, rb_node);
+ if (x >= va->va_start && x < va->va_end) {
+ ret = 1;
+ break;
+ }
+ }
+
+ spin_unlock(&vmap_area_lock);
+ return ret;
+}
+EXPORT_SYMBOL(is_vmalloc_addr);
+#endif
+
int is_vmalloc_or_module_addr(const void *x)
{
/*
@@ -2628,6 +2651,9 @@ static int s_show(struct seq_file *m, void *p)
if (v->flags & VM_VPAGES)
seq_printf(m, " vpages");
+ if (v->flags & VM_LOWMEM)
+ seq_printf(m, " lowmem");
+
show_numa_info(m, v);
seq_putc(m, '\n');
return 0;
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 4/4] mm/vmalloc.c: Treat the entire kernel virtual space as vmalloc
2013-11-11 23:26 [RFC 0/4] Intermix Lowmem and vmalloc Laura Abbott
` (2 preceding siblings ...)
2013-11-11 23:26 ` [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked " Laura Abbott
@ 2013-11-11 23:26 ` Laura Abbott
2013-11-14 17:26 ` Dave Hansen
2013-11-12 0:13 ` [RFC 0/4] Intermix Lowmem and vmalloc Russell King - ARM Linux
4 siblings, 1 reply; 16+ messages in thread
From: Laura Abbott @ 2013-11-11 23:26 UTC (permalink / raw)
To: linux-arm-kernel
With CONFIG_ENABLE_VMALLOC_SAVINGS, all lowmem is tracked in
vmalloc. This means that all the kernel virtual address space
can be treated as part of the vmalloc region. Allow vm areas
to be allocated from the full kernel address range.
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
---
mm/vmalloc.c | 11 +++++++++++
1 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c7b138b..181247d 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1385,16 +1385,27 @@ struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags,
*/
struct vm_struct *get_vm_area(unsigned long size, unsigned long flags)
{
+#ifdef CONFIG_ENABLE_VMALLOC_SAVING
+ return __get_vm_area_node(size, 1, flags, PAGE_OFFSET, VMALLOC_END,
+ NUMA_NO_NODE, GFP_KERNEL,
+ __builtin_return_address(0));
+#else
return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END,
NUMA_NO_NODE, GFP_KERNEL,
__builtin_return_address(0));
+#endif
}
struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags,
const void *caller)
{
+#ifdef CONFIG_ENABLE_VMALLOC_SAVING
+ return __get_vm_area_node(size, 1, flags, PAGE_OFFSET, VMALLOC_END,
+ NUMA_NO_NODE, GFP_KERNEL, caller);
+#else
return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END,
NUMA_NO_NODE, GFP_KERNEL, caller);
+#endif
}
/**
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked in vmalloc
2013-11-11 23:26 ` [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked " Laura Abbott
@ 2013-11-11 23:37 ` Kyungmin Park
2013-11-12 1:23 ` Laura Abbott
2013-11-14 17:45 ` Dave Hansen
1 sibling, 1 reply; 16+ messages in thread
From: Kyungmin Park @ 2013-11-11 23:37 UTC (permalink / raw)
To: linux-arm-kernel
Hi Laura,
On Tue, Nov 12, 2013 at 8:26 AM, Laura Abbott <lauraa@codeaurora.org> wrote:
> vmalloc is currently assumed to be a completely separate address space
> from the lowmem region. While this may be true in the general case,
> there are some instances where lowmem and virtual space intermixing
> provides gains. One example is needing to steal a large chunk of physical
> lowmem for another purpose outside the systems usage. Rather than
> waste the precious lowmem space on a 32-bit system, we can allow the
> virtual holes created by the physical holes to be used by vmalloc
> for virtual addressing. Track lowmem allocations in vmalloc to
> allow mixing of lowmem and vmalloc.
>
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> Signed-off-by: Neeti Desai <neetid@codeaurora.org>
> ---
> include/linux/mm.h | 6 ++++++
> include/linux/vmalloc.h | 1 +
> mm/Kconfig | 11 +++++++++++
> mm/vmalloc.c | 26 ++++++++++++++++++++++++++
> 4 files changed, 44 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index f022460..76df50d 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -308,6 +308,10 @@ unsigned long vmalloc_to_pfn(const void *addr);
> * On nommu, vmalloc/vfree wrap through kmalloc/kfree directly, so there
> * is no special casing required.
> */
> +
> +#ifdef CONFIG_VMALLOC_SAVING
mismatch below Kconfig. CONFIG_ENABLE_VMALLOC_SAVING?
> +extern int is_vmalloc_addr(const void *x)
> +#else
> static inline int is_vmalloc_addr(const void *x)
> {
> #ifdef CONFIG_MMU
> @@ -318,6 +322,8 @@ static inline int is_vmalloc_addr(const void *x)
> return 0;
> #endif
> }
> +#endif
> +
> #ifdef CONFIG_MMU
> extern int is_vmalloc_or_module_addr(const void *x);
> #else
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index 4b8a891..e0c8c49 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -16,6 +16,7 @@ struct vm_area_struct; /* vma defining user mapping in mm_types.h */
> #define VM_USERMAP 0x00000008 /* suitable for remap_vmalloc_range */
> #define VM_VPAGES 0x00000010 /* buffer for pages was vmalloc'ed */
> #define VM_UNINITIALIZED 0x00000020 /* vm_struct is not fully initialized */
> +#define VM_LOWMEM 0x00000040 /* Tracking of direct mapped lowmem */
> /* bits [20..32] reserved for arch specific ioremap internals */
>
> /*
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 8028dcc..b3c459d 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -519,3 +519,14 @@ config MEM_SOFT_DIRTY
> it can be cleared by hands.
>
> See Documentation/vm/soft-dirty.txt for more details.
> +
> +config ENABLE_VMALLOC_SAVING
> + bool "Intermix lowmem and vmalloc virtual space"
> + depends on ARCH_TRACKS_VMALLOC
> + help
> + Some memory layouts on embedded systems steal large amounts
> + of lowmem physical memory for purposes outside of the kernel.
> + Rather than waste the physical and virtual space, allow the
> + kernel to use the virtual space as vmalloc space.
> +
> + If unsure, say N.
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 13a5495..c7b138b 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -204,6 +204,29 @@ static int vmap_page_range(unsigned long start, unsigned long end,
> return ret;
> }
>
> +#ifdef ENABLE_VMALLOC_SAVING
missing "CONFIG_"
Thank you,
Kyungimn Park
> +int is_vmalloc_addr(const void *x)
> +{
> + struct rb_node *n;
> + struct vmap_area *va;
> + int ret = 0;
> +
> + spin_lock(&vmap_area_lock);
> +
> + for (n = rb_first(vmap_area_root); n; rb_next(n)) {
> + va = rb_entry(n, struct vmap_area, rb_node);
> + if (x >= va->va_start && x < va->va_end) {
> + ret = 1;
> + break;
> + }
> + }
> +
> + spin_unlock(&vmap_area_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL(is_vmalloc_addr);
> +#endif
> +
> int is_vmalloc_or_module_addr(const void *x)
> {
> /*
> @@ -2628,6 +2651,9 @@ static int s_show(struct seq_file *m, void *p)
> if (v->flags & VM_VPAGES)
> seq_printf(m, " vpages");
>
> + if (v->flags & VM_LOWMEM)
> + seq_printf(m, " lowmem");
> +
> show_numa_info(m, v);
> seq_putc(m, '\n');
> return 0;
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> hosted by The Linux Foundation
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo at kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email at kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC 0/4] Intermix Lowmem and vmalloc
2013-11-11 23:26 [RFC 0/4] Intermix Lowmem and vmalloc Laura Abbott
` (3 preceding siblings ...)
2013-11-11 23:26 ` [RFC PATCH 4/4] mm/vmalloc.c: Treat the entire kernel virtual space as vmalloc Laura Abbott
@ 2013-11-12 0:13 ` Russell King - ARM Linux
2013-11-12 1:24 ` Laura Abbott
4 siblings, 1 reply; 16+ messages in thread
From: Russell King - ARM Linux @ 2013-11-12 0:13 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 11, 2013 at 03:26:48PM -0800, Laura Abbott wrote:
> Hi,
>
> This is an RFC for a feature to allow lowmem and vmalloc virtual address space
> to be intermixed. This has currently only been tested on a narrow set of ARM
> chips.
>
> Currently on 32-bit systems we have
>
>
> Virtual Physical
>
> PAGE_OFFSET +--------------+ PHYS_OFFSET +------------+
> | | | |
> | | | |
> | | | |
> | lowmem | | direct |
> | | | mapped |
> | | | |
> | | | |
> | | | |
> +--------------+------------------>x------------>
> | | | |
> | | | |
> | | | not-direct|
> | | | mapped |
> | vmalloc | | |
> | | | |
> | | | |
> | | | |
> +--------------+ +------------+
>
> Where part of the virtual spaced above PHYS_OFFSET is reserved for direct
> mapped lowmem and part of the virtual address space is reserved for vmalloc.
Minor nit...
ITYM PAGE_OFFSET here. vmalloc space doesn't exist in physical memory.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked in vmalloc
2013-11-11 23:37 ` Kyungmin Park
@ 2013-11-12 1:23 ` Laura Abbott
0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2013-11-12 1:23 UTC (permalink / raw)
To: linux-arm-kernel
On 11/11/2013 3:37 PM, Kyungmin Park wrote:
> Hi Laura,
>
> On Tue, Nov 12, 2013 at 8:26 AM, Laura Abbott <lauraa@codeaurora.org> wrote:
>> vmalloc is currently assumed to be a completely separate address space
>> from the lowmem region. While this may be true in the general case,
>> there are some instances where lowmem and virtual space intermixing
>> provides gains. One example is needing to steal a large chunk of physical
>> lowmem for another purpose outside the systems usage. Rather than
>> waste the precious lowmem space on a 32-bit system, we can allow the
>> virtual holes created by the physical holes to be used by vmalloc
>> for virtual addressing. Track lowmem allocations in vmalloc to
>> allow mixing of lowmem and vmalloc.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> Signed-off-by: Neeti Desai <neetid@codeaurora.org>
>> ---
>> include/linux/mm.h | 6 ++++++
>> include/linux/vmalloc.h | 1 +
>> mm/Kconfig | 11 +++++++++++
>> mm/vmalloc.c | 26 ++++++++++++++++++++++++++
>> 4 files changed, 44 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index f022460..76df50d 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -308,6 +308,10 @@ unsigned long vmalloc_to_pfn(const void *addr);
>> * On nommu, vmalloc/vfree wrap through kmalloc/kfree directly, so there
>> * is no special casing required.
>> */
>> +
>> +#ifdef CONFIG_VMALLOC_SAVING
> mismatch below Kconfig. CONFIG_ENABLE_VMALLOC_SAVING?
Argh, I folded in a wrong patch when integrating. I'll fix it.
>> +extern int is_vmalloc_addr(const void *x)
>> +#else
>> static inline int is_vmalloc_addr(const void *x)
>> {
>> #ifdef CONFIG_MMU
>> @@ -318,6 +322,8 @@ static inline int is_vmalloc_addr(const void *x)
>> return 0;
>> #endif
>> }
>> +#endif
>> +
>> #ifdef CONFIG_MMU
>> extern int is_vmalloc_or_module_addr(const void *x);
>> #else
>> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
>> index 4b8a891..e0c8c49 100644
>> --- a/include/linux/vmalloc.h
>> +++ b/include/linux/vmalloc.h
>> @@ -16,6 +16,7 @@ struct vm_area_struct; /* vma defining user mapping in mm_types.h */
>> #define VM_USERMAP 0x00000008 /* suitable for remap_vmalloc_range */
>> #define VM_VPAGES 0x00000010 /* buffer for pages was vmalloc'ed */
>> #define VM_UNINITIALIZED 0x00000020 /* vm_struct is not fully initialized */
>> +#define VM_LOWMEM 0x00000040 /* Tracking of direct mapped lowmem */
>> /* bits [20..32] reserved for arch specific ioremap internals */
>>
>> /*
>> diff --git a/mm/Kconfig b/mm/Kconfig
>> index 8028dcc..b3c459d 100644
>> --- a/mm/Kconfig
>> +++ b/mm/Kconfig
>> @@ -519,3 +519,14 @@ config MEM_SOFT_DIRTY
>> it can be cleared by hands.
>>
>> See Documentation/vm/soft-dirty.txt for more details.
>> +
>> +config ENABLE_VMALLOC_SAVING
>> + bool "Intermix lowmem and vmalloc virtual space"
>> + depends on ARCH_TRACKS_VMALLOC
>> + help
>> + Some memory layouts on embedded systems steal large amounts
>> + of lowmem physical memory for purposes outside of the kernel.
>> + Rather than waste the physical and virtual space, allow the
>> + kernel to use the virtual space as vmalloc space.
>> +
>> + If unsure, say N.
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index 13a5495..c7b138b 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -204,6 +204,29 @@ static int vmap_page_range(unsigned long start, unsigned long end,
>> return ret;
>> }
>>
>> +#ifdef ENABLE_VMALLOC_SAVING
> missing "CONFIG_"
>
Yes, this is a mess and needs to be cleaned up.
> Thank you,
> Kyungimn Park
Thanks,
Laura
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC 0/4] Intermix Lowmem and vmalloc
2013-11-12 0:13 ` [RFC 0/4] Intermix Lowmem and vmalloc Russell King - ARM Linux
@ 2013-11-12 1:24 ` Laura Abbott
0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2013-11-12 1:24 UTC (permalink / raw)
To: linux-arm-kernel
On 11/11/2013 4:13 PM, Russell King - ARM Linux wrote:
> On Mon, Nov 11, 2013 at 03:26:48PM -0800, Laura Abbott wrote:
>> Hi,
>>
>> This is an RFC for a feature to allow lowmem and vmalloc virtual address space
>> to be intermixed. This has currently only been tested on a narrow set of ARM
>> chips.
>>
>> Currently on 32-bit systems we have
>>
>>
>> Virtual Physical
>>
>> PAGE_OFFSET +--------------+ PHYS_OFFSET +------------+
>> | | | |
>> | | | |
>> | | | |
>> | lowmem | | direct |
>> | | | mapped |
>> | | | |
>> | | | |
>> | | | |
>> +--------------+------------------>x------------>
>> | | | |
>> | | | |
>> | | | not-direct|
>> | | | mapped |
>> | vmalloc | | |
>> | | | |
>> | | | |
>> | | | |
>> +--------------+ +------------+
>>
>> Where part of the virtual spaced above PHYS_OFFSET is reserved for direct
>> mapped lowmem and part of the virtual address space is reserved for vmalloc.
>
> Minor nit...
>
> ITYM PAGE_OFFSET here. vmalloc space doesn't exist in physical memory.
>
Yes, that is a typo.
Thanks,
Laura
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 4/4] mm/vmalloc.c: Treat the entire kernel virtual space as vmalloc
2013-11-11 23:26 ` [RFC PATCH 4/4] mm/vmalloc.c: Treat the entire kernel virtual space as vmalloc Laura Abbott
@ 2013-11-14 17:26 ` Dave Hansen
2013-11-15 5:34 ` Laura Abbott
0 siblings, 1 reply; 16+ messages in thread
From: Dave Hansen @ 2013-11-14 17:26 UTC (permalink / raw)
To: linux-arm-kernel
On 11/11/2013 03:26 PM, Laura Abbott wrote:
> With CONFIG_ENABLE_VMALLOC_SAVINGS, all lowmem is tracked in
> vmalloc. This means that all the kernel virtual address space
> can be treated as part of the vmalloc region. Allow vm areas
> to be allocated from the full kernel address range.
>
> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
> Signed-off-by: Neeti Desai <neetid@codeaurora.org>
> ---
> mm/vmalloc.c | 11 +++++++++++
> 1 files changed, 11 insertions(+), 0 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index c7b138b..181247d 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1385,16 +1385,27 @@ struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags,
> */
> struct vm_struct *get_vm_area(unsigned long size, unsigned long flags)
> {
> +#ifdef CONFIG_ENABLE_VMALLOC_SAVING
> + return __get_vm_area_node(size, 1, flags, PAGE_OFFSET, VMALLOC_END,
> + NUMA_NO_NODE, GFP_KERNEL,
> + __builtin_return_address(0));
> +#else
> return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END,
> NUMA_NO_NODE, GFP_KERNEL,
> __builtin_return_address(0));
> +#endif
> }
>
> struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags,
> const void *caller)
> {
> +#ifdef CONFIG_ENABLE_VMALLOC_SAVING
> + return __get_vm_area_node(size, 1, flags, PAGE_OFFSET, VMALLOC_END,
> + NUMA_NO_NODE, GFP_KERNEL, caller);
> +#else
> return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END,
> NUMA_NO_NODE, GFP_KERNEL, caller);
> +#endif
> }
Couple of nits: first of all, there's no reason to copy, paste, and
#ifdef this much code. This just invites one of the copies to bitrot.
I'd much rather see this:
#ifdef CONFIG_ENABLE_VMALLOC_SAVING
#define LOWEST_VMALLOC_VADDR PAGE_OFFSET
#else
#define LOWEST_VMALLOC_VADDR VMALLOC_START
#endif
Then just replace the PAGE_OFFSET in the function arguments with
LOWEST_VMALLOC_VADDR.
Have you done any audits to make sure that the rest of the code that
deals with vmalloc addresses in the kernel is using is_vmalloc_addr()?
I'd be a bit worried that we might have picked up an assumption or two
that *all* vmalloc addresses are _above_ VMALLOC_START.
The percpu.c code looks like it might do this, and maybe the kcore code.
The vmalloc.c code itself has this in get_vmalloc_info():
> /*
> * Some archs keep another range for modules in vmalloc space
> */
> if (addr < VMALLOC_START)
> continue;
Seems like that would break as well.
With this patch, VMALLOC_START loses enough of its meaning that I wonder
if we should even keep it around. It's the start of the _dedicated_
vmalloc space, but it's mostly useless and obscure enough that maybe we
should get rid of its use in common code.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked in vmalloc
2013-11-11 23:26 ` [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked " Laura Abbott
2013-11-11 23:37 ` Kyungmin Park
@ 2013-11-14 17:45 ` Dave Hansen
2013-11-15 4:52 ` Laura Abbott
1 sibling, 1 reply; 16+ messages in thread
From: Dave Hansen @ 2013-11-14 17:45 UTC (permalink / raw)
To: linux-arm-kernel
On 11/11/2013 03:26 PM, Laura Abbott wrote:
> +config ENABLE_VMALLOC_SAVING
> + bool "Intermix lowmem and vmalloc virtual space"
> + depends on ARCH_TRACKS_VMALLOC
> + help
> + Some memory layouts on embedded systems steal large amounts
> + of lowmem physical memory for purposes outside of the kernel.
> + Rather than waste the physical and virtual space, allow the
> + kernel to use the virtual space as vmalloc space.
I really don't think this needs to be exposed with help text and so
forth. How about just defining a 'def_bool n' with some comments and
let the architecture 'select' it?
> +#ifdef ENABLE_VMALLOC_SAVING
> +int is_vmalloc_addr(const void *x)
> +{
> + struct rb_node *n;
> + struct vmap_area *va;
> + int ret = 0;
> +
> + spin_lock(&vmap_area_lock);
> +
> + for (n = rb_first(vmap_area_root); n; rb_next(n)) {
> + va = rb_entry(n, struct vmap_area, rb_node);
> + if (x >= va->va_start && x < va->va_end) {
> + ret = 1;
> + break;
> + }
> + }
> +
> + spin_unlock(&vmap_area_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL(is_vmalloc_addr);
> +#endif
It's probably worth noting that this makes is_vmalloc_addr() a *LOT*
more expensive than it was before. There are a couple dozen of these in
the tree in kinda weird places (ext4, netlink, tcp). You didn't
mention it here, but you probably want to at least make sure you're not
adding a spinlock and a tree walk in some critical path.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked in vmalloc
2013-11-14 17:45 ` Dave Hansen
@ 2013-11-15 4:52 ` Laura Abbott
2013-11-15 15:53 ` Dave Hansen
2013-11-26 22:45 ` Andrew Morton
0 siblings, 2 replies; 16+ messages in thread
From: Laura Abbott @ 2013-11-15 4:52 UTC (permalink / raw)
To: linux-arm-kernel
On 11/14/2013 9:45 AM, Dave Hansen wrote:
> On 11/11/2013 03:26 PM, Laura Abbott wrote:
>> +config ENABLE_VMALLOC_SAVING
>> + bool "Intermix lowmem and vmalloc virtual space"
>> + depends on ARCH_TRACKS_VMALLOC
>> + help
>> + Some memory layouts on embedded systems steal large amounts
>> + of lowmem physical memory for purposes outside of the kernel.
>> + Rather than waste the physical and virtual space, allow the
>> + kernel to use the virtual space as vmalloc space.
>
> I really don't think this needs to be exposed with help text and so
> forth. How about just defining a 'def_bool n' with some comments and
> let the architecture 'select' it?
>
>> +#ifdef ENABLE_VMALLOC_SAVING
>> +int is_vmalloc_addr(const void *x)
>> +{
>> + struct rb_node *n;
>> + struct vmap_area *va;
>> + int ret = 0;
>> +
>> + spin_lock(&vmap_area_lock);
>> +
>> + for (n = rb_first(vmap_area_root); n; rb_next(n)) {
>> + va = rb_entry(n, struct vmap_area, rb_node);
>> + if (x >= va->va_start && x < va->va_end) {
>> + ret = 1;
>> + break;
>> + }
>> + }
>> +
>> + spin_unlock(&vmap_area_lock);
>> + return ret;
>> +}
>> +EXPORT_SYMBOL(is_vmalloc_addr);
>> +#endif
>
> It's probably worth noting that this makes is_vmalloc_addr() a *LOT*
> more expensive than it was before. There are a couple dozen of these in
> the tree in kinda weird places (ext4, netlink, tcp). You didn't
> mention it here, but you probably want to at least make sure you're not
> adding a spinlock and a tree walk in some critical path.
>
Yes, that was a concern I had as well. If is_vmalloc_addr returned true
the spinlock/tree walk would happen anyway so essentially this is
getting rid of the fast path. This is typically used in the idiom
alloc(size) {
if (size > some metric)
vmalloc
else
kmalloc
}
free (ptr) {
if (is_vmalloc_addr(ptr)
vfree
else
kfree
}
so my hypothesis would be that any path would have to be willing to take
the penalty of vmalloc anyway. The actual cost would depend on the
vmalloc / kmalloc ratio. I haven't had a chance to get profiling data
yet to see the performance difference.
Thanks,
Laura
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 4/4] mm/vmalloc.c: Treat the entire kernel virtual space as vmalloc
2013-11-14 17:26 ` Dave Hansen
@ 2013-11-15 5:34 ` Laura Abbott
0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2013-11-15 5:34 UTC (permalink / raw)
To: linux-arm-kernel
On 11/14/2013 9:26 AM, Dave Hansen wrote:
> On 11/11/2013 03:26 PM, Laura Abbott wrote:
>> With CONFIG_ENABLE_VMALLOC_SAVINGS, all lowmem is tracked in
>> vmalloc. This means that all the kernel virtual address space
>> can be treated as part of the vmalloc region. Allow vm areas
>> to be allocated from the full kernel address range.
>>
>> Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
>> Signed-off-by: Neeti Desai <neetid@codeaurora.org>
>> ---
>> mm/vmalloc.c | 11 +++++++++++
>> 1 files changed, 11 insertions(+), 0 deletions(-)
>>
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index c7b138b..181247d 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -1385,16 +1385,27 @@ struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags,
>> */
>> struct vm_struct *get_vm_area(unsigned long size, unsigned long flags)
>> {
>> +#ifdef CONFIG_ENABLE_VMALLOC_SAVING
>> + return __get_vm_area_node(size, 1, flags, PAGE_OFFSET, VMALLOC_END,
>> + NUMA_NO_NODE, GFP_KERNEL,
>> + __builtin_return_address(0));
>> +#else
>> return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END,
>> NUMA_NO_NODE, GFP_KERNEL,
>> __builtin_return_address(0));
>> +#endif
>> }
>>
>> struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags,
>> const void *caller)
>> {
>> +#ifdef CONFIG_ENABLE_VMALLOC_SAVING
>> + return __get_vm_area_node(size, 1, flags, PAGE_OFFSET, VMALLOC_END,
>> + NUMA_NO_NODE, GFP_KERNEL, caller);
>> +#else
>> return __get_vm_area_node(size, 1, flags, VMALLOC_START, VMALLOC_END,
>> NUMA_NO_NODE, GFP_KERNEL, caller);
>> +#endif
>> }
>
> Couple of nits: first of all, there's no reason to copy, paste, and
> #ifdef this much code. This just invites one of the copies to bitrot.
> I'd much rather see this:
>
> #ifdef CONFIG_ENABLE_VMALLOC_SAVING
> #define LOWEST_VMALLOC_VADDR PAGE_OFFSET
> #else
> #define LOWEST_VMALLOC_VADDR VMALLOC_START
> #endif
>
> Then just replace the PAGE_OFFSET in the function arguments with
> LOWEST_VMALLOC_VADDR.
>
Good point.
> Have you done any audits to make sure that the rest of the code that
> deals with vmalloc addresses in the kernel is using is_vmalloc_addr()?
> I'd be a bit worried that we might have picked up an assumption or two
> that *all* vmalloc addresses are _above_ VMALLOC_START.
>
> The percpu.c code looks like it might do this, and maybe the kcore code.
> The vmalloc.c code itself has this in get_vmalloc_info():
>
>> /*
>> * Some archs keep another range for modules in vmalloc space
>> */
>> if (addr < VMALLOC_START)
>> continue;
>
> Seems like that would break as well.
>
> With this patch, VMALLOC_START loses enough of its meaning that I wonder
> if we should even keep it around. It's the start of the _dedicated_
> vmalloc space, but it's mostly useless and obscure enough that maybe we
> should get rid of its use in common code.
>
Yes, there are plenty of clients who are using VMALLOC_START. There
might still be a use for VMALLOC_START as marking 'no more direct mapped
memory above this address' . To start making some if the cleanup easier,
it would help to have an already calculated total amount of vmalloc for
the clients who are trying to work off a vmalloc percentage.
Thanks,
Laura
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked in vmalloc
2013-11-15 4:52 ` Laura Abbott
@ 2013-11-15 15:53 ` Dave Hansen
2013-11-26 22:45 ` Andrew Morton
1 sibling, 0 replies; 16+ messages in thread
From: Dave Hansen @ 2013-11-15 15:53 UTC (permalink / raw)
To: linux-arm-kernel
On 11/14/2013 08:52 PM, Laura Abbott wrote:
> free (ptr) {
> if (is_vmalloc_addr(ptr)
> vfree
> else
> kfree
> }
>
> so my hypothesis would be that any path would have to be willing to take
> the penalty of vmalloc anyway. The actual cost would depend on the
> vmalloc / kmalloc ratio. I haven't had a chance to get profiling data
> yet to see the performance difference.
Well, either that, or these kinds of things where it is a fallback:
> hc = kmalloc(hsize, GFP_NOFS | __GFP_NOWARN);
> if (hc == NULL)
> hc = __vmalloc(hsize, GFP_NOFS, PAGE_KERNEL);
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked in vmalloc
2013-11-15 4:52 ` Laura Abbott
2013-11-15 15:53 ` Dave Hansen
@ 2013-11-26 22:45 ` Andrew Morton
2013-12-03 4:59 ` Laura Abbott
1 sibling, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2013-11-26 22:45 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 14 Nov 2013 20:52:38 -0800 Laura Abbott <lauraa@codeaurora.org> wrote:
> If is_vmalloc_addr returned true
> the spinlock/tree walk would happen anyway so essentially this is
> getting rid of the fast path. This is typically used in the idiom
>
> alloc(size) {
> if (size > some metric)
> vmalloc
> else
> kmalloc
> }
A better form is
if (kmalloc(..., GFP_NOWARN) == NULL)
vmalloc
> free (ptr) {
> if (is_vmalloc_addr(ptr)
> vfree
> else
> kfree
> }
>
> so my hypothesis would be that any path would have to be willing to take
> the penalty of vmalloc anyway. The actual cost would depend on the
> vmalloc / kmalloc ratio. I haven't had a chance to get profiling data
> yet to see the performance difference.
I've resisted adding the above helper functions simply to discourage
the use of vmalloc() - it *is* slow, and one day we might hit
vmalloc-arena fragmentation issues.
That being said, I might one day give up, because adding such helpers
would be a significant cleanup. And once they are added, their use
will proliferate and is_vmalloc_addr() will take quite a beating.
So yes, it would be prudent to be worried about is_vmalloc_addr()
performance at the outset.
Couldn't is_vmalloc_addr() just be done with a plain old bitmap? It
would consume 128kbytes to manage a 4G address space, and 1/8th of a meg
isn't much.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked in vmalloc
2013-11-26 22:45 ` Andrew Morton
@ 2013-12-03 4:59 ` Laura Abbott
0 siblings, 0 replies; 16+ messages in thread
From: Laura Abbott @ 2013-12-03 4:59 UTC (permalink / raw)
To: linux-arm-kernel
On 11/26/2013 2:45 PM, Andrew Morton wrote:
> So yes, it would be prudent to be worried about is_vmalloc_addr()
> performance at the outset.
>
> Couldn't is_vmalloc_addr() just be done with a plain old bitmap? It
> would consume 128kbytes to manage a 4G address space, and 1/8th of a meg
> isn't much.
>
Yes, I came to the same conclusion after realizing I needed something
similar to fix up proc/kcore.c . I plan to go with the bitmap for the
next patch version.
Laura
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2013-12-03 4:59 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-11 23:26 [RFC 0/4] Intermix Lowmem and vmalloc Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 1/4] arm: mm: Add iotable_init_novmreserve Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 2/4] arm: mm: Track lowmem in vmalloc Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 3/4] mm/vmalloc.c: Allow lowmem to be tracked " Laura Abbott
2013-11-11 23:37 ` Kyungmin Park
2013-11-12 1:23 ` Laura Abbott
2013-11-14 17:45 ` Dave Hansen
2013-11-15 4:52 ` Laura Abbott
2013-11-15 15:53 ` Dave Hansen
2013-11-26 22:45 ` Andrew Morton
2013-12-03 4:59 ` Laura Abbott
2013-11-11 23:26 ` [RFC PATCH 4/4] mm/vmalloc.c: Treat the entire kernel virtual space as vmalloc Laura Abbott
2013-11-14 17:26 ` Dave Hansen
2013-11-15 5:34 ` Laura Abbott
2013-11-12 0:13 ` [RFC 0/4] Intermix Lowmem and vmalloc Russell King - ARM Linux
2013-11-12 1:24 ` Laura Abbott
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).