* [RFC 0/7] Enable ZONE_DEVICE on POWER
@ 2016-05-03 6:29 Anshuman Khandual
2016-05-03 6:29 ` [RFC 1/7] powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory Anshuman Khandual
` (6 more replies)
0 siblings, 7 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 6:29 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aneesh.kumar, mpe, mikey, oohall
This series enables ZONE_DEVICE support on POWER. The TEST patches are
just example usage of this newly enabled zone. Here are some of the logs
from the example device driver. This demonstrates how struct pages can
be allocated inside the device memory range itself and how usable memory
will be less than the total memory mapped in this way.
[ 0.462493] RMEM: Driver loaded
[ 0.462502] RMEM: Reserved memory sections
[ 0.462504] RMEM: Base b10000000 Size: 10000000 Node: 0
[ 0.462507] RMEM: Base b20000000 Size: 10000000 Node: 0
[ 0.462510] RMEM: Base b30000000 Size: 10000000 Node: 0
[ 0.462512] RMEM: Base b40000000 Size: 10000000 Node: 0
[ 0.462515] RMEM: Base b50000000 Size: 10000000 Node: 0
[ 0.462517] RMEM: Base b60000000 Size: 10000000 Node: 0
[ 0.462520] RMEM: Base b70000000 Size: 10000000 Node: 0
[ 0.462522] RMEM: Base b80000000 Size: 10000000 Node: 0
[ 0.462525] RMEM: Base b90000000 Size: 10000000 Node: 0
[ 0.462527] RMEM: Base ba0000000 Size: 10000000 Node: 0
[ 0.462530] RMEM: Base bb0000000 Size: 10000000 Node: 0
[ 0.462532] RMEM: Base bc0000000 Size: 10000000 Node: 0
[ 0.462535] RMEM: Base bd0000000 Size: 10000000 Node: 0
[ 0.462537] RMEM: Base be0000000 Size: 10000000 Node: 0
[ 0.462540] RMEM: Base bf0000000 Size: 10000000 Node: 0
[ 0.462611] RMEM: vmemmap backing (f000000001000000 40b000000)
[ 0.462617] RMEM: vmemmap backing (f000000000000000 40c000000)
[ 0.462800] RMEM: vmemmap backing (f000000002000000 b10000000)
[ 0.462804] RMEM: vmemmap backing (f000000001000000 40b000000)
[ 0.462807] RMEM: vmemmap backing (f000000000000000 40c000000)
[ 0.462810] RMEM: Read access complete (c000000b10000000 8000000)
[ 0.462813] RMEM: altmap->base_pfn 724992
[ 0.462815] RMEM: altmap->reserve 0
[ 0.462817] RMEM: altmap->free 256
[ 0.462818] RMEM: altmap->align 0
[ 0.462820] RMEM: altmap->alloc 256
[ 0.462822] RMEM: pagemap (c000000002060018)
[ 0.462824] RMEM: dev_pagemap (c000000002060060)
[ 0.462827] RMEM: pfn range (b1100 b1800)
[ 1.264550] RMEM: Driver now owns PFN(b1100....b1800)
[ 1.264552] RMEM: Data integrity test successful
Thoughts, suggestions, inputs and comments welcome. Thank you.
Anshuman Khandual (5):
powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory
powerpc/mm: Enable support for ZONE_DEVICE on PPC_BOOK3S_64 platforms
mm/memremap: Export pfn_first, pfn_end, find_pagemap functions
TEST: Reserve system memory to be emulated as device memory
TEST: Driver to test device memory through ZONE_DEVICE
Oliver O'Halloran (2):
powerpc/mm: Define TOP_ZONE as a constant
powerpc/mm: Set MAX_ZONE_PFN to 0 for all zones beyond TOP_ZONE
arch/powerpc/Kconfig | 4 +
arch/powerpc/include/asm/prom.h | 15 +++
arch/powerpc/kernel/prom.c | 11 ++
arch/powerpc/mm/init_64.c | 18 +++-
arch/powerpc/mm/mem.c | 28 ++---
arch/powerpc/platforms/pseries/Makefile | 2 +-
arch/powerpc/platforms/pseries/rmem.c | 186 ++++++++++++++++++++++++++++++++
include/linux/memremap.h | 18 ++++
kernel/memremap.c | 15 ++-
mm/Kconfig | 2 +-
10 files changed, 280 insertions(+), 19 deletions(-)
create mode 100644 arch/powerpc/platforms/pseries/rmem.c
--
1.8.3.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [RFC 1/7] powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory
2016-05-03 6:29 [RFC 0/7] Enable ZONE_DEVICE on POWER Anshuman Khandual
@ 2016-05-03 6:29 ` Anshuman Khandual
2016-05-03 8:04 ` Balbir Singh
2016-05-03 6:29 ` [RFC 2/7] powerpc/mm: Enable support for ZONE_DEVICE on PPC_BOOK3S_64 platforms Anshuman Khandual
` (5 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 6:29 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aneesh.kumar, mpe, mikey, oohall
Change the vmemmap_populate function to detect device memory through
to_vmemmap_altmap and then call generic the __vmmemap_alloc_block_buf
function instead of vmemmap_alloc_block as the earlier can allocate
physical memory from the device range instead of the system RAM.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/mm/init_64.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index ba65566..db73708 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -42,6 +42,7 @@
#include <linux/memblock.h>
#include <linux/hugetlb.h>
#include <linux/slab.h>
+#include <linux/memremap.h>
#include <asm/pgalloc.h>
#include <asm/page.h>
@@ -312,6 +313,7 @@ static __meminit void vmemmap_list_populate(unsigned long phys,
int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
{
unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
+ unsigned long orig = start;
/* Align to the page size of the linear mapping. */
start = _ALIGN_DOWN(start, page_size);
@@ -319,13 +321,15 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
pr_debug("vmemmap_populate %lx..%lx, node %d\n", start, end, node);
for (; start < end; start += page_size) {
+ struct vmem_altmap *altmap;
void *p;
int rc;
if (vmemmap_populated(start, page_size))
continue;
- p = vmemmap_alloc_block(page_size, node);
+ altmap = to_vmem_altmap((unsigned long) orig);
+ p = __vmemmap_alloc_block_buf(page_size, node, altmap);
if (!p)
return -ENOMEM;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC 2/7] powerpc/mm: Enable support for ZONE_DEVICE on PPC_BOOK3S_64 platforms
2016-05-03 6:29 [RFC 0/7] Enable ZONE_DEVICE on POWER Anshuman Khandual
2016-05-03 6:29 ` [RFC 1/7] powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory Anshuman Khandual
@ 2016-05-03 6:29 ` Anshuman Khandual
2016-05-03 6:29 ` [RFC 3/7] powerpc/mm: Define TOP_ZONE as a constant Anshuman Khandual
` (4 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 6:29 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aneesh.kumar, mpe, mikey, oohall
This enables support for the new ZONE_DEVCICE on PPC_BOOK3S_64 platforms
which now accommodates device memory during memory hotplug operation.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/Kconfig | 4 ++++
mm/Kconfig | 2 +-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 7cd32c0..9ac5c6f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -1084,6 +1084,10 @@ endif
config ARCH_RANDOM
def_bool n
+config ZONE_DEVICE
+ bool
+ default y if PPC_BOOK3S_64
+
source "net/Kconfig"
source "drivers/Kconfig"
diff --git a/mm/Kconfig b/mm/Kconfig
index 989f8f3..8ecd869 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -654,7 +654,7 @@ config ZONE_DEVICE
depends on MEMORY_HOTPLUG
depends on MEMORY_HOTREMOVE
depends on SPARSEMEM_VMEMMAP
- depends on X86_64 #arch_add_memory() comprehends device memory
+ depends on (X86_64 || PPC_BOOK3S_64) #arch_add_memory() comprehends device memory
help
Device memory hotplug support allows for establishing pmem,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC 3/7] powerpc/mm: Define TOP_ZONE as a constant
2016-05-03 6:29 [RFC 0/7] Enable ZONE_DEVICE on POWER Anshuman Khandual
2016-05-03 6:29 ` [RFC 1/7] powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory Anshuman Khandual
2016-05-03 6:29 ` [RFC 2/7] powerpc/mm: Enable support for ZONE_DEVICE on PPC_BOOK3S_64 platforms Anshuman Khandual
@ 2016-05-03 6:29 ` Anshuman Khandual
2016-05-03 8:12 ` Balbir Singh
2016-05-03 6:29 ` [RFC 4/7] powerpc/mm: Set MAX_ZONE_PFN to 0 for all zones beyond TOP_ZONE Anshuman Khandual
` (3 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 6:29 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aneesh.kumar, mpe, mikey, oohall
From: Oliver O'Halloran <oohall@gmail.com>
The zone that contains the top of the memory will be either ZONE_NORMAL
or ZONE_HIGHMEM depending on the kernel config. There are two functions
in there which require this information and both of them use an #ifdef
to set a local variable (top_zone). This is a little nuts, so lets just
make it a constant.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
arch/powerpc/mm/mem.c | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index ac79dbd..bcaede4 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -61,6 +61,12 @@
#define CPU_FTR_NOEXECUTE 0
#endif
+#ifdef CONFIG_HIGHMEM
+#define TOP_ZONE ZONE_HIGHMEM
+#else
+#define TOP_ZONE ZONE_NORMAL
+#endif
+
unsigned long long memory_limit;
#ifdef CONFIG_HIGHMEM
@@ -267,14 +273,9 @@ void __init limit_zone_pfn(enum zone_type zone, unsigned long pfn_limit)
*/
int dma_pfn_limit_to_zone(u64 pfn_limit)
{
- enum zone_type top_zone = ZONE_NORMAL;
int i;
-#ifdef CONFIG_HIGHMEM
- top_zone = ZONE_HIGHMEM;
-#endif
-
- for (i = top_zone; i >= 0; i--) {
+ for (i = TOP_ZONE; i >= 0; i--) {
if (max_zone_pfns[i] <= pfn_limit)
return i;
}
@@ -289,7 +290,6 @@ void __init paging_init(void)
{
unsigned long long total_ram = memblock_phys_mem_size();
phys_addr_t top_of_ram = memblock_end_of_DRAM();
- enum zone_type top_zone;
#ifdef CONFIG_PPC32
unsigned long v = __fix_to_virt(__end_of_fixed_addresses - 1);
@@ -313,13 +313,9 @@ void __init paging_init(void)
(long int)((top_of_ram - total_ram) >> 20));
#ifdef CONFIG_HIGHMEM
- top_zone = ZONE_HIGHMEM;
limit_zone_pfn(ZONE_NORMAL, lowmem_end_addr >> PAGE_SHIFT);
-#else
- top_zone = ZONE_NORMAL;
#endif
-
- limit_zone_pfn(top_zone, top_of_ram >> PAGE_SHIFT);
+ limit_zone_pfn(TOP_ZONE, top_of_ram >> PAGE_SHIFT);
zone_limits_final = true;
free_area_init_nodes(max_zone_pfns);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC 4/7] powerpc/mm: Set MAX_ZONE_PFN to 0 for all zones beyond TOP_ZONE
2016-05-03 6:29 [RFC 0/7] Enable ZONE_DEVICE on POWER Anshuman Khandual
` (2 preceding siblings ...)
2016-05-03 6:29 ` [RFC 3/7] powerpc/mm: Define TOP_ZONE as a constant Anshuman Khandual
@ 2016-05-03 6:29 ` Anshuman Khandual
2016-05-03 8:13 ` Balbir Singh
2016-05-03 6:29 ` [RFC 5/7] mm/memremap: Export pfn_first, pfn_end, find_pagemap functions Anshuman Khandual
` (2 subsequent siblings)
6 siblings, 1 reply; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 6:29 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aneesh.kumar, mpe, mikey, oohall
From: Oliver O'Halloran <oohall@gmail.com>
All the memory zones past TOP_ZONE are managed by generic mm code. Zone
max PFN should be set to 0 instead of ~0UL since that's what the generic
mm code expects. Without this, kernel assigns all pages into ZONE_DEVICE
zone which is not part of buddy allocator, hence kernel cannot allocate
any memory and even fails to boot.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
arch/powerpc/mm/mem.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index bcaede4..19d2c62 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -242,8 +242,14 @@ static int __init mark_nonram_nosave(void)
static bool zone_limits_final;
+/*
+ * The memory zones past TOP_ZONE are managed by generic mm code.
+ * Zone max PFN should be set to 0 instead of ~0UL since that's
+ * what the generic mm code expects.
+ */
static unsigned long max_zone_pfns[MAX_NR_ZONES] = {
- [0 ... MAX_NR_ZONES - 1] = ~0UL
+ [0 ... TOP_ZONE - 1] = ~0UL,
+ [TOP_ZONE ... MAX_NR_ZONES - 1] = 0
};
/*
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC 5/7] mm/memremap: Export pfn_first, pfn_end, find_pagemap functions
2016-05-03 6:29 [RFC 0/7] Enable ZONE_DEVICE on POWER Anshuman Khandual
` (3 preceding siblings ...)
2016-05-03 6:29 ` [RFC 4/7] powerpc/mm: Set MAX_ZONE_PFN to 0 for all zones beyond TOP_ZONE Anshuman Khandual
@ 2016-05-03 6:29 ` Anshuman Khandual
2016-05-03 6:29 ` [RFC 6/7] TEST: Reserve system memory to be emulated as device memory Anshuman Khandual
2016-05-03 6:29 ` [RFC 7/7] TEST: Driver to test device memory through ZONE_DEVICE Anshuman Khandual
6 siblings, 0 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 6:29 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aneesh.kumar, mpe, mikey, oohall
This change exports two existing functions pfn_first, pfn_end and also
defines a new one called find_pagemap. These functions are required to
navigate through the memory which is hotplugged into the ZONE_DEVICE as
device memory.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
include/linux/memremap.h | 18 ++++++++++++++++++
kernel/memremap.c | 15 +++++++++++++--
2 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index bcaa634..b193044 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -28,11 +28,29 @@ void vmem_altmap_free(struct vmem_altmap *altmap, unsigned long nr_pfns);
#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_ZONE_DEVICE)
struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start);
+struct page_map *find_pagemap(resource_size_t phys);
+unsigned long pfn_first(struct page_map *pgmap);
+unsigned long pfn_end(struct page_map *pgmap);
#else
static inline struct vmem_altmap *to_vmem_altmap(unsigned long memmap_start)
{
return NULL;
}
+
+static inline struct page_map *find_pagemap(resource_size_t phys)
+{
+ return NULL;
+}
+
+unsigned long pfn_first(struct page_map *pgmap)
+{
+ return 0;
+}
+
+unsigned long pfn_end(struct page_map *pgmap)
+{
+ return 0;
+}
#endif
/**
diff --git a/kernel/memremap.c b/kernel/memremap.c
index a6d3823..3c21051 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -207,7 +207,7 @@ static void pgmap_radix_release(struct resource *res)
mutex_unlock(&pgmap_lock);
}
-static unsigned long pfn_first(struct page_map *page_map)
+unsigned long pfn_first(struct page_map *page_map)
{
struct dev_pagemap *pgmap = &page_map->pgmap;
const struct resource *res = &page_map->res;
@@ -220,7 +220,7 @@ static unsigned long pfn_first(struct page_map *page_map)
return pfn;
}
-static unsigned long pfn_end(struct page_map *page_map)
+unsigned long pfn_end(struct page_map *page_map)
{
const struct resource *res = &page_map->res;
@@ -262,6 +262,17 @@ struct dev_pagemap *find_dev_pagemap(resource_size_t phys)
return page_map ? &page_map->pgmap : NULL;
}
+struct page_map *find_pagemap(resource_size_t phys)
+{
+ struct page_map *page_map;
+
+ rcu_read_lock();
+ page_map = radix_tree_lookup(&pgmap_radix, phys >> PA_SECTION_SHIFT);
+ rcu_read_unlock();
+
+ return page_map;
+}
+
/**
* devm_memremap_pages - remap and provide memmap backing for the given resource
* @dev: hosting device for @res
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC 6/7] TEST: Reserve system memory to be emulated as device memory
2016-05-03 6:29 [RFC 0/7] Enable ZONE_DEVICE on POWER Anshuman Khandual
` (4 preceding siblings ...)
2016-05-03 6:29 ` [RFC 5/7] mm/memremap: Export pfn_first, pfn_end, find_pagemap functions Anshuman Khandual
@ 2016-05-03 6:29 ` Anshuman Khandual
2016-05-03 6:29 ` [RFC 7/7] TEST: Driver to test device memory through ZONE_DEVICE Anshuman Khandual
6 siblings, 0 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 6:29 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aneesh.kumar, mpe, mikey, oohall
While booting, cap system memory at SYSRAM_END and start recording
the memory from DEVRAM_START as reserved memory.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/prom.h | 15 +++++++++++++++
arch/powerpc/kernel/prom.c | 11 +++++++++++
2 files changed, 26 insertions(+)
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 7f436ba..a4ec240e4 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -165,5 +165,20 @@ struct of_drconf_cell {
*/
extern unsigned char ibm_architecture_vec[];
+#define SYSRAM_END 0x400000000
+#define DEVRAM_START 0xb00000000
+#define NR_RESERVE 500
+
+enum resmem_elements {
+ MEM_BASE = 0,
+ MEM_SIZE = 1,
+ MEM_NODE = 2,
+ MEM_MAX = 3
+};
+
+struct resmem {
+ u64 mem[NR_RESERVE][MEM_MAX];
+ u64 nr;
+};
#endif /* __KERNEL__ */
#endif /* _POWERPC_PROM_H */
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index a15fe1d..9b3bf2e 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -438,6 +438,8 @@ static int __init early_init_dt_scan_chosen_ppc(unsigned long node,
}
#ifdef CONFIG_PPC_PSERIES
+struct resmem rmem;
+EXPORT_SYMBOL(rmem);
/*
* Interpret the ibm,dynamic-memory property in the
* /ibm,dynamic-reconfiguration-memory node.
@@ -471,6 +473,7 @@ static int __init early_init_dt_scan_drconf_memory(unsigned long node)
if (usm != NULL)
is_kexec_kdump = 1;
+ memset(&rmem, 0, sizeof(struct resmem));
for (; n != 0; --n) {
base = dt_mem_next_cell(dt_root_addr_cells, &dm);
flags = of_read_number(&dm[3], 1);
@@ -508,6 +511,14 @@ static int __init early_init_dt_scan_drconf_memory(unsigned long node)
if ((base + size) > 0x80000000ul)
size = 0x80000000ul - base;
}
+ if (base > DEVRAM_START) {
+ rmem.mem[rmem.nr][MEM_BASE] = base;
+ rmem.mem[rmem.nr][MEM_SIZE] = size;
+ rmem.nr++;
+ continue;
+ }
+ if (base > SYSRAM_END)
+ continue;
memblock_add(base, size);
} while (--rngs);
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [RFC 7/7] TEST: Driver to test device memory through ZONE_DEVICE
2016-05-03 6:29 [RFC 0/7] Enable ZONE_DEVICE on POWER Anshuman Khandual
` (5 preceding siblings ...)
2016-05-03 6:29 ` [RFC 6/7] TEST: Reserve system memory to be emulated as device memory Anshuman Khandual
@ 2016-05-03 6:29 ` Anshuman Khandual
2016-05-03 8:21 ` Balbir Singh
6 siblings, 1 reply; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 6:29 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aneesh.kumar, mpe, mikey, oohall
This is an example driver with little bit of kernel hack to test
ZONE_DEVICE based device memory management on POWER.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/mm/init_64.c | 12 ++-
arch/powerpc/platforms/pseries/Makefile | 2 +-
arch/powerpc/platforms/pseries/rmem.c | 186 ++++++++++++++++++++++++++++++++
3 files changed, 198 insertions(+), 2 deletions(-)
create mode 100644 arch/powerpc/platforms/pseries/rmem.c
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index db73708..5cc286d 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -351,6 +351,17 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
}
#ifdef CONFIG_MEMORY_HOTPLUG
+void dump_vmemmap(void)
+{
+ struct vmemmap_backing *vmem_back = vmemmap_list;
+
+ for (; vmem_back; vmem_back = vmem_back->list) {
+ printk("RMEM: vmemmap backing (%lx %lx)\n",
+ vmem_back->virt_addr, vmem_back->phys);
+ }
+}
+EXPORT_SYMBOL(dump_vmemmap);
+
static unsigned long vmemmap_list_free(unsigned long start)
{
struct vmemmap_backing *vmem_back, *vmem_back_prev;
@@ -482,5 +493,4 @@ struct page *realmode_pfn_to_page(unsigned long pfn)
return page;
}
EXPORT_SYMBOL_GPL(realmode_pfn_to_page);
-
#endif /* CONFIG_SPARSEMEM_VMEMMAP/CONFIG_FLATMEM */
diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile
index fedc2ccf0..20a5e23 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -5,7 +5,7 @@ obj-y := lpar.o hvCall.o nvram.o reconfig.o \
of_helpers.o \
setup.o iommu.o event_sources.o ras.o \
firmware.o power.o dlpar.o mobility.o rng.o \
- pci.o pci_dlpar.o eeh_pseries.o msi.o
+ pci.o pci_dlpar.o eeh_pseries.o msi.o rmem.o
obj-$(CONFIG_SMP) += smp.o
obj-$(CONFIG_SCANLOG) += scanlog.o
obj-$(CONFIG_KEXEC) += kexec.o
diff --git a/arch/powerpc/platforms/pseries/rmem.c b/arch/powerpc/platforms/pseries/rmem.c
new file mode 100644
index 0000000..8c2287a
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/rmem.c
@@ -0,0 +1,186 @@
+/*
+ * Copyright (C) 2016 Anshuman Khandual (khandual@linux.vnet.ibm.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/gfp.h>
+#include <linux/kthread.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/memremap.h>
+
+#include <asm/firmware.h>
+#include <asm/hvcall.h>
+#include <asm/mmu.h>
+#include <asm/pgalloc.h>
+#include <asm/uaccess.h>
+#include <linux/memory.h>
+#include <asm/plpar_wrappers.h>
+#include <asm/prom.h>
+
+#define DEVM_MAP_SIZE (1UL << PA_SECTION_SHIFT) * 8
+
+extern void dump_vmemmap(void);
+extern struct resmem rmem;
+
+unsigned long devmem_start;
+unsigned long devmem_end;
+
+void driver_test_devmem(void)
+{
+ unsigned long i;
+ unsigned long start = devmem_start >> PAGE_SHIFT;
+ unsigned long end = devmem_end >> PAGE_SHIFT;
+
+ for(i = start; i < end; i++)
+ *(unsigned long *)i = (char)i;
+
+ for(i = start; i < end; i++) {
+ if (*(unsigned long *)i != (char)i)
+ printk("RMEM: Error data miscompare at %lx\n", i);
+ }
+ printk("RMEM: Data integrity test successful\n");
+}
+
+void driver_memory(unsigned long start_pfn, unsigned long end_pfn)
+{
+ printk("RMEM: Driver now owns PFN(%lx....%lx)\n", start_pfn, end_pfn);
+
+ devmem_start = start_pfn;
+ devmem_end = end_pfn;
+ driver_test_devmem();
+}
+
+static void dump_reserved(void)
+{
+ unsigned long i;
+
+ printk("RMEM: Reserved memory sections\n");
+ for (i = 0; i < rmem.nr; i++) {
+ printk("RMEM: Base %llx Size: %llx Node: %llu\n",
+ rmem.mem[i][MEM_BASE], rmem.mem[i][MEM_SIZE],
+ rmem.mem[i][MEM_NODE]);
+ }
+}
+
+static void dump_devmap(resource_size_t start)
+{
+ struct vmem_altmap *altmap;
+ struct dev_pagemap *pgmap;
+ struct page_map *pmap;
+ struct page *page;
+ unsigned long pfn;
+
+ altmap = to_vmem_altmap((unsigned long)pfn_to_page(start >> PAGE_SHIFT));
+ if (altmap) {
+ printk("RMEM: altmap->base_pfn %lu\n", altmap->base_pfn);
+ printk("RMEM: altmap->reserve %lu\n", altmap->reserve);
+ printk("RMEM: altmap->free %lu\n", altmap->free);
+ printk("RMEM: altmap->align %lu\n", altmap->align);
+ printk("RMEM: altmap->alloc %lu\n", altmap->alloc);
+ }
+ pmap = find_pagemap(start);
+ rcu_read_lock();
+ pgmap = find_dev_pagemap(start);
+ rcu_read_unlock();
+ printk("RMEM: pagemap (%lx)\n", (unsigned long)pmap);
+ printk("RMEM: dev_pagemap (%lx)\n", (unsigned long)pgmap);
+ printk("RMEM: pfn range (%lx %lx)\n", pfn_first(pmap), pfn_end(pmap));
+
+ for (pfn = pfn_first(pmap); pfn < pfn_end(pmap); pfn++) {
+ page = pfn_to_page(pfn);
+ printk("DEVM: pfn(%lx) page(%lx) pagemap(%lx) flags(%lx)\n",
+ pfn, (unsigned long)page, (unsigned long)page->pgmap,
+ page->flags);
+ }
+ driver_memory(pfn_first(pmap), pfn_end(pmap));
+}
+
+static void simple_translation_test(void __pmem *vaddr)
+{
+ unsigned long i;
+
+ if (vaddr) {
+ unsigned long tmp;
+
+ for (i = 0; i < DEVM_MAP_SIZE; i++)
+ tmp = *((unsigned long *)vaddr + i);
+
+ printk("RMEM: Read access complete (%lx %lx)\n",
+ (unsigned long)vaddr, DEVM_MAP_SIZE);
+ }
+}
+
+static int rmem_init(void)
+{
+ struct class *class;
+ struct device *dev;
+ struct resource *res;
+ struct percpu_ref *ref;
+ void __pmem *vaddr;
+ struct vmem_altmap *altmap;
+ struct vmem_altmap __altmap = {
+ .base_pfn = rmem.mem[0][0] >> PAGE_SHIFT,
+ .reserve = 0,
+ .free = 0x100,
+ .alloc = 0,
+ .align = 0,
+ };
+
+ printk("RMEM: Driver loaded\n");
+ dump_reserved();
+
+ class = class_create(THIS_MODULE, "rmem");
+ if (!class) {
+ printk("RMEM: class_create() failed\n");
+ goto out;
+ }
+
+ dev = device_create(class, NULL, MKDEV(100, 100), NULL, "rmem");
+ if (!dev) {
+ printk("RMEM: device_create() failed\n");
+ goto out;
+ }
+
+ res = devm_kzalloc(dev, sizeof(*res), GFP_KERNEL);
+ if (!res) {
+ printk("RMEM: devm_kzalloc() failed\n");
+ goto out;
+ }
+
+ ref = devm_kzalloc(dev, sizeof(*ref), GFP_KERNEL);
+ if (!res) {
+ printk("RMEM: devm_kzalloc() failed\n");
+ goto out;
+ }
+
+ dump_vmemmap();
+ altmap = &__altmap;
+ res->start = rmem.mem[0][0];
+ res->end = rmem.mem[0][0] + DEVM_MAP_SIZE;
+ vaddr = devm_memremap_pages(dev, res, ref, altmap);
+ dump_vmemmap();
+
+ simple_translation_test(vaddr);
+ dump_devmap(res->start);
+ return 0;
+out:
+ return -1;
+}
+
+static void rmem_exit(void)
+{
+ printk("RMEM: rmem driver unloaded\n");
+}
+
+module_init(rmem_init);
+module_exit(rmem_exit);
+
+MODULE_AUTHOR("Anshuman Khandual <khandual@linux.vnet.ibm.com>");
+MODULE_DESCRIPTION("Test driver for device memory");
+MODULE_LICENSE("GPL");
--
1.8.3.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC 1/7] powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory
2016-05-03 6:29 ` [RFC 1/7] powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory Anshuman Khandual
@ 2016-05-03 8:04 ` Balbir Singh
2016-05-03 8:25 ` Anshuman Khandual
0 siblings, 1 reply; 13+ messages in thread
From: Balbir Singh @ 2016-05-03 8:04 UTC (permalink / raw)
To: Anshuman Khandual, linuxppc-dev; +Cc: mikey, oohall, aneesh.kumar
On 03/05/16 16:29, Anshuman Khandual wrote:
> Change the vmemmap_populate function to detect device memory through
> to_vmemmap_altmap and then call generic the __vmmemap_alloc_block_buf
> function instead of vmemmap_alloc_block as the earlier can allocate
> physical memory from the device range instead of the system RAM.
>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
> arch/powerpc/mm/init_64.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index ba65566..db73708 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -42,6 +42,7 @@
> #include <linux/memblock.h>
> #include <linux/hugetlb.h>
> #include <linux/slab.h>
> +#include <linux/memremap.h>
>
> #include <asm/pgalloc.h>
> #include <asm/page.h>
> @@ -312,6 +313,7 @@ static __meminit void vmemmap_list_populate(unsigned long phys,
> int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
> {
> unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
> + unsigned long orig = start;
I would much rather do struct vmem_altmap *altmap = to_vmem_altmap(start);
>
> /* Align to the page size of the linear mapping. */
> start = _ALIGN_DOWN(start, page_size);
> @@ -319,13 +321,15 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
> pr_debug("vmemmap_populate %lx..%lx, node %d\n", start, end, node);
>
> for (; start < end; start += page_size) {
> + struct vmem_altmap *altmap;
> void *p;
> int rc;
>
> if (vmemmap_populated(start, page_size))
> continue;
>
> - p = vmemmap_alloc_block(page_size, node);
> + altmap = to_vmem_altmap((unsigned long) orig);
> + p = __vmemmap_alloc_block_buf(page_size, node, altmap);
> if (!p)
> return -ENOMEM;
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 3/7] powerpc/mm: Define TOP_ZONE as a constant
2016-05-03 6:29 ` [RFC 3/7] powerpc/mm: Define TOP_ZONE as a constant Anshuman Khandual
@ 2016-05-03 8:12 ` Balbir Singh
0 siblings, 0 replies; 13+ messages in thread
From: Balbir Singh @ 2016-05-03 8:12 UTC (permalink / raw)
To: Anshuman Khandual, linuxppc-dev; +Cc: mikey, oohall, aneesh.kumar
On 03/05/16 16:29, Anshuman Khandual wrote:
> From: Oliver O'Halloran <oohall@gmail.com>
>
> The zone that contains the top of the memory will be either ZONE_NORMAL
> or ZONE_HIGHMEM depending on the kernel config. There are two functions
> in there which require this information and both of them use an #ifdef
> to set a local variable (top_zone). This is a little nuts, so lets just
> make it a constant.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
Looks good
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 4/7] powerpc/mm: Set MAX_ZONE_PFN to 0 for all zones beyond TOP_ZONE
2016-05-03 6:29 ` [RFC 4/7] powerpc/mm: Set MAX_ZONE_PFN to 0 for all zones beyond TOP_ZONE Anshuman Khandual
@ 2016-05-03 8:13 ` Balbir Singh
0 siblings, 0 replies; 13+ messages in thread
From: Balbir Singh @ 2016-05-03 8:13 UTC (permalink / raw)
To: Anshuman Khandual, linuxppc-dev; +Cc: mikey, oohall, aneesh.kumar
On 03/05/16 16:29, Anshuman Khandual wrote:
> From: Oliver O'Halloran <oohall@gmail.com>
>
> All the memory zones past TOP_ZONE are managed by generic mm code. Zone
> max PFN should be set to 0 instead of ~0UL since that's what the generic
> mm code expects. Without this, kernel assigns all pages into ZONE_DEVICE
> zone which is not part of buddy allocator, hence kernel cannot allocate
> any memory and even fails to boot.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 7/7] TEST: Driver to test device memory through ZONE_DEVICE
2016-05-03 6:29 ` [RFC 7/7] TEST: Driver to test device memory through ZONE_DEVICE Anshuman Khandual
@ 2016-05-03 8:21 ` Balbir Singh
0 siblings, 0 replies; 13+ messages in thread
From: Balbir Singh @ 2016-05-03 8:21 UTC (permalink / raw)
To: Anshuman Khandual, linuxppc-dev; +Cc: mikey, oohall, aneesh.kumar
On 03/05/16 16:29, Anshuman Khandual wrote:
> This is an example driver with little bit of kernel hack to test
> ZONE_DEVICE based device memory management on POWER.
>
I think this should be under CONFIG_DEBUG/CONFIG_DEBUG_VM or something
No?
If you are exporting the functions from the previous patch just for the
test driver, we might want tor refactor the code
Balbir
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC 1/7] powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory
2016-05-03 8:04 ` Balbir Singh
@ 2016-05-03 8:25 ` Anshuman Khandual
0 siblings, 0 replies; 13+ messages in thread
From: Anshuman Khandual @ 2016-05-03 8:25 UTC (permalink / raw)
To: Balbir Singh, linuxppc-dev; +Cc: mikey, oohall, aneesh.kumar
On 05/03/2016 01:34 PM, Balbir Singh wrote:
>
>
> On 03/05/16 16:29, Anshuman Khandual wrote:
>> Change the vmemmap_populate function to detect device memory through
>> to_vmemmap_altmap and then call generic the __vmmemap_alloc_block_buf
>> function instead of vmemmap_alloc_block as the earlier can allocate
>> physical memory from the device range instead of the system RAM.
>>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/mm/init_64.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
>> index ba65566..db73708 100644
>> --- a/arch/powerpc/mm/init_64.c
>> +++ b/arch/powerpc/mm/init_64.c
>> @@ -42,6 +42,7 @@
>> #include <linux/memblock.h>
>> #include <linux/hugetlb.h>
>> #include <linux/slab.h>
>> +#include <linux/memremap.h>
>>
>> #include <asm/pgalloc.h>
>> #include <asm/page.h>
>> @@ -312,6 +313,7 @@ static __meminit void vmemmap_list_populate(unsigned long phys,
>> int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
>> {
>> unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
>> + unsigned long orig = start;
>
> I would much rather do struct vmem_altmap *altmap = to_vmem_altmap(start);
Sure, makes sense.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-05-03 8:26 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-03 6:29 [RFC 0/7] Enable ZONE_DEVICE on POWER Anshuman Khandual
2016-05-03 6:29 ` [RFC 1/7] powerpc/mm: Make vmemmap_populate accommodate ZONE_DEVICE memory Anshuman Khandual
2016-05-03 8:04 ` Balbir Singh
2016-05-03 8:25 ` Anshuman Khandual
2016-05-03 6:29 ` [RFC 2/7] powerpc/mm: Enable support for ZONE_DEVICE on PPC_BOOK3S_64 platforms Anshuman Khandual
2016-05-03 6:29 ` [RFC 3/7] powerpc/mm: Define TOP_ZONE as a constant Anshuman Khandual
2016-05-03 8:12 ` Balbir Singh
2016-05-03 6:29 ` [RFC 4/7] powerpc/mm: Set MAX_ZONE_PFN to 0 for all zones beyond TOP_ZONE Anshuman Khandual
2016-05-03 8:13 ` Balbir Singh
2016-05-03 6:29 ` [RFC 5/7] mm/memremap: Export pfn_first, pfn_end, find_pagemap functions Anshuman Khandual
2016-05-03 6:29 ` [RFC 6/7] TEST: Reserve system memory to be emulated as device memory Anshuman Khandual
2016-05-03 6:29 ` [RFC 7/7] TEST: Driver to test device memory through ZONE_DEVICE Anshuman Khandual
2016-05-03 8:21 ` Balbir Singh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).