linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section
@ 2016-02-17 12:12 Anshuman Khandual
  2016-02-17 12:12 ` [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout Anshuman Khandual
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Anshuman Khandual @ 2016-02-17 12:12 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: aneesh.kumar, mpe

The commit (16a05bff1: powerpc: start loop at section start of
start in vmemmap_populated()) reused 'start' variable to compute
the starting address of the memory section where the given address
belongs. Then the same variable is used for iterating over starting
address of all memory sections before reaching the 'end' address.
Renaming it as 'section_start' makes the logic more clear.

Fixes: 16a05bff1 ("powerpc: start loop at section start of start in vmemmap_populated()")
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/mm/init_64.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 379a6a9..d6b9b4d 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -170,11 +170,15 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
  */
 static int __meminit vmemmap_populated(unsigned long start, int page_size)
 {
-	unsigned long end = start + page_size;
-	start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
+	unsigned long end, section_start;
 
-	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
-		if (pfn_valid(page_to_pfn((struct page *)start)))
+	end = start + page_size;
+	section_start = (unsigned long)(pfn_to_page
+					(vmemmap_section_start(start)));
+
+	for (; section_start < end; section_start
+				+= (PAGES_PER_SECTION * sizeof(struct page)))
+		if (pfn_valid(page_to_pfn((struct page *)section_start)))
 			return 1;
 
 	return 0;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout
  2016-02-17 12:12 [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Anshuman Khandual
@ 2016-02-17 12:12 ` Anshuman Khandual
  2016-02-17 14:49   ` Aneesh Kumar K.V
  2016-02-18 14:22   ` Michael Ellerman
  2016-02-17 12:12 ` [RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements Anshuman Khandual
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 15+ messages in thread
From: Anshuman Khandual @ 2016-02-17 12:12 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: aneesh.kumar, mpe

Add some explaination to the layout of vmemmap virtual address
space and how physical page mapping is only used for valid PFNs
present at any point on the system.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 41 ++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 8d1c41d..9db4a86 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -26,6 +26,47 @@
 #define IOREMAP_BASE	(PHB_IO_END)
 #define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
 
+/*
+ * Starting address of the virtual address space where all page structs
+ * for the system physical memory are stored under the vmemmap sparse
+ * memory model. All possible struct pages are logically stored in a
+ * sequence in this virtual address space irrespective of the fact
+ * whether any given PFN is valid or even the memory section is valid
+ * or not. During boot and memory hotplug add operation when new memory
+ * sections are added, real physical allocation and hash table bolting
+ * will be performed. This saves precious physical memory when the system
+ * really does not have valid PFNs in some address ranges.
+ *
+ *  vmemmap +--------------+
+ *     +    |  page struct +----------+  PFN is valid
+ *     |    +--------------+          |
+ *     |    |  page struct |          |  PFN is invalid
+ *     |    +--------------+          |
+ *     |    |  page struct +------+   |
+ *     |    +--------------+      |   |
+ *     |    |  page struct |      |   |
+ *     |    +--------------+      |   |
+ *     |    |  page struct |      |   |
+ *     |    +--------------+      |   |
+ *     |    |  page struct +--+   |   |
+ *     |    +--------------+  |   |   |
+ *     |    |  page struct |  |   |   |       +-------------+
+ *     |    +--------------+  |   |   +-----> |     PFN     |
+ *     |    |  page struct |  |   |           +-------------+
+ *     |    +--------------+  |   +---------> |     PFN     |
+ *     |    |  page struct |  |               +-------------+
+ *     |    +--------------+  +-------------> |     PFN     |
+ *     |    |  page struct |                  +-------------+
+ *     |    +--------------+           +----> |     PFN     |
+ *     |    |  page struct |           |      +-------------+
+ *     |    +--------------+           |    Bolted in hash table
+ *     |    |  page struct +-----------+
+ *     v    +--------------+
+ *
+ * VMEMMAP_BASE (0xf000000000000000) region has a total range of 64TB but
+ * then it uses NR_MEM_SECTIONS * PAGES_PER_SECTION * sizeof(page struct)
+ * amount of virtual memory from it.
+ */
 #define vmemmap			((struct page *)VMEMMAP_BASE)
 
 /* Advertise special mapping type for AGP */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements
  2016-02-17 12:12 [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Anshuman Khandual
  2016-02-17 12:12 ` [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout Anshuman Khandual
@ 2016-02-17 12:12 ` Anshuman Khandual
  2016-02-17 14:52   ` Aneesh Kumar K.V
  2016-02-17 12:12 ` [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping Anshuman Khandual
  2016-02-18 14:34 ` [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Michael Ellerman
  3 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2016-02-17 12:12 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: aneesh.kumar, mpe

The structure to track single virtual to physical mapping has
been renamed from vmemmap_backing to vmemmap_hw_map which sounds
more appropriate. This forms a single entry of the global linked
list tracking all of the vmemmap physical mapping. The changes
are as follows.

	vmemmap_backing.list -> vmemmap_hw_map.link
	vmemmap_backing.phys -> vmemmap_hw_map.paddr
	vmemmap_backing.virt_addr -> vmemmap_hw_map.vaddr

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pgalloc-64.h | 16 +++++++---
 arch/powerpc/kernel/machine_kexec.c   |  8 ++---
 arch/powerpc/mm/init_64.c             | 58 +++++++++++++++++------------------
 3 files changed, 44 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index 69ef28a..e03b41c 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -11,12 +11,18 @@
 #include <linux/cpumask.h>
 #include <linux/percpu.h>
 
-struct vmemmap_backing {
-	struct vmemmap_backing *list;
-	unsigned long phys;
-	unsigned long virt_addr;
+/*
+ * This structure tracks a single virtual page mapping from
+ * the vmemmap ddress space. This element is required to
+ * track virtual to physical mapping of page structures in
+ * absense of a page table at boot time.
+ */
+struct vmemmap_hw_map {
+	struct vmemmap_hw_map *link;
+	unsigned long paddr;
+	unsigned long vaddr;
 };
-extern struct vmemmap_backing *vmemmap_list;
+extern struct vmemmap_hw_map *vmemmap_list;
 
 /*
  * Functions that deal with pagetables that could be at any level of
diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index 015ae55..0d90798 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -80,10 +80,10 @@ void arch_crash_save_vmcoreinfo(void)
 	VMCOREINFO_SYMBOL(vmemmap_list);
 	VMCOREINFO_SYMBOL(mmu_vmemmap_psize);
 	VMCOREINFO_SYMBOL(mmu_psize_defs);
-	VMCOREINFO_STRUCT_SIZE(vmemmap_backing);
-	VMCOREINFO_OFFSET(vmemmap_backing, list);
-	VMCOREINFO_OFFSET(vmemmap_backing, phys);
-	VMCOREINFO_OFFSET(vmemmap_backing, virt_addr);
+	VMCOREINFO_STRUCT_SIZE(vmemmap_hw_map);
+	VMCOREINFO_OFFSET(vmemmap_hw_map, link);
+	VMCOREINFO_OFFSET(vmemmap_hw_map, paddr);
+	VMCOREINFO_OFFSET(vmemmap_hw_map, vaddr);
 	VMCOREINFO_STRUCT_SIZE(mmu_psize_def);
 	VMCOREINFO_OFFSET(mmu_psize_def, shift);
 #endif
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index d6b9b4d..9b5dea3 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -194,7 +194,7 @@ static int __meminit vmemmap_populated(unsigned long start, int page_size)
 #ifdef CONFIG_PPC_BOOK3E
 static void __meminit vmemmap_create_mapping(unsigned long start,
 					     unsigned long page_size,
-					     unsigned long phys)
+					     unsigned long paddr)
 {
 	/* Create a PTE encoding without page size */
 	unsigned long i, flags = _PAGE_PRESENT | _PAGE_ACCESSED |
@@ -207,11 +207,11 @@ static void __meminit vmemmap_create_mapping(unsigned long start,
 	flags |= mmu_psize_defs[mmu_vmemmap_psize].enc << 8;
 
 	/* For each PTE for that area, map things. Note that we don't
-	 * increment phys because all PTEs are of the large size and
+	 * increment paddr because all PTEs are of the large size and
 	 * thus must have the low bits clear
 	 */
 	for (i = 0; i < page_size; i += PAGE_SIZE)
-		BUG_ON(map_kernel_page(start + i, phys, flags));
+		BUG_ON(map_kernel_page(start + i, paddr, flags));
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -223,9 +223,9 @@ static void vmemmap_remove_mapping(unsigned long start,
 #else /* CONFIG_PPC_BOOK3E */
 static void __meminit vmemmap_create_mapping(unsigned long start,
 					     unsigned long page_size,
-					     unsigned long phys)
+					     unsigned long paddr)
 {
-	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
+	int  mapped = htab_bolt_mapping(start, start + page_size, paddr,
 					pgprot_val(PAGE_KERNEL),
 					mmu_vmemmap_psize,
 					mmu_kernel_ssize);
@@ -245,19 +245,19 @@ static void vmemmap_remove_mapping(unsigned long start,
 
 #endif /* CONFIG_PPC_BOOK3E */
 
-struct vmemmap_backing *vmemmap_list;
-static struct vmemmap_backing *next;
+struct vmemmap_hw_map *vmemmap_list;
+static struct vmemmap_hw_map *next;
 static int num_left;
 static int num_freed;
 
-static __meminit struct vmemmap_backing * vmemmap_list_alloc(int node)
+static __meminit struct vmemmap_hw_map * vmemmap_list_alloc(int node)
 {
-	struct vmemmap_backing *vmem_back;
+	struct vmemmap_hw_map *vmem_back;
 	/* get from freed entries first */
 	if (num_freed) {
 		num_freed--;
 		vmem_back = next;
-		next = next->list;
+		next = next->link;
 
 		return vmem_back;
 	}
@@ -269,7 +269,7 @@ static __meminit struct vmemmap_backing * vmemmap_list_alloc(int node)
 			WARN_ON(1);
 			return NULL;
 		}
-		num_left = PAGE_SIZE / sizeof(struct vmemmap_backing);
+		num_left = PAGE_SIZE / sizeof(struct vmemmap_hw_map);
 	}
 
 	num_left--;
@@ -277,11 +277,11 @@ static __meminit struct vmemmap_backing * vmemmap_list_alloc(int node)
 	return next++;
 }
 
-static __meminit void vmemmap_list_populate(unsigned long phys,
+static __meminit void vmemmap_list_populate(unsigned long paddr,
 					    unsigned long start,
 					    int node)
 {
-	struct vmemmap_backing *vmem_back;
+	struct vmemmap_hw_map *vmem_back;
 
 	vmem_back = vmemmap_list_alloc(node);
 	if (unlikely(!vmem_back)) {
@@ -289,9 +289,9 @@ static __meminit void vmemmap_list_populate(unsigned long phys,
 		return;
 	}
 
-	vmem_back->phys = phys;
-	vmem_back->virt_addr = start;
-	vmem_back->list = vmemmap_list;
+	vmem_back->paddr = paddr;
+	vmem_back->vaddr = start;
+	vmem_back->link = vmemmap_list;
 
 	vmemmap_list = vmem_back;
 }
@@ -329,13 +329,13 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 #ifdef CONFIG_MEMORY_HOTPLUG
 static unsigned long vmemmap_list_free(unsigned long start)
 {
-	struct vmemmap_backing *vmem_back, *vmem_back_prev;
+	struct vmemmap_hw_map *vmem_back, *vmem_back_prev;
 
 	vmem_back_prev = vmem_back = vmemmap_list;
 
 	/* look for it with prev pointer recorded */
-	for (; vmem_back; vmem_back = vmem_back->list) {
-		if (vmem_back->virt_addr == start)
+	for (; vmem_back; vmem_back = vmem_back->link) {
+		if (vmem_back->vaddr == start)
 			break;
 		vmem_back_prev = vmem_back;
 	}
@@ -347,16 +347,16 @@ static unsigned long vmemmap_list_free(unsigned long start)
 
 	/* remove it from vmemmap_list */
 	if (vmem_back == vmemmap_list) /* remove head */
-		vmemmap_list = vmem_back->list;
+		vmemmap_list = vmem_back->link;
 	else
-		vmem_back_prev->list = vmem_back->list;
+		vmem_back_prev->link = vmem_back->link;
 
 	/* next point to this freed entry */
-	vmem_back->list = next;
+	vmem_back->link = next;
 	next = vmem_back;
 	num_freed++;
 
-	return vmem_back->phys;
+	return vmem_back->paddr;
 }
 
 void __ref vmemmap_free(unsigned long start, unsigned long end)
@@ -427,20 +427,20 @@ void register_page_bootmem_memmap(unsigned long section_nr,
  */
 struct page *realmode_pfn_to_page(unsigned long pfn)
 {
-	struct vmemmap_backing *vmem_back;
+	struct vmemmap_hw_map *vmem_back;
 	struct page *page;
 	unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
 	unsigned long pg_va = (unsigned long) pfn_to_page(pfn);
 
-	for (vmem_back = vmemmap_list; vmem_back; vmem_back = vmem_back->list) {
-		if (pg_va < vmem_back->virt_addr)
+	for (vmem_back = vmemmap_list; vmem_back; vmem_back = vmem_back->link) {
+		if (pg_va < vmem_back->vaddr)
 			continue;
 
 		/* After vmemmap_list entry free is possible, need check all */
 		if ((pg_va + sizeof(struct page)) <=
-				(vmem_back->virt_addr + page_size)) {
-			page = (struct page *) (vmem_back->phys + pg_va -
-				vmem_back->virt_addr);
+				(vmem_back->vaddr + page_size)) {
+			page = (struct page *) (vmem_back->paddr + pg_va -
+				vmem_back->vaddr);
 			return page;
 		}
 	}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping
  2016-02-17 12:12 [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Anshuman Khandual
  2016-02-17 12:12 ` [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout Anshuman Khandual
  2016-02-17 12:12 ` [RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements Anshuman Khandual
@ 2016-02-17 12:12 ` Anshuman Khandual
  2016-02-18 14:37   ` Michael Ellerman
  2016-02-18 14:34 ` [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Michael Ellerman
  3 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2016-02-17 12:12 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: aneesh.kumar, mpe

This renames the global list which tracks all the virtual to physical
mapping and also the global list which tracks all the available unused
vmemmap_hw_map node structures. It also attempts to explain the purpose
of these global linked lists and points out a possible race condition.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pgalloc-64.h |  2 +-
 arch/powerpc/kernel/machine_kexec.c   |  2 +-
 arch/powerpc/mm/init_64.c             | 82 +++++++++++++++++++++--------------
 3 files changed, 52 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index e03b41c..6e21a2a 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -22,7 +22,7 @@ struct vmemmap_hw_map {
 	unsigned long paddr;
 	unsigned long vaddr;
 };
-extern struct vmemmap_hw_map *vmemmap_list;
+extern struct vmemmap_hw_map *vmemmap_global;
 
 /*
  * Functions that deal with pagetables that could be at any level of
diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index 0d90798..eb6876c 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -77,7 +77,7 @@ void arch_crash_save_vmcoreinfo(void)
 	VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #if defined(CONFIG_PPC64) && defined(CONFIG_SPARSEMEM_VMEMMAP)
-	VMCOREINFO_SYMBOL(vmemmap_list);
+	VMCOREINFO_SYMBOL(vmemmap_global);
 	VMCOREINFO_SYMBOL(mmu_vmemmap_psize);
 	VMCOREINFO_SYMBOL(mmu_psize_defs);
 	VMCOREINFO_STRUCT_SIZE(vmemmap_hw_map);
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 9b5dea3..d998f3f 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -245,45 +245,63 @@ static void vmemmap_remove_mapping(unsigned long start,
 
 #endif /* CONFIG_PPC_BOOK3E */
 
-struct vmemmap_hw_map *vmemmap_list;
-static struct vmemmap_hw_map *next;
-static int num_left;
-static int num_freed;
+/*
+ * vmemmap virtual address space does not have a page table to track
+ * existing physical mapping. The vmemmap_global list maintains the
+ * physical mapping at all times where as the vmemmap_avail list
+ * maintains the available vmemmap_hw_map structures which got deleted
+ * from the vmemmap_global list during system runtime (memory hotplug
+ * remove operation for example). They freed structures are reused later
+ * when new requests come in without allocating new fresh memory. This
+ * pointer also tracks the allocated vmemmap_hw_map structures as we
+ * allocate one full page memory at a time when we dont have any.
+ */
+struct vmemmap_hw_map *vmemmap_global;
+static struct vmemmap_hw_map *vmemmap_avail;
+
+/* XXX: The same pointer vmemmap_avail tracks individual chunks inside
+ * the allocated full page during the boot time and again tracks the
+ * freeed nodes during runtime. It is racy but it does not happen as
+ * both they are separated by the boot process. Will create problem if
+ * some how we have memory hotplug operation during boot !!
+ */
+static int free_chunk;		/* Allocated chunks available */
+static int free_node;		/* Freeed nodes available */
 
-static __meminit struct vmemmap_hw_map * vmemmap_list_alloc(int node)
+static __meminit struct vmemmap_hw_map * vmemmap_global_alloc(int node)
 {
 	struct vmemmap_hw_map *vmem_back;
 	/* get from freed entries first */
-	if (num_freed) {
-		num_freed--;
-		vmem_back = next;
-		next = next->link;
+	if (free_node) {
+		free_node--;
+		vmem_back = vmemmap_avail;
+		vmemmap_avail = vmemmap_avail->link;
 
 		return vmem_back;
 	}
 
 	/* allocate a page when required and hand out chunks */
-	if (!num_left) {
-		next = vmemmap_alloc_block(PAGE_SIZE, node);
-		if (unlikely(!next)) {
+	if (!free_chunk) {
+		vmemmap_avail = vmemmap_alloc_block(PAGE_SIZE, node);
+		if (unlikely(!vmemmap_avail)) {
 			WARN_ON(1);
 			return NULL;
 		}
-		num_left = PAGE_SIZE / sizeof(struct vmemmap_hw_map);
+		free_chunk = PAGE_SIZE / sizeof(struct vmemmap_hw_map);
 	}
 
-	num_left--;
+	free_chunk--;
 
-	return next++;
+	return vmemmap_avail++;
 }
 
-static __meminit void vmemmap_list_populate(unsigned long paddr,
+static __meminit void vmemmap_global_populate(unsigned long paddr,
 					    unsigned long start,
 					    int node)
 {
 	struct vmemmap_hw_map *vmem_back;
 
-	vmem_back = vmemmap_list_alloc(node);
+	vmem_back = vmemmap_global_alloc(node);
 	if (unlikely(!vmem_back)) {
 		WARN_ON(1);
 		return;
@@ -291,9 +309,9 @@ static __meminit void vmemmap_list_populate(unsigned long paddr,
 
 	vmem_back->paddr = paddr;
 	vmem_back->vaddr = start;
-	vmem_back->link = vmemmap_list;
+	vmem_back->link = vmemmap_global;
 
-	vmemmap_list = vmem_back;
+	vmemmap_global = vmem_back;
 }
 
 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
@@ -315,7 +333,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 		if (!p)
 			return -ENOMEM;
 
-		vmemmap_list_populate(__pa(p), start, node);
+		vmemmap_global_populate(__pa(p), start, node);
 
 		pr_debug("      * %016lx..%016lx allocated at %p\n",
 			 start, start + page_size, p);
@@ -327,11 +345,11 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-static unsigned long vmemmap_list_free(unsigned long start)
+static unsigned long vmemmap_global_free(unsigned long start)
 {
 	struct vmemmap_hw_map *vmem_back, *vmem_back_prev;
 
-	vmem_back_prev = vmem_back = vmemmap_list;
+	vmem_back_prev = vmem_back = vmemmap_global;
 
 	/* look for it with prev pointer recorded */
 	for (; vmem_back; vmem_back = vmem_back->link) {
@@ -345,16 +363,16 @@ static unsigned long vmemmap_list_free(unsigned long start)
 		return 0;
 	}
 
-	/* remove it from vmemmap_list */
-	if (vmem_back == vmemmap_list) /* remove head */
-		vmemmap_list = vmem_back->link;
+	/* remove it from vmemmap_global */
+	if (vmem_back == vmemmap_global) /* remove head */
+		vmemmap_global = vmem_back->link;
 	else
 		vmem_back_prev->link = vmem_back->link;
 
-	/* next point to this freed entry */
-	vmem_back->link = next;
-	next = vmem_back;
-	num_freed++;
+	/* vmemmap_avail point to this freed entry */
+	vmem_back->link = vmemmap_avail;
+	vmemmap_avail = vmem_back;
+	free_node++;
 
 	return vmem_back->paddr;
 }
@@ -378,7 +396,7 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
 		if (vmemmap_populated(start, page_size))
 			continue;
 
-		addr = vmemmap_list_free(start);
+		addr = vmemmap_global_free(start);
 		if (addr) {
 			struct page *page = pfn_to_page(addr >> PAGE_SHIFT);
 
@@ -432,11 +450,11 @@ struct page *realmode_pfn_to_page(unsigned long pfn)
 	unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
 	unsigned long pg_va = (unsigned long) pfn_to_page(pfn);
 
-	for (vmem_back = vmemmap_list; vmem_back; vmem_back = vmem_back->link) {
+	for (vmem_back = vmemmap_global; vmem_back; vmem_back = vmem_back->link) {
 		if (pg_va < vmem_back->vaddr)
 			continue;
 
-		/* After vmemmap_list entry free is possible, need check all */
+		/* After vmemmap_global entry free is possible, need check all */
 		if ((pg_va + sizeof(struct page)) <=
 				(vmem_back->vaddr + page_size)) {
 			page = (struct page *) (vmem_back->paddr + pg_va -
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout
  2016-02-17 12:12 ` [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout Anshuman Khandual
@ 2016-02-17 14:49   ` Aneesh Kumar K.V
  2016-02-18 14:22   ` Michael Ellerman
  1 sibling, 0 replies; 15+ messages in thread
From: Aneesh Kumar K.V @ 2016-02-17 14:49 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> Add some explaination to the layout of vmemmap virtual address
> space and how physical page mapping is only used for valid PFNs
> present at any point on the system.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>


> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 41 ++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 8d1c41d..9db4a86 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -26,6 +26,47 @@
>  #define IOREMAP_BASE	(PHB_IO_END)
>  #define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
>
> +/*
> + * Starting address of the virtual address space where all page structs
> + * for the system physical memory are stored under the vmemmap sparse
> + * memory model. All possible struct pages are logically stored in a
> + * sequence in this virtual address space irrespective of the fact
> + * whether any given PFN is valid or even the memory section is valid
> + * or not. During boot and memory hotplug add operation when new memory
> + * sections are added, real physical allocation and hash table bolting
> + * will be performed. This saves precious physical memory when the system
> + * really does not have valid PFNs in some address ranges.
> + *
> + *  vmemmap +--------------+
> + *     +    |  page struct +----------+  PFN is valid
> + *     |    +--------------+          |
> + *     |    |  page struct |          |  PFN is invalid
> + *     |    +--------------+          |
> + *     |    |  page struct +------+   |
> + *     |    +--------------+      |   |
> + *     |    |  page struct |      |   |
> + *     |    +--------------+      |   |
> + *     |    |  page struct |      |   |
> + *     |    +--------------+      |   |
> + *     |    |  page struct +--+   |   |
> + *     |    +--------------+  |   |   |
> + *     |    |  page struct |  |   |   |       +-------------+
> + *     |    +--------------+  |   |   +-----> |     PFN     |
> + *     |    |  page struct |  |   |           +-------------+
> + *     |    +--------------+  |   +---------> |     PFN     |
> + *     |    |  page struct |  |               +-------------+
> + *     |    +--------------+  +-------------> |     PFN     |
> + *     |    |  page struct |                  +-------------+
> + *     |    +--------------+           +----> |     PFN     |
> + *     |    |  page struct |           |      +-------------+
> + *     |    +--------------+           |    Bolted in hash table
> + *     |    |  page struct +-----------+
> + *     v    +--------------+
> + *
> + * VMEMMAP_BASE (0xf000000000000000) region has a total range of 64TB but
> + * then it uses NR_MEM_SECTIONS * PAGES_PER_SECTION * sizeof(page struct)
> + * amount of virtual memory from it.
> + */
>  #define vmemmap			((struct page *)VMEMMAP_BASE)
>
>  /* Advertise special mapping type for AGP */
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements
  2016-02-17 12:12 ` [RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements Anshuman Khandual
@ 2016-02-17 14:52   ` Aneesh Kumar K.V
  2016-02-18 14:23     ` Michael Ellerman
  0 siblings, 1 reply; 15+ messages in thread
From: Aneesh Kumar K.V @ 2016-02-17 14:52 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> The structure to track single virtual to physical mapping has
> been renamed from vmemmap_backing to vmemmap_hw_map which sounds
> more appropriate. This forms a single entry of the global linked
> list tracking all of the vmemmap physical mapping. The changes
> are as follows.
>
> 	vmemmap_backing.list -> vmemmap_hw_map.link
> 	vmemmap_backing.phys -> vmemmap_hw_map.paddr
> 	vmemmap_backing.virt_addr -> vmemmap_hw_map.vaddr
>

I am not sure this helps. If we are going to take these renames, can you
wait till th book3s p9 preparation patches [1] hit upstream ? 

[1] http://mid.gmane.org/1454923241-6681-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com
-aneesh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout
  2016-02-17 12:12 ` [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout Anshuman Khandual
  2016-02-17 14:49   ` Aneesh Kumar K.V
@ 2016-02-18 14:22   ` Michael Ellerman
  2016-02-19  5:15     ` Anshuman Khandual
  1 sibling, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2016-02-18 14:22 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: aneesh.kumar

On Wed, 2016-02-17 at 17:42 +0530, Anshuman Khandual wrote:

> Add some explaination to the layout of vmemmap virtual address
> space and how physical page mapping is only used for valid PFNs
> present at any point on the system.
> 
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 41 ++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 8d1c41d..9db4a86 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -26,6 +26,47 @@
>  #define IOREMAP_BASE	(PHB_IO_END)
>  #define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
>  
> +/*
> + * Starting address of the virtual address space where all page structs

This is so far from the variable it's referring to that it's not clear what it
refers to. So you should say "vmemmap is the starting ..."

> + * for the system physical memory are stored under the vmemmap sparse
                                              ^
					      , when using the SPARSEMEM_VMEMMAP
> + * memory model. All possible struct pages are logically stored in a
> + * sequence in this virtual address space irrespective of the fact
> + * whether any given PFN is valid or even the memory section is valid
> + * or not.

I know what you mean but I think that could be worded better. But it's too late
for me to reword it :)

The key point is that we allocate space for a page struct for each PFN that
could be present in the system, including holes in the address space (hence
sparse). That has the nice property of meaning there is a constant relationship
between the address of a struct page and it's PFN.

> + * During boot and memory hotplug add operation when new memory
                  ^                               ^
		  or				  ,
> + * sections are added, real physical allocation and hash table bolting
                                                  ^
						  of struct pages

> + * will be performed. This saves precious physical memory when the system
> + * really does not have valid PFNs in some address ranges.


> + *
> + *  vmemmap +--------------+
> + *     +    |  page struct +----------+  PFN is valid
> + *     |    +--------------+          |
> + *     |    |  page struct |          |  PFN is invalid
> + *     |    +--------------+          |
> + *     |    |  page struct +------+   |
> + *     |    +--------------+      |   |
> + *     |    |  page struct |      |   |
> + *     |    +--------------+      |   |
> + *     |    |  page struct |      |   |
> + *     |    +--------------+      |   |
> + *     |    |  page struct +--+   |   |
> + *     |    +--------------+  |   |   |
> + *     |    |  page struct |  |   |   |       +-------------+
> + *     |    +--------------+  |   |   +-----> |     PFN     |
> + *     |    |  page struct |  |   |           +-------------+
> + *     |    +--------------+  |   +---------> |     PFN     |
> + *     |    |  page struct |  |               +-------------+
> + *     |    +--------------+  +-------------> |     PFN     |
> + *     |    |  page struct |                  +-------------+
> + *     |    +--------------+           +----> |     PFN     |
> + *     |    |  page struct |           |      +-------------+
> + *     |    +--------------+           |    Bolted in hash table
> + *     |    |  page struct +-----------+
> + *     v    +--------------+


The things on the right are not PFNs, they're struct pages. Each one
corresponds to a PFN, but that relationship is derived from the vmemap layout,
not the physical layout.

I think it's more like:

	f000000000000000		  c000000000000000 (and also 0x0)
vmemmap +--------------+                  +--------------+
   +    |  page struct | +--------------> |  page struct |
   |    +--------------+                  +--------------+
   |    |  page struct | +--------------> |  page struct |
   |    +--------------+ |                +--------------+
   |    |  page struct | +       +------> |  page struct |
   |    +--------------+         |        +--------------+
   |    |  page struct |         |   +--> |  page struct |
   |    +--------------+         |   |    +--------------+
   |    |  page struct |         |   |
   |    +--------------+         |   |
   |    |  page struct |         |   |
   |    +--------------+         |   |
   |    |  page struct |         |   |
   |    +--------------+         |   |
   |    |  page struct |         |   |
   |    +--------------+         |   |
   |    |  page struct | +-------+   |
   |    +--------------+             |
   |    |  page struct | +-----------+
   |    +--------------+
   |    |  page struct | No mapping
   |    +--------------+
   |    |  page struct | No mapping
   v    +--------------+



Then there's the relationship between struct pages and PFNs:


                           page_to_pfn
                           +--------->
  vmemmap +--------------+             +-------------+
     +    |  page struct | +---------> |     PFN     |
     |    +--------------+             +-------------+
     |    |  page struct | +---------> |     PFN     |
     |    +--------------+             +-------------+
     |    |  page struct | +---------> |     PFN     |
     |    +--------------+             +-------------+
     |    |  page struct | +---------> |     PFN     |
     |    +--------------+             +-------------+
     |    |              |
     |    +--------------+
     |    |              |
     |    +--------------+
     |    |              |
     |    +--------------+             +-------------+
     |    |  page struct | +---------> |     PFN     |
     |    +--------------+             +-------------+
     |    |              |
     |    +--------------+
     |    |              |
     |    +--------------+             +-------------+
     |    |  page struct | +---------> |     PFN     |
     |    +--------------+             +-------------+
     |    |  page struct | +---------> |     PFN     |
     v    +--------------+             +-------------+



cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements
  2016-02-17 14:52   ` Aneesh Kumar K.V
@ 2016-02-18 14:23     ` Michael Ellerman
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Ellerman @ 2016-02-18 14:23 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Anshuman Khandual, linuxppc-dev

On Wed, 2016-02-17 at 20:22 +0530, Aneesh Kumar K.V wrote:

> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
>

> > The structure to track single virtual to physical mapping has
> > been renamed from vmemmap_backing to vmemmap_hw_map which sounds
> > more appropriate. This forms a single entry of the global linked
> > list tracking all of the vmemmap physical mapping. The changes
> > are as follows.
> >
> > 	vmemmap_backing.list -> vmemmap_hw_map.link
> > 	vmemmap_backing.phys -> vmemmap_hw_map.paddr
> > 	vmemmap_backing.virt_addr -> vmemmap_hw_map.vaddr
>
> I am not sure this helps. If we are going to take these renames, can you
> wait till th book3s p9 preparation patches [1] hit upstream ?

I don't see why the new names are any better, and it's a lot of churn for
minimal if any benefit. So I'm not going to take this one.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section
  2016-02-17 12:12 [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Anshuman Khandual
                   ` (2 preceding siblings ...)
  2016-02-17 12:12 ` [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping Anshuman Khandual
@ 2016-02-18 14:34 ` Michael Ellerman
  2016-02-19  4:50   ` Anshuman Khandual
  3 siblings, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2016-02-18 14:34 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: aneesh.kumar

On Wed, 2016-02-17 at 17:42 +0530, Anshuman Khandual wrote:

> The commit (16a05bff1: powerpc: start loop at section start of
> start in vmemmap_populated()) reused 'start' variable to compute
> the starting address of the memory section where the given address
> belongs. Then the same variable is used for iterating over starting
> address of all memory sections before reaching the 'end' address.
> Renaming it as 'section_start' makes the logic more clear.
> 
> Fixes: 16a05bff1 ("powerpc: start loop at section start of start in vmemmap_populated()")

It's not a fix, just a cleanup. Fixes lines should be reserved for actual bug
fixes.

> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/init_64.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 379a6a9..d6b9b4d 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -170,11 +170,15 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
>   */
>  static int __meminit vmemmap_populated(unsigned long start, int page_size)
>  {
> -	unsigned long end = start + page_size;
> -	start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
> +	unsigned long end, section_start;
>  
> -	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
> -		if (pfn_valid(page_to_pfn((struct page *)start)))
> +	end = start + page_size;
> +	section_start = (unsigned long)(pfn_to_page
> +					(vmemmap_section_start(start)));
> +
> +	for (; section_start < end; section_start
> +				+= (PAGES_PER_SECTION * sizeof(struct page)))
> +		if (pfn_valid(page_to_pfn((struct page *)section_start)))
>  			return 1;
>  
>  	return 0;

That's not a big improvement.

But I think this code could be improved. There's a lot of casts, it seems to be
confused about whether it's iterating over addresses or struct pages.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping
  2016-02-17 12:12 ` [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping Anshuman Khandual
@ 2016-02-18 14:37   ` Michael Ellerman
  2016-02-19  4:54     ` Anshuman Khandual
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2016-02-18 14:37 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: aneesh.kumar

On Wed, 2016-02-17 at 17:42 +0530, Anshuman Khandual wrote:

> This renames the global list which tracks all the virtual to physical
> mapping and also the global list which tracks all the available unused
> vmemmap_hw_map node structures.

But why? Why are the new names *so* much better that we would want to go
through all this churn?

> It also attempts to explain the purpose
> of these global linked lists and points out a possible race condition.

I'm happy to take the comments.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section
  2016-02-18 14:34 ` [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Michael Ellerman
@ 2016-02-19  4:50   ` Anshuman Khandual
  0 siblings, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2016-02-19  4:50 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: aneesh.kumar

On 02/18/2016 08:04 PM, Michael Ellerman wrote:
> On Wed, 2016-02-17 at 17:42 +0530, Anshuman Khandual wrote:
> 
>> The commit (16a05bff1: powerpc: start loop at section start of
>> start in vmemmap_populated()) reused 'start' variable to compute
>> the starting address of the memory section where the given address
>> belongs. Then the same variable is used for iterating over starting
>> address of all memory sections before reaching the 'end' address.
>> Renaming it as 'section_start' makes the logic more clear.
>>
>> Fixes: 16a05bff1 ("powerpc: start loop at section start of start in vmemmap_populated()")
> 
> It's not a fix, just a cleanup. Fixes lines should be reserved for actual bug
> fixes.

Sure, got it.

> 
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/mm/init_64.c | 12 ++++++++----
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
>> index 379a6a9..d6b9b4d 100644
>> --- a/arch/powerpc/mm/init_64.c
>> +++ b/arch/powerpc/mm/init_64.c
>> @@ -170,11 +170,15 @@ static unsigned long __meminit vmemmap_section_start(unsigned long page)
>>   */
>>  static int __meminit vmemmap_populated(unsigned long start, int page_size)
>>  {
>> -	unsigned long end = start + page_size;
>> -	start = (unsigned long)(pfn_to_page(vmemmap_section_start(start)));
>> +	unsigned long end, section_start;
>>  
>> -	for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
>> -		if (pfn_valid(page_to_pfn((struct page *)start)))
>> +	end = start + page_size;
>> +	section_start = (unsigned long)(pfn_to_page
>> +					(vmemmap_section_start(start)));
>> +
>> +	for (; section_start < end; section_start
>> +				+= (PAGES_PER_SECTION * sizeof(struct page)))
>> +		if (pfn_valid(page_to_pfn((struct page *)section_start)))
>>  			return 1;
>>  
>>  	return 0;
> 
> That's not a big improvement.
> 
> But I think this code could be improved. There's a lot of casts, it seems to be
> confused about whether it's iterating over addresses or struct pages.

Right, this patch just tries to clear on such confusion. Thats all.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping
  2016-02-18 14:37   ` Michael Ellerman
@ 2016-02-19  4:54     ` Anshuman Khandual
  2016-02-19 10:30       ` Michael Ellerman
  0 siblings, 1 reply; 15+ messages in thread
From: Anshuman Khandual @ 2016-02-19  4:54 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: aneesh.kumar

On 02/18/2016 08:07 PM, Michael Ellerman wrote:
> On Wed, 2016-02-17 at 17:42 +0530, Anshuman Khandual wrote:
> 
>> This renames the global list which tracks all the virtual to physical
>> mapping and also the global list which tracks all the available unused
>> vmemmap_hw_map node structures.
> 
> But why? Why are the new names *so* much better that we would want to go
> through all this churn?

Hmm, okay. Its kind of subjective but then its upto you.

> 
>> It also attempts to explain the purpose
>> of these global linked lists and points out a possible race condition.
> 
> I'm happy to take the comments.

Sure, will send across next time around separately.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout
  2016-02-18 14:22   ` Michael Ellerman
@ 2016-02-19  5:15     ` Anshuman Khandual
  0 siblings, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2016-02-19  5:15 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: aneesh.kumar

On 02/18/2016 07:52 PM, Michael Ellerman wrote:
> On Wed, 2016-02-17 at 17:42 +0530, Anshuman Khandual wrote:
> 
>> Add some explaination to the layout of vmemmap virtual address
>> space and how physical page mapping is only used for valid PFNs
>> present at any point on the system.
>>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/include/asm/book3s/64/pgtable.h | 41 ++++++++++++++++++++++++++++
>>  1 file changed, 41 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 8d1c41d..9db4a86 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -26,6 +26,47 @@
>>  #define IOREMAP_BASE	(PHB_IO_END)
>>  #define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
>>  
>> +/*
>> + * Starting address of the virtual address space where all page structs
> 
> This is so far from the variable it's referring to that it's not clear what it
> refers to. So you should say "vmemmap is the starting ..."
> 
>> + * for the system physical memory are stored under the vmemmap sparse
>                                               ^
> 					      , when using the SPARSEMEM_VMEMMAP
>> + * memory model. All possible struct pages are logically stored in a
>> + * sequence in this virtual address space irrespective of the fact
>> + * whether any given PFN is valid or even the memory section is valid
>> + * or not.
> 
> I know what you mean but I think that could be worded better. But it's too late
> for me to reword it :)
> 
> The key point is that we allocate space for a page struct for each PFN that
> could be present in the system, including holes in the address space (hence
> sparse). That has the nice property of meaning there is a constant relationship
> between the address of a struct page and it's PFN.
> 
>> + * During boot and memory hotplug add operation when new memory
>                   ^                               ^
> 		  or				  ,
>> + * sections are added, real physical allocation and hash table bolting
>                                                   ^
> 						  of struct pages
> 
>> + * will be performed. This saves precious physical memory when the system
>> + * really does not have valid PFNs in some address ranges.
> 
> 
>> + *
>> + *  vmemmap +--------------+
>> + *     +    |  page struct +----------+  PFN is valid
>> + *     |    +--------------+          |
>> + *     |    |  page struct |          |  PFN is invalid
>> + *     |    +--------------+          |
>> + *     |    |  page struct +------+   |
>> + *     |    +--------------+      |   |
>> + *     |    |  page struct |      |   |
>> + *     |    +--------------+      |   |
>> + *     |    |  page struct |      |   |
>> + *     |    +--------------+      |   |
>> + *     |    |  page struct +--+   |   |
>> + *     |    +--------------+  |   |   |
>> + *     |    |  page struct |  |   |   |       +-------------+
>> + *     |    +--------------+  |   |   +-----> |     PFN     |
>> + *     |    |  page struct |  |   |           +-------------+
>> + *     |    +--------------+  |   +---------> |     PFN     |
>> + *     |    |  page struct |  |               +-------------+
>> + *     |    +--------------+  +-------------> |     PFN     |
>> + *     |    |  page struct |                  +-------------+
>> + *     |    +--------------+           +----> |     PFN     |
>> + *     |    |  page struct |           |      +-------------+
>> + *     |    +--------------+           |    Bolted in hash table
>> + *     |    |  page struct +-----------+
>> + *     v    +--------------+
> 
> 
> The things on the right are not PFNs, they're struct pages. Each one
> corresponds to a PFN, but that relationship is derived from the vmemap layout,
> not the physical layout.
> 
> I think it's more like:
> 
> 	f000000000000000		  c000000000000000 (and also 0x0)
> vmemmap +--------------+                  +--------------+
>    +    |  page struct | +--------------> |  page struct |
>    |    +--------------+                  +--------------+
>    |    |  page struct | +--------------> |  page struct |
>    |    +--------------+ |                +--------------+
>    |    |  page struct | +       +------> |  page struct |
>    |    +--------------+         |        +--------------+
>    |    |  page struct |         |   +--> |  page struct |
>    |    +--------------+         |   |    +--------------+
>    |    |  page struct |         |   |
>    |    +--------------+         |   |
>    |    |  page struct |         |   |
>    |    +--------------+         |   |
>    |    |  page struct |         |   |
>    |    +--------------+         |   |
>    |    |  page struct |         |   |
>    |    +--------------+         |   |
>    |    |  page struct | +-------+   |
>    |    +--------------+             |
>    |    |  page struct | +-----------+
>    |    +--------------+
>    |    |  page struct | No mapping
>    |    +--------------+
>    |    |  page struct | No mapping
>    v    +--------------+
> 
> 
> 
> Then there's the relationship between struct pages and PFNs:
> 
> 
>                            page_to_pfn
>                            +--------->
>   vmemmap +--------------+             +-------------+
>      +    |  page struct | +---------> |     PFN     |
>      |    +--------------+             +-------------+
>      |    |  page struct | +---------> |     PFN     |
>      |    +--------------+             +-------------+
>      |    |  page struct | +---------> |     PFN     |
>      |    +--------------+             +-------------+
>      |    |  page struct | +---------> |     PFN     |
>      |    +--------------+             +-------------+
>      |    |              |
>      |    +--------------+
>      |    |              |
>      |    +--------------+
>      |    |              |
>      |    +--------------+             +-------------+
>      |    |  page struct | +---------> |     PFN     |
>      |    +--------------+             +-------------+
>      |    |              |
>      |    +--------------+
>      |    |              |
>      |    +--------------+             +-------------+
>      |    |  page struct | +---------> |     PFN     |
>      |    +--------------+             +-------------+
>      |    |  page struct | +---------> |     PFN     |
>      v    +--------------+             +-------------+

Awesome, this conveys the message better. Will change it next time
around. Thanks !

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping
  2016-02-19  4:54     ` Anshuman Khandual
@ 2016-02-19 10:30       ` Michael Ellerman
  2016-02-19 11:02         ` Anshuman Khandual
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Ellerman @ 2016-02-19 10:30 UTC (permalink / raw)
  To: Anshuman Khandual, linuxppc-dev; +Cc: aneesh.kumar

On Fri, 2016-02-19 at 10:24 +0530, Anshuman Khandual wrote:
> On 02/18/2016 08:07 PM, Michael Ellerman wrote:
> > On Wed, 2016-02-17 at 17:42 +0530, Anshuman Khandual wrote:
> > 
> > > This renames the global list which tracks all the virtual to physical
> > > mapping and also the global list which tracks all the available unused
> > > vmemmap_hw_map node structures.
> > 
> > But why? Why are the new names *so* much better that we would want to go
> > through all this churn?
> 
> Hmm, okay. Its kind of subjective but then its upto you.

Yeah it is of course. I'm not saying your names aren't better, but they're not
obviously better to me, and so it's a lot of code churn for not much benefit
IMHO.

But you can try and convince me if you really feel strongly about it.

> > > It also attempts to explain the purpose
> > > of these global linked lists and points out a possible race condition.
> > 
> > I'm happy to take the comments.
> 
> Sure, will send across next time around separately.

Thanks.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping
  2016-02-19 10:30       ` Michael Ellerman
@ 2016-02-19 11:02         ` Anshuman Khandual
  0 siblings, 0 replies; 15+ messages in thread
From: Anshuman Khandual @ 2016-02-19 11:02 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: aneesh.kumar

On 02/19/2016 04:00 PM, Michael Ellerman wrote:
> On Fri, 2016-02-19 at 10:24 +0530, Anshuman Khandual wrote:
>> > On 02/18/2016 08:07 PM, Michael Ellerman wrote:
>>> > > On Wed, 2016-02-17 at 17:42 +0530, Anshuman Khandual wrote:
>>> > > 
>>>> > > > This renames the global list which tracks all the virtual to physical
>>>> > > > mapping and also the global list which tracks all the available unused
>>>> > > > vmemmap_hw_map node structures.
>>> > > 
>>> > > But why? Why are the new names *so* much better that we would want to go
>>> > > through all this churn?
>> > 
>> > Hmm, okay. Its kind of subjective but then its upto you.
> Yeah it is of course. I'm not saying your names aren't better, but they're not
> obviously better to me, and so it's a lot of code churn for not much benefit
> IMHO.
> 
> But you can try and convince me if you really feel strongly about it.

Agreed, its not worth the churn. Thanks !

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-02-19 11:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-17 12:12 [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Anshuman Khandual
2016-02-17 12:12 ` [RFC 2/4] powerpc/mm: Add comments to the vmemmap layout Anshuman Khandual
2016-02-17 14:49   ` Aneesh Kumar K.V
2016-02-18 14:22   ` Michael Ellerman
2016-02-19  5:15     ` Anshuman Khandual
2016-02-17 12:12 ` [RFC 3/4] powerpc/mm: Rename the vmemmap_backing struct and its elements Anshuman Khandual
2016-02-17 14:52   ` Aneesh Kumar K.V
2016-02-18 14:23     ` Michael Ellerman
2016-02-17 12:12 ` [RFC 4/4] powerpc/mm: Rename global tracker for virtual to physical mapping Anshuman Khandual
2016-02-18 14:37   ` Michael Ellerman
2016-02-19  4:54     ` Anshuman Khandual
2016-02-19 10:30       ` Michael Ellerman
2016-02-19 11:02         ` Anshuman Khandual
2016-02-18 14:34 ` [RFC 1/4] powerpc/mm: Rename variable to reflect start address of a section Michael Ellerman
2016-02-19  4:50   ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).