LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 05/17] h8300, nds32, openrisc: simplify detection of memory extents
From: Mike Rapoport @ 2020-08-02 16:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Emil Renner Berthing, linux-sh, Peter Zijlstra,
	Dave Hansen, linux-mips, Max Filippov, Paul Mackerras, sparclinux,
	linux-riscv, Will Deacon, Christoph Hellwig, Marek Szyprowski,
	linux-arch, linux-s390, linux-c6x-dev, Baoquan He, x86,
	Russell King, Mike Rapoport, clang-built-linux, Ingo Molnar,
	linux-arm-kernel, Catalin Marinas, uclinux-h8-devel, linux-xtensa,
	openrisc, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Stafford Horne, Hari Bathini, Michal Simek, Yoshinori Sato,
	linux-mm, linux-kernel, iommu, Palmer Dabbelt, linuxppc-dev,
	Mike Rapoport
In-Reply-To: <20200802163601.8189-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

Instead of traversing memblock.memory regions to find memory_start and
memory_end, simply query memblock_{start,end}_of_DRAM().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Stafford Horne <shorne@gmail.com>
---
 arch/h8300/kernel/setup.c    | 8 +++-----
 arch/nds32/kernel/setup.c    | 8 ++------
 arch/openrisc/kernel/setup.c | 9 ++-------
 3 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/arch/h8300/kernel/setup.c b/arch/h8300/kernel/setup.c
index 28ac88358a89..0281f92eea3d 100644
--- a/arch/h8300/kernel/setup.c
+++ b/arch/h8300/kernel/setup.c
@@ -74,17 +74,15 @@ static void __init bootmem_init(void)
 	memory_end = memory_start = 0;
 
 	/* Find main memory where is the kernel */
-	for_each_memblock(memory, region) {
-		memory_start = region->base;
-		memory_end = region->base + region->size;
-	}
+	memory_start = memblock_start_of_DRAM();
+	memory_end = memblock_end_of_DRAM();
 
 	if (!memory_end)
 		panic("No memory!");
 
 	/* setup bootmem globals (we use no_bootmem, but mm still depends on this) */
 	min_low_pfn = PFN_UP(memory_start);
-	max_low_pfn = PFN_DOWN(memblock_end_of_DRAM());
+	max_low_pfn = PFN_DOWN(memory_end);
 	max_pfn = max_low_pfn;
 
 	memblock_reserve(__pa(_stext), _end - _stext);
diff --git a/arch/nds32/kernel/setup.c b/arch/nds32/kernel/setup.c
index a066efbe53c0..c356e484dcab 100644
--- a/arch/nds32/kernel/setup.c
+++ b/arch/nds32/kernel/setup.c
@@ -249,12 +249,8 @@ static void __init setup_memory(void)
 	memory_end = memory_start = 0;
 
 	/* Find main memory where is the kernel */
-	for_each_memblock(memory, region) {
-		memory_start = region->base;
-		memory_end = region->base + region->size;
-		pr_info("%s: Memory: 0x%x-0x%x\n", __func__,
-			memory_start, memory_end);
-	}
+	memory_start = memblock_start_of_DRAM();
+	memory_end = memblock_end_of_DRAM();
 
 	if (!memory_end) {
 		panic("No memory!");
diff --git a/arch/openrisc/kernel/setup.c b/arch/openrisc/kernel/setup.c
index 8aa438e1f51f..c5706153d3b6 100644
--- a/arch/openrisc/kernel/setup.c
+++ b/arch/openrisc/kernel/setup.c
@@ -48,17 +48,12 @@ static void __init setup_memory(void)
 	unsigned long ram_start_pfn;
 	unsigned long ram_end_pfn;
 	phys_addr_t memory_start, memory_end;
-	struct memblock_region *region;
 
 	memory_end = memory_start = 0;
 
 	/* Find main memory where is the kernel, we assume its the only one */
-	for_each_memblock(memory, region) {
-		memory_start = region->base;
-		memory_end = region->base + region->size;
-		printk(KERN_INFO "%s: Memory: 0x%x-0x%x\n", __func__,
-		       memory_start, memory_end);
-	}
+	memory_start = memblock_start_of_DRAM();
+	memory_end = memblock_end_of_DRAM();
 
 	if (!memory_end) {
 		panic("No memory!");
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 04/17] arm64: numa: simplify dummy_numa_init()
From: Mike Rapoport @ 2020-08-02 16:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Emil Renner Berthing, linux-sh, Peter Zijlstra,
	Dave Hansen, linux-mips, Max Filippov, Paul Mackerras, sparclinux,
	linux-riscv, Will Deacon, Christoph Hellwig, Marek Szyprowski,
	linux-arch, linux-s390, linux-c6x-dev, Baoquan He, x86,
	Russell King, Mike Rapoport, clang-built-linux, Ingo Molnar,
	linux-arm-kernel, Catalin Marinas, uclinux-h8-devel, linux-xtensa,
	openrisc, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Stafford Horne, Hari Bathini, Michal Simek, Yoshinori Sato,
	linux-mm, linux-kernel, iommu, Palmer Dabbelt, Jonathan Cameron,
	linuxppc-dev, Mike Rapoport
In-Reply-To: <20200802163601.8189-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

dummy_numa_init() loops over memblock.memory and passes nid=0 to
numa_add_memblk() which essentially wraps memblock_set_node(). However,
memblock_set_node() can cope with entire memory span itself, so the loop
over memblock.memory regions is redundant.

Using a single call to memblock_set_node() rather than a loop also fixes an
issue with a buggy ACPI firmware in which the SRAT table covers some but
not all of the memory in the EFI memory map.

Jonathan Cameron says:

  This issue can be easily triggered by having an SRAT table which fails
  to cover all elements of the EFI memory map.

  This firmware error is detected and a warning printed. e.g.
  "NUMA: Warning: invalid memblk node 64 [mem 0x240000000-0x27fffffff]"
  At that point we fall back to dummy_numa_init().

  However, the failed ACPI init has left us with our memblocks all broken
  up as we split them when trying to assign them to NUMA nodes.

  We then iterate over the memblocks and add them to node 0.

  numa_add_memblk() calls memblock_set_node() which merges regions that
  were previously split up during the earlier attempt to add them to different
  nodes during parsing of SRAT.

  This means elements are moved in the memblock array and we can end up
  in a different memblock after the call to numa_add_memblk().
  Result is:

  Unable to handle kernel paging request at virtual address 0000000000003a40
  Mem abort info:
    ESR = 0x96000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
  Data abort info:
    ISV = 0, ISS = 0x00000004
    CM = 0, WnR = 0
  [0000000000003a40] user address but active_mm is swapper
  Internal error: Oops: 96000004 [#1] PREEMPT SMP

  ...

  Call trace:
    sparse_init_nid+0x5c/0x2b0
    sparse_init+0x138/0x170
    bootmem_init+0x80/0xe0
    setup_arch+0x2a0/0x5fc
    start_kernel+0x8c/0x648

Replace the loop with a single call to memblock_set_node() to the entire
memory.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/mm/numa.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/mm/numa.c b/arch/arm64/mm/numa.c
index aafcee3e3f7e..0cbdbcc885fb 100644
--- a/arch/arm64/mm/numa.c
+++ b/arch/arm64/mm/numa.c
@@ -423,19 +423,16 @@ static int __init numa_init(int (*init_func)(void))
  */
 static int __init dummy_numa_init(void)
 {
+	phys_addr_t start = memblock_start_of_DRAM();
+	phys_addr_t end = memblock_end_of_DRAM();
 	int ret;
-	struct memblock_region *mblk;
 
 	if (numa_off)
 		pr_info("NUMA disabled\n"); /* Forced off on command line. */
-	pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n",
-		memblock_start_of_DRAM(), memblock_end_of_DRAM() - 1);
-
-	for_each_memblock(memory, mblk) {
-		ret = numa_add_memblk(0, mblk->base, mblk->base + mblk->size);
-		if (!ret)
-			continue;
+	pr_info("Faking a node at [mem %#018Lx-%#018Lx]\n", start, end - 1);
 
+	ret = numa_add_memblk(0, start, end);
+	if (ret) {
 		pr_err("NUMA init failed\n");
 		return ret;
 	}
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 03/17] arm, xtensa: simplify initialization of high memory pages
From: Mike Rapoport @ 2020-08-02 16:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Emil Renner Berthing, linux-sh, Peter Zijlstra,
	Dave Hansen, linux-mips, Max Filippov, Paul Mackerras, sparclinux,
	linux-riscv, Will Deacon, Christoph Hellwig, Marek Szyprowski,
	linux-arch, linux-s390, linux-c6x-dev, Baoquan He, x86,
	Russell King, Mike Rapoport, clang-built-linux, Ingo Molnar,
	linux-arm-kernel, Catalin Marinas, uclinux-h8-devel, linux-xtensa,
	openrisc, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Stafford Horne, Hari Bathini, Michal Simek, Yoshinori Sato,
	linux-mm, linux-kernel, iommu, Palmer Dabbelt, linuxppc-dev,
	Mike Rapoport
In-Reply-To: <20200802163601.8189-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

The function free_highpages() in both arm and xtensa essentially open-code
for_each_free_mem_range() loop to detect high memory pages that were not
reserved and that should be initialized and passed to the buddy allocator.

Replace open-coded implementation of for_each_free_mem_range() with usage
of memblock API to simplify the code.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>		# xtensa
Tested-by: Max Filippov <jcmvbkbc@gmail.com>		# xtensa
---
 arch/arm/mm/init.c    | 48 +++++++------------------------------
 arch/xtensa/mm/init.c | 55 ++++++++-----------------------------------
 2 files changed, 18 insertions(+), 85 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 01e18e43b174..626af348eb8f 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -352,61 +352,29 @@ static void __init free_unused_memmap(void)
 #endif
 }
 
-#ifdef CONFIG_HIGHMEM
-static inline void free_area_high(unsigned long pfn, unsigned long end)
-{
-	for (; pfn < end; pfn++)
-		free_highmem_page(pfn_to_page(pfn));
-}
-#endif
-
 static void __init free_highpages(void)
 {
 #ifdef CONFIG_HIGHMEM
 	unsigned long max_low = max_low_pfn;
-	struct memblock_region *mem, *res;
+	phys_addr_t range_start, range_end;
+	u64 i;
 
 	/* set highmem page free */
-	for_each_memblock(memory, mem) {
-		unsigned long start = memblock_region_memory_base_pfn(mem);
-		unsigned long end = memblock_region_memory_end_pfn(mem);
+	for_each_free_mem_range(i, NUMA_NO_NODE, MEMBLOCK_NONE,
+				&range_start, &range_end, NULL) {
+		unsigned long start = PHYS_PFN(range_start);
+		unsigned long end = PHYS_PFN(range_end);
 
 		/* Ignore complete lowmem entries */
 		if (end <= max_low)
 			continue;
 
-		if (memblock_is_nomap(mem))
-			continue;
-
 		/* Truncate partial highmem entries */
 		if (start < max_low)
 			start = max_low;
 
-		/* Find and exclude any reserved regions */
-		for_each_memblock(reserved, res) {
-			unsigned long res_start, res_end;
-
-			res_start = memblock_region_reserved_base_pfn(res);
-			res_end = memblock_region_reserved_end_pfn(res);
-
-			if (res_end < start)
-				continue;
-			if (res_start < start)
-				res_start = start;
-			if (res_start > end)
-				res_start = end;
-			if (res_end > end)
-				res_end = end;
-			if (res_start != start)
-				free_area_high(start, res_start);
-			start = res_end;
-			if (start == end)
-				break;
-		}
-
-		/* And now free anything which remains */
-		if (start < end)
-			free_area_high(start, end);
+		for (; start < end; start++)
+			free_highmem_page(pfn_to_page(start));
 	}
 #endif
 }
diff --git a/arch/xtensa/mm/init.c b/arch/xtensa/mm/init.c
index a05b306cf371..ad9d59d93f39 100644
--- a/arch/xtensa/mm/init.c
+++ b/arch/xtensa/mm/init.c
@@ -79,67 +79,32 @@ void __init zones_init(void)
 	free_area_init(max_zone_pfn);
 }
 
-#ifdef CONFIG_HIGHMEM
-static void __init free_area_high(unsigned long pfn, unsigned long end)
-{
-	for (; pfn < end; pfn++)
-		free_highmem_page(pfn_to_page(pfn));
-}
-
 static void __init free_highpages(void)
 {
+#ifdef CONFIG_HIGHMEM
 	unsigned long max_low = max_low_pfn;
-	struct memblock_region *mem, *res;
+	phys_addr_t range_start, range_end;
+	u64 i;
 
-	reset_all_zones_managed_pages();
 	/* set highmem page free */
-	for_each_memblock(memory, mem) {
-		unsigned long start = memblock_region_memory_base_pfn(mem);
-		unsigned long end = memblock_region_memory_end_pfn(mem);
+	for_each_free_mem_range(i, NUMA_NO_NODE, MEMBLOCK_NONE,
+				&range_start, &range_end, NULL) {
+		unsigned long start = PHYS_PFN(range_start);
+		unsigned long end = PHYS_PFN(range_end);
 
 		/* Ignore complete lowmem entries */
 		if (end <= max_low)
 			continue;
 
-		if (memblock_is_nomap(mem))
-			continue;
-
 		/* Truncate partial highmem entries */
 		if (start < max_low)
 			start = max_low;
 
-		/* Find and exclude any reserved regions */
-		for_each_memblock(reserved, res) {
-			unsigned long res_start, res_end;
-
-			res_start = memblock_region_reserved_base_pfn(res);
-			res_end = memblock_region_reserved_end_pfn(res);
-
-			if (res_end < start)
-				continue;
-			if (res_start < start)
-				res_start = start;
-			if (res_start > end)
-				res_start = end;
-			if (res_end > end)
-				res_end = end;
-			if (res_start != start)
-				free_area_high(start, res_start);
-			start = res_end;
-			if (start == end)
-				break;
-		}
-
-		/* And now free anything which remains */
-		if (start < end)
-			free_area_high(start, end);
+		for (; start < end; start++)
+			free_highmem_page(pfn_to_page(start));
 	}
-}
-#else
-static void __init free_highpages(void)
-{
-}
 #endif
+}
 
 /*
  * Initialize memory pages.
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 02/17] dma-contiguous: simplify cma_early_percent_memory()
From: Mike Rapoport @ 2020-08-02 16:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Emil Renner Berthing, linux-sh, Peter Zijlstra,
	Dave Hansen, linux-mips, Max Filippov, Paul Mackerras, sparclinux,
	linux-riscv, Will Deacon, Christoph Hellwig, Marek Szyprowski,
	linux-arch, linux-s390, linux-c6x-dev, Baoquan He, x86,
	Russell King, Mike Rapoport, clang-built-linux, Ingo Molnar,
	linux-arm-kernel, Catalin Marinas, uclinux-h8-devel, linux-xtensa,
	openrisc, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Stafford Horne, Hari Bathini, Michal Simek, Yoshinori Sato,
	linux-mm, linux-kernel, iommu, Palmer Dabbelt, linuxppc-dev,
	Mike Rapoport
In-Reply-To: <20200802163601.8189-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

The memory size calculation in cma_early_percent_memory() traverses
memblock.memory rather than simply call memblock_phys_mem_size(). The
comment in that function suggests that at some point there should have been
call to memblock_analyze() before memblock_phys_mem_size() could be used.
As of now, there is no memblock_analyze() at all and
memblock_phys_mem_size() can be used as soon as cold-plug memory is
registerd with memblock.

Replace loop over memblock.memory with a call to memblock_phys_mem_size().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 kernel/dma/contiguous.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 15bc5026c485..1992afd8ca7b 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -73,16 +73,7 @@ early_param("cma", early_cma);
 
 static phys_addr_t __init __maybe_unused cma_early_percent_memory(void)
 {
-	struct memblock_region *reg;
-	unsigned long total_pages = 0;
-
-	/*
-	 * We cannot use memblock_phys_mem_size() here, because
-	 * memblock_analyze() has not been called yet.
-	 */
-	for_each_memblock(memory, reg)
-		total_pages += memblock_region_memory_end_pfn(reg) -
-			       memblock_region_memory_base_pfn(reg);
+	unsigned long total_pages = PHYS_PFN(memblock_phys_mem_size());
 
 	return (total_pages * CONFIG_CMA_SIZE_PERCENTAGE / 100) << PAGE_SHIFT;
 }
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 01/17] KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
From: Mike Rapoport @ 2020-08-02 16:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Emil Renner Berthing, linux-sh, Peter Zijlstra,
	Dave Hansen, linux-mips, Max Filippov, Paul Mackerras, sparclinux,
	linux-riscv, Will Deacon, Christoph Hellwig, Marek Szyprowski,
	linux-arch, linux-s390, linux-c6x-dev, Baoquan He, x86,
	Russell King, Mike Rapoport, clang-built-linux, Ingo Molnar,
	linux-arm-kernel, Catalin Marinas, uclinux-h8-devel, linux-xtensa,
	openrisc, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Stafford Horne, Hari Bathini, Michal Simek, Yoshinori Sato,
	linux-mm, linux-kernel, iommu, Palmer Dabbelt, linuxppc-dev,
	Mike Rapoport
In-Reply-To: <20200802163601.8189-1-rppt@kernel.org>

From: Mike Rapoport <rppt@linux.ibm.com>

The memory size calculation in kvm_cma_reserve() traverses memblock.memory
rather than simply call memblock_phys_mem_size(). The comment in that
function suggests that at some point there should have been call to
memblock_analyze() before memblock_phys_mem_size() could be used.
As of now, there is no memblock_analyze() at all and
memblock_phys_mem_size() can be used as soon as cold-plug memory is
registerd with memblock.

Replace loop over memblock.memory with a call to memblock_phys_mem_size().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/powerpc/kvm/book3s_hv_builtin.c | 11 ++---------
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index 7cd3cf3d366b..56ab0d28de2a 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -95,22 +95,15 @@ EXPORT_SYMBOL_GPL(kvm_free_hpt_cma);
 void __init kvm_cma_reserve(void)
 {
 	unsigned long align_size;
-	struct memblock_region *reg;
-	phys_addr_t selected_size = 0;
+	phys_addr_t selected_size;
 
 	/*
 	 * We need CMA reservation only when we are in HV mode
 	 */
 	if (!cpu_has_feature(CPU_FTR_HVMODE))
 		return;
-	/*
-	 * We cannot use memblock_phys_mem_size() here, because
-	 * memblock_analyze() has not been called yet.
-	 */
-	for_each_memblock(memory, reg)
-		selected_size += memblock_region_memory_end_pfn(reg) -
-				 memblock_region_memory_base_pfn(reg);
 
+	selected_size = PHYS_PFN(memblock_phys_mem_size());
 	selected_size = (selected_size * kvm_cma_resv_ratio / 100) << PAGE_SHIFT;
 	if (selected_size) {
 		pr_debug("%s: reserving %ld MiB for global area\n", __func__,
-- 
2.26.2


^ permalink raw reply related

* [PATCH v2 00/17] memblock: seasonal cleaning^w cleanup
From: Mike Rapoport @ 2020-08-02 16:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Emil Renner Berthing, linux-sh, Peter Zijlstra,
	Dave Hansen, linux-mips, Max Filippov, Paul Mackerras, sparclinux,
	linux-riscv, Will Deacon, Christoph Hellwig, Marek Szyprowski,
	linux-arch, linux-s390, linux-c6x-dev, Baoquan He, x86,
	Russell King, Mike Rapoport, clang-built-linux, Ingo Molnar,
	linux-arm-kernel, Catalin Marinas, uclinux-h8-devel, linux-xtensa,
	openrisc, Borislav Petkov, Andy Lutomirski, Paul Walmsley,
	Stafford Horne, Hari Bathini, Michal Simek, Yoshinori Sato,
	linux-mm, linux-kernel, iommu, Palmer Dabbelt, linuxppc-dev,
	Mike Rapoport

From: Mike Rapoport <rppt@linux.ibm.com>

Hi,

These patches simplify several uses of memblock iterators and hide some of
the memblock implementation details from the rest of the system.

The patches are on top of v5.8-rc7 + cherry-pick of "mm/sparse: cleanup the
code surrounding memory_present()" [1] from mmotm tree.

v2 changes:
* replace for_each_memblock() with two versions, one for memblock.memory
  and another one for memblock.reserved
* fix overzealous cleanup of powerpc fadamp: keep the traversal over the
  memblocks, but use better suited iterators
* don't remove traversal over memblock.reserved in x86 numa cleanup but
  replace for_each_memblock() with new for_each_reserved_mem_region()
* simplify ramdisk and crash kernel allocations on x86
* drop more redundant and unused code: __next_reserved_mem_region() and
  memblock_mem_size()
* add description of numa initialization fix on arm64 (thanks Jonathan)
* add Acked and Reviewed tags

[1] http://lkml.kernel.org/r/20200712083130.22919-1-rppt@kernel.org 

Mike Rapoport (17):
  KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
  dma-contiguous: simplify cma_early_percent_memory()
  arm, xtensa: simplify initialization of high memory pages
  arm64: numa: simplify dummy_numa_init()
  h8300, nds32, openrisc: simplify detection of memory extents
  riscv: drop unneeded node initialization
  mircoblaze: drop unneeded NUMA and sparsemem initializations
  memblock: make for_each_memblock_type() iterator private
  memblock: make memblock_debug and related functionality private
  memblock: reduce number of parameters in for_each_mem_range()
  arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
  arch, drivers: replace for_each_membock() with for_each_mem_range()
  x86/setup: simplify initrd relocation and reservation
  x86/setup: simplify reserve_crashkernel()
  memblock: remove unused memblock_mem_size()
  memblock: implement for_each_reserved_mem_region() using __next_mem_region()
  memblock: use separate iterators for memory and reserved regions

 .clang-format                            |  4 +-
 arch/arm/kernel/setup.c                  | 18 +++--
 arch/arm/mm/init.c                       | 59 ++++------------
 arch/arm/mm/mmu.c                        | 39 ++++-------
 arch/arm/mm/pmsa-v7.c                    | 20 +++---
 arch/arm/mm/pmsa-v8.c                    | 17 +++--
 arch/arm/xen/mm.c                        |  7 +-
 arch/arm64/kernel/machine_kexec_file.c   |  6 +-
 arch/arm64/kernel/setup.c                |  4 +-
 arch/arm64/mm/init.c                     | 11 ++-
 arch/arm64/mm/kasan_init.c               | 10 +--
 arch/arm64/mm/mmu.c                      | 11 +--
 arch/arm64/mm/numa.c                     | 15 ++---
 arch/c6x/kernel/setup.c                  |  9 +--
 arch/h8300/kernel/setup.c                |  8 +--
 arch/microblaze/mm/init.c                | 24 ++-----
 arch/mips/cavium-octeon/dma-octeon.c     | 12 ++--
 arch/mips/kernel/setup.c                 | 31 +++++----
 arch/mips/netlogic/xlp/setup.c           |  2 +-
 arch/nds32/kernel/setup.c                |  8 +--
 arch/openrisc/kernel/setup.c             |  9 +--
 arch/openrisc/mm/init.c                  |  8 ++-
 arch/powerpc/kernel/fadump.c             | 57 ++++++++--------
 arch/powerpc/kvm/book3s_hv_builtin.c     | 11 +--
 arch/powerpc/mm/book3s64/hash_utils.c    | 16 ++---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 11 ++-
 arch/powerpc/mm/kasan/kasan_init_32.c    |  8 +--
 arch/powerpc/mm/mem.c                    | 33 +++++----
 arch/powerpc/mm/numa.c                   |  7 +-
 arch/powerpc/mm/pgtable_32.c             |  8 +--
 arch/riscv/mm/init.c                     | 34 +++-------
 arch/riscv/mm/kasan_init.c               | 10 +--
 arch/s390/kernel/crash_dump.c            |  8 +--
 arch/s390/kernel/setup.c                 | 31 +++++----
 arch/s390/mm/page-states.c               |  6 +-
 arch/s390/mm/vmem.c                      | 16 +++--
 arch/sh/mm/init.c                        |  9 +--
 arch/sparc/mm/init_64.c                  | 12 ++--
 arch/x86/kernel/setup.c                  | 56 +++++-----------
 arch/x86/mm/numa.c                       |  2 +-
 arch/xtensa/mm/init.c                    | 55 +++------------
 drivers/bus/mvebu-mbus.c                 | 12 ++--
 drivers/irqchip/irq-gic-v3-its.c         |  2 +-
 drivers/s390/char/zcore.c                |  9 +--
 include/linux/memblock.h                 | 65 +++++++++---------
 kernel/dma/contiguous.c                  | 11 +--
 mm/memblock.c                            | 85 ++++++++----------------
 mm/page_alloc.c                          | 11 ++-
 mm/sparse.c                              | 10 ++-
 49 files changed, 366 insertions(+), 561 deletions(-)

-- 
2.26.2


^ permalink raw reply

* Re: [RFC PATCH 1/2] powerpc/numa: Introduce logical numa id
From: Aneesh Kumar K.V @ 2020-08-02 14:21 UTC (permalink / raw)
  To: Srikar Dronamraju; +Cc: Nathan Lynch, linuxppc-dev
In-Reply-To: <20200801052059.GA24375@linux.vnet.ibm.com>

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:

> * Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [2020-07-31 16:49:14]:
>
>> We use ibm,associativity and ibm,associativity-lookup-arrays to derive the numa
>> node numbers. These device tree properties are firmware indicated grouping of
>> resources based on their hierarchy in the platform. These numbers (group id) are
>> not sequential and hypervisor/firmware can follow different numbering schemes.
>> For ex: on powernv platforms, we group them in the below order.
>> 
>>  *     - CCM node ID
>>  *     - HW card ID
>>  *     - HW module ID
>>  *     - Chip ID
>>  *     - Core ID
>> 
>> Based on ibm,associativity-reference-points we use one of the above group ids as
>> Linux NUMA node id. (On PowerNV platform Chip ID is used). This results
>> in Linux reporting non-linear NUMA node id and which also results in Linux
>> reporting empty node 0 NUMA nodes.
>> 
>
> If its just to eliminate node 0, then we have 2 other probably better
> solutions.
> 1. Dont mark node 0 as spl (currently still in mm-tree and a result in
> linux-next)
> 2. powerpc specific: explicitly clear node 0 during numa bringup.
>


I am not sure I consider them better. But yes, those patches are good
and also resolves the node 0 initialization when the firmware didn't
indicate the presence of such a node.

This patch in addition make sure that we get the same topolgy report
across reboot on a virtualized partitions as longs as the cpu/memory
ratio per powervm domains remain the same. This should also help to
avoid confusion after an LPM migration once we start applying topology
updates. 

>> This can  be resolved by mapping the firmware provided group id to a logical Linux
>> NUMA id. In this patch, we do this only for pseries platforms considering the
>
> On PowerVM, as you would know the nid is already a logical or a flattened
> chip-id and not the actual hardware chip-id.

Yes. But then they are derived based on PowerVM resources AKA domains.
Now based on the available resource on a system, we could end up with
different node numbers with same toplogy across reboots. Making it
logical at OS level prevent that. 


>
>> firmware group id is a virtualized entity and users would not have drawn any
>> conclusion based on the Linux Numa Node id.
>> 
>> On PowerNV platform since we have historically mapped Chip ID as Linux NUMA node
>> id, we keep the existing Linux NUMA node id numbering.
>> 
>> Before Fix:
>>  # numactl -H
>> available: 2 nodes (0-1)
>> node 0 cpus:
>> node 0 size: 0 MB
>> node 0 free: 0 MB
>> node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
>> node 1 size: 50912 MB
>> node 1 free: 45248 MB
>> node distances:
>> node   0   1
>>   0:  10  40
>>   1:  40  10
>> 
>> after fix
>>  # numactl  -H
>> available: 1 nodes (0)
>> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
>> node 0 size: 50912 MB
>> node 0 free: 49724 MB
>> node distances:
>> node   0
>>   0:  10
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>  arch/powerpc/include/asm/topology.h |  1 +
>>  arch/powerpc/mm/numa.c              | 49 ++++++++++++++++++++++-------
>>  2 files changed, 39 insertions(+), 11 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
>> index f0b6300e7dd3..15b0424a27a8 100644
>> --- a/arch/powerpc/include/asm/topology.h
>> +++ b/arch/powerpc/include/asm/topology.h
>> @@ -118,5 +118,6 @@ int get_physical_package_id(int cpu);
>>  #endif
>>  #endif
>> 
>> +int firmware_group_id_to_nid(int firmware_gid);
>>  #endif /* __KERNEL__ */
>>  #endif	/* _ASM_POWERPC_TOPOLOGY_H */
>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> index e437a9ac4956..6c659aada55b 100644
>> --- a/arch/powerpc/mm/numa.c
>> +++ b/arch/powerpc/mm/numa.c
>> @@ -221,25 +221,51 @@ static void initialize_distance_lookup_table(int nid,
>>  	}
>>  }
>> 
>> +static u32 nid_map[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] =  NUMA_NO_NODE};
>> +
>> +int firmware_group_id_to_nid(int firmware_gid)
>> +{
>> +	static int last_nid = 0;
>> +
>> +	/*
>> +	 * For PowerNV we don't change the node id. This helps to avoid
>> +	 * confusion w.r.t the expected node ids. On pseries, node numbers
>> +	 * are virtualized. Hence do logical node id for pseries.
>> +	 */
>> +	if (!firmware_has_feature(FW_FEATURE_LPAR))
>> +		return firmware_gid;
>> +
>> +	if (firmware_gid ==  -1)
>> +		return NUMA_NO_NODE;
>> +
>> +	if (nid_map[firmware_gid] == NUMA_NO_NODE)
>> +		nid_map[firmware_gid] = last_nid++;
>
> How do we ensure 2 simultaneous firmware_group_id_to_nid() calls dont end up
> at this place in parallel?

Do we have a code path where we do that? All the node id init should
happen early and there should not be two cpus doing node init at the
same time. I might be mistaken. Can you point to the code path where you
expect this to be called in parallel?


>
>> +
>> +	return nid_map[firmware_gid];
>> +}
>> +
>>  /* Returns nid in the range [0..MAX_NUMNODES-1], or -1 if no useful numa
>>   * info is found.
>>   */
>>  static int associativity_to_nid(const __be32 *associativity)
>>  {
>>  	int nid = NUMA_NO_NODE;
>> +	int firmware_gid = -1;
>> 
>>  	if (!numa_enabled)
>>  		goto out;
>> 
>>  	if (of_read_number(associativity, 1) >= min_common_depth)
>> -		nid = of_read_number(&associativity[min_common_depth], 1);
>> +		firmware_gid = of_read_number(&associativity[min_common_depth], 1);
>> 
>>  	/* POWER4 LPAR uses 0xffff as invalid node */
>> -	if (nid == 0xffff || nid >= MAX_NUMNODES)
>> -		nid = NUMA_NO_NODE;
>> +	if (firmware_gid == 0xffff || firmware_gid >= MAX_NUMNODES)
>> +		firmware_gid = -1;
>
> Lets assume two or more invocations of associativity_to_nid for the same
> associativity, end up with -1, In each case aren't giving different
> nids?


I didn't quiet get the comment here. But I assume you are indicating the
same one you mentioned above?

>
>
>> +
>> +	nid = firmware_group_id_to_nid(firmware_gid);
>> 
>>  	if (nid > 0 &&
>> -		of_read_number(associativity, 1) >= distance_ref_points_depth) {
>> +	    of_read_number(associativity, 1) >= distance_ref_points_depth) {
>>  		/*
>>  		 * Skip the length field and send start of associativity array
>>  		 */
>> @@ -432,24 +458,25 @@ static int of_get_assoc_arrays(struct assoc_arrays *aa)
>>  static int of_drconf_to_nid_single(struct drmem_lmb *lmb)
>>  {
>>  	struct assoc_arrays aa = { .arrays = NULL };
>> -	int default_nid = NUMA_NO_NODE;
>> -	int nid = default_nid;
>> +	int nid = NUMA_NO_NODE, firmware_gid;
>>  	int rc, index;
>> 
>>  	if ((min_common_depth < 0) || !numa_enabled)
>> -		return default_nid;
>> +		return NUMA_NO_NODE;
>> 
>>  	rc = of_get_assoc_arrays(&aa);
>>  	if (rc)
>> -		return default_nid;
>> +		return NUMA_NO_NODE;
>
> https://lore.kernel.org/linuxppc-dev/87lfjc1b5f.fsf@linux.ibm.com/t/#u

Not sure what I should conclude on that. I am changing the function here
and would like to make NUMA_NO_NODE as the error return. 

>
>> 
>>  	if (min_common_depth <= aa.array_sz &&
>>  	    !(lmb->flags & DRCONF_MEM_AI_INVALID) && lmb->aa_index < aa.n_arrays) {
>>  		index = lmb->aa_index * aa.array_sz + min_common_depth - 1;
>> -		nid = of_read_number(&aa.arrays[index], 1);
>> +		firmware_gid = of_read_number(&aa.arrays[index], 1);
>> 
>> -		if (nid == 0xffff || nid >= MAX_NUMNODES)
>> -			nid = default_nid;
>> +		if (firmware_gid == 0xffff || firmware_gid >= MAX_NUMNODES)
>> +			firmware_gid = -1;
>
> Same case as above, How do we ensure that we return unique nid for a
> similar assoc_array?

Can you ellaborate this?

>
>> +
>> +		nid = firmware_group_id_to_nid(firmware_gid);
>> 
>>  		if (nid > 0) {
>>  			index = lmb->aa_index * aa.array_sz;
>> -- 
>> 2.26.2
>> 

-aneesh

^ permalink raw reply

* Re: [PATCH v3] selftests: powerpc: Fix CPU affinity for child process
From: Michael Ellerman @ 2020-08-02 13:39 UTC (permalink / raw)
  To: Harish, mpe; +Cc: srikar, kamalesh, shiganta, sathnaga, sandipan, linuxppc-dev
In-Reply-To: <20200609081423.529664-1-harish@linux.ibm.com>

On Tue, 9 Jun 2020 13:44:23 +0530, Harish wrote:
> On systems with large number of cpus, test fails trying to set
> affinity by calling sched_setaffinity() with smaller size for
> affinity mask. This patch fixes it by making sure that the size of
> allocated affinity mask is dependent on the number of CPUs as
> reported by get_nprocs().

Applied to powerpc/next.

[1/1] selftests/powerpc: Fix CPU affinity for child process
      https://git.kernel.org/powerpc/c/854eb5022be04f81e318765f089f41a57c8e5d83

cheers

^ permalink raw reply

* Re: [PATCH v4 0/2] powerpc/papr_scm: add support for reporting NVDIMM 'life_used_percentage' metric
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: linuxppc-dev, Vaibhav Jain, linux-nvdimm; +Cc: Aneesh Kumar K . V
In-Reply-To: <20200731064153.182203-1-vaibhav@linux.ibm.com>

On Fri, 31 Jul 2020 12:11:51 +0530, Vaibhav Jain wrote:
> Changes since v3[1]:
> 
> * Fixed a rebase issue pointed out by Aneesh in first patch in the series.
> 
> [1] https://lore.kernel.org/linux-nvdimm/20200730121303.134230-1-vaibhav@linux.ibm.com

Applied to powerpc/next.

[1/2] powerpc/papr_scm: Fetch nvdimm performance stats from PHYP
      https://git.kernel.org/powerpc/c/2d02bf835e5731de632c8a13567905fa7c0da01c
[2/2] powerpc/papr_scm: Add support for fetching nvdimm 'fuel-gauge' metric
      https://git.kernel.org/powerpc/c/af0870c4e75655b1931d0a5ffde2f448a2794362

cheers

^ permalink raw reply

* Re: [PATCH] powerpc: fix function annotations to avoid section mismatch warnings with gcc-10
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: linuxppc-dev, Vladis Dronov
  Cc: Aneesh Kumar K . V, Paul Mackerras, linux-kernel
In-Reply-To: <20200729133741.62789-1-vdronov@redhat.com>

On Wed, 29 Jul 2020 15:37:41 +0200, Vladis Dronov wrote:
> Certain warnings are emitted for powerpc code when building with a gcc-10
> toolset:
> 
>     WARNING: modpost: vmlinux.o(.text.unlikely+0x377c): Section mismatch in
>     reference from the function remove_pmd_table() to the function
>     .meminit.text:split_kernel_mapping()
>     The function remove_pmd_table() references
>     the function __meminit split_kernel_mapping().
>     This is often because remove_pmd_table lacks a __meminit
>     annotation or the annotation of split_kernel_mapping is wrong.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: fix function annotations to avoid section mismatch warnings with gcc-10
      https://git.kernel.org/powerpc/c/aff779515a070df7e23da9e86f1096f7d10d647e

cheers

^ permalink raw reply

* Re: [PATCH v3] selftests: powerpc: Fix online CPU selection
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: Sandipan Das, mpe
  Cc: srikar, kamalesh, shiganta, nasastry, harish, linuxppc-dev
In-Reply-To: <a408c4b8e9a23bb39b539417a21eb0ff47bb5127.1596084858.git.sandipan@linux.ibm.com>

On Thu, 30 Jul 2020 10:38:46 +0530, Sandipan Das wrote:
> The size of the CPU affinity mask must be large enough for
> systems with a very large number of CPUs. Otherwise, tests
> which try to determine the first online CPU by calling
> sched_getaffinity() will fail. This makes sure that the size
> of the allocated affinity mask is dependent on the number of
> CPUs as reported by get_nprocs_conf().

Applied to powerpc/next.

[1/1] selftests/powerpc: Fix online CPU selection
      https://git.kernel.org/powerpc/c/dfa03fff86027e58c8dba5c03ae68150d4e513ad

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/pseries/hotplug-cpu: remove double free in error path
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: linuxppc-dev, Nathan Lynch
In-Reply-To: <20190919231633.1344-1-nathanl@linux.ibm.com>

On Thu, 19 Sep 2019 18:16:33 -0500, Nathan Lynch wrote:
> In the unlikely event that the device tree lacks a /cpus node,
> find_dlpar_cpus_to_add() oddly frees the cpu_drcs buffer it has been
> passed before returning an error. Its only caller also frees the
> buffer on error.
> 
> Remove the less conventional kfree() of a caller-supplied buffer from
> find_dlpar_cpus_to_add().

Applied to powerpc/next.

[1/1] powerpc/pseries/hotplug-cpu: Remove double free in error path
      https://git.kernel.org/powerpc/c/a0ff72f9f5a780341e7ff5e9ba50a0dad5fa1980

cheers

^ permalink raw reply

* Re: [PATCH 0/4] cacheinfo instrumentation tweaks
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: linuxppc-dev, Nathan Lynch
In-Reply-To: <20190627051537.7298-1-nathanl@linux.ibm.com>

On Thu, 27 Jun 2019 00:15:33 -0500, Nathan Lynch wrote:
> A few changes that would have aided debugging this code's interactions
> with partition migration, maybe they'll help with the next thing
> (hibernation?).
> 
> Nathan Lynch (4):
>   powerpc/cacheinfo: set pr_fmt
>   powerpc/cacheinfo: name@unit instead of full DT path in debug messages
>   powerpc/cacheinfo: improve diagnostics about malformed cache lists
>   powerpc/cacheinfo: warn if cache object chain becomes unordered
> 
> [...]

Applied to powerpc/next.

[1/4] powerpc/cacheinfo: Set pr_fmt()
      https://git.kernel.org/powerpc/c/e2b3c165f27a6bdb197b0dc86683ed36f61c5527
[2/4] powerpc/cacheinfo: Use name@unit instead of full DT path in debug messages
      https://git.kernel.org/powerpc/c/be6f885e97e9304541057fbf25148685847ef310
[3/4] powerpc/cacheinfo: Improve diagnostics about malformed cache lists
      https://git.kernel.org/powerpc/c/1b3da8ffaa158e9a95c19b17c14d7259d58bc0cd
[4/4] powerpc/cacheinfo: Warn if cache object chain becomes unordered
      https://git.kernel.org/powerpc/c/6ec54363f198aae9c1343f82ff5b865546944a73

cheers

^ permalink raw reply

* Re: [PATCH 0/2] migration/prrn instrumentation tweaks
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: linuxppc-dev, Nathan Lynch
In-Reply-To: <20190627053044.9238-1-nathanl@linux.ibm.com>

On Thu, 27 Jun 2019 00:30:42 -0500, Nathan Lynch wrote:
> Mainly this produces better information about what's happening with
> the device tree as a result of LPM or PRRN.
> 
> Nathan Lynch (2):
>   powerpc/pseries/mobility: set pr_fmt
>   powerpc/pseries/mobility: add pr_debug for device tree changes
> 
> [...]

Applied to powerpc/next.

[1/2] powerpc/pseries/mobility: Set pr_fmt()
      https://git.kernel.org/powerpc/c/494a66f34e00b6a1897b5a1ab150a19265696b17
[2/2] powerpc/pseries/mobility: Add pr_debug() for device tree changes
      https://git.kernel.org/powerpc/c/5d8b1f9dea17b4bf5e5f088f39eeab32c7e487be

cheers

^ permalink raw reply

* Re: [PATCH v2] hmi: Move hmi irq stat from percpu variable to paca.
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: Mahesh Salgaonkar, linuxppc-dev; +Cc: Aneesh Kumar K.V, Nicholas Piggin
In-Reply-To: <159290806973.3642154.5244613424529764050.stgit@jupiter>

On Tue, 23 Jun 2020 15:57:50 +0530, Mahesh Salgaonkar wrote:
> With the proposed change in percpu bootmem allocator to use page mapping
> [1], the percpu first chunk memory area can come from vmalloc ranges. This
> makes hmi handler to crash the kernel whenever percpu variable is accessed
> in real mode.  This patch fixes this issue by moving the hmi irq stat
> inside paca for safe access in realmode.
> 
> [1] https://lore.kernel.org/linuxppc-dev/20200608070904.387440-1-aneesh.kumar@linux.ibm.com/

Applied to powerpc/next.

[1/1] powerpc/64s: Move HMI IRQ stat from percpu variable to paca.
      https://git.kernel.org/powerpc/c/ada68a66b72687e6b74e35c42efd1783e84b01fd

cheers

^ permalink raw reply

* Re: [PATCH v6 00/11] ppc64: enable kdump support for kexec_file_load syscall
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: Michael Ellerman, Hari Bathini
  Cc: Laurent Dufour, kernel test robot, Pingfan Liu, Kexec-ml,
	Dave Young, Nayna Jain, Petr Tesarik, lkml, Sourabh Jain,
	Vivek Goyal, linuxppc-dev, Eric Biederman, Andrew Morton,
	Mahesh J Salgaonkar, Mimi Zohar, Thiago Jung Bauermann
In-Reply-To: <159602259854.575379.16910915605574571585.stgit@hbathini>

On Wed, 29 Jul 2020 17:08:44 +0530, Hari Bathini wrote:
> Sorry! There was a gateway issue on my system while posting v5, due to
> which some patches did not make it through. Resending...
> 
> This patch series enables kdump support for kexec_file_load system
> call (kexec -s -p) on PPC64. The changes are inspired from kexec-tools
> code but heavily modified for kernel consumption.
> 
> [...]

Applied to powerpc/next.

[01/11] kexec_file: Allow archs to handle special regions while locating memory hole
        https://git.kernel.org/powerpc/c/f891f19736bdf404845f97d8038054be37160ea8
[02/11] powerpc/kexec_file: Mark PPC64 specific code
        https://git.kernel.org/powerpc/c/19031275a5881233b4fc31b7dee68bf0b0758bbc
[03/11] powerpc/kexec_file: Add helper functions for getting memory ranges
        https://git.kernel.org/powerpc/c/180adfc532a83c1d74146449f7385f767d4b8059
[04/11] powerpc/kexec_file: Avoid stomping memory used by special regions
        https://git.kernel.org/powerpc/c/b8e55a3e5c208862eacded5aad822184f89f85d9
[05/11] powerpc/drmem: Make LMB walk a bit more flexible
        https://git.kernel.org/powerpc/c/adfefc609e55edc5dce18a68d1526af6d70aaf86
[06/11] powerpc/kexec_file: Restrict memory usage of kdump kernel
        https://git.kernel.org/powerpc/c/7c64e21a1c5a5bcd651d895b8faa68e9cdcc433d
[07/11] powerpc/kexec_file: Setup backup region for kdump kernel
        https://git.kernel.org/powerpc/c/1a1cf93c200581c72a3cd521e1e0a1a3b5d0077d
[08/11] powerpc/kexec_file: Prepare elfcore header for crashing kernel
        https://git.kernel.org/powerpc/c/cb350c1f1f867db16725f1bb06be033ece19e998
[09/11] powerpc/kexec_file: Add appropriate regions for memory reserve map
        https://git.kernel.org/powerpc/c/6ecd0163d36049b5f2435a8658f1320c9f3f2924
[10/11] powerpc/kexec_file: Fix kexec load failure with lack of memory hole
        https://git.kernel.org/powerpc/c/b5667d13be8d0928a02b46e0c6f7ab891d32f697
[11/11] powerpc/kexec_file: Enable early kernel OPAL calls
        https://git.kernel.org/powerpc/c/2e6bd221d96fcfd9bd1eed5cd9c008e7959daed7

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/fsl/dts: add missing P4080DS I2C devices
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: linuxppc-dev, David Lamparter
In-Reply-To: <20180920230422.GK487685@eidolon.nox.tf>

On Fri, 21 Sep 2018 01:04:22 +0200, David Lamparter wrote:
> This just adds the zl2006 voltage regulators / power monitors and the
> onboard I2C eeproms.  The ICS9FG108 clock chip doesn't seem to have a
> driver, so it is left in the DTS as a comment.  And for good measure,
> the SPD eeproms are tagged as such.

Applied to powerpc/next.

[1/1] powerpc/fsl/dts: add missing P4080DS I2C devices
      https://git.kernel.org/powerpc/c/d3c61954fc1827df571e235b9a98e10108ef5c3d

cheers

^ permalink raw reply

* Re: [PATCH v3 0/3] cpuidle-pseries: Parse extended CEDE information for idle.
From: Michael Ellerman @ 2020-08-02 13:35 UTC (permalink / raw)
  To: Anton Blanchard, Michael Ellerman, Nicholas Piggin,
	Gautham R. Shenoy, Nathan Lynch, Michael Neuling,
	Vaidyanathan Srinivasan
  Cc: linuxppc-dev, linux-kernel, linux-pm
In-Reply-To: <1596087177-30329-1-git-send-email-ego@linux.vnet.ibm.com>

On Thu, 30 Jul 2020 11:02:54 +0530, Gautham R. Shenoy wrote:
> This is a v3 of the patch series to parse the extended CEDE
> information in the pseries-cpuidle driver.
> 
> The previous two versions of the patches can be found here:
> 
> v2: https://lore.kernel.org/lkml/1596005254-25753-1-git-send-email-ego@linux.vnet.ibm.com/
> 
> [...]

Applied to powerpc/next.

[1/3] cpuidle: pseries: Set the latency-hint before entering CEDE
      https://git.kernel.org/powerpc/c/3af0ada7dd98c6da35c1fd7f107af3b9aa5e904c
[2/3] cpuidle: pseries: Add function to parse extended CEDE records
      https://git.kernel.org/powerpc/c/054e44ba99ae36918631fcbf5f034e466c2f1b73
[3/3] cpuidle: pseries: Fixup exit latency for CEDE(0)
      https://git.kernel.org/powerpc/c/d947fb4c965cdb7242f3f91124ea16079c49fa8b

cheers

^ permalink raw reply

* Re: [PATCH] selftests/powerpc: return skip code for spectre_v2
From: Michael Ellerman @ 2020-08-02 13:34 UTC (permalink / raw)
  To: Michael Ellerman, Thadeu Lima de Souza Cascardo
  Cc: Shuah Khan, linuxppc-dev, linux-kernel, linux-kselftest
In-Reply-To: <20200728155039.401445-1-cascardo@canonical.com>

On Tue, 28 Jul 2020 12:50:39 -0300, Thadeu Lima de Souza Cascardo wrote:
> When running under older versions of qemu of under newer versions with old
> machine types, some security features will not be reported to the guest.
> This will lead the guest OS to consider itself Vulnerable to spectre_v2.
> 
> So, spectre_v2 test fails in such cases when the host is mitigated and miss
> predictions cannot be detected as expected by the test.
> 
> [...]

Applied to powerpc/next.

[1/1] selftests/powerpc: Return skip code for spectre_v2
      https://git.kernel.org/powerpc/c/f3054ffd71b5afd44832b2207e6e90267e1cd2d1

cheers

^ permalink raw reply

* Re: [PATCH v4 0/3] Add support for divde[.] and divdeu[.] instruction emulation
From: Michael Ellerman @ 2020-08-02 13:34 UTC (permalink / raw)
  To: Balamuruhan S, mpe
  Cc: ravi.bangoria, jniethe5, paulus, sandipan, naveen.n.rao,
	linuxppc-dev
In-Reply-To: <20200728130308.1790982-1-bala24@linux.ibm.com>

On Tue, 28 Jul 2020 18:33:05 +0530, Balamuruhan S wrote:
> This patchset adds support to emulate divde, divde., divdeu and divdeu.
> instructions and testcases for it.
> 
> Resend v4: rebased on latest powerpc next branch
> 
> Changes in v4:
> -------------
> Fix review comments from Naveen,
> * replace TEST_DIVDEU() instead of wrongly used TEST_DIVDEU_DOT() in
>   divdeu testcase.
> * Include `acked-by` tag from Naveen for the series.
> * Rebase it on latest mpe's merge tree.
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/ppc-opcode: Add divde and divdeu opcodes
      https://git.kernel.org/powerpc/c/8902c6f96364d1117236948d6c7b9178f428529c
[2/3] powerpc/sstep: Add support for divde[.] and divdeu[.] instructions
      https://git.kernel.org/powerpc/c/151c32bf5ebdd41114267717dc4b53d2632cbd30
[3/3] powerpc/test_emulate_step: Add testcases for divde[.] and divdeu[.] instructions
      https://git.kernel.org/powerpc/c/b859c95cf4b936b5e8019e7ab68ee2740e609ffd

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/configs: Add BLK_DEV_NVME to pseries_defconfig
From: Michael Ellerman @ 2020-08-02 13:34 UTC (permalink / raw)
  To: Anton Blanchard, paulus, benh, mpe; +Cc: linuxppc-dev
In-Reply-To: <20200729040828.2312966-1-anton@ozlabs.org>

On Wed, 29 Jul 2020 14:08:28 +1000, Anton Blanchard wrote:
> I've forgotten to manual enable NVME when building pseries kernels
> for machines with NVME adapters. Since it's a reasonably common
> configuration, enable it by default.

Applied to powerpc/next.

[1/1] powerpc/configs: Add BLK_DEV_NVME to pseries_defconfig
      https://git.kernel.org/powerpc/c/fdaa7ce2016ccd09a538b05bace5f4479662ddcb

cheers

^ permalink raw reply

* Re: [PATCH 0/2] powerpc: OpenCAPI Cleanup
From: Michael Ellerman @ 2020-08-02 13:34 UTC (permalink / raw)
  To: Alastair D'Silva
  Cc: Andrew Donnellan, Arnd Bergmann, Greg Kroah-Hartman, linux-kernel,
	Paul Mackerras, Frederic Barrat, linuxppc-dev
In-Reply-To: <20200415012343.919255-1-alastair@d-silva.org>

On Wed, 15 Apr 2020 11:23:41 +1000, Alastair D'Silva wrote:
> These patches address checkpatch & kernel doc warnings
> in the OpenCAPI infrastructure.
> 
> Alastair D'Silva (2):
>   ocxl: Remove unnecessary externs
>   ocxl: Address kernel doc errors & warnings
> 
> [...]

Applied to powerpc/next.

[1/2] ocxl: Remove unnecessary externs
      https://git.kernel.org/powerpc/c/c75d42e4c768c403f259f6c7f6217c850cf11be9
[2/2] ocxl: Address kernel doc errors & warnings
      https://git.kernel.org/powerpc/c/3591538a31af37cf6a2d83f1da99e651a822af8b

cheers

^ permalink raw reply

* Re: [PATCH] powerpc/64s/hash: Fix hash_preload running with interrupts enabled
From: Michael Ellerman @ 2020-08-02 13:24 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev; +Cc: Aneesh Kumar K . V, Athira Rajeev
In-Reply-To: <20200727060947.10060-1-npiggin@gmail.com>

On Mon, 27 Jul 2020 16:09:47 +1000, Nicholas Piggin wrote:
> Commit 2f92447f9f96 ("powerpc/book3s64/hash: Use the pte_t address from the
> caller") removed the local_irq_disable from hash_preload, but it was
> required for more than just the page table walk: the hash pte busy bit is
> effectively a lock which may be taken in interrupt context, and the local
> update flag test must not be preempted before it's used.
> 
> This solves apparent lockups with perf interrupting __hash_page_64K. If
> get_perf_callchain then also takes a hash fault on the same page while it
> is already locked, it will loop forever taking hash faults, which looks like
> this:
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/64s/hash: Fix hash_preload running with interrupts enabled
      https://git.kernel.org/powerpc/c/909adfc66b9a1db21b5e8733e9ebfa6cd5135d74

cheers

^ permalink raw reply

* Re: [PATCH 06/15] powerpc: fadamp: simplify fadump_reserve_crash_area()
From: Michael Ellerman @ 2020-08-02 13:14 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-sh, Peter Zijlstra, Dave Hansen, Hari Bathini, linux-mips,
	Max Filippov, Paul Mackerras, sparclinux, linux-riscv,
	Will Deacon, Stafford Horne, Marek Szyprowski, linux-s390,
	linux-c6x-dev, Yoshinori Sato, x86, Russell King, Mike Rapoport,
	clang-built-linux, Ingo Molnar, Catalin Marinas, uclinux-h8-devel,
	linux-xtensa, openrisc, Borislav Petkov, Andy Lutomirski,
	Paul Walmsley, Thomas Gleixner, linux-arm-kernel, Michal Simek,
	linux-mm, linuxppc-dev, linux-kernel, iommu, Palmer Dabbelt,
	Andrew Morton, Christoph Hellwig
In-Reply-To: <20200801101854.GD534153@kernel.org>

Mike Rapoport <rppt@kernel.org> writes:
> On Thu, Jul 30, 2020 at 10:15:13PM +1000, Michael Ellerman wrote:
>> Mike Rapoport <rppt@kernel.org> writes:
>> > From: Mike Rapoport <rppt@linux.ibm.com>
>> >
>> > fadump_reserve_crash_area() reserves memory from a specified base address
>> > till the end of the RAM.
>> >
>> > Replace iteration through the memblock.memory with a single call to
>> > memblock_reserve() with appropriate  that will take care of proper memory
>>                                      ^
>>                                      parameters?
>> > reservation.
>> >
>> > Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>> > ---
>> >  arch/powerpc/kernel/fadump.c | 20 +-------------------
>> >  1 file changed, 1 insertion(+), 19 deletions(-)
>> 
>> I think this looks OK to me, but I don't have a setup to test it easily.
>> I've added Hari to Cc who might be able to.
>> 
>> But I'll give you an ack in the hope that it works :)
>
> Actually, I did some digging in the git log and the traversal was added
> there on purpose by the commit b71a693d3db3 ("powerpc/fadump: exclude
> memory holes while reserving memory in second kernel")
> Presuming this is still reqruired I'm going to drop this patch and will
> simply replace for_each_memblock() with for_each_mem_range() in v2.

Thanks.

cheers

^ permalink raw reply

* Re: [PATCH v2 15/16] powerpc/powernv/sriov: Make single PE mode a per-BAR setting
From: Michael Ellerman @ 2020-08-02 13:12 UTC (permalink / raw)
  To: Nathan Chancellor, Oliver O'Halloran; +Cc: clang-built-linux, linuxppc-dev
In-Reply-To: <20200801061823.GA1203340@ubuntu-n2-xlarge-x86>

Nathan Chancellor <natechancellor@gmail.com> writes:
> On Wed, Jul 22, 2020 at 04:57:14PM +1000, Oliver O'Halloran wrote:
>> Using single PE BARs to map an SR-IOV BAR is really a choice about what
>> strategy to use when mapping a BAR. It doesn't make much sense for this to
>> be a global setting since a device might have one large BAR which needs to
>> be mapped with single PE windows and another smaller BAR that can be mapped
>> with a regular segmented window. Make the segmented vs single decision a
>> per-BAR setting and clean up the logic that decides which mode to use.
>> 
>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>> ---
>> v2: Dropped unused total_vfs variables in pnv_pci_ioda_fixup_iov_resources()
>>     Dropped bar_no from pnv_pci_iov_resource_alignment()
>>     Minor re-wording of comments.
>> ---
>>  arch/powerpc/platforms/powernv/pci-sriov.c | 131 ++++++++++-----------
>>  arch/powerpc/platforms/powernv/pci.h       |  11 +-
>>  2 files changed, 73 insertions(+), 69 deletions(-)
>> 
>> diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
>> index ce8ad6851d73..76215d01405b 100644
>> --- a/arch/powerpc/platforms/powernv/pci-sriov.c
>> +++ b/arch/powerpc/platforms/powernv/pci-sriov.c
>> @@ -260,42 +256,40 @@ void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
>>  resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
>>  						      int resno)
>>  {
>> -	struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus);
>>  	struct pnv_iov_data *iov = pnv_iov_get(pdev);
>>  	resource_size_t align;
>>  
>> +	/*
>> +	 * iov can be null if we have an SR-IOV device with IOV BAR that can't
>> +	 * be placed in the m64 space (i.e. The BAR is 32bit or non-prefetch).
>> +	 * In that case we don't allow VFs to be enabled since one of their
>> +	 * BARs would not be placed in the correct PE.
>> +	 */
>> +	if (!iov)
>> +		return align;
>> +	if (!iov->vfs_expanded)
>> +		return align;
>> +
>> +	align = pci_iov_resource_size(pdev, resno);

That's, oof.

> I am not sure if it has been reported yet but clang points out that
> align is initialized after its use:
>
> arch/powerpc/platforms/powernv/pci-sriov.c:267:10: warning: variable 'align' is uninitialized when used here [-Wuninitialized]
>                 return align;
>                        ^~~~~
> arch/powerpc/platforms/powernv/pci-sriov.c:258:23: note: initialize the variable 'align' to silence this warning
>         resource_size_t align;
>                              ^
>                               = 0
> 1 warning generated.

But I can't get gcc to warn about it?

It produces some code, so it's not like the whole function has been
elided or something. I'm confused.

cheers

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox