* Re: [PATCH] mm: memmap_init_zone() performance improvement [not found] <1349276174-8398-1-git-send-email-mike.yoknis@hp.com> @ 2012-10-06 23:59 ` Ni zhan Chen 2012-10-08 15:16 ` Mel Gorman 1 sibling, 0 replies; 13+ messages in thread From: Ni zhan Chen @ 2012-10-06 23:59 UTC (permalink / raw) To: Mike Yoknis Cc: mgorman, mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On 10/03/2012 10:56 PM, Mike Yoknis wrote: > memmap_init_zone() loops through every Page Frame Number (pfn), > including pfn values that are within the gaps between existing > memory sections. The unneeded looping will become a boot > performance issue when machines configure larger memory ranges > that will contain larger and more numerous gaps. > > The code will skip across invalid sections to reduce the > number of loops executed. looks reasonable to me. > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> > --- > arch/x86/include/asm/mmzone_32.h | 2 ++ > arch/x86/include/asm/page_32.h | 1 + > arch/x86/include/asm/page_64_types.h | 3 ++- > include/asm-generic/page.h | 1 + > include/linux/mmzone.h | 6 ++++++ > mm/page_alloc.c | 5 ++++- > 6 files changed, 16 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/mmzone_32.h b/arch/x86/include/asm/mmzone_32.h > index eb05fb3..73c5c74 100644 > --- a/arch/x86/include/asm/mmzone_32.h > +++ b/arch/x86/include/asm/mmzone_32.h > @@ -48,6 +48,8 @@ static inline int pfn_to_nid(unsigned long pfn) > #endif > } > > +#define next_pfn_try(pfn) ((pfn)+1) > + > static inline int pfn_valid(int pfn) > { > int nid = pfn_to_nid(pfn); > diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h > index da4e762..e2c4cfc 100644 > --- a/arch/x86/include/asm/page_32.h > +++ b/arch/x86/include/asm/page_32.h > @@ -19,6 +19,7 @@ extern unsigned long __phys_addr(unsigned long); > > #ifdef CONFIG_FLATMEM > #define pfn_valid(pfn) ((pfn) < max_mapnr) > +#define next_pfn_try(pfn) ((pfn)+1) > #endif /* CONFIG_FLATMEM */ > > #ifdef CONFIG_X86_USE_3DNOW > diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h > index 320f7bb..02d82e5 100644 > --- a/arch/x86/include/asm/page_64_types.h > +++ b/arch/x86/include/asm/page_64_types.h > @@ -69,7 +69,8 @@ extern void init_extra_mapping_wb(unsigned long phys, unsigned long size); > #endif /* !__ASSEMBLY__ */ > > #ifdef CONFIG_FLATMEM > -#define pfn_valid(pfn) ((pfn) < max_pfn) > +#define pfn_valid(pfn) ((pfn) < max_pfn) > +#define next_pfn_try(pfn) ((pfn)+1) > #endif > > #endif /* _ASM_X86_PAGE_64_DEFS_H */ > diff --git a/include/asm-generic/page.h b/include/asm-generic/page.h > index 37d1fe2..316200d 100644 > --- a/include/asm-generic/page.h > +++ b/include/asm-generic/page.h > @@ -91,6 +91,7 @@ extern unsigned long memory_end; > #endif > > #define pfn_valid(pfn) ((pfn) >= ARCH_PFN_OFFSET && ((pfn) - ARCH_PFN_OFFSET) < max_mapnr) > +#define next_pfn_try(pfn) ((pfn)+1) > > #define virt_addr_valid(kaddr) (((void *)(kaddr) >= (void *)PAGE_OFFSET) && \ > ((void *)(kaddr) < (void *)memory_end)) > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index f7d88ba..04d3c39 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -1166,6 +1166,12 @@ static inline int pfn_valid(unsigned long pfn) > return 0; > return valid_section(__nr_to_section(pfn_to_section_nr(pfn))); > } > + > +static inline unsigned long next_pfn_try(unsigned long pfn) > +{ > + /* Skip entire section, because all of it is invalid. */ > + return section_nr_to_pfn(pfn_to_section_nr(pfn) + 1); > +} > #endif > > static inline int pfn_present(unsigned long pfn) > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 5b6b6b1..dd2af8b 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3798,8 +3798,11 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, > * exist on hotplugged memory. > */ > if (context == MEMMAP_EARLY) { > - if (!early_pfn_valid(pfn)) > + if (!early_pfn_valid(pfn)) { > + pfn = next_pfn_try(pfn); > + pfn--; > continue; > + } > if (!early_pfn_in_nid(pfn, nid)) > continue; > } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement [not found] <1349276174-8398-1-git-send-email-mike.yoknis@hp.com> 2012-10-06 23:59 ` [PATCH] mm: memmap_init_zone() performance improvement Ni zhan Chen @ 2012-10-08 15:16 ` Mel Gorman 2012-10-09 0:42 ` Ni zhan Chen 2012-10-09 14:56 ` Mike Yoknis 1 sibling, 2 replies; 13+ messages in thread From: Mel Gorman @ 2012-10-08 15:16 UTC (permalink / raw) To: Mike Yoknis Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote: > memmap_init_zone() loops through every Page Frame Number (pfn), > including pfn values that are within the gaps between existing > memory sections. The unneeded looping will become a boot > performance issue when machines configure larger memory ranges > that will contain larger and more numerous gaps. > > The code will skip across invalid sections to reduce the > number of loops executed. > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> This only helps SPARSEMEM and changes more headers than should be necessary. It would have been easier to do something simple like if (!early_pfn_valid(pfn)) { pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; continue; } because that would obey the expectation that pages within a MAX_ORDER_NR_PAGES-aligned range are all valid or all invalid (ARM is the exception that breaks this rule). It would be less efficient on SPARSEMEM than what you're trying to merge but I do not see the need for the additional complexity unless you can show it makes a big difference to boot times. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-10-08 15:16 ` Mel Gorman @ 2012-10-09 0:42 ` Ni zhan Chen 2012-10-09 14:56 ` Mike Yoknis 1 sibling, 0 replies; 13+ messages in thread From: Ni zhan Chen @ 2012-10-09 0:42 UTC (permalink / raw) To: Mel Gorman Cc: Mike Yoknis, mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On 10/08/2012 11:16 PM, Mel Gorman wrote: > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote: >> memmap_init_zone() loops through every Page Frame Number (pfn), >> including pfn values that are within the gaps between existing >> memory sections. The unneeded looping will become a boot >> performance issue when machines configure larger memory ranges >> that will contain larger and more numerous gaps. >> >> The code will skip across invalid sections to reduce the >> number of loops executed. >> >> Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> > This only helps SPARSEMEM and changes more headers than should be > necessary. It would have been easier to do something simple like > > if (!early_pfn_valid(pfn)) { > pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; > continue; > } So if present memoy section in sparsemem can have MAX_ORDER_NR_PAGES-aligned range are all invalid? If the answer is yes, when this will happen? > > because that would obey the expectation that pages within a > MAX_ORDER_NR_PAGES-aligned range are all valid or all invalid (ARM is the > exception that breaks this rule). It would be less efficient on > SPARSEMEM than what you're trying to merge but I do not see the need for > the additional complexity unless you can show it makes a big difference > to boot times. > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-10-08 15:16 ` Mel Gorman 2012-10-09 0:42 ` Ni zhan Chen @ 2012-10-09 14:56 ` Mike Yoknis 2012-10-19 19:53 ` Mike Yoknis 1 sibling, 1 reply; 13+ messages in thread From: Mike Yoknis @ 2012-10-09 14:56 UTC (permalink / raw) To: Mel Gorman Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote: > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote: > > memmap_init_zone() loops through every Page Frame Number (pfn), > > including pfn values that are within the gaps between existing > > memory sections. The unneeded looping will become a boot > > performance issue when machines configure larger memory ranges > > that will contain larger and more numerous gaps. > > > > The code will skip across invalid sections to reduce the > > number of loops executed. > > > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> > > This only helps SPARSEMEM and changes more headers than should be > necessary. It would have been easier to do something simple like > > if (!early_pfn_valid(pfn)) { > pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; > continue; > } > > because that would obey the expectation that pages within a > MAX_ORDER_NR_PAGES-aligned range are all valid or all invalid (ARM is the > exception that breaks this rule). It would be less efficient on > SPARSEMEM than what you're trying to merge but I do not see the need for > the additional complexity unless you can show it makes a big difference > to boot times. > Mel, I, too, was concerned that pfn_valid() was defined in so many header files. But, I did not feel that it was appropriate for me to try to restructure things to consolidate those definitions just to add this one new function. Being a kernel newbie I did not believe that I had a good enough understanding of what combinations and permutations of CONFIG and architecture may have made all of those different definitions necessary, so I left them in. Yes, indeed, this fix is targeted for systems that have holes in memory. That is where we see the problem. We are creating large computer systems and we would like for those machines to perform well, including boot times. Let me pass along the numbers I have. We have what we call an "architectural simulator". It is a computer program that pretends that it is a computer system. We use it to test the firmware before real hardware is available. We have booted Linux on our simulator. As you would expect it takes longer to boot on the simulator than it does on real hardware. With my patch - boot time 41 minutes Without patch - boot time 94 minutes These numbers do not scale linearly to real hardware. But indicate to me a place where Linux can be improved. Mike Yoknis ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-10-09 14:56 ` Mike Yoknis @ 2012-10-19 19:53 ` Mike Yoknis 2012-10-20 8:29 ` Mel Gorman 0 siblings, 1 reply; 13+ messages in thread From: Mike Yoknis @ 2012-10-19 19:53 UTC (permalink / raw) To: Mel Gorman Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On Tue, 2012-10-09 at 08:56 -0600, Mike Yoknis wrote: > On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote: > > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote: > > > memmap_init_zone() loops through every Page Frame Number (pfn), > > > including pfn values that are within the gaps between existing > > > memory sections. The unneeded looping will become a boot > > > performance issue when machines configure larger memory ranges > > > that will contain larger and more numerous gaps. > > > > > > The code will skip across invalid sections to reduce the > > > number of loops executed. > > > > > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> > > > > I do not see the need for > > the additional complexity unless you can show it makes a big difference > > to boot times. > > > > Mel, > > Let me pass along the numbers I have. We have what we call an > "architectural simulator". It is a computer program that pretends that > it is a computer system. We use it to test the firmware before real > hardware is available. We have booted Linux on our simulator. As you > would expect it takes longer to boot on the simulator than it does on > real hardware. > > With my patch - boot time 41 minutes > Without patch - boot time 94 minutes > > These numbers do not scale linearly to real hardware. But indicate to > me a place where Linux can be improved. > > Mike Yoknis > Mel, I finally got access to prototype hardware. It is a relatively small machine with only 64GB of RAM. I put in a time measurement by reading the TSC register. I booted both with and without my patch - Without patch - [ 0.000000] Normal zone: 13400064 pages, LIFO batch:31 [ 0.000000] memmap_init_zone() enter 1404184834218 [ 0.000000] memmap_init_zone() exit 1411174884438 diff = 6990050220 With patch - [ 0.000000] Normal zone: 13400064 pages, LIFO batch:31 [ 0.000000] memmap_init_zone() enter 1555530050778 [ 0.000000] memmap_init_zone() exit 1559379204643 diff = 3849153865 This shows that without the patch the routine spends 45% of its time spinning unnecessarily. Mike Yoknis ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-10-19 19:53 ` Mike Yoknis @ 2012-10-20 8:29 ` Mel Gorman 2012-10-24 15:47 ` Mike Yoknis 2012-10-30 15:14 ` [PATCH] " Dave Hansen 0 siblings, 2 replies; 13+ messages in thread From: Mel Gorman @ 2012-10-20 8:29 UTC (permalink / raw) To: Mike Yoknis Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On Fri, Oct 19, 2012 at 01:53:18PM -0600, Mike Yoknis wrote: > On Tue, 2012-10-09 at 08:56 -0600, Mike Yoknis wrote: > > On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote: > > > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote: > > > > memmap_init_zone() loops through every Page Frame Number (pfn), > > > > including pfn values that are within the gaps between existing > > > > memory sections. The unneeded looping will become a boot > > > > performance issue when machines configure larger memory ranges > > > > that will contain larger and more numerous gaps. > > > > > > > > The code will skip across invalid sections to reduce the > > > > number of loops executed. > > > > > > > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> > > > > > > I do not see the need for > > > the additional complexity unless you can show it makes a big difference > > > to boot times. > > > > > > > Mel, > > > > Let me pass along the numbers I have. We have what we call an > > "architectural simulator". It is a computer program that pretends that > > it is a computer system. We use it to test the firmware before real > > hardware is available. We have booted Linux on our simulator. As you > > would expect it takes longer to boot on the simulator than it does on > > real hardware. > > > > With my patch - boot time 41 minutes > > Without patch - boot time 94 minutes > > > > These numbers do not scale linearly to real hardware. But indicate to > > me a place where Linux can be improved. > > > > Mike Yoknis > > > Mel, > I finally got access to prototype hardware. > It is a relatively small machine with only 64GB of RAM. > > I put in a time measurement by reading the TSC register. > I booted both with and without my patch - > > Without patch - > [ 0.000000] Normal zone: 13400064 pages, LIFO batch:31 > [ 0.000000] memmap_init_zone() enter 1404184834218 > [ 0.000000] memmap_init_zone() exit 1411174884438 diff = 6990050220 > > With patch - > [ 0.000000] Normal zone: 13400064 pages, LIFO batch:31 > [ 0.000000] memmap_init_zone() enter 1555530050778 > [ 0.000000] memmap_init_zone() exit 1559379204643 diff = 3849153865 > > This shows that without the patch the routine spends 45% > of its time spinning unnecessarily. > I'm travelling at the moment so apologies that I have not followed up on this. My problem is still the same with the patch - it changes more headers than is necessary and it is sparsemem specific. At minimum, try the suggestion of if (!early_pfn_valid(pfn)) { pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; continue; } and see how much it gains you as it should work on all memory models. If it turns out that you really need to skip whole sections then the strice could MAX_ORDER_NR_PAGES on all memory models except sparsemem where the stride would be PAGES_PER_SECTION -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-10-20 8:29 ` Mel Gorman @ 2012-10-24 15:47 ` Mike Yoknis 2012-10-25 9:44 ` Mel Gorman 2012-10-30 15:14 ` [PATCH] " Dave Hansen 1 sibling, 1 reply; 13+ messages in thread From: Mike Yoknis @ 2012-10-24 15:47 UTC (permalink / raw) To: Mel Gorman Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On Sat, 2012-10-20 at 09:29 +0100, Mel Gorman wrote: > On Fri, Oct 19, 2012 at 01:53:18PM -0600, Mike Yoknis wrote: > > On Tue, 2012-10-09 at 08:56 -0600, Mike Yoknis wrote: > > > On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote: > > > > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote: > > > > > memmap_init_zone() loops through every Page Frame Number (pfn), > > > > > including pfn values that are within the gaps between existing > > > > > memory sections. The unneeded looping will become a boot > > > > > performance issue when machines configure larger memory ranges > > > > > that will contain larger and more numerous gaps. > > > > > > > > > > The code will skip across invalid sections to reduce the > > > > > number of loops executed. > > > > > > > > > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> > > > > > > > > I do not see the need for > > > > the additional complexity unless you can show it makes a big difference > > > > to boot times. > > > > > > > > > > Mel, > > > > > > Let me pass along the numbers I have. We have what we call an > > > "architectural simulator". It is a computer program that pretends that > > > it is a computer system. We use it to test the firmware before real > > > hardware is available. We have booted Linux on our simulator. As you > > > would expect it takes longer to boot on the simulator than it does on > > > real hardware. > > > > > > With my patch - boot time 41 minutes > > > Without patch - boot time 94 minutes > > > > > > These numbers do not scale linearly to real hardware. But indicate to > > > me a place where Linux can be improved. > > > > > > Mike Yoknis > > > > > Mel, > > I finally got access to prototype hardware. > > It is a relatively small machine with only 64GB of RAM. > > > > I put in a time measurement by reading the TSC register. > > I booted both with and without my patch - > > > > Without patch - > > [ 0.000000] Normal zone: 13400064 pages, LIFO batch:31 > > [ 0.000000] memmap_init_zone() enter 1404184834218 > > [ 0.000000] memmap_init_zone() exit 1411174884438 diff = 6990050220 > > > > With patch - > > [ 0.000000] Normal zone: 13400064 pages, LIFO batch:31 > > [ 0.000000] memmap_init_zone() enter 1555530050778 > > [ 0.000000] memmap_init_zone() exit 1559379204643 diff = 3849153865 > > > > This shows that without the patch the routine spends 45% > > of its time spinning unnecessarily. > > > > I'm travelling at the moment so apologies that I have not followed up on > this. My problem is still the same with the patch - it changes more > headers than is necessary and it is sparsemem specific. At minimum, try > the suggestion of > > if (!early_pfn_valid(pfn)) { > pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; > continue; > } > > and see how much it gains you as it should work on all memory models. If > it turns out that you really need to skip whole sections then the strice > could MAX_ORDER_NR_PAGES on all memory models except sparsemem where the > stride would be PAGES_PER_SECTION > Mel, I tried your suggestion. I re-ran all 3 methods on our latest firmware. The following are TSC difference numbers (*10^6) to execute memmap_init_zone() - No patch - 7010 Mel's patch- 3918 My patch - 3847 The incremental improvement of my method is not significant vs. yours. If you believe your suggested change is worthwhile I will create a v2 patch. Mike Y ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-10-24 15:47 ` Mike Yoknis @ 2012-10-25 9:44 ` Mel Gorman 2012-10-26 22:47 ` [PATCH v2] " Mike Yoknis 0 siblings, 1 reply; 13+ messages in thread From: Mel Gorman @ 2012-10-25 9:44 UTC (permalink / raw) To: Mike Yoknis Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On Wed, Oct 24, 2012 at 09:47:47AM -0600, Mike Yoknis wrote: > On Sat, 2012-10-20 at 09:29 +0100, Mel Gorman wrote: > > On Fri, Oct 19, 2012 at 01:53:18PM -0600, Mike Yoknis wrote: > > > On Tue, 2012-10-09 at 08:56 -0600, Mike Yoknis wrote: > > > > On Mon, 2012-10-08 at 16:16 +0100, Mel Gorman wrote: > > > > > On Wed, Oct 03, 2012 at 08:56:14AM -0600, Mike Yoknis wrote: > > > > > > memmap_init_zone() loops through every Page Frame Number (pfn), > > > > > > including pfn values that are within the gaps between existing > > > > > > memory sections. The unneeded looping will become a boot > > > > > > performance issue when machines configure larger memory ranges > > > > > > that will contain larger and more numerous gaps. > > > > > > > > > > > > The code will skip across invalid sections to reduce the > > > > > > number of loops executed. > > > > > > > > > > > > Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> > > > > > > > > > > I do not see the need for > > > > > the additional complexity unless you can show it makes a big difference > > > > > to boot times. > > > > > > > > > > > > > Mel, > > > > > > > > Let me pass along the numbers I have. We have what we call an > > > > "architectural simulator". It is a computer program that pretends that > > > > it is a computer system. We use it to test the firmware before real > > > > hardware is available. We have booted Linux on our simulator. As you > > > > would expect it takes longer to boot on the simulator than it does on > > > > real hardware. > > > > > > > > With my patch - boot time 41 minutes > > > > Without patch - boot time 94 minutes > > > > > > > > These numbers do not scale linearly to real hardware. But indicate to > > > > me a place where Linux can be improved. > > > > > > > > Mike Yoknis > > > > > > > Mel, > > > I finally got access to prototype hardware. > > > It is a relatively small machine with only 64GB of RAM. > > > > > > I put in a time measurement by reading the TSC register. > > > I booted both with and without my patch - > > > > > > Without patch - > > > [ 0.000000] Normal zone: 13400064 pages, LIFO batch:31 > > > [ 0.000000] memmap_init_zone() enter 1404184834218 > > > [ 0.000000] memmap_init_zone() exit 1411174884438 diff = 6990050220 > > > > > > With patch - > > > [ 0.000000] Normal zone: 13400064 pages, LIFO batch:31 > > > [ 0.000000] memmap_init_zone() enter 1555530050778 > > > [ 0.000000] memmap_init_zone() exit 1559379204643 diff = 3849153865 > > > > > > This shows that without the patch the routine spends 45% > > > of its time spinning unnecessarily. > > > > > > > I'm travelling at the moment so apologies that I have not followed up on > > this. My problem is still the same with the patch - it changes more > > headers than is necessary and it is sparsemem specific. At minimum, try > > the suggestion of > > > > if (!early_pfn_valid(pfn)) { > > pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; > > continue; > > } > > > > and see how much it gains you as it should work on all memory models. If > > it turns out that you really need to skip whole sections then the strice > > could MAX_ORDER_NR_PAGES on all memory models except sparsemem where the > > stride would be PAGES_PER_SECTION > > > Mel, > I tried your suggestion. I re-ran all 3 methods on our latest firmware. > > The following are TSC difference numbers (*10^6) to execute > memmap_init_zone() - > > No patch - 7010 > Mel's patch- 3918 > My patch - 3847 > > The incremental improvement of my method is not significant vs. yours. > > If you believe your suggested change is worthwhile I will create a v2 > patch. I think it is a reasonable change and I prefer my suggestion because it should work for all memory models. Please do a V2 of the patch. I'm still travelling at the moment (writing this from an airport) but I'll be back online next Tuesday and will review it when I can. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] mm: memmap_init_zone() performance improvement 2012-10-25 9:44 ` Mel Gorman @ 2012-10-26 22:47 ` Mike Yoknis 2012-10-30 22:31 ` Andrew Morton 0 siblings, 1 reply; 13+ messages in thread From: Mike Yoknis @ 2012-10-26 22:47 UTC (permalink / raw) To: Mel Gorman Cc: mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm memmap_init_zone() loops through every Page Frame Number (pfn), including pfn values that are within the gaps between existing memory sections. The unneeded looping will become a boot performance issue when machines configure larger memory ranges that will contain larger and more numerous gaps. The code will skip across invalid pfn values to reduce the number of loops executed. Signed-off-by: Mike Yoknis <mike.yoknis@hp.com> --- mm/page_alloc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 45c916b..9f9c1a6 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3857,8 +3857,11 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, * exist on hotplugged memory. */ if (context == MEMMAP_EARLY) { - if (!early_pfn_valid(pfn)) + if (!early_pfn_valid(pfn)) { + pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, + MAX_ORDER_NR_PAGES) - 1; continue; + } if (!early_pfn_in_nid(pfn, nid)) continue; } -- 1.7.11.3 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2] mm: memmap_init_zone() performance improvement 2012-10-26 22:47 ` [PATCH v2] " Mike Yoknis @ 2012-10-30 22:31 ` Andrew Morton 0 siblings, 0 replies; 13+ messages in thread From: Andrew Morton @ 2012-10-30 22:31 UTC (permalink / raw) To: mike.yoknis Cc: Mel Gorman, mingo, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On Fri, 26 Oct 2012 16:47:47 -0600 Mike Yoknis <mike.yoknis@hp.com> wrote: > memmap_init_zone() loops through every Page Frame Number (pfn), > including pfn values that are within the gaps between existing > memory sections. The unneeded looping will become a boot > performance issue when machines configure larger memory ranges > that will contain larger and more numerous gaps. > > The code will skip across invalid pfn values to reduce the > number of loops executed. > So I was wondering how much difference this makes. Then I see Mel already asked and was answered. The lesson: please treat a reviewer question as a sign that the changelog needs more information! I added this text to the changelog: : We have what we call an "architectural simulator". It is a computer : program that pretends that it is a computer system. We use it to test the : firmware before real hardware is available. We have booted Linux on our : simulator. As you would expect it takes longer to boot on the simulator : than it does on real hardware. : : With my patch - boot time 41 minutes : Without patch - boot time 94 minutes : : These numbers do not scale linearly to real hardware. But indicate to me : a place where Linux can be improved. > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3857,8 +3857,11 @@ void __meminit memmap_init_zone(unsigned long > size, int nid, unsigned long zone, > * exist on hotplugged memory. > */ > if (context == MEMMAP_EARLY) { > - if (!early_pfn_valid(pfn)) > + if (!early_pfn_valid(pfn)) { > + pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, > + MAX_ORDER_NR_PAGES) - 1; > continue; > + } > if (!early_pfn_in_nid(pfn, nid)) > continue; > } So what is the assumption here? That each zone's first page has a pfn which is a multiple of MAX_ORDER_NR_PAGES? That seems reasonable, but is it actually true, for all architectures and for all time? Where did this come from? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-10-20 8:29 ` Mel Gorman 2012-10-24 15:47 ` Mike Yoknis @ 2012-10-30 15:14 ` Dave Hansen 2012-11-06 16:03 ` Mike Yoknis 1 sibling, 1 reply; 13+ messages in thread From: Dave Hansen @ 2012-10-30 15:14 UTC (permalink / raw) To: Mel Gorman Cc: Mike Yoknis, mingo, akpm, linux-arch, mmarek, tglx, hpa, arnd, sam, minchan, kamezawa.hiroyu, mhocko, linux-kbuild, linux-kernel, linux-mm On 10/20/2012 01:29 AM, Mel Gorman wrote: > I'm travelling at the moment so apologies that I have not followed up on > this. My problem is still the same with the patch - it changes more > headers than is necessary and it is sparsemem specific. At minimum, try > the suggestion of > > if (!early_pfn_valid(pfn)) { > pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; > continue; > } Sorry I didn't catch this until v2... Is that ALIGN() correct? If pfn=3, then it would expand to: (3+MAX_ORDER_NR_PAGES+MAX_ORDER_NR_PAGES-1) & ~(MAX_ORDER_NR_PAGES-1) You would end up skipping the current MAX_ORDER_NR_PAGES area, and then one _extra_ because ALIGN() aligns up, and you're adding MAX_ORDER_NR_PAGES too. It doesn't matter unless you run in to a !early_valid_pfn() in the middle of a MAX_ORDER area, I guess. I think this would work, plus be a bit smaller: pfn = ALIGN(pfn + 1, MAX_ORDER_NR_PAGES) - 1; ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-10-30 15:14 ` [PATCH] " Dave Hansen @ 2012-11-06 16:03 ` Mike Yoknis 2012-12-18 23:03 ` Andrew Morton 0 siblings, 1 reply; 13+ messages in thread From: Mike Yoknis @ 2012-11-06 16:03 UTC (permalink / raw) To: Dave Hansen Cc: Mel Gorman, mingo@redhat.com, akpm@linux-foundation.org, linux-arch@vger.kernel.org, mmarek@suse.cz, tglx@linutronix.de, hpa@zytor.com, arnd@arndb.de, sam@ravnborg.org, minchan@kernel.org, kamezawa.hiroyu@jp.fujitsu.com, mhocko@suse.cz, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org On Tue, 2012-10-30 at 09:14 -0600, Dave Hansen wrote: > On 10/20/2012 01:29 AM, Mel Gorman wrote: > > I'm travelling at the moment so apologies that I have not followed up on > > this. My problem is still the same with the patch - it changes more > > headers than is necessary and it is sparsemem specific. At minimum, try > > the suggestion of > > > > if (!early_pfn_valid(pfn)) { > > pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; > > continue; > > } > > Sorry I didn't catch this until v2... > > Is that ALIGN() correct? If pfn=3, then it would expand to: > > (3+MAX_ORDER_NR_PAGES+MAX_ORDER_NR_PAGES-1) & ~(MAX_ORDER_NR_PAGES-1) > > You would end up skipping the current MAX_ORDER_NR_PAGES area, and then > one _extra_ because ALIGN() aligns up, and you're adding > MAX_ORDER_NR_PAGES too. It doesn't matter unless you run in to a > !early_valid_pfn() in the middle of a MAX_ORDER area, I guess. > > I think this would work, plus be a bit smaller: > > pfn = ALIGN(pfn + 1, MAX_ORDER_NR_PAGES) - 1; > Dave, I see your point about "rounding-up". But, I favor the way Mel suggested it. It more clearly shows the intent, which is to move up by MAX_ORDER_NR_PAGES. The "pfn+1" may suggest that there is some significance to the next pfn, but there is not. I find Mel's way easier to understand. Mike Y ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] mm: memmap_init_zone() performance improvement 2012-11-06 16:03 ` Mike Yoknis @ 2012-12-18 23:03 ` Andrew Morton 0 siblings, 0 replies; 13+ messages in thread From: Andrew Morton @ 2012-12-18 23:03 UTC (permalink / raw) To: mike.yoknis Cc: Dave Hansen, Mel Gorman, mingo@redhat.com, linux-arch@vger.kernel.org, mmarek@suse.cz, tglx@linutronix.de, hpa@zytor.com, arnd@arndb.de, sam@ravnborg.org, minchan@kernel.org, kamezawa.hiroyu@jp.fujitsu.com, mhocko@suse.cz, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org On Tue, 06 Nov 2012 09:03:26 -0700 Mike Yoknis <mike.yoknis@hp.com> wrote: > On Tue, 2012-10-30 at 09:14 -0600, Dave Hansen wrote: > > On 10/20/2012 01:29 AM, Mel Gorman wrote: > > > I'm travelling at the moment so apologies that I have not followed up on > > > this. My problem is still the same with the patch - it changes more > > > headers than is necessary and it is sparsemem specific. At minimum, try > > > the suggestion of > > > > > > if (!early_pfn_valid(pfn)) { > > > pfn = ALIGN(pfn + MAX_ORDER_NR_PAGES, MAX_ORDER_NR_PAGES) - 1; > > > continue; > > > } > > > > Sorry I didn't catch this until v2... > > > > Is that ALIGN() correct? If pfn=3, then it would expand to: > > > > (3+MAX_ORDER_NR_PAGES+MAX_ORDER_NR_PAGES-1) & ~(MAX_ORDER_NR_PAGES-1) > > > > You would end up skipping the current MAX_ORDER_NR_PAGES area, and then > > one _extra_ because ALIGN() aligns up, and you're adding > > MAX_ORDER_NR_PAGES too. It doesn't matter unless you run in to a > > !early_valid_pfn() in the middle of a MAX_ORDER area, I guess. > > > > I think this would work, plus be a bit smaller: > > > > pfn = ALIGN(pfn + 1, MAX_ORDER_NR_PAGES) - 1; > > > Dave, > I see your point about "rounding-up". But, I favor the way Mel > suggested it. It more clearly shows the intent, which is to move up by > MAX_ORDER_NR_PAGES. The "pfn+1" may suggest that there is some > significance to the next pfn, but there is not. > I find Mel's way easier to understand. I don't think that really answers Dave's question. What happens if we "run in to a !early_valid_pfn() in the middle of a MAX_ORDER area"? ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2012-12-18 23:03 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1349276174-8398-1-git-send-email-mike.yoknis@hp.com>
2012-10-06 23:59 ` [PATCH] mm: memmap_init_zone() performance improvement Ni zhan Chen
2012-10-08 15:16 ` Mel Gorman
2012-10-09 0:42 ` Ni zhan Chen
2012-10-09 14:56 ` Mike Yoknis
2012-10-19 19:53 ` Mike Yoknis
2012-10-20 8:29 ` Mel Gorman
2012-10-24 15:47 ` Mike Yoknis
2012-10-25 9:44 ` Mel Gorman
2012-10-26 22:47 ` [PATCH v2] " Mike Yoknis
2012-10-30 22:31 ` Andrew Morton
2012-10-30 15:14 ` [PATCH] " Dave Hansen
2012-11-06 16:03 ` Mike Yoknis
2012-12-18 23:03 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox