linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robin Holt <holt@sgi.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Robin Holt <holt@sgi.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Ingo Molnar <mingo@kernel.org>, Nate Zimmer <nzimmer@sgi.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, Rob Landley <rob@landley.net>,
	Mike Travis <travis@sgi.com>,
	Daniel J Blueman <daniel@numascale-asia.com>,
	Greg KH <gregkh@linuxfoundation.org>,
	Yinghai Lu <yinghai@kernel.org>, Mel Gorman <mgorman@suse.de>
Subject: Re: [RFC 4/4] Sparse initialization of struct page array.
Date: Tue, 16 Jul 2013 05:38:58 -0500	[thread overview]
Message-ID: <20130716103857.GH3421@sgi.com> (raw)
In-Reply-To: <20130715143037.8287ffbf2fb0e72bc8efb287@linux-foundation.org>

On Mon, Jul 15, 2013 at 02:30:37PM -0700, Andrew Morton wrote:
> On Thu, 11 Jul 2013 21:03:55 -0500 Robin Holt <holt@sgi.com> wrote:
> 
> > During boot of large memory machines, a significant portion of boot
> > is spent initializing the struct page array.  The vast majority of
> > those pages are not referenced during boot.
> > 
> > Change this over to only initializing the pages when they are
> > actually allocated.
> > 
> > Besides the advantage of boot speed, this allows us the chance to
> > use normal performance monitoring tools to determine where the bulk
> > of time is spent during page initialization.
> > 
> > ...
> >
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1330,8 +1330,19 @@ static inline void __free_reserved_page(struct page *page)
> >  	__free_page(page);
> >  }
> >  
> > +extern void __reserve_bootmem_region(phys_addr_t start, phys_addr_t end);
> > +
> > +static inline void __reserve_bootmem_page(struct page *page)
> > +{
> > +	phys_addr_t start = page_to_pfn(page) << PAGE_SHIFT;
> > +	phys_addr_t end = start + PAGE_SIZE;
> > +
> > +	__reserve_bootmem_region(start, end);
> > +}
> 
> It isn't obvious that this needed to be inlined?

It is being declared in a header file.  All the other functions I came
across in that header file are declared as inline (or __always_inline).
It feels to me like this is right.  Can I leave it as-is?

> 
> >  static inline void free_reserved_page(struct page *page)
> >  {
> > +	__reserve_bootmem_page(page);
> >  	__free_reserved_page(page);
> >  	adjust_managed_page_count(page, 1);
> >  }
> > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> > index 6d53675..79e8eb7 100644
> > --- a/include/linux/page-flags.h
> > +++ b/include/linux/page-flags.h
> > @@ -83,6 +83,7 @@ enum pageflags {
> >  	PG_owner_priv_1,	/* Owner use. If pagecache, fs may use*/
> >  	PG_arch_1,
> >  	PG_reserved,
> > +	PG_uninitialized2mib,	/* Is this the right spot? ntz - Yes - rmh */
> 
> "mib" creeps me out too.  And it makes me think of SNMP, which I'd
> prefer not to think about.
> 
> We've traditionally had fears of running out of page flags, but I've
> lost track of how close we are to that happening.  IIRC the answer
> depends on whether you believe there is such a thing as a 32-bit NUMA
> system.
> 
> Can this be avoided anyway?  I suspect there's some idiotic combination
> of flags we could use to indicate the state.  PG_reserved|PG_lru or
> something.
> 
> "2MB" sounds terribly arch-specific.  Shouldn't we make it more generic
> for when the hexagon64 port wants to use 4MB?
> 
> That conversational code comment was already commented on, but it's
> still there?

I am going to work on making it non-2m based over the course of this
week, so expect the _2m (current name based on Yinghai's comments)
to go away entirely.

> > 
> > ...
> >
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -740,6 +740,54 @@ static void __init_single_page(struct page *page, unsigned long zone, int nid, i
> >  #endif
> >  }
> >  
> > +static void expand_page_initialization(struct page *basepage)
> > +{
> > +	unsigned long pfn = page_to_pfn(basepage);
> > +	unsigned long end_pfn = pfn + PTRS_PER_PMD;
> > +	unsigned long zone = page_zonenum(basepage);
> > +	int reserved = PageReserved(basepage);
> > +	int nid = page_to_nid(basepage);
> > +
> > +	ClearPageUninitialized2Mib(basepage);
> > +
> > +	for( pfn++; pfn < end_pfn; pfn++ )
> > +		__init_single_page(pfn_to_page(pfn), zone, nid, reserved);
> > +}
> > +
> > +void ensure_pages_are_initialized(unsigned long start_pfn,
> > +				  unsigned long end_pfn)
> 
> I think this can be made static.  I hope so, as it's a somewhat
> odd-sounding identifier for a global.

Done.

> > +{
> > +	unsigned long aligned_start_pfn = start_pfn & ~(PTRS_PER_PMD - 1);
> > +	unsigned long aligned_end_pfn;
> > +	struct page *page;
> > +
> > +	aligned_end_pfn = end_pfn & ~(PTRS_PER_PMD - 1);
> > +	aligned_end_pfn += PTRS_PER_PMD;
> > +	while (aligned_start_pfn < aligned_end_pfn) {
> > +		if (pfn_valid(aligned_start_pfn)) {
> > +			page = pfn_to_page(aligned_start_pfn);
> > +
> > +			if(PageUninitialized2Mib(page))
> 
> checkpatch them, please.

Will certainly do.

> > +				expand_page_initialization(page);
> > +		}
> > +
> > +		aligned_start_pfn += PTRS_PER_PMD;
> > +	}
> > +}
> 
> Some nice code comments for the above two functions would be helpful.

Will do.

> > 
> > ...
> >
> > +int __meminit pfn_range_init_avail(unsigned long pfn, unsigned long end_pfn,
> > +				   unsigned long size, int nid)
> > +{
> > +	unsigned long validate_end_pfn = pfn + size;
> > +
> > +	if (pfn & (size - 1))
> > +		return 1;
> > +
> > +	if (pfn + size >= end_pfn)
> > +		return 1;
> > +
> > +	while (pfn < validate_end_pfn)
> > +	{
> > +		if (!early_pfn_valid(pfn))
> > +			return 1;
> > +		if (!early_pfn_in_nid(pfn, nid))
> > +			return 1;
> > +		pfn++;
> > + 	}
> > +
> > +	return size;
> > +}
> 
> Document it, please.  The return value semantics look odd, so don't
> forget to explain all that as well.

Will do.  Will also work on the name to make it more clear what we
are returning.

> > 
> > ...
> >
> > @@ -6196,6 +6302,7 @@ static const struct trace_print_flags pageflag_names[] = {
> >  	{1UL << PG_owner_priv_1,	"owner_priv_1"	},
> >  	{1UL << PG_arch_1,		"arch_1"	},
> >  	{1UL << PG_reserved,		"reserved"	},
> > +	{1UL << PG_uninitialized2mib,	"Uninit_2MiB"	},
> 
> It would be better if the name which is visible in procfs matches the
> name in the kernel source code.

Done and will try to maintain the consistency.

> >  	{1UL << PG_private,		"private"	},
> >  	{1UL << PG_private_2,		"private_2"	},
> >  	{1UL << PG_writeback,		"writeback"	},

Robin

  reply	other threads:[~2013-07-16 10:39 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-12  2:03 [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Robin Holt
2013-07-12  2:03 ` [RFC 1/4] memblock: Introduce a for_each_reserved_mem_region iterator Robin Holt
2013-07-12  2:03 ` [RFC 2/4] Have __free_pages_memory() free in larger chunks Robin Holt
2013-07-12  7:45   ` Robin Holt
2013-07-13  3:08     ` Yinghai Lu
2013-07-16 13:02   ` Sam Ben
2013-07-23 15:32     ` Johannes Weiner
2013-07-12  2:03 ` [RFC 3/4] Seperate page initialization into a separate function Robin Holt
2013-07-13  3:06   ` Yinghai Lu
2013-07-15  3:19     ` Robin Holt
2013-07-12  2:03 ` [RFC 4/4] Sparse initialization of struct page array Robin Holt
2013-07-13  4:19   ` Yinghai Lu
2013-07-13  4:39     ` H. Peter Anvin
2013-07-13  5:31       ` Yinghai Lu
2013-07-13  5:38         ` H. Peter Anvin
2013-07-15 14:08         ` Nathan Zimmer
2013-07-15 17:45     ` Nathan Zimmer
2013-07-15 17:54       ` H. Peter Anvin
2013-07-15 18:26         ` Robin Holt
2013-07-15 18:29           ` H. Peter Anvin
2013-07-23  8:32             ` Ingo Molnar
2013-07-23 11:09               ` Robin Holt
2013-07-23 11:15                 ` Robin Holt
2013-07-23 11:41                   ` Robin Holt
2013-07-23 11:50                     ` Robin Holt
2013-07-16 10:26     ` Robin Holt
2013-07-25  2:25     ` Robin Holt
2013-07-25 12:50       ` Yinghai Lu
2013-07-25 13:42         ` Robin Holt
2013-07-25 13:52           ` Yinghai Lu
2013-07-15 21:30   ` Andrew Morton
2013-07-16 10:38     ` Robin Holt [this message]
2013-07-12  8:27 ` [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-07-12  8:47   ` boot tracing Borislav Petkov
2013-07-12  8:53     ` Ingo Molnar
2013-07-15  1:38       ` Sam Ben
2013-07-23  8:18         ` Ingo Molnar
2013-07-12  9:19   ` [RFC 0/4] Transparent on-demand struct page initialization embedded in the buddy allocator Robert Richter
2013-07-15 15:16   ` Robin Holt
2013-07-16  8:55   ` Joonsoo Kim
2013-07-16  9:08     ` Borislav Petkov
2013-07-23  8:20       ` Ingo Molnar
2013-07-15 15:00 ` Robin Holt
2013-07-17  5:17 ` Sam Ben
2013-07-17  9:30   ` Robin Holt
2013-07-19 23:51     ` Yinghai Lu
2013-07-22  6:13       ` Robin Holt
2013-08-02 17:44 ` [RFC v2 0/5] " Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 1/5] memblock: Introduce a for_each_reserved_mem_region iterator Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 2/5] Have __free_pages_memory() free in larger chunks Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 3/5] Move page initialization into a separate function Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 4/5] Only set page reserved in the memblock region Nathan Zimmer
2013-08-03 20:04     ` Nathan Zimmer
2013-08-02 17:44   ` [RFC v2 5/5] Sparse initialization of struct page array Nathan Zimmer
2013-08-05  9:58   ` [RFC v2 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-08-12 21:54   ` [RFC v3 " Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 1/5] memblock: Introduce a for_each_reserved_mem_region iterator Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 2/5] Have __free_pages_memory() free in larger chunks Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 3/5] Move page initialization into a separate function Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 4/5] Only set page reserved in the memblock region Nathan Zimmer
2013-08-12 21:54     ` [RFC v3 5/5] Sparse initialization of struct page array Nathan Zimmer
2013-08-13 10:58     ` [RFC v3 0/5] Transparent on-demand struct page initialization embedded in the buddy allocator Ingo Molnar
2013-08-13 17:09     ` Linus Torvalds
2013-08-13 17:23       ` H. Peter Anvin
2013-08-13 17:33       ` Mike Travis
2013-08-13 17:51         ` Linus Torvalds
2013-08-13 18:04           ` Mike Travis
2013-08-13 19:06             ` Mike Travis
2013-08-13 20:24               ` Yinghai Lu
2013-08-13 20:37                 ` Mike Travis
2013-08-13 21:35             ` Nathan Zimmer
2013-08-13 23:10           ` Nathan Zimmer
2013-08-13 23:55             ` Linus Torvalds
2013-08-14 11:27               ` Ingo Molnar
2013-08-14 11:05           ` Ingo Molnar
2013-08-14 22:15             ` Nathan Zimmer
2013-08-16 16:36     ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130716103857.GH3421@sgi.com \
    --to=holt@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=daniel@numascale-asia.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=nzimmer@sgi.com \
    --cc=rob@landley.net \
    --cc=travis@sgi.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).