linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Borislav Petkov <bp@amd64.org>
Cc: Tejun Heo <tj@kernel.org>, Ralf Baechle <ralf@linux-mips.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: WARNING: at mm/page_alloc.c:4514 free_area_init_node+0x4f/0x37b()
Date: Mon, 6 Aug 2012 09:01:57 +0900	[thread overview]
Message-ID: <20120806000157.GA10971@bbox> (raw)
In-Reply-To: <20120802110641.GA16328@aftab.osrc.amd.com>

Hi Borislav,

On Thu, Aug 02, 2012 at 01:06:41PM +0200, Borislav Petkov wrote:
> On Thu, Aug 02, 2012 at 08:33:35AM +0900, Minchan Kim wrote:
> > Hello Borislav,
> > 
> > On Wed, Aug 01, 2012 at 07:38:37PM +0200, Borislav Petkov wrote:
> > > Hi,
> > > 
> > > I'm hitting the WARN_ON in $Subject with latest linus:
> > > v3.5-8833-g2d534926205d on a 4-node AMD system. As it looks from
> > > dmesg, it is happening on node 0, 1 and 2 but not on 3. Probably the
> > > pgdat->nr_zones thing but I'll have to add more dbg code to be sure.
> > 
> > As I look the code quickly, free_area_init_node initializes node_id and
> > node_start_pfn doublely. They were initialized by setup_node_data.
> > 
> > Could you test below patch? It's not a totally right way to fix it but
> > I want to confirm why it happens.
> > 
> > (I'm on vacation now so please understand that it hard to reach me)
> 
> I sincerely hope you're not going to interrupt your vacation because of
> this.
> 
> :-).
> 
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 889532b..009ac28 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -4511,7 +4511,7 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
> >         pg_data_t *pgdat = NODE_DATA(nid);
> >  
> >         /* pg_data_t should be reset to zero when it's allocated */
> > -       WARN_ON(pgdat->nr_zones || pgdat->node_start_pfn || pgdat->classzone_idx);
> > +       WARN_ON(pgdat->nr_zones || pgdat->classzone_idx);
> >  
> >         pgdat->node_id = nid;
> >         pgdat->node_start_pfn = node_start_pfn;
> 
> Yep, you were right: ->node_start_pfn is set. I added additional debug
> output for more info:
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 889532b8e6c1..c249abe4fee2 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4511,7 +4511,17 @@ void __paginginit free_area_init_node(int nid, unsigned long *zones_size,
>         pg_data_t *pgdat = NODE_DATA(nid);
>  
>         /* pg_data_t should be reset to zero when it's allocated */
> -       WARN_ON(pgdat->nr_zones || pgdat->node_start_pfn || pgdat->classzone_idx);
> +       WARN_ON(pgdat->nr_zones || pgdat->classzone_idx);
> +
> +       if (pgdat->node_start_pfn)
> +               pr_warn("%s: pgdat->node_start_pfn: %lu\n", __func__, pgdat->node_start_pfn);
> +
> +       if (pgdat->nr_zones)
> +               pr_warn("%s: pgdat->nr_zones: %d\n", __func__, pgdat->nr_zones);
> +
> +       if (pgdat->classzone_idx)
> +               pr_warn("%s: pgdat->classzone_idx: %d\n", __func__, pgdat->classzone_idx);
> +
>  
>         pgdat->node_id = nid;
>         pgdat->node_start_pfn = node_start_pfn;
> 
> 
> 
> Here's what it says:
> 
> [    0.000000] On node 0 totalpages: 4193848
> [    0.000000]   DMA zone: 64 pages used for memmap
> [    0.000000]   DMA zone: 6 pages reserved
> [    0.000000]   DMA zone: 3890 pages, LIFO batch:0
> [    0.000000]   DMA32 zone: 16320 pages used for memmap
> [    0.000000]   DMA32 zone: 798464 pages, LIFO batch:31
> [    0.000000]   Normal zone: 52736 pages used for memmap
> [    0.000000]   Normal zone: 3322368 pages, LIFO batch:31
> [    0.000000] free_area_init_node: pgdat->node_start_pfn: 4423680	<----
> [    0.000000] On node 1 totalpages: 4194304
> [    0.000000]   Normal zone: 65536 pages used for memmap
> [    0.000000]   Normal zone: 4128768 pages, LIFO batch:31
> [    0.000000] free_area_init_node: pgdat->node_start_pfn: 8617984	<----
> [    0.000000] On node 2 totalpages: 4194304
> [    0.000000]   Normal zone: 65536 pages used for memmap
> [    0.000000]   Normal zone: 4128768 pages, LIFO batch:31
> [    0.000000] free_area_init_node: pgdat->node_start_pfn: 12812288	<----
> [    0.000000] On node 3 totalpages: 4194304
> [    0.000000]   Normal zone: 65536 pages used for memmap
> [    0.000000]   Normal zone: 4128768 pages, LIFO batch:31
> [    0.000000] ACPI: PM-Timer IO Port: 0x2008
> [    0.000000] ACPI: Local APIC address 0xfee00000
> 
> Thanks.

Thanks for looking at this!

As soon as I come back from vacation, I see this BUG carefully and think patch I sent
is good. The patch's goal is to detect for uninitialized pgdat structure
when it was allocated. So it checks some variables randomly but unfortunately,
pgdat's members like node_start_pfn are closely related to boot arch code
so some members could be used by arch code before reaching generic mm code.
It was a Tejun's concern and he was correct.

I think nr_zones and classzone_idx should be initialized by only generic MM code
during boot sequence, not memory hotplug so that patch would be okay.

Linus already applied the patch in rc-1 but he might need better changelog.
I am not sure I send this patch to whom, Linus or Andrew?
Anyway, Please use below if really need it.

Thanks!

  reply	other threads:[~2012-08-06  0:00 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-01 17:38 WARNING: at mm/page_alloc.c:4514 free_area_init_node+0x4f/0x37b() Borislav Petkov
2012-08-01 23:33 ` Minchan Kim
2012-08-02 11:06   ` Borislav Petkov
2012-08-06  0:01     ` Minchan Kim [this message]
2012-08-08 13:50       ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120806000157.GA10971@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bp@amd64.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ralf@linux-mips.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).