All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
	linux-mm@kvack.org, Pingfan Liu <kernelfans@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org, Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Tony Luck <tony.luck@intel.com>,
	linuxppc-dev@lists.ozlabs.org, linux-ia64@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] x86, numa: always initialize all possible nodes
Date: Mon, 11 Feb 2019 13:49:09 +0000	[thread overview]
Message-ID: <20190211134909.GA107845@gmail.com> (raw)
In-Reply-To: <20190125105008.GJ3560@dhcp22.suse.cz>


* Michal Hocko <mhocko@kernel.org> wrote:

> On Thu 24-01-19 11:10:50, Dave Hansen wrote:
> > On 1/24/19 6:17 AM, Michal Hocko wrote:
> > > and nr_cpus set to 4. The underlying reason is tha the device is bound
> > > to node 2 which doesn't have any memory and init_cpu_to_node only
> > > initializes memory-less nodes for possible cpus which nr_cpus restrics.
> > > This in turn means that proper zonelists are not allocated and the page
> > > allocator blows up.
> > 
> > This looks OK to me.
> > 
> > Could we add a few DEBUG_VM checks that *look* for these invalid
> > zonelists?  Or, would our existing list debugging have caught this?
> 
> Currently we simply blow up because those zonelists are NULL. I do not
> think we have a way to check whether an existing zonelist is actually 
> _correct_ other thatn check it for NULL. But what would we do in the
> later case?
> 
> > Basically, is this bug also a sign that we need better debugging around
> > this?
> 
> My earlier patch had a debugging printk to display the zonelists and
> that might be worthwhile I guess. Basically something like this
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2e097f336126..c30d59f803fb 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5259,6 +5259,11 @@ static void build_zonelists(pg_data_t *pgdat)
>  
>  	build_zonelists_in_node_order(pgdat, node_order, nr_nodes);
>  	build_thisnode_zonelists(pgdat);
> +
> +	pr_info("node[%d] zonelist: ", pgdat->node_id);
> +	for_each_zone_zonelist(zone, z, &pgdat->node_zonelists[ZONELIST_FALLBACK], MAX_NR_ZONES-1)
> +		pr_cont("%d:%s ", zone_to_nid(zone), zone->name);
> +	pr_cont("\n");
>  }

Looks like this patch fell through the cracks - any update on this?

Thanks,

	Ingo

WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>,
	linux-ia64@vger.kernel.org, linux-mm@kvack.org,
	Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Pingfan Liu <kernelfans@gmail.com>,
	Dave Hansen <dave.hansen@intel.com>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [RFC PATCH] x86, numa: always initialize all possible nodes
Date: Mon, 11 Feb 2019 14:49:09 +0100	[thread overview]
Message-ID: <20190211134909.GA107845@gmail.com> (raw)
In-Reply-To: <20190125105008.GJ3560@dhcp22.suse.cz>


* Michal Hocko <mhocko@kernel.org> wrote:

> On Thu 24-01-19 11:10:50, Dave Hansen wrote:
> > On 1/24/19 6:17 AM, Michal Hocko wrote:
> > > and nr_cpus set to 4. The underlying reason is tha the device is bound
> > > to node 2 which doesn't have any memory and init_cpu_to_node only
> > > initializes memory-less nodes for possible cpus which nr_cpus restrics.
> > > This in turn means that proper zonelists are not allocated and the page
> > > allocator blows up.
> > 
> > This looks OK to me.
> > 
> > Could we add a few DEBUG_VM checks that *look* for these invalid
> > zonelists?  Or, would our existing list debugging have caught this?
> 
> Currently we simply blow up because those zonelists are NULL. I do not
> think we have a way to check whether an existing zonelist is actually 
> _correct_ other thatn check it for NULL. But what would we do in the
> later case?
> 
> > Basically, is this bug also a sign that we need better debugging around
> > this?
> 
> My earlier patch had a debugging printk to display the zonelists and
> that might be worthwhile I guess. Basically something like this
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2e097f336126..c30d59f803fb 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5259,6 +5259,11 @@ static void build_zonelists(pg_data_t *pgdat)
>  
>  	build_zonelists_in_node_order(pgdat, node_order, nr_nodes);
>  	build_thisnode_zonelists(pgdat);
> +
> +	pr_info("node[%d] zonelist: ", pgdat->node_id);
> +	for_each_zone_zonelist(zone, z, &pgdat->node_zonelists[ZONELIST_FALLBACK], MAX_NR_ZONES-1)
> +		pr_cont("%d:%s ", zone_to_nid(zone), zone->name);
> +	pr_cont("\n");
>  }

Looks like this patch fell through the cracks - any update on this?

Thanks,

	Ingo

WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
	linux-mm@kvack.org, Pingfan Liu <kernelfans@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org, Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Tony Luck <tony.luck@intel.com>,
	linuxppc-dev@lists.ozlabs.org, linux-ia64@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] x86, numa: always initialize all possible nodes
Date: Mon, 11 Feb 2019 14:49:09 +0100	[thread overview]
Message-ID: <20190211134909.GA107845@gmail.com> (raw)
In-Reply-To: <20190125105008.GJ3560@dhcp22.suse.cz>


* Michal Hocko <mhocko@kernel.org> wrote:

> On Thu 24-01-19 11:10:50, Dave Hansen wrote:
> > On 1/24/19 6:17 AM, Michal Hocko wrote:
> > > and nr_cpus set to 4. The underlying reason is tha the device is bound
> > > to node 2 which doesn't have any memory and init_cpu_to_node only
> > > initializes memory-less nodes for possible cpus which nr_cpus restrics.
> > > This in turn means that proper zonelists are not allocated and the page
> > > allocator blows up.
> > 
> > This looks OK to me.
> > 
> > Could we add a few DEBUG_VM checks that *look* for these invalid
> > zonelists?  Or, would our existing list debugging have caught this?
> 
> Currently we simply blow up because those zonelists are NULL. I do not
> think we have a way to check whether an existing zonelist is actually 
> _correct_ other thatn check it for NULL. But what would we do in the
> later case?
> 
> > Basically, is this bug also a sign that we need better debugging around
> > this?
> 
> My earlier patch had a debugging printk to display the zonelists and
> that might be worthwhile I guess. Basically something like this
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2e097f336126..c30d59f803fb 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5259,6 +5259,11 @@ static void build_zonelists(pg_data_t *pgdat)
>  
>  	build_zonelists_in_node_order(pgdat, node_order, nr_nodes);
>  	build_thisnode_zonelists(pgdat);
> +
> +	pr_info("node[%d] zonelist: ", pgdat->node_id);
> +	for_each_zone_zonelist(zone, z, &pgdat->node_zonelists[ZONELIST_FALLBACK], MAX_NR_ZONES-1)
> +		pr_cont("%d:%s ", zone_to_nid(zone), zone->name);
> +	pr_cont("\n");
>  }

Looks like this patch fell through the cracks - any update on this?

Thanks,

	Ingo


  reply	other threads:[~2019-02-11 13:49 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-14  8:24 [RFC PATCH] x86, numa: always initialize all possible nodes Michal Hocko
2019-01-14  8:24 ` Michal Hocko
2019-01-14  8:24 ` Michal Hocko
2019-01-14  8:24 ` Michal Hocko
2019-01-14 10:26 ` Michael Ellerman
2019-01-14 10:26   ` Michael Ellerman
2019-01-14 10:26   ` Michael Ellerman
2019-01-14 11:48   ` Michal Hocko
2019-01-14 11:48     ` Michal Hocko
2019-01-14 11:48     ` Michal Hocko
2019-01-15  5:32   ` Pingfan Liu
2019-01-15  5:32     ` Pingfan Liu
2019-01-15  5:32     ` Pingfan Liu
2019-01-24 14:17 ` Michal Hocko
2019-01-24 14:17   ` Michal Hocko
2019-01-24 14:17   ` Michal Hocko
2019-01-24 17:51   ` Mike Rapoport
2019-01-24 17:51     ` Mike Rapoport
2019-01-24 17:51     ` Mike Rapoport
2019-01-25 10:40     ` Michal Hocko
2019-01-25 10:40       ` Michal Hocko
2019-01-25 10:40       ` Michal Hocko
2019-01-25 19:25       ` Mike Rapoport
2019-01-25 19:25         ` Mike Rapoport
2019-01-25 19:25         ` Mike Rapoport
2019-01-24 19:10   ` Dave Hansen
2019-01-24 19:10     ` Dave Hansen
2019-01-24 19:10     ` Dave Hansen
2019-01-25 10:50     ` Michal Hocko
2019-01-25 10:50       ` Michal Hocko
2019-01-25 10:50       ` Michal Hocko
2019-02-11 13:49       ` Ingo Molnar [this message]
2019-02-11 13:49         ` Ingo Molnar
2019-02-11 13:49         ` Ingo Molnar
2019-02-11 14:52         ` Michal Hocko
2019-02-11 14:52           ` Michal Hocko
2019-02-11 14:52           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190211134909.GA107845@gmail.com \
    --to=mingo@kernel.org \
    --cc=benh@kernel.crashing.org \
    --cc=dave.hansen@intel.com \
    --cc=kernelfans@gmail.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mhocko@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.