All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Yinghai Lu" <yhlu.kernel@gmail.com>,
	mingo@elte.hu, clameter@sgi.com, linux-kernel@vger.kernel.org,
	Yasunori Goto <y-goto@jp.fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [PATCH] mm: fix boundary checking in free_bootmem_core
Date: Thu, 13 Mar 2008 22:59:46 +0100	[thread overview]
Message-ID: <200803132259.47063.ak@suse.de> (raw)
In-Reply-To: <20080312182240.db32c858.akpm@linux-foundation.org>

On Thursday 13 March 2008 02:22:40 Andrew Morton wrote:
> On Wed, 12 Mar 2008 18:11:41 -0700 "Yinghai Lu" <yhlu.kernel@gmail.com> wrote:
> 
> > >  <looks at it>
> > >
> > >  Sorry, but I find the changelog very hard to amke sense of.  I presently
> > >  have:
> > >
> > >
> > >   So call it when numa is enabled, we don't know which node have that
> > >   range.  and make it more robust.
> > >
> > >   Try to trim it to get valid sidx, and eidx.
> > >
> > >  Could you please expand on this?
> > 
> > please check following...
> > 
> 
> Heaps better, thanks ;)  Below is what I now have.
> 
> (cc's people)
> 
> Guys, could you please review this?  Maybe test it a bit?
> 
> Thanks.
> 
> 
> From: "Yinghai Lu" <yhlu.kernel@gmail.com>
> 
> With numa enabled, some callers could have a range o fmemory on one node but
> try to free that on other node.  This can cause some pages to be freed
> wrongly.

Concrete examples?

If that happens it's really just a problem that the bootmem API
is wrong. I was always annoyed by the hardcoded NODE_DATA(0)s in
free_bootmem. 

I would suggest if that happens you just fix free_bootmem to search
for the correct node instead of hardcoding 0 and then eliminate
free_bootmem_node() everywhere and replace it with free_bootmem()
> 
> For example: when we try to allocate 128g boot ram early for gart/swiotlb, and
> free that range later so gart/swiotlb can get some range afterwards.

I'm confused by the example. AFAIK there is no memory freeing in either
gart nor swiotlb. At least there wasn't until very recently.

> 
> With this patch, we don't need to care which node holds the range, just loop
> to call free_bootmem_node for all online nodes.
> 
> This patch make free_bootmem_core() more robust by trimming the sidx and eidx
> according the ram range that the node has.

I think you should just kill free_bootmem_node() and replace it everywhere
with your improved free_bootmem()


> @@ -125,6 +125,7 @@ static int __init reserve_bootmem_core(b
>  	BUG_ON(!size);
>  	BUG_ON(PFN_DOWN(addr) >= bdata->node_low_pfn);
>  	BUG_ON(PFN_UP(addr + size) > bdata->node_low_pfn);
> +	BUG_ON(addr < bdata->node_boot_start);

That seems unrelated?

>  
>  	sidx = PFN_DOWN(addr - bdata->node_boot_start);
>  	eidx = PFN_UP(addr + size - bdata->node_boot_start);
> @@ -156,21 +157,31 @@ static void __init free_bootmem_core(boo
>  	unsigned long sidx, eidx;
>  	unsigned long i;
>  
> +	BUG_ON(!size);
> +
> +	/* out range */
> +	if (addr + size < bdata->node_boot_start ||
> +		PFN_DOWN(addr) > bdata->node_low_pfn)
> +		return;

I don't really like this silent return without error value.
There should be a BUG() or something for someone passing addresses
outside any node. This check should be probably in the caller.

>  	/*
>  	 * round down end of usable mem, partially free pages are
>  	 * considered reserved.
>  	 */
> -	BUG_ON(!size);
> -	BUG_ON(PFN_DOWN(addr + size) > bdata->node_low_pfn);
>  
> -	if (addr < bdata->last_success)
> +	if (addr >= bdata->node_boot_start && addr < bdata->last_success)
>  		bdata->last_success = addr;
>  
>  	/*
> -	 * Round up the beginning of the address.
> +	 * Round up to index to the range.
>  	 */
> -	sidx = PFN_UP(addr) - PFN_DOWN(bdata->node_boot_start);
> +	if (PFN_UP(addr) > PFN_DOWN(bdata->node_boot_start))
> +		sidx = PFN_UP(addr) - PFN_DOWN(bdata->node_boot_start);
> +	else
> +		sidx = 0;
> +
>  	eidx = PFN_DOWN(addr + size - bdata->node_boot_start);
> +	if (eidx > bdata->node_low_pfn - PFN_DOWN(bdata->node_boot_start))
> +		eidx = bdata->node_low_pfn - PFN_DOWN(bdata->node_boot_start);

I'm not sure for what these other changes are needed?  Just adding the
initial range check should be enough.

If you want to fix something else unrelated please do separate patches.

-Andi


  reply	other threads:[~2008-03-13 22:00 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-12  1:01 [PATCH] mm: fix boundary checking in free_bootmem_core Yinghai Lu
2008-03-12 23:21 ` Yinghai Lu
2008-03-12 23:33   ` Andrew Morton
2008-03-13  1:11     ` Yinghai Lu
2008-03-13  1:22       ` Andrew Morton
2008-03-13 21:59         ` Andi Kleen [this message]
2008-03-13 22:22           ` Yinghai Lu
2008-03-14 11:58             ` Andi Kleen
2008-03-14 16:44               ` Yinghai Lu
2008-03-14 16:53                 ` Andi Kleen
2008-03-14 17:36                   ` Yinghai Lu
2008-03-21 19:44                     ` Andrew Morton
2008-03-21 20:00                       ` Ingo Molnar
2008-03-21 21:54                       ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200803132259.47063.ak@suse.de \
    --to=ak@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=clameter@sgi.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=y-goto@jp.fujitsu.com \
    --cc=yhlu.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.