All of lore.kernel.org
 help / color / mirror / Atom feed
From: Russ Anderson <rja@sgi.com>
To: Rafael Aquini <aquini@linux.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russ Anderson <rja@sgi.com>
Subject: Re: [PATCH] [BUGFIX] mm: hugepages can cause negative commitlimit
Date: Thu, 19 May 2011 17:11:01 -0500	[thread overview]
Message-ID: <20110519221101.GC19648@sgi.com> (raw)
In-Reply-To: <BANLkTinyYP-je9Nf8X-xWEdpgvn8a631Mw@mail.gmail.com>

On Thu, May 19, 2011 at 10:37:13AM -0300, Rafael Aquini wrote:
> On Thu, May 19, 2011 at 1:56 AM, Russ Anderson <rja@sgi.com> wrote:
> >
> > The way it was verified was putting a printk in to print totalram_pages
> > and hugetlb_total_pages.  First the system was booted without any huge
> > pages.  The next boot one huge page was allocated.  The next boot more
> > hugepages allocated.  Each time totalram_pages was reduced by the nuber
> > of huge pages allocated, with totalram_pages + hugetlb_total_pages
> > equaling the original number of pages.
> >
> > That behavior is also consistent with allocating over half of memory
> > resulting in CommitLimit going negative (as is shown in the above
> > output).
> >
> > Here is some data.  Each represents a boot using 1G hugepages.
> >   0 hugepages : totalram_pages 16519867 hugetlb_total_pages       0
> >   1 hugepages : totalram_pages 16257723 hugetlb_total_pages  262144
> >   2 hugepages : totalram_pages 15995578 hugetlb_total_pages  524288
> >  31 hugepages : totalram_pages  8393403 hugetlb_total_pages 8126464
> >  32 hugepages : totalram_pages  8131258 hugetlb_total_pages 8388608
> >
> >
> > > hugepages are reserved, hugetlb_total_pages() has to be accounted and
> > > subtracted from totalram_pages in order to render an accurate number of
> > > remaining pages available to the general memory workload commitment.
> > >
> > > I've tried to reproduce your findings on my boxes,  without
> > > success, unfortunately.
> >
> > Put a printk in meminfo_proc_show() to print totalram_pages and
> > hugetlb_total_pages().  Add "default_hugepagesz=1G hugepagesz=1G
> > hugepages=64"
> > to the boot line (varying the number of hugepages).
> >
> > > I'll keep chasing to hit this behaviour, though.
> 
> I got what I was doing different, and you are partially right.
> Checking mm/hugetlb.c:
> 1811 static int __init hugetlb_nrpages_setup(char *s)
> 1812 {
> ....
> 1834         /*
> 1835          * Global state is always initialized later in hugetlb_init.
> 1836          * But we need to allocate >= MAX_ORDER hstates here early to
> still
> 1837          * use the bootmem allocator.
> 1838          */
> 1839         if (max_hstate && parsed_hstate->order >= MAX_ORDER)
> 1840                 hugetlb_hstate_alloc_pages(parsed_hstate);
> 1841
> 1842         last_mhp = mhp;
> 1843
> 1844         return 1;
> 1845 }
> 1846 __setup("hugepages=", hugetlb_nrpages_setup);
> 
> I realize this issue you've reported only happens when you're using
> oversized hugepages. As their order are always >= MAX_ORDER, they got pages
> early allocated from bootmem allocator. So, these pages are not accounted
> for totalram_pages.
> 
> Although your patch covers a fix for the proposed case, it only works for
> scenarios where oversized hugepages are allocated on boot. I think it will,
> unfortunately, cause a bug for the remaining scenarios.

OK, I see your point.  The root problem is hugepages allocated at boot are
subtracted from totalram_pages but hugepages allocated at run time are not.
Correct me if I've mistate it or are other conditions.

By "allocated at run time" I mean "echo 1 > /proc/sys/vm/nr_hugepages".
That allocation will not change totalram_pages but will change
hugetlb_total_pages().

How best to fix this inconsistency?  Should totalram_pages include or exclude
hugepages?  What are the implications?

I have no strong preference as to which way to go as long as it is consistent.

> Cheers!
> --aquini

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@sgi.com

WARNING: multiple messages have this Message-ID (diff)
From: Russ Anderson <rja@sgi.com>
To: Rafael Aquini <aquini@linux.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	linux-mm <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Christoph Lameter <cl@linux.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Russ Anderson <rja@sgi.com>
Subject: Re: [PATCH] [BUGFIX] mm: hugepages can cause negative commitlimit
Date: Thu, 19 May 2011 17:11:01 -0500	[thread overview]
Message-ID: <20110519221101.GC19648@sgi.com> (raw)
In-Reply-To: <BANLkTinyYP-je9Nf8X-xWEdpgvn8a631Mw@mail.gmail.com>

On Thu, May 19, 2011 at 10:37:13AM -0300, Rafael Aquini wrote:
> On Thu, May 19, 2011 at 1:56 AM, Russ Anderson <rja@sgi.com> wrote:
> >
> > The way it was verified was putting a printk in to print totalram_pages
> > and hugetlb_total_pages.  First the system was booted without any huge
> > pages.  The next boot one huge page was allocated.  The next boot more
> > hugepages allocated.  Each time totalram_pages was reduced by the nuber
> > of huge pages allocated, with totalram_pages + hugetlb_total_pages
> > equaling the original number of pages.
> >
> > That behavior is also consistent with allocating over half of memory
> > resulting in CommitLimit going negative (as is shown in the above
> > output).
> >
> > Here is some data.  Each represents a boot using 1G hugepages.
> >   0 hugepages : totalram_pages 16519867 hugetlb_total_pages       0
> >   1 hugepages : totalram_pages 16257723 hugetlb_total_pages  262144
> >   2 hugepages : totalram_pages 15995578 hugetlb_total_pages  524288
> >  31 hugepages : totalram_pages  8393403 hugetlb_total_pages 8126464
> >  32 hugepages : totalram_pages  8131258 hugetlb_total_pages 8388608
> >
> >
> > > hugepages are reserved, hugetlb_total_pages() has to be accounted and
> > > subtracted from totalram_pages in order to render an accurate number of
> > > remaining pages available to the general memory workload commitment.
> > >
> > > I've tried to reproduce your findings on my boxes,  without
> > > success, unfortunately.
> >
> > Put a printk in meminfo_proc_show() to print totalram_pages and
> > hugetlb_total_pages().  Add "default_hugepagesz=1G hugepagesz=1G
> > hugepages=64"
> > to the boot line (varying the number of hugepages).
> >
> > > I'll keep chasing to hit this behaviour, though.
> 
> I got what I was doing different, and you are partially right.
> Checking mm/hugetlb.c:
> 1811 static int __init hugetlb_nrpages_setup(char *s)
> 1812 {
> ....
> 1834         /*
> 1835          * Global state is always initialized later in hugetlb_init.
> 1836          * But we need to allocate >= MAX_ORDER hstates here early to
> still
> 1837          * use the bootmem allocator.
> 1838          */
> 1839         if (max_hstate && parsed_hstate->order >= MAX_ORDER)
> 1840                 hugetlb_hstate_alloc_pages(parsed_hstate);
> 1841
> 1842         last_mhp = mhp;
> 1843
> 1844         return 1;
> 1845 }
> 1846 __setup("hugepages=", hugetlb_nrpages_setup);
> 
> I realize this issue you've reported only happens when you're using
> oversized hugepages. As their order are always >= MAX_ORDER, they got pages
> early allocated from bootmem allocator. So, these pages are not accounted
> for totalram_pages.
> 
> Although your patch covers a fix for the proposed case, it only works for
> scenarios where oversized hugepages are allocated on boot. I think it will,
> unfortunately, cause a bug for the remaining scenarios.

OK, I see your point.  The root problem is hugepages allocated at boot are
subtracted from totalram_pages but hugepages allocated at run time are not.
Correct me if I've mistate it or are other conditions.

By "allocated at run time" I mean "echo 1 > /proc/sys/vm/nr_hugepages".
That allocation will not change totalram_pages but will change
hugetlb_total_pages().

How best to fix this inconsistency?  Should totalram_pages include or exclude
hugepages?  What are the implications?

I have no strong preference as to which way to go as long as it is consistent.

> Cheers!
> --aquini

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@sgi.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-05-19 22:11 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-18 15:34 [PATCH] [BUGFIX] mm: hugepages can cause negative commitlimit Russ Anderson
2011-05-18 15:34 ` Russ Anderson
2011-05-19  0:51 ` Rafael Aquini
2011-05-19  4:56   ` Russ Anderson
2011-05-19  4:56     ` Russ Anderson
2011-05-19 13:37     ` Rafael Aquini
2011-05-19 22:11       ` Russ Anderson [this message]
2011-05-19 22:11         ` Russ Anderson
2011-05-20 20:04         ` Andrew Morton
2011-05-20 20:04           ` Andrew Morton
2011-05-20 22:30           ` Rafael Aquini
2011-05-20 22:30             ` Rafael Aquini
2011-05-26 21:07             ` Rafael Aquini
2011-05-26 21:07               ` Rafael Aquini
2011-05-27 22:22               ` Russ Anderson
2011-05-27 22:22                 ` Russ Anderson
2011-06-02  4:08               ` Russ Anderson
2011-06-02  4:08                 ` Russ Anderson
2011-06-03  2:55                 ` [PATCH] mm: fix negative commitlimit when gigantic hugepages are allocated Rafael Aquini
2011-06-03  2:55                   ` Rafael Aquini
2011-06-03 12:07                   ` Russ Anderson
2011-06-03 12:07                     ` Russ Anderson
2011-06-09 23:44                   ` Andrew Morton
2011-06-09 23:44                     ` Andrew Morton
2011-06-13 21:11                     ` Rafael Aquini
2011-06-13 21:11                       ` Rafael Aquini
2011-06-13 21:31                       ` Andrew Morton
2011-06-13 21:31                         ` Andrew Morton
2011-06-03  3:08                 ` [PATCH] [BUGFIX] mm: hugepages can cause negative commitlimit Rafael Aquini
2011-06-03  3:08                   ` Rafael Aquini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110519221101.GC19648@sgi.com \
    --to=rja@sgi.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aquini@linux.com \
    --cc=cl@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.