public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* How movable is zone movable?
@ 2009-04-12 15:29 Ed Tomlinson
  2009-04-13  1:04 ` KAMEZAWA Hiroyuki
  2009-04-13  9:59 ` Mel Gorman
  0 siblings, 2 replies; 6+ messages in thread
From: Ed Tomlinson @ 2009-04-12 15:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: mel, Nick Piggin

Hi,

How dependable should zone movable be?  After a boot kvm is able to get enough hugepages to 
back the session.  After a day or two it becomes a lot less predictable.  Sometimes it will swap out for 30 seconds
and then succeed other times it will fail.  Interestingly, it sometimes works if I cancel the kvm session after
it tells me it cannot allocate the hugepages and immediatly restart.  It there some way to determine what
is not respecting zone moveable?  Or is zone moveable just a suggestion and not expected to really be 
moveable?

I have the following set in sysctl.conf

# huge_pages with movablecore set to 3G
kernel.shmmax = 8589934592
vm.nr_hugepages = 128
vm.nr_overcommit_hugepages = 1408
vm.hugepages_treat_as_movable = 1
vm.hugetlb_shm_group = 1005

This is with any recient kernel release (2.6.28 and later)

Thanks,
Ed Tomlinson

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How movable is zone movable?
  2009-04-12 15:29 How movable is zone movable? Ed Tomlinson
@ 2009-04-13  1:04 ` KAMEZAWA Hiroyuki
  2009-04-13  9:59 ` Mel Gorman
  1 sibling, 0 replies; 6+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-13  1:04 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-kernel, mel, Nick Piggin

On Sun, 12 Apr 2009 11:29:09 -0400
Ed Tomlinson <edt@aei.ca> wrote:

> Hi,
> 
> How dependable should zone movable be?  After a boot kvm is able to get enough hugepages to 
> back the session.  After a day or two it becomes a lot less predictable.  Sometimes it will swap out for 30 seconds
> and then succeed other times it will fail.  Interestingly, it sometimes works if I cancel the kvm session after
> it tells me it cannot allocate the hugepages and immediatly restart.  It there some way to determine what
> is not respecting zone moveable?  Or is zone moveable just a suggestion and not expected to really be 
> moveable?
> 
> I have the following set in sysctl.conf
> 
> # huge_pages with movablecore set to 3G
> kernel.shmmax = 8589934592
> vm.nr_hugepages = 128
> vm.nr_overcommit_hugepages = 1408
> vm.hugepages_treat_as_movable = 1
> vm.hugetlb_shm_group = 1005
> 
> This is with any recient kernel release (2.6.28 and later)
> 

At first, "Movable" means that it's only includes anon/file-cache, they are migratable by page
migration (memory hotremove) and considered to be easy to be freed. 
Unfortunately, "move/migrate memory" for memory recalim is not implemented yet. So, at allocating
hugepages, all necessary memory should be freed (swapped out).

Plz see /proc/meminfo before trying to allocate hugepages.
%cat /proc/meminfo
Then, ACTIVE+INACTIVE is current usage of "Movable" pages. (AnonPages means pages needs swap to be freed.)
(or plz see /proc/zoneinfo)

If ACTIVE+INACTIVE is near to 3G in your system, some amount of memory will be swapped out
at huge page allocation.

One tricky way to gain big unused chunk of memory is memory offline->online. This will use
page migration. (But I don't think it's tested widely.)


Thanks,
-Kame


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How movable is zone movable?
  2009-04-12 15:29 How movable is zone movable? Ed Tomlinson
  2009-04-13  1:04 ` KAMEZAWA Hiroyuki
@ 2009-04-13  9:59 ` Mel Gorman
  2009-04-13 10:04   ` Mel Gorman
  1 sibling, 1 reply; 6+ messages in thread
From: Mel Gorman @ 2009-04-13  9:59 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-kernel, Nick Piggin

On Sun, Apr 12, 2009 at 11:29:09AM -0400, Ed Tomlinson wrote:
> Hi,
> 
> How dependable should zone movable be? 

It should be quite dependable for resizing the static hugepage pool,
particularly if you are willing to wait. However, with latest
libhugetlbfs, the hugeadm utility still has a --hard option incase you
need to wait a long time for the resize to occur.

> After a boot kvm is able to get enough hugepages to 
> back the session.  After a day or two it becomes a lot less predictable.  Sometimes it will swap out for 30 seconds
> and then succeed other times it will fail. 

Unfortunately, that sounds about right. If there is significant
fragmentation, it takes time to find out a contiguous area that can be
reclaimed. mlock() if it's a factor will complicate things further.

> Interestingly, it sometimes works if I cancel the kvm session after
> it tells me it cannot allocate the hugepages and immediatly restart.  It there some way to determine what
> is not respecting zone moveable? 

Ok, that is odd. Are base pages being mlocked()? If they are, they could be
the problem as the patches to move those pages around were never merged.

> Or is zone moveable just a suggestion and not expected to really be 
> moveable?
> 

They are expected to be movable, but pages that get locked do not get
moved at the moment.

> I have the following set in sysctl.conf
> 
> # huge_pages with movablecore set to 3G
> kernel.shmmax = 8589934592
> vm.nr_hugepages = 128
> vm.nr_overcommit_hugepages = 1408
> vm.hugepages_treat_as_movable = 1
> vm.hugetlb_shm_group = 1005

How big did you make the movable zone and what is the hugepage size? I'm
guessing the hugepage size is 2MB but no harm in being sure. If that's
the case, the movable zone should be at least 2816MB to maximise success
rates when resizing.

> 
> This is with any recient kernel release (2.6.28 and later)
> 

That's pretty recent. /proc/vmstat will also tell you how many
allocations are actually failing.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How movable is zone movable?
  2009-04-13  9:59 ` Mel Gorman
@ 2009-04-13 10:04   ` Mel Gorman
  2009-04-13 12:10     ` Ed Tomlinson
  0 siblings, 1 reply; 6+ messages in thread
From: Mel Gorman @ 2009-04-13 10:04 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-kernel, Nick Piggin

On Mon, Apr 13, 2009 at 10:59:25AM +0100, Mel Gorman wrote:
> > # huge_pages with movablecore set to 3G
> 
> How big did you make the movable zone and what is the hugepage size?

Scratch that question, I missed you said it was 3G. Are pages being
mlocked()?

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How movable is zone movable?
  2009-04-13 10:04   ` Mel Gorman
@ 2009-04-13 12:10     ` Ed Tomlinson
  2009-04-14  9:18       ` Mel Gorman
  0 siblings, 1 reply; 6+ messages in thread
From: Ed Tomlinson @ 2009-04-13 12:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: linux-kernel, Nick Piggin, KAMEZAWA Hiroyuki

On Monday 13 April 2009 06:04:40 you wrote:
> On Mon, Apr 13, 2009 at 10:59:25AM +0100, Mel Gorman wrote:
> > > # huge_pages with movablecore set to 3G
> > 
> > How big did you make the movable zone and what is the hugepage size?
> 
> Scratch that question, I missed you said it was 3G. Are pages being
> mlocked()?

Looks like there are mlocked pages.  What stopped the patches to allow these pages
to be migrated?

TIA
Ed

(from /proc/zoneinfo)
Node 0, zone  Movable
  pages free     35321
        min      858
        low      1072
        high     1287
        scanned  0 (aa: 2 ia: 0 af: 0 if: 0)
        spanned  786432
        present  775680
    nr_free_pages 35321
    nr_inactive_anon 29383
    nr_active_anon 82452
    nr_inactive_file 376826
    nr_active_file 219586
    nr_unevictable 816
    nr_mlock     816
    nr_anon_pages 110648
    nr_mapped    28364
    nr_file_pages 598492
    nr_dirty     131
    nr_writeback 0
    nr_slab_reclaimable 0
    nr_slab_unreclaimable 0
    nr_page_table_pages 0
    nr_unstable  0
    nr_bounce    0
    nr_vmscan_write 95
    nr_writeback_temp 0
        protection: (0, 0, 0, 0)
  pagesets
    cpu: 0
              count: 7
              high:  186
              batch: 31
  vm stats threshold: 24
    cpu: 1
              count: 7
              high:  186
              batch: 31
  vm stats threshold: 24
    cpu: 2
              count: 14
              high:  186
              batch: 31
  vm stats threshold: 24
  all_unreclaimable: 0
  prev_priority:     12
  start_pfn:         1441792
  inactive_ratio:    4

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: How movable is zone movable?
  2009-04-13 12:10     ` Ed Tomlinson
@ 2009-04-14  9:18       ` Mel Gorman
  0 siblings, 0 replies; 6+ messages in thread
From: Mel Gorman @ 2009-04-14  9:18 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: linux-kernel, Nick Piggin, KAMEZAWA Hiroyuki

On Mon, Apr 13, 2009 at 08:10:58AM -0400, Ed Tomlinson wrote:
> On Monday 13 April 2009 06:04:40 you wrote:
> > On Mon, Apr 13, 2009 at 10:59:25AM +0100, Mel Gorman wrote:
> > > > # huge_pages with movablecore set to 3G
> > > 
> > > How big did you make the movable zone and what is the hugepage size?
> > 
> > Scratch that question, I missed you said it was 3G. Are pages being
> > mlocked()?
> 
> Looks like there are mlocked pages.  What stopped the patches to allow these pages
> to be migrated?
> 

My recollection is fuzzy but it was mainly down to three points.

1. I could occasionally lock up the system if under enough load using
   page migration. I didn't have a reliable reproduction case to pin down
   whether it was something in page migration or something wrong with the
   way I was using it. There have been fixes in page migration since though
   so it's possible that got fixed along the way.

2. At the time, memory partitioning and anti-fragmentation were not long in
   mainline and dynamic hugepage pool resizing was very new. It was not clear
   there were going to be users of dynamic hugepage pool resizing that would
   need memory compaction as well. I was reasonasbly confident but that's
   not users.

3. There were significant problems in hugetlbfs that needed fixing up
   such as the reliability of MAP_PRIVATE. It was more important to chase
   down the stuff people certainly needed than complete features that they
   might need.

The patches weren't downright rejected as such but I didn't push hard as there
were almost zero users of dynamic hugepage pool resizing to be complaining
about mlock(). Improving hugetlbfs was more important and of obvious benefit
and I reckoned I'd wait and see who complained about mlock.

This has changed now. hugetlbfs is in way better state than it was and it
sounds like KVM is a user that is both interested in using dynamic pool
resizing and has mlocked pages. How much demand is there do you think?

There is someone currently working on helping the pool resizing from userspace
by temporarily adding a swap device during pool resize that will take a
while to complete. The plan was that they would revisit memory compaction
based on the prototype patches I did ages ago later in the year but maybe
that can be pushed along a bit harder if there was enough interest.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-04-14  9:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-12 15:29 How movable is zone movable? Ed Tomlinson
2009-04-13  1:04 ` KAMEZAWA Hiroyuki
2009-04-13  9:59 ` Mel Gorman
2009-04-13 10:04   ` Mel Gorman
2009-04-13 12:10     ` Ed Tomlinson
2009-04-14  9:18       ` Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox