Re: [BUG] mlocked page counter mismatch

All of lore.kernel.org
 help / color / mirror / Atom feed

From: MinChan Kim <minchan.kim@gmail.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: linux mm <linux-mm@kvack.org>,
	linux kernel <linux-kernel@vger.kernel.org>,
	Nick Piggin <npiggin@suse.de>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [BUG] mlocked page counter mismatch
Date: Thu, 29 Jan 2009 08:55:14 +0900	[thread overview]
Message-ID: <20090128235514.GB24924@barrios-desktop> (raw)
In-Reply-To: <1233156832.8760.85.camel@lts-notebook>

On Wed, Jan 28, 2009 at 10:33:52AM -0500, Lee Schermerhorn wrote:
> On Wed, 2009-01-28 at 19:28 +0900, MinChan Kim wrote:
> > After executing following program, 'cat /proc/meminfo' shows
> > following result. 
> > 
> > --
> > # cat /proc/meminfo 
> > ..
> > Unevictable:           8 kB
> > Mlocked:               8 kB
> > ..
> 
> Sorry, from your description, I can't understand what the problem is.
> Are you saying that 8kB [2 pages] remains locked after you run your
> test?

Yes. 
Sorry. My explanation was not enought. 

> 
> What did meminfo show before running the test program?  And what kernel
> version?

The meminfo showed mlocked, unevictable pages was zero. 
My kernel version is 2.6.29-rc2. 

> 
> > 
> > -- test program --
> > 
> > #include <stdio.h>
> > #include <sys/mman.h>
> > int main()
> > {
> >         char buf[64] = {0,};
> >         char *ptr;
> >         int k;
> >         int i,j;
> >         int x,y;
> >         mlockall(MCL_CURRENT);
> >         sprintf(buf, "cat /proc/%d/maps", getpid());
> >         system(buf);
> >         return 0;
> > }
> > 
> > --
> > 
> > It seems mlocked page counter have a problem.
> > After I diged in source, I found that try_to_unmap_file called 
> > try_to_mlock_page about shared mapping pages 
> > since other vma had VM_LOCKED flag.
> > After all, try_to_mlock_page called mlock_vma_page. 
> > so, mlocked counter increased
> 
> This path of try_to_unmap_file() -> try_to_mlock_page() should only be
> invoked during reclaim--from shrink_page_list().  [try_to_unmap() is
> also called from page migration, but in this case, try_to_unmap_one()
> won't return SWAP_MLOCK so we don't call try_to_mlock_page().]  Unless
> your system is in continuous reclaim, I don't think you'd hit this
> during your test program.

My system was not reclaim mode. It could be called following path. 
exit_mmap -> munlock_vma_pages_all->munlock_vma_page->try_to_munlock->
try_to_unmap_file->try_to_mlock_page

> 
> > 
> > But, After I called munlockall intentionally, the counter work well. 
> > In case of munlockall, we already had a mmap_sem about write. 
> > Such a case, try_to_mlock_page can't call mlock_vma_page.
> > so, mlocked counter didn't increased. 
> > As a result, the counter seems to be work well but I think 
> > it also have a problem.
> 
> I THINK this is a artifact of the way stats are accumulated in per cpu
> differential counters and pushed to the zone state accumulators when a
> threshold is reached.  I've seen this condition before, but it
> eventually clears itself as the stats get pushed to the zone state.
> Still, it bears more investigation, as it's been a while since I worked
> on this and some subsequent fixes could have broken it:

Hmm... My test result is as follow. 

1) without munlockall
before:

root@barrios-target-linux:~# tail -8 /proc/vmstat 
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           0 kB
Mlocked:               0 kB

after:
root@barrios-target-linux:~# tail -8 /proc/vmstat 
unevictable_pgs_culled 369
unevictable_pgs_scanned 0
unevictable_pgs_rescued 388
unevictable_pgs_mlocked 392
unevictable_pgs_munlocked 387
unevictable_pgs_cleared 1
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           8 kB
Mlocked:               8 kB

after dropping cache

root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           4 kB
Mlocked:               4 kB


2) with munlockall 

barrios-target@barrios-target-linux:~$ tail -8 /proc/vmstat
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

barrios-target@barrios-target-linux:~$ cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           0 kB
Mlocked:               0 kB

after


root@barrios-target-linux:~# tail -8 /proc/vmstat
unevictable_pgs_culled 369
unevictable_pgs_scanned 0
unevictable_pgs_rescued 389
unevictable_pgs_mlocked 389
unevictable_pgs_munlocked 389
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           0 kB
Mlocked:               0 kB

Both tests have to show same result. 
But is didn't. 

I think it's not per-cpu problem. 

When I digged in the source, I found that. 
In case of test without munlockall, try_to_unmap_file calls try_to_mlock_page 
since some pages are mapped several vmas(I don't know why that pages is shared 
other vma in same process. One of page which i have seen is test program's 
first code page[page->index : 0 vma->vm_pgoff : 0]. It was mapped by data vma, too). 
Other vma have VM_LOCKED.
So try_to_unmap_file calls try_to_mlock_page. Then, After calling 
successful down_read_try_lock call, mlock_vma_page increased mlocked
counter again. 

In case of test with munlockall, try_to_mlock_page's down_read_trylock 
couldn't be sucessful. That's because munlockall called down_write.
At result, try_to_mlock_page cannot call try_to_mlock_page. so, mlocked counter 
don't increased. I think it's not right. 
But fortunately Mlocked number is right. :(

--
Kinds Regards
MinChan Kim

WARNING: multiple messages have this Message-ID (diff)

From: MinChan Kim <minchan.kim@gmail.com>
To: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: linux mm <linux-mm@kvack.org>,
	linux kernel <linux-kernel@vger.kernel.org>,
	Nick Piggin <npiggin@suse.de>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Subject: Re: [BUG] mlocked page counter mismatch
Date: Thu, 29 Jan 2009 08:55:14 +0900	[thread overview]
Message-ID: <20090128235514.GB24924@barrios-desktop> (raw)
In-Reply-To: <1233156832.8760.85.camel@lts-notebook>

On Wed, Jan 28, 2009 at 10:33:52AM -0500, Lee Schermerhorn wrote:
> On Wed, 2009-01-28 at 19:28 +0900, MinChan Kim wrote:
> > After executing following program, 'cat /proc/meminfo' shows
> > following result. 
> > 
> > --
> > # cat /proc/meminfo 
> > ..
> > Unevictable:           8 kB
> > Mlocked:               8 kB
> > ..
> 
> Sorry, from your description, I can't understand what the problem is.
> Are you saying that 8kB [2 pages] remains locked after you run your
> test?

Yes. 
Sorry. My explanation was not enought. 

> 
> What did meminfo show before running the test program?  And what kernel
> version?

The meminfo showed mlocked, unevictable pages was zero. 
My kernel version is 2.6.29-rc2. 

> 
> > 
> > -- test program --
> > 
> > #include <stdio.h>
> > #include <sys/mman.h>
> > int main()
> > {
> >         char buf[64] = {0,};
> >         char *ptr;
> >         int k;
> >         int i,j;
> >         int x,y;
> >         mlockall(MCL_CURRENT);
> >         sprintf(buf, "cat /proc/%d/maps", getpid());
> >         system(buf);
> >         return 0;
> > }
> > 
> > --
> > 
> > It seems mlocked page counter have a problem.
> > After I diged in source, I found that try_to_unmap_file called 
> > try_to_mlock_page about shared mapping pages 
> > since other vma had VM_LOCKED flag.
> > After all, try_to_mlock_page called mlock_vma_page. 
> > so, mlocked counter increased
> 
> This path of try_to_unmap_file() -> try_to_mlock_page() should only be
> invoked during reclaim--from shrink_page_list().  [try_to_unmap() is
> also called from page migration, but in this case, try_to_unmap_one()
> won't return SWAP_MLOCK so we don't call try_to_mlock_page().]  Unless
> your system is in continuous reclaim, I don't think you'd hit this
> during your test program.

My system was not reclaim mode. It could be called following path. 
exit_mmap -> munlock_vma_pages_all->munlock_vma_page->try_to_munlock->
try_to_unmap_file->try_to_mlock_page

> 
> > 
> > But, After I called munlockall intentionally, the counter work well. 
> > In case of munlockall, we already had a mmap_sem about write. 
> > Such a case, try_to_mlock_page can't call mlock_vma_page.
> > so, mlocked counter didn't increased. 
> > As a result, the counter seems to be work well but I think 
> > it also have a problem.
> 
> I THINK this is a artifact of the way stats are accumulated in per cpu
> differential counters and pushed to the zone state accumulators when a
> threshold is reached.  I've seen this condition before, but it
> eventually clears itself as the stats get pushed to the zone state.
> Still, it bears more investigation, as it's been a while since I worked
> on this and some subsequent fixes could have broken it:

Hmm... My test result is as follow. 

1) without munlockall
before:

root@barrios-target-linux:~# tail -8 /proc/vmstat 
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           0 kB
Mlocked:               0 kB

after:
root@barrios-target-linux:~# tail -8 /proc/vmstat 
unevictable_pgs_culled 369
unevictable_pgs_scanned 0
unevictable_pgs_rescued 388
unevictable_pgs_mlocked 392
unevictable_pgs_munlocked 387
unevictable_pgs_cleared 1
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           8 kB
Mlocked:               8 kB

after dropping cache

root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           4 kB
Mlocked:               4 kB


2) with munlockall 

barrios-target@barrios-target-linux:~$ tail -8 /proc/vmstat
unevictable_pgs_culled 0
unevictable_pgs_scanned 0
unevictable_pgs_rescued 0
unevictable_pgs_mlocked 0
unevictable_pgs_munlocked 0
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

barrios-target@barrios-target-linux:~$ cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           0 kB
Mlocked:               0 kB

after


root@barrios-target-linux:~# tail -8 /proc/vmstat
unevictable_pgs_culled 369
unevictable_pgs_scanned 0
unevictable_pgs_rescued 389
unevictable_pgs_mlocked 389
unevictable_pgs_munlocked 389
unevictable_pgs_cleared 0
unevictable_pgs_stranded 0
unevictable_pgs_mlockfreed 0

root@barrios-target-linux:~# cat /proc/meminfo | egrep 'Mlo|Unev'
Unevictable:           0 kB
Mlocked:               0 kB

Both tests have to show same result. 
But is didn't. 

I think it's not per-cpu problem. 

When I digged in the source, I found that. 
In case of test without munlockall, try_to_unmap_file calls try_to_mlock_page 
since some pages are mapped several vmas(I don't know why that pages is shared 
other vma in same process. One of page which i have seen is test program's 
first code page[page->index : 0 vma->vm_pgoff : 0]. It was mapped by data vma, too). 
Other vma have VM_LOCKED.
So try_to_unmap_file calls try_to_mlock_page. Then, After calling 
successful down_read_try_lock call, mlock_vma_page increased mlocked
counter again. 

In case of test with munlockall, try_to_mlock_page's down_read_trylock 
couldn't be sucessful. That's because munlockall called down_write.
At result, try_to_mlock_page cannot call try_to_mlock_page. so, mlocked counter 
don't increased. I think it's not right. 
But fortunately Mlocked number is right. :(

--
Kinds Regards
MinChan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2009-01-29  0:31 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-28 10:28 [BUG] mlocked page counter mismatch MinChan Kim
2009-01-28 10:28 ` MinChan Kim
2009-01-28 14:38 ` KOSAKI Motohiro
2009-01-28 14:38   ` KOSAKI Motohiro
2009-01-28 15:33 ` Lee Schermerhorn
2009-01-28 15:33   ` Lee Schermerhorn
2009-01-28 23:55   ` MinChan Kim [this message]
2009-01-28 23:55     ` MinChan Kim
2009-01-28 23:57     ` MinChan Kim
2009-01-28 23:57       ` MinChan Kim
2009-01-29  1:48     ` Lee Schermerhorn
2009-01-29  1:48       ` Lee Schermerhorn
2009-01-29  4:29       ` MinChan Kim
2009-01-29  4:29         ` MinChan Kim
2009-01-29 12:35       ` KOSAKI Motohiro
2009-01-29 12:35         ` KOSAKI Motohiro
2009-01-29 14:44         ` Lee Schermerhorn
2009-01-29 14:44           ` Lee Schermerhorn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090128235514.GB24924@barrios-desktop \
    --to=minchan.kim@gmail.com \
    --cc=Lee.Schermerhorn@hp.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.