linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [RFC V2] mm:add zero_page _mapcount when mapped into user space
@ 2014-12-02  9:27 Wang, Yalin
  2014-12-02 11:30 ` Kirill A. Shutemov
  0 siblings, 1 reply; 6+ messages in thread
From: Wang, Yalin @ 2014-12-02  9:27 UTC (permalink / raw)
  To: linux-arm-kernel

This patch add/dec zero_page's _mapcount to make sure
the mapcount is correct for zero_page,
so that when read from /proc/kpagecount, zero_page's
mapcount is also correct, userspace process like procrank can
calculate PSS correctly.

Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
---
 mm/memory.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 3e50383..7215423 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2198,6 +2198,7 @@ gotten:
 		new_page = alloc_zeroed_user_highpage_movable(vma, address);
 		if (!new_page)
 			goto oom;
+		atomic_dec(&pfn_to_page(pte_pfn(orig_pte))->_mapcount);
 	} else {
 		new_page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma, address);
 		if (!new_page)
@@ -2647,6 +2648,7 @@ static int do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
 		page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
 		if (!pte_none(*page_table))
 			goto unlock;
+		atomic_inc(&pfn_to_page(my_zero_pfn(address))->_mapcount);
 		goto setpte;
 	}
 
-- 
2.1.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [RFC V2] mm:add zero_page _mapcount when mapped into user space
  2014-12-02  9:27 [RFC V2] mm:add zero_page _mapcount when mapped into user space Wang, Yalin
@ 2014-12-02 11:30 ` Kirill A. Shutemov
  2014-12-04  6:10   ` Wang, Yalin
  0 siblings, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2014-12-02 11:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> This patch add/dec zero_page's _mapcount to make sure
> the mapcount is correct for zero_page,
> so that when read from /proc/kpagecount, zero_page's
> mapcount is also correct, userspace process like procrank can
> calculate PSS correctly.

I don't have specific code path to point to, but I would expect zero page
with non-zero mapcount would cause a problem with rmap.

How do you test the change?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC V2] mm:add zero_page _mapcount when mapped into user space
  2014-12-02 11:30 ` Kirill A. Shutemov
@ 2014-12-04  6:10   ` Wang, Yalin
  2014-12-04 12:28     ` Kirill A. Shutemov
  0 siblings, 1 reply; 6+ messages in thread
From: Wang, Yalin @ 2014-12-04  6:10 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Kirill A. Shutemov [mailto:kirill at shutemov.name]
> Sent: Tuesday, December 02, 2014 7:30 PM
> To: Wang, Yalin
> Cc: 'linux-kernel at vger.kernel.org'; 'linux-mm at kvack.org'; 'linux-arm-
> kernel at lists.infradead.org'
> Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
> space
> 
> On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> > This patch add/dec zero_page's _mapcount to make sure the mapcount is
> > correct for zero_page, so that when read from /proc/kpagecount,
> > zero_page's mapcount is also correct, userspace process like procrank
> > can calculate PSS correctly.
> 
> I don't have specific code path to point to, but I would expect zero page
> with non-zero mapcount would cause a problem with rmap.
> 
> How do you test the change?
> 
I just test it to see the mapcount from /proc/pid/pagemap  and /proc/kpagecount ,
It works well,
The problem is that when I see /proc/pid/smaps ,
The Rss / Pss don't calculate zero_page map,
Because smaps_pte_entry() --> vm_normal_page( ),
Will return NULL for zero_page,

But when userspace process cat /proc/pid/pagemap  ,
It will see zero_page mapped,
And will treat as Rss ,  
This is weird, should we also omit zero_page in /proc/pid/pagemap ?
Or add zero_page as Rss in /proc/pid/smaps ? 

I think we should add zero_page into Rss ,
Because it is really mapped into userspace address space.
And will let userspace memory analysis more accurate .

Thanks

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC V2] mm:add zero_page _mapcount when mapped into user space
  2014-12-04  6:10   ` Wang, Yalin
@ 2014-12-04 12:28     ` Kirill A. Shutemov
  2014-12-05  6:39       ` Konstantin Khlebnikov
  0 siblings, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2014-12-04 12:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote:
> > -----Original Message-----
> > From: Kirill A. Shutemov [mailto:kirill at shutemov.name]
> > Sent: Tuesday, December 02, 2014 7:30 PM
> > To: Wang, Yalin
> > Cc: 'linux-kernel at vger.kernel.org'; 'linux-mm at kvack.org'; 'linux-arm-
> > kernel at lists.infradead.org'
> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
> > space
> > 
> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> > > This patch add/dec zero_page's _mapcount to make sure the mapcount is
> > > correct for zero_page, so that when read from /proc/kpagecount,
> > > zero_page's mapcount is also correct, userspace process like procrank
> > > can calculate PSS correctly.
> > 
> > I don't have specific code path to point to, but I would expect zero page
> > with non-zero mapcount would cause a problem with rmap.
> > 
> > How do you test the change?
> > 
> I just test it to see the mapcount from /proc/pid/pagemap  and /proc/kpagecount ,
> It works well,

I took a closer look and your patch is broken in multiple places:
 - on zap_pte_range() you don't decrement mapcount;
 - you don't update rss counters for mm;
 - copy_one_pte() doesn't increase mapcount;
 - ...

Basically, each and every vm_normal_page() call must be audited. As first
step. And you totally skip huge zero page.

Proper mapcount handling for zero page would require a lot more work and I
don't think it worth it. Gain is too small.

NAK.

> The problem is that when I see /proc/pid/smaps ,
> The Rss / Pss don't calculate zero_page map,
> Because smaps_pte_entry() --> vm_normal_page( ),
> Will return NULL for zero_page,
> 
> But when userspace process cat /proc/pid/pagemap  ,
> It will see zero_page mapped,
> And will treat as Rss ,  
> This is weird, should we also omit zero_page in /proc/pid/pagemap ?
> Or add zero_page as Rss in /proc/pid/smaps ? 
> 
> I think we should add zero_page into Rss ,
> Because it is really mapped into userspace address space.
> And will let userspace memory analysis more accurate .

It would be easier for userspace to find out pfn of zero page and take it
into account.

Note: some architectures have multiple zero page due to coloring.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC V2] mm:add zero_page _mapcount when mapped into user space
  2014-12-04 12:28     ` Kirill A. Shutemov
@ 2014-12-05  6:39       ` Konstantin Khlebnikov
  2014-12-05  8:08         ` Wang, Yalin
  0 siblings, 1 reply; 6+ messages in thread
From: Konstantin Khlebnikov @ 2014-12-05  6:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Dec 4, 2014 at 3:28 PM, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote:
>> > -----Original Message-----
>> > From: Kirill A. Shutemov [mailto:kirill at shutemov.name]
>> > Sent: Tuesday, December 02, 2014 7:30 PM
>> > To: Wang, Yalin
>> > Cc: 'linux-kernel at vger.kernel.org'; 'linux-mm at kvack.org'; 'linux-arm-
>> > kernel at lists.infradead.org'
>> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
>> > space
>> >
>> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
>> > > This patch add/dec zero_page's _mapcount to make sure the mapcount is
>> > > correct for zero_page, so that when read from /proc/kpagecount,
>> > > zero_page's mapcount is also correct, userspace process like procrank
>> > > can calculate PSS correctly.

Instead of tweaking mapcount you could mark zero-pages in /proc/kpageflags
and handle them accordingly in userspace. Or mark zero pages with special
magic _mapcount and detect it in /proc/kpagecount.

>> >
>> > I don't have specific code path to point to, but I would expect zero page
>> > with non-zero mapcount would cause a problem with rmap.
>> >
>> > How do you test the change?
>> >
>> I just test it to see the mapcount from /proc/pid/pagemap  and /proc/kpagecount ,
>> It works well,
>
> I took a closer look and your patch is broken in multiple places:
>  - on zap_pte_range() you don't decrement mapcount;
>  - you don't update rss counters for mm;
>  - copy_one_pte() doesn't increase mapcount;
>  - ...
>
> Basically, each and every vm_normal_page() call must be audited. As first
> step. And you totally skip huge zero page.
>
> Proper mapcount handling for zero page would require a lot more work and I
> don't think it worth it. Gain is too small.
>
> NAK.
>
>> The problem is that when I see /proc/pid/smaps ,
>> The Rss / Pss don't calculate zero_page map,
>> Because smaps_pte_entry() --> vm_normal_page( ),
>> Will return NULL for zero_page,
>>
>> But when userspace process cat /proc/pid/pagemap  ,
>> It will see zero_page mapped,
>> And will treat as Rss ,
>> This is weird, should we also omit zero_page in /proc/pid/pagemap ?
>> Or add zero_page as Rss in /proc/pid/smaps ?
>>
>> I think we should add zero_page into Rss ,
>> Because it is really mapped into userspace address space.
>> And will let userspace memory analysis more accurate .
>
> It would be easier for userspace to find out pfn of zero page and take it
> into account.
>
> Note: some architectures have multiple zero page due to coloring.
>
> --
>  Kirill A. Shutemov
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo at kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email at kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [RFC V2] mm:add zero_page _mapcount when mapped into user space
  2014-12-05  6:39       ` Konstantin Khlebnikov
@ 2014-12-05  8:08         ` Wang, Yalin
  0 siblings, 0 replies; 6+ messages in thread
From: Wang, Yalin @ 2014-12-05  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

> -----Original Message-----
> From: Konstantin Khlebnikov [mailto:koct9i at gmail.com]
> Sent: Friday, December 05, 2014 2:39 PM
> To: Kirill A. Shutemov
> Cc: Wang, Yalin; linux-kernel at vger.kernel.org; linux-mm at kvack.org; linux-
> arm-kernel at lists.infradead.org
> Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into user
> space
> 
> On Thu, Dec 4, 2014 at 3:28 PM, Kirill A. Shutemov <kirill@shutemov.name>
> wrote:
> > On Thu, Dec 04, 2014 at 02:10:53PM +0800, Wang, Yalin wrote:
> >> > -----Original Message-----
> >> > From: Kirill A. Shutemov [mailto:kirill at shutemov.name]
> >> > Sent: Tuesday, December 02, 2014 7:30 PM
> >> > To: Wang, Yalin
> >> > Cc: 'linux-kernel at vger.kernel.org'; 'linux-mm at kvack.org';
> >> > 'linux-arm- kernel at lists.infradead.org'
> >> > Subject: Re: [RFC V2] mm:add zero_page _mapcount when mapped into
> >> > user space
> >> >
> >> > On Tue, Dec 02, 2014 at 05:27:36PM +0800, Wang, Yalin wrote:
> >> > > This patch add/dec zero_page's _mapcount to make sure the
> >> > > mapcount is correct for zero_page, so that when read from
> >> > > /proc/kpagecount, zero_page's mapcount is also correct, userspace
> >> > > process like procrank can calculate PSS correctly.
> 
> Instead of tweaking mapcount you could mark zero-pages in /proc/kpageflags
> and handle them accordingly in userspace. Or mark zero pages with special
> magic _mapcount and detect it in /proc/kpagecount.
> 
I think add KPF_ZERO_PAGE in kpageflags is better.
I will make another patch for review .

Thanks

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-12-05  8:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-02  9:27 [RFC V2] mm:add zero_page _mapcount when mapped into user space Wang, Yalin
2014-12-02 11:30 ` Kirill A. Shutemov
2014-12-04  6:10   ` Wang, Yalin
2014-12-04 12:28     ` Kirill A. Shutemov
2014-12-05  6:39       ` Konstantin Khlebnikov
2014-12-05  8:08         ` Wang, Yalin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).