[PATCH mmotm] add the pagefault count into memcg stats: shmem fix

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
@ 2011-05-17 18:24 Hugh Dickins
  2011-05-17 20:00 ` Ying Han
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Hugh Dickins @ 2011-05-17 18:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro, Minchan Kim,
	Daisuke Nishimura, Balbir Singh, linux-mm

mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
target mm, not for current mm (but of course they're usually the same).

We don't know the target mm in shmem_getpage(), so do it at the outer
level in shmem_fault(); and it's easier to follow if we move the
count_vm_event(PGMAJFAULT) there too.

Hah, it was using __count_vm_event() before, sneaking that update into
the unpreemptible section under info->lock: well, it comes to the same
on x86 at least, and I still think it's best to keep these together.

Signed-off-by: Hugh Dickins <hughd@google.com>
---

 mm/shmem.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

--- mmotm/mm/shmem.c	2011-05-13 14:57:45.367884578 -0700
+++ linux/mm/shmem.c	2011-05-17 10:27:19.901934756 -0700
@@ -1293,14 +1293,10 @@ repeat:
 		swappage = lookup_swap_cache(swap);
 		if (!swappage) {
 			shmem_swp_unmap(entry);
+			spin_unlock(&info->lock);
 			/* here we actually do the io */
-			if (type && !(*type & VM_FAULT_MAJOR)) {
-				__count_vm_event(PGMAJFAULT);
-				mem_cgroup_count_vm_event(current->mm,
-							  PGMAJFAULT);
+			if (type)
 				*type |= VM_FAULT_MAJOR;
-			}
-			spin_unlock(&info->lock);
 			swappage = shmem_swapin(swap, gfp, info, idx);
 			if (!swappage) {
 				spin_lock(&info->lock);
@@ -1539,7 +1535,10 @@ static int shmem_fault(struct vm_area_st
 	error = shmem_getpage(inode, vmf->pgoff, &vmf->page, SGP_CACHE, &ret);
 	if (error)
 		return ((error == -ENOMEM) ? VM_FAULT_OOM : VM_FAULT_SIGBUS);
-
+	if (ret & VM_FAULT_MAJOR) {
+		count_vm_event(PGMAJFAULT);
+		mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
+	}
 	return ret | VM_FAULT_LOCKED;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-17 18:24 [PATCH mmotm] add the pagefault count into memcg stats: shmem fix Hugh Dickins
@ 2011-05-17 20:00 ` Ying Han
  2011-05-18  5:43 ` Daisuke Nishimura
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Ying Han @ 2011-05-17 20:00 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, KOSAKI Motohiro, Minchan Kim,
	Daisuke Nishimura, Balbir Singh, linux-mm

[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]

On Tue, May 17, 2011 at 11:24 AM, Hugh Dickins <hughd@google.com> wrote:

> mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> target mm, not for current mm (but of course they're usually the same).
>
> We don't know the target mm in shmem_getpage(), so do it at the outer
> level in shmem_fault(); and it's easier to follow if we move the
> count_vm_event(PGMAJFAULT) there too.
>
> Hah, it was using __count_vm_event() before, sneaking that update into
> the unpreemptible section under info->lock: well, it comes to the same
> on x86 at least, and I still think it's best to keep these together.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>
>  mm/shmem.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> --- mmotm/mm/shmem.c    2011-05-13 14:57:45.367884578 -0700
> +++ linux/mm/shmem.c    2011-05-17 10:27:19.901934756 -0700
> @@ -1293,14 +1293,10 @@ repeat:
>                swappage = lookup_swap_cache(swap);
>                if (!swappage) {
>                        shmem_swp_unmap(entry);
> +                       spin_unlock(&info->lock);
>                        /* here we actually do the io */
> -                       if (type && !(*type & VM_FAULT_MAJOR)) {
> -                               __count_vm_event(PGMAJFAULT);
> -                               mem_cgroup_count_vm_event(current->mm,
> -                                                         PGMAJFAULT);
> +                       if (type)
>                                *type |= VM_FAULT_MAJOR;
> -                       }
> -                       spin_unlock(&info->lock);
>                        swappage = shmem_swapin(swap, gfp, info, idx);
>                        if (!swappage) {
>                                spin_lock(&info->lock);
> @@ -1539,7 +1535,10 @@ static int shmem_fault(struct vm_area_st
>        error = shmem_getpage(inode, vmf->pgoff, &vmf->page, SGP_CACHE,
> &ret);
>        if (error)
>                return ((error == -ENOMEM) ? VM_FAULT_OOM :
> VM_FAULT_SIGBUS);
> -
> +       if (ret & VM_FAULT_MAJOR) {
> +               count_vm_event(PGMAJFAULT);
> +               mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
> +       }
>        return ret | VM_FAULT_LOCKED;
>  }
>
> Thank you Hugh for the fix.

Acked-by: Ying Han<yinghan@google.com>

--Ying

[-- Attachment #2: Type: text/html, Size: 3366 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-17 18:24 [PATCH mmotm] add the pagefault count into memcg stats: shmem fix Hugh Dickins
  2011-05-17 20:00 ` Ying Han
@ 2011-05-18  5:43 ` Daisuke Nishimura
  2011-05-18 18:25   ` Hugh Dickins
  2011-05-18 18:32 ` Balbir Singh
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Daisuke Nishimura @ 2011-05-18  5:43 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Minchan Kim, Balbir Singh, linux-mm, Daisuke Nishimura

On Tue, 17 May 2011 11:24:40 -0700 (PDT)
Hugh Dickins <hughd@google.com> wrote:

> mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> target mm, not for current mm (but of course they're usually the same).
> 
hmm, why ?
In shmem_getpage(), we charge the page to the memcg where current mm belongs to,
so I think counting vm events of the memcg is right.

Thanks,
Daisuke Nishimura.

> We don't know the target mm in shmem_getpage(), so do it at the outer
> level in shmem_fault(); and it's easier to follow if we move the
> count_vm_event(PGMAJFAULT) there too.
> 
> Hah, it was using __count_vm_event() before, sneaking that update into
> the unpreemptible section under info->lock: well, it comes to the same
> on x86 at least, and I still think it's best to keep these together.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
> 
>  mm/shmem.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> --- mmotm/mm/shmem.c	2011-05-13 14:57:45.367884578 -0700
> +++ linux/mm/shmem.c	2011-05-17 10:27:19.901934756 -0700
> @@ -1293,14 +1293,10 @@ repeat:
>  		swappage = lookup_swap_cache(swap);
>  		if (!swappage) {
>  			shmem_swp_unmap(entry);
> +			spin_unlock(&info->lock);
>  			/* here we actually do the io */
> -			if (type && !(*type & VM_FAULT_MAJOR)) {
> -				__count_vm_event(PGMAJFAULT);
> -				mem_cgroup_count_vm_event(current->mm,
> -							  PGMAJFAULT);
> +			if (type)
>  				*type |= VM_FAULT_MAJOR;
> -			}
> -			spin_unlock(&info->lock);
>  			swappage = shmem_swapin(swap, gfp, info, idx);
>  			if (!swappage) {
>  				spin_lock(&info->lock);
> @@ -1539,7 +1535,10 @@ static int shmem_fault(struct vm_area_st
>  	error = shmem_getpage(inode, vmf->pgoff, &vmf->page, SGP_CACHE, &ret);
>  	if (error)
>  		return ((error == -ENOMEM) ? VM_FAULT_OOM : VM_FAULT_SIGBUS);
> -
> +	if (ret & VM_FAULT_MAJOR) {
> +		count_vm_event(PGMAJFAULT);
> +		mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
> +	}
>  	return ret | VM_FAULT_LOCKED;
>  }
>  

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-18  5:43 ` Daisuke Nishimura
@ 2011-05-18 18:25   ` Hugh Dickins
  2011-05-19  0:38     ` KAMEZAWA Hiroyuki
  2011-05-19  2:16     ` Daisuke Nishimura
  0 siblings, 2 replies; 13+ messages in thread
From: Hugh Dickins @ 2011-05-18 18:25 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Minchan Kim, Balbir Singh, linux-mm

On Wed, 18 May 2011, Daisuke Nishimura wrote:
> On Tue, 17 May 2011 11:24:40 -0700 (PDT)
> Hugh Dickins <hughd@google.com> wrote:
> 
> > mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> > target mm, not for current mm (but of course they're usually the same).
> > 
> hmm, why ?
> In shmem_getpage(), we charge the page to the memcg where current mm belongs to,

(In the case when it's this fault which is creating the page.
Just as when filemap_fault() reads in the page, add_to_page_cache
will charge it to the current->mm's memcg, yes.  Arguably correct.)

> so I think counting vm events of the memcg is right.

It should be consistent with which task gets the maj_flt++, and
it should be consistent with filemap_fault(), and it should be a
subset of what's counted by mem_cgroup_count_vm_event(mm, PGFAULT).

In each case, those work on target mm rather than current->mm.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-17 18:24 [PATCH mmotm] add the pagefault count into memcg stats: shmem fix Hugh Dickins
  2011-05-17 20:00 ` Ying Han
  2011-05-18  5:43 ` Daisuke Nishimura
@ 2011-05-18 18:32 ` Balbir Singh
  2011-05-18 23:17 ` Minchan Kim
  2011-05-22 23:31 ` Minchan Kim
  4 siblings, 0 replies; 13+ messages in thread
From: Balbir Singh @ 2011-05-18 18:32 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Minchan Kim, Daisuke Nishimura, linux-mm

* Hugh Dickins <hughd@google.com> [2011-05-17 11:24:40]:

> mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> target mm, not for current mm (but of course they're usually the same).
> 
> We don't know the target mm in shmem_getpage(), so do it at the outer
> level in shmem_fault(); and it's easier to follow if we move the
> count_vm_event(PGMAJFAULT) there too.
> 
> Hah, it was using __count_vm_event() before, sneaking that update into
> the unpreemptible section under info->lock: well, it comes to the same
> on x86 at least, and I still think it's best to keep these together.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>

Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
 

-- 
	Three Cheers,
	Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-17 18:24 [PATCH mmotm] add the pagefault count into memcg stats: shmem fix Hugh Dickins
                   ` (2 preceding siblings ...)
  2011-05-18 18:32 ` Balbir Singh
@ 2011-05-18 23:17 ` Minchan Kim
  2011-05-19  0:28   ` Hugh Dickins
  2011-05-22 23:31 ` Minchan Kim
  4 siblings, 1 reply; 13+ messages in thread
From: Minchan Kim @ 2011-05-18 23:17 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Daisuke Nishimura, Balbir Singh, linux-mm

Hi Hugh,

On Wed, May 18, 2011 at 3:24 AM, Hugh Dickins <hughd@google.com> wrote:
> mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> target mm, not for current mm (but of course they're usually the same).
>
> We don't know the target mm in shmem_getpage(), so do it at the outer
> level in shmem_fault(); and it's easier to follow if we move the
> count_vm_event(PGMAJFAULT) there too.
>
> Hah, it was using __count_vm_event() before, sneaking that update into
> the unpreemptible section under info->lock: well, it comes to the same
> on x86 at least, and I still think it's best to keep these together.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>

It's good to me but I have a nitpick.

You are changing behavior a bit.
Old behavior is to account FAULT although the operation got failed.
But new one is to not account it.
I think we have to account it regardless of whether it is successful or not.
That's because it is fact fault happens.

> ---
>
>  mm/shmem.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> --- mmotm/mm/shmem.c    2011-05-13 14:57:45.367884578 -0700
> +++ linux/mm/shmem.c    2011-05-17 10:27:19.901934756 -0700
> @@ -1293,14 +1293,10 @@ repeat:
>                swappage = lookup_swap_cache(swap);
>                if (!swappage) {
>                        shmem_swp_unmap(entry);
> +                       spin_unlock(&info->lock);
>                        /* here we actually do the io */
> -                       if (type && !(*type & VM_FAULT_MAJOR)) {
> -                               __count_vm_event(PGMAJFAULT);
> -                               mem_cgroup_count_vm_event(current->mm,
> -                                                         PGMAJFAULT);
> +                       if (type)
>                                *type |= VM_FAULT_MAJOR;
> -                       }
> -                       spin_unlock(&info->lock);
>                        swappage = shmem_swapin(swap, gfp, info, idx);
>                        if (!swappage) {
>                                spin_lock(&info->lock);
> @@ -1539,7 +1535,10 @@ static int shmem_fault(struct vm_area_st
>        error = shmem_getpage(inode, vmf->pgoff, &vmf->page, SGP_CACHE, &ret);
>        if (error)
>                return ((error == -ENOMEM) ? VM_FAULT_OOM : VM_FAULT_SIGBUS);
> -
> +       if (ret & VM_FAULT_MAJOR) {
> +               count_vm_event(PGMAJFAULT);
> +               mem_cgroup_count_vm_event(vma->vm_mm, PGMAJFAULT);
> +       }
>        return ret | VM_FAULT_LOCKED;
>  }
>
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-18 23:17 ` Minchan Kim
@ 2011-05-19  0:28   ` Hugh Dickins
  2011-05-19  0:37     ` Minchan Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Hugh Dickins @ 2011-05-19  0:28 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Daisuke Nishimura, Balbir Singh, linux-mm

On Thu, 19 May 2011, Minchan Kim wrote:
> On Wed, May 18, 2011 at 3:24 AM, Hugh Dickins <hughd@google.com> wrote:
> > mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> > target mm, not for current mm (but of course they're usually the same).
> >
> > We don't know the target mm in shmem_getpage(), so do it at the outer
> > level in shmem_fault(); and it's easier to follow if we move the
> > count_vm_event(PGMAJFAULT) there too.
> >
> > Hah, it was using __count_vm_event() before, sneaking that update into
> > the unpreemptible section under info->lock: well, it comes to the same
> > on x86 at least, and I still think it's best to keep these together.
> >
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> 
> It's good to me but I have a nitpick.
> 
> You are changing behavior a bit.
> Old behavior is to account FAULT although the operation got failed.
> But new one is to not account it.
> I think we have to account it regardless of whether it is successful or not.
> That's because it is fact fault happens.

That's a good catch: something I didn't think of at all.

However, it looks as if the patch remains correct, and is fixing
a bug (or inconsistency) that we hadn't noticed before.

If you look through filemap_fault() or do_swap_page() (or even
ncp_file_mmap_fault(), though I don't take that one as canonical!),
they clearly do not count the major fault on error (except in the
case where VM_FAULT_MAJOR needs VM_FAULT_RETRY, then gets
VM_FAULT_ERROR on the retry).

So, shmem.c was the odd one out before.  If you feel very strongly
about it ("it is fact fault happens") you could submit a patch to
change them all - but I think just leave them as is.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-19  0:28   ` Hugh Dickins
@ 2011-05-19  0:37     ` Minchan Kim
  2011-05-19  1:35       ` Hugh Dickins
  0 siblings, 1 reply; 13+ messages in thread
From: Minchan Kim @ 2011-05-19  0:37 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Daisuke Nishimura, Balbir Singh, linux-mm

On Thu, May 19, 2011 at 9:28 AM, Hugh Dickins <hughd@google.com> wrote:
> On Thu, 19 May 2011, Minchan Kim wrote:
>> On Wed, May 18, 2011 at 3:24 AM, Hugh Dickins <hughd@google.com> wrote:
>> > mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
>> > target mm, not for current mm (but of course they're usually the same).
>> >
>> > We don't know the target mm in shmem_getpage(), so do it at the outer
>> > level in shmem_fault(); and it's easier to follow if we move the
>> > count_vm_event(PGMAJFAULT) there too.
>> >
>> > Hah, it was using __count_vm_event() before, sneaking that update into
>> > the unpreemptible section under info->lock: well, it comes to the same
>> > on x86 at least, and I still think it's best to keep these together.
>> >
>> > Signed-off-by: Hugh Dickins <hughd@google.com>
>>
>> It's good to me but I have a nitpick.
>>
>> You are changing behavior a bit.
>> Old behavior is to account FAULT although the operation got failed.
>> But new one is to not account it.
>> I think we have to account it regardless of whether it is successful or not.
>> That's because it is fact fault happens.
>
> That's a good catch: something I didn't think of at all.
>
> However, it looks as if the patch remains correct, and is fixing
> a bug (or inconsistency) that we hadn't noticed before.
>
> If you look through filemap_fault() or do_swap_page() (or even
> ncp_file_mmap_fault(), though I don't take that one as canonical!),
> they clearly do not count the major fault on error (except in the
> case where VM_FAULT_MAJOR needs VM_FAULT_RETRY, then gets
> VM_FAULT_ERROR on the retry).
>
> So, shmem.c was the odd one out before.  If you feel very strongly
> about it ("it is fact fault happens") you could submit a patch to
> change them all - but I think just leave them as is.

Okay. I don't feel it strongly now.
Then, could you repost your patch with corrected description about
this behavior change which is a bug or inconsistency whatever. :)


>
> Hugh
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-18 18:25   ` Hugh Dickins
@ 2011-05-19  0:38     ` KAMEZAWA Hiroyuki
  2011-05-19  1:54       ` Hugh Dickins
  2011-05-19  2:16     ` Daisuke Nishimura
  1 sibling, 1 reply; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-05-19  0:38 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Daisuke Nishimura, Andrew Morton, Ying Han, KOSAKI Motohiro,
	Minchan Kim, Balbir Singh, linux-mm

On Wed, 18 May 2011 11:25:48 -0700 (PDT)
Hugh Dickins <hughd@google.com> wrote:

> On Wed, 18 May 2011, Daisuke Nishimura wrote:
> > On Tue, 17 May 2011 11:24:40 -0700 (PDT)
> > Hugh Dickins <hughd@google.com> wrote:
> > 
> > > mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> > > target mm, not for current mm (but of course they're usually the same).
> > > 
> > hmm, why ?
> > In shmem_getpage(), we charge the page to the memcg where current mm belongs to,
> 
> (In the case when it's this fault which is creating the page.
> Just as when filemap_fault() reads in the page, add_to_page_cache
> will charge it to the current->mm's memcg, yes.  Arguably correct.)
> 
> > so I think counting vm events of the memcg is right.
> 
> It should be consistent with which task gets the maj_flt++, and
> it should be consistent with filemap_fault(), and it should be a
> subset of what's counted by mem_cgroup_count_vm_event(mm, PGFAULT).
> 
> In each case, those work on target mm rather than current->mm.
> 

Hmm, I have no strong opinion on this but yes, it makes sense to account
PGMAJFLT to the process whose mm->maj_flt++. BTW,  do you think memcg should
account shmem into vma->vm_mm rather than current->mm ? When vma->vm_mm
is different from current ? At get_user_pages() + MAJFLT ?

Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-19  0:37     ` Minchan Kim
@ 2011-05-19  1:35       ` Hugh Dickins
  0 siblings, 0 replies; 13+ messages in thread
From: Hugh Dickins @ 2011-05-19  1:35 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Daisuke Nishimura, Balbir Singh, linux-mm

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1983 bytes --]

On Thu, 19 May 2011, Minchan Kim wrote:
> On Thu, May 19, 2011 at 9:28 AM, Hugh Dickins <hughd@google.com> wrote:
> > On Thu, 19 May 2011, Minchan Kim wrote:
> >>
> >> You are changing behavior a bit.
> >> Old behavior is to account FAULT although the operation got failed.
> >> But new one is to not account it.
> >> I think we have to account it regardless of whether it is successful or not.
> >> That's because it is fact fault happens.
> >
> > That's a good catch: something I didn't think of at all.
> >
> > However, it looks as if the patch remains correct, and is fixing
> > a bug (or inconsistency) that we hadn't noticed before.
> >
> > If you look through filemap_fault() or do_swap_page() (or even
> > ncp_file_mmap_fault(), though I don't take that one as canonical!),
> > they clearly do not count the major fault on error (except in the
> > case where VM_FAULT_MAJOR needs VM_FAULT_RETRY, then gets
> > VM_FAULT_ERROR on the retry).
> >
> > So, shmem.c was the odd one out before.  If you feel very strongly
> > about it ("it is fact fault happens") you could submit a patch to
> > change them all - but I think just leave them as is.
> 
> Okay. I don't feel it strongly now.
> Then, could you repost your patch with corrected description about
> this behavior change which is a bug or inconsistency whatever. :)

If I can think up a correct paragraph which makes it clear.
Let me hold off on that, and see what other comments come in.

The situation is less clear-cut than I described above.  I was
dismissing the VM_FAULT_MAJOR|VM_FAULT_RETRY then VM_FAULT_ERROR
case as unlikely, whereas that would be the common case of I/O error
on fault on x86 now - reading in takes page lock, so it will retry.

Whereas other architectures (not using FAULT_FLAG_ALLOW_RETRY)
behave as I described, as filemap_fault() used to behave.

I don't think this a major fault in my patch ;)
Just something we don't care very much about.

Hugh

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-19  0:38     ` KAMEZAWA Hiroyuki
@ 2011-05-19  1:54       ` Hugh Dickins
  0 siblings, 0 replies; 13+ messages in thread
From: Hugh Dickins @ 2011-05-19  1:54 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Daisuke Nishimura, Andrew Morton, Ying Han, KOSAKI Motohiro,
	Minchan Kim, Balbir Singh, linux-mm

On Thu, 19 May 2011, KAMEZAWA Hiroyuki wrote:
> On Wed, 18 May 2011 11:25:48 -0700 (PDT)
> Hugh Dickins <hughd@google.com> wrote:
> 
> > On Wed, 18 May 2011, Daisuke Nishimura wrote:
> > > On Tue, 17 May 2011 11:24:40 -0700 (PDT)
> > > Hugh Dickins <hughd@google.com> wrote:
> > > 
> > > > mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> > > > target mm, not for current mm (but of course they're usually the same).
> > > > 
> > > hmm, why ?
> > > In shmem_getpage(), we charge the page to the memcg where current mm belongs to,
> > 
> > (In the case when it's this fault which is creating the page.
> > Just as when filemap_fault() reads in the page, add_to_page_cache
> > will charge it to the current->mm's memcg, yes.  Arguably correct.)
> > 
> > > so I think counting vm events of the memcg is right.
> > 
> > It should be consistent with which task gets the maj_flt++, and
> > it should be consistent with filemap_fault(), and it should be a
> > subset of what's counted by mem_cgroup_count_vm_event(mm, PGFAULT).
> > 
> > In each case, those work on target mm rather than current->mm.
> > 
> 
> Hmm, I have no strong opinion on this but yes, it makes sense to account
> PGMAJFLT to the process whose mm->maj_flt++.

mm->maj_flt++ would be yet another story!  But it's tsk->maj_flt++.

> BTW,  do you think memcg should
> account shmem into vma->vm_mm rather than current->mm ? When vma->vm_mm
> is different from current ? At get_user_pages() + MAJFLT ?

If what we have at present works well enough, I don't think we should
risk breaking it with a funny change like that.

Is there a reason to treat shmem differently from filemap there?
You can argue that if it's shm then yes, but if it's tmpfs then no.

I suppose shmem.c does usually know the difference (by VM_NORESERVE),
but does not export it; and I'd prefer to keep it that way.

Unless we've got a real bug to fix here, let's not mess with it.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-18 18:25   ` Hugh Dickins
  2011-05-19  0:38     ` KAMEZAWA Hiroyuki
@ 2011-05-19  2:16     ` Daisuke Nishimura
  1 sibling, 0 replies; 13+ messages in thread
From: Daisuke Nishimura @ 2011-05-19  2:16 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Minchan Kim, Balbir Singh, linux-mm, Daisuke Nishimura

On Wed, 18 May 2011 11:25:48 -0700 (PDT)
Hugh Dickins <hughd@google.com> wrote:

> On Wed, 18 May 2011, Daisuke Nishimura wrote:
> > On Tue, 17 May 2011 11:24:40 -0700 (PDT)
> > Hugh Dickins <hughd@google.com> wrote:
> > 
> > > mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> > > target mm, not for current mm (but of course they're usually the same).
> > > 
> > hmm, why ?
> > In shmem_getpage(), we charge the page to the memcg where current mm belongs to,
> 
> (In the case when it's this fault which is creating the page.
> Just as when filemap_fault() reads in the page, add_to_page_cache
> will charge it to the current->mm's memcg, yes.  Arguably correct.)
> 
> > so I think counting vm events of the memcg is right.
> 
> It should be consistent with which task gets the maj_flt++, and
> it should be consistent with filemap_fault(), and it should be a
> subset of what's counted by mem_cgroup_count_vm_event(mm, PGFAULT).
> 
> In each case, those work on target mm rather than current->mm.
> 
Thank you for your explanation. I can agree that we should count PGMAJFLT of memcg
where the target mm belongs to.

Thanks,
Daisuke Nishimura.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH mmotm] add the pagefault count into memcg stats: shmem fix
  2011-05-17 18:24 [PATCH mmotm] add the pagefault count into memcg stats: shmem fix Hugh Dickins
                   ` (3 preceding siblings ...)
  2011-05-18 23:17 ` Minchan Kim
@ 2011-05-22 23:31 ` Minchan Kim
  4 siblings, 0 replies; 13+ messages in thread
From: Minchan Kim @ 2011-05-22 23:31 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Andrew Morton, Ying Han, KAMEZAWA Hiroyuki, KOSAKI Motohiro,
	Daisuke Nishimura, Balbir Singh, linux-mm

On Wed, May 18, 2011 at 3:24 AM, Hugh Dickins <hughd@google.com> wrote:
> mem_cgroup_count_vm_event() should update the PGMAJFAULT count for the
> target mm, not for current mm (but of course they're usually the same).
>
> We don't know the target mm in shmem_getpage(), so do it at the outer
> level in shmem_fault(); and it's easier to follow if we move the
> count_vm_event(PGMAJFAULT) there too.
>
> Hah, it was using __count_vm_event() before, sneaking that update into
> the unpreemptible section under info->lock: well, it comes to the same
> on x86 at least, and I still think it's best to keep these together.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

I am okay if memcg maintainer knew behavior change of shmem fault accounting.
What I want was let memcg maintainer know slight behavior change.

Thanks, Hugh.
-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-05-22 23:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-17 18:24 [PATCH mmotm] add the pagefault count into memcg stats: shmem fix Hugh Dickins
2011-05-17 20:00 ` Ying Han
2011-05-18  5:43 ` Daisuke Nishimura
2011-05-18 18:25   ` Hugh Dickins
2011-05-19  0:38     ` KAMEZAWA Hiroyuki
2011-05-19  1:54       ` Hugh Dickins
2011-05-19  2:16     ` Daisuke Nishimura
2011-05-18 18:32 ` Balbir Singh
2011-05-18 23:17 ` Minchan Kim
2011-05-19  0:28   ` Hugh Dickins
2011-05-19  0:37     ` Minchan Kim
2011-05-19  1:35       ` Hugh Dickins
2011-05-22 23:31 ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).