All of lore.kernel.org
 help / color / mirror / Atom feed
From: CGEL <cgel.zte@gmail.com>
To: David Hildenbrand <david@redhat.com>
Cc: akpm@linux-foundation.org, yang.yang29@zte.com.cn,
	dave.hansen@linux.intel.com, ran.xiaokai@zte.com.cn,
	yang.shi@linux.alibaba.com, saravanand@fb.com,
	minchan@kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, xu xin <xu.xin16@zte.com.cn>
Subject: Re: [PATCH v2] mm/vmstat: add events for ksm cow
Date: Thu, 24 Mar 2022 01:36:06 +0000	[thread overview]
Message-ID: <623bcb0a.1c69fb81.ae484.a006@mx.google.com> (raw)
In-Reply-To: <614b33de-cdf0-73d2-08e3-196363d816d2@redhat.com>

On Wed, Mar 23, 2022 at 07:43:04PM +0100, David Hildenbrand wrote:
> On 23.03.22 08:57, cgel.zte@gmail.com wrote:
> > From: Yang Yang <yang.yang29@zte.com.cn>
> > 
> > Users may use ksm by calling madvise(, , MADV_MERGEABLE) when they want
> > to save memory, it's a tradeoff by suffering delay on ksm cow. Users can
> > get to know how much memory ksm saved by reading
> > /sys/kernel/mm/ksm/pages_sharing, but they don't know what's the costs
> > of ksm cow, and this is important of some delay sensitive tasks.
> > 
> > So add ksm cow events to help users evaluate whether or how to use ksm.
> > 
> > Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
> > Reviewed-by: xu xin <xu.xin16@zte.com.cn>
> > Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> > ---
> > v2:
> > - fix compile error when CONFIG_KSM is not set
> > ---
> >  include/linux/vm_event_item.h |  2 ++
> >  mm/memory.c                   | 20 +++++++++++++++++---
> >  mm/vmstat.c                   |  2 ++
> >  3 files changed, 21 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
> > index 16a0a4fd000b..6f32be04212f 100644
> > --- a/include/linux/vm_event_item.h
> > +++ b/include/linux/vm_event_item.h
> > @@ -131,6 +131,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
> >  		SWAP_RA_HIT,
> >  #ifdef CONFIG_KSM
> >  		KSM_SWPIN_COPY,
> > +		KSM_COW_SUCCESS,
> > +		KSM_COW_FAIL,
> >  #endif
> >  #endif
> >  #ifdef CONFIG_X86
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 4111f97c91a0..c24d5f04fab5 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3257,6 +3257,8 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
> >  	__releases(vmf->ptl)
> >  {
> >  	struct vm_area_struct *vma = vmf->vma;
> > +	vm_fault_t ret = 0;
> > +	bool ksm = 0;
> >  
> >  	if (userfaultfd_pte_wp(vma, *vmf->pte)) {
> >  		pte_unmap_unlock(vmf->pte, vmf->ptl);
> > @@ -3294,6 +3296,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
> >  	 */
> >  	if (PageAnon(vmf->page)) {
> >  		struct page *page = vmf->page;
> > +		ksm = PageKsm(page);
> >  
> >  		/*
> >  		 * We have to verify under page lock: these early checks are
> > @@ -3302,7 +3305,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
> >  		 *
> >  		 * PageKsm() doesn't necessarily raise the page refcount.
> >  		 */
> > -		if (PageKsm(page) || page_count(page) > 3)
> > +		if (ksm || page_count(page) > 3)
> >  			goto copy;
> >  		if (!PageLRU(page))
> >  			/*
> > @@ -3316,7 +3319,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
> >  			goto copy;
> >  		if (PageSwapCache(page))
> >  			try_to_free_swap(page);
> > -		if (PageKsm(page) || page_count(page) != 1) {
> > +		if (ksm || page_count(page) != 1) {
> 
> I think we really want to recheck here, after locking the page.
> Consequently, just do a PageKsm() check below.
> 
> >  			unlock_page(page);
> >  			goto copy;
> >  		}
> > @@ -3339,7 +3342,18 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
> >  	get_page(vmf->page);
> >  
> >  	pte_unmap_unlock(vmf->pte, vmf->ptl);
> > -	return wp_page_copy(vmf);
> > +	ret = wp_page_copy(vmf);
> > +
> > +#ifdef CONFIG_KSM
> > +	if (ksm) {
> > +		if (unlikely(ret & VM_FAULT_ERROR))
> > +			count_vm_event(KSM_COW_FAIL);
> > +		else
> > +			count_vm_event(KSM_COW_SUCCESS);
> > +	}
> > +#endif
> 
> Do we really care if we failed or not? I mean, the failure case will
> usually make your app crash either way ... due to OOM.
>
I think we need failed count. Because ksm cow oom is a little different
from general OOM. User may wonder I am not allocing new memory, why it
cause OOM? And OOM may have big impact for some kind of tasks, so we
better let user know that, do we?
>
> Doing
> 
> #ifdef CONFIG_KSM
> if (PageKsm(page))
> 	count_vm_event(COW_KSM);
> #endif
> 
> before the wp_page_copy(vmf) should be good enough, no?
> 
> Should be good enough IMHO.
> 
> -- 
> Thanks,
> 
> David / dhildenb


  reply	other threads:[~2022-03-24  1:36 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23  7:57 [PATCH v2] mm/vmstat: add events for ksm cow cgel.zte
2022-03-23 18:43 ` David Hildenbrand
2022-03-24  1:36   ` CGEL [this message]
2022-03-24  8:54     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=623bcb0a.1c69fb81.ae484.a006@mx.google.com \
    --to=cgel.zte@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=ran.xiaokai@zte.com.cn \
    --cc=saravanand@fb.com \
    --cc=xu.xin16@zte.com.cn \
    --cc=yang.shi@linux.alibaba.com \
    --cc=yang.yang29@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.