linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Linux API <linux-api@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Rik van Riel <riel@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Mel Gorman <mgorman@suse.de>, Jason Evans <je@fb.com>,
	Zhang Yanfei <zhangyanfei@cn.fujitsu.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	linux390@de.ibm.com, Gerald Schaefer <gerald.schaefer@de.ibm.com>
Subject: Re: [PATCH v9] mm: support madvise(MADV_FREE)
Date: Thu, 3 Jul 2014 17:37:29 +0900	[thread overview]
Message-ID: <20140703083729.GE2939@bbox> (raw)
In-Reply-To: <20140703102901.322bfdb0@mschwide>

Hello,

On Thu, Jul 03, 2014 at 10:29:01AM +0200, Martin Schwidefsky wrote:
> On Thu, 3 Jul 2014 16:29:54 +0900
> Minchan Kim <minchan@kernel.org> wrote:
> 
> > Hello,
> > 
> > On Thu, Jul 03, 2014 at 10:03:19AM +0900, Minchan Kim wrote:
> > > Hello,
> > > 
> > > On Tue, Jul 01, 2014 at 05:50:58PM +0300, Kirill A. Shutemov wrote:
> > > > On Tue, Jul 01, 2014 at 09:36:15AM +0900, Minchan Kim wrote:
> > > > > +	do {
> > > > > +		/*
> > > > > +		 * XXX: We can optimize with supporting Hugepage free
> > > > > +		 * if the range covers.
> > > > > +		 */
> > > > > +		next = pmd_addr_end(addr, end);
> > > > > +		if (pmd_trans_huge(*pmd))
> > > > > +			split_huge_page_pmd(vma, addr, pmd);
> > > > 
> > > > Could you implement proper THP support before upstreaming the feature?
> > > > It shouldn't be a big deal.
> > > 
> > > Okay, Hope to review.
> > > 
> > > Thanks for the feedback!
> > > 
> > 
> > I tried to implement it but had a issue.
> > 
> > I need pmd_mkold, pmd_mkclean for MADV_FREE operation and pmd_dirty for
> > page_referenced. When I investigate all of arches supported THP,
> > it's not a big deal but s390 is not sure to me who has no idea of
> > soft tracking of s390 by storage key instead of page table information.
> > Cced s390 maintainer. Hope to help.
> 
> Storage key for dirty and referenced tracking is a thing of the past.
> The current code for s390 uses software tracking for dirty and referenced.
> There is one catch though, for ptes the software implementation covers
> dirty and referenced bit but for pmds only referenced bit is available.
> The reason is that there is no free bit left in the pmd entry for the
> software dirty bit.

Thanks for the quick reply.

>  
> > So, if there isn't any help from s390, I should introduce
> > HAVE_ARCH_THP_MADVFREE to disable MADV_FREE support of THP in s390 but
> > not want to introduce such new config.
> 
> Why is the dirty bit for pmds needed for the MADV_FREE implementation?

MADV_FREE semantic want it.

When madvise syscall is called, VM clears dirty bit of ptes of
the range. If memory pressure happens, VM checks dirty bit of
page table and if it found still "clean", it means it's a
"lazyfree pages" so VM could discard the page instead of swapping out.
Once there was store operation for the page before VM peek a page
to reclaim, dirty bit is set so VM can swap out the page instead of
discarding to keep up-to-date contents.

If it's hard on s390, maybe we could use just reference bit
instead of dirty bit to check recent access but it might change
semantic a bit with other OSes. :(

> 
> -- 
> blue skies,
>    Martin.
> 
> "Reality continues to ruin my life." - Calvin.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-07-03  8:36 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-01  0:36 [PATCH v9] mm: support madvise(MADV_FREE) Minchan Kim
2014-07-01 14:16 ` Rik van Riel
2014-07-01 14:50 ` Kirill A. Shutemov
2014-07-03  1:03   ` Minchan Kim
2014-07-03  7:29     ` Minchan Kim
2014-07-03  8:29       ` Martin Schwidefsky
2014-07-03  8:37         ` Minchan Kim [this message]
2014-07-03 16:01           ` Martin Schwidefsky
2014-07-04  6:41             ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140703083729.GE2939@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=hannes@cmpxchg.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hughd@google.com \
    --cc=je@fb.com \
    --cc=kirill@shutemov.name \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux390@de.ibm.com \
    --cc=mgorman@suse.de \
    --cc=mtk.manpages@gmail.com \
    --cc=riel@redhat.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=zhangyanfei@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).