From: Minchan Kim <minchan@kernel.org>
To: yalin wang <yalin.wang2010@gmail.com>
Cc: Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
lkml <linux-kernel@vger.kernel.org>,
Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 4/5] mm: simplify reclaim path for MADV_FREE
Date: Tue, 27 Oct 2015 16:09:03 +0900 [thread overview]
Message-ID: <20151027070903.GD26803@bbox> (raw)
In-Reply-To: <EDCE64A3-D874-4FE3-91B5-DE5E26A452F5@gmail.com>
Hello Yalin,
Sorry for missing you in Cc list.
IIRC, mails to send your previous mail address(Yalin.Wang@sonymobile.com)
were returned.
On Tue, Oct 27, 2015 at 11:44:09AM +0800, yalin wang wrote:
>
> > On Oct 27, 2015, at 10:09, Hugh Dickins <hughd@google.com> wrote:
> >
> > On Mon, 19 Oct 2015, Minchan Kim wrote:
> >
> >> I made reclaim path mess to check and free MADV_FREEed page.
> >> This patch simplify it with tweaking add_to_swap.
> >>
> >> So far, we mark page as PG_dirty when we add the page into
> >> swap cache(ie, add_to_swap) to page out to swap device but
> >> this patch moves PG_dirty marking under try_to_unmap_one
> >> when we decide to change pte from anon to swapent so if
> >> any process's pte has swapent for the page, the page must
> >> be swapped out. IOW, there should be no funcional behavior
> >> change. It makes relcaim path really simple for MADV_FREE
> >> because we just need to check PG_dirty of page to decide
> >> discarding the page or not.
> >>
> >> Other thing this patch does is to pass TTU_BATCH_FLUSH to
> >> try_to_unmap when we handle freeable page because I don't
> >> see any reason to prevent it.
> >>
> >> Cc: Hugh Dickins <hughd@google.com>
> >> Cc: Mel Gorman <mgorman@suse.de>
> >> Signed-off-by: Minchan Kim <minchan@kernel.org>
> >
> > Acked-by: Hugh Dickins <hughd@google.com>
> >
> > This is sooooooo much nicer than the code it replaces! Really good.
> > Kudos also to Hannes for suggesting this approach originally, I think.
> >
> > I hope this implementation satisfies a good proportion of the people
> > who have been wanting MADV_FREE: I'm not among them, and have long
> > lost touch with those discussions, so won't judge how usable it is.
> >
> > I assume you'll refactor the series again before it goes to Linus,
> > so the previous messier implementations vanish? I notice Andrew
> > has this "mm: simplify reclaim path for MADV_FREE" in mmotm as
> > mm-dont-split-thp-page-when-syscall-is-called-fix-6.patch:
> > I guess it all got much too messy to divide up in a hurry.
> >
> > I've noticed no problems in testing (unlike the first time you moved
> > to working with pte_dirty); though of course I've not been using
> > MADV_FREE itself at all.
> >
> > One aspect has worried me for a while, but I think I've reached the
> > conclusion that it doesn't matter at all. The swap that's allocated
> > in add_to_swap() would normally get freed again (after try_to_unmap
> > found it was a MADV_FREE !pte_dirty !PageDirty case) at the bottom
> > of shrink_page_list(), in __remove_mapping(), yes?
> >
> > The bit that worried me is that on rare occasions, something unknown
> > might take a speculative reference to the page, and __remove_mapping()
> > fail to freeze refs for that reason. Much too rare to worry over not
> > freeing that page immediately, but it leaves us with a PageUptodate
> > PageSwapCache !PageDirty page, yet its contents are not the contents
> > of that location on swap.
> >
> > But since this can only happen when you have *not* inserted the
> > corresponding swapent anywhere, I cannot think of anything that would
> > have a legitimate interest in its contents matching that location on swap.
> > So I don't think it's worth looking for somewhere to add a SetPageDirty
> > (or a delete_from_swap_cache) just to regularize that case.
> >
> >> ---
> >> include/linux/rmap.h | 6 +----
> >> mm/huge_memory.c | 5 ----
> >> mm/rmap.c | 42 ++++++----------------------------
> >> mm/swap_state.c | 5 ++--
> >> mm/vmscan.c | 64 ++++++++++++++++------------------------------------
> >> 5 files changed, 30 insertions(+), 92 deletions(-)
> >>
<snip>
You added comment bottom line so I'm not sure what PageDirty you meant.
> it is wrong here if you only check PageDirty() to decide if the page is freezable or not .
> The Anon page are shared by multiple process, _mapcount > 1 ,
> so you must check all pt_dirty bit during page_referenced() function,
> see this mail thread:
> http://ns1.ske-art.com/lists/kernel/msg1934021.html
If one of pte among process sharing the page was dirty, the dirtiness should
be propagated from pte to PG_dirty by try_to_unmap_one.
IOW, if the page doesn't have PG_dirty flag, it means all of process did
MADV_FREE.
Am I missing something from you question?
If so, could you show exact scenario I am missing?
Thanks for the interest.
> Thanks
>
>
>
>
>
>
>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: yalin wang <yalin.wang2010@gmail.com>
Cc: Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
lkml <linux-kernel@vger.kernel.org>,
Rik van Riel <riel@redhat.com>, Mel Gorman <mgorman@suse.de>,
Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 4/5] mm: simplify reclaim path for MADV_FREE
Date: Tue, 27 Oct 2015 16:09:03 +0900 [thread overview]
Message-ID: <20151027070903.GD26803@bbox> (raw)
In-Reply-To: <EDCE64A3-D874-4FE3-91B5-DE5E26A452F5@gmail.com>
Hello Yalin,
Sorry for missing you in Cc list.
IIRC, mails to send your previous mail address(Yalin.Wang@sonymobile.com)
were returned.
On Tue, Oct 27, 2015 at 11:44:09AM +0800, yalin wang wrote:
>
> > On Oct 27, 2015, at 10:09, Hugh Dickins <hughd@google.com> wrote:
> >
> > On Mon, 19 Oct 2015, Minchan Kim wrote:
> >
> >> I made reclaim path mess to check and free MADV_FREEed page.
> >> This patch simplify it with tweaking add_to_swap.
> >>
> >> So far, we mark page as PG_dirty when we add the page into
> >> swap cache(ie, add_to_swap) to page out to swap device but
> >> this patch moves PG_dirty marking under try_to_unmap_one
> >> when we decide to change pte from anon to swapent so if
> >> any process's pte has swapent for the page, the page must
> >> be swapped out. IOW, there should be no funcional behavior
> >> change. It makes relcaim path really simple for MADV_FREE
> >> because we just need to check PG_dirty of page to decide
> >> discarding the page or not.
> >>
> >> Other thing this patch does is to pass TTU_BATCH_FLUSH to
> >> try_to_unmap when we handle freeable page because I don't
> >> see any reason to prevent it.
> >>
> >> Cc: Hugh Dickins <hughd@google.com>
> >> Cc: Mel Gorman <mgorman@suse.de>
> >> Signed-off-by: Minchan Kim <minchan@kernel.org>
> >
> > Acked-by: Hugh Dickins <hughd@google.com>
> >
> > This is sooooooo much nicer than the code it replaces! Really good.
> > Kudos also to Hannes for suggesting this approach originally, I think.
> >
> > I hope this implementation satisfies a good proportion of the people
> > who have been wanting MADV_FREE: I'm not among them, and have long
> > lost touch with those discussions, so won't judge how usable it is.
> >
> > I assume you'll refactor the series again before it goes to Linus,
> > so the previous messier implementations vanish? I notice Andrew
> > has this "mm: simplify reclaim path for MADV_FREE" in mmotm as
> > mm-dont-split-thp-page-when-syscall-is-called-fix-6.patch:
> > I guess it all got much too messy to divide up in a hurry.
> >
> > I've noticed no problems in testing (unlike the first time you moved
> > to working with pte_dirty); though of course I've not been using
> > MADV_FREE itself at all.
> >
> > One aspect has worried me for a while, but I think I've reached the
> > conclusion that it doesn't matter at all. The swap that's allocated
> > in add_to_swap() would normally get freed again (after try_to_unmap
> > found it was a MADV_FREE !pte_dirty !PageDirty case) at the bottom
> > of shrink_page_list(), in __remove_mapping(), yes?
> >
> > The bit that worried me is that on rare occasions, something unknown
> > might take a speculative reference to the page, and __remove_mapping()
> > fail to freeze refs for that reason. Much too rare to worry over not
> > freeing that page immediately, but it leaves us with a PageUptodate
> > PageSwapCache !PageDirty page, yet its contents are not the contents
> > of that location on swap.
> >
> > But since this can only happen when you have *not* inserted the
> > corresponding swapent anywhere, I cannot think of anything that would
> > have a legitimate interest in its contents matching that location on swap.
> > So I don't think it's worth looking for somewhere to add a SetPageDirty
> > (or a delete_from_swap_cache) just to regularize that case.
> >
> >> ---
> >> include/linux/rmap.h | 6 +----
> >> mm/huge_memory.c | 5 ----
> >> mm/rmap.c | 42 ++++++----------------------------
> >> mm/swap_state.c | 5 ++--
> >> mm/vmscan.c | 64 ++++++++++++++++------------------------------------
> >> 5 files changed, 30 insertions(+), 92 deletions(-)
> >>
<snip>
You added comment bottom line so I'm not sure what PageDirty you meant.
> it is wrong here if you only check PageDirty() to decide if the page is freezable or not .
> The Anon page are shared by multiple process, _mapcount > 1 ,
> so you must check all pt_dirty bit during page_referenced() function,
> see this mail thread:
> http://ns1.ske-art.com/lists/kernel/msg1934021.html
If one of pte among process sharing the page was dirty, the dirtiness should
be propagated from pte to PG_dirty by try_to_unmap_one.
IOW, if the page doesn't have PG_dirty flag, it means all of process did
MADV_FREE.
Am I missing something from you question?
If so, could you show exact scenario I am missing?
Thanks for the interest.
> Thanks
>
>
>
>
>
>
>
>
>
next prev parent reply other threads:[~2015-10-27 7:08 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-19 6:31 [PATCH 0/5] MADV_FREE refactoring and fix KSM page Minchan Kim
2015-10-19 6:31 ` Minchan Kim
2015-10-19 6:31 ` [PATCH 1/5] mm: MADV_FREE trivial clean up Minchan Kim
2015-10-19 6:31 ` Minchan Kim
2015-10-19 6:31 ` [PATCH 2/5] mm: skip huge zero page in MADV_FREE Minchan Kim
2015-10-19 6:31 ` Minchan Kim
2015-10-19 6:31 ` [PATCH 3/5] mm: clear PG_dirty to mark page freeable Minchan Kim
2015-10-19 6:31 ` Minchan Kim
2015-10-27 1:28 ` Hugh Dickins
2015-10-27 1:28 ` Hugh Dickins
2015-10-27 6:50 ` Minchan Kim
2015-10-27 6:50 ` Minchan Kim
2015-10-19 6:31 ` [PATCH 4/5] mm: simplify reclaim path for MADV_FREE Minchan Kim
2015-10-19 6:31 ` Minchan Kim
2015-10-27 2:09 ` Hugh Dickins
2015-10-27 2:09 ` Hugh Dickins
2015-10-27 3:44 ` yalin wang
2015-10-27 3:44 ` yalin wang
2015-10-27 7:09 ` Minchan Kim [this message]
2015-10-27 7:09 ` Minchan Kim
2015-10-27 7:39 ` yalin wang
2015-10-27 7:39 ` yalin wang
2015-10-27 8:10 ` Minchan Kim
2015-10-27 8:10 ` Minchan Kim
2015-10-27 8:52 ` yalin wang
2015-10-27 8:52 ` yalin wang
2015-10-28 4:03 ` yalin wang
2015-10-28 4:03 ` yalin wang
2015-10-27 6:54 ` Minchan Kim
2015-10-27 6:54 ` Minchan Kim
2015-10-19 6:31 ` [PATCH 5/5] mm: mark stable page dirty in KSM Minchan Kim
2015-10-19 6:31 ` Minchan Kim
2015-10-27 2:23 ` Hugh Dickins
2015-10-27 2:23 ` Hugh Dickins
2015-10-27 6:58 ` Minchan Kim
2015-10-27 6:58 ` Minchan Kim
2015-10-19 10:01 ` [PATCH 0/5] MADV_FREE refactoring and fix KSM page Minchan Kim
2015-10-19 10:01 ` Minchan Kim
2015-10-20 1:38 ` Minchan Kim
2015-10-20 1:38 ` Minchan Kim
2015-10-20 7:21 ` Minchan Kim
2015-10-20 7:21 ` Minchan Kim
2015-10-20 7:27 ` Minchan Kim
2015-10-20 7:27 ` Minchan Kim
2015-10-20 21:36 ` Andrew Morton
2015-10-20 21:36 ` Andrew Morton
2015-10-20 22:43 ` Kirill A. Shutemov
2015-10-20 22:43 ` Kirill A. Shutemov
2015-10-21 5:11 ` Minchan Kim
2015-10-21 5:11 ` Minchan Kim
2015-10-21 7:50 ` Kirill A. Shutemov
2015-10-21 7:50 ` Kirill A. Shutemov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151027070903.GD26803@bbox \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
--cc=vbabka@suse.cz \
--cc=yalin.wang2010@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.