From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@suse.cz>,
Johannes Weiner <hannes@cmpxchg.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Minchan Kim <minchan@kernel.org>
Subject: [RFC 1/6] mm: keep dirty bit on KSM page
Date: Wed, 3 Jun 2015 15:15:40 +0900 [thread overview]
Message-ID: <1433312145-19386-2-git-send-email-minchan@kernel.org> (raw)
In-Reply-To: <1433312145-19386-1-git-send-email-minchan@kernel.org>
I encountered segfault of test program while I tested MADV_FREE
with KSM. By investigation,
1. A KSM page is mapped on page table of A, B processes with
!pte_dirty(but it marked the page as PG_dirty if pte_dirty is on)
2. MADV_FREE of A process can remove the page from swap cache
if it was in there and then clear *PG_dirty* to indicate we could
discard it instead of swapping out.
3. So, the KSM page's status is !pte_dirty of A, B processes &&
!PageDirty.
4. VM judges it as freeable page and discard it.
5. Process B encounters segfault even though B didn't call MADV_FREE.
Clearing PG_dirty after anonymous page is removed from swap cache
was no problem on integrity POV for private page(ie, normal anon page,
not KSM). Just worst case caused by that was unnecessary write out
which we have avoided it if same data is already on swap.
However, with introducing MADV_FREE, it could make above problem
so this patch fixes it with keeping dirty bit of the page table
when the page is replaced with KSM page.
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/ksm.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/mm/ksm.c b/mm/ksm.c
index bc7be0ee2080..9c07346e57f2 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -901,9 +901,8 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
set_pte_at(mm, addr, ptep, entry);
goto out_unlock;
}
- if (pte_dirty(entry))
- set_page_dirty(page);
- entry = pte_mkclean(pte_wrprotect(entry));
+
+ entry = pte_wrprotect(entry);
set_pte_at_notify(mm, addr, ptep, entry);
}
*orig_pte = *ptep;
@@ -932,11 +931,13 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
struct mm_struct *mm = vma->vm_mm;
pmd_t *pmd;
pte_t *ptep;
+ pte_t entry;
spinlock_t *ptl;
unsigned long addr;
int err = -EFAULT;
unsigned long mmun_start; /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
+ bool dirty;
addr = page_address_in_vma(page, vma);
if (addr == -EFAULT)
@@ -956,12 +957,22 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
goto out_mn;
}
+ dirty = pte_dirty(*ptep);
get_page(kpage);
page_add_anon_rmap(kpage, vma, addr);
flush_cache_page(vma, addr, pte_pfn(*ptep));
ptep_clear_flush_notify(vma, addr, ptep);
- set_pte_at_notify(mm, addr, ptep, mk_pte(kpage, vma->vm_page_prot));
+
+ entry = mk_pte(kpage, vma->vm_page_prot);
+ /*
+ * Keep a dirty bit to prevent a KSM page sudden freeing
+ * by MADV_FREE.
+ */
+ if (dirty)
+ entry = pte_mkdirty(entry);
+
+ set_pte_at_notify(mm, addr, ptep, entry);
page_remove_rmap(page);
if (!page_mapped(page))
--
1.9.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-06-03 6:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-03 6:15 [RFC 0/6] MADV_FREE: respect pte_dirty, not PG_dirty Minchan Kim
2015-06-03 6:15 ` Minchan Kim [this message]
2015-06-03 6:15 ` [RFC 2/6] mm: keep dirty bit on anonymous page migration Minchan Kim
2015-06-03 6:15 ` [RFC 3/6] mm: mark dirty bit on swapped-in page Minchan Kim
2015-06-09 19:07 ` Cyrill Gorcunov
2015-06-09 23:52 ` Minchan Kim
2015-06-10 7:23 ` Cyrill Gorcunov
2015-06-10 8:00 ` Minchan Kim
2015-06-10 8:05 ` Cyrill Gorcunov
2015-06-03 6:15 ` [RFC 4/6] mm: mark dirty bit on unuse_pte Minchan Kim
2015-06-03 6:15 ` [RFC 5/6] mm: decouple PG_dirty from MADV_FREE Minchan Kim
2015-06-03 6:15 ` [RFC 6/6] mm: MADV_FREE refactoring Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1433312145-19386-2-git-send-email-minchan@kernel.org \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).