linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: j.glisse@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: fix special swap entry handling on copy mm
Date: Mon, 12 Aug 2013 17:49:37 -0400	[thread overview]
Message-ID: <20130812214936.GA6373@redhat.com> (raw)
In-Reply-To: <1376343400-jbl12uc3-mutt-n-horiguchi@ah.jp.nec.com>

On Mon, Aug 12, 2013 at 05:36:40PM -0400, Naoya Horiguchi wrote:
> Hi Jerome,
> 
> On Mon, Aug 12, 2013 at 11:43:24AM -0400, j.glisse@gmail.com wrote:
> > From: Jerome Glisse <jglisse@redhat.com>
> > 
> > Prior to this copy_one_pte will never reach the special swap file
> > handling code because swap_duplicate will return invalid value.
> > 
> > Note this is not fatal so nothing bad ever happen because of that.
> > Reason is that copy_pte_range would break of its loop and call
> > add_swap_count_continuation which would see its a special swap
> > file and return 0 triggering copy_pte_range to try again. Because
> > we try again there is a huge chance that the temporarily special
> > migration pte is now again valid and pointing to a new valid page.
> > 
> > This patch just split handling of special swap entry from regular
> > one inside copy_one_pte.
> > 
> > (Note i spotted that while reading code i haven't tested my theory.)
> > 
> > Signed-off-by: Jerome Glisse <jglisse@redhat.com>
> 
> non_swap_entry() means not only migration entry, but also hwpoison entry,
> so it seems to me that simply moving the swap_duplicate() into the
> if(!non_swap_entry) block can change the behavior for hwpoison entry.
> Would it be nice to add check for such a case?
> 
> Thanks,
> Naoya Horiguchi

Well if my reading of the code is right for hwpoison entry current code will
loop indefinitly inside the kernel on fork if one entry is set to hwpoison.

My patch does not handle hwpoison because it seems useless as there is nothing
to do for hwpoison pte beside giving setting the new pte to hwpoison to. So
the fork child will also have a pte with hwpoison. My patch do just that.

So change in behavior is current kernel loop indefinitly in kernel with hwpoison
pte on fork, vs child get hwpoison pte with my patch. Meaning that both child
and father can live as long as they dont access the hwpoisoned ptes.

Cheers,
Jerome

> 
> > ---
> >  mm/memory.c | 26 +++++++++++++-------------
> >  1 file changed, 13 insertions(+), 13 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 1ce2e2a..9f907dd 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -833,20 +833,20 @@ copy_one_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
> >  		if (!pte_file(pte)) {
> >  			swp_entry_t entry = pte_to_swp_entry(pte);
> >  
> > -			if (swap_duplicate(entry) < 0)
> > -				return entry.val;
> > -
> > -			/* make sure dst_mm is on swapoff's mmlist. */
> > -			if (unlikely(list_empty(&dst_mm->mmlist))) {
> > -				spin_lock(&mmlist_lock);
> > -				if (list_empty(&dst_mm->mmlist))
> > -					list_add(&dst_mm->mmlist,
> > -						 &src_mm->mmlist);
> > -				spin_unlock(&mmlist_lock);
> > -			}
> > -			if (likely(!non_swap_entry(entry)))
> > +			if (likely(!non_swap_entry(entry))) {
> > +				if (swap_duplicate(entry) < 0)
> > +					return entry.val;
> > +
> > +				/* make sure dst_mm is on swapoff's mmlist. */
> > +				if (unlikely(list_empty(&dst_mm->mmlist))) {
> > +					spin_lock(&mmlist_lock);
> > +					if (list_empty(&dst_mm->mmlist))
> > +						list_add(&dst_mm->mmlist,
> > +							 &src_mm->mmlist);
> > +					spin_unlock(&mmlist_lock);
> > +				}
> >  				rss[MM_SWAPENTS]++;
> > -			else if (is_migration_entry(entry)) {
> > +			} else if (is_migration_entry(entry)) {
> >  				page = migration_entry_to_page(entry);
> >  
> >  				if (PageAnon(page))
> > -- 
> > 1.8.3.1
> > 
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-08-12 21:49 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-12 15:43 [PATCH] mm: fix special swap entry handling on copy mm j.glisse
2013-08-12 21:36 ` Naoya Horiguchi
2013-08-12 21:49   ` Jerome Glisse [this message]
2013-08-13  1:46     ` Naoya Horiguchi
2013-08-13  2:23       ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130812214936.GA6373@redhat.com \
    --to=jglisse@redhat.com \
    --cc=j.glisse@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).