All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wanpeng Li <liwanp@linux.vnet.ibm.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andi Kleen <andi@firstfloor.org>,
	Fengguang Wu <fengguang.wu@intel.com>,
	Tony Luck <tony.luck@intel.com>,
	gong.chen@linux.intel.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/6] mm/hwpoison: fix num_poisoned_pages error statistics for thp
Date: Fri, 23 Aug 2013 12:24:44 +0800	[thread overview]
Message-ID: <20130823042444.GA23672@hacker.(null)> (raw)
In-Reply-To: <1377228430-o4j77sme-mutt-n-horiguchi@ah.jp.nec.com>

Hi Naoya,
On Thu, Aug 22, 2013 at 11:27:10PM -0400, Naoya Horiguchi wrote:
>Hi Wanpeng,
>
>On Fri, Aug 23, 2013 at 07:52:40AM +0800, Wanpeng Li wrote:
>> Hi Naoya,
>> On Thu, Aug 22, 2013 at 12:43:08PM -0400, Naoya Horiguchi wrote:
>> >On Thu, Aug 22, 2013 at 05:48:24PM +0800, Wanpeng Li wrote:
>> >> There is a race between hwpoison page and unpoison page, memory_failure 
>> >> set the page hwpoison and increase num_poisoned_pages without hold page 
>> >> lock, and one page count will be accounted against thp for num_poisoned_pages.
>> >> However, unpoison can occur before memory_failure hold page lock and 
>> >> split transparent hugepage, unpoison will decrease num_poisoned_pages 
>> >> by 1 << compound_order since memory_failure has not yet split transparent 
>> >> hugepage with page lock held. That means we account one page for hwpoison
>> >> and 1 << compound_order for unpoison. This patch fix it by decrease one 
>> >> account for num_poisoned_pages against no hugetlbfs pages case.
>> >> 
>> >> Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
>> >
>> >I think that a thp never becomes hwpoisoned without splitting, so "trying
>> >to unpoison thp" never happens (I think that this implicit fact should be
>> 
>> There is a race window here for hwpoison thp: 
>
>OK, thanks for great explanation (it's worth written in description.)
>And I found my previous comment was comletely pointless, sorry :(
>

Ah, ok, I will fold them in the patch description. ;-)

>> 				A	  			 									B
>> 		memory_failue 
>> 		TestSetPageHWPoison(p);
>> 		if (PageHuge(p))
>> 			nr_pages = 1 << compound_order(hpage);
>> 		else 
>> 			nr_pages = 1;
>> 		atomic_long_add(nr_pages, &num_poisoned_pages);	
>> 																unpoison_memory
>> 																nr_pages = 1<< compound_trans_order(page;)
>> 
>> 																if(TestClearPageHWPoison(p))
>> 																	atomic_long_sub(nr_pages, &num_poisoned_pages);
>> 		lock page 
>> 		if (!PageHWPoison(p))
>> 			unlock page and return 
>> 		hwpoison_user_mappings
>> 		if (PageTransHuge(hpage))
>> 			split_huge_page(hpage);
>
>When this race happens, our expectation is that num_poisoned_pages is
>increased by 1 because finally thread A succeeds to hwpoison one normal page.
>So thread B should fail to unpoison without clearing PageHWPoison nor
>decreasing num_poisoned_pages.  My suggestion is inserting a PageTransHuge
>check before doing TestClearPageHWPoison like follows:
>
>diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>index 1cb3b7d..f551b72 100644
>--- a/mm/memory-failure.c
>+++ b/mm/memory-failure.c
>@@ -1336,6 +1336,16 @@ int unpoison_memory(unsigned long pfn)
> 		return 0;
> 	}
>
>+	/*
>+	 * unpoison_memory() can encounter thp only when the thp is being
>+	 * worked by memory_failure() and the page lock is not held yet.
>+	 * In such case, we yield to memory_failure() and make unpoison fail.
>+	 */
>+	if (PageTransHuge(page)) {
>+		pr_info("MCE: Memory failure is now running on %#lx\n", pfn);
>+		return 0;
>+	}
>+

Looks reasonable to me, I will fold it in my patch. ;-)

> 	nr_pages = 1 << compound_trans_order(page);
>
> 	if (!get_page_unless_zero(page)) {
>
>
>I think that replacing atomic_long_sub() with atomic_long_dec() still
>has a meaning, so you don't have to drop that.
>

Agreed.

>> 
>> We increase one page count, however, decrease 1 << compound_trans_order.
>> The compound_trans_order you mentioned is used here for thp, that's why 
>> I don't drop it in patch 2/6.
>
>I don't think that we have to use compound_trans_order() any more, because
>with the above change we don't calculate nr_pages any more for thp.
>We can reduce the cost to lock/unlock compound_lock as described in 2/6.
>

Agreed.

>> >commented somewhere or asserted with VM_BUG_ON().)
>> 
>> I will add the VM_BUG_ON() in unpoison_memory after lock page in next
>> version.
>
>Sorry, my previous suggestion didn't make sense.
>

Agreed.

Regards,
Wanpeng Li 

>Thank you!
>Naoya Horiguchi
>
>> >And nr_pages in unpoison_memory() can be greater than 1 for hugetlbfs page.
>> >So does this patch break counting when unpoisoning free hugetlbfs pages?
>> >
>> >Thanks,
>> >Naoya Horiguchi
>> >
>> >> ---
>> >>  mm/memory-failure.c | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >> 
>> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> >> index 5092e06..6bfd51e 100644
>> >> --- a/mm/memory-failure.c
>> >> +++ b/mm/memory-failure.c
>> >> @@ -1350,7 +1350,7 @@ int unpoison_memory(unsigned long pfn)
>> >>  			return 0;
>> >>  		}
>> >>  		if (TestClearPageHWPoison(p))
>> >> -			atomic_long_sub(nr_pages, &num_poisoned_pages);
>> >> +			atomic_long_dec(&num_poisoned_pages);
>> >>  		pr_info("MCE: Software-unpoisoned free page %#lx\n", pfn);
>> >>  		return 0;
>> >>  	}
>> >> -- 
>> >> 1.8.1.2
>> >>
>> 
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2013-08-23  4:24 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-22  9:48 [PATCH 1/6] mm/hwpoison: fix lose PG_dirty flag for errors on mlocked pages Wanpeng Li
2013-08-22  9:48 ` Wanpeng Li
2013-08-22  9:48 ` [PATCH 2/6] mm/hwpoison: don't need to hold compound lock for hugetlbfs page Wanpeng Li
2013-08-22  9:48   ` Wanpeng Li
2013-08-22 15:52   ` Naoya Horiguchi
2013-08-22 15:52     ` Naoya Horiguchi
2013-08-22 23:54     ` Wanpeng Li
2013-08-22 23:54     ` Wanpeng Li
2013-08-22  9:48 ` [PATCH 3/6] mm/hwpoison: fix num_poisoned_pages error statistics for thp Wanpeng Li
2013-08-22  9:48   ` Wanpeng Li
2013-08-22 16:43   ` Naoya Horiguchi
2013-08-22 16:43     ` Naoya Horiguchi
2013-08-22 17:00     ` Naoya Horiguchi
2013-08-22 17:00       ` Naoya Horiguchi
2013-08-22 23:52     ` Wanpeng Li
2013-08-22 23:52     ` Wanpeng Li
     [not found]     ` <5216a46f.a800310a.2351.ffffa95cSMTPIN_ADDED_BROKEN@mx.google.com>
2013-08-23  3:27       ` Naoya Horiguchi
2013-08-23  3:27         ` Naoya Horiguchi
2013-08-23  4:24         ` Wanpeng Li
2013-08-23  4:24         ` Wanpeng Li [this message]
2013-08-22  9:48 ` [PATCH 4/6] mm/hwpoison: don't set migration type twice to avoid hold heavy contend zone->lock Wanpeng Li
2013-08-22  9:48   ` Wanpeng Li
2013-08-22 19:06   ` Naoya Horiguchi
2013-08-22 19:06     ` Naoya Horiguchi
2013-08-23  0:15     ` Wanpeng Li
2013-08-23  0:15     ` Wanpeng Li
2013-08-22  9:48 ` [PATCH 5/6] mm/hwpoison: drop forward reference declarations __soft_offline_page() Wanpeng Li
2013-08-22  9:48   ` Wanpeng Li
2013-08-22 19:24   ` Naoya Horiguchi
2013-08-22 19:24     ` Naoya Horiguchi
2013-08-22  9:48 ` [PATCH 6/6] mm/hwpoison: centralize set PG_hwpoison flag and increase num_poisoned_pages Wanpeng Li
2013-08-22  9:48   ` Wanpeng Li
2013-08-22 20:13   ` Naoya Horiguchi
2013-08-22 20:13     ` Naoya Horiguchi
2013-08-23  0:03     ` Wanpeng Li
2013-08-23  0:03     ` Wanpeng Li
2013-08-22 15:51 ` [PATCH 1/6] mm/hwpoison: fix lose PG_dirty flag for errors on mlocked pages Naoya Horiguchi
2013-08-22 15:51   ` Naoya Horiguchi
2013-08-22 23:34   ` Wanpeng Li
2013-08-22 23:34   ` Wanpeng Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='20130823042444.GA23672@hacker.(null)' \
    --to=liwanp@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=fengguang.wu@intel.com \
    --cc=gong.chen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.