All of lore.kernel.org
 help / color / mirror / Atom feed
From: Naoya Horiguchi <nao.horiguchi@gmail.com>
To: Qian Cai <cai@lca.pw>
Cc: "tony.luck@intel.com" <tony.luck@intel.com>,
	"david@redhat.com" <david@redhat.com>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"zeil@yandex-team.ru" <zeil@yandex-team.ru>,
	"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"aneesh.kumar@linux.vnet.ibm.com"
	<aneesh.kumar@linux.vnet.ibm.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"osalvador@suse.de" <osalvador@suse.de>,
	"will@kernel.org" <will@kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"mike.kravetz@oracle.com" <mike.kravetz@oracle.com>
Subject: Re: [PATCH v6 00/12] HWPOISON: soft offline rework
Date: Wed, 12 Aug 2020 04:32:01 +0900	[thread overview]
Message-ID: <20200811193201.GA1410457@u2004> (raw)
In-Reply-To: <20200811173923.GA39857@lca.pw>

On Tue, Aug 11, 2020 at 01:39:24PM -0400, Qian Cai wrote:
> On Tue, Aug 11, 2020 at 03:11:40AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> > I'm still not sure why the test succeeded by reverting these because
> > current mainline kernel provides similar mechanism to prevent reuse of
> > soft offlined page. So this success seems to me something suspicious.
> > 
> > To investigate more, I want to have additional info about the page states
> > of the relevant pages after soft offlining.  Could you collect it by the
> > following steps?
> > 
> >   - modify random.c not to run hotplug_memory() in migrate_huge_hotplug_memory(),
> >   - compile it and run "./random 1" once,
> >   - to collect page state with hwpoisoned pages, run "./page-types -Nlr -b hwpoison",
> >     where page-types is available under tools/vm in kernel source tree.
> >   - choose a few pfns of soft offlined pages from kernel message
> >     "Soft offlining pfn ...", and run "./page-types -Nlr -a <pfn>".
> 
> # ./page-types -Nlr -b hwpoison
> offset	len	flags
> 99a000	1	__________B________X_______________________
> 99c000	1	__________B________X_______________________
> 99e000	1	__________B________X_______________________
> 9a0000	1	__________B________X_______________________
> ba6000	1	__________B________X_______________________
> baa000	1	__________B________X_______________________

Thank you.  It only shows 6 lines of records, which is unexpected to me
because random.c iterates soft offline 2 hugepages with madvise() 1000 times.
Somehow (maybe in arch specific way?) other hwpoisoned pages might be cleared?
If they really are, the success of this test is a fake, and this patchset
can be considered as a fix.

> 
> Every single one of pfns was like this,
> 
> # ./page-types -Nlr -a 0x99a000
> offset	len	flags
> 99a000	1	__________B________X_______________________
> 
> # ./page-types -Nlr -a 0x99e000
> offset	len	flags
> 99e000	1	__________B________X_______________________
> 
> # ./page-types -Nlr -a 0x99c000
> offset	len	flags
> 99c000	1	__________B________X_______________________

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Naoya Horiguchi <nao.horiguchi@gmail.com>
To: Qian Cai <cai@lca.pw>
Cc: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mike.kravetz@oracle.com" <mike.kravetz@oracle.com>,
	"osalvador@suse.de" <osalvador@suse.de>,
	"tony.luck@intel.com" <tony.luck@intel.com>,
	"david@redhat.com" <david@redhat.com>,
	"aneesh.kumar@linux.vnet.ibm.com"
	<aneesh.kumar@linux.vnet.ibm.com>,
	"zeil@yandex-team.ru" <zeil@yandex-team.ru>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	"will@kernel.org" <will@kernel.org>
Subject: Re: [PATCH v6 00/12] HWPOISON: soft offline rework
Date: Wed, 12 Aug 2020 04:32:01 +0900	[thread overview]
Message-ID: <20200811193201.GA1410457@u2004> (raw)
In-Reply-To: <20200811173923.GA39857@lca.pw>

On Tue, Aug 11, 2020 at 01:39:24PM -0400, Qian Cai wrote:
> On Tue, Aug 11, 2020 at 03:11:40AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> > I'm still not sure why the test succeeded by reverting these because
> > current mainline kernel provides similar mechanism to prevent reuse of
> > soft offlined page. So this success seems to me something suspicious.
> > 
> > To investigate more, I want to have additional info about the page states
> > of the relevant pages after soft offlining.  Could you collect it by the
> > following steps?
> > 
> >   - modify random.c not to run hotplug_memory() in migrate_huge_hotplug_memory(),
> >   - compile it and run "./random 1" once,
> >   - to collect page state with hwpoisoned pages, run "./page-types -Nlr -b hwpoison",
> >     where page-types is available under tools/vm in kernel source tree.
> >   - choose a few pfns of soft offlined pages from kernel message
> >     "Soft offlining pfn ...", and run "./page-types -Nlr -a <pfn>".
> 
> # ./page-types -Nlr -b hwpoison
> offset	len	flags
> 99a000	1	__________B________X_______________________
> 99c000	1	__________B________X_______________________
> 99e000	1	__________B________X_______________________
> 9a0000	1	__________B________X_______________________
> ba6000	1	__________B________X_______________________
> baa000	1	__________B________X_______________________

Thank you.  It only shows 6 lines of records, which is unexpected to me
because random.c iterates soft offline 2 hugepages with madvise() 1000 times.
Somehow (maybe in arch specific way?) other hwpoisoned pages might be cleared?
If they really are, the success of this test is a fake, and this patchset
can be considered as a fix.

> 
> Every single one of pfns was like this,
> 
> # ./page-types -Nlr -a 0x99a000
> offset	len	flags
> 99a000	1	__________B________X_______________________
> 
> # ./page-types -Nlr -a 0x99e000
> offset	len	flags
> 99e000	1	__________B________X_______________________
> 
> # ./page-types -Nlr -a 0x99c000
> offset	len	flags
> 99c000	1	__________B________X_______________________


  reply	other threads:[~2020-08-11 19:33 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-06 18:49 [PATCH v6 00/12] HWPOISON: soft offline rework nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 01/12] mm,hwpoison: cleanup unused PageHuge() check nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 02/12] mm, hwpoison: remove recalculating hpage nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 03/12] mm,hwpoison-inject: don't pin for hwpoison_filter nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 04/12] mm,hwpoison: Un-export get_hwpoison_page and make it static nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 05/12] mm,hwpoison: Kill put_hwpoison_page nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 06/12] mm,hwpoison: Unify THP handling for hard and soft offline nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 07/12] mm,hwpoison: Rework soft offline for free pages nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 08/12] mm,hwpoison: Rework soft offline for in-use pages nao.horiguchi
2020-09-18  7:58   ` osalvador
2020-09-19  0:23     ` Andrew Morton
2020-09-19  8:26       ` osalvador
2020-08-06 18:49 ` [PATCH v6 09/12] mm,hwpoison: Refactor soft_offline_huge_page and __soft_offline_page nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 10/12] mm,hwpoison: Return 0 if the page is already poisoned in soft-offline nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 11/12] mm,hwpoison: introduce MF_MSG_UNSPLIT_THP nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 12/12] mm,hwpoison: double-check page count in __get_any_page() nao.horiguchi
2020-08-24 12:21   ` Oscar Salvador
2020-08-10 15:22 ` [PATCH v6 00/12] HWPOISON: soft offline rework Qian Cai
2020-08-10 15:22   ` Qian Cai
2020-08-11  3:11   ` HORIGUCHI NAOYA(堀口 直也)
2020-08-11  3:11     ` HORIGUCHI NAOYA(堀口 直也)
2020-08-11  3:45     ` Qian Cai
2020-08-11  3:45       ` Qian Cai
2020-08-11  3:56       ` HORIGUCHI NAOYA(堀口 直也)
2020-08-11  3:56         ` HORIGUCHI NAOYA(堀口 直也)
2020-08-11 17:39     ` Qian Cai
2020-08-11 17:39       ` Qian Cai
2020-08-11 19:32       ` Naoya Horiguchi [this message]
2020-08-11 19:32         ` Naoya Horiguchi
2020-08-11 22:06         ` Qian Cai
2020-08-11 22:06           ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200811193201.GA1410457@u2004 \
    --to=nao.horiguchi@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=osalvador@suse.de \
    --cc=tony.luck@intel.com \
    --cc=will@kernel.org \
    --cc=zeil@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.