From: Naoya Horiguchi <nao.horiguchi@gmail.com>
To: Qian Cai <cai@lca.pw>
Cc: "tony.luck@intel.com" <tony.luck@intel.com>,
"david@redhat.com" <david@redhat.com>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
"zeil@yandex-team.ru" <zeil@yandex-team.ru>,
"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"mhocko@kernel.org" <mhocko@kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"aneesh.kumar@linux.vnet.ibm.com"
<aneesh.kumar@linux.vnet.ibm.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"osalvador@suse.de" <osalvador@suse.de>,
"will@kernel.org" <will@kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"mike.kravetz@oracle.com" <mike.kravetz@oracle.com>
Subject: Re: [PATCH v6 00/12] HWPOISON: soft offline rework
Date: Wed, 12 Aug 2020 04:32:01 +0900 [thread overview]
Message-ID: <20200811193201.GA1410457@u2004> (raw)
In-Reply-To: <20200811173923.GA39857@lca.pw>
On Tue, Aug 11, 2020 at 01:39:24PM -0400, Qian Cai wrote:
> On Tue, Aug 11, 2020 at 03:11:40AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> > I'm still not sure why the test succeeded by reverting these because
> > current mainline kernel provides similar mechanism to prevent reuse of
> > soft offlined page. So this success seems to me something suspicious.
> >
> > To investigate more, I want to have additional info about the page states
> > of the relevant pages after soft offlining. Could you collect it by the
> > following steps?
> >
> > - modify random.c not to run hotplug_memory() in migrate_huge_hotplug_memory(),
> > - compile it and run "./random 1" once,
> > - to collect page state with hwpoisoned pages, run "./page-types -Nlr -b hwpoison",
> > where page-types is available under tools/vm in kernel source tree.
> > - choose a few pfns of soft offlined pages from kernel message
> > "Soft offlining pfn ...", and run "./page-types -Nlr -a <pfn>".
>
> # ./page-types -Nlr -b hwpoison
> offset len flags
> 99a000 1 __________B________X_______________________
> 99c000 1 __________B________X_______________________
> 99e000 1 __________B________X_______________________
> 9a0000 1 __________B________X_______________________
> ba6000 1 __________B________X_______________________
> baa000 1 __________B________X_______________________
Thank you. It only shows 6 lines of records, which is unexpected to me
because random.c iterates soft offline 2 hugepages with madvise() 1000 times.
Somehow (maybe in arch specific way?) other hwpoisoned pages might be cleared?
If they really are, the success of this test is a fake, and this patchset
can be considered as a fix.
>
> Every single one of pfns was like this,
>
> # ./page-types -Nlr -a 0x99a000
> offset len flags
> 99a000 1 __________B________X_______________________
>
> # ./page-types -Nlr -a 0x99e000
> offset len flags
> 99e000 1 __________B________X_______________________
>
> # ./page-types -Nlr -a 0x99c000
> offset len flags
> 99c000 1 __________B________X_______________________
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Naoya Horiguchi <nao.horiguchi@gmail.com>
To: Qian Cai <cai@lca.pw>
Cc: "HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"mhocko@kernel.org" <mhocko@kernel.org>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"mike.kravetz@oracle.com" <mike.kravetz@oracle.com>,
"osalvador@suse.de" <osalvador@suse.de>,
"tony.luck@intel.com" <tony.luck@intel.com>,
"david@redhat.com" <david@redhat.com>,
"aneesh.kumar@linux.vnet.ibm.com"
<aneesh.kumar@linux.vnet.ibm.com>,
"zeil@yandex-team.ru" <zeil@yandex-team.ru>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
"will@kernel.org" <will@kernel.org>
Subject: Re: [PATCH v6 00/12] HWPOISON: soft offline rework
Date: Wed, 12 Aug 2020 04:32:01 +0900 [thread overview]
Message-ID: <20200811193201.GA1410457@u2004> (raw)
In-Reply-To: <20200811173923.GA39857@lca.pw>
On Tue, Aug 11, 2020 at 01:39:24PM -0400, Qian Cai wrote:
> On Tue, Aug 11, 2020 at 03:11:40AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> > I'm still not sure why the test succeeded by reverting these because
> > current mainline kernel provides similar mechanism to prevent reuse of
> > soft offlined page. So this success seems to me something suspicious.
> >
> > To investigate more, I want to have additional info about the page states
> > of the relevant pages after soft offlining. Could you collect it by the
> > following steps?
> >
> > - modify random.c not to run hotplug_memory() in migrate_huge_hotplug_memory(),
> > - compile it and run "./random 1" once,
> > - to collect page state with hwpoisoned pages, run "./page-types -Nlr -b hwpoison",
> > where page-types is available under tools/vm in kernel source tree.
> > - choose a few pfns of soft offlined pages from kernel message
> > "Soft offlining pfn ...", and run "./page-types -Nlr -a <pfn>".
>
> # ./page-types -Nlr -b hwpoison
> offset len flags
> 99a000 1 __________B________X_______________________
> 99c000 1 __________B________X_______________________
> 99e000 1 __________B________X_______________________
> 9a0000 1 __________B________X_______________________
> ba6000 1 __________B________X_______________________
> baa000 1 __________B________X_______________________
Thank you. It only shows 6 lines of records, which is unexpected to me
because random.c iterates soft offline 2 hugepages with madvise() 1000 times.
Somehow (maybe in arch specific way?) other hwpoisoned pages might be cleared?
If they really are, the success of this test is a fake, and this patchset
can be considered as a fix.
>
> Every single one of pfns was like this,
>
> # ./page-types -Nlr -a 0x99a000
> offset len flags
> 99a000 1 __________B________X_______________________
>
> # ./page-types -Nlr -a 0x99e000
> offset len flags
> 99e000 1 __________B________X_______________________
>
> # ./page-types -Nlr -a 0x99c000
> offset len flags
> 99c000 1 __________B________X_______________________
next prev parent reply other threads:[~2020-08-11 19:33 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-06 18:49 [PATCH v6 00/12] HWPOISON: soft offline rework nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 01/12] mm,hwpoison: cleanup unused PageHuge() check nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 02/12] mm, hwpoison: remove recalculating hpage nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 03/12] mm,hwpoison-inject: don't pin for hwpoison_filter nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 04/12] mm,hwpoison: Un-export get_hwpoison_page and make it static nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 05/12] mm,hwpoison: Kill put_hwpoison_page nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 06/12] mm,hwpoison: Unify THP handling for hard and soft offline nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 07/12] mm,hwpoison: Rework soft offline for free pages nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 08/12] mm,hwpoison: Rework soft offline for in-use pages nao.horiguchi
2020-09-18 7:58 ` osalvador
2020-09-19 0:23 ` Andrew Morton
2020-09-19 8:26 ` osalvador
2020-08-06 18:49 ` [PATCH v6 09/12] mm,hwpoison: Refactor soft_offline_huge_page and __soft_offline_page nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 10/12] mm,hwpoison: Return 0 if the page is already poisoned in soft-offline nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 11/12] mm,hwpoison: introduce MF_MSG_UNSPLIT_THP nao.horiguchi
2020-08-06 18:49 ` [PATCH v6 12/12] mm,hwpoison: double-check page count in __get_any_page() nao.horiguchi
2020-08-24 12:21 ` Oscar Salvador
2020-08-10 15:22 ` [PATCH v6 00/12] HWPOISON: soft offline rework Qian Cai
2020-08-10 15:22 ` Qian Cai
2020-08-11 3:11 ` HORIGUCHI NAOYA(堀口 直也)
2020-08-11 3:11 ` HORIGUCHI NAOYA(堀口 直也)
2020-08-11 3:45 ` Qian Cai
2020-08-11 3:45 ` Qian Cai
2020-08-11 3:56 ` HORIGUCHI NAOYA(堀口 直也)
2020-08-11 3:56 ` HORIGUCHI NAOYA(堀口 直也)
2020-08-11 17:39 ` Qian Cai
2020-08-11 17:39 ` Qian Cai
2020-08-11 19:32 ` Naoya Horiguchi [this message]
2020-08-11 19:32 ` Naoya Horiguchi
2020-08-11 22:06 ` Qian Cai
2020-08-11 22:06 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200811193201.GA1410457@u2004 \
--to=nao.horiguchi@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=cai@lca.pw \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=naoya.horiguchi@nec.com \
--cc=osalvador@suse.de \
--cc=tony.luck@intel.com \
--cc=will@kernel.org \
--cc=zeil@yandex-team.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.