From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C46D7FA3733 for ; Thu, 17 Oct 2019 10:03:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8251420869 for ; Thu, 17 Oct 2019 10:03:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8251420869 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E2C308E0005; Thu, 17 Oct 2019 06:03:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB6D08E0003; Thu, 17 Oct 2019 06:03:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C7CC18E0005; Thu, 17 Oct 2019 06:03:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0134.hostedemail.com [216.40.44.134]) by kanga.kvack.org (Postfix) with ESMTP id A03678E0003 for ; Thu, 17 Oct 2019 06:03:51 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 3F791180ACF7F for ; Thu, 17 Oct 2019 10:03:51 +0000 (UTC) X-FDA: 76052840262.22.soda30_5bcc3add55a2b X-HE-Tag: soda30_5bcc3add55a2b X-Filterd-Recvd-Size: 7322 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Thu, 17 Oct 2019 10:03:50 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9019C30833CB; Thu, 17 Oct 2019 10:03:49 +0000 (UTC) Received: from [10.36.117.42] (ovpn-117-42.ams2.redhat.com [10.36.117.42]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2989E60872; Thu, 17 Oct 2019 10:03:48 +0000 (UTC) Subject: Re: memory offline infinite loop after soft offline To: Michal Hocko , Qian Cai , Naoya Horiguchi Cc: "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Mike Kravetz References: <1570829564.5937.36.camel@lca.pw> <20191014083914.GA317@dhcp22.suse.cz> <20191017093410.GA19973@hori.linux.bs1.fc.nec.co.jp> <20191017100106.GF24485@dhcp22.suse.cz> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <17d5ed8c-4a0f-55c5-7474-3ae5e4263784@redhat.com> Date: Thu, 17 Oct 2019 12:03:47 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: <20191017100106.GF24485@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Thu, 17 Oct 2019 10:03:49 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 17.10.19 12:01, Michal Hocko wrote: > On Thu 17-10-19 09:34:10, Naoya Horiguchi wrote: >> On Mon, Oct 14, 2019 at 10:39:14AM +0200, Michal Hocko wrote: > [...] >>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >>> index 89c19c0feadb..5fb3fee16fde 100644 >>> --- a/mm/page_isolation.c >>> +++ b/mm/page_isolation.c >>> @@ -274,7 +274,7 @@ __test_page_isolated_in_pageblock(unsigned long p= fn, unsigned long end_pfn, >>> * simple way to verify that as VM_BUG_ON(), though. >>> */ >>> pfn +=3D 1 << page_order(page); >>> - else if (skip_hwpoisoned_pages && PageHWPoison(page)) >>> + else if (skip_hwpoisoned_pages && PageHWPoison(compound_head(page)= )) >>> /* A HWPoisoned page cannot be also PageBuddy */ >>> pfn++; >>> else >> >> This fix looks good to me. The original code only addresses hwpoisoned= 4kB-page, >> we seem to have this issue since the following commit, >=20 > Thanks a lot for double checking Naoya! > =20 >> commit b023f46813cde6e3b8a8c24f432ff9c1fd8e9a64 >> Author: Wen Congyang >> Date: Tue Dec 11 16:00:45 2012 -0800 >> =20 >> memory-hotplug: skip HWPoisoned page when offlining pages >> >> and extension of LTP coverage finally discovered this. >=20 > Qian, could you give the patch some testing? > --- >=20 > From 441a9515dcdb29bb0ca39ff995632907d959032f Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Thu, 17 Oct 2019 11:49:15 +0200 > Subject: [PATCH] hugetlb, memory_hotplug: fix HWPoisoned tail pages pro= perly > MIME-Version: 1.0 > Content-Type: text/plain; charset=3DUTF-8 > Content-Transfer-Encoding: 8bit >=20 > Qian Cai has noticed that hwpoisoned hugetlb pages prevent memory > offlining from making a forward progress. He has nailed down the issue > to be __test_page_isolated_in_pageblock always returning EBUSY because > of soft offlined page: > [=C2=A0=C2=A0101.665160][ T8885] pfn =3D 77501, end_pfn =3D 78000 > [=C2=A0=C2=A0101.665245][ T8885] page:c00c000001dd4040 refcount:0 mapco= unt:0 > mapping:0000000000000000 index:0x0 > [=C2=A0=C2=A0101.665329][ T8885] flags: 0x3fffc000000000() > [=C2=A0=C2=A0101.665391][ T8885] raw: 003fffc000000000 0000000000000000= ffffffff01dd0500 > 0000000000000000 > [=C2=A0=C2=A0101.665498][ T8885] raw: 0000000000000000 0000000000000000= 00000000ffffffff > 0000000000000000 > [=C2=A0=C2=A0101.665588][ T8885] page dumped because: soft_offline > [=C2=A0=C2=A0101.665639][ T8885] page_owner tracks the page as freed > [=C2=A0=C2=A0101.665697][ T8885] page last allocated via order 5, migra= tetype Movable, > gfp_mask > 0x346cca(GFP_HIGHUSER_MOVABLE|__GFP_NOWARN|__GFP_RETRY_MAYFAIL|__GFP_CO= MP|__GFP_ > THISNODE) > [=C2=A0=C2=A0101.665924][ T8885]=C2=A0=C2=A0prep_new_page+0x3c0/0x440 > [=C2=A0=C2=A0101.665962][ T8885]=C2=A0=C2=A0get_page_from_freelist+0x25= 68/0x2bb0 > [=C2=A0=C2=A0101.666059][ T8885]=C2=A0=C2=A0__alloc_pages_nodemask+0x1b= 4/0x670 > [=C2=A0=C2=A0101.666115][ T8885]=C2=A0=C2=A0alloc_fresh_huge_page+0x244= /0x6e0 > [=C2=A0=C2=A0101.666183][ T8885]=C2=A0=C2=A0alloc_migrate_huge_page+0x3= 0/0x70 > [=C2=A0=C2=A0101.666254][ T8885]=C2=A0=C2=A0alloc_new_node_page+0xc4/0x= 380 > [=C2=A0=C2=A0101.666325][ T8885]=C2=A0=C2=A0migrate_pages+0x3b4/0x19e0 > [=C2=A0=C2=A0101.666375][ T8885]=C2=A0=C2=A0do_move_pages_to_node.isra.= 29.part.30+0x44/0xa0 > [=C2=A0=C2=A0101.666464][ T8885]=C2=A0=C2=A0kernel_move_pages+0x498/0xf= c0 > [=C2=A0=C2=A0101.666520][ T8885]=C2=A0=C2=A0sys_move_pages+0x28/0x40 > [=C2=A0=C2=A0101.666643][ T8885]=C2=A0=C2=A0system_call+0x5c/0x68 > [=C2=A0=C2=A0101.666665][ T8885] page last free stack trace: > [=C2=A0=C2=A0101.666704][ T8885]=C2=A0=C2=A0__free_pages_ok+0xa4c/0xd40 > [=C2=A0=C2=A0101.666773][ T8885]=C2=A0=C2=A0update_and_free_page+0x2dc/= 0x5b0 > [=C2=A0=C2=A0101.666821][ T8885]=C2=A0=C2=A0free_huge_page+0x2dc/0x740 > [=C2=A0=C2=A0101.666875][ T8885]=C2=A0=C2=A0__put_compound_page+0x64/0x= c0 > [=C2=A0=C2=A0101.666926][ T8885]=C2=A0=C2=A0putback_active_hugepage+0x2= 28/0x390 > [=C2=A0=C2=A0101.666990][ T8885]=C2=A0=C2=A0migrate_pages+0xa78/0x19e0 > [=C2=A0=C2=A0101.667048][ T8885]=C2=A0=C2=A0soft_offline_page+0x314/0x1= 050 > [=C2=A0=C2=A0101.667117][ T8885]=C2=A0=C2=A0sys_madvise+0x1068/0x1080 > [=C2=A0=C2=A0101.667185][ T8885]=C2=A0=C2=A0system_call+0x5c/0x68 >=20 > The reason is that __test_page_isolated_in_pageblock doesn't recognize > hugetlb tail pages as the HWPoison bit is not transferred from the head > page. Pfn walker then doesn't recognize those pages and so EBUSY is > returned up the call chain. >=20 > The proper fix would be to handle HWPoison throughout the huge page but > considering there is a WIP to rework that code considerably let's go > with a simple and easily backportable workaround and simply check the > the head of a compound page for the HWPoison flag. >=20 > Reported-and-analyzed-by: Qian Cai > Fixes: b023f46813cd ("memory-hotplug: skip HWPoisoned page when offlini= ng pages") > Cc: stable > Signed-off-by: Michal Hocko > --- > mm/page_isolation.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) >=20 > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > index 89c19c0feadb..5fb3fee16fde 100644 > --- a/mm/page_isolation.c > +++ b/mm/page_isolation.c > @@ -274,7 +274,7 @@ __test_page_isolated_in_pageblock(unsigned long pfn= , unsigned long end_pfn, > * simple way to verify that as VM_BUG_ON(), though. > */ > pfn +=3D 1 << page_order(page); > - else if (skip_hwpoisoned_pages && PageHWPoison(page)) > + else if (skip_hwpoisoned_pages && PageHWPoison(compound_head(page))) > /* A HWPoisoned page cannot be also PageBuddy */ > pfn++; > else >=20 With the extended description, this makes sense to me now :) Acked-by: David Hildenbrand --=20 Thanks, David / dhildenb