From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HK_RANDOM_FROM,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_2 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85869C433C1 for ; Wed, 31 Mar 2021 02:44:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C6106619CA for ; Wed, 31 Mar 2021 02:44:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C6106619CA Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kingsoft.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5CCFB6B007E; Tue, 30 Mar 2021 22:44:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 555B86B0081; Tue, 30 Mar 2021 22:44:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A7ED6B0082; Tue, 30 Mar 2021 22:44:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0162.hostedemail.com [216.40.44.162]) by kanga.kvack.org (Postfix) with ESMTP id 1766B6B007E for ; Tue, 30 Mar 2021 22:44:00 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C9D36181AEF2A for ; Wed, 31 Mar 2021 02:43:59 +0000 (UTC) X-FDA: 77978624598.01.29CC648 Received: from mail.kingsoft.com (mail.kingsoft.com [114.255.44.145]) by imf27.hostedemail.com (Postfix) with ESMTP id DCF5880192D5 for ; Wed, 31 Mar 2021 02:43:56 +0000 (UTC) X-AuditID: 0a580157-2cfff7000006b36a-db-6063e1e84d7d Received: from mail.kingsoft.com (localhost [10.88.1.79]) (using TLS with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mail.kingsoft.com (SMG-1-NODE-87) with SMTP id 6E.51.45930.8E1E3606; Wed, 31 Mar 2021 10:43:52 +0800 (HKT) Received: from alex-virtual-machine (172.16.253.254) by KSBJMAIL4.kingsoft.cn (10.88.1.79) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.2; Wed, 31 Mar 2021 10:43:52 +0800 Date: Wed, 31 Mar 2021 10:43:36 +0800 From: Aili Yao To: "HORIGUCHI =?UTF-8?B?TkFPWUE=?=(=?UTF-8?B?5aCA5Y+j44CA55u05Lmf?=)" CC: David Hildenbrand , Matthew Wilcox , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "yangfeng1@kingsoft.com" , "sunhao2@kingsoft.com" , Oscar Salvador , Mike Kravetz , Subject: Re: [PATCH v5] mm/gup: check page hwposion status for coredump. Message-ID: <20210331104303.145aea53@alex-virtual-machine> In-Reply-To: <20210331015258.GB22060@hori.linux.bs1.fc.nec.co.jp> References: <20210317163714.328a038d@alex-virtual-machine> <20a0d078-f49d-54d6-9f04-f6b41dd51e5f@redhat.com> <20210318044600.GJ3420@casper.infradead.org> <20210318133412.12078eb7@alex-virtual-machine> <20210319104437.6f30e80d@alex-virtual-machine> <20210320003516.GC3420@casper.infradead.org> <20210322193318.377c9ce9@alex-virtual-machine> <20210331015258.GB22060@hori.linux.bs1.fc.nec.co.jp> Organization: kingsoft X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.16.253.254] X-ClientProxiedBy: KSBJMAIL1.kingsoft.cn (10.88.1.31) To KSBJMAIL4.kingsoft.cn (10.88.1.79) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrCLMWRmVeSWpSXmKPExsXCFcHor/viYXKCwZW96hZz1q9hs/i6/hez xeVdc9gs7q35z2rxcX+wxcXGA4wWZ6YVWfz+MYfNgcNj8wotj02fJrF7nJjxm8XjxdWNLB4f n95i8Xi/7yqbx+bT1R6fN8kFcERx2aSk5mSWpRbp2yVwZZz4cpW1YJJyxazLRxkbGG9KdzFy ckgImEhc3LCcsYuRi0NIYDqTRMPCZVDOK0aJ58/OMINUsQioSjxY38oEYrMB2bvuzWIFsUUE kiQWz/7KBNLALHCYWeLd/cPsIAlhAQ+JIytOgNm8AlYSfw4+BWvgFHCUWNPczgSx4R+zxMHX ixlBEvwCYhK9V/4zQdxkL9G2ZREjRLOgxMmZT1hAbGYBTYnW7b/ZIWxtiWULX4NdJySgKHF4 yS92iF4liSPdM9gg7FiJpgO32CYwCs9CMmoWklGzkIxawMi8ipGlODfdcBMjJGLCdzDOa/qo d4iRiYPxEKMEB7OSCK/wgcQEId6UxMqq1KL8+KLSnNTiQ4zSHCxK4ry8D5MShATSE0tSs1NT C1KLYLJMHJxSDUzrDrGrhe12+fVV/1mES0abtaVAldaksoXpkTuUDPkXKKVw/jyownAgOfdg X+2Kje/vaOZGy/2eOeXM/RVl5oncEtNOTtmkeUo/7xNL+/JNfhJu/K7/2nov1ZtcWhKd9fVC 7I8aQx+tQpWoD9sPdh+bMeP+ky0JejvXREw833XQksP9xDbptNMvTq/JiXTnOvDlIXP2P8m2 DSIqh+P6Zb/2Muf7L4mU43Xt2Wi1bYKnSLPJusJ1at8ZfRpqJYQ3K+dHKj7jldWseZGZlGIX wX09la1ia596VucV0X6NxNv6ZYVrQ4o7My7weHRU5a/7Ibrh96XNAuxcWdePNz1SVJpS/mPC NN+Fm9SfKwqlK7EUZyQaajEXFScCAMnRq1QHAwAA X-Stat-Signature: q3jnhd9pcfza6j7p6qmernckacusaj7p X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DCF5880192D5 Received-SPF: none (kingsoft.com>: No applicable sender policy available) receiver=imf27; identity=mailfrom; envelope-from=""; helo=mail.kingsoft.com; client-ip=114.255.44.145 X-HE-DKIM-Result: none/none X-HE-Tag: 1617158636-214927 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 31 Mar 2021 01:52:59 +0000 HORIGUCHI NAOYA(=E5=A0=80=E5=8F=A3=E3=80=80=E7=9B=B4=E4=B9=9F) wrote: > On Fri, Mar 26, 2021 at 03:22:49PM +0100, David Hildenbrand wrote: > > On 26.03.21 15:09, David Hildenbrand wrote: =20 > > > On 22.03.21 12:33, Aili Yao wrote: =20 > > > > When we do coredump for user process signal, this may be one SIGBUS= signal > > > > with BUS_MCEERR_AR or BUS_MCEERR_AO code, which means this signal is > > > > resulted from ECC memory fail like SRAR or SRAO, we expect the memo= ry > > > > recovery work is finished correctly, then the get_dump_page() will = not > > > > return the error page as its process pte is set invalid by > > > > memory_failure(). > > > >=20 > > > > But memory_failure() may fail, and the process's related pte may no= t be > > > > correctly set invalid, for current code, we will return the poison = page, > > > > get it dumped, and then lead to system panic as its in kernel code. > > > >=20 > > > > So check the hwpoison status in get_dump_page(), and if TRUE, retur= n NULL. > > > >=20 > > > > There maybe other scenario that is also better to check hwposion st= atus > > > > and not to panic, so make a wrapper for this check, Thanks to David= 's > > > > suggestion(). > > > >=20 > > > > Link: https://lkml.kernel.org/r/20210319104437.6f30e80d@alex-virtua= l-machine > > > > Signed-off-by: Aili Yao > > > > Cc: David Hildenbrand > > > > Cc: Matthew Wilcox > > > > Cc: Naoya Horiguchi > > > > Cc: Oscar Salvador > > > > Cc: Mike Kravetz > > > > Cc: Aili Yao > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: Andrew Morton > > > > --- > > > > mm/gup.c | 4 ++++ > > > > mm/internal.h | 20 ++++++++++++++++++++ > > > > 2 files changed, 24 insertions(+) > > > >=20 > > > > diff --git a/mm/gup.c b/mm/gup.c > > > > index e4c224c..6f7e1aa 100644 > > > > --- a/mm/gup.c > > > > +++ b/mm/gup.c > > > > @@ -1536,6 +1536,10 @@ struct page *get_dump_page(unsigned long add= r) > > > > FOLL_FORCE | FOLL_DUMP | FOLL_GET); > > > > if (locked) > > > > mmap_read_unlock(mm); =20 > > >=20 > > > Thinking again, wouldn't we get -EFAULT from __get_user_pages_locked() > > > when stumbling over a hwpoisoned page? > > >=20 > > > See __get_user_pages_locked()->__get_user_pages()->faultin_page(): > > >=20 > > > handle_mm_fault()->vm_fault_to_errno(), which translates > > > VM_FAULT_HWPOISON to -EFAULT, unless FOLL_HWPOISON is set (-> -EHWPOI= SON) > > >=20 > > > ? =20 >=20 > We could get -EFAULT, but sometimes not (depends on how memory_failure() = fails). >=20 > If we failed to unmap, the page table is not converted to hwpoison entry, > so __get_user_pages_locked() get the hwpoisoned page. >=20 > If we successfully unmapped but failed in truncate_error_page() for examp= le, > the processes mapping the page would get -EFAULT as expected. But even in > this case, other processes could reach the error page via page cache and > __get_user_pages_locked() for them could return the hwpoisoned page. >=20 > >=20 > > Or doesn't that happen as you describe "But memory_failure() may fail, = and > > the process's related pte may not be correctly set invalid" -- but why = does > > that happen? =20 >=20 > Simply because memory_failure() doesn't handle some page types like ksm p= age > and zero page. Or maybe shmem thp also belongs to this class. >=20 > >=20 > > On a similar thought, should get_user_pages() never return a page that = has > > HWPoison set? E.g., check also for existing PTEs if the page is hwpoiso= ned? =20 >=20 > Make sense to me. Maybe inserting hwpoison check into follow_page_pte() a= nd > follow_huge_pmd() would work well. I think we should take more care to broadcast the hwpoison check to other c= ases, SIGBUS coredump is such a case that it is supposed to not touch the poison = page,=20 and if we return NULL for this, the coredump process will get a successful = finish. Other cases may also meet the requirements like coredump, but we need to id= entify it, that's the poison check wrapper's purpose. If not, we may break the integri= ty of the related action, which may be no better than panic. --=20 Thanks! Aili Yao