From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A536C4345F for ; Fri, 26 Apr 2024 19:05:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1F77C6B0099; Fri, 26 Apr 2024 15:05:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A7416B009B; Fri, 26 Apr 2024 15:05:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 096696B009C; Fri, 26 Apr 2024 15:05:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id DF6DB6B0099 for ; Fri, 26 Apr 2024 15:05:27 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9FF09160947 for ; Fri, 26 Apr 2024 19:05:27 +0000 (UTC) X-FDA: 82052611494.11.5019F7B Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf29.hostedemail.com (Postfix) with ESMTP id 450AC12000E for ; Fri, 26 Apr 2024 19:05:24 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="SalqeF2/"; dmarc=none; spf=none (imf29.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714158325; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=J9hzjQzDOk6bgui0ZQTXguiwfYozRtBaHNSEEFobqG8=; b=oYInoxgTQLi/oEI35vZB6qBy0KGKxG2WC+1gDZVpXHhnEWc3MioaC0ojiOSvJJTG+FnEFk iD44kIcYdPQNuWVGXnk42rpGsftbYfxM2X1KrtkdvdJEcP+hllJyDGzr8LyVG6LOgvxOf5 WUXMNsOMlFiSPV1NCCgFxazIkjEL68E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714158325; a=rsa-sha256; cv=none; b=497jp47WdKWWPZl5gq4S3lAtmijGgk1UztUI6N2/beWVfwmDDPOjirCyQBFRYZJ7jveAyW ahmdT/ojzGHn15PcL+m7h7+shc4fS4Pr0TWd1ocz/Dzya2N8pRgupz5eB0o5nm8R2Juucb pmjk0DYg66CphytSA0eN1vuDRCPBkL8= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="SalqeF2/"; dmarc=none; spf=none (imf29.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=J9hzjQzDOk6bgui0ZQTXguiwfYozRtBaHNSEEFobqG8=; b=SalqeF2/y6+XjXTwdw2qUijjlL YbazM9RYU7erDUktOkFEsLCooaBAJg0Xw3Ce+NJmWAFAGh2IL9trO0IXSCu5uWeMddLj+hMl6Tbqo 9FivPMriPCjqktelGcq7wXyuhZY38ZPMtxRGN7wigGJOhBwCmDACZYieCW/bBDwXXSTLEaWoISCdL Es2sL4DEEAFtFq50FAwMAPQ+0pR20uTqA1uRY3Y4wFYAIzCbsB1zpw0ulIaivm3fozFr5GBcKC7U8 qTj4E3+O57dIynIli2nRn7x5GPEBxaWKdmtYmuTE1/D0ffFtkQIW90Ei3Ji5s+/urNH2m81TOyjz5 GDy67bpQ==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1s0Qsp-00000005pno-0JIt; Fri, 26 Apr 2024 19:05:19 +0000 Date: Fri, 26 Apr 2024 20:05:18 +0100 From: Matthew Wilcox To: Sidhartha Kumar Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, linmiaohe@huawei.com, jane.chu@oracle.com, nao.horiguchi@gmail.com, osalvador@suse.de Subject: Re: [PATCH] mm/memory-failure: remove shake_page() Message-ID: References: <20240426171511.122887-1-sidhartha.kumar@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 450AC12000E X-Stat-Signature: ei7u1oonijrqsofsuchn94zt8hgh33u3 X-Rspam-User: X-HE-Tag: 1714158324-471835 X-HE-Meta: U2FsdGVkX18vc8Ru4fDdIPe4IUC+g9ioUJ+rECwdrx7sl4TzSbRUKUTB8uXXeY9ps9vZkkBGkvnLf2Bzcf9alIGWZ+UnSQGs8Zq1RJgmVDrTdgod2lQUkBhtxAKHmRgdFNZ/hHhiTGZOBWHgp7bXMnYWx688VDQOLspnKbuw4hDSgkTDSTqB7rUxDDwIo3EO+PFq4pexLcKH4kPMB/2P9LtkCfbAPseAEuxejSWH1WVl8Jwif7zR4wYyRnnF6q5npi7grSj2DcKlHy5VwNU0c3qCFgV1OX2r+53vVGu6x1i4nfmT5QrE8W/AEGWbFdEwuyL3BzF0ualXhAPXLxtSKXbfy0BYI3hA6JwHex9QSoyi76bLGhGQEkWXmAm1F0KxskO66Jy6DLzcnGTgiEQT613nGKoDI4FdvOi0gfTCUrKzrI+iZqNkMpeljkEp8EPLlrvbb/Pex4sZFDEBuG+XNa0QEo+6y5tuGhVVHOxmy4gdNMMfXW7oKePFqW+8LrksmAq3EOQWwcLZL6qVFUiZvWNVAmaCrD2e68TXL52giNTaUQuUZfCdsaAuVEHUvYn0id4/+cxeQK1MKOLscwgcNDGqYxBEHsPVHHJhyrRMtJ4BKsqVsKuakUeBZeUic1DY1kgkQylHqsJ28Quc+6D/plduE3oabVUtOZMaBbS4OBqMOpJ+bzTkAly/7a6t5lWUsYE1vEYlvZAKzYbMySfIJ3gs7+WzuPAkTJu6blkSQSYAwVviLUMeHUmVjPMeimtrFhbTJKXR5A5Fg4ky9S3X6rGqsy1PkyCJBgEi6uz0Z691WyExbxPFRMt788P5AYzcaRUFIR9gOzFQykVukx4MnbMg9PVpi1edVuLvrH1JkeZeRdP+aJFamEHiNu1Nv3YIej0plwq1QNtz70/AopDyRd3obR/JO342jxeEPp0KMujlmLK6yqRwleEpYyW18UdjV8WOfCQExGWfyeY9zHI prly7IRY exlT8/Dz9xHULyfiP8Iswmtl6zF7ZR3DAoVW8Ae5gXOv81GogXs2qvRjaLq1ailwFBbol/oJwNsB+S4x06t8e+Q+S8pnlmnyyo48/PxmHGo3ZeoVD9Y7FlD/PWDniRikXus7bcivuBDYrBbd35Henrw7eqV6ZijYefkUeV2AFTpKSiFuV+kwNqYCwkjptzhEjDh94l5xcovkW0hKfqY/mLJTlnzzLkA+IkT1UV3h7tOO6gdVu6vx9SEUvBaHmwE8Ej32y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 26, 2024 at 11:53:01AM -0700, Sidhartha Kumar wrote: > On 4/26/24 11:27 AM, Matthew Wilcox wrote: > > On Fri, Apr 26, 2024 at 10:57:31AM -0700, Sidhartha Kumar wrote: > > > On 4/26/24 10:34 AM, Matthew Wilcox wrote: > > > > On Fri, Apr 26, 2024 at 10:15:11AM -0700, Sidhartha Kumar wrote: > > > > > Use a folio in get_any_page() to save 5 calls to compound head and > > > > > convert the last user of shake_page() to shake_folio(). This allows us > > > > > to remove the shake_page() definition. > > > > > > > > So I didn't do this before because I wasn't convinced it was safe. > > > > We don't have a refcount on the folio, so the page might no longer > > > > be part of this folio by the time we get the refcount on the folio. > > > > > > > > I'd really like to see some argumentation for why this is safe. > > > > > > If I moved down the folio = page_folio() line to after we verify > > > __get_hwpoison_page() has returned 1, which indicates the reference count > > > was successfully incremented via foliO_try_get(), that means the folio > > > conversion would happen after we have a refcount. In the case we don't call > > > __get_hwpoison_page(), that means the MF_COUNT_INCREASED flag is set. This > > > means the page has existing users so that path would be safe as well. So I > > > think this is safe after moving page_folio() after __get_hwpoison_page(). > > > > See if you can find a hole in this chain of reasoning ... > > > > memory_failure() > > p = pfn_to_online_page(pfn); > > res = try_memory_failure_hugetlb(pfn, flags, &hugetlb); > > (not a hugetlb) > > if (TestSetPageHWPoison(p)) { > > (not already poisoned) > > if (!(flags & MF_COUNT_INCREASED)) { > > res = get_hwpoison_page(p, flags); > > > > get_hwpoison_page() > > ret = get_any_page(p, flags); > > > > get_any_page() > > folio = page_folio(page) > > That would be unsafe, the safe way would be if we moved page_folio() after > the call to __get_hw_poison() in get_any_page() and there would still be one > remaining user of shake_page() that we can't convert. A safe version of this > patch would result in a removal of one use of PageHuge() and two uses of > put_page(), would that be worth submitting? > > get_any_page() > if(__get_hwpoison_page()) > folio = page_folio() /* folio_try_get() returned 1, safe */ I think we should convert __get_hwpoison_page() to return either the folio or an ERR_PTR or NULL. Also, I think we should delete the "cannot catch tail" part and just loop in __get_hwpoison_page() until we do catch it. See try_get_folio() in mm/gup.c for inspiration (although you can't use it exactly because that code knows that the page is mapped into a page table, so has a refcount). But that's just an immediate assessment; you might find a reason that doesn't work.