From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B6F4CD54A9 for ; Tue, 19 Sep 2023 09:53:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E72F36B04E6; Tue, 19 Sep 2023 05:53:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E234D6B04E8; Tue, 19 Sep 2023 05:53:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D11A96B04E9; Tue, 19 Sep 2023 05:53:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C23676B04E6 for ; Tue, 19 Sep 2023 05:53:07 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 946B2A0C5C for ; Tue, 19 Sep 2023 09:53:07 +0000 (UTC) X-FDA: 81252883614.24.B32C24E Received: from out-212.mta0.migadu.com (out-212.mta0.migadu.com [91.218.175.212]) by imf18.hostedemail.com (Postfix) with ESMTP id 980411C0008 for ; Tue, 19 Sep 2023 09:53:05 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=KCbYo8yW; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf18.hostedemail.com: domain of muchun.song@linux.dev designates 91.218.175.212 as permitted sender) smtp.mailfrom=muchun.song@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695117185; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=P53viX11QuF1Pmhi29hnQ9FK4klbDYMkSngdnTDu+Tw=; b=iSR4m0OXWFe3IC3AMFY+huA79HXEHuDOMTflIJl3I6Ml1FrTqDGJihnIW4gN/YH17octea GHEyaIbpSF5AaDZ83+PGdjGrq5Ih+bUQYtLURFu7BLAmOiOO78X6h2AEqlzFyqdoBPvx78 6arOwYCkcAEOzOF2JWs0l8NC+RMvD18= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=KCbYo8yW; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf18.hostedemail.com: domain of muchun.song@linux.dev designates 91.218.175.212 as permitted sender) smtp.mailfrom=muchun.song@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695117185; a=rsa-sha256; cv=none; b=aGTWRPFAHjpDpld7qBxXGhOm44NwJoX6F5lnWEO9SKGhTB+EMPgPl4wnsnyxIWLd4xXXjv nBv8dV3UpUVuqAyUHPAEKU32lKuWkxP1xCyx9iR4OXIl7nO+6eWIejMZ90Hivo0bPbIlLW aNQd1Jcq3KpbdxSmhEKgsdhLrukNlfw= Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1695117181; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P53viX11QuF1Pmhi29hnQ9FK4klbDYMkSngdnTDu+Tw=; b=KCbYo8yW7rQfrMIVGjHNLoAMbeikhCoKvfH6WFDib/cGpedYn2+QE0dQNm7A3aEmRjm1I/ 7e6wYP0p2h2wnq5IrqTv3tXiYmK9TYqQQQGU88ZKzctr8FGPP4Rl3l4MjxhP33EbrlKi0O ze9+aNwwKYZ22my5QZEGUCfIw5dc28E= Date: Tue, 19 Sep 2023 17:52:51 +0800 MIME-Version: 1.0 Subject: Re: [PATCH v4 4/8] hugetlb: perform vmemmap restoration on a list of pages To: Mike Kravetz , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Muchun Song , Joao Martins , Oscar Salvador , David Hildenbrand , Miaohe Lin , David Rientjes , Anshuman Khandual , Naoya Horiguchi , Barry Song <21cnbao@gmail.com>, Michal Hocko , Matthew Wilcox , Xiongchun Duan , Andrew Morton References: <20230918230202.254631-1-mike.kravetz@oracle.com> <20230918230202.254631-5-mike.kravetz@oracle.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: <20230918230202.254631-5-mike.kravetz@oracle.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 980411C0008 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: f6zccswy4zhksbgcjuothrr3x7yxz1br X-HE-Tag: 1695117185-952725 X-HE-Meta: U2FsdGVkX1+Bnnu6TjLjSL/VCRpv9enIfbgzJZaUfjWyjK8CqFxlgBOYSApxFuNlFGGjD0ggdh5tue2EZNaJWZNs8OdgY305ZqsmHUvEmm3EG2J+gErVP77tkwlNhe8Nv+DJRzo2LWQ+lKJFA2yvkEmcePLFHiReNrqHaWpPqjJF+jAZJ1d0eZKNgaWGpIOAnJFClJ5ln0XUhVEgdwM3VykKcX1z3gd2mXa/LW3HKSubnpwEl3crvkeKf40SXNqdXU595VsOZS990AsFhg9bhBQXXPCyhIyNzLhvnodkIU2xb/64G/EfoqFGkoZmZo5M/MmWkJFLeL0h+Oix7ewptYdgofl+38naz+GCvLzBgy/UPz+t7+seOzw2jauNp2HiLFs8d8MXhJ+z2eHXFEAZH7sOMgTpbqH6wXp8xJ/+qFwMKp3sO3KB47joFdqOjInVT/hvXLCGFw2+b3BHCSbkJN32GHn3+jwBmwakc8EZygY3LsdJzlVm7dconlcz72iKP0pEMAhsgZ0ucdl6KFSmMaKVBFsNZaJ1L5b+ERKyazrCXWRYsMVI0T3HWxbOFiGhfYFGzmIZ5bPtieDq0LCcqKS/eVHL/4AszVP88UKHeZTmcQo8blq/XthcuoA8QT+qm+MwhUJgf9+RA4lMB3nIA5BIvzmBbVp8a4ENvY2LpAwLuJmD1VUM9ak1ZHe12EQrZOjX7M0gYgDxIXWmgvEQbeUXRIZTneI67ikpmh4NFuLGEMgOybc6g1ITP3fXVsOXOoWYNvkuxIYWuB67kL25o0eTe3gb5zQCoUsfsO4lE80rlipsn7nnV277791LEpV3E8tTquWGA9aw0eU4ekjrR7TadQaS/5sq2xlhv+OtbIv4fg9qvtxFwq+F+JrhJiPlML5JpbXg+EwPIWmIrMjsHq0APeQHKUOVgjy8hoGTFXg6dvP2zItLH8gyd8YqLI7o4m7Qs77C5NSGQGjmcSi UKwRRhVh NCBTrfTFGhKGdkme1i7vwIzx/VE/RYN+L3mVccxN7H5Mcf/rhUl1r720nr0S/gzVIpX1YHJISCi8Hql0nu452o31ehdbVQUgAbULJdEv6vAua8ofB32aCNBKedA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/9/19 07:01, Mike Kravetz wrote: > The routine update_and_free_pages_bulk already performs vmemmap > restoration on the list of hugetlb pages in a separate step. In > preparation for more functionality to be added in this step, create a > new routine hugetlb_vmemmap_restore_folios() that will restore > vmemmap for a list of folios. > > This new routine must provide sufficient feedback about errors and > actual restoration performed so that update_and_free_pages_bulk can > perform optimally. > > Signed-off-by: Mike Kravetz > --- > mm/hugetlb.c | 36 ++++++++++++++++++------------------ > mm/hugetlb_vmemmap.c | 37 +++++++++++++++++++++++++++++++++++++ > mm/hugetlb_vmemmap.h | 11 +++++++++++ > 3 files changed, 66 insertions(+), 18 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index d6f3db3c1313..814bb1982274 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1836,36 +1836,36 @@ static void update_and_free_hugetlb_folio(struct hstate *h, struct folio *folio, > > static void update_and_free_pages_bulk(struct hstate *h, struct list_head *list) > { > + int ret; > + unsigned long restored; > struct folio *folio, *t_folio; > - bool clear_dtor = false; > > /* > - * First allocate required vmemmmap (if necessary) for all folios on > - * list. If vmemmap can not be allocated, we can not free folio to > - * lower level allocator, so add back as hugetlb surplus page. > - * add_hugetlb_folio() removes the page from THIS list. > - * Use clear_dtor to note if vmemmap was successfully allocated for > - * ANY page on the list. > + * First allocate required vmemmmap (if necessary) for all folios. > */ > - list_for_each_entry_safe(folio, t_folio, list, lru) { > - if (folio_test_hugetlb_vmemmap_optimized(folio)) { > - if (hugetlb_vmemmap_restore(h, &folio->page)) { > - spin_lock_irq(&hugetlb_lock); > + ret = hugetlb_vmemmap_restore_folios(h, list, &restored); > + > + /* > + * If there was an error restoring vmemmap for ANY folios on the list, > + * add them back as surplus hugetlb pages. add_hugetlb_folio() removes > + * the folio from THIS list. > + */ > + if (ret < 0) { > + spin_lock_irq(&hugetlb_lock); > + list_for_each_entry_safe(folio, t_folio, list, lru) > + if (folio_test_hugetlb_vmemmap_optimized(folio)) > add_hugetlb_folio(h, folio, true); > - spin_unlock_irq(&hugetlb_lock); > - } else > - clear_dtor = true; > - } > + spin_unlock_irq(&hugetlb_lock); > } > > /* > - * If vmemmmap allocation was performed on any folio above, take lock > - * to clear destructor of all folios on list. This avoids the need to > + * If vmemmmap allocation was performed on ANY folio , take lock to > + * clear destructor of all folios on list. This avoids the need to > * lock/unlock for each individual folio. > * The assumption is vmemmap allocation was performed on all or none > * of the folios on the list. This is true expect in VERY rare cases. > */ > - if (clear_dtor) { > + if (restored) { > spin_lock_irq(&hugetlb_lock); > list_for_each_entry(folio, list, lru) > __clear_hugetlb_destructor(h, folio); > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > index 4558b814ffab..463a4037ec6e 100644 > --- a/mm/hugetlb_vmemmap.c > +++ b/mm/hugetlb_vmemmap.c > @@ -480,6 +480,43 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head) > return ret; > } > > +/** > + * hugetlb_vmemmap_restore_folios - restore vmemmap for every folio on the list. > + * @h: struct hstate. > + * @folio_list: list of folios. > + * @restored: Set to number of folios for which vmemmap was restored > + * successfully if caller passes a non-NULL pointer. > + * > + * Return: %0 if vmemmap exists for all folios on the list. If an error is > + * encountered restoring vmemmap for ANY folio, an error code > + * will be returned to the caller. It is then the responsibility > + * of the caller to check the hugetlb vmemmap optimized flag of > + * each folio to determine if vmemmap was actually restored. > + */ > +int hugetlb_vmemmap_restore_folios(const struct hstate *h, > + struct list_head *folio_list, > + unsigned long *restored) > +{ > + unsigned long num_restored; > + struct folio *folio; > + int ret = 0, t_ret; > + > + num_restored = 0; > + list_for_each_entry(folio, folio_list, lru) { > + if (folio_test_hugetlb_vmemmap_optimized(folio)) { > + t_ret = hugetlb_vmemmap_restore(h, &folio->page); I still think we should free a non-optimized HugeTLB page if we encounter an OOM situation instead of continue to restore vemmmap pages. Restoring vmemmmap pages will only aggravate the OOM situation. The suitable appraoch is to free a non-optimized HugeTLB page to satisfy our allocation of vmemmap pages, what's your opinion, Mike? Thanks. > + if (t_ret) > + ret = t_ret; > + else > + num_restored++; > + } > + } > + > + if (*restored) > + *restored = num_restored; > + return ret; > +} > + > /* Return true iff a HugeTLB whose vmemmap should and can be optimized. */ > static bool vmemmap_should_optimize(const struct hstate *h, const struct page *head) > { > diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h > index c512e388dbb4..bb58453c3cc0 100644 > --- a/mm/hugetlb_vmemmap.h > +++ b/mm/hugetlb_vmemmap.h > @@ -19,6 +19,8 @@ > > #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP > int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head); > +int hugetlb_vmemmap_restore_folios(const struct hstate *h, > + struct list_head *folio_list, unsigned long *restored); > void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head); > void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); > > @@ -45,6 +47,15 @@ static inline int hugetlb_vmemmap_restore(const struct hstate *h, struct page *h > return 0; > } > > +static inline int hugetlb_vmemmap_restore_folios(const struct hstate *h, > + struct list_head *folio_list, > + unsigned long *restored) > +{ > + if (restored) > + *restored = 0; > + return 0; > +} > + > static inline void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head) > { > }