From: Mike Kravetz <mike.kravetz@oracle.com>
To: Muchun Song <muchun.song@linux.dev>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Muchun Song <songmuchun@bytedance.com>,
Joao Martins <joao.m.martins@oracle.com>,
Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@redhat.com>,
Miaohe Lin <linmiaohe@huawei.com>,
David Rientjes <rientjes@google.com>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Naoya Horiguchi <naoya.horiguchi@linux.dev>,
Barry Song <21cnbao@gmail.com>, Michal Hocko <mhocko@suse.com>,
Matthew Wilcox <willy@infradead.org>,
Xiongchun Duan <duanxiongchun@bytedance.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v5 4/8] hugetlb: perform vmemmap restoration on a list of pages
Date: Mon, 25 Sep 2023 15:35:36 -0700 [thread overview]
Message-ID: <20230925223536.GA57727@monkey> (raw)
In-Reply-To: <20230925171242.GA11309@monkey>
On 09/25/23 10:12, Mike Kravetz wrote:
> On 09/25/23 21:54, Muchun Song wrote:
> >
> >
> > On 2023/9/25 08:39, Mike Kravetz wrote:
> > > The routine update_and_free_pages_bulk already performs vmemmap
> > > restoration on the list of hugetlb pages in a separate step. In
> > > preparation for more functionality to be added in this step, create a
> > > new routine hugetlb_vmemmap_restore_folios() that will restore
> > > vmemmap for a list of folios.
> > >
> > > This new routine must provide sufficient feedback about errors and
> > > actual restoration performed so that update_and_free_pages_bulk can
> > > perform optimally.
> > >
> > > Special care must be taken when encountering an error from
> > > hugetlb_vmemmap_restore_folios. We want to continue making as much
> > > forward progress as possible. A new routine bulk_vmemmap_restore_error
> > > handles this specific situation.
> > >
> > > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
> > > ---
> > > mm/hugetlb.c | 98 +++++++++++++++++++++++++++++++-------------
> > > mm/hugetlb_vmemmap.c | 38 +++++++++++++++++
> > > mm/hugetlb_vmemmap.h | 10 +++++
> > > 3 files changed, 118 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index da0ebd370b5f..53df35fbc3f2 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -1834,50 +1834,92 @@ static void update_and_free_hugetlb_folio(struct hstate *h, struct folio *folio,
> > > schedule_work(&free_hpage_work);
> > > }
> > > -static void update_and_free_pages_bulk(struct hstate *h, struct list_head *list)
> > > +static void bulk_vmemmap_restore_error(struct hstate *h,
> > > + struct list_head *folio_list,
> > > + struct list_head *non_hvo_folios)
> > > {
> > > struct folio *folio, *t_folio;
> > > - bool clear_dtor = false;
> > > - /*
> > > - * First allocate required vmemmmap (if necessary) for all folios on
> > > - * list. If vmemmap can not be allocated, we can not free folio to
> > > - * lower level allocator, so add back as hugetlb surplus page.
> > > - * add_hugetlb_folio() removes the page from THIS list.
> > > - * Use clear_dtor to note if vmemmap was successfully allocated for
> > > - * ANY page on the list.
> > > - */
> > > - list_for_each_entry_safe(folio, t_folio, list, lru) {
> > > - if (folio_test_hugetlb_vmemmap_optimized(folio)) {
> > > + if (!list_empty(non_hvo_folios)) {
> > > + /*
> > > + * Free any restored hugetlb pages so that restore of the
> > > + * entire list can be retried.
> > > + * The idea is that in the common case of ENOMEM errors freeing
> > > + * hugetlb pages with vmemmap we will free up memory so that we
> > > + * can allocate vmemmap for more hugetlb pages.
> > > + */
> > > + list_for_each_entry_safe(folio, t_folio, non_hvo_folios, lru) {
> > > + list_del(&folio->lru);
> > > + spin_lock_irq(&hugetlb_lock);
> > > + __clear_hugetlb_destructor(h, folio);
> > > + spin_unlock_irq(&hugetlb_lock);
> > > + update_and_free_hugetlb_folio(h, folio, false);
> > > + cond_resched();
> > > + }
> > > + } else {
> > > + /*
> > > + * In the case where there are no folios which can be
> > > + * immediately freed, we loop through the list trying to restore
> > > + * vmemmap individually in the hope that someone elsewhere may
> > > + * have done something to cause success (such as freeing some
> > > + * memory). If unable to restore a hugetlb page, the hugetlb
> > > + * page is made a surplus page and removed from the list.
> > > + * If are able to restore vmemmap and free one hugetlb page, we
> > > + * quit processing the list to retry the bulk operation.
> > > + */
> > > + list_for_each_entry_safe(folio, t_folio, folio_list, lru)
> > > if (hugetlb_vmemmap_restore(h, &folio->page)) {
> >
> > IIUC, the folio should be deleted from the folio list since this
> > huge page will be added to the free list (the list is corrupted),
> > right?
>
> Good catch! Yes, we should remove from the list here.
>
> I did exercise this specific code path and there was no list corruption. In
> any case, I will add the list_del().
Correction. My testing with simulated errors took the else branch here.
--
Mike Kravetz
next prev parent reply other threads:[~2023-09-25 22:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-25 0:39 [PATCH v5 0/8] Batch hugetlb vmemmap modification operations Mike Kravetz
2023-09-25 0:39 ` [PATCH v5 1/8] hugetlb: optimize update_and_free_pages_bulk to avoid lock cycles Mike Kravetz
2023-09-25 0:39 ` [PATCH v5 2/8] hugetlb: restructure pool allocations Mike Kravetz
2023-09-25 8:56 ` Muchun Song
2023-09-25 0:39 ` [PATCH v5 3/8] hugetlb: perform vmemmap optimization on a list of pages Mike Kravetz
2023-09-25 10:24 ` Muchun Song
2023-09-25 0:39 ` [PATCH v5 4/8] hugetlb: perform vmemmap restoration " Mike Kravetz
2023-09-25 13:54 ` Muchun Song
2023-09-25 17:12 ` Mike Kravetz
2023-09-25 22:35 ` Mike Kravetz [this message]
2023-09-25 0:39 ` [PATCH v5 5/8] hugetlb: batch freeing of vmemmap pages Mike Kravetz
2023-09-25 0:39 ` [PATCH v5 6/8] hugetlb: batch PMD split for bulk vmemmap dedup Mike Kravetz
2023-09-25 0:39 ` [PATCH v5 7/8] hugetlb: batch TLB flushes when freeing vmemmap Mike Kravetz
2023-09-25 0:39 ` [PATCH v5 8/8] hugetlb: batch TLB flushes when restoring vmemmap Mike Kravetz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230925223536.GA57727@monkey \
--to=mike.kravetz@oracle.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=david@redhat.com \
--cc=duanxiongchun@bytedance.com \
--cc=joao.m.martins@oracle.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=naoya.horiguchi@linux.dev \
--cc=osalvador@suse.de \
--cc=rientjes@google.com \
--cc=songmuchun@bytedance.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.