All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	David Hildenbrand <david@redhat.com>,
	Muchun Song <songmuchun@bytedance.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/2] mm: Make alloc_contig_range handle free hugetlb pages
Date: Fri, 26 Feb 2021 10:45:14 +0100	[thread overview]
Message-ID: <20210226094507.GA3240@linux> (raw)
In-Reply-To: <YDiyvQ2SCXxCjPJ2@dhcp22.suse.cz>

On Fri, Feb 26, 2021 at 09:35:09AM +0100, Michal Hocko wrote:
> I think it would be helpful to call out that specific case explicitly
> here. I can see only one scenario (are there more?)
> __free_huge_page()		isolate_or_dissolve_huge_page
> 				  PageHuge() == T
> 				  alloc_and_dissolve_huge_page
> 				    alloc_fresh_huge_page()
> 				    spin_lock(hugetlb_lock)
> 				    // PageHuge() && !PageHugeFreed &&
> 				    // !PageCount()
> 				    spin_unlock(hugetlb_lock)
>   spin_lock(hugetlb_lock)
>   1) update_and_free_page
>        PageHuge() == F
>        __free_pages()
>   2) enqueue_huge_page
>        SetPageHugeFreed()
>   spin_unlock(&hugetlb_lock)

I do not think there are more scenarios. The thing is to find a !page_count &&
!PageHugeFreed.
AFAICS, this can only happen after:
put_page->put_page_test_zero->..->_free_huge_page gets called but __free_huge_page
has not reached enqueue_huge_page yet (as you just described above)

By calling out this case, you meant to describe it in the changelog?

> 
> > In this case we retry as the window race is quite small and we have high
> > chances to succeed next time.
> > 
> > With regard to the allocation, we restrict it to the node the page belongs
> > to with __GFP_THISNODE, meaning we do not fallback on other node's zones.
> > 
> > Note that gigantic hugetlb pages are fenced off since there is a cyclic
> > dependency between them and alloc_contig_range.
> > 
> > Signed-off-by: Oscar Salvador <osalvador@suse.de>
> 
> Thanks this looks much better than the initial version. One nit below.
> Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> > +bool isolate_or_dissolve_huge_page(struct page *page)
> > +{
> > +	struct hstate *h = NULL;
> > +	struct page *head;
> > +	bool ret = false;
> > +
> > +	spin_lock(&hugetlb_lock);
> > +	if (PageHuge(page)) {
> > +		head = compound_head(page);
> > +		h = page_hstate(head);
> > +	}
> > +	spin_unlock(&hugetlb_lock);
> > +
> > +	/*
> > +	 * The page might have been dissolved from under our feet.
> > +	 * If that is the case, return success as if we dissolved it ourselves.
> > +	 */
> > +	if (!h)
> > +		return true;
> 
> I am still fighting with this construct a bit. It is not really clear
> what the lock is protecting us from here. alloc_fresh_huge_page deals
> with all potential races and this looks like an optimistic check to save
> some work. But in fact the lock is really necessary for correctness
> because hstate might be completely bogus without the lock or us holding
> a reference on the poage. The following construct would be more
> explicit and compact. What do you think?
> 
> 	struct hstate *h;
> 
> 	/*
> 	 * The page might have been dissloved from under our feet
> 	 * so make sure to carefully check the state under the lock.
> 	 * Return success on when racing as if we dissloved the page
> 	 * ourselves.
> 	 */
> 	spin_lock(&hugetlb_lock);
> 	if (PageHuge(page)) {
> 		head = compound_head(page);
> 		h = page_hstate(head);
> 	} else {
> 		spin_unlock(&hugetlb_lock);
> 		return true;
> 	}
> 	spin_unlock(&hugetlb_lock);

Yes, I find this less eyesore.

I will fix it up in v4.

-- 
Oscar Salvador
SUSE L3


  parent reply	other threads:[~2021-02-26  9:45 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22 13:51 [PATCH v3 0/2] Make alloc_contig_range handle Hugetlb pages Oscar Salvador
2021-02-22 13:51 ` [PATCH v3 1/2] mm: Make alloc_contig_range handle free hugetlb pages Oscar Salvador
2021-02-25 20:03   ` Mike Kravetz
2021-02-26  9:48     ` Oscar Salvador
2021-02-26  8:35   ` Michal Hocko
2021-02-26  8:38     ` Michal Hocko
2021-02-26  9:25       ` David Hildenbrand
2021-02-26  9:47         ` Oscar Salvador
2021-02-26  9:45     ` Oscar Salvador [this message]
2021-02-26  9:51       ` Michal Hocko
2021-03-01 14:09   ` David Hildenbrand
2021-03-04 10:19     ` Oscar Salvador
2021-03-04 10:32       ` David Hildenbrand
2021-03-04 10:41         ` Oscar Salvador
2021-02-22 13:51 ` [PATCH v3 2/2] mm: Make alloc_contig_range handle in-use " Oscar Salvador
2021-02-25 23:05   ` Mike Kravetz
2021-02-26  8:46   ` Michal Hocko
2021-02-26 10:24     ` Oscar Salvador
2021-02-26 10:27       ` Oscar Salvador
2021-02-26 12:46       ` Michal Hocko
2021-02-28 13:43         ` Oscar Salvador
2021-03-05 17:30   ` David Hildenbrand
2021-03-01 12:43 ` [PATCH v3 0/2] Make alloc_contig_range handle Hugetlb pages David Hildenbrand
2021-03-01 12:57   ` Oscar Salvador
2021-03-01 12:59     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210226094507.GA3240@linux \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.