Re: [PATCH v10 04/11] mm/hugetlb: Defer freeing of HugeTLB pages

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oscar Salvador <osalvador@suse.de>
To: Muchun Song <songmuchun@bytedance.com>
Cc: corbet@lwn.net, mike.kravetz@oracle.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com,
	dave.hansen@linux.intel.com, luto@kernel.org,
	peterz@infradead.org, viro@zeniv.linux.org.uk,
	akpm@linux-foundation.org, paulmck@kernel.org,
	mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com,
	rdunlap@infradead.org, oneukum@suse.com,
	anshuman.khandual@arm.com, jroedel@suse.de,
	almasrymina@google.com, rientjes@google.com, willy@infradead.org,
	mhocko@suse.com, song.bao.hua@hisilicon.com, david@redhat.com,
	naoya.horiguchi@nec.com, duanxiongchun@bytedance.com,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v10 04/11] mm/hugetlb: Defer freeing of HugeTLB pages
Date: Mon, 21 Dec 2020 11:27:07 +0100	[thread overview]
Message-ID: <20201221102703.GA15804@linux> (raw)
In-Reply-To: <20201217121303.13386-5-songmuchun@bytedance.com>

On Thu, Dec 17, 2020 at 08:12:56PM +0800, Muchun Song wrote:
> In the subsequent patch, we will allocate the vmemmap pages when free
> HugeTLB pages. But update_and_free_page() is called from a non-task
> context(and hold hugetlb_lock), so we can defer the actual freeing in
> a workqueue to prevent from using GFP_ATOMIC to allocate the vmemmap
> pages.

I think we would benefit from a more complete changelog, at least I had
to stare at the code for a while in order to grasp what are we trying
to do and the reasons behind.

> +static void __free_hugepage(struct hstate *h, struct page *page);
> +
> +/*
> + * As update_and_free_page() is be called from a non-task context(and hold
> + * hugetlb_lock), we can defer the actual freeing in a workqueue to prevent
> + * use GFP_ATOMIC to allocate a lot of vmemmap pages.

The above implies that update_and_free_page() is __always__ called from a 
non-task context, but that is not always the case?

> +static void update_hpage_vmemmap_workfn(struct work_struct *work)
>  {
> -	int i;
> +	struct llist_node *node;
> +	struct page *page;
>  
> +	node = llist_del_all(&hpage_update_freelist);
> +
> +	while (node) {
> +		page = container_of((struct address_space **)node,
> +				     struct page, mapping);
> +		node = node->next;
> +		page->mapping = NULL;
> +		__free_hugepage(page_hstate(page), page);
> +
> +		cond_resched();
> +	}
> +}
> +static DECLARE_WORK(hpage_update_work, update_hpage_vmemmap_workfn);

I wonder if this should be moved to hugetlb_vmemmap.c

> +/*
> + * This is where the call to allocate vmemmmap pages will be inserted.
> + */

I think this should go in the changelog.

> +static void __free_hugepage(struct hstate *h, struct page *page)
> +{
> +	int i;
> +
>  	for (i = 0; i < pages_per_huge_page(h); i++) {
>  		page[i].flags &= ~(1 << PG_locked | 1 << PG_error |
>  				1 << PG_referenced | 1 << PG_dirty |
> @@ -1313,13 +1377,17 @@ static void update_and_free_page(struct hstate *h, struct page *page)
>  	set_page_refcounted(page);
>  	if (hstate_is_gigantic(h)) {
>  		/*
> -		 * Temporarily drop the hugetlb_lock, because
> -		 * we might block in free_gigantic_page().
> +		 * Temporarily drop the hugetlb_lock only when this type of
> +		 * HugeTLB page does not support vmemmap optimization (which
> +		 * context do not hold the hugetlb_lock), because we might
> +		 * block in free_gigantic_page().

"
 /*
  * Temporarily drop the hugetlb_lock, because we might block
  * in free_gigantic_page(). Only drop it in case the vmemmap
  * optimization is disabled, since that context does not hold
  * the lock.
  */
" ?

 
Oscar Salvador
SUSE L3

next prev parent reply	other threads:[~2020-12-21 10:28 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-17 12:12 [PATCH v10 00/11] Free some vmemmap pages of HugeTLB page Muchun Song
2020-12-17 12:12 ` [PATCH v10 01/11] mm/memory_hotplug: Factor out bootmem core functions to bootmem_info.c Muchun Song
2020-12-17 12:12 ` [PATCH v10 02/11] mm/hugetlb: Introduce a new config HUGETLB_PAGE_FREE_VMEMMAP Muchun Song
2020-12-18 15:41   ` Oscar Salvador
2020-12-21 23:56   ` Mike Kravetz
2020-12-17 12:12 ` [PATCH v10 03/11] mm/hugetlb: Free the vmemmap pages associated with each HugeTLB page Muchun Song
2020-12-21  9:11   ` Oscar Salvador
2020-12-21 11:25     ` [External] " Muchun Song
2020-12-21 13:43       ` Oscar Salvador
2020-12-21 15:52         ` Muchun Song
2020-12-21 18:00           ` Oscar Salvador
2020-12-22  1:03             ` Mike Kravetz
2020-12-22  2:49             ` Muchun Song
2020-12-17 12:12 ` [PATCH v10 04/11] mm/hugetlb: Defer freeing of HugeTLB pages Muchun Song
2020-12-21 10:27   ` Oscar Salvador [this message]
2020-12-21 11:07     ` [External] " Muchun Song
2020-12-21 14:14       ` Oscar Salvador
2020-12-21 15:18         ` Muchun Song
2020-12-17 12:12 ` [PATCH v10 05/11] mm/hugetlb: Allocate the vmemmap pages associated with each HugeTLB page Muchun Song
2020-12-17 12:12 ` [PATCH v10 06/11] mm/hugetlb: Set the PageHWPoison to the raw error page Muchun Song
2020-12-17 12:12 ` [PATCH v10 07/11] mm/hugetlb: Flush work when dissolving hugetlb page Muchun Song
2020-12-21 10:40   ` Oscar Salvador
2020-12-21 11:07     ` [External] " Muchun Song
2020-12-17 12:13 ` [PATCH v10 08/11] mm/hugetlb: Add a kernel parameter hugetlb_free_vmemmap Muchun Song
2020-12-17 12:13 ` [PATCH v10 09/11] mm/hugetlb: Introduce nr_free_vmemmap_pages in the struct hstate Muchun Song
2020-12-21  8:16   ` Oscar Salvador
2020-12-21  9:33     ` [External] " Muchun Song
2020-12-17 12:13 ` [PATCH v10 10/11] mm/hugetlb: Gather discrete indexes of tail page Muchun Song
2020-12-18  9:06   ` Oscar Salvador
2020-12-18  9:41     ` [External] " Muchun Song
2020-12-17 12:13 ` [PATCH v10 11/11] mm/hugetlb: Optimize the code with the help of the compiler Muchun Song
2020-12-17 12:17 ` [PATCH v10 00/11] Free some vmemmap pages of HugeTLB page David Hildenbrand
2020-12-17 14:59 ` Oscar Salvador
2020-12-17 15:52   ` [External] " Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201221102703.GA15804@linux \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=anshuman.khandual@arm.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=hpa@zytor.com \
    --cc=jroedel@suse.de \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=oneukum@suse.com \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=songmuchun@bytedance.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.