All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: akpm@linux-foundation.org, muchun.song@linux.dev,
	david@redhat.com, linmiaohe@huawei.com, naoya.horiguchi@nec.com,
	mhocko@kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/3] mm: hugetlb: make the hugetlb migration strategy consistent
Date: Wed, 20 Mar 2024 11:06:45 +0100	[thread overview]
Message-ID: <Zfq1NWzgpR-msYlg@localhost.localdomain> (raw)
In-Reply-To: <3519fcd41522817307a05b40fb551e2e17e68101.1709719720.git.baolin.wang@linux.alibaba.com>

On Wed, Mar 06, 2024 at 06:13:27PM +0800, Baolin Wang wrote:
> As discussed in previous thread [1], there is an inconsistency when handing
> hugetlb migration. When handling the migration of freed hugetlb, it prevents
> fallback to other NUMA nodes in alloc_and_dissolve_hugetlb_folio(). However,
> when dealing with in-use hugetlb, it allows fallback to other NUMA nodes in
> alloc_hugetlb_folio_nodemask(), which can break the per-node hugetlb pool
> and might result in unexpected failures when node bound workloads doesn't get
> what is asssumed available.
> 
> To make hugetlb migration strategy more clear, we should list all the scenarios
> of hugetlb migration and analyze whether allocation fallback is permitted:
> 1) Memory offline: will call dissolve_free_huge_pages() to free the freed hugetlb,
> and call do_migrate_range() to migrate the in-use hugetlb. Both can break the
> per-node hugetlb pool, but as this is an explicit offlining operation, no better
> choice. So should allow the hugetlb allocation fallback.
> 2) Memory failure: same as memory offline. Should allow fallback to a different node
> might be the only option to handle it, otherwise the impact of poisoned memory can
> be amplified.
> 3) Longterm pinning: will call migrate_longterm_unpinnable_pages() to migrate in-use
> and not-longterm-pinnable hugetlb, which can break the per-node pool. But we should
> fail to longterm pinning if can not allocate on current node to avoid breaking the
> per-node pool.
> 4) Syscalls (mbind, migrate_pages, move_pages): these are explicit users operation
> to move pages to other nodes, so fallback to other nodes should not be prohibited.
> 5) alloc_contig_range: used by CMA allocation and virtio-mem fake-offline to allocate
> given range of pages. Now the freed hugetlb migration is not allowed to fallback, to
> keep consistency, the in-use hugetlb migration should be also not allowed to fallback.
> 6) alloc_contig_pages: used by kfence, pgtable_debug etc. The strategy should be
> consistent with that of alloc_contig_range().
> 
> Based on the analysis of the various scenarios above, introducing a new helper to
> determine whether fallback is permitted according to the migration reason..
> 
> [1] https://lore.kernel.org/all/6f26ce22d2fcd523418a085f2c588fe0776d46e7.1706794035.git.baolin.wang@linux.alibaba.com/
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> +static inline bool htlb_allow_alloc_fallback(int reason)
> +{
> +	bool allowed_fallback = false;
> +
> +	/*
> +	 * Note: the memory offline, memory failure and migration syscalls will
> +	 * be allowed to fallback to other nodes due to lack of a better chioce,
                                                                         ^
									 choice
-- 
Oscar Salvador
SUSE Labs


  reply	other threads:[~2024-03-20 10:05 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-06 10:13 [PATCH v2 0/3] make the hugetlb migration strategy consistent Baolin Wang
2024-03-06 10:13 ` [PATCH v2 1/3] mm: record the migration reason for struct migration_target_control Baolin Wang
2024-03-06 10:13 ` [PATCH v2 2/3] mm: hugetlb: make the hugetlb migration strategy consistent Baolin Wang
2024-03-20 10:06   ` Oscar Salvador [this message]
2024-03-06 10:13 ` [PATCH v2 3/3] docs: hugetlbpage.rst: add hugetlb migration description Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zfq1NWzgpR-msYlg@localhost.localdomain \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=linmiaohe@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=naoya.horiguchi@nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.