From: Mike Kravetz <mike.kravetz@oracle.com>
To: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Jiaqi Yan <jiaqiyan@google.com>,
Naoya Horiguchi <naoya.horiguchi@linux.dev>,
Muchun Song <songmuchun@bytedance.com>,
Miaohe Lin <linmiaohe@huawei.com>,
Axel Rasmussen <axelrasmussen@google.com>,
James Houghton <jthoughton@google.com>,
Michal Hocko <mhocko@suse.com>,
Andrew Morton <akpm@linux-foundation.org>,
Mike Kravetz <mike.kravetz@oracle.com>
Subject: [PATCH v2 0/2] Fix hugetlb free path race with memory errors
Date: Mon, 17 Jul 2023 17:49:40 -0700 [thread overview]
Message-ID: <20230718004942.113174-1-mike.kravetz@oracle.com> (raw)
In the discussion of Jiaqi Yan's series "Improve hugetlbfs read on
HWPOISON hugepages" the race window was discovered.
https://lore.kernel.org/linux-mm/20230616233447.GB7371@monkey/
Freeing a hugetlb page back to low level memory allocators is performed
in two steps.
1) Under hugetlb lock, remove page from hugetlb lists and clear destructor
2) Outside lock, allocate vmemmap if necessary and call low level free
Between these two steps, the hugetlb page will appear as a normal
compound page. However, vmemmap for tail pages could be missing.
If a memory error occurs at this time, we could try to update page
flags non-existant page structs.
A much more detailed description is in the first patch.
The first patch addresses the race window. However, it adds a
hugetlb_lock lock/unlock cycle to every vmemmap optimized hugetlb
page free operation. This is sub-optimal but is hardly noticeable
on a mostly idle system (the normal case).
The second path optimizes the update_and_free_pages_bulk routine
to only take the lock once in bulk operations.
-> v2
- Used the more definitive method of checking folio_test_hugetlb to
determine if destructor must be cleared.
- Added comment to clearly describe why and when we clear the
destructor in __update_and_free_hugetlb_folio.
- Clear destructor in hugetlb demote path.
- Do not send second patch to stable releases.
Mike Kravetz (2):
hugetlb: Do not clear hugetlb dtor until allocating vmemmap
hugetlb: optimize update_and_free_pages_bulk to avoid lock cycles
mm/hugetlb.c | 128 ++++++++++++++++++++++++++++++++++++++++-----------
1 file changed, 100 insertions(+), 28 deletions(-)
--
2.41.0
next reply other threads:[~2023-07-18 0:50 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 0:49 Mike Kravetz [this message]
2023-07-18 0:49 ` [PATCH v2 1/2] hugetlb: Do not clear hugetlb dtor until allocating vmemmap Mike Kravetz
2023-07-18 16:14 ` James Houghton
2023-07-19 2:34 ` Muchun Song
2023-07-20 1:34 ` Jiaqi Yan
2023-07-26 8:48 ` Naoya Horiguchi
2023-07-18 0:49 ` [PATCH v2 2/2] hugetlb: optimize update_and_free_pages_bulk to avoid lock cycles Mike Kravetz
2023-07-18 16:31 ` James Houghton
2023-07-18 16:46 ` Mike Kravetz
2023-07-20 0:02 ` James Houghton
2023-07-20 0:18 ` Mike Kravetz
2023-07-20 0:50 ` James Houghton
2023-07-19 3:35 ` Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230718004942.113174-1-mike.kravetz@oracle.com \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=jiaqiyan@google.com \
--cc=jthoughton@google.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=naoya.horiguchi@linux.dev \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).