From: Matthew Wilcox <willy@infradead.org>
To: Will Deacon <will@kernel.org>
Cc: Nanyong Sun <sunnanyong@huawei.com>,
Catalin Marinas <catalin.marinas@arm.com>,
mike.kravetz@oracle.com, muchun.song@linux.dev,
akpm@linux-foundation.org, anshuman.khandual@arm.com,
wangkefeng.wang@huawei.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize
Date: Wed, 7 Feb 2024 11:21:17 +0000 [thread overview]
Message-ID: <ZcNnrdlb3fe0kGHK@casper.infradead.org> (raw)
In-Reply-To: <20240207111252.GA22167@willie-the-truck>
On Wed, Feb 07, 2024 at 11:12:52AM +0000, Will Deacon wrote:
> On Sat, Jan 27, 2024 at 01:04:15PM +0800, Nanyong Sun wrote:
> >
> > On 2024/1/26 2:06, Catalin Marinas wrote:
> > > On Sat, Jan 13, 2024 at 05:44:33PM +0800, Nanyong Sun wrote:
> > > > HVO was previously disabled on arm64 [1] due to the lack of necessary
> > > > BBM(break-before-make) logic when changing page tables.
> > > > This set of patches fix this by adding necessary BBM sequence when
> > > > changing page table, and supporting vmemmap page fault handling to
> > > > fixup kernel address translation fault if vmemmap is concurrently accessed.
> > > I'm not keen on this approach. I'm not even sure it's safe. In the
> > > second patch, you take the init_mm.page_table_lock on the fault path but
> > > are we sure this is unlocked when the fault was taken?
> > I think this situation is impossible. In the implementation of the second
> > patch, when the page table is being corrupted
> > (the time window when a page fault may occur), vmemmap_update_pte() already
> > holds the init_mm.page_table_lock,
> > and unlock it until page table update is done.Another thread could not hold
> > the init_mm.page_table_lock and
> > also trigger a page fault at the same time.
> > If I have missed any points in my thinking, please correct me. Thank you.
>
> It still strikes me as incredibly fragile to handle the fault and trying
> to reason about all the users of 'struct page' is impossible. For example,
> can the fault happen from irq context?
The pte lock cannot be taken in irq context (which I think is what
you're asking?) While it is not possible to reason about all users of
struct page, we are somewhat relieved of that work by noting that this is
only for hugetlbfs, so we don't need to reason about slab, page tables,
netmem or zsmalloc.
> If we want to optimise the vmemmap mapping for arm64, I think we need to
> consider approaches which avoid the possibility of the fault altogether.
> It's more complicated to implement, but I think it would be a lot more
> robust.
>
> Andrew -- please can you drop these from -next?
>
> Thanks,
>
> Will
next prev parent reply other threads:[~2024-02-07 11:21 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-13 9:44 [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize Nanyong Sun
2024-01-13 9:44 ` [PATCH v3 1/3] mm: HVO: introduce helper function to update and flush pgtable Nanyong Sun
2024-01-13 9:44 ` [PATCH v3 2/3] arm64: mm: HVO: support BBM of vmemmap pgtable safely Nanyong Sun
2024-01-15 2:38 ` Muchun Song
2024-02-07 12:21 ` Mark Rutland
2024-02-08 9:30 ` Nanyong Sun
2024-01-13 9:44 ` [PATCH v3 3/3] arm64: mm: Re-enable OPTIMIZE_HUGETLB_VMEMMAP Nanyong Sun
2024-01-25 18:06 ` [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize Catalin Marinas
2024-01-27 5:04 ` Nanyong Sun
2024-02-07 11:12 ` Will Deacon
2024-02-07 11:21 ` Matthew Wilcox [this message]
2024-02-07 12:11 ` Will Deacon
2024-02-07 12:24 ` Mark Rutland
2024-02-07 14:17 ` Matthew Wilcox
2024-02-08 2:24 ` Jane Chu
2024-02-08 15:49 ` Matthew Wilcox
2024-02-08 19:21 ` Jane Chu
2024-02-11 11:59 ` Muchun Song
2024-06-05 20:50 ` Yu Zhao
2024-06-06 8:30 ` David Hildenbrand
2024-06-07 16:55 ` Frank van der Linden
2024-02-07 12:20 ` Catalin Marinas
2024-02-08 9:44 ` Nanyong Sun
2024-02-08 13:17 ` Will Deacon
2024-03-13 23:32 ` David Rientjes
2024-03-25 15:24 ` Nanyong Sun
2024-03-26 12:54 ` Will Deacon
2024-06-24 5:39 ` Yu Zhao
2024-06-27 14:33 ` Nanyong Sun
2024-06-27 21:03 ` Yu Zhao
2024-07-04 11:47 ` Nanyong Sun
2024-07-04 19:45 ` Yu Zhao
2024-02-07 12:44 ` Catalin Marinas
2024-06-27 21:19 ` Yu Zhao
2024-07-05 15:49 ` Catalin Marinas
2024-07-05 17:41 ` Yu Zhao
2024-07-10 16:51 ` Catalin Marinas
2024-07-10 17:12 ` Yu Zhao
2024-07-10 22:29 ` Catalin Marinas
2024-07-10 23:07 ` Yu Zhao
2024-07-11 8:31 ` Yu Zhao
2024-07-11 11:39 ` Catalin Marinas
2024-07-11 17:38 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZcNnrdlb3fe0kGHK@casper.infradead.org \
--to=willy@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=muchun.song@linux.dev \
--cc=sunnanyong@huawei.com \
--cc=wangkefeng.wang@huawei.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).