From: Matthew Brost <matthew.brost@intel.com>
To: Balbir Singh <balbirs@nvidia.com>
Cc: "Mika Penttilä" <mpenttil@redhat.com>,
dri-devel@lists.freedesktop.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
"Andrew Morton" <akpm@linux-foundation.org>,
"David Hildenbrand" <david@redhat.com>, "Zi Yan" <ziy@nvidia.com>,
"Joshua Hahn" <joshua.hahnjy@gmail.com>,
"Rakie Kim" <rakie.kim@sk.com>,
"Byungchul Park" <byungchul@sk.com>,
"Gregory Price" <gourry@gourry.net>,
"Ying Huang" <ying.huang@linux.alibaba.com>,
"Alistair Popple" <apopple@nvidia.com>,
"Oscar Salvador" <osalvador@suse.de>,
"Lorenzo Stoakes" <lorenzo.stoakes@oracle.com>,
"Baolin Wang" <baolin.wang@linux.alibaba.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
"Nico Pache" <npache@redhat.com>,
"Ryan Roberts" <ryan.roberts@arm.com>,
"Dev Jain" <dev.jain@arm.com>, "Barry Song" <baohua@kernel.org>,
"Lyude Paul" <lyude@redhat.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"David Airlie" <airlied@gmail.com>,
"Simona Vetter" <simona@ffwll.ch>,
"Ralph Campbell" <rcampbell@nvidia.com>,
"Francois Dugast" <francois.dugast@intel.com>
Subject: Re: [v3 03/11] mm/migrate_device: THP migration of zone device pages
Date: Thu, 28 Aug 2025 16:14:31 -0700 [thread overview]
Message-ID: <aLDi16JRf2VRWVI+@lstrano-desk.jf.intel.com> (raw)
In-Reply-To: <e43b3eff-6d9e-4e6e-a0a2-9537e669aa82@nvidia.com>
On Thu, Aug 21, 2025 at 08:24:00PM +1000, Balbir Singh wrote:
> On 8/15/25 10:04, Matthew Brost wrote:
> > On Fri, Aug 15, 2025 at 08:51:21AM +1000, Balbir Singh wrote:
> >> On 8/13/25 10:07, Mika Penttilä wrote:
> >>>
> >>> On 8/13/25 02:36, Balbir Singh wrote:
> >>>
> >>>> On 8/12/25 15:35, Mika Penttilä wrote:
> >>>>> Hi,
> >>>>>
> >>>>> On 8/12/25 05:40, Balbir Singh wrote:
> ...
>
> >> I've not run into this with my testing, let me try with more mTHP sizes enabled. I'll wait on Matthew
> >> to post his test case or any results, issues seen
> >>
> >
> > I’ve hit this. In the code I shared privately, I split THPs in the
> > page-collection path. You omitted that in v2 and v3; I believe you’ll
> > need those changes. The code I'm referring to had the below comment.
> >
> > 416 /*
> > 417 * XXX: No clean way to support higher-order folios that don't match PMD
> > 418 * boundaries for now — split them instead. Once mTHP support lands, add
> > 419 * proper support for this case.
> > 420 *
> > 421 * The test, which exposed this as problematic, remapped (memremap) a
> > 422 * large folio to an unaligned address, resulting in the folio being
> > 423 * found in the middle of the PTEs. The requested number of pages was
> > 424 * less than the folio size. Likely to be handled gracefully by upper
> > 425 * layers eventually, but not yet.
> > 426 */
> >
> > I triggered it by doing some odd mremap operations, which caused the CPU
> > page-fault handler to spin indefinitely iirc. In that case, a large device
> > folio had been moved into the middle of a PMD.
> >
> > Upstream could see the same problem if the device fault handler enforces
> > a must-migrate-to-device policy and mremap moves a large CPU folio into
> > the middle of a PMD.
> >
> > I’m in the middle of other work; when I circle back, I’ll try to create
> > a selftest to reproduce this. My current test is a fairly convoluted IGT
> > with a bunch of threads doing remap nonsense, but I’ll try to distill it
> > into a concise selftest.
> >
>
> I ran into this while doing some testing as well, I fixed it in a manner similar
> to split_folio() for partial unmaps. I will consolidate the folio splits into
> a single helper and post it with v4.
>
I created a selftest for this one. I'm going to send these over along +
the fixes I've applied in v3. Please include my selftests in the v4.
Matt
>
> Balbir Singh
next prev parent reply other threads:[~2025-08-28 23:14 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-12 2:40 [v3 00/11] mm: support device-private THP Balbir Singh
2025-08-12 2:40 ` [v3 01/11] mm/zone_device: support large zone device private folios Balbir Singh
2025-08-26 14:22 ` David Hildenbrand
2025-08-12 2:40 ` [v3 02/11] mm/thp: zone_device awareness in THP handling code Balbir Singh
2025-08-12 14:47 ` kernel test robot
2025-08-26 15:19 ` David Hildenbrand
2025-08-27 10:14 ` Balbir Singh
2025-08-27 11:28 ` David Hildenbrand
2025-08-28 20:05 ` Matthew Brost
2025-08-28 20:12 ` David Hildenbrand
2025-08-28 20:17 ` Matthew Brost
2025-08-28 20:22 ` David Hildenbrand
2025-08-12 2:40 ` [v3 03/11] mm/migrate_device: THP migration of zone device pages Balbir Singh
2025-08-12 5:35 ` Mika Penttilä
2025-08-12 5:54 ` Matthew Brost
2025-08-12 6:18 ` Matthew Brost
2025-08-12 6:25 ` Mika Penttilä
2025-08-12 6:33 ` Matthew Brost
2025-08-12 6:37 ` Mika Penttilä
2025-08-12 23:36 ` Balbir Singh
2025-08-13 0:07 ` Mika Penttilä
2025-08-14 22:51 ` Balbir Singh
2025-08-15 0:04 ` Matthew Brost
2025-08-15 12:09 ` Balbir Singh
2025-08-21 10:24 ` Balbir Singh
2025-08-28 23:14 ` Matthew Brost [this message]
2025-08-12 2:40 ` [v3 04/11] mm/memory/fault: add support for zone device THP fault handling Balbir Singh
2025-08-12 2:40 ` [v3 05/11] lib/test_hmm: test cases and support for zone device private THP Balbir Singh
2025-08-12 2:40 ` [v3 06/11] mm/memremap: add folio_split support Balbir Singh
2025-08-12 2:40 ` [v3 07/11] mm/thp: add split during migration support Balbir Singh
2025-08-27 20:29 ` David Hildenbrand
2025-08-12 2:40 ` [v3 08/11] lib/test_hmm: add test case for split pages Balbir Singh
2025-08-12 2:40 ` [v3 09/11] selftests/mm/hmm-tests: new tests for zone device THP migration Balbir Singh
2025-08-12 2:40 ` [v3 10/11] gpu/drm/nouveau: add THP migration support Balbir Singh
2025-08-13 2:23 ` kernel test robot
2025-08-12 2:40 ` [v3 11/11] selftests/mm/hmm-tests: new throughput tests including THP Balbir Singh
-- strict thread matches above, loose matches on Subject: below --
2025-08-12 15:30 [v3 03/11] mm/migrate_device: THP migration of zone device pages kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aLDi16JRf2VRWVI+@lstrano-desk.jf.intel.com \
--to=matthew.brost@intel.com \
--cc=Liam.Howlett@oracle.com \
--cc=airlied@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=balbirs@nvidia.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=byungchul@sk.com \
--cc=dakr@kernel.org \
--cc=david@redhat.com \
--cc=dev.jain@arm.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=francois.dugast@intel.com \
--cc=gourry@gourry.net \
--cc=joshua.hahnjy@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=lyude@redhat.com \
--cc=mpenttil@redhat.com \
--cc=npache@redhat.com \
--cc=osalvador@suse.de \
--cc=rakie.kim@sk.com \
--cc=rcampbell@nvidia.com \
--cc=ryan.roberts@arm.com \
--cc=simona@ffwll.ch \
--cc=ying.huang@linux.alibaba.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.