Re: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* Re: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page
       [not found]           ` <Yv0ku1mn4LBzg/zG@monkey>
@ 2022-08-19 11:22             ` Michael Ellerman
  2022-08-19 16:55               ` Mike Kravetz
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Ellerman @ 2022-08-19 11:22 UTC (permalink / raw)
  To: Mike Kravetz, Andrew Morton
  Cc: linmiaohe@huawei.com, alex.sierra@amd.com, Vasily Gorbik,
	david@redhat.com, linuxppc-dev, Heiko Carstens,
	apopple@nvidia.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Sven Schnelle, Huang, Ying, Wang, Haiyue,
	Alexander Gordeev, Christian Borntraeger,
	naoya.horiguchi@linux.dev, songmuchun@bytedance.com

Mike Kravetz <mike.kravetz@oracle.com> writes:
> On 08/16/22 22:43, Andrew Morton wrote:
>> On Wed, 17 Aug 2022 03:31:37 +0000 "Wang, Haiyue" <haiyue.wang@intel.com> wrote:
>>
>> > > >  		}
>> > >
>> > > I would be better to fix this for real at those three client code sites?
>> >
>> > Then 5.19 will break for a while to wait for the final BIG patch ?
>>
>> If that's the proposal then your [1/2] should have had a cc:stable and
>> changelog words describing the plan for 6.0.
>>
>> But before we do that I'd like to see at least a prototype of the final
>> fixes to s390 and hugetlb, so we can assess those as preferable for
>> backporting.  I don't think they'll be terribly intrusive or risky?
>
> I will start on adding follow_huge_pgd() support.  Although, I may need
> some help with verification from the powerpc folks, as that is the only
> architecture which supports hugetlb pages at that level.
>
> mpe any suggestions?

I'm happy to test.

I have a system where I can allocate 1GB huge pages.

I'm not sure how to actually test this path though. I hacked up the
vm/migration.c test to allocate 1GB hugepages, but I can't see it going
through follow_huge_pgd() (using ftrace).

Maybe I hacked it up badly, I'll have a closer look on Monday. But if
you have any tips on how to trigger that path let me know :)

cheers

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page
  2022-08-19 11:22             ` [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page Michael Ellerman
@ 2022-08-19 16:55               ` Mike Kravetz
  2022-08-26 13:07                 ` Michael Ellerman
  0 siblings, 1 reply; 3+ messages in thread
From: Mike Kravetz @ 2022-08-19 16:55 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linmiaohe@huawei.com, alex.sierra@amd.com, Vasily Gorbik,
	david@redhat.com, linuxppc-dev, Heiko Carstens,
	apopple@nvidia.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Sven Schnelle, Huang, Ying, Wang, Haiyue,
	Andrew Morton, Christian Borntraeger, naoya.horiguchi@linux.dev,
	Alexander Gordeev, songmuchun@bytedance.com

On 08/19/22 21:22, Michael Ellerman wrote:
> Mike Kravetz <mike.kravetz@oracle.com> writes:
> > On 08/16/22 22:43, Andrew Morton wrote:
> >> On Wed, 17 Aug 2022 03:31:37 +0000 "Wang, Haiyue" <haiyue.wang@intel.com> wrote:
> >>
> >> > > >  		}
> >> > >
> >> > > I would be better to fix this for real at those three client code sites?
> >> >
> >> > Then 5.19 will break for a while to wait for the final BIG patch ?
> >>
> >> If that's the proposal then your [1/2] should have had a cc:stable and
> >> changelog words describing the plan for 6.0.
> >>
> >> But before we do that I'd like to see at least a prototype of the final
> >> fixes to s390 and hugetlb, so we can assess those as preferable for
> >> backporting.  I don't think they'll be terribly intrusive or risky?
> >
> > I will start on adding follow_huge_pgd() support.  Although, I may need
> > some help with verification from the powerpc folks, as that is the only
> > architecture which supports hugetlb pages at that level.
> >
> > mpe any suggestions?
> 
> I'm happy to test.
> 
> I have a system where I can allocate 1GB huge pages.
> 
> I'm not sure how to actually test this path though. I hacked up the
> vm/migration.c test to allocate 1GB hugepages, but I can't see it going
> through follow_huge_pgd() (using ftrace).

I thing you needed to use 16GB to trigger this code path.  Anshuman introduced
support for page offline (and migration) at this level in commit 94310cbcaa3c
("mm/madvise: enable (soft|hard) offline of HugeTLB pages at PGD level").
When asked about the use case, he mentioned:

"Yes, its in the context of 16GB pages on POWER8 system where all the
 gigantic pages are pre allocated from the platform and passed on to
 the kernel through the device tree. We dont allocate these gigantic
 pages on runtime."

-- 
Mike Kravetz

> 
> Maybe I hacked it up badly, I'll have a closer look on Monday. But if
> you have any tips on how to trigger that path let me know :)
> 
> cheers

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page
  2022-08-19 16:55               ` Mike Kravetz
@ 2022-08-26 13:07                 ` Michael Ellerman
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Ellerman @ 2022-08-26 13:07 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: linmiaohe@huawei.com, alex.sierra@amd.com, Vasily Gorbik,
	david@redhat.com, linuxppc-dev, Heiko Carstens,
	apopple@nvidia.com, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Aneesh Kumar K.V, Sven Schnelle, Huang, Ying,
	Wang, Haiyue, Andrew Morton, Christian Borntraeger,
	naoya.horiguchi@linux.dev, Alexander Gordeev,
	songmuchun@bytedance.com

Mike Kravetz <mike.kravetz@oracle.com> writes:
> On 08/19/22 21:22, Michael Ellerman wrote:
>> Mike Kravetz <mike.kravetz@oracle.com> writes:
>> > On 08/16/22 22:43, Andrew Morton wrote:
>> >> On Wed, 17 Aug 2022 03:31:37 +0000 "Wang, Haiyue" <haiyue.wang@intel.com> wrote:
>> >>
>> >> > > >  		}
>> >> > >
>> >> > > I would be better to fix this for real at those three client code sites?
>> >> >
>> >> > Then 5.19 will break for a while to wait for the final BIG patch ?
>> >>
>> >> If that's the proposal then your [1/2] should have had a cc:stable and
>> >> changelog words describing the plan for 6.0.
>> >>
>> >> But before we do that I'd like to see at least a prototype of the final
>> >> fixes to s390 and hugetlb, so we can assess those as preferable for
>> >> backporting.  I don't think they'll be terribly intrusive or risky?
>> >
>> > I will start on adding follow_huge_pgd() support.  Although, I may need
>> > some help with verification from the powerpc folks, as that is the only
>> > architecture which supports hugetlb pages at that level.
>> >
>> > mpe any suggestions?
>>
>> I'm happy to test.
>>
>> I have a system where I can allocate 1GB huge pages.
>>
>> I'm not sure how to actually test this path though. I hacked up the
>> vm/migration.c test to allocate 1GB hugepages, but I can't see it going
>> through follow_huge_pgd() (using ftrace).
>
> I thing you needed to use 16GB to trigger this code path.  Anshuman introduced
> support for page offline (and migration) at this level in commit 94310cbcaa3c
> ("mm/madvise: enable (soft|hard) offline of HugeTLB pages at PGD level").
> When asked about the use case, he mentioned:
>
> "Yes, its in the context of 16GB pages on POWER8 system where all the
>  gigantic pages are pre allocated from the platform and passed on to
>  the kernel through the device tree. We dont allocate these gigantic
>  pages on runtime."

That was true, but isn't anymore.

I must have been insufficently caffeinated the other day. On our newer
machines 1GB is the largest huge page size, but it's obviously way too
small to sit at the PGD level. So that was a waste of my time :)

We used to support 16GB at the PGD level, but we reworked the page table
geometry a few years ago, and now they sit at the PUD level on machines
that support 16GB pages:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ba95b5d0359609b4ec8010f77c40ab3c595a6ac6

Note the author :}

So the good news is we no longer have any configuration where a huge
page entry is expected in the PGD. So we can drop our pgd_huge()
definitions, and ours are the last non-zero definitions, so it can all
go away I think.

I'll send a patch to remove the powerpc pgd_huge() definitions after
I've run it through some tests.

cheers

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-08-26 13:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20220812084921.409142-1-haiyue.wang@intel.com>
     [not found] ` <20220816022102.582865-1-haiyue.wang@intel.com>
     [not found]   ` <20220816022102.582865-2-haiyue.wang@intel.com>
     [not found]     ` <20220816175838.211a1b1e85bc68c439101995@linux-foundation.org>
     [not found]       ` <BYAPR11MB3495F747CBF95E079E8FC8A5F76A9@BYAPR11MB3495.namprd11.prod.outlook.com>
     [not found]         ` <20220816224322.33e0dfbcbf522fcdc2026f0e@linux-foundation.org>
     [not found]           ` <Yv0ku1mn4LBzg/zG@monkey>
2022-08-19 11:22             ` [PATCH v6 1/2] mm: migration: fix the FOLL_GET failure on following huge page Michael Ellerman
2022-08-19 16:55               ` Mike Kravetz
2022-08-26 13:07                 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).