public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Ryan Roberts <ryan.roberts@arm.com>
To: Yu Zhao <yuzhao@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <willy@infradead.org>,
	Yin Fengwei <fengwei.yin@intel.com>,
	David Hildenbrand <david@redhat.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Yang Shi <shy828301@gmail.com>,
	"Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Itaru Kitayama <itaru.kitayama@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v4 2/5] mm: LARGE_ANON_FOLIO for improved performance
Date: Wed, 9 Aug 2023 17:08:32 +0100	[thread overview]
Message-ID: <e80cd2e6-6f7c-4337-a170-152926863290@arm.com> (raw)
In-Reply-To: <CAOUHufbaWjOq4BnQasSeZyq2SZKQZ0d7DQ7mkj7Ses8hFAR5uw@mail.gmail.com>

[...]

>>>> Let me reiterate [1]:
>>>>   My impression is we only agreed on one thing: at the current stage, we
>>>>   should respect things we absolutely have to. We didn't agree on what
>>>>   "never" means ("never 2MB" or "never >4KB"), and we didn't touch on
>>>>   how "always" should behave at all.
>>>>
>>>> And [2]:
>>>>   (Thanks to David, now I agree that) we have to interpret MADV_NOHUGEPAGE
>>>>   as nothing >4KB.
>>>>
>>>> My final take [3]:
>>>>   I agree these points require more discussion. But I don't think we
>>>>   need to conclude them now, unless they cause correctness issues like
>>>>   ignoring MADV_NOHUGEPAGE would.
>>>
>>> Thanks, I've read all of these comments previously, and appreciate the time you
>>> have put into the feedback. I'm not sure I fully agree with your point that we
>>> don't need to conclude on a policy now; I certainly don't think we need the
>>> whole thing in place on day 1, but I do think that whatever we put in should
>>> strive to be a strict subset of where we think we are going. For example, if we
>>> put something in with one policy (i.e. "never" only means "never 2MB") then find
>>> a problem and have to change that to be more conservative, are we risking perf
>>> regressions for any LAF users that started using it on day 1?
>>
>> It's not that I don't want to -- I just don't think we have enough
>> information before we have a wider deployment [1] and gain a better
>> understanding of real-world scenarios.
>>
>> Of course we could force a conclusion, a mostly opinion-based one. But
>> it would still involve prolonged discussions and delay this series, or
>> rush into decisions we might regret later.
>>
>> [1] Our fleets (servers, laptops and phones) support large-scale
>> experiments and I plan to run them on both client and server devices.

This all sounds great and I'm looking forward to seeing results! But I guess I
had been assuming that this sort of testing would be preferable to do before we
merge; that allows us to get confidence in the approach and reduces the changes
of having to change it later. I guess you have policies that prevent you from
testing this series at the scale you want until it is merged?

I'm not convinced this testing will help us answer the "what does never mean?"
question; if nothing breaks in your testing, it doesn't mean there aren't
systems out there that would break - it's hard to prove a negative. I think its
mostly embedded systems that use thp=never to reduce memory footprint to the
absolute minimum?


>>
>>>> But I should have been clear about the parameters to
>>>> hugepage_vma_check(): enforce_sysfs=false.
>>>
>>> So hugepage_vma_check(..., smaps=false, in_pf=true, enforce_sysfs=false) would
>>> give us:
>>>
>>>                 | prctl/fw  | sysfs     | sysfs     | sysfs
>>>                 | disable   | never     | madvise   | always
>>> ----------------|-----------|-----------|-----------|-----------
>>> no hint         | S         | LAF>S     | LAF>S     | THP>LAF>S
>>> MADV_HUGEPAGE   | S         | LAF>S     | THP>LAF>S | THP>LAF>S
>>> MADV_NOHUGEPAGE | S         | S         | S         | S
>>>
>>> Where "prctl/fw disable" trumps the sysfs setting.
>>>
>>> I can certainly see the benefit of this approach; it gives us a way to enable
>>> LAF while disabling THP (thp=never). It doesn't give us a way to enable THP
>>> without enabling LAF though (unless you recompile with LAF disabled). Does
>>> anyone see a problem with this?
>>
>> I do myself :)
>>
>> This is just something temporary to get this series landed. We are
>> hiding behind a Kconfig, not making any ABI changes, and not exposing
>> this policy to userspace (i.e., not updating Documentation/, man
>> pages, etc.)
>>
>> Meanwhile, we can keep discussing all the open questions in parallel.

You're right - don't want to slow down the testing, so I'm going to post a v5
tomorrow with the policy in the table above. We're still waiting for the
prerequisites to land before we can kick off testing in anger though.

> 
> And the stat ABI changes should be discussed before or at the same
> time. If we came up with a policy but there was *zero* observability
> of how well that policy works...

Yep agreed. I have a series at [1] which I hoped would kickstart that discussion.

[1] https://lore.kernel.org/linux-mm/20230613160950.3554675-1-ryan.roberts@arm.com/

Thanks,
Ryan



  reply	other threads:[~2023-08-09 16:08 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-26  9:51 [PATCH v4 0/5] variable-order, large folios for anonymous memory Ryan Roberts
2023-07-26  9:51 ` [PATCH v4 1/5] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts
2023-07-26  9:51 ` [PATCH v4 2/5] mm: LARGE_ANON_FOLIO for improved performance Ryan Roberts
2023-07-26 16:41   ` Yu Zhao
2023-07-27  4:31     ` Yu Zhao
2023-07-28 10:13       ` Ryan Roberts
2023-08-01  6:36         ` Yu Zhao
2023-08-01 23:30           ` Yin Fengwei
2023-08-02  8:02           ` Ryan Roberts
2023-08-02  9:04             ` Ryan Roberts
2023-08-02 13:51             ` Yin, Fengwei
2023-08-03  8:05         ` Yin Fengwei
2023-08-03  8:21           ` Ryan Roberts
2023-08-03  8:37             ` Yin Fengwei
2023-08-03  9:32               ` Ryan Roberts
2023-08-03  9:58                 ` Yin Fengwei
2023-08-03 10:27                   ` Ryan Roberts
2023-08-03 10:54                     ` Yin Fengwei
2023-08-04  0:28           ` Yu Zhao
2023-08-01  6:18   ` Yu Zhao
2023-08-02  9:33     ` Ryan Roberts
2023-08-02 21:05       ` Yu Zhao
2023-08-03 10:24         ` Ryan Roberts
2023-08-03 12:43   ` Ryan Roberts
2023-08-03 14:21     ` Kirill A. Shutemov
2023-08-04  0:19       ` Yu Zhao
2023-08-04  2:16         ` Zi Yan
2023-08-04  3:35           ` Yu Zhao
2023-08-04  9:06         ` Ryan Roberts
2023-08-04 18:53           ` Yu Zhao
2023-08-07 19:00             ` Ryan Roberts
2023-08-03 23:50     ` Yu Zhao
2023-08-04  8:27       ` Ryan Roberts
2023-08-04 20:23         ` David Hildenbrand
2023-08-04 21:00           ` Yu Zhao
2023-08-04 21:13             ` David Hildenbrand
2023-08-04 21:26               ` Yu Zhao
2023-08-04 21:30                 ` David Hildenbrand
2023-08-04 21:58                   ` Zi Yan
2023-08-05  2:50                     ` Yin, Fengwei
2023-08-07 17:45                       ` Ryan Roberts
2023-08-07 18:10                         ` Zi Yan
2023-08-08  9:58                           ` Ryan Roberts
2023-08-07  5:24   ` Yu Zhao
2023-08-07 19:07     ` Ryan Roberts
2023-08-07 23:21       ` Yu Zhao
2023-08-08  9:37         ` Ryan Roberts
2023-08-08 17:57           ` Yu Zhao
2023-08-08 18:12             ` Yu Zhao
2023-08-09 16:08               ` Ryan Roberts [this message]
2023-07-26  9:51 ` [PATCH v4 3/5] arm64: mm: Override arch_wants_pte_order() Ryan Roberts
2023-07-26  9:51 ` [PATCH v4 4/5] selftests/mm/cow: Generalize do_run_with_thp() helper Ryan Roberts
2023-07-26  9:51 ` [PATCH v4 5/5] selftests/mm/cow: Add large anon folio tests Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e80cd2e6-6f7c-4337-a170-152926863290@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=itaru.kitayama@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mcgrof@kernel.org \
    --cc=shy828301@gmail.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yuzhao@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox