linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] prctl: introduce PR_SET/GET_THP_POLICY
@ 2025-05-15 13:33 Usama Arif
  2025-05-15 13:33 ` [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process Usama Arif
                   ` (6 more replies)
  0 siblings, 7 replies; 51+ messages in thread
From: Usama Arif @ 2025-05-15 13:33 UTC (permalink / raw)
  To: Andrew Morton, david, linux-mm
  Cc: hannes, shakeel.butt, riel, ziy, laoar.shao, baolin.wang,
	lorenzo.stoakes, Liam.Howlett, npache, ryan.roberts, linux-kernel,
	linux-doc, kernel-team, Usama Arif

This allows to change the THP policy of a process, according to the value
set in arg2, all of which will be inherited during fork+exec:
- PR_THP_POLICY_DEFAULT_HUGE: This will set the MMF2_THP_VMA_DEFAULT_HUGE
  process flag which changes the default of new VMAs to be VM_HUGEPAGE. The
  call also modifies all existing VMAs that are not VM_NOHUGEPAGE
  to be VM_HUGEPAGE.
  This allows systems where the global policy is set to "madvise"
  to effectively have THPs always for the process. In an environment
  where different types of workloads are stacked on the same machine
  whose global policy is set to "madvise", this will allow workloads
  that benefit from always having hugepages to do so, without regressing
  those that don't.
- PR_THP_POLICY_DEFAULT_NOHUGE: This will set the MMF2_THP_VMA_DEFAULT_NOHUGE
  process flag which changes the default of new VMAs to be VM_NOHUGEPAGE.
  The call also modifies all existing VMAs that are not VM_HUGEPAGE
  to be VM_NOHUGEPAGE.
  This allows systems where the global policy is set to "always"
  to effectively have THPs on madvise only for the process. In an
  environment where different types of workloads are stacked on the
  same machine whose global policy is set to "always", this will allow
  workloads that benefit from having hugepages on an madvise basis only
  to do so, without regressing those that benefit from having hugepages
  always.
- PR_THP_POLICY_DEFAULT_SYSTEM: This will clear the MMF2_THP_VMA_DEFAULT_HUGE
  and MMF2_THP_VMA_DEFAULT_NOHUGE process flags.

These patches are required in rolling out hugepages in hyperscaler
configurations for workloads that benefit from them, where workloads are
stacked anda single THP global policy is likely to be used across the entire
fleet, and prctl will help override it.

v1->v2:
- change from modifying the THP decision making for the process, to modifying
  VMA flags only. This prevents further complicating the logic used to
  determine THP order (Thanks David!)
- change from using a prctl per policy change to just using PR_SET_THP_POLICY
  and arg2 to set the policy. (Zi Yan)
- Introduce PR_THP_POLICY_DEFAULT_NOHUGE and PR_THP_POLICY_DEFAULT_SYSTEM
- Add selftests and documentation.

Usama Arif (6):
  prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process
  prctl: introduce PR_THP_POLICY_DEFAULT_NOHUGE for the process
  prctl: introduce PR_THP_POLICY_SYSTEM for the process
  selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_NOHUGE
  selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_HUGE
  docs: transhuge: document process level THP controls

 Documentation/admin-guide/mm/transhuge.rst    |  40 +++
 include/linux/huge_mm.h                       |   4 +
 include/linux/mm_types.h                      |  14 +
 include/uapi/linux/prctl.h                    |   6 +
 kernel/fork.c                                 |   1 +
 kernel/sys.c                                  |  35 +++
 mm/huge_memory.c                              |  56 ++++
 mm/vma.c                                      |   2 +
 tools/include/uapi/linux/prctl.h              |   6 +
 .../trace/beauty/include/uapi/linux/prctl.h   |   6 +
 tools/testing/selftests/prctl/Makefile        |   2 +-
 tools/testing/selftests/prctl/thp_policy.c    | 286 ++++++++++++++++++
 12 files changed, 457 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/prctl/thp_policy.c

-- 
2.47.1


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2025-05-17 20:25 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-15 13:33 [PATCH 0/6] prctl: introduce PR_SET/GET_THP_POLICY Usama Arif
2025-05-15 13:33 ` [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process Usama Arif
2025-05-15 14:40   ` Lorenzo Stoakes
2025-05-15 14:44     ` David Hildenbrand
2025-05-15 14:56       ` Usama Arif
2025-05-15 14:58         ` David Hildenbrand
2025-05-15 15:18           ` Lorenzo Stoakes
2025-05-15 15:45       ` Liam R. Howlett
2025-05-15 15:57         ` David Hildenbrand
2025-05-15 16:38           ` Lorenzo Stoakes
2025-05-15 17:29             ` David Hildenbrand
2025-05-15 18:09               ` Liam R. Howlett
2025-05-15 18:21                 ` Lorenzo Stoakes
2025-05-15 18:42                   ` Zi Yan
2025-05-15 21:04                     ` Lorenzo Stoakes
2025-05-15 18:46                   ` Usama Arif
2025-05-15 19:20                 ` David Hildenbrand
2025-05-15 15:28     ` Usama Arif
2025-05-15 16:06       ` Lorenzo Stoakes
2025-05-15 16:11         ` David Hildenbrand
2025-05-15 18:08           ` Lorenzo Stoakes
2025-05-15 19:12             ` David Hildenbrand
2025-05-15 20:35               ` Lorenzo Stoakes
2025-05-16  7:45                 ` David Hildenbrand
2025-05-16 10:57                   ` Lorenzo Stoakes
2025-05-16 11:24                     ` David Hildenbrand
2025-05-16 12:57                       ` Lorenzo Stoakes
2025-05-16 17:19                         ` Usama Arif
2025-05-16 17:51                           ` Lorenzo Stoakes
2025-05-16 19:34                             ` Usama Arif
2025-05-17 16:20                         ` Is number of process_madvise()-able ranges limited to 8? (was Re: [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process) SeongJae Park
2025-05-17 18:50                           ` Lorenzo Stoakes
2025-05-17 20:25                             ` SeongJae Park
2025-05-17 19:01                         ` [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process Lorenzo Stoakes
2025-05-15 16:47         ` Usama Arif
2025-05-15 18:36           ` Lorenzo Stoakes
2025-05-15 19:17             ` David Hildenbrand
2025-05-15 20:42               ` Lorenzo Stoakes
2025-05-16  6:12   ` kernel test robot
2025-05-15 13:33 ` [PATCH 2/6] prctl: introduce PR_THP_POLICY_DEFAULT_NOHUGE " Usama Arif
2025-05-16  8:19   ` kernel test robot
2025-05-15 13:33 ` [PATCH 3/6] prctl: introduce PR_THP_POLICY_SYSTEM " Usama Arif
2025-05-15 13:33 ` [PATCH 4/6] selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_NOHUGE Usama Arif
2025-05-15 13:33 ` [PATCH 5/6] selftests: prctl: introduce tests for PR_THP_POLICY_DEFAULT_HUGE Usama Arif
2025-05-15 13:33 ` [PATCH 6/6] docs: transhuge: document process level THP controls Usama Arif
2025-05-15 13:55 ` [PATCH 0/6] prctl: introduce PR_SET/GET_THP_POLICY Lorenzo Stoakes
2025-05-15 14:50   ` Usama Arif
2025-05-15 15:15     ` Lorenzo Stoakes
2025-05-15 15:54       ` Usama Arif
2025-05-15 16:04         ` David Hildenbrand
2025-05-15 16:24         ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).