From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 413CBC3ABC9 for ; Fri, 16 May 2025 17:19:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E83FA6B020C; Fri, 16 May 2025 13:19:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E0D066B020E; Fri, 16 May 2025 13:19:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CAFB46B020F; Fri, 16 May 2025 13:19:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A3A976B020C for ; Fri, 16 May 2025 13:19:36 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id D2737140EDB for ; Fri, 16 May 2025 17:19:37 +0000 (UTC) X-FDA: 83449432794.08.53C4F0F Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) by imf13.hostedemail.com (Postfix) with ESMTP id CED452000B for ; Fri, 16 May 2025 17:19:35 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ItQ2IUEQ; spf=pass (imf13.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.221.47 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747415975; a=rsa-sha256; cv=none; b=CYXabwj847Ioj8XQuwu4CiG5hdLLI1iS4BTwioty4cGdMDmhjNcnxhttQaidsUT5u6LtQP kgtA5tFyfBRtCKwGLllNu5c20KGvV3Zqg6p5rDPD4oRaUx0CC3tBeuVe7YJqgc5lhk8m4o /AhnriC3tU3qidvpwh2SAYitAUny4pk= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ItQ2IUEQ; spf=pass (imf13.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.221.47 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747415975; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=scp30wX9S3En5NLP8zYpz0v3AMQnTSxhNtpujWPSkWo=; b=W7HG3u6wY2qbIUrEt31KTDBWnHUFfTJ5cOfJp8GcRYL4l62B5LrzgqsX4ijQiExa7l3wg0 L0ZuqKeqWTOsb+yAjr+oUOYXiaseo0ZBYdkPEYvztAfKeRoXDl8YjqWa1SgCXcdqwmwtvL E904WkDiKYcPsl9I0TlRUV86gg/zNrQ= Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-3a0b291093fso2339826f8f.0 for ; Fri, 16 May 2025 10:19:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747415974; x=1748020774; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=scp30wX9S3En5NLP8zYpz0v3AMQnTSxhNtpujWPSkWo=; b=ItQ2IUEQvnSAxBCIEnvCsPtGzFAEq5YhyWt53mf5ssU2bb5mHZWwu8G6LNKaNkBiSD UCaUVm2LvkitZyJNnIfdzPj6gdEnkH3Uy+VleMLWgGgoL4rLxoANvzaTxDNNuJL3cNr5 +EM96Wxl9nOdg1sjwacso8i77mRUCiwjX/TlYTkbbamGby3R5Hlq5Z1lhYWD7Muqvo+0 GSM66xRTZDIVmzas+/Y/6zGyd2hgw2EuB4jJV80Ewc5SOlJ7dr0UAj6wg8WJDKhKHAzK M6u1sG9oQPm2fuhC52UUTk9lLvuIMJ9BC6/N/97U8wpJ0cnB9P3A3F4zzyVzdgK7i/hb XsYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747415974; x=1748020774; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=scp30wX9S3En5NLP8zYpz0v3AMQnTSxhNtpujWPSkWo=; b=nDfd3Y1JMwTI9WbNeweBRyG1gLN8oPLAA4N9o5+TaPyk4XFKG7QcR20Izy8CYGxVSo UPPEat9KiHLvC8ayMj0c/A984YyShUnNdfgwMoNToRfJKqkpUhHBMVsmpbxRWFc/GNT4 cKI/5e5UKlNrWXJQ77nnYevNWb6JAUGECBaL8bEZTN+7JSl8v24wbG9glYck18QYGQap 1QyG2yiGY3lBtb2krPsBB8GDMGaVdNRwdizkoy8FkM5bTUmlFkFfwJJychz4VXxAkXul eT++hQwCoeQuSQtyg5YZlRP78XqBh+HFS9mM4tYov/XcXBcOqsNbD60voNuRbveZH3YM SI0A== X-Forwarded-Encrypted: i=1; AJvYcCXecGGB7LTs/XCseEaZrJH3YW+YOpWqL1hogsXrZ7u4LzJgp0FJScPkYLybVA+3AXS7ZMNj7UW+Fw==@kvack.org X-Gm-Message-State: AOJu0YyPZObbzMPQwwIdFDb/o4ref/j4VspdPNjEyuADHWWiMh+FX5Xy +PxbEnkKI86gxyOsxv9Jl6Wc91s/tzYZNr2DNDnNgnGe3VMJs7BpgCbT X-Gm-Gg: ASbGncvb/YP2wUEtCNMH6Sx6q+jELkDnkZLeal5o5UINx1eX5WLA0OKMgsbbUHA0Bdu Wf2i2CjCjUS7hc6iaapcs5KCj9UjzGmjznSxl1ofe5o1eYxk1ZHmNjAImaIovib4IDcRaYgVLHU oVk4GzQ8IRCC2hPMpg8oflJBYpcql6kEG0YEIt1whrz8irJ8a8YfbqOgMuUmzRCOrhVPiH2RQ23 JV8n02TCfQIcy2SLmgFjDR94vyCoBE9u8yCsqKdAJa7MzKR3VAuOp383RrOx7Q9iqo4GmlRTiwl yigygKsmSaJTatxBnM0XhIrk5RZ12dTTj7bUF1GHjQVnMLAghyI9vk6i4elzFYWEdRSZfpO5y1Z kaQz/9Mok80xbHVUenCyH1QBneBqgMC843u5xUgQDDdtQnMngFuAylAyNKw== X-Google-Smtp-Source: AGHT+IEzy01Mr4eOjhKk/daIdILUsRnMi+mICtkNBZwJf2oVbEFvg6chOyHRDPdQXEo31nfb/2V3og== X-Received: by 2002:a05:6000:2a5:b0:3a0:b817:2d7a with SMTP id ffacd0b85a97d-3a351222edemr7573219f8f.29.1747415973931; Fri, 16 May 2025 10:19:33 -0700 (PDT) Received: from ?IPV6:2a01:4b00:b211:ad00:1096:2c00:b223:9747? ([2a01:4b00:b211:ad00:1096:2c00:b223:9747]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-442fd50ee03sm40122895e9.14.2025.05.16.10.19.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 16 May 2025 10:19:33 -0700 (PDT) Message-ID: <3284ec20-2c3f-46d0-a599-2f322b2883c8@gmail.com> Date: Fri, 16 May 2025 18:19:32 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/6] prctl: introduce PR_THP_POLICY_DEFAULT_HUGE for the process To: Lorenzo Stoakes , David Hildenbrand , "Liam R. Howlett" Cc: Andrew Morton , linux-mm@kvack.org, hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, ziy@nvidia.com, laoar.shao@gmail.com, baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com References: <02ead03b-339b-45c8-b252-d31a66501c39@lucifer.local> <3a2a329d-2592-4e31-a763-d87dcd925966@redhat.com> <8ea288f2-5196-41f9-bd65-e29f22bb29e8@lucifer.local> <5f77366d-e100-46bb-ac85-aa4b216eb2cf@redhat.com> <8f0a22c2-3176-4942-994d-58d940901ecf@redhat.com> <1a175a2c-8afa-4995-9dec-e3e7cf1efc72@lucifer.local> Content-Language: en-US From: Usama Arif In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: CED452000B X-Stat-Signature: zfqtt9q3ijo33m87cf86sgttccuyp34o X-HE-Tag: 1747415975-995737 X-HE-Meta: U2FsdGVkX19vIH8AixzxL4l43SLY73I4nIscj2MBsmgzfaF64YaUlIfOg1s3QtkjogYv6IBkGp0njvMbUmAIQf7KlPJhhPvvQ7rYjlJAv2fPO2h059PIurlt8yUfH24AyIy6pUyJj5xacavsH3ocoYGFMTstJUYaVBqQuSuX/3QIISH7TVuLUwUEWNhXJBHj60yj7InJNxAXqUiWgtquBzl2Qv9jGDAEPvTtENXON8p3NhjwCcezuzuLbi8djz0uQ/YnXQihn+VwsM19J1ARtM76zV5vvbXTMeuuaLbMcQPD6YBvuIdy23V3NMPxwQkIDbElBt/Sw8xlogp4/1C/wU8grYKlZ7aNFrPpAYOTaln0kXptyJIm5+K4fsAoxzhOkjQfOKOi2+WNqx3Zv54EOH5RoWyNLpONVcG3hkqhUKmx7HeMSndwNZjH9tPsLN8hOdisE8am08TqYIBS8CC2Jhi+Qon/zwQFeSRrC2knNv4eTIGYntha8k+iQjMm6VcCndxW6JY709c2r33NmWtaVhmWHfpI/h2Oic0EAqDJL/tCVWHOMotMi668+v6+hH2FK5DJEF+capzGPxNiUQtAUj2UZtxWwiE3PyUtgUIXFYvN/gEJOghhCwU73zqNCgf77yZOAzsrFBPleBqgjMmhiak8B1Nkgscx4gmwhq1rVZCFUOcrB2efI9ob+ts+l4r5DsQGrov1a9h+iUzU4Av7rdA5djCTAyUchdQtATzErPaPgLFyST7ILv5ac4RLbNUc7vTZkVMWOogVBKSPfHM4zhiA4PatTpQSd2wiXP7lq95A3KC25BZiTf1gBcoDdBCcFbS2NLW8yvhHHnWEKGdeaTeA+Jfdor7R0dSVL8DPNdi36ZBQfuPI236DhoT9rK/pppf2QJawZkY1ZfGCAn0147npXo2qHkYxDhjniKrZD1R2Q5azTDAluUSymLdeGcrXtfznxoYwkCLTfQi+1nX JFCm32J6 mTp4J1QbCgqCfUZSb2fFVO3vkuc0lBoSo3axSUbBT0u1ODky+TMmqyP23JB2iUySHw4rl+bkAARKX8Sd/zcYUuafmmraKsVvIYOXuQxzygbY5RS/oIK/bYkKA/qnep8hP6SWzMpVbn5quvLHTzU/mMuZbWa7+2+cGTKP9gCoL2sqaAfm5b1fVEuz8n2oBFCyQtzdMCRfkavxCyelUIcBGiQBmtJzDVMRX42qzmog76Wh3ZPEXe9G6bv1y04esrJ8JMwKfQS5/JER50TEsPLXsAn5VrFEnAV1MrffD4kugnU1ay/ak+Pr1VxhZGMzdFH/izUZmk7UGAktvG2ETWS0AlrMoCHxo1HqWi3OmvUCsEwBM1ZQ4CNRSnSBgPENNQmXMEq46FCSyl5pyAL6oaD8uXz65pfItMCaWFm3/rOSbnjFRy6YY1TC26Yc/YffmmXtWKcISxNkMGHAUcexJOR4jjrF9amoE+tWYnmD0b9Te3JE+AH8Q135zzLj+PLkXCkx/OV70/A7dlmDzB3J9pCjZe9WFEpuYkt9QEMs1ZXSGFLSNhAnwafywbZpuTv6B9xbwuSBjKRri7/i0/MtYU0UnW76dtg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 16/05/2025 13:57, Lorenzo Stoakes wrote: > On Fri, May 16, 2025 at 01:24:18PM +0200, David Hildenbrand wrote: >> Looking forward to hearing what your magic thinking cap can do! :) > > OK so just to say at the outset, this is purely playing around with a > theoretical idea here, so if it's crazy just let me know :)) > > Right now madvise() has limited utility because: > > - You have little control over how the operation is done > - You get little feedback about what's actually succeeded or not > - While you can perform multiple operations at once via process_madvise(), > even to the current process (after my changes to extend it), it's limited > to a single advice over 8 ranges. > - You can't say 'ignore errors just try' > - You get the weird gap behaviour. > > So the concept is - make everything explicit and add a new syscall that > wraps the existing madvise() stuff and addresses all the above issues. > > Specifically pertinent to the case at hand - also add a 'set_default' > boolean (you'll see shortly exactly where) to also tell madvise() to make > all future VMAs default to the specified advice. We'll whitelist what we're > allowed to use here and should be able to use mm->def_flags. > > So the idea is we'll use a helper struct-configured function (hey, it's me, > I <3 helper structs so of course) like: > > int madvise_ranges(struct madvise_range_control *ctl); > > With the data structures as follows (untested, etc. etc.): > > enum madvise_range_type { > MADVISE_RANGE_SINGLE, > MADVISE_RANGE_MULTI, > MADVISE_RANGE_ALL, > }; > > struct madvise_range { > const void *addr; > size_t size; > int advice; > }; > > struct madvise_ranges { > const struct madvise_range *arr; > size_t count; > }; > > struct madvise_range_stats { > struct madvise_range range; > bool success; > bool partial; > }; > > struct madvise_ranges_stats { > unsigned long nr_mappings_advised; > unsigned long nr_mappings_skipped; > unsigned long nr_pages_advised; > unsigned long nr_pages_skipped; > unsigned long nr_gaps; > > /* > * Useful for madvise_range_control->ignore_errors: > * > * If non-NULL, points to an array of size equal to the number of ranges > * specified. Indiciates the specified range, whether it succeeded, and > * whether that success was partial (that is, the range specified > * multiple mappings, only some of which had advice applied > * successfully). > * > * Not valid for MADVISE_RANGE_ALL. > */ > struct madvise_range_stats *per_range_stats; > > /* Error details. */ > int err; > unsigned long failed_address; > size_t offset; /* If multi, at which offset did this occur? */ > }; > > struct madvise_ranges_control { > int version; /* Allow future updates to API. */ > > enum madvise_range_type type; > > union { > struct madvise_range range; /* MADVISE_RANGE_SINGLE */ > struct madvise_ranges ranges; /* MADVISE_RANGE_MULTI */ > struct all { /* MADVISE_RANGE_ALL */ > int advice; > /* > * If set, also have all future mappings have this applied by default. > * > * Only whitelisted advice may set this, otherwise -EINVAL will be returned. > */ > bool set_default; > }; > }; > struct madvise_ranges_stats *stats; /* If non-NULL, report information about operation. */ > > int pidfd; /* If is_remote set, the remote process. */ > > /* Options. */ > bool is_remote :1; /* Target remote process as specified by pidfd. */ > bool ignore_errors :1; /* If error occurs applying advice, carry on to next VMA. */ > bool single_mapping_only :1; /* Error out if any range is not a single VMA. */ > bool stop_on_gap :1; /* Stop operation if input range includes unmapped memory. */ > }; > > So the user can specify whether to apply advice to a single range, > multiple, or the whole address space, with real control over how the operation proceeds. > For single range, we have madvise, for multiple ranges we have process_madvise, we can have a very very simple solution for whole address space with prctl. IMHO, above is really not be needed (but I might be wrong :)), this will introduce a lot of code to solve something that can be done in a very very simple way and it will introduce another syscall when prctl is designed for this, I understand that you don't like prctl, but it is there. I have added below what patch 1 of 6 would look like after incorporating all your feedback. (Thanks for all the feedback, really appreciate it!!) Main difference from the current revisions: - no more flags2. - no more MMF2_... - renamed policy to PR_DEFAULT_MADV_HUGEPAGE - mmap_write_lock_killable acquired in PR_GET_THP_POLICY - mmap_write lock fixed in PR_SET_THP_POLICY - check if hugepage_global_enabled is enabled in the call and account for s390 - set mm->def_flags VM_HUGEPAGE and VM_NOHUGEPAGE according to the policy in the way done by madvise(). I believe VM merge will not be broken in this way, please let me know otherwise. - process_default_madv_hugepage function that does for_each_vma and calls hugepage_madvise. (I can move it to vma.c or any other file you prefer). Please let me know if this looks acceptable and I can send this as RFC v3 for all the 6 patches (the rest are done in a similar way to below) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 2f190c90192d..a8c3ce15a504 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -260,6 +260,8 @@ static inline unsigned long thp_vma_suitable_orders(struct vm_area_struct *vma, return orders; } +void process_default_madv_hugepage(struct mm_struct *mm, int advice); + unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, unsigned long vm_flags, unsigned long tva_flags, diff --git a/include/linux/mm.h b/include/linux/mm.h index 43748c8f3454..436f4588bce8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -466,7 +466,7 @@ extern unsigned int kobjsize(const void *objp); #define VM_NO_KHUGEPAGED (VM_SPECIAL | VM_HUGETLB) /* This mask defines which mm->def_flags a process can inherit its parent */ -#define VM_INIT_DEF_MASK VM_NOHUGEPAGE +#define VM_INIT_DEF_MASK (VM_HUGEPAGE | VM_NOHUGEPAGE) /* This mask represents all the VMA flag bits used by mlock */ #define VM_LOCKED_MASK (VM_LOCKED | VM_LOCKONFAULT) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index e76bade9ebb1..f1836b7c5704 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1703,6 +1703,7 @@ enum { /* leave room for more dump flags */ #define MMF_VM_MERGEABLE 16 /* KSM may merge identical pages */ #define MMF_VM_HUGEPAGE 17 /* set when mm is available for khugepaged */ +#define MMF_VM_HUGEPAGE_MASK (1 << MMF_VM_HUGEPAGE) /* * This one-shot flag is dropped due to necessity of changing exe once again @@ -1742,7 +1743,8 @@ enum { #define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\ MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\ - MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK) + MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK |\ + MMF_VM_HUGEPAGE_MASK) static inline unsigned long mmf_init_flags(unsigned long flags) { diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 15c18ef4eb11..15aaa4db5ff8 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -364,4 +364,8 @@ struct prctl_mm_map { # define PR_TIMER_CREATE_RESTORE_IDS_ON 1 # define PR_TIMER_CREATE_RESTORE_IDS_GET 2 +#define PR_SET_THP_POLICY 78 +#define PR_GET_THP_POLICY 79 +#define PR_DEFAULT_MADV_HUGEPAGE 0 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index c434968e9f5d..4fe860b0ff25 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2658,6 +2658,44 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, clear_bit(MMF_DISABLE_THP, &me->mm->flags); mmap_write_unlock(me->mm); break; + case PR_GET_THP_POLICY: + if (arg2 || arg3 || arg4 || arg5) + return -EINVAL; + if (mmap_write_lock_killable(me->mm)) + return -EINTR; + if (me->mm->def_flags & VM_HUGEPAGE) + error = PR_DEFAULT_MADV_HUGEPAGE; + mmap_write_unlock(me->mm); + break; + case PR_SET_THP_POLICY: + if (arg3 || arg4 || arg5) + return -EINVAL; + if (mmap_write_lock_killable(me->mm)) + return -EINTR; + switch (arg2) { + case PR_DEFAULT_MADV_HUGEPAGE: + if (!hugepage_global_enabled()) + error = -EPERM; +#ifdef CONFIG_S390 + /* + * qemu blindly sets MADV_HUGEPAGE on all allocations, but s390 + * can't handle this properly after s390_enable_sie, so we simply + * ignore the madvise to prevent qemu from causing a SIGSEGV. + */ + else if (mm_has_pgste(vma->vm_mm)) + error = -EPERM; +#endif + else { + me->mm->def_flags &= ~VM_NOHUGEPAGE; + me->mm->def_flags |= VM_HUGEPAGE; + process_default_madv_hugepage(me->mm, MADV_HUGEPAGE); + } + break; + default: + error = -EINVAL; + } + mmap_write_unlock(me->mm); + break; case PR_MPX_ENABLE_MANAGEMENT: case PR_MPX_DISABLE_MANAGEMENT: /* No longer implemented: */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2780a12b25f0..2b9a3e280ae4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -98,6 +98,18 @@ static inline bool file_thp_enabled(struct vm_area_struct *vma) return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); } +void process_default_madv_hugepage(struct mm_struct *mm, int advice) +{ + struct vm_area_struct *vma; + unsigned long vm_flags; + + VMA_ITERATOR(vmi, mm, 0); + for_each_vma(vmi, vma) { + vm_flags = vma->vm_flags; + hugepage_madvise(vma, &vm_flags, advice); + } +} + unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, unsigned long vm_flags, unsigned long tva_flags, > This basically solves the problem this series tries to address while also > providing an improved madvise() API at the same time. > > Thoughts? Have I finally completely lost my mind?