From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C86D2C83F1B for ; Thu, 17 Jul 2025 08:52:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A7486B007B; Thu, 17 Jul 2025 04:52:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 67F4A6B0088; Thu, 17 Jul 2025 04:52:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5462F6B008A; Thu, 17 Jul 2025 04:52:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 4158F6B007B for ; Thu, 17 Jul 2025 04:52:22 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 03F4A14065A for ; Thu, 17 Jul 2025 08:52:21 +0000 (UTC) X-FDA: 83673140124.25.D181B0A Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 8B76E40004 for ; Thu, 17 Jul 2025 08:52:19 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EWY5zLAC; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752742339; a=rsa-sha256; cv=none; b=jrP3jL/wc4qTuK4tICWeKiQdZH2QFlqlT9Tc2rKeYLS2ulPZxo1rQGB4uDkAUY9O4rAb6y au6+t6Fcrs71m0HeLZA3hUdny7Qd0ECTtp8rL0ik0+otjxQYHz6K6l92m6VD3p+m+X9lXh 8rnNhFgicMUIV7xHj6UpMdgSnAz/Qq4= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=EWY5zLAC; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752742339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=f0/2xy7vv3lUPkCKcWlEZdvdOOc1dFZHCFBVBsNJG+I=; b=YgBrMpA/mc/gQivwG6vbwtHmG0/LIb9pNHOc2EJrUcv+6cMw+D6ARhcHX4pR56w3YZMUrq t75ZjpSGE1gnD06rCKEoqbaMzR1n0mqJV/CLE8Av6o5cLjqve8qgrNoox9cbnj5VMuZME2 WRKQ//o3DYvL7EUWjG5flqvWlgRc1IY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1752742338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=f0/2xy7vv3lUPkCKcWlEZdvdOOc1dFZHCFBVBsNJG+I=; b=EWY5zLACf7gIxTrOzU5AyiMgA3g0jSWNyzt95A/HzNF/R9Pg0RYXNP8fspRfr0bl6qHngP kbyn0fXXjUSleS35cXyQp9tIAXhTrCAaVF81Bdma3cWBCRocVUt8YjjqZ3oIkoz6ezT+zR DDhf9nWn2Tnsvso3nOGgpHtCf4hziMo= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-350-wP4ZciWLMb2Kp-rMONxAOQ-1; Thu, 17 Jul 2025 04:52:15 -0400 X-MC-Unique: wP4ZciWLMb2Kp-rMONxAOQ-1 X-Mimecast-MFC-AGG-ID: wP4ZciWLMb2Kp-rMONxAOQ_1752742334 Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-3a4eb6fcd88so407679f8f.1 for ; Thu, 17 Jul 2025 01:52:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752742334; x=1753347134; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=f0/2xy7vv3lUPkCKcWlEZdvdOOc1dFZHCFBVBsNJG+I=; b=EHF0sV5zfg2Snw0K46p/gR/WXgpFcZJzjIEsB+kbsOi6DuF6PVWdT7QWm1QYXINLGC eXFYuYjz8aNiemkLJsPKbbuyP2757xBq2iYCzi3I0Li9pNPFsNTUOhMvWF7HSl1nx8wR TT3cvM79KC8lsWE1saC40zWL0n1OiEaFsTH4atQ8jonWEaGDK8Y3r9V7PTtpEEitXTzS USpWi1zWf6ENgxnsHVEcrlNfEOKa30dfcu/pX3nKRDAFRmA8AErBghX58Yh0NboZc62i MVTgyncZv0tYAvz4lf1xH4d1IhyIrt+nFEtqB0FvN5zXwAaeSqpzXhKyyC8i/zFhhv6A NR3Q== X-Forwarded-Encrypted: i=1; AJvYcCXxao6zVgHnfgmH6n814rc4tP+mEdqAmj3Z/Kpg8OG03W/mkRyD0BQOr/wLlplgQCTfhmt50jB+yQ==@kvack.org X-Gm-Message-State: AOJu0YzyVN6j2482zG3AOVQn100BeH3vAS7s9Qmv0fb7LjYqtEwqOBNk deusnul8OG1B6x+YCl+HhVdFpDRCaL60ZaOCExCQdeuRJnf5/XZSdpJ9jTU0BXKpt01S/195jxJ ZD+Va4joM9//gH6G8nHV6vRiIp/Sp858RPdW+BfX1OhoJKt6PkBKV X-Gm-Gg: ASbGnctanEtF9fAzpIEV4a7qyySIKe7aBLfVbSZ0aafnLqyHze3HBBrZsdePxFrC4tz 13OE/kieFKTm5FTXD9pmI9RkoWL10Ctq43VwEuRMjCy0Lgj3I+vWQMSefg20DKPbMbVMsL6Z6OL D0+3BSMPeG4nwug0MurCXe4SlR6AU3mSH1LkwjJTDZwfU9uHCxDsX9gzBXsI+IWFD7Kz99hgAmS DSHsGuPaa1MINX/KXRNNkW+EK/pSH4Wf0wX0jEBTTdAVRWXNSzO09wQTj9QNU+hDtpQ8j2I74cc WI+5b5nooklDv4CSrdIKuSMp8AfutTxDuK1A9pv/6/bWYxG1aY2Iou1L3jinZYavdi6qarrf/IX i7UV5/xSZVDg0cz+CmmO5Un+aCYlrKyIxi7im5+4DaeMCFlycEEuhGltsZpIHOdiI X-Received: by 2002:a5d:584d:0:b0:3b3:9c85:6b17 with SMTP id ffacd0b85a97d-3b60e4f2bf4mr4155412f8f.34.1752742334303; Thu, 17 Jul 2025 01:52:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFkkzBW3YSl1lHB7iAKBbzRxlBHDxQxqY5m1/VmqOSNtiKiVbHmjCOplubn6zmDVDgAzE8zIA== X-Received: by 2002:a5d:584d:0:b0:3b3:9c85:6b17 with SMTP id ffacd0b85a97d-3b60e4f2bf4mr4155361f8f.34.1752742333742; Thu, 17 Jul 2025 01:52:13 -0700 (PDT) Received: from ?IPV6:2003:d8:2f1f:3600:dc8:26ee:9aa9:fdc7? (p200300d82f1f36000dc826ee9aa9fdc7.dip0.t-ipconnect.de. [2003:d8:2f1f:3600:dc8:26ee:9aa9:fdc7]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3b5e8e0d77asm20534293f8f.58.2025.07.17.01.52.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 17 Jul 2025 01:52:13 -0700 (PDT) Message-ID: <9bc57721-5287-416c-aa30-46932d605f63@redhat.com> Date: Thu, 17 Jul 2025 10:52:12 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v3 0/5] mm, bpf: BPF based THP adjustment To: Yafang Shao , Matthew Wilcox Cc: akpm@linux-foundation.org, ziy@nvidia.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com, hannes@cmpxchg.org, usamaarif642@gmail.com, gutierrez.asier@huawei-partners.com, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, bpf@vger.kernel.org, linux-mm@kvack.org References: <20250608073516.22415-1-laoar.shao@gmail.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAmgsLPQFCRvGjuMACgkQTd4Q 9wD/g1o0bxAAqYC7gTyGj5rZwvy1VesF6YoQncH0yI79lvXUYOX+Nngko4v4dTlOQvrd/vhb 02e9FtpA1CxgwdgIPFKIuXvdSyXAp0xXuIuRPQYbgNriQFkaBlHe9mSf8O09J3SCVa/5ezKM OLW/OONSV/Fr2VI1wxAYj3/Rb+U6rpzqIQ3Uh/5Rjmla6pTl7Z9/o1zKlVOX1SxVGSrlXhqt kwdbjdj/csSzoAbUF/duDuhyEl11/xStm/lBMzVuf3ZhV5SSgLAflLBo4l6mR5RolpPv5wad GpYS/hm7HsmEA0PBAPNb5DvZQ7vNaX23FlgylSXyv72UVsObHsu6pT4sfoxvJ5nJxvzGi69U s1uryvlAfS6E+D5ULrV35taTwSpcBAh0/RqRbV0mTc57vvAoXofBDcs3Z30IReFS34QSpjvl Hxbe7itHGuuhEVM1qmq2U72ezOQ7MzADbwCtn+yGeISQqeFn9QMAZVAkXsc9Wp0SW/WQKb76 FkSRalBZcc2vXM0VqhFVzTb6iNqYXqVKyuPKwhBunhTt6XnIfhpRgqveCPNIasSX05VQR6/a OBHZX3seTikp7A1z9iZIsdtJxB88dGkpeMj6qJ5RLzUsPUVPodEcz1B5aTEbYK6428H8MeLq NFPwmknOlDzQNC6RND8Ez7YEhzqvw7263MojcmmPcLelYbfOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCaCwtJQUJG8aPFAAKCRBN3hD3AP+DWlDnD/4k2TW+HyOOOePVm23F5HOhNNd7nNv3 Vq2cLcW1DteHUdxMO0X+zqrKDHI5hgnE/E2QH9jyV8mB8l/ndElobciaJcbl1cM43vVzPIWn 01vW62oxUNtEvzLLxGLPTrnMxWdZgxr7ACCWKUnMGE2E8eca0cT2pnIJoQRz242xqe/nYxBB /BAK+dsxHIfcQzl88G83oaO7vb7s/cWMYRKOg+WIgp0MJ8DO2IU5JmUtyJB+V3YzzM4cMic3 bNn8nHjTWw/9+QQ5vg3TXHZ5XMu9mtfw2La3bHJ6AybL0DvEkdGxk6YHqJVEukciLMWDWqQQ RtbBhqcprgUxipNvdn9KwNpGciM+hNtM9kf9gt0fjv79l/FiSw6KbCPX9b636GzgNy0Ev2UV m00EtcpRXXMlEpbP4V947ufWVK2Mz7RFUfU4+ETDd1scMQDHzrXItryHLZWhopPI4Z+ps0rB CQHfSpl+wG4XbJJu1D8/Ww3FsO42TMFrNr2/cmqwuUZ0a0uxrpkNYrsGjkEu7a+9MheyTzcm vyU2knz5/stkTN2LKz5REqOe24oRnypjpAfaoxRYXs+F8wml519InWlwCra49IUSxD1hXPxO WBe5lqcozu9LpNDH/brVSzHCSb7vjNGvvSVESDuoiHK8gNlf0v+epy5WYd7CGAgODPvDShGN g3eXuA== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: OZ9AHDwOfuCMcDL7ca9hP80VrM-_J8SKyXl3YaNdGsQ_1752742334 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 8B76E40004 X-Stat-Signature: r4q41ywyuw6zjrectqbjhdau6mdproa9 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1752742339-307901 X-HE-Meta: U2FsdGVkX1/m4rVMk/VlGtOkMDlBYzZLXqKR3WPS8zhiCGouhNRtx2wQ/YN383K/2LV9FCu2TwZVylVRpyCI/o0tyWlcR+qEAxh68DHHofOxm61LrNkAtOQjgcRtVNnTy6rDk3LNy2awPfdqdmiukO5kJG7z/sc8doc89IsZ8eGoRqEvYhhUoplgsh0WggTrfsBa7thOixfSZXVBt3BbVQursALomAClZAPNELUt4R2+PabUC6Ca89QjheRoNMwUdesxGWydWVCbNhH3sAagqH+Ljfcrf95RQJJRUhAr9sR0RnhivNgvO8R3vgn7rkyfb1xlRoWKWSm80UlPGqoJDKBvQTEud6Gbo/FfHkkk7uHa8Lqhb+awWGbg3LS6hxLBhQlXiB5e6sL+pGpFrLMadjrJMZErAhsGxSH/tjd08NX3rPaWjUzAke+/Jh5UAMOUpu7JHjr0mKjZXTIwlbLIIf8xQtu7ChzVfvfslu/3k8n/LUoz84X7nNzpPzodaLQ9FRNDzX7VIh6QqwtAPJS3iWCtOBfVatJdD6XT4qatq4xSomYO7b2ptkxXyyg18AoLBwbib1zX/q26uInA3thNgw4BlG/HixItjAiIxYM4bf1Mhl+WzZdM6Fd24Qqlbxv2QXubyoSFoVgiUC2br7EVvnXfExdwvuSFDdO5jJBBwTYHmE6eedlywiyRlAcJaaZYPiIo6rr24Ugbq0sWpE6hozVJ9HUmgvMsRL0yerd5kbL+35Lh9Ne3IQGwMe5ih+N4D3bFIiUzPlMlb27CGHyix9cJsr9pN2ZJMWswWsw6YBn8VuHo7Phk8u4EG+nd/z8boiToID4riOUNULLaB3ggGXq2UgPulU5FEfXtLdz0RbKRytNNH7UcQvAkqaDNCsdevS9dL/TuYhHklhcoYAt38Sb4XUcVAo8fn8oQM9Co7Dosynp/Z6qsjwN7qZ0dfjnct57B8RKxmKKSe4SFvEC jD2wbsq6 eePQ+yVV27I+jVKMdAjTM2e1IIyFlk8Fz7Rp5+Y3eC/4MohlbBeZFs2zM78XfojaTnX1H1sF5TeC557Jvg13qwPj5WoMJIRnf9KaepJ6qGnIY83iKui5u7pBenFnFp8AgWrEuZFV2+Cqo3k6a4zdRvSdA6jBmtJBua7jBKc3okbhoUhUev5Sbf/KUOe96Jit11BMY7EaoNASuHX8NrRDT9AXj+Fo3AQx3BVBabTvc1oA90tEgn8UyKtZEgXdEZPBaXz/QSvKKhh9EPiKqYqgMBk4/06JXRNc31n0u2H8z2p9hoJsx7W8+m0vjrAas34KvI1TPj14mqkIZ2SqxFEKAJKNxtZiJttoTiwqs+jwIsQIsJVdxPp/e9nqccZb9c54I3ORBxh1LwDtH1ySJyK/wIVPerKIFzPAIrPjimwPmcLXjzYELV1u23HCHmRkZi7VIyMCfrdEImXTSXI9SMDgRbbA6myZM3SdlhXTygGg5r+aMHBo0BfDKLqLw23aLntsSuGTbfJAUvNBs2TwdQgvPtg6pt3wJJjulxbyj3MTejYQ/bEU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 17.07.25 05:09, Yafang Shao wrote: > On Wed, Jul 16, 2025 at 6:42 AM David Hildenbrand wrote: >> >> On 08.06.25 09:35, Yafang Shao wrote: >> >> Sorry for not replying earlier, I was caught up with all other stuff. >> >> I still consider this a very interesting approach, although I think we >> should think more about what a reasonable policy would look like >> medoium-term (in particular, multiple THP sizes, not always falling back >> to small pages if it means splitting excessively in the buddy etc.) > > I find it difficult to understand why we introduced the mTHP sysfs > knobs instead of implementing automatic THP size switching within the > kernel. I'm skeptical about its practical utility in real-world > workloads. > > In contrast, XFS large folio (AKA. File THP) can automatically select > orders between 0 and 9. Based on our verification, this feature has > proven genuinely useful for certain specific workloads—though it's not > yet perfect. I suggest you do some digging about the history of these toggles and the plans for the future (automatic), there has been plenty of talk about all that. [...] >>> >>> - THP allocator >>> >>> int (*allocator)(unsigned long vm_flags, unsigned long tva_flags); >>> >>> The BPF program returns either THP_ALLOC_CURRENT or THP_ALLOC_KHUGEPAGED, >>> indicating whether THP allocation should be performed synchronously >>> (current task) or asynchronously (khugepaged). >>> >>> The decision is based on the current task context, VMA flags, and TVA >>> flags. >> >> I think we should go one step further and actually get advises about the >> orders (THP sizes) to use. It might be helpful if the program would have >> access to system stats, to make an educated decision. >> >> Given page fault information and system information, the program could >> then decide which orders to try to allocate. > > Yes, that aligns with my thoughts as well. For instance, we could > automate the decision-making process based on factors like PSI, memory > fragmentation, and other metrics. However, this logic could be > implemented within BPF programs—all we’d need is to extend the feature > by introducing a few kfuncs (also known as BPF helpers). We discussed this yesterday at a THP upstream meeting, and what we should look into is: (1) Having a callback like unsigned int (*get_suggested_order)(.., bool in_pagefault); Where we can provide some information about the fault (vma size/flags/anon_name), and whether we are in the page fault (or in khugepaged). Maybe we want a bitmap of orders to try (fallback), not sure yet. (2) Having some way to tag these callbacks as "this is absolutely unstable for now and can be changed as we please.". One idea will be to use this mechanism as a way to easily prototype policies, and once we know that a policy works, start moving it into the core. In general, the core, without a BPF program, should be able to continue providing a sane default behavior. > >> >> That means, one would query during page faults and during khugepaged, >> which order one should try -- compared to our current approach of "start >> with the largest order that is enabled and fits". >> >>> >>> - THP reclaimer >>> >>> int (*reclaimer)(bool vma_madvised); >>> >>> The BPF program returns either RECLAIMER_CURRENT or RECLAIMER_KSWAPD, >>> determining whether memory reclamation is handled by the current task or >>> kswapd. >> >> Not sure about that, will have to look into the details. > > Some workloads allocate all their memory during initialization and do > not require THP at runtime. For such cases, aggressively attempting > THP allocation is beneficial. However, other workloads may dynamically > allocate THP during execution—if these are latency-sensitive, we must > avoid introducing long allocation delays. > > Given these differing requirements, the global > /sys/kernel/mm/transparent_hugepage/defrag setting is insufficient. > Instead, we should implement per-workload defrag policies to better > optimize performance based on individual application behavior. We'll be very careful about the callbacks we will offer. Maybe the get_suggested_order() callback could itself make a decision and not suggest a high order if allocation would require comapction. Initially, we should keep it simple and see what other callbacks to add / how to extend get_suggested_order(), to cover these cases. -- Cheers, David / dhildenb