From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D081CD13CF for ; Mon, 2 Sep 2024 10:09:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A36C68D00B7; Mon, 2 Sep 2024 06:09:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9E68F8D0098; Mon, 2 Sep 2024 06:09:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 812C58D00B7; Mon, 2 Sep 2024 06:09:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5B3E58D0098 for ; Mon, 2 Sep 2024 06:09:26 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 105FFC1AB5 for ; Mon, 2 Sep 2024 10:09:26 +0000 (UTC) X-FDA: 82519375932.22.46466BE Received: from mail-ej1-f53.google.com (mail-ej1-f53.google.com [209.85.218.53]) by imf07.hostedemail.com (Postfix) with ESMTP id 0B2C940016 for ; Mon, 2 Sep 2024 10:09:23 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aBPsaNDM; spf=pass (imf07.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725271716; a=rsa-sha256; cv=none; b=B6Li/mnfQtWqmCKjhMhhlOwLh4xr7u8Yc8KQeX0tmoyggmu/32FWZS5oa0pX7QHy6WttIG sa3whN7FhE3kjAoDI1WWn69nqWWxAJmhTr/9UD9MrybqYjmtTTzaCYNjcB5e6fXmTqDwrT /H4mLEGWEWdF2pUykjkBC8fI7680yqo= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=aBPsaNDM; spf=pass (imf07.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.218.53 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725271716; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=C4qeT7o4ULHWCXxvL6tnTmiF2kCLEXZf1Gydy/0gqfY=; b=2/fQW9YOHgrO3H5PFaQToil/aLqmfJYpxuzpGhdJ7n1ETIWV+3jQkeQhbYSPjAo/FKhi2/ bLuvmvI/xiwPWZDh7iy2IAcHi/5o6hqDDghhkKgmi2z72tAyYkROhcID0/0a9sd3l/ADeU JfzQiq4ClaR1K7uX8l3tT6qmA5TNS0s= Received: by mail-ej1-f53.google.com with SMTP id a640c23a62f3a-a866cea40c4so462522066b.0 for ; Mon, 02 Sep 2024 03:09:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725271762; x=1725876562; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=C4qeT7o4ULHWCXxvL6tnTmiF2kCLEXZf1Gydy/0gqfY=; b=aBPsaNDMAQ4dyjhMm88DWJGeqI1d5ma3Ca4Vcpj9jdDqJBuD9SrAWuP1HVrS7+BkgJ 5mNcsyCB2e+h2Dfh87wCtUcVawMcOZiqj9TQxxN6gAJFmTfp1HIX90k76imJAw6W9nBd 5EXHlwGY9FBnRqeCjaK5bVbiSoA1K1u97jjBvVTeHTzsFVsEgO5HjmsEGTMI43xRxVUH PQHaBEfc1uIilwrMFvvi33Pn0Iq7oevi7Kr/A57hQFxIj6NiXs9cdCczRAPl8FNWvjB1 Hnw6BwMXHgyAAo4QztsRIYIkAjScWHOLziDbaAg1X06+QS8c3uQ5LQuGrWr++HvXfBs9 WfpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725271762; x=1725876562; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=C4qeT7o4ULHWCXxvL6tnTmiF2kCLEXZf1Gydy/0gqfY=; b=gbK5Wq5Viwtf/GYXgMymcL5Pq4LZ08v6kxzCwyQT3VjVs7Z7FZYyXrH8mysjyNBVUb AJ3DqdjEG2Dl3Nmv+2hyiiE9R3RD9qVWiNN0znGNr/uCGsTLZm38uAWmh6LsCFrLEvTv cm9L4m6n8IYOsz/OF8lgXv8oMxzX5mtMI3MnssZK11hnAUeEJWbHTNGvbahqPoHLPeTY lgDrySnFfFTyWk/IHLzr91+/JUCp2B5hEzkPApeSMHTzslK+mP+QATWhrrXXucIxFRiw 92RkBJlEzlgymDEmNOmNN+rbIstic2WtjdHResjAFHAHDXME0gh1Zc7gHULnknny0Kv5 Hx4g== X-Forwarded-Encrypted: i=1; AJvYcCVrbwZF6wYpB0UP1lv8QKPrzmWOTLapr4oVPz22btMBiiRuBgQQrUnKBwd+dIAtdzmJRJb+46cp1g==@kvack.org X-Gm-Message-State: AOJu0Yzh7OGcHDLXw1VPj7is0uWcSyNxOufhAK5us0nUEDJjkYM/1kmc TFzDkiSC5mbrSCr+F35VGBO+S7Mn23k4e67+LZg7z7zNXMrp4sfs X-Google-Smtp-Source: AGHT+IEknuOqL3uHFMqy/8uXTo23Npaz4MEm2YU7/Kccsr2LMVU8Oh04k1tsDqje10i2pJc+cTU1gw== X-Received: by 2002:a17:907:7f1e:b0:a77:b052:877e with SMTP id a640c23a62f3a-a897f84d4d0mr894471266b.19.1725271761663; Mon, 02 Sep 2024 03:09:21 -0700 (PDT) Received: from ?IPV6:2a03:83e0:1126:4:eb:d0d0:c7fd:c82c? ([2620:10d:c092:500::7:76e8]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a898900f26asm541443366b.58.2024.09.02.03.09.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 02 Sep 2024 03:09:21 -0700 (PDT) Message-ID: <3b910105-591c-45d4-b9c8-8b70f20f283b@gmail.com> Date: Mon, 2 Sep 2024 11:09:20 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 4/6] mm: Introduce a pageflag for partially mapped folios To: Bang Li , Barry Song Cc: akpm@linux-foundation.org, linux-mm@kvack.org, hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, libang.li@antgroup.com, Lance Yang References: <20240819023145.2415299-1-usamaarif642@gmail.com> <20240819023145.2415299-5-usamaarif642@gmail.com> <9a58e794-2156-4a9f-a383-1cdfc07eee5e@gmail.com> <953d398d-58be-41c6-bf30-4c9df597de77@gmail.com> <5ed479c9-21eb-4bc8-8c17-79e1b6081355@gmail.com> <9eb5af0d-730c-459d-9c2e-5ad7b78f30d7@gmail.com> Content-Language: en-US From: Usama Arif In-Reply-To: <9eb5af0d-730c-459d-9c2e-5ad7b78f30d7@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: 48rhxuy3qgn6yqcufbpasars3xz7wemb X-Rspamd-Queue-Id: 0B2C940016 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1725271763-618925 X-HE-Meta: U2FsdGVkX1+jn+xY00HaaHr7mpXkW4qnpI2oEG932zm7pRyZ11ByISr73ngA/S/QL8Z63qbe3UlvzEd2pfcojsIcTWkOABGZIQGxhtTUjnSkK8EBGSTuz8/cMcDDF6L7psZdI1AOqZMbvU+otQoC3xUHSRMkWjHgBpo8XZVd2LDYNFarazwTiLE6Hl8xS+F40oZnmZWzgRt+9oZx1ewlArQAzf2gYSvtbv5taliofGvkXSXB5v+ZxfNtevnfRzdM+Qk1Z9ujy2fSG6l6zUhrEaOLZofhCBuBpWfrHrCbwZiHp6ex8sWxiRnUmoHAPDTyfEvUBoeFNiyEQsrITy3fzgU77hBOhoKFSnyU42JvrrA9AQC1eBS4+5hBEtgl/XUoniA510wD5yrwvJFFkLTuY/mSvltx5LwGVo7chr9dU8OdBndBSdDCtANxIBMWd8CmBKyBVPKqNkfet9zs6Htk2Xk73qtviGh8+sZ4ffupKG62ezLygpwsf/pzDiw+8Yuc97fTmNNjN5qSNbMvISW84cEWwQjVgP/a2CP/mzS1+ZbYygAApp64afdE5+Hfr6olWDeiHmk/b7/GfBUNeP5L3pOF64Ckxpadrb/ELj5cHuFBKw08PNwKSr56FSJHpqHw8LJCIrFlWoe+Yl/qb9LR8prhDOb+HX6AR8t/H5XT8V+YxSQBTIJQ85pxZSUkEpdTtmXzKMcZ7kRxMdiV8Aq6Gs5Oi/zDMVaEwMaS53I+qoLe7Dx4Z/2KxFerxWjksrKeAsQ1ye/5xzdrLlOqJshsOadtsHUC5WGDwO5cPzjYncF5iRlXW5ftd8IIa7/enmVi40GCwNrJgBk/7IQ5caNNsAiJqRAxjXIY6XMVowTZTngg3koKXrAZqn05YyckcEsSfrWhHbIODYNn7UhUNOYZSEBKr6M8rRVGbH8Ox6CW6C4i1P3ukH458GBwmkGXKa+fTKU/VmonkC2UOgQgR1i t92UCb3F aChwWJbuGtz59UzmtidgoUrWmVDGtqPCxJevPoEzXU7KE89MxCtoT2d3UgbQFvs6thrxxZgbQ1Gan8oQK7ir386EwvgQ/VWwS3ej/Jgv40wScuXcVhpnmCV5oUszwrynPfetfSAR/un34/1McvfnFth5K3MG35NgAvLT6Yg3M4hvJDdNH430YeaEDoXXAK2Xpll2sASj5bkk/noq/pdtf92l5xUNcwSKXzwkxYLcIDcHCnq/qyXOmqzxH3F+8EcveR48t9y5j+HEPUNpMa5ZJ2Gvutn5jCcJourDGqQsbgYKsY8/z0UzSRb6daqxBTBw8PzE5QXp2TtQDezDYw172fUNmfkDJ3UUOda+xUX9FDH4/NP7GO62PyQf5DRLZ33T1B5fCQsUoU7PPxUed1rWfbUtAa6sonfvD8vVnUmvlwKY3uDphJ3F7TWiK3qPTN+nUescGoHcFnPjmc41NkTX0XymLTUHQOR2BCWsNrZcF18DgU+Wn0e8xyYg1uXM0QvkYbwHKZdLT/ViA6aTqW/c9VWCE4q5fT8JJio7GLfM8wBN+FLrKRcVGMI1hvTqccM0mNk4S X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 01/09/2024 08:48, Bang Li wrote: > hi, Usama > > On 2024/8/22 3:04, Usama Arif wrote: > >> >> On 20/08/2024 17:30, Barry Song wrote: >> >>> Hi Usama, >>> thanks! I can't judge if we need this partially_mapped flag. but if we >>> need, the code >>> looks correct to me. I'd like to leave this to David and other experts to ack. >>> >> Thanks for the reviews! >> >>> an alternative approach might be two lists? one for entirely_mapped, >>> the other one >>> for split_deferred. also seems ugly ? >>> >> That was my very first prototype! I shifted to using a bool which I sent in v1, and then a bit in _flags_1 as David suggested. I believe a bit in _flags_1 is the best way forward, as it leaves the most space in folio for future work. >> >>> On the other hand, when we want to extend your patchset to mTHP other than PMD- >>> order, will the only deferred_list create huge lock contention while >>> adding or removing >>> folios from it? >>> >> Yes, I would imagine so. the deferred_split_queue is per memcg/node, so that helps. >> >> Also, this work is tied to khugepaged. So would need some thought when doing it for mTHP. >> >> I would imagine doing underused shrinker for mTHP would be less beneficial compared to doing it for 2M THP. But probably needs experimentation. >> >> Thanks > > Below is the core code snippet to support "split underused mTHP". Can we extend the > khugepaged_max_ptes_none value to mthp and keep its semantics unchanged? With a small > modification, Only folios with page numbers greater than khugepaged_max_ptes_none - 1 > can be added to the deferred_split list and can be split. What do you think? > hmm, so I believe its not as simple as that. First mTHP support would need to be added to khugepaged. The entire khugepaged code needs to be audited for it and significantly tested. If you just look at all the instances of HPAGE_PMD_NR in khugepaged.c, you will see it will be a significant change and needs to be a series of its own. Also, different values of max_ptes_none can have different significance for mTHP sizes. max_ptes_none of 200 does not mean anything for 512K and below, it means 78% of a 1M mTHP and 39% of a 2M THP. We might want different max_ptes_none for each mTHP if we were to do this. The benefit of splitting underused mTHPs of lower order might not be worth the work needed to split. Someone needs to experiment with different workloads and see some improvement for these lower orders. So probably a lot more than the below diff is needed for mTHP support! > diff --git a/mm/memory.c b/mm/memory.c > index b95fce7d190f..ef503958d6a0 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4789,6 +4789,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) >         } > >         folio_ref_add(folio, nr_pages - 1); > +       if (nr_pages > 1 && nr_pages > khugepaged_max_ptes_none - 1) > +               deferred_split_folio(folio, false); >         add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); >         count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC); >         folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); > > shmem THP has the same memory expansion problem when the shmem_enabled configuration is > set to always. In my opinion, it is necessary to support "split underused shmem THP", > but I am not sure if there is any gap in the implementation? > > Bang > Thanks >