From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 725F0CD54A9 for ; Tue, 19 Sep 2023 08:42:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F7836B04D0; Tue, 19 Sep 2023 04:42:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A73D6B04D2; Tue, 19 Sep 2023 04:42:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB1C46B04D3; Tue, 19 Sep 2023 04:42:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DAF1E6B04D0 for ; Tue, 19 Sep 2023 04:42:28 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A82C6C0BD7 for ; Tue, 19 Sep 2023 08:42:28 +0000 (UTC) X-FDA: 81252705576.07.8779E8C Received: from out-221.mta0.migadu.com (out-221.mta0.migadu.com [91.218.175.221]) by imf17.hostedemail.com (Postfix) with ESMTP id D11CF40007 for ; Tue, 19 Sep 2023 08:42:26 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=ld+aoIfW; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf17.hostedemail.com: domain of muchun.song@linux.dev designates 91.218.175.221 as permitted sender) smtp.mailfrom=muchun.song@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695112947; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ls2ZZMvVwgmjUtmPi6DuO17VMZDQ7A1qHqqRp8+oGbE=; b=uLr3QLl6cb4fnEXUWEuQ5SpBW2/vzgk1qOo2knAeKBTyoBust5e48Z+pjxDQIiyl5mqP/v RSSqNBaGIMa2xJZqDfcHy1B1qMhezcurWUkVreOrBy+kEh/9nWLH3TrF1pCkojTV5L8pdy 1n72lMd8l3Fu/zDsanRhOFrZBSlwztU= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=ld+aoIfW; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf17.hostedemail.com: domain of muchun.song@linux.dev designates 91.218.175.221 as permitted sender) smtp.mailfrom=muchun.song@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695112947; a=rsa-sha256; cv=none; b=6eAKOIFd+VsUr27SfqhczmGqavrUhkSRZWsyQI7H9Xw4e1KslMOlxvS6KhdpJoUX8u9yjG 0BkNyULrt/jUptX5xPRkf+6NC+PZtSm+D1NO3SV4yIXYgbCsm7c83r9c477zm0Welz9XRy gc4/4XAOhExvp3GxpTmgcE7F9PrA868= Content-Type: text/plain; charset=us-ascii DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1695112944; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ls2ZZMvVwgmjUtmPi6DuO17VMZDQ7A1qHqqRp8+oGbE=; b=ld+aoIfWLjD4jtVFQ4ikiJay9wseeCxjYZHaqQ8XD/OGLmM8NOEXT+HtsvYMMSXzl9dPUX dAKMAT1manK1lnjGIaGm4D/ikqQjzkONzmB67Y8BO4V/MW6/yPEXcRLxWcZluc/9D572Ue MGQ3KFqOWRillrdCHe4k/C703VzMnHs= Mime-Version: 1.0 Subject: Re: [PATCH v4 6/8] hugetlb: batch PMD split for bulk vmemmap dedup X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: Date: Tue, 19 Sep 2023 16:41:44 +0800 Cc: Mike Kravetz , Muchun Song , Oscar Salvador , David Hildenbrand , Miaohe Lin , David Rientjes , Anshuman Khandual , Naoya Horiguchi , Barry Song <21cnbao@gmail.com>, Michal Hocko , Matthew Wilcox , Xiongchun Duan , Linux-MM , Andrew Morton , LKML Content-Transfer-Encoding: quoted-printable Message-Id: <07192BE2-C66E-4F74-8F76-05F57777C6B7@linux.dev> References: <20230918230202.254631-1-mike.kravetz@oracle.com> <20230918230202.254631-7-mike.kravetz@oracle.com> <9c627733-e6a2-833b-b0f9-d59552f6ab0d@linux.dev> To: Joao Martins X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: D11CF40007 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 3a3ns8efktksg7iuon6d16y3ynk3n176 X-HE-Tag: 1695112946-427210 X-HE-Meta: U2FsdGVkX18ZfN67AoiEvHsNok2cdIyDJhxCY5wTvK+oS29T9+qS8Q0CWEJE1ZPpRAfv5gEBkBpbhhtHqyfFXff2c6b1LousU07rifs3GpKnC6nY48FD8bk6w87IPCVE86FEOFWxXjc0ObDLxFw5IcjMi7CNFUqDVLSDXbNdTgUqqqaWqV0wOMKaZslnfbMR6c2LGwvaktBGuZqOScpWbeBb4ELpTKDGCilUsp6RrNEbqJ5B3sdJnaI0Vm3wSBos8pAWlavA6FiCV1Xan2r35KmbKV0R+5yHxebPYQxqpWCfeWcB2BYodMxSHdKud3abcDt3XZHXX9IzN3kwlJNSvLGuXYEUSXQ78AEK//8p4uADu/ScYBknq+uiujAYK806cB712Lv38Q3qhOFPe0AePTiWfGyHhsMCm+vqqdbjbTKDo+7z4y6pTSca/PbTVpKt6KrTAb0bwIZCXn14ysDUJfrXIjlDn4KnlJEpI8A6TNGFXzRfUxfpudln3X+F8wit7wmNnSd1G+FHMcsPXhpje8zGr3bsDBWPB7FSSmP3choXasSESTbzX31qnMtZe5Y3Ij7fM0Moq3lbHeAFT8G+vyRSBLBcd0YYUXRgftgUoFMX0SFROUoHWZtXlYETAbZKoCbcISB5M1t9YmYpz6qXptE7OhQW2XhU06Ir5iM1YAlbeXolsGabC//uKAQ7p+LHw30fpmcfjGWjo0lm3Jvi/rOxKyjhDNTw32Xutrzaki7q9a/Hu/rgVCFaU7lbImcOGGjJQl1QLwpk07gv8XKuudLdz7h/LpH/hj9PndHWarEU1oNPnNfd1m85efTmJPGdu/WmU6UMXJwRmnHw+sqLtqf16ZzcbNsR8A+1ZF/EeEdHoG/R2Ck978R2N1fqPl5UdTAGS3rqAPDsLRM8tkk4XhHw3BOpKCHuSQZaCJeTLBMreWeaj9w2wvs06sSWY43ZVddtaGFkrnmqq7lvEfH dXOxbdXG IDKdVThSC460vlLYYzDZcIXxcEYX45R+gPrzzI+WpdP2N4xb5t7b8oQ+jbBffIHNh88UwW54KcZMBqfZ7gT4xTVdvRMVUFSPASN5FJ2OJ7Hr7Tw8TI/YBHuiaEfXUm9BQzSIpRcFy36Zup5Om5HedYqISpyYoz1vK7R1/1n3Pbm9P1gDQ4+1oHdO1FA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Sep 19, 2023, at 16:26, Joao Martins = wrote: >=20 > On 19/09/2023 07:42, Muchun Song wrote: >> On 2023/9/19 07:01, Mike Kravetz wrote: >>> From: Joao Martins >>>=20 >>> In an effort to minimize amount of TLB flushes, batch all PMD splits >>> belonging to a range of pages in order to perform only 1 (global) = TLB >>> flush. >>>=20 >>> Add a flags field to the walker and pass whether it's a bulk = allocation >>> or just a single page to decide to remap. First value >>> (VMEMMAP_SPLIT_NO_TLB_FLUSH) designates the request to not do the = TLB >>> flush when we split the PMD. >>>=20 >>> Rebased and updated by Mike Kravetz >>>=20 >>> Signed-off-by: Joao Martins >>> Signed-off-by: Mike Kravetz >>> --- >>> mm/hugetlb_vmemmap.c | 79 = +++++++++++++++++++++++++++++++++++++++++--- >>> 1 file changed, 75 insertions(+), 4 deletions(-) >>>=20 >>> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c >>> index 147ed15bcae4..e8bc2f7567db 100644 >>> --- a/mm/hugetlb_vmemmap.c >>> +++ b/mm/hugetlb_vmemmap.c >>> @@ -27,6 +27,7 @@ >>> * @reuse_addr: the virtual address of the @reuse_page page. >>> * @vmemmap_pages: the list head of the vmemmap pages that can = be freed >>> * or is mapped from. >>> + * @flags: used to modify behavior in bulk operations >>=20 >> Better to describe it as "used to modify behavior in vmemmap page = table walking >> operations" >>=20 > OK >=20 >>> void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct = list_head >>> *folio_list) >>> { >>> struct folio *folio; >>> LIST_HEAD(vmemmap_pages); >>> + list_for_each_entry(folio, folio_list, lru) >>> + hugetlb_vmemmap_split(h, &folio->page); >>> + >>> + flush_tlb_all(); >>> + >>> list_for_each_entry(folio, folio_list, lru) { >>> int ret =3D __hugetlb_vmemmap_optimize(h, &folio->page, >>> &vmemmap_pages); >>=20 >> This is unlikely to be failed since the page table allocation >> is moved to the above=20 >=20 >> (Note that the head vmemmap page allocation >> is not mandatory).=20 >=20 > Good point that I almost forgot >=20 >> So we should handle the error case in the above >> splitting operation. >=20 > But back to the previous discussion in v2... the thinking was that = /some/ PMDs > got split, and say could allow some PTE remapping to occur and free = some pages > back (each page allows 6 more splits worst case). Then the next > __hugetlb_vmemmap_optimize() will have to split PMD pages again for = those > hugepages that failed the batch PMD split (as we only defer the PTE = remap tlb > flush in this stage). Oh, yes. Maybe we could break the above traversal as early as possible once we enter an ENOMEM? >=20 > Unless this isn't something worth handling >=20 > Joao