From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A487DC83F21 for ; Tue, 15 Jul 2025 13:54:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30C126B00A0; Tue, 15 Jul 2025 09:54:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2BB0F6B00A2; Tue, 15 Jul 2025 09:54:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1AA1E6B00A4; Tue, 15 Jul 2025 09:54:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0294B6B00A0 for ; Tue, 15 Jul 2025 09:54:08 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 972BE1A04BE for ; Tue, 15 Jul 2025 13:54:07 +0000 (UTC) X-FDA: 83666642934.02.0AE49F0 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) by imf06.hostedemail.com (Postfix) with ESMTP id A82F9180003 for ; Tue, 15 Jul 2025 13:54:05 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=KchWQKSf; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf06.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752587646; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=f+/pRfnziQy+vnrr7cy84X8As3YxPBFjo6viEF0K4AM=; b=08KPv7JoctOc2GNXK1S14gQ5SbTGxqJ4CSk+qJ85v6uZfmFjXH4DVQbDPCrl9tarl5Cq2X UMrD8AQTpvTrzvPSFmfoiuq6QNm5rOQ8KQuvU+mMgwThamcmWy0IwwD6/BJQQhmkqgyEIl 0N1kzaYp6g1qLzjZfAKQfqiWQHWtSOg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752587646; a=rsa-sha256; cv=none; b=nOzJbOTz879kh2s8gRRYvFkvCkkBEAv6+1/FZJeVaGbLzKxczUxinCeS9Uz0/hM9GFbLBE GCZ5HC98uQZn2jAhO9kURSEggwi+yLxxwlArlkb0CVkBdeGrpSJIG/U6Z4fem/bBDIBq3k 1ZMFwK9X/WTEtxkVlYxtnHkNhqBTz4w= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=KchWQKSf; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf06.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4bhLJS6ZJhz9tJh; Tue, 15 Jul 2025 15:54:00 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1752587641; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f+/pRfnziQy+vnrr7cy84X8As3YxPBFjo6viEF0K4AM=; b=KchWQKSfMAVs5akCV9f1abNSHWlPgTEZYnAr/TBieZnSm6LGk5YhpCi2CPiagz9rloCJnU 5V+irsqtIe3z3O66EoJGSDBpAOAXpfUhVnKabLFn62PqumdoU/26UXEIbfk7lmDIAyGXTb lzmuvm8oMug9rLOtQchKKowFg0DBd0OD5sf6wGQzggcTvGgHsiGQv0gSzvJdg1f6ITF29g hz1SUTlSwUua2jfdjYzBzKnYYELaB8wSJCDjXwYLIPYnSsqJ4Ud0Rd97DEB9gBx7CBJfMo BsKx3j29JVYo0YdByz7sOBeG7th/B73sF9pTKgTB5KEN4UDe7L+aim3Xw5uj9Q== Message-ID: Date: Tue, 15 Jul 2025 15:53:45 +0200 MIME-Version: 1.0 Subject: Re: [PATCH v2 0/5] add static PMD zero page support To: Zi Yan , David Hildenbrand Cc: Dave Hansen , Michal Hocko , Jens Axboe , "Liam R . Howlett" , Nico Pache , Andrew Morton , Lorenzo Stoakes , Mike Rapoport , Vlastimil Babka , "H . Peter Anvin" , Borislav Petkov , Baolin Wang , Ryan Roberts , Suren Baghdasaryan , linux-kernel@vger.kernel.org, Dev Jain , Thomas Gleixner , willy@infradead.org, linux-mm@kvack.org, x86@kernel.org, linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , mcgrof@kernel.org, gost.dev@samsung.com, hch@lst.de, Pankaj Raghav , Ingo Molnar References: <20250707142319.319642-1-kernel@pankajraghav.com> Content-Language: en-US From: Pankaj Raghav In-Reply-To: <20250707142319.319642-1-kernel@pankajraghav.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: A82F9180003 X-Stat-Signature: pfreq7agb9ze3geanfkg8cnz1dq4dqx6 X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1752587645-762645 X-HE-Meta: U2FsdGVkX18JYhETz+G63eXlzAcUwF1r5JH4coG9n88Ub4SwldcJPRc79/zeuiCH8prCpzjkWP5iC0Rc+/5kz0Ezd440Ubkq78pH0swnLBVNGW8wvIC6HuR3+bBfbEuJD6kAXCianOdUU9gPwxYby0At1UU5wGxNnMK50AuFJg36sBn8c/EoOqDdYfiE0lOFWPbsatdF3BiiFa/q7F+TmDnE7HLa8VEaa2ckyBNoDTmB4Z5cTwce7TmTcWZq//yoD6sG3i/6a31Mn4+5qv/CGk2N1kVyzmVMXNJ+yr2bds5cW87dHZPYa5vT0VBN14rqbQfBJLa1mwPxtzkPph8gTkrYV8bN/xEkJaScdiFwuSfrGcqVcHo3PmHNN2ngmj/u0f5h3w8cIsnC5Uk/JianrkxL034Dq/hc626kz7AheiftPW4m80MLD9J9XRj3Y3cd6wtFmK0KaxjWjEL9Oj9HykXbzISbENmaW7K+x/XpFF7Dn/YqlJHpo/bSmX+07+koY/xm9hV4z4mbjZXe7wKChhT8S7fLVLgx6odGo+SeEoJyEIf7Cuh7KRVj2f2a7ZyqmoeKgrynu1Tce/x0dLeJqwVCrwsZx98vvd0KhJ0xb5WnF5/r0g6TWjx8Bdo2RA9umcrEB2QlniBJHM8bbzDPjBk5bh3xIp2qwJoG6x1U17m6L0lEjNyTvQUnAQ5NGWUCXayrTP07iM2QdMSqvHeFCkOTjwHCKx0oUNNWKjeLLFnujgH+xdpFTZFRMAyjqnjah9g4xutWyUjQFnmkD295N70nl+oR0KTBKQI5luzLW1Ui9619fr2r0Tmgb1XzS3mMbgrnINarYddnnG8bYksDzA0Y39t0gjH/C1HvrZm2W+0qYQsx2wECHV8yLtidUK30P6xPDl+eeGsuYBAKXmjQ9NxLWG4606EXP/52iPz+yDMCng4LsFzzJIM0iKyabQoDHahzZcSq3kbJ7qvdubJ zlGTOHso m9gqLn3DbnnddDnkso/m2EoUWEqtideSDjAyfP01dgXPKXCDF9gnzVMSb1f6qj2NOioVurPBg3oFhek6979suTseP+PkHevhGO7GSR/G53S1v6QBbYARXYtC3eb07837r1QsiGT5k38dZwSBX7zJs2R116LbFQtWps35+H0ZJwcVo75m58Fx11WAQVZLgSFvq59R8ZYhLCxIb6mS/xtt9y0EvcwxkahArluOLjNvCOHnzU+PsqF4VxtRBE/aAzuKvtbvNiWXpX4l9D7nBXrx7v3ijWXvTESkSVv4MoOh08tolv3E8xY5Xmw6vXjx0h3wkCf9apMrrwc1qzJq/VQ0lfxVKL8UT/5PkH3dDqo4O/E7yUXHjgW/EwGbKb3lENnmBKtYJl3yhXUaBzOo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi David, For now I have some feedback from Zi. It would be great to hear your feedback before I send the next version :) -- Pankaj On Mon, Jul 07, 2025 at 04:23:14PM +0200, Pankaj Raghav (Samsung) wrote: > From: Pankaj Raghav > > There are many places in the kernel where we need to zeroout larger > chunks but the maximum segment we can zeroout at a time by ZERO_PAGE > is limited by PAGE_SIZE. > > This concern was raised during the review of adding Large Block Size support > to XFS[1][2]. > > This is especially annoying in block devices and filesystems where we > attach multiple ZERO_PAGEs to the bio in different bvecs. With multipage > bvec support in block layer, it is much more efficient to send out > larger zero pages as a part of a single bvec. > > Some examples of places in the kernel where this could be useful: > - blkdev_issue_zero_pages() > - iomap_dio_zero() > - vmalloc.c:zero_iter() > - rxperf_process_call() > - fscrypt_zeroout_range_inline_crypt() > - bch2_checksum_update() > ... > > We already have huge_zero_folio that is allocated on demand, and it will be > deallocated by the shrinker if there are no users of it left. > > At moment, huge_zero_folio infrastructure refcount is tied to the process > lifetime that created it. This might not work for bio layer as the completions > can be async and the process that created the huge_zero_folio might no > longer be alive. > > Add a config option STATIC_PMD_ZERO_PAGE that will always allocate > the huge_zero_folio via memblock, and it will never be freed. > > I have converted blkdev_issue_zero_pages() as an example as a part of > this series. > > I will send patches to individual subsystems using the huge_zero_folio > once this gets upstreamed. > > Looking forward to some feedback. > > [1] https://lore.kernel.org/linux-xfs/20231027051847.GA7885@lst.de/ > [2] https://lore.kernel.org/linux-xfs/ZitIK5OnR7ZNY0IG@infradead.org/ > > Changes since v1: > - Move from .bss to allocating it through memblock(David) > > Changes since RFC: > - Added the config option based on the feedback from David. > - Encode more info in the header to avoid dead code (Dave hansen > feedback) > - The static part of huge_zero_folio in memory.c and the dynamic part > stays in huge_memory.c > - Split the patches to make it easy for review. > > Pankaj Raghav (5): > mm: move huge_zero_page declaration from huge_mm.h to mm.h > huge_memory: add huge_zero_page_shrinker_(init|exit) function > mm: add static PMD zero page > mm: add largest_zero_folio() routine > block: use largest_zero_folio in __blkdev_issue_zero_pages() > > block/blk-lib.c | 17 +++++---- > include/linux/huge_mm.h | 31 ---------------- > include/linux/mm.h | 81 +++++++++++++++++++++++++++++++++++++++++ > mm/Kconfig | 9 +++++ > mm/huge_memory.c | 62 +++++++++++++++++++++++-------- > mm/memory.c | 25 +++++++++++++ > mm/mm_init.c | 1 + > 7 files changed, 173 insertions(+), 53 deletions(-) > > > base-commit: d7b8f8e20813f0179d8ef519541a3527e7661d3a > -- > 2.49.0