From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 004E4C2BD09 for ; Fri, 28 Jun 2024 15:30:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A5DE6B0083; Fri, 28 Jun 2024 11:30:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6549A6B0098; Fri, 28 Jun 2024 11:30:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F51E6B009A; Fri, 28 Jun 2024 11:30:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3193C6B0083 for ; Fri, 28 Jun 2024 11:30:52 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C98E61404F3 for ; Fri, 28 Jun 2024 15:30:51 +0000 (UTC) X-FDA: 82280685102.22.22705D5 Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) by imf03.hostedemail.com (Postfix) with ESMTP id A551420013 for ; Fri, 28 Jun 2024 15:30:48 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jTaLxiCs; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.180 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719588633; a=rsa-sha256; cv=none; b=qUoSkOAv2HAaD3qyIQuLdLP+xPjWzIqj4HqbELjqipO5+hdqVLyb4CKy7dlIfvyZJ+qn3N ZzAMN4az+6YarNbfMwyqKouJtJrXMVX+Oa1P8oYznNg9qMNs3aneHT81H9X86D+KkCqoIJ zcy5nHloM0u8gKbpMQk23LnD61AaRrE= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jTaLxiCs; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.180 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719588633; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SIvAV+BPelCpPdeLvDnIF1t0ou2brUvQEeuXH5r7KiA=; b=Istnwq9MH9kZtSGp29uVsxxDmccw3u6RYxjWitGLosJXKld/RoLDOtWkVaPuZM8KlMUGQP Aow/YXL+1ZJcsrjIjfKF37EaJ5E6oLzDeD940nd9bENbC9xB61xnQ7Sa6mt2LDuZIgwyyn /hK2vZKsqvl9UWe7XN6D31G4D0kodqg= Received: by mail-lj1-f180.google.com with SMTP id 38308e7fff4ca-2ebe0a81dc8so10136241fa.2 for ; Fri, 28 Jun 2024 08:30:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719588647; x=1720193447; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=SIvAV+BPelCpPdeLvDnIF1t0ou2brUvQEeuXH5r7KiA=; b=jTaLxiCsVZhAx2KDQPd605JXJpdxeuX62c2rmPOsMUCZkQKqAdJGenH5zMtOkdhEZQ L3H8kL1XpGoemsM0l0ECEIHkf4HYNd++T0kI6voEm4l6TMK6MndfEgYalxuem0L/ml25 Tl3+Kq1zPAiGuBlnId7nNBVzdjzfF+snm1BvEN6629zgpmQjrOIzDJqRvCJ+XkR0IR5t wDJJlFy131y9/3Pn2+Q3ZTxrAeDHIbw5xsE/ApPI1MEPoon3uu0BsxLC3m3nYgGgvuOS sARjsfTTjA2/kKG2AV9sNdvRcsiJe6d53dN4OARrbN3hAPgMYJKH/Y5sezNomM638bEq NrkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719588647; x=1720193447; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SIvAV+BPelCpPdeLvDnIF1t0ou2brUvQEeuXH5r7KiA=; b=cjEg6B+XXsgzpqkjaWgJkXyY0u6rIoHx0wsuoOVIvWyLJcmU7A+cP3QKIOfa9dh+Uw cteZ+cDOuSsA2ATqqsy4RViwHlBz3Wx53m3vMjtP37IuTjGOfgfusp3VDrmCKPeLda3a YV7cCgwVXxesJV4Mvjc+/LUGNKSWt/uFx4E9n0hJpISyz/0OnR0wsJ1VOSIznedp3Wey hBFMfffzSzSA23eYcmzEDJ4ZP9AZuJ0/j2UcZE0wSwZPNjuTySSo4heORNk/7VLiNr0Z B/4YOtZccPpwtsVhYlR0BRHmQDyYFfWJKbs0VHyFGP0EZdjT/n8IxgCgl3WIcikrZLh/ 8Vyg== X-Forwarded-Encrypted: i=1; AJvYcCX/MUFeN+a8FcC6k+zfnDCc4Tgfw30SVdmqc3OoFbsml5Wi9azT1uUKIprUxB9xFxzNn/iVYv8nmqZShFG1xiNHxc8= X-Gm-Message-State: AOJu0YxWKyVY+6bebaj3kxs4x/UQXUeXjK/y7OdMEY0Fisi1OS9T4xCY u2aZ7MaVVOFwOSE/cLjbTaGFYZ+n7SY6cf4OWn/XrlKKAjJ6OvN4 X-Google-Smtp-Source: AGHT+IH1a9sunwrz6EojMfegKM4Pz2lbCMTC/UGVwnc4W4zu71pJOgaAXc+AX0CsIf/rIM2jiBqQzQ== X-Received: by 2002:ac2:5f76:0:b0:52c:e0d3:c199 with SMTP id 2adb3069b0e04-52ce185d19emr11460785e87.51.1719588646548; Fri, 28 Jun 2024 08:30:46 -0700 (PDT) Received: from ?IPV6:2001:16a2:df29:7e00:184b:b632:a7dc:1c99? ([2001:16a2:df29:7e00:184b:b632:a7dc:1c99]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3675a103d18sm2617455f8f.106.2024.06.28.08.30.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 28 Jun 2024 08:30:46 -0700 (PDT) Message-ID: <44a57df4-e54c-47ee-96b8-e2361c549239@gmail.com> Date: Fri, 28 Jun 2024 18:30:43 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v7 1/2] mm: store zero pages to be swapped out in a bitmap To: Johannes Weiner Cc: akpm@linux-foundation.org, shakeel.butt@linux.dev, david@redhat.com, ying.huang@intel.com, hughd@google.com, willy@infradead.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Andi Kleen References: <20240627105730.3110705-1-usamaarif642@gmail.com> <20240627105730.3110705-2-usamaarif642@gmail.com> <20240627161852.GA469122@cmpxchg.org> Content-Language: en-US From: Usama Arif In-Reply-To: <20240627161852.GA469122@cmpxchg.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: A551420013 X-Stat-Signature: 6xtfmc3nkpf7aw1tb14aomrn9xnsfgur X-Rspam-User: X-HE-Tag: 1719588648-398120 X-HE-Meta: U2FsdGVkX1+xqBwuvkO5EL9n0ZIBABgWLWFD5o1ggeN8c0SDLIGaSRyBbEzmYXBenrlADeyow9ApQHzuD1jKAN8eG/Ej9HFEnzcbqFoWcu7up4UVBW58I7F3MC41tSFLMueOQaBcgHnXfhJExa4lrdKzTiGvgQczwlFLc6eVoBuTyfJGFAdRHTdX+0zfnWwa9UnohcbCUVneStKVBIjictTrhAvbUEPXcr6umoeBei9l/4i2XDl0hT9Iv+/o/3iV0uy5ZxGDcr8h+YluTH1oVS1XouvhgXW1CMCKQk4yyBt3S8D9Ial7ZByEW7h4xXRtMt3SpT8Dtg69gvj3ucPFMsib8q6sdjQKfdwyZdfdjy2DxtWFLtTWKhQ3/jaSVhM2H/VdBUpmQH8fAvQaFAGNLQvs27xLmJx9oDAg+og/IeY01ew/RDP/v/nAAH5cbevYOLMAW5x4Js2HyH8nMBHimeD4EUP3G/3nKaZCS4BMKxkiXMiMwGMF846E49T8sIAOOxfPWHS2FIbsYYQAcUBjq3uxP633dqWwtgj5k4UDBP3HUP96BkddxAy3RSt/J6qRd8kc8Lr1E5NH3ZD5TvAAUoGxXV6CMyVCiTOxLHiA1j4otJAxDFHUxMyO/prrDTn331FDUdUCgREOCU6ZAS2dIRUdKXL0OmAFI4ogLqLdwUhyq0mEEJtAyP9hF0MhCynMk0WxM4FljIWObQ0lklmP/h4eHjEUJkm0uYGICQ9nTYF0yoSMNXf4LXi2vYr9XvOfsCI11AEzQBs6309O4sxHAsPG+2OWZmyzzFnA7xieIppSYVt+2Wx/nOBo/kunmcmZ5cczS41EHz4Sd8l7kzZ8sTT1bAlhrA8MPMoYR2rKnUgHz/h5V7bssO0J01uB1ZMhYLaOfDMlrMGPo1eqzs8jOCgBdAkI0ytYBsz693e5baioXmfjBrqpRsMy+Wn334qt/w8YtCRjrdtqOZw9VWv I9v+yYkU lVzFPACQOVSnKBopfff6rnU2l54pcr0SuwnNPmActHrsCjQUCWKPkEd/6sc8CJI9ncPKNBh90/Nkziu5VCGnbu0TeT3mcyyXNOonHZ+J3IYc4N5BHtBZQN13SeyQnq7wLimATthbkdMByKNRwZl98968zrf6DlEqe/aPQRfl2DzvyrwEeLDNgb0taLcO5YmrNa++Dr1Ou5RbpFOMeVPNVzCM9J+sDJ5XUXgWYMTZUNgRxnRe9aPlL0G64bD04EqbgFXY9Umo0M3TDqQOovWHZ1pS+M+l+yLWHugTcmBvwRuSYiu1Yt1WI933vViS2zvqSyaKE3ZXx3N/102kSbJ+CllBu2bccoHIrugkuhvO/2V0rbBTo8eum+66iFPIfU/F2uQVwTCa1nQuvzvkwF2Gx5ji/BMkrw2XSCKvdaGbhAbnTyZRpIXAOuDw6F0R2dOmomCzRRzHhi2726JfuZ+559rpnI/wQ3Eg+eFrfS/CmhAf2bHZaM8Nd9gijj1jcayT5EYXSIqaEclfMoCPZjDmQEwNzYoDqu6o/Xfn/vuZ3bTlttj3LfWQ1M5jflqGvsD/7Z1HmpS/5Eg01VbaNAJTMjlGJGFompn7/zqV7t6TgIZfgMXDEbsdgWLiHoBWmbmxFUmpEbEItjfuJRoPc5z78lWypTmmwuc1gUPOrZJQL3X/qgW9PIjO6anWHPdVFiwEQKES4wUBEdPoc458dDWsADheWCg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 27/06/2024 19:18, Johannes Weiner wrote: > Hi Usama, > > On Thu, Jun 27, 2024 at 11:55:29AM +0100, Usama Arif wrote: >> Approximately 10-20% of pages to be swapped out are zero pages [1]. >> Rather than reading/writing these pages to flash resulting >> in increased I/O and flash wear, a bitmap can be used to mark these >> pages as zero at write time, and the pages can be filled at >> read time if the bit corresponding to the page is set. >> With this patch, NVMe writes in Meta server fleet decreased >> by almost 10% with conventional swap setup (zswap disabled). >> >> [1] https://lore.kernel.org/all/20171018104832epcms5p1b2232e2236258de3d03d1344dde9fce0@epcms5p1/ >> >> Signed-off-by: Usama Arif >> Reviewed-by: Chengming Zhou >> Reviewed-by: Yosry Ahmed >> Reviewed-by: Nhat Pham >> Cc: David Hildenbrand >> Cc: "Huang, Ying" >> Cc: Hugh Dickins >> Cc: Johannes Weiner >> Cc: Matthew Wilcox (Oracle) >> Cc: Shakeel Butt >> Cc: Usama Arif >> Cc: Andi Kleen >> Signed-off-by: Andrew Morton > This looks great to me, and the numbers speak for themselves. A few > minor comments below: > >> --- >> include/linux/swap.h | 1 + >> mm/page_io.c | 113 ++++++++++++++++++++++++++++++++++++++++++- >> mm/swapfile.c | 20 ++++++++ >> 3 files changed, 133 insertions(+), 1 deletion(-) >> >> diff --git a/include/linux/swap.h b/include/linux/swap.h >> index 3df75d62a835..ed03d421febd 100644 >> --- a/include/linux/swap.h >> +++ b/include/linux/swap.h >> @@ -299,6 +299,7 @@ struct swap_info_struct { >> signed char type; /* strange name for an index */ >> unsigned int max; /* extent of the swap_map */ >> unsigned char *swap_map; /* vmalloc'ed array of usage counts */ >> + unsigned long *zeromap; /* vmalloc'ed bitmap to track zero pages */ >> struct swap_cluster_info *cluster_info; /* cluster info. Only for SSD */ >> struct swap_cluster_list free_clusters; /* free clusters list */ >> unsigned int lowest_bit; /* index of first free in swap_map */ >> diff --git a/mm/page_io.c b/mm/page_io.c >> index 6c1c1828bb88..480b8f221d90 100644 >> --- a/mm/page_io.c >> +++ b/mm/page_io.c >> @@ -172,6 +172,88 @@ int generic_swapfile_activate(struct swap_info_struct *sis, >> goto out; >> } >> > It might be good to have a short comment that gives 1) an overview, > that we're using a bitmap to avoid doing IO for zero-filled pages and > 2) the locking, that the bits are protected by the locked swapcache > folio and atomic updates are used to protect against RMW corruption > due to other zero swap entries seeing concurrent updates. Thanks! have addressed the comments and will include them in next revision. Just a couple of things. Will add the overview in swap_writepage when the check is made if the folio is zero filled and zeromap bits are set, instead of at this point. >> +static bool is_folio_page_zero_filled(struct folio *folio, int i) >> +{ >> + unsigned long *data; >> + unsigned int pos, last_pos = PAGE_SIZE / sizeof(*data) - 1; >> + bool ret = false; >> + >> + data = kmap_local_folio(folio, i * PAGE_SIZE); >> + if (data[last_pos]) >> + goto out; >> +static void folio_zero_fill(struct folio *folio) >> +{ >> + unsigned int i; >> + >> + for (i = 0; i < folio_nr_pages(folio); i++) >> + clear_highpage(folio_page(folio, i)); >> +} > Should this be in highmem.h next to the other folio_zero_* functions? Thanks for pointing to highmem.h. It already has folio_zero_range, which should do the same thing, so I think I can just do folio_zero_range(folio, 0, folio_size(folio)) and this function shouldnt be needed.