From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C65A3C3DA41 for ; Thu, 11 Jul 2024 07:29:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1C5D56B0099; Thu, 11 Jul 2024 03:29:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 176E86B009C; Thu, 11 Jul 2024 03:29:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F10FB6B009E; Thu, 11 Jul 2024 03:29:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B2CFE6B0099 for ; Thu, 11 Jul 2024 03:29:28 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 59B3C4042F for ; Thu, 11 Jul 2024 07:29:28 +0000 (UTC) X-FDA: 82326646416.21.BE61D6B Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf19.hostedemail.com (Postfix) with ESMTP id 00DDF1A0012 for ; Thu, 11 Jul 2024 07:29:25 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=k3RmEBU3; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf19.hostedemail.com: domain of chrisl@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720682930; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=3dAgxsxu0McP0RtJy1w0XY7T6F0BJjN2X3/jkV9ecvQ=; b=rM8KiIXQp7NlOqkRjQGHuC2hRRTSeNMurMpFT7B7Z4oh4P0+VTNoXBHxjwsJTIWKYVytGC Wwe9VTjABCDe5MHlWBaTqDG1wAG/0MH9ieCRKD7bX/HqVfS5zWf9+Hafr43gmoL2vkayfd t/lri4BsjeWBW58+Ah+TU7V1ewZgv6M= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720682930; a=rsa-sha256; cv=none; b=jZ2PNPENsdw/2XfdG1nhREBKMXccDj5Iz3FjK3qEWfsmrZcaLgCXAs6qqvgQlaIQOQZ9m+ gt3KNje+Nl1kY3xxIpuRScbYO8gXvebqPfas22e/Ekyy/iIsf5Z8SCLxJjV8yS0c2EGZ2p RWsjv137ES8SMl/jAQJoCPCwiYiY3WM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=k3RmEBU3; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf19.hostedemail.com: domain of chrisl@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=chrisl@kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 11873CE17FD; Thu, 11 Jul 2024 07:29:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECB61C116B1; Thu, 11 Jul 2024 07:29:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1720682960; bh=ah7G4I/jvGKyUS1UQGL+VahXuGeap2epsi0AOoiBQFo=; h=From:Subject:Date:To:Cc:From; b=k3RmEBU3JVJPkuQq8+QVS/cOexl5WCn1JLgduVW4ie536oanQHwzZhy1ZVUSgiAOy vMlCW5Fm0L5FpbbZieP5qrXwH43pjyYlXn0JO5m4szLiKH0eB72IHqJpgmxuJTdwTk 2w6k85wkaVVBBRj7p0FuVRvnjVezGSjY1e9Oy7Amk+MH9plc+t9Pge08wFqfX+1O9S nRDsLLHxTKg0hoVMP4gyL0spvWtgSRTsGTbyZFpGeLOPdZ+2KxYn4Bx8JMpmy67MSX tKpLhrMu9tDbaRabZZUWAEd+uZx7d+LhU5XOlexqKkuJzNmbozJjSkLXYne0+6mh6n +3GLIjqVsgPAw== From: Chris Li Subject: [PATCH v4 0/3] mm: swap: mTHP swap allocator base on swap cluster order Date: Thu, 11 Jul 2024 00:29:04 -0700 Message-Id: <20240711-swap-allocator-v4-0-0295a4d4c7aa@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-B4-Tracking: v=1; b=H4sIAMCJj2YC/2XMyw6CMBCF4VchXVtDp8PNle9hXJQ6QCOhpDVVQ 3h3C4nxwvJM5vsn5skZ8uyQTMxRMN7YIQ7cJUx3amiJm0vcDFLANAPJ/V2NXPW91epmHReZRI1 lSpqQRTQ6asxjDZ7OcXfGx7fn2g9iub5T+J8KgqccizIXNYKsIT9eyQ3U761r2dIK8PG52HqIH lQmZI2qaKDZePntq42X0VNVSCVFfEH88fM8vwBhE+bQLQEAAA== To: Andrew Morton Cc: Kairui Song , Hugh Dickins , Ryan Roberts , "Huang, Ying" , Kalesh Singh , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Chris Li , Barry Song X-Mailer: b4 0.13.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 00DDF1A0012 X-Stat-Signature: m4qqag3er5rc9by1agqe58gomgbj35yi X-Rspam-User: X-HE-Tag: 1720682965-477464 X-HE-Meta: U2FsdGVkX18Gl9skI7EYyTGcyWBi1RUftuBnXjoHH6ajqcAQoW8W6ADGJp6oI081TjJZqp3ckTvZqckNcSl9YUdGptduqnJ39gdnzhu+lz9bacby05kCbLVmckMpB6Fh/2nWJfAiRntcZJGN5yR5TKZGYjIE536ToALaDE/JXzer0D1Qu1VSA/MlNLglG8iO0uW/oLOCI+tu3OFBbiTAfZnuYALWyV2Z3GNsCmtfnqJQc/1tFvpKASWStD1rY5pSs2aWK3wBOJjvw0so4kTxYR0ntbxe4tKUX6bNnw8MJ0QvIwMI7N/qk48h4SXATVLPzLJgIFDjNWhCEE7kATYAm0bsKFs/bopwru9GtynR7R2B56KfXx/OYv9u+mjXflprSsMJeSGu1e0Udp7PggIAJYwdFwWNsmBeDKX6M4usLzK2FDIVYqEaJ+VHHZ6AzkSTK8bXlXO8p8XMxJWlLfPHTVS9UQANZefPBE/UuOHi3TVIdB/ECdPcQvu+0ngJqduf2MCHuzXYTfPwls2qYgagW4WUbrZxnzip20aUwdFEX3ZTdYwBbiB5zICCWqWOSyWreFB76oDOz6hM0M5OuPjBEfLJf05QRiPkLqDWeVb4cYLP8UoX+nzZaSsQpMfcyyitBbVdlrOzrS2zgUb+8/FU9zjm7FfRIbDZmJaBeYGX37KL378y6iQL84NuiraTPDsC70dWJ8+mbSlHPQJMiOjAfPwgdNKCfqggK65w5j9nt7U7RpNOLeuX7M0uYxT926gWR70vn/avrGMX1VL8qCP4o1vUWs4hU/X9gyI5ay5cFa96EvyGM927PDhwXqPWfxamE4HDbLUfzJO4N3Zj1jXjXsxUWz3BsRcDH6j4DomSfUEDVg/FkAZlIR8HYOgaoHFc19ERhg4K96OU2wyr0w9mkmFbJdmrajKPpF68gkRUDZN5C5RkSwLfN9S8IC5VWccAgnwWJAhZekDUS2uOoaQ zh95Ip0s 0CjPbjAKQKB5Gw9yJ6+lToMTVwa8zpu7qWNN+8yBmRI7cXYIgunzyvzTLVZo9v49K0STI4Du566obaqlkItm+Uiq5RicaVzoctjAMgEDyTowIhlV8q9rkcvGd+Pz2uheH5ofnMvmsNbZE3aLUO7tAgq/e9KBqTU1FQoZA+NKQSvYwLbuuuWbmVSz3eWusLFji4tYNzaM7+PWgjblFaSceOB6s6nHC7rvfcsTT02OgQnvGa2m+bY22V0xu3zB6HPbH/sl/5wwVsMSzSnquQ0Z9KR+9lusi+QvZfBRYUL9V8ZHzOLs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is the short term solutions "swap cluster order" listed in my "Swap Abstraction" discussion slice 8 in the recent LSF/MM conference. When commit 845982eb264bc "mm: swap: allow storage of all mTHP orders" is introduced, it only allocates the mTHP swap entries from the new empty cluster list.  It has a fragmentation issue reported by Barry. https://lore.kernel.org/all/CAGsJ_4zAcJkuW016Cfi6wicRr8N9X+GJJhgMQdSMp+Ah+NSgNQ@mail.gmail.com/ The reason is that all the empty clusters have been exhausted while there are plenty of free swap entries in the cluster that are not 100% free. Remember the swap allocation order in the cluster. Keep track of the per order non full cluster list for later allocation. The patch 3 of this series gives the swap SSD allocation a new separate code path from the HDD allocation. The new allocator use cluster list only and do not global scan swap_map[] without lock any more. This streamline the swap allocation for SSD. The code matches the execution flow much better. User impact: For users that allocate and free mix order mTHP swapping, It greatly improves the success rate of the mTHP swap allocation after the initial phase. It also performs faster when the swapfile is close to full, because the allocator can get the non full cluster from a list rather than scanning a lot of swap_map entries.  This series still lacks the swap cache reclaim feature. The reclaim series of patches are under development and testing right now. Will post the mail list soon. For this reason, the patch 3 is consider RFC and not ready to merge. With Barry's mthp test program V2: Without: $ ./thp_swap_allocator_test -a Iteration 1: swpout inc: 32, swpout fallback inc: 192, Fallback percentage: 85.71% Iteration 2: swpout inc: 0, swpout fallback inc: 231, Fallback percentage: 100.00% Iteration 3: swpout inc: 0, swpout fallback inc: 227, Fallback percentage: 100.00% ... Iteration 98: swpout inc: 0, swpout fallback inc: 224, Fallback percentage: 100.00% Iteration 99: swpout inc: 0, swpout fallback inc: 215, Fallback percentage: 100.00% Iteration 100: swpout inc: 0, swpout fallback inc: 222, Fallback percentage: 100.00% $ ./thp_swap_allocator_test -a -s Iteration 1: swpout inc: 0, swpout fallback inc: 224, Fallback percentage: 100.00% Iteration 2: swpout inc: 0, swpout fallback inc: 218, Fallback percentage: 100.00% Iteration 3: swpout inc: 0, swpout fallback inc: 222, Fallback percentage: 100.00% .. Iteration 98: swpout inc: 0, swpout fallback inc: 228, Fallback percentage: 100.00% Iteration 99: swpout inc: 0, swpout fallback inc: 230, Fallback percentage: 100.00% Iteration 100: swpout inc: 0, swpout fallback inc: 229, Fallback percentage: 100.00% $ ./thp_swap_allocator_test -s Iteration 1: swpout inc: 0, swpout fallback inc: 224, Fallback percentage: 100.00% Iteration 2: swpout inc: 0, swpout fallback inc: 218, Fallback percentage: 100.00% Iteration 3: swpout inc: 0, swpout fallback inc: 222, Fallback percentage: 100.00% .. Iteration 98: swpout inc: 0, swpout fallback inc: 228, Fallback percentage: 100.00% Iteration 99: swpout inc: 0, swpout fallback inc: 230, Fallback percentage: 100.00% Iteration 100: swpout inc: 0, swpout fallback inc: 229, Fallback percentage: 100.00% $ ./thp_swap_allocator_test Iteration 1: swpout inc: 0, swpout fallback inc: 224, Fallback percentage: 100.00% Iteration 2: swpout inc: 0, swpout fallback inc: 218, Fallback percentage: 100.00% Iteration 3: swpout inc: 0, swpout fallback inc: 222, Fallback percentage: 100.00% .. Iteration 98: swpout inc: 0, swpout fallback inc: 228, Fallback percentage: 100.00% Iteration 99: swpout inc: 0, swpout fallback inc: 230, Fallback percentage: 100.00% Iteration 100: swpout inc: 0, swpout fallback inc: 229, Fallback percentage: 100.00% With: $ ./thp_swap_allocator_test -a Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 2: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 3: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00% ... Iteration 98: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 99: swpout inc: 215, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 100: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00% $ ./thp_swap_allocator_test -a -s Iteration 1: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 2: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 3: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 4: swpout inc: 226, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 5: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 6: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 7: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 8: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 9: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 10: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 11: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 12: swpout inc: 232, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 13: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 14: swpout inc: 223, swpout fallback inc: 3, Fallback percentage: 1.33% Iteration 15: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 16: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 17: swpout inc: 212, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 18: swpout inc: 234, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 19: swpout inc: 220, swpout fallback inc: 6, Fallback percentage: 2.65% Iteration 20: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 21: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 22: swpout inc: 226, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 23: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 24: swpout inc: 232, swpout fallback inc: 1, Fallback percentage: 0.43% Iteration 25: swpout inc: 215, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 26: swpout inc: 230, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 27: swpout inc: 219, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 28: swpout inc: 225, swpout fallback inc: 1, Fallback percentage: 0.44% Iteration 29: swpout inc: 226, swpout fallback inc: 2, Fallback percentage: 0.88% Iteration 30: swpout inc: 224, swpout fallback inc: 1, Fallback percentage: 0.44% Iteration 31: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 32: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 33: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 34: swpout inc: 226, swpout fallback inc: 2, Fallback percentage: 0.88% Iteration 35: swpout inc: 230, swpout fallback inc: 3, Fallback percentage: 1.29% Iteration 36: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 37: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 38: swpout inc: 221, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 39: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43% Iteration 40: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 41: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 42: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 43: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 44: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 45: swpout inc: 221, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 46: swpout inc: 221, swpout fallback inc: 2, Fallback percentage: 0.90% Iteration 47: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 48: swpout inc: 220, swpout fallback inc: 1, Fallback percentage: 0.45% Iteration 49: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 50: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 51: swpout inc: 224, swpout fallback inc: 2, Fallback percentage: 0.88% Iteration 52: swpout inc: 229, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 53: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 54: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 55: swpout inc: 226, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 56: swpout inc: 226, swpout fallback inc: 2, Fallback percentage: 0.88% Iteration 57: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 58: swpout inc: 219, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 59: swpout inc: 224, swpout fallback inc: 1, Fallback percentage: 0.44% Iteration 60: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43% Iteration 61: swpout inc: 217, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 62: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 63: swpout inc: 223, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 64: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 65: swpout inc: 226, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 66: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 67: swpout inc: 220, swpout fallback inc: 2, Fallback percentage: 0.90% Iteration 68: swpout inc: 224, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 69: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 70: swpout inc: 219, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 71: swpout inc: 225, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 72: swpout inc: 231, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 73: swpout inc: 218, swpout fallback inc: 5, Fallback percentage: 2.24% Iteration 74: swpout inc: 223, swpout fallback inc: 5, Fallback percentage: 2.19% Iteration 75: swpout inc: 222, swpout fallback inc: 7, Fallback percentage: 3.06% Iteration 76: swpout inc: 226, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 77: swpout inc: 229, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 78: swpout inc: 215, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 79: swpout inc: 223, swpout fallback inc: 2, Fallback percentage: 0.89% Iteration 80: swpout inc: 222, swpout fallback inc: 1, Fallback percentage: 0.45% Iteration 81: swpout inc: 218, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 82: swpout inc: 228, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 83: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43% Iteration 84: swpout inc: 222, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 85: swpout inc: 213, swpout fallback inc: 1, Fallback percentage: 0.47% Iteration 86: swpout inc: 215, swpout fallback inc: 8, Fallback percentage: 3.59% Iteration 87: swpout inc: 222, swpout fallback inc: 1, Fallback percentage: 0.45% Iteration 88: swpout inc: 227, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 89: swpout inc: 222, swpout fallback inc: 6, Fallback percentage: 2.63% Iteration 90: swpout inc: 224, swpout fallback inc: 1, Fallback percentage: 0.44% Iteration 91: swpout inc: 214, swpout fallback inc: 1, Fallback percentage: 0.47% Iteration 92: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 93: swpout inc: 221, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 94: swpout inc: 223, swpout fallback inc: 2, Fallback percentage: 0.89% Iteration 95: swpout inc: 222, swpout fallback inc: 1, Fallback percentage: 0.45% Iteration 96: swpout inc: 223, swpout fallback inc: 4, Fallback percentage: 1.76% Iteration 97: swpout inc: 223, swpout fallback inc: 7, Fallback percentage: 3.04% Iteration 98: swpout inc: 227, swpout fallback inc: 1, Fallback percentage: 0.44% Iteration 99: swpout inc: 229, swpout fallback inc: 1, Fallback percentage: 0.43% Iteration 100: swpout inc: 229, swpout fallback inc: 0, Fallback percentage: 0.00% $ ./thp_swap_allocator_test Iteration 1: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 2: swpout inc: 134, swpout fallback inc: 98, Fallback percentage: 42.24% Iteration 3: swpout inc: 72, swpout fallback inc: 154, Fallback percentage: 68.14% Iteration 4: swpout inc: 40, swpout fallback inc: 183, Fallback percentage: 82.06% Iteration 5: swpout inc: 27, swpout fallback inc: 199, Fallback percentage: 88.05% Iteration 6: swpout inc: 22, swpout fallback inc: 202, Fallback percentage: 90.18% Iteration 7: swpout inc: 12, swpout fallback inc: 216, Fallback percentage: 94.74% Iteration 8: swpout inc: 14, swpout fallback inc: 214, Fallback percentage: 93.86% Iteration 9: swpout inc: 5, swpout fallback inc: 221, Fallback percentage: 97.79% Iteration 10: swpout inc: 10, swpout fallback inc: 218, Fallback percentage: 95.61% ... Iteration 97: swpout inc: 12, swpout fallback inc: 207, Fallback percentage: 94.52% Iteration 98: swpout inc: 8, swpout fallback inc: 219, Fallback percentage: 96.48% Iteration 99: swpout inc: 16, swpout fallback inc: 218, Fallback percentage: 93.16% Iteration 100: swpout inc: 10, swpout fallback inc: 218, Fallback percentage: 95.61% $ ./thp_swap_allocator_test -s Iteration 1: swpout inc: 233, swpout fallback inc: 0, Fallback percentage: 0.00% Iteration 2: swpout inc: 84, swpout fallback inc: 148, Fallback percentage: 63.79% Iteration 3: swpout inc: 39, swpout fallback inc: 195, Fallback percentage: 83.33% Iteration 4: swpout inc: 16, swpout fallback inc: 217, Fallback percentage: 93.13% Iteration 5: swpout inc: 11, swpout fallback inc: 214, Fallback percentage: 95.11% Iteration 6: swpout inc: 10, swpout fallback inc: 218, Fallback percentage: 95.61% ... Iteration 96: swpout inc: 5, swpout fallback inc: 225, Fallback percentage: 97.83% Iteration 97: swpout inc: 2, swpout fallback inc: 215, Fallback percentage: 99.08% Iteration 98: swpout inc: 2, swpout fallback inc: 220, Fallback percentage: 99.10% Iteration 99: swpout inc: 4, swpout fallback inc: 222, Fallback percentage: 98.23% Iteration 100: swpout inc: 3, swpout fallback inc: 221, Fallback percentage: 98.66% Kernel compile under tmpfs with cgroup memory.max = 2G. 12 core 24 hyperthreading, 32 jobs. HDD swap 3 runs average, 20G swap file: Without: user 4186.290 system 421.743 real 597.317 With: user 4113.897 system 413.123 real 659.543 SSD swap 10 runs average, 20G swap partition: Without: user 4736.810 system 500.921 real 250.243 With: user 4729.478 system 500.265 real 249.633 Two zram swap: zram0 1.4G zram1 20G. The idea is forcing the zram0 almost full then overflow to zram1: Two zram 10 runs average: Without: user 4600.693 system 384.105 real 238.735 With: user 4604.502 system 382.087 real 239.063 Reported-by: Barry Song <21cnbao@gmail.com> Signed-off-by: Chris Li --- Changes in v4: - Remove a warning in patch 2. - Allocating from the free cluster list before the nonfull list. Revert the v3 behavior. - Add cluster_index and cluster_offset function. - Patch 3 has a new allocating path for SSD. - HDD swap allocation does not need to consider clusters any more. Changes in v3: - Using V1 as base. - Rename "next" to "list" for the list field, suggested by Ying. - Update comment for the locking rules for cluster fields and list, suggested by Ying. - Allocate from the nonfull list before attempting free list, suggested by Kairui. - Link to v2: https://lore.kernel.org/r/20240614-swap-allocator-v2-0-2a513b4a7f2f@kernel.org Changes in v2: - Abandoned. - Link to v1: https://lore.kernel.org/r/20240524-swap-allocator-v1-0-47861b423b26@kernel.org --- Chris Li (3): mm: swap: swap cluster switch to double link list mm: swap: mTHP allocate swap entries from nonfull list RFC: mm: swap: seperate SSD allocation from scan_swap_map_slots() include/linux/swap.h | 30 ++-- mm/swapfile.c | 490 +++++++++++++++++++++++---------------------------- 2 files changed, 238 insertions(+), 282 deletions(-) --- base-commit: ff3a648ecb9409aff1448cf4f6aa41d78c69a3bc change-id: 20240523-swap-allocator-1534c480ece4 Best regards, -- Chris Li