From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 801842F067D for ; Tue, 12 Aug 2025 22:28:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755037680; cv=none; b=i9BfG3Z3PnqCvSEkMrOBBsPMOq0XWK7hBNuUKh06piy6dG5soKB73DbdhY/cCL7zfgxj2oq6eit8JgTphtpuBcqmi+SeBrQc359bCDah44vwHYOAZrwzk6sLC0j4wqe8OrcVYHwnGmqwDLnvBHCeccDTJUJ5uxnuqoS1vGm7/hg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755037680; c=relaxed/simple; bh=TPjrzCC4U4+NOemPFdqJSxLWAcOTZytrQW6Fdcab31M=; h=Date:To:From:Subject:Message-Id; b=Z/EEILzpXHac/4dOB9Tux7WFgn3TM7O4Aio7UF/7CBo0MrseIwlF7hJ0hWQXiCfCSSz695bywU/HAJC+ih5B4yZhk4ajZdDOYdj+R+hBuKP7/UEYvTVews0x7g1enyQP9CdzrVk8c7SrX3hkWBG0WdNJcRNl2ad/IVoOA70d5U8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=VlgBDVsJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="VlgBDVsJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EF848C4CEF0; Tue, 12 Aug 2025 22:27:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1755037680; bh=TPjrzCC4U4+NOemPFdqJSxLWAcOTZytrQW6Fdcab31M=; h=Date:To:From:Subject:From; b=VlgBDVsJIu/EJX1S7BVqakK5P0hJR0mZ3qvJIRU62StI0KtWbltPwNEM5qxV8/OLL 4qysr5vs1v3Om54yAcVcdBm72nVk9m5QAMHSszH4JTLR3Eb9QbNIkUYXfoVB7N2g1B 2CrwRUO7KFtkXMzMXXLRGBc/yRLRdvhAppwNEaog= Date: Tue, 12 Aug 2025 15:27:59 -0700 To: mm-commits@vger.kernel.org,nphamcs@gmail.com,kasong@tencent.com,hannes@cmpxchg.org,flintglass@gmail.com,david@redhat.com,chrisl@kernel.org,chengming.zhou@linux.dev,bhe@redhat.com,baohua@kernel.org,sj@kernel.org,akpm@linux-foundation.org From: Andrew Morton Subject: + mm-zswap-store-page_size-compression-failed-page-as-is.patch added to mm-unstable branch Message-Id: <20250812222759.EF848C4CEF0@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The patch titled Subject: mm/zswap: store Subject: mm/zswap: store length == PAGE_SIZE'. Because the uncompressed data is saved in zpool, same to the compressed ones, this introduces no change in terms of memory management including movability and migratability of involved pages. This change is also not increasing per zswap entry metadata overhead. But as the number of incompressible pages increases, total zswap metadata overhead is proportionally increased. The overhead should not be problematic in usual cases, since the zswap metadata for single zswap entry is much smaller than PAGE_SIZE, and in common zswap use cases there should be a sufficient amount of compressible pages. Also it can be mitigated by the zswap writeback. When the writeback is disabled, the additional overhead could be problematic. For the case, keep the current behavior that just returns the failure and let swap_writeout() put the page back to the active LRU list in the case. Knowing how many compression failures happened will be useful for future investigations. investigations. Add a new debugfs file, compress_fail, for the purpose. Tests ----- I tested this patch using a simple self-written microbenchmark that is available at GitHub[1]. You can reproduce the test I did by executing run_tests.sh of the repo on your system. Note that the repo's documentation is not good as of this writing, so you may need to read and use the code. The basic test scenario is simple. Run a test program making artificial accesses to memory having artificial content under memory.high-set memory limit and measure how many accesses were made in given time. The test program repeatedly and randomly access three anonymous memory regions. The regions are all 500 MiB size, and accessed in the same probability. Two of those are filled up with a simple content that can easily be compressed, while the remaining one is filled up with a content that read from /dev/urandom, which is easy to fail at compressing to Suggested-by: Nhat Pham Suggested-by: Takero Funaki Acked-by: Nhat Pham Cc: Chengming Zhou Cc: David Hildenbrand Cc: Johannes Weiner Cc: SeongJae Park Cc: Baoquan He Cc: Barry Song Cc: Chris Li Cc: Kairui Song Signed-off-by: Andrew Morton --- mm/zswap.c | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-) --- a/mm/zswap.c~mm-zswap-store-page_size-compression-failed-page-as-is +++ a/mm/zswap.c @@ -60,6 +60,8 @@ static u64 zswap_written_back_pages; static u64 zswap_reject_reclaim_fail; /* Store failed due to compression algorithm failure */ static u64 zswap_reject_compress_fail; +/* Compression into a size of req), &acomp_ctx->wait); dlen = acomp_ctx->req->dlen; - if (comp_ret) - goto unlock; + + /* + * If a page cannot be compressed into a size smaller than PAGE_SIZE, + * save the content as is without a compression, to keep the LRU order + * of writebacks. If writeback is disabled, reject the page since it + * only adds metadata overhead. swap_writeout() will put the page back + * to the active LRU list in the case. + */ + if (comp_ret || dlen >= PAGE_SIZE) { + zswap_compress_fail++; + if (mem_cgroup_zswap_writeback_enabled( + folio_memcg(page_folio(page)))) { + comp_ret = 0; + dlen = PAGE_SIZE; + dst = kmap_local_page(page); + } else { + comp_ret = comp_ret ? comp_ret : -EINVAL; + goto unlock; + } + } zpool = pool->zpool; gfp = GFP_NOWAIT | __GFP_NORETRY | __GFP_HIGHMEM | __GFP_MOVABLE; @@ -990,6 +1010,8 @@ static bool zswap_compress(struct page * entry->length = dlen; unlock: + if (dst != acomp_ctx->buffer) + kunmap_local(dst); if (comp_ret == -ENOSPC || alloc_ret == -ENOSPC) zswap_reject_compress_poor++; else if (comp_ret) @@ -1012,6 +1034,14 @@ static bool zswap_decompress(struct zswa acomp_ctx = acomp_ctx_get_cpu_lock(entry->pool); obj = zpool_obj_read_begin(zpool, entry->handle, acomp_ctx->buffer); + /* zswap entries of length PAGE_SIZE are not compressed. */ + if (entry->length == PAGE_SIZE) { + memcpy_to_folio(folio, 0, obj, entry->length); + zpool_obj_read_end(zpool, entry->handle, obj); + acomp_ctx_put_unlock(acomp_ctx); + return true; + } + /* * zpool_obj_read_begin() might return a kmap address of highmem when * acomp_ctx->buffer is not used. However, sg_init_one() does not @@ -1809,6 +1839,8 @@ static int zswap_debugfs_init(void) zswap_debugfs_root, &zswap_reject_kmemcache_fail); debugfs_create_u64("reject_compress_fail", 0444, zswap_debugfs_root, &zswap_reject_compress_fail); + debugfs_create_u64("compress_fail", 0444, + zswap_debugfs_root, &zswap_compress_fail); debugfs_create_u64("reject_compress_poor", 0444, zswap_debugfs_root, &zswap_reject_compress_poor); debugfs_create_u64("decompress_fail", 0444, _ Patches currently in -mm which might be from sj@kernel.org are mm-zswap-store-page_size-compression-failed-page-as-is.patch