From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5E8AC3DA41 for ; Mon, 8 Jul 2024 03:56:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5495B6B0083; Sun, 7 Jul 2024 23:56:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F8536B0088; Sun, 7 Jul 2024 23:56:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C0046B0089; Sun, 7 Jul 2024 23:56:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1E2006B0083 for ; Sun, 7 Jul 2024 23:56:48 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 924FE4101B for ; Mon, 8 Jul 2024 03:56:47 +0000 (UTC) X-FDA: 82315224054.19.E95B806 Received: from out-183.mta0.migadu.com (out-183.mta0.migadu.com [91.218.175.183]) by imf09.hostedemail.com (Postfix) with ESMTP id 08561140017 for ; Mon, 8 Jul 2024 03:56:44 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="Qg1/u6X4"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf09.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.183 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720410973; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iMeQrMxGWVYsF20LSONjNqLwKGL28dK2ApGBEXzkprM=; b=qqCDgx0UBsUU9JCH8npWcRG/D4NsOMRZsRgtbUDVN1UmkC0wsbg+FZ1Rd8fq64Kqee0Qyf PSu5VXfCHb4lmZ/OhVsV1qPV9RHjPViVvC3Y31H0K7qVF2CzXTzMNZAf3mFH1oVxn5zLnV 8GrjyhgCetUuXh/L8r5Zl89UaPsXrD8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720410973; a=rsa-sha256; cv=none; b=3LSf/l9LcbV4Pg+H7JbT6TMM9et0tnyOGUUBmX2O6GVR+dmqi6BYjgmiodIKgUdsMQa0pz 8XCMRkYJjtiAIxql9PA3gHoO/oO3jWpoqHUusXsryzCN2gl3V/CRc7CcLCWnQu6WvIG5Ak Tfw1XWdjjhSt9lypWfK7Zr9ejKbdSGM= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="Qg1/u6X4"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf09.hostedemail.com: domain of chengming.zhou@linux.dev designates 91.218.175.183 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev X-Envelope-To: flintglass@gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1720411002; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iMeQrMxGWVYsF20LSONjNqLwKGL28dK2ApGBEXzkprM=; b=Qg1/u6X4IdGaKlHKepPOog6I0xmBNRaojBWKFv9BYYNHNniFuv+RPNgBBmhBz6YxmIPM5/ TCW8mW7P8qQ14iQC81G7mUmfxKL34EkFds9UkjNhONtT0qFCg/VLUdic4dinAWT4oYud50 KzcNjS3eXJyOEvvN6YJcUivaD7nb1Q0= X-Envelope-To: hannes@cmpxchg.org X-Envelope-To: yosryahmed@google.com X-Envelope-To: nphamcs@gmail.com X-Envelope-To: corbet@lwn.net X-Envelope-To: akpm@linux-foundation.org X-Envelope-To: cerasuolodomenico@gmail.com X-Envelope-To: linux-mm@kvack.org X-Envelope-To: linux-doc@vger.kernel.org X-Envelope-To: linux-kernel@vger.kernel.org Message-ID: <0afc769e-241a-404e-b2c9-a6a27bdd3c72@linux.dev> Date: Mon, 8 Jul 2024 11:56:34 +0800 MIME-Version: 1.0 Subject: Re: [PATCH v2 5/6] mm: zswap: store incompressible page as-is To: Takero Funaki , Johannes Weiner , Yosry Ahmed , Nhat Pham , Jonathan Corbet , Andrew Morton , Domenico Cerasuolo Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org References: <20240706022523.1104080-1-flintglass@gmail.com> <20240706022523.1104080-6-flintglass@gmail.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou In-Reply-To: <20240706022523.1104080-6-flintglass@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 08561140017 X-Stat-Signature: y7m4ntp7pjhhzjawar1toxiqrdhyr1r6 X-Rspam-User: X-HE-Tag: 1720411004-891079 X-HE-Meta: U2FsdGVkX1/xFukzOBijC5d0OIXDjkGKAolzwlaQh1wB1cR5nnVI/d6EQ9Ig9EwewBRE41BOte9gALm9uVU4D3VCnTGVv93W5N3mHo8KiABO93Ej57oygpYNVOVy//4aEcXHTkkDGbZl5N7d5TaZMF7sefFtLqo8eX+a1TLshU1XFu4E1408GnYhDPuvUMcXkK/inc3zPn0IMIkhxhbiqkk7UBKHh9xDJ7MhigfxKFb3fZdIs94U0lwnnrHY4OZP2BQ2r2RPuqOioouvFm9wsbtmWhWlJpRY5ZJGef14GNUTiatL7h0CyaK19GfToJR5Pcb06HWGzyNPlJPc2oczGqstrWriIzWU4DOoWWecJFB3uo73DnYNGijQmT0iXebT3Kihb03Qu7vDQ2Fw31ir1VSGZHNJa6nAvHKo9QKkQAyXRKtt4oKWeTb274SmkIy8Wbcj+89r1Gb0ue3Y/LXDVA1lR7ZBK33t4gluA7gkf0yt5sSouOg5byJjmuhnPLw78dIOomTY28P2yWxM1yQtwRnDqA10FH2FVVd3k0IZsgFNVhihAOEeGjlukWRSqv/vCZvC887ncURQW0MRsNJPj62bA/T33/yIzKWJUOLUjUs+EOYNsPhFfvsIdutZ03Im8qn35msCo/EYL4rIxFEG86kGoyUizri+7EWn/iExgtHG+wrlU4i1KX3tPdAJXAgXY2Nos/y0AxCjjO0SZlc+SBDalGOmA7qu72JD5KKFUJvBhhjFCetkZlPXQIjhmk3oqXZnl7lhx1J3SRgujxe8wN96jX9gh3Eon8zTSayJiIARxdPIqToOYZ/FIYKhDW+tD1I+M+46aSZ1MWiI9cYBcaW/Hp6mycMNF2gh7gDZ8h8vNlDrSzeJPzVyj0te5fR0V86GkNVSaq/LNSY9JlMn9glTJx3TB1pcwe8uXK1ZP4i1VFnmPVLw8hA143OJEqvPKo2lJwSGPrt6Qi9lYN0 o7JIOTwB IJ3JNVRw2kGluWl4PweK/t4g4zRR60pCrKc9D7yhKwOhlGLwyC1aNb+lDueejNAjTa/VFkF6xPAHW5C9wp904pcFCkyiwAnNNtPgIIlrGX4W1kxC91qutsr1rUwLxQLSX8SSbeFU9QlVcars5kY+jimepTiw13Sx3sImM9/uqyNSKI6Z6V6eGr0Nrr1dUyxsFI+8zozuTyGd3fRs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/7/6 10:25, Takero Funaki wrote: > This patch allows zswap to accept incompressible pages and store them > into zpool if possible. > > This change is required to achieve zero rejection on zswap_store(). With > proper amount of proactive shrinking, swapout can be buffered by zswap > without IO latency. Storing incompressible pages may seem costly, but it > can reduce latency. A rare incompressible page in a large batch of > compressive pages can delay the entire batch during swapping. > > The memory overhead is negligible because the underlying zsmalloc > already accepts nearly incompressible pages. zsmalloc stores data close > to PAGE_SIZE to a dedicated page. Thus storing as-is saves decompression > cycles without allocation overhead. zswap itself has not rejected pages > in these cases. > > To store the page as-is, use the compressed data size field `length` in > struct `zswap_entry`. The length == PAGE_SIZE indicates > incompressible data. > > If a zpool backend does not support allocating PAGE_SIZE (zbud), the > behavior remains unchanged. The allocation failure reported by the zpool > blocks accepting the page as before. > > Signed-off-by: Takero Funaki > --- > mm/zswap.c | 36 +++++++++++++++++++++++++++++++++--- > 1 file changed, 33 insertions(+), 3 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index 76691ca7b6a7..def0f948a4ab 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -186,6 +186,8 @@ static struct shrinker *zswap_shrinker; > * length - the length in bytes of the compressed page data. Needed during > * decompression. For a same value filled page length is 0, and both > * pool and lru are invalid and must be ignored. > + * If length is equal to PAGE_SIZE, the data stored in handle is > + * not compressed. The data must be copied to page as-is. > * pool - the zswap_pool the entry's data is in > * handle - zpool allocation handle that stores the compressed page data > * value - value of the same-value filled pages which have same content > @@ -969,9 +971,23 @@ static bool zswap_compress(struct folio *folio, struct zswap_entry *entry) > */ > comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait); > dlen = acomp_ctx->req->dlen; > - if (comp_ret) > + > + /* coa_compress returns -EINVAL for errors including insufficient dlen */ > + if (comp_ret && comp_ret != -EINVAL) > goto unlock; Seems we don't need to care about? "comp_ret" is useless anymore. Just: if (comp_ret || dlen > PAGE_SIZE - 64) dlen = PAGE_SIZE; And remove the checkings of comp_ret at the end. > > + /* > + * If the data cannot be compressed well, store the data as-is. > + * Switching by a threshold at > + * PAGE_SIZE - (allocation granularity) > + * zbud and z3fold use 64B granularity. > + * zsmalloc stores >3632B in one page for 4K page arch. > + */ > + if (comp_ret || dlen > PAGE_SIZE - 64) { > + /* we do not use compressed result anymore */ > + comp_ret = 0; > + dlen = PAGE_SIZE; > + } > zpool = zswap_find_zpool(entry); > gfp = __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLAIM; > if (zpool_malloc_support_movable(zpool)) > @@ -981,14 +997,20 @@ static bool zswap_compress(struct folio *folio, struct zswap_entry *entry) > goto unlock; > > buf = zpool_map_handle(zpool, handle, ZPOOL_MM_WO); > - memcpy(buf, dst, dlen); > + > + /* PAGE_SIZE indicates not compressed. */ > + if (dlen == PAGE_SIZE) > + memcpy_from_folio(buf, folio, 0, PAGE_SIZE); We actually don't need to hold mutex if we are just copying folio. Thanks. > + else > + memcpy(buf, dst, dlen); > + > zpool_unmap_handle(zpool, handle); > > entry->handle = handle; > entry->length = dlen; > > unlock: > - if (comp_ret == -ENOSPC || alloc_ret == -ENOSPC) > + if (alloc_ret == -ENOSPC) > zswap_reject_compress_poor++; > else if (comp_ret) > zswap_reject_compress_fail++; > @@ -1006,6 +1028,14 @@ static void zswap_decompress(struct zswap_entry *entry, struct page *page) > struct crypto_acomp_ctx *acomp_ctx; > u8 *src; > > + if (entry->length == PAGE_SIZE) { > + /* the content is not compressed. copy back as-is. */ > + src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO); > + memcpy_to_page(page, 0, src, entry->length); > + zpool_unmap_handle(zpool, entry->handle); > + return; > + } > + > acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); > mutex_lock(&acomp_ctx->mutex); >