From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38B15CD1283 for ; Fri, 29 Mar 2024 22:29:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B5D36B007B; Fri, 29 Mar 2024 18:29:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 764F86B0082; Fri, 29 Mar 2024 18:29:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 62CCE6B0083; Fri, 29 Mar 2024 18:29:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 44B6C6B007B for ; Fri, 29 Mar 2024 18:29:50 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C73141601E6 for ; Fri, 29 Mar 2024 22:29:49 +0000 (UTC) X-FDA: 81951520098.09.8F16AB8 Received: from mail-ej1-f42.google.com (mail-ej1-f42.google.com [209.85.218.42]) by imf11.hostedemail.com (Postfix) with ESMTP id EAA0A4000C for ; Fri, 29 Mar 2024 22:29:47 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XL5b16ks; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711751388; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=j7HucFjZo+IYW7DIZaEciZZcS/GCvuWFK/7eA6qgdTI=; b=Z6gFBExK98GtXou6P8A91/O8Jhif+r5u8BkUpEj1QNQV96va3msT2sP81PnJVGeCHHBOYY alIzDdE7zkLeYx+igAMQVgiMLRERILSv3yXCta+Te+CubmmOsZfj29izwcuV3Djhl/xAW5 CcIjkdkjxxxE8LyAvhWyre98qFiGhjI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XL5b16ks; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711751388; a=rsa-sha256; cv=none; b=NH3NQV4innj4gH26vNf0fQs6YtfSRgbp+rxOfiYe9gBKNxBOHFY9nyfr2kJBSnzXttWxLi liHaLu8zKj5YqpBpug5kx33w7TzZ4V0LUU/ujEDuVZQ1ABlpWFnhNUCGNuMgMam/itoPEk 4C8CCmk5IESfYDa4114Rg33gnk/FXVw= Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-a4715d4c2cbso320556766b.1 for ; Fri, 29 Mar 2024 15:29:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711751386; x=1712356186; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=j7HucFjZo+IYW7DIZaEciZZcS/GCvuWFK/7eA6qgdTI=; b=XL5b16kszTEiTMjmkK3Y/nzqIfCnFY9uTM7qz71ZteltnnEweUtjxUG1+DSMzvmHBS 8WviXzZkaW1+GDGcGqpSicd+XnnDkuZ/wdhBGIgufzDFtsM150aX+tEwC9s13pMt61xA snbCtwzH7dr2MsrFmVHR6tXOV/0msKREhMAwfj4w+dhQbanweYKvhKBlrjamydBKOXAr vTCrDRwoJFMNZOXfP/PJq4uUccFCaySbtYSRoz9Mr3Kcl+YNVqNfKIgGaCWOeDwiJWhS 8NsFWpwOEIM1UYUIJZB53uy0NlPBWaBd6dSiWoBx+VhW/nZZMc0vVPfGhCG6eG247xyI YHsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711751386; x=1712356186; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=j7HucFjZo+IYW7DIZaEciZZcS/GCvuWFK/7eA6qgdTI=; b=BFJme5lQ7wC3R0k/Z6RfadFVbZJ11iDwHYZ3D1jqj+aWR0WHGXcZiPKrFOENNWpwI5 eqOux00FuRBUz7YH9jD1TOkqW8AaSbwekzG3/LKlD0SR1p6mX6qnZ3x9JOjExe1+223W TOEssMFEj7F65rnvCDEfqCGjhtm6dO1tvtaD1aFwHi3UrjrZRmyf9YMNHXRAtCtpyTh0 kjjcJiaPb6GmlHDXshSIA5OPevXUKSdJ/O9qba21Gbcrp140NgxqGaB4rygeNyvHC550 Wq88I40JBDWyOsXT+eJn2WqpNdNwq9JJeteAwYBH2kLjpCmUYktUj+Dt+DM8VN/Ts3sl By7g== X-Forwarded-Encrypted: i=1; AJvYcCU7+NmORRL7mStjrTI4jUMiECkGU9KExss6BJy4qIc1DEFisBLyRxhkIFrkq83KRl1QZYzlVo7FiWnps50HPX5Y2PY= X-Gm-Message-State: AOJu0YyF6tUBQszKqaOMbqC4FZ5klHVnOMzlOAiD1Nym3i3kLc3P17T7 FaDX8RzpHt8Iu3R5rBz9aC/wyZJMHRS/OzOckTqDP3IabNECOqLm+b2hHCBrwUeCZKK+uLF9JUT Dega8s5DEhO7cCwBIVDKnWvQK6AOXY7cKUEKE X-Google-Smtp-Source: AGHT+IG4teUTwPktp+DR/5ayqNkHcdTq36vc3n8+aVyWq9S2DzWoZpaL12inphGYAcc7sq15bW+08Sn6mRyVTrx26qU= X-Received: by 2002:a17:906:4f92:b0:a47:61cd:2b51 with SMTP id o18-20020a1709064f9200b00a4761cd2b51mr2283347eju.2.1711751385930; Fri, 29 Mar 2024 15:29:45 -0700 (PDT) MIME-Version: 1.0 References: <20240325235018.2028408-1-yosryahmed@google.com> <20240325235018.2028408-7-yosryahmed@google.com> <20240328193149.GF7597@cmpxchg.org> <20240328210709.GH7597@cmpxchg.org> <20240329173759.GI7597@cmpxchg.org> <20240329211723.GA882964@cmpxchg.org> In-Reply-To: <20240329211723.GA882964@cmpxchg.org> From: Yosry Ahmed Date: Fri, 29 Mar 2024 15:29:07 -0700 Message-ID: Subject: Re: [RFC PATCH 6/9] mm: zswap: drop support for non-zero same-filled pages handling To: Johannes Weiner Cc: Nhat Pham , Andrew Morton , Chengming Zhou , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Srividya Desireddy Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: EAA0A4000C X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: epgsr38kxubedpu6foufs6x3xhuuquor X-HE-Tag: 1711751387-710623 X-HE-Meta: U2FsdGVkX1+z8EFGj+u31HBwXeTMMXM3X06hCuHXKFLQ0yZjnTELInU3FXOqFM3nFXEpsQ8hImB1mkrIQCn+k4maA+VoEabgFBQPdFs70aQOopUfR9o6pSAyWhG37zh6rR0cS9MEYEv7UX795obBAV0DKIERd3+AAMR8uLapDsOvAFYd8dBw6xV+DFdt1yUXLAPLL8aHyrNFEy9SBgjZVzP7/ehmdEjq3gUVbhcopNMBPFzL8kZhJ9QrXnVW3NxvuPhxcfX6xxcvkp+PccRH2eUtOkf219PPkNyIh+b3U6WvEgv6SSHwOsx4s0kYf2IrVg2np3nWQxV0/WBiJEVvoSnz7kz456yyG2wzLH7WSwjp4aeVQYE7oyxW33raDqCdqU6+k0mraMyKFaEEVQqYk4nIzv29a/sEOycMIn8NLdXcbtiGLP3YIMlFBJoPxFQhPAPlpmNhjl/Fvmx5dUDZalQ6Z02ZdawbhP1iLeYhQtZHawO3vPOWddXFi6Pnsd8KTAY5dFb49LcPoZ5JnSxHNeENvgT2joYFNLVqT/VbIGN7SOKqyrw4v7eNbkpfHsVxGy7XOYVvyNaJP7tMGeucYIDDMzSbSRigMKQ+RD7UsBNRY8TvdEWUO6AmH+vfN7thXF/EyugmTzuDwvsAYLPJ/otZzgM94QA5mC+eZ4km9ZK72qvNav5F4flJH/mA8XciJ2Gad4R5GPFJTRcAXTVh82rFWV+VRyjufpfX0iEo1mTXJIh9ogVHVP72VlCKeURgj8QPg0vLClCc0Pt7Xl42wc/q1UwDZ/rfYBxipx4xknpLPhZ4D6fWoq5rMpwRRxQ41G244fW2uD32gYLH5pSl/+BwMrCJ7s6JIvPd3q3JRcf1QTa5mDaa12ksmDwk9Qb75ylpkiArg0O59V4b1UQ9Ifc6bkqkDrKtb+EnKgvS1utRX5TBX764yps1SfNXlLyUQZ4K0Ywslb6hQClGJWq USC2wkqX ZxwkA3KFPufwRMUcpDSHNjkiLVNmSCwPO5ehxpArG/kl2GH3JcbyiWdwa84q7o5iRPL4q2ti6B7W3cWb5QojpH3Z52Q2UH4cQuTSM/F1a+W7a9UOLyxCWCa2afSdm6EwSmpmDu197XPEZ72UtCcdvjnT6YietAcA7kKrG38k8clxSvsI5+sOC3KEmL/SFy96vlm39UIMJvxseQIx8Dqtr+rPJdDkjDgArchGAatcif8meG0bFwgcvJ1ImkhNrQWJMC+9vykHZxK0Ei32kyF/vGV8Jvyt+W4x4I1kidbZA340Tptd1JmTtRT47u4Dp90/QpeyNZATFJBZpPMprXSXoc50IA3xOItjhNzwDVWSLQLfJmhM33VIFg3dzlorD/4+hGLnRvs9A4mxgvsPgKbEDrrW3PfN6vA1inrx8UscEnzCIkPgRCxS23pL7ANmHLGJOQ341mKPONglwPOpmAfipi/KlL7bL5HWHD0CH X-Bogosity: Ham, tests=bogofilter, spamicity=0.060301, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Mar 29, 2024 at 2:17=E2=80=AFPM Johannes Weiner wrote: > > On Fri, Mar 29, 2024 at 11:56:46AM -0700, Yosry Ahmed wrote: > > > I perf'd zswapping out data that is 10% same-filled and 90% data that > > > always needs compression. It does nothing but madvise(MADV_PAGEOUT), > > > and the zswap_store() stack is already only ~60% of the cycles. > > > > > > Using zsmalloc + zstd, this is the diff between vanilla and my patch: > > > > > > # Baseline Delta Abs Shared Object Symbol > > > # ........ ......... .................... ........................= ............................. > > > # > > > 4.34% -3.02% [kernel.kallsyms] [k] zswap_store > > > 11.07% +1.41% [kernel.kallsyms] [k] ZSTD_compressBlock_d= oubleFast > > > 15.55% +0.91% [kernel.kallsyms] [k] FSE_buildCTable_wksp > > > > > > As expected, we have to compress a bit more; on the other hand we're > > > removing the content scan for same-filled for 90% of the pages that > > > don't benefit from it. They almost amortize each other. Let's round i= t > > > up and the remaining difference is ~1%. > > > > Thanks for the data, this is very useful. > > > > There is also the load path. The original patch that introduced > > same-filled pages reports more gains on the load side, which also > > happens to be more latency-sensitive. Of course, the data could be > > outdated -- but it's a different type of workload than what will be > > running in a data center fleet AFAICT. > > > > Is there also no noticeable difference on the load side in your data? > > 9.40% +2.51% [kernel.kallsyms] [k] ZSTD_safecopy > 0.76% -0.48% [kernel.kallsyms] [k] zswap_load > > About 2% net. > > Checking other compression algorithms, lzo eats 27.58%, and lz4 > 13.82%. The variance between these alone makes our "try to be smarter > than an actual compression algorithm" code look even sillier. > > > Also, how much increase did you observe in the size of compressed data > > with your patch? Just curious about how zstd ended up handling those > > pages. > > Checking zsmalloc stats, it did pack the same-filled ones down into 32 > byte objects. So 0.7% of their original size. That's negligible, even > for workloads that have an unusually high share of them. Glad to see that this was handled appropriately. > > > > It's difficult to make the case that this matters to any real > > > workloads with actual think time in between paging. > > > > If the difference in performance is 1%, I surely agree. > > > > The patch reported 19-32% improvement in store time for same-filled > > pages depending on the workload and platform. For 10% same-filled > > pages, this should roughly correspond to 2-3% overall improvement, > > which is not too far from the 1% you observed. > > Right. > > > The patch reported much larger improvement for load times (which > > matters more), 49-85% for same-filled pages. If this corresponds to > > 5-8% overall improvement in zswap load time for a workload with 10% > > same-filled pages, that would be very significant. I don't expect to > > see such improvements tbh, but we should check. > > No, I'm seeing around 2% net. You mentioned that other compressors eat more cycles in this case, so perhaps the data in the original patch comes from one of those compressors. > > > > But let's say you do make the case that zero-filled pages are worth > > > optimizing for. > > > > I am not saying they are for sure, but removing the same-filled > > checking didn't seem to improve performance much in my testing, so the > > cost seems to be mostly in maintenance. This makes it seem to me that > > it *could* be a good tradeoff to only handle zero-filled pages. We can > > make them slightly faster and we can trim the complexity -- as shown > > by this patch. > > In the original numbers, there was a certain percentage of non-zero > same-filled pages. You still first have to find that number for real > production loads to determine what the tradeoff actually is. > > And I disagree that the code is less complex. There are fewer lines to > the memchr_inv() than to the open-coded word-scan, but that scan is > dead simple, self-contained and out of the way. > > So call that a wash in terms of maintenance burden, but for a > reduction in functionality and a regression risk (however small). > > The next patch to store them as special xarray entries is also a wash > at best. It trades the entry having an implicit subtype for no type at > all. zswap_load_zero_filled() looks like it's the fast path, and > decompression the fallback; the entry destructor is now called > "zswap_tree_free_element" and takes a void pointer. It also re-adds > most of the lines deleted by the zero-only patch. > > I think this is actually a bit worse than the status quo. > > But my point is, this all seems like a lot of opportunity cost in > terms of engineering time, review bandwidth, regression risk, and > readability, hackability, reliability, predictability of the code - > for gains that are still not shown to actually matter in practice. Yeah in terms of code cleanliness it did not turn out as I thought it would. Actually even using different subtypes will have either similarly abstract (yet less clear) helpers, or completely separate paths with code duplication -- both of which are not ideal. So perhaps it's better to leave it alone (and perhaps clean it up slightly) or remove it completely. I wanted to see what others thought, and I was aware it's controversial (hence RFC) :) Anyway, I will send a separate series with only cleanups and removing knobs. We can discuss completely removing same-filled handling separately. I suspect 2% regression in the load path (and potentially larger with other compressors) may not be okay. If handling for zero-filled pages is added directly in reclaim as you suggested though, then the justification for handling other patterns in zswap becomes much less :) Handling it in reclaim also means we avoid IO for zero pages completely, which would be even more beneficial than just avoiding compression/decompression in zswap. > > https://lore.kernel.org/all/20240328122352.a001a56aed97b01ac5931998@linux= -foundation.org/ > > This needs numbers to show not just that your patches are fine, but > that any version of this optimization actually matters for real > workloads. Without that, my vote would be NAK.