From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F5ADC6379F for ; Sun, 19 Feb 2023 09:35:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 120A06B0072; Sun, 19 Feb 2023 04:35:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D0FB6B0073; Sun, 19 Feb 2023 04:35:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDB4C6B0074; Sun, 19 Feb 2023 04:35:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id DC8CF6B0072 for ; Sun, 19 Feb 2023 04:35:06 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B1B384153D for ; Sun, 19 Feb 2023 09:35:06 +0000 (UTC) X-FDA: 80483532612.12.3F64036 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) by imf24.hostedemail.com (Postfix) with ESMTP id EF4C6180009 for ; Sun, 19 Feb 2023 09:35:04 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UaMY130c; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676799305; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lBXbL7q3VlCkSoORrG441hnbdCnwC1o+EdpNgjQ158A=; b=65SRUvK3G4yF0gBlpppryHsWJtbF4yhuNEOmudaLYxT9Qk/7BHNF0K7ofNUJqkXDRw5Trh s/25wtZ1GRSjS0QEk2r4oeUi9f3RNDwWxtWabejyBsICtnyVP8lfs+Di0IFbbJKoW2NVdB MoNhPdDPI9xf39+7L3dMbSajcs2rx/Q= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=UaMY130c; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676799305; a=rsa-sha256; cv=none; b=d+9cd0DzIPQh2mUOUoCryfG2GAh/EM8YDTGSt/lx7I4L3MTZitRIAWFO0BU4vibU8kUafX gx5jWgFjZlbCnltS1VUqJYMLfyvJQCx5D4Q1cIAZ9gY/4oAwIZCwm7wpNazkkpBlRmilwT i7rUaHOslGTEbzNa3s82Q7OpD5HxfGo= Received: by mail-ed1-f42.google.com with SMTP id dk16so775115edb.6 for ; Sun, 19 Feb 2023 01:35:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=lBXbL7q3VlCkSoORrG441hnbdCnwC1o+EdpNgjQ158A=; b=UaMY130c7LeteGhsLHCWW07AK3q5S6B3rFrRmIBzCKpVl7AIUUZa5sD/iboK8WK623 h883WXlwTHRX7c++4hPNo4jV8nYcuW6KZgEklYw0q+M98mWP8sI1Dmhr/bektngfH2KY c708ZCFIPxaM4Kr7lrXSUejgs3Ush3pd3b8ttXTwJ73bYtQpKs3VjKLj2KtCLHqggYdM roC5tVrtTGiYDR483XXyVZsAiBvvJwQznHD4uZXnD5L1edcQP6X3vYtp3uTx1oQ2oxo6 FUV1ncVSirPGBaBR08CLB5XaRtFWB0pfarjnjdPE4S6tGW1NRKj/zUO1tV/vQLW/Mbmb MN1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lBXbL7q3VlCkSoORrG441hnbdCnwC1o+EdpNgjQ158A=; b=mNMAbPlH74oK7Os78EZJGmpGgyHtXa3jLn4yRzor5zpnVMnuFIcMzifHmAT8suArKr cCXi4RjCp0dXlwX9ofyakYPmHz8mLCnsAOImI+0lA5anTMEJ1N/LxfV7rvEe/Jmoi8tW Q9ygCTIRdH+k3gEF0ZbXPmUDuLkjGbIYDU21DzGr6t7REuIx6yJAPtmGesm2RTswTvmQ GqD4MpdfJ0A25gZGh8sY1/o38SgLShDdku+3w1GkM1YznXAXyvB0pB23jpDm9lmUE134 AwCezrnykM13mr7fWgyx1ewbzXt605TDE7UgShyqMlp3B00zw3s9xUg1cCGJqMa+yUfB T3GQ== X-Gm-Message-State: AO0yUKV+wevoO8QM2XfkCfN98K5O9Du5mupyt+fdGe5cYtv96iaLV7Uz bzYXk1YHzoFAR8pQa/rbb9XwXwMD6W9Ad4mUWsHpqw== X-Google-Smtp-Source: AK7set9hd2Y7alGEizb4nGsRgNrtpg5sE2iiFp3slpjbn3WZQRgks5MrEYgSs8IwU1jQ4Oe2kCrJu027C7KlGHWUuh4= X-Received: by 2002:a17:906:9146:b0:8b3:8147:6b6e with SMTP id y6-20020a170906914600b008b381476b6emr2934453ejw.10.1676799303292; Sun, 19 Feb 2023 01:35:03 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Yosry Ahmed Date: Sun, 19 Feb 2023 01:34:26 -0800 Message-ID: Subject: Re: [LSF/MM/BPF TOPIC] Swap Abstraction / Native Zswap To: Matthew Wilcox Cc: lsf-pc@lists.linux-foundation.org, Johannes Weiner , Linux-MM , Michal Hocko , Shakeel Butt , David Rientjes , Hugh Dickins , Seth Jennings , Dan Streetman , Vitaly Wool , Yang Shi , Peter Xu , Minchan Kim , Andrew Morton , Huang Ying , NeilBrown Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: EF4C6180009 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: kw5rmzyabifbn3byrq8jomgcf36beasu X-HE-Tag: 1676799304-574890 X-HE-Meta: U2FsdGVkX1+EZyknPmHQcvtkZCtOf9hWv7uHYzUWXQ3WOoMaGnnzUfg9iqvKEbVXC4mDTNfaeSSSugOdOGrPOucLIlnIVnZbGz3U63+GOStLd0iFLT83vouSq+sCdo/i71V5nB24mAM7baA9DIo4fJBXDtajChD118739FgcnJwEmowdLPmQ0V4kn8rgnrsTxscsx4BU5wy6UyG5UpXXov/DyTy2Pb0cxcfAdXNFSUnkDrBkA0weIZWkhHRvCIKq1vIzwJPthA7WqA0ykgM3mfHed4tH67kyz2NitD8B465oaneOSPLmwtLbr9FT6XNwWRDUoOpCxLp3lGzNTghfyX3+acAgGsqXYlo8YKKVB4kyZUPHW/FFI8U14RiFyPQn3lYZUIIi3QOuN+Ip+evU+Pc4UFhdT9eyDrwLtObSmQgXKbLA2/X6ucZNyM5HXgYPyUpfHkJIkOLMzG5d3TFu9+FperqPFo8vMo5nSYcSukJlC3Zabai6c5tT3ziLgWLGqKNUmHLN0lo4A8aFB5wK88TOV1kQUxpIqqpuPk0AHtBMH97PqEps2KcMwX9zyERDnwbD6Vamiw9kv0NHuc+xk1nzxU6xB8GaBU9OTzWiIQQaZJbuYTs/4mmL/sdoPIeDZ3eWWfpPpTJbFQMowe131AV+nnY0z9bpD6StqVD5w2bzxEk8S5it+N8GHMxjTlD9BF3Tj320z7VUEgQbnRH4ydAbMsqm9OPFVPUD9NuhsVCf+SFeHd8O7weC466X8Nb3Hyo2L9223TsqUG5JoNjte1pcalErkOKljhZHf2uYENDlkJpyfV/pgqsm0LBFBnUkz+isX749yw6D+mhUMjqkcClsWdjUpX7/9JzDTqt045Lc2r//FjWBpQBg1odkqxJXKeBfG9k4mwpnuugwSTtJ5846ZYFf/DeZzSG1oUlR7ANO3bOnwE0og55iQE3OtzKf/mc8LcnmmAnDU0nBH+P s2yy4dLF CaGR+xm+TZKcJk4eK9to3uSiHz+krtiqPRUiU1rPKmzND0Nw7m52oVvjJEjFwpqGm4oO6A2BcN5EkUXV3rZBdK/UHZqWI9c481cfx+pu4QTOC05Rc9gd4SZwmrwCZLnJ+w56zlGNJoVDg7vVsuK0pjoSIL4i/+P0fKijKgmJ9eo+4As3/x5B9EV/ORluKQ/2feXoGBVaLUz614+NBLkrXjyRp5I+OOZLigma8yszoR+JQhgyqQDs1Ak00vv/GSZ/85g41d2Ajck+w+9vlZweiS7ebeiVcQM9XnD9V2I5bmREWQNPSLmQzVr4yYA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Feb 18, 2023 at 8:31 PM Matthew Wilcox wrote: > > On Sat, Feb 18, 2023 at 02:38:40PM -0800, Yosry Ahmed wrote: > > Hello everyone, > > > > I would like to propose a topic for the upcoming LSF/MM/BPF in May > > 2023 about swap & zswap (hope I am not too late). > > Submissions are due March 1st, I believe, so not too late. > > > ==================== Bottom Line ==================== > > It would be nice to discuss the potential here and the tradeoffs. I > > know that other folks using zswap (or interested in using it) may find > > this very useful. I am sure I am missing some context on why things > > are the way they are, and perhaps some obvious holes in my story. > > Looking forward to discussing this with anyone interested :) > > > > I think Johannes may be interested in attending this discussion, since > > a lot of ideas here are inspired by discussions I had with him :) > > I think an overhaul of the swap code is long overdue. I appreciate > you're very much focused on zswap, but there are many other problems. Fully agree. I spent more time than I care to admit just figuring out the difference between all the functions that have "swap" and "free" in their names :/ I cannot claim that I am trying to do that, like you said I am focused on zswap, but we can discuss the direction that swap should head in, and where zswap would fit in the picture. We can at least make sure that this zswap work would be aligned with any future plans for swap, so that we don't step on each other's toes. > For example, swap does not work on zoned devices. Swap readahead is > generally physical (ie optimised for spinning discs) rather than logical > (more appropriate for SSDs). Swap's management of free space is crude We have swap_vma_readahead() which should be on by default for anon memory on non-rotating devices, but it's only for anon. shmem only uses swap_cluster_readahead(), which I am not sure if it makes sense for all cases, especially zswap. > compared to real filesystems. The way that swap bypasses the filesystem > when writing to swap files is awful. I haven't even started to look at > what changes need to be made to swap in order to swap out arbitrary-order > folios (instead of PMD-sized + PTE-sized). I don't know a lot about file systems so I can't chip in here. > > I'm probably not a great person to participate in the design of a > replacement system. I don't know nearly enough about anonymous memory. Any input would be helpful, I am sure you know more than I do :) > I'd be sitting in the back shouting unhelpful things like, "Can't you > see an anon_vma is the exact same thing as an inode?" and "Why don't > we steal the block allocation functions from XFS?" and "Why do tmpfs > pages have to move to the swap cache; can't we just leave them in the > page cache and pass them to the swap code directly?" For that last one at least, the proposed design makes the swap cache much less similar to the page cache, so at least we can stop worrying about whether we really need to use the swap cache for tmpfs ;) > > Maybe Neil Brown or Huang Ying would be good participants, although > I don't recall seeing either of them at an LSFMM recently. Looking forward to talking about this to everyone who's interested :)