From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5DEDC54E67 for ; Wed, 20 Mar 2024 06:13:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 66D316B0092; Wed, 20 Mar 2024 02:13:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 61D876B0093; Wed, 20 Mar 2024 02:13:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4E4736B0095; Wed, 20 Mar 2024 02:13:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 400936B0092 for ; Wed, 20 Mar 2024 02:13:35 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D8CA9A1664 for ; Wed, 20 Mar 2024 06:13:34 +0000 (UTC) X-FDA: 81916400748.21.66333D6 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf17.hostedemail.com (Postfix) with ESMTP id 12EB840007 for ; Wed, 20 Mar 2024 06:13:32 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=H5nZp7gJ; spf=pass (imf17.hostedemail.com: domain of 3jH76ZQoKCKkhXbahJQVNMPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--yosryahmed.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3jH76ZQoKCKkhXbahJQVNMPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710915213; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Zj7+zABO75II9g/KOKonpSfKZraBzKkEO6GqLn+SlEA=; b=v+966ujTaVmscggt+5FQrh0gz3zukOrKdvT3dGbliS5N4bLm7gGtW3zqmJYLrzbZ4aXZRW 2pr5zwq17ZHtD6VW3wbdMMPMq062QjSHBEfGL0y+FAHqom5apQ2zZHKIFxWoSrlOWWGFGt 0Hr3LrvvOFwLlvgYJdYKfb1+z6iB6E0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710915213; a=rsa-sha256; cv=none; b=rAoqF40S7ieKY0jTxlLWAq4ki6lGt3gZbobp0FeCK1gFQ7Qvijb7FAxPknA63zH1LCSWcm Jfq/YzNhiwhdnQFjTKvDye7vqhCjaukXMqw0CZWFvrLzieCkI5cIXdvvZY+BDNujwH9zx6 ZOZjGc3WwC6WN5l5uL0zQwrYdH9S8yE= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=H5nZp7gJ; spf=pass (imf17.hostedemail.com: domain of 3jH76ZQoKCKkhXbahJQVNMPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--yosryahmed.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3jH76ZQoKCKkhXbahJQVNMPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dbe9e13775aso10263856276.1 for ; Tue, 19 Mar 2024 23:13:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710915212; x=1711520012; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Zj7+zABO75II9g/KOKonpSfKZraBzKkEO6GqLn+SlEA=; b=H5nZp7gJEKLDG+fGTZ2QmDVWboShBnQ3MfDvwkFHFltnY6472Rkc7/aNIxsUu8HmZa ON8dB1yNvAfXqrpjoZQgb8bDEBmLdoWpBbNe93Q4R7LA1WgOS9n4dfYt5lCy5Pj20+4n cYJS+PR2Kqbwcr1xx7OMBkQZ5Cr1fg6NPG8di6TvixPWcvK6L0eKyJIn583zbbyKFq6f aY4EL+wyCNcmKWIY7yT5GV/e0Xe7XYdof+GGoV+TxPzbEhGF/+NfkW/Xp80a8zfn6Avr u3mLoMvcVv8YvpaUdAysU786VowJxmC9ecpXI581m9H9cnXSJ6ArQ85J325ERobvBHlF WMVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710915212; x=1711520012; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Zj7+zABO75II9g/KOKonpSfKZraBzKkEO6GqLn+SlEA=; b=JSaZzIsE5p6ldrwF0vYTQC0BMqAVM0I4fKGifrwBhWLI2R+9q6qdgCdB7WB/Qlii56 NEml/hgB2NVApeBZ7VHjFGPd24fA/THfV9YWoj070Q9pKc3uIz+Ad+7B2fyYzv30EaZ+ VgtxL2rL18FZMI74CBzqeh2zL6vxAOjJ6rpWGRIXUNHCqraGVEc0W77lavX6tHADxPNa 20ZgfoK/5cfRx3RxEUlK23wJoWaVqKpi7BKbUVFCo7vST0TLZ9VHigtH2FZhxkt4+YEm USCW87qr0XpMSlz8pAnRBl+lioN8/1P88VwnMvLYOKgfoi7+tMLAdJJ1gTzsCxm8uEL4 pNuw== X-Forwarded-Encrypted: i=1; AJvYcCWnSjZASLAbE44pQ0VLvrTQ3fBuIVarGZqwoM/C00GscFw76ECdPmxk8x3KrvUcvQct6N7A+dcTXxQ2FVQn7Aq6MEY= X-Gm-Message-State: AOJu0YxDqNFl4+oMYFW7wfN9dUhvuWWqSYHe25YvvLjOXyzpgpF2v6LC UAduB3mpBEc6JBDF4H4OMDABsgxH6ajikzcGZ8GkCI05pfjcYKgiMuHcPRAiysQMBxjotIkDYzi 53Ww8phhmtcKKyXdKiA== X-Google-Smtp-Source: AGHT+IHO9jutXv2J30LOwCgyAcPL12KG1J8A/WJiPynw9T5KQ+ILRwx7sQKwyC0nvXiVA5ubtcdJojtJnUNqj7tq X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a05:6902:1b03:b0:dd9:1db5:8348 with SMTP id eh3-20020a0569021b0300b00dd91db58348mr1156570ybb.8.1710915212100; Tue, 19 Mar 2024 23:13:32 -0700 (PDT) Date: Wed, 20 Mar 2024 06:13:29 +0000 In-Reply-To: <20240319-zswap-xarray-v7-1-e9a03a049e86@kernel.org> Mime-Version: 1.0 References: <20240319-zswap-xarray-v7-1-e9a03a049e86@kernel.org> Message-ID: Subject: Re: [PATCH v7] zswap: replace RB tree with xarray From: Yosry Ahmed To: Chris Li Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Nhat Pham , Johannes Weiner , "Matthew Wilcox (Oracle)" , Chengming Zhou , Barry Song Content-Type: text/plain; charset="us-ascii" X-Stat-Signature: nxxan9zjdh9auma564oze4bj1wgzhand X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 12EB840007 X-Rspam-User: X-HE-Tag: 1710915212-500938 X-HE-Meta: U2FsdGVkX1+Qd13L23//zNyhiMbiwTejORm8Vp5PTSHuhlY2Ov1mX5GnA1SBm89zZbLt5yobLfJrGJksoxxahoxA89DOxeZbG5NDgBVsqiV1NSq8tnAsgHfs6hLLu5k/nUMjrdRdlDR/CUMhlDf8yKYkgu0susF7u2yJpAgrZYsEvRfAYm0PQkP1mQ+fAvEQmiZckmfnoZA/ixpsiEAKLXxCgBuGXlc20vVWZRldRx9Brg3/236R3dSAkcFsw+oEpHZ3UYBs7i6OUYxv+yEq6WTMKXpDhat6X9uCsEqV/IfTbFLqKLoe45YurflqTogXvSD4mv+Fn0+ONZAh6pbSERDuShy6n1jNDtgj/OqhDFyaQr4EMJilpVAlS1fN32jWDBxW5GyerzAkWwCqmzUTT+AKkPclxGopuYuc/3IdnRs1dYIbFaFf5itFmGk2R/fTV138r2KxHg30slR19wP7oh63L78+NplSPQOXq9cFdqDZy63zRKQiIhpInn/zowfA9O6kEwO3LKpU0b7JWAxlVkgEe0rTq8iwKbGf8Ttg+tnwqCyRYaWx4zQsC+p87YAcLDo/I/Hq35CMWq6Ka2QSRj4MSvPNeltDPZzZQqWdPpvuWVYAge/xdwz9JNrsijKB47ZLzokKE1cippe6xTwTZTJgLuIK5/E7iJt7nKoosITjyHkxpH6yZOfoCymiJZH1iAkR1CaEwwiogQvKjuQ2DGrW3y44f4ggJIj/Nn0ACDjCC0L+FizfPH2WbIKIlelrltMHeLHmXJ1K+LTDZVTWq/RTitJ9V9gPZ6zFRBTXpx4x1lkn9CHPSuFmrTyp0r6qv3W0BwDEioiOsHASWWhNmK5MVFy/sgkgs96ef9uj5cVgH7ohSxznbfh6cr4CHpK3RLEe16WGzunKmht78puSCeTEFOM19tLPfSjw9FhH4FRTRUv3PRq0cqLu8bwwRN2QoUEVEwCbCbz7grleCIw FWdFpvuZ HuAFfwhmdfwbLm9chSCItKL+iO/EAQNjJp/cxd349RLgLBChLwVsxuYbWr40UP9uJqBDmuCsTzN0cnLCyp3g5wSvtgAqMZZh0y6dzTNOoFf03QNLTlgzzo8MOUVSafXLM4rV+xVeLgcX8m61XfwE778RckayOrAOIPN/wSB3BqNe1HKjpun9voJiTed00NgaTxS8b7rktxq2bsSCxmGL5wEEG0i2T10Si/zN6iEjVyWWSYWh6draed2PlwNEHe3YviFhj4WwE0yjpT08e355hYv+KSmICYP/+RP1tIStbqz2shpML9yZJNLazUcSnR2n/0KPhTHMx5869GfAJwrawaAOCzJdlGcKqDOHIqppoMWOkB7+19QZzFZxfAoZJ2oL40ae4ZJUd7RwsYAlfz7YR+e+RNGFsrA0moSsTok4+IzwNFxXyleG/aaONu+PG+8Br4g/6pmjY4jYwgoTACXKD71niPnQ8/XvnL9xidG/EecVoq8THSOd2oDQkXO10QlFYk3Q7AMao01RrjcGPL18mo868tBqkheuNPyLlPBUywf5Q6MtohT2SlIKlGUupUe5wJJae5va2l3WyZ2aDhePORFFqvWXHq0uFuGEkWkvTX93GjtClT16BJseUCTlpgXSCrjRWnnA509bn7YxxD+0Jzj1fRo9hSIN9tr5q8IgtxYA3cGPNV2Gd5AW6ogIHouUlEUfxY6HA/ML79UIOn5c5xYKhgQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Mar 19, 2024 at 10:52:26PM -0700, Chris Li wrote: > Very deep RB tree requires rebalance at times. That > contributes to the zswap fault latencies. Xarray does not > need to perform tree rebalance. Replacing RB tree to xarray > can have some small performance gain. > > One small difference is that xarray insert might fail with > ENOMEM, while RB tree insert does not allocate additional > memory. > > The zswap_entry size will reduce a bit due to removing the > RB node, which has two pointers and a color field. Xarray > store the pointer in the xarray tree rather than the > zswap_entry. Every entry has one pointer from the xarray > tree. Overall, switching to xarray should save some memory, > if the swap entries are densely packed. > > Notice the zswap_rb_search and zswap_rb_insert always > followed by zswap_rb_erase. Use xa_erase and xa_store > directly. That saves one tree lookup as well. > > Remove zswap_invalidate_entry due to no need to call > zswap_rb_erase any more. Use zswap_free_entry instead. > > The "struct zswap_tree" has been replaced by "struct xarray". > The tree spin lock has transferred to the xarray lock. > > Run the kernel build testing 10 times for each version, averages: > (memory.max=2GB, zswap shrinker and writeback enabled, > one 50GB swapfile, 24 HT core, 32 jobs) > > mm-unstable-a824831a082f xarray v7 > user 3547.264 3541.509 > sys 531.176 526.111 > real 200.752 201.334 > > --- I believe there shouldn't be a separator before Rb and Sb below. > Reviewed-by: Nhat Pham > > Signed-off-by: Chris Li I have some comments below, with them addressed: Acked-by: Yosry Ahmed [..] > @@ -1556,28 +1474,43 @@ bool zswap_store(struct folio *folio) > insert_entry: > entry->swpentry = swp; > entry->objcg = objcg; > + > + old = xa_store(tree, offset, entry, GFP_KERNEL); > + if (xa_is_err(old)) { > + int err = xa_err(old); There should be a blank line after the declaration. > + WARN_ONCE(err != -ENOMEM, "unexpected xarray error: %d\n", err); > + zswap_reject_alloc_fail++; > + goto store_failed; > + } > + > + /* > + * We may have had an existing entry that became stale when > + * the folio was redirtied and now the new version is being > + * swapped out. Get rid of the old. > + */ This comment is mis-indented. checkpatch would have caught these btw. > + if (old) > + zswap_entry_free(old); > + > if (objcg) { > obj_cgroup_charge_zswap(objcg, entry->length); > - /* Account before objcg ref is moved to tree */ > count_objcg_event(objcg, ZSWPOUT); > } > > - /* map */ > - spin_lock(&tree->lock); > /* > - * The folio may have been dirtied again, invalidate the > - * possibly stale entry before inserting the new entry. > + * We finish initializing the entry while it's already in xarray. > + * This is safe because: > + * > + * 1. Concurrent stores and invalidations are excluded by folio lock. > + * > + * 2. Writeback is excluded by the entry not being on the LRU yet. > + * The publishing order matters to prevent writeback from seeing > + * an incoherent entry. As I mentioned before, writeback is also protected by the folio lock. Concurrent writeback will find the folio in the swapcache and abort. The fact that the entry is not on the LRU yet is just additional protection, so I don't think the publishing order actually matters here. Right?