From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3128C27C52 for ; Thu, 6 Jun 2024 22:36:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 53FFC6B00A6; Thu, 6 Jun 2024 18:36:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4EF3C6B00A8; Thu, 6 Jun 2024 18:36:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 390296B00AA; Thu, 6 Jun 2024 18:36:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1C4076B00A6 for ; Thu, 6 Jun 2024 18:36:38 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9AEF81C2730 for ; Thu, 6 Jun 2024 22:36:37 +0000 (UTC) X-FDA: 82201924434.01.FACB432 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) by imf26.hostedemail.com (Postfix) with ESMTP id B22EF14000A for ; Thu, 6 Jun 2024 22:36:35 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=c+GZ1cgj; spf=pass (imf26.hostedemail.com: domain of minchan.kim@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=minchan.kim@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717713395; a=rsa-sha256; cv=none; b=bLtP9C8+LDa063Hx1k2GOb6tDqzIaDcfxAPVfdt8/oVh7PBAkorymBwybHI060rOFA+ap9 7twHZawrfxr2ZaGofhDF0RRzCgjRSuEoz3N0aQmYtsDwCYlJjKI4Hg8J1eaGnt0hzw8sSL /U9ZDvv9P1+CTHy8yHHqXFZREkfQA+g= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=c+GZ1cgj; spf=pass (imf26.hostedemail.com: domain of minchan.kim@gmail.com designates 209.85.214.181 as permitted sender) smtp.mailfrom=minchan.kim@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717713395; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ljHTvvE/zKr2TSmx0HfN6TAzoE4stELOfnNySmAR6LQ=; b=BfTGCy1SHhv8rAuYnNsjsgLau0Oy4/nE+a4vw9dXGNjYf1AZ9URcsz9DD+CBIoEgNg0Lm1 u1RRmIIc49/B7XWlV0yAMpF8ZsUpNZHUr/+oCEIR2KRStWFpjIcQ3xQUVnTzTrVKM89vsA iGM054BaK3HrncRYZVjZsYYB36u/si0= Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1f4a0050b9aso13105075ad.2 for ; Thu, 06 Jun 2024 15:36:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717713394; x=1718318194; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:from:to:cc:subject:date:message-id :reply-to; bh=ljHTvvE/zKr2TSmx0HfN6TAzoE4stELOfnNySmAR6LQ=; b=c+GZ1cgj7uKEVHSxmUDEl9l6P0teqWRQVtSPLN+QpCoeCH9b+WhQ9oOgV2u0XDa6uW HCUyOLnMJjiCOyvV4PKsXC5EwRE8IMJ2juaMmgDt32ECY0C4VnqKrjExnxVJplxFvITE GXtaC7hu8xCR59njoc7y9wqkh0vy8hWgv8HG16Qens//6j5NtAL+NsnWRdkbAwR2oz1p 2PGLWaYVdAQQI4SBysyiqg+4vpu9ygifMVAog8QV5SIF+dZhQDRX3lvTpXqtYeZlyIIa rfZEuwNI4x22Ollhp2JIDitCPvDHupQ0D114MSXkfez0jhBW4JyTSlRKpu/0doyaPSEQ MGcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717713394; x=1718318194; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:sender:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ljHTvvE/zKr2TSmx0HfN6TAzoE4stELOfnNySmAR6LQ=; b=LOr6QrCezLj2kGwkCXxN5vx4LoBfIpeLebky93pVxEkQCY5nXWcya/LLsjZDPNR9SD WqHDqMrO9LESvd/+kFg+mmwZF8lvJKdXU8JGMu/ekEeY7f0DEccqZvBGpR2NiQTeJV1t 0q6YkKr07G08b8hoJk4dI65fB1sLpVuCuMLS/MtNuv8hJjAs2T/KoOKsg9x26k9NSabR IIsW9O7pbViui/QenzVH+l2Fy+eiu+W8Y01WzoS17H9WVeXCTyf6MH3UX7oOmPxdtJwZ TnGz6hgde1QWGlMNTICLkUQTUBnBmYD1xIP8p4+A3QnCQHBXWDSQukQhFApgiPh5P6zo pfPQ== X-Forwarded-Encrypted: i=1; AJvYcCVxpHzlRHCWJAcYDHp9AUc1S/EHVYmlDiippjeOSzTVb+6UJV+OBjnmo21zjURIoz0LX3sw9ahaxmdCI/JvWbxcDNk= X-Gm-Message-State: AOJu0YzV0TKjbYUbDiRzKIeIxNyfNjLEhO4sgQEyMUhsi07luG3/dWh3 XuOjSY2qtBR0sK3FDMU5lMeoCR6inOQnfCvk5KPm/2iHN1Sb715o X-Google-Smtp-Source: AGHT+IG3lMXA3+OW7FaYElQK2a7OKbvuinvinQBB8ErYJXaWBqrf+Fb9Wq/Aaq1K/hSaKnF6AnutJA== X-Received: by 2002:a17:902:e546:b0:1f6:a870:e124 with SMTP id d9443c01a7336-1f6d03c0e11mr9124705ad.42.1717713394277; Thu, 06 Jun 2024 15:36:34 -0700 (PDT) Received: from google.com ([2620:0:1000:8411:1225:d298:7d81:144c]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f6bd76d6f3sm20515035ad.111.2024.06.06.15.36.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 15:36:33 -0700 (PDT) Date: Thu, 6 Jun 2024 15:36:31 -0700 From: Minchan Kim To: Yosry Ahmed Cc: Andrew Morton , Sergey Senozhatsky , Vlastimil Babka , David Rientjes , Christoph Lameter , Erhard Furtner , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: zsmalloc: share slab caches for all zsmalloc zpools Message-ID: References: <20240604175340.218175-1-yosryahmed@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240604175340.218175-1-yosryahmed@google.com> X-Rspamd-Queue-Id: B22EF14000A X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: wh8sia3arb493kbxk8skhmjbbf8u1bjt X-HE-Tag: 1717713395-36553 X-HE-Meta: U2FsdGVkX18kDhUYhbsdh8dOb9epENl4Qu5s8dcGPXlolwd8E+AzzexGf9vhgimQ7t94eoq7UYX8Mz7PsJpDQUBPUEzjTjBwF06urpHyPuYWxn0pNXjU2OQunEvERIXjG+DBI2WuoWLGRrg+D5FGcpkMY7ltnuIKapWlRzAvG9V9zVufMc0xZv7AaA2grxN0ILCiSah8S22ARSZAZSjJWHCDeYx45cHHZeWFjO1cW6ruRdt08LufV8o4siVwv7TWH74IGd/Yd3XCxXV3n0datqu1CB4fQXNpVfjfPA8Sdo4YdfVS3vXUsLVpApVlWTNMX5Dedax9oGEipNcgxIomaKOEvTPXeL4d4/9Hivlcd4MWKfH9QOnJchXgSTngvoG5r5bg2ajT+bnfkJkNet25tkcQJDruKo/AqcSwUDQ9VxScZ7hc2h0JSsJqb/6bIsYwg8SiB9dVM8C9yk15c9sKSAFJmtth0yJZ4XepLfwXs15UyCWhbBOnOXkJzIEPI4nWp/VO3aliq7uyuFtLkzBMzcKPmV8ASlQ1FPlRCOCIszxmx3f2QZ3l21TOuADgZyYKLMBrj/uMnahrvahTNUEcBiwXcpjkWipls/XVwg9Q6qZNO245ooOpZlkARraXNk/6oUOkzDmlNI8ICqytlwnsEINCbO7HESSGJvJRZQZmVvn4xIkkg1JOwuAgDsx666T9Igk2hpHYN88hc9WksvNhNOizPn7Y9tU6oKmDGlxCKNbEJwDpWbXBZSxXBe0tbVtPQN8e97rH3hCxQ8lGA/WsjlHGQO25wK0dMacBLKnn3D0pwZBbP9tp7arYhKxo5sM8/qfWkfINSZVaruWe1mIMCML6xRvfUutndkJeqgNwqHAuY1ip3TO5fjsu7SJpob5/6qjZ3rRVB0fdCy8zymZHk8Md8Pk8pwIisngumQpT9Xn+JKnsUePP5/vb6EAbHkjbTZJediJW8wr6f/l3qdU HwW59/PB s0MD8tly4rOeKuImLWvllgY3tGsnd8vXn+/cWEqK8pofF+/AZU3sltZ/3HpQeWj3g3VRUkxrz9TdiFw1clL5HLjGNZMQt0FQedSk91S0RelNduZO+pwc7Yfa61JwObzWZ3xy/zKtAGOOuhK0ZRSRWB7s91DrNwD/MG5GrVsgQ7H2Z3L6b+EpKrzNYJiuPfhEGJt67UIY2wjst1nMqCgKEf8bXBNrnQwD/rjmBMooCB/wwcb4fDvWGZ2oFlbzutDNwBx66cnZklvtHCMnOQR3dGiMZK2p5YWWPjpJhL4eyz0WPkXQJy2QcJFFA0qtYNS/gIDeVn/eAd8K8z3rqzzbRPepFhrOFfyhlU8G74v8ltQybAo2Oc80h/j7jfrb4/onLeXZ98Fa5E0iZNJVyBHkBgQVvQwdanwpxY/8xxxGE+mXDSi4QmHvOMQda9T2xXleSSrlQqCr4TE57LG7DEniUmn/kQZGygXT7Et0Yd5QxywYmyW2limw5SDkONieCuHcUHKLZX5W4E+LrS9wCtfkrLUsaBcURdJ8JCTSH7MClef2AH6ACl65CftOkhtgwDnG8eF0w4LrTr4ZAjyBhbeQ0/UO8l6JXn6E6DIAhL5O2gR8V0LwLS5JIIVD0DGMgXTKXLW01Po4W18nhTR1QVckyHwjbiGDOVyGxcOIVkEqyH2xoglakEJeYmo932d0hXQZENqdnV8qGSiF5SrWbZaYPfNpyAA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 04, 2024 at 05:53:40PM +0000, Yosry Ahmed wrote: > Zswap creates multiple zpools to improve concurrency. Each zsmalloc > zpool creates its own 'zs_handle' and 'zspage' slab caches. Currently we > end up with 32 slab caches of each type. > > Since each slab cache holds some free objects, we end up with a lot of > free objects distributed among the separate zpool caches. Slab caches > are designed to handle concurrent allocations by using percpu > structures, so having a single instance of each cache should be enough, > and avoids wasting more memory than needed due to fragmentation. > > Additionally, having more slab caches than needed unnecessarily slows > down code paths that iterate slab_caches. > > In the results reported by Eric in [1], the amount of unused slab memory > in these caches goes down from 242808 bytes to 29216 bytes (-88%). This > is calculated by (num_objs - active_objs) * objsize for each 'zs_handle' > and 'zspage' cache. Although this patch did not help with the allocation > failure reported by Eric with zswap + zsmalloc, I think it is still > worth merging on its own. > > [1]https://lore.kernel.org/lkml/20240604134458.3ae4396a@yea/ I doubt this is the right direction. Zsmalloc is used for various purposes, each with different object lifecycles. For example, swap operations relatively involve short-lived objects, while filesystem use cases might have longer-lived objects. This mix of lifecycles could lead to fragmentation with this approach. I believe the original problem arose when zsmalloc reduced its lock granularity from the class level to a global level. And then, Zswap went to mitigate the issue with multiple zpools, but it's essentially another bandaid on top of the existing problem, IMO. The correct approach would be to further reduce the zsmalloc lock granularity. > > Signed-off-by: Yosry Ahmed > --- > mm/zsmalloc.c | 87 ++++++++++++++++++++++++++------------------------- > 1 file changed, 44 insertions(+), 43 deletions(-) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index b42d3545ca856..76d9976442a4a 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -220,8 +220,6 @@ struct zs_pool { > const char *name; > > struct size_class *size_class[ZS_SIZE_CLASSES]; > - struct kmem_cache *handle_cachep; > - struct kmem_cache *zspage_cachep; > > atomic_long_t pages_allocated; > > @@ -289,50 +287,29 @@ static void init_deferred_free(struct zs_pool *pool) {} > static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage) {} > #endif > > -static int create_cache(struct zs_pool *pool) > -{ > - pool->handle_cachep = kmem_cache_create("zs_handle", ZS_HANDLE_SIZE, > - 0, 0, NULL); > - if (!pool->handle_cachep) > - return 1; > - > - pool->zspage_cachep = kmem_cache_create("zspage", sizeof(struct zspage), > - 0, 0, NULL); > - if (!pool->zspage_cachep) { > - kmem_cache_destroy(pool->handle_cachep); > - pool->handle_cachep = NULL; > - return 1; > - } > - > - return 0; > -} > +static struct kmem_cache *zs_handle_cache; > +static struct kmem_cache *zspage_cache; > > -static void destroy_cache(struct zs_pool *pool) > +static unsigned long cache_alloc_handle(gfp_t gfp) > { > - kmem_cache_destroy(pool->handle_cachep); > - kmem_cache_destroy(pool->zspage_cachep); > -} > - > -static unsigned long cache_alloc_handle(struct zs_pool *pool, gfp_t gfp) > -{ > - return (unsigned long)kmem_cache_alloc(pool->handle_cachep, > + return (unsigned long)kmem_cache_alloc(zs_handle_cache, > gfp & ~(__GFP_HIGHMEM|__GFP_MOVABLE)); > } > > -static void cache_free_handle(struct zs_pool *pool, unsigned long handle) > +static void cache_free_handle(unsigned long handle) > { > - kmem_cache_free(pool->handle_cachep, (void *)handle); > + kmem_cache_free(zs_handle_cache, (void *)handle); > } > > -static struct zspage *cache_alloc_zspage(struct zs_pool *pool, gfp_t flags) > +static struct zspage *cache_alloc_zspage(gfp_t flags) > { > - return kmem_cache_zalloc(pool->zspage_cachep, > + return kmem_cache_zalloc(zspage_cache, > flags & ~(__GFP_HIGHMEM|__GFP_MOVABLE)); > } > > -static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage) > +static void cache_free_zspage(struct zspage *zspage) > { > - kmem_cache_free(pool->zspage_cachep, zspage); > + kmem_cache_free(zspage_cache, zspage); > } > > /* pool->lock(which owns the handle) synchronizes races */ > @@ -837,7 +814,7 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, > page = next; > } while (page != NULL); > > - cache_free_zspage(pool, zspage); > + cache_free_zspage(zspage); > > class_stat_dec(class, ZS_OBJS_ALLOCATED, class->objs_per_zspage); > atomic_long_sub(class->pages_per_zspage, &pool->pages_allocated); > @@ -950,7 +927,7 @@ static struct zspage *alloc_zspage(struct zs_pool *pool, > { > int i; > struct page *pages[ZS_MAX_PAGES_PER_ZSPAGE]; > - struct zspage *zspage = cache_alloc_zspage(pool, gfp); > + struct zspage *zspage = cache_alloc_zspage(gfp); > > if (!zspage) > return NULL; > @@ -967,7 +944,7 @@ static struct zspage *alloc_zspage(struct zs_pool *pool, > dec_zone_page_state(pages[i], NR_ZSPAGES); > __free_page(pages[i]); > } > - cache_free_zspage(pool, zspage); > + cache_free_zspage(zspage); > return NULL; > } > > @@ -1338,7 +1315,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) > if (unlikely(size > ZS_MAX_ALLOC_SIZE)) > return (unsigned long)ERR_PTR(-ENOSPC); > > - handle = cache_alloc_handle(pool, gfp); > + handle = cache_alloc_handle(gfp); > if (!handle) > return (unsigned long)ERR_PTR(-ENOMEM); > > @@ -1363,7 +1340,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) > > zspage = alloc_zspage(pool, class, gfp); > if (!zspage) { > - cache_free_handle(pool, handle); > + cache_free_handle(handle); > return (unsigned long)ERR_PTR(-ENOMEM); > } > > @@ -1441,7 +1418,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle) > free_zspage(pool, class, zspage); > > spin_unlock(&pool->lock); > - cache_free_handle(pool, handle); > + cache_free_handle(handle); > } > EXPORT_SYMBOL_GPL(zs_free); > > @@ -2111,9 +2088,6 @@ struct zs_pool *zs_create_pool(const char *name) > if (!pool->name) > goto err; > > - if (create_cache(pool)) > - goto err; > - > /* > * Iterate reversely, because, size of size_class that we want to use > * for merging should be larger or equal to current size. > @@ -2234,16 +2208,41 @@ void zs_destroy_pool(struct zs_pool *pool) > kfree(class); > } > > - destroy_cache(pool); > kfree(pool->name); > kfree(pool); > } > EXPORT_SYMBOL_GPL(zs_destroy_pool); > > +static void zs_destroy_caches(void) > +{ > + kmem_cache_destroy(zs_handle_cache); > + kmem_cache_destroy(zspage_cache); > + zs_handle_cache = NULL; > + zspage_cache = NULL; > +} > + > +static int zs_create_caches(void) > +{ > + zs_handle_cache = kmem_cache_create("zs_handle", ZS_HANDLE_SIZE, > + 0, 0, NULL); > + zspage_cache = kmem_cache_create("zspage", sizeof(struct zspage), > + 0, 0, NULL); > + > + if (!zs_handle_cache || !zspage_cache) { > + zs_destroy_caches(); > + return -1; > + } > + return 0; > +} > + > static int __init zs_init(void) > { > int ret; > > + ret = zs_create_caches(); > + if (ret) > + goto out; > + > ret = cpuhp_setup_state(CPUHP_MM_ZS_PREPARE, "mm/zsmalloc:prepare", > zs_cpu_prepare, zs_cpu_dead); > if (ret) > @@ -2258,6 +2257,7 @@ static int __init zs_init(void) > return 0; > > out: > + zs_destroy_caches(); > return ret; > } > > @@ -2269,6 +2269,7 @@ static void __exit zs_exit(void) > cpuhp_remove_state(CPUHP_MM_ZS_PREPARE); > > zs_stat_exit(); > + zs_destroy_caches(); > } > > module_init(zs_init); > -- > 2.45.1.288.g0e0cd299f1-goog >