linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jianfeng Wang <jianfeng.w.wang@oracle.com>
To: "Christoph Lameter (Ampere)" <cl@linux.com>,
	Chengming Zhou <chengming.zhou@linux.dev>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	David Rientjes <rientjes@google.com>,
	penberg@kernel.org, iamjoonsoo.kim@lge.com,
	akpm@linux-foundation.org, roman.gushchin@linux.dev,
	42.hyeyoo@gmail.com, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Chengming Zhou <zhouchengming@bytedance.com>
Subject: Re: [PATCH] slub: avoid scanning all partial slabs in get_slabinfo()
Date: Thu, 22 Feb 2024 23:36:01 -0800	[thread overview]
Message-ID: <93497e03-1acf-483e-8695-e103fd1bc044@oracle.com> (raw)
In-Reply-To: <6daf88a2-84c2-5ba4-853c-c38cca4a03cb@linux.com>


On 2/22/24 7:02 PM, Christoph Lameter (Ampere) wrote:
> On Thu, 22 Feb 2024, Chengming Zhou wrote:
> 
>> Anyway, I put the code below for discussion...
> 
> Can we guestimate the free objects based on the number of partial slabs. That number is available.
> 

Yes.
I've thought about calculating the average number of free objects in a
partial slab (through sampling) and then estimating the total number of
free objects as (avg * n->nr_partial).

See the following.

---
 mm/slub.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 63d281dfacdb..13385761049c 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2963,6 +2963,8 @@ static inline bool free_debug_processing(struct kmem_cache *s,
 #endif /* CONFIG_SLUB_DEBUG */
 
 #if defined(CONFIG_SLUB_DEBUG) || defined(SLAB_SUPPORTS_SYSFS)
+#define MAX_PARTIAL_TO_SCAN 10000
+
 static unsigned long count_partial(struct kmem_cache_node *n,
 					int (*get_count)(struct slab *))
 {
@@ -2971,8 +2973,22 @@ static unsigned long count_partial(struct kmem_cache_node *n,
 	struct slab *slab;
 
 	spin_lock_irqsave(&n->list_lock, flags);
-	list_for_each_entry(slab, &n->partial, slab_list)
-		x += get_count(slab);
+	if (n->nr_partial > MAX_PARTIAL_TO_SCAN) {
+		/* Estimate total count of objects via sampling */
+		unsigned long sample_rate = n->nr_partial / MAX_PARTIAL_TO_SCAN;
+		unsigned long scanned = 0;
+		unsigned long counted = 0;
+		list_for_each_entry(slab, &n->partial, slab_list) {
+			if (++scanned % sample_rate == 0) {
+				x += get_count(slab);
+				counted++;
+			}
+		}
+		x = mult_frac(x, n->nr_partial, counted);
+	} else {
+		list_for_each_entry(slab, &n->partial, slab_list)
+			x += get_count(slab);
+	}
 	spin_unlock_irqrestore(&n->list_lock, flags);
 	return x;
 }
-- 

> How accurate need the accounting be? We also have fuzzy accounting in the VM counters.
Based on my experience, for a |kmem_cache|, the total number of objects can tell
whether the |kmem_cache| has been heavily used by a workload. When the total
number is large: if the number of free objects is small, then either these objects
are really in-use or there is *memory leak* going on (which then must be further
diagnosed). However, if the number of free objects is large, we can only know
the slab memory fragmentation happens.

So, I think the object accounting needn't be accurate. We only have to tell
whether a large percentage of slab objects is free or not. The above code is a
sampling, which should do the job if we take enough samples.


  parent reply	other threads:[~2024-02-23  7:36 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-15 21:14 [PATCH] slub: avoid scanning all partial slabs in get_slabinfo() Jianfeng Wang
2024-02-18 19:25 ` David Rientjes
2024-02-19  8:30   ` Vlastimil Babka
2024-02-19  9:29     ` Chengming Zhou
2024-02-19 10:17       ` Vlastimil Babka
2024-02-22 13:20     ` Chengming Zhou
2024-02-23  3:02       ` Christoph Lameter (Ampere)
2024-02-23  3:36         ` Chengming Zhou
2024-02-23  3:50           ` Christoph Lameter (Ampere)
2024-02-23  5:00             ` Chengming Zhou
2024-02-23  9:24               ` Vlastimil Babka
2024-02-23  9:37                 ` Chengming Zhou
2024-02-23  9:46                   ` Chengming Zhou
2024-02-23  9:51                   ` Vlastimil Babka
2024-02-26 17:38                     ` Christoph Lameter (Ampere)
2024-02-27  9:30                       ` Chengming Zhou
2024-02-27 22:55                         ` Christoph Lameter (Ampere)
2024-02-28  9:51                           ` Chengming Zhou
2024-03-14  0:38                             ` Jianfeng Wang
2024-03-14 23:45                               ` Christoph Lameter (Ampere)
2024-02-23  7:36         ` Jianfeng Wang [this message]
2024-02-23  9:17           ` Vlastimil Babka
2024-02-20 18:41   ` Jianfeng Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=93497e03-1acf-483e-8695-e103fd1bc044@oracle.com \
    --to=jianfeng.w.wang@oracle.com \
    --cc=42.hyeyoo@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=vbabka@suse.cz \
    --cc=zhouchengming@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).