patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] SLUB: calculate_order() cleanups
@ 2023-09-08 14:53 Vlastimil Babka
  2023-09-08 14:53 ` [PATCH 1/4] mm/slub: simplify the last resort slab order calculation Vlastimil Babka
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-08 14:53 UTC (permalink / raw)
  To: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel
  Cc: Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel, Vlastimil Babka

Since reviewing recent patches made me finally dig into these functions
in details for the first time, I've also noticed some opportunities for
cleanups that should make them simpler and also deliver more consistent
results for some corner case object sizes (probably not seen in
practice). Thus patch 3 can increase slab orders somewhere, but only in
the way that was already intended. Otherwise it's almost no functional
changes.

Vlastimil Babka (4):
  mm/slub: simplify the last resort slab order calculation
  mm/slub: remove min_objects loop from calculate_order()
  mm/slub: attempt to find layouts up to 1/2 waste in calculate_order()
  mm/slub: refactor calculate_order() and calc_slab_order()

 mm/slub.c | 63 ++++++++++++++++++++++++-------------------------------
 1 file changed, 27 insertions(+), 36 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/4] mm/slub: simplify the last resort slab order calculation
  2023-09-08 14:53 [PATCH 0/4] SLUB: calculate_order() cleanups Vlastimil Babka
@ 2023-09-08 14:53 ` Vlastimil Babka
  2023-09-19  7:56   ` Feng Tang
  2023-09-08 14:53 ` [PATCH 2/4] mm/slub: remove min_objects loop from calculate_order() Vlastimil Babka
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-08 14:53 UTC (permalink / raw)
  To: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel
  Cc: Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel, Vlastimil Babka

If calculate_order() can't fit even a single large object within
slub_max_order, it will try using the smallest necessary order that may
exceed slub_max_order but not MAX_ORDER.

Currently this is done with a call to calc_slab_order() which is
unecessary. We can simply use get_order(size). No functional change.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index f7940048138c..c6e694cb17b9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4193,7 +4193,7 @@ static inline int calculate_order(unsigned int size)
 	/*
 	 * Doh this slab cannot be placed using slub_max_order.
 	 */
-	order = calc_slab_order(size, 1, MAX_ORDER, 1);
+	order = get_order(size);
 	if (order <= MAX_ORDER)
 		return order;
 	return -ENOSYS;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/4] mm/slub: remove min_objects loop from calculate_order()
  2023-09-08 14:53 [PATCH 0/4] SLUB: calculate_order() cleanups Vlastimil Babka
  2023-09-08 14:53 ` [PATCH 1/4] mm/slub: simplify the last resort slab order calculation Vlastimil Babka
@ 2023-09-08 14:53 ` Vlastimil Babka
  2023-09-08 14:53 ` [PATCH 3/4] mm/slub: attempt to find layouts up to 1/2 waste in calculate_order() Vlastimil Babka
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-08 14:53 UTC (permalink / raw)
  To: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel
  Cc: Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel, Vlastimil Babka

calculate_order() currently has two nested loops. The inner one that
gradually modifies the acceptable waste from 1/16 up to 1/4, and the
outer one that decreases min_objects down to 2.

Upon closer inspection, the outer loop is unnecessary. Decreasing
min_objects could have in theory two effects to make the inner loop and
its call to calc_slab_order() succeed where a previous iteration with
higher min_objects would not:

- it could cause the min_objects-derived min_order to fit within
  slub_max_order. But min_objects is already pre-capped to max_objects
  that's derived from slub_max_order above the loops, so every iteration
  tries at least slub_max_order in calc_slab_order()

- it could cause calc_slab_order() to be called with lower min_objects
  thus potentially lower min_order in its loop. This would make a
  difference if the lower order could cause the fractional waste test to
  succeed where a higher order has already failed with same fract_leftover
  in the previous iteration with a higher min_order. But that's not
  possible, because increasing the order can only result in lower (or
  same) fractional waste. If we increase the slab size 2 times, we will
  fit at least 2 times the number of objects (thus same fraction of
  waste), or it will allow us to fit one more object (lower fraction of
  waste).

For more confidence I have tried adding a printk to notify when
decreasing min_objects resulted in a success, and simulated calculations
for a range of object sizes, nr_cpus and page_sizes. As expected, the
printk never triggered.

Thus remove the outer loop and adjust comments accordingly.

There's almost no functional change except a weird corner case when
slub_min_objects=1 on boot command line would cause the whole two nested
loops to be skipped before this patch. Now it would try to find the best
layout as usual, resulting in potentially higher orderthat minimizes
waste. This is not wrong and will be further expanded by the next patch.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 38 ++++++++++++++++++--------------------
 1 file changed, 18 insertions(+), 20 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index c6e694cb17b9..5c287d96b212 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4141,14 +4141,6 @@ static inline int calculate_order(unsigned int size)
 	unsigned int max_objects;
 	unsigned int nr_cpus;
 
-	/*
-	 * Attempt to find best configuration for a slab. This
-	 * works by first attempting to generate a layout with
-	 * the best configuration and backing off gradually.
-	 *
-	 * First we increase the acceptable waste in a slab. Then
-	 * we reduce the minimum objects required in a slab.
-	 */
 	min_objects = slub_min_objects;
 	if (!min_objects) {
 		/*
@@ -4168,18 +4160,24 @@ static inline int calculate_order(unsigned int size)
 	max_objects = order_objects(slub_max_order, size);
 	min_objects = min(min_objects, max_objects);
 
-	while (min_objects > 1) {
-		unsigned int fraction;
-
-		fraction = 16;
-		while (fraction >= 4) {
-			order = calc_slab_order(size, min_objects,
-					slub_max_order, fraction);
-			if (order <= slub_max_order)
-				return order;
-			fraction /= 2;
-		}
-		min_objects--;
+	/*
+	 * Attempt to find best configuration for a slab. This works by first
+	 * attempting to generate a layout with the best possible configuration and
+	 * backing off gradually.
+	 *
+	 * We start with accepting at most 1/16 waste and try to find the
+	 * smallest order from min_objects-derived/slub_min_order up to
+	 * slub_max_order that will satisfy the constraint. Note that increasing
+	 * the order can only result in same or less fractional waste, not more.
+	 *
+	 * If that fails, we increase the acceptable fraction of waste and try
+	 * again.
+	 */
+	for (unsigned int fraction = 16; fraction >= 4; fraction /= 2) {
+		order = calc_slab_order(size, min_objects, slub_max_order,
+					fraction);
+		if (order <= slub_max_order)
+			return order;
 	}
 
 	/*
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/4] mm/slub: attempt to find layouts up to 1/2 waste in calculate_order()
  2023-09-08 14:53 [PATCH 0/4] SLUB: calculate_order() cleanups Vlastimil Babka
  2023-09-08 14:53 ` [PATCH 1/4] mm/slub: simplify the last resort slab order calculation Vlastimil Babka
  2023-09-08 14:53 ` [PATCH 2/4] mm/slub: remove min_objects loop from calculate_order() Vlastimil Babka
@ 2023-09-08 14:53 ` Vlastimil Babka
  2023-09-20 13:11   ` Feng Tang
  2023-09-08 14:53 ` [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order() Vlastimil Babka
  2023-09-28  4:46 ` [PATCH 0/4] SLUB: calculate_order() cleanups Jay Patel
  4 siblings, 1 reply; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-08 14:53 UTC (permalink / raw)
  To: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel
  Cc: Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel, Vlastimil Babka

The main loop in calculate_order() currently tries to find an order with
at most 1/4 waste. If that's impossible (for particular large object
sizes), there's a fallback that will try to place one object within
slab_max_order.

If we expand the loop boundary to also allow up to 1/2 waste as the last
resort, we can remove the fallback and simplify the code, as the loop
will find an order for such sizes as well. Note we don't need to allow
more than 1/2 waste as that will never happen - calc_slab_order() would
calculate more objects to fit, reducing waste below 1/2.

Sucessfully finding an order in the loop (compared to the fallback) will
also have the benefit in trying to satisfy min_objects, because the
fallback was passing 1. Thus the resulting slab orders might be larger
(not because it would improve waste, but to reduce pressure on shared
locks), which is one of the goals of calculate_order().

For example, with nr_cpus=1 and 4kB PAGE_SIZE, slub_max_order=3, before
the patch we would get the following orders for these object sizes:

 2056 to 10920 - order-3 as selected by the loop
10928 to 12280 - order-2 due to fallback, as <1/4 waste is not possible
12288 to 32768 - order-3 as <1/4 waste is again possible

After the patch:

2056 to 32768 - order-3, because even in the range of 10928 to 12280 we
                try to satisfy the calculated min_objects.

As a result the code is simpler and gives more consistent results.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 5c287d96b212..f04eb029d85a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4171,23 +4171,17 @@ static inline int calculate_order(unsigned int size)
 	 * the order can only result in same or less fractional waste, not more.
 	 *
 	 * If that fails, we increase the acceptable fraction of waste and try
-	 * again.
+	 * again. The last iteration with fraction of 1/2 would effectively
+	 * accept any waste and give us the order determined by min_objects, as
+	 * long as at least single object fits within slub_max_order.
 	 */
-	for (unsigned int fraction = 16; fraction >= 4; fraction /= 2) {
+	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
 		order = calc_slab_order(size, min_objects, slub_max_order,
 					fraction);
 		if (order <= slub_max_order)
 			return order;
 	}
 
-	/*
-	 * We were unable to place multiple objects in a slab. Now
-	 * lets see if we can place a single object there.
-	 */
-	order = calc_slab_order(size, 1, slub_max_order, 1);
-	if (order <= slub_max_order)
-		return order;
-
 	/*
 	 * Doh this slab cannot be placed using slub_max_order.
 	 */
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
  2023-09-08 14:53 [PATCH 0/4] SLUB: calculate_order() cleanups Vlastimil Babka
                   ` (2 preceding siblings ...)
  2023-09-08 14:53 ` [PATCH 3/4] mm/slub: attempt to find layouts up to 1/2 waste in calculate_order() Vlastimil Babka
@ 2023-09-08 14:53 ` Vlastimil Babka
  2023-09-11  5:56   ` kernel test robot
                     ` (2 more replies)
  2023-09-28  4:46 ` [PATCH 0/4] SLUB: calculate_order() cleanups Jay Patel
  4 siblings, 3 replies; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-08 14:53 UTC (permalink / raw)
  To: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel
  Cc: Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel, Vlastimil Babka

After the previous cleanups, we can now move some code from
calc_slab_order() to calculate_order() so it's executed just once, and
do some more cleanups.

- move the min_order and MAX_OBJS_PER_PAGE evaluation to
  calc_slab_order().

- change calc_slab_order() parameter min_objects to min_order

Also make MAX_OBJS_PER_PAGE check more robust by considering also
min_objects in addition to slub_min_order. Otherwise this is not a
functional change.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/slub.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index f04eb029d85a..1c91f72c7239 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4110,17 +4110,12 @@ static unsigned int slub_min_objects;
  * the smallest order which will fit the object.
  */
 static inline unsigned int calc_slab_order(unsigned int size,
-		unsigned int min_objects, unsigned int max_order,
+		unsigned int min_order, unsigned int max_order,
 		unsigned int fract_leftover)
 {
-	unsigned int min_order = slub_min_order;
 	unsigned int order;
 
-	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
-		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
-
-	for (order = max(min_order, (unsigned int)get_order(min_objects * size));
-			order <= max_order; order++) {
+	for (order = min_order; order <= max_order; order++) {
 
 		unsigned int slab_size = (unsigned int)PAGE_SIZE << order;
 		unsigned int rem;
@@ -4139,7 +4134,7 @@ static inline int calculate_order(unsigned int size)
 	unsigned int order;
 	unsigned int min_objects;
 	unsigned int max_objects;
-	unsigned int nr_cpus;
+	unsigned int min_order;
 
 	min_objects = slub_min_objects;
 	if (!min_objects) {
@@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
 		 * order on systems that appear larger than they are, and too
 		 * low order on systems that appear smaller than they are.
 		 */
-		nr_cpus = num_present_cpus();
+		unsigned int nr_cpus = num_present_cpus();
 		if (nr_cpus <= 1)
 			nr_cpus = nr_cpu_ids;
 		min_objects = 4 * (fls(nr_cpus) + 1);
@@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
 	max_objects = order_objects(slub_max_order, size);
 	min_objects = min(min_objects, max_objects);
 
+	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
+	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
+		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
+
 	/*
 	 * Attempt to find best configuration for a slab. This works by first
 	 * attempting to generate a layout with the best possible configuration and
@@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
 	 * long as at least single object fits within slub_max_order.
 	 */
 	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
-		order = calc_slab_order(size, min_objects, slub_max_order,
+		order = calc_slab_order(size, min_order, slub_max_order,
 					fraction);
 		if (order <= slub_max_order)
 			return order;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
  2023-09-08 14:53 ` [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order() Vlastimil Babka
@ 2023-09-11  5:56   ` kernel test robot
  2023-09-15 13:36     ` Vlastimil Babka
  2023-09-16  1:28   ` Baoquan He
  2023-09-20 13:36   ` Feng Tang
  2 siblings, 1 reply; 18+ messages in thread
From: kernel test robot @ 2023-09-11  5:56 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: oe-lkp, lkp, linux-mm, David Rientjes, Christoph Lameter,
	Hyeonggon Yoo, Jay Patel, Roman Gushchin, Pekka Enberg,
	Joonsoo Kim, patches, linux-kernel, Vlastimil Babka, oliver.sang



Hello,

kernel test robot noticed "UBSAN:shift-out-of-bounds_in_mm/slub.c" on:

commit: f04d441027621c16081803832a54f59272112cf5 ("[PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()")
url: https://github.com/intel-lab-lkp/linux/commits/Vlastimil-Babka/mm-slub-simplify-the-last-resort-slab-order-calculation/20230908-225506
base: git://git.kernel.org/cgit/linux/kernel/git/vbabka/slab.git for-next
patch link: https://lore.kernel.org/all/20230908145302.30320-10-vbabka@suse.cz/
patch subject: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()

in testcase: boot

compiler: clang-16
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


+-------------------------------------------------+------------+------------+
|                                                 | a17847b835 | f04d441027 |
+-------------------------------------------------+------------+------------+
| UBSAN:shift-out-of-bounds_in_mm/slub.c          | 0          | 12         |
+-------------------------------------------------+------------+------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309111340.f59c3f22-oliver.sang@intel.com


[    0.901457][    T0] UBSAN: shift-out-of-bounds in mm/slub.c:463:34
[    0.902458][    T0] shift exponent 52 is too large for 32-bit type 'unsigned int'
[    0.903477][    T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G                T  6.5.0-rc1-00009-gf04d44102762 #1
[    0.904450][    T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[    0.904450][    T0] Call Trace:
[    0.904450][    T0]  <TASK>
[ 0.904450][ T0] dump_stack_lvl (lib/dump_stack.c:107) 
[ 0.904450][ T0] dump_stack (lib/dump_stack.c:114) 
[ 0.904450][ T0] ubsan_epilogue (lib/ubsan.c:218) 
[ 0.904450][ T0] __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:?) 
[ 0.904450][ T0] ? tdx_handle_virt_exception (arch/x86/include/asm/shared/tdx.h:60 arch/x86/coco/tdx/tdx.c:375 arch/x86/coco/tdx/tdx.c:430 arch/x86/coco/tdx/tdx.c:650 arch/x86/coco/tdx/tdx.c:666) 
[ 0.904450][ T0] ? kmemleak_alloc (mm/kmemleak.c:977) 
[ 0.904450][ T0] __kmem_cache_create (mm/slub.c:? mm/slub.c:4159 mm/slub.c:4473 mm/slub.c:4507 mm/slub.c:5104) 
[ 0.904450][ T0] ? kmem_cache_alloc (mm/slub.c:3502) 
[ 0.904450][ T0] kmem_cache_create_usercopy (mm/slab_common.c:236 mm/slab_common.c:340) 
[ 0.904450][ T0] fork_init (kernel/fork.c:1048) 
[ 0.904450][ T0] ? kmem_cache_create (mm/slab_common.c:395) 
[ 0.904450][ T0] start_kernel (init/main.c:1046) 
[ 0.904450][ T0] x86_64_start_reservations (??:?) 
[ 0.904450][ T0] x86_64_start_kernel (arch/x86/kernel/head64.c:486) 
[ 0.904450][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441) 
[    0.904450][    T0]  </TASK>
[    0.904457][    T0] ================================================================================
[    0.905560][    T0] LSM: initializing lsm=lockdown,capability,landlock,yama,safesetid,integrity
[    0.906497][    T0] landlock: Up and running.



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230911/202309111340.f59c3f22-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
  2023-09-11  5:56   ` kernel test robot
@ 2023-09-15 13:36     ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-15 13:36 UTC (permalink / raw)
  To: kernel test robot
  Cc: oe-lkp, lkp, linux-mm, David Rientjes, Christoph Lameter,
	Hyeonggon Yoo, Jay Patel, Roman Gushchin, Pekka Enberg,
	Joonsoo Kim, patches, linux-kernel

On 9/11/23 07:56, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "UBSAN:shift-out-of-bounds_in_mm/slub.c" on:
> 
> commit: f04d441027621c16081803832a54f59272112cf5 ("[PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()")
> url: https://github.com/intel-lab-lkp/linux/commits/Vlastimil-Babka/mm-slub-simplify-the-last-resort-slab-order-calculation/20230908-225506
> base: git://git.kernel.org/cgit/linux/kernel/git/vbabka/slab.git for-next
> patch link: https://lore.kernel.org/all/20230908145302.30320-10-vbabka@suse.cz/
> patch subject: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
> 
> in testcase: boot
> 
> compiler: clang-16
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> +-------------------------------------------------+------------+------------+
> |                                                 | a17847b835 | f04d441027 |
> +-------------------------------------------------+------------+------------+
> | UBSAN:shift-out-of-bounds_in_mm/slub.c          | 0          | 12         |
> +-------------------------------------------------+------------+------------+
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202309111340.f59c3f22-oliver.sang@intel.com
> 
> 
> [    0.901457][    T0] UBSAN: shift-out-of-bounds in mm/slub.c:463:34
> [    0.902458][    T0] shift exponent 52 is too large for 32-bit type 'unsigned int'
> [    0.903477][    T0] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G                T  6.5.0-rc1-00009-gf04d44102762 #1
> [    0.904450][    T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [    0.904450][    T0] Call Trace:
> [    0.904450][    T0]  <TASK>
> [ 0.904450][ T0] dump_stack_lvl (lib/dump_stack.c:107) 
> [ 0.904450][ T0] dump_stack (lib/dump_stack.c:114) 
> [ 0.904450][ T0] ubsan_epilogue (lib/ubsan.c:218) 
> [ 0.904450][ T0] __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:?) 
> [ 0.904450][ T0] ? tdx_handle_virt_exception (arch/x86/include/asm/shared/tdx.h:60 arch/x86/coco/tdx/tdx.c:375 arch/x86/coco/tdx/tdx.c:430 arch/x86/coco/tdx/tdx.c:650 arch/x86/coco/tdx/tdx.c:666) 
> [ 0.904450][ T0] ? kmemleak_alloc (mm/kmemleak.c:977) 
> [ 0.904450][ T0] __kmem_cache_create (mm/slub.c:? mm/slub.c:4159 mm/slub.c:4473 mm/slub.c:4507 mm/slub.c:5104) 

Oh thanks, fixing up:

--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4152,7 +4152,8 @@ static inline int calculate_order(unsigned int size)
                        nr_cpus = nr_cpu_ids;
                min_objects = 4 * (fls(nr_cpus) + 1);
        }
-       max_objects = order_objects(slub_max_order, size);
+       /* min_objects can't be 0 because get_order(0) is undefined */
+       max_objects = max(order_objects(slub_max_order, size), 1);
        min_objects = min(min_objects, max_objects);
 
        min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
l


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
  2023-09-08 14:53 ` [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order() Vlastimil Babka
  2023-09-11  5:56   ` kernel test robot
@ 2023-09-16  1:28   ` Baoquan He
  2023-09-22  7:00     ` Vlastimil Babka
  2023-09-20 13:36   ` Feng Tang
  2 siblings, 1 reply; 18+ messages in thread
From: Baoquan He @ 2023-09-16  1:28 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel

On 09/08/23 at 04:53pm, Vlastimil Babka wrote:
> After the previous cleanups, we can now move some code from
> calc_slab_order() to calculate_order() so it's executed just once, and
> do some more cleanups.
> 
> - move the min_order and MAX_OBJS_PER_PAGE evaluation to
>   calc_slab_order().
> 
> - change calc_slab_order() parameter min_objects to min_order
> 
> Also make MAX_OBJS_PER_PAGE check more robust by considering also
> min_objects in addition to slub_min_order. Otherwise this is not a
> functional change.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f04eb029d85a..1c91f72c7239 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4110,17 +4110,12 @@ static unsigned int slub_min_objects;
>   * the smallest order which will fit the object.
>   */
>  static inline unsigned int calc_slab_order(unsigned int size,
> -		unsigned int min_objects, unsigned int max_order,
> +		unsigned int min_order, unsigned int max_order,
>  		unsigned int fract_leftover)
>  {
> -	unsigned int min_order = slub_min_order;
>  	unsigned int order;
>  
> -	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> -		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> -
> -	for (order = max(min_order, (unsigned int)get_order(min_objects * size));
> -			order <= max_order; order++) {
> +	for (order = min_order; order <= max_order; order++) {
>  
>  		unsigned int slab_size = (unsigned int)PAGE_SIZE << order;
>  		unsigned int rem;
> @@ -4139,7 +4134,7 @@ static inline int calculate_order(unsigned int size)
>  	unsigned int order;
>  	unsigned int min_objects;
>  	unsigned int max_objects;
> -	unsigned int nr_cpus;
> +	unsigned int min_order;
>  
>  	min_objects = slub_min_objects;
>  	if (!min_objects) {
> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
>  		 * order on systems that appear larger than they are, and too
>  		 * low order on systems that appear smaller than they are.
>  		 */
> -		nr_cpus = num_present_cpus();
> +		unsigned int nr_cpus = num_present_cpus();
>  		if (nr_cpus <= 1)
>  			nr_cpus = nr_cpu_ids;
>  		min_objects = 4 * (fls(nr_cpus) + 1);

A minor concern, should we change 'min_objects' to be a local static
to avoid the "if (!min_objects) {" code block every time?  It's deducing
the value from nr_cpus, we may not need do the calculation each time.

> @@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
>  	max_objects = order_objects(slub_max_order, size);
>  	min_objects = min(min_objects, max_objects);
>  
> +	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
> +	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> +		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> +
>  	/*
>  	 * Attempt to find best configuration for a slab. This works by first
>  	 * attempting to generate a layout with the best possible configuration and
> @@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
>  	 * long as at least single object fits within slub_max_order.
>  	 */
>  	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
> -		order = calc_slab_order(size, min_objects, slub_max_order,
> +		order = calc_slab_order(size, min_order, slub_max_order,
>  					fraction);
>  		if (order <= slub_max_order)
>  			return order;
> -- 
> 2.42.0
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] mm/slub: simplify the last resort slab order calculation
  2023-09-08 14:53 ` [PATCH 1/4] mm/slub: simplify the last resort slab order calculation Vlastimil Babka
@ 2023-09-19  7:56   ` Feng Tang
  2023-09-20  6:38     ` Vlastimil Babka
  0 siblings, 1 reply; 18+ messages in thread
From: Feng Tang @ 2023-09-19  7:56 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm@kvack.org,
	patches@lists.linux.dev, linux-kernel@vger.kernel.org

Hi Vlastimil,

On Fri, Sep 08, 2023 at 10:53:04PM +0800, Vlastimil Babka wrote:
> If calculate_order() can't fit even a single large object within
> slub_max_order, it will try using the smallest necessary order that may
> exceed slub_max_order but not MAX_ORDER.
> 
> Currently this is done with a call to calc_slab_order() which is
> unecessary. We can simply use get_order(size). No functional change.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f7940048138c..c6e694cb17b9 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4193,7 +4193,7 @@ static inline int calculate_order(unsigned int size)
>  	/*
>  	 * Doh this slab cannot be placed using slub_max_order.
>  	 */
> -	order = calc_slab_order(size, 1, MAX_ORDER, 1);
> +	order = get_order(size);


This patchset is a nice cleanup, and my previous test all looked fine. 
And one 'slub_min_order' setup reminded by Christopher [1] doesn't
work as not taking affect with this 1/4 patch.

The root cause seems to be, in current kernel, the 'slub_max_order'
is not ajusted  accordingly with 'slub_min_order', so there is case
that 'slub_min_order' is bigger than the default 'slub_max_order' (3).

And it could be fixed by the below patch 

diff --git a/mm/slub.c b/mm/slub.c
index 1c91f72c7239..dbe950783105 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4702,6 +4702,9 @@ static int __init setup_slub_min_order(char *str)
 {
 	get_option(&str, (int *)&slub_min_order);
 
+	if (slub_min_order > slub_max_order)
+		slub_max_order = slub_min_order;
+
 	return 1;
 }

Though the formal fix may also need to cover case like this kind of
crazy setting "slub_min_order=6 slub_max_order=5" 

[1]. https://lore.kernel.org/lkml/21a0ba8b-bf05-0799-7c78-2a35f8c8d52a@os.amperecomputing.com/

Thanks,
Feng

>  	if (order <= MAX_ORDER)
>  		return order;
>  	return -ENOSYS;
> -- 
> 2.42.0
> 
> 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] mm/slub: simplify the last resort slab order calculation
  2023-09-19  7:56   ` Feng Tang
@ 2023-09-20  6:38     ` Vlastimil Babka
  2023-09-20  7:09       ` Feng Tang
  0 siblings, 1 reply; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-20  6:38 UTC (permalink / raw)
  To: Feng Tang
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm@kvack.org,
	patches@lists.linux.dev, linux-kernel@vger.kernel.org

On 9/19/23 09:56, Feng Tang wrote:
> Hi Vlastimil,
> 
> On Fri, Sep 08, 2023 at 10:53:04PM +0800, Vlastimil Babka wrote:
>> If calculate_order() can't fit even a single large object within
>> slub_max_order, it will try using the smallest necessary order that may
>> exceed slub_max_order but not MAX_ORDER.
>> 
>> Currently this is done with a call to calc_slab_order() which is
>> unecessary. We can simply use get_order(size). No functional change.
>> 
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>>  mm/slub.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/mm/slub.c b/mm/slub.c
>> index f7940048138c..c6e694cb17b9 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -4193,7 +4193,7 @@ static inline int calculate_order(unsigned int size)
>>  	/*
>>  	 * Doh this slab cannot be placed using slub_max_order.
>>  	 */
>> -	order = calc_slab_order(size, 1, MAX_ORDER, 1);
>> +	order = get_order(size);
> 
> 
> This patchset is a nice cleanup, and my previous test all looked fine. 
> And one 'slub_min_order' setup reminded by Christopher [1] doesn't
> work as not taking affect with this 1/4 patch.

Hmm I see. Well the trick should keep working if you pass both
slab_min_order=9 slab_max_order=9 ? Maybe Christopher actually does that,
but didn't type it fully in the mail.

> The root cause seems to be, in current kernel, the 'slub_max_order'
> is not ajusted  accordingly with 'slub_min_order', so there is case
> that 'slub_min_order' is bigger than the default 'slub_max_order' (3).
> 
> And it could be fixed by the below patch 
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 1c91f72c7239..dbe950783105 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4702,6 +4702,9 @@ static int __init setup_slub_min_order(char *str)
>  {
>  	get_option(&str, (int *)&slub_min_order);
>  
> +	if (slub_min_order > slub_max_order)
> +		slub_max_order = slub_min_order;
> +
>  	return 1;
>  }

Sounds like a good idea. Would also do analogous thing in setup_slub_max_order.

> Though the formal fix may also need to cover case like this kind of
> crazy setting "slub_min_order=6 slub_max_order=5" 

Doing both should cover even this, and AFAICS how param processing works the
last one passed would "win" so it would set min=max=5 in that case. That's
probably the most sane way we can handle such scenarios.

Want to set a full patch or should I finalize it? I would put it as a new
1/5 before the rest. Thanks!

> [1]. https://lore.kernel.org/lkml/21a0ba8b-bf05-0799-7c78-2a35f8c8d52a@os.amperecomputing.com/
> 
> Thanks,
> Feng
> 
>>  	if (order <= MAX_ORDER)
>>  		return order;
>>  	return -ENOSYS;
>> -- 
>> 2.42.0
>> 
>> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/4] mm/slub: simplify the last resort slab order calculation
  2023-09-20  6:38     ` Vlastimil Babka
@ 2023-09-20  7:09       ` Feng Tang
  0 siblings, 0 replies; 18+ messages in thread
From: Feng Tang @ 2023-09-20  7:09 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm@kvack.org,
	patches@lists.linux.dev, linux-kernel@vger.kernel.org

On Wed, Sep 20, 2023 at 08:38:05AM +0200, Vlastimil Babka wrote:
> On 9/19/23 09:56, Feng Tang wrote:
> > Hi Vlastimil,
> > 
> > On Fri, Sep 08, 2023 at 10:53:04PM +0800, Vlastimil Babka wrote:
> >> If calculate_order() can't fit even a single large object within
> >> slub_max_order, it will try using the smallest necessary order that may
> >> exceed slub_max_order but not MAX_ORDER.
> >> 
> >> Currently this is done with a call to calc_slab_order() which is
> >> unecessary. We can simply use get_order(size). No functional change.
> >> 
> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> >> ---
> >>  mm/slub.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> 
> >> diff --git a/mm/slub.c b/mm/slub.c
> >> index f7940048138c..c6e694cb17b9 100644
> >> --- a/mm/slub.c
> >> +++ b/mm/slub.c
> >> @@ -4193,7 +4193,7 @@ static inline int calculate_order(unsigned int size)
> >>  	/*
> >>  	 * Doh this slab cannot be placed using slub_max_order.
> >>  	 */
> >> -	order = calc_slab_order(size, 1, MAX_ORDER, 1);
> >> +	order = get_order(size);
> > 
> > 
> > This patchset is a nice cleanup, and my previous test all looked fine. 
> > And one 'slub_min_order' setup reminded by Christopher [1] doesn't
> > work as not taking affect with this 1/4 patch.
> 
> Hmm I see. Well the trick should keep working if you pass both
> slab_min_order=9 slab_max_order=9 ? Maybe Christopher actually does that,
> but didn't type it fully in the mail.

Yes, that's possible. And "slub_min_order=9" alone also works to make
all slab's order be 9, as current code's final fallback will try
MAX_ORDER inside caculate_order():

	order = calc_slab_order(size, 1, MAX_ORDER, 1);

though the dmesg looks strange:

  SLUB: HWalign=64, Order=9-3, MinObjects=0, CPUs=16, Nodes=1

> 
> > The root cause seems to be, in current kernel, the 'slub_max_order'
> > is not ajusted  accordingly with 'slub_min_order', so there is case
> > that 'slub_min_order' is bigger than the default 'slub_max_order' (3).
> > 
> > And it could be fixed by the below patch 
> > 
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 1c91f72c7239..dbe950783105 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -4702,6 +4702,9 @@ static int __init setup_slub_min_order(char *str)
> >  {
> >  	get_option(&str, (int *)&slub_min_order);
> >  
> > +	if (slub_min_order > slub_max_order)
> > +		slub_max_order = slub_min_order;
> > +
> >  	return 1;
> >  }
> 
> Sounds like a good idea. Would also do analogous thing in setup_slub_max_order.

Yes.

> > Though the formal fix may also need to cover case like this kind of
> > crazy setting "slub_min_order=6 slub_max_order=5" 
> 
> Doing both should cover even this, and AFAICS how param processing works the
> last one passed would "win" so it would set min=max=5 in that case. That's
> probably the most sane way we can handle such scenarios.

Agree. The latter setting should take privilage. My test code also
does this way.

> Want to set a full patch or should I finalize it? I would put it as a new
> 1/5 before the rest. Thanks!

I can try to make a patch with more detail in commit log, and resend. Thanks
for the review!

Thanks,
Feng

> 
> > [1]. https://lore.kernel.org/lkml/21a0ba8b-bf05-0799-7c78-2a35f8c8d52a@os.amperecomputing.com/
> > 
> > Thanks,
> > Feng
> > 
> >>  	if (order <= MAX_ORDER)
> >>  		return order;
> >>  	return -ENOSYS;
> >> -- 
> >> 2.42.0
> >> 
> >> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/4] mm/slub: attempt to find layouts up to 1/2 waste in calculate_order()
  2023-09-08 14:53 ` [PATCH 3/4] mm/slub: attempt to find layouts up to 1/2 waste in calculate_order() Vlastimil Babka
@ 2023-09-20 13:11   ` Feng Tang
  0 siblings, 0 replies; 18+ messages in thread
From: Feng Tang @ 2023-09-20 13:11 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm@kvack.org,
	patches@lists.linux.dev, linux-kernel@vger.kernel.org

On Fri, Sep 08, 2023 at 10:53:06PM +0800, Vlastimil Babka wrote:
> The main loop in calculate_order() currently tries to find an order with
> at most 1/4 waste. If that's impossible (for particular large object
> sizes), there's a fallback that will try to place one object within
> slab_max_order.
> 
> If we expand the loop boundary to also allow up to 1/2 waste as the last
> resort, we can remove the fallback and simplify the code, as the loop
> will find an order for such sizes as well. Note we don't need to allow
> more than 1/2 waste as that will never happen - calc_slab_order() would
> calculate more objects to fit, reducing waste below 1/2.
> 
> Sucessfully finding an order in the loop (compared to the fallback) will
> also have the benefit in trying to satisfy min_objects, because the
> fallback was passing 1. Thus the resulting slab orders might be larger
> (not because it would improve waste, but to reduce pressure on shared
> locks), which is one of the goals of calculate_order().
> 
> For example, with nr_cpus=1 and 4kB PAGE_SIZE, slub_max_order=3, before
> the patch we would get the following orders for these object sizes:
> 
>  2056 to 10920 - order-3 as selected by the loop
> 10928 to 12280 - order-2 due to fallback, as <1/4 waste is not possible
> 12288 to 32768 - order-3 as <1/4 waste is again possible
> 
> After the patch:
> 
> 2056 to 32768 - order-3, because even in the range of 10928 to 12280 we
>                 try to satisfy the calculated min_objects.
> 
> As a result the code is simpler and gives more consistent results.
 
Current code already tries the fraction "1" in the follwing 2 fallback
calls of calc_slab_order(), so trying fraction "2" makes sense to me.

Reviewed-by: Feng Tang <feng.tang@intel.com>

Thanks,
Feng

> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 14 ++++----------
>  1 file changed, 4 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 5c287d96b212..f04eb029d85a 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4171,23 +4171,17 @@ static inline int calculate_order(unsigned int size)
>  	 * the order can only result in same or less fractional waste, not more.
>  	 *
>  	 * If that fails, we increase the acceptable fraction of waste and try
> -	 * again.
> +	 * again. The last iteration with fraction of 1/2 would effectively
> +	 * accept any waste and give us the order determined by min_objects, as
> +	 * long as at least single object fits within slub_max_order.
>  	 */
> -	for (unsigned int fraction = 16; fraction >= 4; fraction /= 2) {
> +	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
>  		order = calc_slab_order(size, min_objects, slub_max_order,
>  					fraction);
>  		if (order <= slub_max_order)
>  			return order;
>  	}
>  
> -	/*
> -	 * We were unable to place multiple objects in a slab. Now
> -	 * lets see if we can place a single object there.
> -	 */
> -	order = calc_slab_order(size, 1, slub_max_order, 1);
> -	if (order <= slub_max_order)
> -		return order;
> -
>  	/*
>  	 * Doh this slab cannot be placed using slub_max_order.
>  	 */
> -- 
> 2.42.0
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
  2023-09-08 14:53 ` [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order() Vlastimil Babka
  2023-09-11  5:56   ` kernel test robot
  2023-09-16  1:28   ` Baoquan He
@ 2023-09-20 13:36   ` Feng Tang
  2023-09-22  6:55     ` Vlastimil Babka
  2 siblings, 1 reply; 18+ messages in thread
From: Feng Tang @ 2023-09-20 13:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm@kvack.org,
	patches@lists.linux.dev, linux-kernel@vger.kernel.org

On Fri, Sep 08, 2023 at 10:53:07PM +0800, Vlastimil Babka wrote:
> After the previous cleanups, we can now move some code from
> calc_slab_order() to calculate_order() so it's executed just once, and
> do some more cleanups.
> 
> - move the min_order and MAX_OBJS_PER_PAGE evaluation to
>   calc_slab_order().

Nit: here is to 'move ... to calculate_order()'?

I tried this patch series with normal boot on a desktop and one 2
socket server: patch 2/4 doesn't change order of any slab, and patch
3/4 does make the slab order of big objects more consistent.

Thanks for making the code much cleaner! And for the whole series, 

Reviewed-by: Feng Tang <feng.tang@intel.com>

> - change calc_slab_order() parameter min_objects to min_order
> 
> Also make MAX_OBJS_PER_PAGE check more robust by considering also
> min_objects in addition to slub_min_order. Otherwise this is not a
> functional change.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/slub.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index f04eb029d85a..1c91f72c7239 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -4110,17 +4110,12 @@ static unsigned int slub_min_objects;
>   * the smallest order which will fit the object.
>   */
>  static inline unsigned int calc_slab_order(unsigned int size,
> -		unsigned int min_objects, unsigned int max_order,
> +		unsigned int min_order, unsigned int max_order,
>  		unsigned int fract_leftover)
>  {
> -	unsigned int min_order = slub_min_order;
>  	unsigned int order;
>  
> -	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> -		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> -
> -	for (order = max(min_order, (unsigned int)get_order(min_objects * size));
> -			order <= max_order; order++) {
> +	for (order = min_order; order <= max_order; order++) {
>  
>  		unsigned int slab_size = (unsigned int)PAGE_SIZE << order;
>  		unsigned int rem;
> @@ -4139,7 +4134,7 @@ static inline int calculate_order(unsigned int size)
>  	unsigned int order;
>  	unsigned int min_objects;
>  	unsigned int max_objects;
> -	unsigned int nr_cpus;
> +	unsigned int min_order;
>  
>  	min_objects = slub_min_objects;
>  	if (!min_objects) {
> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
>  		 * order on systems that appear larger than they are, and too
>  		 * low order on systems that appear smaller than they are.
>  		 */
> -		nr_cpus = num_present_cpus();
> +		unsigned int nr_cpus = num_present_cpus();
>  		if (nr_cpus <= 1)
>  			nr_cpus = nr_cpu_ids;
>  		min_objects = 4 * (fls(nr_cpus) + 1);
> @@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
>  	max_objects = order_objects(slub_max_order, size);
>  	min_objects = min(min_objects, max_objects);
>  
> +	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
> +	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
> +		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
> +
>  	/*
>  	 * Attempt to find best configuration for a slab. This works by first
>  	 * attempting to generate a layout with the best possible configuration and
> @@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
>  	 * long as at least single object fits within slub_max_order.
>  	 */
>  	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
> -		order = calc_slab_order(size, min_objects, slub_max_order,
> +		order = calc_slab_order(size, min_order, slub_max_order,
>  					fraction);
>  		if (order <= slub_max_order)
>  			return order;
> -- 
> 2.42.0
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
  2023-09-20 13:36   ` Feng Tang
@ 2023-09-22  6:55     ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-22  6:55 UTC (permalink / raw)
  To: Feng Tang
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm@kvack.org,
	patches@lists.linux.dev, linux-kernel@vger.kernel.org

On 9/20/23 15:36, Feng Tang wrote:
> On Fri, Sep 08, 2023 at 10:53:07PM +0800, Vlastimil Babka wrote:
>> After the previous cleanups, we can now move some code from
>> calc_slab_order() to calculate_order() so it's executed just once, and
>> do some more cleanups.
>> 
>> - move the min_order and MAX_OBJS_PER_PAGE evaluation to
>>   calc_slab_order().
> 
> Nit: here is to 'move ... to calculate_order()'?

Oops, right, fixed.

> I tried this patch series with normal boot on a desktop and one 2
> socket server: patch 2/4 doesn't change order of any slab, and patch
> 3/4 does make the slab order of big objects more consistent.
> 
> Thanks for making the code much cleaner! And for the whole series, 
> 
> Reviewed-by: Feng Tang <feng.tang@intel.com>

Thanks! Applied.

> 
>> - change calc_slab_order() parameter min_objects to min_order
>> 
>> Also make MAX_OBJS_PER_PAGE check more robust by considering also
>> min_objects in addition to slub_min_order. Otherwise this is not a
>> functional change.
>> 
>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>>  mm/slub.c | 19 +++++++++----------
>>  1 file changed, 9 insertions(+), 10 deletions(-)
>> 
>> diff --git a/mm/slub.c b/mm/slub.c
>> index f04eb029d85a..1c91f72c7239 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -4110,17 +4110,12 @@ static unsigned int slub_min_objects;
>>   * the smallest order which will fit the object.
>>   */
>>  static inline unsigned int calc_slab_order(unsigned int size,
>> -		unsigned int min_objects, unsigned int max_order,
>> +		unsigned int min_order, unsigned int max_order,
>>  		unsigned int fract_leftover)
>>  {
>> -	unsigned int min_order = slub_min_order;
>>  	unsigned int order;
>>  
>> -	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
>> -		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
>> -
>> -	for (order = max(min_order, (unsigned int)get_order(min_objects * size));
>> -			order <= max_order; order++) {
>> +	for (order = min_order; order <= max_order; order++) {
>>  
>>  		unsigned int slab_size = (unsigned int)PAGE_SIZE << order;
>>  		unsigned int rem;
>> @@ -4139,7 +4134,7 @@ static inline int calculate_order(unsigned int size)
>>  	unsigned int order;
>>  	unsigned int min_objects;
>>  	unsigned int max_objects;
>> -	unsigned int nr_cpus;
>> +	unsigned int min_order;
>>  
>>  	min_objects = slub_min_objects;
>>  	if (!min_objects) {
>> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
>>  		 * order on systems that appear larger than they are, and too
>>  		 * low order on systems that appear smaller than they are.
>>  		 */
>> -		nr_cpus = num_present_cpus();
>> +		unsigned int nr_cpus = num_present_cpus();
>>  		if (nr_cpus <= 1)
>>  			nr_cpus = nr_cpu_ids;
>>  		min_objects = 4 * (fls(nr_cpus) + 1);
>> @@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
>>  	max_objects = order_objects(slub_max_order, size);
>>  	min_objects = min(min_objects, max_objects);
>>  
>> +	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
>> +	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
>> +		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
>> +
>>  	/*
>>  	 * Attempt to find best configuration for a slab. This works by first
>>  	 * attempting to generate a layout with the best possible configuration and
>> @@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
>>  	 * long as at least single object fits within slub_max_order.
>>  	 */
>>  	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
>> -		order = calc_slab_order(size, min_objects, slub_max_order,
>> +		order = calc_slab_order(size, min_order, slub_max_order,
>>  					fraction);
>>  		if (order <= slub_max_order)
>>  			return order;
>> -- 
>> 2.42.0
>> 
>> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
  2023-09-16  1:28   ` Baoquan He
@ 2023-09-22  7:00     ` Vlastimil Babka
  2023-09-22  7:29       ` Baoquan He
  0 siblings, 1 reply; 18+ messages in thread
From: Vlastimil Babka @ 2023-09-22  7:00 UTC (permalink / raw)
  To: Baoquan He
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel

On 9/16/23 03:28, Baoquan He wrote:
> On 09/08/23 at 04:53pm, Vlastimil Babka wrote:
>> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
>>  		 * order on systems that appear larger than they are, and too
>>  		 * low order on systems that appear smaller than they are.
>>  		 */
>> -		nr_cpus = num_present_cpus();
>> +		unsigned int nr_cpus = num_present_cpus();
>>  		if (nr_cpus <= 1)
>>  			nr_cpus = nr_cpu_ids;
>>  		min_objects = 4 * (fls(nr_cpus) + 1);
> 
> A minor concern, should we change 'min_objects' to be a local static
> to avoid the "if (!min_objects) {" code block every time?  It's deducing
> the value from nr_cpus, we may not need do the calculation each time.

Maybe, although it's not a hot path. But we should make sure the
num_present_cpus() cannot change. Could it be e.g. low (1) very early when
we bootstrap the initial caches, but then update and at least most of the
caches then reflect the real number of cpus? With a static we would create
everything with 1.

>> @@ -4160,6 +4155,10 @@ static inline int calculate_order(unsigned int size)
>>  	max_objects = order_objects(slub_max_order, size);
>>  	min_objects = min(min_objects, max_objects);
>>  
>> +	min_order = max(slub_min_order, (unsigned int)get_order(min_objects * size));
>> +	if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE)
>> +		return get_order(size * MAX_OBJS_PER_PAGE) - 1;
>> +
>>  	/*
>>  	 * Attempt to find best configuration for a slab. This works by first
>>  	 * attempting to generate a layout with the best possible configuration and
>> @@ -4176,7 +4175,7 @@ static inline int calculate_order(unsigned int size)
>>  	 * long as at least single object fits within slub_max_order.
>>  	 */
>>  	for (unsigned int fraction = 16; fraction > 1; fraction /= 2) {
>> -		order = calc_slab_order(size, min_objects, slub_max_order,
>> +		order = calc_slab_order(size, min_order, slub_max_order,
>>  					fraction);
>>  		if (order <= slub_max_order)
>>  			return order;
>> -- 
>> 2.42.0
>> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order()
  2023-09-22  7:00     ` Vlastimil Babka
@ 2023-09-22  7:29       ` Baoquan He
  0 siblings, 0 replies; 18+ messages in thread
From: Baoquan He @ 2023-09-22  7:29 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Rientjes, Christoph Lameter, Hyeonggon Yoo, Jay Patel,
	Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel

On 09/22/23 at 09:00am, Vlastimil Babka wrote:
> On 9/16/23 03:28, Baoquan He wrote:
> > On 09/08/23 at 04:53pm, Vlastimil Babka wrote:
> >> @@ -4152,7 +4147,7 @@ static inline int calculate_order(unsigned int size)
> >>  		 * order on systems that appear larger than they are, and too
> >>  		 * low order on systems that appear smaller than they are.
> >>  		 */
> >> -		nr_cpus = num_present_cpus();
> >> +		unsigned int nr_cpus = num_present_cpus();
> >>  		if (nr_cpus <= 1)
> >>  			nr_cpus = nr_cpu_ids;
> >>  		min_objects = 4 * (fls(nr_cpus) + 1);
> > 
> > A minor concern, should we change 'min_objects' to be a local static
> > to avoid the "if (!min_objects) {" code block every time?  It's deducing
> > the value from nr_cpus, we may not need do the calculation each time.
> 
> Maybe, although it's not a hot path. But we should make sure the
> num_present_cpus() cannot change. Could it be e.g. low (1) very early when
> we bootstrap the initial caches, but then update and at least most of the
> caches then reflect the real number of cpus? With a static we would create
> everything with 1.

Yeah, I was silly, didn't think about it. We may check via system_state,
but it's not worth to bother since it's not hot path as you said. Sorry for
the noise.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/4] SLUB: calculate_order() cleanups
  2023-09-08 14:53 [PATCH 0/4] SLUB: calculate_order() cleanups Vlastimil Babka
                   ` (3 preceding siblings ...)
  2023-09-08 14:53 ` [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order() Vlastimil Babka
@ 2023-09-28  4:46 ` Jay Patel
  2023-10-02 12:38   ` Vlastimil Babka
  4 siblings, 1 reply; 18+ messages in thread
From: Jay Patel @ 2023-09-28  4:46 UTC (permalink / raw)
  To: Vlastimil Babka, David Rientjes, Christoph Lameter, Hyeonggon Yoo
  Cc: Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel

On Fri, 2023-09-08 at 16:53 +0200, Vlastimil Babka wrote:
> Since reviewing recent patches made me finally dig into these
> functions
> in details for the first time, I've also noticed some opportunities
> for
> cleanups that should make them simpler and also deliver more
> consistent
> results for some corner case object sizes (probably not seen in
> practice). Thus patch 3 can increase slab orders somewhere, but only
> in
> the way that was already intended. Otherwise it's almost no
> functional
> changes.
> 
Hi Vlastimil,

This cleanup patchset looks promising.
I've conducted test
on PowerPC with 16 CPUs and a 64K page size, and here are the results.

S
lub Memory Usage

+-------------------+--------+------------+
|                   | Normal | With Patch |
+-------------------+--------+------------+
| Total Slub Memory | 476992 | 478464     |
| Wastage           | 431    | 451        |
+-------------------+--------+------------+

Also, I have not detected any changes in the page order for slub caches
across all objects with 64K page size.

Hackbench Results

+-------+----+---------+------------+----------+
|     
  |    | Normal  | With Patch |          |
+-------+----+---------+-----
-------+----------+
| Amean | 1  | 1.1530  | 1.1347     | ( 1.59%) |
|
Amean | 4  | 3.9220  | 3.8240     | ( 2.50%) |
| Amean | 7  | 6.7943  |
6.6300     | ( 2.42%) |
| Amean | 12 | 11.7067 | 11.4423    | ( 2.26%) |
| Amean | 21 | 20.6617 | 20.1680    | ( 2.39%) |
| Amean | 30 | 29.4200
| 28.6460    | ( 2.63%) |
| Amean | 48 | 47.2797 | 46.2820    | ( 2.11%)
|
| Amean | 64 | 63.4680 | 62.1813    | ( 2.03%) |
+-------+----+------
---+------------+----------+  


Reviewed-by: Jay Patel
<jaypatel@linux.ibm.com>
Tested-by: Jay Patel <jaypatel@linux.ibm.com>

Th
ank You 
Jay Patel
> Vlastimil Babka (4):
>   mm/slub: simplify the last resort slab order calculation
>   mm/slub: remove min_objects loop from calculate_order()
>   mm/slub: attempt to find layouts up to 1/2 waste in
> calculate_order()
>   mm/slub: refactor calculate_order() and calc_slab_order()
> 
>  mm/slub.c | 63 ++++++++++++++++++++++++-----------------------------
> --
>  1 file changed, 27 insertions(+), 36 deletions(-)
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 0/4] SLUB: calculate_order() cleanups
  2023-09-28  4:46 ` [PATCH 0/4] SLUB: calculate_order() cleanups Jay Patel
@ 2023-10-02 12:38   ` Vlastimil Babka
  0 siblings, 0 replies; 18+ messages in thread
From: Vlastimil Babka @ 2023-10-02 12:38 UTC (permalink / raw)
  To: jaypatel, David Rientjes, Christoph Lameter, Hyeonggon Yoo
  Cc: Roman Gushchin, Pekka Enberg, Joonsoo Kim, linux-mm, patches,
	linux-kernel

On 9/28/23 06:46, Jay Patel wrote:
> On Fri, 2023-09-08 at 16:53 +0200, Vlastimil Babka wrote:
>> Since reviewing recent patches made me finally dig into these
>> functions
>> in details for the first time, I've also noticed some opportunities
>> for
>> cleanups that should make them simpler and also deliver more
>> consistent
>> results for some corner case object sizes (probably not seen in
>> practice). Thus patch 3 can increase slab orders somewhere, but only
>> in
>> the way that was already intended. Otherwise it's almost no
>> functional
>> changes.
>> 
> Hi Vlastimil,

Hi, Jay!

> This cleanup patchset looks promising.
> I've conducted test
> on PowerPC with 16 CPUs and a 64K page size, and here are the results.
> 
> S
> lub Memory Usage
> 
> +-------------------+--------+------------+
> |                   | Normal | With Patch |
> +-------------------+--------+------------+
> | Total Slub Memory | 476992 | 478464     |
> | Wastage           | 431    | 451        |
> +-------------------+--------+------------+
> 
> Also, I have not detected any changes in the page order for slub caches
> across all objects with 64K page size.

As expected. Which should mean any benchmark differences should be noise and
not caused by the patches.

> Hackbench Results
> 
> +-------+----+---------+------------+----------+
> |     
>   |    | Normal  | With Patch |          |
> +-------+----+---------+-----
> -------+----------+
> | Amean | 1  | 1.1530  | 1.1347     | ( 1.59%) |
> |
> Amean | 4  | 3.9220  | 3.8240     | ( 2.50%) |
> | Amean | 7  | 6.7943  |
> 6.6300     | ( 2.42%) |
> | Amean | 12 | 11.7067 | 11.4423    | ( 2.26%) |
> | Amean | 21 | 20.6617 | 20.1680    | ( 2.39%) |
> | Amean | 30 | 29.4200
> | 28.6460    | ( 2.63%) |
> | Amean | 48 | 47.2797 | 46.2820    | ( 2.11%)
> |
> | Amean | 64 | 63.4680 | 62.1813    | ( 2.03%) |
> +-------+----+------
> ---+------------+----------+  
> 
> 
> Reviewed-by: Jay Patel
> <jaypatel@linux.ibm.com>
> Tested-by: Jay Patel <jaypatel@linux.ibm.com>

Thanks! Applied your Reviewed-and-tested-by:

> Th
> ank You 
> Jay Patel
>> Vlastimil Babka (4):
>>   mm/slub: simplify the last resort slab order calculation
>>   mm/slub: remove min_objects loop from calculate_order()
>>   mm/slub: attempt to find layouts up to 1/2 waste in
>> calculate_order()
>>   mm/slub: refactor calculate_order() and calc_slab_order()
>> 
>>  mm/slub.c | 63 ++++++++++++++++++++++++-----------------------------
>> --
>>  1 file changed, 27 insertions(+), 36 deletions(-)
>> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-10-02 12:38 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-08 14:53 [PATCH 0/4] SLUB: calculate_order() cleanups Vlastimil Babka
2023-09-08 14:53 ` [PATCH 1/4] mm/slub: simplify the last resort slab order calculation Vlastimil Babka
2023-09-19  7:56   ` Feng Tang
2023-09-20  6:38     ` Vlastimil Babka
2023-09-20  7:09       ` Feng Tang
2023-09-08 14:53 ` [PATCH 2/4] mm/slub: remove min_objects loop from calculate_order() Vlastimil Babka
2023-09-08 14:53 ` [PATCH 3/4] mm/slub: attempt to find layouts up to 1/2 waste in calculate_order() Vlastimil Babka
2023-09-20 13:11   ` Feng Tang
2023-09-08 14:53 ` [PATCH 4/4] mm/slub: refactor calculate_order() and calc_slab_order() Vlastimil Babka
2023-09-11  5:56   ` kernel test robot
2023-09-15 13:36     ` Vlastimil Babka
2023-09-16  1:28   ` Baoquan He
2023-09-22  7:00     ` Vlastimil Babka
2023-09-22  7:29       ` Baoquan He
2023-09-20 13:36   ` Feng Tang
2023-09-22  6:55     ` Vlastimil Babka
2023-09-28  4:46 ` [PATCH 0/4] SLUB: calculate_order() cleanups Jay Patel
2023-10-02 12:38   ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).