From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98071C433F5 for ; Tue, 8 Mar 2022 09:26:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345336AbiCHJ1n (ORCPT ); Tue, 8 Mar 2022 04:27:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234844AbiCHJ1k (ORCPT ); Tue, 8 Mar 2022 04:27:40 -0500 Received: from out30-43.freemail.mail.aliyun.com (out30-43.freemail.mail.aliyun.com [115.124.30.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E117140930; Tue, 8 Mar 2022 01:26:43 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R211e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=dtcccc@linux.alibaba.com;NM=1;PH=DS;RN=28;SR=0;TI=SMTPD_---0V6e-ti7_1646731589; Received: from localhost.localdomain(mailfrom:dtcccc@linux.alibaba.com fp:SMTPD_---0V6e-ti7_1646731589) by smtp.aliyun-inc.com(127.0.0.1); Tue, 08 Mar 2022 17:26:38 +0800 From: Tianchen Ding To: Zefan Li , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Tejun Heo , Johannes Weiner , Tianchen Ding , Michael Wang , Cruz Zhao , Masahiro Yamada , Nathan Chancellor , Kees Cook , Andrew Morton , Vlastimil Babka , "Gustavo A. R. Silva" , Arnd Bergmann , Miguel Ojeda , Chris Down , Vipin Sharma , Daniel Borkmann Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Subject: [RFC PATCH v2 0/4] Introduce group balancer Date: Tue, 8 Mar 2022 17:26:25 +0800 Message-Id: <20220308092629.40431-1-dtcccc@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Modern platform are growing fast on CPU numbers. To achieve better utility of CPU resource, multiple apps are starting to sharing the CPUs. What we need is a way to ease confliction in share mode, make groups as exclusive as possible, to gain both performance and resource efficiency. The main idea of group balancer is to fulfill this requirement by balancing groups of tasks among groups of CPUs, consider this as a dynamic demi-exclusive mode. Task trigger work to settle it's group into a proper partition (minimum predicted load), then try migrate itself into it. To gradually settle groups into the most exclusively partition. GB can be seen as an optimize policy based on load balance, it obeys the main idea of load balance and makes adjustment based on that. Our test on ARM64 platform with 128 CPUs shows that, throughput of sysbench memory is improved about 25%, and redis-benchmark is improved up to about 10%. See each patch for detail: The 1st patch introduces infrastructure. The 2nd patch introduces detail about partition info. The 3rd patch is the main part of group balancer. The 4th patch is about stats. v2: Put partition info and period settings to cpuset subsys of cgroup_v2. v1: https://lore.kernel.org/all/98f41efd-74b2-198a-839c-51b785b748a6@linux.alibaba.com/ Michael Wang (1): sched: Introduce group balancer Tianchen Ding (3): sched, cpuset: Introduce infrastructure of group balancer cpuset: Handle input of partition info for group balancer cpuset, gb: Add stat for group balancer include/linux/cpuset.h | 5 + include/linux/sched.h | 5 + include/linux/sched/gb.h | 70 ++++++ init/Kconfig | 12 + kernel/cgroup/cpuset.c | 405 +++++++++++++++++++++++++++++++- kernel/sched/Makefile | 1 + kernel/sched/core.c | 5 + kernel/sched/debug.c | 10 +- kernel/sched/fair.c | 26 ++- kernel/sched/gb.c | 487 +++++++++++++++++++++++++++++++++++++++ kernel/sched/sched.h | 14 ++ 11 files changed, 1037 insertions(+), 3 deletions(-) create mode 100644 include/linux/sched/gb.h create mode 100644 kernel/sched/gb.c -- 2.27.0