From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751157AbdH1F5p (ORCPT ); Mon, 28 Aug 2017 01:57:45 -0400 Received: from mga05.intel.com ([192.55.52.43]:61679 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750720AbdH1F5o (ORCPT ); Mon, 28 Aug 2017 01:57:44 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,440,1498546800"; d="scan'208";a="1166698103" From: "Huang\, Ying" To: kernel test robot Cc: Vincent Guittot , Stephen Rothwell , Peter Zijlstra , "lkp\@01.org" , Mike Galbraith , LKML , Thomas Gleixner , Linus Torvalds , Ingo Molnar Subject: Re: [LKP] [lkp-robot] [sched/cfs] 625ed2bf04: unixbench.score -7.4% regression References: <20170519060706.GU568@yexl-desktop> Date: Mon, 28 Aug 2017 13:57:39 +0800 In-Reply-To: <20170519060706.GU568@yexl-desktop> (kernel test robot's message of "Fri, 19 May 2017 14:07:06 +0800") Message-ID: <8760d8i698.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org kernel test robot writes: > Greeting, > > FYI, we noticed a -7.4% regression of unixbench.score due to commit: > > > commit: 625ed2bf049d5a352c1bcca962d6e133454eaaff ("sched/cfs: Make util/load_avg more stable") > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > in testcase: unixbench > on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory > with following parameters: > > runtime: 300s > nr_task: 100% > test: spawn > cpufreq_governor: performance > > test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system. > This has been merged by v4.13-rc1, so we checked it again. If my understanding were correct, the patch changes the algorithm to calculate the load of CPU, so it influences the load balance behavior for this test case. 4.73 ± 8% -31.3% 3.25 ± 10% sched_debug.cpu.nr_running.max 0.95 ± 5% -29.0% 0.67 ± 4% sched_debug.cpu.nr_running.stddev As above, the effect is that the tasks are distributed into more CPUs, that is, system is more balanced. But this triggered more contention on tasklist_lock, so hurt the unixbench score, as below. 26.60 -10.6 16.05 perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.do_idle 10.10 +2.4 12.53 perf-profile.calltrace.cycles-pp._raw_write_lock_irq.do_exit.do_group_exit.sys_exit_group.entry_SYSCALL_64_fastpath 8.03 +2.6 10.63 perf-profile.calltrace.cycles-pp._raw_write_lock_irq.release_task.wait_consider_task.do_wait.sys_wait4 17.98 +5.2 23.14 perf-profile.calltrace.cycles-pp._raw_read_lock.do_wait.sys_wait4.entry_SYSCALL_64_fastpath 7.47 +5.9 13.33 perf-profile.calltrace.cycles-pp._raw_write_lock_irq.copy_process._do_fork.sys_clone.do_syscall_64 The patch makes the tasks distributed more balanced, so I think scheduler do better job here. The problem is that the tasklist_lock isn't scalable. But considering this is only a micro-benchmark which specially exercises fork/exit/wait syscall, this may be not a big problem in reality. So, all in all, I think we can ignore this regression. Best Regards, Huang, Ying