From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA465C636CC for ; Mon, 30 Jan 2023 04:15:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C734A8E0001; Sun, 29 Jan 2023 23:15:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C230C6B0073; Sun, 29 Jan 2023 23:15:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC39D8E0001; Sun, 29 Jan 2023 23:15:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9B4816B0072 for ; Sun, 29 Jan 2023 23:15:43 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3D6C8140ABA for ; Mon, 30 Jan 2023 04:15:43 +0000 (UTC) X-FDA: 80410151766.24.A3EC1A2 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf20.hostedemail.com (Postfix) with ESMTP id 029A01C000A for ; Mon, 30 Jan 2023 04:15:39 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="G+S7c9c/"; spf=none (imf20.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675052140; a=rsa-sha256; cv=none; b=Elz8ILs6t3bv8rriCT8wfbmXylpXgM4xny8GioMsC9osebs8+9nXPOwdcevDVVqooCvVXM HWRW7rukJpb6rrzox9mF+f6EWMbYxes1dEzSPaNHKzeLMyX20s/IpTpWlYDBY50sbtRFn/ rxfUfm88RN+F4iWuoIC3mZEaHuoW1SQ= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b="G+S7c9c/"; spf=none (imf20.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675052140; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JApqLJSlREJfto41HrlaOE5znyfw3ipMqIEA/E2Edb0=; b=1XmHYY7mzttRdLGZcfTSxIJcLkeFDp7Bed8R0hDyVRUSlFxSIEhOywKCFFQz0eaAnPa9Np dYRs0QZhZiEnFEq/9tpFnNgeRJhTaDLmrskINqjroE0UCcdeOtDnEoE6TiRREUiEFRQG/L RSIoP0RX/HAJl8jearB/Q98TBgscW3M= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description; bh=JApqLJSlREJfto41HrlaOE5znyfw3ipMqIEA/E2Edb0=; b=G+S7c9c/3tN2UkizZ1Sdw6MiTM FmmV8PcPcJujF+Kx31seWO9uUgLdt0aXWrYtMKxm3saYvt9Fl6qyxZeqkuezm8hfITWH9x04TM7B4 Jwsu7S0JyfRBpEj7gSLKI7J/ylIfBYDhse2HdRaf5h18zkX2xKg2G1fzK+GG5oXjXwJ+HHdyVUKX3 IajDnctjB09FW1M0HuRdibQ1HbjC1i7raE884dLQ1EFIbG8WklyGEFsDG0IC+cjaBDvF5hBqmEFJJ m347XYOkj983QX5XVL1SeCz/aA2EZ7HTcdpQJk5DxDSULYmQCewfhtc57gzcp/Uimeq/Bi5S9sOmu RLJ0q1+w==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pMLZV-00A1lP-Q6; Mon, 30 Jan 2023 04:15:09 +0000 Date: Mon, 30 Jan 2023 04:15:09 +0000 From: Matthew Wilcox To: kernel test robot Cc: Shakeel Butt , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Marek Szyprowski , linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, ying.huang@intel.com, feng.tang@intel.com, zhengjun.xing@linux.intel.com, fengwei.yin@intel.com Subject: Re: [linus:master] [mm] f1a7941243: unixbench.score -19.2% regression Message-ID: References: <202301301057.e55dad5b-oliver.sang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <202301301057.e55dad5b-oliver.sang@intel.com> X-Rspam-User: X-Rspamd-Queue-Id: 029A01C000A X-Rspamd-Server: rspam01 X-Stat-Signature: ntif4wh3a4t6qohh94nqiqhgowoxma6a X-HE-Tag: 1675052139-986833 X-HE-Meta: U2FsdGVkX19cqPuQMc3qdEPSVV58IagqMhiHsSPg/O7qRIn/EzFCAuBRUeLQDHuXwSZnMQZeTsMVqKyqgFTkUxgvwCsnaGUliKI280pjFpmAvoVxMxFWf/XZtSDno36RipJmEextR1vodnNbofbT176+ebeKf+97DikVGjmjbk3CM+pTk8y5v59a2UbEw9iJlhtGURXq2LkRPnA2eK4JwkKVxuUZ/xS/qAJmDBXET11n76YK3do66nuTDV9sFJul9EVwTaV34zxSt1H+0bXRrcwd25ns8Z5DuEqJEsRhf3a0WEIenfFPNlU/o5zApltyeiY7pRjsiFi9uZWqyNjU0AbDiCNPDXOhrzbFI3Zfg3V5RRdtHAvAmQD1/drgcrIurMryhM/ULbhlpZ3Z+kR2aV+PriLeNQRSiUwCOyZc1M6TviF9UxxjRAC+9IxpdffWWleeSPcwuMVvgn28m7tdSEb56ny4jE9A7JE/a3Xz2M5dY2BtvfVL24+OrmJ5JSwDcpiK8c7foL7ISjMJF3s4RJbEaUTa5WZEYb7Ev3oLYlD9VnbtgIGYB+vPoY/3lAtHmIvhGV1ADoUdXhtj8EiwtVNmjVtu/yQF55fSKKKDy/odHvx6/8yv9jvdDXeUnfwitbdhao3K7dUR5RuSOThu1VaxvaajGhnlOm8a5R/B0iT6S3iF9srI7MrkLvQw1cICrETjE2DZrPHydfFKvoB/0Nv3SKM+hqhCj7liROpSHg/rnLMammhYJzjH7g3kjh2nB1gAwTqQfi31dUyiN72Wm+elRu7xexhXuOk69gtiRqY8C1qTwV9mATAgwfIODQ/if7g3qvq5HCpiYJrIZzfwRUAOYGeFL0pMVI+CpoLi6tnivolJHpHmqZDulvPSUBIb3o6/JPoy1gI2XR24AkeRlzgRLGWbI9xUgK2m7Xp9E2qFraEPU0eDc0WUFKAYXTyqyh9tZGI1v9/f80GfBIV 3tcnTb0K HVCvQW9nY/MlD5hvjtnYF9iYZWi7CYv2EI9V5i8e3Uvq7unnEUM3di3qoHTGLILP+5zMe23juyNb2Ib516emjKFmF/qVS9GiI6Dvz1hn+1tkzOq8l/HZdo0+qnCoEQFsQtpJtIDbEZmzAKXC8YhR1DhN/18V0qAaYxElKGhp2CB/1VUY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jan 30, 2023 at 10:32:56AM +0800, kernel test robot wrote: > FYI, we noticed a -19.2% regression of unixbench.score due to commit: > > commit: f1a7941243c102a44e8847e3b94ff4ff3ec56f25 ("mm: convert mm's rss stats into percpu_counter") > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > in testcase: unixbench > on test machine: 128 threads 4 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory > with following parameters: > > runtime: 300s > nr_task: 30% > test: spawn > cpufreq_governor: performance ... > 9cd6ffa60256e931 f1a7941243c102a44e8847e3b94 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 11110 -19.2% 8974 unixbench.score > 1090843 -12.2% 957314 unixbench.time.involuntary_context_switches > 4243909 ± 6% -32.4% 2867136 ± 5% unixbench.time.major_page_faults > 10547 -12.6% 9216 unixbench.time.maximum_resident_set_size > 9.913e+08 -19.6% 7.969e+08 unixbench.time.minor_page_faults > 5638 +19.1% 6714 unixbench.time.system_time > 5502 -20.7% 4363 unixbench.time.user_time So we're spending a lot more time in the kernel and correspondingly less time in userspace. > 67991885 -16.9% 56507507 unixbench.time.voluntary_context_switches > 46198768 -19.1% 37355723 unixbench.workload > 1.365e+08 -12.5% 1.195e+08 ± 7% cpuidle..usage > 1220612 ± 4% -38.0% 757009 ± 28% meminfo.Active > 1220354 ± 4% -38.0% 756754 ± 28% meminfo.Active(anon) > 0.50 ± 2% -0.1 0.45 ± 4% mpstat.cpu.all.soft% > 1.73 -0.2 1.52 ± 2% mpstat.cpu.all.usr% > 532266 -18.4% 434559 vmstat.system.cs > 495826 -12.2% 435455 ± 8% vmstat.system.in > 1.36e+08 -13.2% 1.18e+08 ± 9% turbostat.C1 > 68.80 +0.8 69.60 turbostat.C1% > 1.663e+08 -12.1% 1.462e+08 ± 8% turbostat.IRQ > 15.54 ± 20% -49.0% 7.93 ± 24% sched_debug.cfs_rq:/.runnable_avg.min > 13.26 ± 19% -46.6% 7.08 ± 29% sched_debug.cfs_rq:/.util_avg.min > 48.96 ± 8% +51.5% 74.20 ± 13% sched_debug.cfs_rq:/.util_est_enqueued.avg > 138.00 ± 5% +28.9% 177.87 ± 7% sched_debug.cfs_rq:/.util_est_enqueued.stddev > 228060 ± 3% +13.3% 258413 ± 4% sched_debug.cpu.avg_idle.stddev > 432533 ± 5% -16.4% 361517 ± 4% sched_debug.cpu.nr_switches.min > 2.665e+08 -18.9% 2.162e+08 numa-numastat.node0.local_node > 2.666e+08 -18.9% 2.163e+08 numa-numastat.node0.numa_hit > 2.746e+08 -20.9% 2.172e+08 numa-numastat.node1.local_node > 2.747e+08 -20.9% 2.172e+08 numa-numastat.node1.numa_hit > 2.602e+08 -17.4% 2.149e+08 numa-numastat.node2.local_node > 2.603e+08 -17.4% 2.149e+08 numa-numastat.node2.numa_hit > 2.423e+08 -15.0% 2.06e+08 numa-numastat.node3.local_node > 2.424e+08 -15.0% 2.061e+08 numa-numastat.node3.numa_hit So we're going off-node a lot more for ... something. > 2.666e+08 -18.9% 2.163e+08 numa-vmstat.node0.numa_hit > 2.665e+08 -18.9% 2.162e+08 numa-vmstat.node0.numa_local > 2.747e+08 -20.9% 2.172e+08 numa-vmstat.node1.numa_hit > 2.746e+08 -20.9% 2.172e+08 numa-vmstat.node1.numa_local > 2.603e+08 -17.4% 2.149e+08 numa-vmstat.node2.numa_hit > 2.602e+08 -17.4% 2.149e+08 numa-vmstat.node2.numa_local > 2.424e+08 -15.0% 2.061e+08 numa-vmstat.node3.numa_hit > 2.423e+08 -15.0% 2.06e+08 numa-vmstat.node3.numa_local > 304947 ± 4% -38.0% 189144 ± 28% proc-vmstat.nr_active_anon Umm. Are we running vmstat a lot during this test? The commit says: At the moment the readers are either procfs interface, oom_killer and memory reclaim which I think are not performance critical and should be ok with slow read. However I think we can make that change in a separate patch. This would explain the increased cross-NUMA references (we're going to the other nodes to collect the stats), and the general slowdown. But I don't think it reflects a real workload; it's reflecting that the monitoring of this workload that we're doing is now more accurate and more expensive.