From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============8355142560644445037==" MIME-Version: 1.0 From: Jeff Layton To: lkp@lists.01.org Subject: Re: [fs] fa629d46c4: vm-scalability.throughput 43.5% improvement Date: Thu, 15 Dec 2016 13:10:29 -0500 Message-ID: <1481825429.2699.19.camel@redhat.com> In-Reply-To: <20161205005200.GC27619@yexl-desktop> List-Id: --===============8355142560644445037== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Mon, 2016-12-05 at 08:52 +0800, kernel test robot wrote: > Greeting, > = > FYI, we noticed a 43.5% improvement of vm-scalability.throughput due to c= ommit: > = > = > commit: fa629d46c4da556a77c7b8c7760e734dd88d1f3e ("fs: only set S_VERSION= when updating times if it has been queried") > git://git.samba.org/jlayton/linux iversion > = > in testcase: vm-scalability > on test machine: 32 threads Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz with = 4G memory > with following parameters: > = > runtime: 300s > size: 1T > test: msync-mt > cpufreq_governor: performance > = > test-description: The motivation behind this suite is to exercise functio= ns and regions of the mm/ of the Linux kernel which are of interest to us. > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability= .git/ > = > = > = > Details are as below: > -------------------------------------------------------------------------= -------------------------> > = > = > To reproduce: > = > git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-t= ests.git > cd lkp-tests > bin/lkp install job.yaml # job file is attached in this email > bin/lkp run job.yaml > = > testcase/path_params/tbox_group/run: vm-scalability/300s-1T-msync-mt-perf= ormance/lkp-sb02 > = > 142824faf828159d fa629d46c4da556a77c7b8c776 > ---------------- -------------------------- > fail:runs %reproduction fail:runs > | | | > :4 25% 1:4 last_state.is_incomplete_run > %stddev change %stddev > \ | \ > %stddev %change %stddev > \ | \ > 1920140 =C2=B1 0% +43.5% 2755159 =C2=B1 0% vm-scalability.th= roughput > 7.315e+08 =C2=B1 0% +42.0% 1.039e+09 =C2=B1 0% vm-scalability.ti= me.file_system_outputs > 2326217 =C2=B1 3% +14.4% 2660253 =C2=B1 4% vm-scalability.ti= me.involuntary_context_switches > 3777 =C2=B1 6% +71.5% 6478 =C2=B1 2% vm-scalability.ti= me.major_page_faults > 2.023e+08 =C2=B1 1% +18.9% 2.405e+08 =C2=B1 0% vm-scalability.ti= me.minor_page_faults > 1239 =C2=B1 1% -17.1% 1028 =C2=B1 0% vm-scalability.ti= me.system_time > 448.56 =C2=B1 0% +46.6% 657.67 =C2=B1 0% vm-scalability.ti= me.user_time > 1.338e+08 =C2=B1 1% -27.8% 96520722 =C2=B1 1% vm-scalability.ti= me.voluntary_context_switches > 1.097e+08 =C2=B1 1% +77.4% 1.946e+08 =C2=B1 1% interrupts.CAL:Fu= nction_call_interrupts > 800.00 =C2=B1 9% -41.2% 470.33 =C2=B1 22% slabinfo.proc_in= ode_cache.active_objs > 877.25 =C2=B1 7% -36.1% 561.00 =C2=B1 21% slabinfo.proc_ino= de_cache.num_objs > 37662 =C2=B1 4% +22.0% 45943 =C2=B1 5% meminfo.Dirty > 69585 =C2=B1 21% -41.9% 40426 =C2=B1 7% meminfo.Inactive(= anon) > 1695 =C2=B1 17% -6.1% 1592 =C2=B1 17% meminfo.Mlocked > 413514 =C2=B1 0% +42.0% 587051 =C2=B1 0% vmstat.io.bo > 2.25 =C2=B1 19% -55.6% 1.00 =C2=B1 0% vmstat.memory.buff > 279615 =C2=B1 1% -27.8% 201810 =C2=B1 1% vmstat.system.cs > 135662 =C2=B1 0% +71.0% 231976 =C2=B1 1% vmstat.system.in > 65.08 =C2=B1 0% +10.9% 72.18 =C2=B1 0% turbostat.%Busy > 1909 =C2=B1 0% +10.8% 2115 =C2=B1 0% turbostat.Avg_MHz > 12.16 =C2=B1 0% -39.3% 7.38 =C2=B1 1% turbostat.CPU%c1 > 21.99 =C2=B1 1% -10.5% 19.69 =C2=B1 0% turbostat.CPU%c6 > 32.35 =C2=B1 0% +5.6% 34.17 =C2=B1 1% turbostat.CorWatt > 36.37 =C2=B1 0% +5.5% 38.38 =C2=B1 1% turbostat.PkgWatt > = > = > = > vm-scalability.time.user_time > = > 800 ++-----------------------------------------------------------------= ---+ > O O O O O O = | > 700 ++ O O O O O O O O O O O O O O O O O O O O O O = | > 600 ++O = O > | O = | > 500 ++ = | > *.*..*.*.*.*..*.*.*.*..*.*.*.*..*.*.*.*..*.*.*.*..* = | > 400 ++ = | > | = | > 300 ++ = | > 200 ++ = | > | = | > 100 ++ = | > | = | > 0 ++-----------------------------------------------------------------= -O-+ > = > = > vm-scalability.time.file_system_outputs > = > 1.2e+09 ++O------------------------------------------------------------= ---+ > O O O O O O = | > 1e+09 ++ O O O O O O O O O O O O O O O O O O O O O O O= O > | = | > | = | > 8e+08 *+*.*.*..*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*..*. .* = | > | * = | > 6e+08 ++ = | > | = | > 4e+08 ++ = | > | = | > | = | > 2e+08 ++ = | > | = | > 0 ++-------------------------------------------------------------= -O-+ > = > = > vm-scalability.throughput > = > 3e+06 O+--O-O--O-O-O-------------------------------------------------= ---+ > | O O O O O O O O O O O O O O O O O O O O O O= O > 2.5e+06 ++O O = | > | = | > | = | > 2e+06 *+*.*.*..*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*..*.*.* = | > | = | > 1.5e+06 ++ = | > | = | > 1e+06 ++ = | > | = | > | = | > 500000 ++ = | > | = | > 0 ++-------------------------------------------------------------= -O-+ > = > = > interrupts.CAL:Function_call_interrupts > = > 3.5e+08 ++-------------------------------------------------------------= ---+ > | O = | > 3e+08 ++ = | > | = | > 2.5e+08 O+ O O O O O O = | > | O O O O = | > 2e+08 ++ O O O O O O O O O O O O O O O O O O= O > | = | > 1.5e+08 ++ = | > *. .*.*.. .*.*.*.*.*.*. .*.*. = | > 1e+08 ++* * *..*.*.* *.*.*..*.*.* = | > | = | > 5e+07 ++ = | > | = | > 0 ++-------------------------------------------------------------= -O-+ > = > = > [*] bisect-good sample > [O] bisect-bad sample > = > = > Disclaimer: > Results have been estimated based on internal Intel analysis and are prov= ided > for informational purposes only. Any difference in system hardware or sof= tware > design or configuration may affect actual performance. > = > = > Thanks, > Xiaolong Hi, I've gotten a couple of these emails recently on a patchset that I'm working on but haven't sent upstream for review yet. It's not every day that you get a mail that says you improved throughput by 43%.=C2=A0 Could you help me interpret the results here? I'm guessing that this patchset allows the kernel to dirty mmapped pages faster? Also, I've looked at what this test does, and I'm wondering...is this simulating some sort of real-world workload? If so, what? Thanks! -- = Jeff Layton --===============8355142560644445037==--