From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============2091646304661186457==" MIME-Version: 1.0 From: Huang, Ying To: lkp@lists.01.org Subject: Re: [fs] fa629d46c4: vm-scalability.throughput 43.5% improvement Date: Mon, 19 Dec 2016 09:11:55 +0800 Message-ID: <87zijsd8o4.fsf@yhuang-dev.intel.com> In-Reply-To: <1481825429.2699.19.camel@redhat.com> List-Id: --===============2091646304661186457== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Jeff Layton writes: > On Mon, 2016-12-05 at 08:52 +0800, kernel test robot wrote: >> Greeting, >> = >> FYI, we noticed a 43.5% improvement of vm-scalability.throughput due to = commit: >> = >> = >> commit: fa629d46c4da556a77c7b8c7760e734dd88d1f3e ("fs: only set S_VERSIO= N when updating times if it has been queried") >> git://git.samba.org/jlayton/linux iversion >> = >> in testcase: vm-scalability >> on test machine: 32 threads Intel(R) Core(TM) i5-2300 CPU @ 2.80GHz with= 4G memory >> with following parameters: >> = >> runtime: 300s >> size: 1T >> test: msync-mt >> cpufreq_governor: performance >> = >> test-description: The motivation behind this suite is to exercise functi= ons and regions of the mm/ of the Linux kernel which are of interest to us. >> test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalabilit= y.git/ >> = >> = >> = >> Details are as below: >> ------------------------------------------------------------------------= --------------------------> >> = >> = >> To reproduce: >> = >> git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-= tests.git >> cd lkp-tests >> bin/lkp install job.yaml # job file is attached in this email >> bin/lkp run job.yaml >> = >> testcase/path_params/tbox_group/run: vm-scalability/300s-1T-msync-mt-per= formance/lkp-sb02 >> = >> 142824faf828159d fa629d46c4da556a77c7b8c776 >> ---------------- -------------------------- >> fail:runs %reproduction fail:runs >> | | | >> :4 25% 1:4 last_state.is_incomplete_r= un >> %stddev change %stddev >> \ | \ >> %stddev %change %stddev >> \ | \ >> 1920140 =C2=B1 0% +43.5% 2755159 =C2=B1 0% vm-scalability.t= hroughput >> 7.315e+08 =C2=B1 0% +42.0% 1.039e+09 =C2=B1 0% vm-scalability.t= ime.file_system_outputs >> 2326217 =C2=B1 3% +14.4% 2660253 =C2=B1 4% vm-scalability.t= ime.involuntary_context_switches >> 3777 =C2=B1 6% +71.5% 6478 =C2=B1 2% vm-scalability.t= ime.major_page_faults >> 2.023e+08 =C2=B1 1% +18.9% 2.405e+08 =C2=B1 0% vm-scalability.t= ime.minor_page_faults >> 1239 =C2=B1 1% -17.1% 1028 =C2=B1 0% vm-scalability.t= ime.system_time >> 448.56 =C2=B1 0% +46.6% 657.67 =C2=B1 0% vm-scalability.t= ime.user_time >> 1.338e+08 =C2=B1 1% -27.8% 96520722 =C2=B1 1% vm-scalability.t= ime.voluntary_context_switches >> 1.097e+08 =C2=B1 1% +77.4% 1.946e+08 =C2=B1 1% interrupts.CAL:F= unction_call_interrupts >> 800.00 =C2=B1 9% -41.2% 470.33 =C2=B1 22% slabinfo.proc_i= node_cache.active_objs >> 877.25 =C2=B1 7% -36.1% 561.00 =C2=B1 21% slabinfo.proc_in= ode_cache.num_objs >> 37662 =C2=B1 4% +22.0% 45943 =C2=B1 5% meminfo.Dirty >> 69585 =C2=B1 21% -41.9% 40426 =C2=B1 7% meminfo.Inactive= (anon) >> 1695 =C2=B1 17% -6.1% 1592 =C2=B1 17% meminfo.Mlocked >> 413514 =C2=B1 0% +42.0% 587051 =C2=B1 0% vmstat.io.bo >> 2.25 =C2=B1 19% -55.6% 1.00 =C2=B1 0% vmstat.memory.bu= ff >> 279615 =C2=B1 1% -27.8% 201810 =C2=B1 1% vmstat.system.cs >> 135662 =C2=B1 0% +71.0% 231976 =C2=B1 1% vmstat.system.in >> 65.08 =C2=B1 0% +10.9% 72.18 =C2=B1 0% turbostat.%Busy >> 1909 =C2=B1 0% +10.8% 2115 =C2=B1 0% turbostat.Avg_MHz >> 12.16 =C2=B1 0% -39.3% 7.38 =C2=B1 1% turbostat.CPU%c1 >> 21.99 =C2=B1 1% -10.5% 19.69 =C2=B1 0% turbostat.CPU%c6 >> 32.35 =C2=B1 0% +5.6% 34.17 =C2=B1 1% turbostat.CorWatt >> 36.37 =C2=B1 0% +5.5% 38.38 =C2=B1 1% turbostat.PkgWatt >> = >> = >> = >> vm-scalability.time.user_time >> = >> 800 ++----------------------------------------------------------------= ----+ >> O O O O O O = | >> 700 ++ O O O O O O O O O O O O O O O O O O O O O O= | >> 600 ++O = O >> | O = | >> 500 ++ = | >> *.*..*.*.*.*..*.*.*.*..*.*.*.*..*.*.*.*..*.*.*.*..* = | >> 400 ++ = | >> | = | >> 300 ++ = | >> 200 ++ = | >> | = | >> 100 ++ = | >> | = | >> 0 ++----------------------------------------------------------------= --O-+ >> = >> = >> vm-scalability.time.file_system_outputs >> = >> 1.2e+09 ++O-----------------------------------------------------------= ----+ >> O O O O O O = | >> 1e+09 ++ O O O O O O O O O O O O O O O O O O O O O O = O O >> | = | >> | = | >> 8e+08 *+*.*.*..*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*..*. .* = | >> | * = | >> 6e+08 ++ = | >> | = | >> 4e+08 ++ = | >> | = | >> | = | >> 2e+08 ++ = | >> | = | >> 0 ++------------------------------------------------------------= --O-+ >> = >> = >> vm-scalability.throughput >> = >> 3e+06 O+--O-O--O-O-O------------------------------------------------= ----+ >> | O O O O O O O O O O O O O O O O O O O O O = O O >> 2.5e+06 ++O O = | >> | = | >> | = | >> 2e+06 *+*.*.*..*.*.*.*.*.*.*.*..*.*.*.*.*.*.*.*..*.*.* = | >> | = | >> 1.5e+06 ++ = | >> | = | >> 1e+06 ++ = | >> | = | >> | = | >> 500000 ++ = | >> | = | >> 0 ++------------------------------------------------------------= --O-+ >> = >> = >> interrupts.CAL:Function_call_interrupts >> = >> 3.5e+08 ++------------------------------------------------------------= ----+ >> | O = | >> 3e+08 ++ = | >> | = | >> 2.5e+08 O+ O O O O O O = | >> | O O O O = | >> 2e+08 ++ O O O O O O O O O O O O O O O O O = O O >> | = | >> 1.5e+08 ++ = | >> *. .*.*.. .*.*.*.*.*.*. .*.*. = | >> 1e+08 ++* * *..*.*.* *.*.*..*.*.* = | >> | = | >> 5e+07 ++ = | >> | = | >> 0 ++------------------------------------------------------------= --O-+ >> = >> = >> [*] bisect-good sample >> [O] bisect-bad sample >> = >> = >> Disclaimer: >> Results have been estimated based on internal Intel analysis and are pro= vided >> for informational purposes only. Any difference in system hardware or so= ftware >> design or configuration may affect actual performance. >> = >> = >> Thanks, >> Xiaolong > > Hi, I've gotten a couple of these emails recently on a patchset that I'm > working on but haven't sent upstream for review yet. It's not every day > that you get a mail that says you improved throughput by 43%.=C2=A0 > > Could you help me interpret the results here? I'm guessing that this > patchset allows the kernel to dirty mmapped pages faster? > > Also, I've looked at what this test does, and I'm wondering...is this > simulating some sort of real-world workload? If so, what? This is a micro-benchmark, I don't think it is simulating some sort of real-world workload. Best Regards, Huang, Ying --===============2091646304661186457==--