From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752490AbcDZLVh (ORCPT ); Tue, 26 Apr 2016 07:21:37 -0400 Received: from mail-pa0-f54.google.com ([209.85.220.54]:33344 "EHLO mail-pa0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751548AbcDZLVf (ORCPT ); Tue, 26 Apr 2016 07:21:35 -0400 Date: Tue, 26 Apr 2016 20:23:05 +0900 From: Sergey Senozhatsky To: Minchan Kim Cc: Sergey Senozhatsky , Andrew Morton , linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: Re: zram: per-cpu compression streams Message-ID: <20160426112305.GA1155@swordfish> References: <20160330083419.GA2769@swordfish> <20160330221233.GA6736@bbox> <20160331012626.GB1758@swordfish> <20160331055355.GD6736@bbox> <20160331063416.GA3343@swordfish> <20160401153829.GA1212@swordfish> <20160404002757.GC5833@bbox> <20160404011702.GB6164@swordfish> <20160418075758.GA1983@swordfish> <20160419080025.GE18448@bbox> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160419080025.GE18448@bbox> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Minchan, On (04/19/16 17:00), Minchan Kim wrote: [..] > I'm convinced now with your data. Super thanks! > However, as you know, we need data how bad it is in heavy memory pressure. > Maybe, you can test it with fio and backgound memory hogger, it's really hard to produce stable test results when the system is under mem pressure. first, I modified zram to export the re-compression number (put cpu stream and re-try handler allocation) mm_stat for numjobs{1..10}. the number of re-compressions is in "< NUM>" format 3221225472 3221225472 3221225472 0 3221229568 0 0 < 6421> 3221225472 3221225472 3221225472 0 3221233664 0 0 < 6998> 3221225472 2912157607 2952802304 0 2952814592 0 84 < 7271> 3221225472 2893479936 2899120128 0 2899136512 0 156 < 8260> 3221217280 2886040814 2899099648 0 2899128320 0 78 < 8297> 3221225472 2880045056 2885693440 0 2885718016 0 54 < 7794> 3221213184 2877431364 2883756032 0 2883801088 0 144 < 7336> 3221225472 2873229312 2876096512 0 2876133376 0 28 < 8699> 3221213184 2870728008 2871693312 0 2871730176 0 30 < 8189> 2899095552 2899095552 2899095552 0 2899136512 78643 0 < 7485> as we can see, the number of re-compressions can vary from 6421 to 8699. the test: -- 4 GB x86_64 box -- zram 3GB, lzo -- mem-hogger pre-faults 3GB of pages before the fio test -- fio test has been modified to have 11% compression ratio (to increase the chances of re-compressions) -- buffer_compress_percentage=11 -- scramble_buffers=0 considering buffer_compress_percentage=11, the box was under somewhat heavy pressure. now, the results fio stats 4 streams 8 streams per cpu =========================================================== #jobs1 READ: 2411.4MB/s 2430.4MB/s 2440.4MB/s READ: 2094.8MB/s 2002.7MB/s 2034.5MB/s WRITE: 141571KB/s 140334KB/s 143542KB/s WRITE: 712025KB/s 706111KB/s 745256KB/s READ: 531014KB/s 525250KB/s 537547KB/s WRITE: 530960KB/s 525197KB/s 537492KB/s READ: 473577KB/s 470320KB/s 476880KB/s WRITE: 473645KB/s 470387KB/s 476948KB/s #jobs2 READ: 7897.2MB/s 8031.4MB/s 7968.9MB/s READ: 6864.9MB/s 6803.2MB/s 6903.4MB/s WRITE: 321386KB/s 314227KB/s 313101KB/s WRITE: 1275.3MB/s 1245.6MB/s 1383.5MB/s READ: 1035.5MB/s 1021.9MB/s 1098.4MB/s WRITE: 1035.6MB/s 1021.1MB/s 1098.6MB/s READ: 972014KB/s 952321KB/s 987.66MB/s WRITE: 969792KB/s 950144KB/s 985.40MB/s #jobs3 READ: 13260MB/s 13260MB/s 13222MB/s READ: 11636MB/s 11636MB/s 11755MB/s WRITE: 511500KB/s 507730KB/s 504959KB/s WRITE: 1646.1MB/s 1673.9MB/s 1755.5MB/s READ: 1389.5MB/s 1387.2MB/s 1479.6MB/s WRITE: 1387.6MB/s 1385.3MB/s 1477.4MB/s READ: 1286.8MB/s 1289.1MB/s 1377.3MB/s WRITE: 1284.8MB/s 1287.1MB/s 1374.9MB/s #jobs4 READ: 19851MB/s 20244MB/s 20344MB/s READ: 17732MB/s 17835MB/s 18097MB/s WRITE: 667776KB/s 655599KB/s 693464KB/s WRITE: 2041.2MB/s 2072.6MB/s 2474.1MB/s READ: 1770.1MB/s 1781.7MB/s 2035.5MB/s WRITE: 1765.8MB/s 1777.3MB/s 2030.5MB/s READ: 1641.6MB/s 1672.4MB/s 1892.5MB/s WRITE: 1643.2MB/s 1674.2MB/s 1894.4MB/s #jobs5 READ: 19468MB/s 18484MB/s 18439MB/s READ: 17594MB/s 17757MB/s 17716MB/s WRITE: 843266KB/s 859627KB/s 867928KB/s WRITE: 1927.1MB/s 2041.8MB/s 2168.9MB/s READ: 1718.6MB/s 1771.7MB/s 1963.5MB/s WRITE: 1712.7MB/s 1765.6MB/s 1956.8MB/s READ: 1705.3MB/s 1663.6MB/s 1767.3MB/s WRITE: 1704.3MB/s 1662.6MB/s 1766.2MB/s #jobs6 READ: 21583MB/s 21685MB/s 21483MB/s READ: 19160MB/s 18432MB/s 18618MB/s WRITE: 986276KB/s 1004.2MB/s 981.11MB/s WRITE: 2013.6MB/s 1922.5MB/s 2429.1MB/s READ: 1797.1MB/s 1678.9MB/s 2038.8MB/s WRITE: 1794.8MB/s 1675.9MB/s 2035.2MB/s READ: 1678.2MB/s 1632.5MB/s 1917.4MB/s WRITE: 1673.9MB/s 1627.6MB/s 1911.6MB/s #jobs7 READ: 20697MB/s 21677MB/s 21062MB/s READ: 18781MB/s 18667MB/s 19338MB/s WRITE: 1074.6MB/s 1099.8MB/s 1105.3MB/s WRITE: 2100.7MB/s 2010.3MB/s 2598.7MB/s READ: 1783.2MB/s 1710.2MB/s 2027.8MB/s WRITE: 1784.3MB/s 1712.1MB/s 2029.6MB/s READ: 1690.8MB/s 1620.6MB/s 1893.6MB/s WRITE: 1681.4MB/s 1611.7MB/s 1883.7MB/s #jobs8 READ: 19883MB/s 20827MB/s 20395MB/s READ: 18562MB/s 18178MB/s 17822MB/s WRITE: 1240.5MB/s 1307.3MB/s 1331.7MB/s WRITE: 2132.1MB/s 2143.6MB/s 2564.9MB/s READ: 1841.1MB/s 1831.1MB/s 2111.4MB/s WRITE: 1843.1MB/s 1833.1MB/s 2113.4MB/s READ: 1795.4MB/s 1778.6MB/s 2029.3MB/s WRITE: 1791.4MB/s 1774.5MB/s 2024.5MB/s #jobs9 READ: 18834MB/s 19470MB/s 19402MB/s READ: 17988MB/s 18118MB/s 18531MB/s WRITE: 1339.4MB/s 1441.2MB/s 1512.6MB/s WRITE: 2102.4MB/s 2111.9MB/s 2478.8MB/s READ: 1754.5MB/s 1777.3MB/s 2050.2MB/s WRITE: 1753.9MB/s 1776.7MB/s 2049.5MB/s READ: 1686.4MB/s 1698.2MB/s 1931.6MB/s WRITE: 1684.1MB/s 1696.8MB/s 1929.1MB/s #jobs10 READ: 19128MB/s 19517MB/s 19592MB/s READ: 18177MB/s 17544MB/s 18221MB/s WRITE: 1397.1MB/s 1567.4MB/s 1683.2MB/s WRITE: 2151.9MB/s 2205.1MB/s 2642.6MB/s READ: 1879.2MB/s 1907.3MB/s 2223.2MB/s WRITE: 1878.5MB/s 1906.2MB/s 2222.8MB/s READ: 1835.7MB/s 1837.9MB/s 2131.4MB/s WRITE: 1838.6MB/s 1840.8MB/s 2134.8MB/s perf stats 4 streams 8 streams per cpu ==================================================================================================================== jobs1 stalled-cycles-frontend 52,219,601,943 ( 55.87%) 53,406,899,652 ( 56.33%) 49,944,625,376 ( 56.27%) stalled-cycles-backend 23,194,739,214 ( 24.82%) 24,397,423,796 ( 25.73%) 22,782,579,660 ( 25.67%) instructions 86,078,512,819 ( 0.92) 86,235,354,709 ( 0.91) 80,378,845,354 ( 0.91) branches 15,732,850,506 ( 532.108) 15,743,473,327 ( 522.592) 14,725,420,241 ( 523.425) branch-misses 104,546,578 ( 0.66%) 107,847,818 ( 0.69%) 106,343,602 ( 0.72%) jobs2 stalled-cycles-frontend 118,614,605,521 ( 59.74%) 113,520,838,279 ( 59.94%) 104,301,243,221 ( 59.06%) stalled-cycles-backend 59,490,170,824 ( 29.96%) 56,518,872,622 ( 29.84%) 50,161,702,782 ( 28.40%) instructions 169,663,993,572 ( 0.85) 160,959,388,344 ( 0.85) 153,541,182,646 ( 0.87) branches 31,859,926,551 ( 497.945) 30,132,524,256 ( 494.660) 28,579,927,064 ( 503.079) branch-misses 164,531,311 ( 0.52%) 163,509,596 ( 0.54%) 145,472,902 ( 0.51%) jobs3 stalled-cycles-frontend 153,932,401,104 ( 60.86%) 158,470,334,291 ( 60.81%) 150,767,641,835 ( 59.21%) stalled-cycles-backend 77,023,824,597 ( 30.45%) 79,673,952,089 ( 30.57%) 72,693,245,174 ( 28.55%) instructions 197,452,119,661 ( 0.78) 204,116,060,906 ( 0.78) 207,832,729,315 ( 0.82) branches 36,579,918,543 ( 404.660) 37,980,582,651 ( 406.326) 39,091,715,974 ( 428.559) branch-misses 214,292,753 ( 0.59%) 215,861,282 ( 0.57%) 203,320,703 ( 0.52%) jobs4 stalled-cycles-frontend 237,223,396,661 ( 64.22%) 227,572,336,186 ( 64.37%) 202,100,979,033 ( 61.41%) stalled-cycles-backend 129,935,296,918 ( 35.17%) 124,957,172,193 ( 35.34%) 103,626,575,103 ( 31.49%) instructions 270,083,196,348 ( 0.73) 257,652,752,109 ( 0.73) 259,773,237,031 ( 0.79) branches 52,120,828,566 ( 391.426) 49,121,254,042 ( 385.647) 49,896,944,076 ( 420.532) branch-misses 260,480,947 ( 0.50%) 254,957,745 ( 0.52%) 239,402,681 ( 0.48%) jobs5 stalled-cycles-frontend 257,778,703,389 ( 64.89%) 265,688,762,182 ( 65.13%) 229,916,792,090 ( 61.41%) stalled-cycles-backend 142,090,098,727 ( 35.77%) 147,101,411,510 ( 36.06%) 117,081,586,471 ( 31.27%) instructions 291,859,438,730 ( 0.73) 298,380,653,546 ( 0.73) 302,840,047,693 ( 0.81) branches 55,111,567,225 ( 385.905) 56,316,470,332 ( 383.545) 57,500,842,324 ( 428.083) branch-misses 270,056,201 ( 0.49%) 269,400,845 ( 0.48%) 258,495,925 ( 0.45%) jobs6 stalled-cycles-frontend 311,626,093,277 ( 65.61%) 314,291,595,576 ( 65.77%) 249,524,291,273 ( 61.39%) stalled-cycles-backend 174,358,063,361 ( 36.71%) 177,312,195,233 ( 37.10%) 126,508,172,269 ( 31.13%) instructions 345,271,436,105 ( 0.73) 346,679,577,246 ( 0.73) 333,258,054,473 ( 0.82) branches 65,298,537,641 ( 381.664) 65,995,652,812 ( 383.717) 62,730,160,550 ( 428.999) branch-misses 313,241,654 ( 0.48%) 307,876,772 ( 0.47%) 282,570,360 ( 0.45%) jobs7 stalled-cycles-frontend 333,896,608,350 ( 64.68%) 349,165,441,969 ( 64.85%) 276,185,831,513 ( 59.95%) stalled-cycles-backend 186,083,638,772 ( 36.05%) 197,000,957,906 ( 36.59%) 138,835,486,733 ( 30.14%) instructions 388,707,023,219 ( 0.75) 404,347,465,692 ( 0.75) 394,078,203,426 ( 0.86) branches 71,999,476,930 ( 387.008) 76,197,698,685 ( 392.759) 73,195,649,665 ( 440.914) branch-misses 328,598,294 ( 0.46%) 323,895,230 ( 0.43%) 298,205,996 ( 0.41%) jobs8 stalled-cycles-frontend 378,806,234,772 ( 66.73%) 369,453,970,323 ( 66.55%) 313,738,845,641 ( 62.55%) stalled-cycles-backend 211,732,966,238 ( 37.30%) 207,691,463,546 ( 37.41%) 161,120,924,768 ( 32.12%) instructions 406,674,721,912 ( 0.72) 401,922,649,599 ( 0.72) 405,830,823,213 ( 0.81) branches 75,637,492,422 ( 369.371) 74,287,789,757 ( 371.226) 75,967,291,039 ( 420.260) branch-misses 355,733,892 ( 0.47%) 328,972,387 ( 0.44%) 318,203,258 ( 0.42%) jobs9 stalled-cycles-frontend 422,712,242,907 ( 66.39%) 417,293,429,710 ( 66.14%) 343,703,467,466 ( 61.35%) stalled-cycles-backend 239,356,726,574 ( 37.59%) 231,725,068,834 ( 36.73%) 172,101,321,805 ( 30.72%) instructions 465,964,470,967 ( 0.73) 468,561,486,803 ( 0.74) 474,119,504,255 ( 0.85) branches 86,724,291,348 ( 377.755) 86,534,438,758 ( 380.374) 88,431,722,886 ( 437.939) branch-misses 385,706,052 ( 0.44%) 360,946,347 ( 0.42%) 337,858,267 ( 0.38%) jobs10 stalled-cycles-frontend 451,844,797,592 ( 67.24%) 435,099,070,573 ( 67.18%) 352,877,428,118 ( 62.18%) stalled-cycles-backend 255,533,666,521 ( 38.03%) 249,295,276,734 ( 38.49%) 179,754,582,074 ( 31.67%) instructions 472,331,884,636 ( 0.70) 458,948,698,965 ( 0.71) 464,131,768,633 ( 0.82) branches 88,848,212,769 ( 366.556) 85,330,239,413 ( 365.282) 86,837,838,069 ( 424.329) branch-misses 398,856,497 ( 0.45%) 359,532,394 ( 0.42%) 333,821,387 ( 0.38%) perf reported execution time 4 streams 8 streams per cpu ==================================================================== seconds elapsed 41.359653597 43.131195776 40.961640812 seconds elapsed 37.778174380 38.681792299 38.368529861 seconds elapsed 38.367149768 39.368008799 37.687545579 seconds elapsed 40.402963748 39.177529033 36.205357101 seconds elapsed 44.145428970 43.251655348 41.810848146 seconds elapsed 49.344988495 49.951048242 44.270045250 seconds elapsed 53.865398777 54.271392367 48.824173559 seconds elapsed 57.028770416 56.228105290 51.332017545 seconds elapsed 62.931350164 61.251237873 55.977463074 seconds elapsed 67.088285633 63.544376242 57.690998344 -ss