From mboxrd@z Thu Jan 1 00:00:00 1970 From: Suravee Suthikulanit Subject: Re: Xen 4.3 development update Date: Thu, 4 Apr 2013 12:14:00 -0500 Message-ID: <515DB4D8.2090809@amd.com> References: <515B186F02000078000CA1A7@nat28.tlf.novell.com> <20130402163440.GB17022@ocelot.phlegethon.org> <515B101A.7050501@amd.com> <515B6E4B.8080108@amd.com> <515C09A8.8010104@eu.citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------040401050409050508060202" Return-path: In-Reply-To: <515C09A8.8010104@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap Cc: "Tim (Xen.org)" , Andres Lagar-Cavilla , Jan Beulich , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --------------040401050409050508060202 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit On 4/3/2013 5:51 AM, George Dunlap wrote: > On 03/04/13 00:48, Suravee Suthikulanit wrote: >> On 4/2/2013 12:06 PM, Suravee Suthikulpanit wrote: >>> On 4/2/2013 11:34 AM, Tim Deegan wrote: >>>> At 16:42 +0100 on 02 Apr (1364920927), Jan Beulich wrote: >>>>>>>> On 02.04.13 at 16:07, George Dunlap >>>>>>>> wrote: >>>>>> * AMD NPT performance regression after c/s 24770:7f79475d3de7 >>>>>> owner: ? >>>>>> Reference: http://marc.info/?l=xen-devel&m=135075376805215 >>>>> This is supposedly fixed with the RTC changes Tim committed the >>>>> other day. Suravee, is that correct? >>>> This is a separate problem. IIRC the AMD XP perf issue is caused >>>> by the >>>> emulation of LAPIC TPR accesses slowing down with Andres's p2m locking >>>> patches. XP doesn't have 'lazy IRQL' or support for CR8, so it >>>> takes a >>>> _lot_ of vmexits for IRQL reads and writes. >>> Is there any tools or good ways to count the number of VMexit in Xen? >>> >> Tim/Jan, >> >> I have used iperf benchmark to compare network performance (bandwidth) >> between the two versions of the hypervisor: >> 1. good: 24769:730f6ed72d70 >> 2. bad: 24770:7f79475d3de7 >> >> In the "bad" case, I am seeing that the network bandwidth has dropped >> about 13-15%. >> >> However, when I uses the xentrace utility to trace the number of VMEXIT, >> I actually see about 25% more number of VMEXIT in the good case. This >> is inconsistent with the statement that Tim mentioned above. > > I was going to say, what I remember from my little bit of > investigation back in November, was that it had all the earmarks of > micro-architectural "drag", which happens when the TLB or the caches > can't be effective. > > Suvaree, if you look at xenalyze, a microarchitectural "drag" looks like: > * fewer VMEXITs, but > * time for each vmexit takes longer > > If you post the results of "xenalyze --svm-mode -s" for both traces, I > can tell you what I see. > > -George > Here's another version of the outputs from xenalyze with only VMEXIT. In this case, I pin all the VCPUs (4) and pin my application process to VCPU 3. NOTE: This measurement is without the RTC bug. BAD: -- v3 -- Runstates: running: 1 4.51s 10815429411 {10815429411|10815429411|10815429411} cpu affinity: 1 10816540697 {10816540697|10816540697|10816540697} [7]: 1 10816540697 {10816540697|10816540697|10816540697} Exit reasons: VMEXIT_CR0_READ 633 0.00s 0.00% 1503 cyc { 1092| 1299| 2647} VMEXIT_CR4_READ 3 0.00s 0.00% 1831 cyc { 1309| 1659| 2526} VMEXIT_CR0_WRITE 305 0.00s 0.00% 1660 cyc { 1158| 1461| 2507} VMEXIT_CR4_WRITE 6 0.00s 0.00% 19771 cyc { 1738| 5031|79600} VMEXIT_EXCEPTION_NM 1 0.00s 0.00% 2272 cyc { 2272| 2272| 2272} VMEXIT_INTR 28 0.00s 0.00% 3374 cyc { 1225| 3770| 6095} VMEXIT_VINTR 388 0.00s 0.00% 1023 cyc { 819| 901| 1744} VMEXIT_PAUSE 33 0.00s 0.00% 7476 cyc { 4881| 6298|18941} VMEXIT_HLT 388 3.35s 14.84% 20701800 cyc {169589|3848166|55770601} VMEXIT_IOIO 5581 0.19s 0.85% 82514 cyc { 4250|81909|146439} VMEXIT_NPF 108072 0.71s 3.14% 15702 cyc { 6362| 6865|37280} Guest interrupt counts: Emulate eip list GOOD: -- v3 -- Runstates: running: 4 12.10s 7257234016 {18132721625|18132721625|18132721625} lost: 12 1.24s 248210482 {188636654|719488416|719488416} cpu affinity: 1 32007462122 {32007462122|32007462122|32007462122} [7]: 1 32007462122 {32007462122|32007462122|32007462122} Exit reasons: VMEXIT_CR0_READ 4748 0.00s 0.01% 1275 cyc { 1007| 1132| 1878} VMEXIT_CR4_READ 6 0.00s 0.00% 1752 cyc { 1189| 1629| 2600} VMEXIT_CR0_WRITE 3099 0.00s 0.01% 1541 cyc { 1157| 1420| 2151} VMEXIT_CR4_WRITE 12 0.00s 0.00% 4105 cyc { 1885| 4380| 5515} VMEXIT_EXCEPTION_NM 18 0.00s 0.00% 2169 cyc { 1973| 2152| 2632} VMEXIT_INTR 258 0.00s 0.00% 4622 cyc { 1358| 4235| 8987} VMEXIT_VINTR 2552 0.00s 0.00% 971 cyc { 850| 928| 1131} VMEXIT_PAUSE 370 0.00s 0.00% 5758 cyc { 4381| 5688| 7933} VMEXIT_HLT 1505 6.14s 27.19% 9788981 cyc {268573|3768704|56331182} VMEXIT_IOIO 53835 1.97s 8.74% 87959 cyc { 4996|82423|144207} VMEXIT_NPF 855101 2.06s 9.13% 5787 cyc { 4903| 5328| 8572} Guest interrupt counts: Emulate eip list Suravee --------------040401050409050508060202 Content-Type: text/plain; charset="windows-1252"; name="xp32.vmexit.xenalyze.bad" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="xp32.vmexit.xenalyze.bad" Content-Description: xp32.vmexit.xenalyze.bad Total time: 22.55 seconds (using cpu speed 2.40 GHz) --- Log volume summary --- - cpu 4 - gen : 632 hvm : 9327532 +-vmentry: 3497832 +-vmexit : 5829700 - cpu 5 - gen : 724 hvm : 11119980 +-vmentry: 4170000 +-vmexit : 6949980 - cpu 6 - gen : 616 hvm : 8962956 +-vmentry: 3361116 +-vmexit : 5601840 - cpu 7 - gen : 3008 hvm : 48541004 +-vmentry: 18202884 +-vmexit : 30338120 |-- Domain 1 --| Runstates: full run: 7 4.51s 1545220099 {10780557145|10780557145|10780557145} Grant table ops: Done by: Done for: Populate-on-demand: Populated: Reclaim order: Reclaim contexts: -- v0 -- Runstates: running: 1 4.51s 10815791389 {10815791389|10815791389|10815791389} cpu affinity: 1 10815791389 {10815791389|10815791389|10815791389} [4]: 1 10815791389 {10815791389|10815791389|10815791389} Exit reasons: VMEXIT_CR0_READ 183 0.00s 0.00% 1812 cyc { 1122| 1632| 3498} VMEXIT_CR4_READ 2 0.00s 0.00% 2062 cyc { 1926| 2199| 2199} VMEXIT_CR0_WRITE 78 0.00s 0.00% 1996 cyc { 1149| 1658| 4217} VMEXIT_CR4_WRITE 4 0.00s 0.00% 56884 cyc {15001|68359|90794} VMEXIT_INTR 7 0.00s 0.00% 5646 cyc { 2429| 6054| 7887} VMEXIT_VINTR 76 0.00s 0.00% 1509 cyc { 878| 1060| 3594} VMEXIT_PAUSE 157 0.00s 0.00% 7879 cyc { 5241| 6691|15013} VMEXIT_HLT 370 3.94s 17.49% 25583983 cyc {829018|16636281|55835489} VMEXIT_IOIO 5578 0.18s 0.82% 79293 cyc { 2231|88278|152838} VMEXIT_NPF 19651 0.30s 1.33% 36744 cyc { 6849|25715|127962} Guest interrupt counts: -- v1 -- Runstates: running: 1 4.49s 10785203391 {10785203391|10785203391|10785203391} cpu affinity: 1 10786466878 {10786466878|10786466878|10786466878} [5]: 1 10786466878 {10786466878|10786466878|10786466878} Exit reasons: VMEXIT_CR0_READ 340 0.00s 0.00% 1415 cyc { 1071| 1272| 2171} VMEXIT_CR4_READ 2 0.00s 0.00% 1746 cyc { 1724| 1769| 1769} VMEXIT_CR0_WRITE 10 0.00s 0.00% 2283 cyc { 1134| 1912| 4267} VMEXIT_CR4_WRITE 4 0.00s 0.00% 29555 cyc { 4872|34069|47533} VMEXIT_EXCEPTION_NM 1 0.00s 0.00% 2021 cyc { 2021| 2021| 2021} VMEXIT_INTR 13 0.00s 0.00% 5124 cyc { 1666| 5530| 6878} VMEXIT_VINTR 165 0.00s 0.00% 986 cyc { 832| 883| 1302} VMEXIT_PAUSE 113 0.00s 0.00% 7287 cyc { 4375| 5933|13541} VMEXIT_HLT 396 3.89s 17.26% 23592797 cyc {176514|12738957|55766839} VMEXIT_IOIO 4127 0.15s 0.68% 88700 cyc { 3721|87403|151659} VMEXIT_NPF 25689 0.34s 1.52% 32124 cyc { 6473|16878|115309} Guest interrupt counts: -- v2 -- Runstates: running: 1 4.49s 10781631107 {10781631107|10781631107|10781631107} cpu affinity: 1 10781820632 {10781820632|10781820632|10781820632} [6]: 1 10781820632 {10781820632|10781820632|10781820632} Exit reasons: VMEXIT_CR0_READ 249 0.00s 0.00% 1755 cyc { 1120| 1584| 3082} VMEXIT_CR4_READ 2 0.00s 0.00% 2033 cyc { 1499| 2567| 2567} VMEXIT_CR0_WRITE 2 0.00s 0.00% 3977 cyc { 3327| 4628| 4628} VMEXIT_CR4_WRITE 4 0.00s 0.00% 72590 cyc {39319|88002|91428} VMEXIT_INTR 15 0.00s 0.00% 4280 cyc { 1394| 4727| 7035} VMEXIT_VINTR 135 0.00s 0.00% 1277 cyc { 851| 974| 2976} VMEXIT_PAUSE 80 0.00s 0.00% 8354 cyc { 5487| 8652|13277} VMEXIT_HLT 364 3.93s 17.43% 25917764 cyc {113516|17552184|55783615} VMEXIT_IOIO 3873 0.14s 0.63% 88434 cyc { 4299|88250|148914} VMEXIT_NPF 29959 0.32s 1.40% 25322 cyc { 6529|14372|104649} Guest interrupt counts: -- v3 -- Runstates: running: 1 4.51s 10815429411 {10815429411|10815429411|10815429411} cpu affinity: 1 10816540697 {10816540697|10816540697|10816540697} [7]: 1 10816540697 {10816540697|10816540697|10816540697} Exit reasons: VMEXIT_CR0_READ 633 0.00s 0.00% 1503 cyc { 1092| 1299| 2647} VMEXIT_CR4_READ 3 0.00s 0.00% 1831 cyc { 1309| 1659| 2526} VMEXIT_CR0_WRITE 305 0.00s 0.00% 1660 cyc { 1158| 1461| 2507} VMEXIT_CR4_WRITE 6 0.00s 0.00% 19771 cyc { 1738| 5031|79600} VMEXIT_EXCEPTION_NM 1 0.00s 0.00% 2272 cyc { 2272| 2272| 2272} VMEXIT_INTR 28 0.00s 0.00% 3374 cyc { 1225| 3770| 6095} VMEXIT_VINTR 388 0.00s 0.00% 1023 cyc { 819| 901| 1744} VMEXIT_PAUSE 33 0.00s 0.00% 7476 cyc { 4881| 6298|18941} VMEXIT_HLT 388 3.35s 14.84% 20701800 cyc {169589|3848166|55770601} VMEXIT_IOIO 5581 0.19s 0.85% 82514 cyc { 4250|81909|146439} VMEXIT_NPF 108072 0.71s 3.14% 15702 cyc { 6362| 6865|37280} Guest interrupt counts: Emulate eip list --------------040401050409050508060202 Content-Type: text/plain; charset="windows-1252"; name="xp32.vmexit.xenalyze.good" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="xp32.vmexit.xenalyze.good" Content-Description: xp32.vmexit.xenalyze.good Total time: 22.58 seconds (using cpu speed 2.40 GHz) --- Log volume summary --- - cpu 4 - gen : 1424 hvm : 19260932 +-vmentry: 7222872 +-vmexit : 12038060 - cpu 5 - gen : 980 hvm : 12922980 +-vmentry: 4846140 +-vmexit : 8076840 - cpu 6 - gen : 792 hvm : 9098224 +-vmentry: 3411864 +-vmexit : 5686360 - cpu 7 - gen : 3836 hvm : 59030776 +-vmentry: 22136556 +-vmexit : 36894220 |-- Domain 1 --| Runstates: full run: 26 12.51s 1154609114 {18123228220|18123228220|18123228220} lost: 6 0.83s 331270856 {401481966|714242403|714242403} Grant table ops: Done by: Done for: Populate-on-demand: Populated: Reclaim order: Reclaim contexts: -- v0 -- Runstates: running: 4 12.31s 7386445159 {18311864874|18311864874|18311864874} lost: 6 0.64s 256384507 {160195345|728981821|728981821} cpu affinity: 1 31084149226 {31084149226|31084149226|31084149226} [4]: 1 31084149226 {31084149226|31084149226|31084149226} Exit reasons: VMEXIT_CR0_READ 148 0.00s 0.00% 1675 cyc { 1093| 1600| 2980} VMEXIT_CR4_READ 11 0.00s 0.00% 1832 cyc { 1064| 1639| 2681} VMEXIT_CR0_WRITE 22 0.00s 0.00% 2615 cyc { 1134| 1796| 4476} VMEXIT_CR4_WRITE 22 0.00s 0.00% 3497 cyc { 1280| 3298| 7146} VMEXIT_INTR 185 0.00s 0.00% 4650 cyc { 1381| 4134| 8633} VMEXIT_VINTR 205 0.00s 0.00% 1398 cyc { 970| 1136| 2320} VMEXIT_PAUSE 1323 0.00s 0.02% 6533 cyc { 4789| 6238| 9305} VMEXIT_HLT 1366 8.57s 37.96% 15057220 cyc {1112579|7520597|56280627} VMEXIT_IOIO 46347 1.61s 7.13% 83338 cyc { 2676|83036|145300} VMEXIT_NPF 390000 1.05s 4.65% 6453 cyc { 5005| 5748| 9907} Guest interrupt counts: -- v1 -- Runstates: running: 3 12.30s 9840745963 {18408860623|18408860623|18408860623} lost: 5 0.65s 310382375 {436938050|714242403|714242403} cpu affinity: 1 31074181912 {31074181912|31074181912|31074181912} [5]: 1 31074181912 {31074181912|31074181912|31074181912} Exit reasons: VMEXIT_CR0_READ 385 0.00s 0.00% 1492 cyc { 1018| 1415| 2223} VMEXIT_CR4_READ 6 0.00s 0.00% 1745 cyc { 1234| 1717| 2514} VMEXIT_CR0_WRITE 34 0.00s 0.00% 2360 cyc { 1259| 2306| 5210} VMEXIT_CR4_WRITE 12 0.00s 0.00% 5338 cyc { 1703| 5089| 9430} VMEXIT_INTR 108 0.00s 0.00% 6004 cyc { 2082| 5121|12020} VMEXIT_VINTR 197 0.00s 0.00% 1138 cyc { 861| 994| 2170} VMEXIT_PAUSE 833 0.00s 0.01% 6431 cyc { 4328| 5904| 9108} VMEXIT_HLT 1158 9.86s 43.67% 20434762 cyc {358508|13214097|56352969} VMEXIT_IOIO 36732 1.35s 5.96% 87903 cyc { 4019|85234|146724} VMEXIT_NPF 130435 0.39s 1.72% 7144 cyc { 5106| 6142|11174} Guest interrupt counts: -- v2 -- Runstates: running: 4 12.08s 7249658168 {18566248465|18566248465|18566248465} lost: 4 0.86s 517210853 {717049906|858313777|858313777} cpu affinity: 1 31067476087 {31067476087|31067476087|31067476087} [6]: 1 31067476087 {31067476087|31067476087|31067476087} Exit reasons: VMEXIT_CR0_READ 689 0.00s 0.00% 1503 cyc { 1024| 1410| 2428} VMEXIT_CR4_READ 6 0.00s 0.00% 1504 cyc { 1346| 1488| 1757} VMEXIT_CR0_WRITE 19 0.00s 0.00% 2624 cyc { 1234| 1923| 4311} VMEXIT_CR4_WRITE 12 0.00s 0.00% 4285 cyc { 1975| 4454| 6885} VMEXIT_INTR 56 0.00s 0.00% 5723 cyc { 1282| 5202| 9900} VMEXIT_VINTR 358 0.00s 0.00% 1181 cyc { 905| 1010| 2202} VMEXIT_PAUSE 416 0.00s 0.01% 7554 cyc { 4645| 8157| 9454} VMEXIT_HLT 1087 10.47s 46.37% 23113320 cyc {172100|16138284|56338801} VMEXIT_IOIO 21645 0.78s 3.47% 86779 cyc { 3764|85054|146473} VMEXIT_NPF 111974 0.36s 1.57% 7609 cyc { 5131| 7397|11731} Guest interrupt counts: -- v3 -- Runstates: running: 4 12.10s 7257234016 {18132721625|18132721625|18132721625} lost: 12 1.24s 248210482 {188636654|719488416|719488416} cpu affinity: 1 32007462122 {32007462122|32007462122|32007462122} [7]: 1 32007462122 {32007462122|32007462122|32007462122} Exit reasons: VMEXIT_CR0_READ 4748 0.00s 0.01% 1275 cyc { 1007| 1132| 1878} VMEXIT_CR4_READ 6 0.00s 0.00% 1752 cyc { 1189| 1629| 2600} VMEXIT_CR0_WRITE 3099 0.00s 0.01% 1541 cyc { 1157| 1420| 2151} VMEXIT_CR4_WRITE 12 0.00s 0.00% 4105 cyc { 1885| 4380| 5515} VMEXIT_EXCEPTION_NM 18 0.00s 0.00% 2169 cyc { 1973| 2152| 2632} VMEXIT_INTR 258 0.00s 0.00% 4622 cyc { 1358| 4235| 8987} VMEXIT_VINTR 2552 0.00s 0.00% 971 cyc { 850| 928| 1131} VMEXIT_PAUSE 370 0.00s 0.00% 5758 cyc { 4381| 5688| 7933} VMEXIT_HLT 1505 6.14s 27.19% 9788981 cyc {268573|3768704|56331182} VMEXIT_IOIO 53835 1.97s 8.74% 87959 cyc { 4996|82423|144207} VMEXIT_NPF 855101 2.06s 9.13% 5787 cyc { 4903| 5328| 8572} Guest interrupt counts: Emulate eip list --------------040401050409050508060202 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --------------040401050409050508060202--