All of lore.kernel.org
 help / color / mirror / Atom feed
From: xuehai zhang <hai@cs.uchicago.edu>
To: "Santos, Jose Renato G (Jose Renato Santos)" <joserenato.santos@hp.com>
Cc: m+Ian.Pratt@cl.cam.ac.uk, Xen-devel@lists.xensource.com,
	Aravind Menon <aravind.menon@epfl.ch>,
	G John Janakiraman <john@arivalai.hpl.hp.com>,
	"Turner, Yoshio" <yoshio_turner@hp.com>
Subject: Re: MPI benchmark performance gap between native linux anddomU
Date: Tue, 05 Apr 2005 23:24:17 -0500	[thread overview]
Message-ID: <42536471.7070802@cs.uchicago.edu> (raw)
In-Reply-To: <6C21311CEE34E049B74CC0EF339464B902FB2C@cacexc12.americas.cpqcorp.net>

Jose,

Thank you for your help to diagnose the problem!

I kinda agree with you that the problem is due to the network latency. The throughput calculation of 
SencRecv benchmark is actually directly related to the latency and the following is its formula 
(where #_of_messages is 2 and the unit of message_size is bytes and the unit of latency is 
milliseconds):
	throughput = ((#_of_messages * message_size)/220)/(latency/106)
So, the performance gap really comes from the delayed latency in domU. It is true that PMB's 
SendRecv benchmark is sensitive to the round trip latency. I would like to hear Keir's comments on 
the behavior of event notifications in the inter-domain I/O channel for networking very much.

BTW, as I stated in my previous emails, besides the SendRecv benchmark, I also have other 11 PMB's 
benchmark results for both native linux and domU. The following are PingPing results (between 2 
nodes) in my experiments. As you can see, the performance gap is not that big as SendRecv and the 
performance is very closer in several testing cases. Part of the reason might come from the fact 
only two nodes are used and only one-way latency is used for the calculation of the latency and 
throughput values.

Best,
Xuehai

P.S.
Note: each reported data point in the following table is the
average of over 10 runs of the same experiments, similarly as the SendRecv.

PingPing Throughput (MB/sec)
Msg-size(bytes) #repetitions      native-linux    domU
             0         1000         0.00          0.00
             1         1000         0.01           0.00
             2         1000         0.01           0.01
             4         1000         0.02           0.01
             8         1000         0.04           0.02
            16         1000         0.09           0.04
            32         1000         0.17           0.09
            64         1000         0.33           0.17
           128         1000         0.65           0.33
           256         1000         1.19           0.62
           512         1000         1.95           1.06
          1024         1000         2.80           1.73
          2048         1000         3.74           2.52
          4096         1000         5.38           3.77
          8192         1000         6.49           4.79
         16384         1000         7.45           4.97
         32768         1000         6.74           5.27
         65536          640         5.89           3.07
        131072          320         5.27           3.11
        262144          160         5.09           3.88
        524288           80         5.00           4.84
       1048576           40         4.95           4.91
       2097152           20         4.94           4.89
       4194304           10         4.93           4.92

PingPing Latency/Startup (usec)
Msg-size(bytes) #repetitions      native-linux     domU
             0         1000       172.78          342.89
             1         1000       176.12          346.23
             2         1000       173.48          344.20
             4         1000       177.05          346.15
             8         1000       177.54          343.56
            16         1000       178.71          346.47
            32         1000       176.71          351.25
            64         1000       183.83          359.41
           128         1000       188.09          371.94
           256         1000       204.64          393.79
           512         1000       250.63          462.45
          1024         1000       349.20          565.03
          2048         1000       521.56          773.63
          4096         1000       726.62         1036.23
          8192         1000      1204.54         1630.43
         16384         1000      2097.42         3143.95
         32768         1000      4633.77         5930.04
         65536          640     10604.54        20335.55
        131072          320     23717.61        40174.68
        262144          160     49146.14        64505.20
        524288           80     99962.09       103390.30
       1048576           40    202000.30       203478.00
       2097152           20    404857.10       408950.55
       4194304           10    812047.60       813135.50

Santos, Jose Renato G (Jose Renato Santos) wrote:
>   Xuehai,
> 
>   Thanks for posting your new results. In fact it seems that your
> problem is not the same as the one we encountered.
>   
>   I believe your problem is due to a higher network latency in Xen. Your
> formula to compute throughput uses the inverse of round trip latency (if
> I understood it correctly). This probably means that your application is
> sensitive to the round trip latency. Your latency mesurements show a
> higher value for domainU and this is the reason for the lower
> throughput.  I am not sure but it is possible that network interrupts or
> event notifications in the inter-domain channel are being coalesced and
> causing longer latency. Keir, do event notifications get coalesced in
> the inter-domain I/O channel for networking?
> 
>   Renato
>   
> 
> 
>>>-----Original Message-----
>>>From: xuehai zhang [mailto:hai@cs.uchicago.edu] 
>>>Sent: Tuesday, April 05, 2005 3:23 PM
>>>To: Santos, Jose Renato G (Jose Renato Santos); 
>>>m+Ian.Pratt@cl.cam.ac.uk
>>>Cc: Xen-devel@lists.xensource.com; Aravind Menon; Turner, 
>>>Yoshio; G John Janakiraman
>>>Subject: Re: [Xen-devel] MPI benchmark performance gap 
>>>between native linux anddomU
>>>
>>>
>>>Hi Ian and Jose,
>>>
>>>Based on your suggestions, I did two more experiments: one 
>>>(with tag "domU-B" in table below) is 
>>>changing the TCP advertised window of domU to -2 (the 
>>>default is 2) and the other (with tag "dom0" 
>>>in table below) is to repeat the experiment in dom0 (only 
>>>dom0 is running). The following table 
>>>contains the results from these two new experiments plus two 
>>>old ones (with tags "native-linux" and 
>>>"domU-A" in table below) in my previous email.
>>>
>>>I have the following observation from the results:
>>>
>>>1. Decreasing the scaling of TCP window ("domU-B") doesn't 
>>>buy any good to the  performance but 
>>>slightly slowdown the performance (comparing with "domU-A").
>>>
>>>2. Generally, the performance of running the experiments in 
>>>dom0 ("dom0" column) is very close 
>>>(slightly less) to the performance on native linux 
>>>("native-linux" column). However, in certain 
>>>situations, it outperforms the performance on native linux. 
>>>For example, throughput values when 
>>>message size is 64KB and latency values when message size is 
>>>1 , or 2, or 4, or 8 bytes.
>>>
>>>3. The performance gap between domU and dom0 is big, 
>>>similarly as domU and native linux.
>>>
>>>BTW, each reported data point in the following table is the 
>>>average of over 10 runs of the same 
>>>experiments. I forget to mention that in experiment using 
>>>user domains, the 8 domU forms a private 
>>>network and each domU is assigned a private network IP (for 
>>>example, 192.168.254.X).
>>>
>>>Xuehai
>>>
>>>*********************************
>>>*SendRecv Throughput(Mbytes/sec)*
>>>*********************************
>>>
>>>Msg Size(bytes)  native-linux	dom0          domU-A    
>>>      domU-B
>>>         0         0              0.00          0              0.00
>>>         1         0              0.01          0              0.00
>>>         2         0              0.01          0              0.00
>>>         4         0              0.03          0              0.00
>>>         8         0.04           0.05          0.01           0.01
>>>        16         0.16           0.11          0.01           0.01
>>>        32         0.34           0.21          0.02           0.02
>>>        64         0.65           0.42          0.04           0.04
>>>       128         1.17           0.79          0.09           0.10
>>>       256         2.15           1.44          0.59           0.58
>>>       512         3.4            2.39          1.23           1.22
>>>      1024         5.29           3.79          2.57           2.50
>>>      2048         7.68           5.30          3.5            3.44
>>>      4096         10.7           8.51          4.96           5.23
>>>      8192         13.35          11.06         7.07           6.00
>>>     16384         14.9           13.60         3.77           4.62
>>>     32768         9.85           11.13         3.68           4.34
>>>     65536         5.06           9.06          3.02           3.14
>>>    131072         7.91           7.61          4.94           5.04
>>>    262144         7.85           7.65          5.25           5.29
>>>    524288         7.93           7.77          6.11           5.40
>>>   1048576         7.85           7.82          6.5            5.62
>>>   2097152         8.18           7.35          5.44           5.32
>>>   4194304         7.55           6.88          4.93           4.92
>>>
>>>*********************************
>>>*   SendRecv Latency(millisec)  *
>>>*********************************
>>>
>>>Msg Size(bytes)  native-linux	dom0           domU-A   
>>>      domU-B					
>>>         0         1979.6         1920.83       3010.96      
>>>  3246.71    		
>>>         1         1724.16        397.27        3218.88      
>>>  3219.63
>>>         2         1669.65        297.58        3185.3       
>>>  3298.86
>>>         4         1637.26        285.27        3055.67      
>>>  3222.34
>>>         8         406.77         282.78        2966.17      
>>>  3001.24
>>>        16         185.76         283.87        2777.89      
>>>  2761.90
>>>        32         181.06         284.75        2791.06      
>>>  2798.77
>>>        64         189.12         293.93        2940.82      
>>>  3043.55
>>>       128         210.51         310.47        2716.3       
>>>  2495.83
>>>       256         227.36         338.13        843.94         853.86
>>>       512         287.28         408.14        796.71         805.51
>>>      1024         368.72         515.59        758.19         786.67
>>>      2048         508.65         737.12        1144.24      
>>>  1150.66
>>>      4096         730.59         917.97        1612.66      
>>>  1516.35
>>>      8192         1170.22        1411.94       2471.65      
>>>  2650.17
>>>     16384         2096.86        2297.19       8300.18      
>>>  6857.13
>>>     32768         6340.45        5619.56       17017.99     
>>>  14392.36
>>>     65536         24640.78       13787.31      41264.5      
>>>  39871.19
>>>    131072         31709.09       32797.52      50608.97     
>>>  49533.68
>>>    262144         63680.67       65174.67      94918.13     
>>>  94157.30
>>>    524288         125531.7       128116.73     162168.47    
>>>  189307.05
>>>   1048576         251566.94      252257.55     321451.02    
>>>  361714.44
>>>   2097152         477431.32      527432.60     707981       
>>>  728504.38
>>>   4194304         997768.35      1108898.61    1503987.61   
>>>  1534795.56
>>>
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 

  reply	other threads:[~2005-04-06  4:24 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-06  0:37 MPI benchmark performance gap between native linux anddomU Santos, Jose Renato G (Jose Renato Santos)
2005-04-06  4:24 ` xuehai zhang [this message]
  -- strict thread matches above, loose matches on Subject: below --
2005-04-06  7:23 Ian Pratt
2005-04-06  7:08 Ian Pratt
2005-04-06  0:17 Santos, Jose Renato G (Jose Renato Santos)
2005-04-05 23:59 Santos, Jose Renato G (Jose Renato Santos)
2005-04-05 15:23 Santos, Jose Renato G (Jose Renato Santos)
2005-04-05 15:47 ` Keir Fraser
2005-04-05  3:10 Santos, Jose Renato G (Jose Renato Santos)
2005-04-05  5:27 ` xuehai zhang
2005-04-05  2:07 Santos, Jose Renato G (Jose Renato Santos)
2005-04-05  8:59 ` Keir Fraser
2005-04-05 22:22 ` Nivedita Singhvi
2005-04-05 22:23 ` xuehai zhang
2005-04-05 22:34 ` xuehai zhang
2005-04-05 22:53   ` Nivedita Singhvi
2005-04-05 22:58     ` Nivedita Singhvi
2005-04-04 23:30 Ian Pratt
2005-04-04 23:43 ` xuehai zhang
2005-04-05 22:29   ` xuehai zhang
2005-04-05 22:34     ` Mark Williamson
2005-04-05 22:39       ` xuehai zhang
2005-04-05 22:43         ` Mark Williamson
2005-04-06  4:25           ` xuehai zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=42536471.7070802@cs.uchicago.edu \
    --to=hai@cs.uchicago.edu \
    --cc=Xen-devel@lists.xensource.com \
    --cc=aravind.menon@epfl.ch \
    --cc=john@arivalai.hpl.hp.com \
    --cc=joserenato.santos@hp.com \
    --cc=m+Ian.Pratt@cl.cam.ac.uk \
    --cc=yoshio_turner@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.