All of lore.kernel.org
 help / color / mirror / Atom feed
From: xuehai zhang <hai@cs.uchicago.edu>
To: "Petersson, Mats" <mats.petersson@amd.com>
Cc: Xen Mailing List <xen-devel@lists.xensource.com>,
	Kate Keahey <keahey@mcs.anl.gov>,
	Tim Freeman <tfreeman@mcs.anl.gov>
Subject: Re: open/stat64 syscalls run faster on Xen VM than standard Linux
Date: Tue, 29 Nov 2005 07:48:01 -0600	[thread overview]
Message-ID: <438C5C11.2040702@cs.uchicago.edu> (raw)
In-Reply-To: <907625E08839C4409CE5768403633E0B0EAACB@sefsexmb1.amd.com>

Mats,

Thank you so much for your help.

>>-----Original Message-----
>>From: xen-devel-bounces@lists.xensource.com 
>>[mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
>>xuehai zhang
>>Sent: 29 November 2005 12:40
>>To: Anthony Liguori
>>Cc: Kate Keahey; Tim Freeman; Xen Mailing List
>>Subject: Re: [Xen-devel] open/stat64 syscalls run faster on 
>>Xen VM than standard Linux
>>
>>Anthony Liguori wrote:
>>
>>>This may just be the difference between having the extra level of 
>>>block caching from using a loop back device.
>>>
>>>Try running the same benchmark on a domain that uses an actual 
>>>partition.  While the syscalls may appear to be faster, I 
>>
>>imagine it's 
>>
>>>because the cost of pulling in a block has already been 
>>
>>payed so the 
>>
>>>overall workload is unaffected.
>>
>>I created a new domainU by using the physical partition 
>>instead of the loopback file as the backends of the VBDs and 
>>I reran the "strace -c /bin/sh -c /bin/echo foo" benchmark 
>>inside of the domU. The following are the results. Comparing 
>>with the results for domU with loopback files as VBDs I 
>>reported in the previous email (quoted below), the average 
>>time of open/stat64 syscalls are very similar, but still much 
>>smaller than the values for standard Linux. If the reason 
>>that open/stat64 run faster on domU with loopback files as 
>>VBDs is because of the extra level of block caching from 
>>using a loop back device, why open/stat64 still run 
>>similarily faster on domU with physical partition as VBDs 
>>when there is no extra level of block caching from using a 
>>loop back device?
> 
> 
> Again, have you run the same test-case on the same physical partition?
> You should be able to do the same trick with mount and chroot for this
> one too. 

Yes, I did. I shutdown the partition-based domU first, then mount the partition and chroot to it, 
and redid "strace -c /bin/sh -c /bin/echo foo". The following results show the write cost is similar 
as the value of physical machine while open and stat64 are still similar as values of loopback 
file-based domU.

# strace -c /bin/sh -c /bin/echo foo

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  21.17    0.000482         482         1           write
  16.07    0.000366          24        15           old_mmap
  14.54    0.000331          37         9         3 open
  11.55    0.000263          10        26           brk
   8.83    0.000201          40         5           read
   7.47    0.000170          85         2           munmap
   4.04    0.000092           8        11           rt_sigaction
   3.16    0.000072          12         6           close
   2.85    0.000065          11         6           fstat64
   2.20    0.000050          25         2           access
   2.20    0.000050           8         6           rt_sigprocmask
   1.80    0.000041          14         3           uname
   1.27    0.000029          15         2           stat64
   0.35    0.000008           8         1           time
   0.35    0.000008           8         1           getppid
   0.35    0.000008           8         1           getpgrp
   0.35    0.000008           8         1           getuid32
   0.35    0.000008           8         1           getgid32
   0.31    0.000007           7         1           getpid
   0.26    0.000006           6         1           execve
   0.26    0.000006           6         1           geteuid32
   0.26    0.000006           6         1           getegid32
------ ----------- ----------- --------- --------- ----------------
100.00    0.002277                   103         3 total

Some more information:
The partition I use is /dev/sda6, which is mapped to /tmp in dom0.

# mount /tmp

# df -lh
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             1.5G  728M  661M  53% /
tmpfs                  62M  4.0K   62M   1% /dev/shm
/dev/sda5             938M  250M  640M  29% /var
/dev/sda6             4.2G  2.9G  1.2G  72% /tmp

# ls /tmp
bcfg-run.nightly  dbuta/  initrd/      media/              mpd2.console_root=    opt/      sbin/  usr/
beckman/          dev/    lib/         mnt/                mpd2.logfile_rbutler  proc/     srv/   var/
bin/              etc/    lost+found/  mpd2/               mpd2.logfile_robl     root/     sys/   vosg/
boot/             home/   mdtest*      mpd2.console_robl=  mpd2.logfile_root     sandbox/  tmp/ 
zoid_threads/

> This is just to rule out other variations (where in the file-system
> different files live, just as an example). 
> 
> I'm also not entirely convinced that you don't get some double-caching
> with this method too - there is the real hard-disk driver in Dom0, then
> the virtual disk driver in DomU, and the file-system driver in DomU. I
> don't know if the real or virtual block-device driver itself will cache
> things or not. Someone who knows may want to comment on this... 

I would like to know the caching mechanism of real or virtual block-device driver too.

> Still, I agree with Anthony - run some REAL loads on the system, rather
> than microbenchmark a particular, very small function of the whole
> system. Microbenchmarks are very good at showing how long some
> particular thing takes, but they are also very sensitive to extremely
> small changes in the system architecture, which for a bigger load would
> not be noticable. 

> I'm doing my best to explain things here, but I'm not sure if I'm
> getting the point across, as it seems like you're trying very hard to
> get back to the same original question. We do know that Xen behaves
> different than Linux. One reason for this is that it's dealing with
> hardware devices through a virtual device. This will most of the time
> slow things down, but as you've seen, for example due to the
> double-caching effect, you may sometimes actually gain from it,
> particularly with very small workloads. 

I am sorry that I didn't describe before about why I am doing all these microbechmarks. The reason 
behind is actually coming from the fact that the VM outperforms the physical machine when I ran some 
real application on both of them originally (VM and physical machine are configured as similar as 
possible). We all expect that the Xen VM presents a little bit worse performance than the compatible 
physical machine. So, the performance "gain" on VM to run the same application is really an abnormal 
observation, which has been made me worried a lot in the past week. That is why I spent so much time 
tracing down what happens in the application until the level of system calls. So, the problem is 
original from running the actual application.

Thanks.

Xuehai

>>XenLinux (physical partition as VBDs)
>>
>>root@cctest1:~/c# strace -c /bin/sh -c /bin/echo foo
>>% time     seconds  usecs/call     calls    errors syscall
>>------ ----------- ----------- --------- --------- ----------------
>>  39.56    0.001955        1955         1           write
>>  18.94    0.000936         936         1           execve
>>   7.65    0.000378          24        16           old_mmap
>>   7.57    0.000374          42         9         2 open
>>   6.27    0.000310          52         6           read
>>   5.10    0.000252          84         3           munmap
>>   4.92    0.000243           9        26           brk
>>   1.92    0.000095          14         7           close
>>   1.78    0.000088           8        11           rt_sigaction
>>   1.40    0.000069          10         7           fstat64
>>   1.01    0.000050           8         6           rt_sigprocmask
>>   0.93    0.000046          23         2           access
>>   0.79    0.000039          13         3           uname
>>   0.69    0.000034          17         2           stat64
>>   0.38    0.000019          19         1           ioctl
>>   0.16    0.000008           8         1           getppid
>>   0.16    0.000008           8         1           getpgrp
>>   0.14    0.000007           7         1           time
>>   0.14    0.000007           7         1           getuid32
>>   0.14    0.000007           7         1           getgid32
>>   0.12    0.000006           6         1           getpid
>>   0.12    0.000006           6         1           getegid32
>>   0.10    0.000005           5         1           geteuid32
>>------ ----------- ----------- --------- --------- ----------------
>>100.00    0.004942                   109         2 total
>>
>>Thanks.
>>
>>Xuehai
>>
>>
>>
>>>xuehai zhang wrote:
>>>
>>>
>>>>Dear all,
>>>>When I debugged the execution performance of an application using 
>>>>strace, I found there are some system calls like open and stat64 
>>>>which run faster on XenLinux than the standard Linux. The 
>>
>>following 
>>
>>>>is the output of running "strace -c /bin/sh -c /bin/echo 
>>
>>foo" on both 
>>
>>>>systems. An open call runs averagely 109 usec on standard 
>>
>>Linux but 
>>
>>>>only 41 usecs on XenLinux. An stat64 call runs
>>>>75 usecs on standard Linux but only 19 usecs on XenLinux.
>>>>The Xen VM runs on the same physical machine as the 
>>
>>standard Linux. 
>>
>>>>It uses loopback files in dom0 as the backends of VBDs.
>>>>Any insight is highly appreciated.
>>>>Thanks.
>>>>Xuehai
>>>>
>>>>XenLinux:
>>>># strace -c /bin/sh -c /bin/echo foo
>>>>
>>>>% time     seconds  usecs/call     calls    errors syscall
>>>>------ ----------- ----------- --------- --------- ----------------
>>>> 39.05    0.001972        1972         1           write
>>>> 19.35    0.000977         977         1           execve
>>>>  7.74    0.000391          24        16           old_mmap
>>>>  7.23    0.000365          41         9         2 open
>>>>  6.06    0.000306          51         6           read
>>>>  5.17    0.000261          10        26           brk
>>>>  4.93    0.000249          83         3           munmap
>>>>  1.98    0.000100          14         7           close
>>>>  1.90    0.000096           9        11           rt_sigaction
>>>>  1.52    0.000077          11         7           fstat64
>>>>  1.01    0.000051           9         6           rt_sigprocmask
>>>>  0.95    0.000048          24         2           access
>>>>  0.81    0.000041          14         3           uname
>>>>  0.75    0.000038          19         2           stat64
>>>>  0.38    0.000019          19         1           ioctl
>>>>  0.18    0.000009           9         1           time
>>>>  0.18    0.000009           9         1           getppid
>>>>  0.16    0.000008           8         1           getpgrp
>>>>  0.16    0.000008           8         1           getuid32
>>>>  0.14    0.000007           7         1           getgid32
>>>>  0.12    0.000006           6         1           getpid
>>>>  0.12    0.000006           6         1           geteuid32
>>>>  0.12    0.000006           6         1           getegid32
>>>>------ ----------- ----------- --------- --------- ----------------
>>>>100.00    0.005050                   109         2 total
>>>>
>>>>
>>>>Standard Linux:
>>>># strace -c /bin/sh -c /bin/echo foo
>>>>
>>>>% time     seconds  usecs/call     calls    errors syscall
>>>>------ ----------- ----------- --------- --------- ----------------
>>>> 22.90    0.000982         109         9         2 open
>>>> 22.85    0.000980         980         1           execve
>>>> 10.87    0.000466          29        16           old_mmap
>>>> 10.45    0.000448         448         1           write
>>>>  7.06    0.000303          51         6           read
>>>>  6.67    0.000286          10        30           brk
>>>>  3.61    0.000155          78         2           access
>>>>  3.50    0.000150          75         2           stat64
>>>>  2.91    0.000125          42         3           munmap
>>>>  2.24    0.000096          14         7           close
>>>>  2.12    0.000091          13         7           fstat64
>>>>  1.84    0.000079           7        11           rt_sigaction
>>>>  1.03    0.000044           7         6           rt_sigprocmask
>>>>  0.72    0.000031          10         3           uname
>>>>  0.19    0.000008           8         1           geteuid32
>>>>  0.16    0.000007           7         1           time
>>>>  0.16    0.000007           7         1           getppid
>>>>  0.16    0.000007           7         1           getpgrp
>>>>  0.16    0.000007           7         1           getuid32
>>>>  0.14    0.000006           6         1           getpid
>>>>  0.14    0.000006           6         1           getgid32
>>>>  0.12    0.000005           5         1           getegid32
>>>>------ ----------- ----------- --------- --------- ----------------
>>>>100.00    0.004289                   112         2 total
>>>>
>>>>_______________________________________________
>>>>Xen-devel mailing list
>>>>Xen-devel@lists.xensource.com
>>>>http://lists.xensource.com/xen-devel
>>>>
>>>
>>>
>>
>>_______________________________________________
>>Xen-devel mailing list
>>Xen-devel@lists.xensource.com
>>http://lists.xensource.com/xen-devel
>>
>>
> 
> 
> 

  reply	other threads:[~2005-11-29 13:48 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-11-29 12:59 open/stat64 syscalls run faster on Xen VM than standard Linux Petersson, Mats
2005-11-29 13:48 ` xuehai zhang [this message]
  -- strict thread matches above, loose matches on Subject: below --
2005-11-28 18:17 Petersson, Mats
2005-11-28 16:15 Petersson, Mats
2005-11-28 17:16 ` xuehai zhang
2005-11-28 19:37   ` Anthony Liguori
2005-11-28 14:45 Petersson, Mats
2005-11-28 15:28 ` xuehai zhang
2005-11-28 15:50 ` xuehai zhang
2005-11-28 16:07   ` Stephen C. Tweedie
2005-11-28 16:27     ` xuehai zhang
2005-11-28 17:10       ` Stephen C. Tweedie
2005-11-29  4:50         ` Tim Freeman
2005-11-29 13:27         ` xuehai zhang
2005-11-28 14:36 Petersson, Mats
2005-11-28 14:47 ` xuehai zhang
2005-11-28  8:21 xuehai zhang
2005-11-28 14:38 ` Anthony Liguori
2005-11-28 14:59   ` xuehai zhang
2005-11-29 12:39   ` xuehai zhang
2005-11-29 12:58     ` xuehai zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=438C5C11.2040702@cs.uchicago.edu \
    --to=hai@cs.uchicago.edu \
    --cc=keahey@mcs.anl.gov \
    --cc=mats.petersson@amd.com \
    --cc=tfreeman@mcs.anl.gov \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.