From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Joanna Rutkowska <joanna@invisiblethingslab.com>
Cc: xen-devel@lists.xensource.com
Subject: Re: Xen 4.0.0x allows for data corruption in Dom0
Date: Mon, 08 Mar 2010 14:24:20 -0800 [thread overview]
Message-ID: <4B957914.4050408@goop.org> (raw)
In-Reply-To: <4B922A89.2060105@invisiblethingslab.com>
On 03/06/2010 02:12 AM, Joanna Rutkowska wrote:
> There is a nasty data corruption problem most likely allowed by a bug in
> the Xen 4.0.0-x hypervisors.
>
> The problem occurs with a frequency of "a few chunks per 10 GB of data
> copied", and only when running a VM (PV domU) with a specific kernel.
> The problem, however, affects not only the VM but also the Dom0, which
> is of significant importance.
>
> How to reproduce:
>
> 1) Start at least one Xen PV VM with a pvops0 kernel. One kernel known
> to demonstrate the problem is the one built by Michael Young, based on
> xen/master git from Dec 23. It has recently been replaced by a newer
> kernel, which doesn't always show the problem, but I uploaded the
> previous one at the URL below, so people can use it for testing:
>
> http://invisiblethingslab.com/pub/kernel-2.6.31.9-1.2.82.xendom0.fc12.x86_64.rpm
>
> Now you can start a dummy VM with this kernel, e.g.:
>
> # xm create -c /dev/null memory=400 kernel=<path/to/kernel>
> extra="rootdelay=1000"
>
> 2) Now, in Dom0, after having started this dummy VM, create a big test
> file, filled all with zeros. Make sure to choose a size bigger than your
> DRAM size, to avoid fs caching effect, e.g.:
>
> $ dd if=/dev/zero of=test bs=1M count=10000
>
> That should create a 10GB file. Make sure to use /dev/zero and not
> /dev/null!
>
> 3) Once the test file got created, check if it really consists of zeros
> only:
>
> $ xxd test.bin | grep -v "0000 0000 0000 0000 0000 0000 0000 0000"
>
> Normally you should not get any output. However, I consistently get
> something like this:
>
> 4593a000:940d 0000 0000 0000 2d40 d6fc c803 0000 ........-@......
> 4593a010:00f6 1f52 b301 0000 b620 dcd5 ff00 0000 ...R..... ......
> a5df0000:e542 712c 77da c9f9 a429 4b85 ecc4 9395 .Bq,w....)K.....
> a5df0010:d9d6 971f 0d58 5c70 aba6 387d 805f 09e2 .....X\p..8}._..
> ceecb000:f80d 0000 0000 0000 096e 1cdc e403 0000 .........n......
> ceecb010:2460 7ef6 be01 0000 b620 dcd5 ff00 0000 $`~...... ......
> 148432000580e 0000 0000 0000 5665 ed9d ff03 0000 X.......Ve......
> 1484320107bcc a023 ca01 0000 b620 dcd5 ff00 0000 {..#..... ......
> 1c548b000bc0e 0000 0000 0000 6942 387d 1b04 0000 ........iB8}....
> 1c548b010872b 01c8 d501 0000 b620 dcd5 ff00 0000 .+....... ......
> 225d450004448 27cd b966 b37e 1f0c e9e3 c2db b6ee DH'..f.~........
> 225d45010d2b2 55b8 9ef1 e818 a7e3 364d 2322 dc75 ..U.......6M#".u
> 242056000140f 0000 0000 0000 0bb0 3704 3404 0000 ..........7.4...
> 2420560109601 b606 e001 0000 b620 dcd5 ff00 0000 ......... ......
>
> The actual data vary between tests, however, the "dcd5 ff00 0000"
> pattern seems to be repeatable on a given system with a given hypervisor
> binary (the above numbers are for Xen-4.0.0-rc5 built from Michael
> Young's SRPM). The errors always occur in chunks of 32-bytes.
>
> We have tested this in our lab on three different machines, with various
> Dom0 kernels -- based on xen/master (AKA xen/stable-2.6.31) and
> xen/stable (AKA xen/stable-2.6.32) -- and with a few Xen 4 hypervisors
> (rc2, rc4, rc5). Not every kernel allows for reproducing the error with
> such a simple "dummy" VM as the one given above -- e.g. the 2.6.32-based
> kernels required some more regular VMs to be started for the problem to
> be noticeable. However, with the previously mentioned kernel (M. Young
> Dec23), the problem has been 100% reproducible us.
>
> When downgraded to Xen 3.4.2 the problem went away.
>
> Of course this problem cannot be attributed to a buggy VM kernel, as the
> hypervisor should be resistant to any kind of "wrong" software (buggy or
> malicious) that executes in a VM.
>
Why "of course"? You report looks to me like a bug in dom0 which is
causing data corruption when there's another domain running. I don't
see anything that specifically implicates Xen. The fact that the
symptoms change with a different Xen version could mean kernel bug is
effected by the Xen version (different memory layout, for example, or
different paths in the kernel caused by different feature availability).
J
next parent reply other threads:[~2010-03-08 22:24 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4B922A89.2060105@invisiblethingslab.com>
2010-03-08 22:24 ` Jeremy Fitzhardinge [this message]
2010-03-08 22:34 ` Xen 4.0.0x allows for data corruption in Dom0 Joanna Rutkowska
2010-03-08 23:12 ` Jeremy Fitzhardinge
2010-03-08 23:23 ` Joanna Rutkowska
2010-03-08 23:41 ` Jeremy Fitzhardinge
2010-03-08 23:48 ` Joanna Rutkowska
2010-03-09 0:18 ` James Harper
2010-03-09 0:20 ` Joanna Rutkowska
2010-03-08 23:32 ` Daniel Stodden
[not found] ` <4B958A42.4000407@invisiblethingslab.com>
2010-03-08 23:46 ` Daniel Stodden
[not found] <C7B80F95.C5F3%keir.fraser@eu.citrix.com>
2010-03-06 13:37 ` Joanna Rutkowska
2010-03-06 17:18 ` Keir Fraser
[not found] <C7B7F4C4.C5D8%keir.fraser@eu.citrix.com>
2010-03-06 13:36 ` Keir Fraser
2010-03-07 14:36 ` Pasi Kärkkäinen
2010-03-07 14:39 ` Keir Fraser
2010-03-07 16:12 ` Pasi Kärkkäinen
2010-03-08 23:22 ` Daniel Stodden
2010-03-08 23:30 ` Joanna Rutkowska
2010-03-08 23:52 ` Daniel Stodden
2010-03-08 23:56 ` Joanna Rutkowska
2010-03-09 0:33 ` Daniel Stodden
2010-03-09 8:25 ` Pasi Kärkkäinen
2010-03-09 9:37 ` Jan Beulich
2010-03-09 10:15 ` Jan Beulich
2010-03-09 10:17 ` Keir Fraser
2010-03-09 10:15 ` Keir Fraser
2010-03-09 10:25 ` Pasi Kärkkäinen
2010-03-09 10:43 ` Keir Fraser
2010-03-09 12:03 ` Pasi Kärkkäinen
2010-03-09 10:42 ` Jan Beulich
2010-03-09 23:28 ` Jeremy Fitzhardinge
2010-03-10 1:33 ` Dan Magenheimer
2010-03-10 18:02 ` Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B957914.4050408@goop.org \
--to=jeremy@goop.org \
--cc=joanna@invisiblethingslab.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.