From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joanna Rutkowska Subject: Re: Xen 4.0.0x allows for data corruption in Dom0 Date: Tue, 09 Mar 2010 00:23:12 +0100 Message-ID: <4B9586E0.2060005@invisiblethingslab.com> References: <4B922A89.2060105@invisiblethingslab.com> <4B957914.4050408@goop.org> <4B957B93.4060401@invisiblethingslab.com> <4B958475.3050407@goop.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1059769146==" Return-path: In-Reply-To: <4B958475.3050407@goop.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jeremy Fitzhardinge Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============1059769146== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigBA95808527AB6EF9ADC83B5D" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigBA95808527AB6EF9ADC83B5D Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 03/09/2010 12:12 AM, Jeremy Fitzhardinge wrote: > I think its most likely to be a dom0 bug, specifically a bug in one of > the backend drivers. The common failure mode which causes symtoms like= > this is when a granted page (=3Da domU page mapped into dom0) is releas= ed > back into dom0's heap and reused as general memory while still being > under the control of the domU. >=20 > However, given that the domU hasn't got any devices assigned to it asid= e > from the console, none of the backend should be coming into play. It > might be a more general problem with the privcmd interface. >=20 > Alternatively, I suppose, the domain builder could end up using some of= > dom0 pages to construct the domU without properly freeing them, which > would suggest a bug in the balloon driver. >=20 > I can't think of a Xen failure-mode which would cause these symptoms > without also being massively obvious in other cases. (But "I can't > think of..." is where all the best bugs hide.) >=20 But the corruptions always happen in 32-bytes chunks, which might suggest it's not a page-related problem (e.g. wrongly re-used page), as in that case we would be observing (at least sometimes) much bigger chunks of corrupted data, I think. The reason why I still believe it's a hypervisor related thing, it that I'm currently using the very *same* Dom0 kernel (very recent xen/stable-2.6.31) with Xen 3.4.2 and the system is damn stable. And I really mean extensive use with 5-7 VMs running all the time doing various things from Web browsing to kernel building. If I was to make an educated guess I would say it's something related to some interrupt handling, i.e. Xen mishandling it, e.g. the handler is writing out-of-buffer somewhere and it just happens to land in the Dom0 fs buffer used by e.g. dd operation. joanna. --------------enigBA95808527AB6EF9ADC83B5D Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAkuVhuYACgkQORdkotfEW85zMgCgoOivs5Q5KXENk9N6QTST1Kc+ WOIAnjM9+T+9nCvC9UWc5OE/Y2ss8GK+ =Sr/G -----END PGP SIGNATURE----- --------------enigBA95808527AB6EF9ADC83B5D-- --===============1059769146== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============1059769146==--