From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: what will happen for a floated vcpu? Date: Fri, 1 Apr 2016 16:18:34 +0200 Message-ID: <1459520314.5082.275.camel@citrix.com> References: ,<1459440044.5082.178.camel@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2297585533061375806==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: tutu sky , xen devel List-Id: xen-devel@lists.xenproject.org --===============2297585533061375806== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-6KrDvjLBiyJuJZIM+obK" --=-6KrDvjLBiyJuJZIM+obK Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, 2016-03-31 at 16:27 +0000, tutu sky wrote: > Thanks Dario, > Yeah, you are totally right, but software crashes are almost a result > of hardware faults.=20 > Well, no, I'd sa that software crashes are almost always due to bugs in the software itself. Then there are also hardware faults. > I mean if a part of a core for example register file or maybe > floating point unit or even ALU, suddenly stop working or does > malfunction, what will happen for the corresponding vcpu which it is > running? Does that vcpu crash (as a part of a software) or migrate to > another core?=20 > I still don't follow. When you say "register file or ALU stop working or malfunction", that does not mean much, unless better specify what malfunction means. For instance, if a CPU starts to behave in such a way that you store 0xabcd in register R1 and then, when you later read it, you read 0xdcba, what would happen? How does Xen (or any other hypervisor or OS) knows this is a malfunctioning. Well, that entirely depend of what specific piece of code is running when the malfunctioning manifests, and what will likely happen is that the execution of the software would be altered in a way that the software was not expecting and, at some point, things will crash. Whether it would be the hypervisor to crash, or the OS of a guest, or a user level process in a guest, is again dependant on a lot of things. So, really, to a question like "what happens to a Xen system if ALU starts to malfunction?", the only possible answer is "bad things, but who knows". > You just please imagine recoverable faults. Has Xen any plan for deal > with such a situation?=20 > The "situation" you're asking about is too broad a thing to have a plan for it, as a whole. For specific issues, there may or may not be plans or actually implemented solution. For examples (but I don't really know if this could be related to what you're thinking about) there is this: https://en.wikipedia.org/wiki/Machine-check_exception And Xen does some MCE stuff (but I don't know the details myself). Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-6KrDvjLBiyJuJZIM+obK Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlb+gzoACgkQk4XaBE3IOsSN0ACdHZlafS3pPKT4L7X/ldiJ46Cu jBgAnjrvBzLA4/ed4FQtEaZ6VJF7nNuA =GPbK -----END PGP SIGNATURE----- --=-6KrDvjLBiyJuJZIM+obK-- --===============2297585533061375806== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============2297585533061375806==--