From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: More RTC issues with Win2k3 Date: Wed, 4 Sep 2013 14:37:51 +0100 Message-ID: <522737AF.9080404@citrix.com> References: <5225E614.50900@citrix.com> <52271BCD02000078000F0466@nat28.tlf.novell.com> <522728DE.2020103@citrix.com> <5227476702000078000F0571@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5227476702000078000F0571@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: George Dunlap , Xen-devel List List-Id: xen-devel@lists.xenproject.org On 04/09/13 13:44, Jan Beulich wrote: >>>> On 04.09.13 at 14:34, Andrew Cooper wrote: >> On 04/09/13 10:38, Jan Beulich wrote: >>> Which raises two questions: Does that specific version of Windows >>> not honor the WAET flags saying that REG_C reads are unnecessary? >>> Or does this only occur during very early boot (where iirc a first, >>> temporary RTC interrupt handler gets installed for a very brief period >>> of time that doesn't pay attention to the WAET flag)? >> When the VM falls into the loop, it is still in text mode with "Starting >> windows..." and a block progress bar which is full. This means that >> ntldr has finished loading the base drivers using int 13h. From >> Xentrace, we do see that it is in 64 bit mode, so execution is probably >> right at the beginning of the kernel, even before switching the VGA mode. > So that might then be the early probing interrupt handler that I > had found they install transiently. > >>>> I have attached xen-hvmctx from the affected domain, and do have one >>>> example of a VM in this loop so can poke for other state, if there are >>>> any sensible suggestions >>> The REG_A value says 64Hz for the periodic interrupt if I'm not >>> mistaken, so RTC_PF getting re-set between two iterations would >>> first of all hint at a significantly overcommitted system (such that >>> no two iterations of the loop can complete within 1/64 second). >> This is part of our automatic testing. There are two VMs (32 and 64bit >> variants) running the same set of tests, being basic lifecycle/migrate >> etc loops. The hosts are not overcommitted in the slightest. > In which case, unless there's some scheduler anomaly involved, I > see no explanation for the behavior. > > Not knowing what precise data Xentrace produces - does that include > any timing information? If so, what's the smallest delta between two > of these REG_C reads? > > Jan > Yes. Here is a larger sample: ] 2.862018629 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176 2.862018629 d98v0 io write port 70 val c ] 2.862019461 d98v0 vmentry cycles 1996 ] 2.862019936 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177 2.862019936 d98v0 io read port 71 val c0 ] 2.862021275 d98v0 vmentry cycles 3214 ] 2.862021752 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176 2.862021752 d98v0 io write port 70 val c ] 2.862022560 d98v0 vmentry cycles 1938 ] 2.862023068 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177 2.862023068 d98v0 io read port 71 val c0 ] 2.862024410 d98v0 vmentry cycles 3220 ] 2.862024886 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176 2.862024886 d98v0 io write port 70 val c ] 2.862025735 d98v0 vmentry cycles 2037 ] 2.862026207 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177 2.862026207 d98v0 io read port 71 val c0 ] 2.862027537 d98v0 vmentry cycles 3190 ] 2.862028012 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176 2.862028012 d98v0 io write port 70 val c ] 2.862028854 d98v0 vmentry cycles 2021 ] 2.862029322 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177 2.862029322 d98v0 io read port 71 val c0 ] 2.862030668 d98v0 vmentry cycles 3232 ] 2.862031145 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176 2.862031145 d98v0 io write port 70 val c ] 2.862031954 d98v0 vmentry cycles 1941 ] 2.862032462 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177 2.862032462 d98v0 io read port 71 val c0 ] 2.862033762 d98v0 vmentry cycles 3118 ] 2.862034233 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176 2.862034233 d98v0 io write port 70 val c ] 2.862035082 d98v0 vmentry cycles 2036 ] 2.862035558 d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177 2.862035558 d98v0 io read port 71 val c0 ] 2.862036937 d98v0 vmentry cycles 3309 George will know for certain, but as far as I am aware, the timestamp column is calculated from raw TSC counter values embedded in the trace records. Either way, it is on the order of 2 microseconds difference around iterations of the loops, with a substantial proportion of that being vmextry/vmexit. As an easy starting point of reference, I will completely disable writing the WAET table to see whether that makes a difference. ~Andrew