From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: More RTC issues with Win2k3
Date: Wed, 4 Sep 2013 14:37:51 +0100
Message-ID: <522737AF.9080404@citrix.com>
References: <5225E614.50900@citrix.com>
	<52271BCD02000078000F0466@nat28.tlf.novell.com>
	<522728DE.2020103@citrix.com>
	<5227476702000078000F0571@nat28.tlf.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <5227476702000078000F0571@nat28.tlf.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>, Xen-devel List <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On 04/09/13 13:44, Jan Beulich wrote:
>>>> On 04.09.13 at 14:34, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> On 04/09/13 10:38, Jan Beulich wrote:
>>> Which raises two questions: Does that specific version of Windows
>>> not honor the WAET flags saying that REG_C reads are unnecessary?
>>> Or does this only occur during very early boot (where iirc a first,
>>> temporary RTC interrupt handler gets installed for a very brief period
>>> of time that doesn't pay attention to the WAET flag)?
>> When the VM falls into the loop, it is still in text mode with "Starting
>> windows..." and a block progress bar which is full.  This means that
>> ntldr has finished loading the base drivers using int 13h.  From
>> Xentrace, we do see that it is in 64 bit mode, so execution is probably
>> right at the beginning of the kernel, even before switching the VGA mode.
> So that might then be the early probing interrupt handler that I
> had found they install transiently.
>
>>>> I have attached xen-hvmctx from the affected domain, and do have one
>>>> example of a VM in this loop so can poke for other state, if there are
>>>> any sensible suggestions
>>> The REG_A value says 64Hz for the periodic interrupt if I'm not
>>> mistaken, so RTC_PF getting re-set between two iterations would
>>> first of all hint at a significantly overcommitted system (such that
>>> no two iterations of the loop can complete within 1/64 second).
>> This is part of our automatic testing.  There are two VMs (32 and 64bit
>> variants) running the same set of tests, being basic lifecycle/migrate
>> etc loops.  The hosts are not overcommitted in the slightest.
> In which case, unless there's some scheduler anomaly involved, I
> see no explanation for the behavior.
>
> Not knowing what precise data Xentrace produces - does that include
> any timing information? If so, what's the smallest delta between two
> of these REG_C reads?
>
> Jan
>

Yes.  Here is a larger sample:

]  2.862018629  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176
   2.862018629  d98v0 io write port 70 val c
]  2.862019461  d98v0 vmentry cycles 1996
]  2.862019936  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177
   2.862019936  d98v0 io read port 71 val c0
]  2.862021275  d98v0 vmentry cycles 3214
]  2.862021752  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176
   2.862021752  d98v0 io write port 70 val c
]  2.862022560  d98v0 vmentry cycles 1938
]  2.862023068  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177
   2.862023068  d98v0 io read port 71 val c0
]  2.862024410  d98v0 vmentry cycles 3220
]  2.862024886  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176
   2.862024886  d98v0 io write port 70 val c
]  2.862025735  d98v0 vmentry cycles 2037
]  2.862026207  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177
   2.862026207  d98v0 io read port 71 val c0
]  2.862027537  d98v0 vmentry cycles 3190
]  2.862028012  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176
   2.862028012  d98v0 io write port 70 val c
]  2.862028854  d98v0 vmentry cycles 2021
]  2.862029322  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177
   2.862029322  d98v0 io read port 71 val c0
]  2.862030668  d98v0 vmentry cycles 3232
]  2.862031145  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176
   2.862031145  d98v0 io write port 70 val c
]  2.862031954  d98v0 vmentry cycles 1941
]  2.862032462  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177
   2.862032462  d98v0 io read port 71 val c0
]  2.862033762  d98v0 vmentry cycles 3118
]  2.862034233  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814176
   2.862034233  d98v0 io write port 70 val c
]  2.862035082  d98v0 vmentry cycles 2036
]  2.862035558  d98v0 vmexit exit_reason IO_INSTRUCTION eip fffff80000814177
   2.862035558  d98v0 io read port 71 val c0
]  2.862036937  d98v0 vmentry cycles 3309

George will know for certain, but as far as I am aware, the timestamp
column is calculated from raw TSC counter values embedded in the trace
records.

Either way, it is on the order of 2 microseconds difference around
iterations of the loops, with a substantial proportion of that being
vmextry/vmexit.

As an easy starting point of reference, I will completely disable
writing the WAET table to see whether that makes a difference.

~Andrew