From mboxrd@z Thu Jan  1 00:00:00 1970
From: George Dunlap <george.dunlap@eu.citrix.com>
Subject: Re: Xen 4.3 development update
Date: Fri, 3 May 2013 17:41:35 +0100
Message-ID: <5183E8BF.7030706@eu.citrix.com>
References: <CAFLBxZYNSsBMS79sRHFb69BpLv9+TsEmFJdj80yeijQPSzppNA@mail.gmail.com>
	<515B186F02000078000CA1A7@nat28.tlf.novell.com>
	<20130402163440.GB17022@ocelot.phlegethon.org>
	<515BF5F102000078000CA39C@nat28.tlf.novell.com>
	<515C0A0D.6020007@eu.citrix.com>
	<4691AE88-FA92-4826-BF5E-50175BACA5D9@gmail.com>
	<20130404152321.GI42936@ocelot.phlegethon.org>
	<20130404170507.GJ42936@ocelot.phlegethon.org>
	<517E73DE.7090803@brockmann-consult.de>
	<20130502154856.GO65547@ocelot.phlegethon.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <20130502154856.GO65547@ocelot.phlegethon.org>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Tim Deegan <tim@xen.org>
Cc: Andres Lagar-Cavilla <andres.lagarcavilla@gmail.com>, Peter Maloney <peter.maloney@brockmann-consult.de>, "suravee.suthikulpanit@amd.com" <suravee.suthikulpanit@amd.com>, Jan Beulich <JBeulich@suse.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
List-Id: xen-devel@lists.xenproject.org

On 02/05/13 16:48, Tim Deegan wrote:
> At 15:21 +0200 on 29 Apr (1367248894), Peter Maloney wrote:
>> On 04/04/2013 07:05 PM, Tim Deegan wrote:
>>> Also, if there is still a bad slowdown, caused by the p2m lookups, this
>>> might help a little bit:
>>>
>>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>>> index 38e87ce..7bd8646 100644
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -1361,6 +1361,18 @@ int hvm_hap_nested_page_fault(paddr_t gpa,
>>>           }
>>>       }
>>>   
>>> +
>>> +    /* For the benefit of 32-bit WinXP (& older Windows) on AMD CPUs,
>>> +     * a fast path for LAPIC accesses, skipping the p2m lookup. */
>>> +    if ( !nestedhvm_vcpu_in_guestmode(v)
>>> +         && gfn == vlapic_base_address(vcpu_vlapic(current)) >> PAGE_SHIFT )
>>> +    {
>>> +        if ( !handle_mmio() )
>>> +            hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>> +        rc = 1;
>>> +        goto out;
>>> +    }
>>> +
>>>       p2m = p2m_get_hostp2m(v->domain);
>>>       mfn = get_gfn_type_access(p2m, gfn, &p2mt, &p2ma,
>>>                                 P2M_ALLOC | (access_w ? P2M_UNSHARE : 0), NULL);
>> This patch (applied to 4.2.2) has a very large improvement on my box
>> (AMD FX-8150) and WinXP 32 bit.
> Hmm - I expected it to be only a mild improvement.  How about this one,
> which puts in the same shortcut in another place as well?  I don't think
> it will be much better than the last one, but it's worth a try.

So I dusted off my old perf testing scripts and added in one to measure 
boot performance.

Below are boot times, from after "xl create" returns, until a specific 
python daemon running in the VM starts responding to requests.  So lower 
is better.

There are a number of places where there can be a few seconds of noise 
either way, but on the whole the tests seem fairly repeatable.

I ran this with w2k3eesp2 and with winxpsp3, using some of the 
auto-install test images made for the XenServer regression testing. All 
of them are using a flat file disk backend with qemu-traditional.

Results are in order of commits:

Xen 4.1:

w2k3: 43 34 34 33 34
winxp: 110 111 111 110 112

Xen 4.2:

w2k3: 34 44 45 45 45
winxp: 203 221 210 211 200

Xen-unstable w/ RTC fix:

w2k3: 43 44 44 45 44
winxp: 268 275 265 276 265

Xen-unstable with rtc fix + this "fast lapic" patch:

w2k3: 43 45 44 45 45
winxp: 224 232 232 232 232


So w2k3 boots fairly quickly anyway; has a 50% slow-down when moving 
from 4.1 to 4.2, and no discernible change after that.

winxp boots fairly slowly; nearly doubles in speed for 4.2, and gets 
even worse for xen-unstable.  The patch is a measurable improvement, but 
still nowhere near 4.1, or even 4.2.

On the whole however -- I'm not sure that boot time by itself is a 
blocker.  If the problem really is primarily the "eager TPR" issue for 
Windows XP, then I'm not terribly motivated either: the Citrix PV 
drivers patch Windows XP to modify the routine to be lazy (like w2k3); 
there is hardware available which allows the TPR to be virtualized; and 
there are plenty of Windows-based OSes available which do not have this 
problem.

I'll be doing some more workload-based benchmarks (probably starting 
with the Windows ddk example build) to see if there are other issues I 
turn up.

  -George