I assume this is PV domU rather than HVM, right?

1. we need check if super page is the culprit by SP_check1.patch.

2. if this can fix this problem, we need further check where the extra 
costs comes: the speculative algorithm, or the super page population 
hypercall by SP_check2.patch

If SP_check2.patch works, the culprit is the new allocation hypercall(so 
guest creation also suffer); Else, the speculative algorithm.

Does it make sense?

Thanks,
edwin


Brendan Cully wrote:
> On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote:
>   
>> On 03/06/2010 02:04, "Brendan Cully" <Brendan@cs.ubc.ca> wrote:
>>
>>     
>>> I've done a bit of profiling of the restore code and observed the
>>> slowness here too. It looks to me like it's probably related to
>>> superpage changes. The big hit appears to be at the front of the
>>> restore process during calls to allocate_mfn_list, under the
>>> normal_page case. It looks like we're calling
>>> xc_domain_memory_populate_physmap once per page here, instead of
>>> batching the allocation? I haven't had time to investigate further
>>> today, but I think this is the culprit.
>>>       
>> Ccing Edwin Zhai. He wrote the superpage logic for domain restore.
>>     
>
> Here's some data on the slowdown going from 2.6.18 to pvops dom0:
>
> I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable
> to measure the time to do the allocation.
>
> kernel, min call time, max call time
> 2.6.18, 4 us, 72 us
> pvops, 202 us, 10696 us (!)
>
> It looks like pvops is dramatically slower to perform the
> xc_domain_memory_populate_physmap call!
>
> I'll attach the patch and raw data below.
>   

-- 
best rgds,
edwin