From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH 1 of 2 RFC] xen, pod: Zero-check recently populated pages (checklast) Date: Thu, 14 Jun 2012 15:24:46 +0100 Message-ID: <4FD9F42E.2060707@eu.citrix.com> References: <4145b32d0c43d7d46650.1339155932@exile> <4FD205E80200007800088C36@nat28.tlf.novell.com> <20120614090725.GC82539@ocelot.phlegethon.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20120614090725.GC82539@ocelot.phlegethon.org> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Tim Deegan Cc: Jan Beulich , xen-devel List-Id: xen-devel@lists.xenproject.org On 14/06/12 10:07, Tim Deegan wrote: > > > At 13:02 +0100 on 08 Jun (1339160536), Jan Beulich wrote: >>>>> On 08.06.12 at 13:45, George Dunlap wrote: >>> --- a/xen/include/asm-x86/p2m.h >>> +++ b/xen/include/asm-x86/p2m.h >>> @@ -287,6 +287,9 @@ struct p2m_domain { >>> unsigned reclaim_super; /* Last gpfn of a scan */ >>> unsigned reclaim_single; /* Last gpfn of a scan */ >>> unsigned max_guest; /* gpfn of max guest demand-populate */ >>> +#define POD_HISTORY_MAX 128 >>> + unsigned last_populated[POD_HISTORY_MAX]; /* gpfn of last guest page demand-populated */ > This is the gpfns of the last 128 order-9 superpages populated, right? Ah, yes -- just order 9. > Also, this line is>80 columns - I think I saw a few others in this series. I'll go through and check, thanks. > >> unsigned long? >> >> Also, wouldn't it be better to allocate this table dynamically, at >> once allowing its size to scale with the number of vCPU-s in the >> guest? > You could even make it a small per-vcpu array, assuming that the parallel > scrubbing will be symmetric across vcpus. I can't remember exactly what I found here (this was last summer I was doing the tests); it may be that Windows creates a bunch of tasks which may migrate to various cpus. If that were the case, a global list would be better than per-vcpu lists. The problem with dynamically scaling the list is that I don't have a heuristic to hand for how to scale it. In both cases, it's not unlikely that making a change without testing will significantly reduce the effectiveness of the patch. Would you rather hold off and wait until I can get a chance to run my benchmarks again (which may miss the 4.2 cycle), or accept a tidied-up version of this patch first, and hope to get a revised method (using dynamic scaling or per-vcpu arrays) in before 4.2, but for sure by 4.3? -George