From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bob Liu Subject: Re: [PATCH 2/2] xen: spread page scrubbing across all idle CPU Date: Wed, 11 Jun 2014 19:17:27 +0800 Message-ID: <53983AC7.1000400@oracle.com> References: <1402402717-26736-1-git-send-email-bob.liu@oracle.com> <1402402717-26736-2-git-send-email-bob.liu@oracle.com> <53972E79020000780001989C@mail.emea.novell.com> <5397C438.301@oracle.com> <53982BB00200007800019C80@mail.emea.novell.com> <5398314B.5030902@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WugXa-00070M-H9 for xen-devel@lists.xenproject.org; Wed, 11 Jun 2014 11:18:02 +0000 In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap Cc: Bob Liu , Keir Fraser , Ian Campbell , Andrew Cooper , Jan Beulich , xen-devel List-Id: xen-devel@lists.xenproject.org On 06/11/2014 06:45 PM, George Dunlap wrote: > On Wed, Jun 11, 2014 at 11:36 AM, Bob Liu wrote: >> >> On 06/11/2014 06:13 PM, George Dunlap wrote: >>> On Wed, Jun 11, 2014 at 9:13 AM, Jan Beulich wrote: >>>>>>> On 11.06.14 at 04:51, wrote: >>>>> On 06/10/2014 10:12 PM, Jan Beulich wrote: >>>>>>>>> On 10.06.14 at 14:18, wrote: >>>>>>> + if( is_tasklet ) >>>>>>> + tasklet_schedule_on_cpu(&global_scrub_tasklet, cpu); >>>>>> >>>>>> So you re-schedule this tasklet immediately - while this may be >>>>>> acceptable inside the hypervisor, did you consider the effect this >>>>>> will have on the guest (normally Dom0)? Its respective vCPU won't >>>>>> get run _at all_ until you're done scrubbing. >>>>>> >>>>> >>>>> Yes, that's a problem. I don't have any better idea right now. >>>>> >>>>> What I'm trying is doing the scrubbing on current CPU as well as on all >>>>> idle vcpus in parallel. >>>>> I also considered your suggestion about doing the scrubbing in the >>>>> background as well as on the allocation path. But I think it's more >>>>> unacceptable for users to get blocked randomly for a uncertain time when >>>>> allocating a large mount of memory. >>>>> That's why I still chose the sync way that once 'xl destroy' return all >>>>> memory are scrubbed. >>>> >>>> But I hope you realize that in the current shape, with the shortcomings >>>> pointed out un-addressed, there's no way for this to go in. >>> >>> Would it make more sense to do something like the following: >>> * Have a "clean" freelist and a "dirty" freelist >>> * When destroying a domain, simply move pages to the dirty freelist >>> * Have idle vcpus scrub the dirty freelist before going to sleep >>> - ...and wake up idle vcpus to do some scrubbing when adding pages to >>> the dirty freelist >>> * In alloc_domheap_pages(): >>> - If there are pages on the "clean" freelist, allocate them >>> - If there are no pages on the "clean" freelist but there are on the >>> "dirty" freelist, scrub pages from the "dirty" freelist synchronously. >>> >> >> Thank you very much for your suggestion, it's similar as Jan suggested. >> >> My concern of this approach is in some bad situation the allocation path >> may be blocked for a long time waiting for scrubbing "dirty" freelist. >> >> What the users see is it's much faster to destroy a domain but may >> slower to create an new one(or slow down other routines need to alloc >> large memory). > > Yes, so on a system with near 100% memory usage, a reboot of a large > VM would mean short destroy, long start-up, rather than long destroy, > short start-up. End-to-end it's basically the same; but on a system > with less memory usage, both operations are significantly faster. > Okay, I'll follow your suggestion. And thanks everyone! -- Regards, -Bob