From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: Hypervisor crash(!) on xl cpupool-numa-split Date: Thu, 03 Feb 2011 10:18:16 +0100 Message-ID: <4D4A72D8.3020502@ts.fujitsu.com> References: <4D41FD3A.5090506@amd.com> <201102021539.06664.stephan.diestelhorst@amd.com> <4D4974D1.1080503@ts.fujitsu.com> <201102021701.05665.stephan.diestelhorst@amd.com> <4D4A43B7.5040707@ts.fujitsu.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------020907060708050805060404" Return-path: In-Reply-To: <4D4A43B7.5040707@ts.fujitsu.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Stephan Diestelhorst Cc: George Dunlap , "Przywara, Andre" , "xen-devel@lists.xensource.com" , Keir Fraser , Ian Jackson List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --------------020907060708050805060404 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Andre, Stephan, could you give the attached patch a try? It moves the cpu assigning/unassigning into a tasklet always executed on the cpu to be moved. This should avoid critical races. Regarding Stephans rant: You should be aware that the main critical sections are only in the tasklets. The locking in the main routines is needed only to avoid the cpupool to be destroyed in between. I'm not sure whether the master_ticker patch is still needed. It seems to break something, as my machine hung up after several 100 cpu moves (without the new patch). I'm still investigating this problem. Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967 Fujitsu Technology Solutions e-mail: juergen.gross@ts.fujitsu.com Domagkstr. 28 Internet: ts.fujitsu.com D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html --------------020907060708050805060404 Content-Type: text/x-patch; name="cpupool-idle.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="cpupool-idle.patch" diff -r 4bdb78db22b6 xen/common/cpupool.c --- a/xen/common/cpupool.c Wed Feb 02 17:06:36 2011 +0000 +++ b/xen/common/cpupool.c Thu Feb 03 10:09:53 2011 +0100 @@ -217,14 +217,30 @@ static int cpupool_assign_cpu_locked(str return 0; } +static long cpupool_assign_cpu_helper(void *info) +{ + int cpu = cpupool_moving_cpu; + long ret; + + cpupool_dprintk("cpupool_assign_cpu(pool=%d,cpu=%d) ret %ld\n", + cpupool_cpu_moving->cpupool_id, cpu, ret); + BUG_ON(!is_idle_vcpu(current)); + BUG_ON(cpu != smp_processor_id()); + spin_lock(&cpupool_lock); + ret = cpupool_assign_cpu_locked(cpupool_cpu_moving, cpu); + spin_unlock(&cpupool_lock); + return ret; +} + static long cpupool_unassign_cpu_helper(void *info) { int cpu = cpupool_moving_cpu; long ret; cpupool_dprintk("cpupool_unassign_cpu(pool=%d,cpu=%d) ret %ld\n", - cpupool_id, cpu, ret); - + cpupool_cpu_moving->cpupool_id, cpu, ret); + BUG_ON(!is_idle_vcpu(current)); + BUG_ON(cpu != smp_processor_id()); spin_lock(&cpupool_lock); ret = cpu_disable_scheduler(cpu); cpu_set(cpu, cpupool_free_cpus); @@ -241,9 +257,51 @@ static long cpupool_unassign_cpu_helper( } /* + * assign a specific cpu to a cpupool + * we must be sure to run on the cpu to be assigned in idle! to achieve this + * the main functionality is performed via continue_hypercall_on_cpu on the + * specific cpu. + * possible failures: + * - cpu not free + * - cpu just being unplugged + */ +int cpupool_assign_cpu(struct cpupool *c, unsigned int cpu) +{ + int ret; + + cpupool_dprintk("cpupool_assign_cpu(pool=%d,cpu=%d)\n", + c->cpupool_id, cpu); + + spin_lock(&cpupool_lock); + ret = -EBUSY; + if ( (cpupool_moving_cpu != -1) && (cpu != cpupool_moving_cpu) ) + goto out; + if ( cpu_isset(cpu, cpupool_locked_cpus) ) + goto out; + + ret = 0; + if ( !cpu_isset(cpu, cpupool_free_cpus) && (cpu != cpupool_moving_cpu) ) + goto out; + + cpupool_moving_cpu = cpu; + atomic_inc(&c->refcnt); + cpupool_cpu_moving = c; + cpu_clear(cpu, c->cpu_valid); + spin_unlock(&cpupool_lock); + + return continue_hypercall_on_cpu(cpu, cpupool_assign_cpu_helper, c); + +out: + spin_unlock(&cpupool_lock); + cpupool_dprintk("cpupool_assign_cpu(pool=%d,cpu=%d) ret %d\n", + cpupool_id, cpu, ret); + return ret; +} + +/* * unassign a specific cpu from a cpupool - * we must be sure not to run on the cpu to be unassigned! to achieve this - * the main functionality is performed via continue_hypercall_on_cpu on a + * we must be sure to run on the cpu to be unassigned in idle! to achieve this + * the main functionality is performed via continue_hypercall_on_cpu on the * specific cpu. * if the cpu to be removed is the last one of the cpupool no active domain * must be bound to the cpupool. dying domains are moved to cpupool0 as they @@ -254,7 +312,6 @@ static long cpupool_unassign_cpu_helper( */ int cpupool_unassign_cpu(struct cpupool *c, unsigned int cpu) { - int work_cpu; int ret; struct domain *d; @@ -302,14 +359,7 @@ int cpupool_unassign_cpu(struct cpupool cpu_clear(cpu, c->cpu_valid); spin_unlock(&cpupool_lock); - work_cpu = smp_processor_id(); - if ( work_cpu == cpu ) - { - work_cpu = first_cpu(cpupool0->cpu_valid); - if ( work_cpu == cpu ) - work_cpu = next_cpu(cpu, cpupool0->cpu_valid); - } - return continue_hypercall_on_cpu(work_cpu, cpupool_unassign_cpu_helper, c); + return continue_hypercall_on_cpu(cpu, cpupool_unassign_cpu_helper, c); out: spin_unlock(&cpupool_lock); @@ -455,27 +505,15 @@ int cpupool_do_sysctl(struct xen_sysctl_ { unsigned cpu; + c = __cpupool_get_by_id(op->cpupool_id, 0); + ret = -ENOENT; + if ( c == NULL ) + break; cpu = op->cpu; - cpupool_dprintk("cpupool_assign_cpu(pool=%d,cpu=%d)\n", - op->cpupool_id, cpu); - spin_lock(&cpupool_lock); if ( cpu == XEN_SYSCTL_CPUPOOL_PAR_ANY ) cpu = first_cpu(cpupool_free_cpus); - ret = -EINVAL; - if ( cpu >= NR_CPUS ) - goto addcpu_out; - ret = -EBUSY; - if ( !cpu_isset(cpu, cpupool_free_cpus) ) - goto addcpu_out; - c = cpupool_find_by_id(op->cpupool_id, 0); - ret = -ENOENT; - if ( c == NULL ) - goto addcpu_out; - ret = cpupool_assign_cpu_locked(c, cpu); - addcpu_out: - spin_unlock(&cpupool_lock); - cpupool_dprintk("cpupool_assign_cpu(pool=%d,cpu=%d) ret %d\n", - op->cpupool_id, cpu, ret); + ret = (cpu < NR_CPUS) ? cpupool_assign_cpu(c, cpu) : -EINVAL; + cpupool_put(c); } break; --------------020907060708050805060404 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --------------020907060708050805060404--