From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752210Ab3HAFAW (ORCPT ); Thu, 1 Aug 2013 01:00:22 -0400 Received: from e8.ny.us.ibm.com ([32.97.182.138]:39085 "EHLO e8.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750830Ab3HAFAU (ORCPT ); Thu, 1 Aug 2013 01:00:20 -0400 Date: Thu, 1 Aug 2013 10:29:58 +0530 From: Srikar Dronamraju To: Mel Gorman Cc: Peter Zijlstra , Ingo Molnar , Andrea Arcangeli , Johannes Weiner , Linux-MM , LKML Subject: Re: [PATCH 18/18] sched: Swap tasks when reschuling if a CPU on a target node is imbalanced Message-ID: <20130801045958.GB6151@linux.vnet.ibm.com> Reply-To: Srikar Dronamraju References: <1373901620-2021-1-git-send-email-mgorman@suse.de> <1373901620-2021-19-git-send-email-mgorman@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1373901620-2021-19-git-send-email-mgorman@suse.de> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13080105-0320-0000-0000-00000082D81F Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > @@ -904,6 +908,8 @@ static int task_numa_find_cpu(struct task_struct *p, int nid) > src_eff_load *= src_load + effective_load(tg, src_cpu, -weight, -weight); > > for_each_cpu(cpu, cpumask_of_node(nid)) { > + struct task_struct *swap_candidate = NULL; > + > dst_load = target_load(cpu, idx); > > /* If the CPU is idle, use it */ > @@ -922,12 +928,41 @@ static int task_numa_find_cpu(struct task_struct *p, int nid) > * migrate to its preferred node due to load imbalances. > */ > balanced = (dst_eff_load <= src_eff_load); > - if (!balanced) > - continue; > + if (!balanced) { > + struct rq *rq = cpu_rq(cpu); > + unsigned long src_faults, dst_faults; > + > + /* Do not move tasks off their preferred node */ > + if (rq->curr->numa_preferred_nid == nid) > + continue; > + > + /* Do not attempt an illegal migration */ > + if (!cpumask_test_cpu(cpu, tsk_cpus_allowed(rq->curr))) > + continue; > + > + /* > + * Do not impair locality for the swap candidate. > + * Destination for the swap candidate is the source cpu > + */ > + if (rq->curr->numa_faults) { > + src_faults = rq->curr->numa_faults[task_faults_idx(nid, 1)]; > + dst_faults = rq->curr->numa_faults[task_faults_idx(src_cpu_node, 1)]; > + if (src_faults > dst_faults) > + continue; > + } > + > + /* > + * The destination is overloaded but running a task > + * that is not running on its preferred node. Consider > + * swapping the CPU tasks are running on. > + */ > + swap_candidate = rq->curr; > + } > > if (dst_load < min_load) { > min_load = dst_load; > dst_cpu = cpu; > + *swap_p = swap_candidate; Are we some times passing a wrong candidate? Lets say the first cpu balanced is false and we set the swap_candidate, but find the second cpu(/or later cpus) to be idle or has lesser effective load, then we could be sending the task that is running on the first cpu as the swap candidate. Then would the preferred cpu and swap_candidate match? -- Thanks and Regards Srikar