From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753839AbbJHIT5 (ORCPT ); Thu, 8 Oct 2015 04:19:57 -0400 Received: from mail-wi0-f177.google.com ([209.85.212.177]:32913 "EHLO mail-wi0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752744AbbJHITx (ORCPT ); Thu, 8 Oct 2015 04:19:53 -0400 Message-ID: <1444292390.3389.100.camel@gmail.com> Subject: Re: CFS scheduler unfairly prefers pinned tasks From: Mike Galbraith To: paul.szabo@sydney.edu.au, Peter Zijlstra Cc: linux-kernel@vger.kernel.org Date: Thu, 08 Oct 2015 10:19:50 +0200 In-Reply-To: <1444099557.2832.48.camel@gmail.com> References: <201510052148.t95LmQm6018585@como.maths.usyd.edu.au> <1444099557.2832.48.camel@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2015-10-06 at 04:45 +0200, Mike Galbraith wrote: > On Tue, 2015-10-06 at 08:48 +1100, paul.szabo@sydney.edu.au wrote: > > The Linux CFS scheduler prefers pinned tasks and unfairly > > gives more CPU time to tasks that have set CPU affinity. > > This effect is observed with or without CGROUP controls. > > > > To demonstrate: on an otherwise idle machine, as some user > > run several processes pinned to each CPU, one for each CPU > > (as many as CPUs present in the system) e.g. for a quad-core > > non-HyperThreaded machine: > > > > taskset -c 0 perl -e 'while(1){1}' & > > taskset -c 1 perl -e 'while(1){1}' & > > taskset -c 2 perl -e 'while(1){1}' & > > taskset -c 3 perl -e 'while(1){1}' & > > > > and (as that same or some other user) run some without > > pinning: > > > > perl -e 'while(1){1}' & > > perl -e 'while(1){1}' & > > > > and use e.g. top to observe that the pinned processes get > > more CPU time than "fair". I see a fairness issue with pinned tasks and group scheduling, but one opposite to your complaint. Two task groups, one with 8 hogs (oink), one with 1 (pert), all are pinned. PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 3269 root 20 0 4060 724 648 R 100.0 0.004 1:00.02 1 oink 3270 root 20 0 4060 652 576 R 100.0 0.004 0:59.84 2 oink 3271 root 20 0 4060 692 616 R 100.0 0.004 0:59.95 3 oink 3274 root 20 0 4060 608 532 R 100.0 0.004 1:00.01 6 oink 3273 root 20 0 4060 728 652 R 99.90 0.005 0:59.98 5 oink 3272 root 20 0 4060 644 568 R 99.51 0.004 0:59.80 4 oink 3268 root 20 0 4060 612 536 R 99.41 0.004 0:59.67 0 oink 3279 root 20 0 8312 804 708 R 88.83 0.005 0:53.06 7 pert 3275 root 20 0 4060 656 580 R 11.07 0.004 0:06.98 7 oink . That group share math would make a huge compute group with progress checkpoints sharing an SGI monster with one other hog amusing to watch. -Mike