public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mike Galbraith <efault@gmx.de>
To: Willy Tarreau <w@1wt.eu>
Cc: Balazs Scheidler <bazsi@balabit.hu>, linux-kernel@vger.kernel.org
Subject: Re: scheduler oddity [bug?]
Date: Mon, 09 Mar 2009 04:35:22 +0100	[thread overview]
Message-ID: <1236569722.14798.5.camel@marge.simson.net> (raw)
In-Reply-To: <20090308220319.GA570@1wt.eu>

On Sun, 2009-03-08 at 23:03 +0100, Willy Tarreau wrote:
> Hi Balazs,
> 
> On Sun, Mar 08, 2009 at 08:45:24PM +0100, Balazs Scheidler wrote:
> > On Sat, 2009-03-07 at 19:47 +0100, Balazs Scheidler wrote:
> > > On Sat, 2009-03-07 at 18:47 +0100, Balazs Scheidler wrote:
> > > > Hi,
> > > > 
> > > > I've tested this on 3 computers and each showed the same symptoms:
> > > >  * quad core Opteron, running Ubuntu kernel 2.6.27-13.29
> > > >  * Core 2 Duo, running Ubuntu kernel 2.6.27-11.27
> > > >  * Dual Core Opteron, Debian backports.org kernel 2.6.26-13~bpo40+1
> > > > 
> > > > Is this a bug, or a feature?
> > > > 
> > > 
> > > One new interesting information: I've retested with a 2.6.22 based
> > > kernel, and it still works there, setting the CPU affinity does not
> > > change the performance of the test program and mpstat nicely shows that
> > > 2 cores are working, not just one.
> > > 
> > > Maybe this is CFS related? That was merged for 2.6.23 IIRC.
> > > 
> > > Also, I tried changing various scheduler knobs
> > > in /proc/sys/kernel/sched_* but they didn't help. I've tried to change
> > > these:
> > > 
> > >  * sched_migration_cost: changed from the default 500000 to 100000 and
> > > then 10000 but neither helped.
> > >  * sched_nr_migrate: increased it to 64, but again nothing
> > > 
> > > I'm starting to think that this is a regression that may or may not be
> > > related to CFS. 
> > > 
> > > I don't have a box where I could bisect on, but the test program makes
> > > the problem quite obvious.
> > 
> > Some more test results:
> > 
> > Latest tree from Linus seems to work, at least the program runs on both
> > cores as it should. I bisected the patch that changed behaviour, and
> > I've found this:
> > 
> > commit 38736f475071b80b66be28af7b44c854073699cc
> > Author: Gautham R Shenoy <ego@in.ibm.com>
> > Date:   Sat Sep 6 14:50:23 2008 +0530
> > 
> >     sched: fix __load_balance_iterator() for cfq with only one task
> >     
> >     The __load_balance_iterator() returns a NULL when there's only one
> >     sched_entity which is a task. It is caused by the following code-path.
> >     
> >     	/* Skip over entities that are not tasks */
> >     	do {
> >     		se = list_entry(next, struct sched_entity, group_node);
> >     		next = next->next;
> >     	} while (next != &cfs_rq->tasks && !entity_is_task(se));
> >     
> >     	if (next == &cfs_rq->tasks)
> >     		return NULL;
> >     	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >           This will return NULL even when se is a task.
> >     
> >     As a side-effect, there was a regression in sched_mc behavior since 2.6.25,
> >     since iter_move_one_task() when it calls load_balance_start_fair(),
> >     would not get any tasks to move!
> >     
> >     Fix this by checking if the last entity was a task or not.
> >     
> >     Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
> >     Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> >     Signed-off-by: Ingo Molnar <mingo@elte.hu>
> > 
> > 
> > This patch was integrated for 2.6.28. With the above patch, my test program uses 
> > two cores as it should. I could only test this in a virtual machine so I don't 
> > know exact performance metrics, but I'll test 2.6.27 + plus this patch on a real 
> > box tomorrow to see if this was the culprit.
> 
> Just tested right here and I can confirm it is the culprit. I can reliably
> reproduce the issue here on my core2 duo, and this patch fixes it. With your
> memset() loop at 20k iterations, I saw exactly 50% CPU usage, and a final
> sum of 794. With the patch, I see 53% CPU and 909. Changing the loop to 80k
> iterations shows 53% CPU usage and 541 loops without the patch, versus
> 639 loops and 63% CPU usage with the patch.

Interesting.  I'm testing in .git (Q6600), and it's only using one CPU
unless I actively intervene.  Doing whatever to pry the pair apart takes
loops/sec from 70 to 84.

	-Mike


  reply	other threads:[~2009-03-09  3:40 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-07 17:47 scheduler oddity [bug?] Balazs Scheidler
2009-03-07 18:47 ` Balazs Scheidler
2009-03-08 19:45   ` Balazs Scheidler
2009-03-08 22:03     ` Willy Tarreau
2009-03-09  3:35       ` Mike Galbraith [this message]
2009-03-09 11:19     ` David Newall
2009-03-08  9:42 ` Mike Galbraith
2009-03-08  9:58   ` Mike Galbraith
2009-03-08 10:02     ` Mike Galbraith
2009-03-08 10:19     ` Peter Zijlstra
2009-03-08 13:35       ` Mike Galbraith
2009-03-08 15:39     ` Ingo Molnar
2009-03-08 16:20       ` Mike Galbraith
2009-03-08 17:52         ` Ingo Molnar
2009-03-08 18:39           ` Mike Galbraith
2009-03-08 18:55             ` Ingo Molnar
2009-03-09  4:10               ` Mike Galbraith
2009-03-09  6:52                 ` Ingo Molnar
2009-03-09  8:02           ` [patch] " Mike Galbraith
2009-03-09  8:07             ` Ingo Molnar
2009-03-09 10:16               ` David Newall
2009-03-09 11:04               ` Peter Zijlstra
2009-03-09 13:16                 ` Mike Galbraith
2009-03-09 13:27                   ` Peter Zijlstra
2009-03-09 13:51                     ` Mike Galbraith
2009-03-09 14:00                     ` David Newall
2009-03-09 14:19                       ` Peter Zijlstra
2009-03-10  0:20                         ` David Newall
2009-03-09 13:37                   ` Mike Galbraith
2009-03-09 13:46                     ` Peter Zijlstra
2009-03-09 13:58                       ` Mike Galbraith
2009-03-09 14:11                         ` Mike Galbraith
2009-03-09 14:41                           ` Peter Zijlstra
2009-03-09 15:30                             ` Mike Galbraith
2009-03-09 16:12                               ` Peter Zijlstra
2009-03-09 17:28                                 ` Mike Galbraith
2009-03-15 13:53                                   ` Balazs Scheidler
2009-03-15 17:16                                     ` Mike Galbraith
2009-03-15 18:57                                       ` Ingo Molnar
2009-03-16 11:55                                         ` Balazs Scheidler
2009-03-09 15:57             ` Balazs Scheidler
2009-03-10  3:16               ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1236569722.14798.5.camel@marge.simson.net \
    --to=efault@gmx.de \
    --cc=bazsi@balabit.hu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=w@1wt.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox