public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Thomas Pilarski <thomas.pi@arcor.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mike Galbraith <efault@gmx.de>,
	Gregory Haskins <ghaskins@novell.com>,
	bugme-daemon@bugzilla.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [Bugme-new] [Bug 12562] New: High overhead while switching or synchronizing threads on different cores
Date: Thu, 29 Jan 2009 12:37:08 +0100	[thread overview]
Message-ID: <1233229028.4495.34.camel@laptop> (raw)
In-Reply-To: <1233224644.5294.52.camel@bugs-laptop>

On Thu, 2009-01-29 at 11:24 +0100, Thomas Pilarski wrote:
> Some explanation of the test program. 
> 
> ../schedulerissue 1 4096 8 2000
> 1 producer and 1 consumer
> buffer size of 4096 doubles * 8byte 
> 8 buffer (256kB total buffer)
> 2000 messages
> 
> ../schedulerissue 2 4096 8 2000
> 2 producer and 2 consumer
> buffer size of 4096 doubles * 8byte 
> 8 buffer (256kB total buffer)
> 2000 messages
> 
> 
> It was not 512KB bytes in the test before, but 4MB.
> But there is the same problem with a total buffer size of 48kB and 4
> threads (./schedulerissue 2 2048 3 20000).

Linux opteron 2.6.29-rc3-tip #61 SMP PREEMPT Thu Jan 29 11:59:15 CET
2009 x86_64 x86_64 x86_64 GNU/Linux

[root@opteron bench]# schedtool -a 1 -e ./ThreadSchedulingIssue 1 4096 8 20000
All threads finished: 19992 messages in 6.485 seconds / 3082.877 msg/s
[root@opteron bench]# ./ThreadSchedulingIssue 1 4096 8 20000
All threads finished: 19992 messages in 6.496 seconds / 3077.604 msg/s
[root@opteron bench]# ./ThreadSchedulingIssue 1 4096 8 20000 & ./ThreadSchedulingIssue 1 4096 8 20000 &
[1] 10314
[2] 10315
[root@opteron bench]# All threads finished: 19992 messages in 6.720 seconds / 2975.009 msg/s
All threads finished: 19992 messages in 6.792 seconds / 2943.574 msg/s

[1]-  Done                    ./ThreadSchedulingIssue 1 4096 8 20000
[2]+  Done                    ./ThreadSchedulingIssue 1 4096 8 20000
[root@opteron bench]# ./ThreadSchedulingIssue 2 4096 8 20000
All threads finished: 19992 messages in 17.299 seconds / 1155.667 msg/s


[root@opteron bench]# for i in 4 8 16 32 64 128 256 ; do 
> echo -n $((i*1024)) $((80000/i)) " " ; 
> schedtool -a 1 -e ./ThreadSchedulingIssue 1 $((i*1024)) 8 $((80000/i)) ;
> done
4096 20000  All threads finished: 19992 messages in 6.368 seconds / 3139.251 msg/s
8192 10000  All threads finished: 9992 messages in 5.363 seconds / 1863.083 msg/s
16384 5000  All threads finished: 4992 messages in 5.471 seconds / 912.479 msg/s
32768 2500  All threads finished: 2493 messages in 5.730 seconds / 435.059 msg/s
65536 1250  All threads finished: 1242 messages in 5.544 seconds / 224.021 msg/s
131072 625  All threads finished: 617 messages in 5.755 seconds / 107.217 msg/s
262144 312  All threads finished: 305 messages in 6.014 seconds / 50.713 msg/s

[root@opteron bench]# for i in 4 8 16 32 64 128 256 ; do
> echo -n $((i*1024)) $((80000/i)) " " ;
> ./ThreadSchedulingIssue 1 $((i*1024)) 8 $((80000/i)) ;
> done
4096 20000  All threads finished: 19992 messages in 6.462 seconds / 3093.717 msg/s
8192 10000  All threads finished: 9992 messages in 8.767 seconds / 1139.738 msg/s
16384 5000  All threads finished: 5000 messages in 5.366 seconds / 931.798 msg/s
32768 2500  All threads finished: 2494 messages in 20.720 seconds / 120.369 msg/s
65536 1250  All threads finished: 1242 messages in 11.521 seconds / 107.805 msg/s
131072 625  All threads finished: 618 messages in 14.035 seconds / 44.032 msg/s
262144 312  All threads finished: 305 messages in 17.342 seconds / 17.587 msg/s

The above point between 16 and 32 is exactly where the total working set
doesn't fit into cache anymore -- I suspect that pushes the producer's
latency to go to sleep over the edge and everything collapses.


We use wakeup patterns to determine if two tasks are working together
and should thus be kept together.

Task A should wake up B, and B should wake up A. Furthermore, any task
should quickly go to sleep after waking up the other.

This program does neither, with a single pair, the producer continues
production after waking the consumer (until the queue is filled --
which, if the consumer is fast enough, might never happen).

With multiple pairs there is no strict pair relation at all, since they
all work on the same global buffer queue, so P1 can wake Cn etc.

Furthermore the program uses shared memory (not a bad design), and thus
mises out on the explicit affinity hints of pipes, sockets, etc.


In short this program is carefully crafted to defeat all our affinity
tests - and I'm not sure what to do.



  parent reply	other threads:[~2009-01-29 11:37 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-12562-10286@http.bugzilla.kernel.org/>
2009-01-28 20:56 ` [Bugme-new] [Bug 12562] New: High overhead while switching or synchronizing threads on different cores Andrew Morton
2009-01-28 22:15   ` Peter Zijlstra
2009-01-28 22:25   ` Thomas Pilarski
2009-01-29  9:07     ` Peter Zijlstra
2009-01-29 10:12       ` Thomas Pilarski
2009-01-29 10:24         ` Thomas Pilarski
2009-01-29 10:31           ` Peter Zijlstra
2009-01-29 11:37           ` Peter Zijlstra [this message]
2009-01-29 14:05             ` Thomas Pilarski
2009-01-30  7:57               ` Mike Galbraith
2009-02-02  7:43                 ` Thomas Pilarski
2009-02-02  8:19                   ` Peter Zijlstra
2009-02-02  8:33                     ` Thomas Pilarski
2009-02-02  8:52                       ` Mike Galbraith
2009-02-02  8:55                         ` Peter Zijlstra
2009-02-02 12:15                           ` Peter Zijlstra
2009-02-02 18:29                             ` Michael Kerrisk
2009-02-02 18:35                               ` Peter Zijlstra
2009-02-03  4:55                                 ` Mike Galbraith
2009-02-03  3:56                   ` Valdis.Kletnieks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1233229028.4495.34.camel@laptop \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=bugme-daemon@bugzilla.kernel.org \
    --cc=efault@gmx.de \
    --cc=ghaskins@novell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=thomas.pi@arcor.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox