From mboxrd@z Thu Jan  1 00:00:00 1970
From: Emmanuel Ackaouy <ackaouy@gmail.com>
Subject: Re: Re: Xen scheduler
Date: Tue, 24 Apr 2007 16:55:48 +0200
Message-ID: <ddeb35cafec7af3c1d71e0a6d6b77015@gmail.com>
References: <907625E08839C4409CE5768403633E0B018E1C53@sefsexmb1.amd.com>
Mime-Version: 1.0 (Apple Message framework v624)
Content-Type: text/plain; charset=US-ASCII; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <907625E08839C4409CE5768403633E0B018E1C53@sefsexmb1.amd.com>
List-Unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: "Petersson, Mats" <Mats.Petersson@amd.com>
Cc: ncmike@us.ibm.com, xen-devel@lists.xensource.com, pak333@comcast.net
List-Id: xen-devel@lists.xenproject.org

On Apr 24, 2007, at 16:42, Petersson, Mats wrote:
>> If you feel two VCPUs would do better co-scheduled on a
>> core or socket, you'd currently have to use cpumasks -- as
>> mike suggested -- to restrict where they can run manually. I'd
>> be curious to know of real world cases where doing this
>> increases performance significantly.
>
> If you have data-sharing between the apps on the same socket, and a 
> shared L2 or L3 cache, and the application/data fits in the cache, I 
> could see that it would help. [And of course, the OS for example will 
> have some data and code-sharing between CPU's - so some application 
> where a lot of time is spent in the OS itself would be benefitting 
> from "socket sharing"].
>
> For other applications, having better memory bandwitch is most likely 
> better.
>
> Of course, for ideal performance, it would also have to be taken into 
> account which CPU owns the memory being used, as the latency of 
> transferring memory from one CPU to another in a NUMA system can 
> affect the performance quite noticably.

I understand in theory what would do better scheduled in either
of these ways. What I'm interested in learning about is actual
applications that people use that exhibit the type of L2/3 cache
sharing that would make it significantly better to co-schedule
the VCPUs in question on whole sockets rather than across
them.