From mboxrd@z Thu Jan 1 00:00:00 1970 From: Emmanuel Ackaouy Subject: Re: Re: Xen scheduler Date: Tue, 24 Apr 2007 16:55:48 +0200 Message-ID: References: <907625E08839C4409CE5768403633E0B018E1C53@sefsexmb1.amd.com> Mime-Version: 1.0 (Apple Message framework v624) Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <907625E08839C4409CE5768403633E0B018E1C53@sefsexmb1.amd.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: "Petersson, Mats" Cc: ncmike@us.ibm.com, xen-devel@lists.xensource.com, pak333@comcast.net List-Id: xen-devel@lists.xenproject.org On Apr 24, 2007, at 16:42, Petersson, Mats wrote: >> If you feel two VCPUs would do better co-scheduled on a >> core or socket, you'd currently have to use cpumasks -- as >> mike suggested -- to restrict where they can run manually. I'd >> be curious to know of real world cases where doing this >> increases performance significantly. > > If you have data-sharing between the apps on the same socket, and a > shared L2 or L3 cache, and the application/data fits in the cache, I > could see that it would help. [And of course, the OS for example will > have some data and code-sharing between CPU's - so some application > where a lot of time is spent in the OS itself would be benefitting > from "socket sharing"]. > > For other applications, having better memory bandwitch is most likely > better. > > Of course, for ideal performance, it would also have to be taken into > account which CPU owns the memory being used, as the latency of > transferring memory from one CPU to another in a NUMA system can > affect the performance quite noticably. I understand in theory what would do better scheduled in either of these ways. What I'm interested in learning about is actual applications that people use that exhibit the type of L2/3 cache sharing that would make it significantly better to co-schedule the VCPUs in question on whole sockets rather than across them.