From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: Re: Even faster page copy for Xen? Date: Tue, 10 Aug 2010 13:41:17 +0100 Message-ID: References: <8e0efd21-1863-489c-a58a-317f69646a3b@default> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <8e0efd21-1863-489c-a58a-317f69646a3b@default> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dan Magenheimer , Jan Beulich , Dulloor Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 10/08/2010 13:31, "Dan Magenheimer" wrote: >> You can do so if you feel like saving/restoring all necessary XMM >> state isn't going to eat up all of the performance win... > > Again excuse my x86 ignorance, but on some architectures > floating point registers can be saved/restored "lazily" > because there is a privileged bit that disables their use > (which can be trapped and used as a "floating-point dirty" bit). > Is there anything equivalent for the XMM state? If so, > then lazy save might be a good approach. If not, then I agree > that the state save/restore overhead might eat up the performance > win. (However, if we were to later use Linux memory compaction > and NUMA page migration, the performance tradeoff might change > to positive.) We do lazy FPU/SSE restore already. But in any case, it is questionable how much faster you can make a non-temporal and/or non-local bulk memory copy: it ought to be bottlenecked on FSB bandwidth. -- Keir