From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dulloor Subject: Re: Re: Even faster page copy for Xen? Date: Mon, 9 Aug 2010 10:47:37 -0700 Message-ID: References: <4C5BDC79020000780000E9AB@vpn.id2.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Return-path: In-Reply-To: <4C5BDC79020000780000E9AB@vpn.id2.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Beulich Cc: Dan Magenheimer , xen-devel@lists.xensource.com, Keir Fraser List-Id: xen-devel@lists.xenproject.org On Fri, Aug 6, 2010 at 12:57 AM, Jan Beulich wrote: >>>> On 15.07.10 at 20:15, Dan Magenheimer wrote: >> Hi Jan, Keir -- >> >> My x86 assembly skills are much too poor to carefully evaluate >> and, if of value, implement this in Xen but given your previous >> interest, such as: >> >> http://xenbits.xensource.com/xen-unstable.hg?rev/8de4b4e9a435 >> >> the following might be worth looking at. >> >> Intel has just posted memcpy improvements for glibc for recent >> popular Intel processor families here: >> >> http://article.gmane.org/gmane.comp.lib.glibc.alpha/15278 >> >> The preface to the above patch looks very enticing... > > I'm not sure how much of this applies to the much more specific > case of copying pages... Additionally, I don't think trying to > use XMM registers in Xen would be a good idea. Why would you say using xmm/sse in Xen is a bad idea ? We already have a copy_page_sse2 (in copy_page.S) in our code base and available (by default) for x86_64. Is it a bad idea to use that ? > >> Semi-related, I wonder if you know, if there were a >> "copy_page_from_other_node()" to be used if the >> caller is fairly sure that the page is being copied >> between nodes, could this be made significantly faster >> than a normal copy_page()? > > I would think that this should mostly be taken care of by > using non-temporal stores (non-temporal loads unfortunately > aren't available without using XMM registers). The only other > meaningful tuning one could do would be to increase the > prefetch distances and grow the distance between loads and > stores. The latter would require the use of more registers > and hence have other drawbacks. > > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >