From mboxrd@z Thu Jan 1 00:00:00 1970 From: Herbert van den Bergh Subject: Re: vnif socket buffer mistaken for pagetable page causes major performance problem Date: Mon, 08 Jun 2009 09:22:10 -0700 Message-ID: <4A2D3AB2.5020104@oracle.com> References: <2ee9aac0-0772-4306-a9e5-bf711cf2c2d8@default> <4A2CD5B3.4090700@eu.citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4A2CD5B3.4090700@eu.citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Gianluca Guida Cc: Dan Magenheimer , "Xen-Devel (E-mail)" List-Id: xen-devel@lists.xenproject.org Gianluca Guida wrote: > Hi, > > Sorry for the late reply, > > Dan Magenheimer wrote: >> A recent posting reminded me of this and, though the >> information is a bit vague, someone familiar with the >> shadow code might know just where to look to fix this, >> hopefully in time for 3.4 (if its not already fixed). >> >> One of our performance experts discovered a strange >> major network performance hit seen only under certain >> circumstances, IIRC mostly in HVM but sometimes in >> PV when migrating. >> >> With some help from Intel, it was determined that >> the heuristics used by shadow paging to determine >> whether a guest is modifying a pagetable page were >> getting fooled by the access pattern used by the >> code that copies data into a newly allocated socket >> buffer. As a result, many unnecessary vmenter/vmexits >> were happening. >> >> The workaround was to preallocate all socket buffer >> memory. >> >> That's all I've got, but we can try to answer questions >> if this isn't already a known fixed problem. > > Can you be a little more specific about what is the performance loss, Network throughput was reduced to 10% of normal. > in what workload, A network send throughput test using netperf. > and what heuristic you found to be wrong in this case? The access pattern to the memory page that was recognized as a pagetable access was a regular memcpy doing 4 byte aligned writes into a newly allocated page. I'm not that familiar with the shadow pagetable code, so I don't know how "wrong" this is, just that it caused a false positive on this type of memory access. Thanks, Herbert.