From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Theurer Subject: Re: [PATCH] turn off writable page tables Date: Thu, 27 Jul 2006 09:43:56 -0500 Message-ID: <44C8D12C.1060900@us.ibm.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Pratt Cc: xen-devel@lists.xensource.com, Gerd Hoffmann List-Id: xen-devel@lists.xenproject.org >> fork a quite linear from small number to large number of dirty pages. >> Below are the min and max: >> >> 1280 pages 128000 pages >> wtpt: 813 usec 37552 usec >> emulate: 3279 usec 283879 usec >> > > Good, at least that suggests that the code works for the usage it was > intended for. > > >> So, in a -perfect-world- this works great. Problem is most workloads >> don't appear to have a vast percentage of entries that need to be >> updated. I'll go ahead and expand this test to find out what the >> threshold is to break even. I'll also see if we can implement a >> > batched > >> call in fork to update the parent -I hope this will show just as good >> performance even when most entries need modification and even better >> performance over wtpt with a low number of entries modified. >> > > With license to make more invasive changes to core Linux mm it certainly > should be possible to optimize this specific case with a batched update > fairly easily. You could even go further an implement a 'make all PTEs > in pagetable RO' hypercall, possibly including a copy to the child. This > could potentially work better than current 'late pin', at least the > validation would be incremental rather than in one big hit at the end. > > Ian > FWIW, I found the threshold for emulate vs wtpt. I ran the fork test with a set number of pages dirtied such that we had x number of PTEs per pte_page. writable-pt ----------- #pte usec 002 5242 004 5251 006 5373 008 5519 010 5873 emulate -------- #pte usec 002 4922 004 5265 006 6074 008 6991 010 7806 012 5988 So, the threshold appears to be around 4 PTEs/page. I was a little shocked at first how low this number is, but considering the near identical performance with the various workloads, this make sense. All of the workloads had the vast majority of writable pages flushed with just 2 PTEs/page changed and a handful with more PTEs/page changed. It would not surprise me if the overall average was around 4 PTEs/page. I am having a hard time finding any "enterprise" workloads which have a lot of PTEs/page right before fork. If anyone can point me to some, that would be great. I will look into batching next, but I am curious if simply using a hypercall in stead of write fault + emulate will make any difference at all. I'll try that first, then implement the batched update. Eventually a hypercall which does more would be nice, but I guess we'll have to convince the Linux maintainers it's a good idea. -Andrew