From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Egger Subject: Re: [PATCH 0 of 5] v2: Nested-p2m cleanups and locking changes Date: Fri, 1 Jul 2011 12:00:15 +0200 Message-ID: <4E0D9AAF.6070801@amd.com> References: <20110627105654.GK17634@whitby.uk.xensource.com> <4E08762A.2050801@amd.com> <20110627131528.GN17634@whitby.uk.xensource.com> <20110627154831.GS17634@whitby.uk.xensource.com> <20110630094908.GV17634@whitby.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20110630094908.GV17634@whitby.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Tim Deegan Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 06/30/11 11:49, Tim Deegan wrote: > At 16:48 +0100 on 27 Jun (1309193311), Tim Deegan wrote: >> At 14:15 +0100 on 27 Jun (1309184128), Tim Deegan wrote: >>> At 14:23 +0200 on 27 Jun (1309184586), Christoph Egger wrote: >>>>> - Why is there a 10x increase in IPIs after this series? I don't see >>>>> what sequence of events sets the relevant cpumask bits to make this >>>>> happen. >>>> >>>> In patch 1 the code that sends the IPIs was outside of the loop and >>>> moved into the loop. >>> >>> Well, yes, but I don't see what that causes 10x IPIs, unless the vcpus >>> are burning through np2m tables very quickly indeed. Maybe removing the >>> extra flushes for TLB control will do the trick. I'll make a patch... >> >> I think I get it - it's a race between p2m_flush_nestedp2m() on one CPU >> flushing all the nested P2M tables and a VCPU on another CPU repeatedly >> getting fresh ones. Try the attached patch, which should cut back the >> major source of p2m_flush_nestedp2m() calls. >> >> Writing it, I realised that after my locking fix, p2m_flush_nestedp2m() >> isn't safe because it can run in parallel with p2m_get_nestedp2m, which >> reorders the array it walks. I'll have to make the LRU-fu independent >> of the array order; should be easy enough but I'll hold off committing >> the current series until I've done it. > > I've just pushed 23633 - 26369, which is this series plus the change to > the LRU code (and a fix to the NULL deref you reported is folded in). > Hopefully that puts nested SVM back in at least as good a state as it > was before my locking-order patch broke it! :) I run some tests and nested SVM is now in an even better state as it was before. Performance is a lot better now, particularly MMIO performance. The e1000 driver needed several minutes to read the e1000 mac address before and now it takes less than a second. Christoph -- ---to satisfy European Law for business letters: Advanced Micro Devices GmbH Einsteinring 24, 85689 Dornach b. Muenchen Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632