From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Egger Subject: Re: [PATCH 3 of 4] Nested p2m: clarify logic in p2m_get_nestedp2m() Date: Mon, 27 Jun 2011 11:46:05 +0200 Message-ID: <4E08515D.30203@amd.com> References: <4E049E64.9080908@amd.com> <20110624143726.GI9784@whitby.uk.xensource.com> <4E04A4F0.4090803@amd.com> <20110624150509.GJ9784@whitby.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20110624150509.GJ9784@whitby.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Tim Deegan Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 06/24/11 17:05, Tim Deegan wrote: > Hi, > > At 15:53 +0100 on 24 Jun (1308930816), Christoph Egger wrote: >>> More generally, I think that you need to figure out exactly what >>> behaviour you want from this function. For example in the current code >>> there's no way that two vcpus with the same ncr3 value can share a >>> nested-p2m. Is that deliberate? >> >> By 'current code' do you mean with or w/o this patch ? > > Both, and all versions of the code from before my current series to the > full series applied. > >> It is deliberate that two vcpus with the same ncr3 share a nested-p2m. > > But they don't. The code in current unstable tip does this: > > for (i = 0; i< MAX_NESTEDP2M; i++) { > p2m = d->arch.nested_p2m[i]; > if ((p2m->cr3 != cr3&& p2m->cr3 != CR3_EADDR) || (p2m != nv->nv_p2m)) > continue; > > // ... return this p2m > } > > /* All p2m's are or were in use. Take the least recent used one, > * flush it and reuse. > */ > for (i = 0; i< MAX_NESTEDP2M; i++) { > p2m = p2m_getlru_nestedp2m(d, NULL); > rv = p2m_flush_locked(p2m); > if (rv == 0) > break; > } > > // ... return this p2m > > The first loop never returns a p2m that's != nv->nv_p2m. The second > loop always returns a fresh, flushed p2m. So there's no way that two > different vcpus, starting with nv->nv_p2m == NULL, can ever get the same > p2m as each other. > > The pseudocode is basically: > - If I have an existing nv_p2m and it hasn't been flushed, reuse it. > - Else flush all np2ms in LRU order and return the last one flushed. I see. Thanks. > My patch 3/4 doesn't change the logic at all (I think); your latest fix > just avoids the over-aggressive flushing of all np2ms. Yes and results in a noticable performance boost. >> But fixing the p2m locking problem in upstream tree has a higher >> priority right now and we can work on that after the p2m locking >> issue is fixed upstream. > > AFAICS the locking is fixed by the current set of patches Yes, I can confirm that. patch 4 fixes it. > (though I'm still not able to run Xen-in-Xen well enough to test them). Can you describe what problem you are facing? Is it a hanging L1 Dom0 ? Does L1 Dom0 not see the SVM cpuid feature bit? > I can send the full series again for clarity if you like. Yes, please! > The outstanding bug is that there are many more IPIs than previously; > I suspect that your latest fix will reduce them quite a lot by avoiding a storm of > mutually-destructive flush operations. Yes because p2m_flush_nestedp2m() runs less often. The number of sent IPIs from p2m_flush_nestedp2m() is still 10 times higher. > If the performance is still too bad we can add more IPI-avoidance strategies. Yes, this still brings significant performance for L3 guests because both host and l1 guest send less IPIs then. We can do this performance improvement in a seperate patch series. Christoph -- ---to satisfy European Law for business letters: Advanced Micro Devices GmbH Einsteinring 24, 85689 Dornach b. Muenchen Geschaeftsfuehrer: Alberto Bozzo, Andrew Bowd Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632