From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: additional domain.c memory allocation causes "xm create" to fail Date: Tue, 4 Sep 2012 21:11:26 +0100 Message-ID: <5046606E.9080908@citrix.com> References: <50464954.5030207@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4479523484154043251==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: misiu godfrey Cc: xen-devel List-Id: xen-devel@lists.xenproject.org --===============4479523484154043251== Content-Type: multipart/alternative; boundary="------------070007080202050205030503" --------------070007080202050205030503 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit On 04/09/12 20:45, misiu godfrey wrote: > > For what purpose? There was once a bug which caused this to > happen and > it caused Xen to slow to a glacial pace. We got bored of > debugging HVM > domains after several hours and the BIOS has still not loaded the > bootloader. > > > I'm working on an experiment to do with cache-based side channels in > Cloud environments. Part of it involves measuring the effect of > flushing the cache every time there is a VM switch. Ok. My suspicion is that it will be unreasonably slow, although should not be the glacial pace we saw with the bug (which was flushing the L2 cache on every instruction). > > You don't check the return value, so what happens when the allocation > fails? I would say that calling *alloc() is not a sensible thing > to be > doing in __context_switch() anyway, as you are sitting doing a long > operation while in a critical section of Xen code. > > > Unfortunately, I can chalk that up to my inexperience with C > programming. Thanks for pointing that out. > > As for the sensibility of the plan, it is still in rather early stages > and not as robust as I would like it. As I get more working I was > planning on leaving the memory buffer permanently allocated so as not > to spend time managing it in a critical section. If you have a > suggestion for a more practical solution I'm all ears. > > Furthermore, this algorithm has no guarantee to clear the L2 > cache. In > fact is almost certainly will not. > > > This is the code that has worked in all of my prior experiments and > has been ratified by others I have worked with. Are you sure it > wouldn't work? While, for simplicity's sake, I have removed portions > of the code designed to prevent pre-fetching and perhaps left out > something important, my understanding of cache-coloring, however, > would still imply that the data in the cache should be dirty, or > flushed after this loop terminates. > > Perhaps I have misused the term "flush". My objective is to make each > cache line dirty, or flush it to main memory. "flush" is the correct term. However, the structure of caches work against you. With a set associative cache, you have no control over which of the sets gets used for your cache line. So on an N way set associate cache, your worst case may only dirty 1/N of the actual lines in the cache. After that, your L1 cache inclusion policy is going to affect how you dirty your L2 cache, as well as whether you have joined caches or split instruction and data caches. Furthermore, on newer processes, multiple cores may share an L2 or L3 cache, and context switches are unlike to occur at exactly the same time on each core, meaning that a context switch on one core is going to (attempt to) nuke the L2 cache of the VM which is in mid run on another core. Conversely, even executing the loop trying to dirty the cache will mean that you dont get all of it, and having another core executing on the same L2 cache means it will pull its data back during your dirtying loop. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com --------------070007080202050205030503 Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit
On 04/09/12 20:45, misiu godfrey wrote:

For what purpose?  There was once a bug which caused this to happen and
it caused Xen to slow to a glacial pace.  We got bored of debugging HVM
domains after several hours and the BIOS has still not loaded the
bootloader.

I'm working on an experiment to do with cache-based side channels in Cloud environments.  Part of it involves measuring the effect of flushing the cache every time there is a VM switch.

Ok.  My suspicion is that it will be unreasonably slow, although should not be the glacial pace we saw with the bug (which was flushing the L2 cache on every instruction).


You don't check the return value, so what happens when the allocation
fails?  I would say that calling *alloc() is not a sensible thing to be
doing in __context_switch() anyway, as you are sitting doing a long
operation while in a critical section of Xen code.

Unfortunately, I can chalk that up to my inexperience with C programming.  Thanks for pointing that out.

As for the sensibility of the plan, it is still in rather early stages and not as robust as I would like it.  As I get more working I was planning on leaving the memory buffer permanently allocated so as not to spend time managing it in a critical section.  If you have a suggestion for a more practical solution I'm all ears.

Furthermore, this algorithm has no guarantee to clear the L2 cache.  In
fact is almost certainly will not.

This is the code that has worked in all of my prior experiments and has been ratified by others I have worked with.  Are you sure it wouldn't work?  While, for simplicity's sake, I have removed portions of the code designed to prevent pre-fetching and perhaps left out something important, my understanding of cache-coloring, however, would still imply that the data in the cache should be dirty, or flushed after this loop terminates.

Perhaps I have misused the term "flush".  My objective is to make each cache line dirty, or flush it to main memory.

"flush" is the correct term.

However, the structure of caches work against you.  With a set associative cache, you have no control over which of the sets gets used for your cache line.  So on an N way set associate cache, your worst case may only dirty 1/N of the actual lines in the cache.

After that, your L1 cache inclusion policy is going to affect how you dirty your L2 cache, as well as whether you have joined caches or split instruction and data caches.

Furthermore, on newer processes, multiple cores may share an L2 or L3 cache, and context switches are unlike to occur at exactly the same time on each core, meaning that a context switch on one core is going to (attempt to) nuke the L2 cache of the VM which is in mid run on another core.  Conversely, even executing the loop trying to dirty the cache will mean that you dont get all of it, and having another core executing on the same L2 cache means it will pull its data back during your dirtying loop.

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
--------------070007080202050205030503-- --===============4479523484154043251== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============4479523484154043251==--