From mboxrd@z Thu Jan  1 00:00:00 1970
From: George Dunlap <george.dunlap@eu.citrix.com>
Subject: Re: So I tried to use xentrace...
Date: Fri, 7 May 2010 16:16:34 -0500
Message-ID: <4BE48332.6040209@eu.citrix.com>
References: <4BDB4CCC.3080405@goop.org>
	<n2ode76405a1005071348m8f871b8cn49e50dff487943e6@mail.gmail.com>
	<4BE47F26.50904@goop.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <4BE47F26.50904@goop.org>
List-Unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Xen-devel <xen-devel@lists.xensource.com>, Keir Fraser <Keir.Fraser@eu.citrix.com>
List-Id: xen-devel@lists.xenproject.org

Jeremy Fitzhardinge wrote:
>>> (XEN) ----[ Xen-4.1-unstable  x86_64  debug=y  Not tainted ]----
>>> (XEN) CPU:    1
>>> (XEN) RIP:    e008:[<ffff82c4801215b3>] check_lock+0x1b/0x45
>>>     
>>>       
>
> This suggests the problem is with misusing a lock in the wrong interrupt
> context, rather than anything to do with sizes.
>   
Except that, it works for me if I use -S 32, and doesn't if I use -S 512 
(on my 2-core box, equivalent # of pages to -S 256 on your 4-core box). 
:-)  Try it, I suspect it will work.

Also:
* It's a page fault with a null pointer, not a bugcheck.  In a non-debug 
build, it will crash in spin_lock instead of check_lock.
* The fault is in the MMU update hypercall; I believe done when xentrace 
tries to map garbage pages or invalid MFNs.
* This is the exact bug we were getting in product, and the 
bounds-checking fixed it.

Hmm... the bounds checking should be working.  The maximum index is 
meant to be 2048 (2 pages = 8k,  / sizeof(uint32_t) = 2048), and the 
maximum index for you is  1088, well within the t_info size.  Hmm...

 -George