From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Hopwood Subject: Re: Grant Table Network Issues Date: Sun, 14 Aug 2005 17:53:36 +0100 Message-ID: <42FF7710.9090504@blueyonder.co.uk> References: <20050813185945.GA23341@vrable.net> <85c87b7ce2e667949306ca7da953d219@cl.cam.ac.uk> <20050814153953.GA5037@vrable.net> Reply-To: david.nospam.hopwood@blueyonder.co.uk Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20050814153953.GA5037@vrable.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org Michael Vrable wrote: > On Sun, Aug 14, 2005 at 09:29:02AM +0100, Keir Fraser wrote: >>On 13 Aug 2005, at 19:59, Michael Vrable wrote: >> >>>The line causing trouble is "BUG_ON(in_irq())". [...] >> >>The only code between irq_exit and kmap_skb_frag on the stack trace is >>unmodified Linux code. Assuming that is all correct (and presumably the >>same whether we enable grant tables or not) I might guess another >>interrupt arrives and the handler corrupts things? > > I discovered the cause of this and other crashes yesterday: when grant > tables are enabled in the netback driver, net_tx_action() and > net_tx_action_dealloc() in netback.c each allocate large arrays from the > kernel's stack ("gnttab_map_grant_ref_t map_ops[MAX_PENDING_REQS]" and > "gnttab_unmap_grant_ref_t unmap_ops[MAX_PENDING_REQS]"). This results > in a stack overflow. Is there no way to make a kernel stack overflow fail fast with an obvious error, rather than causing other subtle failures? -- David Hopwood