From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeremy Fitzhardinge Subject: Re: [PV-ops][PATCH] Netback: Fix PV network issue for netback multiple threads patchset Date: Thu, 01 Jul 2010 18:07:12 +0200 Message-ID: <4C2CBD30.4060704@goop.org> References: <1276248930.19091.2870.camel@zakaz.uk.xensource.com> <4C1F49B1.3060403@goop.org> <1277995730.28432.24.camel@zakaz.uk.xensource.com> <4C2CB443.1060907@goop.org> <1277999255.28432.50.camel@zakaz.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1277999255.28432.50.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Ian Campbell Cc: xen-devel@lists.xensource.com, Fantu , Stefano Stabellini , "Xu, Dongxiao" , Paul Durrant , djmagee@mageenet.net List-Id: xen-devel@lists.xenproject.org On 07/01/2010 05:47 PM, Ian Campbell wrote: >> Hm, I hadn't meant to commit that properly. I had it locally and >> accidentally pushed it out. >> >> I only did that patch as an RFC in response to an issue alluded to by >> Dongxiao (or was it you?) about things not being fully initialized by >> the time the async code starts. Is this a real issue, and if so, what's >> the correct fix? >> > I don't think there is an actual current issue, just a potential one > since we are relying on data structures being zeroed rather than > properly initialised to keep the async code from running off into the > weeds, it just seemed a little fragile this way. > > Originally I said: > >>> The crash is in one of the calls to list_move_tail and I think it is >>> because netbk->pending_inuse_head not being initialised until after >>> the >>> threads and/or tasklets are created (I was running in threaded mode). >>> Perhaps even though we are now zeroing the netbk struct those fields >>> should still be initialised before kicking off any potentially >>> asynchronous tasks? >>> > this specific issue was fixed by zeroing the netbk array as it is > allocated, I just thought we could make things more robust by not > triggering the async code until everything was fully setup. > It would only affect system startup time, not domain creation? I was looking at it because Stefano was having fairly consistent crashes on domain creation, and it looked like sort-of-racy symptoms. J