From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pete Zaitcev Subject: Re: Post-XDR CLD cannot keep session up Date: Tue, 9 Feb 2010 09:12:06 -0700 Message-ID: <20100209091206.07df7ea2@redhat.com> References: <20100207000047.46f77de0@redhat.com> <4B713A38.1010106@garzik.org> <4B714FCF.1060708@garzik.org> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4B714FCF.1060708@garzik.org> Sender: hail-devel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" To: Jeff Garzik Cc: Project Hail List On Tue, 09 Feb 2010 07:06:39 -0500 Jeff Garzik wrote: > There is definitely something strange going on in the timer routines, > that is causing session_timeout() not to run even though it re-adds > itself to the timer list using cld_timer_add(). fprintf() debug output > in cld_timer_add and cld_timers_run are yielding unexpected results. Shoot, I think I know what this is, and it's my fault. The list is "cached" improperly inside cld_timers_run. I remember that at some point I added a mutex to every list and noticed that the list wasn't locked correctly, so fixed it. But then I dropped those mutexes because of some recursion issues and undone the fix. I'll retest and send a patch in a few. -- Pete