public inbox for hail-devel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Pete Zaitcev <zaitcev@redhat.com>
Cc: Project Hail List <hail-devel@vger.kernel.org>
Subject: Re: [Patch 01/12] CLD: fix crash in retransmissions
Date: Sun, 18 Apr 2010 23:46:07 -0400	[thread overview]
Message-ID: <4BCBD1FF.3080909@garzik.org> (raw)
In-Reply-To: <20100417223709.3f902df3@redhat.com>

On 04/18/2010 12:37 AM, Pete Zaitcev wrote:
> For a longest time I was plagued by (very infrequent) crashes like this:
>
> Program received signal SIGSEGV, Segmentation fault.
> sess_retry_output (timer=0x92070c0) at session.c:532
> 532                     if (!next_retry || (op->next_retry<  next_retry))
> (gdb) info threads
> * 1 Thread 0xb72f96c0 (LWP 22417)  sess_retry_output (timer=0x92070c0) at session.c:532
> (gdb) where
> #0  sess_retry_output (timer=0x92070c0) at session.c:532
> #1  session_retry (timer=0x92070c0) at session.c:565
> #2  0x08049aee in cld_timers_run (tlist=0x8056630) at ../lib/libtimer.c:95
> #3  0x0804e9cc in main_loop (argc=5, argv=0xbff70bd4) at server.c:983
> #4  main (argc=5, argv=0xbff70bd4) at server.c:1138
>
> The crash happens because op is NULL. As it turned out, this happens
> if a packet retransmit and a session expiration occur simultaneously
> (in the same pass of timers_run). The scenario is:
>   - timers_run collects expired timers at exec list
>   - timers_run expires session
>   - two timer_del are called, but one of them is on exec list already,
>     so it's ineffective
>   - session is freed, this zeroes ->data in lists (later op)
>   - timers_run continues along the exec list, invokes the retransmission
>     callback, and that crashes with NULL op.
>
> The proposed solution is to rework the timers_run, again. But this
> time, we'll make it simpler by observing that timers are ordered by
> expiration time. Therefore, we can pull next timer off the list,
> expire it, and loop until expiration time is greater than the current
> time. No execution list is kept. The integrity of the main list
> is assured by never walking it and always referring to the head
> anew at each iteration.
>
> This patch appears to fix the problem and stands up to use that
> crashed the old code.
>
> Signed-off-by: Pete Zaitcev<zaitcev@redhat.com>
>
> ---
>   include/cld_common.h |   10 ++++++----
>   lib/libtimer.c       |   41 ++++++++++++++++++++---------------------
>   2 files changed, 26 insertions(+), 25 deletions(-)

applied 1-4

Note that I change the email subject line prefix (which is normally 
copied by automated tools directly into git) from "CLD" to "libcldc", 
when committing to git.  I am trying to follow (and encourage others to) 
the kernel's method of using a prefix to indicate the subsystem or 
section within the current git repo to which a change applies.

To be fully friendly with automated tools, an ideal Project Hail subject 
line might read

	[cld patch 1/1] update Makefile.am
	 	or
	[tabled patch 1/1] update Makefile.am
		or
	[cld patch 1/1] libcldc: add nncld API, the new new CLD API

Not a big deal, just noting what is most friendly to the git automated 
tools, when I'm importing each Hail patch.

	Jeff




  reply	other threads:[~2010-04-19  3:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-18  4:37 [Patch 01/12] CLD: fix crash in retransmissions Pete Zaitcev
2010-04-19  3:46 ` Jeff Garzik [this message]
2010-04-19  4:38   ` Pete Zaitcev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BCBD1FF.3080909@garzik.org \
    --to=jeff@garzik.org \
    --cc=hail-devel@vger.kernel.org \
    --cc=zaitcev@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox