All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Pete Zaitcev <zaitcev@redhat.com>
Cc: Project Hail List <hail-devel@vger.kernel.org>
Subject: Re: [Patch 01/12] CLD: fix crash in retransmissions
Date: Sun, 18 Apr 2010 23:46:07 -0400	[thread overview]
Message-ID: <4BCBD1FF.3080909@garzik.org> (raw)
In-Reply-To: <20100417223709.3f902df3@redhat.com>

On 04/18/2010 12:37 AM, Pete Zaitcev wrote:
> For a longest time I was plagued by (very infrequent) crashes like this:
>
> Program received signal SIGSEGV, Segmentation fault.
> sess_retry_output (timer=0x92070c0) at session.c:532
> 532                     if (!next_retry || (op->next_retry<  next_retry))
> (gdb) info threads
> * 1 Thread 0xb72f96c0 (LWP 22417)  sess_retry_output (timer=0x92070c0) at session.c:532
> (gdb) where
> #0  sess_retry_output (timer=0x92070c0) at session.c:532
> #1  session_retry (timer=0x92070c0) at session.c:565
> #2  0x08049aee in cld_timers_run (tlist=0x8056630) at ../lib/libtimer.c:95
> #3  0x0804e9cc in main_loop (argc=5, argv=0xbff70bd4) at server.c:983
> #4  main (argc=5, argv=0xbff70bd4) at server.c:1138
>
> The crash happens because op is NULL. As it turned out, this happens
> if a packet retransmit and a session expiration occur simultaneously
> (in the same pass of timers_run). The scenario is:
>   - timers_run collects expired timers at exec list
>   - timers_run expires session
>   - two timer_del are called, but one of them is on exec list already,
>     so it's ineffective
>   - session is freed, this zeroes ->data in lists (later op)
>   - timers_run continues along the exec list, invokes the retransmission
>     callback, and that crashes with NULL op.
>
> The proposed solution is to rework the timers_run, again. But this
> time, we'll make it simpler by observing that timers are ordered by
> expiration time. Therefore, we can pull next timer off the list,
> expire it, and loop until expiration time is greater than the current
> time. No execution list is kept. The integrity of the main list
> is assured by never walking it and always referring to the head
> anew at each iteration.
>
> This patch appears to fix the problem and stands up to use that
> crashed the old code.
>
> Signed-off-by: Pete Zaitcev<zaitcev@redhat.com>
>
> ---
>   include/cld_common.h |   10 ++++++----
>   lib/libtimer.c       |   41 ++++++++++++++++++++---------------------
>   2 files changed, 26 insertions(+), 25 deletions(-)

applied 1-4

Note that I change the email subject line prefix (which is normally 
copied by automated tools directly into git) from "CLD" to "libcldc", 
when committing to git.  I am trying to follow (and encourage others to) 
the kernel's method of using a prefix to indicate the subsystem or 
section within the current git repo to which a change applies.

To be fully friendly with automated tools, an ideal Project Hail subject 
line might read

	[cld patch 1/1] update Makefile.am
	 	or
	[tabled patch 1/1] update Makefile.am
		or
	[cld patch 1/1] libcldc: add nncld API, the new new CLD API

Not a big deal, just noting what is most friendly to the git automated 
tools, when I'm importing each Hail patch.

	Jeff




  reply	other threads:[~2010-04-19  3:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-18  4:37 [Patch 01/12] CLD: fix crash in retransmissions Pete Zaitcev
2010-04-19  3:46 ` Jeff Garzik [this message]
2010-04-19  4:38   ` Pete Zaitcev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BCBD1FF.3080909@garzik.org \
    --to=jeff@garzik.org \
    --cc=hail-devel@vger.kernel.org \
    --cc=zaitcev@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.