public inbox for hail-devel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Pete Zaitcev <zaitcev@redhat.com>
Cc: Project Hail List <hail-devel@vger.kernel.org>
Subject: Re: [cld patch 1/1] fix crash in cldc_close due to array reallocations
Date: Sat, 15 May 2010 16:36:58 -0400	[thread overview]
Message-ID: <4BEF05EA.60807@garzik.org> (raw)
In-Reply-To: <20100515133227.24e72d0d@redhat.com>

On 05/15/2010 03:32 PM, Pete Zaitcev wrote:
> A long-running tabled is virtually guaranteed to crash when it's shut
> down, with the traceback like this:
>
> Program terminated with signal 11, Segmentation fault.
> #0  cldc_close (fh=0x10a71c8, copts=0x7fff2fdef780) at cldc.c:1049
> 1049            if (sess->expired) {
> (gdb) where
> #0  cldc_close (fh=0x10a71c8, copts=0x7fff2fdef780) at cldc.c:1049
> #1  0x00007f8a878ed7c7 in ncld_close (fh=0x10a7790) at cldc.c:2215
> #2  0x00007f8a878ed918 in ncld_sess_close (nsess=0x10a6f10) at cldc.c:2247
> #3  0x0000000000406252 in cld_end () at cldu.c:701
> #4  0x000000000040d6c4 in main (argc=<value optimized out>,
>      argv=<value optimized out>) at server.c:1980
> (gdb) print *fh
> $3 = {fh = 140232956496032, sess = 0x0, valid = 176}
>
> Obviously the boolean value should not be 176.
>
> The reason this happens is array reallocation. The API for libcldc
> (used internally by ncld) returns a pointer to struct cldc_fh that
> the user is expected to save. However, this structure is placed
> inside a GArray. As array expands, it can be relocated in memory
> and the file handle pointer is left hanging. Tabled is vulnerable
> to this because 1) it keeps an open handle for the CLD file that
> it locked and 2) it periodically opens and closes CLD files for
> chunkservers, causing array to grow.
>
> We observed that the indexes into the array are not the same
> as file handles. Therefore, we never access elements by index
> and thus it is not necessary to use an array structure.
>
> This patch replaces array with a list, so that pointers to handle
> structures can be stable.
>
> Also, the patch frees the handles when they are closed, thus
> ending the endless growth of the file handle table. This is
> done in a compatible way, so handle is available to callers
> of cldc_close (ncld does not need it, but it may be useful).
>
> Although we change globally visible structures, we only replace
> one ostensibly opaque pointer type (GArray) with another one
> (GList), so applications need not be rebuilt.
>
> Signed-off-by: Pete Zaitcev<zaitcev@redhat.com>

applied


      reply	other threads:[~2010-05-15 20:36 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-15 19:32 [cld patch 1/1] fix crash in cldc_close due to array reallocations Pete Zaitcev
2010-05-15 20:36 ` Jeff Garzik [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BEF05EA.60807@garzik.org \
    --to=jeff@garzik.org \
    --cc=hail-devel@vger.kernel.org \
    --cc=zaitcev@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox