Re: xenstored crashes with SIGSEGV

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Philipp Hahn <hahn@univention.de>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>, Xen-devel@lists.xen.org
Subject: Re: xenstored crashes with SIGSEGV
Date: Mon, 15 Dec 2014 23:29:19 +0100	[thread overview]
Message-ID: <548F60BF.4020901@univention.de> (raw)
In-Reply-To: <1418665524.16425.171.camel@citrix.com>

Hello Ian,

On 15.12.2014 18:45, Ian Campbell wrote:
> On Mon, 2014-12-15 at 14:50 +0000, Ian Campbell wrote:
>> On Mon, 2014-12-15 at 15:19 +0100, Philipp Hahn wrote:
>>> I just noticed something strange:
>>>
>>>> #3  0x000000000040a684 in tdb_open (name=0xff00000000 <Address
>>>> 0xff00000000 out of bounds>, hash_size=0,
>>>>     tdb_flags=4254928, open_flags=-1, mode=3119127560) at tdb.c:1773
...
> I'm reasonably convinced now that this is just a weird artefact of
> running gdb on an optimised binary, probably a shortcoming in the debug
> info leading to gdb getting confused.
> 
> Unfortunately this also calls into doubt the parameter to talloc_free,
> perhaps in that context 0xff0000000 is a similar artefact.
> 
> Please can you print the entire contents of tdb in the second frame
> ("print *tdb" ought to do it). I'm curious whether it is all sane or
> not.

(gdb) print *tdb
$1 = {name = 0x0, map_ptr = 0x0, fd = 47, map_size = 65280, read_only =
16711680,
  locked = 0xff0000000000, ecode = 16711680, header = {
    magic_food =
"\000\000\000\000\000\000\000\000\000\377\000\000\000\000\377\000\000\000\000\000\000\000\000\000\000\377\000\000\000\000\377",
version = 0, hash_size = 0,
    rwlocks = 65280, reserved = {16711680, 0, 0, 65280, 16711680, 0, 0,
65280,
      16711680, 0, 0, 65280, 16711680, 0, 0, 65280, 16711680, 0, 0,
65280, 16711680,
      0, 0, 65280, 16711680, 0, 0, 65280, 16711680, 0, 0}}, flags = 0,
travlocks = {
    next = 0xff0000, off = 0, hash = 65280}, next = 0xff0000,
  device = 280375465082880, inode = 16711680, log_fn = 0x4093b0
<null_log_fn>,
  hash_fn = 0x4092f0 <default_tdb_hash>, open_flags = 2}

> Please can you also print "info regs" at the point of the segv (in frame
> 0) as well as "disas" at that point.

(gdb) info registers
rax            0x0      0
rbx            0x16bff70        23854960
rcx            0xffffffffffffffff       -1
rdx            0x40ecd0 4254928
rsi            0x0      0
rdi            0xff0000000000   280375465082880
rbp            0x7fcaed6c96a8   0x7fcaed6c96a8
rsp            0x7fff9dc86330   0x7fff9dc86330
r8             0x7fcaece54c08   140509534571528
r9             0xff00000000000000       -72057594037927936
r10            0x7fcaed08c14c   140509536895308
r11            0x246    582
r12            0xd      13
r13            0xff0000000000   280375465082880
r14            0x4093b0 4232112
r15            0x167d620        23582240
rip            0x4075c4 0x4075c4 <talloc_chunk_from_ptr+4>
eflags         0x10206  [ PF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0
fctrl          0x0      0
fstat          0x0      0
ftag           0x0      0
fiseg          0x0      0
fioff          0x0      0
foseg          0x0      0
fooff          0x0      0
fop            0x0      0
mxcsr          0x0      [ ]

(gdb) disassemble
Dump of assembler code for function talloc_chunk_from_ptr:
0x00000000004075c0 <talloc_chunk_from_ptr+0>:   sub    $0x8,%rsp
0x00000000004075c4 <talloc_chunk_from_ptr+4>:   mov    -0x8(%rdi),%edx
0x00000000004075c7 <talloc_chunk_from_ptr+7>:   lea    -0x50(%rdi),%rax
0x00000000004075cb <talloc_chunk_from_ptr+11>:  mov    %edx,%ecx
0x00000000004075cd <talloc_chunk_from_ptr+13>:  and
$0xfffffffffffffff0,%ecx
0x00000000004075d0 <talloc_chunk_from_ptr+16>:  cmp    $0xe814ec70,%ecx
0x00000000004075d6 <talloc_chunk_from_ptr+22>:  jne    0x4075e2
<talloc_chunk_from_ptr+34>
0x00000000004075d8 <talloc_chunk_from_ptr+24>:  and    $0x1,%edx
0x00000000004075db <talloc_chunk_from_ptr+27>:  jne    0x4075e2
<talloc_chunk_from_ptr+34>
0x00000000004075dd <talloc_chunk_from_ptr+29>:  add    $0x8,%rsp
0x00000000004075e1 <talloc_chunk_from_ptr+33>:  retq
0x00000000004075e2 <talloc_chunk_from_ptr+34>:  nopw   0x0(%rax,%rax,1)
0x00000000004075e8 <talloc_chunk_from_ptr+40>:  callq  0x401b98 <abort@plt>

> Can you also "p $_siginfo._sifields._sigfault.si_addr" (in frame 0).
> This ought to be the actual faulting address, which ought to give a hint
> on how much we can trust the parameters in the stack trace.

Hmm, my gdb refused to access $_siginfo:
(gdb) show convenience
$_siginfo = Unable to read siginfo

> Since I'm asking for the world I may as well ask you to dump the raw
> stack too "x/64x $sp" ought to be a good starting point.

(gdb) x/64x $sp
0x7fff9dc86330: 0xed6c96a8      0x00007fca      0x00407edf      0x00000000
0x7fff9dc86340: 0x00000000      0x00000000      0x016bff70      0x00000000
0x7fff9dc86350: 0xed6c96a8      0x00007fca      0x0000000d      0x00000000
0x7fff9dc86360: 0x00000000      0x00000000      0x004093b0      0x00000000
0x7fff9dc86370: 0x0167d620      0x00000000      0x0040a348      0x00000000
0x7fff9dc86380: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fff9dc86390: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fff9dc863a0: 0x00000011      0x00000000      0x411d4816      0x00000000
0x7fff9dc863b0: 0x00000001      0x00000000      0x000081a0      0x00000000
0x7fff9dc863c0: 0x00000000      0x00000000      0x00000000      0x00000000
0x7fff9dc863d0: 0x00096000      0x00000000      0x00001000      0x00000000
0x7fff9dc863e0: 0x000004b0      0x00000000      0x5438ba01      0x00000000
0x7fff9dc863f0: 0x07fd332e      0x00000000      0x5438ba01      0x00000000
0x7fff9dc86400: 0x07fd332e      0x00000000      0x5438ba01      0x00000000
0x7fff9dc86410: 0x07fd332e      0x00000000      0x00000000      0x00000000
0x7fff9dc86420: 0x00000000      0x00000000      0x00000000      0x00000000

> I notice in your bugzilla (for a different occurrence, I think):
>> [2090451.721705] univention-conf[2512]: segfault at ff00000000 ip 000000000045e238 sp 00007ffff68dfa30 error 6 in python2.6[400000+21e000]
> 
> Which appears to have faulted access 0xff000000000 too. It looks like
> this process is a python thing, it's nothing to do with xenstored I
> assume?

Yes, that's one univention-config, which is completely independent of
xen(stored).

> It seems rather coincidental that it should be accessing the 
> same sort of address and be faulting.

Yes, good catch. I'll have another look at those core dumps.

> Ian.

Thank you for your help.
Philipp Hahn

next prev parent reply	other threads:[~2014-12-15 22:29 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-13  7:45 xenstored crashes with SIGSEGV Philipp Hahn
2014-11-13  9:12 ` Ian Campbell
2014-12-12 16:14   ` Philipp Hahn
2014-12-12 16:32     ` Ian Campbell
2014-12-12 16:45       ` Philipp Hahn
2014-12-12 16:56         ` Ian Campbell
2014-12-12 17:20           ` Philipp Hahn
2014-12-12 17:58             ` Ian Campbell
2014-12-15 13:17               ` Ian Campbell
2014-12-15 14:19                 ` Philipp Hahn
2014-12-15 14:50                   ` Ian Campbell
2014-12-15 17:45                     ` Ian Campbell
2014-12-15 22:29                       ` Philipp Hahn [this message]
2014-12-16  9:51                         ` Ian Campbell
2014-12-16 10:25                         ` Ian Campbell
2014-12-16 10:45                         ` Ian Campbell
2014-12-16 11:06                           ` Ian Campbell
2014-12-16 11:30                             ` Frediano Ziglio
2014-12-16 12:23                               ` Ian Campbell
2014-12-16 16:13                                 ` Frediano Ziglio
2014-12-16 16:23                                   ` Ian Campbell
2014-12-16 16:44                                     ` Frediano Ziglio
2014-12-17  9:14                                       ` Frediano Ziglio
2014-12-17 12:43                                         ` core dump files do not include all CPU registers? Philipp Hahn
2014-12-18 10:20                                         ` xenstored crashes with SIGSEGV Philipp Hahn
2014-12-18 10:17                                   ` Ian Campbell
2014-12-18 10:25                                     ` David Vrabel
2014-12-19 14:30                                       ` Konrad Rzeszutek Wilk
2014-12-18 10:49                                     ` Jan Beulich
2014-12-18 10:51                                       ` Ian Campbell
2014-12-19 12:36                                     ` Philipp Hahn
2015-01-06  7:19                                       ` Philipp Hahn
2015-03-12 12:08                                         ` Philipp Hahn
2015-03-12 18:17                                           ` Oleg Nesterov
2015-03-12 21:57                                             ` Philipp Hahn
2014-12-16 12:04                           ` Philipp Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=548F60BF.4020901@univention.de \
    --to=hahn@univention.de \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=Xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.