All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philipp Hahn <hahn@univention.de>
To: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Xen-devel@lists.xen.org
Subject: Re: xenstored crashes with SIGSEGV
Date: Fri, 12 Dec 2014 17:45:28 +0100	[thread overview]
Message-ID: <548B1BA8.3090504@univention.de> (raw)
In-Reply-To: <1418401932.16425.34.camel@citrix.com>

Hello Ian,

On 12.12.2014 17:32, Ian Campbell wrote:
> On Fri, 2014-12-12 at 17:14 +0100, Philipp Hahn wrote:
>> We did enable tracing and now have the xenstored-trace.log of one crash:
>> It contains 1.6 billion lines and is 83 GiB.
>> It just shows xenstored to crash on TRANSACTION_START.
>>
>> Is there some tool to feed that trace back into a newly launched xenstored?
> 
> Not that I know of I'm afraid.

Okay, then I have to continue with my own tool.

> Do you get a core dump when this happens? You might need to fiddle with
> ulimits (some distros disable by default). IIRC there is also some /proc
> nob which controls where core dumps go on the filesystem.

Not for that specific trace: We first enabled generating core files, but
only then discovered that this is not enough. Then we enabled
--trace-file, but on that host something reseted generating the core file.
We hopefully fixed all hosts so on the next crash we hopefully will get
both a core file and the trace.

>> My hope would be that xenstored crashes again, because then we could use
>> all those other tools like valgrind more easily.
> 
> That would be handy. My fear would be that this bug is likely to be a
> race condition of some sort, and the granularity/accuracy of the
> playback would possibly need to be quite high to trigger the issue.

cxenstored looks single threaded to me, or am I wrong?

>>> Do you rm the xenstore db on boot? It might have a persistent
>>> corruption, aiui most folks using C xenstored are doing so or even
>>> placing it on a tmpfs for performance reasons.
>>
>> We're using a tmpfs for /var/lib/xenstored/, as we had some sever
>> performance problem with something updating
>> /local/domain/0/backend/console/*/0/uuid too often, which put xenstored
>> in permanent D state.
> 
> But this is just a process crashing and not the whole host so you still
> have the db file at the point of the crash?

Yes: Running xs_tdb_dump or tdb_dump on it didn't show anything
obviously wrong.

> It might be interesting to see what happens if you preserve the db and
> reboot arranging for the new xenstored to start with the old file. If
> the corruption is part of the file then maybe it can be induced to crash
> again more quickly.

Thanks for the pointer, will try.

Thank you again for your fast reply.
Philipp Hahn

  reply	other threads:[~2014-12-12 16:45 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-13  7:45 xenstored crashes with SIGSEGV Philipp Hahn
2014-11-13  9:12 ` Ian Campbell
2014-12-12 16:14   ` Philipp Hahn
2014-12-12 16:32     ` Ian Campbell
2014-12-12 16:45       ` Philipp Hahn [this message]
2014-12-12 16:56         ` Ian Campbell
2014-12-12 17:20           ` Philipp Hahn
2014-12-12 17:58             ` Ian Campbell
2014-12-15 13:17               ` Ian Campbell
2014-12-15 14:19                 ` Philipp Hahn
2014-12-15 14:50                   ` Ian Campbell
2014-12-15 17:45                     ` Ian Campbell
2014-12-15 22:29                       ` Philipp Hahn
2014-12-16  9:51                         ` Ian Campbell
2014-12-16 10:25                         ` Ian Campbell
2014-12-16 10:45                         ` Ian Campbell
2014-12-16 11:06                           ` Ian Campbell
2014-12-16 11:30                             ` Frediano Ziglio
2014-12-16 12:23                               ` Ian Campbell
2014-12-16 16:13                                 ` Frediano Ziglio
2014-12-16 16:23                                   ` Ian Campbell
2014-12-16 16:44                                     ` Frediano Ziglio
2014-12-17  9:14                                       ` Frediano Ziglio
2014-12-17 12:43                                         ` core dump files do not include all CPU registers? Philipp Hahn
2014-12-18 10:20                                         ` xenstored crashes with SIGSEGV Philipp Hahn
2014-12-18 10:17                                   ` Ian Campbell
2014-12-18 10:25                                     ` David Vrabel
2014-12-19 14:30                                       ` Konrad Rzeszutek Wilk
2014-12-18 10:49                                     ` Jan Beulich
2014-12-18 10:51                                       ` Ian Campbell
2014-12-19 12:36                                     ` Philipp Hahn
2015-01-06  7:19                                       ` Philipp Hahn
2015-03-12 12:08                                         ` Philipp Hahn
2015-03-12 18:17                                           ` Oleg Nesterov
2015-03-12 21:57                                             ` Philipp Hahn
2014-12-16 12:04                           ` Philipp Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=548B1BA8.3090504@univention.de \
    --to=hahn@univention.de \
    --cc=Ian.Campbell@citrix.com \
    --cc=Xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.