All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Campbell <Ian.Campbell@citrix.com>
To: Philipp Hahn <hahn@univention.de>
Cc: Xen-devel@lists.xen.org
Subject: Re: xenstored crashes with SIGSEGV
Date: Thu, 13 Nov 2014 09:12:31 +0000	[thread overview]
Message-ID: <1415869951.31613.26.camel@citrix.com> (raw)
In-Reply-To: <546461A2.2070908@univention.de>

On Thu, 2014-11-13 at 08:45 +0100, Philipp Hahn wrote:
> To me this looks like some memory corruption by some unknown code
> writing into some random memory space, which happens to be the tdb here.

I wonder if running xenstored under valgrind would be useful. I think
you'd want to stop xenstored from starting during normal boot and then
launch it with:
        valgrind /usr/local/sbin/xenstored -N
-N is to stay in the foreground, you might want to do this in a screen
session or something, alternatively you could investigate the --log-*
options in the valgrind manpage, together with the various
--trace-children* in order to follow the processes over its
daemonization.

I'm not sure what the impact on the system would be with this, but I
think it is probably ok unless you have massive xs load.

You'll need a version of valgrind with xen support in it, anything from
the last year or so should do I think.

Other than that we don't really have anyone who is an expert in that
aspect of the C xenstore/tdb who we can lean on for pointers (no pun
intended) etc, so in the absence of some sort of ability to trigger on
demand I'm not sure what else to suggest.

> 1. Has someone observed a similar crash?

I think you are the only one I've seen reporting this.

> 2. We've now also enabled "xenstored -T /log --verbose" to log the
> messages in the hope to find the triggering transaction, but until then
> is there something more we can do to track down the problem?
> 
> 3. the crash happens rarely and the host run fine most of the time. The
> crash mostly happens around midnight and seem to be guest-triggered, as
> the logs on the host don't show any activity like starting new or
> destroying running VMs. So far the problem only showed on host running
> Linux VMs. Other host running Windows VMs so far never showed that crash.

If it is really mostly happening around midnight then it might be worth
digging into the host and guest configs for cronjobs and the like, e.g.
log rotation stuff like that which might be tweaking things somehow.

Does this happen on multiple hosts, or just the one?

Do you rm the xenstore db on boot? It might have a persistent
corruption, aiui most folks using C xenstored are doing so or even
placing it on a tmpfs for performance reasons.

If you are running 4.1.x then I think oxenstored isn't an option, but it
might be something to consider when you upgrade.

Ian.

  reply	other threads:[~2014-11-13  9:12 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-13  7:45 xenstored crashes with SIGSEGV Philipp Hahn
2014-11-13  9:12 ` Ian Campbell [this message]
2014-12-12 16:14   ` Philipp Hahn
2014-12-12 16:32     ` Ian Campbell
2014-12-12 16:45       ` Philipp Hahn
2014-12-12 16:56         ` Ian Campbell
2014-12-12 17:20           ` Philipp Hahn
2014-12-12 17:58             ` Ian Campbell
2014-12-15 13:17               ` Ian Campbell
2014-12-15 14:19                 ` Philipp Hahn
2014-12-15 14:50                   ` Ian Campbell
2014-12-15 17:45                     ` Ian Campbell
2014-12-15 22:29                       ` Philipp Hahn
2014-12-16  9:51                         ` Ian Campbell
2014-12-16 10:25                         ` Ian Campbell
2014-12-16 10:45                         ` Ian Campbell
2014-12-16 11:06                           ` Ian Campbell
2014-12-16 11:30                             ` Frediano Ziglio
2014-12-16 12:23                               ` Ian Campbell
2014-12-16 16:13                                 ` Frediano Ziglio
2014-12-16 16:23                                   ` Ian Campbell
2014-12-16 16:44                                     ` Frediano Ziglio
2014-12-17  9:14                                       ` Frediano Ziglio
2014-12-17 12:43                                         ` core dump files do not include all CPU registers? Philipp Hahn
2014-12-18 10:20                                         ` xenstored crashes with SIGSEGV Philipp Hahn
2014-12-18 10:17                                   ` Ian Campbell
2014-12-18 10:25                                     ` David Vrabel
2014-12-19 14:30                                       ` Konrad Rzeszutek Wilk
2014-12-18 10:49                                     ` Jan Beulich
2014-12-18 10:51                                       ` Ian Campbell
2014-12-19 12:36                                     ` Philipp Hahn
2015-01-06  7:19                                       ` Philipp Hahn
2015-03-12 12:08                                         ` Philipp Hahn
2015-03-12 18:17                                           ` Oleg Nesterov
2015-03-12 21:57                                             ` Philipp Hahn
2014-12-16 12:04                           ` Philipp Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1415869951.31613.26.camel@citrix.com \
    --to=ian.campbell@citrix.com \
    --cc=Xen-devel@lists.xen.org \
    --cc=hahn@univention.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.