All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steve Traugott <stevegt@TerraLuna.Org>
To: Keir Fraser <Keir.Fraser@cl.cam.ac.uk>
Cc: xen-devel <xen-devel@lists.xensource.com>
Subject: blocking Xen 3.X production use: soft lockup bugs
Date: Wed, 2 Aug 2006 13:54:49 -0700	[thread overview]
Message-ID: <20060802205449.GA17411@terraluna.org> (raw)

Hi All,

I hate to say it, but it's starting to look like soft lockup bug(s)
are turning into a serious roadblock for general production use of Xen
3.X, on a wide range of hardware.  I've been using Xen since the 1.0
days, and I have to say that this the most serious showstopper bug
I've ever hit -- it usually manifests itself during the first
significant network and/or disk I/O after starting a second or third
domU on the same box, and is the only bug I've ever hit that has
caused permanent damage -- it tends to corrupt guest filesystems.  In
my case it's stopped a deployment dead in its tracks, and our only
options at this point are to go back to Xen 2.X or (horrors) to native
Linux kernels.

The problem (or something that looks identical) is described in
several tickets, status currently NEW or REOPENED, no clear
resolution:
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=543
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=690
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=697
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=705

In our own shop, we consistently hit soft lockups while running on
both IBM x330's and older Netengines (similar to an IBM 4000R).  We've
found no workaround.  We're on xen-3.0-testing, changeset 9732, kernel
2.6.6.13.  On April 6th, Keir posted a note saying this was fixed as
of a blkif_schedule() fix, which we already have because that was way
back in changeset 9587...
http://lists.xensource.com/archives/html/xen-devel/2006-04/msg00121.html.

The most recent devel list traffic I've found which covers this is
July 7th:
http://lists.xensource.com/archives/html/xen-users/2006-07/msg00134.html
...this message referred back to Kier's comment as describing a fix,
but it doesn't look true; while Kier's 9587 checkin may have fixed a
soft lockup problem, there appear to be more out there, or else
there's been regression.

Do we have any consensus that this bug is fixed at all in
xen-3.0-testing, or even unstable?  Is anyone who was hitting soft
lockups in testing *not* hitting them any more on the same hardware?
If so, what changeset are you on now?

If anyone needs any more information, just let me know.  As usual, if
anyone wants login and console server access to one of these boxes to
chase this down, I'm more than happy to provide that.

Thanks, 

Steve
-- 
Stephen G. Traugott  (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org 
http://www.stevegt.com -- http://Infrastructures.Org

             reply	other threads:[~2006-08-02 20:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-02 20:54 Steve Traugott [this message]
2006-08-02 22:48 ` blocking Xen 3.X production use: soft lockup bugs Steve Traugott
  -- strict thread matches above, loose matches on Subject: below --
2006-08-02 22:25 Ian Pratt
2006-08-03  0:27 ` Steve Traugott
2006-08-03  8:07   ` Keir Fraser
2006-08-03  8:03 ` Keir Fraser
2006-08-04 20:21   ` Steve Traugott
2006-08-05  8:50     ` Keir Fraser
2006-08-05 11:59       ` Harry Butterworth
2006-08-05 13:45         ` Keir Fraser
2006-08-05 14:33           ` Harry Butterworth
2006-08-05  7:38 Ian Pratt
2006-08-07 14:15 Harry Butterworth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060802205449.GA17411@terraluna.org \
    --to=stevegt@terraluna.org \
    --cc=Keir.Fraser@cl.cam.ac.uk \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.