All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber de Oliveira Costa <gcosta@redhat.com>
To: Keir Fraser <Keir.Fraser@cl.cam.ac.uk>
Cc: xen-devel@lists.xensource.com
Subject: Re: [PATCH] BUG() on soft lockup upon suspend/resume
Date: Mon, 9 Oct 2006 21:29:10 -0300	[thread overview]
Message-ID: <20061010002910.GC28540@redhat.com> (raw)
In-Reply-To: <C1508A1A.2405%Keir.Fraser@cl.cam.ac.uk>

> 
> > In systems with vcpu > 1, a BUG due to a detected soft lockup seems to be
> > triggered after system resume/suspend. This is probably due to the lack of
> > seqlocking around the region that does the local time processing.
> 
> We do SMP save/restore tests regularly and do not see this issue. It ought
> to be avoided by the fact that, when we bring up a CPU, we
> touch_softlockup_watchdog() in cpu_bringup(), before enabling interrupts.
> For CPU0 on resume, the touch is done in time_resume() in
> arch/i386/kernel/time-xen.c.

This happens not only (once) when the system comes back. It do happen a
lot after it. So even if the first touch is right, I suspect this issue
is more related to a situation in which we are already resumed for a
long time, with all set up
> 
> I think we need to understand the issue you are hitting a bit more before
> deciding on the right fix.

Right, here it goes more info:

I'm on a 8-way x86_64 machine, and This is the sort of info I see
repeatedly:

BUG: soft lockup detected on CPU#1!

Call Trace:
 <IRQ>  [<ffffffff802ace9d>] softlockup_tick+0xf8/0x113
 [<ffffffff8026d591>] timer_interrupt+0x38a/0x3d8
 [<ffffffff80210e87>] handle_IRQ_event+0x2d/0x60
 [<ffffffff802ad1e6>] __do_IRQ+0xa5/0x107
 [<ffffffff8028be7a>] _local_bh_enable+0x61/0xc5
 [<ffffffff8026b4c9>] do_IRQ+0xe7/0xf5
 [<ffffffff8039386e>] evtchn_do_upcall+0x86/0xe0
 [<ffffffff8025e2a2>] do_hypervisor_callback+0x1e/0x2c
 <EOI>  [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff8026cb13>] raw_safe_halt+0x84/0xa8
 [<ffffffff8026a121>] xen_idle+0x38/0x4a
 [<ffffffff80248e66>] cpu_idle+0x97/0xba

It obviously never happen on CPU#0, but I see it on all others (vcpus=4)

If you have any other opinion on what else may be causing this, it's
very welcome. I'll keep investigating.


-- 
Glauber de Oliveira Costa
Red Hat Inc.
"Free as in Freedom"

  reply	other threads:[~2006-10-10  0:29 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-09 21:22 [PATCH] BUG() on soft lockup upon suspend/resume Glauber de Oliveira Costa
2006-10-09 22:22 ` Keir Fraser
2006-10-10  0:29   ` Glauber de Oliveira Costa [this message]
2006-10-10 10:06     ` Keir Fraser

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061010002910.GC28540@redhat.com \
    --to=gcosta@redhat.com \
    --cc=Keir.Fraser@cl.cam.ac.uk \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.