All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Christopher Pereira <kripper@imatronix.cl>
Cc: kvm@vger.kernel.org
Subject: Re: "BUG: soft lockup" and frozen guest
Date: Tue, 30 Apr 2019 12:49:15 +0200	[thread overview]
Message-ID: <874l6f6a50.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <ba7deff9-6a29-9514-642f-99b3f7cd8fe1@imatronix.cl>

Christopher Pereira <kripper@imatronix.cl> writes:

> On April 29, 2019 7:56:44 AM AST, Vitaly Kuznetsov <vkuznets@redhat.com> 
> wrote:
>
>     Christopher Pereira <kripper@imatronix.cl> writes:
>
>         Hi, I have been experiencing some random guest crashes in the
>         last years and would like to invest some time in trying to debug
>         them with your help. Symptom is: 1) "BUG: soft lockup" & "CPU#*
>         stuck for *s!" messages during high load on the guest 2) At some
>         point later (eg. 12 hours later), the guest just hangs without
>         any message and must be destroyed / rebooted. I attached the
>         relevant kernel messages. Host (spec: Intel(R) Xeon(R) CPU
>         E5645) is running: kernel-3.10.0-327.el7.x86_64
>         libvirt-daemon-kvm-1.2.17-13.el7_2.5.x86_64
>         qemu-kvm-ev-2.3.0-31.el7_2.10.1.x86_64
>         qemu-kvm-common-ev-2.3.0-31.el7_2.10.1.x86_64 
>
>
>     This is pretty old stuff, e.g. kernel-3.10.0-327.el7 was release with
>     RHEL-7.2 (Nov 2015). As this is upstream mailing list, it would be great
>     if you could build an upstream kernel (should work with EL7 userspace)
>     and try to reproduce.
>
> Hi Vitaly,
>
> Yes, but it's a critical production environment and I haven't seen any 
> related patch in the kernel changelog since 3.10. We will try to upgrade 
> whenever possible.

It's hard to tell which changes may be related here (as, for example, I
also see nf_conntrack_* in your trace and the bug may as well be there)
but even in RHEL-7.2 updates (kernel-3.10.0-327.*) I can see several
dozed KVM commits (and there's several hundred between 7.2 and 7.6).

>
> I believe this bug could be related to overcommitting resources. Does 
> qemu-kvm throw any log message when resources are overcommited? Is there 
> some way to enable this?
>
> We have seen this happening one in a while in the last 4 years on 
> different production hardware and wanted to ask if this is a common 
> issue and how to address/debug this issue.


Define "resources" and "overcommit" ;-) In case you overcommit
CPUs/memory severily (like dozens/hundereds of vCPUs per pCPU, host
constantly swapping) guests may, of course, start to misbehave. In case
it is just a couple of vCPU per pCPU and the host is not swapping
guest softlockups are not normal.

In case there's no way to trigger the issue you may want to enable kdump
and set

sysctl -w kernel.softlockup_panic=1
sysctl -w kernel.softlockup_all_cpu_backtrace=1

and then inspect the crash dump.

-- 
Vitaly

      reply	other threads:[~2019-04-30 10:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-29  6:35 "BUG: soft lockup" and frozen guest Christopher Pereira
2019-04-29 11:56 ` Vitaly Kuznetsov
2019-04-29 23:14   ` Christopher Pereira
2019-04-30 10:49     ` Vitaly Kuznetsov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874l6f6a50.fsf@vitty.brq.redhat.com \
    --to=vkuznets@redhat.com \
    --cc=kripper@imatronix.cl \
    --cc=kvm@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.