All of lore.kernel.org
 help / color / mirror / Atom feed
From: jan.glauber@caviumnetworks.com (Jan Glauber)
To: linux-arm-kernel@lists.infradead.org
Subject: RCU stall with high number of KVM vcpus
Date: Mon, 13 Nov 2017 18:35:52 +0100	[thread overview]
Message-ID: <20171113173552.GA13282@hc> (raw)
In-Reply-To: <2832f775-3cbe-d984-fe4f-e018642b6f1d@arm.com>

On Mon, Nov 13, 2017 at 01:47:38PM +0000, Marc Zyngier wrote:
> On 13/11/17 13:10, Jan Glauber wrote:
> > I'm seeing RCU stalls in the host with 4.14 when I run KVM on ARM64 (ThunderX2) with a high
> > number of vcpus (60). I only use one guest that does kernel compiles in
> 
> Is that only reproducible on 4.14? With or without VHE? Can you
> reproduce this on another implementation (such as ThunderX-1)?

I've reproduced it on a distro 4.13 and several vanilla 4.14 rc's and
tip/locking. VHE is enabled. I've not yet tried to reproduce it with
older kernels or ThunderX-1. I can check if it happens also on ThunderX-1.

> > a loop. After some hours (less likely the more debugging options are
> > enabled, more likely with more vcpus) RCU stalls are happening in both host & guest.
> > 
> > Both host & guest recover after some time, until the issue is triggered
> > again. 
> > 
> > Stack traces in the guest are next to useless, everything is messed up
> > there. The host seems to stave on kvm->mmu_lock spin lock, the lock_stat
> 
> Please elaborate. Messed in what way? Corrupted? The guest crashing? Or
> is that a tooling issue?

Every vcpu that oopses prints one line in parallel, so I get blocks like:
[58880.179814] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179834] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179847] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179873] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179893] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179911] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179917] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180288] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180303] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180336] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180363] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180384] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180415] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180461] [<ffff000008084b98>] ret_from_fork+0x10/0x18

I can send the full log if you want to have a look.

> > numbers don't look good, see waittime-max:
> > 
> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >                               class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > 
> >                 &(&kvm->mmu_lock)->rlock:      99346764       99406604           0.14  1321260806.59 710654434972.0        7148.97      154228320      225122857           0.13   917688890.60  3705916481.39          16.46
> >                 ------------------------
> >                 &(&kvm->mmu_lock)->rlock       99365598          [<ffff0000080b43b8>] kvm_handle_guest_abort+0x4c0/0x950
> >                 &(&kvm->mmu_lock)->rlock          25164          [<ffff0000080a4e30>] kvm_mmu_notifier_invalidate_range_start+0x70/0xe8
> >                 &(&kvm->mmu_lock)->rlock          14934          [<ffff0000080a7eec>] kvm_mmu_notifier_invalidate_range_end+0x24/0x68
> >                 &(&kvm->mmu_lock)->rlock            908          [<ffff00000810a1f0>] __cond_resched_lock+0x68/0xb8
> >                 ------------------------
> >                 &(&kvm->mmu_lock)->rlock              3          [<ffff0000080b34c8>] stage2_flush_vm+0x60/0xd8
> >                 &(&kvm->mmu_lock)->rlock       99186296          [<ffff0000080b43b8>] kvm_handle_guest_abort+0x4c0/0x950
> >                 &(&kvm->mmu_lock)->rlock         179238          [<ffff0000080a4e30>] kvm_mmu_notifier_invalidate_range_start+0x70/0xe8
> >                 &(&kvm->mmu_lock)->rlock          19181          [<ffff0000080a7eec>] kvm_mmu_notifier_invalidate_range_end+0x24/0x68
> > 
> > .............................................................................................................................................................................................................................
> [slots of stuff]
> 
> Well, the mmu_lock is clearly contended. Is the box in a state where you
> are swapping? There seem to be as many faults as contentions, which is a
> bit surprising...

I don't think it is swapping but need to double check.

> Also, we recently moved arm64 to qrwlocks, which may have an impact.
> Care to give this[1] a go and report the result?

Sure, I that was my first suspision but I can reproduce the issue with
and without the qrwlock patches, so these are not to blame. Also, the
starving lock is a spinlock and not a qrwlock. So maybe the spinlocks
have fairness issues too?

--Jan

> Thanks,
> 
> 	M.
> 
> [1]: https://lkml.org/lkml/2017/10/12/266
> -- 
> Jazz is not dead. It just smells funny...

WARNING: multiple messages have this Message-ID (diff)
From: Jan Glauber <jan.glauber@caviumnetworks.com>
To: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-arm-kernel@lists.infradead.org,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Christoffer Dall" <christoffer.dall@linaro.org>,
	kvm@vger.kernel.org, "Radim Krčmář" <rkrcmar@redhat.com>
Subject: Re: RCU stall with high number of KVM vcpus
Date: Mon, 13 Nov 2017 18:35:52 +0100	[thread overview]
Message-ID: <20171113173552.GA13282@hc> (raw)
In-Reply-To: <2832f775-3cbe-d984-fe4f-e018642b6f1d@arm.com>

On Mon, Nov 13, 2017 at 01:47:38PM +0000, Marc Zyngier wrote:
> On 13/11/17 13:10, Jan Glauber wrote:
> > I'm seeing RCU stalls in the host with 4.14 when I run KVM on ARM64 (ThunderX2) with a high
> > number of vcpus (60). I only use one guest that does kernel compiles in
> 
> Is that only reproducible on 4.14? With or without VHE? Can you
> reproduce this on another implementation (such as ThunderX-1)?

I've reproduced it on a distro 4.13 and several vanilla 4.14 rc's and
tip/locking. VHE is enabled. I've not yet tried to reproduce it with
older kernels or ThunderX-1. I can check if it happens also on ThunderX-1.

> > a loop. After some hours (less likely the more debugging options are
> > enabled, more likely with more vcpus) RCU stalls are happening in both host & guest.
> > 
> > Both host & guest recover after some time, until the issue is triggered
> > again. 
> > 
> > Stack traces in the guest are next to useless, everything is messed up
> > there. The host seems to stave on kvm->mmu_lock spin lock, the lock_stat
> 
> Please elaborate. Messed in what way? Corrupted? The guest crashing? Or
> is that a tooling issue?

Every vcpu that oopses prints one line in parallel, so I get blocks like:
[58880.179814] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179834] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179847] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179873] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179893] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179911] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.179917] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180288] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180303] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180336] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180363] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180384] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180415] [<ffff000008084b98>] ret_from_fork+0x10/0x18
[58880.180461] [<ffff000008084b98>] ret_from_fork+0x10/0x18

I can send the full log if you want to have a look.

> > numbers don't look good, see waittime-max:
> > 
> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >                               class name    con-bounces    contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
> > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > 
> >                 &(&kvm->mmu_lock)->rlock:      99346764       99406604           0.14  1321260806.59 710654434972.0        7148.97      154228320      225122857           0.13   917688890.60  3705916481.39          16.46
> >                 ------------------------
> >                 &(&kvm->mmu_lock)->rlock       99365598          [<ffff0000080b43b8>] kvm_handle_guest_abort+0x4c0/0x950
> >                 &(&kvm->mmu_lock)->rlock          25164          [<ffff0000080a4e30>] kvm_mmu_notifier_invalidate_range_start+0x70/0xe8
> >                 &(&kvm->mmu_lock)->rlock          14934          [<ffff0000080a7eec>] kvm_mmu_notifier_invalidate_range_end+0x24/0x68
> >                 &(&kvm->mmu_lock)->rlock            908          [<ffff00000810a1f0>] __cond_resched_lock+0x68/0xb8
> >                 ------------------------
> >                 &(&kvm->mmu_lock)->rlock              3          [<ffff0000080b34c8>] stage2_flush_vm+0x60/0xd8
> >                 &(&kvm->mmu_lock)->rlock       99186296          [<ffff0000080b43b8>] kvm_handle_guest_abort+0x4c0/0x950
> >                 &(&kvm->mmu_lock)->rlock         179238          [<ffff0000080a4e30>] kvm_mmu_notifier_invalidate_range_start+0x70/0xe8
> >                 &(&kvm->mmu_lock)->rlock          19181          [<ffff0000080a7eec>] kvm_mmu_notifier_invalidate_range_end+0x24/0x68
> > 
> > .............................................................................................................................................................................................................................
> [slots of stuff]
> 
> Well, the mmu_lock is clearly contended. Is the box in a state where you
> are swapping? There seem to be as many faults as contentions, which is a
> bit surprising...

I don't think it is swapping but need to double check.

> Also, we recently moved arm64 to qrwlocks, which may have an impact.
> Care to give this[1] a go and report the result?

Sure, I that was my first suspision but I can reproduce the issue with
and without the qrwlock patches, so these are not to blame. Also, the
starving lock is a spinlock and not a qrwlock. So maybe the spinlocks
have fairness issues too?

--Jan

> Thanks,
> 
> 	M.
> 
> [1]: https://lkml.org/lkml/2017/10/12/266
> -- 
> Jazz is not dead. It just smells funny...

  reply	other threads:[~2017-11-13 17:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20171113131000.GA10546@hc>
2017-11-13 13:47 ` RCU stall with high number of KVM vcpus Marc Zyngier
2017-11-13 13:47   ` Marc Zyngier
2017-11-13 17:35   ` Jan Glauber [this message]
2017-11-13 17:35     ` Jan Glauber
2017-11-13 18:11     ` Marc Zyngier
2017-11-13 18:11       ` Marc Zyngier
2017-11-13 18:40       ` Jan Glauber
2017-11-13 18:40         ` Jan Glauber
2017-11-14 13:30         ` Marc Zyngier
2017-11-14 13:30           ` Marc Zyngier
2017-11-14 14:19           ` Jan Glauber
2017-11-14 14:19             ` Jan Glauber
2017-11-14  7:52       ` Jan Glauber
2017-11-14  7:52         ` Jan Glauber
2017-11-14  8:49         ` Marc Zyngier
2017-11-14  8:49           ` Marc Zyngier
2017-11-14 11:34           ` Suzuki K Poulose
2017-11-14 11:34             ` Suzuki K Poulose
2017-11-13 18:13     ` Shameerali Kolothum Thodi
2017-11-13 18:13       ` Shameerali Kolothum Thodi
2017-11-14  7:49       ` Jan Glauber
2017-11-14  7:49         ` Jan Glauber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171113173552.GA13282@hc \
    --to=jan.glauber@caviumnetworks.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.