* [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
@ 2016-12-29 16:21 bugzilla-daemon
2017-01-01 1:23 ` [Bug 191481] " bugzilla-daemon
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: bugzilla-daemon @ 2016-12-29 16:21 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
Bug ID: 191481
Summary: Virtual machine CPU counters are broken forever after
live migration to system with "steal time overflow"
KVM bug
Product: Virtualization
Version: unspecified
Kernel Version: Found since 3.18.31, still resides in latest ubuntu
3.19..4.8 builds
Hardware: Intel
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: kvm
Assignee: virtualization_kvm@kernel-bugs.osdl.org
Reporter: ds@vo-ix.ru
Regression: No
All (at least, compiled for ubuntu) kernels had the stolen time overflow bug,
described in links below:
http://lists.gnu.org/archive/html/qemu-devel/2015-06/msg01295.html
https://bugs.launchpad.net/linux/+bug/1494350
It was fixed (in ubuntu) on summer 2016, so till now still exists affected
hosts.
Kernels built before Apr 2016 bypass "%still" counter as is, so sar output
likes this:
11:11:15 AM CPU %user %nice %system %iowait %steal %idle
11:10:48 AM all 0.00 0.00 0.50 0.00 0.00 99.50
11:10:49 AM all 0.00 0.00 0.00 0.00
18823208238479134720.00 203.06
11:10:50 AM all 0.00 0.00 0.00 0.00 0.00 100.00
While all newer kernel counters I've been tested sticks:
11:11:15 AM CPU %user %nice %system %iowait %steal %idle
07:07:56 AM all 0.50 0.00 0.00 0.00 0.00 99.50
07:07:57 AM all 0.00 0.00 0.50 0.00 0.00 99.50
07:07:58 AM all 0.00 0.00 0.50 0.00 0.00 99.50
07:07:59 AM all 0.00 0.00 0.00 0.00 100.00 0.00
07:08:00 AM all 0.00 0.00 0.00 0.00 100.00 0.00
07:08:01 AM all 0.00 0.00 0.00 0.00 100.00 0.00
%steal is always 100% after migration.
I slightly suspect that the reason resides in commit
0185604c2d82c560dab2f2933a18f797e74ab5a8.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 191481] Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
@ 2017-01-01 1:23 ` bugzilla-daemon
2017-01-09 9:23 ` bugzilla-daemon
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-01-01 1:23 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
Wanpeng Li <wanpeng.li@hotmail.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |wanpeng.li@hotmail.com
--- Comment #1 from Wanpeng Li <wanpeng.li@hotmail.com> ---
Could you test commit 2348140d58f4 (KVM: Fix steal clock warp during guest CPU
hotplug)? The commit should be applied in guest kernel.
Regards,
Wanpeng Li
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 191481] Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
2017-01-01 1:23 ` [Bug 191481] " bugzilla-daemon
@ 2017-01-09 9:23 ` bugzilla-daemon
2017-01-10 12:31 ` bugzilla-daemon
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-01-09 9:23 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
--- Comment #2 from Dmitry Svyatogorov <ds@vo-ix.ru> ---
>git show 2348140d58f4
"""
commit 2348140d58f4f4245e9635ea8f1a77e940a4d877
Author: Wanpeng Li <wanpeng.li@hotmail.com>
Date: Mon Jun 13 18:32:44 2016 +0800
…
- memset(st, 0, sizeof(*st));
"""
Just remove st counter cleaning in kvm_register_steal_time() @
arch/x86/kernel/kvm.c?
Ок, I'll try it shortly.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 191481] Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
2017-01-01 1:23 ` [Bug 191481] " bugzilla-daemon
2017-01-09 9:23 ` bugzilla-daemon
@ 2017-01-10 12:31 ` bugzilla-daemon
2017-01-11 1:21 ` bugzilla-daemon
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-01-10 12:31 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
--- Comment #3 from Dmitry Svyatogorov <ds@vo-ix.ru> ---
I've compiled/installed kernel 3.18.46 with 2348140d58f4 for ubuntu VM, but no
luck. Still falls into st=100%.
# dmesg -T
…
[Tue Jan 10 08:13:09 2017] Linux version 3.18.46-031846-generic (root@repo1)
(gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04) ) #201612271322 SMP Tue Jan 10
07:24:30 AST 2017
[Tue Jan 10 08:13:09 2017] Command line:
BOOT_IMAGE=/boot/vmlinuz-3.18.46-031846-generic
root=UUID=9a370490-4f38-4714-a3b1-c51f5562a5b1 ro transparent_hugepage=never
vga=792 console=ttyS0 console=tty0
…
# sar 1 600
08:21:55 AM CPU %user %nice %system %iowait %steal %idle
…
08:22:25 AM all 0.00 0.00 0.50 0.00 0.00 99.50
08:22:26 AM all 0.00 0.00 0.00 0.00 100.00 0.00
08:22:27 AM all 0.00 0.00 0.00 0.00 100.00 0.00
08:22:28 AM all 0.00 0.00 0.00 0.00 100.00 0.00
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 191481] Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
` (2 preceding siblings ...)
2017-01-10 12:31 ` bugzilla-daemon
@ 2017-01-11 1:21 ` bugzilla-daemon
2017-01-11 9:00 ` bugzilla-daemon
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-01-11 1:21 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
--- Comment #4 from Wanpeng Li <wanpeng.li@hotmail.com> ---
Please dump "cat /proc/stat | grep cpu" in guest.
In addition, if this can be reproduced against latest kvm tree?
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 191481] Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
` (3 preceding siblings ...)
2017-01-11 1:21 ` bugzilla-daemon
@ 2017-01-11 9:00 ` bugzilla-daemon
2017-01-11 11:15 ` bugzilla-daemon
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-01-11 9:00 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
--- Comment #5 from Dmitry Svyatogorov <ds@vo-ix.ru> ---
# cat /proc/stat | grep cpu
cpu 413 0 352 14180440 27163 0 3 1754446170805 0 0
cpu0 212 0 200 7090896 11611 0 2 285245288023 0 0
cpu1 201 0 152 7089543 15551 0 1 1469200882781 0 0
* Guest not rebooted since yesterday, so counters remains broken.
"latest kvm tree" on guest side? I've been tested (till now only on ubuntu) all
latest kernel branches available for ubuntu 14.04: (4.8.15-040815-generic,
4.4.0-53-generic, 4.2.0-42-generic, 3.19.0-77-generic) to make sure the problem
resides in mainline. (As far as I know, ubuntu-team does nothing with kvm, so
such check looks sufficient.)
At this moment I have simple plan to localize the buggy commit:
1. I already found that in 3.18 bug was committed to 3.18.31. So, checkout to
3.18.31, then revert both kvm commits.
2. In case test succeeds, there remains rather small amount of code for
investigation.
3. Except, think again :)
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 191481] Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
` (4 preceding siblings ...)
2017-01-11 9:00 ` bugzilla-daemon
@ 2017-01-11 11:15 ` bugzilla-daemon
2017-01-11 14:04 ` bugzilla-daemon
2017-01-12 11:53 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-01-11 11:15 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
--- Comment #6 from Dmitry Svyatogorov <ds@vo-ix.ru> ---
2 most suspicious candidates to create the destroy, after 3.18.31 changelog
review:
f9c904b7613b8b4c85b10cd6b33ad41b2843fa9d "Fix steal_account_process_tick() to
always return jiffies" does ST accounting refactoring, best fit.
0185604c2d82c560dab2f2933a18f797e74ab5a8 changes pit counters initialization, I
don't know whether it can affect.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 191481] Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
` (5 preceding siblings ...)
2017-01-11 11:15 ` bugzilla-daemon
@ 2017-01-11 14:04 ` bugzilla-daemon
2017-01-12 11:53 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-01-11 14:04 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
Paolo Bonzini <bonzini@gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
CC| |bonzini@gnu.org
Resolution|--- |INVALID
--- Comment #7 from Paolo Bonzini <bonzini@gnu.org> ---
Please open a bug in launchpad, not here.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug 191481] Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
` (6 preceding siblings ...)
2017-01-11 14:04 ` bugzilla-daemon
@ 2017-01-12 11:53 ` bugzilla-daemon
7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon @ 2017-01-12 11:53 UTC (permalink / raw)
To: kvm
https://bugzilla.kernel.org/show_bug.cgi?id=191481
--- Comment #8 from Dmitry Svyatogorov <ds@vo-ix.ru> ---
LOL! bugzilla.kernel.org becomes invalid place to fix commits merged into
master-branch.
JFYI, the problem was yesterday localized in
f9c904b7613b8b4c85b10cd6b33ad41b2843fa9d. The kernel 3.18.46 with this commit
reverted is running on tests for ≈20 hours, no additional problems found till
now.
For those who found this thread because of stucking with the same problem, you
can safely rebuild the problematic kernel with "git revert
f9c904b7613b8b4c85b10cd6b33ad41b2843fa9d" or make/apply reverting patch,
depending on your distro/build system. Less accuracy in stolen time calculation
will scarcely be a bereavement.
Good luck!
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-01-12 11:53 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-29 16:21 [Bug 191481] New: Virtual machine CPU counters are broken forever after live migration to system with "steal time overflow" KVM bug bugzilla-daemon
2017-01-01 1:23 ` [Bug 191481] " bugzilla-daemon
2017-01-09 9:23 ` bugzilla-daemon
2017-01-10 12:31 ` bugzilla-daemon
2017-01-11 1:21 ` bugzilla-daemon
2017-01-11 9:00 ` bugzilla-daemon
2017-01-11 11:15 ` bugzilla-daemon
2017-01-11 14:04 ` bugzilla-daemon
2017-01-12 11:53 ` bugzilla-daemon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).