From: Igor Mammedov <imammedo@redhat.com>
To: Amit Shah <amit.shah@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
kvm@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com,
hpa@zytor.com, x86@kernel.org, johnstul@us.ibm.com,
riel@redhat.com, avi@redhat.com
Subject: Re: x86: kvmclock: abstract save/restore sched_clock_state
Date: Thu, 09 Feb 2012 16:13:29 +0100 [thread overview]
Message-ID: <4F33E299.3090500@redhat.com> (raw)
In-Reply-To: <20120209122742.GA25505@amit.redhat.com>
On 02/09/2012 01:27 PM, Amit Shah wrote:
> On (Tue) 07 Feb 2012 [19:05:42], Marcelo Tosatti wrote:
>>
>> Upon resume from hibernation, CPU 0's hvclock area contains the old
>> values for system_time and tsc_timestamp. It is necessary for the
>> hypervisor to update these values with uptodate ones before the CPU uses
>> them.
>>
>> Abstract TSC's save/restore sched_clock_state functions and use
>> restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
>>
>> Fixes suspend-to-disk with kvmclock.
>
> There are stalls after resume, see trace below.
>
> ./x86_64-softmmu/qemu-system-x86_64 -kernel ~/src/linux/arch/x86/boot/bzImage -append 'root=/dev/vda1 console=tty0 console=ttyS0 no_console_suspend' -drive file=/guests/f14-suspend.qcow2,if=none,id=dr0 -device virtio-blk-pci,drive=dr0 -net nic,model=virtio -net user -serial stdio -enable-kvm -m 512 -cpu host -smp 4
>
> <The following is on resume:>
>
> Disabling non-boot CPUs ...
> CPU 1 is now offline
> CPU 2 is now offline
> CPU 3 is now offline
> lockdep: fixing up alternatives.
> kvm-clock: cpu 0, msr 0:1f1d19c1, primary cpu clock, resume
> PM: Restoring platform NVS memory
> Enabling non-boot CPUs ...
> lockdep: fixing up alternatives.
> Booting Node 0 Processor 1 APIC 0x1
> Calibrating delay loop (skipped) already calibrated this CPU
> INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=314192 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
> <IRQ> [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
> [<ffffffff8108bef5>] rcu_check_callbacks+0xab/0x108
> [<ffffffff81031208>] update_process_times+0x3f/0x75
> [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
> [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
> [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
> [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
> [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
> [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
> <EOI> [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
> [<ffffffff8130840c>] native_cpu_up+0x108/0x127
> [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
> [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
> [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
> [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
> [<ffffffff81059e6d>] hibernate+0x97/0x196
> [<ffffffff81057eb0>] state_store+0x5c/0x106
> [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
> [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
> [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
> [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
> [<ffffffff810eb7ad>] sys_write+0x4d/0x74
> [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=314193 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
> <IRQ> [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
> [<ffffffff8108bf33>] rcu_check_callbacks+0xe9/0x108
> [<ffffffff81031208>] update_process_times+0x3f/0x75
> [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
> [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
> [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
> [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
> [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
> [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
> <EOI> [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
> [<ffffffff8130840c>] native_cpu_up+0x108/0x127
> [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
> [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
> [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
> [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
> [<ffffffff81059e6d>] hibernate+0x97/0x196
> [<ffffffff81057eb0>] state_store+0x5c/0x106
> [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
> [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
> [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
> [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
> [<ffffffff810eb7ad>] sys_write+0x4d/0x74
> [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=1713579840232 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
> <IRQ> [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
> [<ffffffff8108bef5>] rcu_check_callbacks+0xab/0x108
> [<ffffffff81031208>] update_process_times+0x3f/0x75
> [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
> [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
> [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
> [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
> [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
> [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
> <EOI> [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
> [<ffffffff8130840c>] native_cpu_up+0x108/0x127
> [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
> [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
> [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
> [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
> [<ffffffff81059e6d>] hibernate+0x97/0x196
> [<ffffffff81057eb0>] state_store+0x5c/0x106
> [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
> [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
> [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
> [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
> [<ffffffff810eb7ad>] sys_write+0x4d/0x74
> [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=1713579840233 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
> <IRQ> [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
> [<ffffffff8108bf33>] rcu_check_callbacks+0xe9/0x108
> [<ffffffff81031208>] update_process_times+0x3f/0x75
> [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
> [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
> [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
> [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
> [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
> [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
> <EOI> [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
> [<ffffffff8130840c>] native_cpu_up+0x108/0x127
> [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
> [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
> [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
> [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
> [<ffffffff81059e6d>] hibernate+0x97/0x196
> [<ffffffff81057eb0>] state_store+0x5c/0x106
> [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
> [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
> [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
> [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
> [<ffffffff810eb7ad>] sys_write+0x4d/0x74
> [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
>
>
> Amit
Stalls are probably caused by uninitialized percpu hv_clock, with
following patch I don't see stalls. Although I might be just lucky.
http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487
However there is/are a warning/s on suspend path and with following patch:
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 66c74f4..c8e7587 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -117,6 +117,7 @@
static void native_smp_send_reschedule(int cpu)
{
if (unlikely(cpu_is_offline(cpu))) {
+ printk ("curr_cpu: %d, for cpu: %d\n", smp_processor_id(), cpu);
WARN_ON(1);
return;
}
[ 3.101919] Unregister pv shared memory for cpu 1
[ 3.207321] CPU 1 is now offline
[ 3.210177] curr_cpu: 0, for cpu: 1
[ 3.211172] ------------[ cut here ]------------
[ 3.211172] WARNING: at arch/x86/kernel/smp.c:121 native_smp_send_reschedule+0x40/0x71()
[ 3.211172] Hardware name: Bochs
[ 3.211172] Modules linked in: floppy virtio_blk
[ 3.211172] Pid: 264, comm: sh Not tainted 3.3.0-rc1+ #42
[ 3.211172] Call Trace:
[ 3.211172] <IRQ> [<ffffffff81045ecd>] warn_slowpath_common+0x83/0x9b
[ 3.211172] [<ffffffff81045eff>] warn_slowpath_null+0x1a/0x1c
[ 3.211172] [<ffffffff81026dba>] native_smp_send_reschedule+0x40/0x71
[ 3.211172] [<ffffffff810749d6>] trigger_load_balance+0x28e/0x2ba
[ 3.211172] [<ffffffff8106e7ec>] scheduler_tick+0xe9/0xf2
[ 3.211172] [<ffffffff81052db8>] update_process_times+0x65/0x76
[ 3.211172] [<ffffffff81088219>] tick_sched_timer+0x75/0x9f
[ 3.211172] [<ffffffff81064056>] __run_hrtimer+0xb1/0x155
[ 3.211172] [<ffffffff810881a4>] ? tick_nohz_handler+0xdd/0xdd
[ 3.211172] [<ffffffff8106476c>] hrtimer_interrupt+0xe0/0x19e
[ 3.211172] [<ffffffff814d847c>] smp_apic_timer_interrupt+0x77/0x8a
[ 3.211172] [<ffffffff814d7133>] apic_timer_interrupt+0x73/0x80
[ 3.211172] <EOI> [<ffffffff814ba14f>] ? cpuid4_cache_sysfs_exit+0x94/0xad
[ 3.211172] [<ffffffff8111ebe3>] ? cmpxchg_double_slab+0x6/0xef
[ 3.211172] [<ffffffff814c9afc>] __slab_free+0xe6/0x1b7
[ 3.211172] [<ffffffff8111eed2>] kmem_cache_free+0x72/0xac
[ 3.211172] [<ffffffff814ba14f>] ? cpuid4_cache_sysfs_exit+0x94/0xad
[ 3.211172] [<ffffffff81186d23>] release_sysfs_dirent+0x9b/0xb9
[ 3.211172] [<ffffffff81186d59>] __sysfs_put+0x18/0x1a
[ 3.211172] [<ffffffff81187257>] sysfs_addrm_finish+0x8a/0xa2
[ 3.211172] [<ffffffff81185d2b>] sysfs_hash_and_remove+0x5c/0x71
[ 3.211172] [<ffffffff8118674d>] sysfs_remove_file+0x34/0x3b
[ 3.211172] [<ffffffff812ee9da>] device_remove_file+0x17/0x19
[ 3.211172] [<ffffffff814bcaff>] mce_cpu_callback+0xa3/0x186
[ 3.211172] [<ffffffff814d31c0>] notifier_call_chain+0x37/0x63
[ 3.211172] [<ffffffff810655b6>] __raw_notifier_call_chain+0xe/0x10
[ 3.211172] [<ffffffff81047758>] __cpu_notify+0x20/0x32
[ 3.211172] [<ffffffff81047863>] cpu_notify_nofail+0x13/0x1b
[ 3.211172] [<ffffffff814ae6b3>] _cpu_down+0x13a/0x21f
[ 3.211172] [<ffffffff81047915>] disable_nonboot_cpus+0x66/0x11e
[ 3.211172] [<ffffffff8107bff1>] hibernation_restore+0x78/0x106
[ 3.211172] [<ffffffff8107c25f>] software_resume+0x1e0/0x249
[ 3.211172] [<ffffffff8107c353>] resume_store+0x8b/0x97
[ 3.211172] [<ffffffff8122f23b>] kobj_attr_store+0x17/0x19
[ 3.211172] [<ffffffff81186183>] sysfs_write_file+0x101/0x13d
[ 3.211172] [<ffffffff8112f2e3>] vfs_write+0xac/0xf3
[ 3.211172] [<ffffffff810caf1d>] ? trace_hardirqs_on_caller+0xf/0x24
[ 3.211172] [<ffffffff8112f4d2>] sys_write+0x4a/0x6e
[ 3.211172] [<ffffffff814d6629>] system_call_fastpath+0x16/0x1b
[ 3.211172] ---[ end trace 5aa7abdf47c70330 ]---
[ 3.276748] Unregister pv shared memory for cpu 2
[ 3.280771] CPU 2 is now offline
[ 3.285070] Unregister pv shared memory for cpu 3
[ 3.288206] CPU 3 is now offline
[ 31.290110] kvm-clock: cpu 0, msr 0:1fc13341, primary cpu clock, resume
[ 31.290110] Enabling non-boot CPUs ...
[ 31.291319] Booting Node 0 Processor 1 APIC 0x1
[ 31.293720] smpboot cpu 1: start_ip = 96000
[ 31.172141] Calibrating delay loop (skipped) already calibrated this CPUcurr_cpu: 0, for cpu: 2
[ 31.330037] ------------[ cut here ]------------
[ 31.330094] WARNING: at arch/x86/kernel/smp.c:121 native_smp_send_reschedule+0x40/0x71()
[ 31.330094] Hardware name: Bochs
[ 31.330094] Modules linked in: lockd fcoe libfcoe 8021q libfc scsi_transport_fc scsi_tgt garp stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables e1000 pcspkr uinput i2c_piix4 i2c_core ppdev parport_pc parport
microcode sunrpc floppy virtio_blk [last unloaded: scsi_wait_scan]
[ 31.330094] Pid: 1016, comm: bash Tainted: G W 3.3.0-rc1+ #42
[ 31.330094] Call Trace:
[ 31.330094] <IRQ> [<ffffffff81045ecd>] warn_slowpath_common+0x83/0x9b
[ 31.330094] [<ffffffff81045eff>] warn_slowpath_null+0x1a/0x1c
[ 31.330094] [<ffffffff81026dba>] native_smp_send_reschedule+0x40/0x71
[ 31.330094] [<ffffffff810749d6>] trigger_load_balance+0x28e/0x2ba
[ 31.330094] [<ffffffff8106e7ec>] scheduler_tick+0xe9/0xf2
[ 31.330094] [<ffffffff81052db8>] update_process_times+0x65/0x76
[ 31.330094] [<ffffffff81088219>] tick_sched_timer+0x75/0x9f
[ 31.330094] [<ffffffff81064056>] __run_hrtimer+0xb1/0x155
[ 31.330094] [<ffffffff810881a4>] ? tick_nohz_handler+0xdd/0xdd
[ 31.330094] [<ffffffff8106476c>] hrtimer_interrupt+0xe0/0x19e
[ 31.330094] [<ffffffff814d847c>] smp_apic_timer_interrupt+0x77/0x8a
[ 31.330094] [<ffffffff814d7133>] apic_timer_interrupt+0x73/0x80
[ 31.330094] <EOI> [<ffffffff814cffe1>] ? retint_restore_args+0x13/0x13
[ 31.330094] [<ffffffff8108d42a>] ? arch_local_irq_restore+0x6/0xd
[ 31.330094] [<ffffffff814cfd88>] _raw_spin_unlock_irqrestore+0x3d/0x3f
[ 31.330094] [<ffffffff814b1bd3>] workqueue_cpu_callback+0x203/0x214
[ 31.330094] [<ffffffff814d31c0>] notifier_call_chain+0x37/0x63
[ 31.330094] [<ffffffff814c6b68>] ? arch_local_irq_restore+0x6/0xd
[ 31.330094] [<ffffffff810655b6>] __raw_notifier_call_chain+0xe/0x10
[ 31.330094] [<ffffffff81047758>] __cpu_notify+0x20/0x32
[ 31.330094] [<ffffffff814bfb99>] _cpu_up+0xcd/0xfc
[ 31.330094] [<ffffffff814ae88b>] enable_nonboot_cpus+0x57/0xc4
[ 31.330094] [<ffffffff8107becc>] hibernation_snapshot+0x1f3/0x2a0
[ 31.330094] [<ffffffff8107c503>] hibernate+0x97/0x198
[ 31.330094] [<ffffffff8107a613>] state_store+0x5c/0x106
[ 31.330094] [<ffffffff8122f23b>] kobj_attr_store+0x17/0x19
[ 31.330094] [<ffffffff81186183>] sysfs_write_file+0x101/0x13d
[ 31.330094] [<ffffffff8112f2e3>] vfs_write+0xac/0xf3
[ 31.330094] [<ffffffff810caf1d>] ? trace_hardirqs_on_caller+0xf/0x24
[ 31.330094] [<ffffffff8112f4d2>] sys_write+0x4a/0x6e
[ 31.330094] [<ffffffff814d6629>] system_call_fastpath+0x16/0x1b
[ 31.330094] ---[ end trace bbd3f0319d3e2c44 ]---
[ 31.402829] KVM setup async PF for cpu 1
[ 31.402829] kvm-stealtime: cpu 1, msr 1fc8df00
[ 31.656224] NMI watchdog disabled (cpu1): hardware events not enabled
[ 31.659240] CPU1 is up
[ 31.660402] Booting Node 0 Processor 2 APIC 0x2
[ 31.661522] smpboot cpu 2: start_ip = 96000
[ 31.278595] Calibrating delay loop (skipped) already calibrated this CPU
[ 31.697981] NMI watchdog disabled (cpu2): hardware events not enabled
[ 31.698047] KVM setup async PF for cpu 2
[ 31.698051] kvm-stealtime: cpu 2, msr 1fd0df00
[ 31.706326] CPU2 is up
however I think they are not related to subj patch.
next prev parent reply other threads:[~2012-02-09 15:13 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-07 21:05 x86: kvmclock: abstract save/restore sched_clock_state Marcelo Tosatti
2012-02-08 10:53 ` Igor Mammedov
2012-02-09 12:27 ` Amit Shah
2012-02-09 15:13 ` Igor Mammedov [this message]
2012-02-10 10:02 ` Amit Shah
2012-02-10 10:11 ` Igor Mammedov
2012-02-10 10:23 ` Amit Shah
2012-02-10 12:32 ` Marcelo Tosatti
2012-02-10 12:33 ` Marcelo Tosatti
2012-02-10 12:43 ` Igor Mammedov
2012-02-13 12:46 ` Amit Shah
2012-02-13 12:56 ` Amit Shah
2012-02-10 13:18 ` Amit Shah
2012-02-10 20:58 ` Igor Mammedov
2012-02-13 12:39 ` Amit Shah
2012-02-13 13:07 ` x86: kvmclock: abstract save/restore sched_clock_state (v2) Marcelo Tosatti
2012-02-13 15:20 ` Igor Mammedov
2012-02-13 15:52 ` Marcelo Tosatti
2012-02-15 10:28 ` Avi Kivity
2012-02-13 15:45 ` [PATCH RFC] pvclock: Make pv_clock more robust and fixup it if overflow happens Igor Mammedov
2012-02-13 17:48 ` Marcelo Tosatti
2012-02-13 18:15 ` Igor Mammedov
2012-02-13 16:26 ` x86: kvmclock: abstract save/restore sched_clock_state (v2) Amit Shah
2012-03-01 9:58 ` x86: kvmclock: abstract save/restore sched_clock_state Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F33E299.3090500@redhat.com \
--to=imammedo@redhat.com \
--cc=amit.shah@redhat.com \
--cc=avi@redhat.com \
--cc=hpa@zytor.com \
--cc=johnstul@us.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=mtosatti@redhat.com \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).