kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Amit Shah <amit.shah@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	kvm@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com,
	hpa@zytor.com, x86@kernel.org, johnstul@us.ibm.com,
	riel@redhat.com, avi@redhat.com
Subject: Re: x86: kvmclock: abstract save/restore sched_clock_state
Date: Thu, 09 Feb 2012 16:13:29 +0100	[thread overview]
Message-ID: <4F33E299.3090500@redhat.com> (raw)
In-Reply-To: <20120209122742.GA25505@amit.redhat.com>

On 02/09/2012 01:27 PM, Amit Shah wrote:
> On (Tue) 07 Feb 2012 [19:05:42], Marcelo Tosatti wrote:
>>
>> Upon resume from hibernation, CPU 0's hvclock area contains the old
>> values for system_time and tsc_timestamp. It is necessary for the
>> hypervisor to update these values with uptodate ones before the CPU uses
>> them.
>>
>> Abstract TSC's save/restore sched_clock_state functions and use
>> restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
>>
>> Fixes suspend-to-disk with kvmclock.
>
> There are stalls after resume, see trace below.
>
> ./x86_64-softmmu/qemu-system-x86_64 -kernel ~/src/linux/arch/x86/boot/bzImage  -append 'root=/dev/vda1 console=tty0 console=ttyS0 no_console_suspend' -drive file=/guests/f14-suspend.qcow2,if=none,id=dr0 -device virtio-blk-pci,drive=dr0 -net nic,model=virtio -net user  -serial stdio  -enable-kvm -m 512  -cpu host -smp 4
>
> <The following is on resume:>
>
> Disabling non-boot CPUs ...
> CPU 1 is now offline
> CPU 2 is now offline
> CPU 3 is now offline
> lockdep: fixing up alternatives.
> kvm-clock: cpu 0, msr 0:1f1d19c1, primary cpu clock, resume
> PM: Restoring platform NVS memory
> Enabling non-boot CPUs ...
> lockdep: fixing up alternatives.
> Booting Node 0 Processor 1 APIC 0x1
> Calibrating delay loop (skipped) already calibrated this CPU
> INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=314192 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
>   <IRQ>   [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
>   [<ffffffff8108bef5>] rcu_check_callbacks+0xab/0x108
>   [<ffffffff81031208>] update_process_times+0x3f/0x75
>   [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
>   [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
>   [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
>   [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
>   [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
>   [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
>   <EOI>   [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
>   [<ffffffff8130840c>] native_cpu_up+0x108/0x127
>   [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
>   [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
>   [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
>   [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
>   [<ffffffff81059e6d>] hibernate+0x97/0x196
>   [<ffffffff81057eb0>] state_store+0x5c/0x106
>   [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
>   [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
>   [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
>   [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
>   [<ffffffff810eb7ad>] sys_write+0x4d/0x74
>   [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=314193 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
>   <IRQ>   [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
>   [<ffffffff8108bf33>] rcu_check_callbacks+0xe9/0x108
>   [<ffffffff81031208>] update_process_times+0x3f/0x75
>   [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
>   [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
>   [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
>   [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
>   [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
>   [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
>   <EOI>   [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
>   [<ffffffff8130840c>] native_cpu_up+0x108/0x127
>   [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
>   [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
>   [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
>   [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
>   [<ffffffff81059e6d>] hibernate+0x97/0x196
>   [<ffffffff81057eb0>] state_store+0x5c/0x106
>   [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
>   [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
>   [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
>   [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
>   [<ffffffff810eb7ad>] sys_write+0x4d/0x74
>   [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=1713579840232 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
>   <IRQ>   [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
>   [<ffffffff8108bef5>] rcu_check_callbacks+0xab/0x108
>   [<ffffffff81031208>] update_process_times+0x3f/0x75
>   [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
>   [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
>   [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
>   [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
>   [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
>   [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
>   <EOI>   [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
>   [<ffffffff8130840c>] native_cpu_up+0x108/0x127
>   [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
>   [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
>   [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
>   [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
>   [<ffffffff81059e6d>] hibernate+0x97/0x196
>   [<ffffffff81057eb0>] state_store+0x5c/0x106
>   [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
>   [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
>   [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
>   [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
>   [<ffffffff810eb7ad>] sys_write+0x4d/0x74
>   [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
> INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=1713579840233 jiffies)
> Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
> Call Trace:
>   <IRQ>   [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
>   [<ffffffff8108bf33>] rcu_check_callbacks+0xe9/0x108
>   [<ffffffff81031208>] update_process_times+0x3f/0x75
>   [<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
>   [<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
>   [<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
>   [<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
>   [<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
>   [<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
>   <EOI>   [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
>   [<ffffffff8130840c>] native_cpu_up+0x108/0x127
>   [<ffffffff81308f2b>] _cpu_up+0x92/0xfc
>   [<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
>   [<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
>   [<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
>   [<ffffffff81059e6d>] hibernate+0x97/0x196
>   [<ffffffff81057eb0>] state_store+0x5c/0x106
>   [<ffffffff811af49b>] kobj_attr_store+0x17/0x19
>   [<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
>   [<ffffffff810eb5f6>] vfs_write+0xab/0xd2
>   [<ffffffff810ec955>] ? fget_light+0x3a/0xa1
>   [<ffffffff810eb7ad>] sys_write+0x4d/0x74
>   [<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
>
>
> 		Amit

Stalls are probably caused by uninitialized percpu hv_clock, with
following patch I don't see stalls. Although I might be just lucky.
http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487

However there is/are a warning/s on suspend path and with following patch:
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 66c74f4..c8e7587 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -117,6 +117,7 @@
  static void native_smp_send_reschedule(int cpu)
  {
         if (unlikely(cpu_is_offline(cpu))) {
+               printk ("curr_cpu: %d, for cpu: %d\n", smp_processor_id(), cpu);
                 WARN_ON(1);
                 return;
         }


[    3.101919] Unregister pv shared memory for cpu 1
[    3.207321] CPU 1 is now offline
[    3.210177] curr_cpu: 0, for cpu: 1
[    3.211172] ------------[ cut here ]------------
[    3.211172] WARNING: at arch/x86/kernel/smp.c:121 native_smp_send_reschedule+0x40/0x71()
[    3.211172] Hardware name: Bochs
[    3.211172] Modules linked in: floppy virtio_blk
[    3.211172] Pid: 264, comm: sh Not tainted 3.3.0-rc1+ #42
[    3.211172] Call Trace:
[    3.211172]  <IRQ>  [<ffffffff81045ecd>] warn_slowpath_common+0x83/0x9b
[    3.211172]  [<ffffffff81045eff>] warn_slowpath_null+0x1a/0x1c
[    3.211172]  [<ffffffff81026dba>] native_smp_send_reschedule+0x40/0x71
[    3.211172]  [<ffffffff810749d6>] trigger_load_balance+0x28e/0x2ba
[    3.211172]  [<ffffffff8106e7ec>] scheduler_tick+0xe9/0xf2
[    3.211172]  [<ffffffff81052db8>] update_process_times+0x65/0x76
[    3.211172]  [<ffffffff81088219>] tick_sched_timer+0x75/0x9f
[    3.211172]  [<ffffffff81064056>] __run_hrtimer+0xb1/0x155
[    3.211172]  [<ffffffff810881a4>] ? tick_nohz_handler+0xdd/0xdd
[    3.211172]  [<ffffffff8106476c>] hrtimer_interrupt+0xe0/0x19e
[    3.211172]  [<ffffffff814d847c>] smp_apic_timer_interrupt+0x77/0x8a
[    3.211172]  [<ffffffff814d7133>] apic_timer_interrupt+0x73/0x80
[    3.211172]  <EOI>  [<ffffffff814ba14f>] ? cpuid4_cache_sysfs_exit+0x94/0xad
[    3.211172]  [<ffffffff8111ebe3>] ? cmpxchg_double_slab+0x6/0xef
[    3.211172]  [<ffffffff814c9afc>] __slab_free+0xe6/0x1b7
[    3.211172]  [<ffffffff8111eed2>] kmem_cache_free+0x72/0xac
[    3.211172]  [<ffffffff814ba14f>] ? cpuid4_cache_sysfs_exit+0x94/0xad
[    3.211172]  [<ffffffff81186d23>] release_sysfs_dirent+0x9b/0xb9
[    3.211172]  [<ffffffff81186d59>] __sysfs_put+0x18/0x1a
[    3.211172]  [<ffffffff81187257>] sysfs_addrm_finish+0x8a/0xa2
[    3.211172]  [<ffffffff81185d2b>] sysfs_hash_and_remove+0x5c/0x71
[    3.211172]  [<ffffffff8118674d>] sysfs_remove_file+0x34/0x3b
[    3.211172]  [<ffffffff812ee9da>] device_remove_file+0x17/0x19
[    3.211172]  [<ffffffff814bcaff>] mce_cpu_callback+0xa3/0x186
[    3.211172]  [<ffffffff814d31c0>] notifier_call_chain+0x37/0x63
[    3.211172]  [<ffffffff810655b6>] __raw_notifier_call_chain+0xe/0x10
[    3.211172]  [<ffffffff81047758>] __cpu_notify+0x20/0x32
[    3.211172]  [<ffffffff81047863>] cpu_notify_nofail+0x13/0x1b
[    3.211172]  [<ffffffff814ae6b3>] _cpu_down+0x13a/0x21f
[    3.211172]  [<ffffffff81047915>] disable_nonboot_cpus+0x66/0x11e
[    3.211172]  [<ffffffff8107bff1>] hibernation_restore+0x78/0x106
[    3.211172]  [<ffffffff8107c25f>] software_resume+0x1e0/0x249
[    3.211172]  [<ffffffff8107c353>] resume_store+0x8b/0x97
[    3.211172]  [<ffffffff8122f23b>] kobj_attr_store+0x17/0x19
[    3.211172]  [<ffffffff81186183>] sysfs_write_file+0x101/0x13d
[    3.211172]  [<ffffffff8112f2e3>] vfs_write+0xac/0xf3
[    3.211172]  [<ffffffff810caf1d>] ? trace_hardirqs_on_caller+0xf/0x24
[    3.211172]  [<ffffffff8112f4d2>] sys_write+0x4a/0x6e
[    3.211172]  [<ffffffff814d6629>] system_call_fastpath+0x16/0x1b
[    3.211172] ---[ end trace 5aa7abdf47c70330 ]---
[    3.276748] Unregister pv shared memory for cpu 2
[    3.280771] CPU 2 is now offline
[    3.285070] Unregister pv shared memory for cpu 3
[    3.288206] CPU 3 is now offline
[   31.290110] kvm-clock: cpu 0, msr 0:1fc13341, primary cpu clock, resume
[   31.290110] Enabling non-boot CPUs ...
[   31.291319] Booting Node 0 Processor 1 APIC 0x1
[   31.293720] smpboot cpu 1: start_ip = 96000
[   31.172141] Calibrating delay loop (skipped) already calibrated this CPUcurr_cpu: 0, for cpu: 2
[   31.330037] ------------[ cut here ]------------
[   31.330094] WARNING: at arch/x86/kernel/smp.c:121 native_smp_send_reschedule+0x40/0x71()
[   31.330094] Hardware name: Bochs
[   31.330094] Modules linked in: lockd fcoe libfcoe 8021q libfc scsi_transport_fc scsi_tgt garp stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables e1000 pcspkr uinput i2c_piix4 i2c_core ppdev parport_pc parport 
microcode sunrpc floppy virtio_blk [last unloaded: scsi_wait_scan]
[   31.330094] Pid: 1016, comm: bash Tainted: G        W    3.3.0-rc1+ #42
[   31.330094] Call Trace:
[   31.330094]  <IRQ>  [<ffffffff81045ecd>] warn_slowpath_common+0x83/0x9b
[   31.330094]  [<ffffffff81045eff>] warn_slowpath_null+0x1a/0x1c
[   31.330094]  [<ffffffff81026dba>] native_smp_send_reschedule+0x40/0x71
[   31.330094]  [<ffffffff810749d6>] trigger_load_balance+0x28e/0x2ba
[   31.330094]  [<ffffffff8106e7ec>] scheduler_tick+0xe9/0xf2
[   31.330094]  [<ffffffff81052db8>] update_process_times+0x65/0x76
[   31.330094]  [<ffffffff81088219>] tick_sched_timer+0x75/0x9f
[   31.330094]  [<ffffffff81064056>] __run_hrtimer+0xb1/0x155
[   31.330094]  [<ffffffff810881a4>] ? tick_nohz_handler+0xdd/0xdd
[   31.330094]  [<ffffffff8106476c>] hrtimer_interrupt+0xe0/0x19e
[   31.330094]  [<ffffffff814d847c>] smp_apic_timer_interrupt+0x77/0x8a
[   31.330094]  [<ffffffff814d7133>] apic_timer_interrupt+0x73/0x80
[   31.330094]  <EOI>  [<ffffffff814cffe1>] ? retint_restore_args+0x13/0x13
[   31.330094]  [<ffffffff8108d42a>] ? arch_local_irq_restore+0x6/0xd
[   31.330094]  [<ffffffff814cfd88>] _raw_spin_unlock_irqrestore+0x3d/0x3f
[   31.330094]  [<ffffffff814b1bd3>] workqueue_cpu_callback+0x203/0x214
[   31.330094]  [<ffffffff814d31c0>] notifier_call_chain+0x37/0x63
[   31.330094]  [<ffffffff814c6b68>] ? arch_local_irq_restore+0x6/0xd
[   31.330094]  [<ffffffff810655b6>] __raw_notifier_call_chain+0xe/0x10
[   31.330094]  [<ffffffff81047758>] __cpu_notify+0x20/0x32
[   31.330094]  [<ffffffff814bfb99>] _cpu_up+0xcd/0xfc
[   31.330094]  [<ffffffff814ae88b>] enable_nonboot_cpus+0x57/0xc4
[   31.330094]  [<ffffffff8107becc>] hibernation_snapshot+0x1f3/0x2a0
[   31.330094]  [<ffffffff8107c503>] hibernate+0x97/0x198
[   31.330094]  [<ffffffff8107a613>] state_store+0x5c/0x106
[   31.330094]  [<ffffffff8122f23b>] kobj_attr_store+0x17/0x19
[   31.330094]  [<ffffffff81186183>] sysfs_write_file+0x101/0x13d
[   31.330094]  [<ffffffff8112f2e3>] vfs_write+0xac/0xf3
[   31.330094]  [<ffffffff810caf1d>] ? trace_hardirqs_on_caller+0xf/0x24
[   31.330094]  [<ffffffff8112f4d2>] sys_write+0x4a/0x6e
[   31.330094]  [<ffffffff814d6629>] system_call_fastpath+0x16/0x1b
[   31.330094] ---[ end trace bbd3f0319d3e2c44 ]---
[   31.402829] KVM setup async PF for cpu 1
[   31.402829] kvm-stealtime: cpu 1, msr 1fc8df00
[   31.656224] NMI watchdog disabled (cpu1): hardware events not enabled
[   31.659240] CPU1 is up
[   31.660402] Booting Node 0 Processor 2 APIC 0x2
[   31.661522] smpboot cpu 2: start_ip = 96000
[   31.278595] Calibrating delay loop (skipped) already calibrated this CPU
[   31.697981] NMI watchdog disabled (cpu2): hardware events not enabled
[   31.698047] KVM setup async PF for cpu 2
[   31.698051] kvm-stealtime: cpu 2, msr 1fd0df00
[   31.706326] CPU2 is up

however I think they are not related to subj patch.



  reply	other threads:[~2012-02-09 15:13 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-07 21:05 x86: kvmclock: abstract save/restore sched_clock_state Marcelo Tosatti
2012-02-08 10:53 ` Igor Mammedov
2012-02-09 12:27 ` Amit Shah
2012-02-09 15:13   ` Igor Mammedov [this message]
2012-02-10 10:02     ` Amit Shah
2012-02-10 10:11       ` Igor Mammedov
2012-02-10 10:23         ` Amit Shah
2012-02-10 12:32       ` Marcelo Tosatti
2012-02-10 12:33         ` Marcelo Tosatti
2012-02-10 12:43           ` Igor Mammedov
2012-02-13 12:46             ` Amit Shah
2012-02-13 12:56           ` Amit Shah
2012-02-10 13:18         ` Amit Shah
2012-02-10 20:58           ` Igor Mammedov
2012-02-13 12:39             ` Amit Shah
2012-02-13 13:07 ` x86: kvmclock: abstract save/restore sched_clock_state (v2) Marcelo Tosatti
2012-02-13 15:20   ` Igor Mammedov
2012-02-13 15:52     ` Marcelo Tosatti
2012-02-15 10:28       ` Avi Kivity
2012-02-13 15:45   ` [PATCH RFC] pvclock: Make pv_clock more robust and fixup it if overflow happens Igor Mammedov
2012-02-13 17:48     ` Marcelo Tosatti
2012-02-13 18:15       ` Igor Mammedov
2012-02-13 16:26   ` x86: kvmclock: abstract save/restore sched_clock_state (v2) Amit Shah
2012-03-01  9:58 ` x86: kvmclock: abstract save/restore sched_clock_state Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4F33E299.3090500@redhat.com \
    --to=imammedo@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=avi@redhat.com \
    --cc=hpa@zytor.com \
    --cc=johnstul@us.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).