From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Ian.Campbell@eu.citrix.com, konrad.wilk@oracle.com,
pbonzini@redhat.com, xen-devel@lists.xensource.com
Subject: Re: Prepping a bugfix push
Date: Thu, 03 Dec 2009 17:10:42 -0800 [thread overview]
Message-ID: <4B186192.8000201@goop.org> (raw)
In-Reply-To: <20091204002406.GB5897@kremvax.cs.ubc.ca>
On 12/03/09 16:24, Brendan Cully wrote:
> On Thursday, 03 December 2009 at 15:22, Jeremy Fitzhardinge wrote:
>
>> On 12/03/09 11:35, Brendan Cully wrote:
>>
>>> Not a patch, but I've just tried out xm save -c again with the latest
>>> xen changes, and while I no longer see the grant table version panic,
>>> the guest's devices (aside from the console) appear to be wedged on
>>> resume. Is anyone else seeing this?
>>>
>>> After a while on the console I see messages like this:
>>>
>>> INFO: task syslogd:2219 blocked for more than 120 seconds.
>>>
>>> which I assume is trouble with the block device.
>>>
>> Is there any backtrace or other info associated with this?
>>
> I take it you couldn't reproduce? Let me know what you want
> captured. In the meantime, here's the kernel log:
>
Nope, works for me :) Ian has been doing testing with hundreds or
thousands of suspend/resumes, so it *can* be reliable.
That said, I am seeing some odd hangs after suspend/resume with the
bugfix branch merged into Linux mainline...
Can you ping the domain when its in that state?
J
> Dec 4 00:16:12 p32 kernel: suspending xenstore...
> Dec 4 00:16:12 p32 kernel: Grant tables using version 2 layout.
>
> Here's a backtrace after the block message:
>
> INFO: task kjournald:880 blocked for more than 120 seconds.
> "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> kjournald D 00000000 0 880 2 0x00000000
> cfb6be70 00000246 cf8fa000 00000000 00000002 00000000 c14972a0 00000001
> cf8fa000 c152f380 c152f380 c152f380 cf8fa1fc cf850b10 cf850c9c d0808380
> c1924584 00000000 cfb6be70 c1006f04 c1920284 c1924588 c1920284 cfb49854
> Call Trace:
> [<c1006f04>] ? check_events+0x8/0xc
> [<c137b0a8>] io_schedule+0x1e/0x28
> [<c10c2555>] sync_buffer+0x33/0x37
> [<c137b3a1>] __wait_on_bit+0x45/0x62
> [<c10c2522>] ? sync_buffer+0x0/0x37
> [<c137b41d>] out_of_line_wait_on_bit+0x5f/0x72
> [<c10c2522>] ? sync_buffer+0x0/0x37
> [<c1049b7e>] ? wake_bit_function+0x0/0x47
> [<c10c24e5>] __wait_on_buffer+0x23/0x25
> [<c1102940>] journal_commit_transaction+0x50b/0xdeb
> [<c1006efb>] ? xen_restore_fl_direct_end+0x0/0x1
> [<c137c4f4>] ? _spin_unlock_irqrestore+0x17/0x1d
> [<c103fad4>] ? try_to_del_timer_sync+0x49/0x52
> [<c1105b66>] kjournald+0xab/0x1f0
> [<c1049b46>] ? autoremove_wake_function+0x0/0x38
> [<c1105abb>] ? kjournald+0x0/0x1f0
> [<c1049a85>] kthread+0x74/0x7f
> [<c1049a11>] ? kthread+0x0/0x7f
> [<c1009b37>] kernel_thread_helper+0x7/0x10
>
> and here's the result of xm sysrq<dom> l
> SysRq : Show backtrace of all active CPUs
>
> and xm sysrq<dom> w
>
> SysRq : Show Blocked State
> task PC stack pid father
> pdflush D cf40297c 0 150 2 0x00000000
> cf95dec8 00000246 00000000 cf40297c c1550280 c1553680 c10067df c1550280
> 00000000 c152f380 c152f380 c152f380 c1550280 cf8b3130 cf8b32bc d0808380
> c1553680 ffff995d cf95dec8 c1006f04 c1550280 c1553a6c c1550280 c1006efb
> Call Trace:
> [<c10067df>] ? xen_force_evtchn_callback+0xf/0x14
> [<c1006f04>] ? check_events+0x8/0xc
> [<c1006efb>] ? xen_restore_fl_direct_end+0x0/0x1
> [<c137b1cb>] schedule_timeout+0xd6/0x13e
> [<c10067df>] ? xen_force_evtchn_callback+0xf/0x14
> [<c103fc07>] ? process_timeout+0x0/0xa
> [<c137a8ea>] io_schedule_timeout+0x1e/0x28
> [<c108e9a8>] congestion_wait+0x51/0x66
> [<c1049b46>] ? autoremove_wake_function+0x0/0x38
> [<c1083ef3>] wb_kupdate+0xb3/0xfa
> [<c10842d5>] ? pdflush+0x0/0x205
> [<c10843d5>] pdflush+0x100/0x205
> [<c1006efb>] ? xen_restore_fl_direct_end+0x0/0x1
> [<c137c4f4>] ? _spin_unlock_irqrestore+0x17/0x1d
> [<c1083e40>] ? wb_kupdate+0x0/0xfa
> [<c1049a85>] kthread+0x74/0x7f
> [<c1049a11>] ? kthread+0x0/0x7f
> [<c1009b37>] kernel_thread_helper+0x7/0x10
> kjournald D 00000000 0 880 2 0x00000000
> cfb6be70 00000246 cf8fa000 00000000 00000002 00000000 c14972a0 00000001
> cf8fa000 c152f380 c152f380 c152f380 cf8fa1fc cf850b10 cf850c9c d0808380
> c1924584 00000000 cfb6be70 c1006f04 c1920284 c1924588 c1920284 cfb49854
> Call Trace:
> [<c1006f04>] ? check_events+0x8/0xc
> [<c137b0a8>] io_schedule+0x1e/0x28
> [<c10c2555>] sync_buffer+0x33/0x37
> [<c137b3a1>] __wait_on_bit+0x45/0x62
> [<c10c2522>] ? sync_buffer+0x0/0x37
> [<c137b41d>] out_of_line_wait_on_bit+0x5f/0x72
> [<c10c2522>] ? sync_buffer+0x0/0x37
> [<c1049b7e>] ? wake_bit_function+0x0/0x47
> [<c10c24e5>] __wait_on_buffer+0x23/0x25
> [<c1102940>] journal_commit_transaction+0x50b/0xdeb
> [<c1006efb>] ? xen_restore_fl_direct_end+0x0/0x1
> [<c137c4f4>] ? _spin_unlock_irqrestore+0x17/0x1d
> [<c103fad4>] ? try_to_del_timer_sync+0x49/0x52
> [<c1105b66>] kjournald+0xab/0x1f0
> [<c1049b46>] ? autoremove_wake_function+0x0/0x38
> [<c1105abb>] ? kjournald+0x0/0x1f0
> [<c1049a85>] kthread+0x74/0x7f
> [<c1049a11>] ? kthread+0x0/0x7f
> [<c1009b37>] kernel_thread_helper+0x7/0x10
> syslogd D 00000000 0 2219 1 0x00000000
> cfb33e40 00000286 00000000 00000000 0000aa70 00000000 c10067df cfb7026c
> 00000001 c152f380 c152f380 c152f380 00000000 cfa2a130 cfa2a2bc d0808380
> cfb33e64 cfb70254 c1006efb c137c4f4 c10282ff cfb33e58 cfb33e40 c1049d05
> Call Trace:
> [<c10067df>] ? xen_force_evtchn_callback+0xf/0x14
> [<c1006efb>] ? xen_restore_fl_direct_end+0x0/0x1
> [<c137c4f4>] ? _spin_unlock_irqrestore+0x17/0x1d
> [<c10282ff>] ? __wake_up+0x3a/0x42
> [<c1049d05>] ? prepare_to_wait+0x41/0x46
> [<c110535f>] log_wait_commit+0x87/0xef
> [<c1049b46>] ? autoremove_wake_function+0x0/0x38
> [<c1100ca8>] journal_stop+0x1dc/0x29f
> [<c110156d>] ? journal_start+0x88/0xb2
> [<c11015b4>] journal_force_commit+0x1d/0x1f
> [<c10f2d8a>] ext3_force_commit+0x1c/0x22
> [<c10ed096>] ext3_write_inode+0x2d/0x3b
> [<c10bd21f>] writeback_single_inode+0x1f5/0x2be
> [<c108326f>] ? do_writepages+0x37/0x39
> [<c10bdaa1>] sync_inode+0x1f/0x37
> [<c10e9409>] ext3_sync_file+0x99/0xb0
> [<c10c06fc>] vfs_fsync+0x73/0xb1
> [<c10c075c>] do_fsync+0x22/0x32
> [<c10c078b>] sys_fsync+0xd/0xf
> [<c1009058>] sysenter_do_call+0x12/0x2c
> getty D cf402840 0 2330 1 0x00000004
> cfb23cc0 00000286 00000000 cf402840 00000000 00000000 00000000 00000000
> 00000000 c152f380 c152f380 c152f380 00000000 cf8740f0 cf87427c d0808380
> cfb23d14 c19202c8 c1006efb c137c4f4 00000002 cfb23d08 cfb23cc0 c1049d05
> Call Trace:
> [<c1006efb>] ? xen_restore_fl_direct_end+0x0/0x1
> [<c137c4f4>] ? _spin_unlock_irqrestore+0x17/0x1d
> [<c1049d05>] ? prepare_to_wait+0x41/0x46
> [<c1101016>] do_get_write_access+0x2ab/0x3cd
> [<c1006efb>] ? xen_restore_fl_direct_end+0x0/0x1
> [<c1049b7e>] ? wake_bit_function+0x0/0x47
> [<c1101153>] journal_get_write_access+0x1b/0x2a
> [<c10f72c2>] __ext3_journal_get_write_access+0x19/0x3f
> [<c10eab31>] ext3_reserve_inode_write+0x34/0x6a
> [<c10eab7e>] ext3_mark_inode_dirty+0x17/0x2e
> [<c10eda05>] ext3_dirty_inode+0x6b/0x6d
> [<c107ce9c>] ? file_remove_suid+0x1b/0x59
> [<c10bd91d>] __mark_inode_dirty+0x23/0xd2
> [<c10b6618>] file_update_time+0x76/0xb3
> [<c107e0b9>] __generic_file_aio_write_nolock+0x235/0x4b3
> [<c107f0e5>] ? filemap_fault+0x76/0x3b1
> [<c107e5a5>] generic_file_aio_write+0x5d/0xd1
> [<c10e9267>] ext3_file_write+0x27/0xa5
> [<c10a527a>] do_sync_write+0xcd/0x103
> [<c1049b46>] ? autoremove_wake_function+0x0/0x38
> [<c1006efb>] ? xen_restore_fl_direct_end+0x0/0x1
> [<c10d2c2c>] ? locks_free_lock+0x31/0x40
> [<c10d4103>] ? fcntl_setlk+0x3e/0x185
> [<c10a596e>] vfs_write+0x8b/0x13f
> [<c137e474>] ? do_page_fault+0x1b4/0x3a0
> [<c10a51ad>] ? do_sync_write+0x0/0x103
> [<c10a5f1d>] sys_write+0x3d/0x64
> [<c1009058>] sysenter_do_call+0x12/0x2c
> Sched Debug Version: v0.09, 2.6.31.6 #8
> now at 194118.700060 msecs
> .jiffies : 4294940775
> .sysctl_sched_latency : 20.000000
> .sysctl_sched_min_granularity : 4.000000
> .sysctl_sched_wakeup_granularity : 5.000000
> .sysctl_sched_child_runs_first : 0.000001
> .sysctl_sched_features : 113917
>
> cpu#0, 2500.152 MHz
> .nr_running : 1
> .load : 3121
> .nr_switches : 17143
> .nr_load_updates : 48354
> .nr_uninterruptible : 4
> .next_balance : 4294.940896
> .curr->pid : 14
> .clock : 193919.954084
> .cpu_load[0] : 0
> .cpu_load[1] : 0
> .cpu_load[2] : 0
> .cpu_load[3] : 0
> .cpu_load[4] : 0
>
> cfs_rq[0]:
> .exec_clock : 0.000000
> .MIN_vruntime : 0.000001
> .min_vruntime : 13296.631519
> .max_vruntime : 0.000001
> .spread : 0.000000
> .spread0 : 0.000000
> .nr_running : 1
> .load : 3121
> .nr_spread_over : 0
>
> rt_rq[0]:
> .rt_nr_running : 0
> .rt_throttled : 0
> .rt_time : 0.000000
> .rt_runtime : 950.000000
>
> runnable tasks:
> task PID tree-key switches prio exec-runtime sum-exec sum-sleep
> ----------------------------------------------------------------------------------------------------------
> R xenwatch 14 13276.942024 146 115 0 0 0.000000 0.000000 0.000000
>
>
next prev parent reply other threads:[~2009-12-04 1:10 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-03 19:26 Prepping a bugfix push Jeremy Fitzhardinge
2009-12-03 19:35 ` Brendan Cully
2009-12-03 19:38 ` Jeremy Fitzhardinge
2009-12-03 23:22 ` Jeremy Fitzhardinge
2009-12-04 0:24 ` Brendan Cully
2009-12-04 1:10 ` Jeremy Fitzhardinge [this message]
2009-12-04 4:29 ` Brendan Cully
2009-12-04 7:46 ` Ian Campbell
2009-12-04 15:50 ` Ian Campbell
2009-12-04 16:37 ` Jeremy Fitzhardinge
2009-12-04 16:56 ` Ian Campbell
2009-12-05 0:05 ` Jeremy Fitzhardinge
2009-12-04 17:01 ` Brendan Cully
2009-12-04 17:12 ` Jeremy Fitzhardinge
2009-12-04 10:45 ` Ian Campbell
2009-12-04 14:10 ` Konrad Rzeszutek Wilk
2009-12-04 16:18 ` Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B186192.8000201@goop.org \
--to=jeremy@goop.org \
--cc=Ian.Campbell@eu.citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=pbonzini@redhat.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.