xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Ankur Arora <ankur.a.arora@oracle.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Philipp Hahn <hahn@univention.de>
Cc: xen-devel <xen-devel@lists.xenproject.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: 4.9.52: INFO: task XXX blocked for more than 300 seconds.
Date: Thu, 5 Oct 2017 10:59:28 -0700	[thread overview]
Message-ID: <be4d2137-384d-a0f1-b648-3c55b2ecee4c@oracle.com> (raw)
In-Reply-To: <20171005132016.GG24547@char.us.oracle.com>

On 2017-10-05 06:20 AM, Konrad Rzeszutek Wilk wrote:
> On Wed, Oct 04, 2017 at 08:26:27PM +0200, Philipp Hahn wrote:
>> Hello,
>>
> 
> Adding Ankur to this as I think he saw something similar.
> 
> But in the meantime - do you see this with the latest version of Linux?
>> with linux-4.9.52 running on Debian-Wheezy with Xen-4.1 I observed
>> several stuck processes: Here is one (truncated) dump of the Linux
>> kernel messages:
>>
>>>   [<ffffffff8160db6d>] ? __schedule+0x23d/0x6d0
>>>   [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>>>   [<ffffffff8160e032>] ? schedule+0x32/0x80
>>>   [<ffffffff8161150c>] ? schedule_timeout+0x1ec/0x360
>>>   [<ffffffff8130f277>] ? __blk_mq_run_hw_queue+0x327/0x3e0* see below
>>>   [<ffffffff8101b7f1>] ? xen_clocksource_get_cycles+0x11/0x20
>>>   [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>>>   [<ffffffff8160d8b4>] ? io_schedule_timeout+0xb4/0x130
>>>   [<ffffffff810bb6f7>] ? prepare_to_wait+0x57/0x80
>>>   [<ffffffff8160e8b7>] ? bit_wait_io+0x17/0x60
>>>   [<ffffffff8160e3ac>] ? __wait_on_bit+0x5c/0x90
>>>   [<ffffffff8160e8a0>] ? bit_wait_timeout+0x90/0x90
>>>   [<ffffffff8160e50e>] ? out_of_line_wait_on_bit+0x7e/0xa0
>>>   [<ffffffff810bba20>] ? autoremove_wake_function+0x40/0x40
>>>   [<ffffffffc00abd18>] ? jbd2_journal_commit_transaction+0xd48/0x17e0 [jbd2]
>>>   [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>>>   [<ffffffff810e67fd>] ? try_to_del_timer_sync+0x4d/0x80
>>>   [<ffffffffc00b099d>] ? kjournald2+0xdd/0x280 [jbd2]
>>>   [<ffffffff810bb9e0>] ? wake_up_atomic_t+0x30/0x30
>>>   [<ffffffffc00b08c0>] ? commit_timeout+0x10/0x10 [jbd2]
>>>   [<ffffffff81097660>] ? kthread+0xf0/0x110
>>>   [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>>>   [<ffffffff81097570>] ? kthread_park+0x60/0x60
>>>   [<ffffffff81612bb5>] ? ret_from_fork+0x25/0x30
This looks like this race: https://patchwork.kernel.org/patch/9853443/

Can you dump the output of: cat /sys/block/$xen-frontend-device/mq/*/tags

If you've hit this bug, one or more of the MQs would be wedged and
the nr_free in one or more of the queues would be 0 and will not
change.

Ankur

>>> NMI backtrace for cpu 2
>>> CPU: 2 PID: 35 Comm: khungtaskd Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
>>>   0000000000000000 ffffffff81331935 0000000000000000 0000000000000002
>>>   ffffffff81335e60 0000000000000002 ffffffff8104cb70 ffff8801f0c90e80
>>>   ffffffff81335f6a ffff8801f0c90e80 00000000003fffbc ffffffff81128048
>>> Call Trace:
>>>   [<ffffffff81331935>] ? dump_stack+0x5c/0x77
>>>   [<ffffffff81335e60>] ? nmi_cpu_backtrace+0x90/0xa0
>>>   [<ffffffff8104cb70>] ? irq_force_complete_move+0x140/0x140
>>>   [<ffffffff81335f6a>] ? nmi_trigger_cpumask_backtrace+0xfa/0x130
>>>   [<ffffffff81128048>] ? watchdog+0x2b8/0x330
>>>   [<ffffffff81127d90>] ? reset_hung_task_detector+0x10/0x10
>>>   [<ffffffff81097660>] ? kthread+0xf0/0x110
>>>   [<ffffffff810247d9>] ? __switch_to+0x2c9/0x720
>>>   [<ffffffff81097570>] ? kthread_park+0x60/0x60
>>>   [<ffffffff81612bb5>] ? ret_from_fork+0x25/0x30
>>> Sending NMI from CPU 2 to CPUs 0-1,3:
>>> NMI backtrace for cpu 1
>>> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
>>> task: ffff8801f4a02ec0 task.stack: ffffc90040ca4000
>>> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
>>> RSP: e02b:ffffc90040ca7ed0  EFLAGS: 00000246
>>> RAX: 0000000000000000 RBX: ffff8801f4a02ec0 RCX: ffffffff810013aa
>>> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
>>> RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
>>> R13: 0000000000000000 R14: ffff8801f4a02ec0 R15: ffff8801f4a02ec0
>>> FS:  00007f23ac595700(0000) GS:ffff8801f5a80000(0000) knlGS:0000000000000000
>>> CS:  e033 DS: 002b ES: 002b CR0: 0000000080050033
>>> CR2: 00007f52537d6d46 CR3: 00000001bba23000 CR4: 0000000000002660
>>> Stack:
>>>   ffff8801bb832201 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>>>   ffff8801f4a02ec0 0000000000000001 ffffffff810bc280 ffff8801f4a02ec0
>>>   ffff8801f4a02ec0 c0a995961d41240f addce6dcadd009c9 0000000000000000
>>> Call Trace:
>>>   [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>>>   [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>>>   [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
>>> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>> NMI backtrace for cpu 3
>>> CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
>>> task: ffff8801f4a24f00 task.stack: ffffc90040cb4000
>>> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
>>> RSP: e02b:ffffc90040cb7ed0  EFLAGS: 00000246
>>> RAX: 0000000000000000 RBX: ffff8801f4a24f00 RCX: ffffffff810013aa
>>> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
>>> RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
>>> R13: 0000000000000000 R14: ffff8801f4a24f00 R15: ffff8801f4a24f00
>>> FS:  00007f1a2af19700(0000) GS:ffff8801f5b80000(0000) knlGS:0000000000000000
>>> CS:  e033 DS: 002b ES: 002b CR0: 0000000080050033
>>> CR2: 00007f4a5416b000 CR3: 00000001d83ec000 CR4: 0000000000002660
>>> Stack:
>>>   0000000000000001 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>>>   ffff8801f4a24f00 0000000000000003 ffffffff810bc280 ffff8801f4a24f00
>>>   ffff8801f4a24f00 77816deb133b9979 addce6dcadd009c9 0000000000000000
>>> Call Trace:
>>>   [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>>>   [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>>>   [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
>>> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>> NMI backtrace for cpu 0
>>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-ucs105-amd64 #1 Debian 4.9.30-2A~4.2.0.201709271649
>>> task: ffffffff81c0e540 task.stack: ffffffff81c00000
>>> RIP: e030:[<ffffffff810013aa>]  [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
>>> RSP: e02b:ffffffff81c03e90  EFLAGS: 00000246
>>> RAX: 0000000000000000 RBX: ffffffff81c0e540 RCX: ffffffff810013aa
>>> RDX: ffffffff81c4ba70 RSI: 0000000000000000 RDI: 0000000000000001
>>> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000007ff0 R11: 0000000000000246 R12: 0000000000000000
>>> R13: 0000000000000000 R14: ffffffff81c0e540 R15: ffffffff81c0e540
>>> FS:  00007f977d3c2700(0000) GS:ffff8801f5a00000(0000) knlGS:0000000000000000
>>> CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 000055ae9becabc8 CR3: 00000001d6a40000 CR4: 0000000000002660
>>> Stack:
>>>   0000000000000001 0000000000000001 ffffffff8101b55c ffffffff81611ec8
>>>   ffffffff81c0e540 0000000000000000 ffffffff810bc280 ffffffff81c0e540
>>>   ffffffff81c0e540 ebeac214e330bd3f addce6dcadd009c9 ffffffffffffffff
>>> Call Trace:
>>>   [<ffffffff8101b55c>] ? xen_safe_halt+0xc/0x20
>>>   [<ffffffff81611ec8>] ? default_idle+0x18/0xd0
>>>   [<ffffffff810bc280>] ? cpu_startup_entry+0x1f0/0x260
>>>   [<ffffffff81d4bf84>] ? start_kernel+0x46d/0x48d
>>>   [<ffffffff81d510e6>] ? xen_start_kernel+0x52e/0x538
>>> Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>
>>
>> Looking at the dis-assembly of xen_clocksource_get_cycles() in
>> arch/x86/xen/time.c I see no path how that should call
>> __blk_mq_run_hw_queue():
>>
>>> 00000000000001a0 <xen_clocksource_get_cycles> mov    %gs:0x0(%rip),%rdi        # 00000000000001a8 <xen_clocksource_get_cycles+0x8>
>>> 00000000000001a8 <xen_clocksource_get_cycles+0x8> add    $0x20,%rdi
>>>          ret = pvclock_clocksource_read(src);
>>> 00000000000001ac <xen_clocksource_get_cycles+0xc> callq  00000000000001b1 <xen_clocksource_get_cycles+0x11>
>>> }
>>>
>>> static cycle_t xen_clocksource_get_cycles(struct clocksource *cs)
>>> {
>>>          return xen_clocksource_read();
>>> }
>>> 00000000000001b1 <xen_clocksource_get_cycles+0x11> retq
>>> 00000000000001b2 <xen_clocksource_get_cycles+0x12> data16 data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
>>>
>>> static void xen_read_wallclock(struct timespec *ts)
>>> {
>>>          struct shared_info *s = HYPERVISOR_shared_info;
>>>          struct pvclock_wall_clock *wall_clock = &(s->wc);
>>> 00000000000001c0 <xen_get_wallclock> mov    0x0(%rip),%rax        # 00000000000001c7 <xen_get_wallclock+0x7>
>>
>>
>> Here's another dump, which diverges from xen_clocksource_get_cycles() to
>> some completely other function:
>>
>>> INFO: task btrfs-transacti:522 blocked for more than 300 seconds.
>>> btrfs-transacti D    0   522      2 0x00000000
>>>   ffff8801f3836000 0000000000000000 ffff8801f4a24f00 ffff8801f32da1c0
>>>   ffff8801f5b18780 ffffc9004199fa40 ffffffff8160cd2d 0000000000000000
>>>   ffff8801741a58a8 0000000000000102 000000000000000e ffff8801f32da1c0
>>> Call Trace:
>>>   [<ffffffff8160cd2d>] ? __schedule+0x23d/0x6d0
>>>   [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>>>   [<ffffffff8160d1f2>] ? schedule+0x32/0x80
>>>   [<ffffffff81610729>] ? schedule_timeout+0x249/0x300
>> ***
>>>   [<ffffffff8101b7d1>] ? xen_clocksource_get_cycles+0x11/0x20
>>>   [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>>>   [<ffffffff8160ca74>] ? io_schedule_timeout+0xb4/0x130
>>>   [<ffffffff810bb5e7>] ? prepare_to_wait+0x57/0x80
>>>   [<ffffffff8160da77>] ? bit_wait_io+0x17/0x60
>>>   [<ffffffff8160d56c>] ? __wait_on_bit+0x5c/0x90
>>>   [<ffffffff8117e18e>] ? find_get_pages_tag+0x15e/0x300
>>>   [<ffffffff8117d016>] ? wait_on_page_bit+0x86/0xa0
>>>   [<ffffffff810bb910>] ? autoremove_wake_function+0x40/0x40
>>>   [<ffffffff8117d107>] ? __filemap_fdatawait_range+0xd7/0x150
>>>   [<ffffffff8117d18f>] ? filemap_fdatawait_range+0xf/0x30
>>>   [<ffffffffc0175643>] ? btrfs_wait_ordered_range+0x73/0x110 [btrfs]
>>>   [<ffffffffc01a2a0d>] ? btrfs_wait_cache_io+0x5d/0x1f0 [btrfs]
>>>   [<ffffffffc0141a36>] ? btrfs_write_dirty_block_groups+0x106/0x380 [btrfs]
>>>   [<ffffffffc0140c5d>] ? btrfs_run_delayed_refs+0x1fd/0x2b0 [btrfs]
>>>   [<ffffffffc01551d7>] ? commit_cowonly_roots+0x257/0x2f0 [btrfs]
>>>   [<ffffffffc0157a24>] ? btrfs_commit_transaction+0x4e4/0xa40 [btrfs]
>>>   [<ffffffffc015801d>] ? start_transaction+0x9d/0x4a0 [btrfs]
>>>   [<ffffffffc01523a2>] ? transaction_kthread+0x1b2/0x1f0 [btrfs]
>>>   [<ffffffffc01521f0>] ? btrfs_cleanup_transaction+0x580/0x580 [btrfs]
>>>   [<ffffffff810975c0>] ? kthread+0xf0/0x110
>>>   [<ffffffff8102476b>] ? __switch_to+0x2bb/0x700
>>>   [<ffffffff810974d0>] ? kthread_park+0x60/0x60
>>>   [<ffffffff81611d35>] ? ret_from_fork+0x25/0x30
>>
>> And another one:
>>> INFO: task smbd:20101 blocked for more than 300 seconds.
>>> smbd            D    0 20101   1714 0x00000000
>>>   ffff8801f01cac00 0000000000000000 ffff8801f4a241c0 ffff88007f9a5240
>>>   ffff8801f5b98780 ffffc90049a33bc0 ffffffff8160cd2d ffff8800f78bb5b0
>>>   fd2236313fb5274b 000000008dc3c1a0 ffff8801f3ef0a40 ffff88007f9a5240
>>> Call Trace:
>>>   [<ffffffff8160cd2d>] ? __schedule+0x23d/0x6d0
>>>   [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>>>   [<ffffffff8160d1f2>] ? schedule+0x32/0x80
>>>   [<ffffffff81610729>] ? schedule_timeout+0x249/0x300
>>>   [<ffffffff81336486>] ? __radix_tree_lookup+0x76/0xe0
>> ***
>>>   [<ffffffff8101b7d1>] ? xen_clocksource_get_cycles+0x11/0x20
>>>   [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>>>   [<ffffffff8160ca74>] ? io_schedule_timeout+0xb4/0x130
>>>   [<ffffffff810bb5e7>] ? prepare_to_wait+0x57/0x80
>>>   [<ffffffff8160da77>] ? bit_wait_io+0x17/0x60
>>>   [<ffffffff8160d56c>] ? __wait_on_bit+0x5c/0x90
>>>   [<ffffffffc00f290b>] ? __jbd2_journal_file_buffer+0xcb/0x180 [jbd2]
>>>   [<ffffffff8160da60>] ? bit_wait_timeout+0x90/0x90
>>>   [<ffffffff8160d6ce>] ? out_of_line_wait_on_bit+0x7e/0xa0
>>>   [<ffffffff810bb910>] ? autoremove_wake_function+0x40/0x40
>>>   [<ffffffffc00f2bc8>] ? do_get_write_access+0x208/0x420 [jbd2]
>>>   [<ffffffffc00f2e0e>] ? jbd2_journal_get_write_access+0x2e/0x60 [jbd2]
>>>   [<ffffffffc02c78a6>] ? __ext4_journal_get_write_access+0x36/0x70 [ext4]
>>>   [<ffffffffc02a3913>] ? ext4_orphan_add+0xd3/0x230 [ext4]
>>>   [<ffffffffc0296b4a>] ? ext4_mark_inode_dirty+0x6a/0x200 [ext4]
>>>   [<ffffffffc02a4b8a>] ? ext4_unlink+0x36a/0x380 [ext4]
>>>   [<ffffffff81211677>] ? vfs_unlink+0xe7/0x180
>>>   [<ffffffff81214439>] ? do_unlinkat+0x289/0x300
>>>   [<ffffffff81611abb>] ? system_call_fast_compare_end+0xc/0x9b
>>
>> This does not look normal to me or did I miss something?
>>
>> Where can I get more information on why there is no progress for 300s,
>> what should I do to debug which task is waiting for what?
>>
>> The traces of the of other CPUs look normal to me: the one posted first
>> above is the shortest, in all other cases they were sooner or later
>> waiting for IO (my interpretation, but I can post them if necessary.)
>>
>> This problem occurs since the upgrade of the Linux kernel inside the VM
>> from 4.1.x to 4.9.32 and now 4.9.52.
>>
>> Any help is appreciated.
>> Philipp Hahn
>> -- 
>> Philipp Hahn
>> Open Source Software Engineer
>>
>> Univention GmbH
>> be open.
>> Mary-Somerville-Str. 1
>> D-28359 Bremen
>> Tel.: +49 421 22232-0
>> Fax : +49 421 22232-99
>> hahn@univention.de
>>
>> http://www.univention.de/
>> Geschäftsführer: Peter H. Ganten
>> HRB 20755 Amtsgericht Bremen
>> Steuer-Nr.: 71-597-02876
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-10-05 17:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-04 18:26 4.9.52: INFO: task XXX blocked for more than 300 seconds Philipp Hahn
2017-10-05 10:12 ` Jan Beulich
2017-10-05 12:04   ` Philipp Hahn
2017-10-05 10:38 ` Roger Pau Monné
2017-10-05 13:20 ` Konrad Rzeszutek Wilk
2017-10-05 17:59   ` Ankur Arora [this message]
2017-10-19 14:34     ` 4.9.52: INFO: task X " Philipp Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be4d2137-384d-a0f1-b648-3c55b2ecee4c@oracle.com \
    --to=ankur.a.arora@oracle.com \
    --cc=hahn@univention.de \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).