From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.fusionio.com ([66.114.96.30]:60720 "EHLO mx1.fusionio.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756429Ab2FYNIm (ORCPT ); Mon, 25 Jun 2012 09:08:42 -0400 Message-ID: <4FE862D7.3060005@fusionio.com> Date: Mon, 25 Jun 2012 09:08:39 -0400 From: Josef Bacik MIME-Version: 1.0 To: Stefan Priebe CC: "linux-btrfs@vger.kernel.org" Subject: Re: btrfs deadlock in 3.5-rc3 References: <4FE58366.8000302@profihost.ag> In-Reply-To: <4FE58366.8000302@profihost.ag> Content-Type: text/plain; charset="ISO-8859-15"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 06/23/2012 04:50 AM, Stefan Priebe wrote: > Hello list, > > i've seen this deadlock today and can reproduce it while using ceph. > > Is this a known bug? > > [ 599.514534] INFO: task kworker/6:0:29 blocked for more than 120 seconds. > [ 599.530394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 599.546832] kworker/6:0 D ffffffff8180ebc0 0 29 2 > 0x00000000 > [ 599.563725] ffff880e78f4dc90 0000000000000046 ffff880e78f4a180 > 0000000000012280 > [ 599.580812] ffff880e78f4dfd8 ffff880e78f4c010 0000000000012280 > 0000000000012280 > [ 599.580813] ffff880e78f4dfd8 0000000000012280 ffff880e78e8c340 > ffff880e78f4a180 > [ 599.580814] Call Trace: > [ 599.580820] [] schedule+0x24/0x70 > [ 599.580832] [] wait_for_commit+0x55/0x90 [btrfs] > [ 599.580836] [] ? wake_up_bit+0x40/0x40 > [ 599.580844] [] > btrfs_commit_transaction+0x655/0xab0 [btrfs] > [ 599.580846] [] ? idle_balance+0x108/0x130 > [ 599.580847] [] ? wake_up_bit+0x40/0x40 > [ 599.580853] [] ? btrfs_end_transaction+0x20/0x20 > [btrfs] > [ 599.580859] [] do_async_commit+0x1a/0x30 [btrfs] > [ 599.580861] [] process_one_work+0x11f/0x470 > [ 599.580862] [] worker_thread+0x178/0x400 > [ 599.580863] [] ? manage_workers+0x210/0x210 > [ 599.580865] [] kthread+0x96/0xa0 > [ 599.580866] [] kernel_thread_helper+0x4/0x10 > [ 599.580868] [] ? kthread_worker_fn+0x130/0x130 > [ 599.580869] [] ? gs_change+0xb/0xb > [ 599.580872] INFO: task kworker/2:1:1184 blocked for more than 120 > seconds. > [ 599.580872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 599.580873] kworker/2:1 D ffffffff8180ebc0 0 1184 2 > 0x00000000 > [ 599.580874] ffff880e448cfc90 0000000000000046 ffff880e46cce380 > 0000000000012280 > [ 599.580875] ffff880e448cffd8 ffff880e448ce010 0000000000012280 > 0000000000012280 > [ 599.580876] ffff880e448cffd8 0000000000012280 ffff880e78e5a240 > ffff880e46cce380 > [ 599.580877] Call Trace: > [ 599.580879] [] schedule+0x24/0x70 > [ 599.580884] [] wait_for_commit+0x55/0x90 [btrfs] > [ 599.580886] [] ? wake_up_bit+0x40/0x40 > [ 599.580891] [] > btrfs_commit_transaction+0x655/0xab0 [btrfs] > [ 599.580893] [] ? idle_balance+0x108/0x130 > [ 599.580894] [] ? wake_up_bit+0x40/0x40 > [ 599.580899] [] ? btrfs_end_transaction+0x20/0x20 > [btrfs] > [ 599.580905] [] do_async_commit+0x1a/0x30 [btrfs] > [ 599.580906] [] process_one_work+0x11f/0x470 > [ 599.580907] [] worker_thread+0x178/0x400 > [ 599.580908] [] ? manage_workers+0x210/0x210 > [ 599.580910] [] kthread+0x96/0xa0 > [ 599.580911] [] kernel_thread_helper+0x4/0x10 > [ 599.580913] [] ? kthread_worker_fn+0x130/0x130 > [ 599.580914] [] ? gs_change+0xb/0xb > [ 599.580915] INFO: task kworker/5:1:1186 blocked for more than 120 > seconds. > [ 599.580915] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 599.580916] kworker/5:1 D ffffffff8180ebc0 0 1186 2 > 0x00000000 > [ 599.580917] ffff880e448a3c90 0000000000000046 ffff880e46d36600 > 0000000000012280 > [ 599.580919] ffff880e448a3fd8 ffff880e448a2010 0000000000012280 > 0000000000012280 > [ 599.580920] ffff880e448a3fd8 0000000000012280 ffff880e78e88300 > ffff880e46d36600 > [ 599.580920] Call Trace: > [ 599.580922] [] schedule+0x24/0x70 > [ 599.580927] [] wait_for_commit+0x55/0x90 [btrfs] > [ 599.580929] [] ? wake_up_bit+0x40/0x40 > [ 599.580934] [] > btrfs_commit_transaction+0x655/0xab0 [btrfs] > [ 599.580936] [] ? idle_balance+0x108/0x130 > [ 599.580937] [] ? wake_up_bit+0x40/0x40 > [ 599.580942] [] ? btrfs_end_transaction+0x20/0x20 > [btrfs] > [ 599.580948] [] do_async_commit+0x1a/0x30 [btrfs] > [ 599.580949] [] process_one_work+0x11f/0x470 > [ 599.580950] [] worker_thread+0x178/0x400 > [ 599.580951] [] ? manage_workers+0x210/0x210 > [ 599.580952] [] kthread+0x96/0xa0 > [ 599.580954] [] kernel_thread_helper+0x4/0x10 > [ 599.580955] [] ? kthread_worker_fn+0x130/0x130 > [ 599.580957] [] ? gs_change+0xb/0xb > [ 599.580966] INFO: task ceph-osd:3248 blocked for more than 120 seconds. > [ 599.580966] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 599.580967] ceph-osd D 0000000000000006 0 3248 1 > 0x00000000 > [ 599.580968] ffff880bc4201c38 0000000000000082 ffff880e3efc4480 > 0000000000012280 > [ 599.580970] ffff880bc4201fd8 ffff880bc4200010 0000000000012280 > 0000000000012280 > [ 599.580971] ffff880bc4201fd8 0000000000012280 ffff880e78f4a180 > ffff880e3efc4480 > [ 599.580971] Call Trace: > [ 599.580972] [] schedule+0x24/0x70 > [ 599.580978] [] > btrfs_commit_transaction_async+0x1d5/0x240 [btrfs] > [ 599.580983] [] ? block_rsv_add_bytes+0x26/0x60 [btrfs] > [ 599.580985] [] ? wake_up_bit+0x40/0x40 > [ 599.580990] [] ? block_rsv_migrate_bytes+0x35/0x50 > [btrfs] > [ 599.580997] [] btrfs_mksubvol+0x2be/0x350 [btrfs] > [ 599.581004] [] > btrfs_ioctl_snap_create_transid+0x109/0x1a0 [btrfs] > [ 599.581010] [] > btrfs_ioctl_snap_create_v2+0x84/0xf0 [btrfs] > [ 599.581016] [] btrfs_ioctl+0x76f/0x12d0 [btrfs] > [ 599.581019] [] ? fsnotify+0x1ba/0x2e0 > [ 599.581021] [] do_vfs_ioctl+0x93/0x4f0 > [ 599.581022] [] sys_ioctl+0x4a/0x80 > [ 599.581024] [] system_call_fastpath+0x16/0x1b > [ 599.581035] INFO: task ceph-osd:3278 blocked for more than 120 seconds. > [ 599.581035] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 599.581036] ceph-osd D 0000000000000002 0 3278 1 > 0x00000000 > [ 599.581038] ffff880bc41fbc38 0000000000000082 ffff880e412283c0 > 0000000000012280 > [ 599.581039] ffff880bc41fbfd8 ffff880bc41fa010 0000000000012280 > 0000000000012280 > [ 599.581040] ffff880bc41fbfd8 0000000000012280 ffff880e46cce380 > ffff880e412283c0 > [ 599.581040] Call Trace: > [ 599.581042] [] schedule+0x24/0x70 > [ 599.581048] [] > btrfs_commit_transaction_async+0x1d5/0x240 [btrfs] > [ 599.581053] [] ? block_rsv_add_bytes+0x26/0x60 [btrfs] > [ 599.581055] [] ? wake_up_bit+0x40/0x40 > [ 599.581059] [] ? block_rsv_migrate_bytes+0x35/0x50 > [btrfs] > [ 599.581066] [] btrfs_mksubvol+0x2be/0x350 [btrfs] > [ 599.581071] [] > btrfs_ioctl_snap_create_transid+0x109/0x1a0 [btrfs] > [ 599.581077] [] > btrfs_ioctl_snap_create_v2+0x84/0xf0 [btrfs] > [ 599.581082] [] btrfs_ioctl+0x76f/0x12d0 [btrfs] > [ 599.581084] [] ? fsnotify+0x1ba/0x2e0 > [ 599.581085] [] do_vfs_ioctl+0x93/0x4f0 > [ 599.581086] [] sys_ioctl+0x4a/0x80 > [ 599.581088] [] system_call_fastpath+0x16/0x1b > [ 599.581094] INFO: task ceph-osd:3330 blocked for more than 120 seconds. > [ 599.581094] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 599.581095] ceph-osd D 0000000000000005 0 3330 1 > 0x00000000 > [ 599.581096] ffff880bc3bf9c38 0000000000000086 ffff880bc3bd4740 > 0000000000012280 > [ 599.581098] ffff880bc3bf9fd8 ffff880bc3bf8010 0000000000012280 > 0000000000012280 > [ 599.581099] ffff880bc3bf9fd8 0000000000012280 ffff880e46d36600 > ffff880bc3bd4740 > [ 599.581099] Call Trace: > [ 599.581101] [] schedule+0x24/0x70 > [ 599.581106] [] > btrfs_commit_transaction_async+0x1d5/0x240 [btrfs] > [ 599.581111] [] ? block_rsv_add_bytes+0x26/0x60 [btrfs] > [ 599.581113] [] ? wake_up_bit+0x40/0x40 > [ 599.581118] [] ? block_rsv_migrate_bytes+0x35/0x50 > [btrfs] > [ 599.581123] [] btrfs_mksubvol+0x2be/0x350 [btrfs] > [ 599.581129] [] > btrfs_ioctl_snap_create_transid+0x109/0x1a0 [btrfs] > [ 599.581134] [] > btrfs_ioctl_snap_create_v2+0x84/0xf0 [btrfs] > [ 599.581139] [] btrfs_ioctl+0x76f/0x12d0 [btrfs] > [ 599.581141] [] ? fsnotify+0x1ba/0x2e0 > [ 599.581142] [] do_vfs_ioctl+0x93/0x4f0 > [ 599.581143] [] sys_ioctl+0x4a/0x80 > [ 599.581145] [] system_call_fastpath+0x16/0x1b > [ 719.395814] INFO: task kworker/6:0:29 blocked for more than 120 seconds. > [ 719.417426] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 719.439537] kworker/6:0 D ffffffff8180ebc0 0 29 2 > 0x00000000 > [ 719.461937] ffff880e78f4dc90 0000000000000046 ffff880e78f4a180 > 0000000000012280 > [ 719.484547] ffff880e78f4dfd8 ffff880e78f4c010 0000000000012280 > 0000000000012280 > [ 719.507211] ffff880e78f4dfd8 0000000000012280 ffff880e78e8c340 > ffff880e78f4a180 > [ 719.530071] Call Trace: > [ 719.552729] [] schedule+0x24/0x70 > [ 719.552741] [] wait_for_commit+0x55/0x90 [btrfs] > [ 719.552744] [] ? wake_up_bit+0x40/0x40 > [ 719.552752] [] > btrfs_commit_transaction+0x655/0xab0 [btrfs] > [ 719.552755] [] ? idle_balance+0x108/0x130 > [ 719.552756] [] ? wake_up_bit+0x40/0x40 > [ 719.552762] [] ? btrfs_end_transaction+0x20/0x20 > [btrfs] > [ 719.552768] [] do_async_commit+0x1a/0x30 [btrfs] > [ 719.552770] [] process_one_work+0x11f/0x470 > [ 719.552771] [] worker_thread+0x178/0x400 > [ 719.552772] [] ? manage_workers+0x210/0x210 > [ 719.552774] [] kthread+0x96/0xa0 > [ 719.552776] [] kernel_thread_helper+0x4/0x10 > [ 719.552777] [] ? kthread_worker_fn+0x130/0x130 > [ 719.552779] [] ? gs_change+0xb/0xb > [ 719.552781] INFO: task kworker/3:1:578 blocked for more than 120 seconds. > [ 719.552781] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 719.552783] kworker/3:1 D ffffffff8180ebc0 0 578 2 > 0x00000000 > [ 719.552784] ffff880e45dd3c90 0000000000000046 ffff880e7890e080 > 0000000000012280 > [ 719.552786] ffff880e45dd3fd8 ffff880e45dd2010 0000000000012280 > 0000000000012280 > [ 719.552787] ffff880e45dd3fd8 0000000000012280 ffff880e78e7e280 > ffff880e7890e080 > [ 719.552787] Call Trace: > [ 719.552789] [] schedule+0x24/0x70 > [ 719.552795] [] wait_for_commit+0x55/0x90 [btrfs] > [ 719.552797] [] ? wake_up_bit+0x40/0x40 > [ 719.552802] [] > btrfs_commit_transaction+0x655/0xab0 [btrfs] > [ 719.552804] [] ? idle_balance+0x108/0x130 > [ 719.552805] [] ? wake_up_bit+0x40/0x40 > [ 719.552810] [] ? btrfs_end_transaction+0x20/0x20 > [btrfs] > [ 719.552816] [] do_async_commit+0x1a/0x30 [btrfs] > [ 719.552817] [] process_one_work+0x11f/0x470 > [ 719.552818] [] worker_thread+0x178/0x400 > [ 719.552819] [] ? manage_workers+0x210/0x210 > [ 719.552820] [] kthread+0x96/0xa0 > [ 719.552822] [] kernel_thread_helper+0x4/0x10 > [ 719.552824] [] ? kthread_worker_fn+0x130/0x130 > [ 719.552825] [] ? gs_change+0xb/0xb > [ 719.552827] INFO: task kworker/2:1:1184 blocked for more than 120 > seconds. > [ 719.552827] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 719.552828] kworker/2:1 D ffffffff8180ebc0 0 1184 2 > 0x00000000 > [ 719.552830] ffff880e448cfc90 0000000000000046 ffff880e46cce380 > 0000000000012280 > [ 719.552831] ffff880e448cffd8 ffff880e448ce010 0000000000012280 > 0000000000012280 > [ 719.552832] ffff880e448cffd8 0000000000012280 ffff880e78e5a240 > ffff880e46cce380 > [ 719.552832] Call Trace: > [ 719.552834] [] schedule+0x24/0x70 > [ 719.552839] [] wait_for_commit+0x55/0x90 [btrfs] > [ 719.552841] [] ? wake_up_bit+0x40/0x40 > [ 719.552846] [] > btrfs_commit_transaction+0x655/0xab0 [btrfs] > [ 719.552848] [] ? idle_balance+0x108/0x130 > [ 719.552849] [] ? wake_up_bit+0x40/0x40 > [ 719.552854] [] ? btrfs_end_transaction+0x20/0x20 > [btrfs] > [ 719.552859] [] do_async_commit+0x1a/0x30 [btrfs] > [ 719.552861] [] process_one_work+0x11f/0x470 > [ 719.552862] [] worker_thread+0x178/0x400 > [ 719.552863] [] ? manage_workers+0x210/0x210 > [ 719.552864] [] kthread+0x96/0xa0 > [ 719.552866] [] kernel_thread_helper+0x4/0x10 > [ 719.552867] [] ? kthread_worker_fn+0x130/0x130 > [ 719.552868] [] ? gs_change+0xb/0xb > [ 719.552869] INFO: task kworker/5:1:1186 blocked for more than 120 > seconds. > [ 719.552869] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 719.552870] kworker/5:1 D ffffffff8180ebc0 0 1186 2 > 0x00000000 > [ 719.552872] ffff880e448a3c90 0000000000000046 ffff880e46d36600 > 0000000000012280 > [ 719.552873] ffff880e448a3fd8 ffff880e448a2010 0000000000012280 > 0000000000012280 > [ 719.552874] ffff880e448a3fd8 0000000000012280 ffff880e78e88300 > ffff880e46d36600 > [ 719.552874] Call Trace: > [ 719.552876] [] schedule+0x24/0x70 > [ 719.552881] [] wait_for_commit+0x55/0x90 [btrfs] > [ 719.552883] [] ? wake_up_bit+0x40/0x40 > [ 719.552888] [] > btrfs_commit_transaction+0x655/0xab0 [btrfs] > [ 719.552890] [] ? idle_balance+0x108/0x130 > [ 719.552891] [] ? wake_up_bit+0x40/0x40 > [ 719.552896] [] ? btrfs_end_transaction+0x20/0x20 > [btrfs] > [ 719.552901] [] do_async_commit+0x1a/0x30 [btrfs] > [ 719.552902] [] process_one_work+0x11f/0x470 > [ 719.552903] [] worker_thread+0x178/0x400 > [ 719.552904] [] ? manage_workers+0x210/0x210 > [ 719.552906] [] kthread+0x96/0xa0 > [ 719.552907] [] kernel_thread_helper+0x4/0x10 > [ 719.552909] [] ? kthread_worker_fn+0x130/0x130 > [ 719.552910] [] ? gs_change+0xb/0xb > This isn't showing the guy who's actually trying to commit the transaction. Next time this happens can you do a sysrq+w and capture the output and post it here so we can see what everybody is doing? Thanks, Josef