linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* task sync:2450 blocked for more than 120 seconds.
@ 2014-04-30  6:37 Neuer User
  2014-05-01  8:14 ` Neuer User
  0 siblings, 1 reply; 3+ messages in thread
From: Neuer User @ 2014-04-30  6:37 UTC (permalink / raw)
  To: linux-btrfs

Hi

I have a non-rootfs btrfs partition that I use for some work where I
like to keep some snapshots. The partition is about 160GB big and has
about 80-90 GB of data.

I often see the following errors:

Apr 29 20:41:24 DesktopMB kernel: [47030.195270] INFO: task sync:2450
blocked for more than 120 seconds.
Apr 29 20:41:24 DesktopMB kernel: [47030.195275]       Tainted: GF
    O 3.14.1-031401-generic #201404141220
Apr 29 20:41:24 DesktopMB kernel: [47030.195276] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 29 20:41:24 DesktopMB kernel: [47030.195277] sync            D
ffffffff818118e0     0  2450  29298 0x00000004
Apr 29 20:41:24 DesktopMB kernel: [47030.195280]  ffff8801261bbc68
0000000000000002 ffff8801261bbc18 ffff8801261bbfd8
Apr 29 20:41:24 DesktopMB kernel: [47030.195282]  00000000000144c0
00000000000144c0 ffff88022341b1e0 ffff88012606cad0
Apr 29 20:41:24 DesktopMB kernel: [47030.195284]  ffff8801261bbc68
ffff88022f3d4dc8 ffff88012606cad0 ffffffff8115f460
Apr 29 20:41:24 DesktopMB kernel: [47030.195286] Call Trace:
Apr 29 20:41:24 DesktopMB kernel: [47030.195292]  [<ffffffff8115f460>] ?
__lock_page+0x70/0x70
Apr 29 20:41:24 DesktopMB kernel: [47030.195295]  [<ffffffff817650e9>]
schedule+0x29/0x70
Apr 29 20:41:24 DesktopMB kernel: [47030.195297]  [<ffffffff817651bf>]
io_schedule+0x8f/0xd0
Apr 29 20:41:24 DesktopMB kernel: [47030.195299]  [<ffffffff8115f46e>]
sleep_on_page+0xe/0x20
Apr 29 20:41:24 DesktopMB kernel: [47030.195300]  [<ffffffff81765882>]
__wait_on_bit+0x62/0x90
Apr 29 20:41:24 DesktopMB kernel: [47030.195302]  [<ffffffff8115ff0b>] ?
find_get_pages_tag+0xcb/0x170
Apr 29 20:41:24 DesktopMB kernel: [47030.195304]  [<ffffffff8115f5d0>]
wait_on_page_bit+0x80/0x90
Apr 29 20:41:24 DesktopMB kernel: [47030.195307]  [<ffffffff810b51c0>] ?
wake_atomic_t_function+0x40/0x40
Apr 29 20:41:24 DesktopMB kernel: [47030.195309]  [<ffffffff8115f6d4>]
filemap_fdatawait_range+0xf4/0x180
Apr 29 20:41:24 DesktopMB kernel: [47030.195312]  [<ffffffff8108624a>] ?
__queue_delayed_work+0xaa/0x1a0
Apr 29 20:41:24 DesktopMB kernel: [47030.195314]  [<ffffffff810867c5>] ?
try_to_grab_pending+0x65/0x80
Apr 29 20:41:24 DesktopMB kernel: [47030.195316]  [<ffffffff8115f78b>]
filemap_fdatawait+0x2b/0x30
Apr 29 20:41:24 DesktopMB kernel: [47030.195319]  [<ffffffff811fa6a5>]
wait_sb_inodes+0xc5/0x120
Apr 29 20:41:24 DesktopMB kernel: [47030.195321]  [<ffffffff81202300>] ?
fdatawrite_one_bdev+0x20/0x20
Apr 29 20:41:24 DesktopMB kernel: [47030.195323]  [<ffffffff811fa7a3>]
sync_inodes_sb+0xa3/0xd0
Apr 29 20:41:24 DesktopMB kernel: [47030.195325]  [<ffffffff8120231d>]
sync_inodes_one_sb+0x1d/0x30
Apr 29 20:41:24 DesktopMB kernel: [47030.195328]  [<ffffffff811d5711>]
iterate_supers+0xf1/0x100
Apr 29 20:41:24 DesktopMB kernel: [47030.195330]  [<ffffffff81202485>]
sys_sync+0x35/0x90
Apr 29 20:41:24 DesktopMB kernel: [47030.195333]  [<ffffffff8177247f>]
tracesys+0xe1/0xe6
Apr 29 20:43:24 DesktopMB kernel: [47150.168257] INFO: task sync:2450
blocked for more than 120 seconds.

In such a case I can only emergency remount and
power of the system. Nothing else.

I am still not sure how to exactly reproduce the error. But it only
happens, when the filesysstem has been used extensively for some time (I
am doing kernel compilation on that fs).

When I mount the system and then sync, nothing happens. After a full day
of work with some compiles, I can no longer unmount the fs as the sync
then blocks.

A btrfs scrub reports no errors. A btrfs check report hundreds of errors:

...
Incorrect global backref count on 91295694848 found 1 wanted 2
backpointer mismatch on [91295694848 16384]
ref mismatch on [91295711232 16384] extent item 1, found 2
Incorrect global backref count on 91295711232 found 1 wanted 2
backpointer mismatch on [91295711232 16384]
ref mismatch on [91295727616 16384] extent item 1, found 2
Incorrect global backref count on 91295727616 found 1 wanted 2
backpointer mismatch on [91295727616 16384]
ref mismatch on [91295744000 16384] extent item 1, found 2
Incorrect global backref count on 91295744000 found 1 wanted 2
backpointer mismatch on [91295744000 16384]
ref mismatch on [91295760384 16384] extent item 1, found 2
Incorrect global backref count on 91295760384 found 1 wanted 2
backpointer mismatch on [91295760384 16384]
ref mismatch on [91295776768 16384] extent item 1, found 2
Incorrect global backref count on 91295776768 found 1 wanted 2
backpointer mismatch on [91295776768 16384]
ref mismatch on [91295793152 16384] extent item 1, found 2
Incorrect global backref count on 91295793152 found 1 wanted 2
backpointer mismatch on [91295793152 16384]
ref mismatch on [91295809536 16384] extent item 1, found 2
Incorrect global backref count on 91295809536 found 1 wanted 2
backpointer mismatch on [91295809536 16384]
ref mismatch on [91295825920 16384] extent item 1, found 2
Incorrect global backref count on 91295825920 found 1 wanted 2
backpointer mismatch on [91295825920 16384]
Errors found in extent allocation tree or chunk allocation
checking free space cache
free space inode generation (0) did not match free space cache
generation (3272)
checking fs roots
checking csums
checking root refs
found 42565280463 bytes used err is 0
total csum bytes: 82221224
total tree bytes: 3689676800
total fs tree bytes: 3475308544
total extent tree bytes: 109215744
btree space waste bytes: 609518812
file data blocks allocated: 118278844416
 referenced 117607366656
Btrfs v3.12

Kernel is 3.14.1, Ubuntu 14.04.

The question is: What shall I do? btrfs check --repair? Or better just
copy all data to a new fs and delete the partition?

Michael


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: task sync:2450 blocked for more than 120 seconds.
  2014-04-30  6:37 task sync:2450 blocked for more than 120 seconds Neuer User
@ 2014-05-01  8:14 ` Neuer User
  2014-05-01 17:16   ` Duncan
  0 siblings, 1 reply; 3+ messages in thread
From: Neuer User @ 2014-05-01  8:14 UTC (permalink / raw)
  To: linux-btrfs

It's probably best to copy al my data to another disk, then delete the
parttiion and make a new ext4 partition. btrfs is probably still too
experimental, I guess.

Am 30.04.2014 08:37, schrieb Neuer User:
> Hi
> 
> I have a non-rootfs btrfs partition that I use for some work where I
> like to keep some snapshots. The partition is about 160GB big and has
> about 80-90 GB of data.
> 
> I often see the following errors:
> 
> Apr 29 20:41:24 DesktopMB kernel: [47030.195270] INFO: task sync:2450
> blocked for more than 120 seconds.
> Apr 29 20:41:24 DesktopMB kernel: [47030.195275]       Tainted: GF
>     O 3.14.1-031401-generic #201404141220
> Apr 29 20:41:24 DesktopMB kernel: [47030.195276] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Apr 29 20:41:24 DesktopMB kernel: [47030.195277] sync            D
> ffffffff818118e0     0  2450  29298 0x00000004
> Apr 29 20:41:24 DesktopMB kernel: [47030.195280]  ffff8801261bbc68
> 0000000000000002 ffff8801261bbc18 ffff8801261bbfd8
> Apr 29 20:41:24 DesktopMB kernel: [47030.195282]  00000000000144c0
> 00000000000144c0 ffff88022341b1e0 ffff88012606cad0
> Apr 29 20:41:24 DesktopMB kernel: [47030.195284]  ffff8801261bbc68
> ffff88022f3d4dc8 ffff88012606cad0 ffffffff8115f460
> Apr 29 20:41:24 DesktopMB kernel: [47030.195286] Call Trace:
> Apr 29 20:41:24 DesktopMB kernel: [47030.195292]  [<ffffffff8115f460>] ?
> __lock_page+0x70/0x70
> Apr 29 20:41:24 DesktopMB kernel: [47030.195295]  [<ffffffff817650e9>]
> schedule+0x29/0x70
> Apr 29 20:41:24 DesktopMB kernel: [47030.195297]  [<ffffffff817651bf>]
> io_schedule+0x8f/0xd0
> Apr 29 20:41:24 DesktopMB kernel: [47030.195299]  [<ffffffff8115f46e>]
> sleep_on_page+0xe/0x20
> Apr 29 20:41:24 DesktopMB kernel: [47030.195300]  [<ffffffff81765882>]
> __wait_on_bit+0x62/0x90
> Apr 29 20:41:24 DesktopMB kernel: [47030.195302]  [<ffffffff8115ff0b>] ?
> find_get_pages_tag+0xcb/0x170
> Apr 29 20:41:24 DesktopMB kernel: [47030.195304]  [<ffffffff8115f5d0>]
> wait_on_page_bit+0x80/0x90
> Apr 29 20:41:24 DesktopMB kernel: [47030.195307]  [<ffffffff810b51c0>] ?
> wake_atomic_t_function+0x40/0x40
> Apr 29 20:41:24 DesktopMB kernel: [47030.195309]  [<ffffffff8115f6d4>]
> filemap_fdatawait_range+0xf4/0x180
> Apr 29 20:41:24 DesktopMB kernel: [47030.195312]  [<ffffffff8108624a>] ?
> __queue_delayed_work+0xaa/0x1a0
> Apr 29 20:41:24 DesktopMB kernel: [47030.195314]  [<ffffffff810867c5>] ?
> try_to_grab_pending+0x65/0x80
> Apr 29 20:41:24 DesktopMB kernel: [47030.195316]  [<ffffffff8115f78b>]
> filemap_fdatawait+0x2b/0x30
> Apr 29 20:41:24 DesktopMB kernel: [47030.195319]  [<ffffffff811fa6a5>]
> wait_sb_inodes+0xc5/0x120
> Apr 29 20:41:24 DesktopMB kernel: [47030.195321]  [<ffffffff81202300>] ?
> fdatawrite_one_bdev+0x20/0x20
> Apr 29 20:41:24 DesktopMB kernel: [47030.195323]  [<ffffffff811fa7a3>]
> sync_inodes_sb+0xa3/0xd0
> Apr 29 20:41:24 DesktopMB kernel: [47030.195325]  [<ffffffff8120231d>]
> sync_inodes_one_sb+0x1d/0x30
> Apr 29 20:41:24 DesktopMB kernel: [47030.195328]  [<ffffffff811d5711>]
> iterate_supers+0xf1/0x100
> Apr 29 20:41:24 DesktopMB kernel: [47030.195330]  [<ffffffff81202485>]
> sys_sync+0x35/0x90
> Apr 29 20:41:24 DesktopMB kernel: [47030.195333]  [<ffffffff8177247f>]
> tracesys+0xe1/0xe6
> Apr 29 20:43:24 DesktopMB kernel: [47150.168257] INFO: task sync:2450
> blocked for more than 120 seconds.
> 
> In such a case I can only emergency remount and
> power of the system. Nothing else.
> 
> I am still not sure how to exactly reproduce the error. But it only
> happens, when the filesysstem has been used extensively for some time (I
> am doing kernel compilation on that fs).
> 
> When I mount the system and then sync, nothing happens. After a full day
> of work with some compiles, I can no longer unmount the fs as the sync
> then blocks.
> 
> A btrfs scrub reports no errors. A btrfs check report hundreds of errors:
> 
> ...
> Incorrect global backref count on 91295694848 found 1 wanted 2
> backpointer mismatch on [91295694848 16384]
> ref mismatch on [91295711232 16384] extent item 1, found 2
> Incorrect global backref count on 91295711232 found 1 wanted 2
> backpointer mismatch on [91295711232 16384]
> ref mismatch on [91295727616 16384] extent item 1, found 2
> Incorrect global backref count on 91295727616 found 1 wanted 2
> backpointer mismatch on [91295727616 16384]
> ref mismatch on [91295744000 16384] extent item 1, found 2
> Incorrect global backref count on 91295744000 found 1 wanted 2
> backpointer mismatch on [91295744000 16384]
> ref mismatch on [91295760384 16384] extent item 1, found 2
> Incorrect global backref count on 91295760384 found 1 wanted 2
> backpointer mismatch on [91295760384 16384]
> ref mismatch on [91295776768 16384] extent item 1, found 2
> Incorrect global backref count on 91295776768 found 1 wanted 2
> backpointer mismatch on [91295776768 16384]
> ref mismatch on [91295793152 16384] extent item 1, found 2
> Incorrect global backref count on 91295793152 found 1 wanted 2
> backpointer mismatch on [91295793152 16384]
> ref mismatch on [91295809536 16384] extent item 1, found 2
> Incorrect global backref count on 91295809536 found 1 wanted 2
> backpointer mismatch on [91295809536 16384]
> ref mismatch on [91295825920 16384] extent item 1, found 2
> Incorrect global backref count on 91295825920 found 1 wanted 2
> backpointer mismatch on [91295825920 16384]
> Errors found in extent allocation tree or chunk allocation
> checking free space cache
> free space inode generation (0) did not match free space cache
> generation (3272)
> checking fs roots
> checking csums
> checking root refs
> found 42565280463 bytes used err is 0
> total csum bytes: 82221224
> total tree bytes: 3689676800
> total fs tree bytes: 3475308544
> total extent tree bytes: 109215744
> btree space waste bytes: 609518812
> file data blocks allocated: 118278844416
>  referenced 117607366656
> Btrfs v3.12
> 
> Kernel is 3.14.1, Ubuntu 14.04.
> 
> The question is: What shall I do? btrfs check --repair? Or better just
> copy all data to a new fs and delete the partition?
> 
> Michael
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: task sync:2450 blocked for more than 120 seconds.
  2014-05-01  8:14 ` Neuer User
@ 2014-05-01 17:16   ` Duncan
  0 siblings, 0 replies; 3+ messages in thread
From: Duncan @ 2014-05-01 17:16 UTC (permalink / raw)
  To: linux-btrfs

Neuer User posted on Thu, 01 May 2014 10:14:35 +0200 as excerpted:

> It's probably best to copy al my data to another disk, then delete the
> parttiion and make a new ext4 partition. btrfs is probably still too
> experimental, I guess.

I thought I replied to this one but it was to a different, similar 
problem, post.

The blocked task bug isn't entirely uncommon, and is apparently a number 
of bugs with similar symptoms.  One thing the devs have been requesting 
on such reports is the output from a sysrq-w, which dumps the blocked 
tasks so the devs can trace where the blocks are happening and fix the 
bugs.  Either alt-sysrq-w, if you're running direct on hardware, or
echo w > /proc/sysrq-trigger, which should work in VMs, etc, as well.

But you're correct, tracking down and fixing all these bugs is likely to 
take awhile, and if you simply want something that works and don't have 
the luxury of testing and waiting for fixes, switching back to ext4 or 
whatever, until btrfs has matured a bit longer, would be the recommended 
course of action.

And one more comment, a hint for next time.  On mailing lists it's common 
to quote what you're replying to and reply below it.  Top posting the 
reply works for that post, but messes things up for others trying to 
reply to the thread as well and for later readers, since now there's 
either a mixture of top-posted and in-context-posted replies at multiple 
levels or people end up omitting entirely the second level and deeper 
quotes that might have put the whole thing in context.  Either case makes 
the thread difficult to follow or to properly reply to. =:^(

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-05-01 17:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-30  6:37 task sync:2450 blocked for more than 120 seconds Neuer User
2014-05-01  8:14 ` Neuer User
2014-05-01 17:16   ` Duncan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).