* Btrfs: blocked for more than 120 seconds, made worse by 3.2 rc7 @ 2011-12-28 19:26 Konstantinos Skarlatos 2011-12-28 20:36 ` Konstantinos Skarlatos 2011-12-28 21:48 ` Dave Chinner 0 siblings, 2 replies; 4+ messages in thread From: Konstantinos Skarlatos @ 2011-12-28 19:26 UTC (permalink / raw) To: linux-kernel; +Cc: Linux Btrfs, Chris Mason Hello all: I have two machines with btrfs, that give me the "blocked for more than 120 seconds" message. After that I cannot write anything to disk, i am unable to unmount the btrfs filesystem and i can only reboot with sysrq-trigger. It always happens when i write many files with rsync over network. When i used 3.2rc6 it happened randomly on both machines after 50-500gb of writes. with rc7 it happens after much less writes, probably 10gb or so, but only on machine 1 for the time being. machine 2 has not crashed yet after 200gb of writes and I am still testing that. machine 1: btrfs on a 6tb sparse file, mounted as loop, on a xfs filesystem that lies on a 10TB md raid5. mount options compress=zlib,compress-force machine 2: btrfs over md raid 5 (4x2TB)=5.5TB filesystem. mount options compress=zlib,compress-force pastebins: machine1: 3.2rc7 http://pastebin.com/u583G7jK 3.2rc6 http://pastebin.com/L12TDaXa machine2: 3.2rc6 http://pastebin.com/khD0wGXx 3.2rc7 (not crashed yet) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Btrfs: blocked for more than 120 seconds, made worse by 3.2 rc7 2011-12-28 19:26 Btrfs: blocked for more than 120 seconds, made worse by 3.2 rc7 Konstantinos Skarlatos @ 2011-12-28 20:36 ` Konstantinos Skarlatos 2011-12-28 21:48 ` Dave Chinner 1 sibling, 0 replies; 4+ messages in thread From: Konstantinos Skarlatos @ 2011-12-28 20:36 UTC (permalink / raw) To: linux-kernel; +Cc: Linux Btrfs, Chris Mason Well now machine2 has just crashed too... http://pastebin.com/gvfUm0az On =CE=A4=CE=B5=CF=84=CE=AC=CF=81=CF=84=CE=B7, 28 =CE=94=CE=B5=CE=BA=CE= =AD=CE=BC=CE=B2=CF=81=CE=B9=CE=BF=CF=82 2011 9:26:07 =CE=BC=CE=BC, Kons= tantinos Skarlatos wrote: > Hello all: > I have two machines with btrfs, that give me the "blocked for more=20 > than 120 seconds" message. After that I cannot write anything to disk= ,=20 > i am unable to unmount the btrfs filesystem and i can only reboot wit= h=20 > sysrq-trigger. > > It always happens when i write many files with rsync over network.=20 > When i used 3.2rc6 it happened randomly on both machines after=20 > 50-500gb of writes. with rc7 it happens after much less writes,=20 > probably 10gb or so, but only on machine 1 for the time being. machin= e=20 > 2 has not crashed yet after 200gb of writes and I am still testing th= at. > > machine 1: btrfs on a 6tb sparse file, mounted as loop, on a xfs=20 > filesystem that lies on a 10TB md raid5. mount options=20 > compress=3Dzlib,compress-force > > machine 2: btrfs over md raid 5 (4x2TB)=3D5.5TB filesystem. mount=20 > options compress=3Dzlib,compress-force > > pastebins: > > machine1: > 3.2rc7 http://pastebin.com/u583G7jK > 3.2rc6 http://pastebin.com/L12TDaXa > > machine2: > 3.2rc6 http://pastebin.com/khD0wGXx > 3.2rc7 (not crashed yet) ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Btrfs: blocked for more than 120 seconds, made worse by 3.2 rc7 2011-12-28 19:26 Btrfs: blocked for more than 120 seconds, made worse by 3.2 rc7 Konstantinos Skarlatos 2011-12-28 20:36 ` Konstantinos Skarlatos @ 2011-12-28 21:48 ` Dave Chinner 2011-12-28 21:58 ` Konstantinos Skarlatos 1 sibling, 1 reply; 4+ messages in thread From: Dave Chinner @ 2011-12-28 21:48 UTC (permalink / raw) To: Konstantinos Skarlatos; +Cc: linux-kernel, Linux Btrfs, Chris Mason On Wed, Dec 28, 2011 at 09:26:07PM +0200, Konstantinos Skarlatos wrote: > Hello all: > I have two machines with btrfs, that give me the "blocked for more > than 120 seconds" message. After that I cannot write anything to > disk, i am unable to unmount the btrfs filesystem and i can only > reboot with sysrq-trigger. > > It always happens when i write many files with rsync over network. > When i used 3.2rc6 it happened randomly on both machines after > 50-500gb of writes. with rc7 it happens after much less writes, > probably 10gb or so, but only on machine 1 for the time being. > machine 2 has not crashed yet after 200gb of writes and I am still > testing that. > > machine 1: btrfs on a 6tb sparse file, mounted as loop, on a xfs > filesystem that lies on a 10TB md raid5. mount options > compress=zlib,compress-force > > machine 2: btrfs over md raid 5 (4x2TB)=5.5TB filesystem. mount > options compress=zlib,compress-force > > pastebins: > > machine1: > 3.2rc7 http://pastebin.com/u583G7jK > 3.2rc6 http://pastebin.com/L12TDaXa These two are caused by it taking longer than 120s for XFS to fsync the loop file. Writing a signficant chunk of a sparse 6TB file on a software RAID5 volume is going to take some time. However, if IO is not occurring, then somewhere below XFS an IO has gone missing (MD or hardware problem) because the fsync on the XFS file is blocked waiting for an IO completion. > machine2: > 3.2rc6 http://pastebin.com/khD0wGXx > 3.2rc7 (not crashed yet) These don't have XFS in the picture, but also appear to be hung waiting on IO completion with MD stuck in make_request()->get_active_stripe(). That, to me, indicates an MD problem..... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Btrfs: blocked for more than 120 seconds, made worse by 3.2 rc7 2011-12-28 21:48 ` Dave Chinner @ 2011-12-28 21:58 ` Konstantinos Skarlatos 0 siblings, 0 replies; 4+ messages in thread From: Konstantinos Skarlatos @ 2011-12-28 21:58 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-kernel, Linux Btrfs, Chris Mason, linux-raid On =CE=A4=CE=B5=CF=84=CE=AC=CF=81=CF=84=CE=B7, 28 =CE=94=CE=B5=CE=BA=CE= =AD=CE=BC=CE=B2=CF=81=CE=B9=CE=BF=CF=82 2011 11:48:32 =CE=BC=CE=BC, Dav= e Chinner wrote: > On Wed, Dec 28, 2011 at 09:26:07PM +0200, Konstantinos Skarlatos wrot= e: >> Hello all: >> I have two machines with btrfs, that give me the "blocked for more >> than 120 seconds" message. After that I cannot write anything to >> disk, i am unable to unmount the btrfs filesystem and i can only >> reboot with sysrq-trigger. >> >> It always happens when i write many files with rsync over network. >> When i used 3.2rc6 it happened randomly on both machines after >> 50-500gb of writes. with rc7 it happens after much less writes, >> probably 10gb or so, but only on machine 1 for the time being. >> machine 2 has not crashed yet after 200gb of writes and I am still >> testing that. >> >> machine 1: btrfs on a 6tb sparse file, mounted as loop, on a xfs >> filesystem that lies on a 10TB md raid5. mount options >> compress=3Dzlib,compress-force >> >> machine 2: btrfs over md raid 5 (4x2TB)=3D5.5TB filesystem. mount >> options compress=3Dzlib,compress-force >> >> pastebins: >> >> machine1: >> 3.2rc7 http://pastebin.com/u583G7jK >> 3.2rc6 http://pastebin.com/L12TDaXa > > These two are caused by it taking longer than 120s for XFS to fsync > the loop file. Writing a signficant chunk of a sparse 6TB file on a > software RAID5 volume is going to take some time. However, if IO > is not occurring, then somewhere below XFS an IO has gone missing > (MD or hardware problem) because the fsync on the XFS file is > blocked waiting for an IO completion. > >> machine2: >> 3.2rc6 http://pastebin.com/khD0wGXx >> 3.2rc7 (not crashed yet) Crashed a few hours ago, here is the rc7 pastebin http://pastebin.com/gvfUm0az=20 > > These don't have XFS in the picture, but also appear to be hung > waiting on IO completion with MD stuck in > make_request()->get_active_stripe(). That, to me, indicates an MD > problem..... > Added the linux-raid mailing list Please reply to me too, because i am not subscribed. > Cheers, > > Dave. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-12-28 21:58 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-12-28 19:26 Btrfs: blocked for more than 120 seconds, made worse by 3.2 rc7 Konstantinos Skarlatos 2011-12-28 20:36 ` Konstantinos Skarlatos 2011-12-28 21:48 ` Dave Chinner 2011-12-28 21:58 ` Konstantinos Skarlatos
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).