* hung task timer + btrfs_convert or btrfs balance = OOPS
@ 2014-04-23 20:44 Robert White
2014-04-23 21:12 ` Marc MERLIN
0 siblings, 1 reply; 3+ messages in thread
From: Robert White @ 2014-04-23 20:44 UTC (permalink / raw)
To: linux-btrfs
The first mount of a non-trivial file system after a btrfs_convert, or
an ongoing btrfs balance operation containing large files may lead to an
oops (and a pathologically damaged file system) if the hang check timer
(CONFIG_DETECT_HUNG_TASK=y) is compiled into the linux kernel and not
disabled.
I've had two systems destroyed after a btrfs_convert. After the
conversion the first mount took several minutes. The hung task timer
expired against some internal btrfs_daemon. I think it was
'[btrfs-transacti]'. Said task then goes oops and the file system was
chock full of errors. So many that I no longer trusted the conversion so
mkfs.btrfs and restored from backup.
On another system the same thing happened after a successful convert and
mount (I'd remembered to disable the timer during the first mount) when
a btrfs balance was running.
Whatever is blocking in that task really ought not to do that for 2+
minutes and sleep on some data structure instead.
As it is, the two options are not happy together. Be sure to
echo 0 > /proc/sys/kernel/hung_task_timeout_secs
to disable the timer before doing a mount or balance after a
btrfs_convert (and possibly a btrfs balance if it decides to move a very
large file like a VM disk image).
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: hung task timer + btrfs_convert or btrfs balance = OOPS
2014-04-23 20:44 hung task timer + btrfs_convert or btrfs balance = OOPS Robert White
@ 2014-04-23 21:12 ` Marc MERLIN
2014-04-23 22:59 ` Robert White
0 siblings, 1 reply; 3+ messages in thread
From: Marc MERLIN @ 2014-04-23 21:12 UTC (permalink / raw)
To: Robert White; +Cc: linux-btrfs
On Wed, Apr 23, 2014 at 01:44:21PM -0700, Robert White wrote:
> As it is, the two options are not happy together. Be sure to
>
> echo 0 > /proc/sys/kernel/hung_task_timeout_secs
>
> to disable the timer before doing a mount or balance after a
But this only removes the messages from syslog, it doesn't actually
fix/remove any hangs, unless I'm missing something.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: hung task timer + btrfs_convert or btrfs balance = OOPS
2014-04-23 21:12 ` Marc MERLIN
@ 2014-04-23 22:59 ` Robert White
0 siblings, 0 replies; 3+ messages in thread
From: Robert White @ 2014-04-23 22:59 UTC (permalink / raw)
To: Marc MERLIN; +Cc: linux-btrfs
On 04/23/2014 02:12 PM, Marc MERLIN wrote:
> On Wed, Apr 23, 2014 at 01:44:21PM -0700, Robert White wrote:
>> As it is, the two options are not happy together. Be sure to
>>
>> echo 0 > /proc/sys/kernel/hung_task_timeout_secs
>>
>> to disable the timer before doing a mount or balance after a
>
> But this only removes the messages from syslog, it doesn't actually
> fix/remove any hangs, unless I'm missing something.
All I know for sure is instant prompt return after the oops and trashed
file system.
Maybe I have the cause and effect backwards and the btrfs_convert had
some subtle fail that led to the delay coincidentally. With the drive
trashed I assumed the oops was causative because I had to jump right on
rebuilding the system.
-- Rob.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2014-04-23 23:06 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-23 20:44 hung task timer + btrfs_convert or btrfs balance = OOPS Robert White
2014-04-23 21:12 ` Marc MERLIN
2014-04-23 22:59 ` Robert White
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).