linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* hung task timer + btrfs_convert or btrfs balance = OOPS
@ 2014-04-23 20:44 Robert White
  2014-04-23 21:12 ` Marc MERLIN
  0 siblings, 1 reply; 3+ messages in thread
From: Robert White @ 2014-04-23 20:44 UTC (permalink / raw)
  To: linux-btrfs

The first mount of a non-trivial file system after a btrfs_convert, or 
an ongoing btrfs balance operation containing large files may lead to an 
oops (and a pathologically damaged file system) if the hang check timer 
(CONFIG_DETECT_HUNG_TASK=y) is compiled into the linux kernel and not 
disabled.

I've had two systems destroyed after a btrfs_convert. After the 
conversion the first mount took several minutes. The hung task timer 
expired against some internal btrfs_daemon. I think it was 
'[btrfs-transacti]'. Said task then goes oops and the file system was 
chock full of errors. So many that I no longer trusted the conversion so 
mkfs.btrfs and restored from backup.

On another system the same thing happened after a successful convert and 
mount (I'd remembered to disable the timer during the first mount) when 
a btrfs balance was running.

Whatever is blocking in that task really ought not to do that for 2+ 
minutes and sleep on some data structure instead.

As it is, the two options are not happy together. Be sure to

echo 0 > /proc/sys/kernel/hung_task_timeout_secs

to disable the timer before doing a mount or balance after a 
btrfs_convert (and possibly a btrfs balance if it decides to move a very 
large file like a VM disk image).


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: hung task timer + btrfs_convert or btrfs balance = OOPS
  2014-04-23 20:44 hung task timer + btrfs_convert or btrfs balance = OOPS Robert White
@ 2014-04-23 21:12 ` Marc MERLIN
  2014-04-23 22:59   ` Robert White
  0 siblings, 1 reply; 3+ messages in thread
From: Marc MERLIN @ 2014-04-23 21:12 UTC (permalink / raw)
  To: Robert White; +Cc: linux-btrfs

On Wed, Apr 23, 2014 at 01:44:21PM -0700, Robert White wrote:
> As it is, the two options are not happy together. Be sure to
> 
> echo 0 > /proc/sys/kernel/hung_task_timeout_secs
> 
> to disable the timer before doing a mount or balance after a 

But this only removes the messages from syslog, it doesn't actually
fix/remove any hangs, unless I'm missing something.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: hung task timer + btrfs_convert or btrfs balance = OOPS
  2014-04-23 21:12 ` Marc MERLIN
@ 2014-04-23 22:59   ` Robert White
  0 siblings, 0 replies; 3+ messages in thread
From: Robert White @ 2014-04-23 22:59 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: linux-btrfs

On 04/23/2014 02:12 PM, Marc MERLIN wrote:
> On Wed, Apr 23, 2014 at 01:44:21PM -0700, Robert White wrote:
>> As it is, the two options are not happy together. Be sure to
>>
>> echo 0 > /proc/sys/kernel/hung_task_timeout_secs
>>
>> to disable the timer before doing a mount or balance after a
>
> But this only removes the messages from syslog, it doesn't actually
> fix/remove any hangs, unless I'm missing something.

All I know for sure is instant prompt return after the oops and trashed 
file system.

Maybe I have the cause and effect backwards and the btrfs_convert had 
some subtle fail that led to the delay coincidentally. With the drive 
trashed I assumed the oops was causative because I had to jump right on 
rebuilding the system.



-- Rob.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-04-23 23:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-23 20:44 hung task timer + btrfs_convert or btrfs balance = OOPS Robert White
2014-04-23 21:12 ` Marc MERLIN
2014-04-23 22:59   ` Robert White

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).