Re: btrfs-transaction blocked for more than 120 seconds

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marc MERLIN <marc@merlins.org>
To: Duncan <1i5t5.duncan@cox.net>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfs-transaction blocked for more than 120 seconds
Date: Sat, 4 Jan 2014 22:39:57 -0800	[thread overview]
Message-ID: <20140105063957.GF11749@merlins.org> (raw)
In-Reply-To: <pan$6ce5a$53830c0$382c1e0e$7169217c@cox.net>

On Fri, Jan 03, 2014 at 09:34:10PM +0000, Duncan wrote:
> > Thank you for that tip, I had been unaware of it 'till now.
> > This will make my virtualbox image directory much happier :)
> 
> I think I said it, but it bears repeating.  Once you set that attribute 
> on the dir, you may want to move the files out of the dir (to another 
> partition would make sure the data is actually moved) and back in, so 
> they're effectively new files in the dir.  Or use something like cat 
> oldfile > newfile, so you know it's actually creating the new file, not 
> reflinking.  That'll ensure the NOCOW takes effect.

Yes, I got that. That why I ran btrfs defrag on the files after that (I
explained why, copy would waste lots of snapshot space by replacing all
the block needlessly).
 
> > Unfortunately, on a 83GB vdi (virtualbox) file, with 3.12.5, it did a
> > lot of writing and chewed up my 4 CPUs. Then, it started to be hard to
> > move my mouse cursor and my procmeter graph was barely updating seconds.
> > Next, nothing updated on my X server anymore, not even seconds in time
> > widgets.
> > 
> > But, I could still sometimes move my mouse cursor, and I could sometimes
> > see the HD light fliker a bit before going dead again. In other words,
> > the system wasn't fully deadlocked, but btrfs sure got into a state
> > where it was unable to to finish the job, and took the kernel down with
> > it (64bit, 8GB of RAM).
> > 
> > I waited 2H and it never came out of it, I had to power down the system
> > in the end.  Note that this was on a top of the line 500MB/s write
> > Samsung Evo 840 SSD, not a slow HD.
> 
> That was defrag (the command) or autodefrag (the mount option)?  I'd 
> guess defrag (the command).

defrag, the btrfs subcommand.

> That's fragmentation for you!  What did/does filefrag have to say about 
> that file?  Were you the one that posted the 6-digit extents?

Nope, I never posted anything until now. Hopefully you agree that it's
not ok for btrfs/kernel to just kill my system for over 2H until I power
it off before of defragging one file. I did hit a severe performance but
if it's not a never ending loop.

gandalfthegreat:/var/local/nobck/VirtualBox VMs/Win7# filefrag Win7.vdi 
Win7.vdi: 156222 extents found

Considering how virtualbox works, that's hardly surprising.

> For something that bad, it might be faster to copy/move it off-device 
> (expect it to take awhile) then move it back.  That way you're only 
> trying to read OR write on the device, not both, and the move elsewhere 
> should defrag it quite a bit, effectively sequential write, then read and 
> write on the move back.

Yes, I know how I can work around the problem (although I'll likely have
to delete all my historical snapshots to delete the old blocks, which I
don't love to do).
But doesn't it make sense to see why the kernel is near deadlocking on a
single file defrag first?

> But even that might be prohibitive.  At some point, you may need to 
> either simply give up on it (if you're lazy), or get down and dirty with 
> the tracing/profiling, working with a dev to figure out where it's 
> spending its time and hopefully get btrfs recoded to work a bit faster 
> for that sort of thing.

I'm on my way to a linux conf where I'm speaking, so I have limited time
and can't crash my laptop, but I'm happy to type some commands and give
output.

> As I suggested above, you might try the old school method of defrag, move 
> the file to a different device, then move it back.  And if possible do it 
> when nothing else is using the system.  But it may simply be practically 
> inaccessible with a current kernel, in which case you'd either have to 
> work with the devs to optimize, or give it up as a lost cause. =:(
 
I can fix my problem, actually virtualbox works fine with the fragmented
file, without even feeling slow, so really I don't need to fix it
urgently, I was just trying it out after your post.
 
> Then if the process completed successfully, you could cat the parts back 
> together again... and the written parts would be basically sequential, so 
> that should go MUCH faster! =:^)

All that noted, but I'm not desperate, just trying commands I hadn't
tried yet :)

Thanks for your replies,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

next prev parent reply	other threads:[~2014-01-05  8:35 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-31 11:46 btrfs-transaction blocked for more than 120 seconds Sulla
2014-01-01 12:37 ` Duncan
2014-01-01 20:08   ` Sulla
2014-01-02  8:38     ` Duncan
2014-01-03  1:24       ` Kai Krakow
2014-01-03  9:18         ` Duncan
2014-01-05  0:12     ` Sulla
2014-01-03 17:25   ` Marc MERLIN
2014-01-03 21:34     ` Duncan
2014-01-05  6:39       ` Marc MERLIN [this message]
2014-01-05 17:09         ` Chris Murphy
2014-01-05 17:54           ` Jim Salter
2014-01-05 19:57             ` Duncan
2014-01-05 20:44               ` Chris Murphy
2014-01-08  3:22       ` Marc MERLIN
2014-01-08  9:45         ` Duncan
2014-01-04 20:48     ` Roger Binns
2014-01-02  8:49 ` Jojo
2014-01-05 20:32 ` Chris Murphy
2014-01-05 21:17   ` Sulla
2014-01-05 22:36     ` Brendan Hide
2014-01-05 22:57       ` Roman Mamedov
2014-01-07 10:22         ` Brendan Hide
2014-01-06  0:15       ` Chris Murphy
2014-01-06  0:19         ` Chris Murphy
2014-01-05 23:48     ` Chris Murphy
2014-01-05 23:57       ` Chris Murphy
2014-01-06  0:25         ` Sulla
2014-01-06  0:49           ` Chris Murphy
     [not found]             ` <52CA06FE.2030802@gmx.at>
2014-01-06  1:55               ` Chris Murphy
     [not found] <ADin1n00P0VAdqd01DioM9>
2014-01-05 20:44 ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140105063957.GF11749@merlins.org \
    --to=marc@merlins.org \
    --cc=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.