system hangs due to qgroups

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* system hangs due to qgroups
@ 2016-12-03 18:40 Marc Joliet
  2016-12-03 20:42 ` Chris Murphy
  2016-12-05  0:39 ` Qu Wenruo
  0 siblings, 2 replies; 22+ messages in thread
From: Marc Joliet @ 2016-12-03 18:40 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3186 bytes --]

Hello all,

I'm having some trouble with btrfs on a laptop, possibly due to qgroups.  
Specifically, some file system activities (e.g., snapshot creation, 
baloo_file_extractor from KDE Plasma) cause the system to hang for up to about 
40 minutes, maybe more.  It always causes (most of) my desktop to hang, 
(although I can usually navigate between pre-existing Konsole tabs) and 
prevents new programs from starting.  I've seen the system load go up to >30 
before the laptop suddenly resumes normal operation.  I've been seeing this 
since Linux 4.7, maybe already 4.6.

Now, I thought that maybe this was (indirectly) due to an overly full file 
system (~90% full), so I deleted some things I didn't need to get it up to 15% 
free.  (For the record, I also tried mounting with ssd_spread.)  After that, I 
ran a balance with -dusage=50, which started out promising, but then went back 
to the "bad" behaviour.  *But* it seemed better than before overall, so I 
started a balance with -musage=10, then -musage=50.  That turned out to be a 
mistake.  Since I had to transport the laptop, and couldn't wait for "balance 
cancel" to return (IIUC it only returns after the next block (group?) is 
freed), I forced the laptop off.

After I next turned on the laptop, the balance resumed, causing bootup to 
fail, after which I remembered about the skip_balance mount option, which I 
tried in a rescue shell from an initramfs.  But wait, that failed, too!  
Specifically, the stack trace I get whenever I try it includes as one of the 
last lines:

"RIP [<ffffffff8131226f>] qgroup_fix_relocated_data_extents+0x1f/0x2a8"

(I can take photos of the full stack trace if requested.)

So then I ran "btrfs qgroup show /sysroot/", which showed many quota groups, 
much to my surprise.  On the upside, at least now I discovered the likely 
reason for the performance problems.

(I actually think I know why I'm seeing qgroups: at one point I was trying out 
various snapshot/backup tools for btrfs, and one (I forgot which) 
unconditionally activated quota support, which infuriated me, but I promptly 
deactivated it, or so I thought.  Is quota support automatically enabled when 
qgroups are discovered, or did I perhaps not disable quota support properly?)

Since I couldn't use skip_balance, and logically can't destroy qgroups on a 
read-only file system, I decided to wait for a regular mount to finish.  That 
has been running since Tuesday, and I am slowly growing impatient.

Thus I arrive at my question(s): is there anything else I can try, short of 
reformatting and restoring from backup?  Can I use btrfs-check here, or any 
other tool?  Or...?

Also, should I be able to avoid reformatting: how do I properly disable quota 
support?

(BTW, searching for qgroup_fix_relocated_data_extents turned up the ML thread 
"[PATCH] Btrfs: fix endless loop in balancing block groups", could that be 
related?)

The laptop is currently running Gentoo with Linux 4.8.10 and btrfs-progs 
4.8.4.

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-03 18:40 system hangs due to qgroups Marc Joliet
@ 2016-12-03 20:42 ` Chris Murphy
  2016-12-03 21:46   ` Marc Joliet
  2016-12-05  0:39 ` Qu Wenruo
  1 sibling, 1 reply; 22+ messages in thread
From: Chris Murphy @ 2016-12-03 20:42 UTC (permalink / raw)
  To: Marc Joliet; +Cc: Btrfs BTRFS

On Sat, Dec 3, 2016 at 11:40 AM, Marc Joliet <marcec@gmx.de> wrote:
> Hello all,
>
> I'm having some trouble with btrfs on a laptop, possibly due to qgroups.
> Specifically, some file system activities (e.g., snapshot creation,
> baloo_file_extractor from KDE Plasma) cause the system to hang for up to about
> 40 minutes, maybe more.

Do you get any blocked tasks kernel messages? If so, issue sysrq+w
during the hang, and then check the system log (dmesg may not contain
everything if the command fills the message buffer). If it's a hang
without any kernel messages, then issue sysrq+t.

https://www.kernel.org/doc/Documentation/sysrq.txt

>
> After I next turned on the laptop, the balance resumed, causing bootup to
> fail, after which I remembered about the skip_balance mount option, which I
> tried in a rescue shell from an initramfs.

The file system is the root filesystem? If so, skip_balance may not be
happening soon enough. Use kernel parameter rootflags=skip_balance
which will apply this mount option at the very first moment the file
system is mounted during boot.

> Since I couldn't use skip_balance, and logically can't destroy qgroups on a
> read-only file system, I decided to wait for a regular mount to finish.  That
> has been running since Tuesday, and I am slowly growing impatient.

Haha, no kidding! I think that's very patient.

> Thus I arrive at my question(s): is there anything else I can try, short of
> reformatting and restoring from backup?  Can I use btrfs-check here, or any
> other tool?  Or...?

Yes, btrfs-progs 4.8.5 has the latest qgroup checks, so if there's
something wrong it should find it and if not that's a bug of its own.

> Also, should I be able to avoid reformatting: how do I properly disable quota
> support?

'btrfs quota disable' is the only command that applies to this and it
requires rw mount; there's no 'noquota' mount option.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-03 20:42 ` Chris Murphy
@ 2016-12-03 21:46   ` Marc Joliet
  2016-12-03 22:56     ` Chris Murphy
  2016-12-04  2:10     ` Adam Borowski
  0 siblings, 2 replies; 22+ messages in thread
From: Marc Joliet @ 2016-12-03 21:46 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4148 bytes --]

On Saturday 03 December 2016 13:42:42 Chris Murphy wrote:
> On Sat, Dec 3, 2016 at 11:40 AM, Marc Joliet <marcec@gmx.de> wrote:
> > Hello all,
> > 
> > I'm having some trouble with btrfs on a laptop, possibly due to qgroups.
> > Specifically, some file system activities (e.g., snapshot creation,
> > baloo_file_extractor from KDE Plasma) cause the system to hang for up to
> > about 40 minutes, maybe more.
> 
> Do you get any blocked tasks kernel messages? If so, issue sysrq+w
> during the hang, and then check the system log (dmesg may not contain
> everything if the command fills the message buffer). If it's a hang
> without any kernel messages, then issue sysrq+t.
> 
> https://www.kernel.org/doc/Documentation/sysrq.txt

As it's a rescue shell, I have only the one shell AFAIK, and it's occupied by 
mount.  So I can't tell if there are dmesg entries, however, when this happens 
during a normal running system, I never saw any dmesg entries.  Anyway, I ran 
both.

The output of sysrq+w mentions two tasks: "btrfs-transaction" with 
btrfs_scrub_pause+0xbe/0xd0 as the top-most entry in the call trace, and 
"mount" with its top-most entry at schedule+0x33/0x90 (it looks like it's 
still in the "early" processing, since there's also 
"btrfs_parse_early_options+0190/0x190" in the call trace).

The output of sysrq+t is too big to capture all of it (i.e., I can't scroll 
back to the beginning), but just looking at the task names that I *can* see, 
there are: btrfs-fixup, various btrfs-endio*, btrfs-rmw, btrfs-freespace, 
btrfs-delayed-m (cut off), btrfs-readahead, btrfs-qgroup-re (cut off), btrfs-
extent-re (cut off), btrfs-cleaner, and btrfs-transaction.  Oh, and a bunch of 
kworkers.

Should I take photos?  That'll be annoying to do with all the scrolling, but I 
can do that if need be.

> > After I next turned on the laptop, the balance resumed, causing bootup to
> > fail, after which I remembered about the skip_balance mount option, which
> > I
> > tried in a rescue shell from an initramfs.
> 
> The file system is the root filesystem? If so, skip_balance may not be
> happening soon enough. Use kernel parameter rootflags=skip_balance
> which will apply this mount option at the very first moment the file
> system is mounted during boot.

Yes, it's the root file system (there's that plus a swap partition).  I 
believe I tried rootflags, but I think it also failed, which is why I'm using 
a rescue shell now.  I can try it again, though, if anybody thinks that 
there's no point in waiting, especially if btrfs_scrub_pause in the btrfs-
transaction call trace is significant.

> > Since I couldn't use skip_balance, and logically can't destroy qgroups on
> > a
> > read-only file system, I decided to wait for a regular mount to finish. 
> > That has been running since Tuesday, and I am slowly growing impatient.
> Haha, no kidding! I think that's very patient.

Heh :) . I've still got my main desktop (as ancient as it may be), so I'm 
content with waiting for now, but I don't want to wait forever, especially if 
there might not even be a point.

> > Thus I arrive at my question(s): is there anything else I can try, short
> > of
> > reformatting and restoring from backup?  Can I use btrfs-check here, or
> > any
> > other tool?  Or...?
> 
> Yes, btrfs-progs 4.8.5 has the latest qgroup checks, so if there's
> something wrong it should find it and if not that's a bug of its own.

The initramfs has 4.8.4, but it looks like 4.8.5 was "only" an urgent bug fix, 
with no changes to qgroups handling, so I can use that, too.  Can it repair 
qgroups problems, too?

> > Also, should I be able to avoid reformatting: how do I properly disable
> > quota support?
> 
> 'btrfs quota disable' is the only command that applies to this and it
> requires rw mount; there's no 'noquota' mount option.

OK, thanks.

So what should I try next?  I'm sick at home, so I can spend more time on this 
than usual.

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-03 21:46   ` Marc Joliet
@ 2016-12-03 22:56     ` Chris Murphy
  2016-12-04 16:02       ` Marc Joliet
  2016-12-04  2:10     ` Adam Borowski
  1 sibling, 1 reply; 22+ messages in thread
From: Chris Murphy @ 2016-12-03 22:56 UTC (permalink / raw)
  To: Marc Joliet; +Cc: Btrfs BTRFS

On Sat, Dec 3, 2016 at 2:46 PM, Marc Joliet <marcec@gmx.de> wrote:
> On Saturday 03 December 2016 13:42:42 Chris Murphy wrote:
>> On Sat, Dec 3, 2016 at 11:40 AM, Marc Joliet <marcec@gmx.de> wrote:
>> > Hello all,
>> >
>> > I'm having some trouble with btrfs on a laptop, possibly due to qgroups.
>> > Specifically, some file system activities (e.g., snapshot creation,
>> > baloo_file_extractor from KDE Plasma) cause the system to hang for up to
>> > about 40 minutes, maybe more.
>>
>> Do you get any blocked tasks kernel messages? If so, issue sysrq+w
>> during the hang, and then check the system log (dmesg may not contain
>> everything if the command fills the message buffer). If it's a hang
>> without any kernel messages, then issue sysrq+t.
>>
>> https://www.kernel.org/doc/Documentation/sysrq.txt
>
> As it's a rescue shell, I have only the one shell AFAIK, and it's occupied by
> mount.  So I can't tell if there are dmesg entries, however, when this happens
> during a normal running system, I never saw any dmesg entries.  Anyway, I ran
> both.

OK so this is root fs? I would try to work on it from another volume.
An advantage of openSUSE Tumbleweed is they claim to fully support
qgroups, where upstream uses much more guarded language about its
stability.

Whereas last night's Fedora Rawhide has kernel 4.9-rc7 and btrfs-progs 4.8.5.
https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20161203.n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-netinst-x86_64-Rawhide-20161203.n.0.iso

You can use dd to write the ISO to a USB stick, it supports BIOS and
UEFI and Secure Boot.

Troubleshooting > Rescue a Fedora system > option 3 to get to a shell
The sysrq+t and sysrq+w can be written out in their entirety with
monotonic time using 'journalctl -b -k -o short-monotonic >
kernelmessages.log'

Unfortunately this is not a live system, so you can't (as far as I
know) install script to more easily capture everything to a single
file; 'btrfs check <dev> > btrfscheck.log' should capture most of the
output, but it misses a few early lines for some reason.

And then scp those files to another system, or mount another stick and
copy locally.

>
> Should I take photos?  That'll be annoying to do with all the scrolling, but I
> can do that if need be.

I can't decipher it anyway, it's mainly for a dev who wanders across
this thread or if you file a bug report. But you can get the complete
output using the method above.

>
>> > After I next turned on the laptop, the balance resumed, causing bootup to
>> > fail, after which I remembered about the skip_balance mount option, which
>> > I
>> > tried in a rescue shell from an initramfs.
>>
>> The file system is the root filesystem? If so, skip_balance may not be
>> happening soon enough. Use kernel parameter rootflags=skip_balance
>> which will apply this mount option at the very first moment the file
>> system is mounted during boot.
>
> Yes, it's the root file system (there's that plus a swap partition).  I
> believe I tried rootflags, but I think it also failed, which is why I'm using
> a rescue shell now.  I can try it again, though, if anybody thinks that
> there's no point in waiting, especially if btrfs_scrub_pause in the btrfs-
> transaction call trace is significant.

It sounds like it's resuming a scrub. That won't happen if you boot
from an alternate volume. There's a scrub file found at
/var/lib/btrfs/ that tracks the progress of scrubs for each btrfs
volume - that directory with an inprogress scrub for your file system
is actually in the directory on that file system. If you haven't had
luck with btrfs scrub cancel, you can just remove the files in that
directory when you get a chance to rw mount the volume.

>
>> > Since I couldn't use skip_balance, and logically can't destroy qgroups on
>> > a
>> > read-only file system, I decided to wait for a regular mount to finish.
>> > That has been running since Tuesday, and I am slowly growing impatient.
>> Haha, no kidding! I think that's very patient.
>
> Heh :) . I've still got my main desktop (as ancient as it may be), so I'm
> content with waiting for now, but I don't want to wait forever, especially if
> there might not even be a point.

How big is the file system? Sounds like it's a single device volume on
a laptop so I'm guessing at most 1TB, and that'd mean at most 100GiB
of metadata, which should mean around 15 minutes max to completely
read and process all the metadata, and maybe a few hours to do a
scrub. I'd bail after a few hours for sure.

>
>> > Thus I arrive at my question(s): is there anything else I can try, short
>> > of
>> > reformatting and restoring from backup?  Can I use btrfs-check here, or
>> > any
>> > other tool?  Or...?
>>
>> Yes, btrfs-progs 4.8.5 has the latest qgroup checks, so if there's
>> something wrong it should find it and if not that's a bug of its own.
>
> The initramfs has 4.8.4, but it looks like 4.8.5 was "only" an urgent bug fix,
> with no changes to qgroups handling, so I can use that, too.  Can it repair
> qgroups problems, too?

Yes, 4.8.4 is fine.

>
>> > Also, should I be able to avoid reformatting: how do I properly disable
>> > quota support?
>>
>> 'btrfs quota disable' is the only command that applies to this and it
>> requires rw mount; there's no 'noquota' mount option.
>
> OK, thanks.
>
> So what should I try next?  I'm sick at home, so I can spend more time on this
> than usual.

Well if it were me I'd use btrfs check to see what state it thinks the
file system is in. And then I'd do btrfs image to make a copy of the
filesystem metadata both for the devs and also in case the next things
make the problem worse, in theory the fs can be restored (or you can
setup an overlay  if you prefer).

And then I'd mount normally, possibly with skip_balance. Capture
sysrq+t or +w or both. And then see if things get more sane if you
disable quotas. If not, then I'd see if it'll tolerate 'btrfs qgroup
destroy' on a few subvolumes. I'd basically use destroy and remove to
wipe away all the quotas - I don't know off hand if quotas needs to be
enabled for qgroup remove/destroy to work so you'll have to figure
that out. And it might take a while for the command to complete, but
I'd like to believe as you wipe away the qgroups, whatever qgroup
related kernel accounting is happening will eventually stop.

It sounds to me like there may be some legacy qgroup confusion going
on, but I haven't tested this much at all, so you're kinda on the
bleeding edge.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-03 21:46   ` Marc Joliet
  2016-12-03 22:56     ` Chris Murphy
@ 2016-12-04  2:10     ` Adam Borowski
  2016-12-04 16:02       ` Marc Joliet
  1 sibling, 1 reply; 22+ messages in thread
From: Adam Borowski @ 2016-12-04  2:10 UTC (permalink / raw)
  To: Marc Joliet; +Cc: linux-btrfs

On Sat, Dec 03, 2016 at 10:46:40PM +0100, Marc Joliet wrote:
> As it's a rescue shell, I have only the one shell AFAIK, and it's occupied
> by mount.  So I can't tell if there are dmesg entries, however, when this
> happens during a normal running system, I never saw any dmesg entries. 

You can use "open" (might be named "openvt") to spawn a shell on
tty2/tty3/etc.  And if you have "screen" installed, Ctrl-a c spawns new
terminals (Ctrl-a n/p/0-9 to switch).

> The output of sysrq+t is too big to capture all of it (i.e., I can't scroll 
> back to the beginning)

You may use netconsole to log everything kernel says to another machine.  I
can't provide you with the incantations from the top of my head (got working
serial (far more reliable) on all my dev boxes, and it doesn't work with
bridging ie containers on production), but as your rescue shell has no
network sharing, assuming your network card driver supports a feature
netconsole needs _and_ stars were aligned right when your network card was
manufactured, netconsole is a valuable aid.

The system might be not dead enough to stop userland network logging from
getting through, too.

Meow!
-- 
The bill declaring Jesus as the King of Poland fails to specify whether
the addition is at the top or end of the list of kings.  What should the
historians do?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-03 22:56     ` Chris Murphy
@ 2016-12-04 16:02       ` Marc Joliet
  2016-12-04 18:24         ` Duncan
  2016-12-04 18:52         ` Chris Murphy
  0 siblings, 2 replies; 22+ messages in thread
From: Marc Joliet @ 2016-12-04 16:02 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 11965 bytes --]

OK, so I tried a few things, to now avail, more below.

On Saturday 03 December 2016 15:56:45 Chris Murphy wrote:
> On Sat, Dec 3, 2016 at 2:46 PM, Marc Joliet <marcec@gmx.de> wrote:
> > On Saturday 03 December 2016 13:42:42 Chris Murphy wrote:
> >> On Sat, Dec 3, 2016 at 11:40 AM, Marc Joliet <marcec@gmx.de> wrote:
> >> > Hello all,
> >> > 
> >> > I'm having some trouble with btrfs on a laptop, possibly due to
> >> > qgroups.
> >> > Specifically, some file system activities (e.g., snapshot creation,
> >> > baloo_file_extractor from KDE Plasma) cause the system to hang for up
> >> > to
> >> > about 40 minutes, maybe more.
> >> 
> >> Do you get any blocked tasks kernel messages? If so, issue sysrq+w
> >> during the hang, and then check the system log (dmesg may not contain
> >> everything if the command fills the message buffer). If it's a hang
> >> without any kernel messages, then issue sysrq+t.
> >> 
> >> https://www.kernel.org/doc/Documentation/sysrq.txt
> > 
> > As it's a rescue shell, I have only the one shell AFAIK, and it's occupied
> > by mount.  So I can't tell if there are dmesg entries, however, when this
> > happens during a normal running system, I never saw any dmesg entries. 
> > Anyway, I ran both.
> 
> OK so this is root fs? I would try to work on it from another volume.
> An advantage of openSUSE Tumbleweed is they claim to fully support
> qgroups, where upstream uses much more guarded language about its
> stability.
> 
> Whereas last night's Fedora Rawhide has kernel 4.9-rc7 and btrfs-progs
> 4.8.5.
> https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20161203.
> n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-netinst-x86_64-Rawhide
> -20161203.n.0.iso
> 
> You can use dd to write the ISO to a USB stick, it supports BIOS and
> UEFI and Secure Boot.
> 
> Troubleshooting > Rescue a Fedora system > option 3 to get to a shell
> The sysrq+t and sysrq+w can be written out in their entirety with
> monotonic time using 'journalctl -b -k -o short-monotonic >
> kernelmessages.log'
> 
> Unfortunately this is not a live system, so you can't (as far as I
> know) install script to more easily capture everything to a single
> file; 'btrfs check <dev> > btrfscheck.log' should capture most of the
> output, but it misses a few early lines for some reason.
> 
> And then scp those files to another system, or mount another stick and
> copy locally.

That's a good idea, although I'll probably start with sysrescuecd (Linux 4.8.5 
and btrfs-progs 4.7.3), as I already have experience with it.

[After trying it]

Well, crap, I was able to get images of the file system (one sanitized), but 
mounting always fails with "device or resource busy" (with no corresponding 
dmesg output).  (Also, that drive's partitions weren't discovered on bootup, I 
had to run partprobe first.)  I never see that in the initramfs, so I'm not 
sure what's causing that.

Also, now the file system fails with the BUG I mentioned, see here:

[Sun Dec  4 12:27:07 2016] BUG: unable to handle kernel paging request at 
fffffffffffffe10
[Sun Dec  4 12:27:07 2016] IP: [<ffffffff8131226f>] 
qgroup_fix_relocated_data_extents+0x1f/0x2a0
[Sun Dec  4 12:27:07 2016] PGD 1c07067 PUD 1c09067 PMD 0 
[Sun Dec  4 12:27:07 2016] Oops: 0000 [#1] PREEMPT SMP
[Sun Dec  4 12:27:07 2016] Modules linked in: crc32c_intel serio_raw
[Sun Dec  4 12:27:07 2016] CPU: 0 PID: 370 Comm: mount Not tainted 4.8.11-
gentoo #1
[Sun Dec  4 12:27:07 2016] Hardware name: FUJITSU LIFEBOOK A530/FJNBB06, BIOS 
Version 1.19   08/15/2011
[Sun Dec  4 12:27:07 2016] task: ffff8801b1d90000 task.stack: ffff8801b1268000
[Sun Dec  4 12:27:07 2016] RIP: 0010:[<ffffffff8131226f>]  
[<ffffffff8131226f>] qgroup_fix_relocated_data_extents+0x1f/0x2a0
[Sun Dec  4 12:27:07 2016] RSP: 0018:ffff8801b126bcd8  EFLAGS: 00010246
[Sun Dec  4 12:27:07 2016] RAX: 0000000000000000 RBX: ffff8801b10b3150 RCX: 
0000000000000000
[Sun Dec  4 12:27:07 2016] RDX: ffff8801b20f24f0 RSI: ffff8801b2790800 RDI: 
ffff8801b20f2460
[Sun Dec  4 12:27:07 2016] RBP: ffff8801b10bc000 R08: 0000000000020340 R09: 
ffff8801b20f2460
[Sun Dec  4 12:27:07 2016] R10: ffff8801b48b7300 R11: ffffea0005dd0ac0 R12: 
ffff8801b126bd70
[Sun Dec  4 12:27:07 2016] R13: 0000000000000000 R14: ffff8801b2790800 R15: 
00000000b20f2460
[Sun Dec  4 12:27:07 2016] FS:  00007f97a7846780(0000) 
GS:ffff8801bbc00000(0000) knlGS:0000000000000000
[Sun Dec  4 12:27:07 2016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sun Dec  4 12:27:07 2016] CR2: fffffffffffffe10 CR3: 00000001b12ae000 CR4: 
00000000000006f0
[Sun Dec  4 12:27:07 2016] Stack:
[Sun Dec  4 12:27:07 2016]  0000000000000801 0000000000000801 ffff8801b20f2460 
ffff8801b4aaa000
[Sun Dec  4 12:27:07 2016]  0000000000000801 ffff8801b20f2460 ffffffff812c23ed 
ffff8801b1d90000
[Sun Dec  4 12:27:07 2016]  0000000000000000 00ff8801b126bd18 ffff8801b10b3150 
ffff8801b4aa9800
[Sun Dec  4 12:27:07 2016] Call Trace:
[Sun Dec  4 12:27:07 2016]  [<ffffffff812c23ed>] ? 
start_transaction+0x8d/0x4e0
[Sun Dec  4 12:27:07 2016]  [<ffffffff81317913>] ? 
btrfs_recover_relocation+0x3b3/0x440
[Sun Dec  4 12:27:07 2016]  [<ffffffff81292b2a>] ? btrfs_remount+0x3ca/0x560
[Sun Dec  4 12:27:07 2016]  [<ffffffff811bfc04>] ? shrink_dcache_sb+0x54/0x70
[Sun Dec  4 12:27:07 2016]  [<ffffffff811ad473>] ? do_remount_sb+0x63/0x1d0
[Sun Dec  4 12:27:07 2016]  [<ffffffff811c9953>] ? do_mount+0x6f3/0xbe0
[Sun Dec  4 12:27:07 2016]  [<ffffffff811c918f>] ? 
copy_mount_options+0xbf/0x170
[Sun Dec  4 12:27:07 2016]  [<ffffffff811ca111>] ? SyS_mount+0x61/0xa0
[Sun Dec  4 12:27:07 2016]  [<ffffffff8169565b>] ? 
entry_SYSCALL_64_fastpath+0x13/0x8f
[Sun Dec  4 12:27:07 2016] Code: 66 90 66 2e 0f 1f 84 00 00 00 00 00 41 57 41 
56 41 55 41 54 55 53 48 83 ec 50 48 8b 46 08 4c 8b 6e 10 48 8b a8 f0 01 00 00 
31 c0 <4d> 8b a5 10 fe ff ff f6 85 80 0c 00 00 01 74 09 80 be b0 05 00 
[Sun Dec  4 12:27:07 2016] RIP  [<ffffffff8131226f>] 
qgroup_fix_relocated_data_extents+0x1f/0x2a0
[Sun Dec  4 12:27:07 2016]  RSP <ffff8801b126bcd8>
[Sun Dec  4 12:27:07 2016] CR2: fffffffffffffe10
[Sun Dec  4 12:27:07 2016] ---[ end trace bd51bbcfd10492f7 ]---

The main difference is that I remounted rw instead of unmounting and mounting 
again.  In any case, my hope was to mount the file system from the live 
medium, then cancel the scrub from another terminal window.

Ah, but what does work is mounting a snapshot, in the sense that mount doesn't 
fail.  However, it seems that the balance still continues, so I'm back at 
square one.

> > Should I take photos?  That'll be annoying to do with all the scrolling,
> > but I can do that if need be.
> 
> I can't decipher it anyway, it's mainly for a dev who wanders across
> this thread or if you file a bug report. But you can get the complete
> output using the method above.

Alright, I can try the fedora image now that sysrescuecd is a dead end.  I can 
also try to insert the SSD in my desktop (it's a SATA device IIRC).

Oh, and I was wrong: the initramfs rescue shell *does* show dmesg output as it 
comes along, as I witnessed when inserting a USB stick.

> >> > After I next turned on the laptop, the balance resumed, causing bootup
> >> > to
> >> > fail, after which I remembered about the skip_balance mount option,
> >> > which
> >> > I
> >> > tried in a rescue shell from an initramfs.
> >> 
> >> The file system is the root filesystem? If so, skip_balance may not be
> >> happening soon enough. Use kernel parameter rootflags=skip_balance
> >> which will apply this mount option at the very first moment the file
> >> system is mounted during boot.
> > 
> > Yes, it's the root file system (there's that plus a swap partition).  I
> > believe I tried rootflags, but I think it also failed, which is why I'm
> > using a rescue shell now.  I can try it again, though, if anybody thinks
> > that there's no point in waiting, especially if btrfs_scrub_pause in the
> > btrfs- transaction call trace is significant.
> 
> It sounds like it's resuming a scrub. That won't happen if you boot
> from an alternate volume. There's a scrub file found at
> /var/lib/btrfs/ that tracks the progress of scrubs for each btrfs
> volume - that directory with an inprogress scrub for your file system
> is actually in the directory on that file system. If you haven't had
> luck with btrfs scrub cancel, you can just remove the files in that
> directory when you get a chance to rw mount the volume.

OK, I did try again with rootflags=skip_balance, then remounting 
rw,skip_balance, but that also fails, as expected.  If mount ever returned I 
probably wouldn't have to remove those files, though ;) .

> >> > Since I couldn't use skip_balance, and logically can't destroy qgroups
> >> > on
> >> > a
> >> > read-only file system, I decided to wait for a regular mount to finish.
> >> > That has been running since Tuesday, and I am slowly growing impatient.
> >> 
> >> Haha, no kidding! I think that's very patient.
> > 
> > Heh :) . I've still got my main desktop (as ancient as it may be), so I'm
> > content with waiting for now, but I don't want to wait forever, especially
> > if there might not even be a point.
> 
> How big is the file system? Sounds like it's a single device volume on
> a laptop so I'm guessing at most 1TB, and that'd mean at most 100GiB
> of metadata, which should mean around 15 minutes max to completely
> read and process all the metadata, and maybe a few hours to do a
> scrub. I'd bail after a few hours for sure.

It's only 108 GB.  I'm tolerating this low performance because it seems to me 
that it is tied to the same hangs I get at regular system run-time.

[...]
> >> > Also, should I be able to avoid reformatting: how do I properly disable
> >> > quota support?
> >> 
> >> 'btrfs quota disable' is the only command that applies to this and it
> >> requires rw mount; there's no 'noquota' mount option.
> > 
> > OK, thanks.
> > 
> > So what should I try next?  I'm sick at home, so I can spend more time on
> > this than usual.
> 
> Well if it were me I'd use btrfs check to see what state it thinks the
> file system is in. And then I'd do btrfs image to make a copy of the
> filesystem metadata both for the devs and also in case the next things
> make the problem worse, in theory the fs can be restored (or you can
> setup an overlay  if you prefer).

Well, btrfs check came back clean.  And as mentioned above, I was able to get 
two images, but with btrfs-progs 4.7.3 (the version in sysrescuecd).  I can 
get different images from the initramfs (which I didn't think of earlier, 
sorry).

> And then I'd mount normally, possibly with skip_balance. Capture
> sysrq+t or +w or both. And then see if things get more sane if you
> disable quotas. If not, then I'd see if it'll tolerate 'btrfs qgroup
> destroy' on a few subvolumes. I'd basically use destroy and remove to
> wipe away all the quotas - I don't know off hand if quotas needs to be
> enabled for qgroup remove/destroy to work so you'll have to figure
> that out. And it might take a while for the command to complete, but
> I'd like to believe as you wipe away the qgroups, whatever qgroup
> related kernel accounting is happening will eventually stop.

skip_balance always fails.  The rest sounds good, though, but I'll have to get 
a live system to mount the FS.

> It sounds to me like there may be some legacy qgroup confusion going
> on, but I haven't tested this much at all, so you're kinda on the
> bleeding edge.

OK

I think I'll try mounting the SSD in my desktop first, then I'll try the 
fedora image.  Perhaps its newer kernel will help.

Thanks
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-04  2:10     ` Adam Borowski
@ 2016-12-04 16:02       ` Marc Joliet
  0 siblings, 0 replies; 22+ messages in thread
From: Marc Joliet @ 2016-12-04 16:02 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1570 bytes --]

On Sunday 04 December 2016 03:10:19 Adam Borowski wrote:
> On Sat, Dec 03, 2016 at 10:46:40PM +0100, Marc Joliet wrote:
> > As it's a rescue shell, I have only the one shell AFAIK, and it's occupied
> > by mount.  So I can't tell if there are dmesg entries, however, when this
> > happens during a normal running system, I never saw any dmesg entries.
> 
> You can use "open" (might be named "openvt") to spawn a shell on
> tty2/tty3/etc.  And if you have "screen" installed, Ctrl-a c spawns new
> terminals (Ctrl-a n/p/0-9 to switch).

I was actually considering adding tmux to the list of programs in the 
initramfs after this experience :) .

> > The output of sysrq+t is too big to capture all of it (i.e., I can't
> > scroll
> > back to the beginning)
> 
> You may use netconsole to log everything kernel says to another machine.  I
> can't provide you with the incantations from the top of my head (got working
> serial (far more reliable) on all my dev boxes, and it doesn't work with
> bridging ie containers on production), but as your rescue shell has no
> network sharing, assuming your network card driver supports a feature
> netconsole needs _and_ stars were aligned right when your network card was
> manufactured, netconsole is a valuable aid.
> 
> The system might be not dead enough to stop userland network logging from
> getting through, too.

OK, I'll look up netconsole.

> Meow!

Thanks
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-04 16:02       ` Marc Joliet
@ 2016-12-04 18:24         ` Duncan
  2016-12-04 19:20           ` Marc Joliet
  2016-12-04 18:52         ` Chris Murphy
  1 sibling, 1 reply; 22+ messages in thread
From: Duncan @ 2016-12-04 18:24 UTC (permalink / raw)
  To: linux-btrfs

Marc Joliet posted on Sun, 04 Dec 2016 17:02:48 +0100 as excerpted:

> That's a good idea, although I'll probably start with sysrescuecd (Linux
> 4.8.5 and btrfs-progs 4.7.3), as I already have experience with it.
> 
> [After trying it]
> 
> Well, crap, I was able to get images of the file system (one sanitized),
> but mounting always fails with "device or resource busy" (with no
> corresponding dmesg output).  (Also, that drive's partitions weren't
> discovered on bootup, I had to run partprobe first.)  I never see that
> in the initramfs, so I'm not sure what's causing that.

If I understand correctly what you're doing, that part is easily enough 
explained.

Remember that btrfs, unlike most filesystems, is multi-device capable.  
The way it tracks which devices belong to which filesystems is by UUID, 
universally *UNIQUE* ID.  If you image a device via dd or similar, you of 
course image its UUID as well, destroying the "unique" assumption in UUID 
and confusing btrfs, which will consider it part of the existing 
filesystem if the original devices with that filesystem UUID remain 
hooked up.

So if you did what I believe you did, try to mount the image while the 
original filesystem devices remain attached and mounted, btrfs is simply 
saying that filesystem (which btrfs identifies by UUID) is already 
mounted: "device or resource busy".

Furthermore, if the original filesystem remains mounted writable, you're 
at serious risk of (further) corruption, because btrfs now considers them 
part of the same filesystem and may write partial updates to the wrong 
one!

Bottom line, with btrfs, make sure your universally UNIQUE IDs remain 
what they say on the tin, UNIQUE.  When you do clones including the UUID, 
don't expose the new image files as devices to btrfs while the original 
filesystem remains mounted!  Because btrfs /depends/ on UUIDs actually 
being what they say on the tin, UNIQUE.

(And FWIW, this discussion has occurred before on the list.  The btrfs 
assumption of UUID uniqueness is apparently embedded deeply enough in 
btrfs code that it can't be practically removed.  The entire thing would 
have to be rewritten; we're talking man-years worth of work.  So it's not 
going to happen and if it did the result would no longer be btrfs.  It 
may be possible to make btrfs a bit safer in terms of refusing to mount 
and/or going read-only if new devices with the same UUIDs appear in 
ordered to avoid corruption, but the basic UUID uniqueness assumption 
itself is apparently buried deeply enough in btrfs that it's not going to 
be possible to change that.  Starting fresh with a new filesystem project 
would be easier.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-04 16:02       ` Marc Joliet
  2016-12-04 18:24         ` Duncan
@ 2016-12-04 18:52         ` Chris Murphy
  2016-12-05  9:00           ` Marc Joliet
  1 sibling, 1 reply; 22+ messages in thread
From: Chris Murphy @ 2016-12-04 18:52 UTC (permalink / raw)
  To: Marc Joliet; +Cc: Btrfs BTRFS

On Sun, Dec 4, 2016 at 9:02 AM, Marc Joliet <marcec@gmx.de> wrote:

>
> Also, now the file system fails with the BUG I mentioned, see here:
>
> [Sun Dec  4 12:27:07 2016] BUG: unable to handle kernel paging request at
> fffffffffffffe10
> [Sun Dec  4 12:27:07 2016] IP: [<ffffffff8131226f>]
> qgroup_fix_relocated_data_extents+0x1f/0x2a0
> [Sun Dec  4 12:27:07 2016] PGD 1c07067 PUD 1c09067 PMD 0
> [Sun Dec  4 12:27:07 2016] Oops: 0000 [#1] PREEMPT SMP
> [Sun Dec  4 12:27:07 2016] Modules linked in: crc32c_intel serio_raw
> [Sun Dec  4 12:27:07 2016] CPU: 0 PID: 370 Comm: mount Not tainted 4.8.11-
> gentoo #1
> [Sun Dec  4 12:27:07 2016] Hardware name: FUJITSU LIFEBOOK A530/FJNBB06, BIOS
> Version 1.19   08/15/2011
> [Sun Dec  4 12:27:07 2016] task: ffff8801b1d90000 task.stack: ffff8801b1268000
> [Sun Dec  4 12:27:07 2016] RIP: 0010:[<ffffffff8131226f>]
> [<ffffffff8131226f>] qgroup_fix_relocated_data_extents+0x1f/0x2a0
> [Sun Dec  4 12:27:07 2016] RSP: 0018:ffff8801b126bcd8  EFLAGS: 00010246
> [Sun Dec  4 12:27:07 2016] RAX: 0000000000000000 RBX: ffff8801b10b3150 RCX:
> 0000000000000000
> [Sun Dec  4 12:27:07 2016] RDX: ffff8801b20f24f0 RSI: ffff8801b2790800 RDI:
> ffff8801b20f2460
> [Sun Dec  4 12:27:07 2016] RBP: ffff8801b10bc000 R08: 0000000000020340 R09:
> ffff8801b20f2460
> [Sun Dec  4 12:27:07 2016] R10: ffff8801b48b7300 R11: ffffea0005dd0ac0 R12:
> ffff8801b126bd70
> [Sun Dec  4 12:27:07 2016] R13: 0000000000000000 R14: ffff8801b2790800 R15:
> 00000000b20f2460
> [Sun Dec  4 12:27:07 2016] FS:  00007f97a7846780(0000)
> GS:ffff8801bbc00000(0000) knlGS:0000000000000000
> [Sun Dec  4 12:27:07 2016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [Sun Dec  4 12:27:07 2016] CR2: fffffffffffffe10 CR3: 00000001b12ae000 CR4:
> 00000000000006f0
> [Sun Dec  4 12:27:07 2016] Stack:
> [Sun Dec  4 12:27:07 2016]  0000000000000801 0000000000000801 ffff8801b20f2460
> ffff8801b4aaa000
> [Sun Dec  4 12:27:07 2016]  0000000000000801 ffff8801b20f2460 ffffffff812c23ed
> ffff8801b1d90000
> [Sun Dec  4 12:27:07 2016]  0000000000000000 00ff8801b126bd18 ffff8801b10b3150
> ffff8801b4aa9800
> [Sun Dec  4 12:27:07 2016] Call Trace:
> [Sun Dec  4 12:27:07 2016]  [<ffffffff812c23ed>] ?
> start_transaction+0x8d/0x4e0
> [Sun Dec  4 12:27:07 2016]  [<ffffffff81317913>] ?
> btrfs_recover_relocation+0x3b3/0x440
> [Sun Dec  4 12:27:07 2016]  [<ffffffff81292b2a>] ? btrfs_remount+0x3ca/0x560
> [Sun Dec  4 12:27:07 2016]  [<ffffffff811bfc04>] ? shrink_dcache_sb+0x54/0x70
> [Sun Dec  4 12:27:07 2016]  [<ffffffff811ad473>] ? do_remount_sb+0x63/0x1d0
> [Sun Dec  4 12:27:07 2016]  [<ffffffff811c9953>] ? do_mount+0x6f3/0xbe0
> [Sun Dec  4 12:27:07 2016]  [<ffffffff811c918f>] ?
> copy_mount_options+0xbf/0x170
> [Sun Dec  4 12:27:07 2016]  [<ffffffff811ca111>] ? SyS_mount+0x61/0xa0
> [Sun Dec  4 12:27:07 2016]  [<ffffffff8169565b>] ?
> entry_SYSCALL_64_fastpath+0x13/0x8f
> [Sun Dec  4 12:27:07 2016] Code: 66 90 66 2e 0f 1f 84 00 00 00 00 00 41 57 41
> 56 41 55 41 54 55 53 48 83 ec 50 48 8b 46 08 4c 8b 6e 10 48 8b a8 f0 01 00 00
> 31 c0 <4d> 8b a5 10 fe ff ff f6 85 80 0c 00 00 01 74 09 80 be b0 05 00
> [Sun Dec  4 12:27:07 2016] RIP  [<ffffffff8131226f>]
> qgroup_fix_relocated_data_extents+0x1f/0x2a0
> [Sun Dec  4 12:27:07 2016]  RSP <ffff8801b126bcd8>
> [Sun Dec  4 12:27:07 2016] CR2: fffffffffffffe10
> [Sun Dec  4 12:27:07 2016] ---[ end trace bd51bbcfd10492f7 ]---

I can't parse this. Maybe someone else can. Do you get the same thing,
or a different thing, if you do a normal mount rather than a remount?

> Ah, but what does work is mounting a snapshot, in the sense that mount doesn't
> fail.  However, it seems that the balance still continues, so I'm back at
> square one.

Interesting that mounting a subvolume directly works, seeing as that's
just a bind mount behind the scene. But maybe there's something wrong
in the top level subvolume that's being skipped when mounting a
subvolume directly.

Are you mounting with skip_balance mount option? And how do you know
that it's a balance continuing? What do you get for 'btrfs balance
status' for this volume? Basically I'm asking if you're sure there's a
balance happening. The balance itself is not bad, it's just that it
slows everything down astronomically. That's the main reason why you'd
like to skip it or cancel it. Instead of balancing it might be doing
some sort of cleanup. Either 'top' or 'perf top' might give a clue
what's going on if 'btrfs balance status' doesn't show a balance is
happening, and yet the drive is super busy as if a balance is
happening.

Also, if you boot from alternate media, scrub resuming should not
happen because the progress file for scrub is in /var/lib/btrfs, there
is no metadata on the Btrfs volume itself that indicates it's being
scrubbed or what the progress is.

> Well, btrfs check came back clean.  And as mentioned above, I was able to get
> two images, but with btrfs-progs 4.7.3 (the version in sysrescuecd).  I can
> get different images from the initramfs (which I didn't think of earlier,
> sorry).

'btrfs check' using btrfs-progs 4.8.2 or higher came back clean? That
sounds like a bug. You're having quota related problems (at least it's
a contributing factor) but btrfs check says clean, while the kernel is
getting confused. So either 'btrfs check' is correct that there are no
problems, and there's a kernel bug resulting in confusion; or the
check is missing something, and that's why the kernel is mishandling
it. In either case, there's a kernel bug.

So yeah for sure you'll want a sanitized btrfs-image captured for the
developers to look at; put it somewhere like Google Drive or wherever
they can grab it. And put the URL in this thread, and/or also file a
bug about this problem with the URL to the image included.

Looking at 4.9 there's not many qgroup.c changes, but there's a pile
of other changes, per usual. So even though the problem seems like
it's qgroup related, it might actually be some other problem that then
also triggers qgroup messages.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-04 18:24         ` Duncan
@ 2016-12-04 19:20           ` Marc Joliet
  2016-12-05  2:32             ` Duncan
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Joliet @ 2016-12-04 19:20 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1794 bytes --]

On Sunday 04 December 2016 18:24:08 Duncan wrote:
> Marc Joliet posted on Sun, 04 Dec 2016 17:02:48 +0100 as excerpted:
> > That's a good idea, although I'll probably start with sysrescuecd (Linux
> > 4.8.5 and btrfs-progs 4.7.3), as I already have experience with it.
> > 
> > [After trying it]
> > 
> > Well, crap, I was able to get images of the file system (one sanitized),
> > but mounting always fails with "device or resource busy" (with no
> > corresponding dmesg output).  (Also, that drive's partitions weren't
> > discovered on bootup, I had to run partprobe first.)  I never see that
> > in the initramfs, so I'm not sure what's causing that.
> 
> If I understand correctly what you're doing, that part is easily enough
> explained.
> 
> Remember that btrfs, unlike most filesystems, is multi-device capable.
> The way it tracks which devices belong to which filesystems is by UUID,
> universally *UNIQUE* ID.  If you image a device via dd or similar, you of
> course image its UUID as well, destroying the "unique" assumption in UUID
> and confusing btrfs, which will consider it part of the existing
> filesystem if the original devices with that filesystem UUID remain
> hooked up.
> 
> So if you did what I believe you did, try to mount the image while the
> original filesystem devices remain attached and mounted, btrfs is simply
> saying that filesystem (which btrfs identifies by UUID) is already
> mounted: "device or resource busy".
[...]

Nope, sorry if I wasn't clear, I didn't mean that I tried to mount the image 
(can you even mount images created with btrfs-image?).  Plus the images are 
xz-compressed.

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-03 18:40 system hangs due to qgroups Marc Joliet
  2016-12-03 20:42 ` Chris Murphy
@ 2016-12-05  0:39 ` Qu Wenruo
  2016-12-05 11:01   ` Marc Joliet
  1 sibling, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2016-12-05  0:39 UTC (permalink / raw)
  To: Marc Joliet, linux-btrfs



At 12/04/2016 02:40 AM, Marc Joliet wrote:
> Hello all,
>
> I'm having some trouble with btrfs on a laptop, possibly due to qgroups.
> Specifically, some file system activities (e.g., snapshot creation,
> baloo_file_extractor from KDE Plasma) cause the system to hang for up to about
> 40 minutes, maybe more.  It always causes (most of) my desktop to hang,
> (although I can usually navigate between pre-existing Konsole tabs) and
> prevents new programs from starting.  I've seen the system load go up to >30
> before the laptop suddenly resumes normal operation.  I've been seeing this
> since Linux 4.7, maybe already 4.6.

Qgroup is CPU intensive operation.

The main problem is the design of btrfs extent tree, which bias towards 
snapshot creating speed, but quite complicated if used for tracing all 
referencer (which qgroup heavily relies on it).


The main factor affecting qgroup speed, is how many shared extents are 
in the fs.
This including reflinked files and snapshot, under most case snapshot is 
the main part.

Unless we find a better solution, to keep both qgroup accurate and fast, 
I'd recommend to keep qgroup under a reasonable number.
(Personally speaking, 10 would be good)

Despite the qgroup, relocation(balancing) should also be affected by the 
number of shared extents.

>
> Now, I thought that maybe this was (indirectly) due to an overly full file
> system (~90% full), so I deleted some things I didn't need to get it up to 15%
> free.  (For the record, I also tried mounting with ssd_spread.)  After that, I
> ran a balance with -dusage=50, which started out promising, but then went back
> to the "bad" behaviour.  *But* it seemed better than before overall, so I
> started a balance with -musage=10, then -musage=50.  That turned out to be a
> mistake.  Since I had to transport the laptop, and couldn't wait for "balance
> cancel" to return (IIUC it only returns after the next block (group?) is
> freed), I forced the laptop off.
>
> After I next turned on the laptop, the balance resumed, causing bootup to
> fail, after which I remembered about the skip_balance mount option, which I
> tried in a rescue shell from an initramfs.  But wait, that failed, too!
> Specifically, the stack trace I get whenever I try it includes as one of the
> last lines:
>
> "RIP [<ffffffff8131226f>] qgroup_fix_relocated_data_extents+0x1f/0x2a8"

This seems to be a NULL pointer bug in qgroup relocation fix.

The latest fix (not merged yet) should address it.

You could try the for-next-20161125 branch from David to fix it:
https://github.com/kdave/btrfs-devel/tree/for-next-20161125

>
> (I can take photos of the full stack trace if requested.)
>
> So then I ran "btrfs qgroup show /sysroot/", which showed many quota groups,
> much to my surprise.  On the upside, at least now I discovered the likely
> reason for the performance problems.

So, the number of qgroups is the cause for the slowness.

>
> (I actually think I know why I'm seeing qgroups: at one point I was trying out
> various snapshot/backup tools for btrfs, and one (I forgot which)
> unconditionally activated quota support, which infuriated me, but I promptly
> deactivated it, or so I thought.  Is quota support automatically enabled when
> qgroups are discovered, or did I perhaps not disable quota support properly?)

Qgroup will always be enabled after "btrfs quota enable", and until 
"btrfs quota disable" to disable it.

No method to temporarily disable quota, since quota must trace any 
modification, or qgroup number will be out of true.

So, one should manually disable quota.
(And that's the backup tool to blame, it should either info user or 
disable qgroup on uninstallation)

>
> Since I couldn't use skip_balance, and logically can't destroy qgroups on a
> read-only file system, I decided to wait for a regular mount to finish.  That
> has been running since Tuesday, and I am slowly growing impatient.
>
> Thus I arrive at my question(s): is there anything else I can try, short of
> reformatting and restoring from backup?  Can I use btrfs-check here, or any
> other tool?  Or...?
>
> Also, should I be able to avoid reformatting: how do I properly disable quota
> support?

"btrfs quota disable <mnt>", yes you need RW mount.
Any RW mountable snapshot/subvolume is OK.

>
> (BTW, searching for qgroup_fix_relocated_data_extents turned up the ML thread
> "[PATCH] Btrfs: fix endless loop in balancing block groups", could that be
> related?)

Nope, the actual fixing patches are:
[PATCH 1/4] btrfs: qgroup: Add comments explaining how btrfs qgroup works
[PATCH 2/4] btrfs: qgroup: Rename functions to make it follow 
reserve,trace,account steps
[PATCH 3/4] btrfs: Expoert and move leaf/subtree qgroup helpers to qgroup.c
[PATCH 4/4] btrfs: qgroup: Fix qgroup data leaking by using subtree tracing


The 4th patch is the real working one, but relies on previous 3 to apply.

The regression is also caused by my patch:
[PATCH v3.1 2/3] btrfs: relocation: Fix leaking qgroups numbers on data 
extents

Sorry for the trouble.


And for your recovery, I'd suggest to install an Archlinux into a USB 
HDD or USB stick, and compile David's branch and install it into the USB 
HDD.

Then use the USB storage as rescue tool to mount the fs, which should do 
RW mount with or without skip_balance mount option.
So you could disable quota then.

Thanks,
Qu


>
> The laptop is currently running Gentoo with Linux 4.8.10 and btrfs-progs
> 4.8.4.
>
> Greetings
>



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-04 19:20           ` Marc Joliet
@ 2016-12-05  2:32             ` Duncan
  0 siblings, 0 replies; 22+ messages in thread
From: Duncan @ 2016-12-05  2:32 UTC (permalink / raw)
  To: linux-btrfs

Marc Joliet posted on Sun, 04 Dec 2016 20:20:51 +0100 as excerpted:

> On Sunday 04 December 2016 18:24:08 Duncan wrote:
>> Marc Joliet posted on Sun, 04 Dec 2016 17:02:48 +0100 as excerpted:

>> > [After trying it]
>> > 
>> > Well, crap, I was able to get images of the file system (one
>> > sanitized),
>> > but mounting always fails with "device or resource busy" (with no
>> > corresponding dmesg output).  (Also, that drive's partitions weren't
>> > discovered on bootup, I had to run partprobe first.)  I never see
>> > that in the initramfs, so I'm not sure what's causing that.
>> 
>> If I understand correctly what you're doing, that part is easily enough
>> explained.
>> 
>> Remember that btrfs, unlike most filesystems, is multi-device capable.
>> The way it tracks which devices belong to which filesystems is by UUID,
>> universally *UNIQUE* ID.  If you image a device via dd or similar, you
>> of course image its UUID as well, destroying the "unique" assumption in
>> UUID and confusing btrfs, which will consider it part of the existing
>> filesystem if the original devices with that filesystem UUID remain
>> hooked up.
>> 
>> So if you did what I believe you did, try to mount the image while the
>> original filesystem devices remain attached and mounted, btrfs is
>> simply saying that filesystem (which btrfs identifies by UUID) is
>> already mounted: "device or resource busy".
> [...]
> 
> Nope, sorry if I wasn't clear, I didn't mean that I tried to mount the
> image (can you even mount images created with btrfs-image?).  Plus the
> images are xz-compressed.

Namespace collision.

I interpreted "image" as referring to an "image" taken with dd or 
similar, which as I explained naturally copies the filesystem UUID as 
well, and if both that dd-created image and the original filesystem are 
visible to btrfs at the same time, bad things happen because btrfs 
considers them part of the same filesystem.

And if you tried to mount /that/ image while the original filesystem was 
already mounted, you /might/ get a busy error as it could think it was 
already mounted.  (And you'd be lucky if so, because it just might save 
you serious corruption, tho corruption might still happen due to btrfs 
writing to the image instead of the mounted original filesystem.)

I wasn't considering trying to mount the btrfs-created metadata images...

Anyway, as long as you weren't trying to mount or work with the dd-
created image at the same time as the original was mounted, you should be 
good.  Just remove one (or don't create the loopback device out of the 
image file so it's not exposed as a device that btrfs can see) before 
trying to mount the other and you should be good.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-04 18:52         ` Chris Murphy
@ 2016-12-05  9:00           ` Marc Joliet
  2016-12-05 10:16             ` Marc Joliet
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Joliet @ 2016-12-05  9:00 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 14372 bytes --]

On Sunday 04 December 2016 11:52:40 Chris Murphy wrote:
> On Sun, Dec 4, 2016 at 9:02 AM, Marc Joliet <marcec@gmx.de> wrote:
> > Also, now the file system fails with the BUG I mentioned, see here:
> > 
> > [Sun Dec  4 12:27:07 2016] BUG: unable to handle kernel paging request at
> > fffffffffffffe10
> > [Sun Dec  4 12:27:07 2016] IP: [<ffffffff8131226f>]
> > qgroup_fix_relocated_data_extents+0x1f/0x2a0
> > [Sun Dec  4 12:27:07 2016] PGD 1c07067 PUD 1c09067 PMD 0
> > [Sun Dec  4 12:27:07 2016] Oops: 0000 [#1] PREEMPT SMP
> > [Sun Dec  4 12:27:07 2016] Modules linked in: crc32c_intel serio_raw
> > [Sun Dec  4 12:27:07 2016] CPU: 0 PID: 370 Comm: mount Not tainted 4.8.11-
> > gentoo #1
> > [Sun Dec  4 12:27:07 2016] Hardware name: FUJITSU LIFEBOOK A530/FJNBB06,
> > BIOS Version 1.19   08/15/2011
> > [Sun Dec  4 12:27:07 2016] task: ffff8801b1d90000 task.stack:
> > ffff8801b1268000 [Sun Dec  4 12:27:07 2016] RIP:
> > 0010:[<ffffffff8131226f>]
> > [<ffffffff8131226f>] qgroup_fix_relocated_data_extents+0x1f/0x2a0
> > [Sun Dec  4 12:27:07 2016] RSP: 0018:ffff8801b126bcd8  EFLAGS: 00010246
> > [Sun Dec  4 12:27:07 2016] RAX: 0000000000000000 RBX: ffff8801b10b3150
> > RCX:
> > 0000000000000000
> > [Sun Dec  4 12:27:07 2016] RDX: ffff8801b20f24f0 RSI: ffff8801b2790800
> > RDI:
> > ffff8801b20f2460
> > [Sun Dec  4 12:27:07 2016] RBP: ffff8801b10bc000 R08: 0000000000020340
> > R09:
> > ffff8801b20f2460
> > [Sun Dec  4 12:27:07 2016] R10: ffff8801b48b7300 R11: ffffea0005dd0ac0
> > R12:
> > ffff8801b126bd70
> > [Sun Dec  4 12:27:07 2016] R13: 0000000000000000 R14: ffff8801b2790800
> > R15:
> > 00000000b20f2460
> > [Sun Dec  4 12:27:07 2016] FS:  00007f97a7846780(0000)
> > GS:ffff8801bbc00000(0000) knlGS:0000000000000000
> > [Sun Dec  4 12:27:07 2016] CS:  0010 DS: 0000 ES: 0000 CR0:
> > 0000000080050033 [Sun Dec  4 12:27:07 2016] CR2: fffffffffffffe10 CR3:
> > 00000001b12ae000 CR4: 00000000000006f0
> > [Sun Dec  4 12:27:07 2016] Stack:
> > [Sun Dec  4 12:27:07 2016]  0000000000000801 0000000000000801
> > ffff8801b20f2460 ffff8801b4aaa000
> > [Sun Dec  4 12:27:07 2016]  0000000000000801 ffff8801b20f2460
> > ffffffff812c23ed ffff8801b1d90000
> > [Sun Dec  4 12:27:07 2016]  0000000000000000 00ff8801b126bd18
> > ffff8801b10b3150 ffff8801b4aa9800
> > [Sun Dec  4 12:27:07 2016] Call Trace:
> > [Sun Dec  4 12:27:07 2016]  [<ffffffff812c23ed>] ?
> > start_transaction+0x8d/0x4e0
> > [Sun Dec  4 12:27:07 2016]  [<ffffffff81317913>] ?
> > btrfs_recover_relocation+0x3b3/0x440
> > [Sun Dec  4 12:27:07 2016]  [<ffffffff81292b2a>] ?
> > btrfs_remount+0x3ca/0x560 [Sun Dec  4 12:27:07 2016] 
> > [<ffffffff811bfc04>] ? shrink_dcache_sb+0x54/0x70 [Sun Dec  4 12:27:07
> > 2016]  [<ffffffff811ad473>] ? do_remount_sb+0x63/0x1d0 [Sun Dec  4
> > 12:27:07 2016]  [<ffffffff811c9953>] ? do_mount+0x6f3/0xbe0 [Sun Dec  4
> > 12:27:07 2016]  [<ffffffff811c918f>] ?
> > copy_mount_options+0xbf/0x170
> > [Sun Dec  4 12:27:07 2016]  [<ffffffff811ca111>] ? SyS_mount+0x61/0xa0
> > [Sun Dec  4 12:27:07 2016]  [<ffffffff8169565b>] ?
> > entry_SYSCALL_64_fastpath+0x13/0x8f
> > [Sun Dec  4 12:27:07 2016] Code: 66 90 66 2e 0f 1f 84 00 00 00 00 00 41 57
> > 41 56 41 55 41 54 55 53 48 83 ec 50 48 8b 46 08 4c 8b 6e 10 48 8b a8 f0
> > 01 00 00 31 c0 <4d> 8b a5 10 fe ff ff f6 85 80 0c 00 00 01 74 09 80 be b0
> > 05 00 [Sun Dec  4 12:27:07 2016] RIP  [<ffffffff8131226f>]
> > qgroup_fix_relocated_data_extents+0x1f/0x2a0
> > [Sun Dec  4 12:27:07 2016]  RSP <ffff8801b126bcd8>
> > [Sun Dec  4 12:27:07 2016] CR2: fffffffffffffe10
> > [Sun Dec  4 12:27:07 2016] ---[ end trace bd51bbcfd10492f7 ]---
> 
> I can't parse this. Maybe someone else can. Do you get the same thing,
> or a different thing, if you do a normal mount rather than a remount?

The call trace is of course a bit different, but in both cases the RIP line is 
almost identical (if that even matters?).  Compare the line from my first 
message:

"RIP [<ffffffff8131226f>] qgroup_fix_relocated_data_extents+0x1f/0x2a8"

with the newest line:

"RIP [<ffffffff8131226f>] qgroup_fix_relocated_data_extents+0x1f/0x2a0"

But I just remembered, I have one from trying to mount the top-level subvolume 
on my desktop:

[So Dez  4 18:45:19 2016] BUG: unable to handle kernel paging request at 
fffffffffffffe10
[So Dez  4 18:45:19 2016] IP: [<ffffffff812f1103>] 
qgroup_fix_relocated_data_extents+0x33/0x2e0
[So Dez  4 18:45:19 2016] PGD 1a07067 PUD 1a09067 PMD 0 
[So Dez  4 18:45:19 2016] Oops: 0000 [#1] PREEMPT SMP
[So Dez  4 18:45:19 2016] Modules linked in: joydev dummy iptable_filter 
ip_tables x_tables hid_logitech_hidpp hid_logitech_dj snd_hda_codec_hdmi 
snd_hda_codec_analog snd_hda_codec_generic uvcvideo videobuf2_vmalloc 
videobuf2_memops videobuf2_v4l2 videobuf2_core videodev snd_usb_audio 
snd_hwdep snd_usbmidi_lib radeon i2c_algo_bit drm_kms_helper cfbfillrect 
syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea kvm_amd 
kvm ttm irqbypass evdev drm k8temp backlight snd_ice1724 snd_ak4113 snd_pt2258 
snd_hda_intel snd_i2c snd_ak4114 snd_hda_codec snd_ac97_codec snd_hda_core 
ac97_bus snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_rawmidi snd_seq_device snd_pcm 
forcedeth snd_timer snd rtc_cmos asus_atk0110 i2c_nforce2 i2c_core sg sr_mod 
cdrom xhci_pci ata_generic ohci_pci xhci_hcd pata_amd pata_acpi ohci_hcd 
ehci_pci
[So Dez  4 18:45:19 2016]  ehci_hcd
[So Dez  4 18:45:19 2016] CPU: 1 PID: 8545 Comm: mount Not tainted 4.8.12-
gentoo #1
[So Dez  4 18:45:19 2016] Hardware name: System manufacturer System Product 
Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 1701 10/30/2008
[So Dez  4 18:45:19 2016] task: ffff88003d0a2100 task.stack: ffff88011a5e0000
[So Dez  4 18:45:19 2016] RIP: 0010:[<ffffffff812f1103>]  [<ffffffff812f1103>] 
qgroup_fix_relocated_data_extents+0x33/0x2e0
[So Dez  4 18:45:19 2016] RSP: 0018:ffff88011a5e3a38  EFLAGS: 00010246
[So Dez  4 18:45:19 2016] RAX: 0000000000000000 RBX: ffff88007d3d8690 RCX: 
0000000000000000
[So Dez  4 18:45:19 2016] RDX: ffff880138be2ef0 RSI: ffff88007c3e3800 RDI: 
ffff880138be2e60
[So Dez  4 18:45:19 2016] RBP: ffff88007b69a000 R08: 000000000001f0d0 R09: 
ffff880138be2e60
[So Dez  4 18:45:19 2016] R10: ffff88007d3d82a0 R11: ffffea0001f4f600 R12: 
ffff88011a5e3ad0
[So Dez  4 18:45:19 2016] R13: 0000000000000000 R14: ffff88007c3e3800 R15: 
0000000038be2e60
[So Dez  4 18:45:19 2016] FS:  00007fe24229c780(0000) 
GS:ffff88013fc80000(0000) knlGS:0000000000000000
[So Dez  4 18:45:19 2016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[So Dez  4 18:45:19 2016] CR2: fffffffffffffe10 CR3: 000000010b3e8000 CR4: 
00000000000006e0
[So Dez  4 18:45:19 2016] Stack:
[So Dez  4 18:45:19 2016]  0000000000000801 0000000000000801 ffff880138be2e60 
ffff88005b94f000
[So Dez  4 18:45:19 2016]  0000000000000801 ffff880138be2e60 ffffffff8129dd8c 
ffff88003d0a2100
[So Dez  4 18:45:19 2016]  0000000000000000 00ff88011a5e3a78 ffff88007d3d8690 
ffff88005b94a800
[So Dez  4 18:45:19 2016] Call Trace:
[So Dez  4 18:45:19 2016]  [<ffffffff8129dd8c>] ? start_transaction+0x8c/0x4e0
[So Dez  4 18:45:19 2016]  [<ffffffff812f69df>] ? 
btrfs_recover_relocation+0x3bf/0x440
[So Dez  4 18:45:19 2016]  [<ffffffff8129a562>] ? open_ctree+0x2182/0x26a0
[So Dez  4 18:45:19 2016]  [<ffffffff81383769>] ? snprintf+0x39/0x40
[So Dez  4 18:45:19 2016]  [<ffffffff8126d362>] ? btrfs_mount+0xd32/0xe40
[So Dez  4 18:45:19 2016]  [<ffffffff81132e41>] ? pcpu_alloc+0x321/0x610
[So Dez  4 18:45:19 2016]  [<ffffffff81179915>] ? mount_fs+0x45/0x180
[So Dez  4 18:45:19 2016]  [<ffffffff811941e1>] ? vfs_kern_mount+0x71/0x130
[So Dez  4 18:45:19 2016]  [<ffffffff8126c7c2>] ? btrfs_mount+0x192/0xe40
[So Dez  4 18:45:19 2016]  [<ffffffff81132e41>] ? pcpu_alloc+0x321/0x610
[So Dez  4 18:45:19 2016]  [<ffffffff81179915>] ? mount_fs+0x45/0x180
[So Dez  4 18:45:19 2016]  [<ffffffff811941e1>] ? vfs_kern_mount+0x71/0x130
[So Dez  4 18:45:19 2016]  [<ffffffff81196df5>] ? do_mount+0x1e5/0xc00
[So Dez  4 18:45:19 2016]  [<ffffffff8117264f>] ? 
__check_object_size+0x13f/0x1ed
[So Dez  4 18:45:19 2016]  [<ffffffff8112eaa3>] ? memdup_user+0x53/0x80
[So Dez  4 18:45:19 2016]  [<ffffffff81197af5>] ? SyS_mount+0x75/0xc0
[So Dez  4 18:45:19 2016]  [<ffffffff8160275b>] ? 
entry_SYSCALL_64_fastpath+0x13/0x8f
[So Dez  4 18:45:19 2016] Code: 50 48 89 6c 24 58 4c 89 64 24 60 4c 89 6c 24 
68 4c 89 74 24 70 4c 89 7c 24 78 48 8b 46 08 4c 8b 6e 10 48 8b a8 f0 01 00 00 
31 c0 <4d> 8b a5 10 fe ff ff f6 85 80 0c 00 00 01 74 09 80 be b0 05 00 
[So Dez  4 18:45:19 2016] RIP  [<ffffffff812f1103>] 
qgroup_fix_relocated_data_extents+0x33/0x2e0
[So Dez  4 18:45:19 2016]  RSP <ffff88011a5e3a38>
[So Dez  4 18:45:19 2016] CR2: fffffffffffffe10
[So Dez  4 18:45:19 2016] ---[ end trace 5c46f8b4f82b998c ]---

> > Ah, but what does work is mounting a snapshot, in the sense that mount
> > doesn't fail.  However, it seems that the balance still continues, so I'm
> > back at square one.
> 
> Interesting that mounting a subvolume directly works, seeing as that's
> just a bind mount behind the scene. But maybe there's something wrong
> in the top level subvolume that's being skipped when mounting a
> subvolume directly.

Yeah, however, just to be clear, that's a new problem, and one that might be 
temporary (I had the same subvolume fail, too, only to "work" the next try).

> Are you mounting with skip_balance mount option?

When I try that, I get the BUGs mentioned above.  When I'm at the point where 
mount "works", but hangs, I do *not* use it.

> And how do you know
> that it's a balance continuing? What do you get for 'btrfs balance
> status' for this volume? Basically I'm asking if you're sure there's a
> balance happening.

I'm not 100% positive, but the behaviour I'm seeing matches my previous 
experience.  At least, I vaguely remember being in a similar situation with 
the laptop back when I first noticed the performance issues, but I can't 
remember any details.  I *think* the balance finished in a rescue shell, too, 
after an hour or so (it was only a data balance, I think).

> The balance itself is not bad, it's just that it
> slows everything down astronomically. That's the main reason why you'd
> like to skip it or cancel it. Instead of balancing it might be doing
> some sort of cleanup. Either 'top' or 'perf top' might give a clue
> what's going on if 'btrfs balance status' doesn't show a balance is
> happening, and yet the drive is super busy as if a balance is
> happening.

Sadly, "balance status" won't work, because it operates on mounted file 
systems, and mount never finishes.  So "balance cancel" won't work, either.

Also, I can't just watch drive activity, because the symptom of the 
performance issue is that there is none for a long time, with one or more 
intermittent bursts of activity, during which the system becomes (partially) 
responsive, at least temporarily.

But thanks for reminding me of perf top.  Is there any particularly 
recommended invocation, e.g., specific events it should watch?

Actually, I just randomly tried "perf top -e btrfs:*", which actually works :D 
.  I see, amongst others:

253 btrfs:btrfs_qgroup_release_data
1K btrfs:btrfs_qgroup_free_delayed_ref

Sadly, there do not seem to be any balance related events.

Also, I've been running "dstat -df" the entire time, and it shows *no* 
activity whatsoever going on.

> Also, if you boot from alternate media, scrub resuming should not
> happen because the progress file for scrub is in /var/lib/btrfs, there
> is no metadata on the Btrfs volume itself that indicates it's being
> scrubbed or what the progress is.
>
> > Well, btrfs check came back clean.  And as mentioned above, I was able to
> > get two images, but with btrfs-progs 4.7.3 (the version in sysrescuecd). 
> > I can get different images from the initramfs (which I didn't think of
> > earlier, sorry).
> 
> 'btrfs check' using btrfs-progs 4.8.2 or higher came back clean? That
> sounds like a bug. You're having quota related problems (at least it's
> a contributing factor) but btrfs check says clean, while the kernel is
> getting confused. So either 'btrfs check' is correct that there are no
> problems, and there's a kernel bug resulting in confusion; or the
> check is missing something, and that's why the kernel is mishandling
> it. In either case, there's a kernel bug.

Yeah, I was sure it was a bug when I first had the laptop hang for 30 minutes 
for no apparent reason ;) .  But I see what you mean.  Whether it's a balance 
or not, *something* is going on, though, as per the perf events shown above 
(none of my other btrfs file systems have quota support enabled, so it must be 
from the laptop drive).

> So yeah for sure you'll want a sanitized btrfs-image captured for the
> developers to look at; put it somewhere like Google Drive or wherever
> they can grab it. And put the URL in this thread, and/or also file a
> bug about this problem with the URL to the image included.

OK, I'll post the URLs once the images are uploaded.  (I had Dropbox public 
URLs right before my desktop crashed -- see below -- but now dropbox-cli 
doesn't want to create them.)

> Looking at 4.9 there's not many qgroup.c changes, but there's a pile
> of other changes, per usual. So even though the problem seems like
> it's qgroup related, it might actually be some other problem that then
> also triggers qgroup messages.

Yeah, Qu responded in the meantime, and there are fixes available, but not 
merged yet.

Also: I would have sent this mail yesterday, but my desktop hung in the same 
way my laptop hangs, but it never recovered.  I waited until this morning, but 
it was still unresponsive.  After forcing a reboot, the logs showed nothing, 
i.e., they stopped around the time of the hang.  But this time, running "mount 
-o subvol=/.snapshots/rootfs.201611260202,skip_balance /dev/sdd1 /mnt/tmp" 
eventually resulted in a recursive fault (I didn't have a camera with me, 
sorry).  Now I've rebooted normally, and my Email draft was still intact 
*phew*.

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-05  9:00           ` Marc Joliet
@ 2016-12-05 10:16             ` Marc Joliet
  2016-12-05 23:22               ` Marc Joliet
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Joliet @ 2016-12-05 10:16 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 703 bytes --]

On Monday 05 December 2016 10:00:13 Marc Joliet wrote:
> OK, I'll post the URLs once the images are uploaded.  (I had Dropbox public 
> URLs right before my desktop crashed -- see below -- but now dropbox-cli
> doesn't want to create them.)

Alright, here you go:

https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.7.3_sanitized.image.xz
https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.8.5_sanitized.image.xz

(FYI, "dropbox-cli puburl" appears to have broken recently, so I had to use 
the Dropbox web interface to get these URLs.)

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-05  0:39 ` Qu Wenruo
@ 2016-12-05 11:01   ` Marc Joliet
  2016-12-05 12:10     ` Marc Joliet
  2016-12-05 14:43     ` [SOLVED] " Marc Joliet
  0 siblings, 2 replies; 22+ messages in thread
From: Marc Joliet @ 2016-12-05 11:01 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 7010 bytes --]

On Monday 05 December 2016 08:39:02 Qu Wenruo wrote:
> At 12/04/2016 02:40 AM, Marc Joliet wrote:
> > Hello all,
> > 
> > I'm having some trouble with btrfs on a laptop, possibly due to qgroups.
> > Specifically, some file system activities (e.g., snapshot creation,
> > baloo_file_extractor from KDE Plasma) cause the system to hang for up to
> > about 40 minutes, maybe more.  It always causes (most of) my desktop to
> > hang, (although I can usually navigate between pre-existing Konsole tabs)
> > and prevents new programs from starting.  I've seen the system load go up
> > to >30 before the laptop suddenly resumes normal operation.  I've been
> > seeing this since Linux 4.7, maybe already 4.6.
> 
> Qgroup is CPU intensive operation.
> 
> The main problem is the design of btrfs extent tree, which bias towards
> snapshot creating speed, but quite complicated if used for tracing all
> referencer (which qgroup heavily relies on it).
> 
> 
> The main factor affecting qgroup speed, is how many shared extents are
> in the fs.
> This including reflinked files and snapshot, under most case snapshot is
> the main part.
> 
> Unless we find a better solution, to keep both qgroup accurate and fast,
> I'd recommend to keep qgroup under a reasonable number.
> (Personally speaking, 10 would be good)
> 
> Despite the qgroup, relocation(balancing) should also be affected by the
> number of shared extents.

OK

> > Now, I thought that maybe this was (indirectly) due to an overly full file
> > system (~90% full), so I deleted some things I didn't need to get it up to
> > 15% free.  (For the record, I also tried mounting with ssd_spread.) 
> > After that, I ran a balance with -dusage=50, which started out promising,
> > but then went back to the "bad" behaviour.  *But* it seemed better than
> > before overall, so I started a balance with -musage=10, then -musage=50. 
> > That turned out to be a mistake.  Since I had to transport the laptop,
> > and couldn't wait for "balance cancel" to return (IIUC it only returns
> > after the next block (group?) is freed), I forced the laptop off.
> > 
> > After I next turned on the laptop, the balance resumed, causing bootup to
> > fail, after which I remembered about the skip_balance mount option, which
> > I
> > tried in a rescue shell from an initramfs.  But wait, that failed, too!
> > Specifically, the stack trace I get whenever I try it includes as one of
> > the last lines:
> > 
> > "RIP [<ffffffff8131226f>] qgroup_fix_relocated_data_extents+0x1f/0x2a8"
> 
> This seems to be a NULL pointer bug in qgroup relocation fix.
> 
> The latest fix (not merged yet) should address it.
> 
> You could try the for-next-20161125 branch from David to fix it:
> https://github.com/kdave/btrfs-devel/tree/for-next-20161125

OK, I'll try that, thanks!  I just have to wait for it to finish cloning...

> > (I can take photos of the full stack trace if requested.)
> > 
> > So then I ran "btrfs qgroup show /sysroot/", which showed many quota
> > groups, much to my surprise.  On the upside, at least now I discovered
> > the likely reason for the performance problems.
> 
> So, the number of qgroups is the cause for the slowness.

OK

> > (I actually think I know why I'm seeing qgroups: at one point I was trying
> > out various snapshot/backup tools for btrfs, and one (I forgot which)
> > unconditionally activated quota support, which infuriated me, but I
> > promptly deactivated it, or so I thought.  Is quota support automatically
> > enabled when qgroups are discovered, or did I perhaps not disable quota
> > support properly?)
> Qgroup will always be enabled after "btrfs quota enable", and until
> "btrfs quota disable" to disable it.
> 
> No method to temporarily disable quota, since quota must trace any
> modification, or qgroup number will be out of true.
> 
> So, one should manually disable quota.
> (And that's the backup tool to blame, it should either info user or
> disable qgroup on uninstallation)

Hmm, I must not be remembering the whole story then, because I was pretty sure 
that I ran "quota disable" and verified that quotas were off, too, but then 
again, it's been quite a while now (a year?) since it happened.

> > Since I couldn't use skip_balance, and logically can't destroy qgroups on
> > a
> > read-only file system, I decided to wait for a regular mount to finish. 
> > That has been running since Tuesday, and I am slowly growing impatient.
> > 
> > Thus I arrive at my question(s): is there anything else I can try, short
> > of
> > reformatting and restoring from backup?  Can I use btrfs-check here, or
> > any
> > other tool?  Or...?
> > 
> > Also, should I be able to avoid reformatting: how do I properly disable
> > quota support?
> 
> "btrfs quota disable <mnt>", yes you need RW mount.
> Any RW mountable snapshot/subvolume is OK.

OK

> > (BTW, searching for qgroup_fix_relocated_data_extents turned up the ML
> > thread "[PATCH] Btrfs: fix endless loop in balancing block groups", could
> > that be related?)
> 
> Nope, the actual fixing patches are:
> [PATCH 1/4] btrfs: qgroup: Add comments explaining how btrfs qgroup works
> [PATCH 2/4] btrfs: qgroup: Rename functions to make it follow
> reserve,trace,account steps
> [PATCH 3/4] btrfs: Expoert and move leaf/subtree qgroup helpers to qgroup.c
> [PATCH 4/4] btrfs: qgroup: Fix qgroup data leaking by using subtree tracing
> 
> 
> The 4th patch is the real working one, but relies on previous 3 to apply.
> 
> The regression is also caused by my patch:
> [PATCH v3.1 2/3] btrfs: relocation: Fix leaking qgroups numbers on data
> extents
> 
> Sorry for the trouble.

No problem, I just wish I would've thought to check for qgroups before getting 
into this mess.

Although I'm actually *relieved* that it's qgroups, because before that I was 
worried that I had finally hit a nigh-show-stopping bug.  I thought that I was 
merely not seeing it on my other systems, but that it could happen at any 
time.  Now I'm more confident in the stability of my systems again :) .

> And for your recovery, I'd suggest to install an Archlinux into a USB
> HDD or USB stick, and compile David's branch and install it into the USB
> HDD.
> 
> Then use the USB storage as rescue tool to mount the fs, which should do
> RW mount with or without skip_balance mount option.
> So you could disable quota then.

OK, I'll try that, thanks!

> Thanks,
> Qu
> 
> > The laptop is currently running Gentoo with Linux 4.8.10 and btrfs-progs
> > 4.8.4.
> > 
> > Greetings
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-05 11:01   ` Marc Joliet
@ 2016-12-05 12:10     ` Marc Joliet
  2016-12-05 14:43     ` [SOLVED] " Marc Joliet
  1 sibling, 0 replies; 22+ messages in thread
From: Marc Joliet @ 2016-12-05 12:10 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 926 bytes --]

On Monday 05 December 2016 12:01:28 Marc Joliet wrote:
> > You could try the for-next-20161125 branch from David to fix it:
> > https://github.com/kdave/btrfs-devel/tree/for-next-20161125
> 
> OK, I'll try that, thanks!  I just have to wait for it to finish cloning...

FWIW, I get this warning:

  CC      fs/btrfs/inode.o
fs/btrfs/inode.c: In Funktion »run_delalloc_range«:
fs/btrfs/inode.c:1219:9: Warnung: »cur_end« könnte in dieser Funktion 
uninitialisiert verwendet werden [-Wmaybe-uninitialized]
   start = cur_end + 1;
         ^
fs/btrfs/inode.c:1172:6: Anmerkung: »cur_end« wurde hier deklariert

Should I be worried about that?  At a cursory glance, it looks like a false 
alarm, but I just want to be sure (and even so, false alarms are annoying).

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [SOLVED] Re: system hangs due to qgroups
  2016-12-05 11:01   ` Marc Joliet
  2016-12-05 12:10     ` Marc Joliet
@ 2016-12-05 14:43     ` Marc Joliet
  2016-12-06  0:29       ` Qu Wenruo
  1 sibling, 1 reply; 22+ messages in thread
From: Marc Joliet @ 2016-12-05 14:43 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1450 bytes --]

On Monday 05 December 2016 12:01:28 Marc Joliet wrote:
> > This seems to be a NULL pointer bug in qgroup relocation fix.
> >
> > 
> >
> > The latest fix (not merged yet) should address it.
> >
> > 
> >
> > You could try the for-next-20161125 branch from David to fix it:
> > https://github.com/kdave/btrfs-devel/tree/for-next-20161125
> 
> OK, I'll try that, thanks!  I just have to wait for it to finish cloning...
> 
[...]
> > And for your recovery, I'd suggest to install an Archlinux into a USB
> > HDD or USB stick, and compile David's branch and install it into the USB
> > HDD.
> >
> > 
> >
> > Then use the USB storage as rescue tool to mount the fs, which should do
> > RW mount with or without skip_balance mount option.
> > So you could disable quota then.
> 
> OK, I'll try that, thanks!

Excellent, thank you, that worked!  My laptop is working normally again.  I'll 
keep an eye on it, but so far two balance operations ran normally (that is, 
they completed within a few minutes and without hanging the system).

(Specifically, since I didn't find out how to get a different kernel onto the 
Arch USB stick, I simply installed the kernel on my desktop, then did 
everything from an initramfs emergency shell, then moved the SSD back into the 
laptop.)

Thanks, everyone!
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-05 10:16             ` Marc Joliet
@ 2016-12-05 23:22               ` Marc Joliet
  2016-12-19 11:17                 ` Marc Joliet
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Joliet @ 2016-12-05 23:22 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 460 bytes --]

On Monday 05 December 2016 11:16:35 Marc Joliet wrote:
[...]
> https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.7.3_sanitized.imag
> e.xz
> https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.8.5_sanitized.ima
> ge.xz

BTW, since my problem appears to have been known, does anybody still care 
about these?

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [SOLVED] Re: system hangs due to qgroups
  2016-12-05 14:43     ` [SOLVED] " Marc Joliet
@ 2016-12-06  0:29       ` Qu Wenruo
  2016-12-06 10:12         ` Marc Joliet
  0 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2016-12-06  0:29 UTC (permalink / raw)
  To: Marc Joliet, linux-btrfs



At 12/05/2016 10:43 PM, Marc Joliet wrote:
> On Monday 05 December 2016 12:01:28 Marc Joliet wrote:
>>> This seems to be a NULL pointer bug in qgroup relocation fix.
>>>
>>>
>>>
>>> The latest fix (not merged yet) should address it.
>>>
>>>
>>>
>>> You could try the for-next-20161125 branch from David to fix it:
>>> https://github.com/kdave/btrfs-devel/tree/for-next-20161125
>>
>> OK, I'll try that, thanks!  I just have to wait for it to finish cloning...
>>
> [...]
>>> And for your recovery, I'd suggest to install an Archlinux into a USB
>>> HDD or USB stick, and compile David's branch and install it into the USB
>>> HDD.
>>>
>>>
>>>
>>> Then use the USB storage as rescue tool to mount the fs, which should do
>>> RW mount with or without skip_balance mount option.
>>> So you could disable quota then.
>>
>> OK, I'll try that, thanks!
>
> Excellent, thank you, that worked!  My laptop is working normally again.  I'll
> keep an eye on it, but so far two balance operations ran normally (that is,
> they completed within a few minutes and without hanging the system).
>
> (Specifically, since I didn't find out how to get a different kernel onto the
> Arch USB stick, I simply installed the kernel on my desktop, then did
> everything from an initramfs emergency shell, then moved the SSD back into the
> laptop.)
>
> Thanks, everyone!
>
Glad that helped.

I just forgot that you're using gentoo, not archlinux, and kernel 
install script won't work for archlinux.

Anyway, I'm glad that works for you.

BTW, if you haven't yet disable quota, would you please give a report on 
how many qgroup you have?
And how CPU is spinning for balancing with quota enabled?

This would help us to evaluate how qgroup slows down the process if 
there are too many snapshots.

Thanks,
Qu



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [SOLVED] Re: system hangs due to qgroups
  2016-12-06  0:29       ` Qu Wenruo
@ 2016-12-06 10:12         ` Marc Joliet
  2016-12-06 14:55           ` Marc Joliet
  0 siblings, 1 reply; 22+ messages in thread
From: Marc Joliet @ 2016-12-06 10:12 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3124 bytes --]

On Tuesday 06 December 2016 08:29:48 Qu Wenruo wrote:
> At 12/05/2016 10:43 PM, Marc Joliet wrote:
> > On Monday 05 December 2016 12:01:28 Marc Joliet wrote:
> >>> This seems to be a NULL pointer bug in qgroup relocation fix.
> >>> 
> >>> 
> >>> 
> >>> The latest fix (not merged yet) should address it.
> >>> 
> >>> 
> >>> 
> >>> You could try the for-next-20161125 branch from David to fix it:
> >>> https://github.com/kdave/btrfs-devel/tree/for-next-20161125
> >> 
> >> OK, I'll try that, thanks!  I just have to wait for it to finish
> >> cloning...
> > 
> > [...]
> > 
> >>> And for your recovery, I'd suggest to install an Archlinux into a USB
> >>> HDD or USB stick, and compile David's branch and install it into the USB
> >>> HDD.
> >>> 
> >>> 
> >>> 
> >>> Then use the USB storage as rescue tool to mount the fs, which should do
> >>> RW mount with or without skip_balance mount option.
> >>> So you could disable quota then.
> >> 
> >> OK, I'll try that, thanks!
> > 
> > Excellent, thank you, that worked!  My laptop is working normally again. 
> > I'll keep an eye on it, but so far two balance operations ran normally
> > (that is, they completed within a few minutes and without hanging the
> > system).
> > 
> > (Specifically, since I didn't find out how to get a different kernel onto
> > the Arch USB stick, I simply installed the kernel on my desktop, then did
> > everything from an initramfs emergency shell, then moved the SSD back
> > into the laptop.)
> > 
> > Thanks, everyone!
> 
> Glad that helped.
> 
> I just forgot that you're using gentoo, not archlinux, and kernel
> install script won't work for archlinux.
> 
> Anyway, I'm glad that works for you.
> 
> BTW, if you haven't yet disable quota, would you please give a report on
> how many qgroup you have?

I have disabled quotas already (first thing I did after mounting).  However, 
there were definitely 20-30, maybe more (enough for 2, maybe 3, console pages 
-- I don't know how many lines the initramfs rescue shell has, but based on 
that, you could estimate the number of qgroups).

> And how CPU is spinning for balancing with quota enabled?

All I can say is, based on past observations, that I would see a single 
process (usually btrfs-transaction, but often a user-space process, such as 
baloo_file_extractor) using a single CPU at 100% and blocking (almost) 
everything else, and either finish after a while if it was quick enough, or 
there would be intermittent time frames where other processes weren't blocked.  
With balancing the behaviour was the latter, only it was the btrfs process 
using 100% CPU.  Furthermore, metadata balances were worse than data balances.

> This would help us to evaluate how qgroup slows down the process if
> there are too many snapshots.

Again, sorry that I was so quick to disable quotas, but I was only willing to 
do so much debugging with this laptop.

> Thanks,
> Qu

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [SOLVED] Re: system hangs due to qgroups
  2016-12-06 10:12         ` Marc Joliet
@ 2016-12-06 14:55           ` Marc Joliet
  0 siblings, 0 replies; 22+ messages in thread
From: Marc Joliet @ 2016-12-06 14:55 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 573 bytes --]

On Tuesday 06 December 2016 11:12:12 Marc Joliet wrote:
> I have disabled quotas already (first thing I did after
> mounting).  However,  there were definitely 20-30, maybe more (enough for
> 2, maybe 3, console pages -- I don't know how many lines the initramfs
> rescue shell has, but based on that, you could estimate the number of
> qgroups).

Of course, you can probably check the sanitized images I posted for more 
information.

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: system hangs due to qgroups
  2016-12-05 23:22               ` Marc Joliet
@ 2016-12-19 11:17                 ` Marc Joliet
  0 siblings, 0 replies; 22+ messages in thread
From: Marc Joliet @ 2016-12-19 11:17 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 662 bytes --]

On Tuesday 06 December 2016 00:22:39 Marc Joliet wrote:
> On Monday 05 December 2016 11:16:35 Marc Joliet wrote:
> [...]
> 
> > https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.7.3_sanitized.im
> > ag e.xz
> > https://dl.dropboxusercontent.com/u/5328255/arthur_root_4.8.5_sanitized.im
> > a
> > ge.xz
> 
> BTW, since my problem appears to have been known, does anybody still care
> about these?

I'll remove these files from Dropbox tomorrow around noon unless somebody says 
they still need them.

Greetings
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-12-19 11:17 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-12-03 18:40 system hangs due to qgroups Marc Joliet
2016-12-03 20:42 ` Chris Murphy
2016-12-03 21:46   ` Marc Joliet
2016-12-03 22:56     ` Chris Murphy
2016-12-04 16:02       ` Marc Joliet
2016-12-04 18:24         ` Duncan
2016-12-04 19:20           ` Marc Joliet
2016-12-05  2:32             ` Duncan
2016-12-04 18:52         ` Chris Murphy
2016-12-05  9:00           ` Marc Joliet
2016-12-05 10:16             ` Marc Joliet
2016-12-05 23:22               ` Marc Joliet
2016-12-19 11:17                 ` Marc Joliet
2016-12-04  2:10     ` Adam Borowski
2016-12-04 16:02       ` Marc Joliet
2016-12-05  0:39 ` Qu Wenruo
2016-12-05 11:01   ` Marc Joliet
2016-12-05 12:10     ` Marc Joliet
2016-12-05 14:43     ` [SOLVED] " Marc Joliet
2016-12-06  0:29       ` Qu Wenruo
2016-12-06 10:12         ` Marc Joliet
2016-12-06 14:55           ` Marc Joliet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).