From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: "Stéphane Lesimple" <stephane_btrfs@lesimple.fr>,
"Qu Wenruo" <quwenruo.btrfs@gmx.com>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance
Date: Fri, 18 Sep 2015 08:59:21 +0800 [thread overview]
Message-ID: <55FB61E9.4000300@cn.fujitsu.com> (raw)
In-Reply-To: <53a5553a9c5301789e246144bb264e43@all.all>
Stéphane Lesimple wrote on 2015/09/17 20:47 +0200:
> Le 2015-09-17 12:41, Qu Wenruo a écrit :
>>> In the meantime, I've reactivated quotas, umounted the filesystem and
>>> ran a btrfsck on it : as you would expect, there's no qgroup problem
>>> reported so far.
>>
>> At least, rescan code is working without problem.
>>
>>> I'll clear all my snapshots, run an quota rescan, then
>>> re-create them one by one by rsyncing from my ext4 system I still have.
>>> Maybe I'll run into the issue again.
>>>
>>
>> Would you mind to do the following check for each subvolume rsync?
>>
>> 1) Do 'sync; btrfs qgroup show -prce --raw' and save the output
>> 2) Create the needed snapshot
>> 3) Do 'sync; btrfs qgroup show -prce --raw' and save the output
>> 4) Avoid doing IO if possible until step 6)
>> 5) Do 'btrfs quota rescan -w' and save it
>> 6) Do 'sync; btrfs qgroup show -prce --raw' and save the output
>> 7) Rsync data from ext4 to the newly created snapshot
>>
>> The point is, as you mentioned, rescan is working fine, we can compare
>> output from 3), 6) and 1) to see which qgroup accounting number
>> changes.
>>
>> And if differs, which means the qgroup update at write time OR
>> snapshot creation has something wrong, at least we can locate the
>> problem to qgroup update routine or snapshot creation.
>
> I was about to do that, but first there's something that sounds strange
> : I've begun by trashing all my snapshots, then ran a quota rescan, and
> waited for it to complete, to start on a sane base.
> However, this is the output of qgroup show now :
By "trashing", did you mean deleting all the files inside the subvolume?
Or "btrfs subv del"?
>
> qgroupid rfer excl max_rfer max_excl
> parent child
> -------- ---- ---- -------- --------
> ------ -----
> 0/5 16384 16384 none none
> --- ---
> 0/1906 1657848029184 1657848029184 none none
> --- ---
> 0/1909 124950921216 124950921216 none none
> --- ---
> 0/1911 1054587293696 1054587293696 none none
> --- ---
> 0/3270 23727300608 23727300608 none none
> --- ---
> 0/3314 23206055936 23206055936 none none
> --- ---
> 0/3317 18472996864 0 none none
> --- ---
> 0/3318 22235709440 18446744073708421120 none none
> --- ---
> 0/3319 22240333824 0 none none
> --- ---
> 0/3320 22289608704 0 none none
> --- ---
> 0/3321 22289608704 0 none none
> --- ---
> 0/3322 18461151232 0 none none
> --- ---
> 0/3323 18423902208 0 none none
> --- ---
> 0/3324 18423902208 0 none none
> --- ---
> 0/3325 18463506432 0 none none
> --- ---
> 0/3326 18463506432 0 none none
> --- ---
> 0/3327 18463506432 0 none none
> --- ---
> 0/3328 18463506432 0 none none
> --- ---
> 0/3329 18585427968 0 none none
> --- ---
> 0/3330 18621472768 18446744073251348480 none none
> --- ---
> 0/3331 18621472768 0 none none
> --- ---
> 0/3332 18621472768 0 none none
> --- ---
> 0/3333 18783076352 0 none none
> --- ---
> 0/3334 18799804416 0 none none
> --- ---
> 0/3335 18799804416 0 none none
> --- ---
> 0/3336 18816217088 0 none none
> --- ---
> 0/3337 18816266240 0 none none
> --- ---
> 0/3338 18816266240 0 none none
> --- ---
> 0/3339 18816266240 0 none none
> --- ---
> 0/3340 18816364544 0 none none
> --- ---
> 0/3341 7530119168 7530119168 none none
> --- ---
> 0/3342 4919283712 0 none none
> --- ---
> 0/3343 4921724928 0 none none
> --- ---
> 0/3344 4921724928 0 none none
> --- ---
> 0/3345 6503317504 18446744073690902528 none none
> --- ---
> 0/3346 6503452672 0 none none
> --- ---
> 0/3347 6509514752 0 none none
> --- ---
> 0/3348 6515793920 0 none none
> --- ---
> 0/3349 6515793920 0 none none
> --- ---
> 0/3350 6518685696 0 none none
> --- ---
> 0/3351 6521511936 0 none none
> --- ---
> 0/3352 6521511936 0 none none
> --- ---
> 0/3353 6521544704 0 none none
> --- ---
> 0/3354 6597963776 0 none none
> --- ---
> 0/3355 6598275072 0 none none
> --- ---
> 0/3356 6635880448 0 none none
> --- ---
> 0/3357 6635880448 0 none none
> --- ---
> 0/3358 6635880448 0 none none
> --- ---
> 0/3359 6635880448 0 none none
> --- ---
> 0/3360 6635880448 0 none none
> --- ---
> 0/3361 6635880448 0 none none
> --- ---
> 0/3362 6635880448 0 none none
> --- ---
> 0/3363 6635880448 0 none none
> --- ---
> 0/3364 6635880448 0 none none
> --- ---
> 0/3365 6635880448 0 none none
> --- ---
> 0/3366 6635896832 0 none none
> --- ---
> 0/3367 24185790464 24185790464 none none
> --- ---
>
Nooooo!! What a wired result here!
Qg 3345 is having minus number again, even after a qgroup rescan....
IIRC, from the code, rescan is just passing old_roots as NULL, and use
correct new_roots to build up "rfer" and "excl".
So in theory it should never go below zero in rescan.
The only hope for me is, that's a orphan qgroup.(mentioned below)
> I would have expected all these qgroupids to have been trashed with the
> snapshots, but it seems not. It reminded me of the bug you were talking
> about, where deleted snapshots don't always clear correctly their
> qgroup, but as these don't disappear after a rescan either... I'm a bit
> surprised.
If you mean you "btrfs qgroup del" the subvolume, then it's known the
qgroup won't be deleted, and won't be associated to any subvolume.
(It's possible later created subvolume uses the old subvolid, and be
associated to the qgroup again).
If above qgroups with 0 or even minus "excl" number are orphan, I'll be
much relieved, as it'll be a minor orphan qgroup bug other than another
possible qgroup rework(or at least huge review).
>
> I've just tried quota disable / quota enable, and not it seems OK. Just
> wanted to let you know, in case it's not known behavior ...
Thanks for your info a lot, which indeed expose something we didn't take
much consideration.
And if the qgroups are the same with above description, would you mind
to remove these qgroups?
>
> The procedure I'll use will be slighlty different from what you
> proposed, but to my understanding it won't change the result :
>
>> 0) Rsync data from the next ext4 "snapshot" to the subvolume
>> 1) Do 'sync; btrfs qgroup show -prce --raw' and save the output
>> 2) Create the needed readonly snapshot on btrfs
>> 3) Do 'sync; btrfs qgroup show -prce --raw' and save the output
>> 4) Avoid doing IO if possible until step 6)
>> 5) Do 'btrfs quota rescan -w' and save it
>> 6) Do 'sync; btrfs qgroup show -prce --raw' and save the output
>
> I'll post the results once this is done.
>
Thanks a lot!
Qu
next prev parent reply other threads:[~2015-09-18 0:59 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-14 11:46 kernel BUG at linux-4.2.0/fs/btrfs/extent-tree.c:1833 on rebalance Stéphane Lesimple
2015-09-15 14:47 ` Stéphane Lesimple
2015-09-15 14:56 ` Josef Bacik
2015-09-15 21:47 ` Stéphane Lesimple
2015-09-16 5:02 ` Duncan
2015-09-16 10:28 ` Stéphane Lesimple
2015-09-16 10:46 ` Holger Hoffstätte
2015-09-16 13:04 ` Stéphane Lesimple
2015-09-16 20:18 ` Duncan
2015-09-16 20:41 ` Stéphane Lesimple
2015-09-17 3:03 ` Qu Wenruo
2015-09-17 6:11 ` Stéphane Lesimple
2015-09-17 6:42 ` Qu Wenruo
2015-09-17 8:02 ` Stéphane Lesimple
2015-09-17 8:11 ` Qu Wenruo
2015-09-17 10:08 ` Stéphane Lesimple
2015-09-17 10:41 ` Qu Wenruo
2015-09-17 18:47 ` Stéphane Lesimple
2015-09-18 0:59 ` Qu Wenruo [this message]
2015-09-18 7:36 ` Stéphane Lesimple
2015-09-18 10:15 ` Stéphane Lesimple
2015-09-18 10:26 ` Stéphane Lesimple
2015-09-20 1:22 ` Qu Wenruo
2015-09-20 10:35 ` Stéphane Lesimple
2015-09-20 10:51 ` Qu Wenruo
2015-09-20 11:14 ` Stéphane Lesimple
2015-09-22 1:30 ` Stéphane Lesimple
2015-09-22 1:37 ` Qu Wenruo
2015-09-22 7:34 ` Stéphane Lesimple
2015-09-22 8:40 ` Qu Wenruo
2015-09-22 8:51 ` Qu Wenruo
2015-09-22 14:31 ` Stéphane Lesimple
2015-09-23 7:03 ` Qu Wenruo
2015-09-23 9:40 ` Stéphane Lesimple
2015-09-23 10:13 ` Qu Wenruo
2015-09-17 6:29 ` Stéphane Lesimple
2015-09-17 7:54 ` Stéphane Lesimple
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55FB61E9.4000300@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=stephane_btrfs@lesimple.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).