From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: Re: [PATCH 0/6 v2] xfs: fix the bulkstat mess
Date: Wed, 5 Nov 2014 17:32:26 +1100 [thread overview]
Message-ID: <20141105063226.GF28565@dastard> (raw)
In-Reply-To: <20141105060728.GE28565@dastard>
On Wed, Nov 05, 2014 at 05:07:28PM +1100, Dave Chinner wrote:
> On Wed, Nov 05, 2014 at 11:05:15AM +1100, Dave Chinner wrote:
> > Hi folks, this is version 2 of the bulkstat fixup series first
> > posted here:
> >
> > http://oss.sgi.com/archives/xfs/2014-11/msg00057.html
> >
> > Version 2 fixes the issues Brian found during review:
> > - chunk formatter error leakage (patch 3)
> > - moved main loop chunk formatter error handling from patch 4 to
> > patch 5
> > - reworks last_agino updating in patch 6 to do post-formatting
> > updates and added comments.
> >
> > Comeents and testing welcome.
>
> I'm not 100% convinced that this fixes all the problems. I just
> created, dumped and restored a 10 million inode filesystem (about
> 50GB of dump file) and I found 102 missing files in the dump with no
> errors from xfsdump or xfsrestore.
>
> The files are missing from just 4 directories out of about 1000
> directories containing equal numbers of files, so its not a common
> trigger whatever the issue is. I'll keep digging...
OK, this looks like a problem with handling the last record in the
AGI btree:
$ for i in `cat s.diff | grep "^+/" | sed -e 's/^+//'` ; do ls -i $i; done |sort -n
163209114099 /mnt/scratch/2/dbc/5459605f~~~~~~~~RDJX8QBHPPMCGMD7YJQGYPD2
....
163209114129 /mnt/scratch/2/dbc/5459605f~~~~~~~~U820IYQFKS8A6QYCC8HU3ZBX
292057960758 /mnt/scratch/0/dcc/54596070~~~~~~~~9BUH5D5PZTGAC8BT1YL77OZ0
...
292057960769 /mnt/scratch/0/dcc/54596070~~~~~~~~DAO78GAAFNUZU8PH7Q0UZNRH
1395864555809 /mnt/scratch/1/e60/54596103~~~~~~~~GEMXGHYNREW409N7W9INBMVA
.....
1395864555841 /mnt/scratch/1/e60/54596103~~~~~~~~9XPK9FWHCE21AJ3EN023DU47
1653562593576 /mnt/scratch/5/e79/5459611c~~~~~~~~BSBZ6EUCT9HOIRQPMFZDVPQ5
.....
1653562593601 /mnt/scratch/5/e79/5459611c~~~~~~~~6QY1SO8ZGGNQESAGXSB3G3DH
$
xfs_db> convert inode 163209114099 agno
0x26 (38)
xfs_db> convert inode 163209114099 agino
0x571f3 (356851)
xfs_db> convert inode 163209114129 agino
0x57211 (356881)
xfs_db> agi 38
xfs_db> a root
xfs_db> a ptrs[2]
xfs_db> p
....
recs[1-234] = [startino,freecount,free]
......
228:[356352,0,0] 229:[356416,0,0] 230:[356512,0,0] 231:[356576,0,0]
232:[356672,0,0] 233:[356736,0,0] 234:[356832,14,0xfffc000000000000]
So the first contiguous inode range they all fall into the partial final record
in the AG.
xfs_db> convert inode 292057960758 agino
0x2d136 (184630)
.....
155:[184544,0,0] 156:[184608,30,0xfffffffc00000000]
Same.
xfs_db> convert inode 1395864555809 agino
0x2d121 (184609)
.....
155:[184544,0,0] 156:[184608,30,0xfffffffc00000000]
Same.
xfs_db> convert inode 1653562593576 agino
0x2d128 (184616)
....
155:[184544,0,0] 156:[184608,30,0xfffffffc00000000]
Same.
So they are all falling into the last btree record in the AG, and so
appear to have been skipped as a result of the same issue. At least
that gives me something to look at.
Still, please review the patches I've already posted - I'll push
them to linus if they are fine ASAP, and then add whatever I find
from this test later.
Cheers,
Dave.
PS: every AG I looked at had an identical inode allocation pattern.
Given that the directory entries and the file contents created are
all deterministic, it's reassuring to see that the allocator has
created identical metadata structure layouts on disk for a
repeating workload that creates identical user-visible
hierarchies...
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-11-05 6:32 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-05 0:05 [PATCH 0/6 v2] xfs: fix the bulkstat mess Dave Chinner
2014-11-05 0:05 ` [PATCH 1/6] xfs: bulkstat btree walk doesn't terminate Dave Chinner
2014-11-05 0:05 ` [PATCH 2/6] xfs: bulkstat chunk formatting cursor is broken Dave Chinner
2014-11-05 0:05 ` [PATCH 3/6] xfs: bulkstat chunk-formatter has issues Dave Chinner
2014-11-05 14:59 ` Brian Foster
2014-11-05 0:05 ` [PATCH 4/6] xfs: bulkstat main loop logic is a mess Dave Chinner
2014-11-05 0:05 ` [PATCH 5/6] xfs: bulkstat error handling is broken Dave Chinner
2014-11-05 0:05 ` [PATCH 6/6] xfs: track bulkstat progress by agino Dave Chinner
2014-11-05 15:14 ` Brian Foster
2014-11-05 6:07 ` [PATCH 0/6 v2] xfs: fix the bulkstat mess Dave Chinner
2014-11-05 6:32 ` Dave Chinner [this message]
2014-11-05 13:17 ` Brian Foster
2014-11-05 21:21 ` Dave Chinner
2014-11-06 6:53 ` Dave Chinner
2014-11-06 12:49 ` Brian Foster
2014-11-06 13:02 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141105063226.GF28565@dastard \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox