From: Stefan Priebe <s.priebe@profihost.ag>
To: Brian Foster <bfoster@redhat.com>
Cc: "xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: Is XFS suitable for 350 million files on 20TB storage?
Date: Fri, 05 Sep 2014 20:07:38 +0200 [thread overview]
Message-ID: <5409FBEA.9050708@profihost.ag> (raw)
In-Reply-To: <20140905134810.GA3965@laptop.bfoster>
Hi,
Am 05.09.2014 15:48, schrieb Brian Foster:
> On Fri, Sep 05, 2014 at 02:40:32PM +0200, Stefan Priebe - Profihost AG wrote:
>>
>> Am 05.09.2014 um 14:30 schrieb Brian Foster:
>>> On Fri, Sep 05, 2014 at 11:47:29AM +0200, Stefan Priebe - Profihost AG wrote:
>>>> Hi,
>>>>
>>>> i have a backup system running 20TB of storage having 350 million files.
>>>> This was working fine for month.
>>>>
>>>> But now the free space is so heavily fragmented that i only see the
>>>> kworker with 4x 100% CPU and write speed beeing very slow. 15TB of the
>>>> 20TB are in use.
>>>>
>>>> Overall files are 350 Million - all in different directories. Max 5000
>>>> per dir.
>>>>
>>>> Kernel is 3.10.53 and mount options are:
>>>> noatime,nodiratime,attr2,inode64,logbufs=8,logbsize=256k,noquota
>>>>
>>>> # xfs_db -r -c freesp /dev/sda1
>>>> from to extents blocks pct
>>>> 1 1 29484138 29484138 2,16
>>>> 2 3 16930134 39834672 2,92
>>>> 4 7 16169985 87877159 6,45
>>>> 8 15 78202543 999838327 73,41
>>>> 16 31 3562456 83746085 6,15
>>>> 32 63 2370812 102124143 7,50
>>>> 64 127 280885 18929867 1,39
>>>> 256 511 2 827 0,00
>>>> 512 1023 65 35092 0,00
>>>> 2048 4095 2 6561 0,00
>>>> 16384 32767 1 23951 0,00
>>>>
>>>> Is there anything i can optimize? Or is it just a bad idea to do this
>>>> with XFS? Any other options? Maybe rsync options like --inplace /
>>>> --no-whole-file?
>>>>
>>>
>>> It's probably a good idea to include more information about your fs:
>>>
>>> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>>
>> Generally sure but the problem itself is clear. If you look at the free
>> space allocation you see that free space is heavily fragmented.
>>
>> But here you go:
>> - 3.10.53 vanilla
>> - xfs_repair version 3.1.11
>> - 16 cores
>> - /dev/sda1 /backup xfs
>> rw,noatime,nodiratime,attr2,inode64,logbufs=8,logbsize=256k,noquota 0 0
>> - Raid 10 with 1GB controller cache running in write back mode using 24
>> spinners
>> - no lvm
>> - no io waits
>> - xfs_info /serverbackup/
>> meta-data=/dev/sda1 isize=256 agcount=21,
>> agsize=268435455 blks
>> = sectsz=512 attr=2
>> data = bsize=4096 blocks=5369232896, imaxpct=5
>> = sunit=0 swidth=0 blks
>> naming =version 2 bsize=4096 ascii-ci=0
>> log =internal bsize=4096 blocks=521728, version=2
>> = sectsz=512 sunit=0 blks, lazy-count=1
>> realtime =none extsz=4096 blocks=0, rtextents=0
>>
>> anything missing?
>>
>
> What's the workload to the fs? Is it repeated rsync's from a constantly
> changing dataset? Do the files change frequently or are they only ever
> added/removed?
Yes it repeated rsync with constant changing files. About 10-20% of all
files every week. A mixture of changing, removing / adding.
> Also, what is the characterization of writes being "slow?" An rsync is
> slower than normal? Sustained writes to a single file? How significant a
> degradation?
kworker is using all cpu while writing data to this xfs partition. rsync
can just write at a rate of 32-128kb/s.
> Something like the following might be interesting as well:
> for i in $(seq 0 20); do xfs_db -c "agi $i" -c "p freecount" <dev>; done
freecount = 3189417
freecount = 1975726
freecount = 1309903
freecount = 1726846
freecount = 1271047
freecount = 1281956
freecount = 1571285
freecount = 1365473
freecount = 1238118
freecount = 1697011
freecount = 1000832
freecount = 1369791
freecount = 1706360
freecount = 1439165
freecount = 1656404
freecount = 1881762
freecount = 1593432
freecount = 1555909
freecount = 1197091
freecount = 1667467
freecount = 63
Thanks!
Stefan
> Brian
>
>>> ... as well as what your typical workflow/dataset is for this fs. It
>>> seems like you have relatively small files (15TB used across 350m files
>>> is around 46k per file), yes?
>>
>> Yes - most fo them are even smaller. And some files are > 5GB.
>>
>>> If so, I wonder if something like the
>>> following commit introduced in 3.12 would help:
>>>
>>> 133eeb17 xfs: don't use speculative prealloc for small files
>>
>> Looks interesting.
>>
>> Stefan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-09-05 18:07 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-05 9:47 Is XFS suitable for 350 million files on 20TB storage? Stefan Priebe - Profihost AG
2014-09-05 12:30 ` Brian Foster
2014-09-05 12:40 ` Stefan Priebe - Profihost AG
2014-09-05 13:48 ` Brian Foster
2014-09-05 18:07 ` Stefan Priebe [this message]
2014-09-05 19:18 ` Brian Foster
2014-09-05 20:14 ` Stefan Priebe
2014-09-05 21:24 ` Brian Foster
2014-09-05 22:39 ` Sean Caron
2014-09-05 23:05 ` Dave Chinner
2014-09-06 7:35 ` Stefan Priebe
2014-09-06 15:04 ` Brian Foster
2014-09-06 22:56 ` Dave Chinner
2014-09-08 8:35 ` Stefan Priebe - Profihost AG
2014-09-08 9:46 ` Dave Chinner
2014-09-08 9:49 ` Stefan Priebe - Profihost AG
2014-09-06 14:51 ` Brian Foster
2014-09-06 22:54 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5409FBEA.9050708@profihost.ag \
--to=s.priebe@profihost.ag \
--cc=bfoster@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.