From: Brian Foster <bfoster@redhat.com>
To: Stefan Priebe <s.priebe@profihost.ag>
Cc: "xfs@oss.sgi.com" <xfs@oss.sgi.com>
Subject: Re: Is XFS suitable for 350 million files on 20TB storage?
Date: Fri, 5 Sep 2014 17:24:01 -0400 [thread overview]
Message-ID: <20140905212400.GA8904@laptop.bfoster> (raw)
In-Reply-To: <540A19BB.8040404@profihost.ag>
On Fri, Sep 05, 2014 at 10:14:51PM +0200, Stefan Priebe wrote:
>
> Am 05.09.2014 21:18, schrieb Brian Foster:
> ...
>
> >On Fri, Sep 05, 2014 at 08:07:38PM +0200, Stefan Priebe wrote:
> >Interesting, that seems like a lot of free inodes. That's 1-2 million in
> >each AG that we have to look around for each time we want to allocate an
> >inode. I can't say for sure that's the source of the slowdown, but this
> >certainly looks like the kind of workload that inspired the addition of
> >the free inode btree (finobt) to more recent kernels.
> >
> >It appears that you still have quite a bit of space available in
> >general. Could you run some local tests on this filesystem to try and
> >quantify how much of this degradation manifests on sustained writes vs.
> >file creation? For example, how is throughput when writing a few GB to a
> >local test file?
>
> Not sure if this is what you expect:
>
> # dd if=/dev/zero of=bigfile oflag=direct,sync bs=4M count=1000
> 1000+0 records in
> 1000+0 records out
> 4194304000 bytes (4,2 GB) copied, 125,809 s, 33,3 MB/s
>
> or without sync
> # dd if=/dev/zero of=bigfile oflag=direct bs=4M count=1000
> 1000+0 records in
> 1000+0 records out
> 4194304000 bytes (4,2 GB) copied, 32,5474 s, 129 MB/s
>
> > How about with that same amount of data broken up
> >across a few thousand files?
>
> This results in heavy kworker usage.
>
> 4GB in 32kb files
> # time (mkdir test; for i in $(seq 1 1 131072); do dd if=/dev/zero
> of=test/$i bs=32k count=1 oflag=direct,sync 2>/dev/null; done)
>
> ...
>
> 55 min
>
Both seem pretty slow in general. Any way you can establish a baseline
for these tests on this storage? If not, the only other suggestion I
could make is to allocate inodes until all of those freecount numbers
are accounted for and see if anything changes. That could certainly take
some time and it's not clear it will actually help.
> >Brian
> >
> >P.S., Alternatively if you wanted to grab a metadump of this filesystem
> >and compress/upload it somewhere, I'd be interested to take a look at
> >it.
>
> I think there might be file and directory names in it. If this is the case i
> can't do it.
>
It should enable obfuscation by default, but I would suggest to restore
it yourself and verify it meets your expectations.
Brian
> Stefan
>
>
> >
> >>Thanks!
> >>
> >>Stefan
> >>
> >>
> >>
> >>>Brian
> >>>
> >>>>>... as well as what your typical workflow/dataset is for this fs. It
> >>>>>seems like you have relatively small files (15TB used across 350m files
> >>>>>is around 46k per file), yes?
> >>>>
> >>>>Yes - most fo them are even smaller. And some files are > 5GB.
> >>>>
> >>>>>If so, I wonder if something like the
> >>>>>following commit introduced in 3.12 would help:
> >>>>>
> >>>>>133eeb17 xfs: don't use speculative prealloc for small files
> >>>>
> >>>>Looks interesting.
> >>>>
> >>>>Stefan
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-09-05 21:24 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-05 9:47 Is XFS suitable for 350 million files on 20TB storage? Stefan Priebe - Profihost AG
2014-09-05 12:30 ` Brian Foster
2014-09-05 12:40 ` Stefan Priebe - Profihost AG
2014-09-05 13:48 ` Brian Foster
2014-09-05 18:07 ` Stefan Priebe
2014-09-05 19:18 ` Brian Foster
2014-09-05 20:14 ` Stefan Priebe
2014-09-05 21:24 ` Brian Foster [this message]
2014-09-05 22:39 ` Sean Caron
2014-09-05 23:05 ` Dave Chinner
2014-09-06 7:35 ` Stefan Priebe
2014-09-06 15:04 ` Brian Foster
2014-09-06 22:56 ` Dave Chinner
2014-09-08 8:35 ` Stefan Priebe - Profihost AG
2014-09-08 9:46 ` Dave Chinner
2014-09-08 9:49 ` Stefan Priebe - Profihost AG
2014-09-06 14:51 ` Brian Foster
2014-09-06 22:54 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140905212400.GA8904@laptop.bfoster \
--to=bfoster@redhat.com \
--cc=s.priebe@profihost.ag \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox