linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin Steigerwald <martin@lichtvoll.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>,
	tux3@tux3.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org,
	Mike Galbraith <umgwanakikbuti@gmail.com>,
	Daniel Phillips <daniel@phunq.net>,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Subject: Re: xfs: does mkfs.xfs require fancy switches to get decent performance? (was Tux3 Report: How fast can we fsync?)
Date: Thu, 30 Apr 2015 11:00:05 +0200	[thread overview]
Message-ID: <4154074.ZWLyZCMjhl@merkaba> (raw)
In-Reply-To: <20150430002008.GY15810@dastard>

Am Donnerstag, 30. April 2015, 10:20:08 schrieb Dave Chinner:
> On Wed, Apr 29, 2015 at 09:05:26PM +0200, Mike Galbraith wrote:
> > Here's something that _might_ interest xfs folks.
> > 
> > cd git (source repository of git itself)
> > make clean
> > echo 3 > /proc/sys/vm/drop_caches
> > time make -j8 test
> > 
> > ext4    2m20.721s
> > xfs     6m41.887s <-- ick
> > btrfs   1m32.038s
> > tux3    1m30.262s
> > 
> > Testing by Aunt Tilly: mkfs, no fancy switches, mount the thing, test.
> 
> TL;DR: Results are *very different* on a 256GB Samsung 840 EVO SSD
> with slightly slower CPUs (E5-4620 @ 2.20GHz)i, all filesystems
> using defaults:
> 
> 	real		user		sys
> xfs	3m16.138s	7m8.341s	14m32.462s
> ext4	3m18.045s	7m7.840s	14m32.994s
> btrfs	3m45.149s	7m10.184s	16m30.498s
> 
> What you are seeing is physical seek distances impacting read
> performance.  XFS does not optimise for minimal physical seek
> distance, and hence is slower than filesytsems that do optimise for
> minimal seek distance. This shows up especially well on slow single
> spindles.
> 
> XFS is *adequate* for the use on slow single drives, but it is
> really designed for best performance on storage hardware that is not
> seek distance sensitive.
> 
> IOWS, XFS just hates your disk. Spend $50 and buy a cheap SSD and
> the problem goes away. :)


I am quite surprised that a traditional filesystem that was created in the 
age of rotating media does not like this kind of media and even seems to 
excel on BTRFS on the new non rotating media available.

But…

> ----
> 
> And now in more detail.
> 
> It's easy to be fast on empty filesystems. XFS does not aim to be
> fast in such situations - it aims to have consistent performance
> across the life of the filesystem.

… this is a quite important addition.

> Thing is, once you've abused those filesytsems for a couple of
> months, the files in ext4, btrfs and tux3 are not going to be laid
> out perfectly on the outer edge of the disk. They'll be spread all
> over the place and so all the filesystems will be seeing large seeks
> on read. The thing is, XFS will have roughly the same performance as
> when the filesystem is empty because the spreading of the allocation
> allows it to maintain better locality and separation and hence
> doesn't fragment free space nearly as badly as the oher filesystems.
> Free space fragmentation is what leads to performance degradation in
> filesystems, and all the other filesystem will have degraded to be
> *much worse* than XFS.

I even still see hungs on what I tend to see as freespace fragmentation in 
BTRFS. My /home on a Dual (!) BTRFS SSD setup can basically stall to a 
halt when it has reserved all space of the device for chunks. So this

merkaba:~> btrfs fi sh /home
Label: 'home'  uuid: […]
        Total devices 2 FS bytes used 129.48GiB
        devid    1 size 170.00GiB used 146.03GiB path /dev/mapper/msata-
home
        devid    2 size 170.00GiB used 146.03GiB path /dev/mapper/sata-
home

Btrfs v3.18
merkaba:~> btrfs fi df /home
Data, RAID1: total=142.00GiB, used=126.72GiB
System, RAID1: total=32.00MiB, used=48.00KiB
Metadata, RAID1: total=4.00GiB, used=2.76GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

is safe, but one I have size 170 GiB user 170 GiB, even if inside the 
chunks there is enough free space to allocate from, enough as in 30-40 
GiB, it can happen that writes are stalled up to the point that 
applications on the desktop freeze and I see hung task messages in kernel 
log.

This is the case upto kernel 4.0. I have seen Chris Mason fixing some write 
stalls for big facebook setups, maybe it will help here, but unless this 
issue is fixed, I think BTRFS is not yet fully production ready, unless you 
leave *huge* amount of free space, as in for 200 GiB of data you want to 
write make a 400 GiB volume.

> Put simply: empty filesystem benchmarking does not show the real
> performance of the filesystem under sustained production workloads.
> Hence benchmarks like this - while interesting from a theoretical
> point of view and are widely used for bragging about whose got the
> fastest - are mostly irrelevant to determining how the filesystem
> will perform in production environments.
> 
> We can also look at this algorithm in a different way: take a large
> filesystem (say a few hundred TB) across a few tens of disks in a
> linear concat.  ext4, btrfs and tux3 will only hit the first disk in
> the concat, and so go no faster because they are still bound by
> physical seek times.  XFS, however, will spread the load across many
> (if not all) of the disks, and so effectively reduce the average
> seek time by the number of disks doing concurrent IO. Then you'll
> see that application level IO concurrency becomes the performance
> limitation, not the physical seek time of the hardware.

That are the allocation groups. I always wondered how it can be beneficial 
to spread the allocations onto 4 areas of one partition on expensive seek 
media. Now that makes better sense for me. I always had the gut impression 
that XFS may not be the fastest in all cases, but it is one of the 
filesystem with the most consistent performance over time, but never was 
able to fully explain why that is.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

_______________________________________________
Tux3 mailing list
Tux3@phunq.net
http://phunq.net/mailman/listinfo/tux3

  parent reply	other threads:[~2015-04-30  9:00 UTC|newest]

Thread overview: 160+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-28 23:13 Tux3 Report: How fast can we fsync? Daniel Phillips
2015-04-29  2:21 ` Mike Galbraith
2015-04-29  6:01   ` Daniel Phillips
2015-04-29  6:20     ` Richard Weinberger
2015-04-29  6:56       ` Daniel Phillips
2015-04-29  6:33     ` Mike Galbraith
2015-04-29  7:23       ` Daniel Phillips
2015-04-29 16:42         ` Mike Galbraith
2015-04-29 19:05           ` xfs: does mkfs.xfs require fancy switches to get decent performance? (was Tux3 Report: How fast can we fsync?) Mike Galbraith
2015-04-29 19:20             ` Austin S Hemmelgarn
2015-04-29 21:12             ` Daniel Phillips
2015-04-30  4:40               ` Mike Galbraith
2015-04-30  0:20             ` Dave Chinner
2015-04-30  3:35               ` Mike Galbraith
2015-04-30  9:00               ` Martin Steigerwald [this message]
2015-04-30 14:57                 ` Theodore Ts'o
2015-04-30 15:59                   ` Daniel Phillips
2015-04-30 17:59                   ` Martin Steigerwald
2015-04-30 11:14               ` Daniel Phillips
2015-04-30 12:07                 ` Mike Galbraith
2015-04-30 12:58                   ` Daniel Phillips
2015-04-30 13:48                     ` Mike Galbraith
2015-04-30 14:07                       ` Daniel Phillips
2015-04-30 14:28                         ` Howard Chu
2015-04-30 15:14                           ` Daniel Phillips
2015-04-30 16:00                             ` Howard Chu
2015-04-30 18:22                             ` Christian Stroetmann
2015-05-11 22:12                             ` Pavel Machek
2015-05-11 23:17                               ` Theodore Ts'o
2015-05-12  2:34                                 ` Daniel Phillips
2015-05-12  5:38                                   ` Dave Chinner
2015-05-12  6:18                                     ` Daniel Phillips
2015-05-12 18:39                                       ` David Lang
2015-05-12 20:54                                         ` Daniel Phillips
2015-05-12 21:30                                           ` David Lang
2015-05-12 22:27                                             ` Daniel Phillips
2015-05-12 22:35                                               ` David Lang
2015-05-12 23:55                                                 ` Theodore Ts'o
2015-05-13  1:26                                                 ` Daniel Phillips
2015-05-13 19:09                                                   ` Martin Steigerwald
2015-05-13 19:37                                                     ` Daniel Phillips
2015-05-13 20:02                                                       ` Jeremy Allison
2015-05-13 20:24                                                         ` Daniel Phillips
2015-05-13 20:25                                                       ` Martin Steigerwald
2015-05-13 20:38                                                         ` Daniel Phillips
2015-05-13 21:10                                                           ` Martin Steigerwald
2015-05-13  0:31                                             ` Daniel Phillips
2015-05-12 21:30                                           ` Christian Stroetmann
2015-05-13  7:20                                           ` Pavel Machek
2015-05-13 13:47                                             ` Elifarley Callado Coelho Cruz
2015-05-12  9:03                                   ` Pavel Machek
2015-05-12 11:22                                     ` Daniel Phillips
2015-05-12 13:26                                       ` Howard Chu
2015-05-11 23:53                               ` Daniel Phillips
2015-05-12  0:12                                 ` David Lang
2015-05-12  4:36                                   ` Daniel Phillips
2015-05-12 17:30                                     ` Christian Stroetmann
2015-05-13  7:25                                 ` Pavel Machek
2015-05-13 11:31                                   ` Daniel Phillips
2015-05-13 12:41                                     ` Daniel Phillips
2015-05-13 13:08                                     ` Mike Galbraith
2015-05-13 13:15                                       ` Daniel Phillips
2015-04-30 14:33                         ` Mike Galbraith
2015-04-30 15:24                           ` Daniel Phillips
2015-04-29 20:40           ` Tux3 Report: How fast can we fsync? Daniel Phillips
2015-04-29 22:06             ` OGAWA Hirofumi
2015-04-30  3:57               ` Mike Galbraith
2015-04-30  3:50             ` Mike Galbraith
2015-04-30 10:59               ` Daniel Phillips
2015-04-30  1:46 ` Dave Chinner
2015-04-30 10:28   ` Daniel Phillips
2015-05-01 15:38     ` Dave Chinner
2015-05-01 23:20       ` Daniel Phillips
2015-05-02  1:07         ` David Lang
2015-05-02 10:26           ` Daniel Phillips
2015-05-02 16:00             ` Christian Stroetmann
2015-05-02 16:30               ` Richard Weinberger
2015-05-02 17:00                 ` Christian Stroetmann
2015-05-12 17:41 ` Daniel Phillips
2015-05-12 17:46 ` Tux3 Report: How fast can we fail? Daniel Phillips
2015-05-13 22:07   ` Daniel Phillips
2015-05-26 10:03   ` Pavel Machek
2015-05-27  6:41     ` Mosis Tembo
2015-05-27 18:28       ` Daniel Phillips
2015-05-27 21:39         ` Pavel Machek
2015-05-27 22:46           ` Daniel Phillips
2015-05-28 12:55             ` Austin S Hemmelgarn
2015-05-27  7:37     ` Mosis Tembo
2015-05-27 14:04       ` Austin S Hemmelgarn
2015-05-27 15:21         ` Mosis Tembo
2015-05-27 15:37           ` Austin S Hemmelgarn
2015-05-14  7:37 ` [WIP] tux3: Optimized fsync Daniel Phillips
2015-05-14  8:26 ` [FYI] tux3: Core changes Daniel Phillips
2015-05-14 12:59   ` Rik van Riel
2015-05-15  0:06     ` Daniel Phillips
2015-05-15  3:06       ` Rik van Riel
2015-05-15  8:09         ` Mel Gorman
2015-05-15  9:54           ` Daniel Phillips
2015-05-15 11:00             ` Mel Gorman
2015-05-16 22:38               ` David Lang
2015-05-18 12:57                 ` Mel Gorman
2015-05-15  9:38         ` Daniel Phillips
2015-05-27  7:41           ` Pavel Machek
2015-05-27 18:09             ` Daniel Phillips
2015-05-27 21:37               ` Pavel Machek
2015-05-27 22:33                 ` Daniel Phillips
2015-05-15  8:05       ` Mel Gorman
2015-05-17 13:26     ` Boaz Harrosh
2015-05-18  2:20       ` Rik van Riel
2015-05-18  7:58         ` Boaz Harrosh
2015-05-19  4:46         ` Daniel Phillips
2015-05-21 19:43     ` [WIP][PATCH] tux3: preliminatry nospace handling Daniel Phillips
2015-05-19 14:00   ` [FYI] tux3: Core changes Jan Kara
2015-05-19 19:18     ` Daniel Phillips
2015-05-19 20:33       ` David Lang
2015-05-20 14:44         ` Jan Kara
2015-05-20 16:22           ` Daniel Phillips
2015-05-20 18:01             ` David Lang
2015-05-20 19:53             ` Rik van Riel
2015-05-20 22:51               ` Daniel Phillips
2015-05-21  3:24                 ` Daniel Phillips
2015-05-21  3:51                   ` David Lang
2015-05-21 19:53                     ` Daniel Phillips
2015-05-26  4:25                       ` Rik van Riel
2015-05-26  4:30                         ` Daniel Phillips
2015-05-26  6:04                           ` David Lang
2015-05-26  6:11                             ` Daniel Phillips
2015-05-26  6:13                               ` David Lang
2015-05-26  8:09                                 ` Daniel Phillips
2015-05-26 10:13                                   ` Pavel Machek
2015-05-26  7:09                               ` Jan Kara
2015-05-26  8:08                                 ` Daniel Phillips
2015-05-26  9:00                                   ` Jan Kara
2015-05-26 20:22                                     ` Daniel Phillips
2015-05-26 21:36                                       ` Rik van Riel
2015-05-26 21:49                                         ` Daniel Phillips
2015-05-27  8:41                                       ` Jan Kara
2015-06-21 15:36                                         ` OGAWA Hirofumi
2015-06-23 16:12                                           ` Jan Kara
2015-07-05 12:54                                             ` OGAWA Hirofumi
2015-07-09 16:05                                               ` Jan Kara
2015-07-31  4:44                                                 ` OGAWA Hirofumi
2015-07-31 15:37                                                   ` Raymond Jennings
2015-07-31 17:27                                                     ` Daniel Phillips
2015-07-31 18:29                                                       ` David Lang
2015-07-31 18:43                                                         ` Daniel Phillips
2015-07-31 22:12                                                         ` Daniel Phillips
2015-07-31 22:27                                                           ` David Lang
2015-08-01  0:00                                                             ` Daniel Phillips
2015-08-01  0:16                                                               ` Daniel Phillips
2015-08-03 13:07                                                                 ` Jan Kara
2015-08-01 10:55                                                             ` Elifarley Callado Coelho Cruz
2015-08-18 16:39                                                       ` Rik van Riel
2015-08-03 13:42                                                   ` Jan Kara
2015-08-09 13:42                                                     ` OGAWA Hirofumi
2015-08-10 12:45                                                       ` Jan Kara
2015-08-16 19:42                                                         ` OGAWA Hirofumi
2015-05-26 10:22                                   ` Sergey Senozhatsky
2015-05-26 12:33                                     ` Jan Kara
2015-05-26 19:18                                     ` Daniel Phillips

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4154074.ZWLyZCMjhl@merkaba \
    --to=martin@lichtvoll.de \
    --cc=daniel@phunq.net \
    --cc=david@fromorbit.com \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tux3@tux3.org \
    --cc=tytso@mit.edu \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).