From: Ric Wheeler <rwheeler@redhat.com>
To: Chris Mason <clm@fb.com>,
sandeen@redhat.com,
Linus Torvalds <torvalds@linux-foundation.org>,
Dave Chinner <david@fromorbit.com>,
"Theodore Ts'o" <tytso@mit.edu>,
Ric Wheeler <rwheeler@redhat.com>,
Andy Lutomirski <luto@amacapital.net>,
One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>,
Gregory Farnum <greg@gregs42.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Christoph Hellwig <hch@infradead.org>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Jens Axboe <axboe@kernel.dk>,
Andrew Morton <akpm@linux-foundation.org>,
Linux API <linux-api@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
shane.seymour@hpe.com, Bruce Fields <bfields@fieldses.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Jeff Layton <jlayton@poochiereds.net>
Subject: Re: [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks
Date: Thu, 17 Mar 2016 09:49:15 -0400 [thread overview]
Message-ID: <56EAB5DB.8080106@redhat.com> (raw)
In-Reply-To: <20160316222343.GA53649@clm-mbp.thefacebook.com>
On 03/16/2016 06:23 PM, Chris Mason wrote:
> On Tue, Mar 15, 2016 at 05:51:17PM -0700, Chris Mason wrote:
>> On Tue, Mar 15, 2016 at 07:30:14PM -0500, Eric Sandeen wrote:
>>> On 3/15/16 7:06 PM, Linus Torvalds wrote:
>>>> On Tue, Mar 15, 2016 at 4:52 PM, Dave Chinner <david@fromorbit.com> wrote:
>>>>>> It is pretty clear that the onus is on the patch submitter to
>>>>>> provide justification for inclusion, not for the reviewer/Maintainer
>>>>>> to have to prove that the solution is unworkable.
>>>> I agree, but quite frankly, performance is a good justification.
>>>>
>>>> So if Ted can give performance numbers, that's justification enough.
>>>> We've certainly taken changes with less.
>>> I've been away from ext4 for a while, so I'm really not on top of the
>>> mechanics of the underlying problem at the moment.
>>>
>>> But I would say that in addition to numbers showing that ext4 has trouble
>>> with unwritten extent conversion, we should have an explanation of
>>> why it can't be solved in a way that doesn't open up these concerns.
>>>
>>> XFS certainly has different mechanisms, but is the demonstrated workload
>>> problematic on XFS (or btrfs) as well? If not, can ext4 adopt any of the
>>> solutions that make the workload perform better on other filesystems?
>> When I've benchmarked this in the past, doing small random buffered writes
>> into an preallocated extent was dramatically (3x or more) slower on xfs
>> than doing them into a fully written extent. That was two years ago,
>> but I can redo it.
> So I re-ran some benchmarks, with 4K O_DIRECT random ios on nvme (4.5
> kernel). This is O_DIRECT without O_SYNC. I don't think xfs will do
> commits for each IO into the prealloc file? O_SYNC makes it much
> slower, so hopefully I've got this right.
>
> The test runs for 60 seconds, and I used an iodepth of 4:
>
> prealloc file: 32,000 iops
> overwrite: 121,000 iops
>
> If I bump the iodepth up to 512:
>
> prealloc file: 33,000 iops
> overwrite: 279,000 iops
>
> For streaming writes, XFS converts prealloc to written much better when
> the IO isn't random. You can start seeing the difference at 16K
> sequential O_DIRECT writes, but really its not a huge impact. The worst
> case is 4K:
>
> prealloc file: 227MB/s
> overwrite: 340MB/s
>
> I can't think of sequential workloads where this will matter, since they
> will either end up with bigger IO or the performance impact won't get
> noticed.
>
> -chris
I think that these numbers are the interesting ones, see a 4x slow down is
certainly significant.
If you do the same patch after hacking XFS preallocation as Dave suggested with
xfs_db, do we get most of the performance back?
Ric
next prev parent reply other threads:[~2016-03-17 13:49 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-02 4:09 [PATCH v5.1 0/2] create BLKZEROOUT ioctl that invalidates page cache Darrick J. Wong
2016-03-02 4:09 ` [PATCH 1/2] block: invalidate the page cache when issuing BLKZEROOUT Darrick J. Wong
2016-03-02 9:19 ` Christoph Hellwig
2016-03-02 4:09 ` [PATCH 2/2] block: create ioctl to discard-or-zeroout a range of blocks Darrick J. Wong
2016-03-02 9:20 ` Christoph Hellwig
2016-03-02 18:52 ` Linus Torvalds
2016-03-02 22:56 ` Darrick J. Wong
2016-03-02 23:49 ` Linus Torvalds
2016-03-03 17:02 ` Theodore Ts'o
2016-03-03 17:55 ` Linus Torvalds
2016-03-03 18:00 ` Christoph Hellwig
2016-03-03 18:14 ` Martin K. Petersen
2016-03-03 18:21 ` Theodore Ts'o
2016-03-03 18:01 ` Martin K. Petersen
2016-03-03 18:09 ` Christoph Hellwig
2016-03-03 18:12 ` Darrick J. Wong
2016-03-03 18:54 ` Martin K. Petersen
2016-03-03 22:39 ` Theodore Ts'o
2016-03-03 23:10 ` Dave Chinner
2016-03-04 0:20 ` Theodore Ts'o
2016-03-09 22:20 ` Gregory Farnum
2016-03-09 23:08 ` Theodore Ts'o
2016-03-10 14:58 ` Ric Wheeler
2016-03-10 18:33 ` Linus Torvalds
2016-03-10 21:47 ` Theodore Ts'o
2016-03-11 4:42 ` Ric Wheeler
2016-03-11 13:59 ` One Thousand Gnomes
2016-03-11 15:27 ` Theodore Ts'o
2016-03-11 17:23 ` Linus Torvalds
2016-03-11 17:30 ` Andy Lutomirski
2016-03-11 18:25 ` Linus Torvalds
2016-03-11 22:30 ` Dave Chinner
2016-03-12 0:33 ` Linus Torvalds
2016-03-12 0:35 ` Theodore Ts'o
2016-03-12 0:44 ` Linus Torvalds
2016-03-12 7:19 ` Theodore Ts'o
2016-03-12 10:11 ` Thomas Schoebel-Theuer
2016-03-13 23:30 ` Dave Chinner
2016-03-14 10:34 ` Ric Wheeler
2016-03-14 14:46 ` Theodore Ts'o
2016-03-15 20:14 ` Dave Chinner
2016-03-15 20:43 ` Linus Torvalds
2016-03-15 21:29 ` Theodore Ts'o
2016-03-15 22:33 ` Dave Chinner
2016-03-15 22:52 ` Theodore Ts'o
2016-03-16 1:51 ` Darrick J. Wong
2016-03-16 21:45 ` Andreas Dilger
2016-03-17 0:15 ` Theodore Ts'o
2016-03-17 0:33 ` Eric Sandeen
2016-03-17 0:59 ` Theodore Ts'o
2016-03-17 5:18 ` Gregory Farnum
2016-03-17 12:36 ` Theodore Ts'o
2016-03-17 17:47 ` Linus Torvalds
2016-03-17 17:50 ` Ric Wheeler
2016-03-17 17:59 ` Linus Torvalds
2016-03-17 18:35 ` Chris Mason
2016-03-17 20:49 ` Andreas Dilger
2016-03-17 21:00 ` Chris Mason
2016-03-18 3:20 ` Theodore Ts'o
2016-03-18 15:15 ` Jeff Moyer
2016-03-18 20:05 ` Martin K. Petersen
2016-03-18 6:52 ` Gregory Farnum
2016-03-18 7:19 ` Linus Torvalds
2016-03-17 1:01 ` Dave Chinner
2016-03-17 2:38 ` Darrick J. Wong
2016-03-18 22:55 ` NeilBrown
2016-03-15 23:06 ` Linus Torvalds
2016-03-15 23:14 ` Linus Torvalds
2016-03-16 0:08 ` Dave Chinner
2016-03-15 23:52 ` Dave Chinner
2016-03-16 0:06 ` Linus Torvalds
2016-03-16 0:30 ` Eric Sandeen
2016-03-16 0:51 ` Chris Mason
2016-03-16 22:23 ` Chris Mason
2016-03-17 13:49 ` Ric Wheeler [this message]
2016-03-15 22:38 ` Eric Sandeen
2016-03-03 22:56 ` Dave Chinner
2016-03-04 2:30 ` Thomas Schoebel-Theuer
2016-03-03 18:14 ` Linus Torvalds
2016-03-02 9:15 ` [PATCH v5.1 0/2] create BLKZEROOUT ioctl that invalidates page cache Arnd Bergmann
2016-03-02 9:44 ` Christoph Hellwig
2016-03-02 10:55 ` Arnd Bergmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56EAB5DB.8080106@redhat.com \
--to=rwheeler@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bfields@fieldses.org \
--cc=clm@fb.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=gnomes@lxorguk.ukuu.org.uk \
--cc=greg@gregs42.com \
--cc=hch@infradead.org \
--cc=jlayton@poochiereds.net \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=martin.petersen@oracle.com \
--cc=sandeen@redhat.com \
--cc=shane.seymour@hpe.com \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox