From: Jens Axboe <axboe@fb.com>
To: Dave Chinner <david@fromorbit.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
<linux-block@vger.kernel.org>
Subject: Re: [PATCHSET v3][RFC] Make background writeback not suck
Date: Thu, 31 Mar 2016 21:25:33 -0600 [thread overview]
Message-ID: <56FDEA2D.2030207@fb.com> (raw)
In-Reply-To: <20160401004623.GT11812@dastard>
On 03/31/2016 06:46 PM, Dave Chinner wrote:
> On Thu, Mar 31, 2016 at 08:29:35AM -0600, Jens Axboe wrote:
>> On 03/31/2016 02:24 AM, Dave Chinner wrote:
>>> On Wed, Mar 30, 2016 at 09:07:48AM -0600, Jens Axboe wrote:
>>>> Hi,
>>>>
>>>> This patchset isn't as much a final solution, as it's demonstration
>>>> of what I believe is a huge issue. Since the dawn of time, our
>>>> background buffered writeback has sucked. When we do background
>>>> buffered writeback, it should have little impact on foreground
>>>> activity. That's the definition of background activity... But for as
>>>> long as I can remember, heavy buffered writers has not behaved like
>>>> that. For instance, if I do something like this:
>>>>
>>>> $ dd if=/dev/zero of=foo bs=1M count=10k
>>>>
>>>> on my laptop, and then try and start chrome, it basically won't start
>>>> before the buffered writeback is done. Or, for server oriented
>>>> workloads, where installation of a big RPM (or similar) adversely
>>>> impacts data base reads or sync writes. When that happens, I get people
>>>> yelling at me.
>>>>
>>>> Last time I posted this, I used flash storage as the example. But
>>>> this works equally well on rotating storage. Let's run a test case
>>>> that writes a lot. This test writes 50 files, each 100M, on XFS on
>>>> a regular hard drive. While this happens, we attempt to read
>>>> another file with fio.
>>>>
>>>> Writers:
>>>>
>>>> $ time (./write-files ; sync)
>>>> real 1m6.304s
>>>> user 0m0.020s
>>>> sys 0m12.210s
>>>
>>> Great. So a basic IO tests looks good - let's through something more
>>> complex at it. Say, a benchmark I've been using for years to stress
>>> the Io subsystem, the filesystem and memory reclaim all at the same
>>> time: a concurent fsmark inode creation test.
>>> (first google hit https://lkml.org/lkml/2013/9/10/46)
>>
>> Is that how you are invoking it as well same arguments?
>
> Yes. And the VM is exactly the same, too - 16p/16GB RAM. Cut down
> version of the script I use:
>
> #!/bin/bash
>
> QUOTA=
> MKFSOPTS=
> NFILES=100000
> DEV=/dev/vdc
> LOGBSIZE=256k
> FSMARK=/home/dave/src/fs_mark-3.3/fs_mark
> MNT=/mnt/scratch
>
> while [ $# -gt 0 ]; do
> case "$1" in
> -q) QUOTA="uquota,gquota,pquota" ;;
> -N) NFILES=$2 ; shift ;;
> -d) DEV=$2 ; shift ;;
> -l) LOGBSIZE=$2; shift ;;
> --) shift ; break ;;
> esac
> shift
> done
> MKFSOPTS="$MKFSOPTS $*"
>
> echo QUOTA=$QUOTA
> echo MKFSOPTS=$MKFSOPTS
> echo DEV=$DEV
>
> sudo umount $MNT > /dev/null 2>&1
> sudo mkfs.xfs -f $MKFSOPTS $DEV
> sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV $MNT
> sudo chmod 777 $MNT
> sudo sh -c "echo 1 > /proc/sys/fs/xfs/stats_clear"
> time $FSMARK -D 10000 -S0 -n $NFILES -s 0 -L 32 \
> -d $MNT/0 -d $MNT/1 \
> -d $MNT/2 -d $MNT/3 \
> -d $MNT/4 -d $MNT/5 \
> -d $MNT/6 -d $MNT/7 \
> -d $MNT/8 -d $MNT/9 \
> -d $MNT/10 -d $MNT/11 \
> -d $MNT/12 -d $MNT/13 \
> -d $MNT/14 -d $MNT/15 \
> | tee >(stats --trim-outliers | tail -1 1>&2)
> sync
> sudo umount /mnt/scratch
Perfect, thanks!
>>>> The above was run without scsi-mq, and with using the deadline scheduler,
>>>> results with CFQ are similary depressing for this test. So IO scheduling
>>>> is in place for this test, it's not pure blk-mq without scheduling.
>>>
>>> virtio in guest, XFS direct IO -> no-op -> scsi in host.
>>
>> That has write back caching enabled on the guest, correct?
>
> No. It uses virtio,cache=none (that's the "XFS Direct IO" bit above).
> Sorry for not being clear about that.
That's fine, it's one less worry if that's not the case. So if you cat
the 'write_cache' file in the virtioblk sysfs block queue/ directory, it
says 'write through'? Just want to confirm that we got that propagated
correctly.
--
Jens Axboe
next prev parent reply other threads:[~2016-04-01 3:25 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-30 15:07 [PATCHSET v3][RFC] Make background writeback not suck Jens Axboe
2016-03-30 15:07 ` [PATCH 1/9] writeback: propagate the various reasons for writeback Jens Axboe
2016-03-30 15:07 ` [PATCH 2/9] writeback: add wbc_to_write() Jens Axboe
2016-03-30 15:07 ` [PATCH 3/9] writeback: use WRITE_SYNC for reclaim or sync writeback Jens Axboe
2016-03-30 15:07 ` [PATCH 4/9] writeback: track if we're sleeping on progress in balance_dirty_pages() Jens Axboe
2016-04-13 13:08 ` Jan Kara
2016-04-13 14:20 ` Jens Axboe
2016-03-30 15:07 ` [PATCH 5/9] block: add ability to flag write back caching on a device Jens Axboe
2016-03-30 15:42 ` Christoph Hellwig
2016-03-30 15:46 ` Jens Axboe
2016-03-30 16:23 ` Jens Axboe
2016-03-30 17:29 ` Christoph Hellwig
2016-03-30 15:07 ` [PATCH 6/9] sd: inform block layer of write cache state Jens Axboe
2016-03-30 15:07 ` [PATCH 7/9] NVMe: " Jens Axboe
2016-03-30 15:07 ` [PATCH 8/9] block: add code to track actual device queue depth Jens Axboe
2016-03-30 15:07 ` [PATCH 9/9] writeback: throttle buffered writeback Jens Axboe
2016-03-31 8:24 ` [PATCHSET v3][RFC] Make background writeback not suck Dave Chinner
2016-03-31 14:29 ` Jens Axboe
2016-03-31 16:21 ` Jens Axboe
2016-04-01 0:56 ` Dave Chinner
2016-04-01 3:29 ` Jens Axboe
2016-04-01 3:33 ` Jens Axboe
2016-04-01 3:39 ` Jens Axboe
2016-04-01 6:16 ` Dave Chinner
2016-04-01 14:33 ` Jens Axboe
2016-04-01 5:04 ` Dave Chinner
2016-04-01 0:46 ` Dave Chinner
2016-04-01 3:25 ` Jens Axboe [this message]
2016-04-01 6:27 ` Dave Chinner
2016-04-01 14:34 ` Jens Axboe
2016-03-31 22:09 ` Holger Hoffstätte
2016-04-01 1:01 ` Dave Chinner
2016-04-01 1:01 ` Dave Chinner
2016-04-01 16:58 ` Holger Hoffstätte
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56FDEA2D.2030207@fb.com \
--to=axboe@fb.com \
--cc=david@fromorbit.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.