From: Jens Axboe <axboe@fb.com>
To: Dave Chinner <david@fromorbit.com>
Cc: <linux-kernel@vger.kernel.org>, <linux-fsdevel@vger.kernel.org>,
<linux-block@vger.kernel.org>
Subject: Re: [PATCH 2/6] writeback: wb_start_writeback() should use WB_SYNC_ALL for WB_REASON_SYNC
Date: Tue, 22 Mar 2016 16:07:18 -0600 [thread overview]
Message-ID: <56F1C216.5050103@fb.com> (raw)
In-Reply-To: <20160322220431.GT11812@dastard>
On 03/22/2016 04:04 PM, Dave Chinner wrote:
> On Tue, Mar 22, 2016 at 03:40:28PM -0600, Jens Axboe wrote:
>> On 03/22/2016 03:34 PM, Dave Chinner wrote:
>>> On Tue, Mar 22, 2016 at 11:55:16AM -0600, Jens Axboe wrote:
>>>> If you call sync, the initial call to wakeup_flusher_threads() ends up
>>>> calling wb_start_writeback() with reason=WB_REASON_SYNC, but
>>>> wb_start_writeback() always uses WB_SYNC_NONE as the writeback mode.
>>>> Ensure that we use WB_SYNC_ALL for a sync operation.
>>>
>>> This seems wrong to me. We want background write to happen as
>>> quickly as possible and /not block/ when we first kick sync.
>>
>> It's not going to block. wakeup_flusher_threads() async queues
>> writeback work through wb_start_writeback().
>
> The flusher threads block, not the initial wakeup. e.g. they will
> now block waiting for data writeback to complete before writing the
> inode. i.e. this code in __writeback_single_inode() is now triggered
> by the background flusher:
>
> /*
> * Make sure to wait on the data before writing out the metadata.
> * This is important for filesystems that modify metadata on data
> * I/O completion. We don't do it for sync(2) writeback because it has a
> * separate, external IO completion path and ->sync_fs for guaranteeing
> * inode metadata is written back correctly.
> */
> if (wbc->sync_mode == WB_SYNC_ALL && !wbc->for_sync) {
> int err = filemap_fdatawait(mapping);
> if (ret == 0)
> ret = err;
> }
Yeah, that's not ideal, for this case we'd really like something that
WB_SYNC_ALL_NOWAIT...
> It also changes the writeback chunk size in write_cache_pages(), so
> instead of doing a bit of writeback from all dirty inodes, it tries
> to write everything from each inode in turn (see
> writeback_chunk_size()) which will further exacerbate the wait
> above.
But that part is fine, if it wasn't for the waiting.
>>> The latter blocking passes of sync use WB_SYNC_ALL to ensure that we
>>> block waiting for all remaining IO to be issued and waited on, but
>>> the background writeback doesn't need to do this.
>>
>> That's fine, they can get to wait on the previously issued IO, which
>> was properly submitted with WB_SYNC_ALL.
>>
>> Maybe I'm missing your point?
>
> Making the background flusher block and wait for data makes it
> completely ineffective in speeding up sync() processing...
Agree, we should not wait on the pages individually, we want them
submitted and then waited on. And I suppose it's no differently than
handling the normal buffered write from an application, which then gets
waited on with fsync() or similar. So I can drop this patch, it'll work
fine without it.
--
Jens Axboe
next prev parent reply other threads:[~2016-03-22 22:07 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-22 17:55 [PATCHSET][RFC] Make background writeback not suck Jens Axboe
2016-03-22 17:55 ` [PATCH 1/6] block: ensure we don't truncate top bits of the request command flags Jens Axboe
2016-03-22 18:59 ` Christoph Hellwig
2016-03-22 19:01 ` Jens Axboe
2016-03-25 2:08 ` Mike Christie
2016-03-25 4:18 ` Jens Axboe
2016-03-22 17:55 ` [PATCH 2/6] writeback: wb_start_writeback() should use WB_SYNC_ALL for WB_REASON_SYNC Jens Axboe
2016-03-22 21:34 ` Dave Chinner
2016-03-22 21:40 ` Jens Axboe
2016-03-22 21:51 ` Jens Axboe
2016-03-22 22:04 ` Dave Chinner
2016-03-22 22:07 ` Jens Axboe [this message]
2016-03-22 17:55 ` [PATCH 3/6] block: add ability to flag write back caching on a device Jens Axboe
2016-03-22 18:57 ` Christoph Hellwig
2016-03-22 18:59 ` Jens Axboe
2016-03-22 17:55 ` [PATCH 4/6] sd: inform block layer of write cache state Jens Axboe
2016-03-22 17:55 ` [PATCH 5/6] NVMe: " Jens Axboe
2016-03-22 17:55 ` [PATCH 6/6] writeback: throttle buffered writeback Jens Axboe
2016-03-22 20:12 ` Jeff Moyer
2016-03-22 20:19 ` Jens Axboe
2016-03-22 20:27 ` Jeff Moyer
2016-03-22 21:30 ` Shaohua Li
2016-03-22 21:35 ` Jens Axboe
2016-03-22 21:51 ` [PATCHSET][RFC] Make background writeback not suck Dave Chinner
2016-03-22 22:03 ` Jens Axboe
2016-03-22 22:31 ` Dave Chinner
2016-03-22 22:57 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56F1C216.5050103@fb.com \
--to=axboe@fb.com \
--cc=david@fromorbit.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.