From: Dave Chinner <david@fromorbit.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Jeremy Bongio <bongiojp@gmail.com>,
"Darrick J . Wong" <djwong@kernel.org>,
Allison Henderson <allison.henderson@oracle.com>,
linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 0/1] iomap regression for aio dio 4k writes
Date: Fri, 23 Jun 2023 13:02:26 +1000 [thread overview]
Message-ID: <ZJULQjTpcRdEUHY8@dread.disaster.area> (raw)
In-Reply-To: <20230623023233.GC34229@mit.edu>
On Thu, Jun 22, 2023 at 10:32:33PM -0400, Theodore Ts'o wrote:
> On Thu, Jun 22, 2023 at 09:59:29AM +1000, Dave Chinner wrote:
> > Ah, you are testing pure overwrites, which means for ext4 the only
> > thing it needs to care about is cached mappings. What happens when
> > you add O_DSYNC here?
>
> I think you mean O_SYNC, right?
No, I *explicitly* meant O_DSYNC.
> In a pure overwrite case, where all
> of the extents are initialized and where the Oracle or DB2 server is
> doing writes to preallocated, pre-initialized space in the tablespace
> file followed by fdatasync(), there *are* no post-I/O data integrity
> operations which are required.
Wrong: O_DSYNC DIO write IO requires the data to be on stable
storage at IO completion. This means the pure overwrite IO must be
either issued as a REQ_FUA write or as a normal write followed by a
device cache flush.
That device cache flush is a post-I/O data integrity operation and
that is handled by iomap_dio_complete() -> generic_write_sync() ->
vfs_fsync_range()....
> If the file is opened O_SYNC or if the blocks were not
> preallocated using fallocate(2) and not initialized ahead of time,
> then sure, we can't use this optimization.
Well, yes. That's the whole point of the IOMAP_F_DIRTY flag - if
that is set, we don't attempt any pure overwrite optimisations
because it's not a pure overwrite and metadata needs flushing to the
journal. Hence we need to call generic_write_sync().
> What we might to do is to let the file system tell the iomap layer
> via a flag whether or not there are no post-I/O metadata
> operations required, and then *if* that flag is set, and *if* the
> inode has no pages in the page cache (so there are no invalidate
> operations necessary), it should be safe to skip using
> queue_work(). That way, the file system has to affirmatively
> state that it is safe to skip the workqueue, so it shouldn't do
> any harm to other file systems using the iomap DIO layer.
>
> What am I missing?
You didn't read my followup email. IOMAP_F_DIRTY is the flag
you describe, and it already exists.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2023-06-23 3:02 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-21 17:29 [PATCH 0/1] iomap regression for aio dio 4k writes Jeremy Bongio
2023-06-21 17:29 ` [PATCH 1/1] For DIO writes with no mapped pages for inode, skip deferring completion Jeremy Bongio
2023-06-21 18:55 ` Matthew Wilcox
2023-06-22 0:04 ` Dave Chinner
2023-06-21 23:59 ` [PATCH 0/1] iomap regression for aio dio 4k writes Dave Chinner
2023-06-22 1:55 ` Dave Chinner
2023-06-22 2:55 ` Matthew Wilcox
2023-06-22 4:08 ` Christoph Hellwig
2023-06-22 4:47 ` Dave Chinner
2023-06-23 2:32 ` Theodore Ts'o
2023-06-23 3:02 ` Dave Chinner [this message]
2023-06-22 23:22 ` Allison Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZJULQjTpcRdEUHY8@dread.disaster.area \
--to=david@fromorbit.com \
--cc=allison.henderson@oracle.com \
--cc=bongiojp@gmail.com \
--cc=djwong@kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.