From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-ext4@vger.kernel.org, Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org
Subject: Re: [RFC 2/3] ext2: Convert ext2 regular file buffered I/O to use iomap
Date: Tue, 21 Nov 2023 11:26:15 +0530 [thread overview]
Message-ID: <874jhfy7i8.fsf@doe.com> (raw)
In-Reply-To: <ZVw1xxNYQuHimSmx@infradead.org>
Christoph Hellwig <hch@infradead.org> writes:
> On Tue, Nov 21, 2023 at 12:35:20AM +0530, Ritesh Harjani (IBM) wrote:
>> - mmap path of ext2 continues to use generic_file_vm_ops
>
> So this means there are not space reservations taken at write fault
> time.
Yes.
> While iomap was written with the assumption those exist, I can't
> instantly spot anything that relies on them - you are just a lot more
> likely to hit an -ENOSPC from ->map_blocks now.
Which is also true with existing code no? If the block reservation is
not done at the write fault, writeback is likely to fail due to ENOSPC?
> Maybe we should add
> an xfstests that covers the case where we use up more then the existing
> free space through writable mmaps to ensure it fully works (where works
> means at least as good as the old behavior)?
>
Sure, I can try write an fstests for the same.
Also I did find an old thread which tried implementing ->page_mkwrite in
ext2 [1]. However, it was rejected with following reason -
"""
Allocating
blocks on write fault has the unwelcome impact that the file layout is
likely going to be much worse (much more fragmented) - I remember getting
some reports about performance regressions from users back when I did a
similar change for ext3. And so I'm reluctant to change behavior of such
an old legacy filesystem as ext2...
"""
https://lore.kernel.org/linux-ext4/20210105175348.GE15080@quack2.suse.cz/
>> +static ssize_t ext2_buffered_write_iter(struct kiocb *iocb,
>> + struct iov_iter *from)
>> +{
>> + ssize_t ret = 0;
>> + struct inode *inode = file_inode(iocb->ki_filp);
>> +
>> + inode_lock(inode);
>> + ret = generic_write_checks(iocb, from);
>> + if (ret > 0)
>> + ret = iomap_file_buffered_write(iocb, from, &ext2_iomap_ops);
>> + inode_unlock(inode);
>> + if (ret > 0)
>> + ret = generic_write_sync(iocb, ret);
>> + return ret;
>> +}
>
> Not for now, but if we end up doing a bunch of conversation of trivial
> file systems, it might be worth to eventually add a wrapper for the
> above code with just the iomap_ops passed in. Only once we see a few
> duplicates, though.
>
Sure, make sense. Thanks!
I can try and check if the the wrapper helps.
>> +static int ext2_write_map_blocks(struct iomap_writepage_ctx *wpc,
>> + struct inode *inode, loff_t offset)
>> +{
>> + if (offset >= wpc->iomap.offset &&
>> + offset < wpc->iomap.offset + wpc->iomap.length)
>> + return 0;
>> +
>> + return ext2_iomap_begin(inode, offset, inode->i_sb->s_blocksize,
>> + IOMAP_WRITE, &wpc->iomap, NULL);
>> +}
>
> Looking at gfs2 this also might become a pattern. But I'm fine with
> waiting for now.
>
ok.
>> @@ -1372,7 +1428,7 @@ void ext2_set_file_ops(struct inode *inode)
>> if (IS_DAX(inode))
>> inode->i_mapping->a_ops = &ext2_dax_aops;
>> else
>> - inode->i_mapping->a_ops = &ext2_aops;
>> + inode->i_mapping->a_ops = &ext2_file_aops;
>> }
>
> Did yo run into issues in using the iomap based aops for the other uses
> of ext2_aops, or are just trying to address the users one at a time?
There are problems for e.g. for dir type in ext2. It uses the pagecache
for dir. It uses buffer_heads and attaches them to folio->private.
...it uses block_write_begin/block_write_end() calls.
Look for ext4_make_empty() -> ext4_prepare_chunk ->
block_write_begin().
Now during sync/writeback of the dirty pages (ext4_handle_dirsync()), we
might take a iomap writeback path (if using ext2_file_aops for dir)
which sees folio->private assuming it is "struct iomap_folio_state".
And bad things will happen...
Now we don't have an equivalent APIs in iomap for
block_write_begin()/end() which the users can call for. Hence, Jan
suggested to lets first convert ext2 regular file path to iomap as an RFC.
I shall now check for the dir type to see what changes we might require
in ext2/iomap code.
Thanks again for your review!
-ritesh
next prev parent reply other threads:[~2023-11-21 5:56 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1700506526.git.ritesh.list@gmail.com>
2023-11-20 19:05 ` [RFC 1/3] ext2: Fix ki_pos update for DIO buffered-io fallback case Ritesh Harjani (IBM)
2023-11-21 4:39 ` Christoph Hellwig
2023-11-21 5:36 ` Ritesh Harjani
2023-11-22 6:51 ` Christoph Hellwig
2023-11-20 19:05 ` [RFC 2/3] ext2: Convert ext2 regular file buffered I/O to use iomap Ritesh Harjani (IBM)
2023-11-20 20:00 ` Matthew Wilcox
2023-11-21 4:44 ` Christoph Hellwig
2023-11-21 5:56 ` Ritesh Harjani [this message]
2023-11-21 6:08 ` Christoph Hellwig
2023-11-21 6:15 ` Ritesh Harjani
2023-11-22 12:29 ` Jan Kara
2023-11-22 13:11 ` Christoph Hellwig
2023-11-22 20:26 ` Ritesh Harjani
2023-11-30 3:24 ` Ritesh Harjani
2023-11-30 4:22 ` Matthew Wilcox
2023-11-30 4:37 ` Ritesh Harjani
2023-11-30 4:30 ` Christoph Hellwig
2023-11-30 5:27 ` Ritesh Harjani
2023-11-30 8:22 ` Zhang Yi
2023-11-30 7:45 ` Ritesh Harjani
2023-11-30 10:18 ` Jan Kara
2023-11-30 10:59 ` Ritesh Harjani
2023-11-30 14:08 ` Jan Kara
2023-11-30 15:50 ` Ritesh Harjani
2023-11-30 16:01 ` Jan Kara
2023-11-30 16:03 ` Matthew Wilcox
2023-12-01 23:09 ` Dave Chinner
2023-12-05 15:22 ` Ritesh Harjani
2023-12-07 8:58 ` Jan Kara
2023-11-22 22:26 ` Dave Chinner
2023-11-23 4:09 ` Darrick J. Wong
2023-11-23 7:09 ` Christoph Hellwig
2023-11-29 5:37 ` Darrick J. Wong
2023-11-29 6:32 ` Christoph Hellwig
2023-11-29 9:19 ` Dave Chinner
2023-11-23 7:02 ` Christoph Hellwig
2023-11-22 20:25 ` Ritesh Harjani
2023-11-20 19:05 ` [RFC 3/3] ext2: Enable large folio support Ritesh Harjani (IBM)
2023-11-20 20:00 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874jhfy7i8.fsf@doe.com \
--to=ritesh.list@gmail.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox