From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-ext4@vger.kernel.org, Jan Kara <jack@suse.cz>,
linux-fsdevel@vger.kernel.org
Subject: Re: [RFC 2/3] ext2: Convert ext2 regular file buffered I/O to use iomap
Date: Tue, 21 Nov 2023 11:26:15 +0530 [thread overview]
Message-ID: <874jhfy7i8.fsf@doe.com> (raw)
In-Reply-To: <ZVw1xxNYQuHimSmx@infradead.org>
Christoph Hellwig <hch@infradead.org> writes:
> On Tue, Nov 21, 2023 at 12:35:20AM +0530, Ritesh Harjani (IBM) wrote:
>> - mmap path of ext2 continues to use generic_file_vm_ops
>
> So this means there are not space reservations taken at write fault
> time.
Yes.
> While iomap was written with the assumption those exist, I can't
> instantly spot anything that relies on them - you are just a lot more
> likely to hit an -ENOSPC from ->map_blocks now.
Which is also true with existing code no? If the block reservation is
not done at the write fault, writeback is likely to fail due to ENOSPC?
> Maybe we should add
> an xfstests that covers the case where we use up more then the existing
> free space through writable mmaps to ensure it fully works (where works
> means at least as good as the old behavior)?
>
Sure, I can try write an fstests for the same.
Also I did find an old thread which tried implementing ->page_mkwrite in
ext2 [1]. However, it was rejected with following reason -
"""
Allocating
blocks on write fault has the unwelcome impact that the file layout is
likely going to be much worse (much more fragmented) - I remember getting
some reports about performance regressions from users back when I did a
similar change for ext3. And so I'm reluctant to change behavior of such
an old legacy filesystem as ext2...
"""
https://lore.kernel.org/linux-ext4/20210105175348.GE15080@quack2.suse.cz/
>> +static ssize_t ext2_buffered_write_iter(struct kiocb *iocb,
>> + struct iov_iter *from)
>> +{
>> + ssize_t ret = 0;
>> + struct inode *inode = file_inode(iocb->ki_filp);
>> +
>> + inode_lock(inode);
>> + ret = generic_write_checks(iocb, from);
>> + if (ret > 0)
>> + ret = iomap_file_buffered_write(iocb, from, &ext2_iomap_ops);
>> + inode_unlock(inode);
>> + if (ret > 0)
>> + ret = generic_write_sync(iocb, ret);
>> + return ret;
>> +}
>
> Not for now, but if we end up doing a bunch of conversation of trivial
> file systems, it might be worth to eventually add a wrapper for the
> above code with just the iomap_ops passed in. Only once we see a few
> duplicates, though.
>
Sure, make sense. Thanks!
I can try and check if the the wrapper helps.
>> +static int ext2_write_map_blocks(struct iomap_writepage_ctx *wpc,
>> + struct inode *inode, loff_t offset)
>> +{
>> + if (offset >= wpc->iomap.offset &&
>> + offset < wpc->iomap.offset + wpc->iomap.length)
>> + return 0;
>> +
>> + return ext2_iomap_begin(inode, offset, inode->i_sb->s_blocksize,
>> + IOMAP_WRITE, &wpc->iomap, NULL);
>> +}
>
> Looking at gfs2 this also might become a pattern. But I'm fine with
> waiting for now.
>
ok.
>> @@ -1372,7 +1428,7 @@ void ext2_set_file_ops(struct inode *inode)
>> if (IS_DAX(inode))
>> inode->i_mapping->a_ops = &ext2_dax_aops;
>> else
>> - inode->i_mapping->a_ops = &ext2_aops;
>> + inode->i_mapping->a_ops = &ext2_file_aops;
>> }
>
> Did yo run into issues in using the iomap based aops for the other uses
> of ext2_aops, or are just trying to address the users one at a time?
There are problems for e.g. for dir type in ext2. It uses the pagecache
for dir. It uses buffer_heads and attaches them to folio->private.
...it uses block_write_begin/block_write_end() calls.
Look for ext4_make_empty() -> ext4_prepare_chunk ->
block_write_begin().
Now during sync/writeback of the dirty pages (ext4_handle_dirsync()), we
might take a iomap writeback path (if using ext2_file_aops for dir)
which sees folio->private assuming it is "struct iomap_folio_state".
And bad things will happen...
Now we don't have an equivalent APIs in iomap for
block_write_begin()/end() which the users can call for. Hence, Jan
suggested to lets first convert ext2 regular file path to iomap as an RFC.
I shall now check for the dir type to see what changes we might require
in ext2/iomap code.
Thanks again for your review!
-ritesh
next prev parent reply other threads:[~2023-11-21 5:56 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1700506526.git.ritesh.list@gmail.com>
2023-11-20 19:05 ` [RFC 1/3] ext2: Fix ki_pos update for DIO buffered-io fallback case Ritesh Harjani (IBM)
2023-11-21 4:39 ` Christoph Hellwig
2023-11-21 5:36 ` Ritesh Harjani
2023-11-22 6:51 ` Christoph Hellwig
2023-11-20 19:05 ` [RFC 2/3] ext2: Convert ext2 regular file buffered I/O to use iomap Ritesh Harjani (IBM)
2023-11-20 20:00 ` Matthew Wilcox
2023-11-21 4:44 ` Christoph Hellwig
2023-11-21 5:56 ` Ritesh Harjani [this message]
2023-11-21 6:08 ` Christoph Hellwig
2023-11-21 6:15 ` Ritesh Harjani
2023-11-22 12:29 ` Jan Kara
2023-11-22 13:11 ` Christoph Hellwig
2023-11-22 20:26 ` Ritesh Harjani
2023-11-30 3:24 ` Ritesh Harjani
2023-11-30 4:22 ` Matthew Wilcox
2023-11-30 4:37 ` Ritesh Harjani
2023-11-30 4:30 ` Christoph Hellwig
2023-11-30 5:27 ` Ritesh Harjani
2023-11-30 8:22 ` Zhang Yi
2023-11-30 7:45 ` Ritesh Harjani
2023-11-30 10:18 ` Jan Kara
2023-11-30 10:59 ` Ritesh Harjani
2023-11-30 14:08 ` Jan Kara
2023-11-30 15:50 ` Ritesh Harjani
2023-11-30 16:01 ` Jan Kara
2023-11-30 16:03 ` Matthew Wilcox
2023-12-01 23:09 ` Dave Chinner
2023-12-05 15:22 ` Ritesh Harjani
2023-12-07 8:58 ` Jan Kara
2023-11-22 22:26 ` Dave Chinner
2023-11-23 4:09 ` Darrick J. Wong
2023-11-23 7:09 ` Christoph Hellwig
2023-11-29 5:37 ` Darrick J. Wong
2023-11-29 6:32 ` Christoph Hellwig
2023-11-29 9:19 ` Dave Chinner
2023-11-23 7:02 ` Christoph Hellwig
2023-11-22 20:25 ` Ritesh Harjani
2023-11-20 19:05 ` [RFC 3/3] ext2: Enable large folio support Ritesh Harjani (IBM)
2023-11-20 20:00 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=874jhfy7i8.fsf@doe.com \
--to=ritesh.list@gmail.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.