From: Omar Sandoval <osandov@osandov.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Nikolay Borisov <nborisov@suse.com>,
linux-btrfs@vger.kernel.org, kernel-team@fb.com,
linux-fsdevel@vger.kernel.org, Al Viro <viro@zeniv.linux.org.uk>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-api@vger.kernel.org, Qu Wenruo <wqu@suse.com>
Subject: Re: [PATCH v10 06/14] btrfs: optionally extend i_size in cow_file_range_inline()
Date: Fri, 20 Aug 2021 11:11:12 -0700 [thread overview]
Message-ID: <YR/wQPJcv25vPIp7@relinquished.localdomain> (raw)
In-Reply-To: <2eae3b11-d9aa-42b1-122e-49bd40258d9b@gmx.com>
On Fri, Aug 20, 2021 at 05:13:34PM +0800, Qu Wenruo wrote:
>
>
> On 2021/8/20 下午4:51, Nikolay Borisov wrote:
> >
> >
> > On 18.08.21 г. 0:06, Omar Sandoval wrote:
> > > From: Omar Sandoval <osandov@fb.com>
> > >
> > > Currently, an inline extent is always created after i_size is extended
> > > from btrfs_dirty_pages(). However, for encoded writes, we only want to
> > > update i_size after we successfully created the inline extent.
>
> To me, the idea of write first then update isize is just going to cause
> tons of inline extent related prblems.
>
> The current example is falloc, which only update the isize after the
> falloc finishes.
>
> This behavior has already bothered me quite a lot, as it can easily
> create mixed inline and regular extents.
Do you have an example of how this would happen? I have the inode and
extent bits locked during an encoded write, and I see that fallocate
does the same.
> Can't we remember the old isize (with proper locking), enlarge isize
> (with holes filled), do the write.
>
> If something wrong happened, we truncate the isize back to its old isize.
>
> > > Add an
> > > update_i_size parameter to cow_file_range_inline() and
> > > insert_inline_extent() and pass in the size of the extent rather than
> > > determining it from i_size. Since the start parameter is always passed
> > > as 0, get rid of it and simplify the logic in these two functions. While
> > > we're here, let's document the requirements for creating an inline
> > > extent.
> > >
> > > Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> > > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > > ---
> > > fs/btrfs/inode.c | 100 +++++++++++++++++++++++------------------------
> > > 1 file changed, 48 insertions(+), 52 deletions(-)
> > >
> > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > > index 708d8ab098bc..0b5ff14aa7fd 100644
> > > --- a/fs/btrfs/inode.c
> > > +++ b/fs/btrfs/inode.c
> > > @@ -236,9 +236,10 @@ static int btrfs_init_inode_security(struct btrfs_trans_handle *trans,
> > > static int insert_inline_extent(struct btrfs_trans_handle *trans,
> > > struct btrfs_path *path, bool extent_inserted,
> > > struct btrfs_root *root, struct inode *inode,
> > > - u64 start, size_t size, size_t compressed_size,
> > > + size_t size, size_t compressed_size,
> > > int compress_type,
> > > - struct page **compressed_pages)
> > > + struct page **compressed_pages,
> > > + bool update_i_size)
> > > {
> > > struct extent_buffer *leaf;
> > > struct page *page = NULL;
> > > @@ -247,7 +248,7 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans,
> > > struct btrfs_file_extent_item *ei;
> > > int ret;
> > > size_t cur_size = size;
> > > - unsigned long offset;
> > > + u64 i_size;
> > >
> > > ASSERT((compressed_size > 0 && compressed_pages) ||
> > > (compressed_size == 0 && !compressed_pages));
> > > @@ -260,7 +261,7 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans,
> > > size_t datasize;
> > >
> > > key.objectid = btrfs_ino(BTRFS_I(inode));
> > > - key.offset = start;
> > > + key.offset = 0;
> > > key.type = BTRFS_EXTENT_DATA_KEY;
> > >
> > > datasize = btrfs_file_extent_calc_inline_size(cur_size);
> > > @@ -297,12 +298,10 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans,
> > > btrfs_set_file_extent_compression(leaf, ei,
> > > compress_type);
> > > } else {
> > > - page = find_get_page(inode->i_mapping,
> > > - start >> PAGE_SHIFT);
> > > + page = find_get_page(inode->i_mapping, 0);
> > > btrfs_set_file_extent_compression(leaf, ei, 0);
> > > kaddr = kmap_atomic(page);
> > > - offset = offset_in_page(start);
> > > - write_extent_buffer(leaf, kaddr + offset, ptr, size);
> > > + write_extent_buffer(leaf, kaddr, ptr, size);
> > > kunmap_atomic(kaddr);
> > > put_page(page);
> > > }
> > > @@ -313,8 +312,8 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans,
> > > * We align size to sectorsize for inline extents just for simplicity
> > > * sake.
> > > */
> > > - size = ALIGN(size, root->fs_info->sectorsize);
> > > - ret = btrfs_inode_set_file_extent_range(BTRFS_I(inode), start, size);
> > > + ret = btrfs_inode_set_file_extent_range(BTRFS_I(inode), 0,
> > > + ALIGN(size, root->fs_info->sectorsize));
> > > if (ret)
> > > goto fail;
> > >
> > > @@ -327,7 +326,13 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans,
> > > * before we unlock the pages. Otherwise we
> > > * could end up racing with unlink.
> > > */
> > > - BTRFS_I(inode)->disk_i_size = inode->i_size;
> > > + i_size = i_size_read(inode);
> > > + if (update_i_size && size > i_size) {
> > > + i_size_write(inode, size);
> > > + i_size = size;
> > > + }
> > > + BTRFS_I(inode)->disk_i_size = i_size;
> > > +
> > > fail:
> > > return ret;
> > > }
> > > @@ -338,35 +343,31 @@ static int insert_inline_extent(struct btrfs_trans_handle *trans,
> > > * does the checks required to make sure the data is small enough
> > > * to fit as an inline extent.
> > > */
> > > -static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 start,
> > > - u64 end, size_t compressed_size,
> > > +static noinline int cow_file_range_inline(struct btrfs_inode *inode, u64 size,
> > > + size_t compressed_size,
> > > int compress_type,
> > > - struct page **compressed_pages)
> > > + struct page **compressed_pages,
> > > + bool update_i_size)
> > > {
> > > struct btrfs_drop_extents_args drop_args = { 0 };
> > > struct btrfs_root *root = inode->root;
> > > struct btrfs_fs_info *fs_info = root->fs_info;
> > > struct btrfs_trans_handle *trans;
> > > - u64 isize = i_size_read(&inode->vfs_inode);
> > > - u64 actual_end = min(end + 1, isize);
> > > - u64 inline_len = actual_end - start;
> > > - u64 aligned_end = ALIGN(end, fs_info->sectorsize);
> > > - u64 data_len = inline_len;
> > > + u64 data_len = compressed_size ? compressed_size : size;
> > > int ret;
> > > struct btrfs_path *path;
> > >
> > > - if (compressed_size)
> > > - data_len = compressed_size;
> > > -
> > > - if (start > 0 ||
> > > - actual_end > fs_info->sectorsize ||
> > > + /*
> > > + * We can create an inline extent if it ends at or beyond the current
> > > + * i_size, is no larger than a sector (decompressed), and the (possibly
> > > + * compressed) data fits in a leaf and the configured maximum inline
> > > + * size.
> > > + */
> >
> > Urgh, just some days ago Qu was talking about how awkward it is to have
> > mixed extents in a file. And now, AFAIU, you are making them more likely
> > since now they can be created not just at the beginning of the file but
> > also after i_size write. While this won't be a problem in and of itself
> > it goes just the opposite way of us trying to shrink the possible cases
> > when we can have mixed extents.
>
> Tree-checker should reject such inline extent at non-zero offset.
This change does not allow creating inline extents at a non-zero offset.
> > Qu what is your take on that?
>
> My question is, why encoded write needs to bother the inline extents at all?
>
> My intuition of such encoded write is, it should not create inline
> extents at all.
>
> Or is there any special use-case involved for encoded write?
We create compressed inline extents with normal writes. We should be
able to send and receive them without converting them into regular
extents.
next prev parent reply other threads:[~2021-08-20 18:11 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-17 21:06 [PATCH v10 00/14] btrfs: add ioctls and send/receive support for reading/writing compressed data Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 01/14] fs: export rw_verify_area() Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 02/14] fs: export variant of generic_write_checks without iov_iter Omar Sandoval
2021-08-20 7:59 ` Nikolay Borisov
2021-08-20 17:31 ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 03/14] btrfs: don't advance offset for compressed bios in btrfs_csum_one_bio() Omar Sandoval
2021-08-20 8:08 ` Nikolay Borisov
2021-08-20 17:37 ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 04/14] btrfs: add ram_bytes and offset to btrfs_ordered_extent Omar Sandoval
2021-08-20 8:34 ` Nikolay Borisov
2021-08-20 17:43 ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 05/14] btrfs: support different disk extent size for delalloc Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 06/14] btrfs: optionally extend i_size in cow_file_range_inline() Omar Sandoval
2021-08-20 8:51 ` Nikolay Borisov
2021-08-20 9:13 ` Qu Wenruo
2021-08-20 18:11 ` Omar Sandoval [this message]
2021-08-21 1:11 ` Qu Wenruo
2021-08-23 18:16 ` Omar Sandoval
2021-08-23 23:32 ` Qu Wenruo
2021-08-23 23:46 ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 07/14] btrfs: add definitions + documentation for encoded I/O ioctls Omar Sandoval
2021-08-20 8:56 ` Nikolay Borisov
2021-08-20 17:48 ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 08/14] btrfs: add BTRFS_IOC_ENCODED_READ Omar Sandoval
2021-08-20 12:30 ` Nikolay Borisov
2021-08-20 17:58 ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 09/14] btrfs: add BTRFS_IOC_ENCODED_WRITE Omar Sandoval
2021-08-20 13:44 ` Nikolay Borisov
2021-08-20 17:59 ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 10/14] btrfs: add send stream v2 definitions Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 11/14] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 12/14] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 13/14] btrfs: send: send compressed extents with encoded writes Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 14/14] btrfs: send: enable support for stream v2 and compressed writes Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 01/10] btrfs-progs: receive: support v2 send stream larger tlv_len Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 02/10] btrfs-progs: receive: dynamically allocate sctx->read_buf Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 03/10] btrfs-progs: receive: support v2 send stream DATA tlv format Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 04/10] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 05/10] btrfs-progs: receive: process encoded_write commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write Omar Sandoval
2021-08-18 18:07 ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 07/10] btrfs-progs: receive: process fallocate commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 08/10] btrfs-progs: receive: process setflags ioctl commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 09/10] btrfs-progs: send: stream v2 ioctl flags Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 10/10] btrfs-progs: receive: add tests for basic encoded_write send/receive Omar Sandoval
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YR/wQPJcv25vPIp7@relinquished.localdomain \
--to=osandov@osandov.com \
--cc=kernel-team@fb.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=nborisov@suse.com \
--cc=quwenruo.btrfs@gmx.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).