All of lore.kernel.org
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	kernel-team@fb.com, Dave Chinner <david@fromorbit.com>,
	Jann Horn <jannh@google.com>,
	linux-api@vger.kernel.org
Subject: Re: [RFC PATCH v2 5/5] btrfs: implement RWF_ENCODED writes
Date: Fri, 18 Oct 2019 16:33:00 -0700	[thread overview]
Message-ID: <20191018233300.GE59713@vader> (raw)
In-Reply-To: <20191018225513.GD59713@vader>

On Fri, Oct 18, 2019 at 03:55:13PM -0700, Omar Sandoval wrote:
> On Wed, Oct 16, 2019 at 01:44:56PM +0300, Nikolay Borisov wrote:
> > 
> > 
> > On 15.10.19 г. 21:42 ч., Omar Sandoval wrote:
> > > From: Omar Sandoval <osandov@fb.com>
> > > 
> > > The implementation resembles direct I/O: we have to flush any ordered
> > > extents, invalidate the page cache, and do the io tree/delalloc/extent
> > > map/ordered extent dance. From there, we can reuse the compression code
> > > with a minor modification to distinguish the write from writeback.
> > > 
> > > Now that read and write are implemented, this also sets the
> > > FMODE_ENCODED_IO flag in btrfs_file_open().
> > > 
> > > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > > ---
> > >  fs/btrfs/compression.c |   6 +-
> > >  fs/btrfs/compression.h |   5 +-
> > >  fs/btrfs/ctree.h       |   2 +
> > >  fs/btrfs/file.c        |  40 +++++++--
> > >  fs/btrfs/inode.c       | 197 ++++++++++++++++++++++++++++++++++++++++-
> > >  5 files changed, 237 insertions(+), 13 deletions(-)
> > > 

[snip]

> > > +	for (;;) {
> > > +		struct btrfs_ordered_extent *ordered;
> > > +
> > > +		ret = btrfs_wait_ordered_range(inode, start, end - start + 1);
> > > +		if (ret)
> > > +			goto out_pages;
> > > +		ret = invalidate_inode_pages2_range(inode->i_mapping,
> > > +						    start >> PAGE_SHIFT,
> > > +						    end >> PAGE_SHIFT);
> > > +		if (ret)
> > > +			goto out_pages;
> > > +		lock_extent_bits(io_tree, start, end, &cached_state);
> > > +		ordered = btrfs_lookup_ordered_range(BTRFS_I(inode), start,
> > > +						     end - start + 1);
> > > +		if (!ordered &&
> > > +		    !filemap_range_has_page(inode->i_mapping, start, end))
> > > +			break;
> > > +		if (ordered)
> > > +			btrfs_put_ordered_extent(ordered);
> > > +		unlock_extent_cached(io_tree, start, end, &cached_state);
> > > +		cond_resched();
> > > +	}
> > > +
> > > +	ret = btrfs_delalloc_reserve_space(inode, &data_reserved, start,
> > > +					   num_bytes);
> > > +	if (ret)
> > > +		goto out_unlock;
> > > +
> > > +	ret = btrfs_reserve_extent(root, num_bytes, disk_num_bytes,
> > > +				   disk_num_bytes, 0, 0, &ins, 1, 1);
> > > +	if (ret)
> > > +		goto out_delalloc_release;
> > > +
> > > +	em = create_io_em(inode, start, num_bytes, start, ins.objectid,
> > > +			  ins.offset, ins.offset, num_bytes, compression,
> > > +			  BTRFS_ORDERED_COMPRESSED);
> > > +	if (IS_ERR(em)) {
> > > +		ret = PTR_ERR(em);
> > > +		goto out_free_reserve;
> > > +	}
> > > +	free_extent_map(em);
> > > +
> > > +	ret = btrfs_add_ordered_extent_compress(inode, start, ins.objectid,
> > > +						num_bytes, ins.offset,
> > > +						BTRFS_ORDERED_COMPRESSED,
> > > +						compression);
> > > +	if (ret) {
> > > +		btrfs_drop_extent_cache(BTRFS_I(inode), start, end, 0);
> > > +		goto out_free_reserve;
> > > +	}
> > > +	btrfs_dec_block_group_reservations(fs_info, ins.objectid);
> > > +
> > > +	if (start + encoded->len > inode->i_size)
> > > +		i_size_write(inode, start + encoded->len);
> > 
> > Don't we want the inode size to be updated once data hits disk and
> > btrfs_finish_ordered_io is called?
> 
> Yup, you're right, this is too early.

Actually, no, this part is fine. Compare to the call to i_size_write()
in btrfs_get_blocks_direct_write(): we lock the extent in the io_tree,
create the ordered extent, update i_size, then unlock the extent. Anyone
else who comes in is going to find the ordered extent and wait on that.

> > > +
> > > +	unlock_extent_cached(io_tree, start, end, &cached_state);
> > > +
> > > +	btrfs_delalloc_release_extents(BTRFS_I(inode), num_bytes, false);
> > > +
> > > +	if (btrfs_submit_compressed_write(inode, start, num_bytes, ins.objectid,
> > > +					  ins.offset, pages, nr_pages, 0,
> > > +					  false)) {
> > > +		struct page *page = pages[0];
> > > +
> > > +		page->mapping = inode->i_mapping;
> > > +		btrfs_writepage_endio_finish_ordered(page, start, end, 0);
> > > +		page->mapping = NULL;
> > > +		ret = -EIO;
> > > +		goto out_pages;
> > > +	}
> 
> I also need to wait for the I/O to finish here.
> 
> > > +	iocb->ki_pos += encoded->len;
> > > +	return orig_count;
> > > +
> > > +out_free_reserve:
> > > +	btrfs_dec_block_group_reservations(fs_info, ins.objectid);
> > > +	btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1);
> > > +out_delalloc_release:
> > > +	btrfs_delalloc_release_space(inode, data_reserved, start, num_bytes,
> > > +				     true);
> > > +out_unlock:
> > > +	unlock_extent_cached(io_tree, start, end, &cached_state);
> > > +out_pages:
> > > +	for (i = 0; i < nr_pages; i++) {
> > > +		if (pages[i])
> > > +			put_page(pages[i]);
> > > +	}
> > > +	kvfree(pages);
> > > +	return ret;
> > > +}
> > > +
> > >  #ifdef CONFIG_SWAP
> > >  /*
> > >   * Add an entry indicating a block group or device which is pinned by a
> > > 

  reply	other threads:[~2019-10-18 23:33 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-15 18:42 [RFC PATCH v2 0/5] fs: interface for directly reading/writing compressed data Omar Sandoval
2019-10-15 18:42 ` [PATCH man-pages] Document encoded I/O Omar Sandoval
2019-10-21  6:18   ` Amir Goldstein
2019-10-21 18:53     ` Omar Sandoval
2019-10-22  6:40       ` Amir Goldstein
2019-10-23  4:44         ` Aleksa Sarai
2019-10-23  6:06           ` Amir Goldstein
2019-10-23 12:12             ` Aleksa Sarai
2019-10-30 22:46               ` Omar Sandoval
2019-10-30 22:57                 ` Omar Sandoval
2019-10-15 18:42 ` [RFC PATCH v2 1/5] fs: add O_ENCODED open flag Omar Sandoval
2019-10-19  4:50   ` Aleksa Sarai
2019-10-23  4:46     ` Aleksa Sarai
2019-10-30 22:55     ` Omar Sandoval
2019-10-30 23:17       ` Aleksa Sarai
2019-10-15 18:42 ` [RFC PATCH v2 2/5] fs: add RWF_ENCODED for reading/writing compressed data Omar Sandoval
2019-10-16  9:50   ` Nikolay Borisov
2019-10-18 22:19     ` Omar Sandoval
2019-10-19  5:01   ` Aleksa Sarai
2019-10-21 18:28   ` Darrick J. Wong
2019-10-21 18:38     ` Aleksa Sarai
2019-10-21 19:00       ` Darrick J. Wong
2019-10-22  1:37         ` Aleksa Sarai
2019-10-30 22:21           ` Omar Sandoval
2019-10-22  2:02         ` Aleksa Sarai
2019-10-30 22:26           ` Omar Sandoval
2019-10-30 23:11             ` Aleksa Sarai
2019-10-21 19:07     ` Omar Sandoval
2019-10-21 19:07       ` Omar Sandoval
2019-10-15 18:42 ` [RFC PATCH v2 3/5] btrfs: generalize btrfs_lookup_bio_sums_dio() Omar Sandoval
2019-10-16  9:22   ` Nikolay Borisov
2019-10-18 22:19     ` Omar Sandoval
2019-10-15 18:42 ` [RFC PATCH v2 4/5] btrfs: implement RWF_ENCODED reads Omar Sandoval
2019-10-16 11:10   ` Nikolay Borisov
2019-10-18 22:23     ` Omar Sandoval
2019-10-15 18:42 ` [RFC PATCH v2 5/5] btrfs: implement RWF_ENCODED writes Omar Sandoval
2019-10-16 10:44   ` Nikolay Borisov
2019-10-18 22:55     ` Omar Sandoval
2019-10-18 23:33       ` Omar Sandoval [this message]
2019-10-21 13:14       ` David Sterba
2019-10-21 18:05         ` Omar Sandoval
2019-10-20 23:05 ` [RFC PATCH v2 0/5] fs: interface for directly reading/writing compressed data Dave Chinner
2019-10-21 19:04   ` Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191018233300.GE59713@vader \
    --to=osandov@osandov.com \
    --cc=david@fromorbit.com \
    --cc=jannh@google.com \
    --cc=kernel-team@fb.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=nborisov@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.