From: Omar Sandoval <osandov@osandov.com>
To: Nikolay Borisov <nborisov@suse.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com,
linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org
Subject: Re: [PATCH v11 08/14] btrfs: add BTRFS_IOC_ENCODED_READ
Date: Mon, 18 Oct 2021 16:59:33 -0700 [thread overview]
Message-ID: <YW4KZd4gGKVen4p4@relinquished.localdomain> (raw)
In-Reply-To: <4a64bf3a-b691-1986-80c8-21ddf9e446a0@suse.com>
On Fri, Oct 15, 2021 at 02:45:46PM +0300, Nikolay Borisov wrote:
>
>
> On 1.09.21 г. 20:01, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > There are 4 main cases:
> >
> > 1. Inline extents: we copy the data straight out of the extent buffer.
> > 2. Hole/preallocated extents: we fill in zeroes.
> > 3. Regular, uncompressed extents: we read the sectors we need directly
> > from disk.
> > 4. Regular, compressed extents: we read the entire compressed extent
> > from disk and indicate what subset of the decompressed extent is in
> > the file.
> >
> > This initial implementation simplifies a few things that can be improved
> > in the future:
> >
> > - We hold the inode lock during the operation.
> > - Cases 1, 3, and 4 allocate temporary memory to read into before
> > copying out to userspace.
> > - We don't do read repair, because it turns out that read repair is
> > currently broken for compressed data.
> >
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> > fs/btrfs/ctree.h | 4 +
> > fs/btrfs/inode.c | 489 +++++++++++++++++++++++++++++++++++++++++++++++
> > fs/btrfs/ioctl.c | 111 +++++++++++
> > 3 files changed, 604 insertions(+)
> >
> > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> > index b95ec5fb68d5..cbd7e07c1c34 100644
> > --- a/fs/btrfs/ctree.h
> > +++ b/fs/btrfs/ctree.h
> > @@ -3223,6 +3223,10 @@ int btrfs_writepage_cow_fixup(struct page *page);
> > void btrfs_writepage_endio_finish_ordered(struct btrfs_inode *inode,
> > struct page *page, u64 start,
> > u64 end, bool uptodate);
> > +struct btrfs_ioctl_encoded_io_args;
> > +ssize_t btrfs_encoded_read(struct kiocb *iocb, struct iov_iter *iter,
> > + struct btrfs_ioctl_encoded_io_args *encoded);
> > +
> > extern const struct dentry_operations btrfs_dentry_operations;
> > extern const struct iomap_ops btrfs_dio_iomap_ops;
> > extern const struct iomap_dio_ops btrfs_dio_ops;
> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > index a87a34f56234..1940f22179ba 100644
> > --- a/fs/btrfs/inode.c
> > +++ b/fs/btrfs/inode.c
> > @@ -10500,6 +10500,495 @@ void btrfs_set_range_writeback(struct btrfs_inode *inode, u64 start, u64 end)
> > }
> > }
> >
>
> <snip>
>
> > +
> > +static blk_status_t btrfs_encoded_read_check_bio(struct btrfs_io_bio *io_bio)
>
> nit: The gist of this function is to check the csum so how about
> renaming it to btrfs_encoded_read_verify_csum
Good point, will do.
> > +
> > +static void btrfs_encoded_read_endio(struct bio *bio)
> > +{
> > + struct btrfs_encoded_read_private *priv = bio->bi_private;
> > + struct btrfs_io_bio *io_bio = btrfs_io_bio(bio);
> > + blk_status_t status;
> > +
> > + status = btrfs_encoded_read_check_bio(io_bio);
> > + if (status) {
> > + /*
> > + * The memory barrier implied by the atomic_dec_return() here
> > + * pairs with the memory barrier implied by the
> > + * atomic_dec_return() or io_wait_event() in
>
> nit: I think atomic_dec_return in read_regular_fill_pages is
> inconsequential, what we want to ensure is that when the caller of
> io_wait_event is woken up by this thread it will observe the
> priv->status, which it will, because the atomic-dec_return in this
> function has paired with the general barrier interpolated by wait_event.
>
> So for brevity just leave the text to say "by io_wait_event".
Considering that there is a code path in
btrfs_encoded_read_regular_fill_pages() where atomic_dec_return() is
called but io_wait_event() isn't, I think it's important to mention
both. (This path should be fairly rare, since it can only happen if all
of the bios are completed before the submitting thread checks the
pending counter. But, it's a possible code path, and I wouldn't want
someone reading the comment to be confused and think that we're missing
a barrier in that case.)
> > + * btrfs_encoded_read_regular_fill_pages() to ensure that this
> > + * write is observed before the load of status in
> > + * btrfs_encoded_read_regular_fill_pages().
> > + */
> > + WRITE_ONCE(priv->status, status);
> > + }
> > + if (!atomic_dec_return(&priv->pending))
> > + wake_up(&priv->wait);
> > + btrfs_io_bio_free_csum(io_bio);
> > + bio_put(bio);
> > +}
>
> <snip>
>
> > @@ -4824,6 +4841,94 @@ static int _btrfs_ioctl_send(struct file *file, void __user *argp, bool compat)
> > return ret;
> > }
> >
>
> <snip>
>
> > + memset((char *)&args + copy_end_kernel, 0,
> > + sizeof(args) - copy_end_kernel);
>
> nit: This memset can be eliminated ( in source) by marking args = {};
> and just leaving copy from user above.
args = {} results in slightly worse generated code, but that doesn't
really matter here, so I'll change it.
> > +
> > + ret = import_iovec(READ, args.iov, args.iovcnt, ARRAY_SIZE(iovstack),
> > + &iov, &iter);
> > + if (ret < 0)
> > + goto out_acct;
> > +
> > + if (iov_iter_count(&iter) == 0) {
> > + ret = 0;
> > + goto out_iov;
> > + }
> > + pos = args.offset;
> > + ret = rw_verify_area(READ, file, &pos, args.len);
> > + if (ret < 0)
> > + goto out_iov;
> > +
> > + init_sync_kiocb(&kiocb, file);
> > + ret = kiocb_set_rw_flags(&kiocb, 0);
>
> This call is a noop due to:
> if (!flags)
> return 0;
>
> in kiocb_set_rw_flags.
Good catch, I'll remove that call.
next prev parent reply other threads:[~2021-10-18 23:59 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-01 17:00 [PATCH v11 00/14] btrfs: add ioctls and send/receive support for reading/writing compressed data Omar Sandoval
2021-09-01 17:00 ` [PATCH v11 01/14] fs: export rw_verify_area() Omar Sandoval
2021-11-18 19:19 ` Omar Sandoval
2022-01-04 19:06 ` Omar Sandoval
2021-09-01 17:00 ` [PATCH v11 02/14] fs: export variant of generic_write_checks without iov_iter Omar Sandoval
2021-10-14 12:03 ` Nikolay Borisov
2021-09-01 17:00 ` [PATCH v11 03/14] btrfs: don't advance offset for compressed bios in btrfs_csum_one_bio() Omar Sandoval
2021-10-14 12:05 ` Nikolay Borisov
2021-10-18 18:09 ` Omar Sandoval
2021-09-01 17:00 ` [PATCH v11 04/14] btrfs: add ram_bytes and offset to btrfs_ordered_extent Omar Sandoval
2021-10-21 12:44 ` Nikolay Borisov
2021-10-21 16:55 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 05/14] btrfs: support different disk extent size for delalloc Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 06/14] btrfs: optionally extend i_size in cow_file_range_inline() Omar Sandoval
2021-10-14 12:54 ` Nikolay Borisov
2021-09-01 17:01 ` [PATCH v11 07/14] btrfs: add definitions + documentation for encoded I/O ioctls Omar Sandoval
2021-10-15 9:42 ` Nikolay Borisov
2021-10-18 18:16 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 08/14] btrfs: add BTRFS_IOC_ENCODED_READ Omar Sandoval
2021-10-15 11:45 ` Nikolay Borisov
2021-10-18 23:59 ` Omar Sandoval [this message]
2021-09-01 17:01 ` [PATCH v11 09/14] btrfs: add BTRFS_IOC_ENCODED_WRITE Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 10/14] btrfs: add send stream v2 definitions Omar Sandoval
2021-10-18 12:46 ` Nikolay Borisov
2021-10-18 15:11 ` Nikolay Borisov
2021-10-18 18:58 ` Omar Sandoval
2021-10-19 7:01 ` Nikolay Borisov
2021-09-01 17:01 ` [PATCH v11 11/14] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
2021-10-18 12:57 ` Nikolay Borisov
2021-09-01 17:01 ` [PATCH v11 12/14] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
2021-10-18 11:10 ` Nikolay Borisov
2021-09-01 17:01 ` [PATCH v11 13/14] btrfs: send: send compressed extents with encoded writes Omar Sandoval
2021-10-18 11:59 ` Nikolay Borisov
2021-10-19 0:11 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 14/14] btrfs: send: enable support for stream v2 and compressed writes Omar Sandoval
2021-10-18 12:44 ` Nikolay Borisov
2021-10-18 18:34 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 01/10] btrfs-progs: receive: support v2 send stream larger tlv_len Omar Sandoval
2021-10-20 13:49 ` Nikolay Borisov
2021-10-20 18:48 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 02/10] btrfs-progs: receive: dynamically allocate sctx->read_buf Omar Sandoval
2021-10-20 14:09 ` Nikolay Borisov
2021-10-20 14:35 ` Nikolay Borisov
2021-10-20 17:44 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 03/10] btrfs-progs: receive: support v2 send stream DATA tlv format Omar Sandoval
2021-10-20 14:36 ` Nikolay Borisov
2021-09-01 17:01 ` [PATCH v11 04/10] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h Omar Sandoval
2021-10-20 14:38 ` Nikolay Borisov
2021-09-01 17:01 ` [PATCH v11 05/10] btrfs-progs: receive: process encoded_write commands Omar Sandoval
2021-10-21 13:33 ` Nikolay Borisov
2021-10-21 16:52 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write Omar Sandoval
2021-10-21 13:55 ` Nikolay Borisov
2021-10-21 17:22 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 07/10] btrfs-progs: receive: process fallocate commands Omar Sandoval
2021-10-21 14:21 ` Nikolay Borisov
2021-10-21 17:28 ` Omar Sandoval
2021-09-01 17:01 ` [PATCH v11 08/10] btrfs-progs: receive: process setflags ioctl commands Omar Sandoval
2021-10-21 14:22 ` Nikolay Borisov
2021-09-01 17:01 ` [PATCH v11 09/10] btrfs-progs: send: stream v2 ioctl flags Omar Sandoval
2021-10-22 6:35 ` Nikolay Borisov
2021-09-01 17:01 ` [PATCH v11 10/10] btrfs-progs: receive: add tests for basic encoded_write send/receive Omar Sandoval
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YW4KZd4gGKVen4p4@relinquished.localdomain \
--to=osandov@osandov.com \
--cc=kernel-team@fb.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=nborisov@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox