linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Omar Sandoval <osandov@osandov.com>
To: linux-btrfs@vger.kernel.org
Cc: kernel-team@fb.com, linux-fsdevel@vger.kernel.org,
	Al Viro <viro@zeniv.linux.org.uk>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-api@vger.kernel.org
Subject: Re: [PATCH v10 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write
Date: Wed, 18 Aug 2021 11:07:43 -0700	[thread overview]
Message-ID: <YR1Mb0i6Fk1LggJb@relinquished.localdomain> (raw)
In-Reply-To: <27ad30578c6e4347ff4161183c55ba6dee2e9227.1629234282.git.osandov@fb.com>

On Tue, Aug 17, 2021 at 02:06:52PM -0700, Omar Sandoval wrote:
> From: Boris Burkov <boris@bur.io>
> 
> An encoded_write can fail if the file system it is being applied to does
> not support encoded writes or if it can't find enough contiguous space
> to accommodate the encoded extent. In those cases, we can likely still
> process an encoded_write by explicitly decoding the data and doing a
> normal write.
> 
> Add the necessary fallback path for decoding data compressed with zlib,
> lzo, or zstd. zlib and zstd have reusable decoding context data
> structures which we cache in the receive context so that we don't have
> to recreate them on every encoded_write.
> 
> Finally, add a command line flag for force-decompress which causes
> receive to always use the fallback path rather than first attempting the
> encoded write.
> 
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
>  Documentation/btrfs-receive.asciidoc |   4 +
>  cmds/receive.c                       | 266 ++++++++++++++++++++++++++-
>  2 files changed, 261 insertions(+), 9 deletions(-)
> 
> diff --git a/Documentation/btrfs-receive.asciidoc b/Documentation/btrfs-receive.asciidoc
> index e4c4d2c0..354a71dc 100644
> --- a/Documentation/btrfs-receive.asciidoc
> +++ b/Documentation/btrfs-receive.asciidoc
> @@ -60,6 +60,10 @@ By default the mountpoint is searched in '/proc/self/mounts'.
>  If '/proc' is not accessible, eg. in a chroot environment, use this option to
>  tell us where this filesystem is mounted.
>  
> +--force-decompress::
> +if the stream contains compressed data (see '--compressed-data' in
> +`btrfs-send`(8)), always decompress it instead of writing it with encoded I/O.
> +
>  --dump::
>  dump the stream metadata, one line per operation
>  +
> diff --git a/cmds/receive.c b/cmds/receive.c
> index b43c298f..7506f992 100644
> --- a/cmds/receive.c
> +++ b/cmds/receive.c
> @@ -40,6 +40,10 @@
>  #include <sys/xattr.h>
>  #include <uuid/uuid.h>
>  
> +#include <lzo/lzo1x.h>
> +#include <zlib.h>
> +#include <zstd.h>
> +
>  #include "kernel-shared/ctree.h"
>  #include "ioctl.h"
>  #include "cmds/commands.h"
> @@ -79,6 +83,12 @@ struct btrfs_receive
>  	struct subvol_uuid_search sus;
>  
>  	int honor_end_cmd;
> +
> +	int force_decompress;
> +
> +	/* Reuse stream objects for encoded_write decompression fallback */
> +	ZSTD_DStream *zstd_dstream;
> +	z_stream *zlib_stream;
>  };
>  
>  static int finish_subvol(struct btrfs_receive *rctx)
> @@ -989,9 +999,222 @@ static int process_update_extent(const char *path, u64 offset, u64 len,
>  	return 0;
>  }
>  
> +static int decompress_zlib(struct btrfs_receive *rctx, const char *encoded_data,
> +			   u64 encoded_len, char *unencoded_data,
> +			   u64 unencoded_len)
> +{
> +	bool init = false;
> +	int ret;
> +
> +	if (!rctx->zlib_stream) {
> +		init = true;
> +		rctx->zlib_stream = malloc(sizeof(z_stream));
> +		if (!rctx->zlib_stream) {
> +			error("failed to allocate zlib stream %m");
> +			return -ENOMEM;
> +		}
> +	}
> +	rctx->zlib_stream->next_in = (void *)encoded_data;
> +	rctx->zlib_stream->avail_in = encoded_len;
> +	rctx->zlib_stream->next_out = (void *)unencoded_data;
> +	rctx->zlib_stream->avail_out = unencoded_len;
> +
> +	if (init) {
> +		rctx->zlib_stream->zalloc = Z_NULL;
> +		rctx->zlib_stream->zfree = Z_NULL;
> +		rctx->zlib_stream->opaque = Z_NULL;
> +		ret = inflateInit(rctx->zlib_stream);
> +	} else {
> +		ret = inflateReset(rctx->zlib_stream);
> +	}
> +	if (ret != Z_OK) {
> +		error("zlib inflate init failed: %d", ret);
> +		return -EIO;
> +	}
> +
> +	while (rctx->zlib_stream->avail_in > 0 &&
> +	       rctx->zlib_stream->avail_out > 0) {
> +		ret = inflate(rctx->zlib_stream, Z_FINISH);
> +		if (ret == Z_STREAM_END) {
> +			break;
> +		} else if (ret != Z_OK) {
> +			error("zlib inflate failed: %d", ret);
> +			return -EIO;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static int decompress_zstd(struct btrfs_receive *rctx, const char *encoded_buf,
> +			   u64 encoded_len, char *unencoded_buf,
> +			   u64 unencoded_len)
> +{
> +	ZSTD_inBuffer in_buf = {
> +		.src = encoded_buf,
> +		.size = encoded_len
> +	};
> +	ZSTD_outBuffer out_buf = {
> +		.dst = unencoded_buf,
> +		.size = unencoded_len
> +	};
> +	size_t ret;
> +
> +	if (!rctx->zstd_dstream) {
> +		rctx->zstd_dstream = ZSTD_createDStream();
> +		if (!rctx->zstd_dstream) {
> +			error("failed to create zstd dstream");
> +			return -ENOMEM;
> +		}
> +	}
> +	ret = ZSTD_initDStream(rctx->zstd_dstream);
> +	if (ZSTD_isError(ret)) {
> +		error("failed to init zstd stream: %s", ZSTD_getErrorName(ret));
> +		return -EIO;
> +	}
> +	while (in_buf.pos < in_buf.size && out_buf.pos < out_buf.size) {
> +		ret = ZSTD_decompressStream(rctx->zstd_dstream, &out_buf, &in_buf);
> +		if (ret == 0) {
> +			break;
> +		} else if (ZSTD_isError(ret)) {
> +			error("failed to decompress zstd stream: %s",
> +			      ZSTD_getErrorName(ret));
> +			return -EIO;
> +		}
> +	}
> +	return 0;
> +}
> +
> +static int decompress_lzo(const char *encoded_data, u64 encoded_len,
> +			  char *unencoded_data, u64 unencoded_len,
> +			  unsigned int page_size)
> +{
> +	uint32_t total_len;
> +	size_t in_pos, out_pos;
> +
> +	if (encoded_len < 4) {
> +		error("lzo header is truncated");
> +		return -EIO;
> +	}
> +	memcpy(&total_len, encoded_data, 4);
> +	total_len = le32toh(total_len);
> +	if (total_len > encoded_len) {
> +		error("lzo header is invalid");
> +		return -EIO;
> +	}
> +
> +	in_pos = 4;
> +	out_pos = 0;
> +	while (in_pos < total_len && out_pos < unencoded_len) {
> +		size_t page_remaining;
> +		uint32_t src_len;
> +		lzo_uint dst_len;
> +		int ret;
> +
> +		page_remaining = -in_pos % page_size;
> +		if (page_remaining < 4) {
> +			if (total_len - in_pos <= page_remaining)
> +				break;
> +			in_pos += page_remaining;
> +		}
> +
> +		if (total_len - in_pos < 4) {
> +			error("lzo segment header is truncated");
> +			return -EIO;
> +		}
> +
> +		memcpy(&src_len, encoded_data + in_pos, 4);
> +		src_len = le32toh(src_len);
> +		in_pos += 4;
> +		if (src_len > total_len - in_pos) {
> +			error("lzo segment header is invalid");
> +			return -EIO;
> +		}
> +
> +		dst_len = page_size;
> +		ret = lzo1x_decompress_safe((void *)(encoded_data + in_pos),
> +					    src_len,
> +					    (void *)(unencoded_data + out_pos),
> +					    &dst_len, NULL);
> +		if (ret != LZO_E_OK) {
> +			error("lzo1x_decompress_safe failed: %d", ret);
> +			return -EIO;
> +		}
> +
> +		in_pos += src_len;
> +		out_pos += dst_len;
> +	}
> +	return 0;
> +}
> +
> +static int decompress_and_write(struct btrfs_receive *rctx,
> +				const char *encoded_data, u64 offset,
> +				u64 encoded_len, u64 unencoded_file_len,
> +				u64 unencoded_len, u64 unencoded_offset,
> +				u32 compression)
> +{
> +	int ret = 0;
> +	size_t pos;
> +	ssize_t w;
> +	char *unencoded_data;
> +	int page_shift;
> +
> +	unencoded_data = calloc(unencoded_len, 1);
> +	if (!unencoded_data) {
> +		error("allocating space for unencoded data failed: %m");
> +		return -errno;
> +	}
> +
> +	switch (compression) {
> +	case BTRFS_ENCODED_IO_COMPRESSION_ZLIB:
> +		ret = decompress_zlib(rctx, encoded_data, encoded_len,
> +				      unencoded_data, unencoded_len);
> +		if (ret)
> +			goto out;
> +		break;
> +	case BTRFS_ENCODED_IO_COMPRESSION_ZSTD:
> +		ret = decompress_zstd(rctx, encoded_data, encoded_len,
> +				      unencoded_data, unencoded_len);
> +		if (ret)
> +			goto out;
> +		break;
> +	case BTRFS_ENCODED_IO_COMPRESSION_LZO_4K:
> +	case BTRFS_ENCODED_IO_COMPRESSION_LZO_8K:
> +	case BTRFS_ENCODED_IO_COMPRESSION_LZO_16K:
> +	case BTRFS_ENCODED_IO_COMPRESSION_LZO_32K:
> +	case BTRFS_ENCODED_IO_COMPRESSION_LZO_64K:
> +		page_shift = compression - BTRFS_ENCODED_IO_COMPRESSION_LZO_4K + 12;
> +		ret = decompress_lzo(encoded_data, encoded_len, unencoded_data,
> +				     unencoded_len, 1U << page_shift);
> +		if (ret)
> +			goto out;
> +		break;
> +	default:
> +		error("unknown compression: %d", compression);
> +		ret = -EOPNOTSUPP;
> +		goto out;
> +	}
> +
> +	pos = unencoded_offset;
> +	while (pos < unencoded_file_len) {
> +		w = pwrite(rctx->write_fd, unencoded_data + pos,
> +			   unencoded_file_len - pos, offset);
> +		if (w < 0) {
> +			ret = -errno;
> +			error("writing unencoded data failed: %m");
> +			goto out;
> +		}
> +		pos += w;
> +		offset += w;
> +	}
> +out:
> +	free(unencoded_data);
> +	return ret;
> +}
> +
>  static int process_encoded_write(const char *path, const void *data, u64 offset,
> -	u64 len, u64 unencoded_file_len, u64 unencoded_len,
> -	u64 unencoded_offset, u32 compression, u32 encryption, void *user)
> +				 u64 len, u64 unencoded_file_len,
> +				 u64 unencoded_len, u64 unencoded_offset,
> +				 u32 compression, u32 encryption, void *user)
>  {
>  	int ret;
>  	struct btrfs_receive *rctx = user;
> @@ -1007,6 +1230,7 @@ static int process_encoded_write(const char *path, const void *data, u64 offset,
>  		.compression = compression,
>  		.encryption = encryption,
>  	};
> +	bool encoded_write = !rctx->force_decompress;
>  
>  	if (encryption) {
>  		error("encoded_write: encryption not supported");
> @@ -1023,13 +1247,21 @@ static int process_encoded_write(const char *path, const void *data, u64 offset,
>  	if (ret < 0)
>  		return ret;
>  
> -	ret = ioctl(rctx->write_fd, BTRFS_IOC_ENCODED_WRITE, &encoded);
> -	if (ret < 0) {
> -		ret = -errno;
> -		error("encoded_write: writing to %s failed: %m", path);
> -		return ret;
> +	if (encoded_write) {
> +		ret = ioctl(rctx->write_fd, BTRFS_IOC_ENCODED_WRITE, &encoded);
> +		if (ret >= 0)
> +			return 0;
> +		/* Fall back for these errors, fail hard for anything else. */
> +		if (errno != ENOSPC && errno != EOPNOTSUPP && errno != EINVAL) {

Just caught something that I missed in the conversion, this needs to be
ENOTTY instead of EOPNOTSUPP.

  reply	other threads:[~2021-08-18 18:07 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-17 21:06 [PATCH v10 00/14] btrfs: add ioctls and send/receive support for reading/writing compressed data Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 01/14] fs: export rw_verify_area() Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 02/14] fs: export variant of generic_write_checks without iov_iter Omar Sandoval
2021-08-20  7:59   ` Nikolay Borisov
2021-08-20 17:31     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 03/14] btrfs: don't advance offset for compressed bios in btrfs_csum_one_bio() Omar Sandoval
2021-08-20  8:08   ` Nikolay Borisov
2021-08-20 17:37     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 04/14] btrfs: add ram_bytes and offset to btrfs_ordered_extent Omar Sandoval
2021-08-20  8:34   ` Nikolay Borisov
2021-08-20 17:43     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 05/14] btrfs: support different disk extent size for delalloc Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 06/14] btrfs: optionally extend i_size in cow_file_range_inline() Omar Sandoval
2021-08-20  8:51   ` Nikolay Borisov
2021-08-20  9:13     ` Qu Wenruo
2021-08-20 18:11       ` Omar Sandoval
2021-08-21  1:11         ` Qu Wenruo
2021-08-23 18:16           ` Omar Sandoval
2021-08-23 23:32             ` Qu Wenruo
2021-08-23 23:46               ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 07/14] btrfs: add definitions + documentation for encoded I/O ioctls Omar Sandoval
2021-08-20  8:56   ` Nikolay Borisov
2021-08-20 17:48     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 08/14] btrfs: add BTRFS_IOC_ENCODED_READ Omar Sandoval
2021-08-20 12:30   ` Nikolay Borisov
2021-08-20 17:58     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 09/14] btrfs: add BTRFS_IOC_ENCODED_WRITE Omar Sandoval
2021-08-20 13:44   ` Nikolay Borisov
2021-08-20 17:59     ` Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 10/14] btrfs: add send stream v2 definitions Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 11/14] btrfs: send: write larger chunks when using stream v2 Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 12/14] btrfs: send: allocate send buffer with alloc_page() and vmap() for v2 Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 13/14] btrfs: send: send compressed extents with encoded writes Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 14/14] btrfs: send: enable support for stream v2 and compressed writes Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 01/10] btrfs-progs: receive: support v2 send stream larger tlv_len Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 02/10] btrfs-progs: receive: dynamically allocate sctx->read_buf Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 03/10] btrfs-progs: receive: support v2 send stream DATA tlv format Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 04/10] btrfs-progs: receive: add send stream v2 cmds and attrs to send.h Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 05/10] btrfs-progs: receive: process encoded_write commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 06/10] btrfs-progs: receive: encoded_write fallback to explicit decode and write Omar Sandoval
2021-08-18 18:07   ` Omar Sandoval [this message]
2021-08-17 21:06 ` [PATCH v10 07/10] btrfs-progs: receive: process fallocate commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 08/10] btrfs-progs: receive: process setflags ioctl commands Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 09/10] btrfs-progs: send: stream v2 ioctl flags Omar Sandoval
2021-08-17 21:06 ` [PATCH v10 10/10] btrfs-progs: receive: add tests for basic encoded_write send/receive Omar Sandoval

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YR1Mb0i6Fk1LggJb@relinquished.localdomain \
    --to=osandov@osandov.com \
    --cc=kernel-team@fb.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).