linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Lukáš Czerner" <lczerner@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: tytso@mit.edu, linux-ext4@vger.kernel.org
Subject: Re: [PATCH 16/37] libext2fs: support allocating uninit blocks in bmap2()
Date: Tue, 6 May 2014 17:45:01 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LFD.2.00.1405061736390.2255@localhost.localdomain> (raw)
In-Reply-To: <20140501231407.31890.3876.stgit@birch.djwong.org>

On Thu, 1 May 2014, Darrick J. Wong wrote:

> Date: Thu, 01 May 2014 16:14:07 -0700
> From: Darrick J. Wong <darrick.wong@oracle.com>
> To: tytso@mit.edu, darrick.wong@oracle.com
> Cc: linux-ext4@vger.kernel.org
> Subject: [PATCH 16/37] libext2fs: support allocating uninit blocks in bmap2()
> 
> In order to support fallocate, we need to be able to have
> ext2fs_bmap2() allocate blocks and put them into uninitialized
> extents.  There's a flag to do this in the extent code, but it's not
> exposed to the bmap2 interface, so plumb that in.  Eventually fuse2fs
> or somebody will use it.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  lib/ext2fs/bmap.c      |   24 ++++++++++++++++++++++--
>  lib/ext2fs/ext2fs.h    |    1 +
>  lib/ext2fs/mkjournal.c |   17 +++++++++++++++++
>  3 files changed, 40 insertions(+), 2 deletions(-)
> 
> 
> diff --git a/lib/ext2fs/bmap.c b/lib/ext2fs/bmap.c
> index c1d0e6f..a4dc8ef 100644
> --- a/lib/ext2fs/bmap.c
> +++ b/lib/ext2fs/bmap.c
> @@ -72,6 +72,11 @@ static _BMAP_INLINE_ errcode_t block_ind_bmap(ext2_filsys fs, int flags,
>  					    block_buf + fs->blocksize, &b);
>  		if (retval)
>  			return retval;
> +		if (flags & BMAP_UNINIT) {
> +			retval = ext2fs_zero_blocks2(fs, b, 1, NULL, NULL);
> +			if (retval)
> +				return retval;
> +		}
>  
>  #ifdef WORDS_BIGENDIAN
>  		((blk_t *) block_buf)[nr] = ext2fs_swab32(b);
> @@ -214,10 +219,13 @@ static errcode_t extent_bmap(ext2_filsys fs, ext2_ino_t ino,
>  	errcode_t		retval = 0;
>  	blk64_t			blk64 = 0;
>  	int			alloc = 0;
> +	int			set_flags;
> +
> +	set_flags = bmap_flags & BMAP_UNINIT ? EXT2_EXTENT_SET_BMAP_UNINIT : 0;
>  
>  	if (bmap_flags & BMAP_SET) {
>  		retval = ext2fs_extent_set_bmap(handle, block,
> -						*phys_blk, 0);
> +						*phys_blk, set_flags);
>  		return retval;
>  	}
>  	retval = ext2fs_extent_goto(handle, block);
> @@ -254,7 +262,7 @@ got_block:
>  		alloc++;
>  	set_extent:
>  		retval = ext2fs_extent_set_bmap(handle, block,
> -						blk64, 0);
> +						blk64, set_flags);
>  		if (retval) {
>  			ext2fs_block_alloc_stats2(fs, blk64, -1);
>  			return retval;
> @@ -345,6 +353,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
>  		goto done;
>  	}
>  
> +	if ((bmap_flags & BMAP_SET) && (bmap_flags & BMAP_UNINIT)) {
> +		retval = ext2fs_zero_blocks2(fs, *phys_blk, 1, NULL, NULL);
> +		if (retval)
> +			goto done;
> +	}
> +
>  	if (block < EXT2_NDIR_BLOCKS) {
>  		if (bmap_flags & BMAP_SET) {
>  			b = *phys_blk;
> @@ -360,6 +374,12 @@ errcode_t ext2fs_bmap2(ext2_filsys fs, ext2_ino_t ino, struct ext2_inode *inode,
>  			retval = ext2fs_alloc_block(fs, b, block_buf, &b);
>  			if (retval)
>  				goto done;
> +			if (bmap_flags & BMAP_UNINIT) {
> +				retval = ext2fs_zero_blocks2(fs, b, 1, NULL,
> +							     NULL);
> +				if (retval)
> +					goto done;
> +			}
>  			inode_bmap(inode, block) = b;
>  			blocks_alloc++;
>  			*phys_blk = b;
> diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
> index 599c972..819a14a 100644
> --- a/lib/ext2fs/ext2fs.h
> +++ b/lib/ext2fs/ext2fs.h
> @@ -527,6 +527,7 @@ typedef struct ext2_icount *ext2_icount_t;
>   */
>  #define BMAP_ALLOC	0x0001
>  #define BMAP_SET	0x0002
> +#define BMAP_UNINIT	0x0004
>  
>  /*
>   * Returned flags from ext2fs_bmap
> diff --git a/lib/ext2fs/mkjournal.c b/lib/ext2fs/mkjournal.c
> index 884d9c0..ecc3912 100644
> --- a/lib/ext2fs/mkjournal.c
> +++ b/lib/ext2fs/mkjournal.c
> @@ -174,6 +174,23 @@ errcode_t ext2fs_zero_blocks2(ext2_filsys fs, blk64_t blk, int num,
>  			return ENOMEM;
>  		memset(buf, 0, fs->blocksize * STRIDE_LENGTH);
>  	}
> +
> +	/* Try discard, if it zeroes data... */
> +	if (io_channel_discard_zeroes_data(fs->io)) {
> +		memset(buf + fs->blocksize, 0, fs->blocksize);
> +		retval = io_channel_discard(fs->io, blk, num);
> +		if (retval)
> +			goto skip_discard;
> +		retval = io_channel_read_blk64(fs->io, blk, 1, buf);
> +		if (retval)
> +			goto skip_discard;
> +		if (memcmp(buf, buf + fs->blocksize, fs->blocksize) == 0)
> +			return 0;
> +		/* Hah!  Discard doesn't zero! */
> +		fs->io->flags &= ~CHANNEL_FLAGS_DISCARD_ZEROES;
> +	}
> +skip_discard:

You did not mention that in the description, but this is actually a
problem. The reason is that discard might not be reliable on some
devices. This has been discussed several times and I am not the only
one who've seen that even if the device itself says that it will
return zeroes from discarded regions sometimes it might return data.

I would rather avoid this kind of optimization. However if the
underlying "device" is a loop device then it will be reliable if
it's supported. Also if then underlying "device" is a image then we
can just simply use punch hole.

Thanks!
-Lukas

> +
>  	/* OK, do the write loop */
>  	j=0;
>  	while (j < num) {
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

  reply	other threads:[~2014-05-06 15:45 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-01 23:12 [PATCH 00/37] e2fsprogs patchbomb 5/14 Darrick J. Wong
2014-05-01 23:12 ` [PATCH 01/37] misc: create better-packaged static analysis reports Darrick J. Wong
2014-05-11 22:33   ` Theodore Ts'o
2014-05-01 23:12 ` [PATCH 02/37] misc: coverity fixes Darrick J. Wong
2014-05-02 11:17   ` Lukáš Czerner
2014-05-05 20:04     ` Darrick J. Wong
2014-05-11 22:40       ` Theodore Ts'o
2014-05-01 23:12 ` [PATCH 03/37] libext2fs: create sockets when populating filesystem Darrick J. Wong
2014-05-02 11:22   ` Lukáš Czerner
2014-05-05 20:08     ` Darrick J. Wong
2014-05-11 22:44       ` Theodore Ts'o
2014-05-01 23:12 ` [PATCH 04/37] mke2fs: always warn if 128-byte inode and inline_data Darrick J. Wong
2014-05-02 11:27   ` Lukáš Czerner
2014-05-05 20:10     ` Darrick J. Wong
2014-05-12  0:26       ` Theodore Ts'o
2014-05-01 23:12 ` [PATCH 05/37] debugfs: teach logdump to deal with 64bit revoke tables Darrick J. Wong
2014-05-02 11:38   ` Lukáš Czerner
2014-05-05 22:23     ` Darrick J. Wong
2014-05-06 11:35       ` Lukáš Czerner
2014-05-12  1:20         ` Theodore Ts'o
2014-05-01 23:13 ` [PATCH 06/37] debugfs: force logdump to display (old) journal contents Darrick J. Wong
2014-05-02 11:49   ` Lukáš Czerner
2014-05-06  0:24     ` Darrick J. Wong
2014-05-12  1:41       ` Theodore Ts'o
2014-05-12  3:31         ` Theodore Ts'o
2014-05-14  0:05         ` Darrick J. Wong
2014-05-01 23:13 ` [PATCH 07/37] resize2fs: fix check for collision between old GDT and superblock on sparse_super2 fs Darrick J. Wong
2014-05-12  3:35   ` Theodore Ts'o
2014-05-01 23:13 ` [PATCH 08/37] mke2fs: set gdt csum when creating packed fs Darrick J. Wong
2014-05-02 11:55   ` Lukáš Czerner
2014-05-12  4:22     ` Theodore Ts'o
2014-05-01 23:13 ` [PATCH 09/37] mke2fs: set error behavior at initialization time Darrick J. Wong
2014-05-02 12:13   ` Lukáš Czerner
2014-05-01 23:13 ` [PATCH 10/37] e2fsck: verify checksums after checking everything else Darrick J. Wong
2014-05-02 12:32   ` Lukáš Czerner
2014-05-05 22:56     ` Darrick J. Wong
2014-05-06 11:32       ` Lukáš Czerner
2014-05-08  0:05         ` Darrick J. Wong
2014-05-01 23:13 ` [PATCH 11/37] e2fsck: fix the extended attribute checksum error message Darrick J. Wong
2014-05-02 12:46   ` Lukáš Czerner
2014-05-05 23:08     ` Darrick J. Wong
2014-05-06 10:12       ` Lukáš Czerner
2014-05-01 23:13 ` [PATCH 12/37] e2fsck: insert a missing dirent tail for checksums if possible Darrick J. Wong
2014-05-02 12:54   ` Lukáš Czerner
2014-05-05 23:16     ` Darrick J. Wong
2014-05-01 23:13 ` [PATCH 13/37] e2fsck: write dir blocks after new inode when reconstructing root/lost+found Darrick J. Wong
2014-05-05 17:13   ` Lukáš Czerner
2014-05-01 23:13 ` [PATCH 14/37] dumpe2fs: add switch to disable checksum verification Darrick J. Wong
2014-05-05 17:20   ` Lukáš Czerner
2014-05-01 23:14 ` [PATCH 15/37] mke2fs: set block_validity as a default mount option Darrick J. Wong
2014-05-05 17:24   ` Lukáš Czerner
2014-05-01 23:14 ` [PATCH 16/37] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
2014-05-06 15:45   ` Lukáš Czerner [this message]
2014-05-06 19:59     ` Darrick J. Wong
2014-05-07 10:02       ` Lukáš Czerner
2014-05-07 21:37         ` Darrick J. Wong
2014-05-08  0:13           ` [PATCH 1/2] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
2014-05-13 11:11             ` Lukáš Czerner
2014-05-08  0:14           ` [PATCH 2/2] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
2014-05-27 16:28             ` Lukáš Czerner
2014-05-28 19:48               ` Darrick J. Wong
2014-05-01 23:14 ` [PATCH 17/37] libext2fs: file IO routines should handle uninit blocks Darrick J. Wong
2014-05-01 23:14 ` [PATCH 18/37] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
2014-05-01 23:14 ` [PATCH 19/37] resize2fs: when toggling 64bit, don't free in-use bg data clusters Darrick J. Wong
2014-05-01 23:14 ` [PATCH 20/37] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size Darrick J. Wong
2014-05-01 23:14 ` [PATCH 21/37] libext2fs: have UNIX IO manager use pread/pwrite Darrick J. Wong
2014-08-02 23:16   ` Theodore Ts'o
2014-05-01 23:14 ` [PATCH 22/37] ext2fs: add readahead method to improve scanning Darrick J. Wong
2014-05-01 23:14 ` [PATCH 23/37] e2fsck: provide routines to read-ahead metadata Darrick J. Wong
2014-05-01 23:14 ` [PATCH 24/37] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
2014-07-28 22:25   ` Darrick J. Wong
2014-05-01 23:15 ` [PATCH 25/37] libext2fs: when appending to a file, don't split an index block in equal halves Darrick J. Wong
2014-08-02 23:43   ` Theodore Ts'o
2014-05-01 23:15 ` [PATCH 26/37] libext2fs: find inode goal when allocating blocks Darrick J. Wong
2014-05-01 23:15 ` [PATCH 27/37] libext2fs: find a range of empty blocks Darrick J. Wong
2014-05-01 23:15 ` [PATCH 28/37] libext2fs: provide a function to set inode size Darrick J. Wong
2014-07-26 18:37   ` Theodore Ts'o
2014-05-01 23:15 ` [PATCH 29/37] libext2fs: implement fallocate Darrick J. Wong
2014-05-01 23:15 ` [PATCH 31/37] fuse2fs: translate ACL structures Darrick J. Wong
2014-05-01 23:15 ` [PATCH 32/37] fuse2fs: handle 64-bit dates correctly Darrick J. Wong
2014-05-01 23:16 ` [PATCH 33/37] fuse2fs: implement fallocate Darrick J. Wong
2014-05-01 23:16 ` [PATCH 35/37] tests: enable using fuse2fs with metadata checksum test Darrick J. Wong
2014-05-01 23:16 ` [PATCH 36/37] tests: test date handling Darrick J. Wong
2014-05-01 23:16 ` [PATCH 37/37] ext5: define new subtype to add features and reduce testing complexity Darrick J. Wong
2014-05-02  9:45   ` Lukáš Czerner
2014-05-02 14:04     ` Theodore Ts'o
2014-05-06  1:59       ` Darrick J. Wong
2014-05-06  1:33     ` Darrick J. Wong
2014-05-06 12:50       ` Lukáš Czerner
2014-05-06 15:21         ` Theodore Ts'o
2014-05-06 15:30           ` Lukáš Czerner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1405061736390.2255@localhost.localdomain \
    --to=lczerner@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).