From: Josef Bacik <josef@toxicpanda.com>
To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 1/4] btrfs: fix missing delalloc new bit for new delalloc ranges
Date: Thu, 5 Nov 2020 13:29:12 -0500 [thread overview]
Message-ID: <196ad815-4fd4-6218-83a2-0c5f44ce3940@toxicpanda.com> (raw)
In-Reply-To: <7f215e2509aa557504a5d352ff4371f2e2606f59.1604486892.git.fdmanana@suse.com>
On 11/4/20 6:07 AM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
>
> When doing a buffered write, through one of the write family syscalls, we
> look for ranges which currenly don't have allocated extents and set the
> 'delalloc new' bit on them, so that we can report a correct number of used
> blocks to the stat(2) syscall until delalloc is flushed and ordered extents
> complete.
>
> However there are a few other places where we can do a buffered write
> against a range that is mapped to a hole (no extent allocated) and where
> we do not set the 'new delalloc' bit. Those places are:
>
> - Doing a memory mapped write against a hole;
>
> - Cloning an inline extent into a hole starting at file offset 0;
>
> - Calling btrfs_cont_expand() when the i_size of the file is not aligned
> to the sector size and is located in a hole. For example when cloning
> to a destination offset beyond eof.
>
> So after such cases, until the corresponding delalloc range is flushed and
> the respective ordered extents complete, we can report an incorrect number
> of blocks used through the stat(2) syscall.
>
> In some cases we can end up reporting 0 used blocks to stat(2), which is a
> particular bad value to report as it may mislead tools to think a file is
> completely sparse when its i_size is not zero, making them skip reading
> any data, an undesired consequence for tools such as archivers and other
> backup tools, as reported a long time ago in the following thread (and
> other past threads):
>
> https://lists.gnu.org/archive/html/bug-tar/2016-07/msg00001.html
>
> Example reproducer:
>
> $ cat reproducer.sh
> #!/bin/bash
>
> MNT=/mnt/sdi
> DEV=/dev/sdi
>
> mkfs.btrfs -f $DEV > /dev/null
> # mkfs.xfs -f $DEV > /dev/null
> # mkfs.ext4 -F $DEV > /dev/null
> # mkfs.f2fs -f $DEV > /dev/null
> mount $DEV $MNT
>
> xfs_io -f -c "truncate 64K" \
> -c "mmap -w 0 64K" \
> -c "mwrite -S 0xab 0 64K" \
> -c "munmap" \
> $MNT/foo
>
> blocks_used=$(stat -c %b $MNT/foo)
> echo "blocks used: $blocks_used"
>
> if [ $blocks_used -eq 0 ]; then
> echo "ERROR: blocks used is 0"
> fi
>
> umount $DEV
>
> $ ./reproducer.sh
> blocks used: 0
> ERROR: blocks used is 0
>
> So move the logic that decides to set the 'delalloc bit' bit into the
> function btrfs_set_extent_delalloc(), since that is what we use for all
> those missing cases as well as for the cases that currently work well.
>
> This change is also preparatory work for an upcoming patch that fixes
> other problems related to tracking and reporting the number of bytes used
> by an inode.
>
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Thanks,
Josef
next prev parent reply other threads:[~2020-11-05 18:29 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-04 11:07 [PATCH 0/4] btrfs: fix cases of stat(2) reporting incorrect number of used blocks fdmanana
2020-11-04 11:07 ` [PATCH 1/4] btrfs: fix missing delalloc new bit for new delalloc ranges fdmanana
2020-11-05 18:29 ` Josef Bacik [this message]
2020-11-04 11:07 ` [PATCH 2/4] btrfs: refactor btrfs_drop_extents() to make it easier to extend fdmanana
2020-11-05 18:39 ` Josef Bacik
2020-11-04 11:07 ` [PATCH 3/4] btrfs: fix race when defragging that leads to unnecessary IO fdmanana
2020-11-05 18:44 ` Josef Bacik
2020-11-04 11:07 ` [PATCH 4/4] btrfs: update the number of bytes used by an inode atomically fdmanana
2020-11-05 19:24 ` Josef Bacik
2020-11-09 10:34 ` Nikolay Borisov
2020-11-09 11:10 ` Filipe Manana
2020-11-10 21:46 ` [PATCH 0/4] btrfs: fix cases of stat(2) reporting incorrect number of used blocks David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=196ad815-4fd4-6218-83a2-0c5f44ce3940@toxicpanda.com \
--to=josef@toxicpanda.com \
--cc=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox