Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 1/4] btrfs: fix missing delalloc new bit for new delalloc ranges
Date: Thu, 5 Nov 2020 13:29:12 -0500	[thread overview]
Message-ID: <196ad815-4fd4-6218-83a2-0c5f44ce3940@toxicpanda.com> (raw)
In-Reply-To: <7f215e2509aa557504a5d352ff4371f2e2606f59.1604486892.git.fdmanana@suse.com>

On 11/4/20 6:07 AM, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> When doing a buffered write, through one of the write family syscalls, we
> look for ranges which currenly don't have allocated extents and set the
> 'delalloc new' bit on them, so that we can report a correct number of used
> blocks to the stat(2) syscall until delalloc is flushed and ordered extents
> complete.
> 
> However there are a few other places where we can do a buffered write
> against a range that is mapped to a hole (no extent allocated) and where
> we do not set the 'new delalloc' bit. Those places are:
> 
> - Doing a memory mapped write against a hole;
> 
> - Cloning an inline extent into a hole starting at file offset 0;
> 
> - Calling btrfs_cont_expand() when the i_size of the file is not aligned
>    to the sector size and is located in a hole. For example when cloning
>    to a destination offset beyond eof.
> 
> So after such cases, until the corresponding delalloc range is flushed and
> the respective ordered extents complete, we can report an incorrect number
> of blocks used through the stat(2) syscall.
> 
> In some cases we can end up reporting 0 used blocks to stat(2), which is a
> particular bad value to report as it may mislead tools to think a file is
> completely sparse when its i_size is not zero, making them skip reading
> any data, an undesired consequence for tools such as archivers and other
> backup tools, as reported a long time ago in the following thread (and
> other past threads):
> 
>    https://lists.gnu.org/archive/html/bug-tar/2016-07/msg00001.html
> 
> Example reproducer:
> 
>    $ cat reproducer.sh
>    #!/bin/bash
> 
>    MNT=/mnt/sdi
>    DEV=/dev/sdi
> 
>    mkfs.btrfs -f $DEV > /dev/null
>    # mkfs.xfs -f $DEV > /dev/null
>    # mkfs.ext4 -F $DEV > /dev/null
>    # mkfs.f2fs -f $DEV > /dev/null
>    mount $DEV $MNT
> 
>    xfs_io -f -c "truncate 64K"   \
>        -c "mmap -w 0 64K"        \
>        -c "mwrite -S 0xab 0 64K" \
>        -c "munmap"               \
>        $MNT/foo
> 
>    blocks_used=$(stat -c %b $MNT/foo)
>    echo "blocks used: $blocks_used"
> 
>    if [ $blocks_used -eq 0 ]; then
>        echo "ERROR: blocks used is 0"
>    fi
> 
>    umount $DEV
> 
>    $ ./reproducer.sh
>    blocks used: 0
>    ERROR: blocks used is 0
> 
> So move the logic that decides to set the 'delalloc bit' bit into the
> function btrfs_set_extent_delalloc(), since that is what we use for all
> those missing cases as well as for the cases that currently work well.
> 
> This change is also preparatory work for an upcoming patch that fixes
> other problems related to tracking and reporting the number of bytes used
> by an inode.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef

  reply	other threads:[~2020-11-05 18:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04 11:07 [PATCH 0/4] btrfs: fix cases of stat(2) reporting incorrect number of used blocks fdmanana
2020-11-04 11:07 ` [PATCH 1/4] btrfs: fix missing delalloc new bit for new delalloc ranges fdmanana
2020-11-05 18:29   ` Josef Bacik [this message]
2020-11-04 11:07 ` [PATCH 2/4] btrfs: refactor btrfs_drop_extents() to make it easier to extend fdmanana
2020-11-05 18:39   ` Josef Bacik
2020-11-04 11:07 ` [PATCH 3/4] btrfs: fix race when defragging that leads to unnecessary IO fdmanana
2020-11-05 18:44   ` Josef Bacik
2020-11-04 11:07 ` [PATCH 4/4] btrfs: update the number of bytes used by an inode atomically fdmanana
2020-11-05 19:24   ` Josef Bacik
2020-11-09 10:34   ` Nikolay Borisov
2020-11-09 11:10     ` Filipe Manana
2020-11-10 21:46 ` [PATCH 0/4] btrfs: fix cases of stat(2) reporting incorrect number of used blocks David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=196ad815-4fd4-6218-83a2-0c5f44ce3940@toxicpanda.com \
    --to=josef@toxicpanda.com \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox