From: Boris Burkov <boris@bur.io>
To: Filipe Manana <fdmanana@kernel.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/3] btrfs: add missing inode updates on each iteration when replacing extents
Date: Tue, 7 Jun 2022 09:41:40 -0700 [thread overview]
Message-ID: <Yp9/xIkX137VLByJ@zen> (raw)
In-Reply-To: <20220607093139.GA3554947@falcondesktop>
On Tue, Jun 07, 2022 at 10:31:39AM +0100, Filipe Manana wrote:
> On Mon, Jun 06, 2022 at 03:11:52PM -0700, Boris Burkov wrote:
> > On Mon, Jun 06, 2022 at 10:41:18AM +0100, fdmanana@kernel.org wrote:
> > > From: Filipe Manana <fdmanana@suse.com>
> > >
> > > When replacing file extents, called during fallocate, hole punching,
> > > clone and deduplication, we may not be able to replace/drop all the
> > > target file extent items with a single transaction handle. We may get
> > > -ENOSPC while doing it, in which case we release the transaction handle,
> > > balance the dirty pages of the btree inode, flush delayed items and get
> > > a new transaction handle to operate on what's left of the target range.
> > >
> > > By dropping and replacing file extent items we have effectively modified
> >
> > How can you be sure that you definitely modified it? Is it possible for
> > btrfs_drop_extents to return ENOSPC without dropping extents?
>
> If btrfs_drop_extents() fails with -ENOSPC, it means it tried to modify
> more than one leaf. Since we reserved enough space for modifying one leaf,
> it can only fail if it tries to modify more leaves, and if it tries to do
> so, it means it dropped or trimmed file extent items from a leaf already.
>
> >
> > > the inode, so we should bump its iversion and update its mtime/ctime
> > > before we update the inode item. This is because if the transaction
> > > we used for partially modifying the inode gets committed by someone after
> > > we release it and before we finish the rest of the range, a power failure
> > > happens, then after mounting the filesystem our inode has an outdated
> > > iversion and mtime/ctime, corresponding to the values it had before we
> > > changed it.
> > >
> > > So add the missing iversion and mtime/ctime updates.
> > >
> > > Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Boris Burkov <boris@bur.io>
> > > ---
> > > fs/btrfs/ctree.h | 2 ++
> > > fs/btrfs/file.c | 19 +++++++++++++++++++
> > > fs/btrfs/inode.c | 1 +
> > > fs/btrfs/reflink.c | 1 +
> > > 4 files changed, 23 insertions(+)
> > >
> > > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> > > index 55dee1564e90..737cd59d16b6 100644
> > > --- a/fs/btrfs/ctree.h
> > > +++ b/fs/btrfs/ctree.h
> > > @@ -1330,6 +1330,8 @@ struct btrfs_replace_extent_info {
> > > * existing extent into a file range.
> > > */
> > > bool is_new_extent;
> > > + /* Indicate if we should update the inode's mtime and ctime. */
> > > + bool update_times;
> > > /* Meaningful only if is_new_extent is true. */
> > > int qgroup_reserved;
> > > /*
> > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> > > index 1fd827b99c1b..29de433b7804 100644
> > > --- a/fs/btrfs/file.c
> > > +++ b/fs/btrfs/file.c
> > > @@ -2803,6 +2803,25 @@ int btrfs_replace_file_extents(struct btrfs_inode *inode,
> > > extent_info->file_offset += replace_len;
> > > }
> > >
> > > + /*
> > > + * We are releasing our handle on the transaction, balance the
> > > + * dirty pages of the btree inode and flush delayed items, and
> > > + * then get a new transaction handle, which may now point to a
> > > + * new transaction in case someone else may have committed the
> > > + * transaction we used to replace/drop file extent items. So
> > > + * bump the inode's iversion and update mtime and ctime except
> > > + * if we are called from a dedupe context. This is because a
> > > + * power failure/crash may happen after the transaction is
> > > + * committed and before we finish replacing/dropping all the
> > > + * file extent items we need.
> > > + */
> > > + inode_inc_iversion(&inode->vfs_inode);
> > > +
> > > + if (!extent_info || extent_info->update_times) {
> > > + inode->vfs_inode.i_mtime = current_time(&inode->vfs_inode);
> > > + inode->vfs_inode.i_ctime = inode->vfs_inode.i_mtime;
> > > + }
> > > +
> > > ret = btrfs_update_inode(trans, root, inode);
> > > if (ret)
> > > break;
> > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> > > index 3ede3e873c2a..ab4ebcb7878c 100644
> > > --- a/fs/btrfs/inode.c
> > > +++ b/fs/btrfs/inode.c
> > > @@ -9907,6 +9907,7 @@ static struct btrfs_trans_handle *insert_prealloc_file_extent(
> > > extent_info.file_offset = file_offset;
> > > extent_info.extent_buf = (char *)&stack_fi;
> > > extent_info.is_new_extent = true;
> > > + extent_info.update_times = true;
> > > extent_info.qgroup_reserved = qgroup_released;
> > > extent_info.insertions = 0;
> > >
> > > diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c
> > > index 7e3b0aa318c1..977e0d218d79 100644
> > > --- a/fs/btrfs/reflink.c
> > > +++ b/fs/btrfs/reflink.c
> > > @@ -497,6 +497,7 @@ static int btrfs_clone(struct inode *src, struct inode *inode,
> > > clone_info.file_offset = new_key.offset;
> > > clone_info.extent_buf = buf;
> > > clone_info.is_new_extent = false;
> > > + clone_info.update_times = !no_time_update;
> > > ret = btrfs_replace_file_extents(BTRFS_I(inode), path,
> > > drop_start, new_key.offset + datal - 1,
> > > &clone_info, &trans);
> > > --
> > > 2.35.1
> > >
next prev parent reply other threads:[~2022-06-07 16:41 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-06 9:41 [PATCH 0/3] btrfs: a couple bug fixes around reflinks and fallocate fdmanana
2022-06-06 9:41 ` [PATCH 1/3] btrfs: fix race between reflinking and ordered extent completion fdmanana
2022-06-06 21:36 ` Boris Burkov
2022-06-06 9:41 ` [PATCH 2/3] btrfs: add missing inode updates on each iteration when replacing extents fdmanana
2022-06-06 22:11 ` Boris Burkov
2022-06-07 9:31 ` Filipe Manana
2022-06-07 16:41 ` Boris Burkov [this message]
2022-06-06 9:41 ` [PATCH 3/3] btrfs: do not BUG_ON() on failure to migrate space " fdmanana
2022-06-07 16:44 ` Boris Burkov
2022-06-06 20:45 ` [PATCH 0/3] btrfs: a couple bug fixes around reflinks and fallocate David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yp9/xIkX137VLByJ@zen \
--to=boris@bur.io \
--cc=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox