* [PATCH 0/3] btrfs: error path fixes for btrfs_link() and a cleanup
@ 2025-07-28 9:39 fdmanana
2025-07-28 9:39 ` [PATCH 1/3] btrfs: abort transaction on failure to add link to inode fdmanana
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: fdmanana @ 2025-07-28 9:39 UTC (permalink / raw)
To: linux-btrfs
From: Filipe Manana <fdmanana@suse.com>
Fix a couple bugs in the error paths for btrfs_link() and a cleanup to make
it more simple.
Filipe Manana (3):
btrfs: abort transaction on failure to add link to inode
btrfs: fix inode leak on failure to add link to inode
btrfs: simplify error handling logic for btrfs_link()
fs/btrfs/inode.c | 43 +++++++++++++++++++++----------------------
1 file changed, 21 insertions(+), 22 deletions(-)
--
2.47.2
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 1/3] btrfs: abort transaction on failure to add link to inode 2025-07-28 9:39 [PATCH 0/3] btrfs: error path fixes for btrfs_link() and a cleanup fdmanana @ 2025-07-28 9:39 ` fdmanana 2025-07-28 14:42 ` Johannes Thumshirn 2025-07-28 9:39 ` [PATCH 2/3] btrfs: fix inode leak " fdmanana 2025-07-28 9:39 ` [PATCH 3/3] btrfs: simplify error handling logic for btrfs_link() fdmanana 2 siblings, 1 reply; 9+ messages in thread From: fdmanana @ 2025-07-28 9:39 UTC (permalink / raw) To: linux-btrfs From: Filipe Manana <fdmanana@suse.com> If we fail to update the inode or delete the orphan item, we must abort the transaction to prevent persisting an inconsistent state. For example if we fail to update the inode item, we have the inconsistency of having a persisted inode item with a link count of N but we have N + 1 inode ref items and N + 1 directory entries pointing to our inode in case the transaction gets committed. Signed-off-by: Filipe Manana <fdmanana@suse.com> --- fs/btrfs/inode.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6d9a8d8bea4c..f2b43b1e90de 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6845,16 +6845,20 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *parent = dentry->d_parent; ret = btrfs_update_inode(trans, BTRFS_I(inode)); - if (ret) + if (ret) { + btrfs_abort_transaction(trans, ret); goto fail; + } if (inode->i_nlink == 1) { /* * If new hard link count is 1, it's a file created * with open(2) O_TMPFILE flag. */ ret = btrfs_orphan_del(trans, BTRFS_I(inode)); - if (ret) + if (ret) { + btrfs_abort_transaction(trans, ret); goto fail; + } } d_instantiate(dentry, inode); btrfs_log_new_name(trans, old_dentry, NULL, 0, parent); -- 2.47.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/3] btrfs: abort transaction on failure to add link to inode 2025-07-28 9:39 ` [PATCH 1/3] btrfs: abort transaction on failure to add link to inode fdmanana @ 2025-07-28 14:42 ` Johannes Thumshirn 0 siblings, 0 replies; 9+ messages in thread From: Johannes Thumshirn @ 2025-07-28 14:42 UTC (permalink / raw) To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org Looks good, Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/3] btrfs: fix inode leak on failure to add link to inode 2025-07-28 9:39 [PATCH 0/3] btrfs: error path fixes for btrfs_link() and a cleanup fdmanana 2025-07-28 9:39 ` [PATCH 1/3] btrfs: abort transaction on failure to add link to inode fdmanana @ 2025-07-28 9:39 ` fdmanana 2025-07-28 14:43 ` Johannes Thumshirn 2025-07-28 9:39 ` [PATCH 3/3] btrfs: simplify error handling logic for btrfs_link() fdmanana 2 siblings, 1 reply; 9+ messages in thread From: fdmanana @ 2025-07-28 9:39 UTC (permalink / raw) To: linux-btrfs From: Filipe Manana <fdmanana@suse.com> If we fail to update the inode or delete the orphan item we leak the inode since we update its refcount with the ihold() call to account for the d_instantiate() call which never happens in case we fail those steps. Fix this by setting 'drop_inode' to true in case we fail those steps. Signed-off-by: Filipe Manana <fdmanana@suse.com> --- fs/btrfs/inode.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f2b43b1e90de..46675783578e 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6847,6 +6847,7 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, ret = btrfs_update_inode(trans, BTRFS_I(inode)); if (ret) { btrfs_abort_transaction(trans, ret); + drop_inode = 1; goto fail; } if (inode->i_nlink == 1) { @@ -6857,6 +6858,7 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, ret = btrfs_orphan_del(trans, BTRFS_I(inode)); if (ret) { btrfs_abort_transaction(trans, ret); + drop_inode = 1; goto fail; } } -- 2.47.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] btrfs: fix inode leak on failure to add link to inode 2025-07-28 9:39 ` [PATCH 2/3] btrfs: fix inode leak " fdmanana @ 2025-07-28 14:43 ` Johannes Thumshirn 0 siblings, 0 replies; 9+ messages in thread From: Johannes Thumshirn @ 2025-07-28 14:43 UTC (permalink / raw) To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org Looks good, Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/3] btrfs: simplify error handling logic for btrfs_link() 2025-07-28 9:39 [PATCH 0/3] btrfs: error path fixes for btrfs_link() and a cleanup fdmanana 2025-07-28 9:39 ` [PATCH 1/3] btrfs: abort transaction on failure to add link to inode fdmanana 2025-07-28 9:39 ` [PATCH 2/3] btrfs: fix inode leak " fdmanana @ 2025-07-28 9:39 ` fdmanana 2025-07-28 14:42 ` Johannes Thumshirn 2 siblings, 1 reply; 9+ messages in thread From: fdmanana @ 2025-07-28 9:39 UTC (permalink / raw) To: linux-btrfs From: Filipe Manana <fdmanana@suse.com> Instead of incrementing the inode's link count and refcount early before adding the link, updating the inode and deleting orphan item, do it after all those steps succeeded right before calling d_instantiate(). This makes the error handling logic simpler by avoiding the need for the 'drop_inode' variable to signal if we need to undo the link count increment and the inode refcount increase under the 'fail' label. This also reduces the level of indentation by one, making the code easier to read. Signed-off-by: Filipe Manana <fdmanana@suse.com> --- fs/btrfs/inode.c | 45 +++++++++++++++++++-------------------------- 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 46675783578e..af47cbeadeaa 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6798,7 +6798,6 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, struct fscrypt_name fname; u64 index; int ret; - int drop_inode = 0; /* do not allow sys_link's with other subvols of the same device */ if (btrfs_root_id(root) != btrfs_root_id(BTRFS_I(inode)->root)) @@ -6830,50 +6829,44 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir, /* There are several dir indexes for this inode, clear the cache. */ BTRFS_I(inode)->dir_index = 0ULL; - inc_nlink(inode); inode_inc_iversion(inode); inode_set_ctime_current(inode); - ihold(inode); set_bit(BTRFS_INODE_COPY_EVERYTHING, &BTRFS_I(inode)->runtime_flags); ret = btrfs_add_link(trans, BTRFS_I(dir), BTRFS_I(inode), &fname.disk_name, 1, index); + if (ret) + goto fail; + /* Link added now we update the inode item with the new link count. */ + inc_nlink(inode); + ret = btrfs_update_inode(trans, BTRFS_I(inode)); if (ret) { - drop_inode = 1; - } else { - struct dentry *parent = dentry->d_parent; + btrfs_abort_transaction(trans, ret); + goto fail; + } - ret = btrfs_update_inode(trans, BTRFS_I(inode)); + if (inode->i_nlink == 1) { + /* + * If the new hard link count is 1, it's a file created with the + * open(2) O_TMPFILE flag. + */ + ret = btrfs_orphan_del(trans, BTRFS_I(inode)); if (ret) { btrfs_abort_transaction(trans, ret); - drop_inode = 1; goto fail; } - if (inode->i_nlink == 1) { - /* - * If new hard link count is 1, it's a file created - * with open(2) O_TMPFILE flag. - */ - ret = btrfs_orphan_del(trans, BTRFS_I(inode)); - if (ret) { - btrfs_abort_transaction(trans, ret); - drop_inode = 1; - goto fail; - } - } - d_instantiate(dentry, inode); - btrfs_log_new_name(trans, old_dentry, NULL, 0, parent); } + /* Grab reference for the new dentry passed to d_instantiate(). */ + ihold(inode); + d_instantiate(dentry, inode); + btrfs_log_new_name(trans, old_dentry, NULL, 0, dentry->d_parent); + fail: fscrypt_free_filename(&fname); if (trans) btrfs_end_transaction(trans); - if (drop_inode) { - inode_dec_link_count(inode); - iput(inode); - } btrfs_btree_balance_dirty(fs_info); return ret; } -- 2.47.2 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] btrfs: simplify error handling logic for btrfs_link() 2025-07-28 9:39 ` [PATCH 3/3] btrfs: simplify error handling logic for btrfs_link() fdmanana @ 2025-07-28 14:42 ` Johannes Thumshirn 2025-07-28 16:43 ` Filipe Manana 0 siblings, 1 reply; 9+ messages in thread From: Johannes Thumshirn @ 2025-07-28 14:42 UTC (permalink / raw) To: fdmanana@kernel.org, linux-btrfs@vger.kernel.org On 7/28/25 11:40 AM, fdmanana@kernel.org wrote: > + /* Link added now we update the inode item with the new link count. */ > + inc_nlink(inode); > + ret = btrfs_update_inode(trans, BTRFS_I(inode)); > if (ret) { > - drop_inode = 1; > - } else { > - struct dentry *parent = dentry->d_parent; > + btrfs_abort_transaction(trans, ret); > + goto fail; > + } Don't we need to call "inode_dec_link_count(inode);" here? It used to be in the old "drop_inode" case but it's gone now. And in the commit message you state: [...]This makes the error handling logic simpler by avoiding the need for the 'drop_inode' variable to signal if we need to undo the link count increment and the inode refcount increase under the 'fail' label.[...] So the "undo inode refcount increase" part is gone now. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] btrfs: simplify error handling logic for btrfs_link() 2025-07-28 14:42 ` Johannes Thumshirn @ 2025-07-28 16:43 ` Filipe Manana 2025-07-29 5:48 ` Johannes Thumshirn 0 siblings, 1 reply; 9+ messages in thread From: Filipe Manana @ 2025-07-28 16:43 UTC (permalink / raw) To: Johannes Thumshirn; +Cc: linux-btrfs@vger.kernel.org On Mon, Jul 28, 2025 at 3:42 PM Johannes Thumshirn <Johannes.Thumshirn@wdc.com> wrote: > > On 7/28/25 11:40 AM, fdmanana@kernel.org wrote: > > + /* Link added now we update the inode item with the new link count. */ > > + inc_nlink(inode); > > + ret = btrfs_update_inode(trans, BTRFS_I(inode)); > > if (ret) { > > - drop_inode = 1; > > - } else { > > - struct dentry *parent = dentry->d_parent; > > + btrfs_abort_transaction(trans, ret); > > + goto fail; > > + } > > > Don't we need to call "inode_dec_link_count(inode);" here? It used to be > in the old "drop_inode" case but it's gone now. And in the commit > message you state: > > [...]This makes the error handling logic simpler by avoiding the need for the 'drop_inode' > variable to signal if we need to undo the link count increment and the inode refcount > increase under the 'fail' label.[...] > > So the "undo inode refcount increase" part is gone now. Yes it is gone, because if we fail to update the inode, it means we have added the new link so it's more correct to leave the incremented link count, since the inode has a new link. In the end it doesn't matter much because we abort the transaction, so the new link is never persisted (and neither the incremented link count). Thanks. > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] btrfs: simplify error handling logic for btrfs_link() 2025-07-28 16:43 ` Filipe Manana @ 2025-07-29 5:48 ` Johannes Thumshirn 0 siblings, 0 replies; 9+ messages in thread From: Johannes Thumshirn @ 2025-07-29 5:48 UTC (permalink / raw) To: Filipe Manana; +Cc: linux-btrfs@vger.kernel.org On 7/28/25 6:44 PM, Filipe Manana wrote: > On Mon, Jul 28, 2025 at 3:42 PM Johannes Thumshirn > <Johannes.Thumshirn@wdc.com> wrote: >> On 7/28/25 11:40 AM, fdmanana@kernel.org wrote: >>> + /* Link added now we update the inode item with the new link count. */ >>> + inc_nlink(inode); >>> + ret = btrfs_update_inode(trans, BTRFS_I(inode)); >>> if (ret) { >>> - drop_inode = 1; >>> - } else { >>> - struct dentry *parent = dentry->d_parent; >>> + btrfs_abort_transaction(trans, ret); >>> + goto fail; >>> + } >> >> Don't we need to call "inode_dec_link_count(inode);" here? It used to be >> in the old "drop_inode" case but it's gone now. And in the commit >> message you state: >> >> [...]This makes the error handling logic simpler by avoiding the need for the 'drop_inode' >> variable to signal if we need to undo the link count increment and the inode refcount >> increase under the 'fail' label.[...] >> >> So the "undo inode refcount increase" part is gone now. > Yes it is gone, because if we fail to update the inode, it means we > have added the new link so it's more correct to leave the incremented > link count, since the inode has a new link. > > In the end it doesn't matter much because we abort the transaction, so > the new link is never persisted (and neither the incremented link > count). > > Thanks. > Ah OK that explains it then thanks. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-07-29 5:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-28 9:39 [PATCH 0/3] btrfs: error path fixes for btrfs_link() and a cleanup fdmanana 2025-07-28 9:39 ` [PATCH 1/3] btrfs: abort transaction on failure to add link to inode fdmanana 2025-07-28 14:42 ` Johannes Thumshirn 2025-07-28 9:39 ` [PATCH 2/3] btrfs: fix inode leak " fdmanana 2025-07-28 14:43 ` Johannes Thumshirn 2025-07-28 9:39 ` [PATCH 3/3] btrfs: simplify error handling logic for btrfs_link() fdmanana 2025-07-28 14:42 ` Johannes Thumshirn 2025-07-28 16:43 ` Filipe Manana 2025-07-29 5:48 ` Johannes Thumshirn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.