stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
	Filipe Manana <fdmanana@suse.com>,
	David Sterba <dsterba@suse.com>
Subject: [PATCH 4.9 30/44] Btrfs: do not start a transaction at iterate_extent_inodes()
Date: Mon, 20 May 2019 14:14:19 +0200	[thread overview]
Message-ID: <20190520115234.796445803@linuxfoundation.org> (raw)
In-Reply-To: <20190520115230.720347034@linuxfoundation.org>

From: Filipe Manana <fdmanana@suse.com>

commit bfc61c36260ca990937539cd648ede3cd749bc10 upstream.

When finding out which inodes have references on a particular extent, done
by backref.c:iterate_extent_inodes(), from the BTRFS_IOC_LOGICAL_INO (both
v1 and v2) ioctl and from scrub we use the transaction join API to grab a
reference on the currently running transaction, since in order to give
accurate results we need to inspect the delayed references of the currently
running transaction.

However, if there is currently no running transaction, the join operation
will create a new transaction. This is inefficient as the transaction will
eventually be committed, doing unnecessary IO and introducing a potential
point of failure that will lead to a transaction abort due to -ENOSPC, as
recently reported [1].

That's because the join, creates the transaction but does not reserve any
space, so when attempting to update the root item of the root passed to
btrfs_join_transaction(), during the transaction commit, we can end up
failling with -ENOSPC. Users of a join operation are supposed to actually
do some filesystem changes and reserve space by some means, which is not
the case of iterate_extent_inodes(), it is a read-only operation for all
contextes from which it is called.

The reported [1] -ENOSPC failure stack trace is the following:

 heisenberg kernel: ------------[ cut here ]------------
 heisenberg kernel: BTRFS: Transaction aborted (error -28)
 heisenberg kernel: WARNING: CPU: 0 PID: 7137 at fs/btrfs/root-tree.c:136 btrfs_update_root+0x22b/0x320 [btrfs]
(...)
 heisenberg kernel: CPU: 0 PID: 7137 Comm: btrfs-transacti Not tainted 4.19.0-4-amd64 #1 Debian 4.19.28-2
 heisenberg kernel: Hardware name: FUJITSU LIFEBOOK U757/FJNB2A5, BIOS Version 1.21 03/19/2018
 heisenberg kernel: RIP: 0010:btrfs_update_root+0x22b/0x320 [btrfs]
(...)
 heisenberg kernel: RSP: 0018:ffffb5448828bd40 EFLAGS: 00010286
 heisenberg kernel: RAX: 0000000000000000 RBX: ffff8ed56bccef50 RCX: 0000000000000006
 heisenberg kernel: RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff8ed6bda166a0
 heisenberg kernel: RBP: 00000000ffffffe4 R08: 00000000000003df R09: 0000000000000007
 heisenberg kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff8ed63396a078
 heisenberg kernel: R13: ffff8ed092d7c800 R14: ffff8ed64f5db028 R15: ffff8ed6bd03d068
 heisenberg kernel: FS:  0000000000000000(0000) GS:ffff8ed6bda00000(0000) knlGS:0000000000000000
 heisenberg kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 heisenberg kernel: CR2: 00007f46f75f8000 CR3: 0000000310a0a002 CR4: 00000000003606f0
 heisenberg kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 heisenberg kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 heisenberg kernel: Call Trace:
 heisenberg kernel:  commit_fs_roots+0x166/0x1d0 [btrfs]
 heisenberg kernel:  ? _cond_resched+0x15/0x30
 heisenberg kernel:  ? btrfs_run_delayed_refs+0xac/0x180 [btrfs]
 heisenberg kernel:  btrfs_commit_transaction+0x2bd/0x870 [btrfs]
 heisenberg kernel:  ? start_transaction+0x9d/0x3f0 [btrfs]
 heisenberg kernel:  transaction_kthread+0x147/0x180 [btrfs]
 heisenberg kernel:  ? btrfs_cleanup_transaction+0x530/0x530 [btrfs]
 heisenberg kernel:  kthread+0x112/0x130
 heisenberg kernel:  ? kthread_bind+0x30/0x30
 heisenberg kernel:  ret_from_fork+0x35/0x40
 heisenberg kernel: ---[ end trace 05de912e30e012d9 ]---

So fix that by using the attach API, which does not create a transaction
when there is currently no running transaction.

[1] https://lore.kernel.org/linux-btrfs/b2a668d7124f1d3e410367f587926f622b3f03a4.camel@scientia.net/

Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/backref.c |   18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -2018,13 +2018,19 @@ int iterate_extent_inodes(struct btrfs_f
 			extent_item_objectid);
 
 	if (!search_commit_root) {
-		trans = btrfs_join_transaction(fs_info->extent_root);
-		if (IS_ERR(trans))
-			return PTR_ERR(trans);
+		trans = btrfs_attach_transaction(fs_info->extent_root);
+		if (IS_ERR(trans)) {
+			if (PTR_ERR(trans) != -ENOENT &&
+			    PTR_ERR(trans) != -EROFS)
+				return PTR_ERR(trans);
+			trans = NULL;
+		}
+	}
+
+	if (trans)
 		btrfs_get_tree_mod_seq(fs_info, &tree_mod_seq_elem);
-	} else {
+	else
 		down_read(&fs_info->commit_root_sem);
-	}
 
 	ret = btrfs_find_all_leafs(trans, fs_info, extent_item_objectid,
 				   tree_mod_seq_elem.seq, &refs,
@@ -2056,7 +2062,7 @@ int iterate_extent_inodes(struct btrfs_f
 
 	free_leaf_list(refs);
 out:
-	if (!search_commit_root) {
+	if (trans) {
 		btrfs_put_tree_mod_seq(fs_info, &tree_mod_seq_elem);
 		btrfs_end_transaction(trans, fs_info->extent_root);
 	} else {



  parent reply	other threads:[~2019-05-20 12:53 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-20 12:13 [PATCH 4.9 00/44] 4.9.178-stable review Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 01/44] net: core: another layer of lists, around PF_MEMALLOC skb handling Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 02/44] locking/rwsem: Prevent decrement of reader count before increment Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 03/44] PCI: hv: Fix a memory leak in hv_eject_device_work() Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 04/44] x86/speculation/mds: Revert CPU buffer clear on double fault exit Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 05/44] x86/speculation/mds: Improve CPU buffer clear documentation Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 06/44] objtool: Fix function fallthrough detection Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 07/44] ARM: exynos: Fix a leaked reference by adding missing of_node_put Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 08/44] power: supply: axp288_charger: Fix unchecked return value Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 09/44] arm64: compat: Reduce address limit Greg Kroah-Hartman
2019-05-20 12:13 ` [PATCH 4.9 10/44] arm64: Clear OSDLR_EL1 on CPU boot Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 11/44] sched/x86: Save [ER]FLAGS on context switch Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 12/44] crypto: chacha20poly1305 - set cra_name correctly Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 13/44] crypto: vmx - fix copy-paste error in CTR mode Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 14/44] crypto: crct10dif-generic - fix use via crypto_shash_digest() Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 15/44] crypto: x86/crct10dif-pcl " Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 16/44] ALSA: usb-audio: Fix a memory leak bug Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 17/44] ALSA: hda/hdmi - Read the pin sense from register when repolling Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 18/44] ALSA: hda/hdmi - Consider eld_valid when reporting jack event Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 19/44] ALSA: hda/realtek - EAPD turn on later Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 20/44] ASoC: max98090: Fix restore of DAPM Muxes Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 21/44] ASoC: RT5677-SPI: Disable 16Bit SPI Transfers Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 22/44] mm/mincore.c: make mincore() more conservative Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 23/44] ocfs2: fix ocfs2 read inode data panic in ocfs2_iget Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 24/44] mfd: da9063: Fix OTP control register names to match datasheets for DA9063/63L Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 25/44] mfd: max77620: Fix swapped FPS_PERIOD_MAX_US values Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 26/44] tty/vt: fix write/write race in ioctl(KDSKBSENT) handler Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 27/44] jbd2: check superblock mapped prior to committing Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 28/44] ext4: actually request zeroing of inode table after grow Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 29/44] ext4: fix ext4_show_options for file systems w/o journal Greg Kroah-Hartman
2019-05-20 12:14 ` Greg Kroah-Hartman [this message]
2019-05-20 12:14 ` [PATCH 4.9 31/44] bcache: fix a race between cache register and cacheset unregister Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 32/44] bcache: never set KEY_PTRS of journal key to 0 in journal_reclaim() Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 33/44] ipmi:ssif: compare block number correctly for multi-part return messages Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 34/44] crypto: gcm - Fix error return code in crypto_gcm_create_common() Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 35/44] crypto: gcm - fix incompatibility between "gcm" and "gcm_base" Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 36/44] crypto: salsa20 - dont access already-freed walk.iv Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 37/44] crypto: arm/aes-neonbs " Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 38/44] fib_rules: fix error in backport of e9919a24d302 ("fib_rules: return 0...") Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 39/44] writeback: synchronize sync(2) against cgroup writeback membership switches Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 40/44] fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 41/44] ext4: zero out the unused memory region in the extent tree block Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 42/44] ext4: fix data corruption caused by overlapping unaligned and aligned IO Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 43/44] ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug Greg Kroah-Hartman
2019-05-20 12:14 ` [PATCH 4.9 44/44] KVM: x86: Skip EFER vs. guest CPUID checks for host-initiated writes Greg Kroah-Hartman
2019-05-20 19:28 ` [PATCH 4.9 00/44] 4.9.178-stable review kernelci.org bot
2019-05-21  8:51 ` Jon Hunter
2019-05-21 10:34 ` Naresh Kamboju
2019-05-21 16:46   ` Greg Kroah-Hartman
2019-05-21 21:39 ` shuah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190520115234.796445803@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).