From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Hans van Kranenburg <hans.van.kranenburg@mendix.com>,
Qu Wenruo <wqu@suse.com>, David Sterba <dsterba@suse.com>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.19 060/106] btrfs: volumes: Make sure there is no overlap of dev extents at mount time
Date: Thu, 24 Jan 2019 20:20:17 +0100 [thread overview]
Message-ID: <20190124190210.230964303@linuxfoundation.org> (raw)
In-Reply-To: <20190124190206.342411005@linuxfoundation.org>
4.19-stable review patch. If anyone has any objections, please let me know.
------------------
[ Upstream commit 5eb193812a42dc49331f25137a38dfef9612d3e4 ]
Enhance btrfs_verify_dev_extents() to remember previous checked dev
extents, so it can verify no dev extents can overlap.
Analysis from Hans:
"Imagine allocating a DATA|DUP chunk.
In the chunk allocator, we first set...
max_stripe_size = SZ_1G;
max_chunk_size = BTRFS_MAX_DATA_CHUNK_SIZE
... which is 10GiB.
Then...
/* we don't want a chunk larger than 10% of writeable space */
max_chunk_size = min(div_factor(fs_devices->total_rw_bytes, 1),
max_chunk_size);
Imagine we only have one 7880MiB block device in this filesystem. Now
max_chunk_size is down to 788MiB.
The next step in the code is to search for max_stripe_size * dev_stripes
amount of free space on the device, which is in our example 1GiB * 2 =
2GiB. Imagine the device has exactly 1578MiB free in one contiguous
piece. This amount of bytes will be put in devices_info[ndevs - 1].max_avail
Next we recalculate the stripe_size (which is actually the device extent
length), based on the actual maximum amount of available raw disk space:
stripe_size = div_u64(devices_info[ndevs - 1].max_avail, dev_stripes);
stripe_size is now 789MiB
Next we do...
data_stripes = num_stripes / ncopies
...where data_stripes ends up as 1, because num_stripes is 2 (the amount
of device extents we're going to have), and DUP has ncopies 2.
Next there's a check...
if (stripe_size * data_stripes > max_chunk_size)
...which matches because 789MiB * 1 > 788MiB.
We go into the if code, and next is...
stripe_size = div_u64(max_chunk_size, data_stripes);
...which resets stripe_size to max_chunk_size: 788MiB
Next is a fun one...
/* bump the answer up to a 16MB boundary */
stripe_size = round_up(stripe_size, SZ_16M);
...which changes stripe_size from 788MiB to 800MiB.
We're not done changing stripe_size yet...
/* But don't go higher than the limits we found while searching
* for free extents
*/
stripe_size = min(devices_info[ndevs - 1].max_avail,
stripe_size);
This is bad. max_avail is twice the stripe_size (we need to fit 2 device
extents on the same device for DUP).
The result here is that 800MiB < 1578MiB, so it's unchanged. However,
the resulting DUP chunk will need 1600MiB disk space, which isn't there,
and the second dev_extent might extend into the next thing (next
dev_extent? end of device?) for 22MiB.
The last shown line of code relies on a situation where there's twice
the value of stripe_size present as value for the variable stripe_size
when it's DUP. This was actually the case before commit 92e222df7b
"btrfs: alloc_chunk: fix DUP stripe size handling", from which I quote:
"[...] in the meantime there's a check to see if the stripe_size does
not exceed max_chunk_size. Since during this check stripe_size is twice
the amount as intended, the check will reduce the stripe_size to
max_chunk_size if the actual correct to be used stripe_size is more than
half the amount of max_chunk_size."
In the previous version of the code, the 16MiB alignment (why is this
done, by the way?) would result in a 50% chance that it would actually
do an 8MiB alignment for the individual dev_extents, since it was
operating on double the size. Does this matter?
Does it matter that stripe_size can be set to anything which is not
16MiB aligned because of the amount of remaining available disk space
which is just taken?
What is the main purpose of this round_up?
The most straightforward thing to do seems something like...
stripe_size = min(
div_u64(devices_info[ndevs - 1].max_avail, dev_stripes),
stripe_size
)
..just putting half of the max_avail into stripe_size."
Link: https://lore.kernel.org/linux-btrfs/b3461a38-e5f8-f41d-c67c-2efac8129054@mendix.com/
Reported-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ add analysis from report ]
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/volumes.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 223334f08530..dbb893d0ae81 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7474,6 +7474,8 @@ int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info)
struct btrfs_path *path;
struct btrfs_root *root = fs_info->dev_root;
struct btrfs_key key;
+ u64 prev_devid = 0;
+ u64 prev_dev_ext_end = 0;
int ret = 0;
key.objectid = 1;
@@ -7518,10 +7520,22 @@ int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info)
chunk_offset = btrfs_dev_extent_chunk_offset(leaf, dext);
physical_len = btrfs_dev_extent_length(leaf, dext);
+ /* Check if this dev extent overlaps with the previous one */
+ if (devid == prev_devid && physical_offset < prev_dev_ext_end) {
+ btrfs_err(fs_info,
+"dev extent devid %llu physical offset %llu overlap with previous dev extent end %llu",
+ devid, physical_offset, prev_dev_ext_end);
+ ret = -EUCLEAN;
+ goto out;
+ }
+
ret = verify_one_dev_extent(fs_info, chunk_offset, devid,
physical_offset, physical_len);
if (ret < 0)
goto out;
+ prev_devid = devid;
+ prev_dev_ext_end = physical_offset + physical_len;
+
ret = btrfs_next_item(root, path);
if (ret < 0)
goto out;
--
2.19.1
next prev parent reply other threads:[~2019-01-24 19:36 UTC|newest]
Thread overview: 114+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-24 19:19 [PATCH 4.19 000/106] 4.19.18-stable review Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 001/106] ipv6: Consider sk_bound_dev_if when binding a socket to a v4 mapped address Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 002/106] mlxsw: spectrum: Disable lag port TX before removing it Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 003/106] mlxsw: spectrum_switchdev: Set PVID correctly during VLAN deletion Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 004/106] net: dsa: mv88x6xxx: mv88e6390 errata Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 005/106] net, skbuff: do not prefer skb allocation fails early Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 006/106] qmi_wwan: add MTU default to qmap network interface Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 007/106] r8169: Add support for new Realtek Ethernet Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 008/106] ipv6: Take rcu_read_lock in __inet6_bind for mapped addresses Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 009/106] net: clear skb->tstamp in bridge forwarding path Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 010/106] netfilter: ipset: Allow matching on destination MAC address for mac and ipmac sets Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 011/106] gpio: pl061: Move irq_chip definition inside struct pl061 Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 012/106] drm/amd/display: Guard against null stream_state in set_crc_source Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 013/106] drm/amdkfd: fix interrupt spin lock Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 014/106] ixgbe: allow IPsec Tx offload in VEPA mode Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 015/106] platform/x86: asus-wmi: Tell the EC the OS will handle the display off hotkey Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 016/106] e1000e: allow non-monotonic SYSTIM readings Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 017/106] usb: typec: tcpm: Do not disconnect link for self powered devices Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 018/106] selftests/bpf: enable (uncomment) all tests in test_libbpf.sh Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 019/106] of: overlay: add missing of_node_put() after add new node to changeset Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 020/106] writeback: dont decrement wb->refcnt if !wb->bdi Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 021/106] serial: set suppress_bind_attrs flag only if builtin Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 022/106] bpf: Allow narrow loads with offset > 0 Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 023/106] ALSA: oxfw: add support for APOGEE duet FireWire Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 024/106] x86/mce: Fix -Wmissing-prototypes warnings Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 025/106] MIPS: SiByte: Enable swiotlb for SWARM, LittleSur and BigSur Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 026/106] crypto: ecc - regularize scalar for scalar multiplication Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 027/106] arm64: perf: set suppress_bind_attrs flag to true Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 028/106] drm/atomic-helper: Complete fake_commit->flip_done potentially earlier Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 029/106] clk: meson: meson8b: fix incorrect divider mapping in cpu_scale_table Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 030/106] samples: bpf: fix: error handling regarding kprobe_events Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 031/106] usb: gadget: udc: renesas_usb3: add a safety connection way for forced_b_device Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 032/106] fpga: altera-cvp: fix probing for multiple FPGAs on the bus Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 033/106] selinux: always allow mounting submounts Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 034/106] ASoC: pcm3168a: Dont disable pcm3168a when CONFIG_PM defined Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 035/106] scsi: qedi: Check for session online before getting iSCSI TLV data Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 036/106] drm/amdgpu: Reorder uvd ring init before uvd resume Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 037/106] rxe: IB_WR_REG_MR does not capture MRs iova field Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 038/106] efi/libstub: Disable some warnings for x86{,_64} Greg Kroah-Hartman
2019-01-24 19:19 ` Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 039/106] jffs2: Fix use of uninitialized delayed_work, lockdep breakage Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 040/106] clk: imx: make mux parent strings const Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 041/106] pstore/ram: Do not treat empty buffers as valid Greg Kroah-Hartman
2019-01-24 19:19 ` [PATCH 4.19 042/106] media: uvcvideo: Refactor teardown of uvc on USB disconnect Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 043/106] powerpc/xmon: Fix invocation inside lock region Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 044/106] powerpc/pseries/cpuidle: Fix preempt warning Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 045/106] media: firewire: Fix app_info parameter type in avc_ca{,_app}_info Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 046/106] ASoC: use dma_ops of parent device for acp_audio_dma Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 047/106] media: venus: core: Set dma maximum segment size Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 048/106] staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 049/106] net: call sk_dst_reset when set SO_DONTROUTE Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 050/106] scsi: target: use consistent left-aligned ASCII INQUIRY data Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 051/106] scsi: target/core: Make sure that target_wait_for_sess_cmds() waits long enough Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 052/106] selftests: do not macro-expand failed assertion expressions gregkh
2019-01-24 19:20 ` Greg Kroah-Hartman
2019-01-24 19:20 ` Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 053/106] arm64: kasan: Increase stack size for KASAN_EXTRA Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 054/106] clk: imx6q: reset exclusive gates on init Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 055/106] arm64: Fix minor issues with the dcache_by_line_op macro Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 056/106] bpf: relax verifier restriction on BPF_MOV | BPF_ALU Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 057/106] kconfig: fix file name and line number of warn_ignored_character() Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 058/106] kconfig: fix memory leak when EOF is encountered in quotation Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 059/106] mmc: atmel-mci: do not assume idle after atmci_request_end Greg Kroah-Hartman
2019-01-24 19:20 ` Greg Kroah-Hartman [this message]
2019-01-24 19:20 ` [PATCH 4.19 061/106] btrfs: alloc_chunk: fix more DUP stripe size handling Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 062/106] btrfs: fix use-after-free due to race between replace start and cancel Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 063/106] btrfs: improve error handling of btrfs_add_link Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 064/106] tty/serial: do not free trasnmit buffer page under port lock Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 065/106] perf intel-pt: Fix error with config term "pt=0" Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 066/106] perf tests ARM: Disable breakpoint tests 32-bit Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 067/106] perf svghelper: Fix unchecked usage of strncpy() Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 068/106] perf parse-events: " Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 069/106] perf vendor events intel: Fix Load_Miss_Real_Latency on SKL/SKX Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 070/106] netfilter: ipt_CLUSTERIP: check MAC address when duplicate config is set Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 071/106] netfilter: ipt_CLUSTERIP: remove wrong WARN_ON_ONCE in netns exit routine Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 072/106] netfilter: ipt_CLUSTERIP: fix deadlock " Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 073/106] x86/topology: Use total_cpus for max logical packages calculation Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 074/106] dm crypt: use u64 instead of sector_t to store iv_offset Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 075/106] dm kcopyd: Fix bug causing workqueue stalls Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 076/106] perf stat: Avoid segfaults caused by negated options Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 077/106] tools lib subcmd: Dont add the kernel sources to the include path Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 078/106] dm snapshot: Fix excessive memory usage and workqueue stalls Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 079/106] perf cs-etm: Correct packets swapping in cs_etm__flush() Greg Kroah-Hartman
2019-01-24 19:20 ` Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 080/106] perf tools: Add missing sigqueue() prototype for systems lacking it Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 081/106] perf tools: Add missing open_memstream() " Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 082/106] quota: Lock s_umount in exclusive mode for Q_XQUOTA{ON,OFF} quotactls Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 083/106] clocksource/drivers/integrator-ap: Add missing of_node_put() Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 084/106] dm: Check for device sector overflow if CONFIG_LBDAF is not set Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 085/106] Bluetooth: btusb: Add support for Intel bluetooth device 8087:0029 Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 086/106] ALSA: bebob: fix model-id of unit for Apogee Ensemble Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 087/106] sysfs: Disable lockdep for driver bind/unbind files Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 088/106] IB/usnic: Fix potential deadlock Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 089/106] scsi: mpt3sas: fix memory ordering on 64bit writes Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 090/106] scsi: smartpqi: correct lun reset issues Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 091/106] ath10k: fix peer stats null pointer dereference Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 092/106] scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown() Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 093/106] scsi: megaraid: fix out-of-bound array accesses Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 094/106] iomap: dont search past page end in iomap_is_partially_uptodate Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 095/106] ocfs2: fix panic due to unrecovered local alloc Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 096/106] mm/page-writeback.c: dont break integrity writeback on ->writepage() error Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 097/106] mm/swap: use nr_node_ids for avail_lists in swap_info_struct Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 098/106] userfaultfd: clear flag if remap event not enabled Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 099/106] mm, proc: be more verbose about unstable VMA flags in /proc/<pid>/smaps Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 100/106] iwlwifi: mvm: Send LQ command as async when necessary Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 101/106] Bluetooth: Fix unnecessary error message for HCI request completion Greg Kroah-Hartman
2019-01-24 19:20 ` [PATCH 4.19 102/106] ipmi: fix use-after-free of user->release_barrier.rda Greg Kroah-Hartman
2019-01-24 19:21 ` [PATCH 4.19 103/106] ipmi: msghandler: Fix potential Spectre v1 vulnerabilities Greg Kroah-Hartman
2019-01-24 19:21 ` [PATCH 4.19 104/106] ipmi: Prevent use-after-free in deliver_response Greg Kroah-Hartman
2019-01-24 19:21 ` [PATCH 4.19 105/106] ipmi:ssif: Fix handling of multi-part return messages Greg Kroah-Hartman
2019-01-24 19:21 ` [PATCH 4.19 106/106] ipmi: Dont initialize anything in the core until something uses it Greg Kroah-Hartman
2019-01-25 14:51 ` [PATCH 4.19 000/106] 4.19.18-stable review shuah
2019-01-25 16:18 ` Naresh Kamboju
2019-01-25 23:20 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190124190210.230964303@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dsterba@suse.com \
--cc=hans.van.kranenburg@mendix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.