stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Nikolay Borisov <nborisov@suse.com>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>,
	linux-btrfs@vger.kernel.org
Subject: [PATCH AUTOSEL 4.19 61/81] btrfs: Switch memory allocations in async csum calculation path to kvmalloc
Date: Tue,  7 May 2019 01:35:32 -0400	[thread overview]
Message-ID: <20190507053554.30848-61-sashal@kernel.org> (raw)
In-Reply-To: <20190507053554.30848-1-sashal@kernel.org>

From: Nikolay Borisov <nborisov@suse.com>

[ Upstream commit a3d46aea46f99d134b4e0726e4826b824c3e5980 ]

Recent multi-page biovec rework allowed creation of bios that can span
large regions - up to 128 megabytes in the case of btrfs. OTOH btrfs'
submission path currently allocates a contiguous array to store the
checksums for every bio submitted. This means we can request up to
(128mb / BTRFS_SECTOR_SIZE) * 4 bytes + 32bytes of memory from kmalloc.
On busy systems with possibly fragmented memory said kmalloc can fail
which will trigger BUG_ON due to improper error handling IO submission
context in btrfs.

Until error handling is improved or bios in btrfs limited to a more
manageable size (e.g. 1m) let's use kvmalloc to fallback to vmalloc for
such large allocations. There is no hard requirement that the memory
allocated for checksums during IO submission has to be contiguous, but
this is a simple fix that does not require several non-contiguous
allocations.

For small writes this is unlikely to have any visible effect since
kmalloc will still satisfy allocation requests as usual. For larger
requests the code will just fallback to vmalloc.

We've performed evaluation on several workload types and there was no
significant difference kmalloc vs kvmalloc.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/file-item.c    | 15 +++++++++++----
 fs/btrfs/ordered-data.c |  3 ++-
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index ba74827beb32..2ba8de8c0d17 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -7,6 +7,7 @@
 #include <linux/slab.h>
 #include <linux/pagemap.h>
 #include <linux/highmem.h>
+#include <linux/sched/mm.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -434,9 +435,13 @@ blk_status_t btrfs_csum_one_bio(struct inode *inode, struct bio *bio,
 	unsigned long this_sum_bytes = 0;
 	int i;
 	u64 offset;
+	unsigned nofs_flag;
+
+	nofs_flag = memalloc_nofs_save();
+	sums = kvzalloc(btrfs_ordered_sum_size(fs_info, bio->bi_iter.bi_size),
+		       GFP_KERNEL);
+	memalloc_nofs_restore(nofs_flag);
 
-	sums = kzalloc(btrfs_ordered_sum_size(fs_info, bio->bi_iter.bi_size),
-		       GFP_NOFS);
 	if (!sums)
 		return BLK_STS_RESOURCE;
 
@@ -479,8 +484,10 @@ blk_status_t btrfs_csum_one_bio(struct inode *inode, struct bio *bio,
 
 				bytes_left = bio->bi_iter.bi_size - total_bytes;
 
-				sums = kzalloc(btrfs_ordered_sum_size(fs_info, bytes_left),
-					       GFP_NOFS);
+				nofs_flag = memalloc_nofs_save();
+				sums = kvzalloc(btrfs_ordered_sum_size(fs_info,
+						      bytes_left), GFP_KERNEL);
+				memalloc_nofs_restore(nofs_flag);
 				BUG_ON(!sums); /* -ENOMEM */
 				sums->len = bytes_left;
 				ordered = btrfs_lookup_ordered_extent(inode,
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 0c4ef208b8b9..b92edf0600e3 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -6,6 +6,7 @@
 #include <linux/slab.h>
 #include <linux/blkdev.h>
 #include <linux/writeback.h>
+#include <linux/sched/mm.h>
 #include "ctree.h"
 #include "transaction.h"
 #include "btrfs_inode.h"
@@ -442,7 +443,7 @@ void btrfs_put_ordered_extent(struct btrfs_ordered_extent *entry)
 			cur = entry->list.next;
 			sum = list_entry(cur, struct btrfs_ordered_sum, list);
 			list_del(&sum->list);
-			kfree(sum);
+			kvfree(sum);
 		}
 		kmem_cache_free(btrfs_ordered_extent_cache, entry);
 	}
-- 
2.20.1


  parent reply	other threads:[~2019-05-07  5:37 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-07  5:34 [PATCH AUTOSEL 4.19 01/81] iio: adc: xilinx: fix potential use-after-free on remove Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 02/81] iio: adc: xilinx: fix potential use-after-free on probe Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 03/81] iio: adc: xilinx: prevent touching unclocked h/w on remove Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 04/81] acpi/nfit: Always dump _DSM output payload Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 05/81] libnvdimm/namespace: Fix a potential NULL pointer dereference Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 06/81] HID: input: add mapping for Expose/Overview key Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 07/81] HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 08/81] HID: input: add mapping for "Toggle Display" key Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 09/81] libnvdimm/btt: Fix a kmemdup failure check Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 10/81] s390/dasd: Fix capacity calculation for large volumes Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 11/81] mac80211: fix unaligned access in mesh table hash function Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 12/81] mac80211: Increase MAX_MSG_LEN Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 13/81] cfg80211: Handle WMM rules in regulatory domain intersection Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 14/81] mac80211: fix memory accounting with A-MSDU aggregation Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 15/81] nl80211: Add NL80211_FLAG_CLEAR_SKB flag for other NL commands Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 16/81] Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 17/81] libnvdimm/pmem: fix a possible OOB access when read and write pmem Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 18/81] mac80211: Honor SW_CRYPTO_CONTROL for unicast keys in AP VLAN mode Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 19/81] s390/3270: fix lockdep false positive on view->lock Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 20/81] drm/amd/display: extending AUX SW Timeout Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 21/81] clocksource/drivers/npcm: select TIMER_OF Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 22/81] clocksource/drivers/oxnas: Fix OX820 compatible Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 23/81] selftests: fib_tests: Fix 'Command line is not complete' errors Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 24/81] mISDN: Check address length before reading address family Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 25/81] vxge: fix return of a free'd memblock on a failed dma mapping Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 26/81] qede: fix write to free'd pointer error and double free of ptp Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 27/81] afs: Unlock pages for __pagevec_release() Sasha Levin
2019-05-07  5:34 ` [PATCH AUTOSEL 4.19 28/81] drm/amd/display: If one stream full updates, full update all planes Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 29/81] s390/pkey: add one more argument space for debug feature entry Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 30/81] x86/build/lto: Fix truncated .bss with -fdata-sections Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 31/81] x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51T Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 32/81] x86/mm/tlb: Revert "x86/mm: Align TLB invalidation info" Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 33/81] KVM: x86: Raise #GP when guest vCPU do not support PMU Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 34/81] KVM: fix spectrev1 gadgets Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 35/81] KVM: x86: avoid misreporting level-triggered irqs as edge-triggered in tracing Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 36/81] tools lib traceevent: Fix missing equality check for strcmp Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 37/81] ipmi: ipmi_si_hardcode.c: init si_type array to fix a crash Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 38/81] ocelot: Don't sleep in atomic context (irqs_disabled()) Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 39/81] x86/mm/KASLR: Fix the size of the direct mapping section Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 40/81] scsi: aic7xxx: fix EISA support Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 41/81] mm: fix inactive list balancing between NUMA nodes and cgroups Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 42/81] init: initialize jump labels before command line option parsing Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 43/81] selftests: netfilter: check icmp pkttoobig errors are set as related Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 44/81] ipvs: do not schedule icmp errors from tunnels Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 45/81] netfilter: ctnetlink: don't use conntrack/expect object addresses as id Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 46/81] netfilter: nf_tables: prevent shift wrap in nft_chain_parse_hook() Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 47/81] MIPS: perf: ath79: Fix perfcount IRQ assignment Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 48/81] s390: ctcm: fix ctcm_new_device error return code Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 49/81] drm/sun4i: Set device driver data at bind time for use in unbind Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 50/81] drm/sun4i: Fix component unbinding and component master deletion Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 51/81] selftests/net: correct the return value for run_netsocktests Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 52/81] netfilter: fix nf_l4proto_log_invalid to log invalid packets Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 53/81] gpu: ipu-v3: dp: fix CSC handling Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 54/81] drm/imx: don't skip DP channel disable for background plane Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 55/81] ARM: 8856/1: NOMMU: Fix CCR register faulty initialization when MPU is disabled Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 56/81] spi: Micrel eth switch: declare missing of table Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 57/81] spi: ST ST95HF NFC: " Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 58/81] x86/mm: Fix a crash with kmemleak_scan() Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 59/81] drm/sun4i: Unbind components before releasing DRM and memory Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 60/81] Input: synaptics-rmi4 - fix possible double free Sasha Levin
2019-05-07  5:35 ` Sasha Levin [this message]
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 62/81] RDMA/hns: Bugfix for mapping user db Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 63/81] mm/memory_hotplug.c: drop memory device reference after find_memory_block() Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 64/81] powerpc/smp: Fix NMI IPI timeout Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 65/81] powerpc/smp: Fix NMI IPI xmon timeout Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 66/81] net: dsa: mv88e6xxx: fix few issues in mv88e6390x_port_set_cmode Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 67/81] mm/memory.c: fix modifying of page protection by insert_pfn() Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 68/81] usb: typec: Fix unchecked return value Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 69/81] f2fs: fix to data block override node segment by mistake Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 70/81] netfilter: nf_tables: use-after-free in dynamic operations Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 71/81] netfilter: nf_tables: add missing ->release_ops() in error path of newrule() Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 72/81] net: fec: manage ahb clock in runtime pm Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 73/81] mlxsw: spectrum_switchdev: Add MDB entries in prepare phase Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 74/81] mlxsw: core: Do not use WQ_MEM_RECLAIM for EMAD workqueue Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 75/81] mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw ordered workqueue Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 76/81] mlxsw: core: Do not use WQ_MEM_RECLAIM for mlxsw workqueue Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 77/81] net/tls: fix the IV leaks Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 78/81] net: strparser: partially revert "strparser: Call skb_unclone conditionally" Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 79/81] NFC: nci: Add some bounds checking in nci_hci_cmd_received() Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 80/81] nfc: nci: Potential off by one in ->pipes[] array Sasha Levin
2019-05-07  5:35 ` [PATCH AUTOSEL 4.19 81/81] x86/kprobes: Avoid kretprobe recursion bug Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190507053554.30848-61-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nborisov@suse.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).