From: Ben Hutchings <ben@decadent.org.uk>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk, Jeff Layton <jlayton@redhat.com>,
Jian Li <jiali@redhat.com>, Steve French <smfrench@gmail.com>
Subject: [ 24/73] cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
Date: Tue, 31 Jul 2012 05:43:34 +0100 [thread overview]
Message-ID: <20120731044315.148115472@decadent.org.uk> (raw)
In-Reply-To: <20120731044310.013763753@decadent.org.uk>
3.2-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Layton <jlayton@redhat.com>
commit 3cf003c08be785af4bee9ac05891a15bcbff856a upstream.
Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:
crash> bt
PID: 2789 TASK: f02edaa0 CPU: 3 COMMAND: "fsx"
#0 [eed63cbc] schedule at c083c5b3
#1 [eed63d80] kmap_high at c0500ec8
#2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
#3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
#4 [eed63e50] do_writepages at c04f3e32
#5 [eed63e54] __filemap_fdatawrite_range at c04e152a
#6 [eed63ea4] filemap_fdatawrite at c04e1b3e
#7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
#8 [eed63ecc] do_sync_write at c052d202
#9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
EAX: 00000004 EBX: 00000003 ECX: abd73b73 EDX: 012a65c6
DS: 007b ESI: 012a65c6 ES: 007b EDI: 00000000
SS: 007b ESP: bf8db178 EBP: bf8db1f8 GS: 0033
CS: 0073 EIP: 40000424 ERR: 00000004 EFLAGS: 00000246
Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.
This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.
There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
fs/cifs/cifssmb.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -89,6 +89,32 @@ static struct {
/* Forward declarations */
static void cifs_readv_complete(struct work_struct *work);
+#ifdef CONFIG_HIGHMEM
+/*
+ * On arches that have high memory, kmap address space is limited. By
+ * serializing the kmap operations on those arches, we ensure that we don't
+ * end up with a bunch of threads in writeback with partially mapped page
+ * arrays, stuck waiting for kmap to come back. That situation prevents
+ * progress and can deadlock.
+ */
+static DEFINE_MUTEX(cifs_kmap_mutex);
+
+static inline void
+cifs_kmap_lock(void)
+{
+ mutex_lock(&cifs_kmap_mutex);
+}
+
+static inline void
+cifs_kmap_unlock(void)
+{
+ mutex_unlock(&cifs_kmap_mutex);
+}
+#else /* !CONFIG_HIGHMEM */
+#define cifs_kmap_lock() do { ; } while(0)
+#define cifs_kmap_unlock() do { ; } while(0)
+#endif /* CONFIG_HIGHMEM */
+
/* Mark as invalid, all open files on tree connections since they
were closed when session to server was lost */
static void mark_open_files_invalid(struct cifs_tcon *pTcon)
@@ -1540,6 +1566,7 @@ cifs_readv_receive(struct TCP_Server_Inf
eof_index = eof ? (eof - 1) >> PAGE_CACHE_SHIFT : 0;
cFYI(1, "eof=%llu eof_index=%lu", eof, eof_index);
+ cifs_kmap_lock();
list_for_each_entry_safe(page, tpage, &rdata->pages, lru) {
if (remaining >= PAGE_CACHE_SIZE) {
/* enough data to fill the page */
@@ -1589,6 +1616,7 @@ cifs_readv_receive(struct TCP_Server_Inf
page_cache_release(page);
}
}
+ cifs_kmap_unlock();
/* issue the read if we have any iovecs left to fill */
if (rdata->nr_iov > 1) {
@@ -2171,6 +2199,7 @@ cifs_async_writev(struct cifs_writedata
iov[0].iov_base = smb;
/* marshal up the pages into iov array */
+ cifs_kmap_lock();
wdata->bytes = 0;
for (i = 0; i < wdata->nr_pages; i++) {
iov[i + 1].iov_len = min(inode->i_size -
@@ -2179,6 +2208,7 @@ cifs_async_writev(struct cifs_writedata
iov[i + 1].iov_base = kmap(wdata->pages[i]);
wdata->bytes += iov[i + 1].iov_len;
}
+ cifs_kmap_unlock();
cFYI(1, "async write at %llu %u bytes", wdata->offset, wdata->bytes);
next prev parent reply other threads:[~2012-07-31 4:51 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-31 4:43 [ 00/73] 3.2.25-stable review Ben Hutchings
2012-07-31 4:43 ` [ 01/73] mm: reduce the amount of work done when updating min_free_kbytes Ben Hutchings
2012-07-31 4:43 ` [ 02/73] mm: compaction: allow compaction to isolate dirty pages Ben Hutchings
2012-07-31 4:43 ` [ 03/73] mm: compaction: determine if dirty pages can be migrated without blocking within ->migratepage Ben Hutchings
2012-07-31 4:43 ` [ 04/73] mm: page allocator: do not call direct reclaim for THP allocations while compaction is deferred Ben Hutchings
2012-07-31 4:43 ` [ 05/73] mm: compaction: make isolate_lru_page() filter-aware again Ben Hutchings
2012-07-31 4:43 ` [ 06/73] mm: compaction: introduce sync-light migration for use by compaction Ben Hutchings
2012-07-31 16:42 ` Herton Ronaldo Krzesinski
2012-07-31 17:00 ` Mel Gorman
2012-07-31 17:03 ` Mel Gorman
2012-07-31 23:12 ` Ben Hutchings
2012-07-31 4:43 ` [ 07/73] mm: vmscan: when reclaiming for compaction, ensure there are sufficient free pages available Ben Hutchings
2012-07-31 4:43 ` [ 08/73] mm: vmscan: do not OOM if aborting reclaim to start compaction Ben Hutchings
2012-07-31 4:43 ` [ 09/73] mm: vmscan: check if reclaim should really abort even if compaction_ready() is true for one zone Ben Hutchings
2012-07-31 4:43 ` [ 10/73] vmscan: promote shared file mapped pages Ben Hutchings
2012-07-31 4:43 ` [ 11/73] vmscan: activate executable pages after first usage Ben Hutchings
2012-07-31 4:43 ` [ 12/73] mm/vmscan.c: consider swap space when deciding whether to continue reclaim Ben Hutchings
2012-07-31 4:43 ` [ 13/73] mm: test PageSwapBacked in lumpy reclaim Ben Hutchings
2012-07-31 4:43 ` [ 14/73] mm: vmscan: convert global reclaim to per-memcg LRU lists Ben Hutchings
2012-07-31 4:43 ` [ 15/73] cpuset: mm: reduce large amounts of memory barrier related damage v3 Ben Hutchings
2012-07-31 4:43 ` [ 16/73] mm/hugetlb: fix warning in alloc_huge_page/dequeue_huge_page_vma Ben Hutchings
2012-07-31 4:43 ` [ 17/73] [SCSI] Fix NULL dereferences in scsi_cmd_to_driver Ben Hutchings
2012-07-31 4:43 ` [ 18/73] sched/nohz: Fix rq->cpu_load[] calculations Ben Hutchings
2012-07-31 4:43 ` [ 19/73] sched/nohz: Fix rq->cpu_load calculations some more Ben Hutchings
2012-07-31 4:43 ` [ 20/73] powerpc/ftrace: Fix assembly trampoline register usage Ben Hutchings
2012-07-31 4:43 ` [ 21/73] cx25821: Remove bad strcpy to read-only char* Ben Hutchings
2012-07-31 4:43 ` [ 22/73] x86: Fix boot on Twinhead H12Y Ben Hutchings
2012-07-31 4:43 ` [ 23/73] r8169: RxConfig hack for the 8168evl Ben Hutchings
2012-07-31 4:43 ` Ben Hutchings [this message]
2012-07-31 4:43 ` [ 25/73] wireless: rt2x00: rt2800usb add more devices ids Ben Hutchings
2012-07-31 4:43 ` [ 26/73] wireless: rt2x00: rt2800usb more devices were identified Ben Hutchings
2012-07-31 4:43 ` [ 27/73] rt2800usb: 2001:3c17 is an RT3370 device Ben Hutchings
2012-07-31 4:43 ` [ 28/73] ARM: OMAP2+: OPP: Fix to ensure check of right oppdef after bad one Ben Hutchings
2012-08-01 1:56 ` Herton Ronaldo Krzesinski
2012-08-01 2:36 ` Ben Hutchings
2012-07-31 4:43 ` [ 29/73] usb: gadget: Fix g_ether interface link status Ben Hutchings
2012-07-31 4:43 ` [ 30/73] ext4: pass a char * to ext4_count_free() instead of a buffer_head ptr Ben Hutchings
2012-07-31 4:43 ` [ 31/73] ftrace: Disable function tracing during suspend/resume and hibernation, again Ben Hutchings
2012-07-31 4:43 ` [ 32/73] x86, microcode: microcode_core.c simple_strtoul cleanup Ben Hutchings
2012-07-31 4:43 ` [ 33/73] x86, microcode: Sanitize per-cpu microcode reloading interface Ben Hutchings
2012-08-03 9:04 ` Sven Joachim
2012-08-03 9:43 ` Borislav Petkov
2012-08-03 12:27 ` Borislav Petkov
2012-08-04 15:41 ` Ben Hutchings
2012-08-04 16:07 ` Henrique de Moraes Holschuh
2012-08-04 17:23 ` Ben Hutchings
2012-08-05 9:21 ` Borislav Petkov
2012-08-05 18:56 ` Ben Hutchings
2012-07-31 4:43 ` [ 34/73] usbdevfs: Correct amount of data copied to user in processcompl_compat Ben Hutchings
2012-07-31 4:43 ` [ 35/73] ASoC: dapm: Fix locking during codec shutdown Ben Hutchings
2012-07-31 16:11 ` Herton Ronaldo Krzesinski
2012-07-31 16:13 ` Mark Brown
2012-07-31 23:20 ` Ben Hutchings
2012-07-31 4:43 ` [ 36/73] ext4: fix overhead calculation used by ext4_statfs() Ben Hutchings
2012-07-31 4:43 ` [ 37/73] udf: Improve table length check to avoid possible overflow Ben Hutchings
2012-07-31 4:43 ` [ 38/73] powerpc: Add "memory" attribute for mfmsr() Ben Hutchings
2012-07-31 4:43 ` [ 39/73] mwifiex: correction in mcs index check Ben Hutchings
2012-07-31 4:43 ` [ 40/73] USB: option: Ignore ZTE (Vodafone) K3570/71 net interfaces Ben Hutchings
2012-07-31 4:43 ` [ 41/73] USB: option: add ZTE MF821D Ben Hutchings
2012-07-31 4:43 ` [ 42/73] target: Add generation of LOGICAL BLOCK ADDRESS OUT OF RANGE Ben Hutchings
2012-07-31 4:43 ` [ 43/73] target: Add range checking to UNMAP emulation Ben Hutchings
2012-07-31 4:43 ` [ 44/73] target: Fix reading of data length fields for UNMAP commands Ben Hutchings
2012-07-31 4:43 ` [ 45/73] target: Fix possible integer underflow in UNMAP emulation Ben Hutchings
2012-07-31 4:43 ` [ 46/73] target: Check number of unmap descriptors against our limit Ben Hutchings
2012-07-31 4:43 ` [ 47/73] s390/idle: fix sequence handling vs cpu hotplug Ben Hutchings
2012-07-31 4:43 ` [ 48/73] rtlwifi: rtl8192de: Fix phy-based version calculation Ben Hutchings
2012-07-31 4:43 ` [ 49/73] workqueue: perform cpu down operations from low priority cpu_notifier() Ben Hutchings
2012-07-31 4:44 ` [ 50/73] ALSA: hda - Add support for Realtek ALC282 Ben Hutchings
2012-07-31 4:44 ` [ 51/73] iommu/amd: Fix hotplug with iommu=pt Ben Hutchings
2012-07-31 4:44 ` [ 52/73] drm/radeon: Try harder to avoid HW cursor ending on a multiple of 128 columns Ben Hutchings
2012-07-31 4:44 ` [ 53/73] ALSA: hda - Turn on PIN_OUT from hdmi playback prepare Ben Hutchings
2012-07-31 4:44 ` [ 54/73] block: add blk_queue_dead() Ben Hutchings
2012-07-31 4:44 ` [ 55/73] [SCSI] Fix device removal NULL pointer dereference Ben Hutchings
2012-07-31 4:44 ` [ 56/73] [SCSI] Avoid dangling pointer in scsi_requeue_command() Ben Hutchings
2012-07-31 4:44 ` [ 57/73] [SCSI] fix hot unplug vs async scan race Ben Hutchings
2012-07-31 4:44 ` [ 58/73] [SCSI] fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations) Ben Hutchings
2012-07-31 4:44 ` [ 59/73] [SCSI] libsas: continue revalidation Ben Hutchings
2012-07-31 4:44 ` [ 60/73] [SCSI] libsas: fix sas_discover_devices return code handling Ben Hutchings
2012-07-31 4:44 ` [ 61/73] iscsi-target: Drop bogus struct file usage for iSCSI/SCTP Ben Hutchings
2012-07-31 4:44 ` [ 62/73] mmc: sdhci-pci: CaFe has broken card detection Ben Hutchings
2012-07-31 4:44 ` [ 63/73] ext4: dont let i_reserved_meta_blocks go negative Ben Hutchings
2012-07-31 4:44 ` [ 64/73] ext4: undo ext4_calc_metadata_amount if we fail to claim space Ben Hutchings
2012-07-31 4:44 ` [ 65/73] ASoC: dapm: Fix _PRE and _POST events for DAPM performance improvements Ben Hutchings
2012-07-31 4:44 ` [ 66/73] locks: fix checking of fcntl_setlease argument Ben Hutchings
2012-07-31 4:44 ` [ 67/73] ACPI/AC: prevent OOPS on some boxes due to missing check power_supply_register() return value check Ben Hutchings
2012-07-31 4:44 ` [ 68/73] drm/radeon: fix bo creation retry path Ben Hutchings
2012-07-31 4:44 ` [ 69/73] drm/radeon: fix non revealent error message Ben Hutchings
2012-07-31 4:44 ` [ 70/73] drm/radeon: fix hotplug of DP to DVI|HDMI passive adapters (v2) Ben Hutchings
2012-07-31 4:44 ` [ 71/73] drm/radeon: on hotplug force link training to happen (v2) Ben Hutchings
2012-07-31 4:44 ` [ 72/73] Btrfs: call the ordered free operation without any locks held Ben Hutchings
2012-07-31 4:44 ` [ 73/73] nouveau: Fix alignment requirements on src and dst addresses Ben Hutchings
2012-07-31 5:00 ` [ 00/73] 3.2.25-stable review Ben Hutchings
2012-08-01 12:55 ` Steven Rostedt
2012-08-05 22:26 ` Ben Hutchings
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120731044315.148115472@decadent.org.uk \
--to=ben@decadent.org.uk \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=jiali@redhat.com \
--cc=jlayton@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=smfrench@gmail.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox