From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Gang He <ghe@suse.com>,
Eric Ren <zren@suse.com>, alex chen <alex.chen@huawei.com>,
piaojun <piaojun@huawei.com>, Mark Fasheh <mfasheh@versity.com>,
Joel Becker <jlbec@evilplan.org>,
Junxiao Bi <junxiao.bi@oracle.com>,
Joseph Qi <jiangqi903@gmail.com>,
Changwei Ge <ge.changwei@h3c.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.9 41/77] ocfs2: try a blocking lock before return AOP_TRUNCATED_PAGE
Date: Wed, 21 Feb 2018 13:48:50 +0100 [thread overview]
Message-ID: <20180221124433.925528813@linuxfoundation.org> (raw)
In-Reply-To: <20180221124432.172390020@linuxfoundation.org>
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gang He <ghe@suse.com>
commit ff26cc10aec128c3f86b5611fd5f59c71d49c0e3 upstream.
If we can't get inode lock immediately in the function
ocfs2_inode_lock_with_page() when reading a page, we should not return
directly here, since this will lead to a softlockup problem when the
kernel is configured with CONFIG_PREEMPT is not set. The method is to
get a blocking lock and immediately unlock before returning, this can
avoid CPU resource waste due to lots of retries, and benefits fairness
in getting lock among multiple nodes, increase efficiency in case
modifying the same file frequently from multiple nodes.
The softlockup crash (when set /proc/sys/kernel/softlockup_panic to 1)
looks like:
Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 885 Comm: multi_mmap Tainted: G L 4.12.14-6.1-default #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<IRQ>
dump_stack+0x5c/0x82
panic+0xd5/0x21e
watchdog_timer_fn+0x208/0x210
__hrtimer_run_queues+0xcc/0x200
hrtimer_interrupt+0xa6/0x1f0
smp_apic_timer_interrupt+0x34/0x50
apic_timer_interrupt+0x96/0xa0
</IRQ>
RIP: 0010:unlock_page+0x17/0x30
RSP: 0000:ffffaf154080bc88 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
RAX: dead000000000100 RBX: fffff21e009f5300 RCX: 0000000000000004
RDX: dead0000000000ff RSI: 0000000000000202 RDI: fffff21e009f5300
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffaf154080bb00
R10: ffffaf154080bc30 R11: 0000000000000040 R12: ffff993749a39518
R13: 0000000000000000 R14: fffff21e009f5300 R15: fffff21e009f5300
ocfs2_inode_lock_with_page+0x25/0x30 [ocfs2]
ocfs2_readpage+0x41/0x2d0 [ocfs2]
filemap_fault+0x12b/0x5c0
ocfs2_fault+0x29/0xb0 [ocfs2]
__do_fault+0x1a/0xa0
__handle_mm_fault+0xbe8/0x1090
handle_mm_fault+0xaa/0x1f0
__do_page_fault+0x235/0x4b0
trace_do_page_fault+0x3c/0x110
async_page_fault+0x28/0x30
RIP: 0033:0x7fa75ded638e
RSP: 002b:00007ffd6657db18 EFLAGS: 00010287
RAX: 000055c7662fb700 RBX: 0000000000000001 RCX: 000055c7662fb700
RDX: 0000000000001770 RSI: 00007fa75e909000 RDI: 000055c7662fb700
RBP: 0000000000000003 R08: 000000000000000e R09: 0000000000000000
R10: 0000000000000483 R11: 00007fa75ded61b0 R12: 00007fa75e90a770
R13: 000000000000000e R14: 0000000000001770 R15: 0000000000000000
About performance improvement, we can see the testing time is reduced,
and CPU utilization decreases, the detailed data is as follows. I ran
multi_mmap test case in ocfs2-test package in a three nodes cluster.
Before applying this patch:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2754 ocfs2te+ 20 0 170248 6980 4856 D 80.73 0.341 0:18.71 multi_mmap
1505 root rt 0 222236 123060 97224 S 2.658 6.015 0:01.44 corosync
5 root 20 0 0 0 0 S 1.329 0.000 0:00.19 kworker/u8:0
95 root 20 0 0 0 0 S 1.329 0.000 0:00.25 kworker/u8:1
2728 root 20 0 0 0 0 S 0.997 0.000 0:00.24 jbd2/sda1-33
2721 root 20 0 0 0 0 S 0.664 0.000 0:00.07 ocfs2dc-3C8CFD4
2750 ocfs2te+ 20 0 142976 4652 3532 S 0.664 0.227 0:00.28 mpirun
ocfs2test@tb-node2:~>multiple_run.sh -i ens3 -k ~/linux-4.4.21-69.tar.gz -o ~/ocfs2mullog -C hacluster -s pcmk -n tb-node2,tb-node1,tb-node3 -d /dev/sda1 -b 4096 -c 32768 -t multi_mmap /mnt/shared
Tests with "-b 4096 -C 32768"
Thu Dec 28 14:44:52 CST 2017
multi_mmap..................................................Passed.
Runtime 783 seconds.
After apply this patch:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2508 ocfs2te+ 20 0 170248 6804 4680 R 54.00 0.333 0:55.37 multi_mmap
155 root 20 0 0 0 0 S 2.667 0.000 0:01.20 kworker/u8:3
95 root 20 0 0 0 0 S 2.000 0.000 0:01.58 kworker/u8:1
2504 ocfs2te+ 20 0 142976 4604 3480 R 1.667 0.225 0:01.65 mpirun
5 root 20 0 0 0 0 S 1.000 0.000 0:01.36 kworker/u8:0
2482 root 20 0 0 0 0 S 1.000 0.000 0:00.86 jbd2/sda1-33
299 root 0 -20 0 0 0 S 0.333 0.000 0:00.13 kworker/2:1H
335 root 0 -20 0 0 0 S 0.333 0.000 0:00.17 kworker/1:1H
535 root 20 0 12140 7268 1456 S 0.333 0.355 0:00.34 haveged
1282 root rt 0 222284 123108 97224 S 0.333 6.017 0:01.33 corosync
ocfs2test@tb-node2:~>multiple_run.sh -i ens3 -k ~/linux-4.4.21-69.tar.gz -o ~/ocfs2mullog -C hacluster -s pcmk -n tb-node2,tb-node1,tb-node3 -d /dev/sda1 -b 4096 -c 32768 -t multi_mmap /mnt/shared
Tests with "-b 4096 -C 32768"
Thu Dec 28 15:04:12 CST 2017
multi_mmap..................................................Passed.
Runtime 487 seconds.
Link: http://lkml.kernel.org/r/1514447305-30814-1-git-send-email-ghe@suse.com
Fixes: 1cce4df04f37 ("ocfs2: do not lock/unlock() inode DLM lock")
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Eric Ren <zren@suse.com>
Acked-by: alex chen <alex.chen@huawei.com>
Acked-by: piaojun <piaojun@huawei.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/ocfs2/dlmglue.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/fs/ocfs2/dlmglue.c
+++ b/fs/ocfs2/dlmglue.c
@@ -2485,6 +2485,15 @@ int ocfs2_inode_lock_with_page(struct in
ret = ocfs2_inode_lock_full(inode, ret_bh, ex, OCFS2_LOCK_NONBLOCK);
if (ret == -EAGAIN) {
unlock_page(page);
+ /*
+ * If we can't get inode lock immediately, we should not return
+ * directly here, since this will lead to a softlockup problem.
+ * The method is to get a blocking lock and immediately unlock
+ * before returning, this can avoid CPU resource waste due to
+ * lots of retries, and benefits fairness in getting lock.
+ */
+ if (ocfs2_inode_lock(inode, ret_bh, ex) == 0)
+ ocfs2_inode_unlock(inode, ex);
ret = AOP_TRUNCATED_PAGE;
}
next prev parent reply other threads:[~2018-02-21 12:48 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-21 12:48 [PATCH 4.9 00/77] 4.9.83-stable review Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 01/77] scsi: smartpqi: allow static build ("built-in") Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 02/77] drm/radeon: Add dpm quirk for Jet PRO (v2) Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 03/77] drm/radeon: adjust tested variable Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 04/77] rtc-opal: Fix handling of firmware error codes, prevent busy loops Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 05/77] mbcache: initialize entry->e_referenced in mb_cache_entry_create() Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 06/77] jbd2: fix sphinx kernel-doc build warnings Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 07/77] ext4: fix a race in the ext4 shutdown path Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 08/77] ext4: save error to disk in __ext4_grp_locked_error() Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 09/77] ext4: correct documentation for grpid mount option Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 10/77] mm: hide a #warning for COMPILE_TEST Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 11/77] mm: Fix memory size alignment in devm_memremap_pages_release() Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 12/77] MIPS: Fix typo BIG_ENDIAN to CPU_BIG_ENDIAN Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 13/77] PCI: keystone: Fix interrupt-controller-node lookup Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 14/77] video: fbdev: atmel_lcdfb: fix display-timings lookup Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 15/77] console/dummy: leave .con_font_get set to NULL Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 16/77] rtlwifi: rtl8821ae: Fix connection lost problem correctly Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 17/77] target/iscsi: avoid NULL dereference in CHAP auth error path Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 18/77] Btrfs: fix deadlock in run_delalloc_nocow Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 19/77] Btrfs: fix crash due to not cleaning up tree log blocks dirty bits Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 20/77] Btrfs: fix extent state leak from tree log Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 21/77] Btrfs: fix btrfs_evict_inode to handle abnormal inodes correctly Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 22/77] Btrfs: fix unexpected -EEXIST when creating new inode Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 23/77] 9p/trans_virtio: discard zero-length reply Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 24/77] mtd: nand: vf610: set correct ooblayout Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 25/77] ALSA: hda - Fix headset mic detection problem for two Dell machines Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 26/77] ALSA: usb-audio: Fix UAC2 get_ctl request with a RANGE attribute Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 27/77] ALSA: hda/realtek - Enable Thinkpad Dock device for ALC298 platform Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 28/77] ALSA: hda/realtek: PCI quirk for Fujitsu U7x7 Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 29/77] ALSA: usb-audio: add implicit fb quirk for Behringer UFX1204 Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 30/77] ALSA: seq: Fix racy pool initializations Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 31/77] mvpp2: fix multicast address filter Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 32/77] usb: Move USB_UHCI_BIG_ENDIAN_* out of USB_SUPPORT Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 33/77] dm: correctly handle chained bios in dec_pending() Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 34/77] powerpc: fix build errors in stable tree Greg Kroah-Hartman
2018-02-22 1:01 ` Michael Ellerman
2018-02-22 6:57 ` Greg Kroah-Hartman
2018-02-22 9:33 ` Yves-Alexis Perez
2018-02-22 11:08 ` Greg Kroah-Hartman
2018-02-22 12:02 ` Yves-Alexis Perez
2018-02-22 13:59 ` Yves-Alexis Perez
2018-02-22 13:16 ` Michael Ellerman
2018-02-22 13:32 ` Yves-Alexis Perez
2018-02-22 22:31 ` Michael Ellerman
2018-02-21 12:48 ` [PATCH 4.9 35/77] IB/qib: Fix comparison error with qperf compare/swap test Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 36/77] IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 37/77] kselftest: fix OOM in memory compaction test Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 38/77] RDMA/rxe: Fix a race condition related to the QP error state Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 39/77] cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 40/77] PM / devfreq: Propagate error from devfreq_add_device() Greg Kroah-Hartman
2018-02-21 12:48 ` Greg Kroah-Hartman [this message]
2018-02-21 12:48 ` [PATCH 4.9 42/77] s390: fix handling of -1 in set{,fs}[gu]id16 syscalls Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 43/77] arm64: dts: msm8916: Correct ipc references for smsm Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 44/77] ARM: lpc3250: fix uda1380 gpio numbers Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 45/77] ARM: dts: STi: Add gpio polarity for "hdmi,hpd-gpio" property Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 46/77] ARM: dts: nomadik: add interrupt-parent for clcd Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 47/77] arm: spear600: Add missing interrupt-parent of rtc Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 48/77] arm: spear13xx: Fix dmas cells Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 49/77] arm: spear13xx: Fix spics gpio controllers warning Greg Kroah-Hartman
2018-02-21 12:48 ` [PATCH 4.9 50/77] x86/entry/64/compat: Clear registers for compat syscalls, to reduce speculation attack surface Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 51/77] compiler-gcc.h: Introduce __optimize function attribute Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 52/77] x86/speculation: Update Speculation Control microcode blacklist Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 53/77] x86/speculation: Correct Speculation Control microcode blacklist again Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 54/77] KVM/x86: Reduce retpoline performance impact in slot_handle_level_range(), by always inlining iterator helper methods Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 55/77] X86/nVMX: Properly set spec_ctrl and pred_cmd before merging MSRs Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 56/77] x86/speculation: Clean up various Spectre related details Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 57/77] selftests/x86/pkeys: Remove unused functions Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 58/77] selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 59/77] selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 60/77] x86/speculation: Fix up array_index_nospec_mask() asm constraint Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 61/77] nospec: Move array_index_nospec() parameter checking into separate macro Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 62/77] x86/speculation: Add <asm/msr-index.h> dependency Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 63/77] selftests/x86/mpx: Fix incorrect bounds with old _sigfault Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 64/77] x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 65/77] x86/spectre: Fix an error message Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 66/77] x86/cpu: Change type of x86_cache_size variable to unsigned int Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 67/77] x86: fix build warnign with 32-bit PAE Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 68/77] vfs: dont do RCU lookup of empty pathnames Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 69/77] ARM: dts: exynos: fix RTC interrupt for exynos5410 Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 70/77] ARM: pxa/tosa-bt: add MODULE_LICENSE tag Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 71/77] arm64: dts: msm8916: Add missing #phy-cells Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 72/77] ARM: dts: s5pv210: add interrupt-parent for ohci Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 73/77] arm: dts: mt2701: Add reset-cells Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 74/77] ARM: dts: Delete bogus reference to the charlcd Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 75/77] media: r820t: fix r820t_write_reg for KASAN Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 76/77] mmc: sdhci-of-esdhc: fix eMMC couldnt work after kexec Greg Kroah-Hartman
2018-02-21 12:49 ` [PATCH 4.9 77/77] mmc: sdhci-of-esdhc: fix the mmc error after sleep on ls1046ardb Greg Kroah-Hartman
2018-02-21 13:21 ` Naresh Kamboju
2018-02-21 13:42 ` Greg Kroah-Hartman
2018-02-21 18:44 ` [PATCH 4.9 00/77] 4.9.83-stable review Dan Rue
2018-02-21 20:14 ` Shuah Khan
2018-02-22 14:12 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180221124433.925528813@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=alex.chen@huawei.com \
--cc=ge.changwei@h3c.com \
--cc=ghe@suse.com \
--cc=jiangqi903@gmail.com \
--cc=jlbec@evilplan.org \
--cc=junxiao.bi@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mfasheh@versity.com \
--cc=piaojun@huawei.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=zren@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).