From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Jonathan Toppins <jtoppins@redhat.com>,
Doug Ledford <dledford@redhat.com>,
Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
Mel Gorman <mgorman@techsingularity.net>,
Hillf Danton <hillf.zj@alibaba-inc.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.12 01/65] mm: ratelimit PFNs busy info message
Date: Mon, 14 Aug 2017 18:18:52 -0700 [thread overview]
Message-ID: <20170815011942.451861682@linuxfoundation.org> (raw)
In-Reply-To: <20170815011942.395714306@linuxfoundation.org>
4.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Toppins <jtoppins@redhat.com>
commit 75dddef32514f7aa58930bde6a1263253bc3d4ba upstream.
The RDMA subsystem can generate several thousand of these messages per
second eventually leading to a kernel crash. Ratelimit these messages
to prevent this crash.
Doug said:
"I've been carrying a version of this for several kernel versions. I
don't remember when they started, but we have one (and only one) class
of machines: Dell PE R730xd, that generate these errors. When it
happens, without a rate limit, we get rcu timeouts and kernel oopses.
With the rate limit, we just get a lot of annoying kernel messages but
the machine continues on, recovers, and eventually the memory
operations all succeed"
And:
"> Well... why are all these EBUSY's occurring? It sounds inefficient
> (at least) but if it is expected, normal and unavoidable then
> perhaps we should just remove that message altogether?
I don't have an answer to that question. To be honest, I haven't
looked real hard. We never had this at all, then it started out of the
blue, but only on our Dell 730xd machines (and it hits all of them),
but no other classes or brands of machines. And we have our 730xd
machines loaded up with different brands and models of cards (for
instance one dedicated to mlx4 hardware, one for qib, one for mlx5, an
ocrdma/cxgb4 combo, etc), so the fact that it hit all of the machines
meant it wasn't tied to any particular brand/model of RDMA hardware.
To me, it always smelled of a hardware oddity specific to maybe the
CPUs or mainboard chipsets in these machines, so given that I'm not an
mm expert anyway, I never chased it down.
A few other relevant details: it showed up somewhere around 4.8/4.9 or
thereabouts. It never happened before, but the prinkt has been there
since the 3.18 days, so possibly the test to trigger this message was
changed, or something else in the allocator changed such that the
situation started happening on these machines?
And, like I said, it is specific to our 730xd machines (but they are
all identical, so that could mean it's something like their specific
ram configuration is causing the allocator to hit this on these
machine but not on other machines in the cluster, I don't want to say
it's necessarily the model of chipset or CPU, there are other bits of
identicalness between these machines)"
Link: http://lkml.kernel.org/r/499c0f6cc10d6eb829a67f2a4d75b4228a9b356e.1501695897.git.jtoppins@redhat.com
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7567,7 +7567,7 @@ int alloc_contig_range(unsigned long sta
/* Make sure the range is really isolated. */
if (test_pages_isolated(outer_start, end, false)) {
- pr_info("%s: [%lx, %lx) PFNs busy\n",
+ pr_info_ratelimited("%s: [%lx, %lx) PFNs busy\n",
__func__, outer_start, end);
ret = -EBUSY;
goto done;
next prev parent reply other threads:[~2017-08-15 1:20 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-15 1:18 [PATCH 4.12 00/65] 4.12.8-stable review Greg Kroah-Hartman
2017-08-15 1:18 ` Greg Kroah-Hartman [this message]
2017-08-15 1:18 ` [PATCH 4.12 02/65] mm: fix list corruptions on shmem shrinklist Greg Kroah-Hartman
2017-08-15 1:18 ` [PATCH 4.12 03/65] futex: Remove unnecessary warning from get_futex_key Greg Kroah-Hartman
2017-08-15 1:18 ` [PATCH 4.12 04/65] xtensa: fix cache aliasing handling code for WT cache Greg Kroah-Hartman
2017-08-15 1:18 ` [PATCH 4.12 05/65] xtensa: mm/cache: add missing EXPORT_SYMBOLs Greg Kroah-Hartman
2017-08-15 1:18 ` [PATCH 4.12 06/65] xtensa: dont limit csum_partial export by CONFIG_NET Greg Kroah-Hartman
2017-08-15 1:18 ` [PATCH 4.12 07/65] xfs: Fix leak of discard bio Greg Kroah-Hartman
2017-08-15 1:18 ` [PATCH 4.12 08/65] pinctrl: armada-37xx: Fix number of pin in south bridge Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 09/65] mtd: nand: atmel: Fix DT backward compatibility in pmecc.c Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 10/65] mtd: nand: Fix timing setup for NANDs that do not support SET FEATURES Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 11/65] mtd: nand: Declare tBERS, tR and tPROG as u64 to avoid integer overflow Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 12/65] iscsi-target: fix memory leak in iscsit_setup_text_cmd() Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 13/65] iscsi-target: Fix iscsi_np reset hung task during parallel delete Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 14/65] usb-storage: fix deadlock involving host lock and scsi_done Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 15/65] target: Fix node_acl demo-mode + uncached dynamic shutdown regression Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 16/65] fuse: initialize the flock flag in fuse_file on allocation Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 17/65] i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 18/65] nand: fix wrong default oob layout for small pages using soft ecc Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 19/65] mmc: mmc: correct the logic for setting HS400ES signal voltage Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 20/65] nfs/flexfiles: fix leak of nfs4_ff_ds_version arrays Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 21/65] drm/bridge: tc358767: fix probe without attached output node Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 22/65] drm/etnaviv: Fix off-by-one error in reloc checking Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 23/65] drm/i915: Fix out-of-bounds array access in bdw_load_gamma_lut Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 24/65] USB: serial: option: add D-Link DWM-222 device ID Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 25/65] USB: serial: cp210x: add support for Qivicon USB ZigBee dongle Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 26/65] USB: serial: pl2303: add new ATEN device id Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 27/65] usb: musb: fix tx fifo flush handling again Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 28/65] USB: hcd: Mark secondary HCD as dead if the primary one died Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 29/65] staging:iio:resolver:ad2s1210 fix negative IIO_ANGL_VEL read Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 30/65] iio: aspeed-adc: wait for initial sequence Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 31/65] iio: accel: st_accel: add SPI-3wire support Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 32/65] iio: accel: bmc150: Always restore device to normal mode after suspend-resume Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 33/65] iio: pressure: st_pressure_core: disable multiread by default for LPS22HB Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 34/65] iio: light: tsl2563: use correct event code Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 35/65] iio: adc: Revert "axp288: Drop bogus AXP288_ADC_TS_PIN_CTRL register modifications" Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 36/65] staging: comedi: comedi_fops: do not call blocking ops when !TASK_RUNNING Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 37/65] uas: Add US_FL_IGNORE_RESIDUE for Initio Corporation INIC-3069 Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 38/65] firmware: fix batched requests - wake all waiters Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 39/65] firmware: fix batched requests - send wake up on failure on direct lookups Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 40/65] firmware: avoid invalid fallback aborts by using killable wait Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 41/65] block: Make blk_mq_delay_kick_requeue_list() rerun the queue at a quiet time Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 42/65] usb: gadget: udc: renesas_usb3: Fix usb_gadget_giveback_request() calling Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 43/65] usb: renesas_usbhs: Fix UGCTRL2 value for R-Car Gen3 Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 44/65] USB: Check for dropped connection before switching to full speed Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 45/65] usb: core: unlink urbs from the tail of the endpoints urb_list Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 46/65] usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 47/65] usb:xhci:Add quirk for Certain failing HP keyboard on reset after resume Greg Kroah-Hartman
2017-08-18 20:02 ` Ben Hutchings
2017-08-18 22:50 ` Greg Kroah-Hartman
2017-08-21 9:55 ` Sandeep Singh
2017-08-15 1:19 ` [PATCH 4.12 48/65] PCI: Protect pci_error_handlers->reset_notify() usage with device_lock() Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 49/65] PCI: Remove __pci_dev_reset() and pci_dev_reset() Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 50/65] PCI: Add pci_reset_function_locked() Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 51/65] xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 52/65] iio: adc: vf610_adc: Fix VALT selection value for REFSEL bits Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 53/65] pnfs/blocklayout: require 64-bit sector_t Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 54/65] pinctrl: cherryview: Add Setzer models to the Chromebook DMI quirk Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 55/65] pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver Greg Kroah-Hartman
2017-08-18 20:07 ` Ben Hutchings
2017-08-22 9:11 ` Chen-Yu Tsai
2017-08-22 15:51 ` Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 56/65] pinctrl: intel: merrifield: Correct UART pin lists Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 57/65] pinctrl: uniphier: fix WARN_ON() of pingroups dump on LD11 Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 58/65] pinctrl: uniphier: fix WARN_ON() of pingroups dump on LD20 Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 59/65] pinctrl: samsung: Remove bogus irq_[un]mask from resource management Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 60/65] pinctrl: meson-gxbb: Add missing GPIODV_18 pin entry Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 61/65] pinctrl: meson-gxl: " Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 62/65] MIPS: DEC: Fix an int-handler.S CPU_DADDI_WORKAROUNDS regression Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 63/65] Revert "MIPS: Dont unnecessarily include kmalloc.h into <asm/cache.h>." Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 64/65] MIPS: Octeon: Fix broken EDAC driver Greg Kroah-Hartman
2017-08-15 1:19 ` [PATCH 4.12 65/65] powerpc: Fix /proc/cpuinfo revision for POWER9 DD2 Greg Kroah-Hartman
2017-08-15 10:45 ` [PATCH 4.12 00/65] 4.12.8-stable review Guenter Roeck
2017-08-15 14:34 ` Greg Kroah-Hartman
2017-08-15 18:09 ` Shuah Khan
2017-08-15 18:10 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170815011942.451861682@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=dledford@redhat.com \
--cc=hillf.zj@alibaba-inc.com \
--cc=jtoppins@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).