From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Lang Yu <lang.yu@amd.com>,
David Hildenbrand <david@redhat.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Oscar Salvador <osalvador@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.4 12/44] mm/kmemleak: avoid scanning potential huge holes
Date: Mon, 7 Feb 2022 12:06:28 +0100 [thread overview]
Message-ID: <20220207103753.556364034@linuxfoundation.org> (raw)
In-Reply-To: <20220207103753.155627314@linuxfoundation.org>
From: Lang Yu <lang.yu@amd.com>
commit c10a0f877fe007021d70f9cada240f42adc2b5db upstream.
When using devm_request_free_mem_region() and devm_memremap_pages() to
add ZONE_DEVICE memory, if requested free mem region's end pfn were
huge(e.g., 0x400000000), the node_end_pfn() will be also huge (see
move_pfn_range_to_zone()). Thus it creates a huge hole between
node_start_pfn() and node_end_pfn().
We found on some AMD APUs, amdkfd requested such a free mem region and
created a huge hole. In such a case, following code snippet was just
doing busy test_bit() looping on the huge hole.
for (pfn = start_pfn; pfn < end_pfn; pfn++) {
struct page *page = pfn_to_online_page(pfn);
if (!page)
continue;
...
}
So we got a soft lockup:
watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
RIP: 0010:pfn_to_online_page+0x5/0xd0
Call Trace:
? kmemleak_scan+0x16a/0x440
kmemleak_write+0x306/0x3a0
? common_file_perm+0x72/0x170
full_proxy_write+0x5c/0x90
vfs_write+0xb9/0x260
ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae
I did some tests with the patch.
(1) amdgpu module unloaded
before the patch:
real 0m0.976s
user 0m0.000s
sys 0m0.968s
after the patch:
real 0m0.981s
user 0m0.000s
sys 0m0.973s
(2) amdgpu module loaded
before the patch:
real 0m35.365s
user 0m0.000s
sys 0m35.354s
after the patch:
real 0m1.049s
user 0m0.000s
sys 0m1.042s
Link: https://lkml.kernel.org/r/20211108140029.721144-1-lang.yu@amd.com
Signed-off-by: Lang Yu <lang.yu@amd.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/kmemleak.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1399,7 +1399,8 @@ static void kmemleak_scan(void)
{
unsigned long flags;
struct kmemleak_object *object;
- int i;
+ struct zone *zone;
+ int __maybe_unused i;
int new_leaks = 0;
jiffies_last_scan = jiffies;
@@ -1439,9 +1440,9 @@ static void kmemleak_scan(void)
* Struct page scanning for each node.
*/
get_online_mems();
- for_each_online_node(i) {
- unsigned long start_pfn = node_start_pfn(i);
- unsigned long end_pfn = node_end_pfn(i);
+ for_each_populated_zone(zone) {
+ unsigned long start_pfn = zone->zone_start_pfn;
+ unsigned long end_pfn = zone_end_pfn(zone);
unsigned long pfn;
for (pfn = start_pfn; pfn < end_pfn; pfn++) {
@@ -1450,8 +1451,8 @@ static void kmemleak_scan(void)
if (!page)
continue;
- /* only scan pages belonging to this node */
- if (page_to_nid(page) != i)
+ /* only scan pages belonging to this zone */
+ if (page_zone(page) != zone)
continue;
/* only scan if page is in use */
if (page_count(page) == 0)
next prev parent reply other threads:[~2022-02-07 11:28 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-07 11:06 [PATCH 5.4 00/44] 5.4.178-rc1 review Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 01/44] audit: improve audit queue handling when "audit=1" on cmdline Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 02/44] ASoC: ops: Reject out of bounds values in snd_soc_put_volsw() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 03/44] ASoC: ops: Reject out of bounds values in snd_soc_put_volsw_sx() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 04/44] ASoC: ops: Reject out of bounds values in snd_soc_put_xr_sx() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 05/44] ALSA: usb-audio: Simplify quirk entries with a macro Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 06/44] ALSA: hda/realtek: Add quirk for ASUS GU603 Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 07/44] ALSA: hda/realtek: Add missing fixup-model entry for Gigabyte X570 ALC1220 quirks Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 08/44] ALSA: hda/realtek: Fix silent output on Gigabyte X570S Aorus Master (newer chipset) Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 09/44] ALSA: hda/realtek: Fix silent output on Gigabyte X570 Aorus Xtreme after reboot from Windows Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 10/44] btrfs: fix deadlock between quota disable and qgroup rescan worker Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 11/44] drm/nouveau: fix off by one in BIOS boundary checking Greg Kroah-Hartman
2022-02-07 11:06 ` Greg Kroah-Hartman [this message]
2022-02-07 11:06 ` [PATCH 5.4 13/44] block: bio-integrity: Advance seed correctly for larger interval sizes Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 14/44] Revert "ASoC: mediatek: Check for error clk pointer" Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 15/44] memcg: charge fs_context and legacy_fs_context Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 16/44] IB/rdmavt: Validate remote_addr during loopback atomic tests Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 17/44] RDMA/siw: Fix broken RDMA Read Fence/Resume logic Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 18/44] RDMA/mlx4: Dont continue event handler after memory allocation failure Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 19/44] iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 20/44] iommu/amd: Fix loop timeout issue in iommu_ga_log_enable() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 21/44] spi: bcm-qspi: check for valid cs before applying chip select Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 22/44] spi: mediatek: Avoid NULL pointer crash in interrupt Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 23/44] spi: meson-spicc: add IRQ check in meson_spicc_probe Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 24/44] net: ieee802154: hwsim: Ensure proper channel selection at probe time Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 25/44] net: ieee802154: mcr20a: Fix lifs/sifs periods Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 26/44] net: ieee802154: ca8210: Stop leaking skbs Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 27/44] net: ieee802154: Return meaningful error codes from the netlink helpers Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 28/44] net: macsec: Verify that send_sci is on when setting Tx sci explicitly Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 29/44] net: stmmac: dump gmac4 DMA registers correctly Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 30/44] net: stmmac: ensure PTP time register reads are consistent Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 31/44] drm/i915/overlay: Prevent divide by zero bugs in scaling Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 32/44] ASoC: fsl: Add missing error handling in pcm030_fabric_probe Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 33/44] ASoC: xilinx: xlnx_formatter_pcm: Make buffer bytes multiple of period bytes Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 34/44] ASoC: cpcap: Check for NULL pointer after calling of_get_child_by_name Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 35/44] ASoC: max9759: fix underflow in speaker_gain_control_put() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 36/44] pinctrl: bcm2835: Fix a few error paths Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 37/44] scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 38/44] nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 39/44] selftests: futex: Use variable MAKE instead of make Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 40/44] rtc: cmos: Evaluate century appropriate Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 41/44] EDAC/altera: Fix deferred probing Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 42/44] EDAC/xgene: " Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.4 43/44] ext4: fix error handling in ext4_restore_inline_data() Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.4 44/44] cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning Greg Kroah-Hartman
2022-02-07 21:29 ` [PATCH 5.4 00/44] 5.4.178-rc1 review Shuah Khan
2022-02-07 23:27 ` Florian Fainelli
2022-02-08 1:47 ` Slade Watkins
2022-02-08 3:29 ` Guenter Roeck
2022-02-08 8:09 ` Naresh Kamboju
2022-02-08 8:30 ` Jon Hunter
2022-02-08 13:59 ` Sudip Mukherjee
2022-02-09 2:48 ` Samuel Zou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220207103753.556364034@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=lang.yu@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=osalvador@suse.de \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.