All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Lang Yu <lang.yu@amd.com>,
	David Hildenbrand <david@redhat.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.10 19/74] mm/kmemleak: avoid scanning potential huge holes
Date: Mon,  7 Feb 2022 12:06:17 +0100	[thread overview]
Message-ID: <20220207103757.867500990@linuxfoundation.org> (raw)
In-Reply-To: <20220207103757.232676988@linuxfoundation.org>

From: Lang Yu <lang.yu@amd.com>

commit c10a0f877fe007021d70f9cada240f42adc2b5db upstream.

When using devm_request_free_mem_region() and devm_memremap_pages() to
add ZONE_DEVICE memory, if requested free mem region's end pfn were
huge(e.g., 0x400000000), the node_end_pfn() will be also huge (see
move_pfn_range_to_zone()).  Thus it creates a huge hole between
node_start_pfn() and node_end_pfn().

We found on some AMD APUs, amdkfd requested such a free mem region and
created a huge hole.  In such a case, following code snippet was just
doing busy test_bit() looping on the huge hole.

  for (pfn = start_pfn; pfn < end_pfn; pfn++) {
	struct page *page = pfn_to_online_page(pfn);
		if (!page)
			continue;
	...
  }

So we got a soft lockup:

  watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
  CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
  RIP: 0010:pfn_to_online_page+0x5/0xd0
  Call Trace:
    ? kmemleak_scan+0x16a/0x440
    kmemleak_write+0x306/0x3a0
    ? common_file_perm+0x72/0x170
    full_proxy_write+0x5c/0x90
    vfs_write+0xb9/0x260
    ksys_write+0x67/0xe0
    __x64_sys_write+0x1a/0x20
    do_syscall_64+0x3b/0xc0
    entry_SYSCALL_64_after_hwframe+0x44/0xae

I did some tests with the patch.

(1) amdgpu module unloaded

before the patch:

  real    0m0.976s
  user    0m0.000s
  sys     0m0.968s

after the patch:

  real    0m0.981s
  user    0m0.000s
  sys     0m0.973s

(2) amdgpu module loaded

before the patch:

  real    0m35.365s
  user    0m0.000s
  sys     0m35.354s

after the patch:

  real    0m1.049s
  user    0m0.000s
  sys     0m1.042s

Link: https://lkml.kernel.org/r/20211108140029.721144-1-lang.yu@amd.com
Signed-off-by: Lang Yu <lang.yu@amd.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/kmemleak.c |   13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1401,7 +1401,8 @@ static void kmemleak_scan(void)
 {
 	unsigned long flags;
 	struct kmemleak_object *object;
-	int i;
+	struct zone *zone;
+	int __maybe_unused i;
 	int new_leaks = 0;
 
 	jiffies_last_scan = jiffies;
@@ -1441,9 +1442,9 @@ static void kmemleak_scan(void)
 	 * Struct page scanning for each node.
 	 */
 	get_online_mems();
-	for_each_online_node(i) {
-		unsigned long start_pfn = node_start_pfn(i);
-		unsigned long end_pfn = node_end_pfn(i);
+	for_each_populated_zone(zone) {
+		unsigned long start_pfn = zone->zone_start_pfn;
+		unsigned long end_pfn = zone_end_pfn(zone);
 		unsigned long pfn;
 
 		for (pfn = start_pfn; pfn < end_pfn; pfn++) {
@@ -1452,8 +1453,8 @@ static void kmemleak_scan(void)
 			if (!page)
 				continue;
 
-			/* only scan pages belonging to this node */
-			if (page_to_nid(page) != i)
+			/* only scan pages belonging to this zone */
+			if (page_zone(page) != zone)
 				continue;
 			/* only scan if page is in use */
 			if (page_count(page) == 0)



  parent reply	other threads:[~2022-02-07 11:37 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-07 11:05 [PATCH 5.10 00/74] 5.10.99-rc1 review Greg Kroah-Hartman
2022-02-07 11:05 ` [PATCH 5.10 01/74] selinux: fix double free of cond_list on error paths Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 02/74] audit: improve audit queue handling when "audit=1" on cmdline Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 03/74] ASoC: ops: Reject out of bounds values in snd_soc_put_volsw() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 04/74] ASoC: ops: Reject out of bounds values in snd_soc_put_volsw_sx() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 05/74] ASoC: ops: Reject out of bounds values in snd_soc_put_xr_sx() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 06/74] ALSA: usb-audio: Correct quirk for VF0770 Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 07/74] ALSA: hda: Fix UAF of leds class devs at unbinding Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 08/74] ALSA: hda: realtek: Fix race at concurrent COEF updates Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 09/74] ALSA: hda/realtek: Add quirk for ASUS GU603 Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 10/74] ALSA: hda/realtek: Add missing fixup-model entry for Gigabyte X570 ALC1220 quirks Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 11/74] ALSA: hda/realtek: Fix silent output on Gigabyte X570S Aorus Master (newer chipset) Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 12/74] ALSA: hda/realtek: Fix silent output on Gigabyte X570 Aorus Xtreme after reboot from Windows Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 13/74] btrfs: fix deadlock between quota disable and qgroup rescan worker Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 14/74] drm/nouveau: fix off by one in BIOS boundary checking Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 15/74] drm/amd/display: Force link_rate as LINK_RATE_RBR2 for 2018 15" Apple Retina panels Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 16/74] nvme-fabrics: fix state check in nvmf_ctlr_matches_baseopts() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 17/74] mm/debug_vm_pgtable: remove pte entry from the page table Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 18/74] mm/pgtable: define pte_index so that preprocessor could recognize it Greg Kroah-Hartman
2022-02-07 11:06 ` Greg Kroah-Hartman [this message]
2022-02-07 11:06 ` [PATCH 5.10 20/74] block: bio-integrity: Advance seed correctly for larger interval sizes Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 21/74] dma-buf: heaps: Fix potential spectre v1 gadget Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 22/74] IB/hfi1: Fix AIP early init panic Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 23/74] Revert "ASoC: mediatek: Check for error clk pointer" Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 24/74] memcg: charge fs_context and legacy_fs_context Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 25/74] RDMA/cma: Use correct address when leaving multicast group Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 26/74] RDMA/ucma: Protect mc during concurrent multicast leaves Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 27/74] IB/rdmavt: Validate remote_addr during loopback atomic tests Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 28/74] RDMA/siw: Fix broken RDMA Read Fence/Resume logic Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 29/74] RDMA/mlx4: Dont continue event handler after memory allocation failure Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 30/74] iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 31/74] iommu/amd: Fix loop timeout issue in iommu_ga_log_enable() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 32/74] spi: bcm-qspi: check for valid cs before applying chip select Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 33/74] spi: mediatek: Avoid NULL pointer crash in interrupt Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 34/74] spi: meson-spicc: add IRQ check in meson_spicc_probe Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 35/74] spi: uniphier: fix reference count leak in uniphier_spi_probe() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 36/74] net: ieee802154: hwsim: Ensure proper channel selection at probe time Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 37/74] net: ieee802154: mcr20a: Fix lifs/sifs periods Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 38/74] net: ieee802154: ca8210: Stop leaking skbs Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 39/74] net: ieee802154: Return meaningful error codes from the netlink helpers Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 40/74] net: macsec: Fix offload support for NETDEV_UNREGISTER event Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 41/74] net: macsec: Verify that send_sci is on when setting Tx sci explicitly Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 42/74] net: stmmac: dump gmac4 DMA registers correctly Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 43/74] net: stmmac: ensure PTP time register reads are consistent Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 44/74] drm/i915/overlay: Prevent divide by zero bugs in scaling Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 45/74] ASoC: fsl: Add missing error handling in pcm030_fabric_probe Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 46/74] ASoC: xilinx: xlnx_formatter_pcm: Make buffer bytes multiple of period bytes Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 47/74] ASoC: cpcap: Check for NULL pointer after calling of_get_child_by_name Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 48/74] ASoC: max9759: fix underflow in speaker_gain_control_put() Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 49/74] pinctrl: intel: Fix a glitch when updating IRQ flags on a preconfigured line Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 50/74] pinctrl: intel: fix unexpected interrupt Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 51/74] pinctrl: bcm2835: Fix a few error paths Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 52/74] scsi: bnx2fc: Make bnx2fc_recv_frame() mp safe Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 53/74] nfsd: nfsd4_setclientid_confirm mistakenly expires confirmed client Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 54/74] gve: fix the wrong AdminQ buffer queue index check Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 55/74] bpf: Use VM_MAP instead of VM_ALLOC for ringbuf Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 56/74] selftests/exec: Remove pipe from TEST_GEN_FILES Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 57/74] selftests: futex: Use variable MAKE instead of make Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 58/74] tools/resolve_btfids: Do not print any commands when building silently Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 59/74] rtc: cmos: Evaluate century appropriate Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 60/74] Revert "fbcon: Disable accelerated scrolling" Greg Kroah-Hartman
2022-02-07 11:06 ` [PATCH 5.10 61/74] fbcon: Add option to enable legacy hardware acceleration Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 62/74] perf stat: Fix display of grouped aliased events Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 63/74] perf/x86/intel/pt: Fix crash with stop filters in single-range mode Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 64/74] x86/perf: Default set FREEZE_ON_SMI for all Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 65/74] EDAC/altera: Fix deferred probing Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 66/74] EDAC/xgene: " Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 67/74] ext4: prevent used blocks from being allocated during fast commit replay Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 68/74] ext4: modify the logic of ext4_mb_new_blocks_simple Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 69/74] ext4: fix error handling in ext4_restore_inline_data() Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 70/74] ext4: fix error handling in ext4_fc_record_modified_inode() Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 71/74] ext4: fix incorrect type issue during replay_del_range Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 72/74] net: dsa: mt7530: make NET_DSA_MT7530 select MEDIATEK_GE_PHY Greg Kroah-Hartman
2022-02-09 19:08   ` Pavel Machek
2022-02-07 11:07 ` [PATCH 5.10 73/74] cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning Greg Kroah-Hartman
2022-02-07 11:07 ` [PATCH 5.10 74/74] selftests: nft_concat_range: add test for reload with no element add/del Greg Kroah-Hartman
2022-02-07 17:03 ` [PATCH 5.10 00/74] 5.10.99-rc1 review Pavel Machek
2022-02-07 21:25 ` Shuah Khan
2022-02-07 22:21 ` Guenter Roeck
2022-02-07 23:45 ` Florian Fainelli
2022-02-08  2:45 ` Slade Watkins
2022-02-08  8:02 ` Naresh Kamboju
2022-02-08  8:30 ` Jon Hunter
2022-02-08 14:01 ` Sudip Mukherjee
2022-02-08 20:36 ` Fox Chen
2022-02-09  3:15 ` Samuel Zou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220207103757.867500990@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=lang.yu@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=osalvador@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.