All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>, Michal Hocko <mhocko@suse.com>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>,
	Wei Yang <richard.weiyang@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.4 70/72] mm/memory_hotplug: drain per-cpu pages again during memory offline
Date: Mon, 21 Sep 2020 18:31:49 +0200	[thread overview]
Message-ID: <20200921163125.191811835@linuxfoundation.org> (raw)
In-Reply-To: <20200921163121.870386357@linuxfoundation.org>

From: Pavel Tatashin <pasha.tatashin@soleen.com>

commit 9683182612214aa5f5e709fad49444b847cd866a upstream.

There is a race during page offline that can lead to infinite loop:
a page never ends up on a buddy list and __offline_pages() keeps
retrying infinitely or until a termination signal is received.

Thread#1 - a new process:

load_elf_binary
 begin_new_exec
  exec_mmap
   mmput
    exit_mmap
     tlb_finish_mmu
      tlb_flush_mmu
       release_pages
        free_unref_page_list
         free_unref_page_prepare
          set_pcppage_migratetype(page, migratetype);
             // Set page->index migration type below  MIGRATE_PCPTYPES

Thread#2 - hot-removes memory
__offline_pages
  start_isolate_page_range
    set_migratetype_isolate
      set_pageblock_migratetype(page, MIGRATE_ISOLATE);
        Set migration type to MIGRATE_ISOLATE-> set
        drain_all_pages(zone);
             // drain per-cpu page lists to buddy allocator.

Thread#1 - continue
         free_unref_page_commit
           migratetype = get_pcppage_migratetype(page);
              // get old migration type
           list_add(&page->lru, &pcp->lists[migratetype]);
              // add new page to already drained pcp list

Thread#2
Never drains pcp again, and therefore gets stuck in the loop.

The fix is to try to drain per-cpu lists again after
check_pages_isolated_cb() fails.

Fixes: c52e75935f8d ("mm: remove extra drain pages on pcp list")
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200903140032.380431-1-pasha.tatashin@soleen.com
Link: https://lkml.kernel.org/r/20200904151448.100489-2-pasha.tatashin@soleen.com
Link: http://lkml.kernel.org/r/20200904070235.GA15277@dhcp22.suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/memory_hotplug.c |   14 ++++++++++++++
 mm/page_isolation.c |    8 ++++++++
 2 files changed, 22 insertions(+)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1566,6 +1566,20 @@ static int __ref __offline_pages(unsigne
 		/* check again */
 		ret = walk_system_ram_range(start_pfn, end_pfn - start_pfn,
 					    NULL, check_pages_isolated_cb);
+		/*
+		 * per-cpu pages are drained in start_isolate_page_range, but if
+		 * there are still pages that are not free, make sure that we
+		 * drain again, because when we isolated range we might
+		 * have raced with another thread that was adding pages to pcp
+		 * list.
+		 *
+		 * Forward progress should be still guaranteed because
+		 * pages on the pcp list can only belong to MOVABLE_ZONE
+		 * because has_unmovable_pages explicitly checks for
+		 * PageBuddy on freed pages on other zones.
+		 */
+		if (ret)
+			drain_all_pages(zone);
 	} while (ret);
 
 	/* Ok, all of our target is isolated.
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -187,6 +187,14 @@ __first_valid_page(unsigned long pfn, un
  * pageblocks we may have modified and return -EBUSY to caller. This
  * prevents two threads from simultaneously working on overlapping ranges.
  *
+ * Please note that there is no strong synchronization with the page allocator
+ * either. Pages might be freed while their page blocks are marked ISOLATED.
+ * In some cases pages might still end up on pcp lists and that would allow
+ * for their allocation even when they are in fact isolated already. Depending
+ * on how strong of a guarantee the caller needs drain_all_pages might be needed
+ * (e.g. __offline_pages will need to call it after check for isolated range for
+ * a next retry).
+ *
  * Return: the number of isolated pageblocks on success and -EBUSY if any part
  * of range cannot be isolated.
  */



  parent reply	other threads:[~2020-09-21 16:50 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-21 16:30 [PATCH 5.4 00/72] 5.4.67-rc1 review Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 01/72] gfs2: initialize transaction tr_ailX_lists earlier Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 02/72] RDMA/bnxt_re: Restrict the max_gids to 256 Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 03/72] e1000e: Add support for Comet Lake Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 04/72] dsa: Allow forwarding of redirected IGMP traffic Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 05/72] net: handle the return value of pskb_carve_frag_list() correctly Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 06/72] hv_netvsc: Remove "unlikely" from netvsc_select_queue Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 07/72] firmware_loader: fix memory leak for paged buffer Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 08/72] NFSv4.1 handle ERR_DELAY error reclaiming locking state on delegation recall Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 09/72] scsi: pm8001: Fix memleak in pm8001_exec_internal_task_abort Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 10/72] scsi: libfc: Fix for double free() Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 11/72] scsi: lpfc: Fix FLOGI/PLOGI receive race condition in pt2pt discovery Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 12/72] regulator: pwm: Fix machine constraints application Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 13/72] spi: spi-loopback-test: Fix out-of-bounds read Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 14/72] NFS: Zero-stateid SETATTR should first return delegation Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 15/72] SUNRPC: stop printk reading past end of string Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 16/72] rapidio: Replace select DMAENGINES with depends on Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 17/72] cifs: fix DFS mount with cifsacl/modefromsid Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 18/72] openrisc: Fix cache API compile issue when not inlining Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 19/72] nvme-fc: cancel async events before freeing event struct Greg Kroah-Hartman
2020-09-21 16:30 ` [PATCH 5.4 20/72] nvme-rdma: " Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 21/72] nvme-tcp: " Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 22/72] block: only call sched requeue_request() for scheduled requests Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 23/72] f2fs: fix indefinite loop scanning for free nid Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 24/72] f2fs: Return EOF on unaligned end of file DIO read Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 25/72] i2c: algo: pca: Reapply i2c bus settings after reset Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 26/72] spi: Fix memory leak on splited transfers Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 27/72] KVM: MIPS: Change the definition of kvm type Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 28/72] clk: davinci: Use the correct size when allocating memory Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 29/72] clk: rockchip: Fix initialization of mux_pll_src_4plls_p Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 30/72] ASoC: qcom: Set card->owner to avoid warnings Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 31/72] ASoC: qcom: common: Fix refcount imbalance on error Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 32/72] powerpc/book3s64/radix: Fix boot failure with large amount of guest memory Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 33/72] ASoC: meson: axg-toddr: fix channel order on g12 platforms Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 34/72] Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume() Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 35/72] scsi: libsas: Fix error path in sas_notify_lldd_dev_found() Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 36/72] arm64: Allow CPUs unffected by ARM erratum 1418040 to come in late Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 37/72] Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 38/72] perf test: Fix the "signal" test inline assembly Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 39/72] MIPS: SNI: Fix MIPS_L1_CACHE_SHIFT Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 40/72] perf evlist: Fix cpu/thread map leak Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 41/72] perf parse-event: Fix memory leak in evsel->unit Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 42/72] perf test: Free formats for perf pmu parse test Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 43/72] fbcon: Fix user font detection test at fbcon_resize() Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 44/72] MIPS: SNI: Fix spurious interrupts Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 45/72] drm/mediatek: Add exception handing in mtk_drm_probe() if component init fail Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 46/72] drm/mediatek: Add missing put_device() call in mtk_hdmi_dt_parse_pdata() Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 47/72] arm64: bpf: Fix branch offset in JIT Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 48/72] iommu/amd: Fix potential @entry null deref Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 49/72] i2c: mxs: use MXS_DMA_CTRL_WAIT4END instead of DMA_CTRL_ACK Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 50/72] riscv: Add sfence.vma after early page table changes Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 51/72] drm/i915: Filter wake_flags passed to default_wake_function Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 52/72] USB: quirks: Add USB_QUIRK_IGNORE_REMOTE_WAKEUP quirk for BYD zhaoxin notebook Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 53/72] USB: UAS: fix disconnect by unplugging a hub Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 54/72] usblp: fix race between disconnect() and read() Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 55/72] usb: typec: ucsi: Prevent mode overrun Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 56/72] i2c: i801: Fix resume bug Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 57/72] Revert "ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO" Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 58/72] ALSA: hda: fixup headset for ASUS GX502 laptop Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 59/72] ALSA: hda/realtek - The Mic on a RedmiBook doesnt work Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 60/72] percpu: fix first chunk size calculation for populated bitmap Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 61/72] Input: trackpoint - add new trackpoint variant IDs Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 62/72] Input: i8042 - add Entroware Proteus EL07R4 to nomux and reset lists Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 63/72] serial: 8250_pci: Add Realtek 816a and 816b Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 64/72] x86/boot/compressed: Disable relocation relaxation Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 65/72] s390/zcrypt: fix kmalloc 256k failure Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 66/72] ehci-hcd: Move include to keep CRC stable Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 67/72] powerpc/dma: Fix dma_map_ops::get_required_mask Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 68/72] selftests/vm: fix display of page size in map_hugetlb Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 69/72] dm/dax: Fix table reference counts Greg Kroah-Hartman
2020-09-21 16:31 ` Greg Kroah-Hartman [this message]
2020-09-21 16:31 ` [PATCH 5.4 71/72] dm: Call proper helper to determine dax support Greg Kroah-Hartman
2020-09-21 16:31 ` [PATCH 5.4 72/72] dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX Greg Kroah-Hartman
2020-09-22  6:00 ` [PATCH 5.4 00/72] 5.4.67-rc1 review Naresh Kamboju
2020-09-22  6:46 ` Jon Hunter
2020-09-22 20:19 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200921163125.191811835@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.