From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>,
Wei Yang <richard.weiyang@linux.alibaba.com>,
"Michael S . Tsirkin" <mst@redhat.com>,
virtualization@lists.linux-foundation.org, linux-mm@kvack.org,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>,
Oscar Salvador <osalvador@suse.de>
Subject: [PATCH v1 29/29] virtio-mem: Big Block Mode (BBM) - safe memory hotunplug
Date: Mon, 12 Oct 2020 14:53:23 +0200 [thread overview]
Message-ID: <20201012125323.17509-30-david@redhat.com> (raw)
In-Reply-To: <20201012125323.17509-1-david@redhat.com>
Let's add a safe mechanism to unplug memory, avoiding long/endless loops
when trying to offline memory - similar to in SBM.
Fake-offline all memory (via alloc_contig_range()) before trying to
offline+remove it. Use this mode as default, but allow to enable the other
mode explicitly (which could give better memory hotunplug guarantees in
some environments).
The "unsafe" mode can be enabled e.g., via virtio_mem.bbm_safe_unplug=0
on the cmdline.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
drivers/virtio/virtio_mem.c | 97 ++++++++++++++++++++++++++++++++++++-
1 file changed, 95 insertions(+), 2 deletions(-)
diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 6bcd0acbff32..09f11489be6f 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -37,6 +37,11 @@ module_param(bbm_block_size, ulong, 0444);
MODULE_PARM_DESC(bbm_block_size,
"Big Block size in bytes. Default is 0 (auto-detection).");
+static bool bbm_safe_unplug = true;
+module_param(bbm_safe_unplug, bool, 0444);
+MODULE_PARM_DESC(bbm_safe_unplug,
+ "Use a safe unplug mechanism in BBM, avoiding long/endless loops");
+
/*
* virtio-mem currently supports the following modes of operation:
*
@@ -87,6 +92,8 @@ enum virtio_mem_bbm_bb_state {
VIRTIO_MEM_BBM_BB_PLUGGED,
/* Plugged and added to Linux. */
VIRTIO_MEM_BBM_BB_ADDED,
+ /* All online parts are fake-offline, ready to remove. */
+ VIRTIO_MEM_BBM_BB_FAKE_OFFLINE,
VIRTIO_MEM_BBM_BB_COUNT
};
@@ -889,6 +896,32 @@ static void virtio_mem_sbm_notify_cancel_offline(struct virtio_mem *vm,
}
}
+static void virtio_mem_bbm_notify_going_offline(struct virtio_mem *vm,
+ unsigned long bb_id,
+ unsigned long pfn,
+ unsigned long nr_pages)
+{
+ /*
+ * When marked as "fake-offline", all online memory of this device block
+ * is allocated by us. Otherwise, we don't have any memory allocated.
+ */
+ if (virtio_mem_bbm_get_bb_state(vm, bb_id) !=
+ VIRTIO_MEM_BBM_BB_FAKE_OFFLINE)
+ return;
+ virtio_mem_fake_offline_going_offline(pfn, nr_pages);
+}
+
+static void virtio_mem_bbm_notify_cancel_offline(struct virtio_mem *vm,
+ unsigned long bb_id,
+ unsigned long pfn,
+ unsigned long nr_pages)
+{
+ if (virtio_mem_bbm_get_bb_state(vm, bb_id) !=
+ VIRTIO_MEM_BBM_BB_FAKE_OFFLINE)
+ return;
+ virtio_mem_fake_offline_cancel_offline(pfn, nr_pages);
+}
+
/*
* This callback will either be called synchronously from add_memory() or
* asynchronously (e.g., triggered via user space). We have to be careful
@@ -949,6 +982,10 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb,
vm->hotplug_active = true;
if (vm->in_sbm)
virtio_mem_sbm_notify_going_offline(vm, id);
+ else
+ virtio_mem_bbm_notify_going_offline(vm, id,
+ mhp->start_pfn,
+ mhp->nr_pages);
break;
case MEM_GOING_ONLINE:
mutex_lock(&vm->hotplug_mutex);
@@ -999,6 +1036,10 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb,
break;
if (vm->in_sbm)
virtio_mem_sbm_notify_cancel_offline(vm, id);
+ else
+ virtio_mem_bbm_notify_cancel_offline(vm, id,
+ mhp->start_pfn,
+ mhp->nr_pages);
vm->hotplug_active = false;
mutex_unlock(&vm->hotplug_mutex);
break;
@@ -1189,7 +1230,13 @@ static void virtio_mem_online_page_cb(struct page *page, unsigned int order)
do_online = virtio_mem_sbm_test_sb_plugged(vm, id,
sb_id, 1);
} else {
- do_online = true;
+ /*
+ * If the whole block is marked fake offline, keep
+ * everything that way.
+ */
+ id = virtio_mem_phys_to_bb_id(vm, addr);
+ do_online = virtio_mem_bbm_get_bb_state(vm, id) !=
+ VIRTIO_MEM_BBM_BB_FAKE_OFFLINE;
}
if (do_online)
generic_online_page(page, order);
@@ -1969,15 +2016,50 @@ static int virtio_mem_sbm_unplug_request(struct virtio_mem *vm, uint64_t diff)
static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm,
unsigned long bb_id)
{
+ const unsigned long start_pfn = PFN_DOWN(virtio_mem_bb_id_to_phys(vm, bb_id));
+ const unsigned long nr_pages = PFN_DOWN(vm->bbm.bb_size);
+ unsigned long end_pfn = start_pfn + nr_pages;
+ unsigned long pfn;
+ struct page *page;
int rc;
if (WARN_ON_ONCE(virtio_mem_bbm_get_bb_state(vm, bb_id) !=
VIRTIO_MEM_BBM_BB_ADDED))
return -EINVAL;
+ if (bbm_safe_unplug) {
+ /*
+ * Start by fake-offlining all memory. Once we marked the device
+ * block as fake-offline, all newly onlined memory will
+ * automatically be kept fake-offline. Protect from concurrent
+ * onlining/offlining until we have a consistent state.
+ */
+ mutex_lock(&vm->hotplug_mutex);
+ virtio_mem_bbm_set_bb_state(vm, bb_id,
+ VIRTIO_MEM_BBM_BB_FAKE_OFFLINE);
+
+ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+ page = pfn_to_online_page(pfn);
+ if (!page)
+ continue;
+
+ rc = virtio_mem_fake_offline(pfn, PAGES_PER_SECTION);
+ if (rc) {
+ end_pfn = pfn;
+ goto rollback_safe_unplug;
+ }
+ }
+ mutex_unlock(&vm->hotplug_mutex);
+ }
+
rc = virtio_mem_bbm_offline_and_remove_bb(vm, bb_id);
- if (rc)
+ if (rc) {
+ if (bbm_safe_unplug) {
+ mutex_lock(&vm->hotplug_mutex);
+ goto rollback_safe_unplug;
+ }
return rc;
+ }
rc = virtio_mem_bbm_unplug_bb(vm, bb_id);
if (rc)
@@ -1987,6 +2069,17 @@ static int virtio_mem_bbm_offline_remove_and_unplug_bb(struct virtio_mem *vm,
virtio_mem_bbm_set_bb_state(vm, bb_id,
VIRTIO_MEM_BBM_BB_UNUSED);
return rc;
+
+rollback_safe_unplug:
+ for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
+ page = pfn_to_online_page(pfn);
+ if (!page)
+ continue;
+ virtio_mem_fake_online(pfn, PAGES_PER_SECTION);
+ }
+ virtio_mem_bbm_set_bb_state(vm, bb_id, VIRTIO_MEM_BBM_BB_ADDED);
+ mutex_unlock(&vm->hotplug_mutex);
+ return rc;
}
/*
--
2.26.2
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
next prev parent reply other threads:[~2020-10-12 12:57 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-12 12:52 [PATCH v1 00/29] virtio-mem: Big Block Mode (BBM) David Hildenbrand
2020-10-12 12:52 ` [PATCH v1 01/29] virtio-mem: determine nid only once using memory_add_physaddr_to_nid() David Hildenbrand
2020-10-12 12:52 ` [PATCH v1 02/29] virtio-mem: simplify calculation in virtio_mem_mb_state_prepare_next_mb() David Hildenbrand
[not found] ` <20201015040204.GB86495@L-31X9LVDL-1304.local>
2020-10-15 8:00 ` David Hildenbrand
[not found] ` <20201015100009.GH86495@L-31X9LVDL-1304.local>
2020-10-15 10:01 ` David Hildenbrand
[not found] ` <CAM9Jb+h=2Wg3qAggjAfBf7yyvL9HU6xns7_giJfw6smkCAJ6SQ@mail.gmail.com>
2020-10-16 9:00 ` David Hildenbrand
2020-10-12 12:52 ` [PATCH v1 03/29] virtio-mem: simplify MAX_ORDER - 1 / pageblock_order handling David Hildenbrand
2020-10-12 12:52 ` [PATCH v1 04/29] virtio-mem: drop rc2 in virtio_mem_mb_plug_and_add() David Hildenbrand
2020-10-12 12:52 ` [PATCH v1 05/29] virtio-mem: generalize check for added memory David Hildenbrand
[not found] ` <20201015082808.GE86495@L-31X9LVDL-1304.local>
2020-10-15 8:50 ` David Hildenbrand
[not found] ` <20201016021651.GI86495@L-31X9LVDL-1304.local>
2020-10-16 9:11 ` David Hildenbrand
[not found] ` <20201016100211.GI44269@L-31X9LVDL-1304.local>
2020-10-16 10:32 ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 06/29] virtio-mem: generalize virtio_mem_owned_mb() David Hildenbrand
[not found] ` <20201015083234.GF86495@L-31X9LVDL-1304.local>
2020-10-15 8:37 ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 07/29] virtio-mem: generalize virtio_mem_overlaps_range() David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 08/29] virtio-mem: drop last_mb_id David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 09/29] virtio-mem: don't always trigger the workqueue when offlining memory David Hildenbrand
[not found] ` <20201016040301.GJ86495@L-31X9LVDL-1304.local>
2020-10-16 9:18 ` David Hildenbrand
[not found] ` <20201018035725.GA50506@L-31X9LVDL-1304>
2020-10-19 9:04 ` David Hildenbrand
[not found] ` <20201020004130.GB61232@L-31X9LVDL-1304.local>
2020-10-20 9:09 ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 10/29] virtio-mem: generalize handling when memory is getting onlined deferred David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 11/29] virtio-mem: use "unsigned long" for nr_pages when fake onlining/offlining David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 12/29] virtio-mem: factor out fake-offlining into virtio_mem_fake_offline() David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 13/29] virtio-mem: factor out handling of fake-offline pages in memory notifier David Hildenbrand
[not found] ` <20201016071502.GM86495@L-31X9LVDL-1304.local>
[not found] ` <20201016080046.GA43862@L-31X9LVDL-1304.local>
2020-10-16 8:57 ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 14/29] virtio-mem: retry fake-offlining via alloc_contig_range() on ZONE_MOVABLE David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 15/29] virito-mem: document Sub Block Mode (SBM) David Hildenbrand
2020-10-15 9:33 ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 16/29] virtio-mem: memory block states are specific to " David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 17/29] virito-mem: subblock " David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 18/29] virtio-mem: factor out calculation of the bit number within the sb_states bitmap David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 19/29] virito-mem: existing (un)plug functions are specific to Sub Block Mode (SBM) David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 20/29] virtio-mem: nb_sb_per_mb and subblock_size " David Hildenbrand
[not found] ` <20201016085319.GD44269@L-31X9LVDL-1304.local>
2020-10-16 13:17 ` David Hildenbrand
[not found] ` <20201018124104.GD50506@L-31X9LVDL-1304>
2020-10-19 11:57 ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 21/29] virtio-mem: memory notifier callbacks " David Hildenbrand
[not found] ` <20201019015724.GA54484@L-31X9LVDL-1304.local>
2020-10-19 10:22 ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 22/29] virtio-mem: memory block ids " David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 23/29] virtio-mem: factor out adding/removing memory from Linux David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 24/29] virtio-mem: print debug messages from virtio_mem_send_*_request() David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 25/29] virtio-mem: Big Block Mode (BBM) memory hotplug David Hildenbrand
[not found] ` <20201016093835.GH44269@L-31X9LVDL-1304.local>
2020-10-16 13:13 ` David Hildenbrand
[not found] ` <20201019022630.GB54484@L-31X9LVDL-1304.local>
2020-10-19 9:15 ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 26/29] virtio-mem: allow to force Big Block Mode (BBM) and set the big block size David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 27/29] mm/memory_hotplug: extend offline_and_remove_memory() to handle more than one memory block David Hildenbrand
2020-10-15 13:08 ` Michael S. Tsirkin
2020-10-12 12:53 ` [PATCH v1 28/29] virtio-mem: Big Block Mode (BBM) - basic memory hotunplug David Hildenbrand
[not found] ` <20201019034817.GD54484@L-31X9LVDL-1304.local>
2020-10-19 9:12 ` David Hildenbrand
2020-10-12 12:53 ` David Hildenbrand [this message]
[not found] ` <20201019075406.GE54484@L-31X9LVDL-1304.local>
2020-10-19 8:50 ` [PATCH v1 29/29] virtio-mem: Big Block Mode (BBM) - safe " David Hildenbrand
2020-10-18 15:29 ` [PATCH v1 00/29] virtio-mem: Big Block Mode (BBM) Michael S. Tsirkin
2020-10-18 16:34 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201012125323.17509-30-david@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mst@redhat.com \
--cc=osalvador@suse.de \
--cc=pankaj.gupta.linux@gmail.com \
--cc=richard.weiyang@linux.alibaba.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).