* request for 4.14-stable: c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
@ 2018-09-25 10:02 Sudip Mukherjee
2018-10-11 14:08 ` Greg Kroah-Hartman
0 siblings, 1 reply; 2+ messages in thread
From: Sudip Mukherjee @ 2018-09-25 10:02 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, Tetsuo Handa, Michal Hocko, Wei Wang, Michael S. Tsirkin,
Jan Stancek
[-- Attachment #1: Type: text/plain, Size: 265 bytes --]
Hi Greg,
This was not marked for stable but seems it should be in stable.
Please apply to your queue of 4.14-stable.
And at the same time, there was another later patch which fixed a bug
in this patch. Both are attached as a series to this mail.
--
Regards
Sudip
[-- Attachment #2: 0001-virtio_balloon-fix-deadlock-on-OOM.patch --]
[-- Type: text/x-diff, Size: 8032 bytes --]
>From a95f45a5e326fc63db1d5e3d33e040b78054d9dd Mon Sep 17 00:00:00 2001
From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Fri, 13 Oct 2017 16:11:48 +0300
Subject: [PATCH 1/2] virtio_balloon: fix deadlock on OOM
commit c7cdff0e864713a089d7cb3a2b1136ba9a54881a upstream
fill_balloon doing memory allocations under balloon_lock
can cause a deadlock when leak_balloon is called from
virtballoon_oom_notify and tries to take same lock.
To fix, split page allocation and enqueue and do allocations outside the lock.
Here's a detailed analysis of the deadlock by Tetsuo Handa:
In leak_balloon(), mutex_lock(&vb->balloon_lock) is called in order to
serialize against fill_balloon(). But in fill_balloon(),
alloc_page(GFP_HIGHUSER[_MOVABLE] | __GFP_NOMEMALLOC | __GFP_NORETRY) is
called with vb->balloon_lock mutex held. Since GFP_HIGHUSER[_MOVABLE]
implies __GFP_DIRECT_RECLAIM | __GFP_IO | __GFP_FS, despite __GFP_NORETRY
is specified, this allocation attempt might indirectly depend on somebody
else's __GFP_DIRECT_RECLAIM memory allocation. And such indirect
__GFP_DIRECT_RECLAIM memory allocation might call leak_balloon() via
virtballoon_oom_notify() via blocking_notifier_call_chain() callback via
out_of_memory() when it reached __alloc_pages_may_oom() and held oom_lock
mutex. Since vb->balloon_lock mutex is already held by fill_balloon(), it
will cause OOM lockup.
Thread1 Thread2
fill_balloon()
takes a balloon_lock
balloon_page_enqueue()
alloc_page(GFP_HIGHUSER_MOVABLE)
direct reclaim (__GFP_FS context) takes a fs lock
waits for that fs lock alloc_page(GFP_NOFS)
__alloc_pages_may_oom()
takes the oom_lock
out_of_memory()
blocking_notifier_call_chain()
leak_balloon()
tries to take that balloon_lock and deadlocks
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
---
drivers/virtio/virtio_balloon.c | 24 +++++++++++++++++++-----
include/linux/balloon_compaction.h | 35 ++++++++++++++++++++++++++++++++++-
mm/balloon_compaction.c | 28 +++++++++++++++++++++-------
3 files changed, 74 insertions(+), 13 deletions(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 36c9fbf70d44..43fcd17a738e 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -143,16 +143,17 @@ static void set_page_pfns(struct virtio_balloon *vb,
static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
{
- struct balloon_dev_info *vb_dev_info = &vb->vb_dev_info;
unsigned num_allocated_pages;
+ unsigned num_pfns;
+ struct page *page;
+ LIST_HEAD(pages);
/* We can only do one array worth at a time. */
num = min(num, ARRAY_SIZE(vb->pfns));
- mutex_lock(&vb->balloon_lock);
- for (vb->num_pfns = 0; vb->num_pfns < num;
- vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
- struct page *page = balloon_page_enqueue(vb_dev_info);
+ for (num_pfns = 0; num_pfns < num;
+ num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) {
+ struct page *page = balloon_page_alloc();
if (!page) {
dev_info_ratelimited(&vb->vdev->dev,
@@ -162,6 +163,19 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
msleep(200);
break;
}
+
+ balloon_page_push(&pages, page);
+ }
+
+ mutex_lock(&vb->balloon_lock);
+
+ vb->num_pfns = 0;
+
+ while ((page = balloon_page_pop(&pages))) {
+ balloon_page_enqueue(&vb->vb_dev_info, page);
+
+ vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE;
+
set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
if (!virtio_has_feature(vb->vdev,
diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h
index fbbe6da40fed..53051f3d8f25 100644
--- a/include/linux/balloon_compaction.h
+++ b/include/linux/balloon_compaction.h
@@ -50,6 +50,7 @@
#include <linux/gfp.h>
#include <linux/err.h>
#include <linux/fs.h>
+#include <linux/list.h>
/*
* Balloon device information descriptor.
@@ -67,7 +68,9 @@ struct balloon_dev_info {
struct inode *inode;
};
-extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info);
+extern struct page *balloon_page_alloc(void);
+extern void balloon_page_enqueue(struct balloon_dev_info *b_dev_info,
+ struct page *page);
extern struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info);
static inline void balloon_devinfo_init(struct balloon_dev_info *balloon)
@@ -193,4 +196,34 @@ static inline gfp_t balloon_mapping_gfp_mask(void)
}
#endif /* CONFIG_BALLOON_COMPACTION */
+
+/*
+ * balloon_page_push - insert a page into a page list.
+ * @head : pointer to list
+ * @page : page to be added
+ *
+ * Caller must ensure the page is private and protect the list.
+ */
+static inline void balloon_page_push(struct list_head *pages, struct page *page)
+{
+ list_add(&page->lru, pages);
+}
+
+/*
+ * balloon_page_pop - remove a page from a page list.
+ * @head : pointer to list
+ * @page : page to be added
+ *
+ * Caller must ensure the page is private and protect the list.
+ */
+static inline struct page *balloon_page_pop(struct list_head *pages)
+{
+ struct page *page = list_first_entry_or_null(pages, struct page, lru);
+
+ if (!page)
+ return NULL;
+
+ list_del(&page->lru);
+ return page;
+}
#endif /* _LINUX_BALLOON_COMPACTION_H */
diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c
index 68d28924ba79..ef858d547e2d 100644
--- a/mm/balloon_compaction.c
+++ b/mm/balloon_compaction.c
@@ -11,22 +11,37 @@
#include <linux/balloon_compaction.h>
/*
+ * balloon_page_alloc - allocates a new page for insertion into the balloon
+ * page list.
+ *
+ * Driver must call it to properly allocate a new enlisted balloon page.
+ * Driver must call balloon_page_enqueue before definitively removing it from
+ * the guest system. This function returns the page address for the recently
+ * allocated page or NULL in the case we fail to allocate a new page this turn.
+ */
+struct page *balloon_page_alloc(void)
+{
+ struct page *page = alloc_page(balloon_mapping_gfp_mask() |
+ __GFP_NOMEMALLOC | __GFP_NORETRY);
+ return page;
+}
+EXPORT_SYMBOL_GPL(balloon_page_alloc);
+
+/*
* balloon_page_enqueue - allocates a new page and inserts it into the balloon
* page list.
* @b_dev_info: balloon device descriptor where we will insert a new page to
+ * @page: new page to enqueue - allocated using balloon_page_alloc.
*
- * Driver must call it to properly allocate a new enlisted balloon page
+ * Driver must call it to properly enqueue a new allocated balloon page
* before definitively removing it from the guest system.
* This function returns the page address for the recently enqueued page or
* NULL in the case we fail to allocate a new page this turn.
*/
-struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info)
+void balloon_page_enqueue(struct balloon_dev_info *b_dev_info,
+ struct page *page)
{
unsigned long flags;
- struct page *page = alloc_page(balloon_mapping_gfp_mask() |
- __GFP_NOMEMALLOC | __GFP_NORETRY);
- if (!page)
- return NULL;
/*
* Block others from accessing the 'page' when we get around to
@@ -39,7 +54,6 @@ struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info)
__count_vm_event(BALLOON_INFLATE);
spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
unlock_page(page);
- return page;
}
EXPORT_SYMBOL_GPL(balloon_page_enqueue);
--
2.11.0
[-- Attachment #3: 0002-virtio_balloon-fix-increment-of-vb-num_pfns-in-fill_.patch --]
[-- Type: text/x-diff, Size: 1785 bytes --]
>From 4336b085d95e3dbdc70d88715eedcab7bfbfda0a Mon Sep 17 00:00:00 2001
From: Jan Stancek <jstancek@redhat.com>
Date: Fri, 1 Dec 2017 10:50:28 +0100
Subject: [PATCH 2/2] virtio_balloon: fix increment of vb->num_pfns in
fill_balloon()
commit d9e427f6ab8142d6868eb719e6a7851aafea56b6 upstream
commit c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
changed code to increment vb->num_pfns before call to
set_page_pfns(), which used to happen only after.
This patch fixes boot hang for me on ppc64le KVM guests.
Fixes: c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
---
drivers/virtio/virtio_balloon.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 43fcd17a738e..d9873aa014a6 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -174,13 +174,12 @@ static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
while ((page = balloon_page_pop(&pages))) {
balloon_page_enqueue(&vb->vb_dev_info, page);
- vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE;
-
set_page_pfns(vb, vb->pfns + vb->num_pfns, page);
vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE;
if (!virtio_has_feature(vb->vdev,
VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
adjust_managed_page_count(page, -1);
+ vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE;
}
num_allocated_pages = vb->num_pfns;
--
2.11.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: request for 4.14-stable: c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM")
2018-09-25 10:02 request for 4.14-stable: c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM") Sudip Mukherjee
@ 2018-10-11 14:08 ` Greg Kroah-Hartman
0 siblings, 0 replies; 2+ messages in thread
From: Greg Kroah-Hartman @ 2018-10-11 14:08 UTC (permalink / raw)
To: Sudip Mukherjee
Cc: stable, Tetsuo Handa, Michal Hocko, Wei Wang, Michael S. Tsirkin,
Jan Stancek
On Tue, Sep 25, 2018 at 11:02:42AM +0100, Sudip Mukherjee wrote:
> Hi Greg,
>
> This was not marked for stable but seems it should be in stable.
> Please apply to your queue of 4.14-stable.
> And at the same time, there was another later patch which fixed a bug
> in this patch. Both are attached as a series to this mail.
Now queued up, thanks.
greg k-h
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-10-11 21:36 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-25 10:02 request for 4.14-stable: c7cdff0e8647 ("virtio_balloon: fix deadlock on OOM") Sudip Mukherjee
2018-10-11 14:08 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox