From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov@parallels.com>,
Hugh Dickins <hughd@google.com>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] memcg, vmscan: Do not wait for writeback if killed
Date: Thu, 3 Dec 2015 10:08:26 +0100 [thread overview]
Message-ID: <20151203090826.GD9264@dhcp22.suse.cz> (raw)
In-Reply-To: <20151202142503.0921c0d6e06394ff7dff85fa@linux-foundation.org>
On Wed 02-12-15 14:25:03, Andrew Morton wrote:
> On Wed, 2 Dec 2015 15:26:18 +0100 Michal Hocko <mhocko@kernel.org> wrote:
>
> > From: Michal Hocko <mhocko@suse.com>
> >
> > Legacy memcg reclaim waits for pages under writeback to prevent from a
> > premature oom killer invocation because there was no memcg dirty limit
> > throttling implemented back then.
> >
> > This heuristic might complicate situation when the writeback cannot make
> > forward progress because of the global OOM situation. E.g. filesystem
> > backed by the loop device relies on the underlying filesystem hosting
> > the image to make forward progress which cannot be guaranteed and so
> > we might end up triggering OOM killer to resolve the situation. If the
> > oom victim happens to be the task stuck in wait_on_page_writeback in the
> > memcg reclaim then we are basically deadlocked.
> >
> > Introduce wait_on_page_writeback_killable and use it in this path to
> > prevent from the issue. shrink_page_list will back off if the wait
> > was interrupted.
> >
> > ...
> >
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1021,10 +1021,19 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >
> > /* Case 3 above */
> > } else {
> > + int ret;
> > +
> > unlock_page(page);
> > - wait_on_page_writeback(page);
> > + ret = wait_on_page_writeback_killable(page);
> > /* then go back and try same page again */
> > list_add_tail(&page->lru, page_list);
> > +
> > + /*
> > + * We've got killed while waiting here so
> > + * expedite our way out from the reclaim
> > + */
> > + if (ret)
> > + break;
> > continue;
> > }
> > }
>
> This function is 350 lines long and it takes a bit of effort to work
> out what that `break' is breaking from and where it goes next. I think
> you want a "goto keep_killed" here for consistency and sanity.
Yeah, sounds better. See an update below:
> Also, there's high risk here of a pending signal causing the code to
> fall into some busy loop where it repeatedly tries to do something but
> then bales out without doing it. It's unobvious how this change avoids
> such things. (Maybe it *does* avoid such things, but it should be
> obvious!).
shrink_page_list is called from __alloc_contig_migrate_range and
shrink_inactive_list. Both of them handle fatal_signal_pending and bail
out. I was relying on this behavior. I realize this is far from optimal
wrt. readability but I do not have a great idea how to improve it
without sticking more fatal_signal_pending checks into the reclaim path.
So you think a comment would be sufficient?
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 98a1934493af..2e8ee9e5fcb5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1031,9 +1031,12 @@ static unsigned long shrink_page_list(struct list_head *page_list,
/*
* We've got killed while waiting here so
* expedite our way out from the reclaim
+ *
+ * Our callers should make sure we do not
+ * get here with fatal signals again.
*/
if (ret)
- break;
+ goto keep_killed;
continue;
}
}
@@ -1227,6 +1230,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
VM_BUG_ON_PAGE(PageLRU(page) || PageUnevictable(page), page);
}
+keep_killed:
mem_cgroup_uncharge_list(&free_pages);
try_to_unmap_flush();
free_hot_cold_page_list(&free_pages, true);
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-12-03 9:08 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-02 14:26 [PATCH] memcg, vmscan: Do not wait for writeback if killed Michal Hocko
2015-12-02 22:25 ` Andrew Morton
2015-12-03 9:08 ` Michal Hocko [this message]
2015-12-05 1:03 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151203090826.GD9264@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=vdavydov@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).