From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753035AbcGYIiw (ORCPT ); Mon, 25 Jul 2016 04:38:52 -0400 Received: from LGEAMRELO13.lge.com ([156.147.23.53]:37434 "EHLO lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752732AbcGYIin (ORCPT ); Mon, 25 Jul 2016 04:38:43 -0400 X-Original-SENDERIP: 156.147.1.151 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 165.244.98.203 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 10.177.223.161 X-Original-MAILFROM: minchan@kernel.org Date: Mon, 25 Jul 2016 17:39:13 +0900 From: Minchan Kim To: Mel Gorman CC: Andrew Morton , Johannes Weiner , Michal Hocko , Vlastimil Babka , Linux-MM , LKML Subject: Re: [PATCH 5/5] mm, vmscan: Account for skipped pages as a partial scan Message-ID: <20160725083913.GE1660@bbox> References: <1469110261-7365-1-git-send-email-mgorman@techsingularity.net> <1469110261-7365-6-git-send-email-mgorman@techsingularity.net> MIME-Version: 1.0 In-Reply-To: <1469110261-7365-6-git-send-email-mgorman@techsingularity.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-MIMETrack: Itemize by SMTP Server on LGEKRMHUB04/LGE/LG Group(Release 8.5.3FP6|November 21, 2013) at 2016/07/25 17:38:38, Serialize by Router on LGEKRMHUB04/LGE/LG Group(Release 8.5.3FP6|November 21, 2013) at 2016/07/25 17:38:38, Serialize complete at 2016/07/25 17:38:38 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 21, 2016 at 03:11:01PM +0100, Mel Gorman wrote: > Page reclaim determines whether a pgdat is unreclaimable by examining how > many pages have been scanned since a page was freed and comparing that to > the LRU sizes. Skipped pages are not reclaim candidates but contribute to > scanned. This can prematurely mark a pgdat as unreclaimable and trigger > an OOM kill. > > This patch accounts for skipped pages as a partial scan so that an > unreclaimable pgdat will still be marked as such but by scaling the cost > of a skip, it'll avoid the pgdat being marked prematurely. > > Signed-off-by: Mel Gorman > --- > mm/vmscan.c | 20 ++++++++++++++++++-- > 1 file changed, 18 insertions(+), 2 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 6810d81f60c7..e5af357dd4ac 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -1424,7 +1424,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, > LIST_HEAD(pages_skipped); > > for (scan = 0; scan < nr_to_scan && nr_taken < nr_to_scan && > - !list_empty(src); scan++) { > + !list_empty(src);) { > struct page *page; > > page = lru_to_page(src); > @@ -1438,6 +1438,12 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, > continue; > } > > + /* > + * Account for scanned and skipped separetly to avoid the pgdat > + * being prematurely marked unreclaimable by pgdat_reclaimable. > + */ > + scan++; > + > switch (__isolate_lru_page(page, mode)) { > case 0: > nr_pages = hpage_nr_pages(page); > @@ -1465,14 +1471,24 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan, > */ > if (!list_empty(&pages_skipped)) { > int zid; > + unsigned long total_skipped = 0; > > - list_splice(&pages_skipped, src); > for (zid = 0; zid < MAX_NR_ZONES; zid++) { > if (!nr_skipped[zid]) > continue; > > __count_zid_vm_events(PGSCAN_SKIP, zid, nr_skipped[zid]); > + total_skipped += nr_skipped[zid]; > } > + > + /* > + * Account skipped pages as a partial scan as the pgdat may be > + * close to unreclaimable. If the LRU list is empty, account > + * skipped pages as a full scan. > + */ node-lru made OOM detection lengthy because a freeing of any zone will reset NR_PAGES_SCANNED easily so that it's hard to meet a situation pgdat_reclaimable returns *false*. When I perform stress test, it seems I encounter the situation easily although I have no number now. Anyway, this patch makes sense to me because it's better than now. About accounting scan, I supports this idea. But still, I doubt it's okay to continue skipping pages under irq-disabled-spin lock without any condition.