Re: [PATCH] mm: compaction: Abort compaction if too many pages are isolated and caller is asynchronous

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@suse.de>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mel@csn.ul.ie>,
	akpm@linux-foundation.org, Ury Stankevich <urykhy@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: compaction: Abort compaction if too many pages are isolated and caller is asynchronous
Date: Mon, 6 Jun 2011 11:15:57 +0100	[thread overview]
Message-ID: <20110606101557.GA5247@suse.de> (raw)
In-Reply-To: <BANLkTikA+ugFNS95Zs_o6QqG2u4r2g93=Q@mail.gmail.com>

On Fri, Jun 03, 2011 at 08:01:44AM +0900, Minchan Kim wrote:
> On Fri, Jun 3, 2011 at 7:32 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> > On Fri, Jun 03, 2011 at 07:23:48AM +0900, Minchan Kim wrote:
> >> I mean we have more tail pages than head pages. So I think we are likely to
> >> meet tail pages. Of course, compared to all pages(page cache, anon and
> >> so on), compound pages would be very small percentage.
> >
> > Yes that's my point, that being a small percentage it's no big deal to
> > break the loop early.
> 
> Indeed.
> 
> >
> >> > isolated the head and it's useless to insist on more tail pages (at
> >> > least for large page size like on x86). Plus we've compaction so
> >>
> >> I can't understand your point. Could you elaborate it?
> >
> > What I meant is that if we already isolated the head page of the THP,
> > we don't need to try to free the tail pages and breaking the loop
> > early, will still give us a chance to free a whole 2m because we
> > isolated the head page (it'll involve some work and swapping but if it
> > was a compoundtranspage we're ok to break the loop and we're not
> > making the logic any worse). Provided the PMD_SIZE is quite large like
> > 2/4m...
> 
> Do you want this? (it's almost pseudo-code)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7a4469b..9d7609f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1017,7 +1017,7 @@ static unsigned long isolate_lru_pages(unsigned
> long nr_to_scan,
>         for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) {
>                 struct page *page;
>                 unsigned long pfn;
> -               unsigned long end_pfn;
> +               unsigned long start_pfn, end_pfn;
>                 unsigned long page_pfn;
>                 int zone_id;
> 
> @@ -1057,9 +1057,9 @@ static unsigned long isolate_lru_pages(unsigned
> long nr_to_scan,
>                  */
>                 zone_id = page_zone_id(page);
>                 page_pfn = page_to_pfn(page);
> -               pfn = page_pfn & ~((1 << order) - 1);
> +               start_pfn = pfn = page_pfn & ~((1 << order) - 1);
>                 end_pfn = pfn + (1 << order);
> -               for (; pfn < end_pfn; pfn++) {
> +               while (pfn < end_pfn) {
>                         struct page *cursor_page;
> 
>                         /* The target page is in the block, ignore it. */
> @@ -1086,17 +1086,25 @@ static unsigned long
> isolate_lru_pages(unsigned long nr_to_scan,
>                                 break;
> 
>                         if (__isolate_lru_page(cursor_page, mode, file) == 0) {
> +                               int isolated_pages;
>                                 list_move(&cursor_page->lru, dst);
>                                 mem_cgroup_del_lru(cursor_page);
> -                               nr_taken += hpage_nr_pages(page);
> +                               isolated_pages = hpage_nr_pages(page);
> +                               nr_taken += isolated_pages;
> +                               /* if we isolated pages enough, let's
> break early */
> +                               if (nr_taken > end_pfn - start_pfn)
> +                                       break;
> +                               pfn += isolated_pages;

I think this condition is somewhat unlikely. We are scanning within
aligned blocks in this linear scanner. Huge pages are always aligned
so the only situation where we'll encounter a hugepage in the middle
of this linear scan is when the requested order is larger than a huge
page. This is exceptionally rare.

Did I miss something?

>                                 nr_lumpy_taken++;
>                                 if (PageDirty(cursor_page))
>                                         nr_lumpy_dirty++;
>                                 scan++;
>                         } else {
>                                 /* the page is freed already. */
> -                               if (!page_count(cursor_page))
> +                               if (!page_count(cursor_page)) {
> +                                       pfn++;
>                                         continue;
> +                               }
>                                 break;
>                         }
>                 }
> 
> >
> > The only way this patch makes things worse is for slub order 3 in the
> > process of being freed. But tail pages aren't generally free anyway so
> > I doubt this really makes any difference plus the tail is getting
> > cleared as soon as the page reaches the buddy so it's probably
> 
> Okay. Considering getting clear PG_tail as soon as slub order 3 is
> freed, it would be very rare case.
> 
> > unnoticeable as this then makes a difference only during a race (plus
> > the tail page can't be isolated, only head page can be part of lrus
> > and only if they're THP).
> >
> >> > insisting and screwing lru ordering isn't worth it, better to be
> >> > permissive and abort... in fact I wouldn't dislike to remove the
> >> > entire lumpy logic when COMPACTION_BUILD is true, but that alters the
> >> > trace too...
> >>
> >> AFAIK, it's final destination to go as compaction will not break lru
> >> ordering if my patch(inorder-putback) is merged.
> >
> > Agreed. I like your patchset, sorry for not having reviewed it in
> > detail yet but there were other issues popping up in the last few
> > days.
> 
> No problem. it's urgent than mine. :)
> 

I'm going to take the opportunity to apologise for not reviewing that
series yet. I've been kept too busy with other bugs to set side the
few hours I need to review the series. I'm hoping to get to it this
week if everything goes well.

-- 
Mel Gorman
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)

From: Mel Gorman <mgorman@suse.de>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mel@csn.ul.ie>,
	akpm@linux-foundation.org, Ury Stankevich <urykhy@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: compaction: Abort compaction if too many pages are isolated and caller is asynchronous
Date: Mon, 6 Jun 2011 11:15:57 +0100	[thread overview]
Message-ID: <20110606101557.GA5247@suse.de> (raw)
In-Reply-To: <BANLkTikA+ugFNS95Zs_o6QqG2u4r2g93=Q@mail.gmail.com>

On Fri, Jun 03, 2011 at 08:01:44AM +0900, Minchan Kim wrote:
> On Fri, Jun 3, 2011 at 7:32 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> > On Fri, Jun 03, 2011 at 07:23:48AM +0900, Minchan Kim wrote:
> >> I mean we have more tail pages than head pages. So I think we are likely to
> >> meet tail pages. Of course, compared to all pages(page cache, anon and
> >> so on), compound pages would be very small percentage.
> >
> > Yes that's my point, that being a small percentage it's no big deal to
> > break the loop early.
> 
> Indeed.
> 
> >
> >> > isolated the head and it's useless to insist on more tail pages (at
> >> > least for large page size like on x86). Plus we've compaction so
> >>
> >> I can't understand your point. Could you elaborate it?
> >
> > What I meant is that if we already isolated the head page of the THP,
> > we don't need to try to free the tail pages and breaking the loop
> > early, will still give us a chance to free a whole 2m because we
> > isolated the head page (it'll involve some work and swapping but if it
> > was a compoundtranspage we're ok to break the loop and we're not
> > making the logic any worse). Provided the PMD_SIZE is quite large like
> > 2/4m...
> 
> Do you want this? (it's almost pseudo-code)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7a4469b..9d7609f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1017,7 +1017,7 @@ static unsigned long isolate_lru_pages(unsigned
> long nr_to_scan,
>         for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) {
>                 struct page *page;
>                 unsigned long pfn;
> -               unsigned long end_pfn;
> +               unsigned long start_pfn, end_pfn;
>                 unsigned long page_pfn;
>                 int zone_id;
> 
> @@ -1057,9 +1057,9 @@ static unsigned long isolate_lru_pages(unsigned
> long nr_to_scan,
>                  */
>                 zone_id = page_zone_id(page);
>                 page_pfn = page_to_pfn(page);
> -               pfn = page_pfn & ~((1 << order) - 1);
> +               start_pfn = pfn = page_pfn & ~((1 << order) - 1);
>                 end_pfn = pfn + (1 << order);
> -               for (; pfn < end_pfn; pfn++) {
> +               while (pfn < end_pfn) {
>                         struct page *cursor_page;
> 
>                         /* The target page is in the block, ignore it. */
> @@ -1086,17 +1086,25 @@ static unsigned long
> isolate_lru_pages(unsigned long nr_to_scan,
>                                 break;
> 
>                         if (__isolate_lru_page(cursor_page, mode, file) == 0) {
> +                               int isolated_pages;
>                                 list_move(&cursor_page->lru, dst);
>                                 mem_cgroup_del_lru(cursor_page);
> -                               nr_taken += hpage_nr_pages(page);
> +                               isolated_pages = hpage_nr_pages(page);
> +                               nr_taken += isolated_pages;
> +                               /* if we isolated pages enough, let's
> break early */
> +                               if (nr_taken > end_pfn - start_pfn)
> +                                       break;
> +                               pfn += isolated_pages;

I think this condition is somewhat unlikely. We are scanning within
aligned blocks in this linear scanner. Huge pages are always aligned
so the only situation where we'll encounter a hugepage in the middle
of this linear scan is when the requested order is larger than a huge
page. This is exceptionally rare.

Did I miss something?

>                                 nr_lumpy_taken++;
>                                 if (PageDirty(cursor_page))
>                                         nr_lumpy_dirty++;
>                                 scan++;
>                         } else {
>                                 /* the page is freed already. */
> -                               if (!page_count(cursor_page))
> +                               if (!page_count(cursor_page)) {
> +                                       pfn++;
>                                         continue;
> +                               }
>                                 break;
>                         }
>                 }
> 
> >
> > The only way this patch makes things worse is for slub order 3 in the
> > process of being freed. But tail pages aren't generally free anyway so
> > I doubt this really makes any difference plus the tail is getting
> > cleared as soon as the page reaches the buddy so it's probably
> 
> Okay. Considering getting clear PG_tail as soon as slub order 3 is
> freed, it would be very rare case.
> 
> > unnoticeable as this then makes a difference only during a race (plus
> > the tail page can't be isolated, only head page can be part of lrus
> > and only if they're THP).
> >
> >> > insisting and screwing lru ordering isn't worth it, better to be
> >> > permissive and abort... in fact I wouldn't dislike to remove the
> >> > entire lumpy logic when COMPACTION_BUILD is true, but that alters the
> >> > trace too...
> >>
> >> AFAIK, it's final destination to go as compaction will not break lru
> >> ordering if my patch(inorder-putback) is merged.
> >
> > Agreed. I like your patchset, sorry for not having reviewed it in
> > detail yet but there were other issues popping up in the last few
> > days.
> 
> No problem. it's urgent than mine. :)
> 

I'm going to take the opportunity to apologise for not reviewing that
series yet. I've been kept too busy with other bugs to set side the
few hours I need to review the series. I'm hoping to get to it this
week if everything goes well.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2011-06-06 10:16 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-30 13:13 [PATCH] mm: compaction: Abort compaction if too many pages are isolated and caller is asynchronous Mel Gorman
2011-05-30 13:13 ` Mel Gorman
2011-05-30 14:31 ` Andrea Arcangeli
2011-05-30 14:31   ` Andrea Arcangeli
2011-05-30 15:37   ` Mel Gorman
2011-05-30 15:37     ` Mel Gorman
2011-05-30 16:55     ` Mel Gorman
2011-05-30 16:55       ` Mel Gorman
2011-05-30 17:53       ` Andrea Arcangeli
2011-05-30 17:53         ` Andrea Arcangeli
2011-05-31 12:16         ` Minchan Kim
2011-05-31 12:16           ` Minchan Kim
2011-05-31 12:24           ` Andrea Arcangeli
2011-05-31 12:24             ` Andrea Arcangeli
2011-05-31 13:33             ` Minchan Kim
2011-05-31 13:33               ` Minchan Kim
2011-05-31 14:14               ` Andrea Arcangeli
2011-05-31 14:14                 ` Andrea Arcangeli
2011-05-31 14:37                 ` Minchan Kim
2011-05-31 14:37                   ` Minchan Kim
2011-05-31 14:38                   ` Minchan Kim
2011-05-31 14:38                     ` Minchan Kim
2011-06-02 18:23                     ` Andrea Arcangeli
2011-06-02 18:23                       ` Andrea Arcangeli
2011-06-02 20:21                       ` Minchan Kim
2011-06-02 20:21                         ` Minchan Kim
2011-06-02 20:59                         ` Minchan Kim
2011-06-02 20:59                           ` Minchan Kim
2011-06-02 22:03                           ` Andrea Arcangeli
2011-06-02 22:03                             ` Andrea Arcangeli
2011-06-02 21:40                         ` Andrea Arcangeli
2011-06-02 21:40                           ` Andrea Arcangeli
2011-06-02 22:23                           ` Minchan Kim
2011-06-02 22:23                             ` Minchan Kim
2011-06-02 22:32                             ` Andrea Arcangeli
2011-06-02 22:32                               ` Andrea Arcangeli
2011-06-02 23:01                               ` Minchan Kim
2011-06-02 23:01                                 ` Minchan Kim
2011-06-03 17:37                                 ` Andrea Arcangeli
2011-06-03 17:37                                   ` Andrea Arcangeli
2011-06-03 18:07                                   ` Andrea Arcangeli
2011-06-03 18:07                                     ` Andrea Arcangeli
2011-06-04  7:59                                     ` Minchan Kim
2011-06-04  7:59                                       ` Minchan Kim
2011-06-06 10:32                                     ` Mel Gorman
2011-06-06 10:32                                       ` Mel Gorman
2011-06-06 12:49                                       ` Andrea Arcangeli
2011-06-06 12:49                                         ` Andrea Arcangeli
2011-06-06 14:47                                         ` Mel Gorman
2011-06-06 14:47                                           ` Mel Gorman
2011-06-06 14:07                                       ` Minchan Kim
2011-06-06 14:07                                         ` Minchan Kim
2011-06-06 10:15                                 ` Mel Gorman [this message]
2011-06-06 10:15                                   ` Mel Gorman
2011-06-06 10:26                                   ` Mel Gorman
2011-06-06 10:26                                     ` Mel Gorman
2011-06-06 14:01                                   ` Minchan Kim
2011-06-06 14:01                                     ` Minchan Kim
2011-06-06 14:26                                   ` Minchan Kim
2011-06-06 14:26                                     ` Minchan Kim
2011-06-02 23:02                       ` Minchan Kim
2011-06-02 23:02                         ` Minchan Kim
2011-06-01  0:57                 ` Mel Gorman
2011-06-01  0:57                   ` Mel Gorman
2011-06-01  9:24                   ` Mel Gorman
2011-06-01  9:24                     ` Mel Gorman
2011-06-01 17:58                   ` Mel Gorman
2011-06-01 17:58                     ` Mel Gorman
2011-06-01 19:15                     ` Andrea Arcangeli
2011-06-01 19:15                       ` Andrea Arcangeli
2011-06-01 21:40                       ` Mel Gorman
2011-06-01 21:40                         ` Mel Gorman
2011-06-01 23:30                         ` Andrea Arcangeli
2011-06-01 23:30                           ` Andrea Arcangeli
2011-06-02  1:03                           ` Mel Gorman
2011-06-02  1:03                             ` Mel Gorman
2011-06-02  8:34                             ` Minchan Kim
2011-06-02  8:34                               ` Minchan Kim
2011-06-02 13:29                             ` Andrea Arcangeli
2011-06-02 13:29                               ` Andrea Arcangeli
2011-06-02 14:50                               ` Mel Gorman
2011-06-02 14:50                                 ` Mel Gorman
2011-06-02 15:37                                 ` Andrea Arcangeli
2011-06-02 15:37                                   ` Andrea Arcangeli
2011-06-03  2:09                                   ` Mel Gorman
2011-06-03  2:09                                     ` Mel Gorman
2011-06-03 14:49                                     ` Mel Gorman
2011-06-03 14:49                                       ` Mel Gorman
2011-06-03 15:45                                       ` Andrea Arcangeli
2011-06-03 15:45                                         ` Andrea Arcangeli
2011-06-04  7:25                                         ` Minchan Kim
2011-06-04  7:25                                           ` Minchan Kim
2011-06-06 10:39                                         ` Mel Gorman
2011-06-06 10:39                                           ` Mel Gorman
2011-06-06 12:38                                           ` Andrea Arcangeli
2011-06-06 12:38                                             ` Andrea Arcangeli
2011-06-06 14:55                                             ` Mel Gorman
2011-06-06 14:55                                               ` Mel Gorman
2011-06-06 14:19                                           ` Minchan Kim
2011-06-06 14:19                                             ` Minchan Kim
2011-06-06 22:32                                         ` Andrew Morton
2011-06-06 22:32                                           ` Andrew Morton
2011-06-04  6:58                                       ` Minchan Kim
2011-06-04  6:58                                         ` Minchan Kim
2011-06-06 10:43                                         ` Mel Gorman
2011-06-06 10:43                                           ` Mel Gorman
2011-06-06 12:40                                           ` Andrea Arcangeli
2011-06-06 12:40                                             ` Andrea Arcangeli
2011-06-06 13:27                                             ` Minchan Kim
2011-06-06 13:27                                               ` Minchan Kim
2011-06-06 13:23                                           ` Minchan Kim
2011-06-06 13:23                                             ` Minchan Kim
2011-05-31 14:34         ` Mel Gorman
2011-05-31 14:34           ` Mel Gorman
2011-05-30 14:45 ` [stable] " Greg KH
2011-05-30 14:45   ` Greg KH
2011-05-30 16:14 ` Minchan Kim
2011-05-30 16:14   ` Minchan Kim
2011-05-31  8:32   ` Mel Gorman
2011-05-31  8:32     ` Mel Gorman
2011-05-31  4:48 ` KAMEZAWA Hiroyuki
2011-05-31  4:48   ` KAMEZAWA Hiroyuki
2011-05-31  5:38   ` Minchan Kim
2011-05-31  5:38     ` Minchan Kim
2011-05-31  7:14 ` KOSAKI Motohiro
2011-05-31  7:14   ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110606101557.GA5247@suse.de \
    --to=mgorman@suse.de \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan.kim@gmail.com \
    --cc=urykhy@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.