From: Mel Gorman <mgorman@suse.de>
To: Eric Wong <normalperson@yhbt.net>
Cc: linux-mm@kvack.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, Rik van Riel <riel@redhat.com>,
Minchan Kim <minchan@kernel.org>,
Eric Dumazet <eric.dumazet@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: ppoll() stuck on POLLIN while TCP peer is sending
Date: Tue, 8 Jan 2013 22:43:13 +0000 [thread overview]
Message-ID: <20130108224313.GA13304@suse.de> (raw)
In-Reply-To: <20130107223850.GA21311@dcvr.yhbt.net>
On Mon, Jan 07, 2013 at 10:38:50PM +0000, Eric Wong wrote:
> Mel Gorman <mgorman@suse.de> wrote:
> > Right now it's difficult to see how the capture could be the source of
> > this bug but I'm not ruling it out either so try the following (untested
> > but should be ok) patch. It's not a proper revert, it just disables the
> > capture page logic to see if it's at fault.
>
> Things look good so far with your change.
Ok, so minimally reverting is an option once 2e30abd1 is preserved. The
original motivation for the patch was to improve allocation success rates
under load but due to a bug in the patch the likely source of the improvement
was due to compacting more for THP allocations.
> It's been running 2 hours on a VM and 1 hour on my regular machine.
> Will update again in a few hours (or sooner if it's stuck again).
When I looked at it for long enough I found a number of problems. Most
affect timing but two serious issues are in there. One affects how long
kswapd spends compacting versus reclaiming and the other increases lock
contention meaning that async compaction can abort early. Both are serious
and could explain why a driver would fail high-order allocations.
Please try the following patch. However, even if it works the benefit of
capture may be so marginal that partially reverting it and simplifying
compaction.c is the better decision.
diff --git a/mm/compaction.c b/mm/compaction.c
index 6b807e4..03c82c0 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -857,7 +857,8 @@ static int compact_finished(struct zone *zone,
} else {
unsigned int order;
for (order = cc->order; order < MAX_ORDER; order++) {
- struct free_area *area = &zone->free_area[cc->order];
+ struct free_area *area = &zone->free_area[order];
+
/* Job done if page is free of the right migratetype */
if (!list_empty(&area->free_list[cc->migratetype]))
return COMPACT_PARTIAL;
@@ -929,6 +930,11 @@ static void compact_capture_page(struct compact_control *cc)
if (!cc->page || *cc->page)
return;
+ /* Check that watermarks are satisifed before acquiring locks */
+ if (!zone_watermark_ok(cc->zone, cc->order, low_wmark_pages(cc->zone),
+ 0, 0))
+ return;
+
/*
* For MIGRATE_MOVABLE allocations we capture a suitable page ASAP
* regardless of the migratetype of the freelist is is captured from.
@@ -941,7 +947,7 @@ static void compact_capture_page(struct compact_control *cc)
*/
if (cc->migratetype == MIGRATE_MOVABLE) {
mtype_low = 0;
- mtype_high = MIGRATE_PCPTYPES;
+ mtype_high = MIGRATE_PCPTYPES + 1;
} else {
mtype_low = cc->migratetype;
mtype_high = cc->migratetype + 1;
@@ -1118,7 +1124,6 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,
struct zoneref *z;
struct zone *zone;
int rc = COMPACT_SKIPPED;
- int alloc_flags = 0;
/* Check if the GFP flags allow compaction */
if (!order || !may_enter_fs || !may_perform_io)
@@ -1126,10 +1131,6 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,
count_compact_event(COMPACTSTALL);
-#ifdef CONFIG_CMA
- if (allocflags_to_migratetype(gfp_mask) == MIGRATE_MOVABLE)
- alloc_flags |= ALLOC_CMA;
-#endif
/* Compact each zone in the list */
for_each_zone_zonelist_nodemask(zone, z, zonelist, high_zoneidx,
nodemask) {
@@ -1139,9 +1140,8 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,
contended, page);
rc = max(status, rc);
- /* If a normal allocation would succeed, stop compacting */
- if (zone_watermark_ok(zone, order, low_wmark_pages(zone), 0,
- alloc_flags))
+ /* If a page was captured, stop compacting */
+ if (*page)
break;
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 4ba5e37..9d20c13 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2180,10 +2180,8 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
current->flags &= ~PF_MEMALLOC;
/* If compaction captured a page, prep and use it */
- if (page) {
- prep_new_page(page, order, gfp_mask);
+ if (page && !prep_new_page(page, order, gfp_mask))
goto got_page;
- }
if (*did_some_progress != COMPACT_SKIPPED) {
/* Page migration frees to the PCP lists but we want merging */
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-01-08 22:43 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-28 1:45 ppoll() stuck on POLLIN while TCP peer is sending Eric Wong
2012-12-28 7:06 ` Eric Wong
2012-12-29 11:34 ` Eric Wong
2012-12-31 13:21 ` [PATCH] poll: prevent missed events if _qproc is NULL Eric Wong
2012-12-31 23:24 ` Eric Wong
2013-01-01 16:58 ` Junchang(Jason) Wang
2013-01-01 18:42 ` Eric Dumazet
2013-01-01 21:00 ` Eric Wong
2013-01-01 21:17 ` Eric Wong
2013-01-01 22:53 ` Linus Torvalds
2013-01-01 23:21 ` Junchang(Jason) Wang
2013-01-01 23:56 ` [PATCH] epoll: prevent missed events on EPOLL_CTL_MOD Eric Wong
2013-01-02 17:45 ` Eric Dumazet
2013-01-02 18:40 ` Eric Wong
2013-01-02 19:03 ` Eric Dumazet
2013-01-02 19:32 ` Eric Wong
2013-01-02 22:08 ` Eric Dumazet
2013-01-02 21:16 ` Eric Wong
2013-01-02 20:08 ` ppoll() stuck on POLLIN while TCP peer is sending Eric Wong
2013-01-02 20:47 ` Eric Wong
2013-01-03 13:41 ` Eric Dumazet
2013-01-03 18:32 ` Eric Wong
2013-01-03 23:45 ` Eric Wong
2013-01-04 0:26 ` Eric Wong
2013-01-04 3:52 ` Eric Wong
2013-01-04 16:01 ` Mel Gorman
2013-01-04 17:15 ` Eric Dumazet
2013-01-04 17:59 ` Eric Wong
2013-01-05 1:07 ` Eric Wong
2013-01-06 12:07 ` Eric Wong
2013-01-07 12:25 ` Mel Gorman
2013-01-07 22:38 ` Eric Dumazet
2013-01-08 0:21 ` Eric Wong
2013-01-07 22:38 ` Eric Wong
2013-01-08 20:14 ` Eric Wong
2013-01-08 22:43 ` Mel Gorman [this message]
2013-01-08 23:23 ` Eric Wong
2013-01-09 2:14 ` Eric Dumazet
2013-01-09 2:32 ` Eric Dumazet
2013-01-09 2:54 ` Eric Dumazet
2013-01-09 3:55 ` Eric Wong
2013-01-09 8:42 ` Eric Wong
2013-01-09 8:51 ` Eric Wong
2013-01-09 13:42 ` Mel Gorman
2013-01-09 13:37 ` Mel Gorman
2013-01-09 13:50 ` Mel Gorman
2013-01-10 9:25 ` Eric Wong
2013-01-10 19:42 ` Mel Gorman
2013-01-10 20:03 ` Eric Wong
2013-01-10 20:58 ` Eric Dumazet
2013-01-11 0:51 ` Eric Wong
2013-01-11 9:30 ` Mel Gorman
2013-01-09 21:29 ` Eric Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130108224313.GA13304@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=eric.dumazet@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=normalperson@yhbt.net \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).