netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Eric Wong <normalperson@yhbt.net>
Cc: linux-mm@kvack.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Rik van Riel <riel@redhat.com>,
	Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: ppoll() stuck on POLLIN while TCP peer is sending
Date: Fri, 4 Jan 2013 16:01:48 +0000	[thread overview]
Message-ID: <20130104160148.GB3885@suse.de> (raw)
In-Reply-To: <20130102200848.GA4500@dcvr.yhbt.net>

On Wed, Jan 02, 2013 at 08:08:48PM +0000, Eric Wong wrote:
> (changing Cc:)
> 
> Eric Wong <normalperson@yhbt.net> wrote:
> > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a
> > local TCP socket.  The isolated code below can reproduces the issue
> > after many minutes (<1 hour).  It might be easier to reproduce on
> > a busy system while disk I/O is happening.
> 
> s/might be/is/
> 
> Strangely, I've bisected this seemingly networking-related issue down to
> the following commit:
> 
>   commit 1fb3f8ca0e9222535a39b884cb67a34628411b9f
>   Author: Mel Gorman <mgorman@suse.de>
>   Date:   Mon Oct 8 16:29:12 2012 -0700
> 
>       mm: compaction: capture a suitable high-order page immediately when it is made available
> 
> That commit doesn't revert cleanly on v3.7.1, and I don't feel
> comfortable touching that code myself.
> 

That patch introduced an accounting bug that was corrected by ef6c5be6
(fix incorrect NR_FREE_PAGES accounting (appears like memory leak)). In
some cases that could look like a hang and potentially confuses a bisection.

That said, I see that you report that 3.7.1 and 3.8-rc2 are affected that
includes that fix and the finger is pointed at compaction so something
is wrong.

> Instead, I disabled THP+compaction under v3.7.1 and I've been unable to
> reproduce the issue without THP+compaction.
> 

Implying that it's stuck in compaction somewhere. It could be the case
that compaction alters timing enough to trigger another bug. You say it
tests differently depending on whether TCP or unix sockets are used
which might indicate multiple problems. However, lets try and see if
compaction is the primary problem or not.

> As I mention in http://mid.gmane.org/20121229113434.GA13336@dcvr.yhbt.net
> I run my below test (`toosleepy') with heavy network and disk activity
> for a long time before hitting this.
> 

Using a 3.7.1 or 3.8-rc2 kernel, can you reproduce the problem and then
answer the following questions please?

1. What are the contents of /proc/vmstat at the time it is stuck?

2. What are the contents of /proc/PID/stack for every toosleepy
   process when they are stuck?

3. Can you do a sysrq+m and post the resulting dmesg?

What I'm looking for is a throttling bug (if pgscan_direct_throttle is
elevated), an isolated page accounting bug (nr_isolated_* is elevated
and process is stuck in congestion_wait in a too_many_isolated() loop)
or a free page accounting bug (big difference between nr_free_pages and
buddy list figures).

I'll try reproducing this early next week if none of that shows an
obvious candidate.

Thanks.

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2013-01-04 16:01 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-28  1:45 ppoll() stuck on POLLIN while TCP peer is sending Eric Wong
2012-12-28  7:06 ` Eric Wong
2012-12-29 11:34   ` Eric Wong
2012-12-31 13:21 ` [PATCH] poll: prevent missed events if _qproc is NULL Eric Wong
2012-12-31 23:24   ` Eric Wong
2013-01-01 16:58     ` Junchang(Jason) Wang
2013-01-01 18:42   ` Eric Dumazet
2013-01-01 21:00     ` Eric Wong
2013-01-01 21:17       ` Eric Wong
2013-01-01 22:53         ` Linus Torvalds
2013-01-01 23:21           ` Junchang(Jason) Wang
2013-01-01 23:56           ` [PATCH] epoll: prevent missed events on EPOLL_CTL_MOD Eric Wong
2013-01-02 17:45             ` Eric Dumazet
2013-01-02 18:40               ` Eric Wong
2013-01-02 19:03                 ` Eric Dumazet
2013-01-02 19:32                   ` Eric Wong
2013-01-02 22:08                     ` Eric Dumazet
2013-01-02 21:16             ` Eric Wong
2013-01-02 20:08 ` ppoll() stuck on POLLIN while TCP peer is sending Eric Wong
2013-01-02 20:47   ` Eric Wong
2013-01-03 13:41     ` Eric Dumazet
2013-01-03 18:32       ` Eric Wong
2013-01-03 23:45         ` Eric Wong
2013-01-04  0:26           ` Eric Wong
2013-01-04  3:52             ` Eric Wong
2013-01-04 16:01   ` Mel Gorman [this message]
2013-01-04 17:15     ` Eric Dumazet
2013-01-04 17:59     ` Eric Wong
2013-01-05  1:07     ` Eric Wong
2013-01-06 12:07     ` Eric Wong
2013-01-07 12:25       ` Mel Gorman
2013-01-07 22:38         ` Eric Dumazet
2013-01-08  0:21           ` Eric Wong
2013-01-07 22:38         ` Eric Wong
2013-01-08 20:14           ` Eric Wong
2013-01-08 22:43           ` Mel Gorman
2013-01-08 23:23             ` Eric Wong
2013-01-09  2:14               ` Eric Dumazet
2013-01-09  2:32                 ` Eric Dumazet
2013-01-09  2:54                   ` Eric Dumazet
2013-01-09  3:55                     ` Eric Wong
2013-01-09  8:42                       ` Eric Wong
2013-01-09  8:51                         ` Eric Wong
2013-01-09 13:42                   ` Mel Gorman
2013-01-09 13:37               ` Mel Gorman
2013-01-09 13:50                 ` Mel Gorman
2013-01-10  9:25                 ` Eric Wong
2013-01-10 19:42                   ` Mel Gorman
2013-01-10 20:03                     ` Eric Wong
2013-01-10 20:58                     ` Eric Dumazet
2013-01-11  0:51                     ` Eric Wong
2013-01-11  9:30                       ` Mel Gorman
2013-01-09 21:29             ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130104160148.GB3885@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=normalperson@yhbt.net \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).