From: Mel Gorman <mel@csn.ul.ie>
To: Chris Mason <chris.mason@oracle.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Jens Axboe <jaxboe@fusionio.com>, linux-mm <linux-mm@kvack.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Shaohua Li <shaohua.li@intel.com>
Subject: Re: hunting an IO hang
Date: Mon, 17 Jan 2011 23:03:58 +0000 [thread overview]
Message-ID: <20110117230358.GE27152@csn.ul.ie> (raw)
In-Reply-To: <1295298806-sup-2802@think>
On Mon, Jan 17, 2011 at 04:23:56PM -0500, Chris Mason wrote:
> Excerpts from Linus Torvalds's message of 2011-01-17 13:24:55 -0500:
> > On Mon, Jan 17, 2011 at 9:40 AM, Chris Mason <chris.mason@oracle.com> wrote:
> > >> >
> > >> > I've reverted 744ed1442757767ffede5008bb13e0805085902e, and
> > >> > d8505dee1a87b8d41b9c4ee1325cd72258226fbc and the run has lasted longer
> > >> > than any runs in the past.
> > >> >
> > >>
> > >> Confirmed that reverting these patches makes the problem unreproducible
> > >> for the many_dd's + fsmark for at least an hour here.
> > >
> > > After 2+ hours I'm still running with those two commits gone. I'm
> > > confident they are the cause of the crashes. I also haven't triggered
> > > the cfq stalls without them.
> >
> > Ok, so the question is how to proceed from here.
> >
> > I can easily revert them, and since I was planning on doing -rc1
> > tonight, I probably will. But I promised Chris to delay until tomorrow
> > if he needed time to chase this down, and while it's now apparently
> > chased down, I'll certainly also be open to delaying until tomorrow if
> > somebody has a patch to fix it.
> >
> > So right now my plan is:
> > - I will revert those two later today and then release -rc1 in the evening
> > UNLESS
> > - somebody posts a patch for the problem in the next few hours and
> > Chris/others are willing to give it a good test overnight (or whatever
> > people feel is "sufficient" based on how easily they can trigger the
> > issue), in which case I'd do -rc1 tomorrow (either with the reverts or
> > the patch, depending on how testing works out)
>
> If a patch does come in, I'm happy to test it. Mel had a test that
> triggered within 1-2 minutes, mine took 30 or so, which means I'd want a
> 2 hour run to convince myself it was really fixed. But, I'll give Mel's
> fs_mark + dd workload a try on the buggy kernel.
>
I spent a while seeing if there was a simple patch but it's not trivially
fixable. __activate_page() is getting called in too many different situations
to be fully sure the function is doing the right thing in all cases. I also
couldn't convince myself that the accounting was correct in all cases. I
think the idea of batching updates from mark_page_accessed() in particular
is a good idea but the patch needs a do-over.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2011-01-17 23:04 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1295225684-sup-7168@think>
[not found] ` <AANLkTikBamG2NG6j-z9fyTx=mk6NXFEE7LpB5z9s6ufr@mail.gmail.com>
[not found] ` <4D339C87.30100@fusionio.com>
[not found] ` <1295228148-sup-7379@think>
[not found] ` <AANLkTimp6ef0W_=ijW=CfH6iC1mQzW3gLr1LZivJ5Bmd@mail.gmail.com>
[not found] ` <AANLkTimr3hN8SDmbwv98hkcVfWoh9tioYg4M+0yanzpb@mail.gmail.com>
[not found] ` <1295229722-sup-6494@think>
2011-01-17 2:30 ` hunting an IO hang Andrew Morton
2011-01-17 2:41 ` Chris Mason
2011-01-17 5:11 ` Andrea Arcangeli
2011-01-17 13:48 ` Minchan Kim
2011-01-17 14:10 ` Chris Mason
2011-01-17 14:26 ` Andrea Arcangeli
2011-01-17 14:47 ` Minchan Kim
2011-01-17 15:09 ` Minchan Kim
2011-01-17 20:39 ` Andrea Arcangeli
2011-01-17 10:27 ` Mel Gorman
2011-01-17 13:21 ` Chris Mason
2011-01-17 13:50 ` Mel Gorman
2011-01-17 14:07 ` Chris Mason
2011-01-17 15:02 ` Chris Mason
2011-01-17 16:32 ` Johannes Weiner
2011-01-17 18:10 ` Mel Gorman
2011-01-17 17:09 ` Mel Gorman
2011-01-17 17:40 ` Chris Mason
2011-01-17 18:24 ` Linus Torvalds
2011-01-17 21:23 ` Chris Mason
2011-01-17 23:03 ` Mel Gorman [this message]
2011-01-18 0:30 ` Shaohua Li
2011-01-17 23:02 ` Linus Torvalds
2011-01-17 23:13 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110117230358.GE27152@csn.ul.ie \
--to=mel@csn.ul.ie \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=jaxboe@fusionio.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-mm@kvack.org \
--cc=shaohua.li@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).