From: Frederic Weisbecker <fweisbec@gmail.com>
To: Bastien ROUCARIES <roucaries.bastien@gmail.com>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
akpm@linux-foundation.org
Subject: Re: Reiserfs deadlock in 2.6.36
Date: Tue, 8 Mar 2011 15:05:52 +0100 [thread overview]
Message-ID: <20110308140549.GA1837@nowhere> (raw)
In-Reply-To: <AANLkTikMNw5yMxcUV3a-temcOdWa5gySZ+vTtfuBEAiD@mail.gmail.com>
On Tue, Mar 08, 2011 at 09:41:15AM +0100, Bastien ROUCARIES wrote:
> On Mon, Mar 7, 2011 at 8:00 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > Hi Bastien,
>
> Cc: Ingo Molnar because he work a lot on soft lockup, and could have
> an idea to debug
> cc: andrew morton that trakc also "File/memory corruption in 2.6.37"
About the corruption, I'm not sure it's the same problem. It's hard to
tell yet.
> >> I take me more than two days of testing to reporduce this bugs with trace enabled. My filesystem was quite slow and this bugs seems
> >> to be timing related.
> >>
> >> One patern that trigger this bug is git. Doing a lot of git work of my desktop crash my machine.
> >>
> >> Moreover, trying to reproduce this bug lead to data loss. I have rebuilded twice my / partition using --rebuild-tree, and restored
> >> my home partition three times using backups.
> >>
> >> My log is here.
> >>
> >> Do you need more information?
> >
> > Yeah do you have CONFIG_REISERFS_CHECK? I just would
> > like to ensure we are not missing this important source of
> > information.
>
> Yes I have it
Ok.
> > I'm puzzled because, given the traces, your opening and closing of the journal are
> > well balanced.
> >
> > You have a writer queued and stuck but I see no trace of it in the traces stream.
> > I only see well balanced journal operations, including journal closing that would have
> > woken your queued writer.
> >
> > A theory could be that your queued writer was waiting for someone to close the journal,
> > which finally happen but actually several minutes later, after there was many
> > journal opening/closing that overwrote the old trace containing the queueing of
> > the stuck writer.
>
> Doing a while true;do sync && sleep1; done; help a lot
Which kernel are you running by the way?
> >
> > I don't know what to do yet. I need to think more about it.
> >
>
> Could we do the stuff I have sugested at first ? use lockdep to track
> journal open,/close using fake lock ?
I don't think it's not an adapted test. Lockdep is useful to detect lock inversion
scenarios but that's not very useful to detect a lock that takes too much time
to be released. For that we have the hung task detector, whose report we already
have.
> BTW it seems that someone experiment this confition on ext3. I could
> do more testing if you want, and I will run xfstests in order to see
> if I could reproduce more quickly
I'm not sure the file corruption and the deadlock are linked. But
may be xfstest can provoke the deadlock (or the file corruption)
more quickly. It's pretty good at stressing file systems.
next prev parent reply other threads:[~2011-03-08 14:05 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-18 15:49 Reiserfs deadlock in 2.6.36 Bastien ROUCARIES
2010-11-18 16:30 ` Frederic Weisbecker
2010-11-18 22:02 ` Bastien ROUCARIES
2010-11-26 16:57 ` Bastien ROUCARIES
2010-11-26 17:27 ` Bastien ROUCARIES
2010-12-02 17:43 ` Frederic Weisbecker
2010-12-16 13:49 ` Bastien ROUCARIES
2010-12-22 17:50 ` Bastien ROUCARIES
2010-12-22 18:04 ` Frederic Weisbecker
2010-12-22 18:11 ` Bastien ROUCARIES
2010-12-23 3:42 ` Frederic Weisbecker
2011-01-30 0:08 ` Bastien ROUCARIES
2011-02-16 16:22 ` Regression with dataloss: Reiserfs deadlock in 2.6.36 and 2.6.37, know working 2.6.33 Bastien ROUCARIES
2011-02-16 16:55 ` Frederic Weisbecker
2011-02-23 10:15 ` Bastien ROUCARIES
2011-03-02 12:49 ` Bastien ROUCARIES
2011-03-07 19:00 ` Reiserfs deadlock in 2.6.36 Frederic Weisbecker
2011-03-08 8:41 ` Bastien ROUCARIES
2011-03-08 14:05 ` Frederic Weisbecker [this message]
2011-03-08 15:21 ` Bastien ROUCARIES
2011-03-08 14:18 ` Frederic Weisbecker
2011-03-08 15:22 ` Bastien ROUCARIES
2011-03-28 9:14 ` Bastien ROUCARIES
2011-03-31 15:04 ` Bastien ROUCARIES
2011-04-05 13:30 ` Bastien ROUCARIES
2011-04-05 15:58 ` Jeff Mahoney
2011-04-05 16:10 ` Bastien ROUCARIES
2011-04-05 22:58 ` Frederic Weisbecker
2011-04-06 10:14 ` Bastien ROUCARIES
2011-04-11 8:40 ` Bastien ROUCARIES
2011-04-11 8:49 ` Bastien ROUCARIES
2011-04-11 8:49 ` Bastien ROUCARIES
2011-04-11 23:18 ` Frederic Weisbecker
2011-04-12 12:01 ` Bastien ROUCARIES
2011-04-18 8:01 ` Bastien ROUCARIES
2011-04-26 15:29 ` Frederic Weisbecker
2011-04-27 11:08 ` Bastien ROUCARIES
2011-04-27 11:10 ` Bastien ROUCARIES
2011-04-27 11:13 ` Bastien ROUCARIES
2011-04-27 12:34 ` solsTiCe d'Hiver
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110308140549.GA1837@nowhere \
--to=fweisbec@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=roucaries.bastien@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).