* Is there some bug in ext3 in 2.4.25? @ 2004-03-04 6:50 Daniel Fenert 2004-03-04 7:03 ` Daniel Fenert 2004-03-05 14:06 ` Marcelo Tosatti 0 siblings, 2 replies; 9+ messages in thread From: Daniel Fenert @ 2004-03-04 6:50 UTC (permalink / raw) To: linux-kernel Message from syslogd@lazy at Thu Mar 4 08:31:58 2004 ... lazy kernel: Assertion failure in __journal_drop_transaction() at checkpoint.c:587: "transaction->t_ilist == NULL" Networking still works, I've tried to login, but no luck here. I've got one ssh console opened, and tried to reboot, but nothing happend, it looks like it lost connection with hda :( Where should I look for reason? Machine as faaar away, and it's second or third time it hangs mysteriously, the only difference is that this time I've got some console output. -- Daniel Fenert --==> daniel@fenert.net <==-- ==-P o w e r e d--b y--S l a c k w a r e-=-ICQ #37739641-== =======- http://daniel.fenert.net/ -=======< +48604628083 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is there some bug in ext3 in 2.4.25? 2004-03-04 6:50 Is there some bug in ext3 in 2.4.25? Daniel Fenert @ 2004-03-04 7:03 ` Daniel Fenert 2004-03-05 14:06 ` Marcelo Tosatti 1 sibling, 0 replies; 9+ messages in thread From: Daniel Fenert @ 2004-03-04 7:03 UTC (permalink / raw) To: linux-kernel W dniu Thu, Mar 04, 2004 at 07:50:38AM +0100, Daniel Fenert wystukał(a): >Message from syslogd@lazy at Thu Mar 4 08:31:58 2004 ... >lazy kernel: Assertion failure in __journal_drop_transaction() at >checkpoint.c:587: "transaction->t_ilist == NULL" One more thing - it has happened when /var got full. -- Daniel Fenert --==> daniel@fenert.net <==-- ==-P o w e r e d--b y--S l a c k w a r e-=-ICQ #37739641-== Absurd: przekonanie sprzeczne z Twoimi poglądami - [Ambrose Bierce] =======- http://daniel.fenert.net/ -=======< +48604628083 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is there some bug in ext3 in 2.4.25? 2004-03-04 6:50 Is there some bug in ext3 in 2.4.25? Daniel Fenert 2004-03-04 7:03 ` Daniel Fenert @ 2004-03-05 14:06 ` Marcelo Tosatti 2004-03-05 14:14 ` Michael Frank 2004-03-05 14:25 ` Stephen C. Tweedie 1 sibling, 2 replies; 9+ messages in thread From: Marcelo Tosatti @ 2004-03-05 14:06 UTC (permalink / raw) To: Daniel Fenert; +Cc: linux-kernel, sct, Michelle Konzack Hi, This sounds like memory corruption (which could be caused by a misbehaving driver or by flaky hardware) because transaction->t_ilist is not used at all by the kernel code. Did this box run stable with other kernels? I found a similar report from Michelle (CCed), which can be found at: http://marc.theaimsgroup.com/?l=linux-kernel&m=107529754608448&w=2 Searching a bit more, I found another message from Michelle with topic "[SOLVED] Kernel-Bug (at checkpoint.c 587)" http://lists.debian.org/debian-user-german/2004/debian-user-german-200401/msg04404.html Unfortunately the said message is in German, which I can't understand. Michelle, can you clarify it for me? Stephen, Andrew, any idea how can transaction->t_ilist become not NULL? On Thu, 4 Mar 2004, Daniel Fenert wrote: > Message from syslogd@lazy at Thu Mar 4 08:31:58 2004 ... > lazy kernel: Assertion failure in __journal_drop_transaction() at > checkpoint.c:587: "transaction->t_ilist == NULL" > > Networking still works, I've tried to login, but no luck here. > I've got one ssh console opened, and tried to reboot, but nothing happend, it > looks like it lost connection with hda :( > Where should I look for reason? > Machine as faaar away, and it's second or third time it hangs mysteriously, > the only difference is that this time I've got some console output. > >From daniel@fenert.net Fri Mar 5 10:48:26 2004 Date: Thu, 4 Mar 2004 08:03:29 +0100 From: Daniel Fenert <daniel@fenert.net> To: linux-kernel@vger.kernel.org Subject: Re: Is there some bug in ext3 in 2.4.25? >Message from syslogd@lazy at Thu Mar 4 08:31:58 2004 ... >lazy kernel: Assertion failure in __journal_drop_transaction() at >checkpoint.c:587: "transaction->t_ilist == NULL" One more thing - it has happened when /var got full. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is there some bug in ext3 in 2.4.25? 2004-03-05 14:06 ` Marcelo Tosatti @ 2004-03-05 14:14 ` Michael Frank 2004-03-05 14:26 ` Stephen C. Tweedie 2004-03-05 14:25 ` Stephen C. Tweedie 1 sibling, 1 reply; 9+ messages in thread From: Michael Frank @ 2004-03-05 14:14 UTC (permalink / raw) To: Marcelo Tosatti, Daniel Fenert; +Cc: linux-kernel, sct, Michelle Konzack On Fri, 5 Mar 2004 11:06:02 -0300 (BRT), Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote: > > Hi, > > This sounds like memory corruption (which could be caused by a misbehaving > driver or by flaky hardware) because transaction->t_ilist is not used at > all by the kernel code. Did this box run stable with other kernels? > > I found a similar report from Michelle (CCed), which can be found at: > http://marc.theaimsgroup.com/?l=linux-kernel&m=107529754608448&w=2 > > Searching a bit more, I found another message from Michelle with > topic "[SOLVED] Kernel-Bug (at checkpoint.c 587)" > http://lists.debian.org/debian-user-german/2004/debian-user-german-200401/msg04404.html > > Unfortunately the said message is in German, which I can't understand. > Michelle, can you clarify it for me? > Hallo Leute, > Auch wenn ich von der kernel-list@vger.kernel.org keine Antworterhalten habe, handelt es sich definitiv um einen echten Kernel- > Bug in 2.4.22 der in 2.4.24 offensichtlich nicht mehr vorhandenlist. Although I have nt received a reply from LKML, it is definitively a real kernel bug in 2.4.22 which has been fixed in 2.4.24. Ein weiterer Fehler trat mehrfach in 'exit.c' auf, der ebenfals nach der Installation von Linux 2.4.24 verschwunden war. Further bug occuring several times in 'exit.c' has also vanished after installation of 2.4.24. > > Stephen, Andrew, any idea how can transaction->t_ilist become not NULL? > > > On Thu, 4 Mar 2004, Daniel Fenert wrote: > >> Message from syslogd@lazy at Thu Mar 4 08:31:58 2004 ... >> lazy kernel: Assertion failure in __journal_drop_transaction() at >> checkpoint.c:587: "transaction->t_ilist == NULL" >> >> Networking still works, I've tried to login, but no luck here. >> I've got one ssh console opened, and tried to reboot, but nothing happend, it >> looks like it lost connection with hda :( >> Where should I look for reason? >> Machine as faaar away, and it's second or third time it hangs mysteriously, >> the only difference is that this time I've got some console output. >> > >> From daniel@fenert.net Fri Mar 5 10:48:26 2004 > Date: Thu, 4 Mar 2004 08:03:29 +0100 > From: Daniel Fenert <daniel@fenert.net> > To: linux-kernel@vger.kernel.org > Subject: Re: Is there some bug in ext3 in 2.4.25? > >> Message from syslogd@lazy at Thu Mar 4 08:31:58 2004 ... >> lazy kernel: Assertion failure in __journal_drop_transaction() at >> checkpoint.c:587: "transaction->t_ilist == NULL" > > One more thing - it has happened when /var got full. > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is there some bug in ext3 in 2.4.25? 2004-03-05 14:14 ` Michael Frank @ 2004-03-05 14:26 ` Stephen C. Tweedie 0 siblings, 0 replies; 9+ messages in thread From: Stephen C. Tweedie @ 2004-03-05 14:26 UTC (permalink / raw) To: Michael Frank Cc: Marcelo Tosatti, Daniel Fenert, linux-kernel, Michelle Konzack, Stephen Tweedie Hi, On Fri, 2004-03-05 at 14:14, Michael Frank wrote: > Although I have nt received a reply from LKML, it is definitively > a real kernel bug in 2.4.22 which has been fixed in 2.4.24. > > Ein weiterer Fehler trat mehrfach in 'exit.c' auf, der ebenfals > nach der Installation von Linux 2.4.24 verschwunden war. > > Further bug occuring several times in 'exit.c' has also vanished > after installation of 2.4.24. Sounds like bad memory. It's quite impossible for a bad memory module to show up a problem in one kernel but not in another, simply because kernels are storing their active data in slightly different memory locations from one release to another (or even from one compiler, or one set of config options, to another.) I'd definitely be running memtest86 as the next step here. Cheers, Stephen ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is there some bug in ext3 in 2.4.25? 2004-03-05 14:06 ` Marcelo Tosatti 2004-03-05 14:14 ` Michael Frank @ 2004-03-05 14:25 ` Stephen C. Tweedie 2004-03-08 13:44 ` Daniel Fenert 2004-04-02 10:20 ` Daniel Fenert 1 sibling, 2 replies; 9+ messages in thread From: Stephen C. Tweedie @ 2004-03-05 14:25 UTC (permalink / raw) To: Marcelo Tosatti Cc: Daniel Fenert, linux-kernel, Michelle Konzack, Stephen Tweedie Hi, On Fri, 2004-03-05 at 14:06, Marcelo Tosatti wrote: > This sounds like memory corruption (which could be caused by a misbehaving > driver or by flaky hardware) because transaction->t_ilist is not used at > all by the kernel code. Did this box run stable with other kernels? Sounds like bad memory to me. The only other report of this I've seen was at https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=115935 and that machine didn't pass memtest86. > Stephen, Andrew, any idea how can transaction->t_ilist become not NULL? Bad hardware is about the only way I can think of. If it was a random kernel memory scribble, you'd expect it to show up in other places too: the transaction struct is a very very long-lived struct, you wouldn't expect it to be the only place to show up slab corruptions. Cheers, Stephen ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is there some bug in ext3 in 2.4.25? 2004-03-05 14:25 ` Stephen C. Tweedie @ 2004-03-08 13:44 ` Daniel Fenert 2004-04-02 10:20 ` Daniel Fenert 1 sibling, 0 replies; 9+ messages in thread From: Daniel Fenert @ 2004-03-08 13:44 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: linux-kernel W dniu Fri, Mar 05, 2004 at 02:25:13PM +0000, Stephen C. Tweedie wystukał(a): >> This sounds like memory corruption (which could be caused by a misbehaving >> driver or by flaky hardware) because transaction->t_ilist is not used at >> all by the kernel code. Did this box run stable with other kernels? > >Sounds like bad memory to me. The only other report of this I've seen >was at > >https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=115935 > >and that machine didn't pass memtest86. I'll check this this week, BIG thanks for replies. (the machine was stable for few years, AFAIR 3 years). -- Daniel Fenert --==> daniel@fenert.net <==-- ==-P o w e r e d--b y--S l a c k w a r e-=-ICQ #37739641-== Who does not love wine, women, and song, remains a fool his whole life long. =======- http://daniel.fenert.net/ -=======< +48604628083 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is there some bug in ext3 in 2.4.25? 2004-03-05 14:25 ` Stephen C. Tweedie 2004-03-08 13:44 ` Daniel Fenert @ 2004-04-02 10:20 ` Daniel Fenert 2004-04-02 10:37 ` Stephen C. Tweedie 1 sibling, 1 reply; 9+ messages in thread From: Daniel Fenert @ 2004-04-02 10:20 UTC (permalink / raw) To: Stephen C. Tweedie; +Cc: Marcelo Tosatti, linux-kernel, Michelle Konzack Old thread, but I've managed to test machine. >> This sounds like memory corruption (which could be caused by a misbehaving >> driver or by flaky hardware) because transaction->t_ilist is not used at >> all by the kernel code. Did this box run stable with other kernels? > >Sounds like bad memory to me. The only other report of this I've seen >was at > >https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=115935 > >and that machine didn't pass memtest86. It passed memtest86, 6 or 7 hours, any further hints? -- Daniel Fenert --==> daniel@fenert.net <==-- ==-P o w e r e d--b y--S l a c k w a r e-=-ICQ #37739641-== Najprościej pytać dlaczego, najtrudniej znaleźć odpowiedź --J. Szczawiński =======- http://daniel.fenert.net/ -=======< +48604628083 > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Is there some bug in ext3 in 2.4.25? 2004-04-02 10:20 ` Daniel Fenert @ 2004-04-02 10:37 ` Stephen C. Tweedie 0 siblings, 0 replies; 9+ messages in thread From: Stephen C. Tweedie @ 2004-04-02 10:37 UTC (permalink / raw) To: Daniel Fenert Cc: Marcelo Tosatti, linux-kernel, Michelle Konzack, Stephen Tweedie Hi, On Fri, 2004-04-02 at 11:20, Daniel Fenert wrote: > >> This sounds like memory corruption (which could be caused by a misbehaving > >> driver or by flaky hardware) because transaction->t_ilist is not used at > >> all by the kernel code. Did this box run stable with other kernels? > > > >Sounds like bad memory to me. The only other report of this I've seen > >was at > > > >https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=115935 > > > >and that machine didn't pass memtest86. > > It passed memtest86, 6 or 7 hours, any further hints? Well, 7 hours is often not enough for memtest86, I usually recommend 24 hours if there are signs of bad hardware. But other than that, I can't think of anything ext3-related --- ext3 simply doesn't ever set that flag. If it's being set, something is stomping on ext3's transaction struct. That _could_ be the kernel, but it could be just about anything touching memory after it's freed; or it could be bad hardware. What modules are you using? Is there anything unusual in common between your machine or its use and that in #115935? Rebuilding the kernel to enable slab debugging may well be useful if there's something stomping on transaction structs. Cheers, Stephen ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2004-04-02 10:38 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2004-03-04 6:50 Is there some bug in ext3 in 2.4.25? Daniel Fenert 2004-03-04 7:03 ` Daniel Fenert 2004-03-05 14:06 ` Marcelo Tosatti 2004-03-05 14:14 ` Michael Frank 2004-03-05 14:26 ` Stephen C. Tweedie 2004-03-05 14:25 ` Stephen C. Tweedie 2004-03-08 13:44 ` Daniel Fenert 2004-04-02 10:20 ` Daniel Fenert 2004-04-02 10:37 ` Stephen C. Tweedie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox