* Re: Nikita 19891
2007-07-11 4:46 Nikita 19891 Ingo Bormuth
@ 2007-07-11 4:35 ` Jake Maciejewski
2007-07-11 19:48 ` Edward Shishkin
0 siblings, 1 reply; 6+ messages in thread
From: Jake Maciejewski @ 2007-07-11 4:35 UTC (permalink / raw)
To: Ingo Bormuth; +Cc: reiserfs-devel
I've hit the same panic looping kernel builds (while true ; do make
mrproper ; make allmodconfig ; make -j4 ; done) on 2.6.21.1 with the
Namesys patch and reiser4 debug enabled. I've seen it on my amd64
desktop and x86 laptop.
Another one I've seen is:
reiser4 panicked cowardly: reiser4[fixdep(16043)]: sibling_list_remove (fs/reiser4/tree_walk.c:814)[zam-32245]
In both cases the fsck didn't find anything, as you observed.
On Wed, 2007-07-11 at 06:46 +0200, Ingo Bormuth wrote:
> Hmm, whenever I try to build busybox (1.4.2) I get nikita-191 panics:
>
> [...]
> cc console_tools/clear.o
> reiser4 panicked cowardly: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
> kernel panic - not syncing: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
>
> Running fsck.reiser4 before and after the panic doesn't show any complaints.
> The partition is heavily used. I'm not aware of any other problem.
>
> Vanilla-2.6.21.6 (kernel.org) with reiser4-2.6.21-path (namesys.com).
>
> Not that I understood the code, but why is it an assertion at all?
> Couldn't one just use an empty hint if the current one is invalid?
>
--
Jake Maciejewski <maciejej@msoe.edu>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Nikita 19891
@ 2007-07-11 4:46 Ingo Bormuth
2007-07-11 4:35 ` Jake Maciejewski
0 siblings, 1 reply; 6+ messages in thread
From: Ingo Bormuth @ 2007-07-11 4:46 UTC (permalink / raw)
To: reiserfs-devel
Hmm, whenever I try to build busybox (1.4.2) I get nikita-191 panics:
[...]
cc console_tools/clear.o
reiser4 panicked cowardly: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
kernel panic - not syncing: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
Running fsck.reiser4 before and after the panic doesn't show any complaints.
The partition is heavily used. I'm not aware of any other problem.
Vanilla-2.6.21.6 (kernel.org) with reiser4-2.6.21-path (namesys.com).
Not that I understood the code, but why is it an assertion at all?
Couldn't one just use an empty hint if the current one is invalid?
--
Ingo Bormuth, voicebox & fax: +49-(0)-12125-10226517
public key 86326EC9, http://ibormuth.efil.de/contact
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Nikita 19891
2007-07-11 4:35 ` Jake Maciejewski
@ 2007-07-11 19:48 ` Edward Shishkin
2007-07-13 5:12 ` Jake Maciejewski
0 siblings, 1 reply; 6+ messages in thread
From: Edward Shishkin @ 2007-07-11 19:48 UTC (permalink / raw)
To: Jake Maciejewski; +Cc: Ingo Bormuth, reiserfs-devel
[-- Attachment #1: Type: text/plain, Size: 1612 bytes --]
Jake Maciejewski wrote:
>I've hit the same panic looping kernel builds (while true ; do make
>mrproper ; make allmodconfig ; make -j4 ; done) on 2.6.21.1 with the
>Namesys patch and reiser4 debug enabled. I've seen it on my amd64
>desktop and x86 laptop.
>
>Another one I've seen is:
> reiser4 panicked cowardly: reiser4[fixdep(16043)]: sibling_list_remove (fs/reiser4/tree_walk.c:814)[zam-32245]
>
>In both cases the fsck didn't find anything, as you observed.
>
>On Wed, 2007-07-11 at 06:46 +0200, Ingo Bormuth wrote:
>
>
>>Hmm, whenever I try to build busybox (1.4.2) I get nikita-191 panics:
>>
>>[...]
>>cc console_tools/clear.o
>>reiser4 panicked cowardly: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
>>kernel panic - not syncing: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
>>
>>
Somebody missed set_file_hint(), which synchronizes the coords.
Unfortunately I can not reproduce it. Would you please (if possible)
catch the stack with the attached patch?
>>Running fsck.reiser4 before and after the panic doesn't show any complaints.
>>The partition is heavily used. I'm not aware of any other problem.
>>
>>Vanilla-2.6.21.6 (kernel.org) with reiser4-2.6.21-path (namesys.com).
>>
>>Not that I understood the code, but why is it an assertion at all?
>>Couldn't one just use an empty hint if the current one is invalid?
>>
>>
Sure, it is possible to not use it at all. But if the current one is valid,
it would be nice to use it to avoid tree traversal with waiting for
possible locks, etc..
Thanks,
Edward.
[-- Attachment #2: reiser4-tmp-fix.patch --]
[-- Type: text/x-patch, Size: 568 bytes --]
--- linux-2.6.22-rc6-mm1/fs/reiser4/plugin/file/file.c.orig
+++ linux-2.6.22-rc6-mm1/fs/reiser4/plugin/file/file.c
@@ -707,8 +707,12 @@
return;
fsdata = reiser4_get_file_fsdata(file);
assert("vs-965", !IS_ERR(fsdata));
- assert("nikita-19891",
- coords_equal(&hint->seal.coord1, &hint->ext_coord.coord));
+#if REISER4_DEBUG
+ if (!coords_equal(&hint->seal.coord1, &hint->ext_coord.coord)) {
+ dump_stack();
+ for (; 1 ;) {;}
+ }
+#endif
assert("vs-30", hint->lh.owner == NULL);
spin_lock_inode(file->f_dentry->d_inode);
fsdata->reg.hint = *hint;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Nikita 19891
2007-07-11 19:48 ` Edward Shishkin
@ 2007-07-13 5:12 ` Jake Maciejewski
2007-07-13 15:34 ` Edward Shishkin
0 siblings, 1 reply; 6+ messages in thread
From: Jake Maciejewski @ 2007-07-13 5:12 UTC (permalink / raw)
To: Edward Shishkin; +Cc: Ingo Bormuth, reiserfs-devel
On Wed, 2007-07-11 at 23:48 +0400, Edward Shishkin wrote:
> Jake Maciejewski wrote:
>
> >I've hit the same panic looping kernel builds (while true ; do make
> >mrproper ; make allmodconfig ; make -j4 ; done) on 2.6.21.1 with the
> >Namesys patch and reiser4 debug enabled. I've seen it on my amd64
> >desktop and x86 laptop.
> >
> >Another one I've seen is:
> > reiser4 panicked cowardly: reiser4[fixdep(16043)]: sibling_list_remove (fs/reiser4/tree_walk.c:814)[zam-32245]
> >
> >In both cases the fsck didn't find anything, as you observed.
> >
> >On Wed, 2007-07-11 at 06:46 +0200, Ingo Bormuth wrote:
> >
> >
> >>Hmm, whenever I try to build busybox (1.4.2) I get nikita-191 panics:
> >>
> >>[...]
> >>cc console_tools/clear.o
> >>reiser4 panicked cowardly: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
> >>kernel panic - not syncing: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
> >>
> >>
>
> Somebody missed set_file_hint(), which synchronizes the coords.
> Unfortunately I can not reproduce it. Would you please (if possible)
> catch the stack with the attached patch?
[<ffffffff88186b5e>] :reiser4:save_file_hint+0xee/0x3c0
[<ffffffff88189c60>] :reiser4:read_unix_file+0x940/0xa10
[<ffffffff80276bbb>] vfs_read+0xdb/0x180
[<ffffffff80277083>] sys_read+0x53/0x90
[<ffffffff8020993e>] system_call+0x7e/0x83
As for reproducing it, I think I should mention that:
1. I'm using distcc to speed things up. Without offloading the compiling
work, my laptop has lasted ~3.5hrs before a panic. My desktop with
distcc configured usually only lasts a few minutes.
2. My local storage is encrypted through dm-crypt, but I've also tried
over open-iscsi and got the same results.
>
> >>Running fsck.reiser4 before and after the panic doesn't show any complaints.
> >>The partition is heavily used. I'm not aware of any other problem.
> >>
> >>Vanilla-2.6.21.6 (kernel.org) with reiser4-2.6.21-path (namesys.com).
> >>
> >>Not that I understood the code, but why is it an assertion at all?
> >>Couldn't one just use an empty hint if the current one is invalid?
> >>
> >>
>
> Sure, it is possible to not use it at all. But if the current one is valid,
> it would be nice to use it to avoid tree traversal with waiting for
> possible locks, etc..
>
> Thanks,
> Edward.
>
--
Jake Maciejewski <maciejej@msoe.edu>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Nikita 19891
2007-07-13 5:12 ` Jake Maciejewski
@ 2007-07-13 15:34 ` Edward Shishkin
2007-07-23 23:09 ` Jake Maciejewski
0 siblings, 1 reply; 6+ messages in thread
From: Edward Shishkin @ 2007-07-13 15:34 UTC (permalink / raw)
To: Jake Maciejewski; +Cc: Ingo Bormuth, reiserfs-devel, Vladimir V. Saveliev
[-- Attachment #1: Type: text/plain, Size: 2895 bytes --]
Jake Maciejewski wrote:
>On Wed, 2007-07-11 at 23:48 +0400, Edward Shishkin wrote:
>
>
>>Jake Maciejewski wrote:
>>
>>
>>
>>>I've hit the same panic looping kernel builds (while true ; do make
>>>mrproper ; make allmodconfig ; make -j4 ; done) on 2.6.21.1 with the
>>>Namesys patch and reiser4 debug enabled. I've seen it on my amd64
>>>desktop and x86 laptop.
>>>
>>>Another one I've seen is:
>>> reiser4 panicked cowardly: reiser4[fixdep(16043)]: sibling_list_remove (fs/reiser4/tree_walk.c:814)[zam-32245]
>>>
>>>In both cases the fsck didn't find anything, as you observed.
>>>
>>>On Wed, 2007-07-11 at 06:46 +0200, Ingo Bormuth wrote:
>>>
>>>
>>>
>>>
>>>>Hmm, whenever I try to build busybox (1.4.2) I get nikita-191 panics:
>>>>
>>>>[...]
>>>>cc console_tools/clear.o
>>>>reiser4 panicked cowardly: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
>>>>kernel panic - not syncing: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
>>>>
>>>>
>>>>
>>>>
>>Somebody missed set_file_hint(), which synchronizes the coords.
>>
>>
err, sorry, its name is reiser4_set_hint
>>Unfortunately I can not reproduce it. Would you please (if possible)
>>catch the stack with the attached patch?
>>
>>
>
>[<ffffffff88186b5e>] :reiser4:save_file_hint+0xee/0x3c0
>[<ffffffff88189c60>] :reiser4:read_unix_file+0x940/0xa10
>[<ffffffff80276bbb>] vfs_read+0xdb/0x180
>[<ffffffff80277083>] sys_read+0x53/0x90
>[<ffffffff8020993e>] system_call+0x7e/0x83
>
>
Thanks!
Indeed, the coords are not synchronized when reading tails. However,
it is not a fatal bug: we are victims of brain damaged and unreadable
hint interface.
The possible fix is attached. Would you please test it?
Also don't forget to apply this patch:
http://lkml.org/lkml/diff/2007/7/11/396/1
as it also can be related to the problem.
Edward.
>As for reproducing it, I think I should mention that:
>
>1. I'm using distcc to speed things up. Without offloading the compiling
>work, my laptop has lasted ~3.5hrs before a panic. My desktop with
>distcc configured usually only lasts a few minutes.
>
>2. My local storage is encrypted through dm-crypt, but I've also tried
>over open-iscsi and got the same results.
>
>
>
>>>>Running fsck.reiser4 before and after the panic doesn't show any complaints.
>>>>The partition is heavily used. I'm not aware of any other problem.
>>>>
>>>>Vanilla-2.6.21.6 (kernel.org) with reiser4-2.6.21-path (namesys.com).
>>>>
>>>>Not that I understood the code, but why is it an assertion at all?
>>>>Couldn't one just use an empty hint if the current one is invalid?
>>>>
>>>>
>>>>
>>>>
>>Sure, it is possible to not use it at all. But if the current one is valid,
>>it would be nice to use it to avoid tree traversal with waiting for
>>possible locks, etc..
>>
>>Thanks,
>>Edward.
>>
>>
>>
[-- Attachment #2: reiser4-fix-read_tail.patch --]
[-- Type: text/x-patch, Size: 348 bytes --]
Update hint when reading tails
Signed-off-by: Edward Shishkin <edward@namesys.com>
--- linux-2.6.22-rc6-mm1/fs/reiser4/plugin/item/tail.c.orig
+++ linux-2.6.22-rc6-mm1/fs/reiser4/plugin/item/tail.c
@@ -758,7 +758,7 @@
coord->unit_pos--;
coord->between = AFTER_UNIT;
}
-
+ reiser4_set_hint(hint, &f->key, ZNODE_READ_LOCK);
return 0;
}
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Nikita 19891
2007-07-13 15:34 ` Edward Shishkin
@ 2007-07-23 23:09 ` Jake Maciejewski
0 siblings, 0 replies; 6+ messages in thread
From: Jake Maciejewski @ 2007-07-23 23:09 UTC (permalink / raw)
To: Edward Shishkin; +Cc: Ingo Bormuth, reiserfs-devel, Vladimir V. Saveliev
On Fri, 2007-07-13 at 19:34 +0400, Edward Shishkin wrote:
> Jake Maciejewski wrote:
>
> >On Wed, 2007-07-11 at 23:48 +0400, Edward Shishkin wrote:
> >
> >
> >>Jake Maciejewski wrote:
> >>
> >>
> >>
> >>>I've hit the same panic looping kernel builds (while true ; do make
> >>>mrproper ; make allmodconfig ; make -j4 ; done) on 2.6.21.1 with the
> >>>Namesys patch and reiser4 debug enabled. I've seen it on my amd64
> >>>desktop and x86 laptop.
> >>>
> >>>Another one I've seen is:
> >>> reiser4 panicked cowardly: reiser4[fixdep(16043)]: sibling_list_remove (fs/reiser4/tree_walk.c:814)[zam-32245]
> >>>
> >>>In both cases the fsck didn't find anything, as you observed.
> >>>
> >>>On Wed, 2007-07-11 at 06:46 +0200, Ingo Bormuth wrote:
> >>>
> >>>
> >>>
> >>>
> >>>>Hmm, whenever I try to build busybox (1.4.2) I get nikita-191 panics:
> >>>>
> >>>>[...]
> >>>>cc console_tools/clear.o
> >>>>reiser4 panicked cowardly: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
> >>>>kernel panic - not syncing: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]:
> >>>>
> >>>>
> >>>>
> >>>>
> >>Somebody missed set_file_hint(), which synchronizes the coords.
> >>
> >>
> err, sorry, its name is reiser4_set_hint
>
> >>Unfortunately I can not reproduce it. Would you please (if possible)
> >>catch the stack with the attached patch?
> >>
> >>
> >
> >[<ffffffff88186b5e>] :reiser4:save_file_hint+0xee/0x3c0
> >[<ffffffff88189c60>] :reiser4:read_unix_file+0x940/0xa10
> >[<ffffffff80276bbb>] vfs_read+0xdb/0x180
> >[<ffffffff80277083>] sys_read+0x53/0x90
> >[<ffffffff8020993e>] system_call+0x7e/0x83
> >
> >
>
> Thanks!
> Indeed, the coords are not synchronized when reading tails. However,
> it is not a fatal bug: we are victims of brain damaged and unreadable
> hint interface.
>
> The possible fix is attached. Would you please test it?
> Also don't forget to apply this patch:
> http://lkml.org/lkml/diff/2007/7/11/396/1
> as it also can be related to the problem.
>
> Edward.
Sorry for being so late to reply. Yes, the fix works, but it took some
time to test because I'm still seeing the previously mentioned panic in
sibling_list_remove, except now it takes an hour or two to panic. I'm
reasonably sure I'm not seeing the save_file_hint panic anymore, though.
>
> >As for reproducing it, I think I should mention that:
> >
> >1. I'm using distcc to speed things up. Without offloading the compiling
> >work, my laptop has lasted ~3.5hrs before a panic. My desktop with
> >distcc configured usually only lasts a few minutes.
> >
> >2. My local storage is encrypted through dm-crypt, but I've also tried
> >over open-iscsi and got the same results.
> >
> >
> >
> >>>>Running fsck.reiser4 before and after the panic doesn't show any complaints.
> >>>>The partition is heavily used. I'm not aware of any other problem.
> >>>>
> >>>>Vanilla-2.6.21.6 (kernel.org) with reiser4-2.6.21-path (namesys.com).
> >>>>
> >>>>Not that I understood the code, but why is it an assertion at all?
> >>>>Couldn't one just use an empty hint if the current one is invalid?
> >>>>
> >>>>
> >>>>
> >>>>
> >>Sure, it is possible to not use it at all. But if the current one is valid,
> >>it would be nice to use it to avoid tree traversal with waiting for
> >>possible locks, etc..
> >>
> >>Thanks,
> >>Edward.
> >>
> >>
> >>
>
>
--
Jake Maciejewski <maciejej@msoe.edu>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-07-23 23:09 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-07-11 4:46 Nikita 19891 Ingo Bormuth
2007-07-11 4:35 ` Jake Maciejewski
2007-07-11 19:48 ` Edward Shishkin
2007-07-13 5:12 ` Jake Maciejewski
2007-07-13 15:34 ` Edward Shishkin
2007-07-23 23:09 ` Jake Maciejewski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.