All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: BTW: 2.4.19-patches-to-come?
@ 2002-05-17 15:36 Dieter Nützel
  2002-05-17 15:47 ` Chris Mason
  2002-05-17 15:54 ` Oleg Drokin
  0 siblings, 2 replies; 15+ messages in thread
From: Dieter Nützel @ 2002-05-17 15:36 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Chris Mason, Manuel Krause, ReiserFS List

[-- Attachment #1: Type: text/plain, Size: 6338 bytes --]

On Friday 17 May 2002 09:47, oleg Drokin wrote:
>Hello!
>
>On Fri, May 17, 2002 at 03:49:06AM +0200, Manuel Krause wrote:
>
> > I was curious and I stay curious, as I don't see any program deleting 3 
> > files just before every crash...  (I don't have sizes, places and
> > names...)
>
> It is not necessarily happened to files direectly to a crash.
> Any open file, that was deleted, cannot be deleted until it is closed.
> e.g. if you'd do "sleep 1000000 </path/somefile ; rm /path/somefile",
> /path/somefile won't be deleted until sleep finishes.
> If system crash occured and there were still some open, but deleted files,
> these files gets removed as part of recovery from crash process.
> And it is not possible to find file's names/paths because corresponding
> directory entries were already deleted.
>
> BTW it is absolutely the same thing happens with ext2, once you see
> e2fsck complains about "DTIME is zero", it deletes
> not-yet-deleted-but-scheduled-for-deletion files.

Sorry Oleg, that I jump in, here...

I get from time to time corrupted files (after reboot and replay) which are 
_NOT_ written during/before crash!!!
It happes during huge C++ compilation of one of my 3D VIS apps.

After reboot I have waste in the original *.cxx files. This is very strange 
and time consuming 'cause I have to remove every broken file by hand and 
recreate it with CVS when the compiler hit it during the next run.

A similar thing happen during kernel compilations.
When the system crash (due to kernel devel stuff) some *.o or mostly 
.*.o.flags files are broken after replay/reboot. Yes, this time they are 
written before/during crash and rebuild during replay, but it should much 
more usefull if they were only _REMOVED_ during replay. Because the "next" 
kernel build can't run smooth without "make clean" or deletion by hand.
What do you think?

Here comes how it looks after C++ case:

c++ -O -mcpu=k6 -pipe -mpreferred-stack-boundary=2 -malign-functions=4 
-fschedule-insns2 -fexpensive-optimizations -DvtkRenderingPython_EXPORTS 
-fPIC -I/opt/VTK/V4.0/VTK/Rendering -I/opt/VTK/V4.0/VTK/Rendering 
-I/opt/VTK/V4.0/VTK/Hybrid -I/opt/VTK/V4.0/VTK/Patented -I/opt/VTK/V4.0/VTK 
-I/opt/VTK/V4.0/VTK/Common -I/opt/VTK/V4.0/VTK/Filtering 
-I/opt/VTK/V4.0/VTK/Imaging -I/opt/VTK/V4.0/VTK/Graphics 
-I/opt/VTK/V4.0/VTK/IO -I/opt/VTK/V4.0/VTK/Utilities/zlib 
-I/opt/VTK/V4.0/VTK/Utilities/png -I/opt/VTK/V4.0/VTK/Utilities/jpeg 
-I/opt/VTK/V4.0/VTK/Utilities/tiff -I/opt/VTK/V4.0/VTK/Utilities/expat 
-I/opt/VTK/V4.0/VTK/Common/Testing/Cxx -I/usr/include/python2.1    
-I/usr/X11R6/include -c 
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballCameraPython.cxx -o 
vtkInteractorStyleTrackballCameraPython.o
make[3]: *** [vtkInteractorStyleJoystickActorPython.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: *** [vtkInteractorStyleJoystickCameraPython.o] Error 1
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:1: parse 
error before `%'
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:4: character 
constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:4: unknown 
escape sequence: `\' followed by char code 0xe
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:17: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:17: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:44: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:1: 
unterminated character constant
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:14: 
unterminated character constant
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:44: unknown 
escape sequence `\@'
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:44: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:44: 
nondigits in number and not hexadecimal
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:44: 
nondigits in number and not hexadecimal
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballPython.cxx:44: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballCameraPython.cxx:43: 
unterminated string or character constant
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballCameraPython.cxx:43: 
possible real start of unterminated constant
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballCameraPython.cxx:1: 
syntax error before `('
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballCameraPython.cxx:42: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:1: 
parse error before `0'
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:2: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:2: 
nondigits in number and not hexadecimal
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:2: 
nondigits in number and not hexadecimal
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:2: 
unknown escape sequence: `\' followed by char code 0x91
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:2: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:6: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:6: 
unknown escape sequence: `\' followed by char code 0xf
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:14: 
character constant too long
/opt/VTK/V4.0/VTK/Rendering/vtkInteractorStyleTrackballActorPython.cxx:43: 
character constant too long
make[3]: *** [vtkInteractorStyleTrackballPython.o] Error 1
make[2]: *** [default_target] Error 2
make[1]: *** [default_target_Rendering] Error 2
make: *** [default_target] Error 2

Have a look into the falsely "rebuild" broken files in the attachment, too.

Thanks,
	Dieter
-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: Dieter.Nuetzel@hamburg.de


[-- Attachment #2: vtkInteractorStylePython.cxx.bz2 --]
[-- Type: application/x-bzip2, Size: 9339 bytes --]

[-- Attachment #3: vtkInteractorStyleTrackballPython.cxx.bz2 --]
[-- Type: application/x-bzip2, Size: 4932 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 15:36 BTW: 2.4.19-patches-to-come? Dieter Nützel
@ 2002-05-17 15:47 ` Chris Mason
  2002-05-17 18:20   ` Dieter Nützel
  2002-05-17 15:54 ` Oleg Drokin
  1 sibling, 1 reply; 15+ messages in thread
From: Chris Mason @ 2002-05-17 15:47 UTC (permalink / raw)
  To: Dieter Nützel; +Cc: Oleg Drokin, Manuel Krause, ReiserFS List

On Fri, 2002-05-17 at 11:36, Dieter Nützel wrote:

> Sorry Oleg, that I jump in, here...
> 
> I get from time to time corrupted files (after reboot and replay) which are 
> _NOT_ written during/before crash!!!
> It happes during huge C++ compilation of one of my 3D VIS apps.
> 
> After reboot I have waste in the original *.cxx files. This is very strange 
> and time consuming 'cause I have to remove every broken file by hand and 
> recreate it with CVS when the compiler hit it during the next run.

Is this on IDE?  With tails turned on?  Are all the affected files less
than 16k?

> 
> A similar thing happen during kernel compilations.
> When the system crash (due to kernel devel stuff) some *.o or mostly 
> .*.o.flags files are broken after replay/reboot. Yes, this time they are 
> written before/during crash and rebuild during replay, 

This will be taken care of by the data=ordered patches.  I'm combining
them with the data logging stuff.

-chris



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 15:36 BTW: 2.4.19-patches-to-come? Dieter Nützel
  2002-05-17 15:47 ` Chris Mason
@ 2002-05-17 15:54 ` Oleg Drokin
  2002-05-17 17:05   ` Dieter Nützel
  1 sibling, 1 reply; 15+ messages in thread
From: Oleg Drokin @ 2002-05-17 15:54 UTC (permalink / raw)
  To: Dieter N?tzel; +Cc: Chris Mason, Manuel Krause, ReiserFS List

Hello!

On Fri, May 17, 2002 at 05:36:50PM +0200, Dieter N?tzel wrote:

> > It is not necessarily happened to files direectly to a crash.
> > Any open file, that was deleted, cannot be deleted until it is closed.
> > e.g. if you'd do "sleep 1000000 </path/somefile ; rm /path/somefile",
> > /path/somefile won't be deleted until sleep finishes.
> > If system crash occured and there were still some open, but deleted files,
> > these files gets removed as part of recovery from crash process.
> > And it is not possible to find file's names/paths because corresponding
> > directory entries were already deleted.
> > BTW it is absolutely the same thing happens with ext2, once you see
> > e2fsck complains about "DTIME is zero", it deletes
> > not-yet-deleted-but-scheduled-for-deletion files.
> Sorry Oleg, that I jump in, here...
> I get from time to time corrupted files (after reboot and replay) which are 
> _NOT_ written during/before crash!!!

What is the corruption pattern?

> It happes during huge C++ compilation of one of my 3D VIS apps.

atime is still updated, I think. What was the crash, btw?
Hard power loss?

> A similar thing happen during kernel compilations.
> When the system crash (due to kernel devel stuff) some *.o or mostly 
> .*.o.flags files are broken after replay/reboot. Yes, this time they are 

This is expected.

> written before/during crash and rebuild during replay, but it should much 
> more usefull if they were only _REMOVED_ during replay. Because the "next" 
> kernel build can't run smooth without "make clean" or deletion by hand.
> What do you think?

These files are not deleted, so why fs should delete these?

> Have a look into the falsely "rebuild" broken files in the attachment, too.

Later today

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 15:54 ` Oleg Drokin
@ 2002-05-17 17:05   ` Dieter Nützel
  2002-05-17 18:28     ` Oleg Drokin
  0 siblings, 1 reply; 15+ messages in thread
From: Dieter Nützel @ 2002-05-17 17:05 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Chris Mason, Manuel Krause, ReiserFS List

On Friday 17 May 2002 17:54, Oleg Drokin wrote:
> Hello!
>
> On Fri, May 17, 2002 at 05:36:50PM +0200, Dieter N?tzel wrote:
> > > It is not necessarily happened to files direectly to a crash.
> > > Any open file, that was deleted, cannot be deleted until it is closed.
> > > e.g. if you'd do "sleep 1000000 </path/somefile ; rm /path/somefile",
> > > /path/somefile won't be deleted until sleep finishes.
> > > If system crash occured and there were still some open, but deleted
> > > files, these files gets removed as part of recovery from crash process.
> > > And it is not possible to find file's names/paths because corresponding
> > > directory entries were already deleted.
> > > BTW it is absolutely the same thing happens with ext2, once you see
> > > e2fsck complains about "DTIME is zero", it deletes
> > > not-yet-deleted-but-scheduled-for-deletion files.
> >
> > Sorry Oleg, that I jump in, here...
> > I get from time to time corrupted files (after reboot and replay) which
> > are _NOT_ written during/before crash!!!
>
> What is the corruption pattern?

Have a look into the two examples.

> > It happes during huge C++ compilation of one of my 3D VIS apps.
>
> atime is still updated, I think.

Yes, could be the case. The partition is not mounted with "noatime".

> What was the crash, btw?
> Hard power loss?

No.
"Only" hard lookup. Sadly SysReq is not working ;-(

> > A similar thing happen during kernel compilations.
> > When the system crash (due to kernel devel stuff) some *.o or mostly
> > .*.o.flags files are broken after replay/reboot. Yes, this time they are
>
> This is expected.
>
> > written before/during crash and rebuild during replay, but it should much
> > more usefull if they were only _REMOVED_ during replay. Because the
> > "next" kernel build can't run smooth without "make clean" or deletion by
> > hand. What do you think?
>
> These files are not deleted, so why fs should delete these?

Yes, you are right. But semantically they are worthless 'cause they are 
"broken"...

> > Have a look into the falsely "rebuild" broken files in the attachment,
> > too.
>
> Later today

Hey, it's weekend then...;-)

Happy Whitsun everyone!

Regards,
	Dieter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 15:47 ` Chris Mason
@ 2002-05-17 18:20   ` Dieter Nützel
  2002-05-17 18:26     ` Dieter Nützel
  2002-05-20 13:48     ` Chris Mason
  0 siblings, 2 replies; 15+ messages in thread
From: Dieter Nützel @ 2002-05-17 18:20 UTC (permalink / raw)
  To: Chris Mason; +Cc: Oleg Drokin, Manuel Krause, ReiserFS List

On Friday 17 May 2002 17:47, Chris Mason wrote:
> On Fri, 2002-05-17 at 11:36, Dieter Nützel wrote:
> > Sorry Oleg, that I jump in, here...
> >
> > I get from time to time corrupted files (after reboot and replay) which
> > are _NOT_ written during/before crash!!!
> > It happes during huge C++ compilation of one of my 3D VIS apps.
> >
> > After reboot I have waste in the original *.cxx files. This is very
> > strange and time consuming 'cause I have to remove every broken file by
> > hand and recreate it with CVS when the compiler hit it during the next
> > run.
>
> Is this on IDE?

What? Never ever had such "things" on my private system...;-)

If you need the ultimative U160 disk look at the Fujitsu MAM3184 (18 GB) or 
the bigger brother (36 GB). 15k RPM, 3.5 ms  read, 4.0 ms write, silent like 
the current ATA-5/6 (!!!) and fast as hell (max 88.9 MB/s) disk transfer rate 
(!!!). Sadly my old AHA-2940UW can't handle them correctly even as they 
should "down compatible". Time for a real U160 controller (any 
bounties?)...-;)

> With tails turned on?

Not at all.

/dev/sda3 on / type reiserfs (rw,noatime,notail)
/dev/sda2 on /tmp type reiserfs (rw,notail)
/dev/sda5 on /var type reiserfs (rw,notail)
/dev/sda6 on /home type reiserfs (rw,notail)
/dev/sda7 on /usr type reiserfs (rw,notail)
/dev/sda8 on /opt type reiserfs (rw,notail)
/dev/sdb1 on /Pakete type reiserfs (rw,notail)
/dev/sdb5 on /database/db1 type reiserfs (rw,notail)
/dev/sdb6 on /database/db2 type reiserfs (rw,notail)
/dev/sdb7 on /database/db3 type reiserfs (rw,notail)
/dev/sdb8 on /database/db4 type reiserfs (rw,notail)

> Are all the affected files less than 16k?

I have to recheck (regenerate the "test case" --- push the power button).

To be precise I got this with and without the data logging patch applied but I 
haven't mounted with data logging.

Most (all) former kernel versions with and without preemption+lock-break shown 
the same.

> > A similar thing happen during kernel compilations.
> > When the system crash (due to kernel devel stuff) some *.o or mostly
> > .*.o.flags files are broken after replay/reboot. Yes, this time they are
> > written before/during crash and rebuild during replay,
>
> This will be taken care of by the data=ordered patches.  I'm combining
> them with the data logging stuff.

Ah, then it's time for my data logging results.

See it in my next post.

-Dieter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 18:20   ` Dieter Nützel
@ 2002-05-17 18:26     ` Dieter Nützel
  2002-05-20 13:48     ` Chris Mason
  1 sibling, 0 replies; 15+ messages in thread
From: Dieter Nützel @ 2002-05-17 18:26 UTC (permalink / raw)
  To: Chris Mason; +Cc: Oleg Drokin, Manuel Krause, ReiserFS List

On Friday 17 May 2002 20:20, Dieter Nützel wrote:
> On Friday 17 May 2002 17:47, Chris Mason wrote:
> > On Fri, 2002-05-17 at 11:36, Dieter Nützel wrote:
> > > Sorry Oleg, that I jump in, here...

> > Are all the affected files less than 16k?
>
> I have to recheck (regenerate the "test case" --- push the power button).

Ups, see the two attached files, again.
One is below 16k, the other above.

-Dieter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 17:05   ` Dieter Nützel
@ 2002-05-17 18:28     ` Oleg Drokin
  2002-05-17 19:44       ` Dieter Nützel
  0 siblings, 1 reply; 15+ messages in thread
From: Oleg Drokin @ 2002-05-17 18:28 UTC (permalink / raw)
  To: Dieter N?tzel; +Cc: Chris Mason, Manuel Krause, ReiserFS List

Hello!

On Fri, May 17, 2002 at 07:05:30PM +0200, Dieter N?tzel wrote:
> > What is the corruption pattern?
> Have a look into the two examples.

I will, but later, it seems.

> > What was the crash, btw?
> > Hard power loss?
> No.
> "Only" hard lookup. Sadly SysReq is not working ;-(

And you fix it by "reset switch"?

BTW, Was that you with a kernel with lot of patches applied, or
you do not use anything strange on your workstations?

> > > written before/during crash and rebuild during replay, but it should much
> > > more usefull if they were only _REMOVED_ during replay. Because the
> > > "next" kernel build can't run smooth without "make clean" or deletion by
> > > hand. What do you think?
> > These files are not deleted, so why fs should delete these?
> Yes, you are right. But semantically they are worthless 'cause they are 
> "broken"...

FS cannot know this. Absolutely cannot.
File was created, written into. So file appears after reboot (if transaction
in which it was created was commited).
BTW, I have a feeling that datalogging patches would help to get rid of
garbage in the files, but files still will remain (may be of zero length).

> > > Have a look into the falsely "rebuild" broken files in the attachment,
> > > too.
> > Later today
> Hey, it's weekend then...;-)

So what? ;)

> Happy Whitsun everyone!

Hm. What's that? ;)

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 18:28     ` Oleg Drokin
@ 2002-05-17 19:44       ` Dieter Nützel
  2002-05-18  5:44         ` Oleg Drokin
  0 siblings, 1 reply; 15+ messages in thread
From: Dieter Nützel @ 2002-05-17 19:44 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Chris Mason, Manuel Krause, ReiserFS List

On Friday 17 May 2002 20:28, Oleg Drokin wrote:
> Hello!
>
> On Fri, May 17, 2002 at 07:05:30PM +0200, Dieter N?tzel wrote:
> > > What is the corruption pattern?
> >
> > Have a look into the two examples.
>
> I will, but later, it seems.
>
> > > What was the crash, btw?
> > > Hard power loss?
> >
> > No.
> > "Only" hard lookup. Sadly SysReq is not working ;-(
>
> And you fix it by "reset switch"?

Yes, but fix? 8-)
System come up "normal" do the replay and is smooth except of the "few" broken 
files.

> BTW, Was that you with a kernel with lot of patches applied, or
> you do not use anything strange on your workstations?

Yes, but it happen with and without data logging, preemption, aa_VM, etc.

I can reproduce it with some "new" page coloring stuff (only).
When I modprobe the page coloring module (it is under development) the system 
lookup from time to time during heavy (parallel) compilation for example.
Then I get the broken files.

> > > > written before/during crash and rebuild during replay, but it should
> > > > much more usefull if they were only _REMOVED_ during replay. Because
> > > > the "next" kernel build can't run smooth without "make clean" or
> > > > deletion by hand. What do you think?
> > >
> > > These files are not deleted, so why fs should delete these?
> >
> > Yes, you are right. But semantically they are worthless 'cause they are
> > "broken"...
>
> FS cannot know this. Absolutely cannot.

Yes, I know for sure.

> File was created, written into. So file appears after reboot (if
> transaction in which it was created was commited).

Maybe the FS should have some knowledge "build in" to "decide" that *.o files 
are "temperär" and should be removed during replay?

> BTW, I have a feeling that datalogging patches would help to get rid of
> garbage in the files, but files still will remain (may be of zero length).

Yes, Chris said this, too.

>
> > > > Have a look into the falsely "rebuild" broken files in the
> > > > attachment, too.
> > >
> > > Later today
> >
> > Hey, it's weekend then...;-)
>
> So what? ;)

:-)

> > Happy Whitsun everyone!
>
> Hm. What's that? ;)

Here in Germany the two Whitsun days Sunday and Monday are (high christian) 
feasts and most people are out for short vacation.

-Dieter

BTW I have to work, tomorrow ;-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 19:44       ` Dieter Nützel
@ 2002-05-18  5:44         ` Oleg Drokin
  2002-05-18 13:18           ` Dieter Nützel
  0 siblings, 1 reply; 15+ messages in thread
From: Oleg Drokin @ 2002-05-18  5:44 UTC (permalink / raw)
  To: Dieter N?tzel; +Cc: Chris Mason, Manuel Krause, ReiserFS List

Hello!

On Fri, May 17, 2002 at 09:44:21PM +0200, Dieter N?tzel wrote:

> > BTW, Was that you with a kernel with lot of patches applied, or
> > you do not use anything strange on your workstations?
> Yes, but it happen with and without data logging, preemption, aa_VM, etc.
> I can reproduce it with some "new" page coloring stuff (only).
> When I modprobe the page coloring module (it is under development) the system 
> lookup from time to time during heavy (parallel) compilation for example.
> Then I get the broken files.

Well, sounds like page colouring stuff mistakenly writes some pages in wrong
location, no?
If you cannot reproduce without page colouring patch, that would be the most
probably theory, I'd say.

> > File was created, written into. So file appears after reboot (if
> > transaction in which it was created was commited).
> Maybe the FS should have some knowledge "build in" to "decide" that *.o files 
> are "temper?r" and should be removed during replay?

No, there is absolutely no way for such kludges to go into the FS, I think.
And it even may be the case where .o file was created, commited, and then
journal record obsoletted, because metadata got it's way to the disk,
so file is already in place (without the body, or with garbage instead
of the body), but it have nothing to do with journal.

> > BTW, I have a feeling that datalogging patches would help to get rid of
> > garbage in the files, but files still will remain (may be of zero length).
> Yes, Chris said this, too.

He said it will help your problem, I think. But zero length .o files will
still confuse linker.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-18  5:44         ` Oleg Drokin
@ 2002-05-18 13:18           ` Dieter Nützel
  2002-05-18 16:29             ` Oleg Drokin
  0 siblings, 1 reply; 15+ messages in thread
From: Dieter Nützel @ 2002-05-18 13:18 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Chris Mason, Manuel Krause, ReiserFS List

On Saturday 18 May 2002 07:44, Oleg Drokin wrote:
> Hello!
>
> On Fri, May 17, 2002 at 09:44:21PM +0200, Dieter N?tzel wrote:
> > > BTW, Was that you with a kernel with lot of patches applied, or
> > > you do not use anything strange on your workstations?
> >
> > Yes, but it happen with and without data logging, preemption, aa_VM, etc.
> > I can reproduce it with some "new" page coloring stuff (only).
> > When I modprobe the page coloring module (it is under development) the
> > system lookup from time to time during heavy (parallel) compilation for
> > example. Then I get the broken files.
>
> Well, sounds like page colouring stuff mistakenly writes some pages in
> wrong location, no?

Yes, that could be the case.

> If you cannot reproduce without page colouring patch, that would be the
> most probably theory, I'd say.

That isn't what I've said. I only said I can "reproduce" the kernel crash with 
page coloring the easiest way.

The problem of broken files appear during "any" kernel crash which I get from 
time to time during devel kernel stuff testing/developing.

> > > File was created, written into. So file appears after reboot (if
> > > transaction in which it was created was commited).
> >
> > Maybe the FS should have some knowledge "build in" to "decide" that *.o
> > files are "temper?r" and should be removed during replay?
>
> No, there is absolutely no way for such kludges to go into the FS, I think.
> And it even may be the case where .o file was created, commited, and then
> journal record obsoletted, because metadata got it's way to the disk,
> so file is already in place (without the body, or with garbage instead
> of the body), but it have nothing to do with journal.

I understand this very well, but "all" talking about making computer systems 
and FSs more "intelligent"...;-)

> > > BTW, I have a feeling that datalogging patches would help to get rid of
> > > garbage in the files, but files still will remain (may be of zero
> > > length).
> >
> > Yes, Chris said this, too.
>
> He said it will help your problem, I think. But zero length .o files will
> still confuse linker.

So what's your advice then? Mounting all FSs with "noatime"?

Thanks,
	Dieter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: BTW: 2.4.19-patches-to-come?
  2002-05-18 13:18           ` Dieter Nützel
@ 2002-05-18 16:29             ` Oleg Drokin
  2002-05-18 21:44               ` Manuel Krause
  0 siblings, 1 reply; 15+ messages in thread
From: Oleg Drokin @ 2002-05-18 16:29 UTC (permalink / raw)
  To: Dieter N?tzel; +Cc: Chris Mason, Manuel Krause, ReiserFS List

Hello!

On Sat, May 18, 2002 at 03:18:23PM +0200, Dieter N?tzel wrote:
> > > I can reproduce it with some "new" page coloring stuff (only).
> > > When I modprobe the page coloring module (it is under development) the
> > > system lookup from time to time during heavy (parallel) compilation for
> > > example. Then I get the broken files.
> > Well, sounds like page colouring stuff mistakenly writes some pages in
> > wrong location, no?
> Yes, that could be the case.
> > If you cannot reproduce without page colouring patch, that would be the
> > most probably theory, I'd say.
> That isn't what I've said. I only said I can "reproduce" the kernel crash with 
> page coloring the easiest way.

Sorry, I misinterpreted it, then. Perhaps I was confused by "only" word in sentence above.

> > > > BTW, I have a feeling that datalogging patches would help to get rid of
> > > > garbage in the files, but files still will remain (may be of zero
> > > > length).
> > > Yes, Chris said this, too.
> >
> > He said it will help your problem, I think. But zero length .o files will
> > still confuse linker.
> So what's your advice then? Mounting all FSs with "noatime"?

noatime have nothing to do with wrong content of recently created files after
crash.
It was only my guess that atime updates might have caused corruptions on file
read. But you are not using IDE. Anyway, you can try and tell us.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: BTW: 2.4.19-patches-to-come?
  2002-05-18 16:29             ` Oleg Drokin
@ 2002-05-18 21:44               ` Manuel Krause
  2002-05-19  9:44                 ` Oleg Drokin
  0 siblings, 1 reply; 15+ messages in thread
From: Manuel Krause @ 2002-05-18 21:44 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Chris Mason, reiserfs-list, Dieter Nuetzel

On 05/18/2002 06:29 PM, Oleg Drokin wrote:

> Hello!
> 
> On Sat, May 18, 2002 at 03:18:23PM +0200, Dieter N?tzel wrote:
> 
>>>>I can reproduce it with some "new" page coloring stuff (only).
>>>>When I modprobe the page coloring module (it is under development) the
>>>>system lookup from time to time during heavy (parallel) compilation for
>>>>example. Then I get the broken files.
>>>>
>>>Well, sounds like page colouring stuff mistakenly writes some pages in
>>>wrong location, no?
>>>
>>Yes, that could be the case.
>>
>>>If you cannot reproduce without page colouring patch, that would be the
>>>most probably theory, I'd say.
>>>
>>That isn't what I've said. I only said I can "reproduce" the kernel crash with 
>>page coloring the easiest way.
>>
> 
> Sorry, I misinterpreted it, then. Perhaps I was confused by "only" word in sentence above.
> 
> 
>>>>>BTW, I have a feeling that datalogging patches would help to get rid of
>>>>>garbage in the files, but files still will remain (may be of zero
>>>>>length).
>>>>>
>>>>Yes, Chris said this, too.
>>>>
>>>He said it will help your problem, I think. But zero length .o files will
>>>still confuse linker.
>>>
>>So what's your advice then? Mounting all FSs with "noatime"?
>>
> 
> noatime have nothing to do with wrong content of recently created files after
> crash.
> It was only my guess that atime updates might have caused corruptions on file
> read. But you are not using IDE. Anyway, you can try and tell us.
> 
> Bye,
>     Oleg
> 


Thank you all, for keeping me informed, too!

I have all my partitions mounted -o [...+]noatime and the reiserfs ones 
with [+]notail since many years as it was said to bring a performance 
gain (never proved it myself). As I get same errors as Dieter but on IDE 
disks (!)... that "noatime" might not solve the problem for recently 
accessed files (like it may occur on a crash with a running KDE with its 
setup files, too).

Though having some *.o files when a kernel-or-so compile fails and I 
need to make a "make mrproper", e.g., to get it in order again I don't 
want them all to be removed "intelligently" by the FS. Oleg, you 
explained it  right. I don't want some files vanishing completely upon 
intermediate crash (so, even if not everything has been built and I may 
need to recompile or reset /all/ after restart). But in fact it can be 
very complicated to find the corrupted=broken-but-not-deleted file 
sometimes.


Oleg, another approach to the original topics' problems as I can't say 
what files are accessed and unlinked/deleted before crash with sense 
with my typical crash pattern: How can I try to monitor accessed files 
just before my typical crash patterns (to see what reiserfs makes of 
them after crash)?
I really want to watch what gets restored, recovered or deleted if 
possible. So far I don't get a logic from your explanations to my 
ReiserFS' behaviour on this, sorry, though I understood your and Chris 
words on this. Is there a way to do this "snapshot"?

Dieter, do you have a time relation of your crashes? I mean e.g. related 
in secs or mins after system startup? My "most loved crashes" 
(without-page-colouring, of course) occur approx. 2min after startup 
with full Seti-load or formerly on a "make bzlilo" without "sync" after 
"make bzImage" (formerly = "with unusable bdflush and harddisk settings").

Chris, can we have an answer on Dieters questions ("What's wrong?") in 
thread "[reiserfs-list] Re: data-logging on top of 
linux-2.4.19p7-compound-speedup-2.patch", and a notification on your 
future patchset for latest official reiserfs stuff, thank you! Really, 
with your patches from ftp.suse.com I did not feel well adjusting lines 
by hand (see Dieters questions on this...)


Bye, and thanks, again,
have a nice Whitsun (never new the english words and that explanation 
before Dieter mentionning it, have to work all-day though, too, thanks!)


Manuel



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: BTW: 2.4.19-patches-to-come?
  2002-05-18 21:44               ` Manuel Krause
@ 2002-05-19  9:44                 ` Oleg Drokin
  2002-05-20 17:28                   ` Valdis.Kletnieks
  0 siblings, 1 reply; 15+ messages in thread
From: Oleg Drokin @ 2002-05-19  9:44 UTC (permalink / raw)
  To: Manuel Krause; +Cc: Chris Mason, reiserfs-list, Dieter Nuetzel

Hello!

On Sat, May 18, 2002 at 11:44:55PM +0200, Manuel Krause wrote:

> Oleg, another approach to the original topics' problems as I can't say 
> what files are accessed and unlinked/deleted before crash with sense 
> with my typical crash pattern: How can I try to monitor accessed files 
> just before my typical crash patterns (to see what reiserfs makes of 
> them after crash)?

Since usually you cannot reliable predict a crash time (note, that if you
can and it is SW fault, you should file a bug report instead, I think).

> I really want to watch what gets restored, recovered or deleted if 
> possible. So far I don't get a logic from your explanations to my 

You get keys of all objects truncated or deleted during mount.
deleted keys won't help you as they are deleted, truncated files can be found
by tjeir key (e.g. debugreiserfs -d /dev/hda and look for the first 2 
numbers in a key).

> ReiserFS' behaviour on this, sorry, though I understood your and Chris 
> words on this. Is there a way to do this "snapshot"?

I think Chris said you can get a snapshot using intermezzo + datalogging patchs
from Chris.

> Dieter, do you have a time relation of your crashes? I mean e.g. related 
> in secs or mins after system startup? My "most loved crashes" 
> (without-page-colouring, of course) occur approx. 2min after startup 
> with full Seti-load or formerly on a "make bzlilo" without "sync" after 
> "make bzImage" (formerly = "with unusable bdflush and harddisk settings").

You mean, once you start CPU-intensive task, everything dies pretty soon?
Are you sure you do not have hardware/overheating problems.
I had a problem on my WS, if I'd run 20 instances of
while :; do:; done
my WS would hang in under 5 minutes. This was fixed by adding 2
extra cooles to the case.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: BTW: 2.4.19-patches-to-come?
  2002-05-17 18:20   ` Dieter Nützel
  2002-05-17 18:26     ` Dieter Nützel
@ 2002-05-20 13:48     ` Chris Mason
  1 sibling, 0 replies; 15+ messages in thread
From: Chris Mason @ 2002-05-20 13:48 UTC (permalink / raw)
  To: Dieter Nützel; +Cc: Oleg Drokin, Manuel Krause, ReiserFS List

On Fri, 2002-05-17 at 14:20, Dieter Nützel wrote:
> On Friday 17 May 2002 17:47, Chris Mason wrote:
> > On Fri, 2002-05-17 at 11:36, Dieter Nützel wrote:
> > > Sorry Oleg, that I jump in, here...
> > >
> > > I get from time to time corrupted files (after reboot and replay) which
> > > are _NOT_ written during/before crash!!!
> > > It happes during huge C++ compilation of one of my 3D VIS apps.
> > >
> > > After reboot I have waste in the original *.cxx files. This is very
> > > strange and time consuming 'cause I have to remove every broken file by
> > > hand and recreate it with CVS when the compiler hit it during the next
> > > run.
> >
> > Is this on IDE?
> 
> What? Never ever had such "things" on my private system...;-)

Ah.  What I was hoping for was that tail conversions + writeback caching
were causing these corrupted files.  Clearly that isn't the case.  I'd
be really interested to see if you could reproduce on a vanilla kernel.

-chris



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Re: BTW: 2.4.19-patches-to-come?
  2002-05-19  9:44                 ` Oleg Drokin
@ 2002-05-20 17:28                   ` Valdis.Kletnieks
  0 siblings, 0 replies; 15+ messages in thread
From: Valdis.Kletnieks @ 2002-05-20 17:28 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: reiserfs-list

[-- Attachment #1: Type: text/plain, Size: 309 bytes --]

On Sun, 19 May 2002 13:44:54 +0400, Oleg Drokin said:

> Since usually you cannot reliable predict a crash time (note, that if you
> can and it is SW fault, you should file a bug report instead, I think).

Wasn't there a Dilbert cartoon about a PHB that wanted advance announcement
of unscheduled outages? ;)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2002-05-20 17:28 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-05-17 15:36 BTW: 2.4.19-patches-to-come? Dieter Nützel
2002-05-17 15:47 ` Chris Mason
2002-05-17 18:20   ` Dieter Nützel
2002-05-17 18:26     ` Dieter Nützel
2002-05-20 13:48     ` Chris Mason
2002-05-17 15:54 ` Oleg Drokin
2002-05-17 17:05   ` Dieter Nützel
2002-05-17 18:28     ` Oleg Drokin
2002-05-17 19:44       ` Dieter Nützel
2002-05-18  5:44         ` Oleg Drokin
2002-05-18 13:18           ` Dieter Nützel
2002-05-18 16:29             ` Oleg Drokin
2002-05-18 21:44               ` Manuel Krause
2002-05-19  9:44                 ` Oleg Drokin
2002-05-20 17:28                   ` Valdis.Kletnieks

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.