public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Apparent instability of reiserfs on 2.4.1
@ 2001-02-07 12:06 Hans Reiser
  2001-02-07 15:47 ` Chris Mason
  0 siblings, 1 reply; 45+ messages in thread
From: Hans Reiser @ 2001-02-07 12:06 UTC (permalink / raw)
  To: Vladimir V. Saveliev, zag@zag.botik.ru, Alexander Zarochentcev,
	Yury Shevchuk, Vladimir Demidov, Vitaly Fertman, Edward Shushkin,
	Nikita Danilov, Yury Yu. Rupasov, Alexander Lyamin,
	Elena V. Gryaznova, Chris Mason, gawain@torque.com,
	ragnar@bigstorage.com, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

I know that our number of users has increased, but I doubt that the increase is
sufficient to match the marked increase in bug reports on reiserfs-list.  Please
be patient as we work on this.  We will issue a patch this week that will fix
some bugs (NFS i_generation count losing, and space leakage on crash due to
preallocated blocks being lost). 

We will also change the default for mkreiserfs to creating the new 2.4 only
format, as this (we have belatedly realized) is probably the cause of many users
reporting they can't create large files.

We have a bug affecting add_entry which we suspect is due to our rename not
being adequately atomic and leaving hidden directory entries in the filesystem,
and we are exploring how this might happen (improper journaling, we don't yet
know....)  Treat this description with the usual skepticism attached to any
explanation of a bug not fixed yet, our diagnosing continues....  This is the
most worrisome bug for us stability wise.  It seems ~ a user a day encounters
it.  

This patch for sure also won't fix the zeros getting added to syslog files bug
which we are desperate to learn how to reproduce at our site.  

Thank you for your patience.

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 12:06 Apparent instability of reiserfs on 2.4.1 Hans Reiser
@ 2001-02-07 15:47 ` Chris Mason
  2001-02-07 16:38   ` [reiserfs-list] " David Rees
  2001-02-07 18:41   ` Vedran Rodic
  0 siblings, 2 replies; 45+ messages in thread
From: Chris Mason @ 2001-02-07 15:47 UTC (permalink / raw)
  To: Hans Reiser, Vladimir V. Saveliev, zag@zag.botik.ru,
	Alexander Zarochentcev, Yury Shevchuk, Vladimir Demidov,
	Vitaly Fertman, Edward Shushkin, Nikita Danilov, Yury Yu. Rupasov,
	Alexander Lyamin, Elena V. Gryaznova, gawain@torque.com,
	ragnar@bigstorage.com, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com



Ok, how about we list the known bugs:

zeros in log files, apparently only between bytes 2048 and 4096 (not
reproduced yet).

preallocated block leak on crash (fix in testing)

hidden directory entry cleanup (still reproducing, very hard to hit).

knfsd (patches in testing).

oops in reiserfs_symlink, create_virtual_node (bug in redhat gcc 2.96,
fixed by downloading the update).

We've also had a few reports of other corruptions, most of which have been
traced to hardware problems.  There are two where I'm not sure of the cause
yet, but the method to trigger the bug was too simple to not be a hardware
problem.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 15:47 ` Chris Mason
@ 2001-02-07 16:38   ` David Rees
  2001-02-07 16:48     ` Chris Mason
       [not found]     ` <3A818619.7C3967BC@baldauf.org>
  2001-02-07 18:41   ` Vedran Rodic
  1 sibling, 2 replies; 45+ messages in thread
From: David Rees @ 2001-02-07 16:38 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

On Wed, Feb 07, 2001 at 10:47:09AM -0500, Chris Mason wrote:
> 
> Ok, how about we list the known bugs:
> 
> zeros in log files, apparently only between bytes 2048 and 4096 (not
> reproduced yet).

Could this bug be related to the reported corruption that people with
new VIA chipsets have been also reporting on ext2?  It seems similar
because of the location of the corruption:

http://marc.theaimsgroup.com/?l=linux-kernel&m=98147483712620&w=2

Anyway, it can't hurt to ask the bug reported if they're using a
newer VIA chipset and see if they will upgrade their BIOS which seems
to fix the problem.

-Dave
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 16:38   ` [reiserfs-list] " David Rees
@ 2001-02-07 16:48     ` Chris Mason
  2001-02-08  6:34       ` Daniel Stone
       [not found]     ` <3A818619.7C3967BC@baldauf.org>
  1 sibling, 1 reply; 45+ messages in thread
From: Chris Mason @ 2001-02-07 16:48 UTC (permalink / raw)
  To: David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com



On Wednesday, February 07, 2001 08:38:54 AM -0800 David Rees
<dbr@spoke.nols.com> wrote:

> On Wed, Feb 07, 2001 at 10:47:09AM -0500, Chris Mason wrote:
>> 
>> Ok, how about we list the known bugs:
>> 
>> zeros in log files, apparently only between bytes 2048 and 4096 (not
>> reproduced yet).
> 
> Could this bug be related to the reported corruption that people with
> new VIA chipsets have been also reporting on ext2?  It seems similar
> because of the location of the corruption:
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=98147483712620&w=2
> 
> Anyway, it can't hurt to ask the bug reported if they're using a
> newer VIA chipset and see if they will upgrade their BIOS which seems
> to fix the problem.

I'd love to blame this on VIA problems, but people are seeing it on other
chipsets too ;-)  

People who report this aren't seeing general corruption, just zeros in
files of specific sizes.  So, it really should be a reiserfs bug.

-chris



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
       [not found]     ` <3A818619.7C3967BC@baldauf.org>
@ 2001-02-07 17:39       ` Chris Mason
  2001-02-07 17:53         ` Xuan Baldauf
  2001-02-07 19:14         ` Xuan Baldauf
  2001-02-07 21:47       ` Chris Wedgwood
  1 sibling, 2 replies; 45+ messages in thread
From: Chris Mason @ 2001-02-07 17:39 UTC (permalink / raw)
  To: Xuan Baldauf, David Rees
  Cc: linux-kernel@vger.kernel.org, reiserfs-list@namesys.com



On Wednesday, February 07, 2001 06:30:01 PM +0100 Xuan Baldauf
<xuan--reiserfs@baldauf.org> wrote:
> In my case, it's a SIS5513 board.
> 
> I have to note that I now have one case which is between offset 9260 and
> 11016. So probably the tails unpacking theory does not work out.
> 
> After a more systematical search, I have found following offsets:
> 
> 9260..11016 = 1756
> 4204.. 5964 = 1760
> 2160.. 3243 = 1083
> 2896.. 3534 =  638
> 1392.. 3704 = 2312
> 
> and so on. Maybe I should write a program which automatically detects and
> reports the zero blocks. I think the theory of tails unpacking does not
> work out, because there are also areas affected which are not between 2048
> and 4096. Also the length of the zeroing can be greater than 2048.
> However, I did not encounter a length of over 4096.
> 

Files up to around 16k in length can have tails, and tails can be larger
than 2048 bytes.

Also interesting would be info about when the file closes.  reiserfs only
creates a tail on file close (file writes always go to full blocks).  If
you application has the file mmap'd the rules change a little (does it?).

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 17:39       ` Chris Mason
@ 2001-02-07 17:53         ` Xuan Baldauf
  2001-02-07 19:14         ` Xuan Baldauf
  1 sibling, 0 replies; 45+ messages in thread
From: Xuan Baldauf @ 2001-02-07 17:53 UTC (permalink / raw)
  To: Chris Mason
  Cc: David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com



Chris Mason wrote:

> On Wednesday, February 07, 2001 06:30:01 PM +0100 Xuan Baldauf
> <xuan--reiserfs@baldauf.org> wrote:
> > In my case, it's a SIS5513 board.
> >
> > I have to note that I now have one case which is between offset 9260 and
> > 11016. So probably the tails unpacking theory does not work out.
> >
> > After a more systematical search, I have found following offsets:
> >
> > 9260..11016 = 1756
> > 4204.. 5964 = 1760
> > 2160.. 3243 = 1083
> > 2896.. 3534 =  638
> > 1392.. 3704 = 2312
> >
> > and so on. Maybe I should write a program which automatically detects and
> > reports the zero blocks. I think the theory of tails unpacking does not
> > work out, because there are also areas affected which are not between 2048
> > and 4096. Also the length of the zeroing can be greater than 2048.
> > However, I did not encounter a length of over 4096.
> >
>
> Files up to around 16k in length can have tails, and tails can be larger
> than 2048 bytes.

Oh, I thought that if tails are larger than 2048 bytes, they get converted to
ordinary 4k blocks. The last case (with 2312 zero block size) is the only case
where the first entry after the zero block is the last entry of the file. (This
file is smaller than 4k, all other files should be larger than 4k).

>
>
> Also interesting would be info about when the file closes.  reiserfs only
> creates a tail on file close (file writes always go to full blocks).  If
> you application has the file mmap'd the rules change a little (does it?).

I have to admit that I do not really know: The ad server is a java application,
I do not know wether java on linux does mmap for FileInputStream and
FileOutputStream. "lsof" shows an ordinary file descriptor (such as "13u") at
the FD column for those files, libraries do not have an fd, they have "mem" in
the FD section. So I guess that the files are not memmapped, the implementation
is easier with ordinary read()s and write()s.

It's interesting that the zeroed section never crosses a 4k boundary...

>
>
> -chris

Xuân.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 15:47 ` Chris Mason
  2001-02-07 16:38   ` [reiserfs-list] " David Rees
@ 2001-02-07 18:41   ` Vedran Rodic
  2001-02-07 18:45     ` Chris Mason
  1 sibling, 1 reply; 45+ messages in thread
From: Vedran Rodic @ 2001-02-07 18:41 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-kernel, reiserfs-list

On Wed, Feb 07, 2001 at 10:47:09AM -0500, Chris Mason wrote:
> 
> 
> Ok, how about we list the known bugs:
> 
> zeros in log files, apparently only between bytes 2048 and 4096 (not
> reproduced yet).
> 
> preallocated block leak on crash (fix in testing)
> 
> hidden directory entry cleanup (still reproducing, very hard to hit).
> 
> knfsd (patches in testing).
> 
> oops in reiserfs_symlink, create_virtual_node (bug in redhat gcc 2.96,
> fixed by downloading the update).
> 
> We've also had a few reports of other corruptions, most of which have been
> traced to hardware problems.  There are two where I'm not sure of the cause
> yet, but the method to trigger the bug was too simple to not be a hardware
> problem.

So could some of this bugs also be present in 3.5.x version of reiserfs?
Will you be fixing them for that version?

Vedran Rodic
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 18:41   ` Vedran Rodic
@ 2001-02-07 18:45     ` Chris Mason
  2001-02-07 19:15       ` Ivan Pulleyn
  0 siblings, 1 reply; 45+ messages in thread
From: Chris Mason @ 2001-02-07 18:45 UTC (permalink / raw)
  To: Vedran Rodic; +Cc: linux-kernel, reiserfs-list



On Wednesday, February 07, 2001 07:41:25 PM +0100 Vedran Rodic
<vedran@renata.irb.hr> wrote:

> 
> So could some of this bugs also be present in 3.5.x version of reiserfs?
> Will you be fixing them for that version?
> 

This list of reiserfs bugs was all specific to the 3.6.x versions, and they
don't appear with the 3.5.x code.  You will probably have problems if you
compile 3.5.x reiserfs with an unpatched redhat gcc 2.96, though.  

-chris



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 17:39       ` Chris Mason
  2001-02-07 17:53         ` Xuan Baldauf
@ 2001-02-07 19:14         ` Xuan Baldauf
  1 sibling, 0 replies; 45+ messages in thread
From: Xuan Baldauf @ 2001-02-07 19:14 UTC (permalink / raw)
  To: Chris Mason
  Cc: David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

Hi Chris,

this is the output of my zero block detection utility. Note that in all the
files mentioned, zero bytes never can exist there, so every zero byte is a bug.
The output format is:

${filename} ${decompressed?"d":"n"} ${startIndex} ${endIndex} ${length}

The data (sorted):

_log.2001-01-16--00-04-57.broken.orig   n       2312    3858    1546
_log.2001-01-17--00-02-27.orig.broken   n       2088    2391    303
_log.2001-02-06--00-00-30.corrupted     n       9260    10116   856
log.2000-04-07--00-00-20.gz     d       41988   42231   243
log.2000-04-29--00-00-17.gz     d       91843   91924   81
log.2000-07-20--00-00-01.gz     d       14690   14780   90
log.2000-07-20--00-00-08.gz     d       26234   26314   80
log.2000-07-20--00-01-06.gz     d       8255    8430    175
log.2000-11-08--00-00-02        n       37232   37959   727
log.2000-11-08--00-00-10.gz     d       2008    3588    1580
log.2000-11-08--00-01-01.gz     d       27237   27681   444
log.2000-11-08--00-01-38.gz     d       15547   15721   174
log.2000-11-08--00-01-52.gz     d       8254    8342    88
log.2000-11-08--00-03-42.gz     d       10985   11522   537
log.2000-11-11--00-00-03        n       3152    3923    771
log.2000-11-11--00-00-03.gz     d       3152    3923    771
log.2000-11-12--00-40-18.gz     d       239     1167    928
log.2000-11-14--00-01-17.gz     d       2472    3273    801
log.2000-11-14--00-01-26.gz     d       3152    3874    722
log.2000-11-15--00-00-58.gz     d       2944    3112    168
log.2000-11-15--02-55-09.gz     d       203     3343    3140
log.2000-11-16--00-01-47.gz     d       854     2206    1352
log.2000-11-18--00-01-42.gz     d       1704    2941    1237
log.2000-11-20--00-03-28.gz     d       1098    1914    816
log.2000-11-21--00-00-35.gz     d       3208    3392    184
log.2000-11-22--00-15-01.gz     d       6320    6632    312
log.2000-11-23--00-00-57.gz     d       2784    3354    570
log.2000-11-23--00-02-14.gz     d       2896    3697    801
log.2000-11-24--00-03-46.gz     d       2784    3413    629
log.2000-11-25--00-01-25.gz     d       2560    3363    803
log.2000-11-26--00-09-39.gz     d       112     3513    3401
log.2000-11-27--00-56-50.gz     d       252     1660    1408
log.2000-11-28--00-01-34.gz     d       1458    2858    1400
log.2000-11-28--00-02-07.gz     d       757     3117    2360
log.2000-11-29--00-00-07.gz     d       1704    3404    1700
log.2000-11-29--00-02-20.gz     d       936     1384    448
log.2000-11-29--00-05-40.gz     d       2416    3577    1161
log.2000-12-02--00-06-03.gz     d       627     3419    2792
log.2000-12-02--00-12-42.gz     d       321     1417    1096
log.2000-12-05--00-00-47.gz     d       1536    3123    1587
log.2000-12-06--00-00-35.gz     d       1080    3492    2412
log.2000-12-08--00-07-01.gz     d       3592    3692    100
log.2000-12-09--00-11-02.gz     d       1224    3955    2731
log.2000-12-10--00-10-03.gz     d       1211    3691    2480
log.2000-12-12--00-04-25.gz     d       3200    3743    543
log.2000-12-13--00-00-42.gz     d       2408    3104    696
log.2000-12-13--00-01-31.gz     d       5270    6822    1552
log.2000-12-13--11-09-43.gz     d       3592    3757    165
log.2000-12-14--00-01-30.gz     d       1600    2500    900
log.2000-12-15--00-00-28.gz     d       2560    3423    863
log.2000-12-19--00-04-08.gz     d       442     2682    2240
log.2000-12-20--00-01-29.gz     d       2672    3199    527
log.2000-12-20--00-02-47.gz     d       4682    5378    696
log.2000-12-21--00-00-16.gz     d       4872    6683    1811
log.2000-12-21--00-01-23.gz     d       9210    10170   960
log.2000-12-21--00-08-48.gz     d       2544    3256    712
log.2000-12-22--00-01-42.gz     d       4466    6938    2472
log.2000-12-23--00-00-11.gz     d       4872    6774    1902
log.2000-12-23--00-03-01.gz     d       1352    3827    2475
log.2000-12-24--00-00-15.gz     d       2064    3597    1533
log.2000-12-25--00-01-07.gz     d       1004    2188    1184
log.2000-12-26--00-01-36.gz     d       2080    3508    1428
log.2000-12-27--00-01-13.gz     d       1128    2684    1556
log.2000-12-27--00-05-42.gz     d       4396    4836    440
log.2000-12-30--00-06-54.gz     d       1280    3613    2333
log.2001-01-01--00-05-32        n       2896    3534    638
log.2001-01-01--00-12-07        n       4204    5964    1760
log.2001-01-03--00-03-11.gz     d       1536    3104    1568
log.2001-01-03--00-04-22        n       2160    3243    1083
log.2001-01-04--00-23-53.gz     d       2408    3587    1179
log.2001-01-05--00-26-01.gz     d       3184    3422    238
log.2001-01-06--00-01-52.gz     d       1104    3543    2439
log.2001-01-09--00-00-07.gz     d       2814    3310    496
log.2001-01-09--00-01-45.gz     d       1960    3835    1875
log.2001-01-09--09-43-49        n       1392    3704    2312
log.2001-01-11--00-00-18.gz     d       2048    3511    1463
log.2001-01-11--00-01-24.gz     d       1110    2214    1104
log.2001-01-12--00-00-10.gz     d       800     3589    2789
log.2001-01-12--00-00-30.gz     d       1952    3717    1765
log.2001-01-12--00-05-35.gz     d       1904    3861    1957
log.2001-01-14--00-04-22.gz     d       68      2180    2112
log.2001-01-15--00-03-52.gz     d       1038    1534    496
log.2001-01-16--00-00-03.gz     d       992     1673    681
log.2001-01-16--00-04-36.gz     d       2592    2699    107
log.2001-01-17--00-02-27.gz     d       36301   36634   333
log.2001-01-17--00-04-13.gz     d       312     590     278
log.2001-01-18--00-19-34.gz     d       3040    3699    659
log.2001-01-21--00-00-18.gz     d       5152    5624    472
log.2001-01-23--00-01-21.gz     d       1624    3510    1886
log.2001-01-24--00-04-36.gz     d       272     2518    2246
log.2001-01-27--00-00-13.gz     d       6512    7044    532
log.2001-01-29--00-03-12.gz     d       1129    2857    1728
log.2001-01-29--00-20-36.gz     d       3528    3726    198
log.2001-01-31--00-00-23.gz     d       5984    6254    270
log.2001-02-01--00-14-07.gz     d       560     2208    1648
log.2001-02-02--00-00-25.gz     d       1878    2582    704
log.2001-02-02--00-02-49.gz     d       3160    3353    193
log.2001-02-02--00-06-18.gz     d       1320    1423    103
log.2001-02-06--00-01-10.gz     d       2192    3037    845

Note that the files are in about 20 different subdirectories, but the directory
names were stripped in the output. Please also note that some of the zeroing may
not come from the bug, but from crashes. (Maybe the problem which creates
zeroing on crashes and the bug are identical?) The machines configuration:

plato:~ # uptime
  8:03pm  up 70 days,  8:03,  5 users,  load average: 3.18, 2.50, 2.57
plato:~ # uname -a
Linux plato 2.4.0-test10 #4 Wed Nov 8 18:14:47 CET 2000 i586 unknown
plato:~ #

Because the uptime is 70 days, results from december 2000 on cannot be due to
crashes.

The .gz files are not created on the fly, they are created after the log file
for the next day is created and the former plain log file has no write accesses
anymore. So the corruption is from the original log file, not from the program
which writes the .gz file.

Xuân.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 18:45     ` Chris Mason
@ 2001-02-07 19:15       ` Ivan Pulleyn
  0 siblings, 0 replies; 45+ messages in thread
From: Ivan Pulleyn @ 2001-02-07 19:15 UTC (permalink / raw)
  To: Chris Mason; +Cc: Vedran Rodic, linux-kernel, reiserfs-list



On Wed, 7 Feb 2001, Chris Mason wrote:

> 
> 
> On Wednesday, February 07, 2001 07:41:25 PM +0100 Vedran Rodic
> <vedran@renata.irb.hr> wrote:
> 
> > 
> > So could some of this bugs also be present in 3.5.x version of reiserfs?
> > Will you be fixing them for that version?
> > 
> 
> This list of reiserfs bugs was all specific to the 3.6.x versions, and they
> don't appear with the 3.5.x code.  You will probably have problems if you
> compile 3.5.x reiserfs with an unpatched redhat gcc 2.96, though.  

Apologies if I'm mis-understanding (I don't follow the list too
closely), but the zeros-in-log-files thing happens to me a lot on
3.5.X. Is there some sort of debugging info I could offer to help
figure it out?

Ivan...

---------------------------------------------------------------------------
			     Ivan Pulleyn
		      4942 N. Winchester Ave. #3
			  Chicago, IL 60640

			   ivan@torpid.com
			    (847) 980-1400
---------------------------------------------------------------------------

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
       [not found]     ` <3A818619.7C3967BC@baldauf.org>
  2001-02-07 17:39       ` Chris Mason
@ 2001-02-07 21:47       ` Chris Wedgwood
  2001-02-07 21:55         ` Chris Mason
  1 sibling, 1 reply; 45+ messages in thread
From: Chris Wedgwood @ 2001-02-07 21:47 UTC (permalink / raw)
  To: Xuan Baldauf
  Cc: David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

On Wed, Feb 07, 2001 at 06:30:01PM +0100, Xuan Baldauf wrote:

    and so on. Maybe I should write a program which automatically
    detects and reports the zero blocks. I think the theory of tails
    unpacking does not work out, because there are also areas
    affected which are not between 2048 and 4096. Also the length of
    the zeroing can be greater than 2048.  However, I did not
    encounter a length of over 4096.

these appear on your system every couple of days right? if so... are
you able to run with the fs mount notails for a couple of days and
see if you still experience these?

my guess is you probably still will as most log files aren't
candidates for tail-packing (too large) but it will help eliminate
one more thing....


  --cw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 21:47       ` Chris Wedgwood
@ 2001-02-07 21:55         ` Chris Mason
  2001-02-07 22:05           ` Xuan Baldauf
  0 siblings, 1 reply; 45+ messages in thread
From: Chris Mason @ 2001-02-07 21:55 UTC (permalink / raw)
  To: Chris Wedgwood, Xuan Baldauf
  Cc: David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com



On Thursday, February 08, 2001 10:47:29 AM +1300 Chris Wedgwood
<cw@f00f.org> wrote:

> these appear on your system every couple of days right? if so... are
> you able to run with the fs mount notails for a couple of days and
> see if you still experience these?
> 
> my guess is you probably still will as most log files aren't
> candidates for tail-packing (too large) but it will help eliminate
> one more thing....
> 

Yes, it really would.

1) mount -o notail
2) rm old_logfile
3) restart syslog

This will ensure the log files don't have tails at all.  Knowing for sure
the bug doesn't involve tails would remove much code from the search.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 21:55         ` Chris Mason
@ 2001-02-07 22:05           ` Xuan Baldauf
  2001-02-07 22:13             ` Chris Mason
  0 siblings, 1 reply; 45+ messages in thread
From: Xuan Baldauf @ 2001-02-07 22:05 UTC (permalink / raw)
  To: Chris Mason
  Cc: Chris Wedgwood, Xuan Baldauf, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com



Chris Mason wrote:

> On Thursday, February 08, 2001 10:47:29 AM +1300 Chris Wedgwood
> <cw@f00f.org> wrote:
>
> > these appear on your system every couple of days right? if so... are
> > you able to run with the fs mount notails for a couple of days and
> > see if you still experience these?
> >
> > my guess is you probably still will as most log files aren't
> > candidates for tail-packing (too large) but it will help eliminate
> > one more thing....
> >
>
> Yes, it really would.
>
> 1) mount -o notail
> 2) rm old_logfile
> 3) restart syslog
>
> This will ensure the log files don't have tails at all.  Knowing for sure
> the bug doesn't involve tails would remove much code from the search.
>
> -chris

Mhhh. It's a busy server from which I am about 700km away. I don't like to
restart it now. (Especially because it cannot boot from hard disk, only from
floppy disk, due to bios problems). But I'd be happy if following is true:

(1) Enabling "-o notails" is possible at runtime, i.e. "mount / -o
remount,notails" works and
(2) Notails is compatible with all the tails found on disk (so notails only
changes the way the disk is written, not the way the disk is read).

Is this true?

Xuân.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 22:05           ` Xuan Baldauf
@ 2001-02-07 22:13             ` Chris Mason
  0 siblings, 0 replies; 45+ messages in thread
From: Chris Mason @ 2001-02-07 22:13 UTC (permalink / raw)
  To: Xuan Baldauf
  Cc: Chris Wedgwood, David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com



On Wednesday, February 07, 2001 11:05:51 PM +0100 Xuan Baldauf
<xuan--reiserfs@baldauf.org> wrote:

> Mhhh. It's a busy server from which I am about 700km away. I don't like to
> restart it now. (Especially because it cannot boot from hard disk, only
> from floppy disk, due to bios problems). But I'd be happy if following is
> true:
> 
> (1) Enabling "-o notails" is possible at runtime, i.e. "mount / -o
> remount,notails" works and

Nope.

> (2) Notails is compatible with all the tails found on disk (so notails
> only changes the way the disk is written, not the way the disk is read).
> 

This part is true.

Honestly, I don't want to do this kind of debugging on a busy server.
Sure, it is completely safe, etc, etc, but ...

We'll get the info elsewhere, leave the busy servers out of it ;-)

-chris


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-07 16:48     ` Chris Mason
@ 2001-02-08  6:34       ` Daniel Stone
  2001-02-10 13:02         ` Chris Wedgwood
  2001-02-10 14:47         ` Alan Cox
  0 siblings, 2 replies; 45+ messages in thread
From: Daniel Stone @ 2001-02-08  6:34 UTC (permalink / raw)
  To: Chris Mason
  Cc: David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

On 07 Feb 2001 11:48:16 -0500, Chris Mason wrote:
> 
> 
> On Wednesday, February 07, 2001 08:38:54 AM -0800 David Rees
> <dbr@spoke.nols.com> wrote:
> 
> > On Wed, Feb 07, 2001 at 10:47:09AM -0500, Chris Mason wrote:
> >> 
> >> Ok, how about we list the known bugs:
> >> 
> >> zeros in log files, apparently only between bytes 2048 and 4096 (not
> >> reproduced yet).
> > 
> > Could this bug be related to the reported corruption that people with
> > new VIA chipsets have been also reporting on ext2?  It seems similar
> > because of the location of the corruption:
> > 
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=98147483712620&w=2
> > 
> > Anyway, it can't hurt to ask the bug reported if they're using a
> > newer VIA chipset and see if they will upgrade their BIOS which seems
> > to fix the problem.
> 
> I'd love to blame this on VIA problems, but people are seeing it on other
> chipsets too ;-)  
> 
> People who report this aren't seeing general corruption, just zeros in
> files of specific sizes.  So, it really should be a reiserfs bug.

I run Reiser on all but /boot, and it seems to enjoy corrupting my
mbox'es randomly.
Using the old-style Reiser FS format, 2.4.2-pre1, Evolution, on a CMD640
chipset with the fixes enabled.
This also occurs in some log files, but I put it down to syslogd
crashing or something.

d

-- 
Daniel Stone
Linux Kernel Developer
daniel@kabuki.eyep.net

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
G!>CS d s++:- a---- C++ ULS++++$>B P---- L+++>++++ E+(joe)>+++ W++ N->++ !o
K? w++(--) O---- M- V-- PS+++ PE- Y PGP>++ t--- 5-- X- R- tv-(!) b+++ DI+++ 
D+ G e->++ h!(+) r+(%) y? UF++
------END GEEK CODE BLOCK------



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-08  6:34       ` Daniel Stone
@ 2001-02-10 13:02         ` Chris Wedgwood
  2001-02-10 13:05           ` Daniel Stone
  2001-02-11  6:58           ` Hans Reiser
  2001-02-10 14:47         ` Alan Cox
  1 sibling, 2 replies; 45+ messages in thread
From: Chris Wedgwood @ 2001-02-10 13:02 UTC (permalink / raw)
  To: Daniel Stone
  Cc: Chris Mason, David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:

    I run Reiser on all but /boot, and it seems to enjoy corrupting my
    mbox'es randomly.

what kind of corruption are you seeing?

    This also occurs in some log files, but I put it down to syslogd
    crashing or something.

syslogd crashing shouldn't corrupt files... 



  --cw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 13:02         ` Chris Wedgwood
@ 2001-02-10 13:05           ` Daniel Stone
  2001-02-10 13:08             ` Chris Wedgwood
  2001-02-11  7:00             ` Hans Reiser
  2001-02-11  6:58           ` Hans Reiser
  1 sibling, 2 replies; 45+ messages in thread
From: Daniel Stone @ 2001-02-10 13:05 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Chris Mason, David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

On 11 Feb 2001 02:02:00 +1300, Chris Wedgwood wrote:
> On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:
> 
>     I run Reiser on all but /boot, and it seems to enjoy corrupting my
>     mbox'es randomly.
> 
> what kind of corruption are you seeing?

Zeroed bytes.

>     This also occurs in some log files, but I put it down to syslogd
>     crashing or something.
> 
> syslogd crashing shouldn't corrupt files... 

Actually, I meant to say my hard drive crashing.
I have two hard drives, side-by-side, and sometimes they overheat and
one of them powers down due to the excess heat.
They haven't done that lately, though, as I have a dedicated fan for
both of them, but the corruption persists.

-- 
Daniel Stone
Linux Kernel Developer
daniel@kabuki.eyep.net

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
G!>CS d s++:- a---- C++ ULS++++$>B P---- L+++>++++ E+(joe)>+++ W++ N->++ !o
K? w++(--) O---- M- V-- PS+++ PE- Y PGP>++ t--- 5-- X- R- tv-(!) b+++ DI+++ 
D+ G e->++ h!(+) r+(%) y? UF++
------END GEEK CODE BLOCK------



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 13:05           ` Daniel Stone
@ 2001-02-10 13:08             ` Chris Wedgwood
  2001-02-11  7:00             ` Hans Reiser
  1 sibling, 0 replies; 45+ messages in thread
From: Chris Wedgwood @ 2001-02-10 13:08 UTC (permalink / raw)
  To: Daniel Stone
  Cc: Chris Mason, David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

On Sun, Feb 11, 2001 at 12:05:12AM +1100, Daniel Stone wrote:

    Actually, I meant to say my hard drive crashing.
    I have two hard drives, side-by-side, and sometimes they overheat and
    one of them powers down due to the excess heat.

OK then... if it weren't for the fact other people have reported
similar problems I would say all bets are off. mbox files get
corrupted when machines crash because of their (mis)design; might
this be the case for you here? Or do you see corruption without hard
drive crashes and OS crashes?

It's pretty much impossible to debug and test software when the
hardware if unreliable or unpredictable.


  --cw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-08  6:34       ` Daniel Stone
  2001-02-10 13:02         ` Chris Wedgwood
@ 2001-02-10 14:47         ` Alan Cox
  2001-02-10 21:16           ` David Ford
  2001-02-11  8:50           ` Hans Reiser
  1 sibling, 2 replies; 45+ messages in thread
From: Alan Cox @ 2001-02-10 14:47 UTC (permalink / raw)
  To: Daniel Stone
  Cc: Chris Mason, David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

> I run Reiser on all but /boot, and it seems to enjoy corrupting my
> mbox'es randomly.
> Using the old-style Reiser FS format, 2.4.2-pre1, Evolution, on a CMD640
> chipset with the fixes enabled.
> This also occurs in some log files, but I put it down to syslogd
> crashing or something.

Before you put that down to reiserfs can you chek 2.4.2-pre2. It may be
problems below the reiserfs layer

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 14:47         ` Alan Cox
@ 2001-02-10 21:16           ` David Ford
  2001-02-11  0:36             ` Andrius Adomaitis
                               ` (2 more replies)
  2001-02-11  8:50           ` Hans Reiser
  1 sibling, 3 replies; 45+ messages in thread
From: David Ford @ 2001-02-10 21:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Daniel Stone, Chris Mason, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

Alan Cox wrote:

>> I run Reiser on all but /boot, and it seems to enjoy corrupting my
>> mbox'es randomly.
>> Using the old-style Reiser FS format, 2.4.2-pre1, Evolution, on a CMD640
>> chipset with the fixes enabled.
>> This also occurs in some log files, but I put it down to syslogd
>> crashing or something.
> 
> 
> Before you put that down to reiserfs can you chek 2.4.2-pre2. It may be
> problems below the reiserfs layer


Just as an aside, I've watched this conversation go on and on while I 
run reiserfs on several servers, workstations, and a notebook.  I have 
current kernels and have watched carefully for corruption.  I haven't 
seen any evidence of corruption on any of them including my notebook 
which has a bad battery and bad power connection so it tends to 
instantly die.

Alan, is there a particular trigger to this?

-d

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 21:16           ` David Ford
@ 2001-02-11  0:36             ` Andrius Adomaitis
  2001-02-11  8:29             ` Hans Reiser
  2001-02-11 10:53             ` Alan Cox
  2 siblings, 0 replies; 45+ messages in thread
From: Andrius Adomaitis @ 2001-02-11  0:36 UTC (permalink / raw)
  To: David Ford, linux-kernel

On Saturday 10 February 2001 22:16, David Ford wrote:

> Just as an aside, I've watched this conversation go on and on while I
> run reiserfs on several servers, workstations, and a notebook.  I
> have current kernels and have watched carefully for corruption.  I
> haven't seen any evidence of corruption on any of them including my
> notebook which has a bad battery and bad power connection so it tends
> to instantly die.
>
> Alan, is there a particular trigger to this?

Want to trigger this? Just install reiserfs on Dual SMP machine with 
huge RAID acting as mail server for 90k mailboxes. After several hours 
you'll get a lot reiserfs_read_inode2/reiserfs_iget: bad_inode msgs in 
your kern.log... 

Good luck.

--
Andrius
charta@gaumina.lt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 13:02         ` Chris Wedgwood
  2001-02-10 13:05           ` Daniel Stone
@ 2001-02-11  6:58           ` Hans Reiser
  1 sibling, 0 replies; 45+ messages in thread
From: Hans Reiser @ 2001-02-11  6:58 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Daniel Stone, Chris Mason, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

Chris Wedgwood wrote:
> 
> On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:
> 
>     I run Reiser on all but /boot, and it seems to enjoy corrupting my
>     mbox'es randomly.
> 
> what kind of corruption are you seeing?
> 
>     This also occurs in some log files, but I put it down to syslogd
>     crashing or something.
> 
> syslogd crashing shouldn't corrupt files...
> 
>   --cw

There is a known bug in which nulls get added to log files.  We are having
trouble reproducing it on our machines.

There is an elevator bug in 2.4 which just got found/fixed.  We don't know what
part of our bug reports are due to it.

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 13:05           ` Daniel Stone
  2001-02-10 13:08             ` Chris Wedgwood
@ 2001-02-11  7:00             ` Hans Reiser
  2001-02-12  0:56               ` Chris Mason
  1 sibling, 1 reply; 45+ messages in thread
From: Hans Reiser @ 2001-02-11  7:00 UTC (permalink / raw)
  To: Daniel Stone
  Cc: Chris Wedgwood, Chris Mason, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com,
	Chris Mason, Alexander Zarochentcev

Daniel Stone wrote:
> 
> On 11 Feb 2001 02:02:00 +1300, Chris Wedgwood wrote:
> > On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:
> >
> >     I run Reiser on all but /boot, and it seems to enjoy corrupting my
> >     mbox'es randomly.
> >
> > what kind of corruption are you seeing?
> 
> Zeroed bytes.

This sounds like the same bug as the syslog bug, please try to help Chris
reproduce it.

zam, if Chris can't reproduce it by Monday, please give it a try.

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 21:16           ` David Ford
  2001-02-11  0:36             ` Andrius Adomaitis
@ 2001-02-11  8:29             ` Hans Reiser
       [not found]               ` <wvu261oa80.fsf@freeze.oslo.dnmi.no>
  2001-02-11 10:53             ` Alan Cox
  2 siblings, 1 reply; 45+ messages in thread
From: Hans Reiser @ 2001-02-11  8:29 UTC (permalink / raw)
  To: David Ford
  Cc: Alan Cox, Daniel Stone, Chris Mason, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

David Ford wrote:
> 
> Alan Cox wrote:
> 
> >> I run Reiser on all but /boot, and it seems to enjoy corrupting my
> >> mbox'es randomly.
> >> Using the old-style Reiser FS format, 2.4.2-pre1, Evolution, on a CMD640
> >> chipset with the fixes enabled.
> >> This also occurs in some log files, but I put it down to syslogd
> >> crashing or something.
> >
> >
> > Before you put that down to reiserfs can you chek 2.4.2-pre2. It may be
> > problems below the reiserfs layer
> 
> Just as an aside, I've watched this conversation go on and on while I
> run reiserfs on several servers, workstations, and a notebook.  I have
> current kernels and have watched carefully for corruption.  I haven't
> seen any evidence of corruption on any of them including my notebook
> which has a bad battery and bad power connection so it tends to
> instantly die.
> 
> Alan, is there a particular trigger to this?
> 
> -d

Guys, instability is a relative word.  One of our users in Russia said that
reiserfs was as stable as a mountain, and he didn't understand my email.   We
have some number of users, I wish I really knew how many.  If you look at a few
hundred thousand mountains, you'll discover that a number of them are really
quite unstable.  We used to get a bug report a week, now we get one or two a
day.  Does this mean we went from a few hundred thousand mountains to a few
million?  I don't really know.....  

I can assure the users though that we have an extensive testing procedure, and
that our releases all pass a testing that can roughly be described as hammering
the filesystem every different way we can think of (this is more limited than
what being put into the kernel by Linus does) for twelve or more hours.

What I do know is the following:  there was a recent elevator bug fix.  Our
filesystem is a journaling filesystem and it is extremely dependent on an
assumption that nothing is going to get written to disk before it should.  I
think fsck even makes assumptions about certain states relating to rename being
made atomic never reaching disk (and I think this is being fixed thanks to this
bug).  Could this cause the bug in which syslog gets zeros in it?  Don't know,
we haven't reproduced that bug yet though it "should" be straightforward to
reproduce.  We do have an NFS bug, which Nikita is still fixing.

What I can tell you is that in a few weeks we will have it back to a bug report
every week or two, and until we do version 4 of ReiserFS is going to be stalled 
(not so different from Linux 2.5 being stalled until 2.4 satisfies Linus).

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 14:47         ` Alan Cox
  2001-02-10 21:16           ` David Ford
@ 2001-02-11  8:50           ` Hans Reiser
  1 sibling, 0 replies; 45+ messages in thread
From: Hans Reiser @ 2001-02-11  8:50 UTC (permalink / raw)
  To: Alan Cox
  Cc: Daniel Stone, Chris Mason, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

Alan Cox wrote:

> Before you put that down to reiserfs can you chek 2.4.2-pre2. It may be
> problems below the reiserfs layer

I forgot, this bug exists on reiserfs for Linux 2.2.*, so it isn't going to be
fixed by 2.4.2 (assuming that the bug is not in 2.2.*).

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
       [not found]               ` <wvu261oa80.fsf@freeze.oslo.dnmi.no>
@ 2001-02-11  8:59                 ` Hans Reiser
  2001-02-11  9:52                   ` Adrian Phillips
  0 siblings, 1 reply; 45+ messages in thread
From: Hans Reiser @ 2001-02-11  8:59 UTC (permalink / raw)
  To: Adrian Phillips; +Cc: linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

Adrian Phillips wrote:
> 
> Does your test procedure include other systems, for example reiserfs
> plus NFS ?

Our NFS testing is simply inadequate, we need a copy of LADDIS but haven't found
the money for it yet.

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11  9:52                   ` Adrian Phillips
@ 2001-02-11  9:47                     ` Hans Reiser
  2001-02-11 17:10                       ` Alan Cox
  0 siblings, 1 reply; 45+ messages in thread
From: Hans Reiser @ 2001-02-11  9:47 UTC (permalink / raw)
  To: Adrian Phillips; +Cc: linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

Adrian Phillips wrote:
> 
> >>>>> "Hans" == Hans Reiser <reiser@namesys.com> writes:
> 
>     Hans> Adrian Phillips wrote:
>     >>  Does your test procedure include other systems, for example
>     >> reiserfs plus NFS ?
> 
>     Hans> Our NFS testing is simply inadequate, we need a copy of
>     Hans> LADDIS but haven't found the money for it yet.
> 
> Excuse my ignorance, but what is LADDIS ?
> 
> Sincerely,
> 
> Adrian Phillips
> 
> --
> Your mouse has moved.
> Windows NT must be restarted for the change to take effect.
> Reboot now?  [OK]

LADDIS is the industry standard benchmark for NFS.  It crashes for ReiserFS and
NFS.  We can't afford to buy it, as it is proprietary software.  Once Nikita has
finished testing his changes, we will ask someone to test it for us though.

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11  8:59                 ` Hans Reiser
@ 2001-02-11  9:52                   ` Adrian Phillips
  2001-02-11  9:47                     ` Hans Reiser
  0 siblings, 1 reply; 45+ messages in thread
From: Adrian Phillips @ 2001-02-11  9:52 UTC (permalink / raw)
  To: Hans Reiser; +Cc: linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

>>>>> "Hans" == Hans Reiser <reiser@namesys.com> writes:

    Hans> Adrian Phillips wrote:
    >>  Does your test procedure include other systems, for example
    >> reiserfs plus NFS ?

    Hans> Our NFS testing is simply inadequate, we need a copy of
    Hans> LADDIS but haven't found the money for it yet.

Excuse my ignorance, but what is LADDIS ?

Sincerely,

Adrian Phillips

-- 
Your mouse has moved.
Windows NT must be restarted for the change to take effect.
Reboot now?  [OK]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-10 21:16           ` David Ford
  2001-02-11  0:36             ` Andrius Adomaitis
  2001-02-11  8:29             ` Hans Reiser
@ 2001-02-11 10:53             ` Alan Cox
  2 siblings, 0 replies; 45+ messages in thread
From: Alan Cox @ 2001-02-11 10:53 UTC (permalink / raw)
  To: David Ford
  Cc: Alan Cox, Daniel Stone, Chris Mason, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com

> run reiserfs on several servers, workstations, and a notebook.  I have 
> current kernels and have watched carefully for corruption.  I haven't 
> seen any evidence of corruption on any of them including my notebook 
> which has a bad battery and bad power connection so it tends to 
> instantly die.
> 
> Alan, is there a particular trigger to this?

The 2.4.1 stuff is a specific low level block I/O pattern. Its fixed in
2.4.2pre2/2.4.1ac-something

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11  9:47                     ` Hans Reiser
@ 2001-02-11 17:10                       ` Alan Cox
  2001-02-11 19:56                         ` Andi Kleen
  2001-02-11 21:16                         ` Hans Reiser
  0 siblings, 2 replies; 45+ messages in thread
From: Alan Cox @ 2001-02-11 17:10 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Adrian Phillips, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

> LADDIS is the industry standard benchmark for NFS.  It crashes for ReiserFS and
> NFS.  We can't afford to buy it, as it is proprietary software.  Once Nikita has
> finished testing his changes, we will ask someone to test it for us though.
> 

Do you know if the connectathon test suites show the problem?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11 17:10                       ` Alan Cox
@ 2001-02-11 19:56                         ` Andi Kleen
  2001-02-12  2:17                           ` Rogerio Brito
  2001-02-12 13:39                           ` Henning P. Schmiedehausen
  2001-02-11 21:16                         ` Hans Reiser
  1 sibling, 2 replies; 45+ messages in thread
From: Andi Kleen @ 2001-02-11 19:56 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> > LADDIS is the industry standard benchmark for NFS.  It crashes for ReiserFS and
> > NFS.  We can't afford to buy it, as it is proprietary software.  Once Nikita has
> > finished testing his changes, we will ask someone to test it for us though.
> > 
> 
> Do you know if the connectathon test suites show the problem?

The reiserfs nfs problem in standard 2.4 is very simple -- it'll barf as soon 
as you run out of file handle/inode cache. Any workload that accesses
enough files in parallel can trigger it.

Fixes do exist, but require bigger changes in nfsd.  Basically you need to
hand out an 64bit inode in the nfs filehandle, and pass the upper 32bits
to the low level file system for efficient lookup (actually is all not 
too difficult to implement, just requires very uncodefreezefriendly changes
to nfsd) 


-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11 17:10                       ` Alan Cox
  2001-02-11 19:56                         ` Andi Kleen
@ 2001-02-11 21:16                         ` Hans Reiser
  2001-02-12  9:36                           ` Alan Cox
  1 sibling, 1 reply; 45+ messages in thread
From: Hans Reiser @ 2001-02-11 21:16 UTC (permalink / raw)
  To: Alan Cox
  Cc: Adrian Phillips, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

Alan Cox wrote:
> 
> > LADDIS is the industry standard benchmark for NFS.  It crashes for ReiserFS and
> > NFS.  We can't afford to buy it, as it is proprietary software.  Once Nikita has
> > finished testing his changes, we will ask someone to test it for us though.
> >
> 
> Do you know if the connectathon test suites show the problem?

Not the slightest idea.  Is the connectathon test suite something that stresses
the FS heavily?  If so, we can always add it to our stable, whether or not it
stresses this particular bug.

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11  7:00             ` Hans Reiser
@ 2001-02-12  0:56               ` Chris Mason
  2001-02-12 19:11                 ` Marcelo Tosatti
  0 siblings, 1 reply; 45+ messages in thread
From: Chris Mason @ 2001-02-12  0:56 UTC (permalink / raw)
  To: Hans Reiser, Daniel Stone
  Cc: Chris Wedgwood, David Rees, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com, Alexander Zarochentcev



On Sunday, February 11, 2001 10:00:11 AM +0300 Hans Reiser
<reiser@namesys.com> wrote:

> Daniel Stone wrote:
>> 
>> On 11 Feb 2001 02:02:00 +1300, Chris Wedgwood wrote:
>> > On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:
>> > 
>> >     I run Reiser on all but /boot, and it seems to enjoy corrupting my
>> >     mbox'es randomly.
>> > 
>> > what kind of corruption are you seeing?
>> 
>> Zeroed bytes.
> 
> This sounds like the same bug as the syslog bug, please try to help Chris
> reproduce it.
> 
> zam, if Chris can't reproduce it by Monday, please give it a try.
> 

I had a bunch of scripts running over the weekend to try and reproduce
this, but the results were ruined when a major storm killed the power (no,
still haven't gotten around to configuring my UPS to shut things down ;-).

So, I'll try again.

-chris



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11 19:56                         ` Andi Kleen
@ 2001-02-12  2:17                           ` Rogerio Brito
  2001-02-12  9:49                             ` Andi Kleen
  2001-02-12 13:39                           ` Henning P. Schmiedehausen
  1 sibling, 1 reply; 45+ messages in thread
From: Rogerio Brito @ 2001-02-12  2:17 UTC (permalink / raw)
  To: linux-kernel

On Feb 11 2001, Andi Kleen wrote:
> The reiserfs nfs problem in standard 2.4 is very simple -- it'll
> barf as soon as you run out of file handle/inode cache. Any workload
> that accesses enough files in parallel can trigger it.

	I'm just trying to evaluate if I should use reiserfs here or
	not: is this phenomenon that you describe above happening
	independently of whether I choose the knfsd or userspace nfsd?

	From your message, I got the impression that it would happen
	with knfsd only, but I'm just checking before I make a wrong
	decision.


	Thanks from a humble (and ignorant) network admin, Roger...

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
  Rogerio Brito - rbrito@iname.com - http://www.ime.usp.br/~rbrito/
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11 21:16                         ` Hans Reiser
@ 2001-02-12  9:36                           ` Alan Cox
  0 siblings, 0 replies; 45+ messages in thread
From: Alan Cox @ 2001-02-12  9:36 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Alan Cox, Adrian Phillips, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com

> Not the slightest idea.  Is the connectathon test suite something that stresses
> the FS heavily?  If so, we can always add it to our stable, whether or not it
> stresses this particular bug.

It certainly has been stressing the NFS side of things enough to show up a lot
of problems so maybe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12  2:17                           ` Rogerio Brito
@ 2001-02-12  9:49                             ` Andi Kleen
  0 siblings, 0 replies; 45+ messages in thread
From: Andi Kleen @ 2001-02-12  9:49 UTC (permalink / raw)
  To: Rogerio Brito; +Cc: linux-kernel

Rogerio Brito <rbrito@iname.com> writes:

> On Feb 11 2001, Andi Kleen wrote:
> > The reiserfs nfs problem in standard 2.4 is very simple -- it'll
> > barf as soon as you run out of file handle/inode cache. Any workload
> > that accesses enough files in parallel can trigger it.
> 
> 	I'm just trying to evaluate if I should use reiserfs here or
> 	not: is this phenomenon that you describe above happening
> 	independently of whether I choose the knfsd or userspace nfsd?

This should be all covered extensively in the reiserfs FAQ and list archives, 
here a last time:

It only applies to knfsd, but unfsd unfortunately has different problems
with reiserfs. It makes assumptions about the inode space by the underlying
filesystem by assuming that it can encode a dev_t in upper bits. Reiserfs
unlike ext2 periodically cycles through the full 31bit of inode values, and
after some weeks on a busy file system unfsd starts to complain about 
conflicts. There is a patch at ftp.suse.com:/pub/people/ak/nfs/unfsd*
that works around the problem when you specify --no-cross-mounts (but 
you cannot export trees of multiple file systems then with a single mount
anymore) 

Please also note that the patch also adds a rather obscure bug, which triggers
very seldom (patch partly exists, but not really tested yet)

Another alternative is to use knfsd with Chris Mason's 2.4 knfsd patches.


-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-11 19:56                         ` Andi Kleen
  2001-02-12  2:17                           ` Rogerio Brito
@ 2001-02-12 13:39                           ` Henning P. Schmiedehausen
  1 sibling, 0 replies; 45+ messages in thread
From: Henning P. Schmiedehausen @ 2001-02-12 13:39 UTC (permalink / raw)
  To: linux-kernel

ak@suse.de (Andi Kleen) writes:

>to the low level file system for efficient lookup (actually is all not 
>too difficult to implement, just requires very uncodefreezefriendly changes
>to nfsd) 

Well, at least I would really prefer a change for 2.4.x the sooner the
better as I will never ever want to repeat the NFS nightmare from
2.2. I prefer a working NFS on Reiser over a non working, but
codefreezed at any time. ;-)

	Regards
		Henning
-- 
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen       -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH     hps@intermeta.de

Am Schwabachgrund 22  Fon.: 09131 / 50654-0   info@intermeta.de
D-91054 Buckenhof     Fax.: 09131 / 50654-20   
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12  0:56               ` Chris Mason
@ 2001-02-12 19:11                 ` Marcelo Tosatti
  2001-02-12 20:42                   ` Hans Reiser
  0 siblings, 1 reply; 45+ messages in thread
From: Marcelo Tosatti @ 2001-02-12 19:11 UTC (permalink / raw)
  To: Chris Mason
  Cc: Hans Reiser, Daniel Stone, Chris Wedgwood, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com,
	Alexander Zarochentcev


On Sun, 11 Feb 2001, Chris Mason wrote:

> 
> 
> On Sunday, February 11, 2001 10:00:11 AM +0300 Hans Reiser
> <reiser@namesys.com> wrote:
> 
> > Daniel Stone wrote:
> >> 
> >> On 11 Feb 2001 02:02:00 +1300, Chris Wedgwood wrote:
> >> > On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:
> >> > 
> >> >     I run Reiser on all but /boot, and it seems to enjoy corrupting my
> >> >     mbox'es randomly.
> >> > 
> >> > what kind of corruption are you seeing?
> >> 
> >> Zeroed bytes.
> > 
> > This sounds like the same bug as the syslog bug, please try to help Chris
> > reproduce it.
> > 
> > zam, if Chris can't reproduce it by Monday, please give it a try.
> > 
> 
> I had a bunch of scripts running over the weekend to try and reproduce
> this, but the results were ruined when a major storm killed the power (no,
> still haven't gotten around to configuring my UPS to shut things down ;-).
> 
> So, I'll try again.

Chris,

Do you know if the people reporting the corruption with reiserfs on
2.4 were using IDE drives with PIO mode and IDE multicount turned on?

If so, it may be caused by the problem fixed by Russell King on
2.4.2-pre2. 

Without his fix, I was able to corrupt ext2 while using PIO+multicount
very very easily.




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12 20:42                   ` Hans Reiser
@ 2001-02-12 19:33                     ` Marcelo Tosatti
  2001-02-12 21:01                       ` Hans Reiser
  2001-02-12 23:03                     ` Chris Mason
  1 sibling, 1 reply; 45+ messages in thread
From: Marcelo Tosatti @ 2001-02-12 19:33 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Chris Mason, Daniel Stone, Chris Wedgwood, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com,
	Alexander Zarochentcev


On Mon, 12 Feb 2001, Hans Reiser wrote:

> Marcelo Tosatti wrote:
> > 
> > On Sun, 11 Feb 2001, Chris Mason wrote:
> > 
> > >
> > >
> > > On Sunday, February 11, 2001 10:00:11 AM +0300 Hans Reiser
> > > <reiser@namesys.com> wrote:
> > >
> > > > Daniel Stone wrote:
> > > >>
> > > >> On 11 Feb 2001 02:02:00 +1300, Chris Wedgwood wrote:
> > > >> > On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:
> > > >> >
> > > >> >     I run Reiser on all but /boot, and it seems to enjoy corrupting my
> > > >> >     mbox'es randomly.
> > > >> >
> > > >> > what kind of corruption are you seeing?
> > > >>
> > > >> Zeroed bytes.
> > > >
> > > > This sounds like the same bug as the syslog bug, please try to help Chris
> > > > reproduce it.
> > > >
> > > > zam, if Chris can't reproduce it by Monday, please give it a try.
> > > >
> > >
> > > I had a bunch of scripts running over the weekend to try and reproduce
> > > this, but the results were ruined when a major storm killed the power (no,
> > > still haven't gotten around to configuring my UPS to shut things down ;-).
> > >
> > > So, I'll try again.
> > 
> > Chris,
> > 
> > Do you know if the people reporting the corruption with reiserfs on
> > 2.4 were using IDE drives with PIO mode and IDE multicount turned on?
> > 
> > If so, it may be caused by the problem fixed by Russell King on
> > 2.4.2-pre2.
> > 
> > Without his fix, I was able to corrupt ext2 while using PIO+multicount
> > very very easily.
> 
> Was the bug you describe also present in the 2.2.* series?  If not, then the
> bugs are not the same.

N.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12 19:11                 ` Marcelo Tosatti
@ 2001-02-12 20:42                   ` Hans Reiser
  2001-02-12 19:33                     ` Marcelo Tosatti
  2001-02-12 23:03                     ` Chris Mason
  0 siblings, 2 replies; 45+ messages in thread
From: Hans Reiser @ 2001-02-12 20:42 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Chris Mason, Daniel Stone, Chris Wedgwood, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com,
	Alexander Zarochentcev

Marcelo Tosatti wrote:
> 
> On Sun, 11 Feb 2001, Chris Mason wrote:
> 
> >
> >
> > On Sunday, February 11, 2001 10:00:11 AM +0300 Hans Reiser
> > <reiser@namesys.com> wrote:
> >
> > > Daniel Stone wrote:
> > >>
> > >> On 11 Feb 2001 02:02:00 +1300, Chris Wedgwood wrote:
> > >> > On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:
> > >> >
> > >> >     I run Reiser on all but /boot, and it seems to enjoy corrupting my
> > >> >     mbox'es randomly.
> > >> >
> > >> > what kind of corruption are you seeing?
> > >>
> > >> Zeroed bytes.
> > >
> > > This sounds like the same bug as the syslog bug, please try to help Chris
> > > reproduce it.
> > >
> > > zam, if Chris can't reproduce it by Monday, please give it a try.
> > >
> >
> > I had a bunch of scripts running over the weekend to try and reproduce
> > this, but the results were ruined when a major storm killed the power (no,
> > still haven't gotten around to configuring my UPS to shut things down ;-).
> >
> > So, I'll try again.
> 
> Chris,
> 
> Do you know if the people reporting the corruption with reiserfs on
> 2.4 were using IDE drives with PIO mode and IDE multicount turned on?
> 
> If so, it may be caused by the problem fixed by Russell King on
> 2.4.2-pre2.
> 
> Without his fix, I was able to corrupt ext2 while using PIO+multicount
> very very easily.

Was the bug you describe also present in the 2.2.* series?  If not, then the
bugs are not the same.

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12 19:33                     ` Marcelo Tosatti
@ 2001-02-12 21:01                       ` Hans Reiser
  0 siblings, 0 replies; 45+ messages in thread
From: Hans Reiser @ 2001-02-12 21:01 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Chris Mason, Daniel Stone, Chris Wedgwood, David Rees,
	linux-kernel@vger.kernel.org, reiserfs-list@namesys.com,
	Alexander Zarochentcev

Marcelo Tosatti wrote:
> 
> On Mon, 12 Feb 2001, Hans Reiser wrote:
> 
> > Marcelo Tosatti wrote:
> > >
> > > On Sun, 11 Feb 2001, Chris Mason wrote:
> > >
> > > >
> > > >
> > > > On Sunday, February 11, 2001 10:00:11 AM +0300 Hans Reiser
> > > > <reiser@namesys.com> wrote:
> > > >
> > > > > Daniel Stone wrote:
> > > > >>
> > > > >> On 11 Feb 2001 02:02:00 +1300, Chris Wedgwood wrote:
> > > > >> > On Thu, Feb 08, 2001 at 05:34:44PM +1100, Daniel Stone wrote:
> > > > >> >
> > > > >> >     I run Reiser on all but /boot, and it seems to enjoy corrupting my
> > > > >> >     mbox'es randomly.
> > > > >> >
> > > > >> > what kind of corruption are you seeing?
> > > > >>
> > > > >> Zeroed bytes.
> > > > >
> > > > > This sounds like the same bug as the syslog bug, please try to help Chris
> > > > > reproduce it.
> > > > >
> > > > > zam, if Chris can't reproduce it by Monday, please give it a try.
> > > > >
> > > >
> > > > I had a bunch of scripts running over the weekend to try and reproduce
> > > > this, but the results were ruined when a major storm killed the power (no,
> > > > still haven't gotten around to configuring my UPS to shut things down ;-).
> > > >
> > > > So, I'll try again.
> > >
> > > Chris,
> > >
> > > Do you know if the people reporting the corruption with reiserfs on
> > > 2.4 were using IDE drives with PIO mode and IDE multicount turned on?
> > >
> > > If so, it may be caused by the problem fixed by Russell King on
> > > 2.4.2-pre2.
> > >
> > > Without his fix, I was able to corrupt ext2 while using PIO+multicount
> > > very very easily.
> >
> > Was the bug you describe also present in the 2.2.* series?  If not, then the
> > bugs are not the same.
> 
> N.

Zam will try to reproduce it tomorrow, he successfully escaped me today and got
to write fun code (a simpler block allocator) instead.

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12 23:03                     ` Chris Mason
@ 2001-02-12 22:39                       ` Hans Reiser
  2001-02-13  0:18                         ` Chris Mason
  2001-02-12 22:44                       ` Hans Reiser
  1 sibling, 1 reply; 45+ messages in thread
From: Hans Reiser @ 2001-02-12 22:39 UTC (permalink / raw)
  To: Chris Mason
  Cc: Marcelo Tosatti, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com, Alexander Zarochentcev

Chris Mason wrote:
> 
> On Monday, February 12, 2001 11:42:38 PM +0300 Hans Reiser
> <reiser@namesys.com> wrote:
> 
> >> Chris,
> >>
> >> Do you know if the people reporting the corruption with reiserfs on
> >> 2.4 were using IDE drives with PIO mode and IDE multicount turned on?
> >>
> >> If so, it may be caused by the problem fixed by Russell King on
> >> 2.4.2-pre2.
> >>
> >> Without his fix, I was able to corrupt ext2 while using PIO+multicount
> >> very very easily.
> >
> 
> I suspect the bugfixes in pre2 will fix some of the more exotic corruption
> reports we've seen, but this one (nulls in log files) probably isn't caused
> by a random (or semi-random) lower layer corruption.  These users are not
> seeing random metadata corruption, so I suspect this bug is different (and
> reiserfs specific).
> 
> > Was the bug you describe also present in the 2.2.* series?  If not, then
> > the bugs are not the same.
> >
> 
> In 2.2 code the only data file corruption I know if is caused by a crash....
> 
> -chris

Chris, your quoting is very confusing above..... but I get your very interesting
remark (thanks for noticing) that the nulls are specific to crashes on 2.2, and
therefor could be due to the elevator bug on 2.4.  It even makes rough sense
that the elevator bug (said to occasionally cause a premature write of the wrong
buffer) could cause an effect similar to a crash.  I hope it is true, let's ask
all users to upgrade to pre2 (a good idea anyway) and see if it cures.

zam is perhaps very clever for deferring working on this bug......:-)

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12 23:03                     ` Chris Mason
  2001-02-12 22:39                       ` Hans Reiser
@ 2001-02-12 22:44                       ` Hans Reiser
  1 sibling, 0 replies; 45+ messages in thread
From: Hans Reiser @ 2001-02-12 22:44 UTC (permalink / raw)
  To: Chris Mason
  Cc: Marcelo Tosatti, linux-kernel@vger.kernel.org,
	reiserfs-list@namesys.com, Alexander Zarochentcev

Chris Mason wrote:
> 
> On Monday, February 12, 2001 11:42:38 PM +0300 Hans Reiser
> <reiser@namesys.com> wrote:
> 
> >> Chris,
> >>
> >> Do you know if the people reporting the corruption with reiserfs on
> >> 2.4 were using IDE drives with PIO mode and IDE multicount turned on?
> >>
> >> If so, it may be caused by the problem fixed by Russell King on
> >> 2.4.2-pre2.
> >>
> >> Without his fix, I was able to corrupt ext2 while using PIO+multicount
> >> very very easily.
> >
> 
> I suspect the bugfixes in pre2 will fix some of the more exotic corruption
> reports we've seen, but this one (nulls in log files) probably isn't caused
> by a random (or semi-random) lower layer corruption.  These users are not
> seeing random metadata corruption, so I suspect this bug is different (and
> reiserfs specific).
> 
> > Was the bug you describe also present in the 2.2.* series?  If not, then
> > the bugs are not the same.
> >
> 
> In 2.2 code the only data file corruption I know if is caused by a crash....
> 
> -chris

I'd like to announce on our website and mailing list that all  XXX users should
upgrade to 2.4.2pre2.  Do you all agree with this?

What is the exact definition of XXX?

Hans
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12 20:42                   ` Hans Reiser
  2001-02-12 19:33                     ` Marcelo Tosatti
@ 2001-02-12 23:03                     ` Chris Mason
  2001-02-12 22:39                       ` Hans Reiser
  2001-02-12 22:44                       ` Hans Reiser
  1 sibling, 2 replies; 45+ messages in thread
From: Chris Mason @ 2001-02-12 23:03 UTC (permalink / raw)
  To: Hans Reiser, Marcelo Tosatti
  Cc: linux-kernel@vger.kernel.org, reiserfs-list@namesys.com,
	Alexander Zarochentcev



On Monday, February 12, 2001 11:42:38 PM +0300 Hans Reiser
<reiser@namesys.com> wrote:

>> Chris,
>> 
>> Do you know if the people reporting the corruption with reiserfs on
>> 2.4 were using IDE drives with PIO mode and IDE multicount turned on?
>> 
>> If so, it may be caused by the problem fixed by Russell King on
>> 2.4.2-pre2.
>> 
>> Without his fix, I was able to corrupt ext2 while using PIO+multicount
>> very very easily.
> 

I suspect the bugfixes in pre2 will fix some of the more exotic corruption
reports we've seen, but this one (nulls in log files) probably isn't caused
by a random (or semi-random) lower layer corruption.  These users are not
seeing random metadata corruption, so I suspect this bug is different (and
reiserfs specific).

> Was the bug you describe also present in the 2.2.* series?  If not, then
> the bugs are not the same.
> 

In 2.2 code the only data file corruption I know if is caused by a crash....

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [reiserfs-list] Re: Apparent instability of reiserfs on 2.4.1
  2001-02-12 22:39                       ` Hans Reiser
@ 2001-02-13  0:18                         ` Chris Mason
  0 siblings, 0 replies; 45+ messages in thread
From: Chris Mason @ 2001-02-13  0:18 UTC (permalink / raw)
  To: Hans Reiser
  Cc: linux-kernel@vger.kernel.org, reiserfs-list@namesys.com,
	Alexander Zarochentcev



On Tuesday, February 13, 2001 01:39:02 AM +0300 Hans Reiser
<reiser@namesys.com> wrote:
> Chris, your quoting is very confusing above..... but I get your very
> interesting remark (thanks for noticing) that the nulls are specific to
> crashes on 2.2, and therefor could be due to the elevator bug on 2.4.  It
> even makes rough sense that the elevator bug (said to occasionally cause
> a premature write of the wrong buffer) could cause an effect similar to a
> crash.  I hope it is true, let's ask all users to upgrade to pre2 (a good
> idea anyway) and see if it cures.
> 

Ok, I'll try again ;-)  People have been seeing null bytes in data files on
reiserfs.  They see this without seeing any other corruption of any kind,
and they only see it on files of very specific sizes.  They see this
without crashing, and without hard drive suspend kicking in.  They see it
on scsi and ide, on servers and laptops.

Elevator bugs and general driver bugs could certainly cause nulls in data
files.  But they would also cause other corruptions and probably would not
be selective enough to pick files that happen to have the same range in
size that reiserfs packs tails on.

In other words, updating to 2.4.2pre2 or your favorite ac series kernel is
probably a good plan.  It won't fix this bug ;-)

Perhaps I haven't seen it yet because I've also been testing code that does
direct->indirect conversions slightly differently, I'll try again on a pure
kernel.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2001-02-13  0:18 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-02-07 12:06 Apparent instability of reiserfs on 2.4.1 Hans Reiser
2001-02-07 15:47 ` Chris Mason
2001-02-07 16:38   ` [reiserfs-list] " David Rees
2001-02-07 16:48     ` Chris Mason
2001-02-08  6:34       ` Daniel Stone
2001-02-10 13:02         ` Chris Wedgwood
2001-02-10 13:05           ` Daniel Stone
2001-02-10 13:08             ` Chris Wedgwood
2001-02-11  7:00             ` Hans Reiser
2001-02-12  0:56               ` Chris Mason
2001-02-12 19:11                 ` Marcelo Tosatti
2001-02-12 20:42                   ` Hans Reiser
2001-02-12 19:33                     ` Marcelo Tosatti
2001-02-12 21:01                       ` Hans Reiser
2001-02-12 23:03                     ` Chris Mason
2001-02-12 22:39                       ` Hans Reiser
2001-02-13  0:18                         ` Chris Mason
2001-02-12 22:44                       ` Hans Reiser
2001-02-11  6:58           ` Hans Reiser
2001-02-10 14:47         ` Alan Cox
2001-02-10 21:16           ` David Ford
2001-02-11  0:36             ` Andrius Adomaitis
2001-02-11  8:29             ` Hans Reiser
     [not found]               ` <wvu261oa80.fsf@freeze.oslo.dnmi.no>
2001-02-11  8:59                 ` Hans Reiser
2001-02-11  9:52                   ` Adrian Phillips
2001-02-11  9:47                     ` Hans Reiser
2001-02-11 17:10                       ` Alan Cox
2001-02-11 19:56                         ` Andi Kleen
2001-02-12  2:17                           ` Rogerio Brito
2001-02-12  9:49                             ` Andi Kleen
2001-02-12 13:39                           ` Henning P. Schmiedehausen
2001-02-11 21:16                         ` Hans Reiser
2001-02-12  9:36                           ` Alan Cox
2001-02-11 10:53             ` Alan Cox
2001-02-11  8:50           ` Hans Reiser
     [not found]     ` <3A818619.7C3967BC@baldauf.org>
2001-02-07 17:39       ` Chris Mason
2001-02-07 17:53         ` Xuan Baldauf
2001-02-07 19:14         ` Xuan Baldauf
2001-02-07 21:47       ` Chris Wedgwood
2001-02-07 21:55         ` Chris Mason
2001-02-07 22:05           ` Xuan Baldauf
2001-02-07 22:13             ` Chris Mason
2001-02-07 18:41   ` Vedran Rodic
2001-02-07 18:45     ` Chris Mason
2001-02-07 19:15       ` Ivan Pulleyn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox