* $25 question - ReiserFS 3.6 data errors on 2.4.23
@ 2004-02-17 13:52 Jens Benecke
0 siblings, 0 replies; 7+ messages in thread
From: Jens Benecke @ 2004-02-17 13:52 UTC (permalink / raw)
To: reiserfs-list
Hi,
I posted here in December about problems with one specific computer and
having to rebuild-tree a partition because of data errors on it.
Well, it happeend again. I don't understand this. I am running a dozen
machines, each with ReiserFS, most of them under much heavier load than
those two in question, but for the FOURTH time in some months THIS
partition has coughed up errors (see below).
I would really like to get to the bottom of this. The same problem happened
with 2.4.19 and 2.4.22 (without data logging) after some weeks of
continuous useage. I am running another server with the same 2.4.23 kernel
(+ data logging patches) and it has not had any file sytems problems. The
strange thing is, there are no signs of hardware problems in the logs where
these errors occur:
Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
free_space(entry_count) 0
Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
format found in block 4683374. Fsck?
Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-13070: reiserfs_read_inode2:
i/o failure occurred trying to find stat data of [68636 68667 0x0 SD]
Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
free_space(entry_count) 0
Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
format found in block 4683374. Fsck?
Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-13070: reiserfs_read_inode2:
i/o failure occurred trying to find stat data of [68636 68668 0x0 SD]
Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
free_space(entry_count) 0
Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
format found in block 4683374. Fsck?
Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
free_space(entry_count) 0
Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
format found in block 4683374. Fsck?
Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
free_space(entry_count) 0
Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
format found in block 4683374. Fsck?
Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-13070: reiserfs_read_inode2:
i/o failure occurred trying to find stat data of [68636 68666 0x0 SD]
Feb 17 14:40:18 linux1 imaplogin: LOGOUT, user=et7ks,
ip=[::ffff:134.28.62.44], headers=0, body=0
It's always the same files, the machine has had these for a couple weeks,
and it does not depend on drbd (I had the same problems before starting to
mirror stuff).
If anybody would like a reiserfsdebug output mailed somewhere please say so.
I will arrange for a downtime and run reiserfsdebug over the partition.
Please help =;)
PS: I'm serious about the subject. If this is something I messed up and you
tell me where I'm perfectly willing to pay for it.
--
Jens Benecke (jens at spamfreemail.de)
http://www.hitchhikers.de - Europaweite kostenlose Mitfahrzentrale
http://www.spamfreemail.de - 100% saubere Postfächer - garantiert!
http://www.rb-hosting.de - PHP ab 9? - SSH ab 19? - günstiger Traffic
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: $25 question - ReiserFS 3.6 data errors on 2.4.23
[not found] <20040218093510.GD21098@backtop.namesys.com>
@ 2004-02-18 10:08 ` Vladimir Saveliev
2004-02-18 10:32 ` Jens Benecke
2004-03-01 12:30 ` Jens Benecke
0 siblings, 2 replies; 7+ messages in thread
From: Vladimir Saveliev @ 2004-02-18 10:08 UTC (permalink / raw)
To: jens@spamfreemail.de; +Cc: reiserfs-list
Hello
> Mailing-List: contact reiserfs-list-help@namesys.com; run by ezmlm
> To: reiserfs-list@namesys.com
> From: Jens Benecke <jens@spamfreemail.de>
> Subject: $25 question - ReiserFS 3.6 data errors on 2.4.23
> Date: Tue, 17 Feb 2004 14:52:10 +0100
> Organization: University of the Armed Forces, Hamburg, Germany
>
> Hi,
>
> I posted here in December about problems with one specific computer and
> having to rebuild-tree a partition because of data errors on it.
>
> Well, it happeend again. I don't understand this. I am running a dozen
> machines, each with ReiserFS, most of them under much heavier load than
> those two in question, but for the FOURTH time in some months THIS
> partition has coughed up errors (see below).
>
> I would really like to get to the bottom of this. The same problem happened
> with 2.4.19 and 2.4.22 (without data logging) after some weeks of
> continuous useage. I am running another server with the same 2.4.23 kernel
> (+ data logging patches) and it has not had any file sytems problems. The
> strange thing is, there are no signs of hardware problems in the logs where
> these errors occur:
>
> Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
> one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
> free_space(entry_count) 0
> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
> format found in block 4683374. Fsck?
> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-13070: reiserfs_read_inode2:
> i/o failure occurred trying to find stat data of [68636 68667 0x0 SD]
> Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
> one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
> free_space(entry_count) 0
> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
> format found in block 4683374. Fsck?
> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-13070: reiserfs_read_inode2:
> i/o failure occurred trying to find stat data of [68636 68668 0x0 SD]
> Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
> one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
> free_space(entry_count) 0
> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
> format found in block 4683374. Fsck?
> Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
> one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
> free_space(entry_count) 0
> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
> format found in block 4683374. Fsck?
> Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
> one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
> free_space(entry_count) 0
> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
> format found in block 4683374. Fsck?
> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-13070: reiserfs_read_inode2:
> i/o failure occurred trying to find stat data of [68636 68666 0x0 SD]
> Feb 17 14:40:18 linux1 imaplogin: LOGOUT, user=et7ks,
> ip=[::ffff:134.28.62.44], headers=0, body=0
>
> It's always the same files, the machine has had these for a couple weeks,
> and it does not depend on drbd (I had the same problems before starting to
> mirror stuff).
>
>
> If anybody would like a reiserfsdebug output mailed somewhere please say so.
> I will arrange for a downtime and run reiserfsdebug over the partition.
>
These days usually when one notices similar corruptions - he later
discovers hardware problem (memory problem more often). May I ask you to
try memtest for some time.
Does linux1 have other reiserfs filesystem with similar loads?
Can you switch machine linux1 with other machine which runs flawlessly
currently?
Then if that machine will get filesystem corruptions - then we probably
have a bug which comes up on the load linux1 is currently running under.
If it will work well - than linux1 is probably not very reliable.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: $25 question - ReiserFS 3.6 data errors on 2.4.23
2004-02-18 10:08 ` $25 question - ReiserFS 3.6 data errors on 2.4.23 Vladimir Saveliev
@ 2004-02-18 10:32 ` Jens Benecke
2004-02-18 13:27 ` Nick Burrett
2004-03-01 12:30 ` Jens Benecke
1 sibling, 1 reply; 7+ messages in thread
From: Jens Benecke @ 2004-02-18 10:32 UTC (permalink / raw)
To: reiserfs-list
Vladimir Saveliev wrote:
> Hello
Hi,
thanks for your answer.
>> Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
>> one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
>> free_space(entry_count) 0
>> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
>> format found in block 4683374. Fsck?
>
> These days usually when one notices similar corruptions - he later
> discovers hardware problem (memory problem more often). May I ask you to
> try memtest for some time.
We did, we even exchanged memory, it happened again.
> Does linux1 have other reiserfs filesystem with similar loads?
Only the root file system.
There is only / and /home on this machine, and / never had any problems.
> Can you switch machine linux1 with other machine which runs flawlessly
> currently?
We actually bought new ones now. When they are running fine we'll be able to
take the old ones and investigate more thoroughly.
But that'll probably take some more weeks, because I don't have time to do
the setup on the new machines ATM. (diploma thesis :)
> Then if that machine will get filesystem corruptions - then we probably
> have a bug which comes up on the load linux1 is currently running under.
> If it will work well - than linux1 is probably not very reliable.
Well... it's a K6-350 with 128MB about six years old. With new RAM and new
harddisks though. (IBM 120GB)
--
Jens Benecke (jens at spamfreemail.de)
http://www.hitchhikers.de - Europaweite kostenlose Mitfahrzentrale
http://www.spamfreemail.de - 100% saubere Postfächer - garantiert!
http://www.rb-hosting.de - PHP ab 9? - SSH ab 19? - günstiger Traffic
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: $25 question - ReiserFS 3.6 data errors on 2.4.23
2004-02-18 10:32 ` Jens Benecke
@ 2004-02-18 13:27 ` Nick Burrett
2004-02-18 13:42 ` Jens Benecke
0 siblings, 1 reply; 7+ messages in thread
From: Nick Burrett @ 2004-02-18 13:27 UTC (permalink / raw)
To: Jens Benecke; +Cc: reiserfs-list
Jens Benecke wrote:
> Vladimir Saveliev wrote:
>>>Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
>>>one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
>>>free_space(entry_count) 0
>>>Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
>>>format found in block 4683374. Fsck?
>>
>>These days usually when one notices similar corruptions - he later
>>discovers hardware problem (memory problem more often). May I ask you to
>>try memtest for some time.
>
>
> We did, we even exchanged memory, it happened again.
This reminds me of a similar problem that we experienced across several
of our servers. It turned out that an upgrade from reiserfsprogs 3.6.8
to 3.6.11 changed the status result codes of 'reiserfsck' meaning that
our scripts failed to detect that '--check' was reporting that we should
also run '--fix-fixable'.
Regards,
Nick.
--
Nick Burrett, Senior Systems and Network Engineer
Designer Servers Ltd. http://www.dsvr.co.uk
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: $25 question - ReiserFS 3.6 data errors on 2.4.23
2004-02-18 13:27 ` Nick Burrett
@ 2004-02-18 13:42 ` Jens Benecke
2004-02-20 1:41 ` Tom Vier
0 siblings, 1 reply; 7+ messages in thread
From: Jens Benecke @ 2004-02-18 13:42 UTC (permalink / raw)
To: reiserfs-list
Nick Burrett wrote:
>
>
> Jens Benecke wrote:
>> Vladimir Saveliev wrote:
>>>>Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong
>>>>(second one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location
>>>>1444, free_space(entry_count) 0
>>>>Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key:
>>>>invalid format found in block 4683374. Fsck?
>>>
>>>These days usually when one notices similar corruptions - he later
>>>discovers hardware problem (memory problem more often). May I ask you to
>>>try memtest for some time.
>> We did, we even exchanged memory, it happened again.
> This reminds me of a similar problem that we experienced across several
> of our servers. It turned out that an upgrade from reiserfsprogs 3.6.8
> to 3.6.11 changed the status result codes of 'reiserfsck' meaning that
> our scripts failed to detect that '--check' was reporting that we should
> also run '--fix-fixable'.
Well, the last time we had these problems it required a --rebuild-tree. I
hope it's not the same this time.
And we don't run reiserfsck at boot time, only manually. ("0 0" in fstab)
--
Jens Benecke (jens at spamfreemail.de)
http://www.hitchhikers.de - Europaweite kostenlose Mitfahrzentrale
http://www.spamfreemail.de - 100% saubere Postfächer - garantiert!
http://www.rb-hosting.de - PHP ab 9? - SSH ab 19? - günstiger Traffic
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: $25 question - ReiserFS 3.6 data errors on 2.4.23
2004-02-18 13:42 ` Jens Benecke
@ 2004-02-20 1:41 ` Tom Vier
0 siblings, 0 replies; 7+ messages in thread
From: Tom Vier @ 2004-02-20 1:41 UTC (permalink / raw)
To: reiserfs-list
On Wed, Feb 18, 2004 at 02:42:08PM +0100, Jens Benecke wrote:
> And we don't run reiserfsck at boot time, only manually. ("0 0" in fstab)
if it's done after it boots and it's been mounted rw, you could be reading
stale data.
--
Tom Vier <tmv@comcast.net>
DSA Key ID 0xE6CB97DA
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: $25 question - ReiserFS 3.6 data errors on 2.4.23
2004-02-18 10:08 ` $25 question - ReiserFS 3.6 data errors on 2.4.23 Vladimir Saveliev
2004-02-18 10:32 ` Jens Benecke
@ 2004-03-01 12:30 ` Jens Benecke
1 sibling, 0 replies; 7+ messages in thread
From: Jens Benecke @ 2004-03-01 12:30 UTC (permalink / raw)
To: reiserfs-list
Vladimir Saveliev wrote:
>> Feb 17 14:40:18 linux1 kernel: is_leaf: item location seems wrong (second
>> one): *3.6* [68637 68643 0x1 IND], item_len 8, item_location 1444,
>> free_space(entry_count) 0
>> Feb 17 14:40:18 linux1 kernel: drbd(43,0):vs-5150: search_by_key: invalid
>> format found in block 4683374. Fsck?
> These days usually when one notices similar corruptions - he later
> discovers hardware problem (memory problem more often). May I ask you to
> try memtest for some time.
We finally found time to halt one machine and did. memtest86 finds about 150
seemingly random places in RAM where bit errors occur.
It's a wonder that Linux had an uptime of over 90 days with almost flawless
operation on a machine with THAT kind of hopelessly b0rken RAM.
Thank you for being so stubborn about hardware errors. :-)
--
Jens Benecke (jens at spamfreemail.de)
http://www.hitchhikers.de - Europaweite kostenlose Mitfahrzentrale
http://www.spamfreemail.de - 100% saubere Postfächer - garantiert!
http://www.rb-hosting.de - PHP ab 9? - SSH ab 19? - günstiger Traffic
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2004-03-01 12:30 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20040218093510.GD21098@backtop.namesys.com>
2004-02-18 10:08 ` $25 question - ReiserFS 3.6 data errors on 2.4.23 Vladimir Saveliev
2004-02-18 10:32 ` Jens Benecke
2004-02-18 13:27 ` Nick Burrett
2004-02-18 13:42 ` Jens Benecke
2004-02-20 1:41 ` Tom Vier
2004-03-01 12:30 ` Jens Benecke
2004-02-17 13:52 Jens Benecke
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.