disk or reiserfs problem?

All of lore.kernel.org
 help / color / mirror / Atom feed

* disk or reiserfs problem?
@ 2003-05-28 20:07 Jeff Breidenbach
  2003-05-29  5:45 ` Oleg Drokin
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Breidenbach @ 2003-05-28 20:07 UTC (permalink / raw)
  To: reiserfs-list


This is after a hard (power switch) reboot (due to I/O errors). The
disk in question has about 125 GB of data on a single 200GB reiserfs
partition. Do people think the disk is toast, or is this possibly some
correctable filesystem problem? The machine is remote, so I can't
tell if the disk is making funny grinding noises. Uptime was about
45 days with high disk load before the problem occurred.

Any suggestions appreciated.

-Jeff




root@toko:~# mount -t reiserfs /dev/hdb1 /data1
mount: wrong fs type, bad option, bad superblock on /dev/hdb1,
       or too many mounted file systems
       (could this be the IDE device where you in fact use
       ide-scsi so that sr0 or sda or so is needed?)


root@toko:~# reiserfsck /dev/hdb1
 
<-------------reiserfsck, 2002------------->
reiserfsprogs 3.6.3
 
Will read-only check consistency of the filesystem on /dev/hdb1
Will put log info to 'stdout'
 
Do you want to run this program?[N/Yes] (note need to type Yes):Yes
 
bread: Cannot read a block # 2.


root@toko:~# dmesg | tail
hdb1: bad access: block=35, count=5
end_request: I/O error, dev 03:41 (hdb), sector 35
hdb1: bad access: block=36, count=4
end_request: I/O error, dev 03:41 (hdb), sector 36
hdb1: bad access: block=37, count=3
end_request: I/O error, dev 03:41 (hdb), sector 37
hdb1: bad access: block=38, count=2
end_request: I/O error, dev 03:41 (hdb), sector 38
hdb1: bad access: block=39, count=1
end_request: I/O error, dev 03:41 (hdb), sector 39

root@toko:~# cat /proc/version
Linux version 2.4.20-xfs (knoppix@Knoppix) (gcc version 2.95.4 20011002 (Debian prerelease)) #1 SMP Die Dez 10 20:07:25 CET 2002

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: disk or reiserfs problem?
  2003-05-28 20:07 disk or reiserfs problem? Jeff Breidenbach
@ 2003-05-29  5:45 ` Oleg Drokin
  2003-06-02 18:36   ` Hans Reiser
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Drokin @ 2003-05-29  5:45 UTC (permalink / raw)
  To: Jeff Breidenbach; +Cc: reiserfs-list

Hello!

On Wed, May 28, 2003 at 01:07:27PM -0700, Jeff Breidenbach wrote:
> This is after a hard (power switch) reboot (due to I/O errors). The
> disk in question has about 125 GB of data on a single 200GB reiserfs
> partition. Do people think the disk is toast, or is this possibly some
> correctable filesystem problem? The machine is remote, so I can't
> hdb1: bad access: block=35, count=5
> end_request: I/O error, dev 03:41 (hdb), sector 35

Looks like disk have gone bad. If you are lucky enough, some of the data
still can be recovered. Try to copy entire disk into a file/to another
disk to see how much bad sectors are there.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: disk or reiserfs problem?
  2003-05-29  5:45 ` Oleg Drokin
@ 2003-06-02 18:36   ` Hans Reiser
  2003-06-02 20:00     ` Jeff Breidenbach
  2003-06-03  5:23     ` Oleg Drokin
  0 siblings, 2 replies; 10+ messages in thread
From: Hans Reiser @ 2003-06-02 18:36 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Breidenbach, reiserfs-list

Oleg Drokin wrote:

>Hello!
>
>On Wed, May 28, 2003 at 01:07:27PM -0700, Jeff Breidenbach wrote:
>  
>
>>This is after a hard (power switch) reboot (due to I/O errors). The
>>disk in question has about 125 GB of data on a single 200GB reiserfs
>>partition. Do people think the disk is toast, or is this possibly some
>>correctable filesystem problem? The machine is remote, so I can't
>>hdb1: bad access: block=35, count=5
>>end_request: I/O error, dev 03:41 (hdb), sector 35
>>    
>>
>
>Looks like disk have gone bad. If you are lucky enough, some of the data
>still can be recovered. Try to copy entire disk into a file/to another
>disk to see how much bad sectors are there.
>
>Bye,
>    Oleg
>
>
>  
>
You should provide more details in such advice, such as telling him 
about dd_rescue and why it is better than dd, etc.  You should also 
explain that if it is only a few blocks that are bad, writing to the bad 
blocks can make them go away most of the time (the drive will remap them).

-- 
Hans



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: disk or reiserfs problem?
  2003-06-02 18:36   ` Hans Reiser
@ 2003-06-02 20:00     ` Jeff Breidenbach
  2003-06-03  5:33       ` Oleg Drokin
  2003-06-03  5:23     ` Oleg Drokin
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff Breidenbach @ 2003-06-02 20:00 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Oleg Drokin, reiserfs-list

Well, I retrieved the disk from the colocation facility - 
it is basically totalled. BIOS and the linux kernel make 
about 5 attempts each at spinning up the drive. It spins 
up then spins down after a few seconds. No software tool 
in the worldwill get data off a disk that isn't spinning. 
I gave up, bought a new disk, and am restoring from backups.

Incidentally, during the restoration I find that cp is 
giving a throughput of about 1 MB/s when copying from 
harddrive to harddrive. Both source and destination disks 
use reiserfs and hdparm -tT reports about 50MB/s read rate. 
Is a 1MB/s throughput expected when copying many small files?
(There is a lot of data involved, so I probably have a couple 
of days to ponder the question.)

Cheers,
Jeff

On Mon, 2003-06-02 at 11:36, Hans Reiser wrote:
> Oleg Drokin wrote:
> 
> >Hello!
> >
> >On Wed, May 28, 2003 at 01:07:27PM -0700, Jeff Breidenbach wrote:
> >  
> >
> >>This is after a hard (power switch) reboot (due to I/O errors). The
> >>disk in question has about 125 GB of data on a single 200GB reiserfs
> >>partition. Do people think the disk is toast, or is this possibly some
> >>correctable filesystem problem? The machine is remote, so I can't
> >>hdb1: bad access: block=35, count=5
> >>end_request: I/O error, dev 03:41 (hdb), sector 35
> >>    
> >>
> >
> >Looks like disk have gone bad. If you are lucky enough, some of the data
> >still can be recovered. Try to copy entire disk into a file/to another
> >disk to see how much bad sectors are there.
> >
> >Bye,
> >    Oleg
> >
> >
> >  
> >
> You should provide more details in such advice, such as telling him 
> about dd_rescue and why it is better than dd, etc.  You should also 
> explain that if it is only a few blocks that are bad, writing to the bad 
> blocks can make them go away most of the time (the drive will remap them).
-- 
Jeff Breidenbach <jbreiden@parc.com>
Member of Research Staff, PARC

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: disk or reiserfs problem?
  2003-06-02 18:36   ` Hans Reiser
  2003-06-02 20:00     ` Jeff Breidenbach
@ 2003-06-03  5:23     ` Oleg Drokin
  1 sibling, 0 replies; 10+ messages in thread
From: Oleg Drokin @ 2003-06-03  5:23 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Jeff Breidenbach, reiserfs-list

Hello!

On Mon, Jun 02, 2003 at 10:36:00PM +0400, Hans Reiser wrote:
> >>This is after a hard (power switch) reboot (due to I/O errors). The
> >>disk in question has about 125 GB of data on a single 200GB reiserfs
> >>partition. Do people think the disk is toast, or is this possibly some
> >>correctable filesystem problem? The machine is remote, so I can't
> >>hdb1: bad access: block=35, count=5
> >>end_request: I/O error, dev 03:41 (hdb), sector 35
> >Looks like disk have gone bad. If you are lucky enough, some of the data
> >still can be recovered. Try to copy entire disk into a file/to another
> >disk to see how much bad sectors are there.
> You should provide more details in such advice, such as telling him 
> about dd_rescue and why it is better than dd, etc.  You should also 
> explain that if it is only a few blocks that are bad, writing to the bad 
> blocks can make them go away most of the time (the drive will remap them).

Actually in this case it is impossible to write to the blocks.
The system area where you cannot write is damaged so drive cannot even identify itself.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: disk or reiserfs problem?
  2003-06-02 20:00     ` Jeff Breidenbach
@ 2003-06-03  5:33       ` Oleg Drokin
  2003-06-03 17:49         ` Prereading block device doesn't help filesystems Mike Fedyk
  2003-06-05 13:20         ` disk or reiserfs problem? Hans Reiser
  0 siblings, 2 replies; 10+ messages in thread
From: Oleg Drokin @ 2003-06-03  5:33 UTC (permalink / raw)
  To: Jeff Breidenbach; +Cc: Hans Reiser, reiserfs-list

Hello!

On Mon, Jun 02, 2003 at 01:00:36PM -0700, Jeff Breidenbach wrote:

> Incidentally, during the restoration I find that cp is 
> giving a throughput of about 1 MB/s when copying from 
> harddrive to harddrive. Both source and destination disks 
> use reiserfs and hdparm -tT reports about 50MB/s read rate. 
> Is a 1MB/s throughput expected when copying many small files?
> (There is a lot of data involved, so I probably have a couple 
> of days to ponder the question.)

Copying lots of small files can take lots of time just because you
cannot read those in order they are stored on disk platter. And you do not know
the order most probably. So the end result is the disk does constant seeking and
the speed is reduced. If you know the order in which files were written (e.g. alphabetically),
then you can read them in the same order (plain "cp -R . somedir" will read files in readdir order)
I am not very sure what can be done in such a case if you do not know the order in
which files were written. Perhaps if you copy the mounted backup drive to /dev/zero
(e.g. dd if=/dev/hdb1 /dev/zero bs=1024k count=500 (replace 500 with the half of your RAM
in megabytes) and you have large enough RAM, there is a chance that this will cache
enough data to keep stuff moving more quickly. 

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Prereading block device doesn't help filesystems
  2003-06-03  5:33       ` Oleg Drokin
@ 2003-06-03 17:49         ` Mike Fedyk
  2003-06-03 18:19           ` Chris Mason
  2003-06-05 13:20         ` disk or reiserfs problem? Hans Reiser
  1 sibling, 1 reply; 10+ messages in thread
From: Mike Fedyk @ 2003-06-03 17:49 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Breidenbach, Hans Reiser, reiserfs-list

On Tue, Jun 03, 2003 at 09:33:36AM +0400, Oleg Drokin wrote:
> (e.g. dd if=/dev/hdb1 /dev/zero bs=1024k count=500 (replace 500 with the half of your RAM
> in megabytes) and you have large enough RAM, there is a chance that this will cache
> enough data to keep stuff moving more quickly. 

That won't work either.

Basically you're copying the data into buffer cache, and the page cache
won't be able to pick it up, so you're going to read from the disk twice.
Even with Andrea's buffercache in pagecache from 2.4.10 it doesn't do the
aliasing IIRC.  And there were some long threads where Linus said he
wouldn't accept patches to do the aliasing either.

Don't know if he has changed his mind in 2.5 though...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Prereading block device doesn't help filesystems
  2003-06-03 17:49         ` Prereading block device doesn't help filesystems Mike Fedyk
@ 2003-06-03 18:19           ` Chris Mason
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Mason @ 2003-06-03 18:19 UTC (permalink / raw)
  To: Mike Fedyk; +Cc: Oleg Drokin, Jeff Breidenbach, Hans Reiser, reiserfs-list

On Tue, 2003-06-03 at 13:49, Mike Fedyk wrote:
> On Tue, Jun 03, 2003 at 09:33:36AM +0400, Oleg Drokin wrote:
> > (e.g. dd if=/dev/hdb1 /dev/zero bs=1024k count=500 (replace 500 with the half of your RAM
> > in megabytes) and you have large enough RAM, there is a chance that this will cache
> > enough data to keep stuff moving more quickly. 
> 
> That won't work either.
> 
> Basically you're copying the data into buffer cache, and the page cache
> won't be able to pick it up, so you're going to read from the disk twice.
> Even with Andrea's buffercache in pagecache from 2.4.10 it doesn't do the
> aliasing IIRC.  And there were some long threads where Linus said he
> wouldn't accept patches to do the aliasing either.
> 
> Don't know if he has changed his mind in 2.5 though...
> 

There are two caches used by reiserfs.  The page cache has file data,
and the buffer cache has FS metadata.  The buffer cache doesn't really
exist anymore, it's really just the page cache for the file
corresponding to the block device.

So, if you dd if=/dev/hdb1 of=/dev/zero, you are reading into the page
cache of the block device which means you are populating the metadata
cache for reiserfs as well.

With packed tails and many small files, this represents the bulk of the
data on the FS, and it should make a performance difference if you have
enough ram to make it worth while, and your access patterns keep the
blocks in cache.

-chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: disk or reiserfs problem?
  2003-06-03  5:33       ` Oleg Drokin
  2003-06-03 17:49         ` Prereading block device doesn't help filesystems Mike Fedyk
@ 2003-06-05 13:20         ` Hans Reiser
  2003-06-05 13:38           ` Nikita Danilov
  1 sibling, 1 reply; 10+ messages in thread
From: Hans Reiser @ 2003-06-05 13:20 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Breidenbach, reiserfs-list

Oleg Drokin wrote:

>Hello!
>
>On Mon, Jun 02, 2003 at 01:00:36PM -0700, Jeff Breidenbach wrote:
>
>  
>
>>Incidentally, during the restoration I find that cp is 
>>giving a throughput of about 1 MB/s when copying from 
>>harddrive to harddrive. Both source and destination disks 
>>use reiserfs and hdparm -tT reports about 50MB/s read rate. 
>>Is a 1MB/s throughput expected when copying many small files?
>>(There is a lot of data involved, so I probably have a couple 
>>of days to ponder the question.)
>>    
>>
>
>Copying lots of small files can take lots of time just because you
>cannot read those in order they are stored on disk platter. And you do not know
>the order most probably. So the end result is the disk does constant seeking and
>the speed is reduced. If you know the order in which files were written (e.g. alphabetically),
>then you can read them in the same order (plain "cp -R . somedir" will read files in readdir order)
>I am not very sure what can be done in such a case if you do not know the order in
>which files were written. Perhaps if you copy the mounted backup drive to /dev/zero
>(e.g. dd if=/dev/hdb1 /dev/zero bs=1024k count=500 (replace 500 with the half of your RAM
>in megabytes) and you have large enough RAM, there is a chance that this will cache
>enough data to keep stuff moving more quickly. 
>
>Bye,
>    Oleg
>
>
>  
>
Your performance will improve after the restoration though....

-- 
Hans



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: disk or reiserfs problem?
  2003-06-05 13:20         ` disk or reiserfs problem? Hans Reiser
@ 2003-06-05 13:38           ` Nikita Danilov
  0 siblings, 0 replies; 10+ messages in thread
From: Nikita Danilov @ 2003-06-05 13:38 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Oleg Drokin, Jeff Breidenbach, reiserfs-list

Hans Reiser writes:
 > Oleg Drokin wrote:
 > 

[...]

 > >  
 > >
 > Your performance will improve after the restoration though....

Depends on what one gets in the restaurant, usually.

 > 
 > -- 
 > Hans
 > 

Nikita.

 > 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-06-05 13:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-28 20:07 disk or reiserfs problem? Jeff Breidenbach
2003-05-29  5:45 ` Oleg Drokin
2003-06-02 18:36   ` Hans Reiser
2003-06-02 20:00     ` Jeff Breidenbach
2003-06-03  5:33       ` Oleg Drokin
2003-06-03 17:49         ` Prereading block device doesn't help filesystems Mike Fedyk
2003-06-03 18:19           ` Chris Mason
2003-06-05 13:20         ` disk or reiserfs problem? Hans Reiser
2003-06-05 13:38           ` Nikita Danilov
2003-06-03  5:23     ` Oleg Drokin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.