read error on superblock

All of lore.kernel.org
 help / color / mirror / Atom feed

* read error on superblock
@ 2012-07-23  8:45 dexen deVries
  2012-07-23  9:17 ` Vyacheslav Dubeyko
  0 siblings, 1 reply; 14+ messages in thread
From: dexen deVries @ 2012-07-23  8:45 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

Hi list,


a harddrive got some bad sectors and now one NILFS filesystem can't be mounted;

mount: /dev/sda3: can't read superblock

I can (try to) copy this filesystem to another drive; how do I proceed form 
that point? Does it make any sense to substitute another superblock for this 
one? (either to use a spare superblock, if such exists, or put a new 
superblock on the damaged sector(s)).


Regards,
-- 
dexen deVries

[[[↓][→]]]

"all dichotomies are either true or false" is a true paradox because it's 
paradoxical only if it is a paradox ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
  2012-07-23  8:45 read error on superblock dexen deVries
@ 2012-07-23  9:17 ` Vyacheslav Dubeyko
  2012-07-23  9:24   ` dexen deVries
  2012-07-23  9:39   ` Ryusuke Konishi
  0 siblings, 2 replies; 14+ messages in thread
From: Vyacheslav Dubeyko @ 2012-07-23  9:17 UTC (permalink / raw)
  To: dexen deVries; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

Hi,

On Mon, 2012-07-23 at 10:45 +0200, dexen deVries wrote:
> Hi list,
> 
> 
> a harddrive got some bad sectors and now one NILFS filesystem can't be mounted;
> 
> mount: /dev/sda3: can't read superblock
> 
> I can (try to) copy this filesystem to another drive; how do I proceed form 
> that point? Does it make any sense to substitute another superblock for this 
> one? (either to use a spare superblock, if such exists, or put a new 
> superblock on the damaged sector(s)).
> 
> 
> Regards,

It exits second superblock at the end of NILFS volume. But it can be not
in fully synchronous state with primary ones (as I guess).
Theoretically, it is possible to copy secondary superblock on the place
of primary. But I am afraid that the NILFS volume can be in inconsistent
state anyway.

Do you sure that this volume doesn't contain another damaged sectors?

With the best regards,
Vyacheslav Dubeyko.


--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
  2012-07-23  9:17 ` Vyacheslav Dubeyko
@ 2012-07-23  9:24   ` dexen deVries
  2012-07-23  9:37     ` Vyacheslav Dubeyko
  2012-07-23  9:39   ` Ryusuke Konishi
  1 sibling, 1 reply; 14+ messages in thread
From: dexen deVries @ 2012-07-23  9:24 UTC (permalink / raw)
  To: Vyacheslav Dubeyko; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

Hi Vyacheslaw,


On Monday 23 of July 2012 13:17:28 you wrote:
> It exits second superblock at the end of NILFS volume. But it can be not
> in fully synchronous state with primary ones (as I guess).
> Theoretically, it is possible to copy secondary superblock on the place
> of primary. But I am afraid that the NILFS volume can be in inconsistent
> state anyway.

any hints how to locate the superblock? what offset to look at, and what's the 
magic number(s)?


> Do you sure that this volume doesn't contain another damaged sectors?


in my case, that doesn't matter: all the data i want to recover is either in 
Git (which does internal consistency checks) or in MySQL, which also does 
/some/ consistency checks.


Cheers,
-- 
dexen deVries

[[[↓][→]]]

"all dichotomies are either true or false" is a true paradox because it's 
paradoxical only if it is a paradox ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
  2012-07-23  9:24   ` dexen deVries
@ 2012-07-23  9:37     ` Vyacheslav Dubeyko
  0 siblings, 0 replies; 14+ messages in thread
From: Vyacheslav Dubeyko @ 2012-07-23  9:37 UTC (permalink / raw)
  To: dexen deVries; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

Hi Dexen,

On Mon, 2012-07-23 at 11:24 +0200, dexen deVries wrote:
> Hi Vyacheslaw,
> 
> 
> On Monday 23 of July 2012 13:17:28 you wrote:
> > It exits second superblock at the end of NILFS volume. But it can be not
> > in fully synchronous state with primary ones (as I guess).
> > Theoretically, it is possible to copy secondary superblock on the place
> > of primary. But I am afraid that the NILFS volume can be in inconsistent
> > state anyway.
> 
> any hints how to locate the superblock? what offset to look at, and what's the 
> magic number(s)?

Usually, secondary superblock is located in the last block (4 KB) of the
volume. In nilfs2_fs.h exists such

#define NILFS_SB2_OFFSET_BYTES(devsize) ((((devsize) >> 12) - 1) << 12)

which define placement of the secondary superblock (devsize is size of
the device in bytes).

Magic number of NILFS2 is 0x3434. It is located on 0x0006 bytes offset
from superblock's begin. 

> 
> > Do you sure that this volume doesn't contain another damaged sectors?
> 
> 
> in my case, that doesn't matter: all the data i want to recover is either in 
> Git (which does internal consistency checks) or in MySQL, which also does 
> /some/ consistency checks.
> 
> 
> Cheers,

With the best regards,
Vyacheslav Dubeyko.


--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
  2012-07-23  9:17 ` Vyacheslav Dubeyko
  2012-07-23  9:24   ` dexen deVries
@ 2012-07-23  9:39   ` Ryusuke Konishi
       [not found]     ` <20120723.183907.154986664.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
  1 sibling, 1 reply; 14+ messages in thread
From: Ryusuke Konishi @ 2012-07-23  9:39 UTC (permalink / raw)
  To: Vyacheslav Dubeyko; +Cc: dexen deVries, linux-nilfs-u79uwXL29TY76Z2rM5mHXA

Hi,
On Mon, 23 Jul 2012 13:17:28 +0400, Vyacheslav Dubeyko wrote:
> Hi,
> 
> On Mon, 2012-07-23 at 10:45 +0200, dexen deVries wrote:
> > Hi list,
> > 
> > 
> > a harddrive got some bad sectors and now one NILFS filesystem can't be mounted;
> > 
> > mount: /dev/sda3: can't read superblock
> > 
> > I can (try to) copy this filesystem to another drive; how do I proceed form 
> > that point? Does it make any sense to substitute another superblock for this 
> > one? (either to use a spare superblock, if such exists, or put a new 
> > superblock on the damaged sector(s)).

NILFS tries to use the second superblock automatically if the primary
super block was broken.  And, NILFS even tries to recover the primary
superblock by copying the secondary superblock.

> > mount: /dev/sda3: can't read superblock

Looks weird.  mount.nilfs2 doesn't output this error message.

Is mount.nilfs2 installed in /sbin directory?

Could you try mount.nilfs2 as follows instead of the mount program?

# mount.nilfs2 <device> <mount-point>

 or

# mount -t nilfs2 <device> <mount-point>


Regards,
Ryusuke Konishi

> > Regards,
> 
> It exits second superblock at the end of NILFS volume. But it can be not
> in fully synchronous state with primary ones (as I guess).
> Theoretically, it is possible to copy secondary superblock on the place
> of primary. But I am afraid that the NILFS volume can be in inconsistent
> state anyway.
> 
> Do you sure that this volume doesn't contain another damaged sectors?
> 
> With the best regards,
> Vyacheslav Dubeyko.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
       [not found]     ` <20120723.183907.154986664.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
@ 2012-07-23  9:42       ` dexen deVries
  2012-07-23 11:06       ` dexen deVries
  1 sibling, 0 replies; 14+ messages in thread
From: dexen deVries @ 2012-07-23  9:42 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Ryusuke Konishi

Hi Ryusuke,


On Monday 23 of July 2012 18:39:07 you wrote:
> 
> # mount -t nilfs2 <device> <mount-point>

that's what I've tried.

I guess the problem is, the harddrive have not re-allocated the sector as of 
yet, so it is /unreadable/ rather than merely containing wrong data.


I'll see later on a while if the drive can re-allocate the sector.

-- 
dexen deVries

[[[↓][→]]]

"all dichotomies are either true or false" is a true paradox because it's 
paradoxical only if it is a paradox ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
       [not found]     ` <20120723.183907.154986664.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
  2012-07-23  9:42       ` dexen deVries
@ 2012-07-23 11:06       ` dexen deVries
  2012-07-23 11:19         ` Ryusuke Konishi
  1 sibling, 1 reply; 14+ messages in thread
From: dexen deVries @ 2012-07-23 11:06 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Ryusuke Konishi

Hi again,

On Monday 23 of July 2012 18:39:07 you wrote:
> Looks weird.  mount.nilfs2 doesn't output this error message.


another computer, same drive:

coil!root!/mnt # mount.nilfs2 -v /dev/sdc3 x -o errors=continue,norecovery
mount.nilfs2: Error while mounting /dev/sdc3 on x: Input/output error


also, in dmesg:
> NILFS warning: mounting unchecked fs
> ((lotsa ATA read error stuff))
> NILFS: error searching super root.


-- 
dexen deVries

[[[↓][→]]]

"all dichotomies are either true or false" is a true paradox because it's 
paradoxical only if it is a paradox ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
  2012-07-23 11:06       ` dexen deVries
@ 2012-07-23 11:19         ` Ryusuke Konishi
       [not found]           ` <20120723.201918.94868195.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Ryusuke Konishi @ 2012-07-23 11:19 UTC (permalink / raw)
  To: dexen deVries; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

Hi,
On Mon, 23 Jul 2012 13:06:57 +0200, dexen deVries wrote:
> Hi again,
> 
> On Monday 23 of July 2012 18:39:07 you wrote:
> > Looks weird.  mount.nilfs2 doesn't output this error message.
> 
> 
> another computer, same drive:
> 
> coil!root!/mnt # mount.nilfs2 -v /dev/sdc3 x -o errors=continue,norecovery
> mount.nilfs2: Error while mounting /dev/sdc3 on x: Input/output error
> 
> 
> also, in dmesg:
> > NILFS warning: mounting unchecked fs
> > ((lotsa ATA read error stuff))
> > NILFS: error searching super root.

Uum, the device seems to have serious problem.
Can you copy the contents of the device by dd command?

 # dd if=/dev/sdc3 of=<path-to-other>/nilfs.img

Regards,
Ryusuke Konishi

> -- 
> dexen deVries
> 
> [[[↓][→]]]
> 
> "all dichotomies are either true or false" is a true paradox because it's 
> paradoxical only if it is a paradox ;)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
       [not found]           ` <20120723.201918.94868195.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
@ 2012-07-23 18:24             ` dexen deVries
  2012-07-24  0:06               ` Ryusuke Konishi
  0 siblings, 1 reply; 14+ messages in thread
From: dexen deVries @ 2012-07-23 18:24 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Ryusuke Konishi

Hi again,

I've copied the whole filesystem elsewhere (to a file) with `ddrescue'. It found 
one damaged area on the drive, but apparently neither at start nor at the end 
of partition.

The FS on the drive was marked as `dirty' (requiring recovery upon mount). My 
guess is that kernel attempted recovery, and gave up upon read error.

Unfortunately, the `norecovery' option did not help with the drive; it only 
helped once i've moved whole FS to file.

Log from ddrescue:

# Rescue Logfile. Created by GNU ddrescue version 1.14
# Command line: ddrescue /dev/sdc3 sda3 sda3.log
# current_pos  current_status
0x149E0CCC00     +
#      pos        size  status
0x00000000  0x149E0CC000  +
0x149E0CC000  0x00001000  -
0x149E0CD000  0x11220D3000  +

my understanding is, the following line describes the damaged area, format: 
start length status-marker (`-' for error)
0x149E0CC000  0x00001000  -

Once the FS was copied to a file, it mounted correctly:
# mount -o ro,loop,norecovery ./sda3.img ./some-mountpoint 

My gripe with current (linux-3.5.0) NILFS2 driver is that I couldn't tell it 
to ignore read errors and thus force it to mount the filesystem. Ony after I've 
moved some 160GB of FS to a file (that's a bit tedious :P) it opened the FS 
just fine.

Cheers,
-- 
dexen deVries

1972 - Dennis Ritchie invents a powerful gun that shoots both forward and 
backward simultaneously. Not satisfied with the number of deaths and permanent 
maimings from that invention he invents C and Unix.
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
  2012-07-23 18:24             ` dexen deVries
@ 2012-07-24  0:06               ` Ryusuke Konishi
       [not found]                 ` <20120724.090604.40913934.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Ryusuke Konishi @ 2012-07-24  0:06 UTC (permalink / raw)
  To: dexen deVries; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

On Mon, 23 Jul 2012 20:24:10 +0200, dexen deVries wrote:
> Hi again,
> 
> 
> I've copied the whole filesystem elsewhere (to a file) with `ddrescue'. It found 
> one damaged area on the drive, but apparently neither at start nor at the end 
> of partition.
> 
> The FS on the drive was marked as `dirty' (requiring recovery upon mount). My 
> guess is that kernel attempted recovery, and gave up upon read error.
> 
> Unfortunately, the `norecovery' option did not help with the drive; it only 
> helped once i've moved whole FS to file.
> 
> 
> Log from ddrescue:
> 
> 
> # Rescue Logfile. Created by GNU ddrescue version 1.14
> # Command line: ddrescue /dev/sdc3 sda3 sda3.log
> # current_pos  current_status
> 0x149E0CCC00     +
> #      pos        size  status
> 0x00000000  0x149E0CC000  +
> 0x149E0CC000  0x00001000  -
> 0x149E0CD000  0x11220D3000  +
> 
> 
> my understanding is, the following line describes the damaged area, format: 
> start length status-marker (`-' for error)
> 0x149E0CC000  0x00001000  -
> 
> 
> Once the FS was copied to a file, it mounted correctly:
> # mount -o ro,loop,norecovery ./sda3.img ./some-mountpoint 
> 
> 
> My gripe with current (linux-3.5.0) NILFS2 driver is that I couldn't tell it 
> to ignore read errors and thus force it to mount the filesystem.

Good point.  The current recovery logic is intentionally implemented
so that it aborts when having met an I/O error.

This treatment should not be applied at least if the norecovery option
is specified.

Thanks,
Ryusuke Konishi

> Ony after I've 
> moved some 160GB of FS to a file (that's a bit tedious :P) it opened the FS 
> just fine.
> 
> 
> Cheers,
> -- 
> dexen deVries
> 
> 1972 - Dennis Ritchie invents a powerful gun that shoots both forward and 
> backward simultaneously. Not satisfied with the number of deaths and permanent 
> maimings from that invention he invents C and Unix.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
       [not found]                 ` <20120724.090604.40913934.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
@ 2012-07-24  6:26                   ` Vyacheslav Dubeyko
  2012-07-24  7:52                     ` dexen deVries
  0 siblings, 1 reply; 14+ messages in thread
From: Vyacheslav Dubeyko @ 2012-07-24  6:26 UTC (permalink / raw)
  To: Ryusuke Konishi; +Cc: dexen deVries, linux-nilfs-u79uwXL29TY76Z2rM5mHXA

Hi,

On Tue, 2012-07-24 at 09:06 +0900, Ryusuke Konishi wrote:
> On Mon, 23 Jul 2012 20:24:10 +0200, dexen deVries wrote:
> > Hi again,
> > 
> > 
> > I've copied the whole filesystem elsewhere (to a file) with `ddrescue'. It found 
> > one damaged area on the drive, but apparently neither at start nor at the end 
> > of partition.
> > 
> > The FS on the drive was marked as `dirty' (requiring recovery upon mount). My 
> > guess is that kernel attempted recovery, and gave up upon read error.
> > 
> > Unfortunately, the `norecovery' option did not help with the drive; it only 
> > helped once i've moved whole FS to file.
> > 
> > 
> > Log from ddrescue:
> > 
> > 
> > # Rescue Logfile. Created by GNU ddrescue version 1.14
> > # Command line: ddrescue /dev/sdc3 sda3 sda3.log
> > # current_pos  current_status
> > 0x149E0CCC00     +
> > #      pos        size  status
> > 0x00000000  0x149E0CC000  +
> > 0x149E0CC000  0x00001000  -
> > 0x149E0CD000  0x11220D3000  +
> > 
> > 
> > my understanding is, the following line describes the damaged area, format: 
> > start length status-marker (`-' for error)
> > 0x149E0CC000  0x00001000  -
> > 
> > 
> > Once the FS was copied to a file, it mounted correctly:
> > # mount -o ro,loop,norecovery ./sda3.img ./some-mountpoint 
> > 
> > 
> > My gripe with current (linux-3.5.0) NILFS2 driver is that I couldn't tell it 
> > to ignore read errors and thus force it to mount the filesystem.
> 
> Good point.  The current recovery logic is intentionally implemented
> so that it aborts when having met an I/O error.

I am afraid that it is not so good from the end user point of view.

First of all, the message "mount: /dev/sda3: can't read superblock" can
confuse user. The reason is bad sectors inside the volume but user is
informed about impossibility to read superblock.

Secondly, it is possible situation when it really needs to use a volume
in the case of presence of bad sectors. And I think that users can
expect such NILFS behavior because of declared reliability.

Unfortunately, as I can understand, NILFS hasn't bad blocks table and
can't process situation of bad blocks presence on volume correctly. It
means that NILFS interprets bad blocks as exceptional case. But from my
point of view, it makes sense to interpret bad blocks as usual thing and
try to work in the presence of ones. For example, fsck potentially can
check NILFS volume on bad blocks presence, construct bad blocks table
and save it on the volume.

I suggest to add "virtual" special file for bad blocks description. It
can be described by inode in ifile and all bad blocks can be described
in DAT file as parts of this "virtual" special file. So, as a result,
NILFS file system driver will have bad blocks table which can be a basis
for excluding bad blocks from operation and trying to survive in the not
good device environment.

What do you think about such idea?

With the best regards,
Vyacheslav Dubeyko.

> 
> This treatment should not be applied at least if the norecovery option
> is specified.
> 
> Thanks,
> Ryusuke Konishi
> 
> > Ony after I've 
> > moved some 160GB of FS to a file (that's a bit tedious :P) it opened the FS 
> > just fine.
> > 
> > 
> > Cheers,
> > -- 
> > dexen deVries
> > 
> > 1972 - Dennis Ritchie invents a powerful gun that shoots both forward and 
> > backward simultaneously. Not satisfied with the number of deaths and permanent 
> > maimings from that invention he invents C and Unix.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
  2012-07-24  6:26                   ` Vyacheslav Dubeyko
@ 2012-07-24  7:52                     ` dexen deVries
  2012-07-24 16:46                       ` Ryusuke Konishi
  0 siblings, 1 reply; 14+ messages in thread
From: dexen deVries @ 2012-07-24  7:52 UTC (permalink / raw)
  To: linux-nilfs-u79uwXL29TY76Z2rM5mHXA; +Cc: Vyacheslav Dubeyko

Hi Vyacheslav,


On Tuesday 24 of July 2012 10:26:37 you wrote:
> I am afraid that it is not so good from the end user point of view.
> 
> First of all, the message "mount: /dev/sda3: can't read superblock" can
> confuse user. The reason is bad sectors inside the volume but user is
> informed about impossibility to read superblock.
> 
> Secondly, it is possible situation when it really needs to use a volume
> in the case of presence of bad sectors. And I think that users can
> expect such NILFS behavior because of declared reliability.
> 
> Unfortunately, as I can understand, NILFS hasn't bad blocks table and
> can't process situation of bad blocks presence on volume correctly. It
> means that NILFS interprets bad blocks as exceptional case. But from my
> point of view, it makes sense to interpret bad blocks as usual thing and
> try to work in the presence of ones. For example, fsck potentially can
> check NILFS volume on bad blocks presence, construct bad blocks table
> and save it on the volume.
> 
> I suggest to add "virtual" special file for bad blocks description. It
> can be described by inode in ifile and all bad blocks can be described
> in DAT file as parts of this "virtual" special file. So, as a result,
> NILFS file system driver will have bad blocks table which can be a basis
> for excluding bad blocks from operation and trying to survive in the not
> good device environment.
> 
> What do you think about such idea?

I believe bad sectors to be thing of the past mostly; any decent harddrive 
(probably also any decent SSD) should re-map them after some re-reads. Some 
data & meta-data loss is possible, but overall the FS should be accessible 
again.
I have no idea why my particular HDD did not re-map; perhaps it just takes 
much longer than I gave it.

As a point of reference, XFS does not do bad block management either; however, 
the partition driver of IRIX does bad sector management -- so it is 
implemented one layer below the FS.


I guess it /may be/ possible to use Linux' `dm' driver in such manner.


Cheers,
-- 
dexen deVries

[[[↓][→]]]

"all dichotomies are either true or false" is a true paradox because it's 
paradoxical only if it is a paradox ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: read error on superblock
  2012-07-24  7:52                     ` dexen deVries
@ 2012-07-24 16:46                       ` Ryusuke Konishi
       [not found]                         ` <20120725.014653.38326039.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Ryusuke Konishi @ 2012-07-24 16:46 UTC (permalink / raw)
  To: dexen deVries, Vyacheslav Dubeyko; +Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA

On Tue, 24 Jul 2012 09:52:18 +0200, dexen deVries wrote:
> Hi Vyacheslav,
> 
> 
> On Tuesday 24 of July 2012 10:26:37 you wrote:
> > I am afraid that it is not so good from the end user point of view.
> > 
> > First of all, the message "mount: /dev/sda3: can't read superblock" can
> > confuse user. The reason is bad sectors inside the volume but user is
> > informed about impossibility to read superblock.
> > 
> > Secondly, it is possible situation when it really needs to use a volume
> > in the case of presence of bad sectors. And I think that users can
> > expect such NILFS behavior because of declared reliability.
> > 
> > Unfortunately, as I can understand, NILFS hasn't bad blocks table and
> > can't process situation of bad blocks presence on volume correctly. It
> > means that NILFS interprets bad blocks as exceptional case. But from my
> > point of view, it makes sense to interpret bad blocks as usual thing and
> > try to work in the presence of ones. For example, fsck potentially can
> > check NILFS volume on bad blocks presence, construct bad blocks table
> > and save it on the volume.

NILFS does't have sector-based bad blocks table, but it has an error
flag on the segment usage file (sufile).  If a segment is marked
'erroneous', it will not be allocated.

At present, this doesn't work together with badblocks (mkfs.nilfs2),
nor the recovery logic.  However it is applicable for this purpose if
needed.

> > I suggest to add "virtual" special file for bad blocks description. It
> > can be described by inode in ifile and all bad blocks can be described
> > in DAT file as parts of this "virtual" special file. So, as a result,
> > NILFS file system driver will have bad blocks table which can be a basis
> > for excluding bad blocks from operation and trying to survive in the not
> > good device environment.
> > 
> > What do you think about such idea?
> 
> I believe bad sectors to be thing of the past mostly; any decent harddrive 
> (probably also any decent SSD) should re-map them after some re-reads. Some 
> data & meta-data loss is possible, but overall the FS should be accessible 
> again.

I agree with this opinion.

If the sector-based bad blocks table is sorely-needed, it is worth
considering, but at least it should be optional and not mandatory.

But even it's well implemented optionally, it still looks overkill
because most recent hard drives internally have alternate sectors and
most recent flash based drives have own remap mechanism.

Moreover, how the device corrupts is deeply depends on the nature and
configuration of underlying block device.  In this sense, in-device or
in-driver solution looks better to me.

Badblocks table is about to become a thing of the past, it's almost
stuff of the floppy drive's era.

> I have no idea why my particular HDD did not re-map; perhaps it just takes 
> much longer than I gave it.
> 
> As a point of reference, XFS does not do bad block management either; however, 
> the partition driver of IRIX does bad sector management -- so it is 
> implemented one layer below the FS.

Yes, If we implment some kind of redundancy mechanism in the FS layer,
it absolutely should reflect how the the data integrity should be
enhanced in the FS layer.


With regards,
Ryusuke Konishi


> I guess it /may be/ possible to use Linux' `dm' driver in such manner.
> 
> 
> Cheers,
> -- 
> dexen deVries
> 
> [[[↓][→]]]
> 
> "all dichotomies are either true or false" is a true paradox because it's 
> paradoxical only if it is a paradox ;)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: read error on superblock
       [not found]                         ` <20120725.014653.38326039.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
@ 2012-07-24 20:05                           ` Nick Martin
  0 siblings, 0 replies; 14+ messages in thread
From: Nick Martin @ 2012-07-24 20:05 UTC (permalink / raw)
  To: Ryusuke Konishi, dexen deVries, Vyacheslav Dubeyko
  Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Hello All,

I see in this thread, what I think is a misunderstanding of the role of the disk drive in the face of a hard read error.
The drive cannot simply map an unreadable sector to a new sector based on a read failure.  
If the read has failed, the drive does not contain the correct contents for the sector.  
The read failure needs to persist until a write is received for the unreadable sector.
When the write is received, the new data can be written to a good sector and the sector map adjusted.
One of the jobs of RAID is to reconstruct the data from other sources and write the correct data back to the same sector of the drive allowing the drive to do this remapping.
If you are not using RAID software or hardware, there is typically no way to reconstruct the data.

If the read error is correctable using ECC, the drive does know the proper contents for the sector and could choose to re-map it, but likely will not do so.
This could be done without reporting the error to the host system.  
It is my understanding that ECC errors detected within the drive are not at all uncommon.
If the ECC can correct the error, the valid data is typically returned, and the drive moves on to the next request.
If ECC cannot correct the error, the first thing the drive will do is attempt to re-read the media.  
If it is able to read the data the next time, even if it had to use ECC to correct it, it will still return the valid data and may move on to the next request.
At the file system level, a slow read would be observed, not a read error.

The behavior of the drive firmware is vendor specific.  Sometimes it is configurable.  The behavior of the firmware will vary across different classes and generations of drives even from the same vendor.
Drive firmware that makes up its own data and remaps the sector to correct a read error should never be sold by a reputable drive vendor.

The origin of the bad block table in the file system pre-dates drive hardware sector re-mapping.   
When it was not likely that writing to a sector whose contents were previously unreadable would result in being to read that sector back again, then it was a good idea to not write anything there in the future.
With modern drive technologies, it is likely that a write to a previously unreadable sector will result in being able to read back the newly written data.   The value of a bad block map in the file system is now minimized.
In addition with hardware and software RAID technology now available to everyone, many volumes will never in their lifetime, return a single read error to a file system.  
Errors are hidden and corrected at lower levels.  The file system observes perfect media, or in catastrophic failure of a RAID system, media offline.

I recommend assuming modern storage devices and subsystems, and focusing development efforts on file system issues that remain.

Thanks,
Nick Martin

-----Original Message-----
From: linux-nilfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org [mailto:linux-nilfs-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org] On Behalf Of Ryusuke Konishi
Sent: Tuesday, July 24, 2012 11:47 AM
To: dexen deVries; Vyacheslav Dubeyko
Cc: linux-nilfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: read error on superblock

On Tue, 24 Jul 2012 09:52:18 +0200, dexen deVries wrote:
> Hi Vyacheslav,
> 
> 
> On Tuesday 24 of July 2012 10:26:37 you wrote:
> > I am afraid that it is not so good from the end user point of view.
> > 
> > First of all, the message "mount: /dev/sda3: can't read superblock" 
> > can confuse user. The reason is bad sectors inside the volume but 
> > user is informed about impossibility to read superblock.
> > 
> > Secondly, it is possible situation when it really needs to use a 
> > volume in the case of presence of bad sectors. And I think that 
> > users can expect such NILFS behavior because of declared reliability.
> > 
> > Unfortunately, as I can understand, NILFS hasn't bad blocks table 
> > and can't process situation of bad blocks presence on volume 
> > correctly. It means that NILFS interprets bad blocks as exceptional 
> > case. But from my point of view, it makes sense to interpret bad 
> > blocks as usual thing and try to work in the presence of ones. For 
> > example, fsck potentially can check NILFS volume on bad blocks 
> > presence, construct bad blocks table and save it on the volume.

NILFS does't have sector-based bad blocks table, but it has an error flag on the segment usage file (sufile).  If a segment is marked 'erroneous', it will not be allocated.

At present, this doesn't work together with badblocks (mkfs.nilfs2), nor the recovery logic.  However it is applicable for this purpose if needed.

> > I suggest to add "virtual" special file for bad blocks description. 
> > It can be described by inode in ifile and all bad blocks can be 
> > described in DAT file as parts of this "virtual" special file. So, 
> > as a result, NILFS file system driver will have bad blocks table 
> > which can be a basis for excluding bad blocks from operation and 
> > trying to survive in the not good device environment.
> > 
> > What do you think about such idea?
> 
> I believe bad sectors to be thing of the past mostly; any decent 
> harddrive (probably also any decent SSD) should re-map them after some 
> re-reads. Some data & meta-data loss is possible, but overall the FS 
> should be accessible again.

I agree with this opinion.

If the sector-based bad blocks table is sorely-needed, it is worth considering, but at least it should be optional and not mandatory.

But even it's well implemented optionally, it still looks overkill because most recent hard drives internally have alternate sectors and most recent flash based drives have own remap mechanism.

Moreover, how the device corrupts is deeply depends on the nature and configuration of underlying block device.  In this sense, in-device or in-driver solution looks better to me.

Badblocks table is about to become a thing of the past, it's almost stuff of the floppy drive's era.

> I have no idea why my particular HDD did not re-map; perhaps it just 
> takes much longer than I gave it.
> 
> As a point of reference, XFS does not do bad block management either; 
> however, the partition driver of IRIX does bad sector management -- so 
> it is implemented one layer below the FS.

Yes, If we implment some kind of redundancy mechanism in the FS layer, it absolutely should reflect how the the data integrity should be enhanced in the FS layer.

With regards,
Ryusuke Konishi

> I guess it /may be/ possible to use Linux' `dm' driver in such manner.
> 
> 
> Cheers,
> --
> dexen deVries
> 
> [[[↓][→]]]
> 
> "all dichotomies are either true or false" is a true paradox because 
> it's paradoxical only if it is a paradox ;)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nilfs" 
> in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-07-24 20:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-23  8:45 read error on superblock dexen deVries
2012-07-23  9:17 ` Vyacheslav Dubeyko
2012-07-23  9:24   ` dexen deVries
2012-07-23  9:37     ` Vyacheslav Dubeyko
2012-07-23  9:39   ` Ryusuke Konishi
     [not found]     ` <20120723.183907.154986664.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2012-07-23  9:42       ` dexen deVries
2012-07-23 11:06       ` dexen deVries
2012-07-23 11:19         ` Ryusuke Konishi
     [not found]           ` <20120723.201918.94868195.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2012-07-23 18:24             ` dexen deVries
2012-07-24  0:06               ` Ryusuke Konishi
     [not found]                 ` <20120724.090604.40913934.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2012-07-24  6:26                   ` Vyacheslav Dubeyko
2012-07-24  7:52                     ` dexen deVries
2012-07-24 16:46                       ` Ryusuke Konishi
     [not found]                         ` <20120725.014653.38326039.konishi.ryusuke-Zyj7fXuS5i5L9jVzuh4AOg@public.gmane.org>
2012-07-24 20:05                           ` Nick Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.