Re: [RFD] FS behavior (I/O failure) in kernel summit

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 19:53 [RFD] FS behavior (I/O failure) in kernel summit fs
@ 2005-06-13 17:59 ` Hans Reiser
  2005-06-13 20:13   ` [Ext2-devel] " Andreas Dilger
  2005-06-13 21:51   ` Jeff Mahoney
  2005-06-14 13:22 ` Dave Kleikamp
  1 sibling, 2 replies; 28+ messages in thread
From: Hans Reiser @ 2005-06-13 17:59 UTC (permalink / raw)
  To: fs
  Cc: Linus Torvalds, Andrew Morton, viro VFS, linux-fsdevel,
	linux-kernel, zhiming, qufuping, madsys, xuh, koichi, kuroiwaj,
	okuyama, matsui_v, kikuchi_v, fernando, kskmori, takenakak,
	yamaguchi, ext2-devel, sct, shaggy, xfs-masters,
	Reiserfs developers mail-list

If you write a patch to implement 1a and 3a for reiserfs and reiser4 I
will accept them.  2a is too vague for me to support --- I can only
answer the question of whether error conditions are fs independent when
it is regarding specified error conditions.  I suspect there are times
when it needs to be fs dependent, but only a comprehensive review could
answer to that.

Thanks for your analysis.

Hans

fs wrote:

>Dear Linus, Andrew Morton, and all FS maintainers,
>
>    I've posted email before, but received no response. So I send
>another email in the hope of getting feedback from the community.
>    From the HA application developer's perspective, we want a 
>robust, stable, fast-error-responsive kernel. But the file system
>seems to be a disappointment. 
>  
>  We want to make things clear:
>
>1) When I/O failure occurs(e.g.: unrecoverable media failure - USB
>unplug), FS should
>   a. shutdown the FS right now(XFS does this)
>   b. try to make the media serve as long as possible(EXT3 remounts 
>      read-only, cache is still valid for read)
>   c. do not care, just print some kernel debugging info(EXT2 JFS 
>      ReiserFS)
>
>2) When I/O failure occurs, FS should
>   a. give a unified error
>   b. give errors according to the FS type
>
>3) the returned errno should be
>   a. real cause of failure, e.g. USB unplug returns EIO
>   b. cause from FS, e.g. USB unplug made FS remount read-only,
>      so open(O_RDONLY) returns ENOENT while open(O_RDWR) returns
>      EROFS
>   c. errno means nothing, you already get -1, that's enough
>
>    Unfortunately, recent kernel FSes give mixed answers to the above
>questions. As an end user/developer, this is really BAD! Also, there's
>no correspondent docs/standard, 'de facto' standard varies for different
>people.
>
>    So, we propose 1)a 2)a 3)a as the right behavior. We really hope FS
>maintainers can give us a unified answer on this issue, or AT LEAST 
>positive feedback. If possible, have a discussion in the Kernel Summit.
>
>P.S.: DOUBT has released test results for linux, Solaris, WinXP sp2.
>      Refer to it, then you can know how we feel as a developer.
>
>  
>
>>    I'm taking part in the project DOUBT[1], and my sub-project
>>focuses on the consistency and coherency of FS[2].
>>    Several days ago, I posted a thread, titled "[RFD] What error
>>should
>>FS return when I/O failure occurs?"[3] The purpose of this RFD, is to 
>>see whether the community has agreed on this subject. Unfortunately,
>>NO!
>>
>>    From my test results in [2], we can see different FS returns
>>different error, or even no error. The community has several points,
>>A) some results are caused by bugs, some are correct, some are
>>   implementation compromise. errno is passed to VFS from lower layer,
>>   no need to supply unified error type. User applications should
>>   handle every error type or glibc can convert the types to specified
>>   error type.
>>B) the kernel should give unified error(i.e. errno should be the same
>>   for each FS, and give the correct meaning). User applicatons should
>>   handle specified error type/types.
>>C) the errno that user gets can't provide enough info, so, there's no
>>   need to tell. User application gets -1 from I/O syscalls, that's
>>   enough, don't use errno. If user really have special needs, the 
>>   kernel should use special mechanism to achieve the goal, e.g. add 
>>   new functions to device drivers.
>>D) ...
>>
>>    From the user's perspective, B) seems to be the best, especially
>>for HA purpose. But till now, we can't find any standards or
>>constraints, so each FS implementaion uses 'de facto' way to return
>>errno. This makes users confused. 
>>    So, would you please have a discussion about this issue in Kernel
>>Summit (June 11-18)? If yes, we users are really thankful for this
>>discussion,so we can know how linux is designed for I/O error handling
>>about FS; if not, that means errno is FS implementation dependent, we
>>have to test our app for each FS. :(
>>
>>P.S.: During the presentation of Kenichi Okuyama in Paris, Windows
>>seems to detect every I/O failure immediately, even for async writes.
>>This shows how proprietary software handles I/O failure.
>>
>>[1] http://developer.osdl.jp/projects/doubt/
>>[2]
>>http://developer.osdl.jp/projects/doubt/fs-consistency-and-coherency/index.html
>>[3] http://www.ussg.iu.edu/hypermail/linux/kernel/0505.2/0006.html
>>
>>    
>>
>
>regards,
>----
>Qu Fuping
>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>
>  
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFD] FS behavior (I/O failure) in kernel summit
@ 2005-06-13 19:53 fs
  2005-06-13 17:59 ` Hans Reiser
  2005-06-14 13:22 ` Dave Kleikamp
  0 siblings, 2 replies; 28+ messages in thread
From: fs @ 2005-06-13 19:53 UTC (permalink / raw)
  To: Linus Torvalds, Andrew Morton
  Cc: viro VFS, linux-fsdevel, linux-kernel, zhiming, qufuping, madsys,
	xuh, koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando,
	kskmori, takenakak, yamaguchi, ext2-devel, sct, shaggy,
	xfs-masters, reiser

Dear Linus, Andrew Morton, and all FS maintainers,

    I've posted email before, but received no response. So I send
another email in the hope of getting feedback from the community.
    From the HA application developer's perspective, we want a 
robust, stable, fast-error-responsive kernel. But the file system
seems to be a disappointment. 
  
  We want to make things clear:

1) When I/O failure occurs(e.g.: unrecoverable media failure - USB
unplug), FS should
   a. shutdown the FS right now(XFS does this)
   b. try to make the media serve as long as possible(EXT3 remounts 
      read-only, cache is still valid for read)
   c. do not care, just print some kernel debugging info(EXT2 JFS 
      ReiserFS)

2) When I/O failure occurs, FS should
   a. give a unified error
   b. give errors according to the FS type

3) the returned errno should be
   a. real cause of failure, e.g. USB unplug returns EIO
   b. cause from FS, e.g. USB unplug made FS remount read-only,
      so open(O_RDONLY) returns ENOENT while open(O_RDWR) returns
      EROFS
   c. errno means nothing, you already get -1, that's enough

    Unfortunately, recent kernel FSes give mixed answers to the above
questions. As an end user/developer, this is really BAD! Also, there's
no correspondent docs/standard, 'de facto' standard varies for different
people.

    So, we propose 1)a 2)a 3)a as the right behavior. We really hope FS
maintainers can give us a unified answer on this issue, or AT LEAST 
positive feedback. If possible, have a discussion in the Kernel Summit.

P.S.: DOUBT has released test results for linux, Solaris, WinXP sp2.
      Refer to it, then you can know how we feel as a developer.

>     I'm taking part in the project DOUBT[1], and my sub-project
> focuses on the consistency and coherency of FS[2].
>     Several days ago, I posted a thread, titled "[RFD] What error
> should
> FS return when I/O failure occurs?"[3] The purpose of this RFD, is to 
> see whether the community has agreed on this subject. Unfortunately,
> NO!
> 
>     From my test results in [2], we can see different FS returns
> different error, or even no error. The community has several points,
> A) some results are caused by bugs, some are correct, some are
>    implementation compromise. errno is passed to VFS from lower layer,
>    no need to supply unified error type. User applications should
>    handle every error type or glibc can convert the types to specified
>    error type.
> B) the kernel should give unified error(i.e. errno should be the same
>    for each FS, and give the correct meaning). User applicatons should
>    handle specified error type/types.
> C) the errno that user gets can't provide enough info, so, there's no
>    need to tell. User application gets -1 from I/O syscalls, that's
>    enough, don't use errno. If user really have special needs, the 
>    kernel should use special mechanism to achieve the goal, e.g. add 
>    new functions to device drivers.
> D) ...
> 
>     From the user's perspective, B) seems to be the best, especially
> for HA purpose. But till now, we can't find any standards or
> constraints, so each FS implementaion uses 'de facto' way to return
> errno. This makes users confused. 
>     So, would you please have a discussion about this issue in Kernel
> Summit (June 11-18)? If yes, we users are really thankful for this
> discussion,so we can know how linux is designed for I/O error handling
> about FS; if not, that means errno is FS implementation dependent, we
> have to test our app for each FS. :(
> 
> P.S.: During the presentation of Kenichi Okuyama in Paris, Windows
> seems to detect every I/O failure immediately, even for async writes.
> This shows how proprietary software handles I/O failure.
> 
> [1] http://developer.osdl.jp/projects/doubt/
> [2]
> http://developer.osdl.jp/projects/doubt/fs-consistency-and-coherency/index.html
> [3] http://www.ussg.iu.edu/hypermail/linux/kernel/0505.2/0006.html
> 

regards,
----
Qu Fuping



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 17:59 ` Hans Reiser
@ 2005-06-13 20:13   ` Andreas Dilger
  2005-06-13 23:56     ` Hans Reiser
                       ` (2 more replies)
  2005-06-13 21:51   ` Jeff Mahoney
  1 sibling, 3 replies; 28+ messages in thread
From: Andreas Dilger @ 2005-06-13 20:13 UTC (permalink / raw)
  To: Hans Reiser
  Cc: fs, linux-fsdevel, linux-kernel, zhiming, qufuping, madsys, xuh,
	koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando, kskmori,
	takenakak, yamaguchi, ext2-devel, shaggy, xfs-masters,
	Reiserfs developers mail-list

On Jun 13, 2005  10:59 -0700, Hans Reiser wrote:
> If you write a patch to implement 1a and 3a for reiserfs and reiser4 I
> will accept them.  2a is too vague for me to support --- I can only
> answer the question of whether error conditions are fs independent when
> it is regarding specified error conditions.  I suspect there are times
> when it needs to be fs dependent, but only a comprehensive review could
> answer to that.

Hans, it would probably be preferrable to get ext2-like behaviour where
action is configurable (see below), I personally would be annoyed if my
workstation rebooted if there is a read error from the disk.
Better to mark filesystem read-only on error and continue to allow
users to read from rest of filesystem than to just reboot the node.
That is my experience in any case.  For those systems where there is
e.g. an HA server with dual-channel disk it might be better to reboot
and failover to another server, but even that isn't clear as a real
media error will just cause both nodes to reboot endlessly instead of
providing the best service they can.

> fs wrote:
> >Dear Linus, Andrew Morton, and all FS maintainers,
> >1) When I/O failure occurs(e.g.: unrecoverable media failure - USB
> >unplug), FS should
> >   a. shutdown the FS right now(XFS does this)
> >   b. try to make the media serve as long as possible(EXT3 remounts 
> >      read-only, cache is still valid for read)
> >   c. do not care, just print some kernel debugging info(EXT2 JFS 
> >      ReiserFS)

Actually, 1b is just the default behaviour for ext3 (because of journal
errors).  It is also possible to mount the filesystem with error=panic,
which will implement 1a, and it is also possible to mount ext2 with
error=remount-ro (which is default on Debian for ext2) which implements
1b.  I don't think it is possible to get 1c behaviour for journal
errors on ext3.

> >2) When I/O failure occurs, FS should
> >   a. give a unified error
> >   b. give errors according to the FS type

What is "unified error"?  Does this mean "-EIO" for all cases?  I also
don't understand why this is so important to your application...  If
you get an error back from the filesystem that isn't expected, that is
generally a problem regardless of what the error is...

> >3) the returned errno should be
> >   a. real cause of failure, e.g. USB unplug returns EIO
> >   b. cause from FS, e.g. USB unplug made FS remount read-only,
> >      so open(O_RDONLY) returns ENOENT while open(O_RDWR) returns
> >      EROFS
> >   c. errno means nothing, you already get -1, that's enough

This doesn't make sense.  If the "real cause of failure" is that the
journal code detected an inconsistency (it might not be an IO error at
the time, just some structure that is not what it should be, maybe the
user tried to format their partition while in use ;-) then the real
error is that the journal turned the filesystem read-only.  In any case,
you can't expect to get more information that "EIO", regardless of the
root cause (e.g. ENOMEM causes async buffer read to not complete, caller
checks buffer_uptodate() and it isn't uptodate, returns EIO).

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 17:59 ` Hans Reiser
  2005-06-13 20:13   ` [Ext2-devel] " Andreas Dilger
@ 2005-06-13 21:51   ` Jeff Mahoney
  2005-06-14  0:03     ` Hans Reiser
  1 sibling, 1 reply; 28+ messages in thread
From: Jeff Mahoney @ 2005-06-13 21:51 UTC (permalink / raw)
  To: Hans Reiser
  Cc: fs, Linus Torvalds, Andrew Morton, viro VFS, linux-fsdevel,
	linux-kernel, zhiming, qufuping, madsys, xuh, koichi, kuroiwaj,
	okuyama, matsui_v, kikuchi_v, fernando, kskmori, takenakak,
	yamaguchi, ext2-devel, sct, shaggy, xfs-masters,
	Reiserfs developers mail-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hans Reiser wrote:
> fs wrote:
> 
>>Dear Linus, Andrew Morton, and all FS maintainers,
>>
>>   I've posted email before, but received no response. So I send
>>another email in the hope of getting feedback from the community.
>>   From the HA application developer's perspective, we want a 
>>robust, stable, fast-error-responsive kernel. But the file system
>>seems to be a disappointment. 
>> 
>> We want to make things clear:
>>
>>1) When I/O failure occurs(e.g.: unrecoverable media failure - USB
>>unplug), FS should
>>  a. shutdown the FS right now(XFS does this)
>>  b. try to make the media serve as long as possible(EXT3 remounts 
>>     read-only, cache is still valid for read)
>>  c. do not care, just print some kernel debugging info(EXT2 JFS 
>>     ReiserFS)
>>
>>2) When I/O failure occurs, FS should
>>  a. give a unified error
>>  b. give errors according to the FS type
>>
>>3) the returned errno should be
>>  a. real cause of failure, e.g. USB unplug returns EIO
>>  b. cause from FS, e.g. USB unplug made FS remount read-only,
>>     so open(O_RDONLY) returns ENOENT while open(O_RDWR) returns
>>     EROFS
>>  c. errno means nothing, you already get -1, that's enough
>>
>>   Unfortunately, recent kernel FSes give mixed answers to the above
>>questions. As an end user/developer, this is really BAD! Also, there's
>>no correspondent docs/standard, 'de facto' standard varies for different
>>people.
>>
> If you write a patch to implement 1a and 3a for reiserfs and reiser4 I
> will accept them.  2a is too vague for me to support --- I can only
> answer the question of whether error conditions are fs independent when
> it is regarding specified error conditions.  I suspect there are times
> when it needs to be fs dependent, but only a comprehensive review could
> answer to that.

[quote repositioned so it's not top-posted]

Hans -

These tests must have been run on a kernel prior to 2.6.10-rc1. The I/O
error code exhibits behavior similar to ext3, so (1b). There are still
kinks to be worked out, but it's definitely not the "throw up our arms
and give up" that it used to be.

Implementing behavior 1a for ext3 and reiserfs should be fairly trivial
- - it just means that tests to check if the filesystem is in an aborted
state ("shutdown" in xfs terms) need to added to the call path in some
places, and be moved earlier in others.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCrf/VLPWxlyuTD7IRAqN6AJ9InmmuRbhle00JiHgRyIfKkF6cMACffyim
rM1y80zO5AexaDWbzXrD5iA=
=qXFS
-----END PGP SIGNATURE-----


-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 20:13   ` [Ext2-devel] " Andreas Dilger
@ 2005-06-13 23:56     ` Hans Reiser
  2005-06-14  2:46       ` Kenichi Okuyama
  2005-06-14 12:51       ` Erik Mouw
  2005-06-14  3:46     ` Valdis.Kletnieks
  2005-06-14 17:41     ` [Ext2-devel] " fs
  2 siblings, 2 replies; 28+ messages in thread
From: Hans Reiser @ 2005-06-13 23:56 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: fs, linux-fsdevel, linux-kernel, zhiming, qufuping, madsys, xuh,
	koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando, kskmori,
	takenakak, yamaguchi, ext2-devel, shaggy, xfs-masters,
	Reiserfs developers mail-list

Andreas Dilger wrote:

>On Jun 13, 2005  10:59 -0700, Hans Reiser wrote:
>  
>
>>If you write a patch to implement 1a and 3a for reiserfs and reiser4 I
>>will accept them.  2a is too vague for me to support --- I can only
>>answer the question of whether error conditions are fs independent when
>>it is regarding specified error conditions.  I suspect there are times
>>when it needs to be fs dependent, but only a comprehensive review could
>>answer to that.
>>    
>>
>
>Hans, it would probably be preferrable to get ext2-like behaviour where
>action is configurable (see below),
>

> I personally would be annoyed if my
>workstation rebooted if there is a read error from the disk.
>  
>
My concern is that real users don't read their logs and won't notice
that a disk is going bad, and there is no effective method for the
kernel notifying userspace of an error requiring user attention.

However given the existence of USB drives and CDROMs with scratches I
concede the point.

>Better to mark filesystem read-only on error and continue to allow
>users to read from rest of filesystem than to just reboot the node.
>That is my experience in any case.  For those systems where there is
>e.g. an HA server with dual-channel disk it might be better to reboot
>and failover to another server, but even that isn't clear as a real
>media error will just cause both nodes to reboot endlessly instead of
>providing the best service they can.
>
>  
>
>>fs wrote:
>>    
>>
>>>Dear Linus, Andrew Morton, and all FS maintainers,
>>>1) When I/O failure occurs(e.g.: unrecoverable media failure - USB
>>>unplug), FS should
>>>  a. shutdown the FS right now(XFS does this)
>>>  b. try to make the media serve as long as possible(EXT3 remounts 
>>>     read-only, cache is still valid for read)
>>>  c. do not care, just print some kernel debugging info(EXT2 JFS 
>>>     ReiserFS)
>>>      
>>>
>
>Actually, 1b is just the default behaviour for ext3 (because of journal
>errors).  It is also possible to mount the filesystem with error=panic,
>which will implement 1a, and it is also possible to mount ext2 with
>error=remount-ro (which is default on Debian for ext2) which implements
>1b.  I don't think it is possible to get 1c behaviour for journal
>errors on ext3.
>
>  
>
>>>2) When I/O failure occurs, FS should
>>>  a. give a unified error
>>>  b. give errors according to the FS type
>>>      
>>>
>
>What is "unified error"?  Does this mean "-EIO" for all cases?  I also
>don't understand why this is so important to your application...  If
>you get an error back from the filesystem that isn't expected, that is
>generally a problem regardless of what the error is...
>
>  
>
>>>3) the returned errno should be
>>>  a. real cause of failure, e.g. USB unplug returns EIO
>>>  b. cause from FS, e.g. USB unplug made FS remount read-only,
>>>     so open(O_RDONLY) returns ENOENT while open(O_RDWR) returns
>>>     EROFS
>>>  c. errno means nothing, you already get -1, that's enough
>>>      
>>>
>
>This doesn't make sense.  If the "real cause of failure" is that the
>journal code detected an inconsistency (it might not be an IO error at
>the time, just some structure that is not what it should be, maybe the
>user tried to format their partition while in use ;-) then the real
>error is that the journal turned the filesystem read-only.  In any case,
>you can't expect to get more information that "EIO", regardless of the
>root cause (e.g. ENOMEM causes async buffer read to not complete, caller
>checks buffer_uptodate() and it isn't uptodate, returns EIO).
>  
>
Well, maybe we should fix this. Or at least be open to his writing a
patch to fix it.

EIO is simply not enough information, don't you agree? i mean, if the
USB drive got unplugged, for us to say IO error rather than "hey you,
where'd the USB drive go? Plug it back in, or I can't do nothing!" and
to distinguish it from some other complex error due to software bugs in
the filesystem is to fail to understand the information needs of the
seven year old using the laptop. The seven year old probably can't cope
with debugging the filesystem's software error, but plugging the USB
drive back in he can do....

>Cheers, Andreas
>--
>Andreas Dilger
>Principal Software Engineer
>Cluster File Systems, Inc.
>
>
>
>  
>



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 21:51   ` Jeff Mahoney
@ 2005-06-14  0:03     ` Hans Reiser
  2005-06-15 17:39       ` Jeff Mahoney
  0 siblings, 1 reply; 28+ messages in thread
From: Hans Reiser @ 2005-06-14  0:03 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: fs, Linus Torvalds, Andrew Morton, viro VFS, linux-fsdevel,
	linux-kernel, zhiming, qufuping, madsys, xuh, koichi, kuroiwaj,
	okuyama, matsui_v, kikuchi_v, fernando, kskmori, takenakak,
	yamaguchi, ext2-devel, sct, shaggy, xfs-masters,
	Reiserfs developers mail-list

Jeff, would you be willing to make a proposal for what should be done? 
I would be interested in your suggestions.

Jeff Mahoney wrote:

>
> Hans -
>
> These tests must have been run on a kernel prior to 2.6.10-rc1. The I/O
> error code exhibits behavior similar to ext3, so (1b). There are still
> kinks to be worked out, but it's definitely not the "throw up our arms
> and give up" that it used to be.
>
> Implementing behavior 1a for ext3 and reiserfs should be fairly trivial
> - it just means that tests to check if the filesystem is in an aborted
> state ("shutdown" in xfs terms) need to added to the call path in some
> places, and be moved earlier in others.
>
> -Jeff
>
> --
> Jeff Mahoney
> SuSE Labs



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 23:56     ` Hans Reiser
@ 2005-06-14  2:46       ` Kenichi Okuyama
  2005-06-15 14:01         ` [Ext2-devel] " Theodore Ts'o
  2005-06-14 12:51       ` Erik Mouw
  1 sibling, 1 reply; 28+ messages in thread
From: Kenichi Okuyama @ 2005-06-14  2:46 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Andreas Dilger, fs, linux-fsdevel, linux-kernel, zhiming,
	qufuping, madsys, xuh, koichi, kuroiwaj, okuyama, matsui_v,
	kikuchi_v, fernando, kskmori, takenakak, yamaguchi, ext2-devel,
	shaggy, xfs-masters, Reiserfs developers mail-list

Dear Dr. Reiser, and Mr. Dilger,

Hans Reiser wrote:

>Andreas Dilger wrote:
>
>  
>
>>On Jun 13, 2005  10:59 -0700, Hans Reiser wrote:
>> 
>>
>>    
>>
>>>If you write a patch to implement 1a and 3a for reiserfs and reiser4 I
>>>will accept them.  2a is too vague for me to support --- I can only
>>>answer the question of whether error conditions are fs independent when
>>>it is regarding specified error conditions.  I suspect there are times
>>>when it needs to be fs dependent, but only a comprehensive review could
>>>answer to that.
>>>   
>>>
>>>      
>>>
>>Hans, it would probably be preferrable to get ext2-like behaviour where
>>action is configurable (see below),
>>
>>    
>>
>
>  
>
>>I personally would be annoyed if my
>>workstation rebooted if there is a read error from the disk.
>> 
>>
>>    
>>
>My concern is that real users don't read their logs and won't notice
>that a disk is going bad, and there is no effective method for the
>kernel notifying userspace of an error requiring user attention.
>
>However given the existence of USB drives and CDROMs with scratches I
>concede the point.
>  
>
I agree that kernel can not directly influence user.
But, application may have better chance.

Think about case of editor (vi, emacs, almost any text editors are ok ).

If you try to save file, and recieve no error, user will believe they 
have been written on disk they believe to be existing.
Even log yells for error, user will not notice.

If editor recieve error, then user can know something is wrong. Though 
he is still wondering, if he recieve the message
like "Input Output Error: may be HW error?", he definitely will start 
from looking at cable.

>  
>
>>Better to mark filesystem read-only on error and continue to allow
>>users to read from rest of filesystem than to just reboot the node.
>>That is my experience in any case.  For those systems where there is
>>e.g. an HA server with dual-channel disk it might be better to reboot
>>and failover to another server, but even that isn't clear as a real
>>media error will just cause both nodes to reboot endlessly instead of
>>providing the best service they can.
>>
>> 
>>
>>    
>>
>>>fs wrote:
>>>   
>>>
>>>      
>>>
>>>>Dear Linus, Andrew Morton, and all FS maintainers,
>>>>1) When I/O failure occurs(e.g.: unrecoverable media failure - USB
>>>>unplug), FS should
>>>> a. shutdown the FS right now(XFS does this)
>>>> b. try to make the media serve as long as possible(EXT3 remounts 
>>>>    read-only, cache is still valid for read)
>>>> c. do not care, just print some kernel debugging info(EXT2 JFS 
>>>>    ReiserFS)
>>>>     
>>>>
>>>>        
>>>>
>>Actually, 1b is just the default behaviour for ext3 (because of journal
>>errors).  It is also possible to mount the filesystem with error=panic,
>>which will implement 1a, and it is also possible to mount ext2 with
>>error=remount-ro (which is default on Debian for ext2) which implements
>>1b.  I don't think it is possible to get 1c behaviour for journal
>>errors on ext3.
>>
>> 
>>
>>    
>>
>>>>2) When I/O failure occurs, FS should
>>>> a. give a unified error
>>>> b. give errors according to the FS type
>>>>     
>>>>
>>>>        
>>>>
>>What is "unified error"?  Does this mean "-EIO" for all cases?  I also
>>don't understand why this is so important to your application...  If
>>you get an error back from the filesystem that isn't expected, that is
>>generally a problem regardless of what the error is...
>>
>> 
>>
>>    
>>
>>>>3) the returned errno should be
>>>> a. real cause of failure, e.g. USB unplug returns EIO
>>>> b. cause from FS, e.g. USB unplug made FS remount read-only,
>>>>    so open(O_RDONLY) returns ENOENT while open(O_RDWR) returns
>>>>    EROFS
>>>> c. errno means nothing, you already get -1, that's enough
>>>>     
>>>>
>>>>        
>>>>
>>This doesn't make sense.  If the "real cause of failure" is that the
>>journal code detected an inconsistency (it might not be an IO error at
>>the time, just some structure that is not what it should be, maybe the
>>user tried to format their partition while in use ;-) then the real
>>error is that the journal turned the filesystem read-only.  In any case,
>>you can't expect to get more information that "EIO", regardless of the
>>root cause (e.g. ENOMEM causes async buffer read to not complete, caller
>>checks buffer_uptodate() and it isn't uptodate, returns EIO).
>> 
>>
>>    
>>
>Well, maybe we should fix this. Or at least be open to his writing a
>patch to fix it.
>
>EIO is simply not enough information, don't you agree? i mean, if the
>USB drive got unplugged, for us to say IO error rather than "hey you,
>where'd the USB drive go? Plug it back in, or I can't do nothing!" and
>to distinguish it from some other complex error due to software bugs in
>the filesystem is to fail to understand the information needs of the
>seven year old using the laptop. The seven year old probably can't cope
>with debugging the filesystem's software error, but plugging the USB
>drive back in he can do....
>  
>

I do agree that EIO is not enough information. But is far better than 
nothing, or error like EROFS.
# Read Only? that means you still can READ entire area file system is 
serving, not only cached area.

At least , EIO tells application that it is due to some hardware 
problem, not software.

Also, I strongly disagree with "wait till someone re-plug the cable" action.

- How can you tell that re-plugged device is the same HDD you've unplugged?
- How can you tell user WILL re-plug?

At least, when we think about what word "cache" means, we should not 
assume for existing
"cache image" to be correct. cache is copy of information, which it's 
consistency is UNDER CONTROL.
When cable are unplugged, it means WE LOST OUR CONTROL, and therefore we 
should immediately
remove every cache related.

And at this moment, I don't see any reason why file system can continue 
it's service, nor why it should.

So, I do think we need at least EIO. EIO may be more classifiable into 
detail, and we do wish to have those
information somehow, but just like error handling starts from -1, EIO 
should start from ( errno == EIO ).


It might be good idea to make this as user's mount option for choice. 
But once given, I definitely choose
"immediate service stop". Knowing something went wrong is more important 
than continuing something
we can't trust.


best regards,
----
Kenichi Okuyama@Project DOUBT



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 20:13   ` [Ext2-devel] " Andreas Dilger
  2005-06-13 23:56     ` Hans Reiser
@ 2005-06-14  3:46     ` Valdis.Kletnieks
  2005-06-14 17:41     ` [Ext2-devel] " fs
  2 siblings, 0 replies; 28+ messages in thread
From: Valdis.Kletnieks @ 2005-06-14  3:46 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Hans Reiser, fs, linux-fsdevel, linux-kernel, zhiming, qufuping,
	madsys, xuh, koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v,
	fernando, kskmori, takenakak, yamaguchi, ext2-devel, shaggy,
	xfs-masters, Reiserfs developers mail-list

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]

On Mon, 13 Jun 2005 16:13:15 EDT, Andreas Dilger said:
> > fs wrote:
> > >   c. do not care, just print some kernel debugging info(EXT2 JFS 
> > >      ReiserFS)

> 1b.  I don't think it is possible to get 1c behaviour for journal
> errors on ext3.

Are there any realistic cases where you'd *want* behavior 1c?

(The very idea makes me cringe - I had 2 different vendor's 4.3BSD-based
systems basically do 1c when a Fujitsu Super-Eagle went oxide plow - it merrily
went along all night dragging the crashed head hither and yon failing to write
into newly-destroyed blocks all over the disk....)



[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 23:56     ` Hans Reiser
  2005-06-14  2:46       ` Kenichi Okuyama
@ 2005-06-14 12:51       ` Erik Mouw
  2005-06-14 13:48         ` Denis Vlasenko
  2005-06-14 17:16         ` Kenichi Okuyama
  1 sibling, 2 replies; 28+ messages in thread
From: Erik Mouw @ 2005-06-14 12:51 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Andreas Dilger, fs, linux-fsdevel, linux-kernel, zhiming,
	qufuping, madsys, xuh, koichi, kuroiwaj, okuyama, matsui_v,
	kikuchi_v, fernando, kskmori, takenakak, yamaguchi, ext2-devel,
	shaggy, xfs-masters, Reiserfs developers mail-list

On Mon, Jun 13, 2005 at 04:56:58PM -0700, Hans Reiser wrote:
> Andreas Dilger wrote:
> >Hans, it would probably be preferrable to get ext2-like behaviour where
> >action is configurable (see below),
> >
> > I personally would be annoyed if my
> >workstation rebooted if there is a read error from the disk.
> >  
> My concern is that real users don't read their logs and won't notice
> that a disk is going bad, and there is no effective method for the
> kernel notifying userspace of an error requiring user attention.

Speaking from experience (not only by profession, but also as a real
user), you figure out pretty fast something is wrong with an Ext[23]
filesystem mounted with 'errors=remount-ro'. All kind of file writes go
wrong and soon enough you figure out a hardware error is the problem.
Umount the filesystem, recover the filesystem image to a new drive,
fsck it, recover most of your data, and you're up and running again.

Reiserfs will just continue and only issues a few warning in the log,
which on its turn will not be read. Only after a few days when things
have turn worse you will figure out there's something wrong that
requires uses attention. By that time, changes are that the single disk
error (be it hardware or software) changed into multiple errors which
can make you loose quite some data.

I don't want to discredit Reiserfs or Ext[23], but filesystems are not
well tested with real disk errors. In my experience a filesystem trying
to continue to use a faulty medium usually makes things worse and
decreases the probability for a succesful recovery.

I'd rather have a filesystem which I can tell to warn me immediately
about a problem and not make things worse by trying to continue.
A mount option for Reiserfs like Andreas proposed would be a good idea.

Erik

-- 
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 19:53 [RFD] FS behavior (I/O failure) in kernel summit fs
  2005-06-13 17:59 ` Hans Reiser
@ 2005-06-14 13:22 ` Dave Kleikamp
  1 sibling, 0 replies; 28+ messages in thread
From: Dave Kleikamp @ 2005-06-14 13:22 UTC (permalink / raw)
  To: fs
  Cc: Linus Torvalds, Andrew Morton, viro VFS, linux-fsdevel,
	linux-kernel, zhiming, qufuping, madsys, xuh, koichi, kuroiwaj,
	okuyama, matsui_v, kikuchi_v, fernando, kskmori, takenakak,
	yamaguchi, ext2-devel, sct, xfs-masters, reiser

On Mon, 2005-06-13 at 15:53 -0400, fs wrote:

> 1) When I/O failure occurs(e.g.: unrecoverable media failure - USB
> unplug), FS should
>    a. shutdown the FS right now(XFS does this)
>    b. try to make the media serve as long as possible(EXT3 remounts 
>       read-only, cache is still valid for read)
>    c. do not care, just print some kernel debugging info(EXT2 JFS 
>       ReiserFS)

In practice, JFS will typically do b.  In some cases, an operation may
simply return -EIO (or not even that if the write is asynchronous), but
eventually, a failure to read or write metadata will lead to the file
system being mounted read-only.  Like ext2/3, this behavior is
configurable with the errors= mount option.

It's possible that JFS may behave like c for a short time, or if an I/O
error is isolated.

> 2) When I/O failure occurs, FS should
>    a. give a unified error
>    b. give errors according to the FS type
> 
> 3) the returned errno should be
>    a. real cause of failure, e.g. USB unplug returns EIO
>    b. cause from FS, e.g. USB unplug made FS remount read-only,
>       so open(O_RDONLY) returns ENOENT while open(O_RDWR) returns
>       EROFS
>    c. errno means nothing, you already get -1, that's enough

I'm not sure I understand the difference between 2) & 3).

If 1)b. applies, then 3)b. makes sense.  The initial error causes the
file system to be mounted read-only.  The original error is history, so
any additional errors must make sense in the current context.  Trying to
write to a read-only filesystem should return -EROFS.  Any new I/O
errors may return -EIO.  I'm not sure about -ENOENT, but it probably
makes sense from the context of the code returning the error.

>     Unfortunately, recent kernel FSes give mixed answers to the above
> questions. As an end user/developer, this is really BAD! Also, there's
> no correspondent docs/standard, 'de facto' standard varies for different
> people.
> 
>     So, we propose 1)a 2)a 3)a as the right behavior. We really hope FS
> maintainers can give us a unified answer on this issue, or AT LEAST 
> positive feedback. If possible, have a discussion in the Kernel Summit.

I don't agree.  I think 1)b is the most useful for most purposes.  Most
users would like to be able to recover as much data as possible if a
disk starts failing.  Allowing the volume to remain mounted read-only
allows this without risking further damage to the file system.

-- 
David Kleikamp
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-14 12:51       ` Erik Mouw
@ 2005-06-14 13:48         ` Denis Vlasenko
  2005-06-14 17:16         ` Kenichi Okuyama
  1 sibling, 0 replies; 28+ messages in thread
From: Denis Vlasenko @ 2005-06-14 13:48 UTC (permalink / raw)
  To: Erik Mouw, Hans Reiser
  Cc: Andreas Dilger, fs, linux-fsdevel, linux-kernel, zhiming,
	qufuping, madsys, xuh, koichi, kuroiwaj, okuyama, matsui_v,
	kikuchi_v, fernando, kskmori, takenakak, yamaguchi, ext2-devel,
	shaggy, xfs-masters, Reiserfs developers mail-list

On Tuesday 14 June 2005 15:51, Erik Mouw wrote:
> I don't want to discredit Reiserfs or Ext[23], but filesystems are not
> well tested with real disk errors. In my experience a filesystem trying
> to continue to use a faulty medium usually makes things worse and
> decreases the probability for a succesful recovery.

I recently had this experience. Not nice at all.

> I'd rather have a filesystem which I can tell to warn me immediately
> about a problem and not make things worse by trying to continue.
> A mount option for Reiserfs like Andreas proposed would be a good idea.
--
vda



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-14 12:51       ` Erik Mouw
  2005-06-14 13:48         ` Denis Vlasenko
@ 2005-06-14 17:16         ` Kenichi Okuyama
  2005-06-14 20:17           ` Szakacsits Szabolcs
  1 sibling, 1 reply; 28+ messages in thread
From: Kenichi Okuyama @ 2005-06-14 17:16 UTC (permalink / raw)
  To: erik
  Cc: reiser, adilger, fs, linux-fsdevel, linux-kernel, zhiming,
	qufuping, madsys, xuh, koichi, kuroiwaj, matsui_v, kikuchi_v,
	fernando, kskmori, takenakak, yamaguchi, ext2-devel, shaggy,
	xfs-masters, Reiserfs-Dev

Dear Eric,

>>>>> "Eric" == Erik Mouw <erik@harddisk-recovery.com> writes:
Eric> I'd rather have a filesystem which I can tell to warn me immediately
Eric> about a problem and not make things worse by trying to continue.
Eric> A mount option for Reiserfs like Andreas proposed would be a good idea.

I 100% agree with you about how file system should act.

# STOP!! in the name....

best regards,
---- 
Kenichi Okuyama


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-13 20:13   ` [Ext2-devel] " Andreas Dilger
  2005-06-13 23:56     ` Hans Reiser
  2005-06-14  3:46     ` Valdis.Kletnieks
@ 2005-06-14 17:41     ` fs
  2 siblings, 0 replies; 28+ messages in thread
From: fs @ 2005-06-14 17:41 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: Hans Reiser, linux-fsdevel, linux-kernel, zhiming, madsys, xuh,
	koichi, kuroiwaj, Kenichi Okuyama, matsui_v, kikuchi_v, fernando,
	kskmori, takenakak, yamaguchi, ext2-devel, shaggy, xfs-masters,
	Reiserfs developers mail-list

On Mon, 2005-06-13 at 16:13, Andreas Dilger wrote:

> > fs wrote:
> > >Dear Linus, Andrew Morton, and all FS maintainers,
> > >1) When I/O failure occurs(e.g.: unrecoverable media failure - USB
> > >unplug), FS should
> > >   a. shutdown the FS right now(XFS does this)
> > >   b. try to make the media serve as long as possible(EXT3 remounts 
> > >      read-only, cache is still valid for read)
> > >   c. do not care, just print some kernel debugging info(EXT2 JFS 
> > >      ReiserFS)
> 
> Actually, 1b is just the default behaviour for ext3 (because of journal
> errors).  It is also possible to mount the filesystem with error=panic,
> which will implement 1a, and it is also possible to mount ext2 with
> error=remount-ro (which is default on Debian for ext2) which implements
> 1b.  I don't think it is possible to get 1c behaviour for journal
> errors on ext3.
> 
> > >2) When I/O failure occurs, FS should
> > >   a. give a unified error
> > >   b. give errors according to the FS type

Of coz EIO is not always right. But suppose the same unplug action 
results different errors, just because of FS type? You think both EIO
and EROFS are right, what if new FS return EXXX? Even it's correct, the
community should AT LEAST define a set of error values which are 
considered right. So, the application user can handle these errors one
by one. If not, that means errno can't provide enough info, that's the
case of 3)c

Well, I give question 1) 2) and 3), they're just examples. FS developers
use 'de facto' standard, it's ambiguous. We need an accurate one.

> What is "unified error"?  Does this mean "-EIO" for all cases?  I also
> don't understand why this is so important to your application...  If
> you get an error back from the filesystem that isn't expected, that is
> generally a problem regardless of what the error is...
> 
> > >3) the returned errno should be
> > >   a. real cause of failure, e.g. USB unplug returns EIO
> > >   b. cause from FS, e.g. USB unplug made FS remount read-only,
> > >      so open(O_RDONLY) returns ENOENT while open(O_RDWR) returns
> > >      EROFS
> > >   c. errno means nothing, you already get -1, that's enough

> This doesn't make sense.  If the "real cause of failure" is that the
> journal code detected an inconsistency (it might not be an IO error at
> the time, just some structure that is not what it should be, maybe the
> user tried to format their partition while in use ;-) then the real
> error is that the journal turned the filesystem read-only.  In any case,
> you can't expect to get more information that "EIO", regardless of the
> root cause (e.g. ENOMEM causes async buffer read to not complete, caller
> checks buffer_uptodate() and it isn't uptodate, returns EIO).
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.

yours,
----
Qu Fuping



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-14 17:16         ` Kenichi Okuyama
@ 2005-06-14 20:17           ` Szakacsits Szabolcs
  0 siblings, 0 replies; 28+ messages in thread
From: Szakacsits Szabolcs @ 2005-06-14 20:17 UTC (permalink / raw)
  To: Kenichi Okuyama
  Cc: erik, reiser, adilger, fs, linux-fsdevel, linux-kernel, zhiming,
	qufuping, madsys, xuh, koichi, kuroiwaj, matsui_v, kikuchi_v,
	fernando, kskmori, takenakak, yamaguchi, ext2-devel, shaggy,
	xfs-masters, Reiserfs-Dev

On Wed, 15 Jun 2005, Kenichi Okuyama wrote:
>>>>>> "Eric" == Erik Mouw <erik@harddisk-recovery.com> writes:
> Eric> I'd rather have a filesystem which I can tell to warn me immediately
> Eric> about a problem and not make things worse by trying to continue.
> Eric> A mount option for Reiserfs like Andreas proposed would be a good idea.
>
> I 100% agree with you about how file system should act.

There are permanent and transient errors.

Removing a device is ENODEV and I think this is not closely related to any 
filesystem. Only Windows seems to detect this properly (the error messages 
are perhaps mistranslated by cygwin?)

If the device hits bad sectors then NTFS adds them to the $BadClust list 
on-the-fly and won't try to use them anymore, users don't notice anything 
unless asked. _Some_ bad sectors don't mean the disk is dying: not all 
disks have reserved zone, remapping or it's too small, etc. Many people
use NTFS having defected sectors without issues and no new ones develop
in time.

Thanks for your work. I think it's is important.

Cheers,
 	Szaka

-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-14  2:46       ` Kenichi Okuyama
@ 2005-06-15 14:01         ` Theodore Ts'o
  2005-06-15 19:40           ` Kenichi Okuyama
                             ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Theodore Ts'o @ 2005-06-15 14:01 UTC (permalink / raw)
  To: Kenichi Okuyama
  Cc: Hans Reiser, Andreas Dilger, fs, linux-fsdevel, linux-kernel,
	zhiming, qufuping, madsys, xuh, koichi, kuroiwaj, okuyama,
	matsui_v, kikuchi_v, fernando, kskmori, takenakak, yamaguchi,
	ext2-devel, shaggy, xfs-masters, Reiserfs developers mail-list

On Tue, Jun 14, 2005 at 11:46:36AM +0900, Kenichi Okuyama wrote:
> I agree that kernel can not directly influence user.
> But, application may have better chance.
> 
> Think about case of editor (vi, emacs, almost any text editors are ok ).
> 
> If you try to save file, and recieve no error, user will believe they 
> have been written on disk they believe to be existing.
> Even log yells for error, user will not notice.
> 
> If editor recieve error, then user can know something is wrong. Though 
> he is still wondering, if he recieve the message
> like "Input Output Error: may be HW error?", he definitely will start 
> from looking at cable.

Kenichi-San,

Part of the problem is that we are limited by the constraints of the
POSIX specification for error handling.  For example, we don't have a
way of telling the application, "the reason why you the filesystem was
remounted-read-only was in reaction to an I/O error that appears to be
caused by the multiple CRC checksum errors reported by the SCSI
controller".  We can only return EIO or EROFS.  And while the write()
which causes an I/O error that remounts the filesystem read/only can
(and probably does) return EIO, any subsequent writes will return
EROFS, and changing this would be hard, hackish, and probably wouldn't
be accepted.

Also, there is not neccesarily one right answer to how to respond to a
underlying I/O error in the filesystem.  So for ext2/3 filesystem, it
is configurable.  In case of an underlying error detected in the
filesystem metadata, the filesystem can be set to either (a) panic and
force a reboot, so that hopefully fsck can resolve the issue, (b)
remount the filesystem read/only, to prevent further damage, or (c)
continue and do nothing (the don't worry, be happy approach).
Different users will want different approaches, and so trying to
standardize what applications will see at the user level doesn't seem
like the right approach, since we want to allow system administrators
some flexibility about how they wish to configure their systems.

(For example, an embedded system or a system where there is higher
levels of redundancy, the right answer might be to panic and either
reboot or halt --- continuing and possibly returning wrong answers
might be completely unacceptable, and it may be that the once the
system goes down hard, the adjacent backup blade can pick up
operations.)

So instead of trying to standardize the existing error returns, which
are they way they are and for which trying to standardize them would
probably be not worth the effort, since they don't return enough
context to the application anyway ---- I would suggest the better
thing to do is to design a new mechanism for returning block device
errors via either some kind of notifcation mechanism (pick your choice
of hotplug, dbus, or netlink --- dbus may make the most amount of
sense, since multiple applications may want to subscribe to such
notifications) of problems at the filesystem level, so that
applications can take corrective action as necessary.  

This is a better approach, since it far more flexible and returns much
more information to the user.  For example, in a desktop environment,
the desktop can pop up a warning dialog to the user of a failure of a
block device or filesystem corruption, without having to modify every
single application.  In the case of an embedded system, the
notification can trigger an appropriate failover or recovery process.  

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-14  0:03     ` Hans Reiser
@ 2005-06-15 17:39       ` Jeff Mahoney
  2005-06-16  2:18         ` Dave Chinner
  2005-06-16 18:52         ` Hans Reiser
  0 siblings, 2 replies; 28+ messages in thread
From: Jeff Mahoney @ 2005-06-15 17:39 UTC (permalink / raw)
  To: Hans Reiser
  Cc: fs, Linus Torvalds, Andrew Morton, viro VFS, linux-fsdevel,
	linux-kernel, zhiming, qufuping, madsys, xuh, koichi, kuroiwaj,
	okuyama, matsui_v, kikuchi_v, fernando, kskmori, takenakak,
	yamaguchi, ext2-devel, sct, shaggy, xfs-masters,
	Reiserfs developers mail-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hans Reiser wrote:
> Jeff, would you be willing to make a proposal for what should be done? 
> I would be interested in your suggestions.
> 
> Jeff Mahoney wrote:
> 
>>Hans -
>>
>>These tests must have been run on a kernel prior to 2.6.10-rc1. The I/O
>>error code exhibits behavior similar to ext3, so (1b). There are still
>>kinks to be worked out, but it's definitely not the "throw up our arms
>>and give up" that it used to be.
>>
>>Implementing behavior 1a for ext3 and reiserfs should be fairly trivial
>>- it just means that tests to check if the filesystem is in an aborted
>>state ("shutdown" in xfs terms) need to added to the call path in some
>>places, and be moved earlier in others.

Well it seems to me that all the XFS code does is check to see if the FS
is in a shutdown state really early in the call path. Adding a
super->s_errno or MS_ABORTED flag (i prefer the former, for flexibility)
to the VFS level to be checked before calling into the filesystem would
add the consistent behavior to all filesystems.

As far as the ReiserFS support goes, I was premature in stating that
ReiserFS supports behavior 1b. It does so in terms of journal errors,
but it does just warn and continue on other errors. I'm working on a
patch that introduces reiserfs_error() similar to ext3_error() that
replaces the warnings in many places. The behavior is configurable using
the mount options introduced with the i/o error patches.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCsGe2LPWxlyuTD7IRAhDjAJ0dSbQlWTrK4q91CDToT8TQjnyHggCfS+cm
WWwx8wdwGPvDdB54VE/9rgU=
=c2s6
-----END PGP SIGNATURE-----


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 14:01         ` [Ext2-devel] " Theodore Ts'o
@ 2005-06-15 19:40           ` Kenichi Okuyama
  2005-06-15 20:37             ` [Ext2-devel] " Theodore Ts'o
  2005-06-15 20:38           ` Hans Reiser
  2005-06-16 11:38           ` Matthew Wilcox
  2 siblings, 1 reply; 28+ messages in thread
From: Kenichi Okuyama @ 2005-06-15 19:40 UTC (permalink / raw)
  To: tytso
  Cc: reiser, adilger, fs, linux-fsdevel, linux-kernel, zhiming,
	qufuping, madsys, xuh, koichi, kuroiwaj, matsui_v, kikuchi_v,
	fernando, kskmori, takenakak, yamaguchi, ext2-devel, shaggy,
	xfs-masters, Reiserfs-Dev

Dear Ted-san, and all,

>>>>> "Ted" == Theodore Ts'o <tytso@mit.edu> writes:
Ted> Part of the problem is that we are limited by the constraints of the
Ted> POSIX specification for error handling.  For example, we don't have a
Ted> way of telling the application, "the reason why you the filesystem was
Ted> remounted-read-only was in reaction to an I/O error that appears to be
Ted> caused by the multiple CRC checksum errors reported by the SCSI
Ted> controller".  We can only return EIO or EROFS.  And while the write()
Ted> which causes an I/O error that remounts the filesystem read/only can
Ted> (and probably does) return EIO, any subsequent writes will return
Ted> EROFS, and changing this would be hard, hackish, and probably wouldn't
Ted> be accepted.

You said:

Ted> And while the write()
Ted> which causes an I/O error that remounts the filesystem read/only can
Ted> (and probably does) return EIO

No. they return EROFS from beginning.

Ted> We can only return EIO or EROFS.

I do understand about EIO. What I don't see is EROFS.
EROFS could be returned if file system is being mounted as r/o from
beginning.

The point is pretty easy ( I think ).

Q1.  Why does file system succeed in re-mounting as r/o, when device
     is totally lost?

If device did exist, and throwing away the dirty pages did succeed,
then unmount that device/mount them as read only, should succeed
too. If this is what's happening, EROFS is good result. I agree with
this.

But in case of Mr. Qu's test, device is lost. USB cabel is
unplugged. They are unreachable. How could such device be *MOUNTED*?
# In other word, why can't I mount device which does not exist,
# while I can re-mount them?

Ted> So instead of trying to standardize the existing error returns, which
Ted> are they way they are and for which trying to standardize them would
Ted> probably be not worth the effort, since they don't return enough
Ted> context to the application anyway

I'm sorry, but I can't agree with this.

When error arise from system call, what application first care is to
divide error into two types.

1) devices and file systems are still under control of kernel.
2) devices or file systems are not under control of kernel anymore.

In case of 1, application will wonder if application have done
something wrong ( including user mistakenly mounted filesystem r/o
when it should be r/w, or application writing to r/o file system ).

In case of 2, application can do nothing ( and so should be kernel
). It's human who have to decide what to do.
# usualy, it means "give up the data", but sometimes, you may
# have choice of writing that data to some other devices.

Once type 2 error arise, system should not go back to type 1.
It should be one way path.

EROFS is typically error of type 1.
EIO is typically type 2.
 (SIGBUS is typically type2 too. But is there any other type 2
  error? None that I know of. )

What I believe (I'm sorry, but this is only my believe) is, we
should not mix error of type 1 with error of type 2.  If type 2
problem arised, type 2 error should be passed to application.  The
mode should be one-way.

I do agree that, for devices, it is device driver's responsibility
to identify which type of error have arised. But when file system
recieved type 2 error, he should not change it to type 1 error
( unless fs could really guarantee that ).

And, therefore, for type 2, I belive they can be standardize, and I
think we should.

I strongly agree that, for type 1, there are many ways we can
handle. I agree how you treat type 1 error would be characteristics
of each file system. Standardizing type 1 error is (in most cases)
nonsense.

I hope, Mr. Qu is looking for same thing.

regards,
---- 
Kenichi Okuyama

-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 19:40           ` Kenichi Okuyama
@ 2005-06-15 20:37             ` Theodore Ts'o
  0 siblings, 0 replies; 28+ messages in thread
From: Theodore Ts'o @ 2005-06-15 20:37 UTC (permalink / raw)
  To: Kenichi Okuyama
  Cc: reiser, adilger, fs, linux-fsdevel, linux-kernel, zhiming,
	qufuping, madsys, xuh, koichi, kuroiwaj, matsui_v, kikuchi_v,
	fernando, kskmori, takenakak, yamaguchi, ext2-devel, shaggy,
	xfs-masters, Reiserfs-Dev

On Thu, Jun 16, 2005 at 04:40:45AM +0900, Kenichi Okuyama wrote:
> Ted> And while the write()
> Ted> which causes an I/O error that remounts the filesystem read/only can
> Ted> (and probably does) return EIO
> 
> No. they return EROFS from beginning.
> 

No, trust me, the *first* read/write to a device which is returning
errors is returning EIO.  But it might not be the application which
you are testing.  It might be an attempt to update the inode last
access time that fails, so it might not even be returned to user space
at all.    

But once the filesystem is remounted read-only the reason why EROFS is
being returned is not because of an I/O error, but because the
filesystem is now read-only.  It makes perfect sense, if you think
like a computer....

> The point is pretty easy ( I think ).
> 
> Q1.  Why does file system succeed in re-mounting as r/o, when device
>      is totally lost?

That's because right now there is no way for block devices to inform
the filesystem that device is totally gone.

> But in case of Mr. Qu's test, device is lost. USB cabel is
> unplugged. They are unreachable. How could such device be *MOUNTED*?
> # In other word, why can't I mount device which does not exist,
> # while I can re-mount them?

Because remounting a filesystem means toggling the in-core data
structures that writes are no longer being tolerated.  It doesn't
require reading from the device, which a fresh mount requires.

> 1) devices and file systems are still under control of kernel.
> 2) devices or file systems are not under control of kernel anymore.
> 
> I do agree that, for devices, it is device driver's responsibility
> to identify which type of error have arised. But when file system
> recieved type 2 error, he should not change it to type 1 error
> ( unless fs could really guarantee that ).
> 
> And, therefore, for type 2, I belive they can be standardize, and I
> think we should.

The problem is the filesystem right now can't tell the difference
between type 1 and type 2 errors.  All we know is that an attempt to
read or write from a block as failed.  We don't know why it failed.   

I agree that *if* the filesystem could be told that a block device has
disappeared, then we should do the equivalent of umount -l on the
filesystem, and revoke all open file descriptors, much like the BSD
revoke(2) system call.  

But this isn't matter of "standardizing" error returns, but rather a
feature/enhancement request.

						- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 14:01         ` [Ext2-devel] " Theodore Ts'o
  2005-06-15 19:40           ` Kenichi Okuyama
@ 2005-06-15 20:38           ` Hans Reiser
  2005-06-15 22:53             ` Theodore Ts'o
                               ` (2 more replies)
  2005-06-16 11:38           ` Matthew Wilcox
  2 siblings, 3 replies; 28+ messages in thread
From: Hans Reiser @ 2005-06-15 20:38 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Kenichi Okuyama, Andreas Dilger, fs, linux-fsdevel, linux-kernel,
	zhiming, qufuping, madsys, xuh, koichi, kuroiwaj, okuyama,
	matsui_v, kikuchi_v, fernando, kskmori, takenakak, yamaguchi,
	ext2-devel, shaggy, xfs-masters, Reiserfs developers mail-list

Theodore Ts'o wrote:

>On Tue, Jun 14, 2005 at 11:46:36AM +0900, Kenichi Okuyama wrote:
>  
>
>>I agree that kernel can not directly influence user.
>>But, application may have better chance.
>>
>>Think about case of editor (vi, emacs, almost any text editors are ok ).
>>
>>If you try to save file, and recieve no error, user will believe they 
>>have been written on disk they believe to be existing.
>>Even log yells for error, user will not notice.
>>
>>If editor recieve error, then user can know something is wrong. Though 
>>he is still wondering, if he recieve the message
>>like "Input Output Error: may be HW error?", he definitely will start 
>>from looking at cable.
>>    
>>
>
>Kenichi-San,
>
>Part of the problem is that we are limited by the constraints of the
>POSIX specification for error handling. 
>
Ted, if I understand you correctly, I agree with you.  ;-)

What users need is for a window to pop up saying "the usb drive is
turned off" or "we are getting checksum errors from XXX, this may
indicate hardware problems that require your attention".

Now that GUIs exist, and now that more errors are possible because the
kernel is more complex, perhaps kernel error handling should be
reconsidered.  I don't have the feeling that anyone has felt themselves
authorized to take a deep look at how this ought to be designed.  I mean
sure, there are sometimes console windows that things get printed into,
but unsophisticated users basically want to be prompted if something is
wrong that needs their attention and to not have their experience
cluttered by a console window otherwise.  Also, it has long been
irritating having to make error codes conform to one of the existing
error codes when there is often no good connection between the name of
an existing error code and the new error condition one has just coded,
and there is no space left for new error codes.

Ted, what do you think?

> For example, we don't have a
>way of telling the application, "the reason why you the filesystem was
>remounted-read-only was in reaction to an I/O error that appears to be
>caused by the multiple CRC checksum errors reported by the SCSI
>controller".  We can only return EIO or EROFS.  And while the write()
>which causes an I/O error that remounts the filesystem read/only can
>(and probably does) return EIO, any subsequent writes will return
>EROFS, and changing this would be hard, hackish, and probably wouldn't
>be accepted.
>
>Also, there is not neccesarily one right answer to how to respond to a
>underlying I/O error in the filesystem.  So for ext2/3 filesystem, it
>is configurable.  In case of an underlying error detected in the
>filesystem metadata, the filesystem can be set to either (a) panic and
>force a reboot, so that hopefully fsck can resolve the issue, (b)
>remount the filesystem read/only, to prevent further damage, or (c)
>continue and do nothing (the don't worry, be happy approach).
>Different users will want different approaches, and so trying to
>standardize what applications will see at the user level doesn't seem
>like the right approach, since we want to allow system administrators
>some flexibility about how they wish to configure their systems.
>  
>
Perhaps these policy choices should be mount options, what do you think?

>(For example, an embedded system or a system where there is higher
>levels of redundancy, the right answer might be to panic and either
>reboot or halt --- continuing and possibly returning wrong answers
>might be completely unacceptable, and it may be that the once the
>system goes down hard, the adjacent backup blade can pick up
>operations.)
>
>So instead of trying to standardize the existing error returns, which
>are they way they are and for which trying to standardize them would
>probably be not worth the effort, since they don't return enough
>context to the application anyway ---- I would suggest the better
>thing to do is to design a new mechanism for returning block device
>errors via either some kind of notifcation mechanism (pick your choice
>of hotplug, dbus, or netlink --- dbus may make the most amount of
>sense, since multiple applications may want to subscribe to such
>notifications) of problems at the filesystem level, so that
>applications can take corrective action as necessary.  
>
>This is a better approach, since it far more flexible and returns much
>more information to the user.  For example, in a desktop environment,
>the desktop can pop up a warning dialog to the user of a failure of a
>block device or filesystem corruption, without having to modify every
>single application.  In the case of an embedded system, the
>notification can trigger an appropriate failover or recovery process.  
>
>Regards,
>
>						- Ted
>
>
>  
>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 20:38           ` Hans Reiser
@ 2005-06-15 22:53             ` Theodore Ts'o
  2005-06-16 19:08               ` [Ext2-devel] " Hans Reiser
  2005-06-16 11:52             ` Helge Hafting
  2005-06-16 21:27             ` Pavel Machek
  2 siblings, 1 reply; 28+ messages in thread
From: Theodore Ts'o @ 2005-06-15 22:53 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Kenichi Okuyama, Andreas Dilger, fs, linux-fsdevel, linux-kernel,
	zhiming, qufuping, madsys, xuh, koichi, kuroiwaj, okuyama,
	matsui_v, kikuchi_v, fernando, kskmori, takenakak, yamaguchi,
	ext2-devel, shaggy, xfs-masters, Reiserfs developers mail-list

On Wed, Jun 15, 2005 at 01:38:59PM -0700, Hans Reiser wrote:
> Ted, if I understand you correctly, I agree with you.  ;-)
> 
> What users need is for a window to pop up saying "the usb drive is
> turned off" or "we are getting checksum errors from XXX, this may
> indicate hardware problems that require your attention".

Yes, and as I suggested, this is best done via out-of-band
notification system, such as hotplug or dbus.  

> Now that GUIs exist, and now that more errors are possible because the
> kernel is more complex, perhaps kernel error handling should be
> reconsidered.  I don't have the feeling that anyone has felt themselves
> authorized to take a deep look at how this ought to be designed.  I mean
> sure, there are sometimes console windows that things get printed into,
> but unsophisticated users basically want to be prompted if something is
> wrong that needs their attention and to not have their experience
> cluttered by a console window otherwise.  Also, it has long been
> irritating having to make error codes conform to one of the existing
> error codes when there is often no good connection between the name of
> an existing error code and the new error condition one has just coded,
> and there is no space left for new error codes.

We could try to add some complicated exception system into system
calls, but it's not productive in my opnion.  First of all, backwards
compatibility is an absolute and unconditional requirement (we can't
break POSIX compatibility, and more importantly, we don't want to
change the number of applications that Linux can run from being
Linux-like to being BeOS-like).  This adds enough of a constraint that
I doubt trying to add changes to the system call error handling
mechanism is likely to work well.  

Secondly, if the goal is to have a pop-up show when there is some
major hardware problem, changing the system call error handling
doesn't really help us unless we want to require every single
application in existence to be modified to use this new exception
handling system.  Having seen how well this BeOS-like approach has
worked for BeOS, I believe this is a Really Bad Idea.  It's better to
have a separate, out-of-band notification scheme --- it's what dbus is
really designed to be for.

> >Also, there is not neccesarily one right answer to how to respond to a
> >underlying I/O error in the filesystem.  So for ext2/3 filesystem, it
> >is configurable.  
> >  
> >
> Perhaps these policy choices should be mount options, what do you think?

We put these policy options as options in the superblock, but there
are some advantages in being able to override them at mount-time with
mount options.  For example, one such advantage is that we can
standardize them across different filesystems.

However, even if we do have standardized mount options, it is a real
pain to have to type a very long mount option when doing manual
mounts.  So having defaults that can be stored in the superblock seems
to be a good idea, in my opinion.

						- Ted

-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 17:39       ` Jeff Mahoney
@ 2005-06-16  2:18         ` Dave Chinner
  2005-06-16 15:21           ` Jeff Mahoney
  2005-06-16 18:52         ` Hans Reiser
  1 sibling, 1 reply; 28+ messages in thread
From: Dave Chinner @ 2005-06-16  2:18 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: Hans Reiser, fs, Linus Torvalds, Andrew Morton, viro VFS,
	linux-fsdevel, linux-kernel, zhiming, qufuping, madsys, xuh,
	koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando, kskmori,
	takenakak, yamaguchi, ext2-devel, sct, shaggy, linux-xfs,
	Reiserfs developers mail-list

On Wed, Jun 15, 2005 at 01:39:02PM -0400, Jeff Mahoney wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hans Reiser wrote:
> > Jeff, would you be willing to make a proposal for what should be done? 
> > I would be interested in your suggestions.
> > 
> > Jeff Mahoney wrote:
> > 
> >>Hans -
> >>
> >>These tests must have been run on a kernel prior to 2.6.10-rc1. The I/O
> >>error code exhibits behavior similar to ext3, so (1b). There are still
> >>kinks to be worked out, but it's definitely not the "throw up our arms
> >>and give up" that it used to be.
> >>
> >>Implementing behavior 1a for ext3 and reiserfs should be fairly trivial
> >>- it just means that tests to check if the filesystem is in an aborted
> >>state ("shutdown" in xfs terms) need to added to the call path in some
> >>places, and be moved earlier in others.
> 
> Well it seems to me that all the XFS code does is check to see if the FS
> is in a shutdown state really early in the call path.

FYI, the up front checks in XFS are simply to stop new I/O from starting
if we're already in the shutdown state.

However, there's more than that in XFS - there's checks all through
it's I/O paths so that I/Os and transactions in flight at (or
started after) the time of the shutdown can be aborted before doing
further damage to a potentially corrupted filesystem. This part
cannot be done generically as it is intimately tied to the
filesystem.

It is also worth noting that XFS won't shutdown a filesystem on just
any I/O error. Shutdowns due to I/O errors only occur when the
failure has the potential to leave the filesystem in an inconsistent
state.  Hence any given operation can return different errors
depending on where the I/O error occurred in XFS and what effect
that I/O error has on the consistency of the filesystem.....

BTW, the correct list to use to get the attention of the XFS folk
is linux-xfs@oss.sgi.com.

Cheers,

Dave.
-- 
Dave Chinner
R&D Software Engineer
SGI Australian Software Group

-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 14:01         ` [Ext2-devel] " Theodore Ts'o
  2005-06-15 19:40           ` Kenichi Okuyama
  2005-06-15 20:38           ` Hans Reiser
@ 2005-06-16 11:38           ` Matthew Wilcox
  2 siblings, 0 replies; 28+ messages in thread
From: Matthew Wilcox @ 2005-06-16 11:38 UTC (permalink / raw)
  To: Theodore Ts'o, Kenichi Okuyama, Hans Reiser, Andreas Dilger,
	fs, linux-fsdevel, linux-kernel, zhiming, qufuping, madsys, xuh,
	koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando, kskmori,
	takenakak, yamaguchi, ext2-devel, shaggy, xfs-masters,
	Reiserfs developers mail-list

On Wed, Jun 15, 2005 at 10:01:05AM -0400, Theodore Ts'o wrote:
> We can only return EIO or EROFS.  And while the write()
> which causes an I/O error that remounts the filesystem read/only can
> (and probably does) return EIO, any subsequent writes will return
> EROFS, and changing this would be hard, hackish, and probably wouldn't
> be accepted.

I wasn't quite sure why this would be so hard, so I took a look.  Here's
how it works:

In fs/ext2/super.c, we do:
        if (test_opt(sb, ERRORS_RO)) {
                printk("Remounting filesystem read-only\n");
                sb->s_flags |= MS_RDONLY;
        }

>From here on, the VFS handles returning -EROFS (except for a couple
of ioctls and an xattr call).  So it's not under the control of the
individual filesystem.  One way of handling this would be to introduce a
new MS_ERRORS flag that allows the VFS to return -EIO instead of -EROFS
for a filesystem that contains errors.

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 20:38           ` Hans Reiser
  2005-06-15 22:53             ` Theodore Ts'o
@ 2005-06-16 11:52             ` Helge Hafting
  2005-06-16 19:52               ` Hans Reiser
  2005-06-16 21:27             ` Pavel Machek
  2 siblings, 1 reply; 28+ messages in thread
From: Helge Hafting @ 2005-06-16 11:52 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Theodore Ts'o, Kenichi Okuyama, Andreas Dilger, fs,
	linux-fsdevel, linux-kernel, zhiming, qufuping, madsys, xuh,
	koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando, kskmori,
	takenakak, yamaguchi, ext2-devel, shaggy, xfs-masters,
	Reiserfs developers mail-list

Hans Reiser wrote:

>What users need is for a window to pop up saying "the usb drive is
>turned off" or "we are getting checksum errors from XXX, this may
>indicate hardware problems that require your attention".
>  
>
Nice.  And the way to do this right is to have the kernel merely
log the error as usual.  The user can have some daemon listening
to the log, this program may then pop up error messages with
nifty detailed explanations, start up diagnostic software
for various subsystems and so on. 

The kernel can't do GUI stuff - a GUI may or may not be present,
and the kernel cannot know.  The server may not run X at all
but I still run graphical SW on it using a workstation or X-terminal.
Or the pc may have three video cards, each running a different xserver
with different users for each.  Who to report to?

An error-reporting daemon have an easier job, it can look up the
correct (possibly remote) display in its config file for all those
cases when there isn't just _one_ display.

Helge Hafting





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-16  2:18         ` Dave Chinner
@ 2005-06-16 15:21           ` Jeff Mahoney
  0 siblings, 0 replies; 28+ messages in thread
From: Jeff Mahoney @ 2005-06-16 15:21 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Hans Reiser, fs, Linus Torvalds, Andrew Morton, viro VFS,
	linux-fsdevel, linux-kernel, zhiming, qufuping, madsys, xuh,
	koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando, kskmori,
	takenakak, yamaguchi, ext2-devel, sct, shaggy, linux-xfs,
	Reiserfs developers mail-list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dave Chinner wrote:
>>Well it seems to me that all the XFS code does is check to see if the FS
>>is in a shutdown state really early in the call path.
> 
> FYI, the up front checks in XFS are simply to stop new I/O from starting
> if we're already in the shutdown state.
> 
> However, there's more than that in XFS - there's checks all through
> it's I/O paths so that I/Os and transactions in flight at (or
> started after) the time of the shutdown can be aborted before doing
> further damage to a potentially corrupted filesystem. This part
> cannot be done generically as it is intimately tied to the
> filesystem.
> 
> It is also worth noting that XFS won't shutdown a filesystem on just
> any I/O error. Shutdowns due to I/O errors only occur when the
> failure has the potential to leave the filesystem in an inconsistent
> state.  Hence any given operation can return different errors
> depending on where the I/O error occurred in XFS and what effect
> that I/O error has on the consistency of the filesystem.....

Sorry, I should have clarified. I was only refering to the handling of
operations that aren't already in flight.

Currently, ReiserFS (and ext3) will set the filesystem read-only on
error, which ends up returning -EROFS in situations where that error
code is correct, but not entirely appropriate.

- -Jeff

- --
Jeff Mahoney
SuSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCsZjnLPWxlyuTD7IRAuk5AKCplbYsl3YFml9/M1GRtuvBz21jvwCgoWKn
Mpl0khchSkQ1RwI/mkZ8buY=
=DxvJ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 17:39       ` Jeff Mahoney
  2005-06-16  2:18         ` Dave Chinner
@ 2005-06-16 18:52         ` Hans Reiser
  1 sibling, 0 replies; 28+ messages in thread
From: Hans Reiser @ 2005-06-16 18:52 UTC (permalink / raw)
  To: Jeff Mahoney
  Cc: fs, Linus Torvalds, Andrew Morton, viro VFS, linux-fsdevel,
	linux-kernel, zhiming, qufuping, madsys, xuh, koichi, kuroiwaj,
	okuyama, matsui_v, kikuchi_v, fernando, kskmori, takenakak,
	yamaguchi, ext2-devel, sct, shaggy, xfs-masters,
	Reiserfs developers mail-list

Jeff Mahoney wrote:

>
> As far as the ReiserFS support goes, I was premature in stating that
> ReiserFS supports behavior 1b. It does so in terms of journal errors,
> but it does just warn and continue on other errors. I'm working on a
> patch that introduces reiserfs_error() similar to ext3_error() that
> replaces the warnings in many places. The behavior is configurable using
> the mount options introduced with the i/o error patches.

Sounds good to me.

>
> -Jeff
>
> --
> Jeff Mahoney
> SuSE Labs


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 22:53             ` Theodore Ts'o
@ 2005-06-16 19:08               ` Hans Reiser
  0 siblings, 0 replies; 28+ messages in thread
From: Hans Reiser @ 2005-06-16 19:08 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Kenichi Okuyama, Andreas Dilger, fs, linux-fsdevel, linux-kernel,
	zhiming, qufuping, madsys, xuh, koichi, kuroiwaj, okuyama,
	matsui_v, kikuchi_v, fernando, kskmori, takenakak, yamaguchi,
	ext2-devel, shaggy, xfs-masters, Reiserfs developers mail-list

Theodore Ts'o wrote:

>
> It's better to
>have a separate, out-of-band notification scheme --- it's what dbus is
>really designed to be for.
>  
>
If I understand you, you are saying don't change how we notify the app,
change how we notify the user and if it is the user that needs to act
then we should sidestep the app entirely by creating a new method that
talks to the user, including popping up a window if the user is using a
GUI, and doing some other configurable thing for other circumstances. 
We could even create a mapping of errors to what the system should do in
response to them that the user can modify for the rare case that they
care to.  Ok, I agree.

>  
>
>>>Also, there is not neccesarily one right answer to how to respond to a
>>>underlying I/O error in the filesystem.  So for ext2/3 filesystem, it
>>>is configurable.  
>>> 
>>>
>>>      
>>>
>>Perhaps these policy choices should be mount options, what do you think?
>>    
>>
>
>We put these policy options as options in the superblock, but there
>are some advantages in being able to override them at mount-time with
>mount options.  For example, one such advantage is that we can
>standardize them across different filesystems.
>
>However, even if we do have standardized mount options, it is a real
>pain to have to type a very long mount option when doing manual
>mounts.  So having defaults that can be stored in the superblock seems
>to be a good idea, in my opinion.
>  
>
agreed.

>						- Ted
>
>
>  
>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-16 11:52             ` Helge Hafting
@ 2005-06-16 19:52               ` Hans Reiser
  0 siblings, 0 replies; 28+ messages in thread
From: Hans Reiser @ 2005-06-16 19:52 UTC (permalink / raw)
  To: Helge Hafting
  Cc: Theodore Ts'o, Kenichi Okuyama, Andreas Dilger, fs,
	linux-fsdevel, linux-kernel, zhiming, qufuping, madsys, xuh,
	koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando, kskmori,
	takenakak, yamaguchi, ext2-devel, shaggy, xfs-masters,
	Reiserfs developers mail-list

Helge, the kernel needs to send a proper message to some proper demon
that does the right thing.  Neither proper message nor proper demon
currently exists.  If you think that what we have works, try unplugging
a USB drive while an unsophisticated user is using it.

The conversion to and then from these -EIO style codes with their
limited number of allowed values (what is it, 128 values? ) loses
information, such as what the user should do, and whether to tell the
user about it or just tell the app. 

The current system was not exactly designed by a usability expert.....

Some error messages need to be fs specific (e.g. hash collisions), and
some are not at all fs specific and should be standardized across all
filesystems (e.g. USB drive unplugged).

The essential point is that what we have now is incoherent and broken in
its usability.  Fixing it requires deep work, not surface work.  Deeper
work than I think Kennichi-san realized.  Lets not get mired in details
though of what API should be created until someone volunteers to do the
substantial labor required to unbreak the usability.  If someone were to
appear and offer to fix the usability, I would be happy to have
reiserfs/reiser4 cooperate with that.  Ted, what about ext2/3, would you
guys support such an effort?  Maybe if we are encouraging as a group,
someone will volunteer....

Hans

Helge Hafting wrote:

> Hans Reiser wrote:
>
>> What users need is for a window to pop up saying "the usb drive is
>> turned off" or "we are getting checksum errors from XXX, this may
>> indicate hardware problems that require your attention".
>>  
>>
> Nice.  And the way to do this right is to have the kernel merely
> log the error as usual.  The user can have some daemon listening
> to the log, this program may then pop up error messages with
> nifty detailed explanations, start up diagnostic software
> for various subsystems and so on.
> The kernel can't do GUI stuff - a GUI may or may not be present,
> and the kernel cannot know.  The server may not run X at all
> but I still run graphical SW on it using a workstation or X-terminal.
> Or the pc may have three video cards, each running a different xserver
> with different users for each.  Who to report to?
>
> An error-reporting daemon have an easier job, it can look up the
> correct (possibly remote) display in its config file for all those
> cases when there isn't just _one_ display.
>
> Helge Hafting
>
>
>
>
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [Ext2-devel] Re: [RFD] FS behavior (I/O failure) in kernel summit
  2005-06-15 20:38           ` Hans Reiser
  2005-06-15 22:53             ` Theodore Ts'o
  2005-06-16 11:52             ` Helge Hafting
@ 2005-06-16 21:27             ` Pavel Machek
  2 siblings, 0 replies; 28+ messages in thread
From: Pavel Machek @ 2005-06-16 21:27 UTC (permalink / raw)
  To: Hans Reiser
  Cc: Theodore Ts'o, Kenichi Okuyama, Andreas Dilger, fs,
	linux-fsdevel, linux-kernel, zhiming, qufuping, madsys, xuh,
	koichi, kuroiwaj, okuyama, matsui_v, kikuchi_v, fernando, kskmori,
	takenakak, yamaguchi, ext2-devel, shaggy, xfs-masters,
	Reiserfs developers mail-list

Hi!

> >Kenichi-San,
> >
> >Part of the problem is that we are limited by the constraints of the
> >POSIX specification for error handling. 
> >
> Ted, if I understand you correctly, I agree with you.  ;-)
> 
> What users need is for a window to pop up saying "the usb drive is
> turned off" or "we are getting checksum errors from XXX, this may
> indicate hardware problems that require your attention".
> 
> Now that GUIs exist, and now that more errors are possible because the
> kernel is more complex, perhaps kernel error handling should be
> reconsidered.  I don't have the feeling that anyone has felt themselves
> authorized to take a deep look at how this ought to be designed.  I mean
> sure, there are sometimes console windows that things get printed into,
> but unsophisticated users basically want to be prompted if something

I believe syslog can handle this just fine. Just add some gui code to
watch syslog, and if high-enough (KERN_CRIT?) message happens, display
that message to user.

...which brings interesting question of "how to internationalize this
beast", but list of KERN_CRIT messages should be reasonably small.

								Pavel
-- 
teflon -- maybe it is a trademark, but it should not be.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2005-06-16 21:28 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-13 19:53 [RFD] FS behavior (I/O failure) in kernel summit fs
2005-06-13 17:59 ` Hans Reiser
2005-06-13 20:13   ` [Ext2-devel] " Andreas Dilger
2005-06-13 23:56     ` Hans Reiser
2005-06-14  2:46       ` Kenichi Okuyama
2005-06-15 14:01         ` [Ext2-devel] " Theodore Ts'o
2005-06-15 19:40           ` Kenichi Okuyama
2005-06-15 20:37             ` [Ext2-devel] " Theodore Ts'o
2005-06-15 20:38           ` Hans Reiser
2005-06-15 22:53             ` Theodore Ts'o
2005-06-16 19:08               ` [Ext2-devel] " Hans Reiser
2005-06-16 11:52             ` Helge Hafting
2005-06-16 19:52               ` Hans Reiser
2005-06-16 21:27             ` Pavel Machek
2005-06-16 11:38           ` Matthew Wilcox
2005-06-14 12:51       ` Erik Mouw
2005-06-14 13:48         ` Denis Vlasenko
2005-06-14 17:16         ` Kenichi Okuyama
2005-06-14 20:17           ` Szakacsits Szabolcs
2005-06-14  3:46     ` Valdis.Kletnieks
2005-06-14 17:41     ` [Ext2-devel] " fs
2005-06-13 21:51   ` Jeff Mahoney
2005-06-14  0:03     ` Hans Reiser
2005-06-15 17:39       ` Jeff Mahoney
2005-06-16  2:18         ` Dave Chinner
2005-06-16 15:21           ` Jeff Mahoney
2005-06-16 18:52         ` Hans Reiser
2005-06-14 13:22 ` Dave Kleikamp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).