* Re: [Lsf-pc] [LSF/MM TOPIC] Online filesystem check framework
2016-01-22 9:36 ` [Lsf-pc] " Jan Kara
@ 2016-01-22 12:04 ` Dave Chinner
2016-01-25 18:37 ` Goldwyn Rodrigues
1 sibling, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2016-01-22 12:04 UTC (permalink / raw)
To: Jan Kara; +Cc: Goldwyn Rodrigues, lsf-pc, linux-fsdevel, linux-block
On Fri, Jan 22, 2016 at 10:36:26AM +0100, Jan Kara wrote:
> Hello,
>
> On Thu 21-01-16 22:45:29, Goldwyn Rodrigues wrote:
> > Topic: Generic Online filesystem Check framework
> >
> > Motivation:
> > + Better uptime - Filesystems turn read-only at the first error encountered
> > and it may block critical applications which have not encountered the error.
> > + Unmountable Filesystems - Some filesystems such as clustered filesystem
> > may not be unmountable because they are used by too many computers to be
> > taken offline.
> > + Autofix - may sound dangerous as fixing is without user intervention, but
> > an option may help admins which are looking for a good uptime.
> > + Logic inbuilt - most logic of access to filesystem is already in the
> > filesystem driver. The fix/check would make use of existing functionality.
> >
> > Framework would be around providing a generic interface framework which
> > would use inode numbers as the basic unit to check or fix. Other metadata
> > may need special parameters. Userspace scripts will issue check/fixes to the
> > system, which may/may not be driven by
>
> Well, this is a very difficult topic so I think we need some concrete
> proposal and ideally some RFC code about what you think needs to be done to
> make online fsck possible. Without that it will be just a useless
> handwaving since I don't think anybody is able to come up with a decent
> proposal during that half an hour session...
I don't see it being generic, either, because metadata structurei,
dependencies, redundancy and algorithms are highly filesystem
specific. It just doesn't seem feasible to me - if it was then we'd
only need one userspace fsck/repair program....
FWIW, for XFS we already have a plan towards implementing online
repair[*] which we've been working towards for the past 5 years, but
it's another 5 years worth of work before we'll get to the point of
being able to dynamically repair corruptions as they are detected...
[*] I wrote this document back in 2008:
http://xfs.org/index.php/Reliable_Detection_and_Repair_of_Metadata_Corruption
We've got metadata CRCs now, and reverse mapping is just about ready
to be merged. Then we need parent pointers (i.e. directory structure
reverse mappings), transaction rollback on error, etc, before we can
even start to think about online repair algorithms.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Lsf-pc] [LSF/MM TOPIC] Online filesystem check framework
2016-01-22 9:36 ` [Lsf-pc] " Jan Kara
2016-01-22 12:04 ` Dave Chinner
@ 2016-01-25 18:37 ` Goldwyn Rodrigues
1 sibling, 0 replies; 4+ messages in thread
From: Goldwyn Rodrigues @ 2016-01-25 18:37 UTC (permalink / raw)
To: Jan Kara; +Cc: lsf-pc, linux-fsdevel, linux-block
On 01/22/2016 03:06 PM, Jan Kara wrote:
> Hello,
>
> On Thu 21-01-16 22:45:29, Goldwyn Rodrigues wrote:
>> Topic: Generic Online filesystem Check framework
>>
>> Motivation:
>> + Better uptime - Filesystems turn read-only at the first error encountered
>> and it may block critical applications which have not encountered the error.
>> + Unmountable Filesystems - Some filesystems such as clustered filesystem
>> may not be unmountable because they are used by too many computers to be
>> taken offline.
>> + Autofix - may sound dangerous as fixing is without user intervention, but
>> an option may help admins which are looking for a good uptime.
>> + Logic inbuilt - most logic of access to filesystem is already in the
>> filesystem driver. The fix/check would make use of existing functionality.
>>
>> Framework would be around providing a generic interface framework which
>> would use inode numbers as the basic unit to check or fix. Other metadata
>> may need special parameters. Userspace scripts will issue check/fixes to the
>> system, which may/may not be driven by
>
> Well, this is a very difficult topic so I think we need some concrete
> proposal and ideally some RFC code about what you think needs to be done to
> make online fsck possible. Without that it will be just a useless
> handwaving since I don't think anybody is able to come up with a decent
> proposal during that half an hour session...
Yes, I do have something more than this in mind, and I will come up with
an RFC and hopefully some code. I will try and post it by the deadline
of Feb 11.
--
Goldwyn
^ permalink raw reply [flat|nested] 4+ messages in thread