reiserfs-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Edward Shishkin <edward.shishkin@gmail.com>
To: intelfx@intelfx.name,
	ReiserFS development mailing list
	<reiserfs-devel@vger.kernel.org>
Subject: Re: Reiser4 Upstream Git Repositories on GitHub
Date: Tue, 4 Oct 2016 17:52:17 +0200	[thread overview]
Message-ID: <0a727db9-6f81-3bef-f96a-c328e5b6ed66@gmail.com> (raw)
In-Reply-To: <314913f7-5bf0-3edc-ad0d-6a88567c0ae0@gmail.com>

On 09/29/2016 05:07 PM, Edward Shishkin wrote:
[...]
> BTW, your fstrim-scanner is the first candidate to scrub ;)
>>>>> Actually, I think about a common multi-functional scanner, with 3
>>>>> modes:
>>>>> 1) discard only (handle only free blocks);
>>>>> 2) scrub only (handle only busy blocks);
>>>>> 3) combined (scan the whole partition; for free blocks call
>>>>> discard,
>>>>>        for busy ones call scrub).
>>>>> Any ideas?
>>>>>
>>>>> Thanks,
>>>>> Edward.
>>>>> PS: We have an own ioctl number: 0xCD inherited from
>>>>> ReiserFS(v3).
>>>> I still have to finish the erase unit detection (which has
>>>> completely
>>>> stalled) to merge all this work. Moreover:
>>>>
>>>> For the fstrim, we have dropped all locking and serialization
>>>> issues
>>>> and declared that fstrim is best-effort: if it misses some blocks
>>>> due
>>>> to concurrent transactions allocating and freeing blocks, it
>>>> doesn't
>>>> matter.
>>>>
>>>> For the scrub, this won't fly...
>>> Indeed, the requirements to fstrim and scrub are different,
>>> but, as I remember, the last decision was to not miss:
>>> http://marc.info/?l=reiserfs-devel&m=141391883022745&w=2
>>> so everything will fly just perfectly..
>>>
>>> Edward.
>> This is different thing, it's about grabbing space in bigger chunks...
>> If a concurrent transaction allocates some space and frees some space,
>> we don't care, because it will then be discarded "online".
>>
>> But in case of the scrub, how do we protect from the storage tree
>> changing right beneath us?
>
> Yup, it seems that the idea of common scanner is dead.
> It should be an independent tool. I think, we need to simply scan the
> storage tree, do whatever is needed for each node, and make it dirty.

My last thought is that online scrub is not needed.

Global synchronization issues can not happen online. They can happen
only offline (after fsck-ing). Respectively, I suggest to move the
global synchronization stuff to user-space, where it will be extremely
simple (a sort of dd-ing partitions in parallel, plus we'll need a
user-space version of init_volume.c to collect all mirrors properly).

What can happen online is only(*) local fixable problems (when after
IO completion page is uptodate, but checksum verification failed).
There are 2 approaches:

1) Fix those local problems online: if __jparse() detects a local
    problem, then simply issue a "correction" - a write request for the
    original subvolume, and wait for its completion _before_ marking
    jnode parsed (to prevent "rollbacks").

2) In the case of local problem mark status block of the volume to
    indicate that global synchronization is required before fsck-ing.
    Then we forget about all local problems in that mount session.
    I didn't calculate the probability of simultaneous corruption of
    original and replica blocks with the same blocknumbers (don't have
    any input numbers), but I suspect that it is vanishingly small.

So, we need either pre- and post-fsck global offline synchronizations,
or global post-fsck one plus online local self-healing.

----
(*) I don't consider non-fixable IO errors (including death of one or
more mirrors) that you can handle online with block layer's RAID-1.
However, we also can implement such kind of failover in reiser4.
Downgrading arrays is simple to implement. Upgrading ones will again
require global online synchronization (scrub).

Edward.

  parent reply	other threads:[~2016-10-04 15:52 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-24 20:16 Reiser4 Upstream Git Repositories on GitHub Edward Shishkin
2016-09-25  0:36 ` Christian Kujau
2016-09-26 22:05 ` Ivan Shapovalov
2016-09-26 22:37   ` Edward Shishkin
2016-09-26 23:03     ` Ivan Shapovalov
2016-09-27  1:43     ` Ivan Shapovalov
2016-09-27 14:04       ` Edward Shishkin
2016-09-27  2:43     ` Ivan Shapovalov
2016-09-27 14:13       ` Edward Shishkin
2016-09-27 18:36         ` Ivan Shapovalov
2016-09-27 21:47           ` Edward Shishkin
2016-09-27 21:51             ` Ivan Shapovalov
2016-09-28 10:17               ` Edward Shishkin
2016-09-28 10:36                 ` Ivan Shapovalov
2016-09-28 13:56                   ` Edward Shishkin
2016-09-28 14:44                     ` Edward Shishkin
2016-09-28 15:03                       ` Ivan Shapovalov
2016-09-28 19:58                         ` Edward Shishkin
2016-09-28 21:50                           ` Ivan Shapovalov
2016-09-29 15:07                             ` Edward Shishkin
2016-09-30  3:28                               ` Ivan Shapovalov
2016-10-04 15:52                               ` Edward Shishkin [this message]
2016-09-30  6:56                 ` Ivan Shapovalov
2016-10-03 14:33                   ` Edward Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0a727db9-6f81-3bef-f96a-c328e5b6ed66@gmail.com \
    --to=edward.shishkin@gmail.com \
    --cc=intelfx@intelfx.name \
    --cc=reiserfs-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).