reiserfs-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Edward Shishkin <edward.shishkin@gmail.com>
To: Ivan Shapovalov <intelfx100@gmail.com>
Cc: reiserfs-devel <reiserfs-devel@vger.kernel.org>
Subject: Re: reiser4: discard support
Date: Wed, 30 Apr 2014 16:11:35 +0200	[thread overview]
Message-ID: <53610497.4060101@gmail.com> (raw)
In-Reply-To: <1496741.djsd6PJ1Ae@intelfx-laptop>

On 04/30/2014 09:43 AM, Ivan Shapovalov wrote:
> Hi Edward,

Hello Ivan.

>
> So I bit the bullet, ditched all tasks for the upcoming holidays and gave
> another try to the discard support in reiser4. Since my last status update,
> I've lost the local kernel tree due to bad scripting (decided to TRIM free
> space between partitions... and TRIM'ed everything that is NOT free space), so
> I had to start from scratch.

Once in a while I send key patches to my gmail mailbox configured with 
IMAP..

>
> I decided to go with following algorithm:
>
> During the transaction deallocated ranges are logged as-is to a list or array
> (let's call it "the delete log").

"Log" is a busy term. It is already used for journal blocks.
Let's use "discard set" instead?

> On atom commit we will generate a minimal
> superset of the delete log, comprised only of whole erase units ("the discard
> set").
> So, at commit time the following actions take place:
> - elements of the delete log are sorted;
> - the delete log is iterated, merging any adjacent ranges;
> - for each resulting range in the delete log its start is rounded down to the
>    closest erase unit boundary, and, starting from this block, extents of erase
>    unit length are created until whole range is covered;
> - for each such new extent it is checked whether does it contain allocated
>    blocks. If no (i. e. the range is completely deallocated), the range is
>    added to the discard set.
>
> Complexity of logging is thus O(n), added complexity of atom joining is O(1)
> and added complexity of atom commit is O(n*log(n) + n).
>
> The first question is how to store the delete log.
> - blocknr_set isn't ordered per se, so adjacent entries can't be easily merged
> - simple linked list (extent per node) seems like too big overhead
> - kmalloc()'ed arrays can't be joined in O(1) and they are static (imposes a
> hard limit on log size)
> - kmem_cache... feels like using a sledge-hammer.

Why not to maintain discard set as extents in a special per-atom tree?
When adding an extent, try to merge it with the neighbors (if any), so
that everything is prepared for discard at any moment. Just walk through
the tree from left to right and discard all suitable extents?

There is a ready implementation of rb-trees in the kernel (see 
./lib/rbtree.c,
./lib/rbtree_test.c). Could you please take a look at this?

>
> And oh, where in commit_current_atom() should these things be done? I guess
> right after the reiser4_write_logs(), under the tmgr mutex,

You are right.
Insert the function issue_discard_requests() right after the
reiser4_invalidate_list(ATOM_OVRWR_LIST(*atom)); and before
mutex_unlock(&sbinfo->tmgr.commit_mutex);

>   but I can't really
> explain why I think so.

reiser4_write_logs() commits the atom's overwrite set, which includes
all system blocks, including bitmap blocks.
If you issue discard requests before reiser4_write_logs(), then it
can happen that the system crashes after issue_discard_requests(),
but before reiser4_write_logs(). So, we'll have that the blocks have
been discarded, while the file system still refers to them (as bitmap
eventually hasn't been updated). Welcome to data corruption..

Thanks,
Edward.
P.S.: You'll need to set up a debugging environment. Take a look at this:
http://bipinkunal.blogspot.cz/2012/05/kgdb-tutorial.html
Do you have a second machine with a serial port?

  reply	other threads:[~2014-04-30 14:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-30  7:43 reiser4: discard support Ivan Shapovalov
2014-04-30 14:11 ` Edward Shishkin [this message]
     [not found] ` <1B07908E-B768-407D-ADFA-B3C539FABB49@gmail.com>
     [not found]   ` <53613807.7090605@gmail.com>
2014-05-01 23:51     ` Ivan Shapovalov
     [not found]       ` <CAJZSrNK4=o9vocDfSM=4W5ZgtqZ6RVpmU66sGCu6HFsdN47OHw@mail.gmail.com>
2014-05-02 11:06         ` Ivan Shapovalov
2014-05-02 11:48       ` Edward Shishkin
2014-05-02 12:44         ` Edward Shishkin
2014-05-02 13:47           ` Ivan Shapovalov
2014-05-02 13:36         ` Ivan Shapovalov
2014-05-02 14:07           ` Edward Shishkin
2014-05-02 14:32             ` Ivan Shapovalov
2014-05-02 18:10               ` Edward Shishkin
2014-05-03 18:48                 ` Ivan Shapovalov
2014-05-03 20:21                   ` Edward Shishkin
2014-05-03 20:32                     ` Ivan Shapovalov
2014-05-06  8:58                       ` Edward Shishkin
2014-05-07  7:35                         ` Ivan Shapovalov
2014-05-07 21:04                           ` Edward Shishkin
  -- strict thread matches above, loose matches on Subject: below --
2014-04-30 17:55 Edward Shishkin
2014-04-30 18:53 ` Edward Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53610497.4060101@gmail.com \
    --to=edward.shishkin@gmail.com \
    --cc=intelfx100@gmail.com \
    --cc=reiserfs-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).