From: Edward Shishkin <edward.shishkin@gmail.com>
To: Ivan Shapovalov <intelfx100@gmail.com>
Cc: reiserfs-devel@vger.kernel.org
Subject: Re: reiser4: discard support
Date: Fri, 02 May 2014 14:44:52 +0200 [thread overview]
Message-ID: <53639344.90801@gmail.com> (raw)
In-Reply-To: <5363860C.7060708@gmail.com>
On 05/02/2014 01:48 PM, Edward Shishkin wrote:
> On 05/02/2014 01:51 AM, Ivan Shapovalov wrote:
>> On Wednesday 30 April 2014 at 19:51:03, Edward Shishkin wrote:
>>> On 04/30/2014 05:04 PM, Ivan Shapovalov wrote:
>>> [...]
>>>
>>>>> There is a ready implementation of rb-trees in the kernel (see
>>>>> ./lib/rbtree.c,
>>>>> ./lib/rbtree_test.c). Could you please take a look at this?
>>>> I've checked it already. Well, in case of trees:
>>>> - combined extent insertion+adjacent extent merge complexity is
>>>> O(n*log(n))
>>> Actually, this is:
>>>
>>> 1) Add n extents one-by-one to empty tree:
>>> log(1) + log(2) + ... + log(n-1)
>>>
>>> 2) Merge 2 trees of n extents:
>>> log(n) + log(n+1) + ... + log(2n-1)
>>>
>>> Total: log(1*2*...*2n-1) = log((2n-1)!)
>>>
>>> in accordance with Stirling's formula:
>>>
>>> log((2n)!) = log(((2n)^(1/2))*(2n/e)^n) =
>>> = (1/2)*log(2n) + n*log(2n/e) < n + n*log(n)
>> Hm.
>>
>> Consider we fill two data structures with N extents each, then merge
>> these two
>> data structures and prepare the resulting one for traversal (for
>> lists it will
>> be sorting, for trees that's no-op).
>>
>> For trees:
>> 1) Adding
>> 2*log((n-1)!)
>
> Why "2*" ?
> I suppose they are filled in parallel...
> Total complexity (when everything is serialized) is not interesting ;)
If lists look more simple for you, then do it with lists.
Eventually, we'll replace everything with fast-merge trees ;)
>
>>
>> 2) Merging
>> log((2n-1)!/(n-1)!)
>>
>> 3) Sorting
>> none
>>
>> Total:
>> log((2n-1)!) + log((n-1)!)
>>
>> For lists:
>> 1) Adding
>> 2N
>>
>> 2) Merging
>> constant
>>
>> 3) Sorting
>> 2N*log(2N)
>>
>> Total:
>> 2N+2N*log(2N)
>>
>> Quick testing in WolframAlpha shows that, if these estimations are
>> correct,
>> lists tend to have lesser complexity starting with approx. N=160.
>>
>>> Since atoms can be merged in parallel, we have that
>>> trees work better than lists. Also tree of extents has
>>> better memory consumption because of merging extents.
>>>
>>>> against O(n+n*log(n)) in lists;
>>>>
>>>> - joining two trees is not less than O(log(n)) against O(1) in lists.
>>>>
>>>> So I can't see any benefit in using trees, and join performance is a
>>>> downside of these. Am I missing something?
>>> [...]
>>>
>>>> Why after reiser4_invalidate_list()? I thought that it should be
>>>> called between reiser4_write_logs(), but before
>>>> reiser4_invalidate_list().
>>> OK, insert before. I think, it doesn't matter, since we don't look
>>> at this list when issuing discard requests, no? :)
>> I just don't understand what implications does that function have. It
>> seems to
>> mess with jnodes, and I can imagine that this leads to races with
>> somebody
>> modifying the bitmaps while issue_discard_requests() tries to read
>> them to
>> check discard extents...
>
> "somebody modifying the bitmaps" will wait for commit_mutex.
> Have you found where the bitmaps are modified? This is
> pre_commit_hook_bitmap().
>
>> Again, for me, internals of reiser4 still mostly look
>> like heavy wizardry. :)
>
> This is because the technical level of reiser4 is very high.
> Many very strong scientists contributed to reiser4.
>
> I also don't understand in details how some reiser4 subsystems work.
> Once I need details, I open the respective code and read it with the
> comments.
> The most important thing is general understanding of how it works.
> Nobody is able to grok this only by reading reiser4 code before
> sleeping. You need to implement at least one feature by your hands.
> Yes, it is not easy, and not for everyone. Like every high-level research
> in the science...
>
> Edward.
>
next prev parent reply other threads:[~2014-05-02 12:44 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-30 7:43 reiser4: discard support Ivan Shapovalov
2014-04-30 14:11 ` Edward Shishkin
[not found] ` <1B07908E-B768-407D-ADFA-B3C539FABB49@gmail.com>
[not found] ` <53613807.7090605@gmail.com>
2014-05-01 23:51 ` Ivan Shapovalov
[not found] ` <CAJZSrNK4=o9vocDfSM=4W5ZgtqZ6RVpmU66sGCu6HFsdN47OHw@mail.gmail.com>
2014-05-02 11:06 ` Ivan Shapovalov
2014-05-02 11:48 ` Edward Shishkin
2014-05-02 12:44 ` Edward Shishkin [this message]
2014-05-02 13:47 ` Ivan Shapovalov
2014-05-02 13:36 ` Ivan Shapovalov
2014-05-02 14:07 ` Edward Shishkin
2014-05-02 14:32 ` Ivan Shapovalov
2014-05-02 18:10 ` Edward Shishkin
2014-05-03 18:48 ` Ivan Shapovalov
2014-05-03 20:21 ` Edward Shishkin
2014-05-03 20:32 ` Ivan Shapovalov
2014-05-06 8:58 ` Edward Shishkin
2014-05-07 7:35 ` Ivan Shapovalov
2014-05-07 21:04 ` Edward Shishkin
-- strict thread matches above, loose matches on Subject: below --
2014-04-30 17:55 Edward Shishkin
2014-04-30 18:53 ` Edward Shishkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53639344.90801@gmail.com \
--to=edward.shishkin@gmail.com \
--cc=intelfx100@gmail.com \
--cc=reiserfs-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).