reiserfs-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Edward Shishkin <edward.shishkin@gmail.com>
To: "Dušan Čolić" <dusanc@gmail.com>
Cc: reiserfs-devel <reiserfs-devel@vger.kernel.org>
Subject: Re: [RFC] Smart fibration plugin ext_4321
Date: Sat, 07 Jan 2017 09:58:59 +0200	[thread overview]
Message-ID: <58709FC3.5000606@gmail.com> (raw)
In-Reply-To: <CADW=+3kzCd-jB9dPFyqAdNR6L2Cs81ibTfdYbWdnQd8zY3J38w@mail.gmail.com>



On 01/07/2017 10:15 AM, Dušan Čolić wrote:
> On Sat, Jan 7, 2017 at 12:05 AM, Edward Shishkin
> <edward.shishkin@gmail.com> wrote:
>> On 01/07/2017 01:09 AM, Dušan Čolić wrote:
>>> On Fri, Jan 6, 2017 at 8:58 PM, Edward Shishkin
>>> <edward.shishkin@gmail.com> wrote:
>>>>
>>>> On 01/06/2017 05:34 PM, Dušan Čolić wrote:
>>>>> On Fri, Jan 6, 2017 at 2:44 PM, Edward Shishkin
>>>>> <edward.shishkin@gmail.com> wrote:
>>>>>> On 12/26/2016 11:13 PM, Dušan Čolić wrote:
>>>>>>> On Mon, Dec 26, 2016 at 7:47 PM, Edward Shishkin
>>>>>>> <edward.shishkin@gmail.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/25/2016 02:59 AM, Dušan Čolić wrote:
>>>>>>>>> Fibration is a great way to decrease fragmentation and increase
>>>>>>>>> throughput.
>>>>>>>>> Currently there are 4 fibration plugins, lex, dot.o, ext_1 and ext_3
>>>>>>>>> and they all have their upsides and downsides.
>>>>>>>>>
>>>>>>>>> Proposed fibration plugin combines them all so that it combines
>>>>>>>>> files
>>>>>>>>> with same extensions for 1, 2. 3 and 4 character extension  in
>>>>>>>>> groups
>>>>>>>>> and sorts them in same fiber group.
>>>>>>>>>
>>>>>>>>> With this fibration plugin all eg. xvid files would be in same group
>>>>>>>>> in folder on disk sorted alphabetically
>>>>>>>>
>>>>>>>>
>>>>>>>> What application wants all xvid files to be in the same group?
>>>>>>>> Do you have any benchmark numbers which show advantages
>>>>>>>> of the new plugin?
>>>>>>>>
>>>>>>> Xvid files are just an example.
>>>>>>> ext_1234 fibration would be equal to sum of ext_1, ext_2, ext_3, ext_4
>>>>>>> and dot_o in one.
>>>>>>>
>>>>>>> In currently default plugin (dot_o) we sort all files by name from the
>>>>>>> start except .o files which we put at the end.
>>>>>>> So if we had a source directory with .c .h and .o files in it files by
>>>>>>> extension would be sorted like: chchchchchchchchoooooooooooooo
>>>>>>> I presumed that in some use cases it is better to have files be sorted
>>>>>>> ccccccccccchhhhhhhhhhhhhhoooooooooooo
>>>>>>>
>>>>>>> Hypothesis is to use the premise that files of same extension are in
>>>>>>> same order of size to reduce fragmentation.
>>>>>>
>>>>>>
>>>>>> What kind of fragmentation you are talking about?
>>>>>> Internal (which results in "dead" disk space), or
>>>>>> external (which results in a lot of "extents")?
>>>>>>
>>>>> External
>>>>>
>>>>>> Edward.
>>>>>>
>>>>>>
>>>>>>> If we group files of same extension in groups in one directory, when
>>>>>>> we write files of same extension after deletion of some files of one
>>>>>>> extension  their group would be in same order as the deleted file so
>>>>>>> they would be written in similar place and occupy the 'hole' of
>>>>>>> similar size.
>>>>
>>>>
>>>> So "similar" means the same order, that is file sizes can differ in 2
>>>> times?
>>>> TBH, I don't see what can be deduced from this assumption ;)
>>>> It can happen that new file either doesn't fit to that hole, or occupies
>>>> too
>>>> small place, so that next file won't fit to the rest of the hole..
>>>>
>>> OFC we can never guarantee that the new file completely fits the hole
>>> (especially as we go through compression in next layer) but for both
>>> smaller and larger file than a hole we would have higher probability
>>> for less extents for situations with 2 or more types of files in a
>>> directory. For one type of file in a directory behavior would be the
>>> same as dot_o and ext_1 plugin.
>>
>>
>> I should upset you: fibration plugins are about mapping of a semantic
>> tree to the storage tree. Simply speaking, they manage mapping
>> object-> key, which has nothing common with real locations on diТак ты уже не ищешь? sk.
>>
>> This is a block allocator, who assigns disk addresses to nodes of the
>> storage tree (right before writing them to disk at flush time).
>> And I am sure that block allocator doesn't care about fibration groups.
>>
>> I strongly not recommend you to experiment with block allocator.
>> Simply because I know how many people killed a lot of time without
>> results.
> Then what is this comment in the beginning of kassign.c about:
>
>
> * In reiser4 every piece of file system data and meta-data has a key. Keys
> * are used to store information in and retrieve it from reiser4 internal
> * tree. In addition to this, keys define _ordering_ of all file system
> * information: things having close keys are placed into the same or
> * neighboring (in the tree order) nodes of the tree. As our block allocator
> * tries to respect tree order (see flush.c), keys also define order in which
> * things are laid out on the disk, and hence, affect performance directly.

I can not find where in the code block allocator respects key ordering.
Once you find it, then let me know..

Thanks,
Edward.

  reply	other threads:[~2017-01-07  7:58 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-25  0:59 [RFC] Smart fibration plugin ext_4321 Dušan Čolić
2016-12-25  8:51 ` Dušan Čolić
2016-12-26 18:47 ` Edward Shishkin
2016-12-26 21:13   ` Dušan Čolić
2017-01-06 13:44     ` Edward Shishkin
2017-01-06 15:34       ` Dušan Čolić
2017-01-06 19:58         ` Edward Shishkin
2017-01-06 23:09           ` Dušan Čolić
2017-01-06 23:05             ` Edward Shishkin
2017-01-07  8:15               ` Dušan Čolić
2017-01-07  7:58                 ` Edward Shishkin [this message]
2017-01-07 17:10                   ` Dušan Čolić
2017-01-07 17:56                     ` Edward Shishkin
2017-01-07 19:26                       ` Dušan Čolić
2017-01-07 19:06                         ` Edward Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58709FC3.5000606@gmail.com \
    --to=edward.shishkin@gmail.com \
    --cc=dusanc@gmail.com \
    --cc=reiserfs-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).