From: Edward Shishkin <edward.shishkin@gmail.com>
To: "Dušan Čolić" <dusanc@gmail.com>
Cc: reiserfs-devel <reiserfs-devel@vger.kernel.org>
Subject: Re: [RFC] Smart fibration plugin ext_4321
Date: Fri, 06 Jan 2017 15:44:03 +0200 [thread overview]
Message-ID: <586F9F23.5010409@gmail.com> (raw)
In-Reply-To: <CADW=+3moCR2ufXeD4BmXFG0WE5uS2VpM95S6ZiggKg4BD47tfg@mail.gmail.com>
On 12/26/2016 11:13 PM, Dušan Čolić wrote:
> On Mon, Dec 26, 2016 at 7:47 PM, Edward Shishkin
> <edward.shishkin@gmail.com> wrote:
>>
>>
>> On 12/25/2016 02:59 AM, Dušan Čolić wrote:
>>> Fibration is a great way to decrease fragmentation and increase throughput.
>>> Currently there are 4 fibration plugins, lex, dot.o, ext_1 and ext_3
>>> and they all have their upsides and downsides.
>>>
>>> Proposed fibration plugin combines them all so that it combines files
>>> with same extensions for 1, 2. 3 and 4 character extension in groups
>>> and sorts them in same fiber group.
>>>
>>> With this fibration plugin all eg. xvid files would be in same group
>>> in folder on disk sorted alphabetically
>>
>>
>> What application wants all xvid files to be in the same group?
>> Do you have any benchmark numbers which show advantages
>> of the new plugin?
>>
> Xvid files are just an example.
> ext_1234 fibration would be equal to sum of ext_1, ext_2, ext_3, ext_4
> and dot_o in one.
>
> In currently default plugin (dot_o) we sort all files by name from the
> start except .o files which we put at the end.
> So if we had a source directory with .c .h and .o files in it files by
> extension would be sorted like: chchchchchchchchoooooooooooooo
> I presumed that in some use cases it is better to have files be sorted
> ccccccccccchhhhhhhhhhhhhhoooooooooooo
>
> Hypothesis is to use the premise that files of same extension are in
> same order of size to reduce fragmentation.
What kind of fragmentation you are talking about?
Internal (which results in "dead" disk space), or
external (which results in a lot of "extents")?
Edward.
>
> If we group files of same extension in groups in one directory, when
> we write files of same extension after deletion of some files of one
> extension their group would be in same order as the deleted file so
> they would be written in similar place and occupy the 'hole' of
> similar size.
> Ofc I am not talking about files of few kB size where Reiser4 is great
> at packing but about files from few MB to few GB.
>
> Eg. directory with mp3 and xvid files. mp3s are on the order of MB and
> xvid on the order of GB. If we sort them just by name order of xvid
> and mp3 files in one directory would be random so when deleting the
> smaller ones we would make random holes (like from
> mxmxmxxmmmxxxxmxxmmmx to mx xmxx mx xmx mmmx).
> With grouping of writing where all mp3s would be written first and all
> xvid after them after some deletions we would have smaller holes
> grouped first and larger last (like from mmmmmmmmmmmmxxxxxxxxxx to mm
> m mmm mmxx xxx xxx) but the main thing that after writing we would
> write mp3s in mp3 holes and xvid in xvid holes ergo. reduce
> fragmentation (like from mm m mmm mmxx xxx xxx to
> mmMmMMMmmmXmmxxXxxx xxx) that we would create if we would try to write
> xvid over mp3 holes.
>
> One obvious use case where I hypothesize that this type of fibration
> is better long term would be directories with content similar to usual
> Downloads directory, a lot of different types (and siyes) of files
> that get written and deleted a lot.
>
> ext_1234 fibration is the same as dot_o for directories with only one
> or one and .o file extension.
>
> Ofc this is just a hypothesis that I would like to prove with some
> fragmentation benchmarks but I wanted to hear your thoughts.
>
> And while I was looking through the code I found a part that I
> comprehended, elegant and easy to understand so I wanted to make
> something so I could learn more.
>
>
>> Thanks,
>> Edward.
>>
> Thank you for your time and effort
>
> Dushan
>
>
>>
>>> so that we will avoid putting
>>> small files between them and in that way reduce fragmentation. That
>>> group (xvid 4 character extensions) would be among last groups under
>>> one directory so that all small files would be written before it.
>>>
>>> Problem with the attached patch is that currently every fibre value is
>>> defined as u64 (eg. static __u64 fibre_ext_3) but if I understood
>>> correctly comments in kassign.c and fibration.c fibration part of the
>>> key is only 7 bits long.
>>> If that is true how did fibre_ext_3 worked?
>>>
>>> Thanks
>>>
>>> Dushan
>>
> --
> To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-01-06 13:44 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-25 0:59 [RFC] Smart fibration plugin ext_4321 Dušan Čolić
2016-12-25 8:51 ` Dušan Čolić
2016-12-26 18:47 ` Edward Shishkin
2016-12-26 21:13 ` Dušan Čolić
2017-01-06 13:44 ` Edward Shishkin [this message]
2017-01-06 15:34 ` Dušan Čolić
2017-01-06 19:58 ` Edward Shishkin
2017-01-06 23:09 ` Dušan Čolić
2017-01-06 23:05 ` Edward Shishkin
2017-01-07 8:15 ` Dušan Čolić
2017-01-07 7:58 ` Edward Shishkin
2017-01-07 17:10 ` Dušan Čolić
2017-01-07 17:56 ` Edward Shishkin
2017-01-07 19:26 ` Dušan Čolić
2017-01-07 19:06 ` Edward Shishkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=586F9F23.5010409@gmail.com \
--to=edward.shishkin@gmail.com \
--cc=dusanc@gmail.com \
--cc=reiserfs-devel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).