From: David Masover <ninja@slaphack.com>
To: Mike Benoit <ipso@snappymail.ca>
Cc: Hans Reiser <reiser@namesys.com>,
reiserfs-list@namesys.com,
Alexander Zarochentcev <zam@namesys.com>,
vs <vs@thebsh.namesys.com>
Subject: Re: reiser4 status (correction)
Date: Fri, 21 Jul 2006 21:48:29 -0500 [thread overview]
Message-ID: <44C191FD.4010302@slaphack.com> (raw)
In-Reply-To: <1153525982.6659.108.camel@ipso.snappymail.ca>
Mike Benoit wrote:
> Your detailed explanation is appreciated David and while I'm far from a
> file system expert, I believe you've overstated the negative effects
> somewhat.
>
> It sounds to me like you've gotten Reiser4's allocation process in
> regards to wandering logs correct, from what I've read anyways, but I
> think you've overstated its fragmentation disadvantage when compared
> against other file systems.
>
> I think the thing we need to keep in mind here is that fragmentation
> isn't always a net loss. Depending on the workload, fragmentation (or at
> least not tightly packing data) could actually be a gain. In cases where
defragmented != tightly packed.
> you have files (like log files or database files) that constantly grow
> over a long period of time, packing them tightly at regularly scheduled
> intervals (or at all?) could cause more harm then good.
This is true...
> Consider this scenario of two MySQL tables having rows inserted to each
> one simultaneously, and lets also assume that the two tables were
> tightly packed before we started the insert process.
>
> 1 = Data for Table1
> 2 = Data for Table2
>
> Tightly packed:
>
> 111111111111222222222222----------------------------
>
> Simultaneous inserts start:
>
> 1111111111112222222222221122112211221122------------
>
> Allocate on flush alone would probably help this scenario immensely.
Yes, it would. You'd end up with
1111111111112222222222221111111122222222------------
assuming they both fit into RAM. And of course they could later be
repacked.
By the way, this is the NTFS approach to avoiding fragmentation -- try
to avoid fragmenting anything below a certain block size. I, for one,
would be perfectly happy if my large files were split up every 50 or 100
megs or so.
The problem is when you get tons of tiny files and metadata stored so
horribly inefficiently that things like Native Command Queuing is
actually a huge performance boost.
> The other thing you need to keep in mind is that database files are like
> their own little mini-file system. They have their own fragmentation
> issues to deal with (especially PostgreSQL).
I'd rather not add to that. This is one reason to hate virtualization,
by the way -- it's bad enough to have a fragmented NTFS on your Windows
installation, but worse if the disk itself is a fragmented sparse file
on Linux.
> So in cases like you
> described where you are overwriting data in the middle of a file,
> Reiser4 may be poor at doing this specific operation compared to other
> file systems, but just because you overwrite a row that appears to be in
> the middle of a table doesn't mean that the data itself is actually in
> the middle of the table. If your original row is 1K, and you try to
> overwrite it with 4K of data, it most likely will be put at the end of
> the file anyways, and the original 1K of data will be marked for
> overwriting later on. Isn't this what myisampack is for?
If what you say is true, isn't myisampack also an issue here? Surely it
doesn't write out an entirely separate copy of the file?
Anyway, the most common usage I can see for mysql would be overwriting a
1K row with another 1K row, or dropping a row, or adding a wholly new
row. I may be a bit naive here...
But then, isn't there also some metadata somewhere which says things
like how many rows you have in a given table?
And it's not just databases. Consider BitTorrent. The usual BitTorrent
way of doing things is to create a sparse file, then fill it in randomly
as you receive data. Only if you decide to allocate the whole file
right away, instead of making it sparse, you gain nothing on Reiser4,
since writes will be just as fragmented as if it was sparse.
Personally, I'd rather leave it as sparse, but repack everything later.
> So while I think what you described is ultimately correct, I believe
> extreme negative effects from it to be a corner case, and probably not
> representative of the norm. I also believe that other Reiser4
> improvements would outweigh this draw back to wandering logs, again in
> average workloads.
Depends on your definition of average. I'm also speaking from
experience. On Gentoo, /usr/portage started out being insanely fast on
Reiser4, because it barely had to seek at all -- despite being about
145,000 small files. I think it was maybe half that when I first put it
on r4, but it's more than twice as slow now, and you can hear it thrashing.
Now, the wandering logs did make the rsync process pretty fast -- the
entire thing gets rsync'd against one of the Gentoo mirrors. For anyone
using Debian, this is the equivalent of "apt-get update".
Only now, this rsync process is not only entirely disk-bound, it's
something like 10x as slow. I have a gig of RAM, so at least it's fast
once it's cached, but it's obviously horrendously fragmented. I am not
sure if it's individual files or directories, but it could REALLY use a
repack.
From what I remember of v3, it was never quite this bad, but it never
started out as fast as it did on Reiser4.
This is why I'm curious to see some benchmarks, by the way -- all of
this is subjective, and from memory.
> Like you mentioned, if Reiser4 performance gets so poor without the
> repacker, and Hans decides to charge for it, I think that will turn away
> a lot potential users as they could feel that this is a type of
> extortion. Get them hooked on something that only performs well for a
> certain amount of time, then charge them money to keep it up. I also
> think the community would write their own repacker pretty quick in
> response.
Depends. Unfortunately, it's far more likely that the community would
go "fsck this" and use XFS instead. Or JFS. Or any of the other
filesystems that Linux has which don't need a repacker.
It would eventually get done by the community, but if it's taking the
Namesys guys this long, and if they really expect to be able to make
money off of it, it must not be as trivial as I think it is.
> A much better approach in my opinion would be to have Reiser4 perform
> well in the majority of cases without the repacker, and sell the
> repacker to people who need that extra bit of performance. If I'm not
> mistaken this is actually Hans intent.
Hans?
> If Reiser4 does turn out to
> perform much worse over time, I would expect Hans would consider it a
> bug or design flaw and try to correct the problem however possible.
Or a design constraint...
> But I guess only time will tell if this is true or not. ;)
I'll tell you now it's true.
To be fair, I'm not entirely up to date, but I've had a Reiser4 root
partition for over a year now. It seems pretty decent for most things,
but I've definitely noticed that anywhere like /usr/portage -- lots of
files changing, lots staying the same, over time -- ends up pretty badly
fragmented. Other examples would be games, especially Steam games and
MMOs, played using Wine.
And I'd like some benchmarks, but I strongly suspect that this problem
is pretty bad -- and that the more you'd think a particular workload is
suited for Reiser4, the better the benchmarks are initially, the worse
it will degrade if there's any writing going on.
next prev parent reply other threads:[~2006-07-22 2:48 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-07-20 21:59 reiser4 status (correction) Hans Reiser
2006-07-21 3:02 ` David Masover
2006-07-21 8:44 ` Hans Reiser
2006-07-21 10:17 ` Sarath Menon
2006-07-21 19:13 ` David Masover
2006-07-21 20:41 ` Mike Benoit
2006-07-21 21:06 ` David Masover
2006-07-21 21:37 ` Mike Benoit
2006-07-21 22:29 ` Andreas Schäfer
2006-07-21 22:45 ` David Masover
2006-07-21 23:06 ` Andreas Schäfer
2006-07-22 20:07 ` Maciej Sołtysiak
2006-07-21 22:40 ` David Masover
2006-07-21 23:53 ` Mike Benoit
2006-07-22 2:48 ` David Masover [this message]
2006-07-22 5:53 ` Hans Reiser
2006-07-22 8:55 ` Mike Benoit
2006-07-22 12:34 ` David Masover
2006-07-22 19:56 ` Mike Benoit
2006-07-22 20:37 ` David Masover
2006-07-23 6:19 ` Hans Reiser
2006-07-22 15:40 ` portage tree (Was: Re: reiser4 status (correction)) Christian Trefzer
2006-07-23 5:50 ` Hans Reiser
2006-07-24 15:12 ` wiki entry (Was: Re: portage tree) Christian Trefzer
2006-07-22 0:49 ` reiser4 status (correction) Hans Reiser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=44C191FD.4010302@slaphack.com \
--to=ninja@slaphack.com \
--cc=ipso@snappymail.ca \
--cc=reiser@namesys.com \
--cc=reiserfs-list@namesys.com \
--cc=vs@thebsh.namesys.com \
--cc=zam@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.