All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Masover <ninja@slaphack.com>
To: "Chester R. Hosey" <chosey@nauticom.net>
Cc: reiserfs-list@namesys.com
Subject: Re: Our introduction to Reiser-list
Date: Wed, 26 Oct 2005 15:43:10 -0500	[thread overview]
Message-ID: <435FEA5E.70502@slaphack.com> (raw)
In-Reply-To: <435FB231.5070507@nauticom.net>

Chester R. Hosey wrote:
> Peter van Hardenberg wrote:
> 
>>Although I freely acknowledge my inexperience, I believe the real problems are 
>>related to graph traversal algorithms. Linus has commented on the obvious 
>>hardlink issues. I imagine there are more gremlins lurking in the shadows on 
>>this one. Garbage collectors have largely given up on reference counting, a 
>>luxury afforded by blazingly fast access to small amounts of storage. I am 
>>not particularly up on the research though.
> 
> 
> Just a suggestion from the uninformed peanut gallery...
> 
> Hans already plans on having a repacker, which will run incrementally in
> the background. Might it make sense to do incremental GC, possibly even
> in combination with the repacker's traversal of the disk?

You're not the first person to suggest GC instead of refcounting.  I 
still say, if at all possible, let's not let it come to that.

Try this:  I have a box which I call "the server" because it's headless 
and it does things like my one-man email operation.  It has a TV tuner 
card on it, and it has an 80 gig hard drive.

It wouldn't take a lot of TV to fill up 80 gigs.  My desktop has a 500 
gig RAID, which I use for games, my Windows install, and so on.

So, I can pull the TV from my server onto my desktop relatively easily 
-- there's a gigabit crossover between them, and NFS is fast enough. 
That way, I keep the server disk usage below 50%, even though I don't 
leave the desktop on all the time, and even though it can take awhile 
before I watch the shows I'm recording.  Even if I just choose to record 
from a particular channel for a full day, then skim through the 
recording to see if there's anything interesting.

With grabage collection, the idea is that maybe once a week, the 
repacker runs, and frees space at the same time.  In other words, if I 
delete something, I may not get the space back for most of a week.  With 
the current reference counting scheme, I get the space back immediately.

In virtual machines and such, garbage collection is fast, so it can be 
run much more frequently, even on demand -- need more RAM?  Run the 
garbage collector, flush the buffers, and you have RAM.

You can't do that with an FS, because the garbage collection would take 
insanely long, and you'd never know when it'd hit.  Kind of like lazy 
allocation, only worse.  Lazy allocation means that after awhile, my RAM 
fills up and Reiser4 decides to flush to disk, making my FS access 
unresponsive for a few seconds, sometimes 10 or 20.  It's better now, 
not sure if that's because I've got 2 gigs of RAM on my desktop instead 
of half a gig or because the new version of Reiser4 is smarter about it.

But, imagine that annoying random insane disk activity, effectively a 
few seconds of a frozen system, only you very likely have to lock the 
entire FS, and it takes several minutes or hours instead of a few 
seconds.  That's why you can't do on-disk garbage collection on demand.

Also, if you keep disk usage low, it's easier to keep things 
defragmented.  In RAM, no one cares -- use all the RAM, if it gets out 
of order, so what?  It's called "Random Access Memory" for a reason. 
And don't tell me you repack every time you collect garbage, because it 
already takes too long, and repacking would make it take longer.  And if 
you tried to do it in the same pass, you'd end up with a perfectly 
defragmented FS, except for the hundreds of tiny, randomly distributed 
holes where the recently collected garbage was.


  parent reply	other threads:[~2005-10-26 20:43 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-25 22:58 Our introduction to Reiser-list Peter van Hardenberg
2005-10-25 23:08 ` Hans Reiser
2005-10-26  0:04   ` Peter van Hardenberg
2005-10-26  2:42     ` Hubert Chan
2005-10-26 12:44       ` Peter Foldiak
2005-10-26 16:10         ` Peter van Hardenberg
2005-10-26 16:43           ` Chester R. Hosey
2005-10-26 17:12             ` Hans Reiser
2005-10-26 20:43             ` David Masover [this message]
2005-10-26 22:40             ` Nate Diller
2005-10-26 17:02               ` John Gilmore
2005-10-27  0:55                 ` Hubert Chan
2005-10-27  6:49                 ` Peter van Hardenberg
2005-10-27 11:17                   ` David Masover
2005-10-27 19:20                     ` Peter van Hardenberg
2005-10-27 20:44                       ` Jonathan Briggs
2005-10-27  8:44                 ` Hans Reiser
2005-10-27 12:05                 ` Alexander G. M. Smith
2005-10-27 12:41                   ` John Gilmore
2005-10-28 12:29                     ` Alexander G. M. Smith
2005-10-27 16:40                   ` Hans Reiser
2005-10-26 21:04           ` Nate Diller
2005-10-26 21:09             ` Hans Reiser
2005-10-26 21:00 ` Lares Moreau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=435FEA5E.70502@slaphack.com \
    --to=ninja@slaphack.com \
    --cc=chosey@nauticom.net \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.