All of lore.kernel.org
 help / color / mirror / Atom feed
* Stupid (?) Idea about extent lifetimes.
@ 2016-03-16  6:32 Robert White
  2016-03-17  8:51 ` Duncan
  0 siblings, 1 reply; 2+ messages in thread
From: Robert White @ 2016-03-16  6:32 UTC (permalink / raw)
  To: Btrfs BTRFS

It occurs to me that it would be desirable to mark extents as "least 
favoured nations" and so all new writes would like to not be written 
there and any data written there would have a desire to be somewhere else.

So lets say the wholly unallocated space has a natural status of 100.

Allocated blocks would normally have statuses less than that by a 
trivial amount, such as 99.

One could then marks blocks with higher numbers for being less favoured, 
or lower numbers for being more favoured as desired.

Basically this would create a gravity map of sorts that would be 
factored into allocation decisions.

So say you just converted an ext4 to btrfs. It's got all those oddly 
sized and placed extents. You could give them all higher numbers in 
hopes that the data would naturally migrate away. Say just number them 
really large with no two numbers the same. Now the largest number would 
naturally become vacant and likely to be freed.

Likewise you could weight your data to migrate spindle-ward or such in 
the weeks before a reorg.

Similarly changes in geometry could simply mark segments as ill-favoured 
where the old geometry doesn't match the new and data would migrate 
under pressure.

One could reverse the age induced entropy of a file system by just 
periodically increasing the disfavour values of all the blocks, causing 
the oldest blocks to be the least favoured of all, and so creating a 
slowly rolling pattern.

So say new blocks start life as 50s by default. And empty space is 100. 
And every so often every block gets an increment (say 3 so that 100 is 
naturally skipped over.

Young blocks are now very magnetic. As they age they lose favour. 
Eventually they pass the value for unallocated space. Then they start 
losing data and eventually, in a system with 100 percent turnover the 
blocks get deallocated.

defragging and occasional balancing take care of the files that "never" 
change.

Very high numbers could also be reserved for pinning. Specially flagged 
files would have reverse gravity. A desire to stay put. So say a NOCOW 
files or Swap Files might have reverse gravity and one could use some 
tool to allocate blocks at the cold end of the disk with those sorts of 
numbers. Effectively segregating the the static from the churning data.

Fresh files would thereby tend to vacate extents full of snapshot data 
and freeing static (Read only) snapshot data would tend to release 
contiguous space.

As the disk runs out of space the system naturally breaks down into the 
existing best-fit allocation.

It's less than a defrag or autodefrag, or balance, and it would tend to 
be more like digestive peristalsis, at the extreme end (where people are 
taking way too many snapshots) it becomes an elevator sort by extent age.

(If this is a tired idea, my apologies. I took cough medicine a little 
while ago and this thought that's been rattling around in my head for 
months bubbled out of its cauldron.)

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Stupid (?) Idea about extent lifetimes.
  2016-03-16  6:32 Stupid (?) Idea about extent lifetimes Robert White
@ 2016-03-17  8:51 ` Duncan
  0 siblings, 0 replies; 2+ messages in thread
From: Duncan @ 2016-03-17  8:51 UTC (permalink / raw)
  To: linux-btrfs

Robert White posted on Tue, 15 Mar 2016 23:32:05 -0700 as excerpted:

> It occurs to me that it would be desirable to mark extents as "least
> favoured nations" and so all new writes would like to not be written
> there and any data written there would have a desire to be somewhere
> else.

I believe the word you wanted here instead of extents is "chunks".

Extents are parts of files, and btrfs is a COW (copy on write) 
filesystem, so writes modifying existing files already move out of the 
existing file extents.

But chunks, in btrfs context at least, are relatively large, nominally 1 
GiB for data, 256 MiB for metadata (tho sizes can be larger on really 
large filesystems or smaller on real small ones), "chunks", that the 
filesystem allocates first as empty, and lets files (or metadata nodes) 
fill.  It's these chunks that balance rewrites, and thus these chunks 
that would logically get your numeric "gravity" ratings.

With that change, it's an interesting concept.  I'm not a dev and can't 
really guess if it's entirely practical to implement or not, but as a 
btrfs user and admin, I can certainly see practical uses, should it /be/ 
implemented.

Very creative concept and definitely worth discussion.  Thanks for 
throwing it out here /for/ discussion.  =:^)

Tho one caution.  There's a lot more worthwhile ideas about what /could/ 
be done with btrfs than developers and developer time to implement them 
all, at least in anything like a reasonable time frame of say five years 
or less.  So even if it's a pretty good idea, unless it happens to be 
practically usable to help implement some other project the devs are 
working on, simply because there's so many other pretty good ideas, it 
might take quite some time (over five years) to actually be implemented.  
So don't get your hopes up for anything too immediate, or even 
intermediate term (2-5 years out), unless you get extremely lucky and the 
devs see it as a way to implement something they already are working on 
or that's already a high priority "next" item on their roadmap.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-03-17  8:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-16  6:32 Stupid (?) Idea about extent lifetimes Robert White
2016-03-17  8:51 ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.