* btrfs wishlist
@ 2011-03-01 18:35 Roy Sigurd Karlsbakk
2011-03-01 18:39 ` Chris Mason
0 siblings, 1 reply; 6+ messages in thread
From: Roy Sigurd Karlsbakk @ 2011-03-01 18:35 UTC (permalink / raw)
To: linux-btrfs
Hi all
Having managed ZFS for about two years, I want to post a wishlist.
INCLUDED IN ZFS
- Mirror existing single-drive filesystem, as in 'zfs attach'
- RAIDz-stuff - single and hopefully multiple-parity RAID configuration=
with block-level checksumming
- Background scrub/fsck
- Pool-like management with multiple RAIDs/mirrors (VDEVs)
- Autogrow as in ZFS autoexpand
NOT INCLUDED IN CURRENT ZFS
- Adding/removing drives from VDEVs
- Rebalancing a pool
- dedup
This may be a long shot, but can someone tell if this is doable in a ye=
ar or five?
Vennlige hilsener / Best regards
roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt.=
Det er et element=C3=A6rt imperativ for alle pedagoger =C3=A5 unng=C3=A5=
eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste ti=
lfeller eksisterer adekvate og relevante synonymer p=C3=A5 norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: btrfs wishlist
2011-03-01 18:35 btrfs wishlist Roy Sigurd Karlsbakk
@ 2011-03-01 18:39 ` Chris Mason
2011-03-01 19:09 ` Freddie Cash
2011-03-02 9:05 ` Thomas Bellman
0 siblings, 2 replies; 6+ messages in thread
From: Chris Mason @ 2011-03-01 18:39 UTC (permalink / raw)
To: Roy Sigurd Karlsbakk; +Cc: linux-btrfs
Excerpts from Roy Sigurd Karlsbakk's message of 2011-03-01 13:35:42 -0500:
> Hi all
>
> Having managed ZFS for about two years, I want to post a wishlist.
>
> INCLUDED IN ZFS
>
> - Mirror existing single-drive filesystem, as in 'zfs attach'
This one is easy, we do plan on adding it.
> - RAIDz-stuff - single and hopefully multiple-parity RAID configuration with block-level checksumming
We'll have raid56, but it won't be variable stripe size. There will be
one stripe size for data and one for metadata but that's it.
> - Background scrub/fsck
These are in the works
> - Pool-like management with multiple RAIDs/mirrors (VDEVs)
We have a pool of drives now....I'm not sure exactly what the vdevs are.
> - Autogrow as in ZFS autoexpand
We grow to the available storage now.
>
> NOT INCLUDED IN CURRENT ZFS
>
> - Adding/removing drives from VDEVs
We can add and remove drives on the fly today
> - Rebalancing a pool
We can rebalance space between drives today.
> - dedup
ZFS does have dedup we don't yet. This one has a firm maybe.
-chris
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: btrfs wishlist
2011-03-01 18:39 ` Chris Mason
@ 2011-03-01 19:09 ` Freddie Cash
2011-03-02 9:05 ` Thomas Bellman
1 sibling, 0 replies; 6+ messages in thread
From: Freddie Cash @ 2011-03-01 19:09 UTC (permalink / raw)
To: Chris Mason; +Cc: Roy Sigurd Karlsbakk, linux-btrfs
On Tue, Mar 1, 2011 at 10:39 AM, Chris Mason <chris.mason@oracle.com> wrote:
> Excerpts from Roy Sigurd Karlsbakk's message of 2011-03-01 13:35:42 -0500:
>
>> - Pool-like management with multiple RAIDs/mirrors (VDEVs)
>
> We have a pool of drives now....I'm not sure exactly what the vdevs are.
This functionality is in btrfs already, but it's using different
terminology and configuration methods.
In ZFS, the lowest level in the storage stack is the physical block device.
You group these block devices together into a virtual device (aka
vdev). The possible vdevs are:
- single disk vdev, with no redundancy
- mirror vdev, with any number of devices (n-way mirroring)
- raidz1 vdev, single-parity redundancy
- raidz2 vdev, dual-parity redundancy
- raidz3 vdev, triple-party redundancy
- log vdev, separate device for "journaling", or as a write cache
- cache vdev, separate device that acts as a read cache
A ZFS pool is made up of a collection of the vdevs.
For example, a simple, non-redundant pool setup for a laptop would be:
zpool create laptoppool da0
To create a pool with a dual-parity vdev using 8 disks:
zpool create mypool raidz2 da0 da1 da2 da3 da4 da5 da6 da7
To later add to the existing pool:
zpool add mypool raidz2 da8 da9 da10 da11 da12 da13 da14 da15
Later, you create your ZFS filesystems ontop of the pool.
With btrfs, you setup the redundancy and the filesystem all in one
shot, thus combining the "vdev" with the "pool" (aka filesystem).
ZFS has better separation of the different layers (device, pool,
filesystem), and better tools for working with them (zpool / zfs) but
similar functionality is (or at least appears to be) in btrfs already.
Using device mapper / md underneath btrfs also gives you a similar setup to ZFS.
--
Freddie Cash
fjwcash@gmail.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: btrfs wishlist
2011-03-01 18:39 ` Chris Mason
2011-03-01 19:09 ` Freddie Cash
@ 2011-03-02 9:05 ` Thomas Bellman
2011-03-02 10:43 ` Hugo Mills
2011-03-02 14:18 ` Justin Ossevoort
1 sibling, 2 replies; 6+ messages in thread
From: Thomas Bellman @ 2011-03-02 9:05 UTC (permalink / raw)
To: Chris Mason; +Cc: Roy Sigurd Karlsbakk, linux-btrfs
On 2011-03-01 19:39, Chris Mason wrote:
> We'll have raid56, but it won't be variable stripe size. There will be
> one stripe size for data and one for metadata but that's it.
Will the stripe *width* be configurable? If I have something like a
Sun Thor with 48 drives, I would probably not be entirely comfortable
having 46 drives data and 2 drives parity; too little redundancy for
my tastes. 2 drives parity per 10 drives data is more like what I
would run, but that would of course be an individual choice.
/Bellman
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: btrfs wishlist
2011-03-02 9:05 ` Thomas Bellman
@ 2011-03-02 10:43 ` Hugo Mills
2011-03-02 14:18 ` Justin Ossevoort
1 sibling, 0 replies; 6+ messages in thread
From: Hugo Mills @ 2011-03-02 10:43 UTC (permalink / raw)
To: Thomas Bellman; +Cc: Chris Mason, Roy Sigurd Karlsbakk, linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 958 bytes --]
On Wed, Mar 02, 2011 at 10:05:28AM +0100, Thomas Bellman wrote:
> On 2011-03-01 19:39, Chris Mason wrote:
>
> > We'll have raid56, but it won't be variable stripe size. There will be
> > one stripe size for data and one for metadata but that's it.
>
> Will the stripe *width* be configurable? If I have something like a
> Sun Thor with 48 drives, I would probably not be entirely comfortable
> having 46 drives data and 2 drives parity; too little redundancy for
> my tastes. 2 drives parity per 10 drives data is more like what I
> would run, but that would of course be an individual choice.
It's something that's been asked for, but isn't supported by the
current (proposed) RAID-5/6 code, as far as I'm aware.
Hugo.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- Try everything once, except incest and folk-dancing. ---
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: btrfs wishlist
2011-03-02 9:05 ` Thomas Bellman
2011-03-02 10:43 ` Hugo Mills
@ 2011-03-02 14:18 ` Justin Ossevoort
1 sibling, 0 replies; 6+ messages in thread
From: Justin Ossevoort @ 2011-03-02 14:18 UTC (permalink / raw)
To: Thomas Bellman; +Cc: linux-btrfs
On 02/03/11 10:05, Thomas Bellman wrote:
> Will the stripe *width* be configurable? If I have something like a
> Sun Thor with 48 drives, I would probably not be entirely comfortable
> having 46 drives data and 2 drives parity; too little redundancy for
> my tastes. 2 drives parity per 10 drives data is more like what I
> would run, but that would of course be an individual choice.
On thing to remember is is that the parity is for specific pieces of
file system data. So not your entire dataset is at risk when only write
errors occur on a few places on a few disks, only file system objects
that have data stored in those places are at immediate risk. This means
that only files unlucky enough to have multiple failing sectors for the
same stripe width to be really impacted.
Of course this only matters as long as we're talking about bad sectors
and not full disk failure.
Regards,
justin....
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-03-02 14:18 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-01 18:35 btrfs wishlist Roy Sigurd Karlsbakk
2011-03-01 18:39 ` Chris Mason
2011-03-01 19:09 ` Freddie Cash
2011-03-02 9:05 ` Thomas Bellman
2011-03-02 10:43 ` Hugo Mills
2011-03-02 14:18 ` Justin Ossevoort
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).