public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Large single raid and XFS or two small ones and EXT3?
       [not found]   ` <449BE381.6070000@cjx.com>
@ 2006-06-23 14:01     ` Al Boldi
  2006-06-23 16:06       ` Andreas Dilger
  2006-06-23 16:21       ` Russell Cattelan
  0 siblings, 2 replies; 7+ messages in thread
From: Al Boldi @ 2006-06-23 14:01 UTC (permalink / raw)
  To: linux-raid; +Cc: linux-fsdevel

Chris Allen wrote:
> Francois Barre wrote:
> > 2006/6/23, PFC <lists@peufeu.com>:
> >>         - XFS is faster and fragments less, but make sure you have a
> >> good UPS
> >
> > Why a good UPS ? XFS has a good strong journal, I never had an issue
> > with it yet... And believe me, I did have some dirty things happening
> > here...
> >
> >>         - ReiserFS 3.6 is mature and fast, too, you might consider it
> >>         - ext3 is slow if you have many files in one directory, but
> >> has more
> >> mature tools (resize, recovery etc)
> >
> > XFS tools are kind of mature also. Online grow, dump, ...
> >
> >>         I'd go with XFS or Reiser.
> >
> > I'd go with XFS. But I may be kind of fanatic...
>
> Strange that whatever the filesystem you get equal numbers of people
> saying that they have never lost a single byte to those who have had 
> horrible corruption and would never touch it again. We stopped using XFS 
> about a year ago because we were getting kernel stack space panics under 
> heavy load over NFS. It looks like the time has come to give it another
> try.

If you are keen on data integrity then don't touch any fs w/o data=ordered.

ext3 is still king wrt data=ordered, albeit slow.

Now XFS is fast, but doesn't support data=ordered.  It seems that their 
solution to the problem is to pass the burden onto hw by using barriers.  
Maybe XFS can get away with this.  Maybe.

Thanks!

--
Al


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large single raid and XFS or two small ones and EXT3?
  2006-06-23 14:01     ` Large single raid and XFS or two small ones and EXT3? Al Boldi
@ 2006-06-23 16:06       ` Andreas Dilger
  2006-06-23 16:41         ` Christian Pedaschus
  2006-06-23 16:21       ` Russell Cattelan
  1 sibling, 1 reply; 7+ messages in thread
From: Andreas Dilger @ 2006-06-23 16:06 UTC (permalink / raw)
  To: Al Boldi; +Cc: linux-raid, linux-fsdevel

On Jun 23, 2006  17:01 +0300, Al Boldi wrote:
> Chris Allen wrote:
> > Francois Barre wrote:
> > > 2006/6/23, PFC <lists@peufeu.com>:
> > >>         - ext3 is slow if you have many files in one directory, but
> > >> has more mature tools (resize, recovery etc)

Please use "mke2fs -O dir_index" or "tune2fs -O dir_index" when testing
ext3 performance for many-files-in-dir.  This is now the default in
e2fsprogs-1.39 and later.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large single raid and XFS or two small ones and EXT3?
  2006-06-23 14:01     ` Large single raid and XFS or two small ones and EXT3? Al Boldi
  2006-06-23 16:06       ` Andreas Dilger
@ 2006-06-23 16:21       ` Russell Cattelan
  2006-06-23 18:19         ` Tom Vier
  1 sibling, 1 reply; 7+ messages in thread
From: Russell Cattelan @ 2006-06-23 16:21 UTC (permalink / raw)
  To: Al Boldi; +Cc: linux-raid, linux-fsdevel

Al Boldi wrote:

>Chris Allen wrote:
>  
>
>>Francois Barre wrote:
>>    
>>
>>>2006/6/23, PFC <lists@peufeu.com>:
>>>      
>>>
>>>>        - XFS is faster and fragments less, but make sure you have a
>>>>good UPS
>>>>        
>>>>
>>>Why a good UPS ? XFS has a good strong journal, I never had an issue
>>>with it yet... And believe me, I did have some dirty things happening
>>>here...
>>>
>>>      
>>>
>>>>        - ReiserFS 3.6 is mature and fast, too, you might consider it
>>>>        - ext3 is slow if you have many files in one directory, but
>>>>has more
>>>>mature tools (resize, recovery etc)
>>>>        
>>>>
>>>XFS tools are kind of mature also. Online grow, dump, ...
>>>
>>>      
>>>
>>>>        I'd go with XFS or Reiser.
>>>>        
>>>>
>>>I'd go with XFS. But I may be kind of fanatic...
>>>      
>>>
>>Strange that whatever the filesystem you get equal numbers of people
>>saying that they have never lost a single byte to those who have had 
>>horrible corruption and would never touch it again. We stopped using XFS 
>>about a year ago because we were getting kernel stack space panics under 
>>heavy load over NFS. It looks like the time has come to give it another
>>try.
>>    
>>
>
>If you are keen on data integrity then don't touch any fs w/o data=ordered.
>
>ext3 is still king wrt data=ordered, albeit slow.
>
>Now XFS is fast, but doesn't support data=ordered.  It seems that their 
>solution to the problem is to pass the burden onto hw by using barriers.  
>Maybe XFS can get away with this.  Maybe.
>
>Thanks!
>
>--
>  
>
When you refer to data=ordered are you taking about ext3 user data 
journaling?

While user data journaling seems like a good idea is unclear as what 
benefits it really provides?
By writing all user data twice the write performance of the files system 
is effectively halved.
Granted the log is on area of the disk so some performance advantages 
show up due
to less head seeking for those writes.

As far us meta data jornaling goes it is a fundamental requirement that 
the journal is
synced to disk to a given point in order to release the pinned meta 
data, thus allowing
the meta data to be synced to disk.

The way most files systems guarantee file system consistency is to 
either sync all
outstanding meta data changes to disk or to sync a record of what incore 
changes
have been made to disk.

In the XFS case since it logs meta data delta to the log it can record more
change operations in a smaller number of disk blocks, ext3 on the other hand
writes the entire metadata block to the log.

As far as barriers go I assume you are referring to the ide write barriers?

The need for barrier support in the file system is a result of cheap ide
disks providing large write caches but not having enough reserve power to
guarantee that the cache will be sync'ed to disk in the event of a power
failure.

Originally when xfs was written the disks/raids used by SGI system was 
pretty
much exclusively enterprise level devices that would guarantee the write 
caches
would be flushed in the event of a power failure.

Note ext3,xfs,and reiser all use write barrier now fos r ide disks.





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large single raid and XFS or two small ones and EXT3?
  2006-06-23 16:06       ` Andreas Dilger
@ 2006-06-23 16:41         ` Christian Pedaschus
  2006-06-23 16:46           ` Christian Pedaschus
  2006-06-23 19:53           ` Nix
  0 siblings, 2 replies; 7+ messages in thread
From: Christian Pedaschus @ 2006-06-23 16:41 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Al Boldi, linux-raid, linux-fsdevel


Andreas Dilger wrote:

>On Jun 23, 2006  17:01 +0300, Al Boldi wrote:
>  
>
>>Chris Allen wrote:
>>    
>>
>>>Francois Barre wrote:
>>>      
>>>
>>>>2006/6/23, PFC <lists@peufeu.com>:
>>>>        
>>>>
>>>>>        - ext3 is slow if you have many files in one directory, but
>>>>>has more mature tools (resize, recovery etc)
>>>>>          
>>>>>
>
>Please use "mke2fs -O dir_index" or "tune2fs -O dir_index" when testing
>ext3 performance for many-files-in-dir.  This is now the default in
>e2fsprogs-1.39 and later.
>  
>
for ext3 use (on unmounted disks):
tune2fs -O has_journal -o journal_data /dev/{disk}
tune2fs -O dir_index /dev/{disk}

if data is on the drive, you need to run a fsck afterwards and it uses a
good bit of ram, but it makes ext3 a good bit faster.

and my main points for using ext3 is still: "it's a very mature fs,
nobody will tell you such horrible storys about data-lossage with ext3
than with any other filesystem."
and there are undelete tools for ext3.

so if you're for data-integrity (i guess you are, else you would not use
raid, or? ;) ), use ext3 and if you need the last single kb/s get a
faster drive or use lots of them with a good raid-combo and/or use a
separate disk for the journal (man 8 tune2fs)

my 0.5 cents,
greets chris

ps. but you know, filesystem choosage is not pure science, it's
half-religion :D

>Cheers, Andreas
>--
>Andreas Dilger
>Principal Software Engineer
>Cluster File Systems, Inc.
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>  
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large single raid and XFS or two small ones and EXT3?
  2006-06-23 16:41         ` Christian Pedaschus
@ 2006-06-23 16:46           ` Christian Pedaschus
  2006-06-23 19:53           ` Nix
  1 sibling, 0 replies; 7+ messages in thread
From: Christian Pedaschus @ 2006-06-23 16:46 UTC (permalink / raw)
  Cc: Andreas Dilger, Al Boldi, linux-raid, linux-fsdevel


Christian Pedaschus wrote:

>for ext3 use (on unmounted disks):
>tune2fs -O has_journal -o journal_data /dev/{disk}
>tune2fs -O dir_index /dev/{disk}
>
>if data is on the drive, you need to run a fsck afterwards and it uses a
>good bit of ram, but it makes ext3 a good bit faster.
>
>and my main points for using ext3 is still: "it's a very mature fs,
>nobody will tell you such horrible storys about data-lossage with ext3
>than with any other filesystem."
>and there are undelete tools for ext3.
>
>so if you're for data-integrity (i guess you are, else you would not use
>raid, or? ;) ), use ext3 and if you need the last single kb/s get a
>faster drive or use lots of them with a good raid-combo and/or use a
>separate disk for the journal (man 8 tune2fs)
>
>my 0.5 cents,
>greets chris
>
>ps. but you know, filesystem choosage is not pure science, it's
>half-religion :D
>  
>
Ops, should be:

tune2fs -O has_journal -o journal_data /dev/{partition}
tune2fs -O dir_index /dev/{partition}

;)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large single raid and XFS or two small ones and EXT3?
  2006-06-23 16:21       ` Russell Cattelan
@ 2006-06-23 18:19         ` Tom Vier
  0 siblings, 0 replies; 7+ messages in thread
From: Tom Vier @ 2006-06-23 18:19 UTC (permalink / raw)
  To: Russell Cattelan; +Cc: Al Boldi, linux-raid, linux-fsdevel

On Fri, Jun 23, 2006 at 11:21:34AM -0500, Russell Cattelan wrote:
> When you refer to data=ordered are you taking about ext3 user data 
> journaling?

iirc, data=ordered just writes new data out before updating block pointers,
the file's length in its inode, and the block usage bitmap. That way you
don't get junk or zeroed data at the tail of the file. However, i think to
prevent data leaks (from deleted files), data=writeback requires a write to
the journal, indicating what blocks are being added, so that on recovery
they can be zeroed if the transaction wasn't completed.

> While user data journaling seems like a good idea is unclear as what 
> benefits it really provides?

Data gets commited sooner (until pressure or timeouts force the data to be
written to its final spot - then you loose thruput and there's a net delay).
I think for bursts of small file creation, data=journaled is a win. I don't
know how lazy ext3 is about writing the data to its final position. It
probably does it when the commit timeout hits 0 or the journal is full.

> As far as barriers go I assume you are referring to the ide write barriers?
> 
> The need for barrier support in the file system is a result of cheap ide
> disks providing large write caches but not having enough reserve power to
> guarantee that the cache will be sync'ed to disk in the event of a power
> failure.

It's needed on any drive (including scsi) that has writeback cache enabled.
Most scsi drives (in my experience) come from the factory with the cache set
to write thru, in case the fs/os doesn't use ordered tags, cache flushes, or
force-unit-access writes.

> Note ext3,xfs,and reiser all use write barrier now fos r ide disks.

What i've found very disappointing is that my raid1 doesn't support them!

Jun 22 10:53:49 zero kernel: Filesystem "md1": Disabling barriers, not
supported by the underlying device

I'm not sure if it's the sata drive that don't support write barriers, or if
it's just the md1 layer. I need to investigate that. I think reiserfs also
complained that trying to enabled write barriers fails on that md1 (i've
been playing with various fs'es on it).

-- 
Tom Vier <tmv@comcast.net>
DSA Key ID 0x15741ECE

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large single raid and XFS or two small ones and EXT3?
  2006-06-23 16:41         ` Christian Pedaschus
  2006-06-23 16:46           ` Christian Pedaschus
@ 2006-06-23 19:53           ` Nix
  1 sibling, 0 replies; 7+ messages in thread
From: Nix @ 2006-06-23 19:53 UTC (permalink / raw)
  To: Christian Pedaschus, linux-raid, linux-fsdevel

On 23 Jun 2006, Christian Pedaschus said:
> and my main points for using ext3 is still: "it's a very mature fs,
> nobody will tell you such horrible storys about data-lossage with ext3
> than with any other filesystem."

Actually I can, but it required bad RAM *and* a broken disk controller
*and* an electrical storm *and* heavy disk loads (only read loads,
but I didn't have noatime active so read implied write).

In my personal experience it's since weathered machines with `only' RAM
so bad that md5sums of 512Kb files wouldn't come out the same way twice
with no problems at all (some file data got corrupted, unsurprisingly,
but the metadata was fine).

Definitely an FS to be relied upon.

-- 
`NB: Anyone suggesting that we should say "Tibibytes" instead of
 Terabytes there will be hunted down and brutally slain.
 That is all.' --- Matthew Wilcox

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-06-23 19:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <449AEB7C.6040108@cjx.com>
     [not found] ` <fd8d0180606230226w10b1982ay2916805d9aa0e3bb@mail.gmail.com>
     [not found]   ` <449BE381.6070000@cjx.com>
2006-06-23 14:01     ` Large single raid and XFS or two small ones and EXT3? Al Boldi
2006-06-23 16:06       ` Andreas Dilger
2006-06-23 16:41         ` Christian Pedaschus
2006-06-23 16:46           ` Christian Pedaschus
2006-06-23 19:53           ` Nix
2006-06-23 16:21       ` Russell Cattelan
2006-06-23 18:19         ` Tom Vier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox