* Large single raid and XFS or two small ones and EXT3?
@ 2006-06-22 19:11 Chris Allen
2006-06-22 19:16 ` Gordon Henderson
2006-06-23 8:59 ` PFC
0 siblings, 2 replies; 47+ messages in thread
From: Chris Allen @ 2006-06-22 19:11 UTC (permalink / raw)
To: linux-raid
Dear All,
I have a Linux storage server containing 16x750GB drives - so 12TB raw
space.
If I make them into a single RAID5 array, then it appears my only
choice for a filesystem is XFS - as EXT3 won't really handle partitions
over 8TB.
Alternatively, I could split each drive into 2 partitions and have 2 RAID5
arrays, then put an EXT3 on each one.
Can anybody advise the pros and cons of these two approaches with
regard to stability, reliability and performance? The store is to be used
for files which will have an even split of:
33% approx 2MB in size
33% approx 50KB in size
33% approx 2KB in size
Also:
- I am running a 2.6.15-1 stock FC5 kernel. Would there be any RAID
benefits in
me upgrading to the latest 2.6.16 kernel? (don't want to do this unless
there is
very good reason to)
- I am running mdadm 2.3.1. Would there be any benefits for me in
upgrading to
mdadm v2.5?
- I have read good things about bitmaps. Are these production ready? Any
advice/caveats?
Many thanks for reading,
Chris Allen.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-22 19:11 Large single raid and XFS or two small ones and EXT3? Chris Allen
@ 2006-06-22 19:16 ` Gordon Henderson
2006-06-22 19:23 ` H. Peter Anvin
2006-06-22 20:00 ` Chris Allen
2006-06-23 8:59 ` PFC
1 sibling, 2 replies; 47+ messages in thread
From: Gordon Henderson @ 2006-06-22 19:16 UTC (permalink / raw)
To: Chris Allen; +Cc: linux-raid
On Thu, 22 Jun 2006, Chris Allen wrote:
> Dear All,
>
> I have a Linux storage server containing 16x750GB drives - so 12TB raw
> space.
Just one thing - Do you want to use RAID-5 or RAID-6 ?
I just ask, as with that many drives (and that much data!) the
possibilities of a 2nd drive failure is increasing, and personally,
wherever I can, I take the hit these days, and have used RAID-6 for
some time... drives are cheap, even the 750GB behemoths!
> If I make them into a single RAID5 array, then it appears my only
> choice for a filesystem is XFS - as EXT3 won't really handle partitions
> over 8TB.
I can't help with this though - I didn't realise ext3 had such a
limitation though!
Gordon
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-22 19:16 ` Gordon Henderson
@ 2006-06-22 19:23 ` H. Peter Anvin
2006-06-22 19:58 ` Chris Allen
2006-06-22 20:00 ` Chris Allen
1 sibling, 1 reply; 47+ messages in thread
From: H. Peter Anvin @ 2006-06-22 19:23 UTC (permalink / raw)
To: Gordon Henderson; +Cc: Chris Allen, linux-raid
Gordon Henderson wrote:
> On Thu, 22 Jun 2006, Chris Allen wrote:
>
>> Dear All,
>>
>> I have a Linux storage server containing 16x750GB drives - so 12TB raw
>> space.
>
> Just one thing - Do you want to use RAID-5 or RAID-6 ?
>
> I just ask, as with that many drives (and that much data!) the
> possibilities of a 2nd drive failure is increasing, and personally,
> wherever I can, I take the hit these days, and have used RAID-6 for
> some time... drives are cheap, even the 750GB behemoths!
>
>> If I make them into a single RAID5 array, then it appears my only
>> choice for a filesystem is XFS - as EXT3 won't really handle partitions
>> over 8TB.
>
> I can't help with this though - I didn't realise ext3 had such a
> limitation though!
>
16 TB (2^32 blocks) should be the right number.
-hpa
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-22 19:23 ` H. Peter Anvin
@ 2006-06-22 19:58 ` Chris Allen
0 siblings, 0 replies; 47+ messages in thread
From: Chris Allen @ 2006-06-22 19:58 UTC (permalink / raw)
To: linux-raid
H. Peter Anvin wrote:
> Gordon Henderson wrote:
>> On Thu, 22 Jun 2006, Chris Allen wrote:
>>
>>> Dear All,
>>>
>>> I have a Linux storage server containing 16x750GB drives - so 12TB raw
>>> space.
>>
>> Just one thing - Do you want to use RAID-5 or RAID-6 ?
>>
>> I just ask, as with that many drives (and that much data!) the
>> possibilities of a 2nd drive failure is increasing, and personally,
>> wherever I can, I take the hit these days, and have used RAID-6 for
>> some time... drives are cheap, even the 750GB behemoths!
>>
>>> If I make them into a single RAID5 array, then it appears my only
>>> choice for a filesystem is XFS - as EXT3 won't really handle
>>> partitions
>>> over 8TB.
>>
>> I can't help with this though - I didn't realise ext3 had such a
>> limitation though!
>>
>
> 16 TB (2^32 blocks) should be the right number.
>
It should be, but mkfs.ext3 won't let me create a filesystem bigger than
8TB.
It appears that the only way round this is through kernel patches, and,
as this
is a production machine, I'd rather stick to mainstream releases and go
for one
of the above solutions.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-22 19:16 ` Gordon Henderson
2006-06-22 19:23 ` H. Peter Anvin
@ 2006-06-22 20:00 ` Chris Allen
1 sibling, 0 replies; 47+ messages in thread
From: Chris Allen @ 2006-06-22 20:00 UTC (permalink / raw)
To: Gordon Henderson; +Cc: linux-raid
Gordon Henderson wrote:
> On Thu, 22 Jun 2006, Chris Allen wrote:
>
>
>> Dear All,
>>
>> I have a Linux storage server containing 16x750GB drives - so 12TB raw
>> space.
>>
>
> Just one thing - Do you want to use RAID-5 or RAID-6 ?
>
> I just ask, as with that many drives (and that much data!) the
> possibilities of a 2nd drive failure is increasing, and personally,
> wherever I can, I take the hit these days, and have used RAID-6 for
> some time... drives are cheap, even the 750GB behemoths!
>
>
Each of these boxes has an equivalent mirror box - so I'm happy with using
raid5 for the time being.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-22 19:11 Large single raid and XFS or two small ones and EXT3? Chris Allen
2006-06-22 19:16 ` Gordon Henderson
@ 2006-06-23 8:59 ` PFC
2006-06-23 9:26 ` Francois Barre
2006-06-23 19:48 ` Large single raid and XFS or two small ones and EXT3? Nix
1 sibling, 2 replies; 47+ messages in thread
From: PFC @ 2006-06-23 8:59 UTC (permalink / raw)
To: Chris Allen, linux-raid
- XFS is faster and fragments less, but make sure you have a good UPS
- ReiserFS 3.6 is mature and fast, too, you might consider it
- ext3 is slow if you have many files in one directory, but has more
mature tools (resize, recovery etc)
I'd go with XFS or Reiser.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 8:59 ` PFC
@ 2006-06-23 9:26 ` Francois Barre
2006-06-23 12:50 ` Chris Allen
2006-06-23 19:48 ` Large single raid and XFS or two small ones and EXT3? Nix
1 sibling, 1 reply; 47+ messages in thread
From: Francois Barre @ 2006-06-23 9:26 UTC (permalink / raw)
To: linux-raid; +Cc: Chris Allen, PFC
2006/6/23, PFC <lists@peufeu.com>:
>
> - XFS is faster and fragments less, but make sure you have a good UPS
Why a good UPS ? XFS has a good strong journal, I never had an issue
with it yet... And believe me, I did have some dirty things happening
here...
> - ReiserFS 3.6 is mature and fast, too, you might consider it
> - ext3 is slow if you have many files in one directory, but has more
> mature tools (resize, recovery etc)
XFS tools are kind of mature also. Online grow, dump, ...
>
> I'd go with XFS or Reiser.
I'd go with XFS. But I may be kind of fanatic...
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 9:26 ` Francois Barre
@ 2006-06-23 12:50 ` Chris Allen
2006-06-23 13:14 ` Gordon Henderson
` (3 more replies)
0 siblings, 4 replies; 47+ messages in thread
From: Chris Allen @ 2006-06-23 12:50 UTC (permalink / raw)
To: Francois Barre; +Cc: linux-raid
Francois Barre wrote:
> 2006/6/23, PFC <lists@peufeu.com>:
>>
>> - XFS is faster and fragments less, but make sure you have a
>> good UPS
> Why a good UPS ? XFS has a good strong journal, I never had an issue
> with it yet... And believe me, I did have some dirty things happening
> here...
>
>> - ReiserFS 3.6 is mature and fast, too, you might consider it
>> - ext3 is slow if you have many files in one directory, but
>> has more
>> mature tools (resize, recovery etc)
> XFS tools are kind of mature also. Online grow, dump, ...
>
>>
>> I'd go with XFS or Reiser.
> I'd go with XFS. But I may be kind of fanatic...
Strange that whatever the filesystem you get equal numbers of people
saying that
they have never lost a single byte to those who have had horrible
corruption and
would never touch it again. We stopped using XFS about a year ago because we
were getting kernel stack space panics under heavy load over NFS. It
looks like
the time has come to give it another try.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 12:50 ` Chris Allen
@ 2006-06-23 13:14 ` Gordon Henderson
2006-06-23 13:30 ` Francois Barre
` (2 subsequent siblings)
3 siblings, 0 replies; 47+ messages in thread
From: Gordon Henderson @ 2006-06-23 13:14 UTC (permalink / raw)
To: Chris Allen; +Cc: linux-raid
On Fri, 23 Jun 2006, Chris Allen wrote:
> Strange that whatever the filesystem you get equal numbers of people
> saying that
> they have never lost a single byte to those who have had horrible
> corruption and
> would never touch it again. We stopped using XFS about a year ago because we
> were getting kernel stack space panics under heavy load over NFS. It
> looks like
> the time has come to give it another try.
I had a bad experience with XFS a year or so ago, and after getting told
to RTFM from the XFS users list, after I'd already RTFMd, I gave up on it.
(and them)
However, I've just decided to give it a go again (for the single reason
that it's faster at deleting large swathes of files than ext3, which this
server might have to do from time to time), and so-far so good.
Looking back, what I think I really was having problems with at the time
was 2 issues; one was that I was using LVM too, and it really wasn't
production ready, and the other was that the default kernel stack size was
4KB at the time - which was what was causing me problems under heavy NFS
load...
I'm trying it now on a 3.5TB RAID-6 server now with a relatively light NFS
(and Samba) load, but will be rolling it out on an identical server soon
which is expected to have a relatively high load, so heres hoping...
Gordon
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 12:50 ` Chris Allen
2006-06-23 13:14 ` Gordon Henderson
@ 2006-06-23 13:30 ` Francois Barre
2006-06-23 14:46 ` Martin Schröder
2006-06-23 14:01 ` Al Boldi
2006-06-27 12:05 ` Large single raid... - XFS over NFS woes Dexter Filmore
3 siblings, 1 reply; 47+ messages in thread
From: Francois Barre @ 2006-06-23 13:30 UTC (permalink / raw)
To: linux-raid; +Cc: Chris Allen
> Strange that whatever the filesystem you get equal numbers of people
> saying that
> they have never lost a single byte to those who have had horrible
> corruption and
> would never touch it again.
[...]
Loosing data is worse than loosing anything else. You can buy you
another hard drive, you can buy you another CPU, but you won't buy you
all the data you lost... And as far as I know, true life does not
implement the "Undo" button,
So, as a matter of facts, I started to think that choosing a FS is
much more a matter of personnal belief than any kind of scientific,
statistical, even empirical benchmarking. Something like a new kind of
religion...
For example, back in the reiser3.6's first steps in life, I
experienced a handfull of oopses, and fuzzy things that made my box
think it was running a Redmond stuff... So I neglected Reiser.
Then Reiser4 concepts came to my ear, several years after, and I
thought that, well, you know, Hans Reiser has great ideas and
promising theories, let's have a closer look at it.
So I came back testing reiser3.6. Which just worked flawlessly.
And you know what ? I never had time to play with Reiser4 yet.
So I finally chose XFS for all my more-than-2G partitions, with regard
to thread contents I started back to january : "Linux MD raid5 and
reiser4... Any experience ?".
Anyway, I'm shared between two points of view regarding the fs
experience in Linux
- maybe FS can not be generic, and cannot cover all usage scenarii.
Some are good for doing some stuff, some are better for some others...
And you'll have to chose with regard to your own usage forecasts.
- or maybe there's too much choice inthere : whenever a big problem
arises, it's easier to switch filesystems than to go bug hunting... At
least that's the way I reacted a couple of times. And because data
loss is such a sensible topic, when trust is broken, you just want to
change all stuff around, and start hating what you were found of a
minute ago...
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 12:50 ` Chris Allen
2006-06-23 13:14 ` Gordon Henderson
2006-06-23 13:30 ` Francois Barre
@ 2006-06-23 14:01 ` Al Boldi
2006-06-23 16:06 ` Andreas Dilger
2006-06-23 16:21 ` Russell Cattelan
2006-06-27 12:05 ` Large single raid... - XFS over NFS woes Dexter Filmore
3 siblings, 2 replies; 47+ messages in thread
From: Al Boldi @ 2006-06-23 14:01 UTC (permalink / raw)
To: linux-raid; +Cc: linux-fsdevel
Chris Allen wrote:
> Francois Barre wrote:
> > 2006/6/23, PFC <lists@peufeu.com>:
> >> - XFS is faster and fragments less, but make sure you have a
> >> good UPS
> >
> > Why a good UPS ? XFS has a good strong journal, I never had an issue
> > with it yet... And believe me, I did have some dirty things happening
> > here...
> >
> >> - ReiserFS 3.6 is mature and fast, too, you might consider it
> >> - ext3 is slow if you have many files in one directory, but
> >> has more
> >> mature tools (resize, recovery etc)
> >
> > XFS tools are kind of mature also. Online grow, dump, ...
> >
> >> I'd go with XFS or Reiser.
> >
> > I'd go with XFS. But I may be kind of fanatic...
>
> Strange that whatever the filesystem you get equal numbers of people
> saying that they have never lost a single byte to those who have had
> horrible corruption and would never touch it again. We stopped using XFS
> about a year ago because we were getting kernel stack space panics under
> heavy load over NFS. It looks like the time has come to give it another
> try.
If you are keen on data integrity then don't touch any fs w/o data=ordered.
ext3 is still king wrt data=ordered, albeit slow.
Now XFS is fast, but doesn't support data=ordered. It seems that their
solution to the problem is to pass the burden onto hw by using barriers.
Maybe XFS can get away with this. Maybe.
Thanks!
--
Al
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 13:30 ` Francois Barre
@ 2006-06-23 14:46 ` Martin Schröder
2006-06-23 14:59 ` Francois Barre
` (2 more replies)
0 siblings, 3 replies; 47+ messages in thread
From: Martin Schröder @ 2006-06-23 14:46 UTC (permalink / raw)
To: linux-raid
2006/6/23, Francois Barre <francois.barre@gmail.com>:
> Loosing data is worse than loosing anything else. You can buy you
That's why RAID is no excuse for backups.
Best
Martin
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 14:46 ` Martin Schröder
@ 2006-06-23 14:59 ` Francois Barre
2006-06-23 15:13 ` Bill Davidsen
2006-06-23 15:17 ` Chris Allen
2 siblings, 0 replies; 47+ messages in thread
From: Francois Barre @ 2006-06-23 14:59 UTC (permalink / raw)
To: linux-raid; +Cc: Martin Schröder
> That's why RAID is no excuse for backups.
>
Of course yes, but...
(I'm working in car industry) Raid is your active (if not pro-active)
security system, like a car ESP ; if something goes wrong, it
gracefully and automagically re-align to the *safe way*. Whereas
backup is your airbag. It's always too late when you use it.
And I've never seen anyone trying to recover something from a backup
without praying...
So, one day or another, I'll develop the strongest backup technology
ever, using marble-based disks and a redundant cluster of egyptian
scribes.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 14:46 ` Martin Schröder
2006-06-23 14:59 ` Francois Barre
@ 2006-06-23 15:13 ` Bill Davidsen
2006-06-23 15:34 ` Francois Barre
2006-06-23 15:17 ` Chris Allen
2 siblings, 1 reply; 47+ messages in thread
From: Bill Davidsen @ 2006-06-23 15:13 UTC (permalink / raw)
To: Martin Schröder; +Cc: linux-raid
Martin Schröder wrote:
> 2006/6/23, Francois Barre <francois.barre@gmail.com>:
>
>> Loosing data is worse than loosing anything else. You can buy you
>
>
> That's why RAID is no excuse for backups.
The problem is that there is no cost effective backup available. When a
tape was the same size as a disk and 10% the cost, backups were
practical. Today anything larger than hobby size disk is just not easy
to back up. Anything large enough to be useful is expensive, small media
or something you can't take off-site and lock in a vault aren't backups
so much as copies, which may protect against some problems, but which
provide little to no protection against site disasters.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 14:46 ` Martin Schröder
2006-06-23 14:59 ` Francois Barre
2006-06-23 15:13 ` Bill Davidsen
@ 2006-06-23 15:17 ` Chris Allen
2 siblings, 0 replies; 47+ messages in thread
From: Chris Allen @ 2006-06-23 15:17 UTC (permalink / raw)
To: Martin Schröder; +Cc: linux-raid
Martin Schröder wrote:
> 2006/6/23, Francois Barre <francois.barre@gmail.com>:
>> Loosing data is worse than loosing anything else. You can buy you
>
> That's why RAID is no excuse for backups.
>
>
We have 50TB stored data now and maybe 250TB this time next year.
We mirror the most recent 20TB to a secondary array and rely on
the RAID for the rest.
I can't think of a practical tape backup strategy given tape sizes at
the moment...
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 15:13 ` Bill Davidsen
@ 2006-06-23 15:34 ` Francois Barre
2006-06-23 19:49 ` Nix
2006-06-24 5:19 ` Neil Brown
0 siblings, 2 replies; 47+ messages in thread
From: Francois Barre @ 2006-06-23 15:34 UTC (permalink / raw)
To: linux-raid
> The problem is that there is no cost effective backup available.
One-liner questions :
- How does Google make backups ?
- Aren't tapes dead yet ?
- What about a NUMA principle applied to storage ?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 14:01 ` Al Boldi
@ 2006-06-23 16:06 ` Andreas Dilger
2006-06-23 16:41 ` Christian Pedaschus
2006-06-23 16:21 ` Russell Cattelan
1 sibling, 1 reply; 47+ messages in thread
From: Andreas Dilger @ 2006-06-23 16:06 UTC (permalink / raw)
To: Al Boldi; +Cc: linux-raid, linux-fsdevel
On Jun 23, 2006 17:01 +0300, Al Boldi wrote:
> Chris Allen wrote:
> > Francois Barre wrote:
> > > 2006/6/23, PFC <lists@peufeu.com>:
> > >> - ext3 is slow if you have many files in one directory, but
> > >> has more mature tools (resize, recovery etc)
Please use "mke2fs -O dir_index" or "tune2fs -O dir_index" when testing
ext3 performance for many-files-in-dir. This is now the default in
e2fsprogs-1.39 and later.
Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 14:01 ` Al Boldi
2006-06-23 16:06 ` Andreas Dilger
@ 2006-06-23 16:21 ` Russell Cattelan
2006-06-23 18:19 ` Tom Vier
1 sibling, 1 reply; 47+ messages in thread
From: Russell Cattelan @ 2006-06-23 16:21 UTC (permalink / raw)
To: Al Boldi; +Cc: linux-raid, linux-fsdevel
Al Boldi wrote:
>Chris Allen wrote:
>
>
>>Francois Barre wrote:
>>
>>
>>>2006/6/23, PFC <lists@peufeu.com>:
>>>
>>>
>>>> - XFS is faster and fragments less, but make sure you have a
>>>>good UPS
>>>>
>>>>
>>>Why a good UPS ? XFS has a good strong journal, I never had an issue
>>>with it yet... And believe me, I did have some dirty things happening
>>>here...
>>>
>>>
>>>
>>>> - ReiserFS 3.6 is mature and fast, too, you might consider it
>>>> - ext3 is slow if you have many files in one directory, but
>>>>has more
>>>>mature tools (resize, recovery etc)
>>>>
>>>>
>>>XFS tools are kind of mature also. Online grow, dump, ...
>>>
>>>
>>>
>>>> I'd go with XFS or Reiser.
>>>>
>>>>
>>>I'd go with XFS. But I may be kind of fanatic...
>>>
>>>
>>Strange that whatever the filesystem you get equal numbers of people
>>saying that they have never lost a single byte to those who have had
>>horrible corruption and would never touch it again. We stopped using XFS
>>about a year ago because we were getting kernel stack space panics under
>>heavy load over NFS. It looks like the time has come to give it another
>>try.
>>
>>
>
>If you are keen on data integrity then don't touch any fs w/o data=ordered.
>
>ext3 is still king wrt data=ordered, albeit slow.
>
>Now XFS is fast, but doesn't support data=ordered. It seems that their
>solution to the problem is to pass the burden onto hw by using barriers.
>Maybe XFS can get away with this. Maybe.
>
>Thanks!
>
>--
>
>
When you refer to data=ordered are you taking about ext3 user data
journaling?
While user data journaling seems like a good idea is unclear as what
benefits it really provides?
By writing all user data twice the write performance of the files system
is effectively halved.
Granted the log is on area of the disk so some performance advantages
show up due
to less head seeking for those writes.
As far us meta data jornaling goes it is a fundamental requirement that
the journal is
synced to disk to a given point in order to release the pinned meta
data, thus allowing
the meta data to be synced to disk.
The way most files systems guarantee file system consistency is to
either sync all
outstanding meta data changes to disk or to sync a record of what incore
changes
have been made to disk.
In the XFS case since it logs meta data delta to the log it can record more
change operations in a smaller number of disk blocks, ext3 on the other hand
writes the entire metadata block to the log.
As far as barriers go I assume you are referring to the ide write barriers?
The need for barrier support in the file system is a result of cheap ide
disks providing large write caches but not having enough reserve power to
guarantee that the cache will be sync'ed to disk in the event of a power
failure.
Originally when xfs was written the disks/raids used by SGI system was
pretty
much exclusively enterprise level devices that would guarantee the write
caches
would be flushed in the event of a power failure.
Note ext3,xfs,and reiser all use write barrier now fos r ide disks.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 16:06 ` Andreas Dilger
@ 2006-06-23 16:41 ` Christian Pedaschus
2006-06-23 16:46 ` Christian Pedaschus
2006-06-23 19:53 ` Nix
0 siblings, 2 replies; 47+ messages in thread
From: Christian Pedaschus @ 2006-06-23 16:41 UTC (permalink / raw)
To: Andreas Dilger; +Cc: Al Boldi, linux-raid, linux-fsdevel
Andreas Dilger wrote:
>On Jun 23, 2006 17:01 +0300, Al Boldi wrote:
>
>
>>Chris Allen wrote:
>>
>>
>>>Francois Barre wrote:
>>>
>>>
>>>>2006/6/23, PFC <lists@peufeu.com>:
>>>>
>>>>
>>>>> - ext3 is slow if you have many files in one directory, but
>>>>>has more mature tools (resize, recovery etc)
>>>>>
>>>>>
>
>Please use "mke2fs -O dir_index" or "tune2fs -O dir_index" when testing
>ext3 performance for many-files-in-dir. This is now the default in
>e2fsprogs-1.39 and later.
>
>
for ext3 use (on unmounted disks):
tune2fs -O has_journal -o journal_data /dev/{disk}
tune2fs -O dir_index /dev/{disk}
if data is on the drive, you need to run a fsck afterwards and it uses a
good bit of ram, but it makes ext3 a good bit faster.
and my main points for using ext3 is still: "it's a very mature fs,
nobody will tell you such horrible storys about data-lossage with ext3
than with any other filesystem."
and there are undelete tools for ext3.
so if you're for data-integrity (i guess you are, else you would not use
raid, or? ;) ), use ext3 and if you need the last single kb/s get a
faster drive or use lots of them with a good raid-combo and/or use a
separate disk for the journal (man 8 tune2fs)
my 0.5 cents,
greets chris
ps. but you know, filesystem choosage is not pure science, it's
half-religion :D
>Cheers, Andreas
>--
>Andreas Dilger
>Principal Software Engineer
>Cluster File Systems, Inc.
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 16:41 ` Christian Pedaschus
@ 2006-06-23 16:46 ` Christian Pedaschus
2006-06-23 19:53 ` Nix
1 sibling, 0 replies; 47+ messages in thread
From: Christian Pedaschus @ 2006-06-23 16:46 UTC (permalink / raw)
Cc: Andreas Dilger, Al Boldi, linux-raid, linux-fsdevel
Christian Pedaschus wrote:
>for ext3 use (on unmounted disks):
>tune2fs -O has_journal -o journal_data /dev/{disk}
>tune2fs -O dir_index /dev/{disk}
>
>if data is on the drive, you need to run a fsck afterwards and it uses a
>good bit of ram, but it makes ext3 a good bit faster.
>
>and my main points for using ext3 is still: "it's a very mature fs,
>nobody will tell you such horrible storys about data-lossage with ext3
>than with any other filesystem."
>and there are undelete tools for ext3.
>
>so if you're for data-integrity (i guess you are, else you would not use
>raid, or? ;) ), use ext3 and if you need the last single kb/s get a
>faster drive or use lots of them with a good raid-combo and/or use a
>separate disk for the journal (man 8 tune2fs)
>
>my 0.5 cents,
>greets chris
>
>ps. but you know, filesystem choosage is not pure science, it's
>half-religion :D
>
>
Ops, should be:
tune2fs -O has_journal -o journal_data /dev/{partition}
tune2fs -O dir_index /dev/{partition}
;)
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 16:21 ` Russell Cattelan
@ 2006-06-23 18:19 ` Tom Vier
0 siblings, 0 replies; 47+ messages in thread
From: Tom Vier @ 2006-06-23 18:19 UTC (permalink / raw)
To: Russell Cattelan; +Cc: Al Boldi, linux-raid, linux-fsdevel
On Fri, Jun 23, 2006 at 11:21:34AM -0500, Russell Cattelan wrote:
> When you refer to data=ordered are you taking about ext3 user data
> journaling?
iirc, data=ordered just writes new data out before updating block pointers,
the file's length in its inode, and the block usage bitmap. That way you
don't get junk or zeroed data at the tail of the file. However, i think to
prevent data leaks (from deleted files), data=writeback requires a write to
the journal, indicating what blocks are being added, so that on recovery
they can be zeroed if the transaction wasn't completed.
> While user data journaling seems like a good idea is unclear as what
> benefits it really provides?
Data gets commited sooner (until pressure or timeouts force the data to be
written to its final spot - then you loose thruput and there's a net delay).
I think for bursts of small file creation, data=journaled is a win. I don't
know how lazy ext3 is about writing the data to its final position. It
probably does it when the commit timeout hits 0 or the journal is full.
> As far as barriers go I assume you are referring to the ide write barriers?
>
> The need for barrier support in the file system is a result of cheap ide
> disks providing large write caches but not having enough reserve power to
> guarantee that the cache will be sync'ed to disk in the event of a power
> failure.
It's needed on any drive (including scsi) that has writeback cache enabled.
Most scsi drives (in my experience) come from the factory with the cache set
to write thru, in case the fs/os doesn't use ordered tags, cache flushes, or
force-unit-access writes.
> Note ext3,xfs,and reiser all use write barrier now fos r ide disks.
What i've found very disappointing is that my raid1 doesn't support them!
Jun 22 10:53:49 zero kernel: Filesystem "md1": Disabling barriers, not
supported by the underlying device
I'm not sure if it's the sata drive that don't support write barriers, or if
it's just the md1 layer. I need to investigate that. I think reiserfs also
complained that trying to enabled write barriers fails on that md1 (i've
been playing with various fs'es on it).
--
Tom Vier <tmv@comcast.net>
DSA Key ID 0x15741ECE
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 8:59 ` PFC
2006-06-23 9:26 ` Francois Barre
@ 2006-06-23 19:48 ` Nix
2006-06-25 19:13 ` David Rees
1 sibling, 1 reply; 47+ messages in thread
From: Nix @ 2006-06-23 19:48 UTC (permalink / raw)
To: PFC; +Cc: Chris Allen, linux-raid
On 23 Jun 2006, PFC suggested tentatively:
> - ext3 is slow if you have many files in one directory, but has
> more mature tools (resize, recovery etc)
This is much less true if you turn on the dir_index feature.
--
`NB: Anyone suggesting that we should say "Tibibytes" instead of
Terabytes there will be hunted down and brutally slain.
That is all.' --- Matthew Wilcox
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 15:34 ` Francois Barre
@ 2006-06-23 19:49 ` Nix
2006-06-24 5:19 ` Neil Brown
1 sibling, 0 replies; 47+ messages in thread
From: Nix @ 2006-06-23 19:49 UTC (permalink / raw)
To: Francois Barre; +Cc: linux-raid
On 23 Jun 2006, Francois Barre uttered the following:
>> The problem is that there is no cost effective backup available.
>
> One-liner questions :
> - How does Google make backups ?
Replication across huge numbers of cheap machines on a massively
distributed filesystem.
--
`NB: Anyone suggesting that we should say "Tibibytes" instead of
Terabytes there will be hunted down and brutally slain.
That is all.' --- Matthew Wilcox
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 16:41 ` Christian Pedaschus
2006-06-23 16:46 ` Christian Pedaschus
@ 2006-06-23 19:53 ` Nix
1 sibling, 0 replies; 47+ messages in thread
From: Nix @ 2006-06-23 19:53 UTC (permalink / raw)
To: Christian Pedaschus, linux-raid, linux-fsdevel
On 23 Jun 2006, Christian Pedaschus said:
> and my main points for using ext3 is still: "it's a very mature fs,
> nobody will tell you such horrible storys about data-lossage with ext3
> than with any other filesystem."
Actually I can, but it required bad RAM *and* a broken disk controller
*and* an electrical storm *and* heavy disk loads (only read loads,
but I didn't have noatime active so read implied write).
In my personal experience it's since weathered machines with `only' RAM
so bad that md5sums of 512Kb files wouldn't come out the same way twice
with no problems at all (some file data got corrupted, unsurprisingly,
but the metadata was fine).
Definitely an FS to be relied upon.
--
`NB: Anyone suggesting that we should say "Tibibytes" instead of
Terabytes there will be hunted down and brutally slain.
That is all.' --- Matthew Wilcox
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 15:34 ` Francois Barre
2006-06-23 19:49 ` Nix
@ 2006-06-24 5:19 ` Neil Brown
2006-06-24 7:59 ` Adam Talbot
2006-06-24 12:40 ` Justin Piszcz
1 sibling, 2 replies; 47+ messages in thread
From: Neil Brown @ 2006-06-24 5:19 UTC (permalink / raw)
To: Francois Barre; +Cc: linux-raid
On Friday June 23, francois.barre@gmail.com wrote:
> > The problem is that there is no cost effective backup available.
>
> One-liner questions :
> - How does Google make backups ?
No, Google ARE the backups :-)
> - Aren't tapes dead yet ?
LTO-3 does 300Gig, and LTO-4 is planned.
They may not cope with tera-byte arrays in one hit, but they still
have real value.
> - What about a NUMA principle applied to storage ?
You mean an Hierarchical Storage Manager? Yep, they exist. I'm sure
SGI, EMC and assorted other TLAs could sell you one.
NeilBrown
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-24 5:19 ` Neil Brown
@ 2006-06-24 7:59 ` Adam Talbot
2006-06-24 9:34 ` David Greaves
2006-06-25 23:57 ` Bill Davidsen
2006-06-24 12:40 ` Justin Piszcz
1 sibling, 2 replies; 47+ messages in thread
From: Adam Talbot @ 2006-06-24 7:59 UTC (permalink / raw)
To: Neil Brown; +Cc: Francois Barre, linux-raid
OK, this topic I relay need to get in on.
I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
array. I wanted real numbers, not "This FS is faster because..." I have
moved over 100TB of data on my new array running the bench mark
testing. I have yet to have any major problems with ReiserFS, EXT2/3,
JFS, or XFS. I have done extensive testing on all, including just
trying to break the file system with billions of 1k files, or a 1TB
file. Was able to cause some problems with EXT3 and RiserFS with the 1KB
and 1TB tests, respectively. but both were fixed with a fsck. My basic
test is to move all data from my old server to my new server
(whitequeen2) and clock the transfer time. Whitequeen2 has very little
storage. The NAS's 1.2TB of storage is attached via iSCSI and a cross
over cable to the back of whitequeen2. The data is 100GB of user's
files(1KB~2MB), 50GB of MP3's (1MB~5MB) and the rest is movies and
system backups 600MB~2GB. Here is a copy of my current data sheet,
including specs on the servers and copy times, my numbers are not
perfect, but they should give you a clue about speeds... XFS wins.
The computer: whitequeen2
AMD Athlon64 3200 (2.0GHz)
1GB Corsair DDR 400 (2X 512MB's running in dual DDR mode)
Foxconn 6150K8MA-8EKRS motherboard
Off brand case/power supply
2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
Intel pro/1000 NIC
CentOS 4.3 X86_64 2.6.9
Main app server, Apache, Samba, NFS, NIS
The computer: nas
AMD Athlon64 3000 (1.8GHz)
256MB Corsair DDR 400 (2X 128MB's running in dual DDR mode)
Foxconn 6150K8MA-8EKRS motherboard
Off brand case/power supply and drive cages
2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
6X software raid array, RAID 6, Maxtor 7V300F0, FW VA111900
Gentoo linux. X86_64 2.6.16-gentoo-r9
System built very lite, only built as an iSCSI based NAS.
NFS mount from whitequeen (old server) goes to /mnt/tmp
Target iSCSI to NAS, or when running on local NAS, is /data
Raw dump /dev/null (Speed mark, how fast is the old whitequeen, Read test)
Config=APP+NFS-->/dev/null
[root@whitequeen2 tmp]# time tar cf - . | cat - > /dev/null
real 216m30.621s
user 1m24.222s
sys 15m20.031s
3.6 hours @ 105371M/hour or 1756M/min or *29.27M/sec*
XFS
Config=APP+NFS-->NAS+iSCSI
RAID6 64K chunk
[root@whitequeen2 tmp]# time tar cf - . | (cd /data ; tar xf - )
real 323m9.990s
user 1m28.556s
sys 31m6.405s
/dev/sdb1 1.1T 371G 748G 34% /data
5.399 hours @ 70,260M/hour or 1171M/min or 19.52M/sec
Pass 2 of XFS (are my number repeatable? Yes)
real 320m11.615s
user 1m26.997s
sys 31m11.987s
XFS (Direct NFS connection, no app server, max "real world" speed of my
array?)
Config=NAS+NFS
RAID6 64K chunk
nas tmp # time tar cf - . | (cd /data ; tar xf - )
real 241m8.698s
user 1m2.760s
sys 25m9.770s
/dev/md/0 1.1T 371G 748G 34% /data
4.417 hours @ 85,880M/hour or 1.431M/min or *23.86M/sec*
EXT3
Config=APP+NFS-->NAS+iSCSI
RAID6 64K chunk
[root@whitequeen2 tmp]# time tar cf - . | (cd /data ; tar xf - )
real 371m29.802s
user 1m28.492s
sys 46m48.947s
/dev/sdb1 1.1T 371G 674G 36% /data
6.192 hours @ 61,262M/hour or 1021M/min or 17.02M/sec
EXT2
Config=APP+NFS-->NAS+iSCSI
RAID6 64K chunk
[root@whitequeen2 tmp]# time tar cf - . | ( cd /data/ ; tar xf - )
real 401m48.702s
user 1m25.599s
sys 30m22.620s
/dev/sdb1 1.1T 371G 674G 36% /data
6.692 hours @ 56,684M/hour or 945M/min or 15.75M/sec
JFS
Config=APP+NFS-->NAS+iSCSI
RAID6 64K chunk
[root@whitequeen2 tmp]# time tar cf - . | (cd /data ; tar xf - )
real 337m52.125s
user 1m26.526s
sys 32m33.983s
/dev/sdb1 1.1T 371G 748G 34% /data
5.625 hours @ 67,438M/hour or 1124M/min or 18.73M/sec
ReiserFS
Config=APP+NFS-->NAS+iSCSI
RAID6 64K chunk
[root@whitequeen2 tmp]# time tar cf - . | (cd /data ; tar xf - )
real 334m33.615s
user 1m31.098s
sys 48m41.193s
/dev/sdb1 1.1T 371G 748G 34% /data
5.572 hours @ 68,078M/hour or 1135M/min or 18.91M/sec
Word count
[root@whitequeen2 tmp]# ls | wc
66612 301527 5237755
Actule size = 379,336M
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-24 7:59 ` Adam Talbot
@ 2006-06-24 9:34 ` David Greaves
2006-06-24 22:52 ` Adam Talbot
2006-06-25 23:57 ` Bill Davidsen
1 sibling, 1 reply; 47+ messages in thread
From: David Greaves @ 2006-06-24 9:34 UTC (permalink / raw)
To: Adam Talbot; +Cc: Neil Brown, Francois Barre, linux-raid
Adam Talbot wrote:
> OK, this topic I relay need to get in on.
> I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
> array.
Very interesting. Thanks.
Did you get around to any 'tuning'.
Things like raid chunk size, external logs for xfs, blockdev readahead
on the underlying devices and the raid device?
David
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-24 5:19 ` Neil Brown
2006-06-24 7:59 ` Adam Talbot
@ 2006-06-24 12:40 ` Justin Piszcz
2006-06-26 0:06 ` Bill Davidsen
1 sibling, 1 reply; 47+ messages in thread
From: Justin Piszcz @ 2006-06-24 12:40 UTC (permalink / raw)
To: Neil Brown; +Cc: Francois Barre, linux-raid
On Sat, 24 Jun 2006, Neil Brown wrote:
> On Friday June 23, francois.barre@gmail.com wrote:
>>> The problem is that there is no cost effective backup available.
>>
>> One-liner questions :
>> - How does Google make backups ?
>
> No, Google ARE the backups :-)
>
>> - Aren't tapes dead yet ?
>
> LTO-3 does 300Gig, and LTO-4 is planned.
> They may not cope with tera-byte arrays in one hit, but they still
> have real value.
>
>> - What about a NUMA principle applied to storage ?
>
> You mean an Hierarchical Storage Manager? Yep, they exist. I'm sure
> SGI, EMC and assorted other TLAs could sell you one.
>
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
LTO3 is 400GB native and we've seen very good compression, so 800GB-1TB
per tape.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-24 9:34 ` David Greaves
@ 2006-06-24 22:52 ` Adam Talbot
2006-06-25 13:06 ` Joshua Baker-LePain
2006-06-25 14:51 ` Large single raid and XFS or two small ones and EXT3? Adam Talbot
0 siblings, 2 replies; 47+ messages in thread
From: Adam Talbot @ 2006-06-24 22:52 UTC (permalink / raw)
To: David Greaves; +Cc: Neil Brown, Francois Barre, linux-raid
Trying to test for tuning with different chunk's. Just finished 16K
chunk and am about 20% done with the 32K test. Here are the numbers on
16K chunk, will send 32, 96,128,192 and 256 as I get them. But keep in
mind each one of these tests take about 4~6 hours, so it is a slow
process... I have settled for XFS as the file system type, it seems to
be able to beat any thing else out there.
-Adam
XFS
Config=NAS+NFS
RAID6 16K chunk
nas tmp # time tar cf - . | (cd /data ; tar xf - )
real 252m40.143s
user 1m4.720s
sys 25m6.270s
/dev/md/0 1.1T 371G 748G 34% /data
4.207 hours @ 90,167M/hour or 1502M/min or 25.05M/sec
David Greaves wrote:
> Adam Talbot wrote:
>
>> OK, this topic I relay need to get in on.
>> I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
>> array.
>>
> Very interesting. Thanks.
>
> Did you get around to any 'tuning'.
> Things like raid chunk size, external logs for xfs, blockdev readahead
> on the underlying devices and the raid device?
>
> David
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-24 22:52 ` Adam Talbot
@ 2006-06-25 13:06 ` Joshua Baker-LePain
2006-06-28 3:45 ` I need a PCI V2.1 4 port SATA card Guy
2006-06-25 14:51 ` Large single raid and XFS or two small ones and EXT3? Adam Talbot
1 sibling, 1 reply; 47+ messages in thread
From: Joshua Baker-LePain @ 2006-06-25 13:06 UTC (permalink / raw)
To: Adam Talbot; +Cc: linux-raid
On Sat, 24 Jun 2006 at 3:52pm, Adam Talbot wrote
> nas tmp # time tar cf - . | (cd /data ; tar xf - )
A (bit) cleaner way to accomplish the same thing:
tar cf - --totals . | tar xC /data -f -
--
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-24 22:52 ` Adam Talbot
2006-06-25 13:06 ` Joshua Baker-LePain
@ 2006-06-25 14:51 ` Adam Talbot
2006-06-25 20:35 ` Chris Allen
1 sibling, 1 reply; 47+ messages in thread
From: Adam Talbot @ 2006-06-25 14:51 UTC (permalink / raw)
To: Adam Talbot; +Cc: David Greaves, Neil Brown, Francois Barre, linux-raid
ACK!
At one point some one stated that they were having problems with XFS
crashing under high NFS loads... Did it look something like this?
-Adam
Starting XFS recovery on filesystem: md0 (logdev: internal)
Filesystem "md0": XFS internal error xlog_valid_rec_header(1) at line
3478 of file fs/xfs/xfs_log_recover.c. Caller 0xffffffff802114fc
Call Trace: <ffffffff80211437>{xlog_valid_rec_header+231}
<ffffffff802114fc>{xlog_do_recovery_pass+172}
<ffffffff8020f0c8>{xlog_find_tail+2344}
<ffffffff802217e1>{kmem_alloc+97}
<ffffffff80211bb0>{xlog_recover+192}
<ffffffff8020c564>{xfs_log_mount+1380}
<ffffffff80213968>{xfs_mountfs+2712}
<ffffffff8016aa3a>{set_blocksize+138}
<ffffffff80224d1d>{xfs_setsize_buftarg_flags+61}
<ffffffff802192b4>{xfs_mount+2724}
<ffffffff8022ae00>{linvfs_fill_super+0}
<ffffffff8022aeb8>{linvfs_fill_super+184}
<ffffffff8024a62e>{strlcpy+78}
<ffffffff80169db2>{sget+722} <ffffffff8016a460>{set_bdev_super+0}
<ffffffff8022ae00>{linvfs_fill_super+0}
<ffffffff8022ae00>{linvfs_fill_super+0}
<ffffffff8016a5bc>{get_sb_bdev+268}
<ffffffff8016a84b>{do_kern_mount+107}
<ffffffff8017eed3>{do_mount+1603}
<ffffffff8011a2f9>{do_page_fault+1033}
<ffffffff80145f66>{find_get_pages+22}
<ffffffff8014d57a>{invalidate_mapping_pages+202}
<ffffffff80149f99>{__alloc_pages+89}
<ffffffff8014a234>{__get_free_pages+52}
<ffffffff8017f257>{sys_mount+151} <ffffffff8010a996>{system_call+126}
XFS: log mount/recovery failed: error 990
XFS: log mount failed
Adam Talbot wrote:
> Trying to test for tuning with different chunk's. Just finished 16K
> chunk and am about 20% done with the 32K test. Here are the numbers on
> 16K chunk, will send 32, 96,128,192 and 256 as I get them. But keep in
> mind each one of these tests take about 4~6 hours, so it is a slow
> process... I have settled for XFS as the file system type, it seems to
> be able to beat any thing else out there.
> -Adam
>
> XFS
> Config=NAS+NFS
> RAID6 16K chunk
> nas tmp # time tar cf - . | (cd /data ; tar xf - )
> real 252m40.143s
> user 1m4.720s
> sys 25m6.270s
> /dev/md/0 1.1T 371G 748G 34% /data
> 4.207 hours @ 90,167M/hour or 1502M/min or 25.05M/sec
>
>
>
>
> David Greaves wrote:
>
>> Adam Talbot wrote:
>>
>>
>>> OK, this topic I relay need to get in on.
>>> I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
>>> array.
>>>
>>>
>> Very interesting. Thanks.
>>
>> Did you get around to any 'tuning'.
>> Things like raid chunk size, external logs for xfs, blockdev readahead
>> on the underlying devices and the raid device?
>>
>> David
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-23 19:48 ` Large single raid and XFS or two small ones and EXT3? Nix
@ 2006-06-25 19:13 ` David Rees
0 siblings, 0 replies; 47+ messages in thread
From: David Rees @ 2006-06-25 19:13 UTC (permalink / raw)
To: Nix; +Cc: PFC, Chris Allen, linux-raid
On 6/23/06, Nix <nix@esperi.org.uk> wrote:
> On 23 Jun 2006, PFC suggested tentatively:
> > - ext3 is slow if you have many files in one directory, but has
> > more mature tools (resize, recovery etc)
>
> This is much less true if you turn on the dir_index feature.
However, even with dir_index, deleting large files is still much
slower with ext2/3 than xfs or jfs.
-Dave
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-25 14:51 ` Large single raid and XFS or two small ones and EXT3? Adam Talbot
@ 2006-06-25 20:35 ` Chris Allen
0 siblings, 0 replies; 47+ messages in thread
From: Chris Allen @ 2006-06-25 20:35 UTC (permalink / raw)
To: Adam Talbot; +Cc: linux-raid
Adam Talbot wrote:
> ACK!
> At one point some one stated that they were having problems with XFS
> crashing under high NFS loads... Did it look something like this?
> -Adam
>
>
>
nope, it looked like the trace below - and I could make it happen
consistently by thrashing xfs.
Not even sure it was over NFS - this could well have been a local test.
----------------------
do_IRQ: stack overflow: 304
Unable to handle kernel paging request at virtual address a554b923
printing eip:
c011b202
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: nfsd(U) lockd(U) md5(U) ipv6(U) autofs4(U) sunrpc(U)
xfs(U) exportfs(U) video(U) button(U) battery(U) ac(U) uhci_hcd(U)
ehci_hcd(U) i2c_i801(U) i2c_core(U) shpchp(U) e1000(U) floppy(U)
dm_snapshot(U) dm_zero(U) dm_mirror(U) ext3(U) jbd(U) raid5(U) xor(U)
dm_mod(U) ata_piix(U) libata(U) aar81xx(U) sd_mod(U) scsi_mod(U)
CPU: 10
EIP: 0060:[<c011b202>] Tainted: P VLI
EFLAGS: 00010086 (2.6.11-2.6.11)
EIP is at activate_task+0x34/0x9b
eax: e514b703 ebx: 00000000 ecx: 028f8800 edx: c0400200
esi: 028f8800 edi: 000f4352 ebp: f545d02c esp: f545d018
ds: 007b es: 007b ss: 0068
Process (pid: 947105536, threadinfo=f545c000 task=f5a27000)
Stack: badc0ded c3630160 f7ae4a80 c0400200 f7ae4a80 c3630160 f545d074
c011b785
00000000 c0220f39 00000001 00000086 00000000 00000001 00000003
f7ae4a80
00000082 00000001 0000000a 00000000 c02219da f7d7cf60 c035d914
00000000
Call Trace:
[<c011b785>] try_to_wake_up+0x24a/0x2aa
[<c0220f39>] scrup+0xcf/0xd9
[<c02219da>] set_cursor+0x4f/0x60
[<c01348b0>] autoremove_wake_function+0x15/0x37
[<c011d197>] __wake_up_common+0x39/0x59
[<c011d1e9>] __wake_up+0x32/0x43
[<c0121e2c>] release_console_sem+0xad/0xb5
[<c0121c48>] vprintk+0x1e7/0x29e
[<c0121a5d>] printk+0x1b/0x1f
[<c010664b>] do_IRQ+0x7f/0x86
[<c0104a3e>] common_interrupt+0x1a/0x20
[<c024b5fa>] cfq_may_queue+0x0/0xcd
[<c02425e4>] get_request+0xf2/0x2b7
[<c02430cc>] __make_request+0xbe/0x472
[<c024375b>] generic_make_request+0x91/0x234
[<f881be38>] compute_blocknr+0xe5/0x16e [raid5]
[<c013489b>] autoremove_wake_function+0x0/0x37
[<f881d0c2>] handle_stripe+0x736/0x109e [raid5]
[<f881b45a>] get_active_stripe+0x1fb/0x36c [raid5]
[<f881deed>] make_request+0x2e1/0x30d [raid5]
[<c013489b>] autoremove_wake_function+0x0/0x37
[<c024375b>] generic_make_request+0x91/0x234
[<c03054e1>] schedule+0x431/0xc5e
[<c024a3f4>] cfq_sort_rr_list+0x9b/0xe6
[<c0148c27>] buffered_rmqueue+0xc4/0x1fb
[<c013489b>] autoremove_wake_function+0x0/0x37
[<c0243944>] submit_bio+0x46/0xcc
[<c0147aae>] mempool_alloc+0x6f/0x108
[<c013489b>] autoremove_wake_function+0x0/0x37
[<c0166696>] bio_add_page+0x26/0x2c
[<f9419fe7>] _pagebuf_ioapply+0x175/0x2e3 [xfs]
[<f941a185>] pagebuf_iorequest+0x30/0x133 [xfs]
[<f9419643>] xfs_buf_get_flags+0xe8/0x147 [xfs]
[<f9419d45>] pagebuf_iostart+0x76/0x82 [xfs]
[<f9419707>] xfs_buf_read_flags+0x65/0x89 [xfs]
[<f940c105>] xfs_trans_read_buf+0x122/0x334 [xfs]
[<f93d9dc2>] xfs_btree_read_bufs+0x7d/0x97 [xfs]
[<f93c0d7a>] xfs_alloc_lookup+0x326/0x47b [xfs]
[<f93bc96b>] xfs_alloc_fixup_trees+0x14f/0x320 [xfs]
[<f93d99d9>] xfs_btree_init_cursor+0x1d/0x17f [xfs]
[<f93bdc38>] xfs_alloc_ag_vextent_size+0x377/0x456 [xfs]
[<f93bcbdb>] xfs_alloc_read_agfl+0x9f/0xb9 [xfs]
[<f93bccf5>] xfs_alloc_ag_vextent+0x100/0x102 [xfs]
[<f93be929>] xfs_alloc_fix_freelist+0x2ca/0x478 [xfs]
[<f93bf087>] xfs_alloc_vextent+0x182/0x570 [xfs]
[<f93cdff3>] xfs_bmap_alloc+0x111e/0x18e9 [xfs]
[<c013489b>] autoremove_wake_function+0x0/0x37
[<c024375b>] generic_make_request+0x91/0x234
[<f891eb40>] EdmaReqQueueInsert+0x70/0x80 [aar81xx]
[<c011cf79>] scheduler_tick+0x236/0x40f
[<c011cf79>] scheduler_tick+0x236/0x40f
[<f93d833e>] xfs_bmbt_get_state+0x13/0x1c [xfs]
[<f93cfebf>] xfs_bmap_do_search_extents+0xc3/0x476 [xfs]
[<f93d1b9f>] xfs_bmapi+0x72a/0x1670 [xfs]
[<f93d833e>] xfs_bmbt_get_state+0x13/0x1c [xfs]
[<f93ffdf7>] xlog_grant_log_space+0x329/0x350 [xfs]
[<f93fb3d0>] xfs_iomap_write_allocate+0x2d1/0x572 [xfs]
[<c0243944>] submit_bio+0x46/0xcc
[<c0147aae>] mempool_alloc+0x6f/0x108
[<f93fa368>] xfs_iomap+0x3ef/0x50c [xfs]
[<f94173fd>] xfs_map_blocks+0x39/0x71 [xfs]
[<f94183b3>] xfs_page_state_convert+0x4b9/0x6ab [xfs]
[<f9418b1d>] linvfs_writepage+0x57/0xd5 [xfs]
[<c014e71d>] pageout+0x84/0x101
[<c014ea1b>] shrink_list+0x281/0x454
[<c014db1b>] __pagevec_lru_add+0xac/0xbb
[<c014ed82>] shrink_cache+0xe7/0x26c
[<c014f33f>] shrink_zone+0x76/0xbb
[<c014f3e5>] shrink_caches+0x61/0x6f
[<c014f4b8>] try_to_free_pages+0xc5/0x18d
[<c0148fbb>] __alloc_pages+0x1cc/0x407
[<c014674a>] generic_file_buffered_write+0x148/0x60c
[<c0180ee8>] __mark_inode_dirty+0x28/0x199
[<f941f444>] xfs_write+0xa36/0xd03 [xfs]
[<f941b89d>] linvfs_write+0xe9/0x102 [xfs]
[<c013489b>] autoremove_wake_function+0x0/0x37
[<c014294d>] audit_syscall_entry+0x10b/0x15e
[<f941b7b4>] linvfs_write+0x0/0x102 [xfs]
[<c0161a27>] vfs_write+0x9e/0x110
[<c0161b44>] sys_write+0x41/0x6a
[<c0104009>] syscall_call+0x7/0xb
Code: 89 45 f0 89 55 ec 89 cb e8 24 57 ff ff 89 c6 89 d7 85 db 75 27 ba
00 02 40 c0 b8 00 f0 ff ff 21 e0 8b 40 10 8b 04 85 20 50 40 c0 <2b> 74
02 20 1b 7c 02 24 8b 45 ec 03 70 20 13 78 24 89 f2 89 f9
hr_ioreq_timedout: (0,5,0) opcode 0x28: Enter
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-24 7:59 ` Adam Talbot
2006-06-24 9:34 ` David Greaves
@ 2006-06-25 23:57 ` Bill Davidsen
2006-06-26 0:42 ` Adam Talbot
1 sibling, 1 reply; 47+ messages in thread
From: Bill Davidsen @ 2006-06-25 23:57 UTC (permalink / raw)
To: Adam Talbot; +Cc: Neil Brown, Francois Barre, linux-raid
winspeareAdam Talbot wrote:
>OK, this topic I relay need to get in on.
>I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
>array. I wanted real numbers, not "This FS is faster because..." I have
>moved over 100TB of data on my new array running the bench mark
>testing. I have yet to have any major problems with ReiserFS, EXT2/3,
>JFS, or XFS. I have done extensive testing on all, including just
>trying to break the file system with billions of 1k files, or a 1TB
>file. Was able to cause some problems with EXT3 and RiserFS with the 1KB
>and 1TB tests, respectively. but both were fixed with a fsck. My basic
>test is to move all data from my old server to my new server
>(whitequeen2) and clock the transfer time. Whitequeen2 has very little
>storage. The NAS's 1.2TB of storage is attached via iSCSI and a cross
>over cable to the back of whitequeen2. The data is 100GB of user's
>files(1KB~2MB), 50GB of MP3's (1MB~5MB) and the rest is movies and
>system backups 600MB~2GB. Here is a copy of my current data sheet,
>including specs on the servers and copy times, my numbers are not
>perfect, but they should give you a clue about speeds... XFS wins.
>
>
In many (most?) cases I'm a lot more concerned about filesystem
stability than performance. That is, I want the fastest <reliable>
filesystem. With ext2 and ext3 I've run multiple multi-TB machines
spread over four time zones, and not had a f/s problem updating ~1TB/day.
>The computer: whitequeen2
>AMD Athlon64 3200 (2.0GHz)
>1GB Corsair DDR 400 (2X 512MB's running in dual DDR mode)
>Foxconn 6150K8MA-8EKRS motherboard
>Off brand case/power supply
>2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
>Intel pro/1000 NIC
>CentOS 4.3 X86_64 2.6.9
> Main app server, Apache, Samba, NFS, NIS
>
>The computer: nas
>AMD Athlon64 3000 (1.8GHz)
>256MB Corsair DDR 400 (2X 128MB's running in dual DDR mode)
>Foxconn 6150K8MA-8EKRS motherboard
>Off brand case/power supply and drive cages
>2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
>6X software raid array, RAID 6, Maxtor 7V300F0, FW VA111900
>Gentoo linux. X86_64 2.6.16-gentoo-r9
> System built very lite, only built as an iSCSI based NAS.
>
>EXT3
>Config=APP+NFS-->NAS+iSCSI
>RAID6 64K chunk
>[root@whitequeen2 tmp]# time tar cf - . | (cd /data ; tar xf - )
>real 371m29.802s
>user 1m28.492s
>sys 46m48.947s
>/dev/sdb1 1.1T 371G 674G 36% /data
>6.192 hours @ 61,262M/hour or 1021M/min or 17.02M/sec
>
>
>EXT2
>Config=APP+NFS-->NAS+iSCSI
>RAID6 64K chunk
>[root@whitequeen2 tmp]# time tar cf - . | ( cd /data/ ; tar xf - )
>real 401m48.702s
>user 1m25.599s
>sys 30m22.620s
>/dev/sdb1 1.1T 371G 674G 36% /data
>6.692 hours @ 56,684M/hour or 945M/min or 15.75M/sec
>
Did you tune the extN filesystems to the stripe size of the raid?
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-24 12:40 ` Justin Piszcz
@ 2006-06-26 0:06 ` Bill Davidsen
2006-06-26 8:06 ` Justin Piszcz
0 siblings, 1 reply; 47+ messages in thread
From: Bill Davidsen @ 2006-06-26 0:06 UTC (permalink / raw)
To: Justin Piszcz; +Cc: Neil Brown, Francois Barre, linux-raid
Justin Piszcz wrote:
>
> On Sat, 24 Jun 2006, Neil Brown wrote:
>
>> On Friday June 23, francois.barre@gmail.com wrote:
>>
>>>> The problem is that there is no cost effective backup available.
>>>
>>>
>>> One-liner questions :
>>> - How does Google make backups ?
>>
>>
>> No, Google ARE the backups :-)
>>
>>> - Aren't tapes dead yet ?
>>
>>
>> LTO-3 does 300Gig, and LTO-4 is planned.
>> They may not cope with tera-byte arrays in one hit, but they still
>> have real value.
>>
>>> - What about a NUMA principle applied to storage ?
>>
>>
>> You mean an Hierarchical Storage Manager? Yep, they exist. I'm sure
>> SGI, EMC and assorted other TLAs could sell you one.
>>
>
> LTO3 is 400GB native and we've seen very good compression, so
> 800GB-1TB per tape.
The problem is in small business use, LTO3 is costly in the 1-10TB
range, and takes a lot of media changes as well. A TB of RAID-5 is
~$500, and at that small size the cost of drives and media is
disproportionally high. Using more drives is cost effective, but they
are not good for long term off site storage, because they're large and
fragile.
No obvious solutions in that price and application range that I see.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-25 23:57 ` Bill Davidsen
@ 2006-06-26 0:42 ` Adam Talbot
2006-06-26 14:03 ` Bill Davidsen
0 siblings, 1 reply; 47+ messages in thread
From: Adam Talbot @ 2006-06-26 0:42 UTC (permalink / raw)
To: Bill Davidsen; +Cc: Neil Brown, Francois Barre, linux-raid
Not exactly sure how to tune for stripe size.
What would you advise?
-Adam
Bill Davidsen wrote:
> winspeareAdam Talbot wrote:
>
>> OK, this topic I relay need to get in on.
>> I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
>> array. I wanted real numbers, not "This FS is faster because..." I have
>> moved over 100TB of data on my new array running the bench mark
>> testing. I have yet to have any major problems with ReiserFS, EXT2/3,
>> JFS, or XFS. I have done extensive testing on all, including just
>> trying to break the file system with billions of 1k files, or a 1TB
>> file. Was able to cause some problems with EXT3 and RiserFS with the 1KB
>> and 1TB tests, respectively. but both were fixed with a fsck. My basic
>> test is to move all data from my old server to my new server
>> (whitequeen2) and clock the transfer time. Whitequeen2 has very little
>> storage. The NAS's 1.2TB of storage is attached via iSCSI and a cross
>> over cable to the back of whitequeen2. The data is 100GB of user's
>> files(1KB~2MB), 50GB of MP3's (1MB~5MB) and the rest is movies and
>> system backups 600MB~2GB. Here is a copy of my current data sheet,
>> including specs on the servers and copy times, my numbers are not
>> perfect, but they should give you a clue about speeds... XFS wins.
>>
>>
>
> In many (most?) cases I'm a lot more concerned about filesystem
> stability than performance. That is, I want the fastest <reliable>
> filesystem. With ext2 and ext3 I've run multiple multi-TB machines
> spread over four time zones, and not had a f/s problem updating ~1TB/day.
>
>> The computer: whitequeen2
>> AMD Athlon64 3200 (2.0GHz)
>> 1GB Corsair DDR 400 (2X 512MB's running in dual DDR mode)
>> Foxconn 6150K8MA-8EKRS motherboard
>> Off brand case/power supply
>> 2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
>> Intel pro/1000 NIC
>> CentOS 4.3 X86_64 2.6.9
>> Main app server, Apache, Samba, NFS, NIS
>>
>> The computer: nas
>> AMD Athlon64 3000 (1.8GHz)
>> 256MB Corsair DDR 400 (2X 128MB's running in dual DDR mode)
>> Foxconn 6150K8MA-8EKRS motherboard
>> Off brand case/power supply and drive cages
>> 2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
>> 6X software raid array, RAID 6, Maxtor 7V300F0, FW VA111900
>> Gentoo linux. X86_64 2.6.16-gentoo-r9
>> System built very lite, only built as an iSCSI based NAS.
>>
>> EXT3
>> Config=APP+NFS-->NAS+iSCSI
>> RAID6 64K chunk
>> [root@whitequeen2 tmp]# time tar cf - . | (cd /data ; tar xf - )
>> real 371m29.802s
>> user 1m28.492s
>> sys 46m48.947s
>> /dev/sdb1 1.1T 371G 674G 36% /data
>> 6.192 hours @ 61,262M/hour or 1021M/min or 17.02M/sec
>>
>>
>> EXT2
>> Config=APP+NFS-->NAS+iSCSI
>> RAID6 64K chunk
>> [root@whitequeen2 tmp]# time tar cf - . | ( cd /data/ ; tar xf - )
>> real 401m48.702s
>> user 1m25.599s
>> sys 30m22.620s
>> /dev/sdb1 1.1T 371G 674G 36% /data
>> 6.692 hours @ 56,684M/hour or 945M/min or 15.75M/sec
>>
> Did you tune the extN filesystems to the stripe size of the raid?
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-26 0:06 ` Bill Davidsen
@ 2006-06-26 8:06 ` Justin Piszcz
0 siblings, 0 replies; 47+ messages in thread
From: Justin Piszcz @ 2006-06-26 8:06 UTC (permalink / raw)
To: Bill Davidsen; +Cc: Neil Brown, Francois Barre, linux-raid
On Sun, 25 Jun 2006, Bill Davidsen wrote:
> Justin Piszcz wrote:
>
>>
>> On Sat, 24 Jun 2006, Neil Brown wrote:
>>
>>> On Friday June 23, francois.barre@gmail.com wrote:
>>>
>>>>> The problem is that there is no cost effective backup available.
>>>>
>>>>
>>>> One-liner questions :
>>>> - How does Google make backups ?
>>>
>>>
>>> No, Google ARE the backups :-)
>>>
>>>> - Aren't tapes dead yet ?
>>>
>>>
>>> LTO-3 does 300Gig, and LTO-4 is planned.
>>> They may not cope with tera-byte arrays in one hit, but they still
>>> have real value.
>>>
>>>> - What about a NUMA principle applied to storage ?
>>>
>>>
>>> You mean an Hierarchical Storage Manager? Yep, they exist. I'm sure
>>> SGI, EMC and assorted other TLAs could sell you one.
>>>
>>
>> LTO3 is 400GB native and we've seen very good compression, so 800GB-1TB per
>> tape.
>
> The problem is in small business use, LTO3 is costly in the 1-10TB range, and
> takes a lot of media changes as well. A TB of RAID-5 is ~$500, and at that
> small size the cost of drives and media is disproportionally high. Using more
> drives is cost effective, but they are not good for long term off site
> storage, because they're large and fragile.
>
> No obvious solutions in that price and application range that I see.
>
> --
> bill davidsen <davidsen@tmr.com>
> CTO TMR Associates, Inc
> Doing interesting things with small computers since 1979
>
In the 1-10TB range you are probably correct, as the numbers increase
however, many LTO2/LTO3 drives + robotics become justifiable.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid and XFS or two small ones and EXT3?
2006-06-26 0:42 ` Adam Talbot
@ 2006-06-26 14:03 ` Bill Davidsen
0 siblings, 0 replies; 47+ messages in thread
From: Bill Davidsen @ 2006-06-26 14:03 UTC (permalink / raw)
To: Adam Talbot; +Cc: Neil Brown, Francois Barre, linux-raid
Adam Talbot wrote:
>Not exactly sure how to tune for stripe size.
>What would you advise?
>-Adam
>
>
See the -R option of mke2fs. I don't have a number for the performance
impact of this, but I bet someone else on the list will. Depending on
what posts you read, reports range from "measurable" to "significant,"
without quantifying.
Note, next month I will set up either a 2x750 RAID-1 or 4x250 RAID-5
array, and if I got RAID-5 I will have the chance to run some metrics
before putting the hardware into production service. I'll report on the
-R option if I have any data.
>
>Bill Davidsen wrote:
>
>
>>winspeareAdam Talbot wrote:
>>
>>
>>
>>>OK, this topic I relay need to get in on.
>>>I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
>>>array. I wanted real numbers, not "This FS is faster because..." I have
>>>moved over 100TB of data on my new array running the bench mark
>>>testing. I have yet to have any major problems with ReiserFS, EXT2/3,
>>>JFS, or XFS. I have done extensive testing on all, including just
>>>trying to break the file system with billions of 1k files, or a 1TB
>>>file. Was able to cause some problems with EXT3 and RiserFS with the 1KB
>>>and 1TB tests, respectively. but both were fixed with a fsck. My basic
>>>test is to move all data from my old server to my new server
>>>(whitequeen2) and clock the transfer time. Whitequeen2 has very little
>>>storage. The NAS's 1.2TB of storage is attached via iSCSI and a cross
>>>over cable to the back of whitequeen2. The data is 100GB of user's
>>>files(1KB~2MB), 50GB of MP3's (1MB~5MB) and the rest is movies and
>>>system backups 600MB~2GB. Here is a copy of my current data sheet,
>>>including specs on the servers and copy times, my numbers are not
>>>perfect, but they should give you a clue about speeds... XFS wins.
>>>
>>>
>>>
>>>
>>In many (most?) cases I'm a lot more concerned about filesystem
>>stability than performance. That is, I want the fastest <reliable>
>>filesystem. With ext2 and ext3 I've run multiple multi-TB machines
>>spread over four time zones, and not had a f/s problem updating ~1TB/day.
>>
>>
>>
>>Did you tune the extN filesystems to the stripe size of the raid?
>>
>>
>>
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: Large single raid... - XFS over NFS woes
2006-06-23 12:50 ` Chris Allen
` (2 preceding siblings ...)
2006-06-23 14:01 ` Al Boldi
@ 2006-06-27 12:05 ` Dexter Filmore
3 siblings, 0 replies; 47+ messages in thread
From: Dexter Filmore @ 2006-06-27 12:05 UTC (permalink / raw)
To: Chris Allen, linux-raid
Am Freitag, 23. Juni 2006 14:50 schrieben Sie:
> Strange that whatever the filesystem you get equal numbers of people
> saying that
> they have never lost a single byte to those who have had horrible
> corruption and
> would never touch it again. We stopped using XFS about a year ago because
> we were getting kernel stack space panics under heavy load over NFS. It
> looks like
> the time has come to give it another try.
I'd tread on XFS land cautious - while I always favored XFS over Reiser (had
way to many issues in its stable releases after my fancy) it has some
drawbacks. First, you cannot shrink it. LVM becomes kinda pointless.
But especially with NFS I ran into trouble myself.
Copying large amount of data sometimes stalls and eventually has locked the
machine.
Plus, recently I had some weird filesystem corruption like /root getting lost
or similar. Running 2.6.14.1 and NFS3.
If performance is not top priority, stick to ext3 and create 2 partitions or
volume groups.
My 0.02$
Dex
P.S.: How about JFS..? Don't know if it can resize or how stable it is, but I
can't remember hearing more or less ups or downs than about any other
journaling fs.
--
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d--(+)@ s-:+ a- C++++ UL++ P+>++ L+++>++++ E-- W++ N o? K-
w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@
b++(+++) DI+++ D- G++ e* h>++ r* y?
------END GEEK CODE BLOCK------
http://www.stop1984.com
http://www.againsttcpa.com
^ permalink raw reply [flat|nested] 47+ messages in thread
* I need a PCI V2.1 4 port SATA card
2006-06-25 13:06 ` Joshua Baker-LePain
@ 2006-06-28 3:45 ` Guy
2006-06-28 4:29 ` Brad Campbell
0 siblings, 1 reply; 47+ messages in thread
From: Guy @ 2006-06-28 3:45 UTC (permalink / raw)
To: linux-raid
Hello group,
I am upgrading my disks from old 18 Gig SCSI disks to 300 Gig SATA
disks. I need a good SATA controller. My system is old and has PCI V 2.1.
I need a 4 port card, or 2 2 port cards. My system has multi PCI buses, so
2 cards may give me better performance, but I don't need it. I will be
using software RAID. Can anyone recommend a card that is supported by the
current kernel?
I know this is the wrong group, sorry. But I know this is a very
good place to ask! I did search the archives but don't seem to have the
correct keywords to find what I want.
Btw, I plan to buy 3 or 4 Seagate ST3320620AS disks. Barracuda
7200.10 SATA 320G.
Thanks,
Guy
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: I need a PCI V2.1 4 port SATA card
2006-06-28 3:45 ` I need a PCI V2.1 4 port SATA card Guy
@ 2006-06-28 4:29 ` Brad Campbell
2006-06-28 10:20 ` Justin Piszcz
` (2 more replies)
0 siblings, 3 replies; 47+ messages in thread
From: Brad Campbell @ 2006-06-28 4:29 UTC (permalink / raw)
To: Guy; +Cc: linux-raid
Guy wrote:
> Hello group,
>
> I am upgrading my disks from old 18 Gig SCSI disks to 300 Gig SATA
> disks. I need a good SATA controller. My system is old and has PCI V 2.1.
> I need a 4 port card, or 2 2 port cards. My system has multi PCI buses, so
> 2 cards may give me better performance, but I don't need it. I will be
> using software RAID. Can anyone recommend a card that is supported by the
> current kernel?
I'm using Promise SATA150TX4 cards here in old PCI based systems. They work great and have been rock
solid for well in excess of a year 24/7 hard use. I have 3 in one box and 4 in another.
I'm actually looking at building another 15 disk server now and was hoping to move to something
quicker using _almost_ commodity hardware.
My current 15 drive RAID-6 server is built around a KT600 board with an AMD Sempron processor and 4
SATA150TX4 cards. It does the job but it's not the fastest thing around (takes about 10 hours to do
a check of the array or about 15 to do a rebuild).
I'd love to do something similar with PCI-E or PCI-X and make it go faster (the PCI bus bandwidth is
the killer), however I've not seen many affordable PCI-E multi-port cards that are supported yet and
PCI-X seems to mean moving to "server" class mainboards and the other expenses that come along with
that.
Brad
--
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: I need a PCI V2.1 4 port SATA card
2006-06-28 4:29 ` Brad Campbell
@ 2006-06-28 10:20 ` Justin Piszcz
2006-06-28 11:55 ` Christian Pernegger
2006-06-28 12:12 ` Petr Vyskocil
2 siblings, 0 replies; 47+ messages in thread
From: Justin Piszcz @ 2006-06-28 10:20 UTC (permalink / raw)
To: Brad Campbell; +Cc: Guy, linux-raid
On Wed, 28 Jun 2006, Brad Campbell wrote:
> Guy wrote:
>> Hello group,
>>
>> I am upgrading my disks from old 18 Gig SCSI disks to 300 Gig SATA
>> disks. I need a good SATA controller. My system is old and has PCI V 2.1.
>> I need a 4 port card, or 2 2 port cards. My system has multi PCI buses, so
>> 2 cards may give me better performance, but I don't need it. I will be
>> using software RAID. Can anyone recommend a card that is supported by the
>> current kernel?
>
> I'm using Promise SATA150TX4 cards here in old PCI based systems. They work
> great and have been rock solid for well in excess of a year 24/7 hard use. I
> have 3 in one box and 4 in another.
>
> I'm actually looking at building another 15 disk server now and was hoping to
> move to something quicker using _almost_ commodity hardware.
>
> My current 15 drive RAID-6 server is built around a KT600 board with an AMD
> Sempron processor and 4 SATA150TX4 cards. It does the job but it's not the
> fastest thing around (takes about 10 hours to do a check of the array or
> about 15 to do a rebuild).
>
> I'd love to do something similar with PCI-E or PCI-X and make it go faster
> (the PCI bus bandwidth is the killer), however I've not seen many affordable
> PCI-E multi-port cards that are supported yet and PCI-X seems to mean moving
> to "server" class mainboards and the other expenses that come along with
> that.
>
> Brad
> --
> "Human beings, who are almost unique in having the ability
> to learn from the experience of others, are also remarkable
> for their apparent disinclination to do so." -- Douglas Adams
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
That is the problem, the only 4 port cards are PCI and not PCI-e and thus
limit your speed and bw, the only alternative I see is an Areca card if
you want speed..
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: I need a PCI V2.1 4 port SATA card
2006-06-28 4:29 ` Brad Campbell
2006-06-28 10:20 ` Justin Piszcz
@ 2006-06-28 11:55 ` Christian Pernegger
2006-06-28 11:59 ` Gordon Henderson
2006-06-28 19:38 ` Justin Piszcz
2006-06-28 12:12 ` Petr Vyskocil
2 siblings, 2 replies; 47+ messages in thread
From: Christian Pernegger @ 2006-06-28 11:55 UTC (permalink / raw)
To: Brad Campbell; +Cc: linux-raid
> My current 15 drive RAID-6 server is built around a KT600 board with an AMD Sempron
> processor and 4 SATA150TX4 cards. It does the job but it's not the fastest thing around
> (takes about 10 hours to do a check of the array or about 15 to do a rebuild).
What kind of enclosure do you have this in?
I also subscribe to the "almost commodity hardware" philosophy,
however I've not been able to find a case that comfortably takes even
8 drives. (The Stacker is an absolute nightmare ...) Even most
rackable cases stop at 6 3.5" drive bays -- either that or they are
dedicated storage racks with integrated hw RAID and fiber SCSI
interconnect --> definitely not commodity.
Thanks,
C.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: I need a PCI V2.1 4 port SATA card
2006-06-28 11:55 ` Christian Pernegger
@ 2006-06-28 11:59 ` Gordon Henderson
2006-06-29 18:45 ` Bill Davidsen
2006-06-28 19:38 ` Justin Piszcz
1 sibling, 1 reply; 47+ messages in thread
From: Gordon Henderson @ 2006-06-28 11:59 UTC (permalink / raw)
To: Christian Pernegger; +Cc: linux-raid
On Wed, 28 Jun 2006, Christian Pernegger wrote:
> I also subscribe to the "almost commodity hardware" philosophy,
> however I've not been able to find a case that comfortably takes even
> 8 drives. (The Stacker is an absolute nightmare ...) Even most
> rackable cases stop at 6 3.5" drive bays -- either that or they are
> dedicated storage racks with integrated hw RAID and fiber SCSI
> interconnect --> definitely not commodity.
I've used these:
http://www.acme-technology.co.uk/acm338.htm
(8 drives in a 3U case), and their variants
eg:
http://www.acme-technology.co.uk/acm312.htm
(12 disks in a 3U case)
for several years with good results. Not the cheapest on the block though,
but never had any real issues with them.
Gordon
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: I need a PCI V2.1 4 port SATA card
2006-06-28 4:29 ` Brad Campbell
2006-06-28 10:20 ` Justin Piszcz
2006-06-28 11:55 ` Christian Pernegger
@ 2006-06-28 12:12 ` Petr Vyskocil
2 siblings, 0 replies; 47+ messages in thread
From: Petr Vyskocil @ 2006-06-28 12:12 UTC (permalink / raw)
To: linux-raid
Brad Campbell wrote:
> I'd love to do something similar with PCI-E or PCI-X and make it go
> faster (the PCI bus bandwidth is the killer), however I've not seen
> many affordable PCI-E multi-port cards that are supported yet and
> PCI-X seems to mean moving to "server" class mainboards and the other
> expenses that come along with that.
Recently I was looking for a budget solution to exactly this problem,
and the best I found was to use 2-port SiI 3132 based PCI-E 1x card
combined with 1:5 SATA Splitter based on SiI 3726 (e.g.
http://fwdepot.com/thestore/product_info.php/products_id/1245).
Unfortunately I didn't find anyone selling the splitter here in Czechia,
so I went with 4-port SiI PCI card, which is performing well and stable,
but of course quite slow.
Some test I googled up at that time suggested that this combo can get
about 220MB/s bandwidth through in real life (test was on Win32 though),
so at today's drive speeds you can connect ~4-5 drives to one PCI-E
without bus bandwidth becoming the limiting factor.
Anyway, for really budget machines I can recommend the PCI SiI 3124
based cards, the driver in kernel is working rock-stable for me. Only
grudge is that driver doesn't sense if you disconnect a drive from SATA
connector, i.e. when you do that, computer will freeze trying to write
to disconnected drive. After ~3 minutes it times out and md kicks the
drive out of the array, though.
If someone has any experience to share about SiI 3132+3726 under linux,
I'll be happy to hear about it. According to
http://linux-ata.org/software-status.html#pmp it should work, question
is how stable it is, since it is recent development.
Petr
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: I need a PCI V2.1 4 port SATA card
2006-06-28 11:55 ` Christian Pernegger
2006-06-28 11:59 ` Gordon Henderson
@ 2006-06-28 19:38 ` Justin Piszcz
1 sibling, 0 replies; 47+ messages in thread
From: Justin Piszcz @ 2006-06-28 19:38 UTC (permalink / raw)
To: Christian Pernegger; +Cc: Brad Campbell, linux-raid
On Wed, 28 Jun 2006, Christian Pernegger wrote:
>> My current 15 drive RAID-6 server is built around a KT600 board with an AMD
>> Sempron
>> processor and 4 SATA150TX4 cards. It does the job but it's not the fastest
>> thing around
>> (takes about 10 hours to do a check of the array or about 15 to do a
>> rebuild).
>
> What kind of enclosure do you have this in?
>
> I also subscribe to the "almost commodity hardware" philosophy,
> however I've not been able to find a case that comfortably takes even
> 8 drives. (The Stacker is an absolute nightmare ...) Even most
> rackable cases stop at 6 3.5" drive bays -- either that or they are
> dedicated storage racks with integrated hw RAID and fiber SCSI
> interconnect --> definitely not commodity.
>
> Thanks,
>
> C.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
For the case, there a number of cases (Lian Li) that fit 20 drives with
easy, check here:
http://www.newegg.com/Product/Product.asp?item=N82E16811112062
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: I need a PCI V2.1 4 port SATA card
2006-06-28 11:59 ` Gordon Henderson
@ 2006-06-29 18:45 ` Bill Davidsen
0 siblings, 0 replies; 47+ messages in thread
From: Bill Davidsen @ 2006-06-29 18:45 UTC (permalink / raw)
To: Gordon Henderson; +Cc: Christian Pernegger, linux-raid
Gordon Henderson wrote:
>On Wed, 28 Jun 2006, Christian Pernegger wrote:
>
>
>
>>I also subscribe to the "almost commodity hardware" philosophy,
>>however I've not been able to find a case that comfortably takes even
>>8 drives. (The Stacker is an absolute nightmare ...) Even most
>>rackable cases stop at 6 3.5" drive bays -- either that or they are
>>dedicated storage racks with integrated hw RAID and fiber SCSI
>>interconnect --> definitely not commodity.
>>
>>
>
>I've used these:
>
> http://www.acme-technology.co.uk/acm338.htm
>
>(8 drives in a 3U case), and their variants
>
>eg:
>
> http://www.acme-technology.co.uk/acm312.htm
>
>(12 disks in a 3U case)
>
>
Interesting ad, with a masonic emblem, and a picture of a white case
with a note saying it's only available in black. Of course the hardware
may be perfectly fine, but I wouldn't count on color.
>for several years with good results. Not the cheapest on the block though,
>but never had any real issues with them.
>
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2006-06-29 18:45 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-22 19:11 Large single raid and XFS or two small ones and EXT3? Chris Allen
2006-06-22 19:16 ` Gordon Henderson
2006-06-22 19:23 ` H. Peter Anvin
2006-06-22 19:58 ` Chris Allen
2006-06-22 20:00 ` Chris Allen
2006-06-23 8:59 ` PFC
2006-06-23 9:26 ` Francois Barre
2006-06-23 12:50 ` Chris Allen
2006-06-23 13:14 ` Gordon Henderson
2006-06-23 13:30 ` Francois Barre
2006-06-23 14:46 ` Martin Schröder
2006-06-23 14:59 ` Francois Barre
2006-06-23 15:13 ` Bill Davidsen
2006-06-23 15:34 ` Francois Barre
2006-06-23 19:49 ` Nix
2006-06-24 5:19 ` Neil Brown
2006-06-24 7:59 ` Adam Talbot
2006-06-24 9:34 ` David Greaves
2006-06-24 22:52 ` Adam Talbot
2006-06-25 13:06 ` Joshua Baker-LePain
2006-06-28 3:45 ` I need a PCI V2.1 4 port SATA card Guy
2006-06-28 4:29 ` Brad Campbell
2006-06-28 10:20 ` Justin Piszcz
2006-06-28 11:55 ` Christian Pernegger
2006-06-28 11:59 ` Gordon Henderson
2006-06-29 18:45 ` Bill Davidsen
2006-06-28 19:38 ` Justin Piszcz
2006-06-28 12:12 ` Petr Vyskocil
2006-06-25 14:51 ` Large single raid and XFS or two small ones and EXT3? Adam Talbot
2006-06-25 20:35 ` Chris Allen
2006-06-25 23:57 ` Bill Davidsen
2006-06-26 0:42 ` Adam Talbot
2006-06-26 14:03 ` Bill Davidsen
2006-06-24 12:40 ` Justin Piszcz
2006-06-26 0:06 ` Bill Davidsen
2006-06-26 8:06 ` Justin Piszcz
2006-06-23 15:17 ` Chris Allen
2006-06-23 14:01 ` Al Boldi
2006-06-23 16:06 ` Andreas Dilger
2006-06-23 16:41 ` Christian Pedaschus
2006-06-23 16:46 ` Christian Pedaschus
2006-06-23 19:53 ` Nix
2006-06-23 16:21 ` Russell Cattelan
2006-06-23 18:19 ` Tom Vier
2006-06-27 12:05 ` Large single raid... - XFS over NFS woes Dexter Filmore
2006-06-23 19:48 ` Large single raid and XFS or two small ones and EXT3? Nix
2006-06-25 19:13 ` David Rees
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).