* Performance problems with millions of inodes
@ 2008-06-25 14:43 Christoph Litauer
2008-06-25 14:46 ` Christoph Litauer
2008-06-25 23:12 ` Dave Chinner
0 siblings, 2 replies; 7+ messages in thread
From: Christoph Litauer @ 2008-06-25 14:43 UTC (permalink / raw)
To: xfs
Hi,
sorry if this has been asked before, I am new to this mailing list. I
didn't find any hints in the FAQ or by googling ...
I have a backup server driving two kinds of backup software: bacula and
backuppc. bacula saves it's backups on raid1, backuppc on raid2
(different hardware, but both fast hardware raids).
I have massive performance problems with backuppc which I tracked down
to performance problems of the filesystem on raid2 (I think so). The
main difference between the two backup systems is that backuppc uses
millions of inodes for it's backup (in fact it duplicates the directory
structure of the backup client).
raid1 consists of 91675 inodes, raid2 of 143646439. The filesystems were
created without any options. raid1 is about 7 TB, raid2 about 10TB. Both
filesystems are mounted with options
'(rw,noatime,nodiratime,ihashsize=65536)'.
I used bonnie++ to benchmark both filesystems. Here are the results of
'bonnie++ -u root -f -n 10:0:0:1000':
raid1:
-------------------
Sequential Output: 82505 K/sec
Sequential Input : 102192 K/sec
Sequential file creation: 7184/sec
Random file creation : 17277/sec
raid2:
-------------------
Sequential Output: 124802 K/sec
Sequential Input : 109158 K/sec
Sequential file creation: 123/sec
Random file creation : 138/sec
As you can see, raid2's throughput is higher than raid1's. But the file
creation times are rather slow ...
Maybe the 143 million inodes cause this effect? Any idea how to avoid it?
--
Regards
Christoph
________________________________________________________________________
Christoph Litauer litauer@uni-koblenz.de
Uni Koblenz, Computing Center, http://www.uni-koblenz.de/~litauer
Postfach 201602, 56016 Koblenz Fon: +49 261 287-1311, Fax: -100 1311
PGP-Fingerprint: F39C E314 2650 650D 8092 9514 3A56 FBD8 79E3 27B2
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance problems with millions of inodes
2008-06-25 14:43 Performance problems with millions of inodes Christoph Litauer
@ 2008-06-25 14:46 ` Christoph Litauer
2008-06-25 16:02 ` Emmanuel Florac
2008-06-25 23:12 ` Dave Chinner
1 sibling, 1 reply; 7+ messages in thread
From: Christoph Litauer @ 2008-06-25 14:46 UTC (permalink / raw)
To: xfs
Christoph Litauer schrieb:
> Hi,
>
> sorry if this has been asked before, I am new to this mailing list. I
> didn't find any hints in the FAQ or by googling ...
>
> I have a backup server driving two kinds of backup software: bacula and
> backuppc. bacula saves it's backups on raid1, backuppc on raid2
> (different hardware, but both fast hardware raids).
> I have massive performance problems with backuppc which I tracked down
> to performance problems of the filesystem on raid2 (I think so). The
> main difference between the two backup systems is that backuppc uses
> millions of inodes for it's backup (in fact it duplicates the directory
> structure of the backup client).
>
> raid1 consists of 91675 inodes, raid2 of 143646439. The filesystems were
> created without any options. raid1 is about 7 TB, raid2 about 10TB. Both
> filesystems are mounted with options
> '(rw,noatime,nodiratime,ihashsize=65536)'.
>
> I used bonnie++ to benchmark both filesystems. Here are the results of
> 'bonnie++ -u root -f -n 10:0:0:1000':
>
> raid1:
> -------------------
> Sequential Output: 82505 K/sec
> Sequential Input : 102192 K/sec
> Sequential file creation: 7184/sec
> Random file creation : 17277/sec
>
> raid2:
> -------------------
> Sequential Output: 124802 K/sec
> Sequential Input : 109158 K/sec
> Sequential file creation: 123/sec
> Random file creation : 138/sec
>
> As you can see, raid2's throughput is higher than raid1's. But the file
> creation times are rather slow ...
>
> Maybe the 143 million inodes cause this effect? Any idea how to avoid it?
>
Just another (xfs_)info about raid2:
meta-data=/dev/backuppc/backuppc isize=256 agcount=32,
agsize=79691776 blks
= sectsz=512 attr=0
data = bsize=4096 blocks=2550136832, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=32768, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
--
Regards
Christoph
________________________________________________________________________
Christoph Litauer litauer@uni-koblenz.de
Uni Koblenz, Computing Center, http://www.uni-koblenz.de/~litauer
Postfach 201602, 56016 Koblenz Fon: +49 261 287-1311, Fax: -100 1311
PGP-Fingerprint: F39C E314 2650 650D 8092 9514 3A56 FBD8 79E3 27B2
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Performance problems with millions of inodes
2008-06-25 14:46 ` Christoph Litauer
@ 2008-06-25 16:02 ` Emmanuel Florac
2008-06-25 17:00 ` Mark
2008-06-26 11:29 ` Christoph Litauer
0 siblings, 2 replies; 7+ messages in thread
From: Emmanuel Florac @ 2008-06-25 16:02 UTC (permalink / raw)
To: Christoph Litauer; +Cc: xfs
Le Wed, 25 Jun 2008 16:46:30 +0200
Christoph Litauer <litauer@uni-koblenz.de> écrivait:
> >
> > Maybe the 143 million inodes cause this effect? Any idea how to
> > avoid it?
Maybe you should try to add a "nobarrier" option at mount on the
slowest machine. Barriers can slow down operation tremendously...
--
----------------------------------------
Emmanuel Florac | Intellique
----------------------------------------
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance problems with millions of inodes
2008-06-25 16:02 ` Emmanuel Florac
@ 2008-06-25 17:00 ` Mark
2008-06-26 11:29 ` Christoph Litauer
1 sibling, 0 replies; 7+ messages in thread
From: Mark @ 2008-06-25 17:00 UTC (permalink / raw)
To: Christoph Litauer, Emmanuel Florac; +Cc: xfs
What type of RAID? If striping is involved, perhaps you should investigate the "su" and "sw" suboptions.
I also found some performance improvement with "-l lazy-count=1", although you may not wish to slow down repair times for a backup volume.
--
Mark
"What better place to find oneself than
on the streets of one's home village?"
--Capt. Jean-Luc Picard, "Family"
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Performance problems with millions of inodes
2008-06-25 16:02 ` Emmanuel Florac
2008-06-25 17:00 ` Mark
@ 2008-06-26 11:29 ` Christoph Litauer
1 sibling, 0 replies; 7+ messages in thread
From: Christoph Litauer @ 2008-06-26 11:29 UTC (permalink / raw)
To: Emmanuel Florac; +Cc: xfs
Emmanuel Florac schrieb:
> Le Wed, 25 Jun 2008 16:46:30 +0200
> Christoph Litauer <litauer@uni-koblenz.de> écrivait:
>
>>> Maybe the 143 million inodes cause this effect? Any idea how to
>>> avoid it?
>
> Maybe you should try to add a "nobarrier" option at mount on the
> slowest machine. Barriers can slow down operation tremendously...
>
Thanks for this hint. nobarrier improved file creation performance
significantly (about 50 times).
--
Regards
Christoph
________________________________________________________________________
Christoph Litauer litauer@uni-koblenz.de
Uni Koblenz, Computing Center, http://www.uni-koblenz.de/~litauer
Postfach 201602, 56016 Koblenz Fon: +49 261 287-1311, Fax: -100 1311
PGP-Fingerprint: F39C E314 2650 650D 8092 9514 3A56 FBD8 79E3 27B2
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance problems with millions of inodes
2008-06-25 14:43 Performance problems with millions of inodes Christoph Litauer
2008-06-25 14:46 ` Christoph Litauer
@ 2008-06-25 23:12 ` Dave Chinner
2008-06-26 7:29 ` Christoph Litauer
1 sibling, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2008-06-25 23:12 UTC (permalink / raw)
To: Christoph Litauer; +Cc: xfs
On Wed, Jun 25, 2008 at 04:43:23PM +0200, Christoph Litauer wrote:
> Hi,
>
> sorry if this has been asked before, I am new to this mailing list. I
> didn't find any hints in the FAQ or by googling ...
>
> I have a backup server driving two kinds of backup software: bacula and
> backuppc. bacula saves it's backups on raid1, backuppc on raid2
> (different hardware, but both fast hardware raids).
> I have massive performance problems with backuppc which I tracked down
> to performance problems of the filesystem on raid2 (I think so). The
> main difference between the two backup systems is that backuppc uses
> millions of inodes for it's backup (in fact it duplicates the directory
> structure of the backup client).
>
> raid1 consists of 91675 inodes, raid2 of 143646439. The filesystems were
> created without any options. raid1 is about 7 TB, raid2 about 10TB. Both
> filesystems are mounted with options
> '(rw,noatime,nodiratime,ihashsize=65536)'.
>
> I used bonnie++ to benchmark both filesystems. Here are the results of
> 'bonnie++ -u root -f -n 10:0:0:1000':
>
> raid1:
> -------------------
> Sequential Output: 82505 K/sec
> Sequential Input : 102192 K/sec
> Sequential file creation: 7184/sec
> Random file creation : 17277/sec
>
> raid2:
> -------------------
> Sequential Output: 124802 K/sec
> Sequential Input : 109158 K/sec
> Sequential file creation: 123/sec
> Random file creation : 138/sec
>
> As you can see, raid2's throughput is higher than raid1's. But the file
> creation times are rather slow ...
>
> Maybe the 143 million inodes cause this effect?
Certain will be. You've got about 3 AGs that are holding inodes, so
that's probably 35M+ inodes per AG. With the way allocation works,
it's probably doing a dual-traversal of the AGI btree to find a free
inode "near" to the parent and that is consuming lots and lots of
CPU time.
> Any idea how to avoid it?
I had a protoype patch back when I was at SGI than stopped this
search when the search reached a radius that was no longer "near".
This greatly reduced CPU time for allocation on large inode count
AGs and hence create rates increased significantly.
[Mark - IIRC that patch was in the miscellaneous patch tarball I
left behind...]
The only other way of dealing with this is to use inode64 so that
inodes get spread across the entire filesystem instead of just a
few AGs at the start of the filesystem. It's too late to change the
existing inodes, but new inodes would get spread around....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Performance problems with millions of inodes
2008-06-25 23:12 ` Dave Chinner
@ 2008-06-26 7:29 ` Christoph Litauer
0 siblings, 0 replies; 7+ messages in thread
From: Christoph Litauer @ 2008-06-26 7:29 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Dave Chinner schrieb:
> On Wed, Jun 25, 2008 at 04:43:23PM +0200, Christoph Litauer wrote:
>> Hi,
>>
>> sorry if this has been asked before, I am new to this mailing list. I
>> didn't find any hints in the FAQ or by googling ...
>>
>> I have a backup server driving two kinds of backup software: bacula and
>> backuppc. bacula saves it's backups on raid1, backuppc on raid2
>> (different hardware, but both fast hardware raids).
>> I have massive performance problems with backuppc which I tracked down
>> to performance problems of the filesystem on raid2 (I think so). The
>> main difference between the two backup systems is that backuppc uses
>> millions of inodes for it's backup (in fact it duplicates the directory
>> structure of the backup client).
>>
>> raid1 consists of 91675 inodes, raid2 of 143646439. The filesystems were
>> created without any options. raid1 is about 7 TB, raid2 about 10TB. Both
>> filesystems are mounted with options
>> '(rw,noatime,nodiratime,ihashsize=65536)'.
>>
>> I used bonnie++ to benchmark both filesystems. Here are the results of
>> 'bonnie++ -u root -f -n 10:0:0:1000':
>>
>> raid1:
>> -------------------
>> Sequential Output: 82505 K/sec
>> Sequential Input : 102192 K/sec
>> Sequential file creation: 7184/sec
>> Random file creation : 17277/sec
>>
>> raid2:
>> -------------------
>> Sequential Output: 124802 K/sec
>> Sequential Input : 109158 K/sec
>> Sequential file creation: 123/sec
>> Random file creation : 138/sec
>>
>> As you can see, raid2's throughput is higher than raid1's. But the file
>> creation times are rather slow ...
>>
>> Maybe the 143 million inodes cause this effect?
>
> Certain will be. You've got about 3 AGs that are holding inodes, so
> that's probably 35M+ inodes per AG. With the way allocation works,
> it's probably doing a dual-traversal of the AGI btree to find a free
> inode "near" to the parent and that is consuming lots and lots of
> CPU time.
So, would more AGs improve performance? As backuppc is still in testing
state (for me) it would be no problem to create a new xfs filesystem
with a "better" configuration. I am afraid that the number of inodes
will increase very much if I backup more clients and filesystems. So,
what configuration would you recommend?
>
>> Any idea how to avoid it?
>
> I had a protoype patch back when I was at SGI than stopped this
> search when the search reached a radius that was no longer "near".
> This greatly reduced CPU time for allocation on large inode count
> AGs and hence create rates increased significantly.
>
> [Mark - IIRC that patch was in the miscellaneous patch tarball I
> left behind...]
>
> The only other way of dealing with this is to use inode64 so that
> inodes get spread across the entire filesystem instead of just a
> few AGs at the start of the filesystem. It's too late to change the
> existing inodes, but new inodes would get spread around....
Unfortunatly my backup server is a 32 bit system ...
--
Regards
Christoph
________________________________________________________________________
Christoph Litauer litauer@uni-koblenz.de
Uni Koblenz, Computing Center, http://www.uni-koblenz.de/~litauer
Postfach 201602, 56016 Koblenz Fon: +49 261 287-1311, Fax: -100 1311
PGP-Fingerprint: F39C E314 2650 650D 8092 9514 3A56 FBD8 79E3 27B2
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-06-26 11:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-25 14:43 Performance problems with millions of inodes Christoph Litauer
2008-06-25 14:46 ` Christoph Litauer
2008-06-25 16:02 ` Emmanuel Florac
2008-06-25 17:00 ` Mark
2008-06-26 11:29 ` Christoph Litauer
2008-06-25 23:12 ` Dave Chinner
2008-06-26 7:29 ` Christoph Litauer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox