public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: pg_xfs@xfs.for.sabi.co.UK (Peter Grandi)
To: Linux XFS <linux-xfs@oss.sgi.com>
Subject: Re: xfsdump -s unacceptable performances
Date: Thu, 17 Aug 2006 13:29:51 +0100	[thread overview]
Message-ID: <17636.24895.74010.977829@base.ty.sabi.co.UK> (raw)
In-Reply-To: <200608170858.11697.daniele@interline.it>

>>> On Thu, 17 Aug 2006 08:58:11 +0200, "Daniele P." 
>>> <daniele@interline.it> said:

[ ... ]

daniele> Hi Timothy, Yes, you are right, but there is another
daniele> problem on my side. The /small/ subtree of the
daniele> filesystem usually contains a lot of hard links (our
daniele> backup software

Given the context, I would imagine that this backup filesystem
is stored on a RAID5 device... Is it? It can be an important
part of the strategy.

daniele> uses hard links to save disk space, so expect one hard
daniele> link per file per day) and using a generic tool like
daniele> tar/star or rsync that uses "stat" to scan the
daniele> filesysem should be significant slower (no test done)
daniele> than a native tool like xfsdump, as Bill in a previous
daniele> email pointed out.

Depends a lot, for example on whether the system has enough RAM
to cache the inodes affected, and anyhow on the ratio between
inodes in the subtree and inodes in the whole filesystem.

As to this case:

    deniale> Dumping one directory with 4 file using 4KB of
    deniale> space takes hours (or days, it hasn't finished yet)
    deniale> if the underlying filesystem contains around
    deniale> 10.000.000 inodes.

probably using 'tar'/'star' may be a bit faster...

daniele> It seems that there isn't a right tool for this job.

Or perhaps it is not the right job :-).

Anyhow sequential scans of large filesystems are not an awesome
idea in general. I wonder how long and how much RAM your 10m
inode filesystem will take to 'fsck' for example :-), perhaps
you don't want to read this older entry in this mailing list:

  http://OSS.SGI.com/archives/linux-xfs/2005-08/msg00045.html

The basic problem is that the bottleneck is the ''pickup'' and
its speed has not grown as fast as disc capacity, so one has to
do RAID to work around that, but RAID delivers only for parallel,
not sequential, scans of the filesystem.

A significant issue that nobody seems in a hurry to address. As
a recent ''contributor'' to this list wrote:

  > hope i never need to run repair,

:-)

  reply	other threads:[~2006-08-17 14:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-16 13:15 xfsdump -s unacceptable performances Daniele P.
2006-08-16 14:38 ` Klaus Strebel
2006-08-16 18:01   ` Daniele P.
2006-08-17  1:31     ` Timothy Shimmin
2006-08-17  6:58       ` Daniele P.
2006-08-17 12:29         ` Peter Grandi [this message]
2006-08-16 16:38 ` Bill Kendall
2006-08-16 18:05   ` Daniele P.

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17636.24895.74010.977829@base.ty.sabi.co.UK \
    --to=pg_xfs@xfs.for.sabi.co.uk \
    --cc=linux-xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox