* hardlinking and deleting milions of small files
@ 2016-07-24 12:38 Arkadiusz Miśkiewicz
2016-07-24 12:48 ` Carlos E. R.
2016-07-25 0:23 ` Dave Chinner
0 siblings, 2 replies; 3+ messages in thread
From: Arkadiusz Miśkiewicz @ 2016-07-24 12:38 UTC (permalink / raw)
To: xfs@oss.sgi.com
Hello.
I'm using rsnapshot to backup big servers (like 5TB fs, 25 000 000 inodes,
small files - mailboxes in form of maildirs, so each mail is a separate file).
Backup server - kernel 4.6.3, V4 xfs filesystems.
cp -al for that amount takes about 1.5 day.
rm -rf of hardlinked copy takes another 1.5 day
(and toons of ram for these operations; causing OOM until recent kernels made
reclaim better, so no more OOM)
Now the weird part - similar operations on ext4 finish in matter of hours.
Are there any possibilities for xfs to improve in these areas?
From irc #xfs from few months ago the conclusion was that xfs isn't best in
such operations.
ps. Didn't do scientific comparison (I'm just viewing backup logs of two
similar mail servers (similar hardware, similar storage size) being backed up
to single backup server onto two partitions - one with xfs and one with ext4
on it))
--
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: hardlinking and deleting milions of small files
2016-07-24 12:38 hardlinking and deleting milions of small files Arkadiusz Miśkiewicz
@ 2016-07-24 12:48 ` Carlos E. R.
2016-07-25 0:23 ` Dave Chinner
1 sibling, 0 replies; 3+ messages in thread
From: Carlos E. R. @ 2016-07-24 12:48 UTC (permalink / raw)
To: XFS mailing list
On 2016-07-24 14:38, Arkadiusz Miśkiewicz wrote:
>
> Hello.
>
> I'm using rsnapshot to backup big servers (like 5TB fs, 25 000 000 inodes,
> small files - mailboxes in form of maildirs, so each mail is a separate file).
> Backup server - kernel 4.6.3, V4 xfs filesystems.
>
> cp -al for that amount takes about 1.5 day.
> rm -rf of hardlinked copy takes another 1.5 day
I hesitate to suggest reiserfs...
I know it is good in that situation, but I doubt it scales well nowdays.
Your filesystems are far bigger than my experience.
There was some suggestion that btrfs could do it (without snapshots).
But... ? Too green? You could try it up in a spare test server.
--
Cheers / Saludos,
Carlos E. R.
(from openSUSE Leap 42.1 x86_64 (test))
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: hardlinking and deleting milions of small files
2016-07-24 12:38 hardlinking and deleting milions of small files Arkadiusz Miśkiewicz
2016-07-24 12:48 ` Carlos E. R.
@ 2016-07-25 0:23 ` Dave Chinner
1 sibling, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2016-07-25 0:23 UTC (permalink / raw)
To: Arkadiusz Miśkiewicz; +Cc: xfs@oss.sgi.com
On Sun, Jul 24, 2016 at 02:38:10PM +0200, Arkadiusz Miśkiewicz wrote:
>
> Hello.
>
> I'm using rsnapshot to backup big servers (like 5TB fs, 25 000 000 inodes,
> small files - mailboxes in form of maildirs, so each mail is a separate file).
> Backup server - kernel 4.6.3, V4 xfs filesystems.
What storage? What mount options? What is the xfs_info output?
(/me points at
http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F)
> cp -al for that amount takes about 1.5 day.
> rm -rf of hardlinked copy takes another 1.5 day
So what's the bottleneck? Reading the directory structure/inodes
into memory to make the copy? What's the IO performance look like?
Are they CPU bound? What else is generating IO load at the same
time? How big are the individual directories? How full is the
filesystem? How many hardlinks in a "copy"?
FWIW< you've got 25M inodes in the filesystem - how many hardlinks
do you have in the filesystem? 100M? 200M? 1B? i.e. what's the scale
of the directory structure that contains all the hard links?
> (and toons of ram for these operations; causing OOM until recent kernels made
> reclaim better, so no more OOM)
What oom problems? Slabtop output during a test?
> Now the weird part - similar operations on ext4 finish in matter of hours.
So you probably need to identify where the difference in behaviour
is - reading from disk, writing to disk, CPU usage, directory entry
creation/removal speed, etc.
> ps. Didn't do scientific comparison (I'm just viewing backup logs of two
> similar mail servers (similar hardware, similar storage size) being backed up
> to single backup server onto two partitions - one with xfs and one with ext4
> on it))
So the /destination/ files is either ext4 or XFS, but the source
filesystem is the same? So how does "cp -al" work to create
hardlinks when copying to a different filesystem? If this is a copy
to a different filesystem, then it's a very different problem to
"create/removing hardlinks are slow".
Clearly I haven't understood what you are trying to describe, so can
you please describe the problem in more detail and not assume I know
anything about where you are copying from/to, what the hardware or
filesystem layout is, etc.
I know I haven't answered your question and just fired back a bunch
of questions, but I need to know specifics to be able to have any
chance of understanding the problem you are having.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-07-25 0:24 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-24 12:38 hardlinking and deleting milions of small files Arkadiusz Miśkiewicz
2016-07-24 12:48 ` Carlos E. R.
2016-07-25 0:23 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox