public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Bug in xfsrestore or NFS+XFS Filesystem on NFS Server ? Can anyone confirm ?
@ 2006-08-12  7:56 Juergen Sauer
  2006-08-12 22:57 ` Justin Piszcz
  2006-08-15 13:20 ` Bill Kendall
  0 siblings, 2 replies; 3+ messages in thread
From: Juergen Sauer @ 2006-08-12  7:56 UTC (permalink / raw)
  To: linux-xfs

Hi!
Job to do: filesytem tranfer fom one raid to another / backup a server of 
about 1TB data and restore.

This Senario:
Server: Ubuntu 6.06 LTS + own xfs-fixed kernel 2.6.17.7
Backup: Ubuntu 6.06 LTS + own xfs-fixed kernel 2.6.17.7
Network 1Gbit Copper

xfsdumped server like this:
1. logged into the backup system 
1a. dumped the XFS Filesystem like that: 
> ssh server "xfsdump -l 0 -F -L transfer -M server_backup" - /dev/sdb1" >  /backup//backup/server_sdb1_xfsdump
Worked fine. Rsult was: 
-rw-rw-r-- 1 root root 568776542496 2006-08-10 15:16 /backup/backup-server_sdb1_xfsdump

But the ssh transfer has got an big overhead off cpu (comression, evcryption), so the dump
was not fast as expected, it took about 2 days.
So I decided to drop the overhead and to use a NFS link on the restore side.

2. Now I exchanged the old Hardware Raid on the server and started from Knoppix V5.01
(see: http://ftp.freenet.de/pub/ftp.uni-kl.de/pub/linux/knoppix/KNOPPIX_V5.0.1CD-2006-06-01-DE.iso
http://ftp.freenet.de/pub/ftp.uni-kl.de/pub/linux/knoppix/KNOPPIX_V5.0.1CD-2006-06-01-EN.iso)
Knoppix Booted using "knoppix 2"

2a. created anf formated partitions, sda1, sda2 (sda2 for swap, 4 GN or so. aktivated swap);
mounted the fresh XFS partiton to /mnt
2b. made NFS link to Backup server up (started portmap and mounted the backup:/backup/)

2c. started xfsrestore like that
xfsrestore -f /backup/backup-server_sda1_xfsdump /mnt/ 

After about a few hours xfsrestore quitted with a failure:  "Too much open files", Abort
The restore Process failed after ca. 66G of 500G.
OK, I raised the Kernel Openfile MAX, echo 524288 > /proc/sys/fs/file-max
and tried again.
xfsrestore -f /backup/backup-server_sda1_xfsdump /mnt/ 
Same error at same place.
Version was:  xfsrestore: version 2.2.36 (dump format 3.0)

Can s.o. confirm this ?

A working around was to use the much slower ssh pipe construct:
2d.
root@Knoppix[~]# ssh backup "dd if=/backup/backup-server_sda1_xfsdump bs=10M" | xfsrestore - /mnt/

This works.

Since that the last restore works, I think there is a bug in xfsrestore or xfs/nfs in the kernel 2.6.17.
Backupserver, Server, Knoppix are bugfixed 2.6.17(.7). Restoring the dump via nfs failed due "too much open files".

Does xfsrestore opens really 512k files ? I do not belive that. 
So It could a problem in Kernel  or NFS Stack be the problem.

Can any one confirm this behavior ? (to make it better next time)

Greetings from Northern Germany
	Jürgen Sauer

-- 
Jürgen Sauer - AutomatiX GmbH, +49-4209-4699, jojo@automatix.de
Das Linux Systemhaus - Service - Support - Server - Lösungen
http://www.automatix.de OpenOffice erhalten Sie hier kostenfrei 
http://de.openoffice.org/

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug in xfsrestore or NFS+XFS Filesystem on NFS Server ? Can anyone confirm ?
  2006-08-12  7:56 Bug in xfsrestore or NFS+XFS Filesystem on NFS Server ? Can anyone confirm ? Juergen Sauer
@ 2006-08-12 22:57 ` Justin Piszcz
  2006-08-15 13:20 ` Bill Kendall
  1 sibling, 0 replies; 3+ messages in thread
From: Justin Piszcz @ 2006-08-12 22:57 UTC (permalink / raw)
  To: Juergen Sauer; +Cc: linux-xfs

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3067 bytes --]

If you mounted the FS w/ Knoppix 5.0.1 and did reads/wrties, you have 
corrupted your filesystem. It has 2.6.17, which suffers from the 
corruption bug.

On Sat, 12 Aug 2006, Juergen Sauer wrote:

> Hi!
> Job to do: filesytem tranfer fom one raid to another / backup a server of
> about 1TB data and restore.
>
> This Senario:
> Server: Ubuntu 6.06 LTS + own xfs-fixed kernel 2.6.17.7
> Backup: Ubuntu 6.06 LTS + own xfs-fixed kernel 2.6.17.7
> Network 1Gbit Copper
>
> xfsdumped server like this:
> 1. logged into the backup system
> 1a. dumped the XFS Filesystem like that:
>> ssh server "xfsdump -l 0 -F -L transfer -M server_backup" - /dev/sdb1" >  /backup//backup/server_sdb1_xfsdump
> Worked fine. Rsult was:
> -rw-rw-r-- 1 root root 568776542496 2006-08-10 15:16 /backup/backup-server_sdb1_xfsdump
>
> But the ssh transfer has got an big overhead off cpu (comression, evcryption), so the dump
> was not fast as expected, it took about 2 days.
> So I decided to drop the overhead and to use a NFS link on the restore side.
>
> 2. Now I exchanged the old Hardware Raid on the server and started from Knoppix V5.01
> (see: http://ftp.freenet.de/pub/ftp.uni-kl.de/pub/linux/knoppix/KNOPPIX_V5.0.1CD-2006-06-01-DE.iso
> http://ftp.freenet.de/pub/ftp.uni-kl.de/pub/linux/knoppix/KNOPPIX_V5.0.1CD-2006-06-01-EN.iso)
> Knoppix Booted using "knoppix 2"
>
> 2a. created anf formated partitions, sda1, sda2 (sda2 for swap, 4 GN or so. aktivated swap);
> mounted the fresh XFS partiton to /mnt
> 2b. made NFS link to Backup server up (started portmap and mounted the backup:/backup/)
>
> 2c. started xfsrestore like that
> xfsrestore -f /backup/backup-server_sda1_xfsdump /mnt/
>
> After about a few hours xfsrestore quitted with a failure:  "Too much open files", Abort
> The restore Process failed after ca. 66G of 500G.
> OK, I raised the Kernel Openfile MAX, echo 524288 > /proc/sys/fs/file-max
> and tried again.
> xfsrestore -f /backup/backup-server_sda1_xfsdump /mnt/
> Same error at same place.
> Version was:  xfsrestore: version 2.2.36 (dump format 3.0)
>
> Can s.o. confirm this ?
>
> A working around was to use the much slower ssh pipe construct:
> 2d.
> root@Knoppix[~]# ssh backup "dd if=/backup/backup-server_sda1_xfsdump bs=10M" | xfsrestore - /mnt/
>
> This works.
>
> Since that the last restore works, I think there is a bug in xfsrestore or xfs/nfs in the kernel 2.6.17.
> Backupserver, Server, Knoppix are bugfixed 2.6.17(.7). Restoring the dump via nfs failed due "too much open files".
>
> Does xfsrestore opens really 512k files ? I do not belive that.
> So It could a problem in Kernel  or NFS Stack be the problem.
>
> Can any one confirm this behavior ? (to make it better next time)
>
> Greetings from Northern Germany
> 	Jürgen Sauer
>
> -- 
> Jürgen Sauer - AutomatiX GmbH, +49-4209-4699, jojo@automatix.de
> Das Linux Systemhaus - Service - Support - Server - Lösungen
> http://www.automatix.de OpenOffice erhalten Sie hier kostenfrei
> http://de.openoffice.org/
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug in xfsrestore or NFS+XFS Filesystem on NFS Server ? Can anyone confirm ?
  2006-08-12  7:56 Bug in xfsrestore or NFS+XFS Filesystem on NFS Server ? Can anyone confirm ? Juergen Sauer
  2006-08-12 22:57 ` Justin Piszcz
@ 2006-08-15 13:20 ` Bill Kendall
  1 sibling, 0 replies; 3+ messages in thread
From: Bill Kendall @ 2006-08-15 13:20 UTC (permalink / raw)
  To: juergen.sauer; +Cc: linux-xfs

On 08/12/06 02:56, Juergen Sauer wrote:
> Since that the last restore works, I think there is a bug in xfsrestore or xfs/nfs in the kernel 2.6.17.
> Backupserver, Server, Knoppix are bugfixed 2.6.17(.7). Restoring the dump via nfs failed due "too much open files".
> 
> Does xfsrestore opens really 512k files ? I do not belive that. 
> So It could a problem in Kernel  or NFS Stack be the problem.

Not intentionally, but there could be a file descriptor leak somewhere
in xfsrestore or one of the libraries it uses. Take a look at
/proc/<xfsrestore pid>/fd while the restore is going on and see if the
file descriptor list is growing. If so, it should indicate what files
are being opened but not closed.

Bill

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-08-15 14:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-12  7:56 Bug in xfsrestore or NFS+XFS Filesystem on NFS Server ? Can anyone confirm ? Juergen Sauer
2006-08-12 22:57 ` Justin Piszcz
2006-08-15 13:20 ` Bill Kendall

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox